Ultra-scale Playbook - Data Parallelism

Notes on training LLMs using data parallelism strategy

May 17, 2025 · 4 min · 844 words

Ultra-scale Playbook - Train on a single GPU

Notes on Ultra-scale Playbook - training LLM on a single GPU

April 27, 2025 · 4 min · 786 words