Ultrascale Playbook - Tensor and Sequence Parallelism
Notes on training LLMs using tensor and sequence parallelism
Notes on training LLMs using tensor and sequence parallelism
Notes on training LLMs using pipeline parallelism
My talk on training LLMs at Pydata MCR
Notes on choosing appropriate batch size and compute for training LLMs
Notes on training LLMs using sharding strategies