Zero Bubble Pipeline Parallelism Titelbild

Zero Bubble Pipeline Parallelism

Zero Bubble Pipeline Parallelism

Jetzt kostenlos hören, ohne Abo

Details anzeigen

Nur 0,99 € pro Monat für die ersten 3 Monate

Danach 9.95 € pro Monat. Bedingungen gelten.

Über diesen Titel

Core idea is think about backward pass into two flows, one to compute grad wrt to parameters, and one to compute grad wrt to output of last layer, schedule so that you are always working instead of waiting (bubble). Read full paper: https://arxiv.org/abs/2401.10241 Tags: Systems and Performance, Deep Learning, Machine Learning
Noch keine Rezensionen vorhanden