New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Efficient forward & backward #36

Merged

dbaranchuk merged 13 commits into main from efficient-forward-backward

Jul 23, 2022

Collaborator

dbaranchuk commented Jul 22, 2022 •

edited

Loading

Fault-tolerant sequential forward and backward;
Efficient forward and backward calls that process an input batch in parallel by splitting it on small sub-batches;
Update server deployment scripts.

Some measurements:

Config: Bloom 6b3 | input_ids.shape = (32, 128) | A100 | 3 servers, 10 block each

Naive (current)
84.2 sec / it | (1/3 GPUs is utilized at time)

Sequential
37.4 sec / it | (1/3 GPUs is utilized at time)

Sequential + Parallel
19.4 sec / it | (2/3 or 3/3 GPUs are utilized at time)

Reference Gradient Checkpointing (Single GPU A100)
8.8 sec / it | 100% gpu utils

Reference (Single GPU A100)
5.7 sec / it | 100% gpu utils

dbaranchuk added 2 commits

July 22, 2022 21:50


          efficient forward & backward

f976191


          update server deployment scripts

3cc4e0b

justheuristic reviewed

View reviewed changes

src/client/async_forward_backward.py Outdated Show resolved Hide resolved

justheuristic reviewed

View reviewed changes

src/client/remote_model.py Show resolved Hide resolved

justheuristic reviewed

View reviewed changes

src/client/remote_model.py Show resolved Hide resolved

justheuristic reviewed

View reviewed changes

src/client/async_forward_backward.py Outdated Show resolved Hide resolved

justheuristic reviewed

View reviewed changes

src/client/async_forward_backward.py Outdated Show resolved Hide resolved

justheuristic reviewed

View reviewed changes

src/client/async_forward_backward.py Outdated Show resolved Hide resolved

justheuristic reviewed

View reviewed changes

src/client/async_forward_backward.py Outdated Show resolved Hide resolved

justheuristic approved these changes

View reviewed changes

src/client/async_forward_backward.py Outdated Show resolved Hide resolved

Collaborator

justheuristic commented Jul 22, 2022

Please add a test case for test_remote_sequential when it actually splits tokens w.r.t. batches OR create an issue and tag me :)

justheuristic reviewed

View reviewed changes

src/client/async_forward_backward.py Outdated Show resolved Hide resolved

justheuristic reviewed

View reviewed changes

src/client/async_forward_backward.py Outdated Show resolved Hide resolved

Collaborator

justheuristic commented Jul 22, 2022

[minor, reminder: i'd guess that the test_full_model will fix itself after merging #35 ]

justheuristic reviewed

View reviewed changes

src/client/async_forward_backward.py Outdated Show resolved Hide resolved

dbaranchuk added 7 commits

July 23, 2022 02:17


          Merge remote-tracking branch 'origin/main' into efficient-forward-bac…

b60eedc

…kward


          address the comments

ed86b36


          address the comments

d1abb2f


          Delete async_forward_backward.py

ba9f3ae


          remove redundant imports

180d91b


          Merge branch 'efficient-forward-backward' of github.com:learning-at-h…

44e33f9

…ome/bloom-demo into efficient-forward-backward


          black & isort

c925f4d

dbaranchuk marked this pull request as ready for review

July 22, 2022 23:52

dbaranchuk added 4 commits

July 23, 2022 03:02


          address comments & black & isort

3dfa956


          isort

d9d73cb


          rename functions & add descriptions

a2d020a


          add todo & black

d940083

dbaranchuk merged commit 6573076 into main

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet