Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Efficient forward & backward #36

Merged
merged 13 commits into from
Jul 23, 2022
Merged

Efficient forward & backward #36

merged 13 commits into from
Jul 23, 2022

Conversation

dbaranchuk
Copy link
Collaborator

@dbaranchuk dbaranchuk commented Jul 22, 2022

  • Fault-tolerant sequential forward and backward;
  • Efficient forward and backward calls that process an input batch in parallel by splitting it on small sub-batches;
  • Update server deployment scripts.

Some measurements:

Config: Bloom 6b3 | input_ids.shape = (32, 128) | A100 | 3 servers, 10 block each

Naive (current)
84.2 sec / it | (1/3 GPUs is utilized at time)

Sequential
37.4 sec / it | (1/3 GPUs is utilized at time)

Sequential + Parallel
19.4 sec / it | (2/3 or 3/3 GPUs are utilized at time)

Reference Gradient Checkpointing (Single GPU A100)
8.8 sec / it | 100% gpu utils

Reference (Single GPU A100)
5.7 sec / it | 100% gpu utils

src/client/async_forward_backward.py Outdated Show resolved Hide resolved
@justheuristic
Copy link
Collaborator

Please add a test case for test_remote_sequential when it actually splits tokens w.r.t. batches OR create an issue and tag me :)

@justheuristic
Copy link
Collaborator

[minor, reminder: i'd guess that the test_full_model will fix itself after merging #35 ]

@dbaranchuk dbaranchuk marked this pull request as ready for review July 22, 2022 23:52
@dbaranchuk dbaranchuk merged commit 6573076 into main Jul 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants