Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update model #17

Merged
merged 5 commits into from
Jul 5, 2022
Merged

Update model #17

merged 5 commits into from
Jul 5, 2022

Conversation

dbaranchuk
Copy link
Collaborator

@dbaranchuk dbaranchuk commented Jul 4, 2022

  • Default for client's model parameters: requires_grad=False;
  • Interface: "abstract" class BloomForYou. DistributedBloomForYou inherits from it. All consequent classes for various Bloom applications inherit DistributedBloomForYou;
  • lm_head -> h @ word_embs for DistributedBloomForCausalLM;
  • Updated converted_model without lm_head: dbaranchuk/test-bloomd-6b3;
  • h @ word_embs is very slow in fp16 on CPU. Let's keep fp32 for now. TODO @dbaranchuk: understand why fp16 is slow on CPU. If everything is correct then (TODO @dbaranchuk) keep embeddings in fp16 but calculate fp32 casting on the fly;
  • Minor bug fix in run_local_server.sh;
  • Updated tests/test_full_model.py:
    Test results for REF_NAME='bigscience/bloom-6b3':
    CPU vs GPU (А100) (fp32) passes with (forward_atol=1e-2, inference_atol=1e-2)
    CPU vs CPU (fp32) passes with (forward_atol=1e-4, inference_atol=1e-3)

@dbaranchuk dbaranchuk merged commit 0b5a689 into main Jul 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant