You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Oct 31, 2023. It is now read-only.
Using the bnb.optim.Adam8bit optimizer in place of torch.optim.Adam causes a crash after a handful of batches:
12it [00:22, 1.82s/it]Error an illegal memory access was encountered at line 198 in file /home/alyssa/gpt_math/bitsandbytes/csrc/ops.cu
I am fine-tuning Huggingface's version of the gpt2-large model on an Ampere 3090 GPU with CUDA version 11.6 and nVidia driver version 510.73.05. I have tried compiling bitsandbytes on my machine from source, and the set_optim_to_run_embedding_in_fp32 trick from huggingface/transformers#14819; neither of them affected the behavior. Running with the standard pytorch Adam optimizer works fine. nvidia-smi shows 16 GB of memory used on a GPU with 24 GB, so it shouldn't be running out of RAM or anywhere close to that.
The text was updated successfully, but these errors were encountered:
rationalism
changed the title
8-bit optimizer crashes when fine-tuning gpt-2 large
8-bit optimizer crashes when fine-tuning gpt2-large
Jun 30, 2022
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Using the bnb.optim.Adam8bit optimizer in place of torch.optim.Adam causes a crash after a handful of batches:
12it [00:22, 1.82s/it]Error an illegal memory access was encountered at line 198 in file /home/alyssa/gpt_math/bitsandbytes/csrc/ops.cu
I am fine-tuning Huggingface's version of the gpt2-large model on an Ampere 3090 GPU with CUDA version 11.6 and nVidia driver version 510.73.05. I have tried compiling bitsandbytes on my machine from source, and the
set_optim_to_run_embedding_in_fp32
trick from huggingface/transformers#14819; neither of them affected the behavior. Running with the standard pytorch Adam optimizer works fine.nvidia-smi
shows 16 GB of memory used on a GPU with 24 GB, so it shouldn't be running out of RAM or anywhere close to that.The text was updated successfully, but these errors were encountered: