8-bit optimizer crashes when fine-tuning gpt2-large #26

rationalism · 2022-06-30T17:27:34Z

Using the bnb.optim.Adam8bit optimizer in place of torch.optim.Adam causes a crash after a handful of batches:

12it [00:22, 1.82s/it]Error an illegal memory access was encountered at line 198 in file /home/alyssa/gpt_math/bitsandbytes/csrc/ops.cu

I am fine-tuning Huggingface's version of the gpt2-large model on an Ampere 3090 GPU with CUDA version 11.6 and nVidia driver version 510.73.05. I have tried compiling bitsandbytes on my machine from source, and the set_optim_to_run_embedding_in_fp32 trick from huggingface/transformers#14819; neither of them affected the behavior. Running with the standard pytorch Adam optimizer works fine. nvidia-smi shows 16 GB of memory used on a GPU with 24 GB, so it shouldn't be running out of RAM or anywhere close to that.

The text was updated successfully, but these errors were encountered:

rationalism changed the title ~~8-bit optimizer crashes when fine-tuning gpt-2 large~~ 8-bit optimizer crashes when fine-tuning gpt2-large Jun 30, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

8-bit optimizer crashes when fine-tuning gpt2-large #26

8-bit optimizer crashes when fine-tuning gpt2-large #26

rationalism commented Jun 30, 2022 •

edited

Loading

8-bit optimizer crashes when fine-tuning gpt2-large #26

8-bit optimizer crashes when fine-tuning gpt2-large #26

Comments

rationalism commented Jun 30, 2022 • edited Loading

rationalism commented Jun 30, 2022 •

edited

Loading