-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Returned non-zero exit status 3221225477 - can't train lora #2800
Comments
I have a 3060 12GB, I trained a lora just the day before yesterday, same config doesn't work today after I updated yesterday. |
So you have the same error? have you tried going back to 41afd26? |
Yes, same error of "returned non-zero exit status 3221225477". |
Thanks, i’ll try it |
@Telllinex This might help: Also, why did you close this issue as completed? The issue is still there, we just reverted back to a previous commit! right? |
I just think no one cares about it because it works at last commit for everyone, but ok, i’ll reopen :) |
Tried it again after swithcing to 41afd26 and still same error - maybe i need to increase pagefile size? |
I don't know about this particular error, but 16GB RAM seems too low to train Flux Lora, it takes up a lot of memory.
|
See, I get around 10 something s/it with batch size of 1, on my 3060. So, your lack of RAM is definitely worsening your speed. Make the batch size 2, it'll decrease the speed (s/it) a bit, but total steps will be halved, so overall it'll be faster. |
I am just happy i can train it on VM i have for free without loading my pc and paying electricity bill, it has 1/2 of rtx6000grid (turing) 12gb vram and 16 gig ram, as you know |
Yes, the the VRAM usage keeps going up and down. I don't know the reason. |
Hello, here is the console logs
Maybe i can provide detailed logs to identify what's wrong? this gpu does not have support for bf16 so i disabled it switching to fp16, but seems the problem is elsewhere - I have 16 gb ram, pagefile is 32gb, 12gb vram
The text was updated successfully, but these errors were encountered: