[Request] A method to resume training with different batch size while keeping your G epoch and nkimg value. #92

nom57 · 2022-09-08T11:13:16Z

on SG2 and SG3 given you use them on a modified fork you can resume training on a completely different batch size and still keep your Tick / Nkimg progress , by specifying it with the kwarg ""--nkimg"" example --nkimg=2500 resumes training with an assumed 2500kimg progress.

SGXL resets kimg to 0 if you change the batch size

I found its extremely useful to start with a very low batch size such as --batch=2 --glr=0.0008 --dlr=0.0006 to improve diversity and then swittch to a batch size of 32 / 64 / 128 for better FID when im starting to bottom out FID with batch=2 ,

however because the Augmentation , the G epoch and kimg resets in SGXL when doing this , I am having a really bad time.

nom57 · 2022-09-08T11:17:04Z

this method of training can make recall be 0.7+ instead of 0.5 on many datasets while still reaching the same FID (after also bottoming out FID on 128 batch size) so its tremendously better recall with this method. with batch=2 it also converges much faster so the first part of training focusing on diversity is very fast.

nom57 · 2022-09-08T11:23:39Z

for example the pokemon dataset can have a recall of 0.787 at 64x64 with this method @xl-sr
so an -nkimg resume feature would be tremendously helpful

nom57 · 2022-09-09T16:13:02Z

update :

this seems to favor SG2-Ada way more than SGXL , SGXL can easily have collapses with low batch sizes , so its hard to tame , but still , a batch size of 2-16 for the first 144kimg and then switching to a batch size of 64 or 128 , seems beneficial on Unimodal datasets for better recall and faster training.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Request] A method to resume training with different batch size while keeping your G epoch and nkimg value. #92

[Request] A method to resume training with different batch size while keeping your G epoch and nkimg value. #92

nom57 commented Sep 8, 2022 •

edited

Loading

nom57 commented Sep 8, 2022

nom57 commented Sep 8, 2022

nom57 commented Sep 9, 2022

[Request] A method to resume training with different batch size while keeping your G epoch and nkimg value. #92

[Request] A method to resume training with different batch size while keeping your G epoch and nkimg value. #92

Comments

nom57 commented Sep 8, 2022 • edited Loading

nom57 commented Sep 8, 2022

nom57 commented Sep 8, 2022

nom57 commented Sep 9, 2022

nom57 commented Sep 8, 2022 •

edited

Loading