Skip to content

Standardizing weights to accelerate micro-batch training with gluon

Notifications You must be signed in to change notification settings

seujung/WeightStandardization_gluon

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 

Repository files navigation

Weight Standardization

Weight Standardization (WS) is a normalization method to accelerate micro-batch training. Micro-batch training is hard because small batch sizes are not enough for training networks with Batch Normalization (BN), while other normalization methods that do not rely on batch knowledge still have difficulty matching the performances of BN in large-batch training.

Our WS ends this problem because when used with Group Normalization and trained with 1 image/GPU, WS is able to match or outperform the performances of BN trained with large batch sizes with only 2 more lines of code. So if you are facing any micro-batch training problem, please do yourself a favor and try Weight Standardization. You will be surprised by how well it performs.

Test Result with Gluon

  • TBD

Reference

Original Repo

About

Standardizing weights to accelerate micro-batch training with gluon

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages