-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More scripts #61
More scripts #61
Conversation
models the lyrics, and we use its last layer to produce keys/values that are attened to by the decoder transformer | ||
- Single Encoder Decoder: This is a simplification where we combine them into a single model. We merge the text vocab | ||
and VQ vocab into a single large vocab, and the lyric tokens and VQ tokens into a single longer sequence of tokens which | ||
we autoregressively model together. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
README.md
Outdated
mpiexec -n {ngpus} python jukebox/train.py --hps=vqvae,small_prior,all_fp16,cpu_ema --name=pretrained_vqvae_small_prior_labels \ | ||
--sample_length=1048576 --bs=4 --aug_shift --aug_blend --audio_files_dir={audio_files_dir} \ | ||
--labels=True --train --test --prior --levels=3 --level=2 --weight_decay=0.01 --save_iters=1000 \ | ||
--labels_v3=True --y_bins=({artists},{genres}) --max_bow_genre_size=1 --min_duration=60.0 --max_duration=600.0 --t_bins=64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might be useful to add that --min_duration
--max_duration
should be decided based on the dataset people have, and that --min_duration
should be at least ~23.8 seconds so that there's always a context we can train on.
jukebox/hparams.py
Outdated
n_ctx=6144, | ||
prior_width=1024, | ||
prior_depth=48, | ||
heads=4, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: in general, lgtm, but perhaps, 2 heads will be enough? 1B lyrics uses 2 heads as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point, changing!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, left some comments
No description provided.