XaMiL

This is an experiment in generating sheet music using a transformer model. I use Karpathy's NanoGPT for the model. All the unique code in this repo is simply transforming MusicXML files into a sequence of tokens.

My method of tokenizing MusicXML is in two parts:

Establish a minimal base set of tokens that gets rid of a lot of redundancy inherent in an XML file (ex. end tags, which can be generated at inference time).
Use a BPE-like algorithm to create additional tokens by merging in the most common pairs of tokens together.

My initial attempt (2023) at this was just Part 1. It yieled a few hundred base tokens. In early 2024, I added Part 2 which allowed me to raise the vocab to an arbitrarily large size. In an experiment I went to 20k tokens which reduced my overall training set by more than 10x. This also has the benefit of making my context length effectively 10x longer and inference 10x faster.

This is still highly experimental and I didn't do a thorough analysis. But the validation loss (NLL) went from 10 to about 1.8 (i.e. from 1/20000 random guessing to 1/6 which is some decent learning). And the results of the generated MusicXML look coherent.

Use

Find good XML files (single staff for now) to train on.

python find_good_files.py --xml_folder_root=/path/to/xmls

Prep data by creating vocab, extracting tokens, and saving files for validation.

python prep.py

Train the model as long as you want.

python model_train.py

Run inference.

python model_infer.py

Contributing

See CONTRIBUTING.md for details.

License

MIT License; see LICENSE for details.

Disclaimer

This project is not an official Google project. It is not supported by Google and Google specifically disclaims all warranties as to its quality, merchantability, or fitness for a particular purpose.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
bpe.py		bpe.py
bpe_test.py		bpe_test.py
consts.py		consts.py
find_good_files.py		find_good_files.py
interpreter.py		interpreter.py
model_data.py		model_data.py
model_def.py		model_def.py
model_infer.py		model_infer.py
model_train.py		model_train.py
prep.py		prep.py
requirements.txt		requirements.txt
sbiff.py		sbiff.py
sbiff_test.py		sbiff_test.py
tokens.py		tokens.py
tokens_test.py		tokens_test.py
util.py		util.py
validator.py		validator.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

XaMiL

Use

Contributing

License

Disclaimer

About

Releases

Packages

Languages

License

jsphweid/xamil

Folders and files

Latest commit

History

Repository files navigation

XaMiL

Use

Contributing

License

Disclaimer

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages