Please download pretrained audio encoders from PANNs or HTSAT. We have also uploaded our used audio encoders here.
Put them under pretrained_models/audio_encoders
.
-
You can configure training settings in yaml files under
settings
directory. -
For our dataloader, we use json files, and the
audio
key refers to the path of the audio clip in your computer or server. -
Run
pretrain.py
for pretraining, andtrain.py
for finetuning or training from scratch.
We provide pretrained audio-language retrieval models for reproducing results.
Pretrained models can be downloaded at Google Drive