Skip to content

Commit

Permalink
Updated README.md
Browse files Browse the repository at this point in the history
Added new links for experiment checkpoints
  • Loading branch information
justachetan authored Dec 30, 2022
1 parent 960cd02 commit 809401d
Showing 1 changed file with 11 additions and 11 deletions.
22 changes: 11 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -275,17 +275,17 @@ Additionally, we also only used the data for one female speaker per emotion. The

| Approach | Dataset | Result Dumps | Optimiser | Learning Rate | Training Script | Slides |
| -- | -- | -- | -- | -- | -- | -- |
| [Approach 1](#x-approach-1-fine-tuning-a-vanilla-tacotron-model-on-ravdess-pre-trained-on-lj-speech) | RAVDESS (angry) | [`approach_1`](https://drive.google.com/drive/folders/16Ljdx-KVlVCwNnN7_8emntt1OjI2qfuz?usp=sharing) | Adam | 2e-3 | [`train.py`](https://github.com/Emotional-Text-to-Speech/tacotron_pytorch/blob/master/train.py) | [\[slides\]](https://docs.google.com/presentation/d/1CuOiThrBodv6HRp5gCdFN8dnUwkVpFYs7DRP1DS2nKk/edit?usp=sharing) |
| [Approach 2](#x-approach-2-using-a-smaller-learning-rate-for-fine-tuning) | RAVDESS (angry) | [`approach_2`](https://drive.google.com/open?id=1VJb5_GjZGWdGvTnRWG2fqXpFht9sMnuC) | Adam | 2e-5 | [`train.py`](https://github.com/Emotional-Text-to-Speech/tacotron_pytorch/blob/master/train.py) | [\[slides\]](https://docs.google.com/presentation/d/1pZSFGBPgVUbMk3gFBctaJTEg7VvYQGp1sxEKh0sQ7OA/edit?usp=sharing) |
| [Approach 3](#x-approach-3-using-a-smaller-learning-rate-and-sgd-for-fine-tuning) | RAVDESS (angry) | [`approach_3`](https://drive.google.com/drive/folders/19r3BXQKjfLWxhJHjv9OfS_U8djAP8VnV?usp=sharing) | SGD | 2e-5 | [`train_sgd.py`](https://github.com/Emotional-Text-to-Speech/tacotron_pytorch/blob/master/train_sgd.py) | [\[slides\]](https://docs.google.com/presentation/d/1Q8WbV8xmzZNUv8k62t2FgyN2ZxS5AWQI7y4mpuwomH0/edit?usp=sharing) |
| [Approach 4](#x-approach-4-freezing-the-encoder-and-postnet) | RAVDESS (angry) | [`approach_4`](https://drive.google.com/open?id=11SxFVEtQDSBIlz549oXs9M27AcztjfJE) | SGD | 2e-5 | [`train_fr_enc_sgd.py`](https://github.com/Emotional-Text-to-Speech/tacotron_pytorch/blob/master/train_fr_enc_sgd.py) | [\[slides\]](https://docs.google.com/presentation/d/1d8VMrd8vAVRE7PcDd3vKdCgFBVf27qDNqPbWMU1QKj4/edit?usp=sharing) |
| [Approach 5](#x-approach-5-freezing-the-encoder-and-postnet-and-switching-back-to-adam) | RAVDESS (angry) | [`approach_5`](https://drive.google.com/open?id=1-2bCkUftdJidkIB5uuD2yYOGm-g9_B8S) | Adam | 2e-5 | [`train_fr_enc_adam.py`](https://github.com/Emotional-Text-to-Speech/tacotron_pytorch/blob/master/train_fr_enc_adam.py) | [\[slides\]](https://docs.google.com/presentation/d/1Y99-84_6S35PRPEKQ1oYmD3kghqDU1D1rUECHwKogNE/edit?usp=sharing) |
| [Approach 6](#white_check_mark-approach-6-freezing-just-the-post-net-using-adam-with-low-initial-learning-rate-training-on-emov-db) | EMOV-DB (each emotion, one speaker) | [`approach_6`](https://drive.google.com/open?id=18VZBbNImoZmN2NrbZgziVoDWF5nXXLqY) | Adam | 2e-5 | [`train_fr_postnet_adam.py `](https://github.com/Emotional-Text-to-Speech/tacotron_pytorch/blob/master/train_fr_postnet_adam.py) | [\[slides\]](https://docs.google.com/presentation/d/1Y99-84_6S35PRPEKQ1oYmD3kghqDU1D1rUECHwKogNE/edit?usp=sharing) |
| [Approach 7](#x-approach-7-fine-tuning-the-text2mel-module-of-the-dc-tts-model-on-emov-db-pre-trained-on-lj-speech) | EMOV-DB (angry) | [`approach_7`](https://drive.google.com/open?id=1HaVlGpiVFKG40vLFlorvtFzkFAHqZSjj) | \- | \- | \- | [\[slides\]](https://docs.google.com/presentation/d/1d8VMrd8vAVRE7PcDd3vKdCgFBVf27qDNqPbWMU1QKj4/edit?usp=sharing) |
| [Approach 8](#white_check_mark-approach-8-fine-tuning-only-on-one-speaker-with-reduced-top_db-and-monotonic-attention) | EMOV-DB (each emotion, one speaker) | [`approach_8`](https://drive.google.com/open?id=1UIbVj-KjI1YJh6dDZTTJNUYHo4jAKZ1h) | \- | \- | \- | [\[slides\]](https://docs.google.com/presentation/d/1Y99-84_6S35PRPEKQ1oYmD3kghqDU1D1rUECHwKogNE/edit?usp=sharing) |

- The pre-trained model for Tacotron, trained on LJ Speech is available here: [`pretrained_ljspeech_tacotron`](https://drive.google.com/open?id=1fgh_1asVi5fsFo_PMyVGQfNOn7kD1xyE)
- The pre-trained model for DC-TTS, trained on LJ Speech is available here: [`pretrained_ljspeech_dctts`](https://drive.google.com/drive/folders/10nz8_0O4g5vc1K0pEoiP4QTeBzA2s-sl?usp=sharing)
| [Approach 1](#x-approach-1-fine-tuning-a-vanilla-tacotron-model-on-ravdess-pre-trained-on-lj-speech) | RAVDESS (angry) | [`approach_1`](https://drive.google.com/drive/folders/19ocJG_CpVG9gJ8KF_p92GNpv2f_DOCGd?usp=sharing) | Adam | 2e-3 | [`train.py`](https://github.com/Emotional-Text-to-Speech/tacotron_pytorch/blob/master/train.py) | [\[slides\]](https://docs.google.com/presentation/d/1CuOiThrBodv6HRp5gCdFN8dnUwkVpFYs7DRP1DS2nKk/edit?usp=sharing) |
| [Approach 2](#x-approach-2-using-a-smaller-learning-rate-for-fine-tuning) | RAVDESS (angry) | [`approach_2`](https://drive.google.com/drive/folders/1eLuDRauXKsxroh1duoe7IxGZmvt2sIq5?usp=sharing) | Adam | 2e-5 | [`train.py`](https://github.com/Emotional-Text-to-Speech/tacotron_pytorch/blob/master/train.py) | [\[slides\]](https://docs.google.com/presentation/d/1pZSFGBPgVUbMk3gFBctaJTEg7VvYQGp1sxEKh0sQ7OA/edit?usp=sharing) |
| [Approach 3](#x-approach-3-using-a-smaller-learning-rate-and-sgd-for-fine-tuning) | RAVDESS (angry) | [`approach_3`](https://drive.google.com/drive/folders/1fMGpkF7_Wu46qRaLlBDPjqfzn3t1Jzgu?usp=sharing) | SGD | 2e-5 | [`train_sgd.py`](https://github.com/Emotional-Text-to-Speech/tacotron_pytorch/blob/master/train_sgd.py) | [\[slides\]](https://docs.google.com/presentation/d/1Q8WbV8xmzZNUv8k62t2FgyN2ZxS5AWQI7y4mpuwomH0/edit?usp=sharing) |
| [Approach 4](#x-approach-4-freezing-the-encoder-and-postnet) | RAVDESS (angry) | [`approach_4`](https://drive.google.com/drive/folders/1I8M0fdXn6hPM_8jAlgixazeaOAhv89dT?usp=sharing) | SGD | 2e-5 | [`train_fr_enc_sgd.py`](https://github.com/Emotional-Text-to-Speech/tacotron_pytorch/blob/master/train_fr_enc_sgd.py) | [\[slides\]](https://docs.google.com/presentation/d/1d8VMrd8vAVRE7PcDd3vKdCgFBVf27qDNqPbWMU1QKj4/edit?usp=sharing) |
| [Approach 5](#x-approach-5-freezing-the-encoder-and-postnet-and-switching-back-to-adam) | RAVDESS (angry) | [`approach_5`](https://drive.google.com/drive/folders/1W9tisfZUDc5xHOKME_qn1j8tKcLIUHNQ?usp=sharing) | Adam | 2e-5 | [`train_fr_enc_adam.py`](https://github.com/Emotional-Text-to-Speech/tacotron_pytorch/blob/master/train_fr_enc_adam.py) | [\[slides\]](https://docs.google.com/presentation/d/1Y99-84_6S35PRPEKQ1oYmD3kghqDU1D1rUECHwKogNE/edit?usp=sharing) |
| [Approach 6](#white_check_mark-approach-6-freezing-just-the-post-net-using-adam-with-low-initial-learning-rate-training-on-emov-db) | EMOV-DB (each emotion, one speaker) | [`approach_6`](https://drive.google.com/drive/folders/1gHPS94Bd0ilCYoiFJ25peAn0s64Sn_MV?usp=sharing) | Adam | 2e-5 | [`train_fr_postnet_adam.py `](https://github.com/Emotional-Text-to-Speech/tacotron_pytorch/blob/master/train_fr_postnet_adam.py) | [\[slides\]](https://docs.google.com/presentation/d/1Y99-84_6S35PRPEKQ1oYmD3kghqDU1D1rUECHwKogNE/edit?usp=sharing) |
| [Approach 7](#x-approach-7-fine-tuning-the-text2mel-module-of-the-dc-tts-model-on-emov-db-pre-trained-on-lj-speech) | EMOV-DB (angry) | [`approach_7`](https://drive.google.com/drive/folders/1cmKx_-u8zv39K5Fs-rn9-9zgpRELpZoZ?usp=sharing) | \- | \- | \- | [\[slides\]](https://docs.google.com/presentation/d/1d8VMrd8vAVRE7PcDd3vKdCgFBVf27qDNqPbWMU1QKj4/edit?usp=sharing) |
| [Approach 8](#white_check_mark-approach-8-fine-tuning-only-on-one-speaker-with-reduced-top_db-and-monotonic-attention) | EMOV-DB (each emotion, one speaker) | [`approach_8`](https://drive.google.com/drive/folders/1dVNzAICx2sZc0kM3FjlbByPOJ5QH-ni-?usp=sharing) | \- | \- | \- | [\[slides\]](https://docs.google.com/presentation/d/1Y99-84_6S35PRPEKQ1oYmD3kghqDU1D1rUECHwKogNE/edit?usp=sharing) |

- The pre-trained model for Tacotron, trained on LJ Speech is available here: [`pretrained_ljspeech_tacotron`](https://drive.google.com/drive/folders/1kfqN_b0UFhWofCDrXka8sMS-5PAfvU-F?usp=sharing)
- The pre-trained model for DC-TTS, trained on LJ Speech is available here: [`pretrained_ljspeech_dctts`](https://drive.google.com/drive/folders/1Ak2UMygytv5I3edRUBgspAH18NwNzAuy?usp=sharing)

# Demonstration

Expand Down

0 comments on commit 809401d

Please sign in to comment.