Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting IndexError: list index out of range error with libritts-train-clean-100 dataset #329

Closed
mlrober opened this issue Sep 10, 2021 · 5 comments

Comments

@mlrober
Copy link

mlrober commented Sep 10, 2021

Hi Team,

Thanks for the wonderful project for helping with this tool.

While i was running this tool on MAC OSX binary, i have done the following steps:

  1. I have used Libritts-train-clean-100 dataset to align.
  2. when i execute the below command i'm getting the error as:
    Command:
    mfa align -d -c ./LibriTTS FastSpeech2/lexicon/librispeech-lexicon.txt english ./FastSpeech2/preprocessed_data/LibriTTS

Error:

Traceback (most recent call last):
File "/Users/rajasekar.a/tts/vir/bin/mfa", line 8, in
sys.exit(main())
File "/Users/rajasekar.a/tts/vir/lib/python3.9/site-packages/montreal_forced_aligner/command_line/mfa.py", line 381, in main
run_adapt_model(args, unknown, acoustic_languages)
File "/Users/rajasekar.a/tts/vir/lib/python3.9/site-packages/montreal_forced_aligner/command_line/align.py", line 184, in run_align_corpus
from montreal_forced_aligner.command_line.mfa import align_parser, fix_path, unfix_path, acoustic_languages,
File "/Users/rajasekar.a/tts/vir/lib/python3.9/site-packages/montreal_forced_aligner/command_line/align.py", line 131, in align_corpus
except Exception as _:
File "/Users/rajasekar.a/tts/vir/lib/python3.9/site-packages/montreal_forced_aligner/aligner/pretrained.py", line 146, in export_textgrids
convert_ali_to_textgrids(self.align_config, output_directory, ali_directory, self.dictionary,
File "/Users/rajasekar.a/tts/vir/lib/python3.9/site-packages/montreal_forced_aligner/multiprocessing/alignment.py", line 588, in convert_ali_to_textgrids
parsed = parse_ctm(word_ctm_path, corpus, dictionary, mode='word')
File "/Users/rajasekar.a/tts/vir/lib/python3.9/site-packages/montreal_forced_aligner/textgrid.py", line 44, in parse_ctm
cur = current_labels[cur_ind]
IndexError: list index out of range

Can someone help me to resolve this issue?

@Weijia221B
Copy link

I got the same error, have no idea why it happened

@chaksam
Copy link

chaksam commented Sep 15, 2021

Getting same error while processing LJSpeech dataset on Ubuntu system:


INFO - Setting up corpus information...
INFO - Number of speakers in corpus: 1, average number of utterances per speaker: 13100.0
INFO - Parsing dictionary without pronunciation probabilities without silence probabilities
INFO - Creating dictionary information...
INFO - Setting up training data...
Generating base features (mfcc)...
Calculating CMVN...
INFO - Done with setup!
INFO - Performing first-pass alignment...
INFO - Calculating fMLLR for speaker adaptation...
INFO - Performing second-pass alignment...
Traceback (most recent call last):
File "/home/anaconda3/envs/aligner/bin/mfa", line 8, in
sys.exit(main())
File "/home/anaconda3/envs/aligner/lib/python3.8/site-packages/montreal_forced_aligner/command_line/mfa.py", line 379, in main
run_align_corpus(args, unknown, acoustic_languages)
File "/home/anaconda3/envs/aligner/lib/python3.8/site-packages/montreal_forced_aligner/command_line/align.py", line 179, in run_align_corpus
align_corpus(args, unknown_args)
File "/home/anaconda3/envs/aligner/lib/python3.8/site-packages/montreal_forced_aligner/command_line/align.py", line 127, in align_corpus
a.export_textgrids(args.output_directory)
File "/home/anaconda3/envs/aligner/lib/python3.8/site-packages/montreal_forced_aligner/aligner/pretrained.py", line 146, in export_textgrids
convert_ali_to_textgrids(self.align_config, output_directory, ali_directory, self.dictionary,
File "/home/anaconda3/envs/aligner/lib/python3.8/site-packages/montreal_forced_aligner/multiprocessing/alignment.py", line 588, in convert_ali_to_textgrids
parsed = parse_ctm(word_ctm_path, corpus, dictionary, mode='word')
File "/home/anaconda3/envs/aligner/lib/python3.8/site-packages/montreal_forced_aligner/textgrid.py", line 40, in parse_ctm
cur = current_labels[cur_ind]
IndexError: list index out of range

@hal3003
Copy link

hal3003 commented Sep 21, 2021

Hi there,

I'm encountering the same issue on Fedora 34. I have a custom data set, consisting of four recordings and plain text transcriptions. I generate a dictionary using
mfa g2p english_g2p ~/test_data/input/003.txt ~/test_data/003.dict and the run the alignment using mfa align ~/test_data/input/ ~/test_data/003.dict english ~/test_data/output --clean --verbose.
The output of that command including the error message looks like this:

All required kaldi binaries were found!
Cleaning old directory!
/home/user/Documents/MFA/input/align.log
INFO - Setting up corpus information...
INFO - Number of speakers in corpus: 1, average number of utterances per speaker: 1.0
INFO - Parsing dictionary without pronunciation probabilities without silence probabilities
INFO - Creating dictionary information...
INFO - Setting up training data...
Generating base features (mfcc)...
Calculating CMVN...
INFO - Done with setup!
INFO - Performing first-pass alignment...
INFO - Calculating fMLLR for speaker adaptation...
INFO - Performing second-pass alignment...
Traceback (most recent call last):
File "/home/user/.virtualenvs/mfa/bin/mfa", line 8, in
sys.exit(main())
File "/home/user/.virtualenvs/mfa/lib/python3.9/site-packages/montreal_forced_aligner/command_line/mfa.py", line 379, in main
run_align_corpus(args, unknown, acoustic_languages)
File "/home/user/.virtualenvs/mfa/lib/python3.9/site-packages/montreal_forced_aligner/command_line/align.py", line 179, in run_align_corpus
align_corpus(args, unknown_args)
File "/home/user/.virtualenvs/mfa/lib/python3.9/site-packages/montreal_forced_aligner/command_line/align.py", line 127, in align_corpus
a.export_textgrids(args.output_directory)
File "/home/user/.virtualenvs/mfa/lib/python3.9/site-packages/montreal_forced_aligner/aligner/pretrained.py", line 146, in export_textgrids
convert_ali_to_textgrids(self.align_config, output_directory, ali_directory, self.dictionary,
File "/home/user/.virtualenvs/mfa/lib/python3.9/site-packages/montreal_forced_aligner/multiprocessing/alignment.py", line 588, in convert_ali_to_textgrids
parsed = parse_ctm(word_ctm_path, corpus, dictionary, mode='word')
File "/home/user/.virtualenvs/mfa/lib/python3.9/site-packages/montreal_forced_aligner/textgrid.py", line 108, in parse_ctm
cur = current_labels[cur_ind]
IndexError: list index out of range

Notably, the error occurs only for one of the files I'm trying to align.

@mlrober
Copy link
Author

mlrober commented Sep 21, 2021

Hi Admin,

Can you help us on ths issue

@mmcauliffe
Copy link
Member

mmcauliffe commented Sep 21, 2021 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants