Skip to content

Commit

Permalink
[scripts] Improve how combine_ali_dirs.sh gets job-specific filenames (
Browse files Browse the repository at this point in the history
  • Loading branch information
KimJeongSun authored and danpovey committed Nov 15, 2019
1 parent f679c78 commit ab36598
Showing 1 changed file with 4 additions and 1 deletion.
5 changes: 4 additions & 1 deletion egs/wsj/s5/steps/combine_ali_dirs.sh
Original file line number Diff line number Diff line change
Expand Up @@ -166,10 +166,13 @@ do_combine() {
# Merge (presumed already sorted) scp's into a single script.
sort -m $temp_dir/$ark.*.scp > $temp_dir/$ark.scp || exit 1

inputs=$(for n in `seq $nj`; do echo $temp_dir/$ark.$n.scp; done)
utils/split_scp.pl --utt2spk=$data/utt2spk $temp_dir/$ark.scp $inputs

echo "$0: Splitting combined $entities into $nj archives on speaker boundary."
$cmd JOB=1:$nj $dest/log/chop_combined_$entities.JOB.log \
$copy_program \
"scp:utils/split_scp.pl --utt2spk=$data/utt2spk --one-based -j $nj JOB $temp_dir/$ark.scp |" \
"scp:$temp_dir/$ark.JOB.scp" \
"ark:| gzip -c > $dest/$ark.JOB.gz" || exit 1

# Get some interesting stats, and signal an error if error threshold exceeded.
Expand Down

2 comments on commit ab36598

@marvin-nj
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dear dan,I am confused why we do not combine " trans.job " file at the same time, because we would do fmllr transform in the bulid tree stage . eager to your response, thank u!

@danpovey
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think because for the purposes for which we needed that script, we didn't need the fMLLR transforms. If you need those for some reason, you could of course update the script and make a PR.

Please sign in to comment.