Skip to content

Commit

Permalink
[scripts] Fixed the possible zero discounting constant issue in make_…
Browse files Browse the repository at this point in the history
…kn_lm.py (#4687)
  • Loading branch information
huangruizhe committed Jan 26, 2022
1 parent 4609ea1 commit 7f3d3da
Showing 1 changed file with 3 additions and 1 deletion.
4 changes: 3 additions & 1 deletion egs/wsj/s5/utils/lang/make_kn_lm.py
Original file line number Diff line number Diff line change
Expand Up @@ -165,7 +165,9 @@ def cal_discounting_constants(self):
n1 += stat[1]
n2 += stat[2]
assert n1 + 2 * n2 > 0
self.d.append(n1 * 1.0 / (n1 + 2 * n2))
self.d.append(max(0.001, n1 * 1.0) / (n1 + 2 * n2)) # We are doing this max(0.001, xxx) to avoid zero discounting constant D due to n1=0,
# which could happen if the number of symbols is small.
# Otherwise, zero discounting constant can cause division by zero in computing BOW.

def cal_f(self):
# f(a_z) is a probability distribution of word sequence a_z.
Expand Down

0 comments on commit 7f3d3da

Please sign in to comment.