Skip to content
This repository has been archived by the owner on Jul 7, 2023. It is now read-only.

t2t-decoder bug when using decode_from_file #376

Closed
vince62s opened this issue Oct 25, 2017 · 3 comments
Closed

t2t-decoder bug when using decode_from_file #376

vince62s opened this issue Oct 25, 2017 · 3 comments

Comments

@vince62s
Copy link
Contributor

As said on gitter, there is a bug when decoding (I think it appeared in 1.2.5)
In the last batch it sent an empty line after the last line of the batch, and it throws this error:

INFO:tensorflow:Inference results INPUT:
INFO:tensorflow:Inference results OUTPUT: Précisions
2017-10-24 21:19:07.312405: W tensorflow/core/framework/op_kernel.cc:1192] Out of range: exceptions.StopIteration:
2017-10-24 21:19:07.312474: W tensorflow/core/framework/op_kernel.cc:1192] Out of range: exceptions.StopIteration:
[[Node: PyFunc = PyFuncTin=[], Tout=[DT_INT32, DT_INT32], token="pyfunc_0", _device="/job:localhost/replica:0/task:0/cpu:0"]]

@vince62s
Copy link
Contributor Author

@lukaszkaiser @rsepassi
1.2.6 fixes the error BUT it does not fix the fact that decode_from_file with N lines output a decode_to_file with N+1 lines, because it will add an extra empty sequence at the end of the last batch.
Hope this is clear.

@vince62s vince62s changed the title t2t-decoder bug when using decode_to_file t2t-decoder bug when using decode_from_file Nov 3, 2017
@vince62s
Copy link
Contributor Author

vince62s commented Nov 3, 2017

There are actually two different issues:
The first one was introduced by the custom delimiter commit.
here: 545ec34#diff-189703c3c1f690231282073cea1cec41R499

text.split(delimiter) will add an empty element at the end of the array.

easy fix is to modify the line after as:
inputs = [record.strip() for record in records[:-1]]
this solve the issue of the extra line as in my first post.
BUT
it does not solve the issue
W tensorflow/core/framework/op_kernel.cc:1192] Out of range: exceptions.StopIteration:
which means there is another place where there is an index issue

help ....

@vince62s vince62s mentioned this issue Nov 4, 2017
@martinpopel
Copy link
Contributor

In my case (t2t 1.2.6 translate_ende_wmt32k) the extra line is not empty, but "Herr Präsident!" (always the same line, when translating newstest 2013, 2014 or 2016).

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants