You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Some vertical texts are not recognised correctly.
This vertical text should be read from right to left.
After it reads "ったく", tesseract seems to understand it as the shorter edge, so it rotate the image in a wrong way... therefore results in failure.
Possible solution is, to split the text image, put the texts in single line before processing it with tesseract.
perfect result was achieved: "ったく 戦い方もロクに 知らないくせに 抵抗しやがって"
The text was updated successfully, but these errors were encountered:
I see. I changed it to 3 from 1 and it works, too.
Still it doesn't recognise "ったく" and similar cases.
This Tesseract jpn_vert.traineddata doesn't reads these two.
But if you combine them vertically. Then it reads at least "村に行って馬を借りてくるわ" but still not the "それに" part.
https://github.com/tesseract-ocr/tessdata/blob/master/best/jpn.traineddata
Very nice.
https://github.com/tesseract-ocr/tessdata/blob/master/best/jpn_vert.traineddata
Does not work for PSM 6(default) mode.
If you add "-l jpn + jpn_vert" option, it will read vertical text in a horizontal way therefore result in failure.
Some vertical texts are not recognised correctly.
This vertical text should be read from right to left.
After it reads "ったく", tesseract seems to understand it as the shorter edge, so it rotate the image in a wrong way... therefore results in failure.
Possible solution is, to split the text image, put the texts in single line before processing it with tesseract.
perfect result was achieved: "ったく 戦い方もロクに 知らないくせに 抵抗しやがって"
The text was updated successfully, but these errors were encountered: