-
Notifications
You must be signed in to change notification settings - Fork 461
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
@破破的桥 代人求教一个文本识别方面的问题。 #43
Comments
https://code.google.com/p/tesseract-ocr/ mostly used open source ocr software. apache 2.0. It has been improved extensively by Google |
memect:AnswerReady https://github.com/memect/hao/blob/master/awesome/ocr-tools.md |
memect:CardReady http://hao.memect.com/?tag=ocr-tools |
memect:weiboReady http://www.weibo.com/5220650532/BgFEdjQG7 |
极客杨的OCR工具箱:Tesseract 是目前应用最广泛的免费开源OCR工具(背后有Google的支持)。商业产品有ABBYY的finereader,还有Adobe;国产的有文通和汉王。当前热点是将OCR移植到智能手机上拓展新的输入渠道、IOS有基于Tesseract的实现,Android有高通vuforia API。资料卡片流: http://t.cn/RPiRyYc |
@ S还未完成 |
极客杨: 关键还是调参数,主要亮点:不同的语言有不同的初始设置; 有颜色或渐进的背景会极大降低识别准确率,需要先转换成黑白/灰度模式(可以试试OpenCV)。 推荐看两篇文章,一篇是Tesseract简介(2007),另一篇报告了Tesseract在处理彩色图片中遇到的问题。 |
极客杨的OCR工具箱:Tesseract 是目前应用最广泛的免费开源OCR工具(背后有Google的支持)。商业产品有ABBYY的finereader,还有Adobe;国产的有文通和汉王。除了常规电脑的应用,Tesseract也被移植到智能手机上。资料卡片流: http://hao.memect.com/?tag=ocr-tools
@好东西传送门 代人求教一个文本识别方面的问题。比如对下图这类中文文字、英文文字、数字混排的文本,传统的文本识别软件效果非常差。不知道有没有合适的低成本的方法将这类图片转成文本文件,并且保证一定的识别率(比如90%)?假如这其中还夹杂着非文字的照片呢?
http://www.weibo.com/1459358890/BgFoRwPgG
http://ww4.sinaimg.cn/bmiddle/56fc0caagw1ej06diuyz2j20b90m0afi.jpg
The text was updated successfully, but these errors were encountered: