deep-learning-playgroud

手写数字识别

使用简单CNN模型, mnist数据集, tensorflow框架进行训练
使用了更复杂的模型, 在kaggle上mnist的数据集acc为99.7%的模型

https://www.kaggle.com/c/digit-recognizer

https://www.kaggle.com/yassineghouzam/introduction-to-cnn-keras-0-997-top-6
使用了VGG16模型

利用keras改写VGG16经典模型在手写数字识别体中的应用https://www.cnblogs.com/LHWorldBlog/p/8677131.html

发现VGG16的input需要是2242243, 大材小用了
随后使用了CRNN模型参考了文章:Recurrent Convolutional Neural Network For Svhn 和github项目:https://github.com/JimLee4530/RCNN

得到了不错的泛化效果 (单手写数字识别)
该过程中, 从0学习了keras, tensorflow的使用, 参考链接如下:

https://keras-zh.readthedocs.io/

keras官方中文文档

【干货】史上最全的Tensorflow学习资源汇总

python – Keras：如何保存模型并继续培训？

Keras ModelCheckpoint 保存训练过程中的最佳模型权重
- Keras保存最好的模型
- 如何为Keras中的深度学习模型建立Checkpoint
  - How to Check-Point Deep Learning Models in Keras
How to model Convolutional recurrent network ( CRNN ) in Keras
了解数据归一化的重要性: 对于同一个模型的收敛数据归一化后, 的收敛速度快很多

机器学习——标准化/归一化的目的、作用和场景
模型加载, 保存, 多gpu利用等tip, 只保存权重的好处是, 模型文件体积变小

python – Keras：如何保存模型并继续培训？

Keras ModelCheckpoint 保存训练过程中的最佳模型权重

Keras保存最好的模型

如何为Keras中的深度学习模型建立Checkpoint - How to Check-Point Deep Learning Models in Keras

Tensorflow加载预训练模型和保存模型
为了达到python离线训练模型，Java在线预测的功能

Java调用Keras、Tensorflow模型

将keras的h5模型转换为tensorflow的pb模型
框架对于多GPU资源的使用方式:

Keras同时用多张显卡训练网络 - 官方文档：multi_gpu_model

多gpu训练，单gpu保存(多gpu下训练的model在单gpu下测试出错) - keras 多GPU训练，单GPU预测

Keras多GPU训练以及载入权重无效的问题

Keras官方文档 - 如何在 GPU 上运行 Keras?

Tensorflow官方文档 - 使用GPU
Keras的图片预处理 Keras的官方文档 - 图片生成器ImageDataGenerator
通过将训练数据转化为npz格式的文件提高加载效率: https://github.com/victoriest/deep-learning-playgroud/blob/master/handwrite_digit_ocr/npz_util.py
Keras训练模型时用到的generator, 目前来说依旧没有玩转: A detailed example of how to use data generators with Keras

keras 两种训练模型方式fit和fit_generator(节省内存) - keras数据自动生成器，继承keras.utils.Sequence，结合fit_generator实现节约内存训练

How to use Keras fit and fit_generator (a hands-on tutorial)

2019-07-22 日更新: 手写英文字母识别可以通过使用RCNN的模型的少许修改, 用EMNIST数据集训练以及测试, 准确率在95%.

修改内容: 将输出Y维度, 从10改为26. 即, 26个英文字母, 不分大小写.

EMNIST数据集

EMNIST的数据集加载lib
2019-07-26 日更新: 手写英文单词短语识别

参考github项目: https://github.com/githubharald/SimpleHTR

使用其中的模型识别模型, 其中为了让识别出来的结果更加准确(纠正拼写错误, 如book 识别成了buuk), 需要加入解码器CTCWordBeamSearch.

结果发现该解码器需要支持tensorflow的自定义操作, 而自定义操作不能在windows平台下使用. 所以需要找替代方案:

partten, 一个基于python的自然语言处理工具包.

使用过程中可能会遇到网络问题导致, 报错:
```
zipfile.BadZipFile: File is not a zip file
```
找到该文件下载的路径,删除重新下载即可.

为了识别短语, 加入分词算法:https://github.com/githubharald/WordSegmentation

另外为了在flask里加载多个keras模型, 总是报错:
```
ValueError: Tensor Tensor is not an element of this graph
```
强制将Flask改为单线程模式就行了
```
if __name__ == '__main__':
    app.run(host="0.0.0.0", port=8080, threaded=False)
```
或者直接使用生产级的WSGI容器.

参考文章:

文档边缘检测

目标是从照片识别出文档区域, 进行了两个模型的训练以及测试

路径1:

根据github项目https://github.com/senliuy/Keras_HED_with_model和https://github.com/lc82111/Keras_HED进行实践, 1. 下载训练数据：http://vcl.ucsd.edu/hed/HED-BSDS.tar 并解压到工程根目录下 2. 下载预训练模型：https://github.com/fchollet/deep-learning-models/releases 中搜索文件’vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5‘，下载并拷贝到./models目录下

路径2:

根据这篇文章:深度学习实践文档检测, 以及github项目https://github.com/RRanddom/tf_doc_localisation进行的相关的文档边缘检测实践.

不定长文本识别

参照github项目: https://github.com/YCG09/chinese_ocr进行实践

其中遇到的坑:

原型工程: ctpn: https://github.com/eragonruan/text-detection-ctpn chinese_ocr: https://github.com/YCG09/chinese_ocr

首先下载两个工程, 并按照readme安装相关依赖的python库

我们的首要使用的工程是chinese_ocr, 在该工程里, 有一个ctpn的目录, 该目录是一个cptn的模型, 这个比较麻烦, 重点讲这里:

在windows环境中, 需要使用c编译两个(三个?)库. 其目录在cptn/lib/utils中, 在linux环境下, 直接make.sh就可以了, 但是在windows下我们需要如下N步: 需要在命令行下, 进入该目录:

cython bbox.pyx
cython cython_nms.pyx
Cython nms.pyx
cython gpu_nms.pyx(GPU可选)

Python setup.py build_ext--inplace

不出意料的话会报错: 这时候就需要ctpn的工程下同目录的setup.py了,

from distutils.core import setup

Import numpy as np
From Cython.Build import cythonize

numpy_include=np.get_include()
#setup(ext_modules=cythonize("bbox.pyx"),include_dirs=[numpy_include])
setup(ext_modules=cythonize("cython_nms.pyx"),include_dirs=[numpy_include])

把编译好的东西拷贝到ctpn的工程的utils目录下哦对了你会遇到这个问题:

"ValueError: Buffer dtype mismatch, expected 'int_t' but got 'long long'" for sample_with_gt_wrapper

改成 intp_t重新编译即可 CharlesShang/FastMaskRCNN#163

参考连接: 与CPTN（文字识别网络）作斗争的记录来自 https://www.jianshu.com/p/027e9399e699

win10+tensorflow CPU 部署CTPN环境来自 https://blog.csdn.net/u010554381/article/details/86519960

文本识别text-detection-ctpn环境搭建来自 https://blog.csdn.net/qq_35513792/article/details/89174958

https://github.com/Li-Ming-Fan/OCR-DETECTION-CTPN eragonruan/text-detection-ctpn#73

参考文档

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
chinese_recognize		chinese_recognize
data_pretreatment		data_pretreatment
deep-learning-playgroud-tf2		deep-learning-playgroud-tf2
document_detection		document_detection
handwrite_digit_ocr		handwrite_digit_ocr
handwriting_english_recognize		handwriting_english_recognize
project_rpa		project_rpa
utils		utils
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
character_ocr_test.py		character_ocr_test.py
docker-compose.yml		docker-compose.yml
object_detection.py		object_detection.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

deep-learning-playgroud

手写数字识别

文档边缘检测

路径1:

路径2:

不定长文本识别

其中遇到的坑:

参考文档

参考的git项目:

常用的模型预训练数据的github项目

Tensorflow各种官方的预训练模型 - TensorFlow-Slim image classification model library

对比许多模型在多个数据集中的测试效果的表格 - What is the class of this image ?

数据集

About

Releases

Packages

Contributors 2

Languages

victoriest/deep-learning-playgroud

Folders and files

Latest commit

History

Repository files navigation

deep-learning-playgroud

手写数字识别

文档边缘检测

路径1:

路径2:

不定长文本识别

其中遇到的坑:

参考文档

参考的git项目:

常用的模型预训练数据的github项目

Tensorflow各种官方的预训练模型 - TensorFlow-Slim image classification model library

对比许多模型在多个数据集中的测试效果的表格 - What is the class of this image ?

数据集

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages