We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
训练集和测试集明明是区分大小写的,为什么代码里要全部转为小写字母,测试集的lmdb是否做过处理导致测试集(如cute80)的标签全部为小写字母(真实标签里是有大写字母的)
dataset.py中label = str(txn.get(label_key.encode()).decode('utf-8'))训练集得到的有大写和小写字母,测试集得到的是标注的小写字母,可是原本是lsbel是有大写和小写字母的,请问是在什么时候转为小写字母的,为什么要进行这样的操作?
The text was updated successfully, but these errors were encountered:
您好。这个设置是简单依照前人工作来做的,之前的论文是不区分大小写进行评估,为了公平对比,我们也不区分。另外,ICDAR官方也提供了大小写不敏感的排名,也是一个重要指标。所以我这个repo里边是没有区分大小写的。如果您想要区分大小写,可以稍微修改一下代码,数据集的原始文件也都有,是不难做的。
Sorry, something went wrong.
关于大小写不敏感的问题,是不是把chara = text_tmp[i][j].lower()改成chara = text_tmp[i][j]就可以解决了 @Canjie-Luo
No branches or pull requests
训练集和测试集明明是区分大小写的,为什么代码里要全部转为小写字母,测试集的lmdb是否做过处理导致测试集(如cute80)的标签全部为小写字母(真实标签里是有大写字母的)
dataset.py中label = str(txn.get(label_key.encode()).decode('utf-8'))训练集得到的有大写和小写字母,测试集得到的是标注的小写字母,可是原本是lsbel是有大写和小写字母的,请问是在什么时候转为小写字母的,为什么要进行这样的操作?
The text was updated successfully, but these errors were encountered: