Stars
中英文敏感词、语言检测、中外手机/电话归属地/运营商查询、名字推断性别、手机号抽取、身份证抽取、邮箱抽取、中日文人名库、中文缩写库、拆字词典、词汇情感值、停用词、反动词表、暴恐词表、繁简体转换、英文模拟中文发音、汪峰歌词生成器、职业名称词库、同义词库、反义词库、否定词库、汽车品牌词库、汽车零件词库、连续英文切割、各种中文词向量、公司名字大全、古诗词库、IT词库、财经词库、成语词库、地名词库、…
A natural language interface for computers
Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and…
The official GitHub page for the survey paper "A Survey of Large Language Models".
Chinese version of GPT2 training code, using BERT tokenizer.
pycorrector is a toolkit for text error correction. 文本纠错,实现了Kenlm,T5,MacBERT,ChatGLM3,LLaMA等模型应用在纠错场景,开箱即用。
一个基于HuggingFace开发的大语言模型训练、测试工具。支持各模型的webui、终端预测,低参数量及全参数模型训练(预训练、SFT、RM、PPO、DPO)和融合、量化。
Mimix: A Text Generation Tool and Pretrained Chinese Models
vba34520 / chineseocr_lite
Forked from DayBreak-u/chineseocr_litePython构建快速高效的中文文字识别OCR
Agent that writes consistent and interesting long stories for any fiction form
高可用高性能定制化 Python Flask MVC http://www.54php.cn