Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-of-use, backed by research.

Python 2,443 172 Updated Jun 26, 2024

NVIDIA / ChatRTX

A developer reference project for creating Retrieval Augmented Generation (RAG) chatbots on Windows using TensorRT-LLM

Python 2,533 284 Updated Jun 22, 2024

chatchat-space / Langchain-Chatchat

Langchain-Chatchat（原Langchain-ChatGLM, Qwen 与 Llama 等）基于 Langchain 与 ChatGLM 等语言模型的 RAG 与 Agent 应用 | Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like ChatGLM, Qwen a…

TypeScript 29,477 5,171 Updated Jul 2, 2024

alisen39 / TrWebOCR

开源易用的中文离线OCR，识别率媲美大厂，并且提供了易用的web页面及web的接口，方便人类日常工作使用或者其他程序来调用~

Python 2,544 586 Updated Jun 14, 2023

DayBreak-u / chineseocr_lite

超轻量级中文ocr，支持竖排文字识别, 支持ncnn、mnn、tnn推理 ( dbnet(1.8M) + crnn(2.5M) + anglenet(378KB)) 总模型仅4.7M

C++ 11,621 2,238 Updated Aug 14, 2023

myhub / tr

Free Offline OCR 离线的中文文本检测+识别SDK

Python 1,234 376 Updated May 31, 2024

Ucas-HaoranWei / Vary

[ECCV2024] Official code implementation of Vary: Scaling Up the Vision Vocabulary of Large Vision Language Models.

Python 1,641 150 Updated Jul 2, 2024

li-plus / chatglm.cpp

C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4

C++ 2,807 325 Updated Jun 24, 2024

yuguo-Jack / GLM-Pretrain-in-Megatron-DeepSpeed

GLM-Pretrain in Megatron-Deepspeed for DCU

Python 7 1 Updated Aug 31, 2023

xinsblog / chatglm-tiny

从头开始训练一个chatglm小模型

Python 48 10 Updated Oct 10, 2023

yangjianxin1 / Firefly-LLaMA2-Chinese

Firefly中文LLaMA-2大模型，支持增量预训练Baichuan2、Llama2、Llama、Falcon、Qwen、Baichuan、InternLM、Bloom等大模型

Python 382 27 Updated Oct 21, 2023

shibing624 / MedicalGPT

MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training Pipeline. 训练医疗大模型，实现了包括增量预训练(PT)、有监督微调(SFT)、RLHF、DPO、ORPO。

Python 2,927 450 Updated Jun 28, 2024

OrionStarAI / OrionStar-Yi-34B-Chat

OrionStar-Yi-34B-Chat 是一款开源中英文Chat模型，由猎户星空基于Yi-34B开源模型、使用15W+高质量语料微调而成。

Python 257 28 Updated Apr 9, 2024

xverse-ai / XVERSE-13B

XVERSE-13B: A multilingual large language model developed by XVERSE Technology Inc.

Python 648 58 Updated Apr 9, 2024

km1994 / LLMs_interview_notes

该仓库主要记录大模型（LLMs）算法工程师相关的面试题

1,140 89 Updated Mar 31, 2024

facert / awesome-spider

爬虫集合

21,886 4,791 Updated Sep 27, 2023

lixiang0 / WEB_KG

爬取百度百科中文页面，抽取三元组信息，构建中文知识图谱

Python 912 188 Updated Jul 20, 2020

cirosantilli / china-dictatorship

反中共政治宣传库。Anti Chinese government propaganda. 住在中国真名用户的网友请别给星星，不然你要被警察请喝茶。常见问答集，新闻集和饭店和音乐建议。卐习万岁卐。冠状病毒审查郝海东新疆改造中心六四事件法轮功 996.ICU709大抓捕巴拿马文件邓家贵低端人口西藏骚乱。Friends who live in China and have real name on…

HTML 1,877 218 Updated Jun 24, 2024

wjn1996 / scrapy_for_zh_wiki

基于scrapy的层次优先队列方法爬取中文维基百科，并自动抽取结构和半结构数据

Python 122 18 Updated Apr 27, 2023

scrapy / scrapy

Scrapy, a fast high-level web crawling & scraping framework for Python.

Python 51,585 10,410 Updated Jul 2, 2024

Edy-Barraza / Transformer_Distillation

Knowledge Distillation For Transformer Language Models

Python 51 11 Updated Jan 3, 2024

microsoft / xtreme-distil-transformers

XtremeDistil framework for distilling/compressing massive multilingual neural network models to tiny and efficient models for AI at scale

Python 153 14 Updated Dec 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ruiqianheartseed

Block or report ruiqianheartseed

Stars

google-research / t5x

codertimo / BERT-pytorch

google-research / bert

THUDM / GLM-4

Tencent / HunyuanDiT

lucidrains / x-clip

Cheneng / DPCNN

run-llama / rags

bclavie / RAGatouille