Skip to content
View lianyingteng's full-sized avatar

Block or report lianyingteng

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

State-of-the-Art Text Embeddings

Python 14,971 2,445 Updated Sep 30, 2024

PyTorch implementation of the InfoNCE loss for self-supervised learning.

Python 470 39 Updated Nov 17, 2023

Embedding, NMT, Text_Classification, Text_Generation, NER etc.

Python 556 116 Updated Jun 12, 2023
Python 5 Updated Aug 25, 2023

Retrieval and Retrieval-augmented LLMs

Python 7,021 513 Updated Sep 26, 2024

Collecting awesome papers of RAG for AIGC. We propose a taxonomy of RAG foundations, enhancements, and applications in paper "Retrieval-Augmented Generation for AI-Generated Content: A Survey".

1,170 84 Updated Aug 20, 2024
Jupyter Notebook 9,321 645 Updated Jul 29, 2024

Train transformer language models with reinforcement learning.

Python 9,604 1,201 Updated Oct 4, 2024

Instruct-tune LLaMA on consumer hardware

Jupyter Notebook 18,568 2,213 Updated Jul 29, 2024

[TMLR 2024] Efficient Large Language Models: A Survey

975 83 Updated Sep 28, 2024

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Python 11,291 1,008 Updated Oct 4, 2024

Code and documentation to train Stanford's Alpaca models, and generate the data.

Python 29,389 4,032 Updated Jul 17, 2024

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 19,591 2,159 Updated Aug 12, 2024

为GPT/GLM等LLM大语言模型提供实用化交互接口,特别优化论文阅读/润色/写作体验,模块化设计,支持自定义快捷按钮&函数插件,支持Python和C++等项目剖析&自译解功能,PDF/LaTex论文翻译&总结功能,支持并行问询多种LLM模型,支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, m…

Python 64,510 7,973 Updated Oct 1, 2024

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…

C++ 8,329 936 Updated Oct 1, 2024

Let us control diffusion models!

Python 29,943 2,704 Updated Feb 25, 2024

小红书笔记 | 评论爬虫、抖音视频 | 评论爬虫、快手视频 | 评论爬虫、B 站视频 | 评论爬虫、微博帖子 | 评论爬虫、百度贴吧帖子 | 百度贴吧评论回复爬虫 | 知乎问答文章|评论爬虫

Python 16,798 5,336 Updated Sep 28, 2024

Scripts for fine-tuning Meta Llama with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting…

Jupyter Notebook 11,999 1,828 Updated Oct 4, 2024

detrex is a research platform for DETR-based object detection, segmentation, pose estimation and other visual recognition tasks.

Python 1,992 206 Updated Aug 15, 2024

TorchMultimodal is a PyTorch library for training state-of-the-art multimodal multi-task models at scale.

Python 1,434 138 Updated Sep 30, 2024

科技爱好者周刊,每周五发布

46,795 2,844 Updated Sep 27, 2024

List of Dirty, Naughty, Obscene, and Otherwise Bad Words

2,899 662 Updated Aug 5, 2024

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Python 132,919 26,515 Updated Oct 4, 2024

Awesome Pretrained Chinese NLP Models,高质量中文预训练模型&大模型&多模态模型&大语言模型集合

Python 4,762 470 Updated Sep 27, 2024

Efficiently Fine-Tune 100+ LLMs in WebUI (ACL 2024)

Python 31,864 3,909 Updated Oct 1, 2024

中文LLaMA-2 & Alpaca-2大模型二期项目 + 64K超长上下文模型 (Chinese LLaMA-2 & Alpaca-2 LLMs with 64K long context models)

Python 7,063 578 Updated Sep 23, 2024

Code and data accompanying Natural Language Processing with PyTorch published by O'Reilly Media https://amzn.to/3JUgR2L

Jupyter Notebook 1,983 807 Updated Mar 12, 2023

👑 Easy-to-use and powerful NLP and LLM library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search…

Python 12,004 2,923 Updated Sep 30, 2024

MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。

3,416 233 Updated Oct 3, 2024
Next