- shanghai
Block or Report
Block or report wuji3
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseLLM
Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
A Unified Library for Parameter-Efficient and Modular Transfer Learning
Transformer: PyTorch Implementation of "Attention Is All You Need"
The simplest, fastest repository for training/finetuning medium-sized GPTs.
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
Prefix-Tuning: Optimizing Continuous Prompts for Generation
Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supportin…
An optimized deep prompt tuning strategy comparable to fine-tuning across scales and tasks
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
This repository contains demos I made with the Transformers library by HuggingFace.
A WebUI for Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
CoreNet: A library for training deep neural networks
Unsupervised text tokenizer for Neural Network-based text generation.
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM
Google AI 2018 BERT pytorch implementation
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
MiniCPM-Llama3-V 2.5: A GPT-4V Level Multimodal LLM on Your Phone
Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"