Skip to content
View wuji3's full-sized avatar
🎯
Focusing
🎯
Focusing
Block or Report

Block or report wuji3

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Stars

LLM

25 repositories

Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"

Python 9,988 634 Updated May 2, 2024

ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型

Python 40,171 5,171 Updated Jun 27, 2024

整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。

13,915 1,281 Updated Jul 21, 2024

Inference code for Llama models

Python 54,856 9,400 Updated Jul 25, 2024

A Unified Library for Parameter-Efficient and Modular Transfer Learning

Jupyter Notebook 2,487 332 Updated Aug 4, 2024

Transformer: PyTorch Implementation of "Attention Is All You Need"

Python 2,580 396 Updated Apr 17, 2024

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Python 35,352 5,472 Updated Aug 2, 2024

Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.

Jupyter Notebook 35,087 3,688 Updated Jul 28, 2024

Prefix-Tuning: Optimizing Continuous Prompts for Generation

Python 873 158 Updated Apr 26, 2024

The official Meta Llama 3 GitHub site

Python 25,180 2,777 Updated Jul 31, 2024

Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supportin…

Jupyter Notebook 11,125 1,572 Updated Aug 2, 2024

An optimized deep prompt tuning strategy comparable to fine-tuning across scales and tasks

Python 1,948 196 Updated Nov 16, 2023

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python 34,123 3,994 Updated Aug 4, 2024

This repository contains demos I made with the Transformers library by HuggingFace.

Jupyter Notebook 8,727 1,361 Updated Jul 23, 2024

A WebUI for Efficient Fine-Tuning of 100+ LLMs (ACL 2024)

Python 28,366 3,476 Updated Aug 2, 2024

CoreNet: A library for training deep neural networks

Python 6,858 528 Updated May 28, 2024

Unsupervised text tokenizer for Neural Network-based text generation.

C++ 9,909 1,148 Updated Aug 1, 2024
Python 7,040 545 Updated Jul 25, 2024

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Python 29,945 6,344 Updated Jul 26, 2024

Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM

Python 7,659 672 Updated Jan 14, 2024

Google AI 2018 BERT pytorch implementation

Python 6,116 1,290 Updated Sep 15, 2023

Pytorch Implementation of Google BERT

Python 587 181 Updated Mar 29, 2020

The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.

Python 12,893 1,041 Updated Jul 30, 2024

MiniCPM-Llama3-V 2.5: A GPT-4V Level Multimodal LLM on Your Phone

Python 8,163 576 Updated Aug 3, 2024

Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

Python 6,054 750 Updated Jun 28, 2024