-
-
-
-
-
-
-
datatrove Public
Forked from huggingface/datatroveFreeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.
Python Apache License 2.0 UpdatedAug 12, 2024 -
nanotron Public
Forked from huggingface/nanotronMinimalistic large language model 3D-parallelism training
Python Apache License 2.0 UpdatedAug 11, 2024 -
-
-
-
-
-
DoLa Public
Forked from voidism/DoLaOfficial implementation for the paper "DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models"
Python UpdatedFeb 6, 2024 -
-
-
-
-
-
-
flan-alpaca-lora Public
Forked from Reason-Wang/flan-alpaca-loraThis repository contains the code to train flan t5 with alpaca instructions and low rank adaptation.
Python UpdatedJun 17, 2023 -
-
korean-safety-benchmarks Public
Forked from naver-ai/korean-safety-benchmarksPython MIT License UpdatedJun 2, 2023 -
imbalanced-dataset-sampler Public
Forked from ufoym/imbalanced-dataset-samplerA (PyTorch) imbalanced dataset sampler for oversampling low frequent classes and undersampling high frequent ones.
Python MIT License UpdatedApr 4, 2023 -
dense_passage_retriever Public
DPR 코드 구현 - data : Korquad, backbone : KLUE-BERT
Python UpdatedMar 30, 2023 -
KoChatGPT Public
Forked from kkang09/KoChatGPTChatGPT의 RLHF를 학습을 위한 3가지 step별 한국어 데이터셋
Jupyter Notebook UpdatedMar 22, 2023 -
-
-
plain-transformers Public
Forked from c00k1ez/plain-transformersTransformer models implementation for training from scratch.
Python Apache License 2.0 UpdatedDec 28, 2022 -