wangruihui0429

Follow

wangruihui0429

Follow

0 followers · 1 following

Stars

SORRY-Bench / sorry-bench

SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal Behaviors

Jupyter Notebook 31 Updated Jun 27, 2024

ChenDelong1999 / Linguistic-Similarity

Official repo of paper "Linguistic Minimal Pairs Elicit Linguistic Similarity in Large Language Models"

4 Updated Sep 20, 2024

zhangpeii / Edu-Values

1 Updated Sep 19, 2024

yuchenlin / rebiber

A simple tool to update bib entries with their official information (e.g., DBLP or the ACL anthology).

Python 2,597 157 Updated Aug 18, 2024

Zeyi-Lin / HivisionIDPhotos

⚡️HivisionIDPhotos: a lightweight and efficient AI ID photos tools. 一个轻量级的AI证件照制作算法。

Python 10,594 1,036 Updated Sep 28, 2024

wdndev / llm_interview_note

主要记录大语言大模型（LLMs）算法（应用）工程师相关的知识及面试题

HTML 2,929 339 Updated Aug 19, 2024

datawhalechina / self-llm

《开源大模型食用指南》基于Linux环境快速部署开源大模型，更适合中国宝宝的部署教程

Jupyter Notebook 8,237 986 Updated Sep 29, 2024

STAIR-BUPT / JailBench

JailBench：大型语言模型越狱攻击风险评测中文数据集

12 1 Updated Jul 13, 2024

anthropics / courses

Anthropic's educational courses

Jupyter Notebook 5,480 420 Updated Sep 18, 2024

ystemsrx / Qwen2-Boundless

A fine-tuned model from Qwen2-1.5B-Instruct, capable of handling sensitive topics like violence, explicit content. / 从 Qwen2-1.5B-Instruct 微调，能处理暴力、色情、违法等敏感话题

Python 124 30 Updated Aug 22, 2024

esbatmop / MNBVC

MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化，也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。

3,416 233 Updated Oct 3, 2024

tml-epfl / llm-past-tense

Does Refusal Training in LLMs Generalize to the Past Tense? [arXiv, July 2024]

Python 50 6 Updated Oct 3, 2024

epfl-dlab / llm-latent-language

Repo accompanying our paper "Do Llamas Work in English? On the Latent Language of Multilingual Transformers".

Jupyter Notebook 52 12 Updated Mar 11, 2024

takagi97 / LLMs-are-parallel-multilingual-learners

The implementation of Large Language Models are Parallel Multilingual Learners.

Roff 10 Updated Mar 18, 2024

RUCAIBox / Language-Specific-Neurons

Python 41 4 Updated Jul 8, 2024

pillowsofwind / Course-Correction

[EMNLP 2024] The official GitHub repo for the paper "Course-Correction: Safety Alignment Using Synthetic Preferences"

Python 19 1 Updated Oct 2, 2024

ThuCCSLab / JailbreakEval

A collection of automated evaluators for assessing jailbreak attempts.

Python 59 8 Updated Jun 26, 2024

AetherPrior / TrickLLM

This repository contains the code for the paper "Tricking LLMs into Disobedience: Formalizing, Analyzing, and Detecting Jailbreaks" by Abhinav Rao, Sachin Vashishta*, Atharva Naik*, Somak Aditya, a…

Jupyter Notebook 5 2 Updated May 22, 2024

zhouying20 / HMGC

COLING'24 Humanizing Machine-Generated Content: Evading AI-Text Detection through Adversarial Attack

Python 26 2 Updated Apr 10, 2024

ydyjya / LLM-IHS-Explanation

Jupyter Notebook 29 2 Updated Jun 13, 2024

luciusssss / mc2_corpus

[ACL'24] MC^2: A Multilingual Corpus of Minority Languages in China (Tibetan, Uyghur, Kazakh, and Mongolian)

Python 16 1 Updated Jun 15, 2024

Tavely / Popular-AI-tools-list-by-category

The most popular AI tools list sorted by category 2024；2024年分类排序的最受欢迎人工智能工具列表

165 23 Updated Aug 5, 2024

thu-coai / SafeUnlearning

Safe Unlearning: A Surprisingly Effective and Generalizable Solution to Defend Against Jailbreak Attacks

Python 20 1 Updated Jul 9, 2024

stanfordnlp / pyreft

ReFT: Representation Finetuning for Language Models

Python 1,118 95 Updated Sep 29, 2024

ChristopheZhao / SFT_data_generation

Instruction Tuning data generation uses LLM in a specific scenario.

Python 14 5 Updated May 2, 2024

Ed-Zh / PARDEN

HTML 8 1 Updated May 14, 2024

raywu0123 / Chinese_Foreign_Affairs_Spokepersons_Remarks

中國外交部發言資料集

Python 5 1 Updated Mar 17, 2020

Libr-AI / do-not-answer

Do-Not-Answer: A Dataset for Evaluating Safeguards in LLMs

Jupyter Notebook 166 23 Updated Jun 7, 2024

Princeton-SysML / Jailbreak_LLM

Jupyter Notebook 144 14 Updated Nov 26, 2023

chujiezheng / LLM-Safeguard

Official repository for ICML 2024 paper "On Prompt-Driven Safeguarding for Large Language Models"

Python 65 6 Updated Sep 5, 2024