Chao Fang fantasysee

🍀

PhD Student @ Integrated Circuits and Intelligent Systems (ICAIS) Lab, Nanjing University

19 followers · 124 following

Nanjing University
Nanjing, China
07:57 (UTC +02:00)

Highlights

Organizations

Starred repositories

henryzhongsc / longctx_bench

Official implementation for Yuan & Liu & Zhong et al., KV Cache Compression, But What Must We Give in Return? A Comprehensive Benchmark of Long Context Capable Approaches. EMNLP Findings 2024

Python 36 2 Updated Oct 5, 2024

ucb-bar / RoSE

A unified simulation platform that combines hardware and software, enabling pre-silicon, full-stack, closed-loop evaluation of your robotic system.

Python 34 4 Updated Sep 27, 2024

KULeuven-MICAS / snax_cluster

Forked from pulp-platform/snitch_cluster

A heterogeneous accelerator-centric compute cluster

SystemVerilog 9 9 Updated Oct 5, 2024

tukl-msd / DRAMPower

Fast and accurate DRAM power and energy estimation tool

C++ 122 47 Updated Oct 1, 2024

opengear-project / GEAR

GEAR: An Efficient KV Cache Compression Recipefor Near-Lossless Generative Inference of LLM

Python 137 11 Updated Jul 12, 2024

jy-yuan / KIVI

KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache

Python 217 21 Updated Aug 27, 2024

NewT123-WM / tnlearn

A Python package that uses task-based neurons to build neural networks.

Python 133 3 Updated Aug 22, 2024

OpenGVLab / OmniQuant

[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.

Python 688 53 Updated Jul 24, 2024

TurakhiaLab / TALCO

C 9 1 Updated Aug 13, 2024

Hannibal046 / Awesome-LLM

Awesome-LLM: a curated list of Large Language Model

18,037 1,456 Updated Oct 2, 2024

SJTU-IPADS / PowerInfer

High-speed Large Language Model Serving on PCs with Consumer-grade GPUs

C++ 7,901 407 Updated Sep 6, 2024

trevorpogue / algebraic-nnhw

Deep learning accelerator architectures requiring half the multipliers

Python 260 15 Updated Mar 28, 2024

SJTU-ReArch-Group / Paper-Reading-List

81 7 Updated May 14, 2024

Xiuyu-Li / q-diffusion

[ICCV 2023] Q-Diffusion: Quantizing Diffusion Models.

Python 315 21 Updated Mar 21, 2024

RUCAIBox / LLMSurvey

The official GitHub page for the survey paper "A Survey of Large Language Models".

Python 10,129 798 Updated Aug 20, 2024

SamsungLabs / Sparse-Multi-DNN-Scheduling

Open-source artifacts and codes of our MICRO'23 paper titled “Sparse-DySta: Sparsity-Aware Dynamic and Static Scheduling for Sparse Multi-DNN Workloads”.

Python 32 Updated Sep 18, 2023