wiluen

yilun wang wiluen

2 followers · 1 following

https://wiluen.github.io/

Achievements

Lists (1)

Sort

🚀 My stack

2 repositories

Beta Lists are currently in beta. Share feedback and report bugs.

Stars

feifeibear / LLMSpeculativeSampling

Fast inference from large lauguage models via speculative decoding

Python 530 51 Updated Aug 22, 2024

AnswerDotAI / cold-compress

Cold Compress is a hackable, lightweight, and open-source toolkit for creating and benchmarking cache compression methods built on top of GPT-Fast, a simple, PyTorch-native generation codebase.

Python 82 8 Updated Aug 9, 2024

machilusZ / FastGen

This repo contains the source code for: Model Tells You What to Discard: Adaptive KV Cache Compression for LLMs

29 Updated Aug 14, 2024

llm-db / FineInfer

Deferred Continuous Batching in Resource-Efficient Large Language Model Serving (EuroMLSys 2024)

Python 11 1 Updated May 28, 2024

poloclub / transformer-explainer

Transformer Explained Visually: Learn How LLM Transformer Models Work with Interactive Visualization

JavaScript 2,680 233 Updated Sep 30, 2024

zhangxjohn / LLM-Agent-Benchmark-List

A banchmark list for evaluation of large language models.

60 1 Updated Jul 8, 2024

wiluen / DeepCAT

Python 2 Updated Sep 2, 2024

wiluen / FaaSConf

Python 1 Updated Sep 11, 2024

chenzomi12 / AISystem

AISystem 主要是指AI系统，包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术

Jupyter Notebook 10,643 1,537 Updated Sep 29, 2024

backprop-ai / vllm-benchmark

Benchmarking the serving capabilities of vLLM

Python 16 4 Updated Aug 20, 2024

RayeRen / acad-homepage.github.io

AcadHomepage: A Modern and Responsive Academic Personal Homepage

SCSS 1,346 2,532 Updated Oct 4, 2024

Emerging-AI / ENOVA

A deployment, monitoring and autoscaling service towards serverless LLM serving.

Python 153 25 Updated Sep 28, 2024

liuxu77 / UniTime

UniTime: A Language-Empowered Unified Model for Cross-Domain Time Series Forecasting (WWW 2024)

Python 66 5 Updated Feb 24, 2024

microservices-demo / microservices-demo

Deployment scripts & config for Sock Shop

Python 3,629 2,812 Updated Dec 5, 2023

alpa-projects / alpa

Training and serving large-scale neural networks with auto parallelization.

Python 3,052 353 Updated Dec 9, 2023

raoyongming / GFNet

[NeurIPS 2021] [T-PAMI] Global Filter Networks for Image Classification

Jupyter Notebook 436 40 Updated Jun 12, 2023

Masterleia / TSF_LSTF_Compare

Time series forecasting especially in LSTF compare，include Informer, Autoformer, Reformer, Pyraformer, FEDformer, Transformer, MTGNN, LSTNet, Graph WaveNet

Python 91 13 Updated Sep 30, 2022

RuifMaxx / Paper-List-of-cloud-resource-management

12 Updated Dec 29, 2022

IntelligentDDS / Uni-AD

Share or Not Share? Towards the Practicability of Deep Models for Unsupervised Anomaly Detection in Modern Online Systems (ISSRE'22)

Python 8 Updated Feb 16, 2023

stevinc / Transformer_Timeseries

Pytorch code for Google's Temporal Fusion Transformer

Python 79 24 Updated May 2, 2022

google / cluster-data

Borg cluster traces from Google

TeX 874 187 Updated Jun 26, 2024

icanforce / Orion-OSDI22

Serverless optimizations

Python 50 17 Updated Feb 25, 2024

microsoft / Moonlit

This is a collection of our research on efficient AI, covering hardware-aware NAS and model compression.

Python 73 7 Updated Sep 17, 2024

James-QiuHaoran / Tools

This repository consists of useful tools or guides for system software development or anything interesting.

Python 10 2 Updated Sep 20, 2024

IBM / multi-cloud-configuration-dataset

Dataset containing runtimes and estimated costs for various workloads across different cloud providers and configuration settings.

Jupyter Notebook 9 4 Updated Jun 10, 2022

Sizeless / ReplicationPackage

JavaScript 13 4 Updated Oct 13, 2021

MBtech / rethinking-serverless

Repo containing data and code for serverless paper

Jupyter Notebook 8 2 Updated Jun 23, 2023

hsy23 / ECWDataset

17 2 Updated Dec 11, 2023

hsy23 / KDD23_DynEformer

PPIO workload prediction framework code

Python 14 2 Updated Jul 22, 2024

microsoft / maro

Multi-Agent Resource Optimization (MARO) platform is an instance of Reinforcement Learning as a Service (RaaS) for real-world resource optimization problems.

Python 846 152 Updated Feb 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

yilun wang wiluen

Achievements

Achievements

Block or report wiluen

Lists (1)

🚀 My stack

Stars

feifeibear / LLMSpeculativeSampling

AnswerDotAI / cold-compress

machilusZ / FastGen

llm-db / FineInfer

poloclub / transformer-explainer

zhangxjohn / LLM-Agent-Benchmark-List

wiluen / DeepCAT

wiluen / FaaSConf

chenzomi12 / AISystem

backprop-ai / vllm-benchmark

RayeRen / acad-homepage.github.io

Emerging-AI / ENOVA

liuxu77 / UniTime

microservices-demo / microservices-demo

alpa-projects / alpa

raoyongming / GFNet

Masterleia / TSF_LSTF_Compare

RuifMaxx / Paper-List-of-cloud-resource-management

IntelligentDDS / Uni-AD

stevinc / Transformer_Timeseries

google / cluster-data

icanforce / Orion-OSDI22

microsoft / Moonlit

James-QiuHaoran / Tools

IBM / multi-cloud-configuration-dataset

Sizeless / ReplicationPackage

MBtech / rethinking-serverless

hsy23 / ECWDataset

hsy23 / KDD23_DynEformer

microsoft / maro