Block or Report
Block or report spongezz
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language
Sort by: Recently starred
⚡ Dynamically generated stats for your github readmes
Easy and Efficient Quantization for Transformers
Easy and Efficient Transformer : Scalable Inference Solution For Large NLP model
The Triton TensorRT-LLM Backend
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…
This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, s…
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Several simple examples for popular neural network toolkits calling custom CUDA operators.
Ongoing research training transformer models at scale
手把手带你实战 Huggingface Transformers 课程视频同步更新在B站与YouTube
Collaborative Collection of C++ Best Practices. This online resource is part of Jason Turner's collection of C++ Best Practices resources. See README.md for more information.
《C++ Primer Plus 第6版(中文版)》原书代码、习题答案和个人笔记,仅供学习和交流。
OpenBLAS is an optimized BLAS library based on GotoBLAS2 1.13 BSD version.
Code for Deep Anomaly Detection on Attributed Networks (SDM2019)
刷算法全靠套路,认准 labuladong 就够了!English version supported! Crack LeetCode, not only how, but also why.
这是一个faster-rcnn的pytorch实现的库,可以利用voc数据集格式的数据进行训练。
数据挖掘、计算机视觉、自然语言处理、推荐系统竞赛知识、代码、思路