🎯
Focusing
Stanford Chexpert Competition top 1,
AICAS 2024 top 1,
Pytorch, TensorRT committer
Stars
1
star
written in Cuda
Clear filter
An efficient GPU support for LLM inference with x-bit quantization (e.g. FP6,FP5).