🎉CUDA 笔记 / 大模型手撕CUDA / C++笔记,更新随缘: flash_attn、sgemm、sgemv、warp reduce、block reduce、dot product、elementwise、softmax、layernorm、rmsnorm、hist etc.
cuda
cuda-kernels
gemm
softmax
cuda-programming
layernorm
gemv
elementwise
rmsnorm
flash-attention
flash-attention-2
warp-reduce
block-reduce
-
Updated
Jun 28, 2024 - Cuda