Skip to content
View hova88's full-sized avatar
🎯
Focusing
🎯
Focusing
  • 12:37 (UTC +08:00)

Block or report hova88

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
14 stars written in Cuda
Clear filter

LLM training in simple, raw C/CUDA

Cuda 23,848 2,669 Updated Oct 2, 2024

[ARCHIVED] Cooperative primitives for CUDA C++. See https://github.com/NVIDIA/cccl

Cuda 1,675 447 Updated Oct 9, 2023

how to optimize some algorithm in cuda.

Cuda 1,485 122 Updated Oct 8, 2024

[MICRO'23, MLSys'22] TorchSparse: Efficient Training and Inference Framework for Sparse Convolution on GPUs.

Cuda 1,194 138 Updated Jul 31, 2024

Efficient GPU kernels for block-sparse matrix multiplication and convolution

Cuda 1,021 200 Updated Jun 8, 2023

CUDA Kernel Benchmarking Library

Cuda 491 63 Updated Jun 5, 2024

Distributed multigrid linear solver library on GPU

Cuda 483 139 Updated Aug 14, 2024

Fast CUDA matrix multiplication from scratch

Cuda 441 61 Updated Dec 28, 2023

Chamfer Distance in Pytorch with f-score

Cuda 326 43 Updated Jan 8, 2021

Step-by-step optimization of CUDA SGEMM

Cuda 216 36 Updated Mar 30, 2022

CUDA implementation of the fundamental sum reduce operation. Aims to be as optimized as reasonable.

Cuda 35 10 Updated Jul 19, 2017

CUDA Program to perform Logistic regression on any CSV data files with Training output in 2nd Coloumn.

Cuda 5 2 Updated Mar 22, 2016

Using CUDA to calculate normals given a n-n matrix. The lenguage of the comments is Spanish.

Cuda 2 Updated Apr 9, 2017