From 35cd4e65ba365fba56f5da3240f65536dbfae06f Mon Sep 17 00:00:00 2001 From: Simon Boehm Date: Tue, 3 Jan 2023 23:04:13 +0100 Subject: [PATCH] Update README.md --- README.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 1a1fb1b..fd570ea 100644 --- a/README.md +++ b/README.md @@ -2,6 +2,7 @@ Step-by-step optimization of matrix multiplication, implemented in CUDA. For an explanation of each kernel, see [siboehm.com/CUDA-MMM](https://siboehm.com/articles/22/CUDA-MMM). +This repo is inspired by [wangzyon/NVIDIA_SGEMM_PRACTICE](https://github.com/wangzyon/NVIDIA_SGEMM_PRACTICE). ## Overview @@ -33,4 +34,4 @@ GFLOPs at matrix size 4092x4092: 1. `mkdir build && cd build && cmake .. -GNinja && ninja` 1. `./sgemm ` -For profiling, download [NVIDIA Nsight Compute](https://developer.nvidia.com/nsight-compute). \ No newline at end of file +For profiling, download [NVIDIA Nsight Compute](https://developer.nvidia.com/nsight-compute).