Skip to content

Commit

Permalink
K10: Rm swizzling, didn't make a diff
Browse files Browse the repository at this point in the history
  • Loading branch information
siboehm committed Feb 25, 2023
1 parent 9a4dd77 commit f2c7982
Showing 1 changed file with 2 additions and 10 deletions.
12 changes: 2 additions & 10 deletions src/kernels/10_kernel_warptiling.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -26,16 +26,8 @@ template <const int BM, const int BN, const int BK, const int WM, const int WN,
__global__ void __launch_bounds__(NUM_THREADS)
sgemmWarptiling(int M, int N, int K, float alpha, float *A, float *B,
float beta, float *C) {
const uint SWIZZLE = 4;
// which swizzle block are we in
const uint swizzleBlockIdx = blockIdx.x / (SWIZZLE * SWIZZLE);
// index inside the swizzle block
const uint swizzleIdx = blockIdx.x % (SWIZZLE * SWIZZLE);

const uint cCol =
(swizzleBlockIdx % (N / BN / SWIZZLE)) * SWIZZLE + (swizzleIdx % SWIZZLE);
const uint cRow =
(swizzleBlockIdx / (N / BN / SWIZZLE)) * SWIZZLE + (swizzleIdx / SWIZZLE);
const uint cRow = blockIdx.y;
const uint cCol = blockIdx.x;

// Placement of the warp in the threadblock tile
const uint warpIdx = threadIdx.x / WARPSIZE; // the warp this thread is in
Expand Down

0 comments on commit f2c7982

Please sign in to comment.