Change how users access OpenCL kernels #966

SteveBronder · 2018-08-04T18:09:45Z

Checklist

Math issue Seperate OpenCL kernel access into it's own class #973

Copyright holder: (fill in copyright holder information)

- Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
- Documentation: CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)

Rok Češnovar and Erik Štrumbelj (Faculty of Computer and Information Science, University of Ljubljana)
Steve Bronder

the code base is stable
- all unit tests pass
- continuous integration passes
the changes are maintainable
- the code design is idiomatic C++ or there's a good reason it's not
- please include appropriate documentation
the changes are tested
the changes adhere to the Math library's C++ standards

Summary

Easier Access To Kernels

Creates a kernel_cl_base singleton to manage the compiled kernels and a kernel_cl friend adapter class to give users access to kernels along with helper functions. Currently, kernels are compiled from within the opencl_context_base class. The GPU team is concerned that this class is becoming too bulky and so we created a new kernel_cl class that users can utilize to access kernels. Currently users run something like the following to compile and send arguments over to an OpenCL kernel.

cl::Kernel kernel = opencl_context.get_kernel("add");
opencl_context.set_kernel_args(kernel, C.buffer(), A.buffer(), B.buffer(),
                                   A.rows(), A.cols());

Users can now call the following

auto add_kernel = kernel_cl.add(C.buffer(), A.buffer(), B.buffer(), A.rows(), A.cols());

Remove Kernel Groups

This PR also removes the kernel groups such that each kernel is now compiled into it's own program. Originally the OpenCL context would compile kernels which were grouped together (IE 'cholesky', 'checks' ,etc.) into a single program with the idea being that if multiple kernels are used together it's smarter to compile all of them at once. While compiling multiple kernels into a single program is faster, as we add more kernels how we choose to label and group them would end up being difficult and too heuristic. When running some tests it did not seem like compiling the kernels separately produced a noticeable amount of overhead wrt the life of the executable.

Tests

The kernel_cl test checks whether the transpose kernel can be compiled. Code to run the test below

echo STAN_OPENCL=true>> make/local
echo OPENCL_PLATFORM_ID=0>> make/local
echo OPENCL_DEVICE_ID=0>> make/local
make test/unit/math/gpu/kernel_cl_test && ./test/unit/math/gpu/kernel_cl_test

Side Effects

None

Additional Notes

Leave additional notes for the code reviewer here.

…st. Pretty sure it's due to how I'm making the compiler options for the kernel

…rnels. Going to find better way to bring those in

…rnel_cl

…stable/2017-11-14)

syclik · 2018-08-07T13:53:23Z

Please create a new issue. It helps us manage this stuff; if there's a single issue per pull request, we can use "fixes #___" to automatically link and close issues.

syclik · 2018-08-07T13:57:06Z

whoa. I just read the description and that's great. what's with the naming? why kernel_cl_base?

If I'm understanding correctly, the constructor to kernel_cl looks up the compiled kernel by string?

rok-cesnovar · 2018-08-07T14:12:40Z

We are following the same naming convention we setup in the opencl_context (opencl_context_base is the singleton, opencl_context is the friend class).

kernel_cl checks if a kernel with the provided name has been compiled. If yes, it sets the member that holds the compiled kernel. If it wasnt compiled yet, it compiles it first.

One side effect of this PR is also that it organizes how we treat the kernel parameters (like LOWER, UPPER,...) a lot better.

SteveBronder · 2018-08-07T15:42:23Z

Along with what Rok mentioned above, I want to make a PR soon that changes all of the names from *_gpu_* to *_cl_*. I think it looks nicer and makes more sense. I'll file an issue soon describing the naming change in more detail

seantalts · 2018-08-09T12:44:26Z

Can we also make it so that the cl files are not string literals but would actually be readable by an IDE? :D

Also I'm happy to review GPU stuff - going through a workshop with Rok and Erik now on it and it's really cool. Let me know if I can take this off your plate, @syclik

syclik · 2018-08-09T13:04:11Z

@seantalts, please and thank you! I've been buried trying to get the make stuff to work consistently. (It's working through Math and Stan now! Now onto CmdStan.)

SteveBronder · 2018-08-09T16:05:41Z

Can we also make it so that the cl files are not string literals but would actually be readable by an IDE?

@seantalts I would absolutely love this but am having a hard time figuring out how to do it. When we tried reading them in with something like an ifstream we would get the instantiation fiasco error. Also wasn't sure how to specify the file locations since we are a header library.

The only other way I've seen to do it is via stringify like this stackoverflow Q. Though when I tried this it also threw up a bunch of errors, though maybe this is worth a second look

If you know a better way to do it then I'm 100% about it!

Also I'm happy to review GPU stuff

!! V appreciated !!

seantalts · 2018-08-19T09:39:02Z

Also just a note for those following this thread: some discussion has continued on the proposal above here: bstatcomp#7

syclik · 2018-08-20T17:08:01Z

Without digging through notes, one of the things was the use of forward declarations and implementation somewhere else. I think the only place where we have it in the math library now is where it's absolutely necessary; in the memory allocation area. For consistency, that was another thing... some things were declare + defined and others were declared and defined elsewhere, within the same class. That was a bit confusing. If it wasn't detrimental to the code, I thought it was better for it to look like the rest of the codebase unless there was a good reason otherwise. For adding methods to matrix_gpu, there wasn't an objection to making the class heavier.

…

On Sun, Aug 19, 2018 at 5:39 AM seantalts ***@***.***> wrote: Also just a note for those following this thread: some discussion has continued on the proposal above here: bstatcomp#7 <bstatcomp#7> — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#966 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAZ_F9IzttQ3rhwBacP2WihZ_IeQo92vks5uSTI3gaJpZM4VvFmt> .

…'s within each kernel file. Doxygen works but the object holding the kernel code is undocumented. All of the kernel structs are moved into the respective kernel file

[WIP] [WIP] [WIP] a way of kerneling

…stable/2017-11-14)

seantalts · 2018-08-23T12:20:10Z

Just waiting for tests to pass now.

seantalts · 2018-08-23T13:05:55Z

hmmm: http://d1m1s1b1.stat.columbia.edu:8080/blue/organizations/jenkins/Math%20Pipeline/detail/PR-966/23/pipeline

seantalts · 2018-08-23T13:21:15Z

Okay think I fixed it that one by adding the GPU #ifdefs

rok-cesnovar · 2018-08-23T18:41:14Z

Thanks @seantalts for all the help and effort in helping us improve this! Really appreciated.

seantalts · 2018-08-24T06:29:15Z

The GPU tests no longer run on a PR and they broke on develop I think because we all forgot about a GPU test that isn't in the GPU folder:
test/unit/math/prim/mat/fun/opencl_copy_test.cpp
Should we move that into the gpu test folder? or
The test compiles again when #include <stan/math/gpu/copy.hpp> is added.

rok-cesnovar · 2018-08-24T09:29:46Z

I would move it. Can we do it in #1001 ?

SteveBronder and others added 14 commits July 30, 2018 00:45

First iteration, does not work

79c365a

Adds kernel_cl across gpu functions. Getting seg_fault on tranpose te…

2ab6ce5

…st. Pretty sure it's due to how I'm making the compiler options for the kernel

add test for kernel_cl

621d8a2

added throwing with wrong kernel name, fixed seg fault

1f2aad1

removing unneccesary kernel from set_args

de7b5b0

fix for the bug in sub_block

35d75c6

Added NOLINT to the include statements in kernel_cl for brining in ke…

adfbcd1

…rnels. Going to find better way to bring those in

Merge branch 'kernel_cl' of https://github.com/bstatcomp/math into ke…

b324f01

…rnel_cl

Change name of map that holds kernels to kernel_table

2927e93

Merge commit '6d968e60bb633ce3b809225c037bedd3b51fa463' into HEAD

9a614c0

[Jenkins] auto-formatting by clang-format version 6.0.0 (tags/google/…

15d590d

…stable/2017-11-14)

Include constants header and remove [in] for kernel param

8564d93

Remove kernel groups

d00e347

[Jenkins] auto-formatting by clang-format version 6.0.0 (tags/google/…

adb1eff

…stable/2017-11-14)

SteveBronder requested a review from syclik August 5, 2018 06:56

SteveBronder added feature code cleanup labels Aug 5, 2018

SteveBronder self-assigned this Aug 5, 2018

SteveBronder changed the title ~~[WIP] Change how users access OpenCL kernels~~ Change how users access OpenCL kernels Aug 5, 2018

SteveBronder added the gpu label Aug 5, 2018

SteveBronder mentioned this pull request Aug 9, 2018

Seperate OpenCL kernel access into it's own class #973

Closed

syclik requested review from seantalts and removed request for syclik August 9, 2018 13:12

seantalts and others added 11 commits August 20, 2018 18:33

snapshot of a work in progress design experiment

0b4f512

Fix bugs

a68f51e

new kernel enqueing in non templated functions

afc73c5

added the rest of the kernels, all gpu tests pass

5508137

changed to global_range, removed semicolons :)

6f8bd4a

...

fba61eb

Changes all the kernel files so that they are placed into const char*…

aadb3ce

…'s within each kernel file. Doxygen works but the object holding the kernel code is undocumented. All of the kernel structs are moved into the respective kernel file

fix lint issues

4189530

Move STRINGIFY to a single location in kernel_cl.hpp

6a2014c

Merge pull request #7 from stan-dev/kcl

3625c8f

[WIP] [WIP] [WIP] a way of kerneling

Merge commit '68b8f7e2effb1a23abe8524ff429a212653b53a3' into HEAD

58943c7

seantalts previously approved these changes Aug 23, 2018

View reviewed changes

[Jenkins] auto-formatting by clang-format version 6.0.0 (tags/google/…

1aa47d8

…stable/2017-11-14)

stan-buildbot dismissed seantalts’s stale review via 1aa47d8 August 23, 2018 12:19

seantalts previously approved these changes Aug 23, 2018

View reviewed changes

Remove extra things from doxygen.cfg that existed for .cl files

e738921

SteveBronder dismissed seantalts’s stale review via e738921 August 23, 2018 12:23

Add #ifdef STAN_OPENCL to kernels to fix header-tests

93ed1a6

seantalts approved these changes Aug 23, 2018

View reviewed changes

rok-cesnovar merged commit 0cdab8e into stan-dev:develop Aug 23, 2018

seantalts mentioned this pull request Aug 24, 2018

Add missing include to fix gpu test. #1001

Merged

1 task

SteveBronder deleted the kernel_cl branch May 22, 2019 03:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change how users access OpenCL kernels #966

Change how users access OpenCL kernels #966

SteveBronder commented Aug 4, 2018 •

edited

Loading

syclik commented Aug 7, 2018

syclik commented Aug 7, 2018

rok-cesnovar commented Aug 7, 2018

SteveBronder commented Aug 7, 2018

seantalts commented Aug 9, 2018

syclik commented Aug 9, 2018

SteveBronder commented Aug 9, 2018

seantalts commented Aug 19, 2018

syclik commented Aug 20, 2018 via email

seantalts commented Aug 23, 2018

seantalts commented Aug 23, 2018

seantalts commented Aug 23, 2018

rok-cesnovar commented Aug 23, 2018

seantalts commented Aug 24, 2018 •

edited

Loading

rok-cesnovar commented Aug 24, 2018

Change how users access OpenCL kernels #966

Change how users access OpenCL kernels #966

Conversation

SteveBronder commented Aug 4, 2018 • edited Loading

Checklist

Summary

Easier Access To Kernels

Remove Kernel Groups

Tests

Side Effects

Additional Notes

syclik commented Aug 7, 2018

syclik commented Aug 7, 2018

rok-cesnovar commented Aug 7, 2018

SteveBronder commented Aug 7, 2018

seantalts commented Aug 9, 2018

syclik commented Aug 9, 2018

SteveBronder commented Aug 9, 2018

seantalts commented Aug 19, 2018

syclik commented Aug 20, 2018 via email

seantalts commented Aug 23, 2018

seantalts commented Aug 23, 2018

seantalts commented Aug 23, 2018

rok-cesnovar commented Aug 23, 2018

seantalts commented Aug 24, 2018 • edited Loading

rok-cesnovar commented Aug 24, 2018

SteveBronder commented Aug 4, 2018 •

edited

Loading

seantalts commented Aug 24, 2018 •

edited

Loading