-
-
Notifications
You must be signed in to change notification settings - Fork 183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gpu setup [second stage] #720
Conversation
I'm seeing that this lead to a failure on jenkins
http://d1m1s1b1.stat.columbia.edu:8080/job/Math%20Pipeline/job/PR-720/6/consoleText I don't touch the chi_square_cdf_etc test tho? It also compiles and runs fine on my local I'm also seeing
in the console output Do I have to set up OpenCL on Jenkins like in Travis? |
26b7cbb
to
8df83d0
Compare
I'm not sure what error you're seeing out of Jenkins. I followed the link and clicked on logs. It looks like a linker error due to this pull request?
|
I think this is because of
(here) which I think I can make static and clear up those error. A similar error was causing failures in EDIT: No actually it's that the multiple translation units test is not getting the |
Ok. That makes sense. Looks like you got it going, at least for Travis. I'm not sure what's happening with Jenkins yet. |
a4f32f6
to
65d755f
Compare
This is large because the GPU kernels are also included in this commit. What will the review process be for the kernels? Should I include them in a separate PR? |
3cfb5d6
to
b24d239
Compare
make/default_compiler_options
Outdated
@@ -12,8 +12,18 @@ AR = ar | |||
# CPPFLAGS are used for both C and C++ compilation | |||
CPPFLAGS = -DNO_FPRINTF_OUTPUT -pipe | |||
# CXXFLAGS are just used for C++ | |||
CXXFLAGS = -Wall -I . -isystem $(EIGEN) -isystem $(BOOST) -isystem $(CVODES)/include -std=c++1y -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -DBOOST_DISABLE_ASSERTS -DBOOST_PHOENIX_NO_VARIADIC_EXPRESSION -Wno-unused-function -Wno-uninitialized | |||
GTEST_CXXFLAGS = -DGTEST_USE_OWN_TR1_TUPLE | |||
CXXFLAGS = -Wall -I . -isystem $(EIGEN) -isystem $(BOOST) -isystem $(CVODES)/include -isystem $(OPENCL) -std=c++1y -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -DBOOST_DISABLE_ASSERTS -DBOOST_PHOENIX_NO_VARIADIC_EXPRESSION -Wno-unused-function -Wno-uninitialized -DSTAN_GPU -DSTAN_DEVICE_CPU |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@seantalts et al.
For removing sudo
in the jenkins file, can I add something for CXXFLAGS
like
ifeq (${TRAVIS},true)
CXXFLAGS+=-L${HOME}/AMDAPPSDK/lib/x86_64/
endif
make/default_compiler_options
Outdated
ifeq ($(shell echo $(TRAVIS_OS_NAME)),linux) | ||
POST_LDLIBS+=-L/home/travis/AMDAPPSDK/lib/x86_64/ | ||
endif | ||
endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this which is working on travis without sudo
make/default_compiler_options
Outdated
else | ||
POST_LDLIBS+=-lOpenCL | ||
endif | ||
endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Relevant side note: I have no idea how to compile this for windows. Does Jenkins test on Windows?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Jenkins does test on Windows.
stan/math/prim/mat/fun/ocl_gpu.hpp
Outdated
@@ -0,0 +1,296 @@ | |||
#ifndef STAN_MATH_PRIM_MAT_FUN_OCL_HPP |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was not really sure where to put this file. It's purpose is to set up the gpu backend. Should this be in some more internal folder?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let me think about it. We have the autodiff stack sitting in stan/math/memory/stack_alloc.hpp
.
stan/math/prim/mat/fun/ocl_gpu.hpp
Outdated
; // NOLINT | ||
kernel_strings["basic_matrix"] = | ||
#include <stan/math/prim/mat/kern_gpu/basic_matrix.cl> // NOLINT | ||
; // NOLINT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These #includes
are here because when I was trying to read the kernels in with ifstream
I was not guaranteed that they would be loaded by the time this map was called to compile the kernel. (Some sort of call before initialization error?)
If there is a prettier / better way to do this I am for it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you point me to a reference to the kernels? They're just strings that are stored?
} | ||
|
||
|
||
)=====" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If all of these kernels makes the review too bulky I can remove them or just take a subset
from here
Que? Is there someway to restart a part of the overall testing like the jobs in travis? EDIT: Jenkins is not throwing this error anymore |
983887e
to
8e3a36b
Compare
make/default_compiler_options
Outdated
@@ -12,8 +12,24 @@ AR = ar | |||
# CPPFLAGS are used for both C and C++ compilation | |||
CPPFLAGS = -DNO_FPRINTF_OUTPUT -pipe | |||
# CXXFLAGS are just used for C++ | |||
CXXFLAGS = -Wall -I . -isystem $(EIGEN) -isystem $(BOOST) -isystem $(CVODES)/include -std=c++1y -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -DBOOST_DISABLE_ASSERTS -DBOOST_PHOENIX_NO_VARIADIC_EXPRESSION -Wno-unused-function -Wno-uninitialized | |||
CXXFLAGS = -Wall -I . -isystem $(EIGEN) -isystem $(BOOST) -isystem $(CVODES)/include -isystem $(OPENCL) -std=c++1y -DBOOST_RESULT_OF_USE_TR1 -DBOOST_NO_DECLTYPE -DBOOST_DISABLE_ASSERTS -DBOOST_PHOENIX_NO_VARIADIC_EXPRESSION -Wno-unused-function -Wno-uninitialized -DSTAN_GPU -DSTAN_DEVICE_CPU |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll need to change this. This forces STAN_GPU
to be defined for all possible uses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's STAN_DEVICE_CPU
? That looks new?
make/default_compiler_options
Outdated
else | ||
POST_LDLIBS+=-lOpenCL | ||
endif | ||
endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Jenkins does test on Windows.
make/tests
Outdated
@@ -13,7 +13,7 @@ $(GTEST)/src/gtest-all.o: CXXFLAGS += $(GTEST_CXXFLAGS) | |||
|
|||
test/%$(EXE) : CXXFLAGS += $(GTEST_CXXFLAGS) | |||
test/%$(EXE) : test/%.o $(GTEST_MAIN) $(GTEST)/src/gtest-all.o | |||
$(LINK.cpp) -o $@ $^ | |||
$(LINK.cpp) -o $@ $^ $(POST_LDLIBS) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll fix this.
What's with the two macros now?
Are they doc'ed anywhere? |
@SteveBronder, we're going to update the make portion of this. So you know what's coming up, what we're going to do is define a make variable. When that's set, we will add the compiler flags to |
@SteveBronder, I just updated the PR. @seantalts, I could use some guidance on how to do the travis thing right. Do we want to test everything with and without GPU? |
This tells the compiler to compile the OpenCL code for the CPU instead of the GPU so we can run on travis. |
I'm cool with this |
Sorry, can you take a step back and describe the two macros? I don't get it. If there are two defined macros, that's 4 configurations. Under what conditions are the four different conditions used? |
@syclik I think we should do GPU tests on Jenkins only. I doubt travis even supports GPUs, but we wouldn't want to rely on it there anyway. |
@seantalts, that sounds good. We'll test both CPU and GPU on Jenkins. |
stan/math/gpu/opencl_context.hpp
Outdated
device.getInfo<CL_DEVICE_AVAILABLE>() << "\n"; | ||
} | ||
} catch (const cl::Error& e) { | ||
msg << "\tno GPU devices in the specified platform" << "\n"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch! Instead of having a message here could we make a call to check_opencl_error()
? I think it would throw a -1001: "CL_PLATFORM_NOT_FOUND_KHR
or if not we can figure out what error code it returns and setup an error code for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we throw at all since having additional platforms without GPUs is not a problem and could happen?
We are just outputing the information here.
This is only a problem if the user selecta the platform that has no GPU devices. But that is handled elsewhere and does throw an exception, as it should :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if we can throw a more informative error message then "C++ exception with description "clGetPlatformIDs" than it's a good idea to catch and handle it
stan/math/gpu/opencl_context.hpp
Outdated
if (OPENCL_PLATFORM_ID >= platforms_.size()) { | ||
system_error("OpenCL Initialization", "[Platform]", -1, | ||
"CL_INVALID_PLATFORM"); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a user specified an ID that did not exist, the test produced a seg fault. We now output the following:
[----------] Global test environment set-up.
[----------] 3 tests from MathGpu
[ RUN ] MathGpu.getInfo
unknown file: Failure
C++ exception with description "OpenCL Initialization: [Platform] CL_INVALID_PLATFORM: Unknown error -1" thrown in the test body.
[ FAILED ] MathGpu.getInfo (29 ms)
stan/math/gpu/opencl_context.hpp
Outdated
if (OPENCL_DEVICE_ID >= devices_.size()) { | ||
system_error("OpenCL Initialization", "[Device]", -1, | ||
"CL_INVALID_DEVICE"); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a user specified an ID that did not exist, the test produced a seg fault. We now output the following:
[ RUN ] MathGpu.getInfo
unknown file: Failure
C++ exception with description "opencl_context: clGetDeviceIDs CL_DEVICE_NOT_FOUND: Unknown error -1" thrown in the test body.
[ FAILED ] MathGpu.getInfo (31 ms)
stan/math/gpu/opencl_context.hpp
Outdated
msg << "\tno GPU devices in the specified platform" | ||
<< "\n"; | ||
msg << "\tno GPU devices in the platform with ID " | ||
<< platform_id << "\n"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an example output listing all the platforms on a system. The first 2 have no GPU devices and the last one has 1 GPU device.
Number of Platforms: 3
Platform ID: 0
Platform Name: Experimental OpenCL 2.1 CPU Only Platform
Platform Vendor: Intel(R) Corporation
no GPU devices in the platform with ID 1
Platform ID: 1
Platform Name: Intel(R) OpenCL
Platform Vendor: Intel(R) Corporation
no GPU devices in the platform with ID 2
Platform ID: 2
Platform Name: NVIDIA CUDA
Platform Vendor: NVIDIA Corporation
Device 0:
Device Name: Graphics Device
Device Type: 4
Device Vendor: NVIDIA Corporation
Device Max Compute Units: 30
Device Global Memory: 12780699648
Device Max Clock Frequency: 1582
Device Max Allocateable Memory: 3195174912
Device Local Memory: 49152
Device Available: 1
I made a few changes on how we handle if a user specifies a platform ID that does not exist. I made review comments for the changes. I also tested what happens if we try this on a system that has no OpenCL platforms but has the OpenCL library. It produces the following output
|
On a side note, I played around with running these tests on Windows with an NVIDIA GPU. Getting to compile with OpenCL ( I used a toolchain based on gcc and mingw-w64 via RTools) was surprisingly smooth. There is however again the problem with the
Is there any less dirty way of doing this? |
@seantalts can you help Ben setup the GPU stuff on the linux box? I can also stop by Friday afternoon to help get things set up if you are available Ben |
Tests passing on my Debian Stable laptop if I use g++ (not sure what combination of things actually were necessary to fix it), but it still crashes if I use clang++. And the server still crashes if I use clang++ but has a linker error if I use g++. |
For the clang++ error did you try removing the gtests.so file? I have to do
that when I switch between compilers. I assumed that was a Stan thing but
if it's a GPU thing then I can make a PR `make clean-all` clean the
gtest.so out as well.
Is the linker error the same one you received before?
…On Tue, Apr 24, 2018, 11:53 PM bgoodri ***@***.***> wrote:
Tests passing on my Debian Stable laptop if I use g++ (not sure what
combination of things actually were necessary to fix it), but it still
crashes if I use clang++. And the server still crashes if I use clang++ but
has a linker error if I use g++.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#720 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AFlfz2APlaQK5EW8hwLz5vJ5C3d5GrCKks5tr_NKgaJpZM4RdE70>
.
|
stan/math/gpu/opencl_context.hpp
Outdated
@@ -194,7 +194,7 @@ class opencl_context { | |||
// and mark them as compiled. | |||
for (auto kern : kernel_info()) { | |||
if (strcmp(kern.second.group, kernel_group) == 0) { | |||
kernels()[(kern.first)] = cl::Kernel(program_, kern.first, &err); | |||
opencl_context_base::getInstance().kernels[(kern.first)] = cl::Kernel(program_, kern.first, &err); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to change this, as I was getting the "CL_INVALID_KERNEL" when using the kernel (retrieved with get_kernel()). Retrieving the name returned an empty string. I added a test to check this.
this is necessary to get g++ to work
With Ben's PR above I think this is all good to go? |
Yeah, looks great! Thanks @SteveBronder, @rok-cesnovar, and @bgoodri for working out the practical issues of GPU! |
Submission Checklist
./runTests.py test/unit
make cpplint
Summary:
OPENCL_DEVICE_ID
andOPENCL_PLATFORM_ID
.The OpenCL context is built in the Meyer's singleton style with a friend class to act as the API for accessing the singleton. So users will access the global instance
opencl_context
which will then call the members ofopencl_context_base::get_instance()
. See the code here for an example. I'm not sure if there is a name for this style, but it is based on the magic card Mimic Vat so maybe that is an appropriate name.Rok and I thought it would be better to break the GPU PR into several smaller PRs that will flow by:
Intended Effect:
Allows for compilation of GPU kernels
How to Verify:
To test the context:
Side Effects:
None
Documentation:
https://github.com/stan-dev/math/wiki/OpenCL-GPU-Routines
Copyright and Licensing
Please list the copyright holder for the work you are submitting (this will be you or your assignee, such as a university or company):
By submitting this pull request, the copyright holder is agreeing to license the submitted work under the following licenses:
OpenCL headers are copyrighted by The Khronos Group Inc.
Rok Češnovar and Erik Štrumbelj (Faculty of Computer and Information Science, University of Ljubljana)
Steve Bronder