Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gpu stanmathcl [WIP] #655

Closed
wants to merge 112 commits into from
Closed
Changes from 1 commit
Commits
Show all changes
112 commits
Select commit Hold shift + click to select a range
ba4dd04
initial commit
SteveBronder Oct 4, 2017
2eef1f2
random fixings
SteveBronder Oct 5, 2017
0c6efac
lintr things
SteveBronder Oct 5, 2017
b0f6dc4
Started adding templating for the gpu functions
SteveBronder Oct 5, 2017
ac0a834
Built tests and rev cholesk with new code. Seg Fault on test
SteveBronder Oct 6, 2017
8db822c
moved basic_matrix_gpu funcs to funcs
SteveBronder Oct 6, 2017
f3a300c
simple test for chol segfault
SteveBronder Oct 9, 2017
f96cb58
added GPU tests, failing
SteveBronder Oct 12, 2017
264d0bf
Trying to run tests on tests
SteveBronder Oct 13, 2017
598fdb0
added some things for checking test vals
SteveBronder Oct 13, 2017
96a89ad
Update code to only return lower left triangular for cholesky
SteveBronder Oct 13, 2017
83916c7
gpu_chol now gives back the original matrix when the size if zero. Te…
SteveBronder Oct 13, 2017
3cf7021
tests pass if the second condition of check_pos_definite() is comment…
SteveBronder Oct 14, 2017
af87eb3
tests pass if the second condition of check_pos_definite() is comment…
SteveBronder Oct 14, 2017
f02225f
Inital commit with the following changes:
rok-cesnovar Oct 17, 2017
e91a870
- reading kernels from files
rok-cesnovar Oct 20, 2017
618f145
all tests pass on smaller block sizes for inverse
rok-cesnovar Oct 21, 2017
e851141
- fixed the inverse semifrequent nan error, all test pass
rok-cesnovar Oct 22, 2017
f4cb8eb
- missing kernel changes from the previous commit
rok-cesnovar Oct 22, 2017
cfa79f0
- added nan value checks for the GPU
rok-cesnovar Oct 23, 2017
3385baf
- changed the kernel source reading
rok-cesnovar Oct 24, 2017
66da004
- use of a more safe transpose, minor cleanup
rok-cesnovar Oct 24, 2017
103d92b
code passes most of cpplint
SteveBronder Oct 26, 2017
5c6ea69
passing cpplint
SteveBronder Oct 26, 2017
4590b03
- removed the loops for transposing the matrices before copy to GPU
rok-cesnovar Oct 27, 2017
8bd0b8e
- changed all dimension checks to exsitent /prim/mat/err checks
rok-cesnovar Oct 27, 2017
b3386c5
- added doxygen comments; only prim and rev cholesky_decompose_gpu.hp…
rok-cesnovar Oct 27, 2017
38b65c3
- added the rest of doxygen comments
rok-cesnovar Oct 27, 2017
64e1ff4
- added the copy_submatrix function
rok-cesnovar Oct 27, 2017
fa56656
- passes cpplint
rok-cesnovar Oct 27, 2017
17434a9
- implemented the blocked version of the rev/cholesky_decompose on th…
rok-cesnovar Oct 29, 2017
c1a8e5e
shortened multiply_with_scalar to multiply. Started builing tests
SteveBronder Oct 29, 2017
a67548a
GPU matrix tests all passing
SteveBronder Oct 30, 2017
2578605
Added OpenCL headers
SteveBronder Oct 30, 2017
80c41cf
fix link to OpenCL from hpp to h
SteveBronder Oct 30, 2017
a7fe86a
fix link to OpenCL from hpp to h
SteveBronder Oct 30, 2017
e7e8b3e
Removed old OpenCL, using C++ opencl header file
SteveBronder Oct 30, 2017
6686a4f
Change OpenCL header
SteveBronder Oct 30, 2017
760daaf
opencl headers
SteveBronder Oct 30, 2017
6047f22
placed tranpose test in wrong place
SteveBronder Oct 30, 2017
9684668
update check_diagonal_zero test
SteveBronder Oct 30, 2017
5d7cfd6
fix check_gpu_tests
SteveBronder Oct 30, 2017
22f3ded
Forgot : and ; in check_gpu_test
SteveBronder Oct 30, 2017
31d110a
add test for zero matrix, identity matrix, and subtract
SteveBronder Oct 31, 2017
e0fafde
- fixed seg fault when creating matrices with size 0
rok-cesnovar Oct 31, 2017
3052592
- order of parameters in subtract now matches the add/multiply order …
rok-cesnovar Oct 31, 2017
af7fa93
- cpplint errors cleanup
rok-cesnovar Oct 31, 2017
0802f69
- changed add to copy on output, tests changed
rok-cesnovar Nov 3, 2017
7ae8263
- changed subtract to copy on output, test fixed
rok-cesnovar Nov 3, 2017
4c7bb3a
- transpose copies on output
rok-cesnovar Nov 3, 2017
c43a944
- all multiplies except for diagonal copy on output
rok-cesnovar Nov 3, 2017
8757bb5
- changed the use of multiply in rev/cholesky_decompose
rok-cesnovar Nov 3, 2017
c4501b4
- inverse copy on output
rok-cesnovar Nov 3, 2017
a1ff9ce
- fixed /rev/cholesky_decompose_gpu with the copy on output functions
rok-cesnovar Nov 3, 2017
6109dc8
added tests for matrix_gpu, the copy functions, and inverse function.
SteveBronder Nov 5, 2017
13b8410
Fix up tests for add and subtract
SteveBronder Nov 5, 2017
51f688d
update check_nan tests. We have nothing to test infinite values which…
SteveBronder Nov 5, 2017
1751f7e
add AMD OpenCL install to travis.yml so that CPU version of OpenCL is…
SteveBronder Nov 8, 2017
db6f798
- set device filter to CPU to run tests
rok-cesnovar Nov 8, 2017
81f6249
- minor test fixes, added the AMD double pragma to the kernels
rok-cesnovar Nov 8, 2017
df71123
- cpplint cleanup, kernel init test
rok-cesnovar Nov 8, 2017
153fd31
- checking symmetric in /prim/cholesky_decompose
rok-cesnovar Nov 8, 2017
c4e48f2
add POST_LDLIBS when compiling .o file, this avoids warnings in clang…
SteveBronder Nov 9, 2017
c8ca938
Update AMD SDK to 3.0
SteveBronder Nov 10, 2017
f0c4a61
move libopencl.so symbolic link line in .travis.yml
SteveBronder Nov 10, 2017
2133c14
- context passed by reference
rok-cesnovar Nov 14, 2017
f35359b
move libopencl.so symbolic link line in .travis.yml (reverted from co…
rok-cesnovar Nov 14, 2017
0ad3dc7
Update AMD SDK to 3.0 (reverted from commit e1ed076099ece4ac529e61aff…
rok-cesnovar Nov 14, 2017
bac8c78
- default work-group size is 16x16 to support all CPUs
rok-cesnovar Nov 15, 2017
a93f5bb
- block in cholesky step to 32
rok-cesnovar Nov 15, 2017
d1d9d0d
- added function name to check_opencl for easier debugging
rok-cesnovar Nov 15, 2017
80fd359
- testing the Travis OpenCL device
rok-cesnovar Nov 16, 2017
53a5ca6
- smaller multiply size for CPU OpenCL
rok-cesnovar Nov 16, 2017
8cb7c76
- missing kernel size
rok-cesnovar Nov 16, 2017
f1afe40
- added missing checks in copy_submatrix
rok-cesnovar Nov 16, 2017
96ac7df
- matrix multiply with matrix of size() 0 returns immediately
rok-cesnovar Nov 16, 2017
ee2c22c
- added multiply with self transposed
rok-cesnovar Nov 17, 2017
b39b9b1
- removed test for removed kernels
rok-cesnovar Nov 18, 2017
c4a5f78
- cholesky now reuses the inverse and zeros kernels
rok-cesnovar Nov 19, 2017
c54ba4a
- two level cholesky decompose
rok-cesnovar Nov 19, 2017
a7f0436
- added check_symmetric_gpu, tests
rok-cesnovar Nov 20, 2017
4a91a88
removed mac specific compilations for OpenCL
SteveBronder Nov 21, 2017
599d11b
fix cpplint errors
SteveBronder Nov 21, 2017
103a6f7
- updated the GPU kernels for readability
rok-cesnovar Nov 21, 2017
801a065
- cleanup of tests due to removed kernels
rok-cesnovar Nov 21, 2017
51861d4
- switched all kernels to column-major
rok-cesnovar Nov 21, 2017
1227672
- optimizations on the matrix multiplies
rok-cesnovar Nov 22, 2017
754ed8d
- cleanup of check_opencl, does not throws on 0
rok-cesnovar Nov 26, 2017
1f9fbe6
- more changes for matrices with const rows and cols
rok-cesnovar Nov 29, 2017
8aca5b1
- appropriate scopes for openCL try/catch for all added functions
rok-cesnovar Nov 29, 2017
70deee5
- removed the use of enqueueCopyBuffer
rok-cesnovar Nov 29, 2017
ab33aef
Add docs for pos_def_check in check_diagonal_zeros and check_nan for …
SteveBronder Dec 5, 2017
baf39a1
Attempt to merge / rebase fork with origin/stanMathcl
SteveBronder Dec 5, 2017
70ce68b
change deevice type to cpu
SteveBronder Dec 6, 2017
242b823
fix lint warnings in test and header files
SteveBronder Dec 6, 2017
77d331c
add -framework OpenCL and a check for mac in the default_compiler_opt…
SteveBronder Dec 6, 2017
9010fb6
Change device type to CPU
SteveBronder Dec 6, 2017
0717822
added define for OpenCL expections in check_opencl. In a previous com…
SteveBronder Dec 7, 2017
1cd8ab6
Add POST_LDLIBS to multiple_translation_units test
SteveBronder Dec 7, 2017
cde6ff7
inlined functions to avoid multiple translation unit error
SteveBronder Dec 9, 2017
58e380b
...
SteveBronder Dec 9, 2017
2c43e0e
add static to global kernel maps and inlined copy function
SteveBronder Dec 9, 2017
26f79dc
include <exception> for check_opencl
SteveBronder Dec 9, 2017
2b1a6a3
add scale file for domain error in check_opencl
SteveBronder Dec 9, 2017
e686c52
add char* to check_ocl_error
SteveBronder Dec 9, 2017
895ce30
Use a string to throw a domain error for OpenCL failures
SteveBronder Dec 11, 2017
8077b41
...
SteveBronder Dec 11, 2017
54d9c19
Add constant tolerance header file to check_gpu.hpp
SteveBronder Dec 11, 2017
eb174f2
add basic_matrix_gpu to headers for multipy_gpu
SteveBronder Dec 11, 2017
fa0376b
update AMD SDK download script
SteveBronder Dec 28, 2017
64e0317
Max workgroup size is considered for determining workgroup sizes
rok-cesnovar Jan 6, 2018
b2be485
- removed residual debug info
rok-cesnovar Jan 6, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
update AMD SDK download script
  • Loading branch information
SteveBronder committed Dec 28, 2017
commit fa0376b2df6e0d375a7115eaf724d0e5f6254ee7
14 changes: 6 additions & 8 deletions .travis/amd_sdk.sh
Original file line number Diff line number Diff line change
Expand Up @@ -3,17 +3,16 @@
# Original script from https://github.com/gregvw/amd_sdk/

# Location from which get nonce and file name from
URL="http://developer.amd.com/amd-accelerated-parallel-processing-app-sdk/"
URLDOWN="http://developer.amd.com/amd-license-agreement-appsdk/"
URL="https://developer.amd.com/amd-accelerated-parallel-processing-app-sdk/"
URLDOWN="https://developer.amd.com/amd-license-agreement-appsdk/"

NONCE1_STRING='name="amd_developer_central_downloads_page_nonce"'
FILE_STRING='name="f"'
POSTID_STRING='name="post_id"'
NONCE2_STRING='name="amd_developer_central_nonce"'

# This gets the second latest (2.9.1 ATM, latest is 3.0)
# For newest: FORM=`wget -qO - $URL | sed -n '/download-2/,/64-bit/p'`
FORM=`wget -qO - $URL | sed -n '/download-2/,/64-bit/p'`
#For newest FORM=`wget -qO - $URL | sed -n '/download-2/,/64-bit/p'`
FORM=`wget --no-check-certificate -qO - $URL | sed -n '/download-5/,/64-bit/p'`

# Get nonce from form
NONCE1=`echo $FORM | awk -F ${NONCE1_STRING} '{print $2}'`
Expand All @@ -30,11 +29,10 @@ FILE=`echo $FORM | awk -F ${FILE_STRING} '{print $2}'`
FILE=`echo $FILE | awk -F'"' '{print $2}'`
echo $FILE

FORM=`wget -qO - $URLDOWN --post-data "amd_developer_central_downloads_page_nonce=${NONCE1}&f=${FILE}&post_id=${POSTID}"`
FORM=`wget --no-check-certificate -qO - $URLDOWN --post-data "amd_developer_central_downloads_page_nonce=${NONCE1}&f=${FILE}&post_id=${POSTID}"`

NONCE2=`echo $FORM | awk -F ${NONCE2_STRING} '{print $2}'`
NONCE2=`echo $NONCE2 | awk -F'"' '{print $2}'`
echo $NONCE2

wget --content-disposition --trust-server-names $URLDOWN --post-data "amd_developer_central_nonce=${NONCE2}&f=${FILE}" -O AMD-SDK.tar.bz2;

wget --no-check-certificate --content-disposition --trust-server-names $URLDOWN --post-data "amd_developer_central_nonce=${NONCE2}&f=${FILE}" -O AMD-SDK.tar.bz2;