Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change how users access OpenCL kernels #966

Merged
merged 47 commits into from
Aug 23, 2018
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
47 commits
Select commit Hold shift + click to select a range
79c365a
First iteration, does not work
SteveBronder Jul 30, 2018
2ab6ce5
Adds kernel_cl across gpu functions. Getting seg_fault on tranpose te…
SteveBronder Aug 1, 2018
621d8a2
add test for kernel_cl
SteveBronder Aug 1, 2018
1f2aad1
added throwing with wrong kernel name, fixed seg fault
rok-cesnovar Aug 1, 2018
de7b5b0
removing unneccesary kernel from set_args
rok-cesnovar Aug 1, 2018
35d75c6
fix for the bug in sub_block
rok-cesnovar Aug 1, 2018
adfbcd1
Added NOLINT to the include statements in kernel_cl for brining in ke…
SteveBronder Aug 1, 2018
b324f01
Merge branch 'kernel_cl' of https://github.com/bstatcomp/math into ke…
SteveBronder Aug 1, 2018
2927e93
Change name of map that holds kernels to kernel_table
SteveBronder Aug 3, 2018
9a614c0
Merge commit '6d968e60bb633ce3b809225c037bedd3b51fa463' into HEAD
yashikno Aug 4, 2018
15d590d
[Jenkins] auto-formatting by clang-format version 6.0.0 (tags/google/…
stan-buildbot Aug 4, 2018
8564d93
Include constants header and remove [in] for kernel param
SteveBronder Aug 4, 2018
d00e347
Remove kernel groups
SteveBronder Aug 5, 2018
adb1eff
[Jenkins] auto-formatting by clang-format version 6.0.0 (tags/google/…
stan-buildbot Aug 5, 2018
1e5f13e
testing stringify
rok-cesnovar Aug 9, 2018
e5de811
macros test
rok-cesnovar Aug 10, 2018
931e62b
Merge commit '88efab8ef687c00170fd970d252a07274dee8500' into HEAD
yashikno Aug 10, 2018
9f6a9f9
[Jenkins] auto-formatting by clang-format version 5.0.2-svn328729-1~e…
stan-buildbot Aug 10, 2018
229bb19
Renamed kernels to remove the _kernel part of the name.
SteveBronder Aug 12, 2018
f680bef
[Jenkins] auto-formatting by clang-format version 6.0.0 (tags/google/…
stan-buildbot Aug 12, 2018
f809a5b
typo
SteveBronder Aug 12, 2018
f634ed1
typo
SteveBronder Aug 12, 2018
9050d5c
Add line to each kernel doc directing users to helper macros
SteveBronder Aug 12, 2018
5fb4289
now using scoped enums
rok-cesnovar Aug 15, 2018
e8eec3b
Merge commit '6c83362fa6486e00ad395a40557d6807facd326c' into HEAD
yashikno Aug 15, 2018
7471093
[Jenkins] auto-formatting by clang-format version 6.0.0 (tags/google/…
stan-buildbot Aug 15, 2018
3bacc32
moved compiling to the kernel_cl_base constructor, compiles everythin…
rok-cesnovar Aug 15, 2018
3424025
[Jenkins] auto-formatting by clang-format version 6.0.0 (tags/google/…
stan-buildbot Aug 15, 2018
d20ac48
Make all kernels compiled at the first construction of the kernel_cl_…
SteveBronder Aug 16, 2018
62d339b
[Jenkins] auto-formatting by clang-format version 6.0.0 (tags/google/…
stan-buildbot Aug 16, 2018
d1c4ec3
Makes methods in kernel_cl so that developers can access the kernel
SteveBronder Aug 17, 2018
5de2818
merge to remote
SteveBronder Aug 17, 2018
f8db5fd
Merge remote-tracking branch 'upstream/develop' into kernel_cl
SteveBronder Aug 18, 2018
0b4f512
snapshot of a work in progress design experiment
seantalts Aug 18, 2018
a68f51e
Fix bugs
seantalts Aug 22, 2018
afc73c5
new kernel enqueing in non templated functions
rok-cesnovar Aug 22, 2018
5508137
added the rest of the kernels, all gpu tests pass
rok-cesnovar Aug 22, 2018
6f8bd4a
changed to global_range, removed semicolons :)
rok-cesnovar Aug 22, 2018
fba61eb
...
rok-cesnovar Aug 22, 2018
aadb3ce
Changes all the kernel files so that they are placed into const char*…
SteveBronder Aug 23, 2018
4189530
fix lint issues
SteveBronder Aug 23, 2018
6a2014c
Move STRINGIFY to a single location in kernel_cl.hpp
seantalts Aug 23, 2018
3625c8f
Merge pull request #7 from stan-dev/kcl
SteveBronder Aug 23, 2018
58943c7
Merge commit '68b8f7e2effb1a23abe8524ff429a212653b53a3' into HEAD
yashikno Aug 23, 2018
1aa47d8
[Jenkins] auto-formatting by clang-format version 6.0.0 (tags/google/…
stan-buildbot Aug 23, 2018
e738921
Remove extra things from doxygen.cfg that existed for .cl files
SteveBronder Aug 23, 2018
93ed1a6
Add #ifdef STAN_OPENCL to kernels to fix header-tests
seantalts Aug 23, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Changes all the kernel files so that they are placed into const char*…
…'s within each kernel file. Doxygen works but the object holding the kernel code is undocumented. All of the kernel structs are moved into the respective kernel file
  • Loading branch information
SteveBronder committed Aug 23, 2018
commit aadb3ce1a1b69313e7ce0840c806b48ae95369e3
7 changes: 4 additions & 3 deletions doxygen/doxygen.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -281,7 +281,7 @@ OPTIMIZE_OUTPUT_VHDL = NO
# Note that for custom extensions you also need to set FILE_PATTERNS otherwise
# the files are not read by doxygen.

EXTENSION_MAPPING =
EXTENSION_MAPPING = cl=C

# If the MARKDOWN_SUPPORT tag is enabled then doxygen pre-processes all comments
# according to the Markdown format, which allows for more readable
Expand Down Expand Up @@ -782,7 +782,8 @@ INPUT_ENCODING = UTF-8

FILE_PATTERNS = *.hpp \
*.cpp \
*.dox
*.dox \
*.cl

# The RECURSIVE tag can be used to specify whether or not subdirectories should
# be searched for input files as well.
Expand Down Expand Up @@ -824,7 +825,7 @@ EXCLUDE_PATTERNS =
# Note that the wildcards are matched against the file with absolute path, so to
# exclude all test directories use the pattern */test/*

EXCLUDE_SYMBOLS =
EXCLUDE_SYMBOLS = STRINGIFY

# The EXAMPLE_PATH tag can be used to specify one or more files or directories
# that contain example code fragments that are included (see the \include
Expand Down
1 change: 1 addition & 0 deletions stan/math/gpu/add.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
#define STAN_MATH_GPU_ADD_HPP
#ifdef STAN_OPENCL
#include <stan/math/gpu/matrix_gpu.hpp>
#include <stan/math/gpu/kernels/add.hpp>
#include <stan/math/gpu/err/check_matching_dims.hpp>
#include <CL/cl.hpp>

Expand Down
6 changes: 3 additions & 3 deletions stan/math/gpu/copy.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
#include <stan/math/gpu/opencl_context.hpp>
#include <stan/math/gpu/kernel_cl.hpp>
#include <stan/math/gpu/matrix_gpu.hpp>
#include <stan/math/gpu/kernels/copy.hpp>
#include <stan/math/prim/mat/fun/Eigen.hpp>
#include <stan/math/prim/scal/err/check_size_match.hpp>
#include <CL/cl.hpp>
Expand Down Expand Up @@ -111,9 +112,8 @@ inline void copy(matrix_gpu& dst, const matrix_gpu& src) {
* see the matrix_gpu(matrix_gpu&) constructor
* for explanation
*/
opencl_kernels::copy(cl::NDRange(dst.rows(), dst.cols()),
src.buffer(), dst.buffer(),
dst.rows(), dst.cols());
opencl_kernels::copy(cl::NDRange(dst.rows(), dst.cols()), src.buffer(),
dst.buffer(), dst.rows(), dst.cols());
} catch (const cl::Error& e) {
std::cout << e.err() << std::endl;
check_opencl_error("copy GPU->GPU", e);
Expand Down
1 change: 1 addition & 0 deletions stan/math/gpu/copy_triangular.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
#include <stan/math/gpu/constants.hpp>
#include <stan/math/gpu/matrix_gpu.hpp>
#include <stan/math/gpu/copy.hpp>
#include <stan/math/gpu/kernels/copy_triangular.hpp>
#include <CL/cl.hpp>

namespace stan {
Expand Down
1 change: 1 addition & 0 deletions stan/math/gpu/err/check_diagonal_zeros.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
#define STAN_MATH_GPU_ERR_CHECK_DIAGONAL_ZEROS_HPP
#ifdef STAN_OPENCL
#include <stan/math/gpu/matrix_gpu.hpp>
#include <stan/math/gpu/kernels/check_diagonal_zeros.hpp>
#include <stan/math/prim/scal/err/domain_error.hpp>

namespace stan {
Expand Down
1 change: 1 addition & 0 deletions stan/math/gpu/err/check_nan.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
#define STAN_MATH_GPU_ERR_CHECK_NAN_HPP
#ifdef STAN_OPENCL
#include <stan/math/gpu/matrix_gpu.hpp>
#include <stan/math/gpu/kernels/check_nan.hpp>
#include <stan/math/prim/scal/err/domain_error.hpp>

namespace stan {
Expand Down
2 changes: 2 additions & 0 deletions stan/math/gpu/err/check_symmetric.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@
#ifdef STAN_OPENCL
#include <stan/math/gpu/matrix_gpu.hpp>
#include <stan/math/prim/scal/err/domain_error.hpp>
#include <stan/math/gpu/kernels/check_symmetric.hpp>


namespace stan {
namespace math {
Expand Down
5 changes: 3 additions & 2 deletions stan/math/gpu/identity.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
#define STAN_MATH_GPU_IDENTITY_HPP
#ifdef STAN_OPENCL
#include <stan/math/gpu/matrix_gpu.hpp>
#include <stan/math/gpu/kernels/identity.hpp>
#include <CL/cl.hpp>

namespace stan {
Expand All @@ -23,8 +24,8 @@ inline matrix_gpu identity(int rows_cols) {
cl::CommandQueue cmdQueue = opencl_context.queue();

try {
opencl_kernels::identity(cl::NDRange(A.rows(), A.cols()),
A.buffer(), A.rows(), A.cols());
opencl_kernels::identity(cl::NDRange(A.rows(), A.cols()), A.buffer(),
A.rows(), A.cols());
} catch (const cl::Error& e) {
check_opencl_error("identity", e);
}
Expand Down
57 changes: 3 additions & 54 deletions stan/math/gpu/kernel_cl.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
#ifdef STAN_OPENCL
#include <stan/math/gpu/opencl_context.hpp>
#include <stan/math/gpu/constants.hpp>
#include <stan/math/gpu/kernels/helpers.hpp>
#include <CL/cl.hpp>
#include <string>
#include <algorithm>
Expand All @@ -13,9 +14,6 @@ namespace stan {
namespace math {
namespace {

std::string helpers = // Helper macros for the kernels.
#include <stan/math/gpu/kernels/helpers.cl> // NOLINT
; // NOLINT
// Holds Default parameter values for each Kernel.
typedef std::map<const char*, int> map_base_opts;
map_base_opts base_opts
Expand All @@ -31,7 +29,7 @@ auto compile_kernel(const char* name, const char* source) {
kernel_opts += std::string(" -D") + comp_opts.first + "="
+ std::to_string(comp_opts.second);
}
std::string kernel_source(helpers);
std::string kernel_source(opencl_kernels::helpers);
kernel_source.append(source);
try {
cl::Program::Sources src(1, std::make_pair(kernel_source.c_str(),
Expand All @@ -58,9 +56,7 @@ class kernel_functor {
kernel_functor(const char* name, const char* source)
: kernel_(compile_kernel(name, source)) {}

auto operator()() const {
return cl::make_kernel<Args...>(kernel_);
}
auto operator()() const { return cl::make_kernel<Args...>(kernel_); }
};

template <typename... Args>
Expand All @@ -75,53 +71,6 @@ struct global_range_kernel {
}
};

const global_range_kernel<cl::Buffer, int, int> identity("identity",
#include <stan/math/gpu/kernels/identity_matrix.cl> // NOLINT
); // NOLINT
const global_range_kernel<cl::Buffer, cl::Buffer, int, int> copy("copy",
#include <stan/math/gpu/kernels/copy_matrix.cl> // NOLINT
); // NOLINT
const global_range_kernel<cl::Buffer, cl::Buffer, int, int> transpose(
"transpose",
#include <stan/math/gpu/kernels/transpose_matrix.cl> // NOLINT
); // NOLINT
const global_range_kernel<cl::Buffer, cl::Buffer, cl::Buffer, int, int> add(
"add",
#include <stan/math/gpu/kernels/add_matrix.cl> // NOLINT
); // NOLINT
const global_range_kernel<cl::Buffer, cl::Buffer, cl::Buffer, int, int>
subtract("subtract",
#include <stan/math/gpu/kernels/subtract_matrix.cl> // NOLINT
); // NOLINT
const global_range_kernel<cl::Buffer, cl::Buffer, int, int, int, int, int, int,
int, int, int, int>
sub_block("sub_block",
#include <stan/math/gpu/kernels/sub_block.cl> // NOLINT
); // NOLINT
const global_range_kernel<cl::Buffer, cl::Buffer, int, int>
check_diagonal_zeros("is_zero_on_diagonal",
#include <stan/math/gpu/kernels/check_diagonal_zeros.cl> // NOLINT
); // NOLINT
const global_range_kernel<cl::Buffer, cl::Buffer, int, int> check_nan("is_nan",
#include <stan/math/gpu/kernels/check_nan.cl> // NOLINT
); // NOLINT
const global_range_kernel<cl::Buffer, cl::Buffer, int, int, const double>
check_symmetric("is_symmetric",
#include <stan/math/gpu/kernels/check_symmetric.cl> // NOLINT
); // NOLINT
const global_range_kernel<cl::Buffer, cl::Buffer, int, int, TriangularViewGPU>
copy_triangular("copy_triangular",
#include <stan/math/gpu/kernels/copy_triangular_matrix.cl> // NOLINT
); // NOLINT
const global_range_kernel<cl::Buffer, int, int, TriangularViewGPU> zeros(
"zeros",
#include <stan/math/gpu/kernels/zeros_matrix.cl> // NOLINT
); // NOLINT
const global_range_kernel<cl::Buffer, int, int, TriangularMapGPU>
triangular_transpose("triangular_transpose",
#include <stan/math/gpu/kernels/triangular_transpose.cl> // NOLINT
); // NOLINT

} // namespace opencl_kernels
} // namespace math
} // namespace stan
Expand Down
51 changes: 51 additions & 0 deletions stan/math/gpu/kernels/add.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
#ifndef STAN_MATH_GPU_KERNELS_ADD_HPP
#define STAN_MATH_GPU_KERNELS_ADD_HPP

#ifndef STRINGIFY
#define STRINGIFY(src) #src
#endif

#include <stan/math/gpu/kernel_cl.hpp>

namespace stan {
namespace math {
namespace opencl_kernels {
// \cond
const char *add_kernel_code = STRINGIFY(
// \endcond
/**
* Matrix addition on the GPU
*
* @param[out] C Output matrix.
* @param[in] A LHS of matrix addition.
* @param[in] B RHS of matrix addition.
* @param rows Number of rows for matrix A.
* @param cols Number of cols for matrix A.
* @note Code is a <code>const char*</code> held in
* <code>add_kernel_code.</code>
* This kernel uses the helper macros available in helpers.cl.
*/
__kernel void add(__global write_only double *C,
__global read_only double *A,
__global read_only double *B, read_only unsigned int rows,
read_only unsigned int cols) {
int i = get_global_id(0);
int j = get_global_id(1);
if (i < rows && j < cols) {
C(i, j) = A(i, j) + B(i, j);
}
}
// \cond
);
// \endcond

/**
* See the docs for \link kernels/add.hpp add() \endlink
*/
const global_range_kernel<cl::Buffer, cl::Buffer, cl::Buffer, int, int> add(
"add", add_kernel_code);

} // namespace opencl_kernels
} // namespace math
} // namespace stan
#endif
24 changes: 0 additions & 24 deletions stan/math/gpu/kernels/add_matrix.cl

This file was deleted.

25 changes: 0 additions & 25 deletions stan/math/gpu/kernels/check_diagonal_zeros.cl

This file was deleted.

53 changes: 53 additions & 0 deletions stan/math/gpu/kernels/check_diagonal_zeros.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
#ifndef STAN_MATH_GPU_KERNELS_CHECK_DIAGONAL_ZEROS_HPP
#define STAN_MATH_GPU_KERNELS_CHECK_DIAGONAL_ZEROS_HPP

#ifndef STRINGIFY
#define STRINGIFY(src) #src
#endif

#include <stan/math/gpu/kernel_cl.hpp>

namespace stan {
namespace math {
namespace opencl_kernels {
// \cond
const char *is_zero_on_diagonal_kernel_code = STRINGIFY(
// \endcond
/**
* Check if the <code>matrix_gpu</code> has zeros on the diagonal
*
* @param[in] A Matrix to check.
* @param[out] flag the flag to be written to if any diagonal is zero.
* @param rows The number of rows for A.
* @param cols The number of cols of A.
* @note Code is a <code>const char*</code> held in
* <code>is_zero_on_diagonal_kernel_code.</code>
* Kernel for stan/math/gpu/err/check_diagonal_zeros.hpp.
* This kernel uses the helper macros available in helpers.cl.
*/
__kernel void is_zero_on_diagonal(
__global read_only double *A, __global int *flag,
read_only unsigned int rows, write_only unsigned int cols) {
const int i = get_global_id(0);
if (i < rows && i < cols) {
if (A(i, i) == 0) {
flag[0] = 1;
}
}
}
// \cond
);
// \endcond

/**
* See the docs for \link kernels/check_diagonal_zeros.hpp
* check_diagonal_zeros() \endlink
*/
const global_range_kernel<cl::Buffer, cl::Buffer, int, int>
check_diagonal_zeros("is_zero_on_diagonal",
is_zero_on_diagonal_kernel_code);

} // namespace opencl_kernels
} // namespace math
} // namespace stan
#endif
26 changes: 0 additions & 26 deletions stan/math/gpu/kernels/check_nan.cl

This file was deleted.

Loading