Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change how users access OpenCL kernels #966

Merged
merged 47 commits into from
Aug 23, 2018
Merged
Changes from 1 commit
Commits
Show all changes
47 commits
Select commit Hold shift + click to select a range
79c365a
First iteration, does not work
SteveBronder Jul 30, 2018
2ab6ce5
Adds kernel_cl across gpu functions. Getting seg_fault on tranpose te…
SteveBronder Aug 1, 2018
621d8a2
add test for kernel_cl
SteveBronder Aug 1, 2018
1f2aad1
added throwing with wrong kernel name, fixed seg fault
rok-cesnovar Aug 1, 2018
de7b5b0
removing unneccesary kernel from set_args
rok-cesnovar Aug 1, 2018
35d75c6
fix for the bug in sub_block
rok-cesnovar Aug 1, 2018
adfbcd1
Added NOLINT to the include statements in kernel_cl for brining in ke…
SteveBronder Aug 1, 2018
b324f01
Merge branch 'kernel_cl' of https://github.com/bstatcomp/math into ke…
SteveBronder Aug 1, 2018
2927e93
Change name of map that holds kernels to kernel_table
SteveBronder Aug 3, 2018
9a614c0
Merge commit '6d968e60bb633ce3b809225c037bedd3b51fa463' into HEAD
yashikno Aug 4, 2018
15d590d
[Jenkins] auto-formatting by clang-format version 6.0.0 (tags/google/…
stan-buildbot Aug 4, 2018
8564d93
Include constants header and remove [in] for kernel param
SteveBronder Aug 4, 2018
d00e347
Remove kernel groups
SteveBronder Aug 5, 2018
adb1eff
[Jenkins] auto-formatting by clang-format version 6.0.0 (tags/google/…
stan-buildbot Aug 5, 2018
1e5f13e
testing stringify
rok-cesnovar Aug 9, 2018
e5de811
macros test
rok-cesnovar Aug 10, 2018
931e62b
Merge commit '88efab8ef687c00170fd970d252a07274dee8500' into HEAD
yashikno Aug 10, 2018
9f6a9f9
[Jenkins] auto-formatting by clang-format version 5.0.2-svn328729-1~e…
stan-buildbot Aug 10, 2018
229bb19
Renamed kernels to remove the _kernel part of the name.
SteveBronder Aug 12, 2018
f680bef
[Jenkins] auto-formatting by clang-format version 6.0.0 (tags/google/…
stan-buildbot Aug 12, 2018
f809a5b
typo
SteveBronder Aug 12, 2018
f634ed1
typo
SteveBronder Aug 12, 2018
9050d5c
Add line to each kernel doc directing users to helper macros
SteveBronder Aug 12, 2018
5fb4289
now using scoped enums
rok-cesnovar Aug 15, 2018
e8eec3b
Merge commit '6c83362fa6486e00ad395a40557d6807facd326c' into HEAD
yashikno Aug 15, 2018
7471093
[Jenkins] auto-formatting by clang-format version 6.0.0 (tags/google/…
stan-buildbot Aug 15, 2018
3bacc32
moved compiling to the kernel_cl_base constructor, compiles everythin…
rok-cesnovar Aug 15, 2018
3424025
[Jenkins] auto-formatting by clang-format version 6.0.0 (tags/google/…
stan-buildbot Aug 15, 2018
d20ac48
Make all kernels compiled at the first construction of the kernel_cl_…
SteveBronder Aug 16, 2018
62d339b
[Jenkins] auto-formatting by clang-format version 6.0.0 (tags/google/…
stan-buildbot Aug 16, 2018
d1c4ec3
Makes methods in kernel_cl so that developers can access the kernel
SteveBronder Aug 17, 2018
5de2818
merge to remote
SteveBronder Aug 17, 2018
f8db5fd
Merge remote-tracking branch 'upstream/develop' into kernel_cl
SteveBronder Aug 18, 2018
0b4f512
snapshot of a work in progress design experiment
seantalts Aug 18, 2018
a68f51e
Fix bugs
seantalts Aug 22, 2018
afc73c5
new kernel enqueing in non templated functions
rok-cesnovar Aug 22, 2018
5508137
added the rest of the kernels, all gpu tests pass
rok-cesnovar Aug 22, 2018
6f8bd4a
changed to global_range, removed semicolons :)
rok-cesnovar Aug 22, 2018
fba61eb
...
rok-cesnovar Aug 22, 2018
aadb3ce
Changes all the kernel files so that they are placed into const char*…
SteveBronder Aug 23, 2018
4189530
fix lint issues
SteveBronder Aug 23, 2018
6a2014c
Move STRINGIFY to a single location in kernel_cl.hpp
seantalts Aug 23, 2018
3625c8f
Merge pull request #7 from stan-dev/kcl
SteveBronder Aug 23, 2018
58943c7
Merge commit '68b8f7e2effb1a23abe8524ff429a212653b53a3' into HEAD
yashikno Aug 23, 2018
1aa47d8
[Jenkins] auto-formatting by clang-format version 6.0.0 (tags/google/…
stan-buildbot Aug 23, 2018
e738921
Remove extra things from doxygen.cfg that existed for .cl files
SteveBronder Aug 23, 2018
93ed1a6
Add #ifdef STAN_OPENCL to kernels to fix header-tests
seantalts Aug 23, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Added NOLINT to the include statements in kernel_cl for brining in ke…
…rnels. Going to find better way to bring those in
  • Loading branch information
SteveBronder committed Aug 1, 2018
commit adfbcd115258a64c956bbd3c429a58d0988e15f7
148 changes: 79 additions & 69 deletions stan/math/gpu/kernel_cl.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,9 @@
#ifdef STAN_OPENCL

#include <stan/math/gpu/opencl_context.hpp>
#include <stan/math/gpu/constants.hpp>
#include <map>
#include <CL/cl.hpp>
#include <string>
#include <map>
#include <vector>

namespace stan {
Expand All @@ -14,52 +14,51 @@ namespace math {
class kernel_cl_base {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like you might want to hide this kernel_cl_base class as an implementation detail behind kernel_cl, which it seems like other Math library developers should want to use instead, right? You can do that by nesting this class one further, inside the stan::math::internal namespace. I'm not 100% sure this suggestion is correct, but it might make sense if you want to limit the developer API to the exposed kernel_cl functionality (which I think you might but I'm not sure) and it doesn't depend on knowing how kernel_cl_base is implemented (ideally it doesn't or is explained at the kernel_cl API layer)

friend class kernel_cl;

private:
private:
const char* copy_matrix_kernel =
#include <stan/math/gpu/kernels/copy_matrix_kernel.cl>
; // NOLINT
#include <stan/math/gpu/kernels/copy_matrix_kernel.cl> // NOLINT
; // NOLINT
const char* transpose_matrix_kernel =
#include <stan/math/gpu/kernels/transpose_matrix_kernel.cl>
; // NOLINT
#include <stan/math/gpu/kernels/transpose_matrix_kernel.cl> // NOLINT
; // NOLINT
const char* zeros_matrix_kernel =
#include <stan/math/gpu/kernels/zeros_matrix_kernel.cl>
; // NOLINT
#include <stan/math/gpu/kernels/zeros_matrix_kernel.cl> // NOLINT
; // NOLINT
const char* identity_matrix_kernel =
#include <stan/math/gpu/kernels/identity_matrix_kernel.cl>
; // NOLINT
#include <stan/math/gpu/kernels/identity_matrix_kernel.cl> // NOLINT
; // NOLINT
const char* copy_triangular_matrix_kernel =
#include <stan/math/gpu/kernels/copy_triangular_matrix_kernel.cl>
; // NOLINT
#include <stan/math/gpu/kernels/copy_triangular_matrix_kernel.cl> // NOLINT
; // NOLINT
const char* copy_triangular_transposed_matrix_kernel =
#include <stan/math/gpu/kernels/triangular_transpose_kernel.cl>
; // NOLINT
#include <stan/math/gpu/kernels/triangular_transpose_kernel.cl> // NOLINT
; // NOLINT
const char* copy_submatrix_kernel =
#include <stan/math/gpu/kernels/sub_block_kernel.cl>
; // NOLINT
#include <stan/math/gpu/kernels/sub_block_kernel.cl> // NOLINT
; // NOLINT
const char* check_nan_kernel =
#include <stan/math/gpu/kernels/check_nan_kernel.cl>
; // NOLINT
#include <stan/math/gpu/kernels/check_nan_kernel.cl> // NOLINT
; // NOLINT
const char* check_diagonal_zeros_kernel =
#include <stan/math/gpu/kernels/check_diagonal_zeros_kernel.cl>
; // NOLINT
#include <stan/math/gpu/kernels/check_diagonal_zeros_kernel.cl> // NOLINT
; // NOLINT
const char* check_symmetric_kernel =
#include <stan/math/gpu/kernels/check_symmetric_kernel.cl>
; // NOLINT
#include <stan/math/gpu/kernels/check_symmetric_kernel.cl> // NOLINT
; // NOLINT
const char* subtract_symmetric_kernel =
#include <stan/math/gpu/kernels/subtract_matrix_kernel.cl>
; // NOLINT
#include <stan/math/gpu/kernels/subtract_matrix_kernel.cl> // NOLINT
; // NOLINT
const char* add_symmetric_kernel =
#include <stan/math/gpu/kernels/add_matrix_kernel.cl>
; // NOLINT
#include <stan/math/gpu/kernels/add_matrix_kernel.cl> // NOLINT
; // NOLINT

protected:
protected:
typedef std::map<const char*, int> map_base_opts;
const map_base_opts base_opts = {
{"LOWER", gpu::Lower},
{"UPPER", gpu::Upper},
{"ENTIRE", gpu::Entire},
{"UPPER_TO_LOWER", gpu::UpperToLower},
{"LOWER_TO_UPPER", gpu::LowerToUpper}};
const map_base_opts base_opts = {{"LOWER", gpu::Lower},
{"UPPER", gpu::Upper},
{"ENTIRE", gpu::Entire},
{"UPPER_TO_LOWER", gpu::UpperToLower},
{"LOWER_TO_UPPER", gpu::LowerToUpper}};

/** Holds meta information about a kernel.
* @param exists a bool to identify whether a kernel has been compiled.
Expand All @@ -77,29 +76,41 @@ class kernel_cl_base {
* Map of a kernel name (first) and it's meta information (second).
*/
typedef std::map<const char*, kernel_meta_info> map_kernel_info;
const map_kernel_info kernel_info = {
{"dummy", { false, "timing", {},
"__kernel void dummy(__global const int* foo) { };"}},
{"dummy2", {false, "timing", {},
"__kernel void dummy2(__global const int* foo) { };"}},
{"copy", {false, "basic_matrix", {}, copy_matrix_kernel}},
{"transpose", {false, "basic_matrix", {}, transpose_matrix_kernel}},
{"zeros", {false, "basic_matrix",
{"LOWER", "UPPER", "ENTIRE"}, zeros_matrix_kernel}},
{"identity", {false, "basic_matrix", {}, identity_matrix_kernel}},
{"copy_triangular", {false, "basic_matrix", {},
copy_triangular_matrix_kernel}},
{"copy_triangular_transposed", {false, "basic_matrix",
{"LOWER_TO_UPPER", "UPPER_TO_LOWER"},
copy_triangular_transposed_matrix_kernel}},
{"copy_submatrix", {false, "basic_matrix", {}, copy_submatrix_kernel}},
{"add", {false, "basic_matrix", {}, add_symmetric_kernel}},
{"subtract", {false, "basic_matrix", {}, subtract_symmetric_kernel}},
{"is_nan", {false, "check", {""}, check_nan_kernel}},
{"is_zero_on_diagonal", {false, "check", {}, check_diagonal_zeros_kernel}},
{"is_symmetric", {false, "check", {}, check_symmetric_kernel}}};
const map_kernel_info kernel_info
= {{"dummy",
{false,
"timing",
{},
"__kernel void dummy(__global const int* foo) { };"}},
{"dummy2",
{false,
"timing",
{},
"__kernel void dummy2(__global const int* foo) { };"}},
{"copy", {false, "basic_matrix", {}, copy_matrix_kernel}},
{"transpose", {false, "basic_matrix", {}, transpose_matrix_kernel}},
{"zeros",
{false,
"basic_matrix",
{"LOWER", "UPPER", "ENTIRE"},
zeros_matrix_kernel}},
{"identity", {false, "basic_matrix", {}, identity_matrix_kernel}},
{"copy_triangular",
{false, "basic_matrix", {}, copy_triangular_matrix_kernel}},
{"copy_triangular_transposed",
{false,
"basic_matrix",
{"LOWER_TO_UPPER", "UPPER_TO_LOWER"},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a chance those gpu options can turn into enums of some kind? If those gpu::UpperToLower become scoped enums, these can become the type of that enum instead of strings, e.g.

enum class GpuOpt {upper, lower, ... };
...
struct kernel_meta_info {
    bool exists;
    std::vector<GpuOpt> opts;
    const char* raw_code;
  };

If you need to be able to set which integer each constant refers to, you can do that, though I don't think you can have two that point to the same int. But I think you'd actually rather use the scoped enum type in the template parameters later on rather than ints, template <GpuOpt triangular_map = GpuOpt::LowerToUpper>...?

copy_triangular_transposed_matrix_kernel}},
{"copy_submatrix", {false, "basic_matrix", {}, copy_submatrix_kernel}},
{"add", {false, "basic_matrix", {}, add_symmetric_kernel}},
{"subtract", {false, "basic_matrix", {}, subtract_symmetric_kernel}},
{"is_nan", {false, "check", {""}, check_nan_kernel}},
{"is_zero_on_diagonal",
{false, "check", {}, check_diagonal_zeros_kernel}},
{"is_symmetric", {false, "check", {}, check_symmetric_kernel}}};
typedef std::map<const char*, cl::Kernel> map_kernel;
map_kernel kernels; // The compiled kernels
map_kernel kernels; // The compiled kernels

static kernel_cl_base& getInstance() {
static kernel_cl_base instance_;
Expand All @@ -111,7 +122,7 @@ class kernel_cl_base {
};

class kernel_cl {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Class could use a doc string too I think

public:
public:
cl::Kernel compiled_;
/**
* Compiles all the kernels in the specified group. The side effect of this
Expand All @@ -130,17 +141,17 @@ class kernel_cl {
std::string kernel_source = "";
if (this->kernel_info().count(kernel_name) == 0) {
// throws if the kernel does not exist
domain_error("compiling kernels", kernel_name,
" kernel does not exist", "");
domain_error("compiling kernels", kernel_name, " kernel does not exist",
"");
}
const char* kernel_group = this->kernel_info()[kernel_name].group;
for (auto kern : this->kernel_info()) {
if (strcmp(kern.second.group, kernel_group) == 0) {
kernel_source += kern.second.raw_code;
for (auto comp_opts : kern.second.opts) {
if (strcmp(comp_opts, "") != 0) {
kernel_opts += std::string(" -D") + comp_opts + "=" +
std::to_string(this->base_options()[comp_opts]);
kernel_opts += std::string(" -D") + comp_opts + "="
+ std::to_string(this->base_options()[comp_opts]);
}
}
}
Expand Down Expand Up @@ -186,7 +197,7 @@ class kernel_cl {
*
* @param[in] kernel_name The kernel name.
*/
explicit kernel_cl(const char* kernel_name) {
explicit kernel_cl(const char* kernel_name) {
// Compile the kernel group and return the kernel
if (!this->kernel_info()[kernel_name].exists) {
this->compile_kernel_group(kernel_name);
Expand Down Expand Up @@ -217,9 +228,8 @@ class kernel_cl {
* simpleopencl.blogspot.com/2013/04/calling-kernels-with-large-number-of.html
*/
template <typename T, typename... Args>
inline void recursive_args(cl::Kernel& kernel, int i,
const T& first_arg,
const Args&... extra_args) {
inline void recursive_args(cl::Kernel& kernel, int i, const T& first_arg,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think these might be better off private or in the internal namespace (they don't seem to need to be methods)

const Args&... extra_args) {
kernel.setArg(i, first_arg);
this->recursive_args(kernel, i + 1, extra_args...);
}
Expand Down Expand Up @@ -259,8 +269,8 @@ class kernel_cl {
return kernel_cl_base::getInstance().base_opts;
}
};
}
}
} // namespace math
} // namespace stan

#endif
#endif
#endif
#endif