Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds require_* template type traits #1344

Merged
merged 74 commits into from
Oct 9, 2019
Merged

Conversation

SteveBronder
Copy link
Collaborator

@SteveBronder SteveBronder commented Sep 7, 2019

Summary

Out of the 2.8K of line additions 1.6K are tests. Most of the other changes are changing enable_if to require. The only files I would say have meaningful changes/additions are

  1. (new) is_matrix_cl
  2. (change) matrix_cl is used as a test bed to show some of the features below
  3. (new) matrix_cl scalar_type gives scalar_type_t for matrix_cl
  4. (new) matrix_cl value_type gives value_type_t for matrix_cl
  5. (new) require_generics.hpp Holds all of the require aliases explained below

Adds higher order functions for enabling and disabling functions through type traits. This new type system is what I'm going to call "legacy concepts" aka I think the closest thing you can get to C++20 concepts without having the requires keyword.

Legacy concepts are done through enable_if_t aliases which are called require_*. There are 5 basic types of requires.

  1. requires_*_t: This means the function requires a certain type to turn on. For instance
template <typename T, require_arithmetic_t<T>...>
T add(T x, T y) { return x + y; }
  1. require_not_*_t : When we want to disable for a type
// Only works for non-arithmetic types
template <typename T, require_not_arithmetic_t<T>...>
T add(T x, T y) { return x + y; }
  1. require_all_*_t : Takes a parameter pack of types to enable if all types satisfy the check.
// Works if T1 and T2 are arithmetic
template <typename T1, typename T2, require_all_arithmetic_t<T1, T2>...>
auto add(T1 x, T2 y) { return x + y; }
  1. require_any_*_t : Takes a parameter pack of types to enable if any of the types satisfy the check.
// Works if either T1 or T2 are arithmetic
template <typename T1, typename T2, require_any_arithmetic_t<T1, T2>...>
auto add(T1 x, T2 y) { return x + y; }
  1. require_not_any_*_t : Takes a parameter pack of types to enable if any one of the types are not satisfied.
// Works if either neither T1 or T2 are arithmetic
template <typename T1, typename T2, require_not_any_arithmetic_t<T1, T2>...>
auto add(T1 x, T2 y) { return x + y; }
  1. require_not_all_*_t : Takes a parameter pack of types to enable if all of the types are not satisfied.
// Works if neither T1 and T2 are arithmetic
template <typename T1, typename T2, require_not_all_arithmetic_t<T1, T2>...>
auto add(T1 x, T2 y) { return x + y; }

The above are built out for the meta programs to detect

is_arithmetic
is_floating_point
is_double_or_int
is_same
is_var
is_var_or_arithmetic
is_fvar
is_var_or_fvar
is_stan_scalar (either var, fvar, or arithmetic)
is_std_vector
is_vector
is_vector_like
is_eigen
is_eigen_vector

in addition, std::vector and Eigen types have additional requires to detect if the value_type (the first underlying type) or the scalar_type (the containers underlying scalar type) satisfy a condition to enable a class or function.

The container requires have a _vt and _st to symbolize the above. A function that accepts eigen matrices of whose whose value type is floating point types can be defined as

template <typename Mat1, typename Mat2,
  require_all_eigen_vt<is_floating_point, Mat1, Mat2>...>
auto add(Mat1&& A, Mat2&& B) { return A + B;}

A function that accepts standard vectors of Eigen vectors whose scalar type is floating point types can be defined as

template <typename Vec1, typename Vec2,
  require_all_std_vector_vt<is_eigen_vector, Vec1, Vec2>...,
  require_all_std_vector_st<is_floating_point, Vec1, Vec2>...>
auto add(Vec1&& A, Vec2&& B) { 
  std::vector<decltype<A[0] + B[0]>> return_vec;
  std::transform(A.begin(), A.end(), B.begin(), return_vec.begin(), 
    [](auto&& x, auto&& y) {
        return x + y;
    });
  return return_vec;
}

Background

I'd like to start using perfect forwarding semantics in the math library. There's a few reasons for this

  1. A lot of folks (including me) are looking to make Stan more performant through parallelization, but I think there is still a lot of money on the table when it comes to how we move and manage our memory in single threaded execution.

  2. If we want to add sparse matrices we either have to use more general templates in functions that can take in Eigen matrices or write overloads specializing on those. So for addition we would need signatures for adding sparse-sparse, sparse-dense, dense-sparse, and dense-dense matrices. That's not super fun, but the good news is that with more general templates (as perfect forwarding demands) we can support sparse matrices while getting better performance on the things we do now.

  3. When using OpenCL with a cpu, if the Eigen type or vector coming in is a non const rvalue, we can pretty safely steal that memory to do a 'zero cost copy' and not pay any copy cost while getting the multicore performance.

You can see a blog post by Meyer's here that goes in depth on what perfect forwarding means and how to implement it I'll give a tl;dr below. But the tl;dr of the tl'dr is that using perfect forwarding semantics allow rvalues to be moved instead of copied to lvalues in the case of const Blah& in signatures, but perfect forwarding is very hard to implement correctly.

This PR will make signatures like this

template <typename T1, typename T2, int R, int C>
inline Eigen::Matrix<return_type_t<T1, T2>, R, C> a_func(
    const Eigen::Matrix<T1, R, C>& m1, const Eigen::Matrix<T2, R, C>& m2) {
   check_nan(m1);
   check_nan(m2);
   Eigen::Matrix<return_type_t<T1, T2>, -1, -1> B = another_func(m1, m2);
    return B;
}

Look like this

template <typename Mat1, typename Mat2, 
require_all_eigen_vt<is_arithmetic, Mat1, Mat2>...> // (1)
inline auto a_func(Mat1&& m1, Mat2&& m2) { // (2)
   check_nan(m1);
   check_nan(m2);
   // If m1 and/or m2 is an rvalue it will be moved over to this function
  // instead of copied to an lvalue
   auto B = another_func(std::forward<Mat1>(m1), std::forward<Mat2>(m2)); // (3)
   return B;

Where

(1) require_*_eigen_vt is a template template alias around std::enable_if_t that checks whether each type is both derived from the EigenBase class and has an underlying arithmetic value type. There are two requires_* for each container, require_*_vt and require_*_st where vt stands for value_type and st stands for scalar_type. When methods use containers of containers we want to verify both the underlying container is the correct type and the scalar type of the container is the correct type. For example, if we accept a vector of eigen vectors with arithmetic scalars for a certain function this would have the template.

template <typename Vec, require_std_vector_vt<is_eigen, Vec>..., require_std_vector_st<std::is_arithmetic, Vec>...>
auto my_func(Vec&& v) {
/// do stuff
}

(2) The ... comes from this blog post and may seem a bit odd. But it's a clever way to force the requirement type trait to execute through parameter pack expansion. In the case of success the parameter pack require_all_eigen_t<condition, Mat1, Mat2>... has nothing in it and so nothing happens, but because we forced execution, failure will SFINAE out the signature for this function.

(3) In the signature the types only have to satisfy the requirements of (1), meaning that they can be Eigen matrices, expressions of eigen matrices, const/nonconst, and lvalues or rvalue refererences. The && in the signature acts like an or statement with lvalue (&) being true rvalue (&&) being false. If the type coming in is an lvalue reference then the type signature does Mat1&& || & produces an Mat1&. If the type coming in is an rvalue then the type signature does Mat1&& || && and produces Mat1&&.

One other important note for this, while in the process of implimenting this, in functions like add that take in two Eigen expressions, we will have to surround the return statements with (return_type).eval() and also perform an .eval() before coefficient access. The .eval() on return is necessary for now since all the other math library functions can only take in Eigen matrices. Once all the signatures in the library can take in general Eigen types we can take off the training wheels and get a lot of benefit from Eigen's buildup of expression templates though the stan math functions.

(3) The std::forward in the call of another_func allows the matrices to keep their value type as they are passed to the next function. It's important to note that std::forward is used at the last call site before the end of the function. If m1 was an rvalue and we used std::forward in check_nan then by the time we tried using it for another_func the memory would have been moved and the program would seg fault.

So that's pretty cool, we work with pretty large objects so not having to make a copy in each signature is good! There's some bureaucracy around the process but overall things are not too rough.

Out of the 2300 new lines in this pr, 1500 lines of this PR are testing, while out of the other 800 or so I'd say only 150 lines are actually meaningful. The important lines are at the top of require_generics.hpp where we define our base aliases that are used in the rest of the PR.

require_t is an alias for enable_if_t that takes in a type trait with a static member named value. it will enable or disable a signature depending on whether value is true or false. The rest of those base aliases are just clever manipulations of that. require_not_t being std::enable_if_t<!...>. require_all_t and require_any_t are variadic templated aliases for enable if that use conjunction<...> and disjunction<...> to turn on and off signatures depending on whether all (logical and) or any (logical or) of the type traits are satisfied, respectively.

The weirdest addition in this PR is the requirement type traits for containers which I'll paste here and then explain. In order to write something simplish like require_any_vector_t<is_var, Var1, Var2> we need the following

template <template <class...> class ContainerCheck, template <class...> class TypeCheck, class... Check>
 using container_type_check_base = bool_constant<math::conjunction<
                             ContainerCheck<std::decay_t<Check>>...,
                             TypeCheck<value_type_t<Check>>...>::value>;


 template <template <class...> class TypeCheck, class... Check>
 struct is_std_vector_check : container_type_check_base<is_std_vector, TypeCheck, Check...> {};

 template <template <class...> class TypeCheck, class... Check>
 using require_std_vector_t = require_t<is_std_vector_check<TypeCheck, Check...>>;

container_type_check_base is a template <template template>, this means it takes in two types that will be templated in the alias. So here both ContainerCheck and TypeCheck use the the template type Check when they are defined in the alias. The type TypeCheck uses value_type_t to extract the underlying type from the variadic template types Check. Check is a variadic template type mostly to deal with when either ContainerCheck or TypeCheck have more than one template type in their arguments, though in most cases this is usually something of the form

template <typename T, typename = void>

which is used frequently for partial specializations for type traits.

the struct alias is_std_vector_check is a partial specialization of container_type_check_base,

 template <template <class...> class TypeCheck, class... Check>
 struct is_std_vector_check : container_type_check_base<is_std_vector, TypeCheck, Check...> {};

It takes in only one templated template type and sets the first as is_std_vector. We then finish things off by making the final alias that gives us our nice pretty requirement

 template <template <class...> class TypeCheck, class... Check>
 using require_std_vector_t = require_t<is_std_vector_check<TypeCheck, Check...>>;

And voila! We can now hide away all the dirt while using the above to make a bunch of pretty requires methods, which tbh are just copy/pasted versions of require_std_vector

To example what this looks like I've added the requires and perfect forwarding templates to the matrix_cl constructor.

Edit: Sadly this won't work yet on the stuff in Stan math since the Stan upstream library has very fixed signatures. So this PR will need to happen in math, then I'll have to jump over to the Stan repo and implement the new signature style there first

Tests

Tests exist for checking all of the higher order functions included in require_generics.hpp files and can be tested with

./runTests.py ./test/unit/math -f require

These tests use the utilities in test/unit/math/require_utils.hpp which contain helper structs to define testing for unary and variadic requires aliases as well as larger structs containing the actual google tests used for the individual tests.

Side Effects

value type has also been moved from the math namespace to the stan namespace where the rest of the type traits exist.

Checklist

  • Math issue Update internals to use more modern c++ #1308

  • Copyright holder: Steve Bronder

  • the basic tests are passing

    • unit tests pass (to run, use: ./runTests.py test/unit)
    • header checks pass, (make test-headers)
    • docs build, (make doxygen)
    • code passes the built in C++ standards checks (make cpplint)
  • the code is written in idiomatic C++ (lol) and changes are documented in the doxygen

  • the new changes are tested

@stan-buildbot
Copy link
Contributor

(stat_comp_benchmarks/benchmarks/gp_pois_regr/gp_pois_regr.stan, 1.0)
(stat_comp_benchmarks/benchmarks/low_dim_corr_gauss/low_dim_corr_gauss.stan, 0.98)
(stat_comp_benchmarks/benchmarks/irt_2pl/irt_2pl.stan, 1.0)
(stat_comp_benchmarks/benchmarks/pkpd/one_comp_mm_elim_abs.stan, 1.01)
(stat_comp_benchmarks/benchmarks/eight_schools/eight_schools.stan, 1.01)
(stat_comp_benchmarks/benchmarks/gp_regr/gp_regr.stan, 0.99)
(stat_comp_benchmarks/benchmarks/arK/arK.stan, 1.01)
(performance.compilation, 1.02)
(stat_comp_benchmarks/benchmarks/low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan, 1.0)
(stat_comp_benchmarks/benchmarks/low_dim_gauss_mix/low_dim_gauss_mix.stan, 1.0)
(stat_comp_benchmarks/benchmarks/sir/sir.stan, 1.03)
(stat_comp_benchmarks/benchmarks/pkpd/sim_one_comp_mm_elim_abs.stan, 1.0)
(stat_comp_benchmarks/benchmarks/garch/garch.stan, 1.01)
(stat_comp_benchmarks/benchmarks/gp_regr/gen_gp_data.stan, 0.98)
(stat_comp_benchmarks/benchmarks/arma/arma.stan, 1.01)
Result: 1.00418259361
Commit hash: ce53688

@SteveBronder SteveBronder self-assigned this Sep 8, 2019
@SteveBronder SteveBronder changed the title [WIP] Adds require_* template type traits Adds require_* template type traits Sep 8, 2019
@SteveBronder
Copy link
Collaborator Author

Alright so all the tests are passing so I think this is ready for a general discussion.

Should this be broken up into multiple PRs? The new code is only about 500 lines while the rest are tests that are pretty copy/pasta

@SteveBronder
Copy link
Collaborator Author

@syclik I'm tagging you because I'm not to sure how to move forward

@bob-carpenter
Copy link
Contributor

Thanks so much for the amazingly well documented PR.

I agree it's important to eliminate copying and believe this is the right, modern C++ way to do it.

I'm pretty sure we don't have any pointer types anywhere where we'd use scalar_type to extract their base scalar types, but we'd definitely want it to keep going through pointers, so that could be fixed in an independent PR or included with this one if it's necessary.

Does is_arithmetic include our autodiff types now or was the first example meant to only apply to primitives?

P.S. That's "et voila" when you get to France :-)

…8729-1~exp1~20180509124008.99 (branches/release_50)"

This reverts commit c804df3.
@SteveBronder
Copy link
Collaborator Author

Thanks so much for the amazingly well documented PR.
I agree it's important to eliminate copying and believe this is the right, modern C++ way to do it.

Thanks!! Before the PR is merged I'll probably add this PRs summary to either the docs or the wiki.

I'm pretty sure we don't have any pointer types anywhere where we'd use scalar_type to extract their base scalar types, but we'd definitely want it to keep going through pointers, so that could be fixed in an independent PR or included with this one if it's necessary.

I can make this in a separate PR

Does is_arithmetic include our autodiff types now or was the first example meant to only apply to primitives?

It's only to primitives. is_stan_scalar is the one that works on all of our types while is_ad_type would work only on fvars and vars.

I still can't decide whether we would want a require_eigen<Mat1> type trait that generally accepts any eigen container. With how non-informative the signatures will be I think require_eigen_t<is_stan_scalar, Mat1>(1) gives actually useful info relative to require_eigen<Mat1>(2).

(2) removes the _t since in my mind palace the _t means "an eigen container with type blah. If we are generally checking it's an eigen container I think it's good to remove that.

P.S. That's "et voila" when you get to France :-)

I'll be caché'ing as we et voila across Paris ;-)

@stan-buildbot
Copy link
Contributor

(stat_comp_benchmarks/benchmarks/gp_pois_regr/gp_pois_regr.stan, 0.98)
(stat_comp_benchmarks/benchmarks/low_dim_corr_gauss/low_dim_corr_gauss.stan, 0.95)
(stat_comp_benchmarks/benchmarks/irt_2pl/irt_2pl.stan, 1.0)
(stat_comp_benchmarks/benchmarks/pkpd/one_comp_mm_elim_abs.stan, 0.99)
(stat_comp_benchmarks/benchmarks/eight_schools/eight_schools.stan, 1.0)
(stat_comp_benchmarks/benchmarks/gp_regr/gp_regr.stan, 0.99)
(stat_comp_benchmarks/benchmarks/arK/arK.stan, 1.0)
(performance.compilation, 1.03)
(stat_comp_benchmarks/benchmarks/low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan, 1.0)
(stat_comp_benchmarks/benchmarks/low_dim_gauss_mix/low_dim_gauss_mix.stan, 1.03)
(stat_comp_benchmarks/benchmarks/sir/sir.stan, 1.05)
(stat_comp_benchmarks/benchmarks/pkpd/sim_one_comp_mm_elim_abs.stan, 1.02)
(stat_comp_benchmarks/benchmarks/garch/garch.stan, 1.01)
(stat_comp_benchmarks/benchmarks/gp_regr/gen_gp_data.stan, 1.0)
(stat_comp_benchmarks/benchmarks/arma/arma.stan, 0.97)
Result: 1.00148479396
Commit hash: 989b127

@stan-buildbot
Copy link
Contributor

(stat_comp_benchmarks/benchmarks/gp_pois_regr/gp_pois_regr.stan, 1.02)
(stat_comp_benchmarks/benchmarks/low_dim_corr_gauss/low_dim_corr_gauss.stan, 1.01)
(stat_comp_benchmarks/benchmarks/irt_2pl/irt_2pl.stan, 1.0)
(stat_comp_benchmarks/benchmarks/pkpd/one_comp_mm_elim_abs.stan, 0.99)
(stat_comp_benchmarks/benchmarks/eight_schools/eight_schools.stan, 1.02)
(stat_comp_benchmarks/benchmarks/gp_regr/gp_regr.stan, 1.0)
(stat_comp_benchmarks/benchmarks/arK/arK.stan, 1.0)
(performance.compilation, 1.0)
(stat_comp_benchmarks/benchmarks/low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan, 1.0)
(stat_comp_benchmarks/benchmarks/low_dim_gauss_mix/low_dim_gauss_mix.stan, 1.01)
(stat_comp_benchmarks/benchmarks/sir/sir.stan, 1.0)
(stat_comp_benchmarks/benchmarks/pkpd/sim_one_comp_mm_elim_abs.stan, 1.0)
(stat_comp_benchmarks/benchmarks/garch/garch.stan, 0.99)
(stat_comp_benchmarks/benchmarks/gp_regr/gen_gp_data.stan, 1.0)
(stat_comp_benchmarks/benchmarks/arma/arma.stan, 1.01)
Result: 1.00332099665
Commit hash: c8c698d

@stan-buildbot
Copy link
Contributor

(stat_comp_benchmarks/benchmarks/gp_pois_regr/gp_pois_regr.stan, 1.0)
(stat_comp_benchmarks/benchmarks/low_dim_corr_gauss/low_dim_corr_gauss.stan, 0.96)
(stat_comp_benchmarks/benchmarks/irt_2pl/irt_2pl.stan, 1.0)
(stat_comp_benchmarks/benchmarks/pkpd/one_comp_mm_elim_abs.stan, 1.01)
(stat_comp_benchmarks/benchmarks/eight_schools/eight_schools.stan, 1.04)
(stat_comp_benchmarks/benchmarks/gp_regr/gp_regr.stan, 1.0)
(stat_comp_benchmarks/benchmarks/arK/arK.stan, 0.99)
(performance.compilation, 1.01)
(stat_comp_benchmarks/benchmarks/low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan, 1.01)
(stat_comp_benchmarks/benchmarks/low_dim_gauss_mix/low_dim_gauss_mix.stan, 0.99)
(stat_comp_benchmarks/benchmarks/sir/sir.stan, 0.99)
(stat_comp_benchmarks/benchmarks/pkpd/sim_one_comp_mm_elim_abs.stan, 1.0)
(stat_comp_benchmarks/benchmarks/garch/garch.stan, 1.0)
(stat_comp_benchmarks/benchmarks/gp_regr/gen_gp_data.stan, 1.01)
(stat_comp_benchmarks/benchmarks/arma/arma.stan, 0.87)
Result: 0.99311276361
Commit hash: 906a24a

Copy link
Contributor

@t4c1 t4c1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While you fixed what I asked for in previous reviews, there are some new changes I think need some fixes.

Comment on lines 79 to 80
using require_all_not_same_t
= require_all_not_t<std::is_same<std::decay_t<T>, std::decay_t<Types>>...>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This name is a bit ambiguous. It could suggests this requires that at least one of types is different than others or that all types are different. I think a better name for what this does this would be require_any_not_same_t.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I flat out need to rethink how same works here. Reading it now it checks that the Types match with just T, but it really should check that each type is the same. If i figure that out I think this will make more sense / I can have an any for same as well

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I think just changing it to any makes sense (also i don't feel like writing out the more complicated thing which I'm not sure we would use that much)

cl::Context& ctx = opencl_context.context();
cl::CommandQueue& queue = opencl_context.queue();
// creates the OpenCL buffer to copy the Eigen
// matrix to the OpenCL device
buffer_cl_ = cl::Buffer(ctx, CL_MEM_READ_WRITE, sizeof(T) * size());
cl::Event write_event;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you move this outside the loop? We need to add all write events, not just the last one.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes! So, do we? I think it depends on whether OpenCL does async when it writes to an individual buffer or if writes to a single buffer happen sequentially. Maybe there is not a guarantee around that. If I can find something in the docs that shows that is true then I'll revert this and else we can discuss it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think we do, fixed this

queue.enqueueWriteBuffer(buffer_cl_, CL_FALSE,
sizeof(double) * offset_size,
sizeof(double) * rows_, A[i].data(),
&this->read_events(), &write_event);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a constructor - this matrix was not used before. So it has no read events and this change is unnecessary.

@@ -324,7 +324,7 @@ class matrix_cl<T, enable_if_arithmetic<T>> {
buffer_cl_ = cl::Buffer(ctx, CL_MEM_READ_WRITE, sizeof(T) * A.size());
cl::Event transfer_event;
queue.enqueueWriteBuffer(buffer_cl_, CL_FALSE, 0, sizeof(T) * A.size(),
A.data(), nullptr, &transfer_event);
A.data(), &this->read_events(), &transfer_event);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again there are no read events in a new matrix.

queue.enqueueWriteBuffer(buffer_cl_, CL_FALSE, 0, sizeof(T), &A, NULL,
&transfer_event);
queue.enqueueWriteBuffer(buffer_cl_, CL_FALSE, 0, sizeof(T), &A,
&this->read_events(), &transfer_event);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No read events in a new matrix

@@ -377,7 +378,7 @@ class matrix_cl<T, enable_if_arithmetic<T>> {
buffer_cl_ = cl::Buffer(ctx, CL_MEM_READ_WRITE, sizeof(T) * A.size());
cl::Event transfer_event;
queue.enqueueWriteBuffer(buffer_cl_, CL_FALSE, 0, sizeof(T) * A.size(),
A.data(), nullptr, &transfer_event);
A.data(), &this->read_events(), &transfer_event);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again

stan/math/opencl/matrix_cl.hpp Show resolved Hide resolved
@@ -406,7 +408,7 @@ class matrix_cl<T, enable_if_arithmetic<T>> {
buffer_cl_ = cl::Buffer(ctx, CL_MEM_READ_WRITE, sizeof(T) * size());
cl::Event transfer_event;
queue.enqueueWriteBuffer(buffer_cl_, CL_FALSE, 0, sizeof(T) * size(), A,
NULL, &transfer_event);
&this->read_events(), &transfer_event);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again

stan/math/opencl/matrix_cl.hpp Show resolved Hide resolved
stan/math/opencl/matrix_cl.hpp Show resolved Hide resolved
@syclik
Copy link
Member

syclik commented Oct 4, 2019

Dope!

I want to make sure the doc you wrote up here:

  1. Doesn’t get lost
  2. Is easy to find by other devs.

I don’t think you need more doc, I just think it’d be a shame if it was buried in this PR and we’d have to find the PR to know what happened.

Would it make sense to use something like doygen’s modules to put your text into the codebase and have it render like this: https://mc-stan.org/math/modules.html ?

One question about naming (this is a question, not a suggestion or required change): you’ve got “not” mixed in with “or” like require_not_double_or_int_t and I recognize you’re negating the “double or int” condition. Should this be “not double and not int” or maybe “not double nor int”? In words, it’s hard to put parens, so it's read as “(not double) or (int)” instead of “not (double or int).” This is a weird edge case so it’s really a question. As long as it’s clearly documented, it’s ok (and we don’t have a foreseeable need for “(not double) or (int)” type of function).

@SteveBronder
Copy link
Collaborator Author

Dope!

Ty!

I don’t think you need more doc, I just think it’d be a shame if it was buried in this PR and we’d have to find the PR to know what happened.

Would it make sense to use something like doygen’s modules to put your text into the codebase and have it render like this: https://mc-stan.org/math/modules.html ?

Yes def, I've been thinking about that more and more, it would be nice to have modules for

  1. This
  2. OpenCL
  3. TBB
  4. A shortened version of the math paper detailing how we do things

I'll look into this later today

One question about naming (this is a question, not a suggestion or required change): you’ve got “not” mixed in with “or” like require_not_double_or_int_t and I recognize you’re negating the “double or int” condition. Should this be “not double and not int” or maybe “not double nor int”? In words, it’s hard to put parens, so it's read as “(not double) or (int)” instead of “not (double or int).” This is a weird edge case so it’s really a question. As long as it’s clearly documented, it’s ok (and we don’t have a foreseeable need for “(not double) or (int)” type of function).

Yeah it's a good Q, naming things is hard! I wanted to keep all of the names the same to avoid gotchas, but I def agree that not_double_or_int could be read in a lot of different ways. I'll think about this. It's not used anywhere so maybe the answer is to just remove it

@stan-buildbot
Copy link
Contributor

(stat_comp_benchmarks/benchmarks/gp_pois_regr/gp_pois_regr.stan, 1.0)
(stat_comp_benchmarks/benchmarks/low_dim_corr_gauss/low_dim_corr_gauss.stan, 0.97)
(stat_comp_benchmarks/benchmarks/irt_2pl/irt_2pl.stan, 1.0)
(stat_comp_benchmarks/benchmarks/pkpd/one_comp_mm_elim_abs.stan, 1.01)
(stat_comp_benchmarks/benchmarks/eight_schools/eight_schools.stan, 1.03)
(stat_comp_benchmarks/benchmarks/gp_regr/gp_regr.stan, 1.01)
(stat_comp_benchmarks/benchmarks/arK/arK.stan, 0.99)
(performance.compilation, 1.02)
(stat_comp_benchmarks/benchmarks/low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan, 1.01)
(stat_comp_benchmarks/benchmarks/low_dim_gauss_mix/low_dim_gauss_mix.stan, 0.98)
(stat_comp_benchmarks/benchmarks/sir/sir.stan, 1.0)
(stat_comp_benchmarks/benchmarks/pkpd/sim_one_comp_mm_elim_abs.stan, 0.96)
(stat_comp_benchmarks/benchmarks/garch/garch.stan, 0.99)
(stat_comp_benchmarks/benchmarks/gp_regr/gen_gp_data.stan, 0.95)
(stat_comp_benchmarks/benchmarks/arma/arma.stan, 0.88)
Result: 0.98646109232
Commit hash: 906a24a

@SteveBronder
Copy link
Collaborator Author

@t4c1 Alrighty! I think I addressed all your review comments. I switched require_all_not_same to require_any_not_same since I think that makes more sense and removed the this->read_events() from the constructors. in addition I noticed that we commonly want to check for the same scalar or value type as a class template in our member functions so I added a _st and _vtvariant ofrequire_same`.

@syclik do you mind if I add the doxygen module in a separate PR? This one is already v big.

@stan-buildbot
Copy link
Contributor

(stat_comp_benchmarks/benchmarks/gp_pois_regr/gp_pois_regr.stan, 0.99)
(stat_comp_benchmarks/benchmarks/low_dim_corr_gauss/low_dim_corr_gauss.stan, 0.97)
(stat_comp_benchmarks/benchmarks/irt_2pl/irt_2pl.stan, 1.0)
(stat_comp_benchmarks/benchmarks/pkpd/one_comp_mm_elim_abs.stan, 0.97)
(stat_comp_benchmarks/benchmarks/eight_schools/eight_schools.stan, 1.02)
(stat_comp_benchmarks/benchmarks/gp_regr/gp_regr.stan, 1.0)
(stat_comp_benchmarks/benchmarks/arK/arK.stan, 1.01)
(performance.compilation, 1.02)
(stat_comp_benchmarks/benchmarks/low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan, 1.01)
(stat_comp_benchmarks/benchmarks/low_dim_gauss_mix/low_dim_gauss_mix.stan, 0.99)
(stat_comp_benchmarks/benchmarks/sir/sir.stan, 1.02)
(stat_comp_benchmarks/benchmarks/pkpd/sim_one_comp_mm_elim_abs.stan, 0.98)
(stat_comp_benchmarks/benchmarks/garch/garch.stan, 1.0)
(stat_comp_benchmarks/benchmarks/gp_regr/gen_gp_data.stan, 0.94)
(stat_comp_benchmarks/benchmarks/arma/arma.stan, 0.87)
Result: 0.98548819891
Commit hash: 5e73b0f

@SteveBronder
Copy link
Collaborator Author

Alrighty! @t4c1 I did a look over the code and think I got all of your review comments and this is passing so if you want to have a look over I think it's good to go!

@syclik
Copy link
Member

syclik commented Oct 9, 2019 via email

Copy link
Contributor

@t4c1 t4c1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Will someone else look this over or shall we merge?

@SteveBronder SteveBronder merged commit 0b88b1b into develop Oct 9, 2019
@SteveBronder
Copy link
Collaborator Author

Sorry -- missed the notification earlier. If it's not going into this PR,
then can you open a new issue with enough detail on what needs to be done?
Maybe someone else can tackle the doc or at the very least, it'll remind
you that there's outstanding work to be done on this PR.

Gonna do it today!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants