-
-
Notifications
You must be signed in to change notification settings - Fork 187
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
optimization #2551
optimization #2551
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -321,6 +321,16 @@ class results_cl { | |
std::get<Is>(results_), std::get<Is>(exprs.expressions_))...)); | ||
}); | ||
} | ||
template <typename... T_expressions, | ||
typename = std::enable_if_t<sizeof...(T_results) | ||
== sizeof...(T_expressions)>> | ||
void operator=(expressions_cl<T_expressions...>&& exprs) { | ||
index_apply<sizeof...(T_expressions)>([this, &exprs](auto... Is) mutable { | ||
assignment_impl(std::tuple_cat(make_assignment_pair( | ||
std::get<Is>(results_), | ||
std::forward<T_expressions>(std::get<Is>(exprs.expressions_)))...)); | ||
}); | ||
} | ||
|
||
/** | ||
* Incrementing \c results_ object by \c expressions_cl object | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Missing docs for the templates. Those are pretty complicated requires so please write out what's going on there There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Template is doced on line 339. The require is also relatively simple - just the number of results and expressions must match. |
||
|
@@ -531,11 +541,16 @@ class results_cl { | |
* @param expression expression | ||
* @return a tuple of pair of result and expression | ||
*/ | ||
template <typename T_result, typename T_expression, | ||
require_all_not_t<is_without_output<T_expression>, | ||
conjunction<internal::is_scalar_check<T_result>, | ||
std::is_arithmetic<std::decay_t< | ||
T_expression>>>>* = nullptr> | ||
template < | ||
typename T_result, typename T_expression, | ||
require_all_not_t< | ||
is_without_output<T_expression>, | ||
conjunction<internal::is_scalar_check<T_result>, | ||
std::is_arithmetic<std::decay_t<T_expression>>>, | ||
conjunction< | ||
is_matrix_cl<T_result>, is_matrix_cl<T_expression>, | ||
std::is_same<value_type_t<T_expression>, value_type_t<T_result>>, | ||
std::is_rvalue_reference<T_expression&&>>>* = nullptr> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If this is strictly for rvalue references why are we forwarding in the below? And why does this one need to be an rvalue reference? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There are two things that can be r- or l-value references. One is tuple (which in this case must be a rvalue reference and than there are tuple elements that can be either. |
||
static auto make_assignment_pair(T_result&& result, | ||
T_expression&& expression) { | ||
return std::make_tuple( | ||
|
@@ -576,6 +591,22 @@ class results_cl { | |
} | ||
return std::make_tuple(); | ||
} | ||
|
||
/** | ||
* Optimized move assignment of a `matrix_cl` into another `matrix_cl`. | ||
* @param result result - check | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What does |
||
* @param pass bool scalar | ||
* @return an empty tuple | ||
*/ | ||
template <typename T_result, typename T_matrix, | ||
require_all_matrix_cl_t<T_result, T_matrix>* = nullptr, | ||
require_all_st_same<T_result, T_matrix>* = nullptr, | ||
require_t<std::is_rvalue_reference<T_matrix&&>>* = nullptr> | ||
static std::tuple<> make_assignment_pair(T_result&& result, | ||
T_matrix&& matrix) { | ||
result = std::move(matrix); | ||
return std::make_tuple(); | ||
} | ||
SteveBronder marked this conversation as resolved.
Show resolved
Hide resolved
|
||
}; | ||
|
||
/** | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the input is && can't you always move instead of forward? And I think this can be made a lot simpler with
for_each()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also can make assignment pair just return back an std::pair?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some make_assignment_pair overloads need to return nothing, which can not be achieved with std::pair. Right now they return an empty tuple.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. The tuple itself might be a rvalue reference, but the elements in it can still be lvalue references, which must not be moved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to implement this with
for_each()
, but I can't expand theT_expressions
parameter pack, so I can not call thestd::forward
. This will need to stay as it is (also in this casefor_each()
would not really simplify the code).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But for that pair, since the rhs is an rvalue doesn't it use the
make_assignment_pair()
that doesn't call a kernel that you wrote in this PR here? If that test should throw we should check that the result and matrix have the same size before doing the moveThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right about this not being implemented within a kernel. The test, however operates on lvalues, so the new
make_assignment_pair()
from this PR is not used and kernel would be used if both operations had the same size. As they do not, it correctly throws.I agree it not exactly nice to have different error handling behavior between l- and rvalues. However, changing the new (rvalue) case to also throw would be a bit hard, as this check would have to be made before the
make_assignment_pair()
call. That would be a bit more work than I am willing to go into for this PR. Anyway, as it is now in this PR, it works correctly, even if a bit unintuitively.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m1_cl - m2_cl
in the test we are talking about is an rvalue though and we are assigning it to amatrix_cl<>
that is uninitialized.EXPECT_THROW(stan::math::results(res1_cl, res2_cl) = stan::math::expressions(m1_cl - m2_cl, m3_cl - m4_cl), std::invalid_argument);
Do we want to always be throwing in the case of incorrect size assignment here? In that case I think it would just be changing the
matrix_cl
rvalue assign operator to check sizes before doing anything. But imo it feels like it makes sense to allow this since the rhs is replacing the buffer of the matrix_cl.The other test where we have an expression of a certain size on the lhs still throws
Is there a reason we can't just check the rows and columns of the assigner and assignee and if they mismatch throw an error?
What's the intuition behind not wanting to allow resizing in the second test case? tmk Eigen only stops resizing in the case of an Eigen Map or fixed size matrices of mismatched sizes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, right, the reason for not using the new code in that test is the fact that the right hand side expression is not a plain matrix. Reminder of my previous post is still true.
That is debatable. If we have two assignments of different sizes we can not do them in the same kernel so we throw. If one of these can be handled by matrix_cl move assignment (the new code in this PR) than we don't need to. Although we still could to be consistent with throwing when right hand side is a rvalue and we can not use move assignment.
We already do that and that is not the issue here. The issue is if we want to assign two expressions in the same call and they have different size (the sizes of results match expressions or can be resized so they do).
It is not about resizing. See my previous answer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry which post? And if what I wrote should not be using the new
make_assignment_pair()
can you add a require to guard against that? Though tbh I'm still confused why it can't use the new one with the move assignment since we are assigning an rvalue expression to theres1/2_cl
It sounds like we should have another test that is something like