Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

operator+ for var matrices and matrix of vars #2115

Merged
merged 50 commits into from
Nov 16, 2020

Conversation

SteveBronder
Copy link
Collaborator

Summary

This adds add() and operator+ for var matrices and adds overloaded functions for add() using reverse_pass_callback() for matrices of vars. I can add subtraction here as well as the code will look nearly the same

Tests

Tests were added for operator+ for all the mixed types it can take in. The tests use add() because for var types add() now has a specialization for var types that divert to operator+

Side Effects

None

Release notes

Adds addition overloads for var matrices and matrices of vars

Checklist

  • Math issue Make functions with custom autodiff var<mat> friendly #2101

  • Copyright holder: (fill in copyright holder information)

    The copyright holder is typically you or your assignee, such as a university or company. By submitting this pull request, the copyright holder is agreeing to the license the submitted work under the following licenses:
    - Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
    - Documentation: CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)

  • the basic tests are passing

    • unit tests pass (to run, use: ./runTests.py test/unit)
    • header checks pass, (make test-headers)
    • dependencies checks pass, (make test-math-dependencies)
    • docs build, (make doxygen)
    • code passes the built in C++ standards checks (make cpplint)
  • the code is written in idiomatic C++ and changes are documented in the doxygen

  • the new changes are tested

@SteveBronder SteveBronder changed the title Feature/varmat operatorplus operator+ for var matrices and matrix of vars Sep 30, 2020
@SteveBronder
Copy link
Collaborator Author

I have a branch below that does this same thing for subtract as well. I'll post some speed tests tonight

https://github.com/stan-dev/math/tree/feature/varmat-operatorminus

Copy link
Collaborator

@andrjohns andrjohns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all looks great! Couple minor changes in there.

I might be a bit of a pain though. I tried this reverse_pass_callback approach with the unary functions and performance got worse, since multiple passes over the same inputs to pull out values and adjoints 'cost' more than was gained by the callback. I wonder if this will have the same problem?

Would you mind benchmarking these against against an apply_scalar_binary implementation?

Something like:

template <typename T1, typename T2, require_any_container_t<T1, T2>* = nullptr>
inline auto operator+(const T1& a, const T2& b) {
  return apply_scalar_binary(
      a, b, [&](const auto& c, const auto& d) { return add(c, d); });
}

Where there are only scalar overloads for add (i.e., comment out the container specialisations or something).

Let me know if you don't have the time and I can work them up over the weekend as well.


/**
* Return <code>true</code> if <code>y</code> is less or equal to
* <code>high</code>.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The doc and variable names don't quite line up with the function - since it's testing for equality rather than less or equal

return (y.array() == x).all();
}

template <typename T_y, typename T_high, require_std_vector_t<T_y>* = nullptr>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will need to add doc for these other specialisations as well

* to low and if and element of y or high is NaN
*/
template <typename T_y, typename T_high, require_eigen_t<T_y>* = nullptr>
inline bool is_equal(const T_y& y, const T_high& x) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'll probably also need to add is_equal(container, container) specialisations, as well as is_equal(scalar, container).

At least for the second one you can just call the existing specialisations with the arguments reversed, so not all bad

}

template <typename T_y, typename T_high, require_std_vector_t<T_y>* = nullptr>
inline bool is_equal(const T_y& y, const T_high& x) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could always just have the Eigen specialisations defined and then Map any std::vectors and pass those to the Eigen versions. Will help cut down on the code for the other combinations

stan/math/prim/err/is_matching_dims.hpp Show resolved Hide resolved
stan/math/prim/err/is_matching_dims.hpp Show resolved Hide resolved
@@ -20,6 +20,11 @@ inline bool is_nan(T x) {
return std::isnan(x);
}

template <typename T, typename = require_eigen_t<T>>
inline bool is_nan(const T& x) {
return Eigen::isnan(x.array()).any();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eigen has a neat little member function just for this:

Suggested change
return Eigen::isnan(x.array()).any();
return x.hasNaN();

stan/math/rev/core/operator_addition.hpp Show resolved Hide resolved
stan/math/rev/core/operator_addition.hpp Show resolved Hide resolved
*/
template <typename Var, typename EigMat,
require_eigen_vt<std::is_arithmetic, EigMat>* = nullptr,
require_var_vt<std::is_arithmetic, Var>* = nullptr>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not related to this pull but I didn't realise we had this require_var_vt, that will simplify some of the wacky templating I've been resorting to. These require generics have been such a good addition!

return {new internal::add_vv_vari(a.vi_, b.vi_)};
var ret(a.val() + b.val());
if (unlikely(is_any_nan(a.val(), b.val()))) {
reverse_pass_callback([a, b]() mutable {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think special handling of nans benefits either correctness of performance. The same propagation of nans happens in general branch as well. Maybe benchmark it?

@SteveBronder
Copy link
Collaborator Author

Woof weird error I need to sort out, but good news is that the benchmarks are v cool.

For Matrix<var>

image

Then for var<Matrix>

image

@andrjohns
Copy link
Collaborator

Just realised that my apply_scalar_binary suggestion doesn't make much sense since there aren't scalar overloads of var<matrix>, so feel free to ignore me on that one

@stan-buildbot
Copy link
Contributor


Name Old Result New Result Ratio Performance change( 1 - new / old )
gp_pois_regr/gp_pois_regr.stan 3.14 3.52 0.89 -12.02% slower
low_dim_corr_gauss/low_dim_corr_gauss.stan 0.02 0.02 0.98 -2.03% slower
eight_schools/eight_schools.stan 0.11 0.12 0.95 -5.56% slower
gp_regr/gp_regr.stan 0.18 0.17 1.04 3.56% faster
irt_2pl/irt_2pl.stan 5.65 5.72 0.99 -1.22% slower
performance.compilation 90.72 88.63 1.02 2.3% faster
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan 8.62 8.42 1.02 2.35% faster
pkpd/one_comp_mm_elim_abs.stan 30.87 29.69 1.04 3.83% faster
sir/sir.stan 128.69 138.22 0.93 -7.41% slower
gp_regr/gen_gp_data.stan 0.04 0.04 1.0 -0.32% slower
low_dim_gauss_mix/low_dim_gauss_mix.stan 2.95 2.96 1.0 -0.2% slower
pkpd/sim_one_comp_mm_elim_abs.stan 0.38 0.41 0.93 -7.25% slower
arK/arK.stan 1.8 2.49 0.72 -38.78% slower
arma/arma.stan 0.61 0.61 1.0 -0.1% slower
garch/garch.stan 0.7 0.71 0.98 -1.53% slower
Mean result: 0.966346746938

Jenkins Console Log
Blue Ocean
Commit hash: e793e6c


Machine information ProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010

CPU:
Intel(R) Xeon(R) CPU E5-1680 v2 @ 3.00GHz

G++:
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

Clang:
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

@SteveBronder
Copy link
Collaborator Author

If anyone has a minute this should be ready for review!

@andrjohns
Copy link
Collaborator

Sorry about the delay, will take a look at this today

@stan-buildbot
Copy link
Contributor


Name Old Result New Result Ratio Performance change( 1 - new / old )
gp_pois_regr/gp_pois_regr.stan 3.13 3.39 0.92 -8.43% slower
low_dim_corr_gauss/low_dim_corr_gauss.stan 0.02 0.02 0.99 -1.18% slower
eight_schools/eight_schools.stan 0.12 0.11 1.02 2.07% faster
gp_regr/gp_regr.stan 0.17 0.17 1.0 0.29% faster
irt_2pl/irt_2pl.stan 5.71 5.68 1.01 0.56% faster
performance.compilation 90.88 88.48 1.03 2.64% faster
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan 8.51 8.47 1.0 0.41% faster
pkpd/one_comp_mm_elim_abs.stan 29.88 29.28 1.02 1.99% faster
sir/sir.stan 143.61 138.41 1.04 3.62% faster
gp_regr/gen_gp_data.stan 0.04 0.04 1.0 0.16% faster
low_dim_gauss_mix/low_dim_gauss_mix.stan 2.95 3.0 0.98 -1.54% slower
pkpd/sim_one_comp_mm_elim_abs.stan 0.38 0.39 0.98 -1.79% slower
arK/arK.stan 1.78 1.76 1.01 1.37% faster
arma/arma.stan 0.6 0.61 0.99 -1.3% slower
garch/garch.stan 0.75 0.58 1.28 21.73% faster
Mean result: 1.01845671308

Jenkins Console Log
Blue Ocean
Commit hash: e4615df


Machine information ProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010

CPU:
Intel(R) Xeon(R) CPU E5-1680 v2 @ 3.00GHz

G++:
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

Clang:
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

Copy link
Collaborator

@andrjohns andrjohns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of very minor queries, but otherwise looks great! Thanks for going through the apply_scalar_binary benchmarking rigmarole.

template<typename T = Scalar>
EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE
std::enable_if_t<std::is_pointer<T>::value, reverse_return_t<T>>
coeffRef(T &v) { return v->adj_; }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed? Shouldn't mat.adj().coeffRef(i,j) 'just work'?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh hm maybe I did something stupid let me look at this

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally unnecessary, deleted!

stan/math/rev/core/operator_addition.hpp Show resolved Hide resolved
@bbbales2
Copy link
Member

bbbales2 commented Nov 9, 2020

@SteveBronder yoyo Steve can you do the polishing on this? I want my operator+

@SteveBronder
Copy link
Collaborator Author

Yes! Trying to finish up the slicing stuff today but I will clean up based on the review comments (thanks @andrjohns !) tmrw

@bbbales2
Copy link
Member

bbbales2 commented Nov 12, 2020

Gogogogogo (edit: that's my generic cheer squad)

@SteveBronder
Copy link
Collaborator Author

Ooof yeah sorry I'll try to fix this up tonight or tomorrow. The assign and subset tests are way more than I thought they were

@bbbales2
Copy link
Member

Ooof

The cheer squad appreciates the effort

@andrjohns
Copy link
Collaborator

The cheer squad appreciates the effort

+1 Cheer

Let me know when I can take a gander at this again

@SteveBronder
Copy link
Collaborator Author

@andrjohns should be ready to rock!

andrjohns
andrjohns previously approved these changes Nov 13, 2020
Copy link
Collaborator

@andrjohns andrjohns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All good here! Woohoo!

@SteveBronder
Copy link
Collaborator Author

@andrjohns can you reclick the approve thing? Something happened in jenkins and I had to kick this off again

@SteveBronder
Copy link
Collaborator Author

@serban-nicusor-toptal is there something up with jenkins? The normal button to view jenkins stuff is gone from the pr and this pr has been waiting since Friday. Is there a backlog of PRs?

@serban-nicusor-toptal
Copy link
Contributor

I had to reboot Jenkins and it may have corrupted the job url here. This is the last build that ran, I've also restarted it, button should be back too.

@SteveBronder
Copy link
Collaborator Author

cool much appreciated!

@stan-buildbot
Copy link
Contributor


Name Old Result New Result Ratio Performance change( 1 - new / old )
gp_pois_regr/gp_pois_regr.stan 3.13 3.58 0.87 -14.52% slower
low_dim_corr_gauss/low_dim_corr_gauss.stan 0.02 0.02 0.97 -3.12% slower
eight_schools/eight_schools.stan 0.12 0.12 0.99 -0.82% slower
gp_regr/gp_regr.stan 0.17 0.16 1.07 6.4% faster
irt_2pl/irt_2pl.stan 5.67 5.71 0.99 -0.6% slower
performance.compilation 86.46 85.4 1.01 1.22% faster
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan 8.62 8.49 1.02 1.53% faster
pkpd/one_comp_mm_elim_abs.stan 31.63 30.5 1.04 3.57% faster
sir/sir.stan 135.87 134.57 1.01 0.96% faster
gp_regr/gen_gp_data.stan 0.04 0.04 0.99 -1.15% slower
low_dim_gauss_mix/low_dim_gauss_mix.stan 2.99 2.95 1.01 1.28% faster
pkpd/sim_one_comp_mm_elim_abs.stan 0.37 0.39 0.96 -4.02% slower
arK/arK.stan 1.76 1.79 0.99 -1.48% slower
arma/arma.stan 0.59 0.61 0.98 -2.33% slower
garch/garch.stan 0.75 0.6 1.25 19.84% faster
Mean result: 1.0096526338

Jenkins Console Log
Blue Ocean
Commit hash: 1507a74


Machine information ProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010

CPU:
Intel(R) Xeon(R) CPU E5-1680 v2 @ 3.00GHz

G++:
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

Clang:
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

@SteveBronder SteveBronder merged commit f5a051e into develop Nov 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants