Add Eigen Dense and Sparse vari_value types #1952

SteveBronder · 2020-06-28T23:05:38Z

Summary

Adds a dense and sparse matrix specialization to vari_value. The dense and sparse vari_value matrix types use an Eigen::Map to hold stack allocated memory for the val_ and adj_.

Tests

This needs more testing, but I could use some help figuring out what needs done. Right now I'm testing that

Eigen matrix expressions are allowed to be passed into the constructor
Standard var_value(var_value()) and var_value(vari_value()) construction works
That member vals are the same as the input vals for the above cases.

Side Effects

Yes, we need to be careful when constructing vari_value and var_value types that the template type is a plain eigen type (aka not an Expression just an Matrix, Array, or SparseMatrix, etc.). I think I'll add an is_plain_type type trait to check that templates for vari_value are actually plain types.

Release notes

Allows vari to hold an Eigen type

Checklist

Math issue Meta Issue for Static Matrices #1875
Copyright holder: Steve Bronder

The copyright holder is typically you or your assignee, such as a university or company. By submitting this pull request, the copyright holder is agreeing to the license the submitted work under the following licenses:
- Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
- Documentation: CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)
the basic tests are passing
- unit tests pass (to run, use: ./runTests.py test/unit)
- header checks pass, (make test-headers)
- dependencies checks pass, (make test-math-dependencies)
- docs build, (make doxygen)
- code passes the built in C++ standards checks (make cpplint)
the code is written in idiomatic C++ and changes are documented in the doxygen
the new changes are tested

This reverts commit a50deb9.

…d::numeric_limits so that it conforms to the C++11 definition

…e vari_base a pure virtual base class

… devirtualize

…_vari, and avoiding copies for integral type vari_value

…templates

…s/RELEASE_600/final)

…h into feature/vari-base-templates

…pe of function

…s/RELEASE_600/final)

stan-buildbot · 2020-08-05T15:23:58Z

Name	Old Result	New Result	Ratio	Performance change( 1 - new / old )
gp_pois_regr/gp_pois_regr.stan	4.01	4.01	1.0	0.16% faster
low_dim_corr_gauss/low_dim_corr_gauss.stan	0.02	0.02	0.98	-2.37% slower
eight_schools/eight_schools.stan	0.09	0.09	1.02	1.68% faster
gp_regr/gp_regr.stan	0.19	0.19	0.98	-1.97% slower
irt_2pl/irt_2pl.stan	5.33	5.41	0.98	-1.54% slower
performance.compilation	87.91	85.88	1.02	2.31% faster
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan	7.57	7.83	0.97	-3.38% slower
pkpd/one_comp_mm_elim_abs.stan	27.55	27.24	1.01	1.14% faster
sir/sir.stan	110.93	133.16	0.83	-20.04% slower
gp_regr/gen_gp_data.stan	0.04	0.04	0.99	-0.83% slower
low_dim_gauss_mix/low_dim_gauss_mix.stan	2.99	3.01	0.99	-0.52% slower
pkpd/sim_one_comp_mm_elim_abs.stan	0.41	0.38	1.07	6.92% faster
arK/arK.stan	1.72	1.8	0.96	-4.2% slower
arma/arma.stan	0.59	0.59	1.01	0.67% faster
garch/garch.stan	0.59	0.53	1.11	10.25% faster
Mean result: 0.995881756989

Jenkins Console Log
Blue Ocean
Commit hash: e290f12

Machine information

ProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010

CPU:
Intel(R) Xeon(R) CPU E5-1680 v2 @ 3.00GHz

G++:
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

Clang:
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

bbbales2 · 2020-08-05T19:32:19Z

stan/math/opencl/rev/vari.hpp

+   * will be called in the reverse order of construction.
+   *
+   * @tparam S A `matrix_cl` or kernel generator expression type that is
+   * convertible to `value_type`


value_type isn't here anymore

bbbales2 · 2020-08-05T19:32:35Z

stan/math/opencl/rev/vari.hpp

+   *  will be called in the reverse order of construction.
+   *
+   * @tparam S A `matrix_cl` or kernel generator expression type that is
+   * convertible to `value_type`


value_type isn't here anymore

stan/math/opencl/rev/vari.hpp

…var_value's grad so that it is only available when the value_type of the var_value is not a container

…4.1 (tags/RELEASE_600/final)

SteveBronder · 2020-08-05T22:40:58Z

@bbbales2 as we discussed I added a grad() method that just starts at the bottom of the stack and calls each chain without setting the adjoints to anything first. Also added the stuff so that the grad() method in var_value only exist if the inner type is a scalar

@t4c1 do you want to get the operands_and_partials branch approved/merged in here first before approving/merging this branch to develop? I'm fine either way

SteveBronder · 2020-08-05T22:57:03Z

@serban-nicusor-toptal getting a Jenkins error I haven't seen before

+ git pull
Already up to date.
+ git checkout develop
error: pathspec 'develop' did not match any file(s) known to git
script returned exit code 1

Do you know what's up with that?

serban-nicusor-toptal · 2020-08-06T00:35:57Z

Hey @SteveBronder everything is back to normal, sorry for the trouble!
Jenkins was only fetching this PR from the origin so I had to make sure it checks out all of them.

bbbales2 · 2020-08-06T01:03:44Z

stan/math/rev/core/grad.hpp

+ * <p>This function does not recover any memory from the computation.
+ *
+ */
+static void grad() {


The duplicate code here can be eliminated pretty easily.

Either call this grad inside the other grad or do a default nullptr argument in the other thing:

template <typename Vari> static void grad(Vari* vi) { vi->init_dependent(); grad(); }

or

template <typename Vari> static void grad(Vari* vi = nullptr) { if(!vi) vi->init_dependent(); size_t end = ChainableStack::instance_->var_stack_.size(); size_t beginning = empty_nested() ? 0 : end - nested_size(); for (size_t i = end; i-- > beginning;) { ChainableStack::instance_->var_stack_[i]->chain(); } }

and delete the second definition.

SteveBronder · 2020-08-06T03:09:41Z

@serban-nicusor-toptal much appreciated!

stan-buildbot · 2020-08-06T05:32:59Z

Name	Old Result	New Result	Ratio	Performance change( 1 - new / old )
gp_pois_regr/gp_pois_regr.stan	4.18	4.0	1.05	4.31% faster
low_dim_corr_gauss/low_dim_corr_gauss.stan	0.02	0.02	1.05	4.33% faster
eight_schools/eight_schools.stan	0.09	0.09	0.98	-2.49% slower
gp_regr/gp_regr.stan	0.2	0.21	0.96	-3.88% slower
irt_2pl/irt_2pl.stan	5.41	5.46	0.99	-0.9% slower
performance.compilation	86.97	85.53	1.02	1.66% faster
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan	7.82	7.85	1.0	-0.33% slower
pkpd/one_comp_mm_elim_abs.stan	27.06	28.97	0.93	-7.08% slower
sir/sir.stan	132.12	130.02	1.02	1.6% faster
gp_regr/gen_gp_data.stan	0.05	0.05	0.96	-3.83% slower
low_dim_gauss_mix/low_dim_gauss_mix.stan	3.01	3.0	1.0	0.46% faster
pkpd/sim_one_comp_mm_elim_abs.stan	0.39	0.39	1.0	0.06% faster
arK/arK.stan	1.78	1.8	0.99	-1.37% slower
arma/arma.stan	0.59	0.73	0.81	-23.67% slower
garch/garch.stan	0.53	0.58	0.93	-8.02% slower
Mean result: 0.978176087394

Jenkins Console Log
Blue Ocean
Commit hash: c6af0b0

Machine information

ProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010

CPU:
Intel(R) Xeon(R) CPU E5-1680 v2 @ 3.00GHz

G++:
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

Clang:
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

t4c1 · 2020-08-06T06:17:12Z

@t4c1 do you want to get the operands_and_partials branch approved/merged in here first before approving/merging this branch to develop? I'm fine either way

I think you don't need to wait for me. If this gets ready before ops_partials go ahead and merge it.

t4c1

A few more details and this is done.

t4c1 · 2020-08-06T06:26:15Z

stan/math/rev/core/var.hpp

@@ -76,14 +86,13 @@ class var_value<T, require_floating_point_t<T>> {
   * @param x Value of the variable.
   */
  template <typename S, require_convertible_t<S&, value_type>* = nullptr>
-  var_value(const S& x) : vi_(new vari_type(x, false)) {}  // NOLINT
+  var_value(S&& x) : vi_(new vari_type(x, false)) {}  // NOLINT


Suggested change

var_value(S&& x) : vi_(new vari_type(x, false)) {} // NOLINT

var_value(const S& x) : vi_(new vari_type(x, false)) {} // NOLINT

t4c1 · 2020-08-06T06:54:31Z

stan/math/rev/core/vari.hpp

+  static constexpr Eigen::Index RowsAtCompileTime
+      = PlainObject::RowsAtCompileTime;
  /**
   * Number of columns known at compile time
   */
-  static constexpr Eigen::Index ColsAtCompileTime = Scalar::ColsAtCompileTime;
+  static constexpr Eigen::Index ColsAtCompileTime
+      = PlainObject::ColsAtCompileTime;


Eigen uses enums to store these. I like our approach with constexprs better, but I think the type should be int.

I thought that enums were preferred since the static constexpr doesn't always operate as purely compile-time (based on this writeup):

Static constant members are lvalues (see Appendix B). So, if we have a declaration such as
void foo(int const&);
and we pass it the result of a metaprogram:
foo(Pow3<7>::value);
a compiler must pass the address of Pow3<7>::value, and that forces the compiler to instantiate and allocate the definition for the static member. As a result, the computation is no longer limited to a pure “compile-time” effect.
Enumeration values aren’t lvalues (i.e., they don’t have an address). So, when we pass them by reference, no static memory is used. It’s almost exactly as if you passed the computed value as a literal. The first edition of this book therefore preferred the use of enumerator constants for this kind of applications.
C++11, however, introduced constexpr static data members, and those are not limited to integral types. They do not solve the address issue raised above, but in spite of that shortcoming they are now a common way to produce results of metaprograms.

I did not know that and I have to say I don't fully understand it. How can enums be passed around without taking memory?

Anyway I am fine with changing this to an enum if that is prefered.

I definitely don't know enough to have preference here, just something I stumbled on recently that I thought was interesting.

Reading the section I think the only concern is when the static constexpr is used by a function taking a reference? imo I like the static constexpr version and idt the performance difference here would be drastic with either/or

t4c1 · 2020-08-06T06:56:21Z

stan/math/rev/core/vari.hpp

-        adj_(x),
-        chainable_alloc() {
+  template <typename S, require_convertible_t<S&, T>* = nullptr>
+  explicit vari_value(S&& x) : val_(x), adj_(x), chainable_alloc() {


Suggested change

explicit vari_value(S&& x) : val_(x), adj_(x), chainable_alloc() {

explicit vari_value(const S& x) : val_(x), adj_(x), chainable_alloc() {

Actually should these be forwarding because sparse holds an actual sparse matrix while the dense matrix one should use const S& since it always allocated memory?

I changed them all to be const S&

I thought sparse vari used maps too. Than you are right and forwarding can be taken advantage of. In that case you also need to call forward on the assignment.

t4c1 · 2020-08-06T06:56:37Z

stan/math/rev/core/vari.hpp

-        adj_(x),
-        chainable_alloc() {
+  template <typename S, require_convertible_t<S&, T>* = nullptr>
+  vari_value(S&& x, bool stacked) : val_(x), adj_(x), chainable_alloc() {


Suggested change

vari_value(S&& x, bool stacked) : val_(x), adj_(x), chainable_alloc() {

vari_value(const S& x, bool stacked) : val_(x), adj_(x), chainable_alloc() {

…i_value

…4.1 (tags/RELEASE_600/final)

stan-buildbot · 2020-08-07T01:25:11Z

Name	Old Result	New Result	Ratio	Performance change( 1 - new / old )
gp_pois_regr/gp_pois_regr.stan	4.05	4.13	0.98	-1.97% slower
low_dim_corr_gauss/low_dim_corr_gauss.stan	0.02	0.02	0.96	-4.04% slower
eight_schools/eight_schools.stan	0.09	0.09	0.95	-5.77% slower
gp_regr/gp_regr.stan	0.19	0.19	1.0	0.31% faster
irt_2pl/irt_2pl.stan	5.39	5.41	1.0	-0.39% slower
performance.compilation	87.26	85.93	1.02	1.52% faster
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan	7.82	7.66	1.02	2.02% faster
pkpd/one_comp_mm_elim_abs.stan	27.8	27.16	1.02	2.3% faster
sir/sir.stan	130.29	131.59	0.99	-1.0% slower
gp_regr/gen_gp_data.stan	0.04	0.05	0.99	-1.19% slower
low_dim_gauss_mix/low_dim_gauss_mix.stan	3.02	2.99	1.01	1.04% faster
pkpd/sim_one_comp_mm_elim_abs.stan	0.41	0.39	1.06	5.9% faster
arK/arK.stan	1.78	1.8	0.99	-1.37% slower
arma/arma.stan	0.59	0.73	0.81	-23.5% slower
garch/garch.stan	0.53	0.57	0.93	-7.55% slower
Mean result: 0.981587372336

Jenkins Console Log
Blue Ocean
Commit hash: 32045cc

Machine information

ProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010

CPU:
Intel(R) Xeon(R) CPU E5-1680 v2 @ 3.00GHz

G++:
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

Clang:
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

SteveBronder · 2020-08-07T03:03:05Z

Cool cool so I think this is ready to go!

serban-nicusor-toptal and others added 30 commits July 18, 2019 23:24

Fixed typo in release notes

4727ac7

Updated 2.21.0 release notes.

2db4561

Revert "release/v2.21.0: updating version numbers"

a707039

This reverts commit a50deb9.

Release v3.0.0

2f037ae

Merge branch 'master' of github.com:stan-dev/math

31151b2

Fixed conflict in stan/math/prim/fun/log_softmax.hpp

d4e4f15

Updated math version

3f8fae4

Reverted release of 3.1.1

067add1

Fixed conflict in stan/math/prim/fun/log_softmax.hpp

b96e401

Fixed conflict in stan/math/prim/fun/log_softmax.hpp

ab50cb5

Updated math to v3.1.1

ba8e315

Reverted release 3.1.1

ea955b8

Fixed conflict in stan/math/prim/fun/log_softmax.hpp

f899a98

Fixed conflict in stan/math/prim/fun/log_softmax.hpp

a5d71b2

Release v3.1.1

e081b4e

Fixed merge conflicts

5f28b8f

Adds templates to vari and var as well as the var operators. Fixes st…

ebbfeb8

…d::numeric_limits so that it conforms to the C++11 definition

Add a template to grad so that init_dependent can be non-virtual, mak…

d7e1d84

…e vari_base a pure virtual base class

add final to classes inheriting op_varis so flto can more aggressivly…

ba930da

… devirtualize

cleanup templates in var and vari, clang-format, tests for is_var, is…

244620a

…_vari, and avoiding copies for integral type vari_value

add ref to require_convertible check in vari constructors

92630dd

Merge remote-tracking branch 'origin/develop' into feature/vari-base-…

d5139c6

…templates

[Jenkins] auto-formatting by clang-format version 6.0.0-1ubuntu2 (tag…

2dfa329

…s/RELEASE_600/final)

fix header check

f045272

Fix templates for hmm

4b5fc67

[Jenkins] auto-formatting by clang-format version 6.0.0-1ubuntu2 (tag…

1fe6e2b

…s/RELEASE_600/final)

add template to lp value in in positive_constrain

b63acdc

Merge branch 'feature/vari-base-templates' of github.com:stan-dev/mat…

df22a0b

…h into feature/vari-base-templates

replace vector func value types in apply_scalar_binary with return ty…

f20d98a

…pe of function

[Jenkins] auto-formatting by clang-format version 6.0.0-1ubuntu2 (tag…

16db81b

…s/RELEASE_600/final)

merge to develop

e290f12

bbbales2 reviewed Aug 5, 2020

View reviewed changes

stan/math/opencl/rev/vari.hpp Show resolved Hide resolved

SteveBronder and others added 3 commits August 5, 2020 18:34

Add grad() method without an input vari and add template paramter to …

bf096b9

…var_value's grad so that it is only available when the value_type of the var_value is not a container

[Jenkins] auto-formatting by clang-format version 6.0.0-1ubuntu2~16.0…

d2edb26

…4.1 (tags/RELEASE_600/final)

fix docs in vari for matrix_cl

614ae97

bbbales2 reviewed Aug 6, 2020

View reviewed changes

SteveBronder added 2 commits August 5, 2020 22:00

use grad() in grad(Vari*)

bf38a76

Move grad() in front of grad(Vari*)

c6af0b0

bbbales2 previously approved these changes Aug 6, 2020

View reviewed changes

t4c1 requested changes Aug 6, 2020

View reviewed changes

merge to develop and use const S& instead of S&& in var_value and var…

7945f66

…i_value

SteveBronder dismissed bbbales2’s stale review via 7945f66 August 6, 2020 17:49

stan-buildbot and others added 3 commits August 6, 2020 17:55

[Jenkins] auto-formatting by clang-format version 6.0.0-1ubuntu2~16.0…

4e70ba6

…4.1 (tags/RELEASE_600/final)

Use pf in var_value for the sparse matrix constructor

8cec8bb

[Jenkins] auto-formatting by clang-format version 6.0.0-1ubuntu2~16.0…

32045cc

…4.1 (tags/RELEASE_600/final)

t4c1 approved these changes Aug 7, 2020

View reviewed changes

SteveBronder merged commit 988a44a into develop Aug 7, 2020

t4c1 mentioned this pull request Aug 10, 2020

Add support for var_value in operands_and_partials #1970

Merged

5 tasks

rok-cesnovar mentioned this pull request Oct 13, 2020

Release of Stan Math 3.4 #2128

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Eigen Dense and Sparse vari_value types #1952

Add Eigen Dense and Sparse vari_value types #1952

SteveBronder commented Jun 28, 2020

stan-buildbot commented Aug 5, 2020

bbbales2 Aug 5, 2020

bbbales2 Aug 5, 2020

SteveBronder commented Aug 5, 2020

SteveBronder commented Aug 5, 2020

serban-nicusor-toptal commented Aug 6, 2020

bbbales2 Aug 6, 2020

SteveBronder Aug 6, 2020

SteveBronder commented Aug 6, 2020

stan-buildbot commented Aug 6, 2020

t4c1 commented Aug 6, 2020

t4c1 left a comment

t4c1 Aug 6, 2020

t4c1 Aug 6, 2020

andrjohns Aug 6, 2020

t4c1 Aug 6, 2020

andrjohns Aug 6, 2020

SteveBronder Aug 6, 2020

t4c1 Aug 6, 2020

SteveBronder Aug 6, 2020

SteveBronder Aug 6, 2020

t4c1 Aug 6, 2020

SteveBronder Aug 6, 2020

t4c1 Aug 6, 2020

stan-buildbot commented Aug 7, 2020

SteveBronder commented Aug 7, 2020

	var_value(S&& x) : vi_(new vari_type(x, false)) {} // NOLINT
	var_value(const S& x) : vi_(new vari_type(x, false)) {} // NOLINT

	explicit vari_value(S&& x) : val_(x), adj_(x), chainable_alloc() {
	explicit vari_value(const S& x) : val_(x), adj_(x), chainable_alloc() {

	vari_value(S&& x, bool stacked) : val_(x), adj_(x), chainable_alloc() {
	vari_value(const S& x, bool stacked) : val_(x), adj_(x), chainable_alloc() {

Add Eigen Dense and Sparse vari_value types #1952

Add Eigen Dense and Sparse vari_value types #1952

Conversation

SteveBronder commented Jun 28, 2020

Summary

Tests

Side Effects

Release notes

Checklist

stan-buildbot commented Aug 5, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SteveBronder commented Aug 5, 2020

SteveBronder commented Aug 5, 2020

serban-nicusor-toptal commented Aug 6, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SteveBronder commented Aug 6, 2020

stan-buildbot commented Aug 6, 2020

t4c1 commented Aug 6, 2020

t4c1 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stan-buildbot commented Aug 7, 2020

SteveBronder commented Aug 7, 2020