Vectorised log_mix with gradients #664

andrjohns · 2017-10-31T15:36:02Z

Submission Checklist

Run unit tests: ./runTests.py test/unit
Run cpplint: make cpplint
Declare copyright holder and open-source license: see below

Summary:

Introduced a prim/mat version of log_mix with analytic gradients that takes a vector of mixture probabilities and a vector of densities (of arbitrary, but equal, length).

So the signature would be:

real = log_mix(vector, vector);

Intended Effect:

Make specifying mixtures much simpler and more efficient.

This has been on my own wishlist for a while, figured I might as well try and tackle it myself!

How to Verify:

Unit test is included that tests equality with equivalent log_sum_exp() parameterisation.

Side Effects:

Documentation:

Inline

Copyright and Licensing

Please list the copyright holder for the work you are submitting (this will be you or your assignee, such as a university or company): Andrew Johnson

By submitting this pull request, the copyright holder is agreeing to license the submitted work under the following licenses:

Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
Documentation: CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)

stan-buildbot · 2017-10-31T15:36:03Z

Can one of the admins verify this patch?

…log_mix

seantalts · 2017-11-01T20:23:36Z

Jenkins, retest this please.

bbbales2

This has been sitting here awhile so I figured I'd leave some comments. I'll leave the Official Review for one of the Columbia folks. Other than the inline comments, I think this function should have rev/fwd/mix tests as well.

This'll be cool to get in. Anything that makes mixture models easier to read is a plus :D.

We'll see if Sean's link on Discourse worked and I manage to communicate this stuff without sounding like a jerk :P.

bbbales2 · 2017-11-10T23:45:55Z

stan/math/prim/mat/fun/log_mix.hpp

+      const size_t N = theta.size();
+
+      for (size_t n = 0; n < N; ++n) {
+        check_not_nan(function, "lambda", lambda[n]);


I think all these checks have vectorized versions, so check_not_nan(function, "lambda", lambda) etc. should just work.

Yes, you want to use vectorized versions so that the error messages can report indexes.

bbbales2 · 2017-11-10T23:46:39Z

stan/math/prim/mat/fun/log_mix.hpp

+      Eigen::Matrix<T_partials_return, Eigen::Dynamic, 1> theta_dbl(N, 1);
+      scalar_seq_view<T_theta> theta_vec(theta);
+        for (size_t n = 0; n < N; ++n)
+          theta_dbl[n] = value_of(theta_vec[n]);


Can you get rid of this extra indention on the for loop and theta_dbl? And elsewhere in this pull?

Statements should left-align under previous statements unless nested in a block.

bbbales2 · 2017-11-10T23:47:29Z

stan/math/prim/mat/fun/log_mix.hpp

+          lam_deriv[n] = theta_deriv[n] * theta_dbl[n];
+
+      operands_and_partials<T_theta, T_lam> ops_partials(theta, lambda);
+      if (!(is_constant_struct<T_theta>::value


Does the outer if need to be here? Won't the inner ifs do the job?

@bbbales2 is right that the outer loop doesn't need to be written.

I also think it's wrong as written because you won't match both is_constant_struct<T_lam> and !is_constant_struct<T_lam>. So this needs a test that will catch this error.

bbbales2 · 2017-11-10T23:52:42Z

stan/math/prim/mat/fun/log_mix.hpp

+                                                        theta_dbl);
+
+      Eigen::Matrix<T_partials_return, Eigen::Dynamic, 1> theta_deriv(N, 1);
+          theta_deriv = exp_lambda_dbl / dot_exp_lam_theta;


Can this not be computed like:

exp(lambda_dbl - log_sum_exp(logp_tmp))

To the same effect? In this way the log_sum_exp handles the overflow trick, and it simplifies the code a bit.

bbbales2 · 2017-11-11T00:02:11Z

stan/math/prim/mat/fun/log_mix.hpp

+      typedef typename stan::partials_return_type<T_theta, T_lam>::type
+        T_partials_return;
+
+      const size_t N = theta.size();


I'd use https://github.com/stan-dev/math/blob/develop/stan/math/prim/scal/meta/length.hpp here. I think Eigen types use ints and std::vectors use size_ts (so this sorta thing gives compiler warnings sometimes).

andrjohns · 2017-11-12T07:04:38Z

Thanks for that Ben, much appreciated. Just in the middle of extending this to an array version (so that you can pass an array of N density vectors, rather than looping over the function N times), so I'll get your comments in after that.

bob-carpenter · 2017-11-12T20:44:17Z

@bbbales2: It's not just Columbia reviewers. @syclik is in charge of this repo and he's at Generable. I hope you'll feel comfortable reviewing pull requests soon.

andrjohns · 2017-11-15T06:20:47Z

The array version for this will take a bit more work, so I'll make a separate pull request when that's ready.

The only thing I haven't been able to get working is the vectorised error checking. It seems that when the error checking functions call stan::get to return a value from the vector to be checked, that get is returning the full vector instead. The correct template is being called, so the function recognises that it's a vector input, but only the prim/scal version of stan::get or stan::length is being called (from what I can see).

bob-carpenter · 2017-11-15T21:14:10Z

@andrjohns Would you mind creating an issue for the vectorized error checking? I don't quite understand what you're saying is a problem or how you're trying to instantiate it. Or just reply here with a link to the file you're having trouble getting to compile.

bbbales2 · 2017-11-16T02:06:16Z

The array version for this will take a bit more work

Cool beans. So this function gives the log density for one mixture. Is the array version for doing a bunch of mixtures at once?

With regards to the checks, is it that x can be a std::vector<std::vector<var>> or something (which isn't handled by the checks now)?

If that's the case, I think it should be fine to just use loops for the checks. Just expound more on the possible input types in the docs for the function.

andrjohns · 2017-11-16T14:31:39Z

Cool beans. So this function gives the log density for one mixture. Is the array version for doing a bunch of mixtures at once?

Yep, it's to add a lot of flexibility to how the mixture is specified. The end goal is to allow all combinations of vector and vector[].

log_mix(vector, vector[]) (e.g. N density vectors, all with the same mixture proportions)
log_mix(vector[], vector) (e.g. single density vector applied to N different mixture proportions)
log_mix(vector[], vector[]) (e.g. N density vectors, each with different mixture proportions)

Should let me do a lot less looping in Stan!

With regards to the checks, is it that x can be a std::vector<std::vector> or something (which isn't handled by the checks now)?

That was just for the regular NaN/bounded/finite checks. But I found where my mistake was, so the checks are working fine without loops now.

bob-carpenter · 2017-11-16T20:15:46Z

There's a third argument for the mixture simplex (or simplexes if you really go crazy).

andrjohns · 2017-11-17T02:23:18Z

For the mixture proportions? Where the scalar log_mix is log_mix(prob(dens_1), dens_1, dens_2), this version has the mixture simplex as the first argument, and then a vector of the densities as the second argument: log_mix(simplex, densities).

Having the function take a vector of densities was the easiest way of allowing an arbitrary number, but I could possibly recreate the structure of the scalar log_mix with a variadic template: log_mix(simplex, dens_1, dens_2, ..., dens_N) if that's preferred.

andrjohns · 2017-11-17T02:32:26Z

Speaking of simplexes, when calculating the derivatives should I be assuming that the input simplex was constructed via stick-breaking, and not calculating the partial derivative for the Kth value? I was originally allowing for the fact that some users might be passing the softmax of a vector as the probability argument, but I hadn't considered whether this would affect use with a default (stick-breaking) simplex.

bob-carpenter · 2017-11-17T17:42:54Z

OK. I think it helps to have arguments. I thought you were doing a different kind of vectorization. I'd write it as real log_mix(vector lambda, vector log_prob); with implementation equivalent to: vector[size(lambda)] lp; for (k in 1:size(lambda)) lp[k] = log(lambda[k]) + log_prob[k]; return log_sum_exp(lp); it's not vector[] and vector[] --- the [] aren't used for vectors in function args as no signs are given. There's a description in the manual. - Bob

bbbales2

This looks good to me! Someone else will have to do the final review handle the actual accept (I don't have repo permissions). @syclik maybe?

The only thing that I see that's out of place are the higher order autodiff tests are all in terms of Eigen types (not really std::vectors). Not sure it matters. std::vectors are tested in prim

One question, this is for the signature

log_mix(vector theta, vector lambda)

Right?

The fancier signatures you mentioned:

log_mix(vectors theta, vector lambda)
log_mix(vector theta, vectors lambda)
log_mix(vectors theta, vectors lambda)

are gonna wait?

bbbales2 · 2017-11-18T20:36:33Z

stan/math/prim/mat/fun/log_mix.hpp

+      T_partials_return logp = log_sum_exp(logp_tmp);
+
+      Eigen::Matrix<T_partials_return, Eigen::Dynamic, 1> theta_deriv(N, 1);
+        theta_deriv.array() = (lam_dbl.array() - logp).exp();


bbbales2 · 2017-11-18T20:36:39Z

stan/math/prim/mat/fun/log_mix.hpp

+        theta_deriv.array() = (lam_dbl.array() - logp).exp();
+
+      Eigen::Matrix<T_partials_return, Eigen::Dynamic, 1> lam_deriv(N, 1);
+        lam_deriv.array() = theta_deriv.array() * theta_dbl.array();


bbbales2 · 2017-11-18T20:38:41Z

test/unit/math/prim/mat/fun/log_mix_test.cpp

+
+TEST(MatrixFunctions, LogMix_Values) {
+  stan::math::vector_d prob(5, 1);
+    prob << 0.1, 0.3, 0.25, 0.15, 0.2;


Fix indents in this file

bbbales2 · 2017-11-18T20:38:51Z

test/unit/math/rev/mat/fun/log_mix_test.cpp

+  using stan::math::vector_v;
+
+  vector_v prob(4), dens(4);
+    prob << 0.13, 0.22, 0.38, 0.27;


bbbales2 · 2017-11-18T20:38:59Z

test/unit/math/rev/mat/fun/log_mix_test.cpp

+  using stan::math::row_vector_v;
+
+  row_vector_v prob(4), dens(4);
+    prob << 0.03, 0.21, 0.63, 0.13;


andrjohns · 2017-11-19T17:47:31Z

Thanks Ben, good catch! I've added in the tests for std::vectors as well.

The fancier signatures you mentioned... are gonna wait?

Yep, this is just for the log_mix(vector theta, vector lambda) signature. I'm still working out the kinks with the gradients for the different combinations so they'll take a while longer.

andrjohns · 2017-11-21T05:11:20Z

Could someone start the testing for this on Jenkins? Thanks!

bob-carpenter · 2017-11-21T07:20:11Z

Jenkins, OK to test.

bob-carpenter · 2017-12-13T19:27:04Z

I'll review this one.

bob-carpenter

Hope these changes are minor.

bob-carpenter · 2017-11-13T21:48:56Z

stan/math/prim/mat/fun/log_mix.hpp

+      const size_t N = theta.size();
+
+      for (size_t n = 0; n < N; ++n) {
+        check_not_nan(function, "lambda", lambda[n]);


Yes, you want to use vectorized versions so that the error messages can report indexes.

bob-carpenter · 2017-11-13T21:49:33Z

stan/math/prim/mat/fun/log_mix.hpp

+      Eigen::Matrix<T_partials_return, Eigen::Dynamic, 1> theta_dbl(N, 1);
+      scalar_seq_view<T_theta> theta_vec(theta);
+        for (size_t n = 0; n < N; ++n)
+          theta_dbl[n] = value_of(theta_vec[n]);


Statements should left-align under previous statements unless nested in a block.

bob-carpenter · 2017-11-13T21:55:51Z

stan/math/prim/mat/fun/log_mix.hpp

+       */
+      Eigen::Matrix<T_partials_return, Eigen::Dynamic, 1> exp_lambda_dbl(N, 1);
+      double max_val = max(lambda_dbl);
+        for (size_t n = 0; n < N; ++n)


Align each statement directly under the last except in nested blocks.

bob-carpenter · 2017-11-13T21:59:03Z

stan/math/prim/mat/fun/log_mix.hpp

+
+      Eigen::Matrix<T_partials_return, Eigen::Dynamic, 1> theta_dbl(N, 1);
+      scalar_seq_view<T_theta> theta_vec(theta);
+        for (size_t n = 0; n < N; ++n)


You should create theta_dbl to be the same size as theta to cut down on memory allocation and assignments. The scalar sequence view will automatically broadcast when you use it.

bob-carpenter · 2017-11-13T22:01:28Z

stan/math/prim/mat/fun/log_mix.hpp

+        for (size_t n = 0; n < N; ++n)
+          exp_lambda_dbl[n] = exp((lambda_dbl[n] - max_val));
+
+      T_partials_return dot_exp_lam_theta = dot_product(exp_lambda_dbl,


This doesn't need to be a dot-product unless both are vectors; otherwise, it's more efficient as scalar-vector operation. I'm not sure how to deal with this efficiently.

You should also be using auto as the type on the left here. That'll avoid over-promotion and is fair game now that we've opened up C++11 constructs.

bob-carpenter · 2017-12-13T19:34:37Z

stan/math/prim/mat/fun/log_mix.hpp

+        lam_dbl[n] = value_of(lam_vec[n]);
+
+      Eigen::Matrix<T_partials_return, Eigen::Dynamic, 1> logp_tmp(N, 1);
+      logp_tmp = log(theta_dbl) + lam_dbl;


I think this one can be a declare-define

Eigen::Matrix<T_partials_return, -1, 1> logp_tmp = log(theta_dbl) + lam_dbl;

You might want to typedef the type here as this is the second usage.

bob-carpenter · 2017-12-13T19:35:29Z

stan/math/prim/mat/fun/log_mix.hpp

+      Eigen::Matrix<T_partials_return, Eigen::Dynamic, 1> logp_tmp(N, 1);
+      logp_tmp = log(theta_dbl) + lam_dbl;
+
+      T_partials_return logp = log_sum_exp(logp_tmp);


I don't see another usage of logp_tmp, so I think this can all be replaced with a one-liner:

T_partials_return logp = log_sum_exp(log(theta_dbl) + lam_dbl);

The suffix _tmp should be avoided. Here, the variable could just be called logp. All variables are "temp" in some sense and shorter is better all else being equal.

I had to split this into two steps because the return type of log(theta_dbl) + lam_dbl is a 'CwiseBinaryOp', so log_sum_exp can't deal with it until Eigen assigns the result to a vector.

I'm not sure if there's a way of using the matrix/array wrappers to avoid this, but I haven't had any luck with figuring it out.

bob-carpenter · 2017-12-13T19:38:47Z

stan/math/prim/mat/fun/log_mix.hpp

+      Eigen::Matrix<T_partials_return, Eigen::Dynamic, 1> theta_deriv(N, 1);
+      theta_deriv.array() = (lam_dbl.array() - logp).exp();
+
+      Eigen::Matrix<T_partials_return, Eigen::Dynamic, 1> lam_deriv(N, 1);


You don't need to convert to arrays for this. I think this can just be

Eigen::Matrix<T_partials_return, -1, 1> lam_deriv = theta_deriv + theta_dbl;

I used the array wrappers here because I needed an elementwise multiplication, not addition. But you were right in the end, I didn't need to convert to arrays because Eigen has the .cwiseProduct() operator

bob-carpenter · 2017-12-13T19:40:11Z

stan/math/prim/mat/fun/log_mix.hpp

+
+      operands_and_partials<T_theta, T_lam> ops_partials(theta, lambda);
+      if (!is_constant_struct<T_theta>::value) {
+        for (int n = 0; n < N; ++n)


Can't this just be

ops_partials.edge1_ = theta_deriv;

bob-carpenter · 2017-12-13T19:40:21Z

stan/math/prim/mat/fun/log_mix.hpp

+
+      if (!is_constant_struct<T_lam>::value) {
+        for (int n = 0; n < N; ++n)
+          ops_partials.edge2_.partials_[n] = lam_deriv[n];


same comment

bob-carpenter · 2017-12-22T02:09:28Z

@andrjohns Even though I approved this, I had a bunch of suggestions on how to make it better. Would you mind if I made the changes then pushed?

andrjohns · 2017-12-30T08:40:34Z

That sounds great to me (if you don't mind). I'm currently abroad with very sporadic internet access, so I won't be able to submit any changes until mid-Jan

…

________________________________ From: Bob Carpenter <notifications@github.com> Sent: Friday, December 22, 2017 10:09:29 AM To: stan-dev/math Cc: Andrew Johnson; Mention Subject: Re: [stan-dev/math] Vectorised log_mix with gradients (#664) @andrjohns<https://github.com/andrjohns> Even though I approved this, I had a bunch of suggestions on how to make it better. Would you mind if I made the changes then pushed? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub<#664 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AabxCHDlzSW_wU2lzFgObHou9xBtSMDaks5tCw9ZgaJpZM4QM8lf>.

syclik · 2018-01-20T04:04:17Z

@bob-carpenter, want to make the fixes you suggested so we can get this merged?

bob-carpenter · 2018-01-20T07:01:50Z

Yes, I can make the fixes.

andrjohns · 2018-01-30T00:38:05Z

Sorry about the delay with this one, I'm catching up on my backlog now. I can make these changes and push them tonight (which would be tomorrow morning Columbia-time)

bob-carpenter · 2018-01-30T05:59:05Z

That'd be fantastic. I was going to do it, but obviously never got around to it. We'll happily take contributions on whatever timeline they come in.

bob-carpenter · 2018-01-30T17:57:33Z

OK, if there's no way to generalize that, we'll have to wait until the underlying operands_and_partials code gets generalized. Let me know if that's everything and I can merge.

andrjohns · 2018-01-30T18:17:11Z

Yep that's the lot, all ready to merge

…

________________________________ From: Bob Carpenter <notifications@github.com> Sent: Wednesday, January 31, 2018 1:57:34 AM To: stan-dev/math Cc: Andrew Johnson; Mention Subject: Re: [stan-dev/math] Vectorised log_mix with gradients (#664) OK, if there's no way to generalize that, we'll have to wait until the underlying operands_and_partials code gets generalized. Let me know if that's everything and I can merge. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub<#664 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AabxCPsfdiin_KP7kAD1IaGoTtgxT6nwks5tP1gOgaJpZM4QM8lf>.

bob-carpenter · 2018-01-30T21:22:57Z

Sorry for all the false alarms and thanks for clarifying. I'll merge when the tests pass. Thanks.

seantalts · 2018-01-31T03:10:32Z

Merging!

re: operands_and_partials generalization - if this was waiting for multivariate / containers of containers, Sebastian got that in a couple of weeks ago I think.

bob-carpenter · 2018-01-31T05:10:57Z

It's not necessary, just an improvement if some of the operations could be done with vectors rather than loops.

Andrew Johnson added 5 commits October 31, 2017 01:14

Working baseline

8833624

Working vectorised version

39281bd

Unit test, fix overflow in exponentials

5ce3300

Final tidying

93f2ff8

Alignment tidying

9bac355

Andrew Johnson added 3 commits November 1, 2017 12:48

Add doxygen math, reuse vars rather then repeat calcs for efficiency

ef29464

Minor formatting, add test of equality between vectorised and scalar …

910df54

…log_mix

Merged 'develop'.

1c57908

Andrew Johnson added 4 commits November 5, 2017 13:54

Formatting, add more detail to doxygen

e0e8387

Merged 'develop'.

79d212c

str to char* for failing test

023ef21

Tidying

93e590f

bbbales2 reviewed Nov 11, 2017

View reviewed changes

Andrew Johnson added 6 commits November 15, 2017 12:04

Address comments, initial fwd, rev, mix tests

41e2f67

Merged 'develop'.

2301eab

Finish comments and mix test

81fce85

Merged 'develop'.

cc95e25

Fix cpplint errors

3fbb9f9

Revert error checking to loop so tests pass

3eb72c2

Fix checks

bc5bcc1

bbbales2 reviewed Nov 18, 2017

View reviewed changes

Remove unnecessary indents, add autodiff tests for std::vectors

558fab9

Simplify mix tests

aed7e92

bob-carpenter approved these changes Dec 13, 2017

View reviewed changes

Andrew Johnson added 2 commits January 30, 2018 22:10

Merge remote-tracking branch 'origin/develop' into feature/log_mix

3d43846

Address comments

420ff99

seantalts merged commit 4f382fb into stan-dev:develop Jan 31, 2018

andrjohns deleted the feature/log_mix branch February 10, 2018 06:00

andrjohns mentioned this pull request Feb 14, 2018

log_mix - Multivariate Containers #751

Merged

3 tasks

andrjohns mentioned this pull request May 31, 2018

Add new log_mix signatures and unit tests stan-dev/stan#2532

Merged

3 tasks

andrjohns mentioned this pull request Oct 18, 2021

Update manual with new log_mix functionality stan-dev/docs#134

Closed

Vectorised log_mix with gradients #664

Vectorised log_mix with gradients #664

Conversation

andrjohns commented Oct 31, 2017

Submission Checklist

Summary:

Intended Effect:

How to Verify:

Side Effects:

Documentation:

Copyright and Licensing

stan-buildbot commented Oct 31, 2017

seantalts commented Nov 1, 2017

bbbales2 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andrjohns commented Nov 12, 2017

bob-carpenter commented Nov 12, 2017

andrjohns commented Nov 15, 2017

bob-carpenter commented Nov 15, 2017

bbbales2 commented Nov 16, 2017

andrjohns commented Nov 16, 2017

bob-carpenter commented Nov 16, 2017

andrjohns commented Nov 17, 2017

andrjohns commented Nov 17, 2017

bob-carpenter commented Nov 17, 2017 via email

bbbales2 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andrjohns commented Nov 19, 2017

andrjohns commented Nov 21, 2017

bob-carpenter commented Nov 21, 2017

bob-carpenter commented Dec 13, 2017

bob-carpenter left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bob-carpenter commented Dec 22, 2017

andrjohns commented Dec 30, 2017 via email

syclik commented Jan 20, 2018

bob-carpenter commented Jan 20, 2018 • edited Loading

andrjohns commented Jan 30, 2018

bob-carpenter commented Jan 30, 2018 via email

bob-carpenter commented Jan 30, 2018 via email

andrjohns commented Jan 30, 2018 via email

bob-carpenter commented Jan 30, 2018

seantalts commented Jan 31, 2018

bob-carpenter commented Jan 31, 2018 via email

bob-carpenter commented Jan 20, 2018 •

edited

Loading