Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing negative binomial phi cutoff #1497

Merged

Conversation

martinmodrak
Copy link
Contributor

@martinmodrak martinmodrak commented Dec 9, 2019

Summary

This introduces more tests on the behavior of negative binomial 2 distribution when phi is large and will try to fix issues discovered along the way. Currently:

  • improved numerical stability of the density by using binomial_coefficient_log and rearranging some operations
  • improved numerical stability of the derivative w.r.t phi
  • this obviated the necessity to delegate to Poisson density for large phi and the whole code branch was removed
  • addressed potential size mismatch noticed in Cleanup code in neg_binomial_2_lpmf and neg_binomial_2_log_lpmf #1531
  • renamed local variables with double underscores in names

This PR requires the improvements from PR #1614 to be merged before it is merged.

Tests

New tests:

  • Created a set of test values and derivatives in Mathematica (Wolfram Cloud) and test against those
  • Changed the finite differences test for derivatives to cover more of the parameter space and use complex step differentiation instead of finite diffs. This test requires somewhat looser tolerance for derivative wrt. phi for n > 100000 and large phi. Importantly, the sign of the derivative does not match (e.g. expected = -3.24e-10, actual = 9e-12), not sure if the complex step or the function is to blame, so I assume it is OK for now.
  • Extended the test for extreme phi values to cover more of the parameter space
  • Tests of values and derivatives against analytical solutions for n = 0 and n = 1 (this allows me to test even mu and phi values where Mathematica will not calculate exact solutions).

Side Effects

  • The way the density for neg_binomial_2_lpmf is computed changed to improve stability, at the cost of tangling mu and phi together earlier, so less computation can be avoided when propto=true and phi is fixed.
  • Generally I would expect a very minor speed drop for some parameter values (when previously we delegated to poisson which is probably cheaper).

Checklist

By submitting this pull request, the copyright holder is agreeing to the license the submitted work under the following licenses:
- Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
- Documentation: CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)

  • the basic tests are passing

    • unit tests pass (to run, use: ./runTests.py test/unit)
    • header checks pass, (make test-headers)
    • docs build, (make doxygen)
    • code passes the built in C++ standards checks (make cpplint)
  • the code is written in idiomatic C++ and changes are documented in the doxygen

  • the new changes are tested

@martinmodrak
Copy link
Contributor Author

So, fixing the phi cutoff triggered some failures in the finite differences test. I tried to fix those and wrote more tests to make sure I am on the right track. Those tests revealed other issues. I failed to fix all of them so far.

In particular, there are numerical differences between the values we compute and what I get from Mathematica and the finite differences test does not pass.

@bob-carpenter
Copy link
Contributor

bob-carpenter commented Dec 10, 2019 via email

@martinmodrak
Copy link
Contributor Author

martinmodrak commented Dec 11, 2019

Will continue the discussion here and not at the issue (hope that's OK).

There are currently two failing tests a precomputed test, using numbers from Mathematica and the original finiteDiffs test, which I expanded to test over wider range of values (but it fails even for the original values tested).

The numbers I get from Mathematica for the precomputed test are computed in this notebook: https://www.wolframcloud.com/obj/martin.modrak/Published/NegBinomial2_Tests.nb - I am still a Mathematica beginner, so hope I made no silly mistake, but since I am using what they call "Exact numbers" the outputs I get should allow arbitrary precision (I instruct it to work with 40 digits, then convert to what they call "CForm" which - hopefully - stores as much precision as C would accept, the number-to-string formatting in Mathematica is complicated). The precomputed test fails for some with relative errors up to 1e-5, all happening for either small mu or small phi here is a sample of the test output:

test/unit/math/rev/scal/prob/neg_binomial_2_test.cpp:84: Failure
The difference between gradients[1] and t.grad_phi is 1.7763568394002505e-015, which exceeds fabs(t.grad_phi * 1e-8), where
gradients[1] evaluates to -2.6984512402350447e-009,
t.grad_phi evaluates to -2.6984530165918841e-009, and
fabs(t.grad_phi * 1e-8) evaluates to 2.6984530165918841e-017.
grad_phi n = 14, mu = 1.5, phi = 162345

test/unit/math/rev/scal/prob/neg_binomial_2_test.cpp:82: Failure
The difference between gradients[0] and t.grad_mu is 4.0061531662412956e-017, which exceeds fabs(t.grad_mu * 1e-8), where
gradients[0] evaluates to -1.2600009924312872e-009,
t.grad_mu evaluates to -1.2600010324928188e-009, and
fabs(t.grad_mu * 1e-8) evaluates to 1.2600010324928188e-017.
grad_mu n = 10233, mu = 10586, phi = 0.00040000000000000002

The finiteDiffs test on the other hand fails with many values being off by 1 or more. I added one of the failing values (n=7, mu = 8, phi = 150000) to the precomputed test and there it passes for mu but doesn't pass for phi, but the error for phi is at the 6th digit, so I am starting to think this might be a problem with the finite diff.

Here is the relevant error output from the precomputed test:

gradients[1] evaluates to 1.333262389380252e-010,
t.grad_phi evaluates to   1.333280152948646e-010

And here is from the finiteDiffs test:

gradients[0] evaluates to -0.12499333368886989,
finite_diffs[0] evaluates to -1.4533085845869209, and

gradients[1] evaluates to 1.333262389380252e-010,
finite_diffs[1] evaluates to 1.1641532182693481, and

The test passed before my code changes, but all the failures are for phi values that used to delegate to Poisson, but are now below the cutoff.

The full test output is seen in the failed Jenkins build: https://jenkins.mc-stan.org/blue/organizations/jenkins/Math%20Pipeline/detail/PR-1497/9/pipeline/.

@martinmodrak
Copy link
Contributor Author

Also, I've run out of both ideas and time to improve the actual computation (if the test failures above are convincing enough to show that it needs improvement). Will be able to work on this more after New Year (unless I'll need very hard to procrastrinate on the stuff I should be doing :-) ).

@martinmodrak
Copy link
Contributor Author

OK, obviously, I can't stop procrastrinating :-) For the differences in derivative wrt. phi at least part of the problem is different results of digamma between boost, R and Mathematica. R code to show the differences:

n <- 7
mu <- 0.0000256
phi <- 15324.0

digamma_phi_boost <-        9.6371428769091523 
digamma_phi_mathematica <-  9.6371428769091527082 
digamma_n_phi_boost <-        9.6375995872973039
digamma_n_phi_mathematica <-  9.6375995872973037537

base <- 1.0 - (n + phi) / (mu + phi) + log(phi) - log(mu + phi)

#Expected value (by mathematica)
-8.9402263370175206e-008
#Computed by Stan math
-8.9402261593818366e-008

base - digamma_phi_boost + digamma_n_phi_boost #Matches Stan
base - digamma(phi) + digamma(n + phi) #Matches mathematica 
base - digamma_phi_mathematica + digamma_n_phi_mathematica #Matches mathematica 

@martinmodrak
Copy link
Contributor Author

martinmodrak commented Dec 11, 2019

So it turns out I was chasing very small errors. Taking absolute tolerance into account (i.e. making the tolerance to be max(true_value * 1e-8, 1e-14)) made the tests pass for almost all values. Slightly rearranging the computation of the density removed the remaining test failures, at the cost of being able to save less computation when propto=true and phi is constant.

So the only remaining failing test is the finite diff. test. I am not sure what to do about this one.

The PR description has been updated to reflect the current situation.

martinmodrak and others added 3 commits December 11, 2019 13:22
Implemented gradient wrt. phi for Poisson approximation to make the test pass
@martinmodrak martinmodrak changed the title [WIP] Fixing negative binomial phi cutoff Fixing negative binomial phi cutoff Dec 11, 2019
@mcol mcol linked an issue Mar 13, 2020 that may be closed by this pull request
@martinmodrak
Copy link
Contributor Author

@bob-carpenter So that code evolved quite a bit and the required changes to binomial_coefficient_log have been merged. I believe this is now in good shape for a re-review.

@bbbales2
Copy link
Member

I can take the review over if that's convenient since I did the last binomial one

Copy link
Contributor

@mcol mcol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a quick look and noted a few duplicate variables and wrongly removed headers.

stan/math/prim/prob/neg_binomial_2_lpmf.hpp Outdated Show resolved Hide resolved
stan/math/prim/prob/neg_binomial_2_lpmf.hpp Outdated Show resolved Hide resolved
stan/math/prim/prob/neg_binomial_2_lpmf.hpp Outdated Show resolved Hide resolved
stan/math/prim/prob/neg_binomial_2_lpmf.hpp Outdated Show resolved Hide resolved
stan/math/prim/prob/neg_binomial_2_lpmf.hpp Outdated Show resolved Hide resolved
@martinmodrak
Copy link
Contributor Author

@mcol thanks for catching those - I was obviously sloppy with the merge, sorry for that. I resolved those.

@stan-buildbot
Copy link
Contributor


Name Old Result New Result Ratio Performance change( 1 - new / old )
gp_pois_regr/gp_pois_regr.stan 4.89 4.88 1.0 0.17% faster
low_dim_corr_gauss/low_dim_corr_gauss.stan 0.02 0.02 0.97 -2.99% slower
eight_schools/eight_schools.stan 0.09 0.09 0.99 -0.6% slower
gp_regr/gp_regr.stan 0.22 0.22 1.0 0.04% faster
irt_2pl/irt_2pl.stan 6.44 6.46 1.0 -0.31% slower
performance.compilation 87.82 86.25 1.02 1.79% faster
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan 7.58 7.58 1.0 0.08% faster
pkpd/one_comp_mm_elim_abs.stan 20.75 21.07 0.98 -1.57% slower
sir/sir.stan 93.74 93.14 1.01 0.64% faster
gp_regr/gen_gp_data.stan 0.05 0.05 0.98 -2.03% slower
low_dim_gauss_mix/low_dim_gauss_mix.stan 2.95 2.95 1.0 -0.13% slower
pkpd/sim_one_comp_mm_elim_abs.stan 0.34 0.31 1.06 6.04% faster
arK/arK.stan 1.74 1.73 1.0 0.43% faster
arma/arma.stan 0.65 0.66 1.0 -0.18% slower
garch/garch.stan 0.51 0.51 1.0 0.09% faster
Mean result: 1.00136891454

Jenkins Console Log
Blue Ocean
Commit hash: 6cfff52


Machine information ProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010

CPU:
Intel(R) Xeon(R) CPU E5-1680 v2 @ 3.00GHz

G++:
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

Clang:
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

@bbbales2
Copy link
Member

@bob-carpenter can I take this over?

@bob-carpenter
Copy link
Contributor

bob-carpenter commented Mar 17, 2020 via email

Copy link
Member

@bbbales2 bbbales2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both the comments are optional. I'll merge if you say so. I think it's super adequately tested but it looked like there were a couple things you meant to have there that aren't there.

@martinmodrak
Copy link
Contributor Author

Thanks @bbbales2 will try to resolve shortly. Note to self: #1679 was fixed and merged, can also expand the lbeta tests.

@bbbales2
Copy link
Member

Quick ping.

can also expand the lbeta tests.

I don't think there's a need to add many new tests here if that's what you're getting at! Just those couple things, and really I'd be fine with the pull as-is.

@martinmodrak
Copy link
Contributor Author

OK, just added the the test for d/dphi at n = 1. Also the lbeta thing was literally that I had a line of code saying "Delete this once #1679 is merged", so I deleted it :-) Tests pass on my computer, hopefully they pass on Jenkins as well.

@stan-buildbot
Copy link
Contributor


Name Old Result New Result Ratio Performance change( 1 - new / old )
gp_pois_regr/gp_pois_regr.stan 4.81 4.93 0.98 -2.45% slower
low_dim_corr_gauss/low_dim_corr_gauss.stan 0.02 0.02 1.0 -0.4% slower
eight_schools/eight_schools.stan 0.09 0.09 1.04 3.57% faster
gp_regr/gp_regr.stan 0.22 0.22 0.99 -1.03% slower
irt_2pl/irt_2pl.stan 6.5 6.47 1.0 0.47% faster
performance.compilation 89.41 86.72 1.03 3.01% faster
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan 7.53 7.52 1.0 0.11% faster
pkpd/one_comp_mm_elim_abs.stan 20.89 20.69 1.01 0.96% faster
sir/sir.stan 94.21 91.13 1.03 3.27% faster
gp_regr/gen_gp_data.stan 0.05 0.05 1.01 1.21% faster
low_dim_gauss_mix/low_dim_gauss_mix.stan 2.95 2.99 0.99 -1.24% slower
pkpd/sim_one_comp_mm_elim_abs.stan 0.33 0.33 0.98 -2.06% slower
arK/arK.stan 1.74 1.75 0.99 -0.84% slower
arma/arma.stan 0.66 0.66 1.01 0.85% faster
garch/garch.stan 0.51 0.51 0.99 -0.87% slower
Mean result: 1.00337330892

Jenkins Console Log
Blue Ocean
Commit hash: 4b2f032


Machine information ProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010

CPU:
Intel(R) Xeon(R) CPU E5-1680 v2 @ 3.00GHz

G++:
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

Clang:
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

@bob-carpenter
Copy link
Contributor

Sorry I missed this was blocked on me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

The way neg_binomial_2_lpmf delegates to Poisson is broken
7 participants