Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

neg_binomial_2_log_lpmf is not stable for large phi #1495

Closed
martinmodrak opened this issue Dec 9, 2019 · 8 comments · Fixed by #1830
Closed

neg_binomial_2_log_lpmf is not stable for large phi #1495

martinmodrak opened this issue Dec 9, 2019 · 8 comments · Fixed by #1830
Milestone

Comments

@martinmodrak
Copy link
Contributor

martinmodrak commented Dec 9, 2019

Description

For some values, neg_binomial_2_log_lpmf would return values larger than 0 (e.g. neg_binomial_2_log_lpmf(39, 4, 6e16) == 256) for other, it just returns noticeably different values than the corresponding neg_binomial_2_lpmf call. In all cases this is related to large phi values (noticeable discrepancies manifest already at around phi = 2e5).

This seems to be related to/a regression of #428

Example

neg_binomial_2_log_lpmf(39, 4, 6e16) == 256.
I've created a failing test on the branch: develop...martinmodrak:bugfix/1495-neg_binomial_2_log_large_phi

Expected Output

The same as neg_binomial_2_lpmf(39, exp(4), 6e16), which is -5.22991

Current Version:

v3.0.0

@martinmodrak
Copy link
Contributor Author

#1496 seems related as the phi cutoff for when to delegate to Poisson is probably not correctly chosen.

@jtimonen
Copy link

jtimonen commented Feb 3, 2020

Is it possible to pin down the part of math/stan/math/prim/prob/neg_binomial_2_lpmf.hpp where the numerical problems occur with large phi? My guess is that it's in computing the binomial coefficient, which you seem to be doing by computing the logarithms of the factorials using lgamma. Can the problem be just numerical inaccuracy of lgamma with large input? Are you using Stirling's approximation or something to compute the logarithm of the gamma function?

@jtimonen
Copy link

jtimonen commented Feb 4, 2020

Here is my idea:

nb_stirling

@martinmodrak
Copy link
Contributor Author

martinmodrak commented Feb 4, 2020

@jtimonen You are correct, my work with the non-log version in #1497 shows that the Poisson branch can be eliminated with a better implementation of binomial_coefficient_log - this better implementation is almost ready in #1614 and so all this should bubble up eventually also here. Note that my experience with reimplementing lbeta is that Stirling approximation is not enough in and of itself and corrections are needed (see #1612) to maintain precision when phi is large and x is small.

Sorry that I wasted your time as this connection was not apparent in the issue here.

@wds15
Copy link
Contributor

wds15 commented Feb 4, 2020

If I read this right, then you are suggesting to improve the lgamma implementation... I do doubt that there are issues of these sort of. We are using the lgamma_r implementation as a default and fall back to boost::math::lgamma on Windows. I have stared a lot at the boost thing and I would be relatively surprised if there is any issue with large arguments or whatever. This thing is coded with great care.

So before you embark on this, please test the lgamma if it is really not precise enough.

@martinmodrak
Copy link
Contributor Author

martinmodrak commented Feb 4, 2020

@wds15 I do not suggest to improve lgamma. The trick I mention concerns differences of lgamma calls for large arguments where precision might be lost to catastrophic cancellation (e.g. lbeta, binomial_coefficient_log). See #1612 for more details.

@wds15
Copy link
Contributor

wds15 commented Feb 4, 2020

I see. Being smart about sums of lgamma sounds like a good approach. Sorry.. I am out of context.

This is super tedious work you are looking into here. So much appreciated.

Some people quoted me for knowing these things... not really, I just have dealt with a few of these hiccups whenever I had to.

@jtimonen
Copy link

jtimonen commented Feb 4, 2020

My previous post had an mistake already in the definition of NB pmf. I am just leaving a (hopefully) correct version here If anyone is wondering. I had to derive this anyway so it wasn't waste of time

Screenshot 2020-02-05 at 1 39 33

martinmodrak added a commit to martinmodrak/math that referenced this issue Apr 10, 2020
@mcol mcol linked a pull request Apr 11, 2020 that will close this issue
5 tasks
@mcol mcol added this to the 3.1.0++ milestone Apr 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants