Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compound Gamma-Poisson Distribution #2775

Closed
wants to merge 9 commits into from
Closed

Compound Gamma-Poisson Distribution #2775

wants to merge 9 commits into from

Conversation

andrjohns
Copy link
Collaborator

@andrjohns andrjohns commented Jun 30, 2022

Summary

This PR adds a new compound distribution: poisson_gamma(), as the likelihood for a Poisson random variate with a Gamma prior for the rate parameter, after marginalising out the rate parameter.

All distribution functions are implemented using the existing neg_binomial_2 distribution, and reference values for testing calculated using the GammaPoiss family in the extraDistr R package: https://rdrr.io/cran/extraDistr/man/GammaPoiss.html

Tests

prim tests added only, as the gradients are handled by the neg_binomial_2_ functions

Side Effects

N/A

Release notes

Introduced new gamma_poisson() compund distribution

Checklist

  • Math issue New compound distribution - poisson_gamma #2772

  • Copyright holder: Andrew Johnson

    The copyright holder is typically you or your assignee, such as a university or company. By submitting this pull request, the copyright holder is agreeing to the license the submitted work under the following licenses:
    - Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
    - Documentation: CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)

  • the basic tests are passing

    • unit tests pass (to run, use: ./runTests.py test/unit)
    • header checks pass, (make test-headers)
    • dependencies checks pass, (make test-math-dependencies)
    • docs build, (make doxygen)
    • code passes the built in C++ standards checks (make cpplint)
  • the code is written in idiomatic C++ and changes are documented in the doxygen

  • the new changes are tested

@andrjohns andrjohns changed the title Compound -Gamma Distribution Compound Gamma-Poisson Distribution Jun 30, 2022
Copy link
Member

@rok-cesnovar rok-cesnovar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. The only thing is removing the _log suffix files. Those are not required anymore in stanc3 so there is no reason to add them in Math.

stan/math/prim/prob/gamma_poisson_ccdf_log.hpp Outdated Show resolved Hide resolved
stan/math/prim/prob/gamma_poisson_cdf_log.hpp Outdated Show resolved Hide resolved
stan/math/prim/prob/gamma_poisson_log.hpp Outdated Show resolved Hide resolved
@bob-carpenter
Copy link
Contributor

This is just another parameterization of the negative binomial and I think it should be named as such. If we have gamma-Poisson and negative-binomial, it'll be confusing, because they're effectively synonyms.

I don't know how to resolve the naming---I'd name after parameters if possible rather than with numbering as we've done so far.

@andrjohns
Copy link
Collaborator Author

This is just another parameterization of the negative binomial and I think it should be named as such. If we have gamma-Poisson and negative-binomial, it'll be confusing, because they're effectively synonyms.

I don't know how to resolve the naming---I'd name after parameters if possible rather than with numbering as we've done so far.

I might disagree about the naming here actually. My intention was for the distribution to be viewed analogously to the beta_binomial - explicitly as a means of estimating a specific distribution with a latent parameter marginalised out. While the negative-binomial and Gamma+Poisson distributions are essentially synonymous, the parameters of the negative-binomial still need to be transformed to specify the desired Gamma prior. This gamma_poisson() was intended to provide a simpler approach to specifying the model by allowing the user to provide the Gamma shape & rate directly.

It's also likely that some less-statistical users might not be aware of the relationship with the negative-binomial, so I think this also provides a clearer implementation for unfamiliar users.

@bob-carpenter
Copy link
Contributor

I'd rather mention in the doc that we have a negative binomial with standard gamma prior parameterization. Then someone could find it. I very much do not want to introduce gamma-poisson as a new distribution name given that it really is just another way of saying "negative binomial". I don't think we'll have many statistically unsophisticated users thinking they need a marginalized gamma-Poisson but walking away because they've never heard of the negative binomial.

But the real question is why are we adding this distribution? I would think the mean + over dispersion parameterization given by negative-binomial-2 would be the most natural to parameterize. Having the mean be alpha / beta is a mess in similar situations like the beta distribution (where it's alpha / (alpha + beta)), for which a reparameterization in terms of mean and dispersion is much more effective.

So I think we disagree here and need a way to resolve.

@bob-carpenter
Copy link
Contributor

prim tests added only, as the gradients are handled by the neg_binomial_2_ functions

We need to include autodiff tests even if we think we know they're going to work for some external reason (like delegating to neg-binomial-2). That way, if someone adds analytic gradients or otherwise monkeys with the function, the tests will be in place. Also, it's surprisingly easy to write things that autodiff fails for even when we think they should work (like square(sqrt(x)) type problems, which for x = 0 returns NaN (0 / 0) rather than returning a derivative of 1.)

@andrjohns
Copy link
Collaborator Author

I'm not in any way determined on having this included. If there's a preference against it, I'm happy to close this PR and instead add a section to the docs or similar explaining the option

@andrjohns andrjohns closed this Jul 7, 2022
@bob-carpenter
Copy link
Contributor

Hi, @andrjohns. I was just expressing my own preference---I don't claim to speak for everyone. So you might want to ask others if they have a use for it and what they'd like it to be called.

@betanalpha
Copy link
Contributor

betanalpha commented Aug 23, 2022 via email

@bob-carpenter
Copy link
Contributor

I'd like to deal with these on a case by case basis. I think having too many distributions of the same kind (like multiple gamma-Poisson, i.e., neg binomial) is confusing, but on the other hand, it's error prone to reparameterize yourself, especially around distributions with multiple standard parameterizations like normal and gamma and negative binomial.

I like that we have the beta proportion distribution in addition to the standard one, because the new one's more natural to formulate priors for. I like having the Cholesky factor multi-normals because they're more efficient than the covariance version even though we could feed L * L' into the regular one and get the same result (up to numerical issues and speed). I'm not convinced we need the compound gamma-Poisson because the gamma distribution doesn't lead to natural priors (it's like the beta that way in how it tangles the parameters), so I don't see the point. I absolutely don't think it should be our goal to implement as many known distributions as we can.

@betanalpha
Copy link
Contributor

betanalpha commented Sep 23, 2022 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants