-
-
Notifications
You must be signed in to change notification settings - Fork 183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compound Gamma-Poisson Distribution #2775
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great. The only thing is removing the _log suffix files. Those are not required anymore in stanc3 so there is no reason to add them in Math.
This is just another parameterization of the negative binomial and I think it should be named as such. If we have gamma-Poisson and negative-binomial, it'll be confusing, because they're effectively synonyms. I don't know how to resolve the naming---I'd name after parameters if possible rather than with numbering as we've done so far. |
I might disagree about the naming here actually. My intention was for the distribution to be viewed analogously to the It's also likely that some less-statistical users might not be aware of the relationship with the negative-binomial, so I think this also provides a clearer implementation for unfamiliar users. |
I'd rather mention in the doc that we have a negative binomial with standard gamma prior parameterization. Then someone could find it. I very much do not want to introduce gamma-poisson as a new distribution name given that it really is just another way of saying "negative binomial". I don't think we'll have many statistically unsophisticated users thinking they need a marginalized gamma-Poisson but walking away because they've never heard of the negative binomial. But the real question is why are we adding this distribution? I would think the mean + over dispersion parameterization given by negative-binomial-2 would be the most natural to parameterize. Having the mean be So I think we disagree here and need a way to resolve. |
We need to include autodiff tests even if we think we know they're going to work for some external reason (like delegating to neg-binomial-2). That way, if someone adds analytic gradients or otherwise monkeys with the function, the tests will be in place. Also, it's surprisingly easy to write things that autodiff fails for even when we think they should work (like |
I'm not in any way determined on having this included. If there's a preference against it, I'm happy to close this PR and instead add a section to the docs or similar explaining the option |
Hi, @andrjohns. I was just expressing my own preference---I don't claim to speak for everyone. So you might want to ask others if they have a use for it and what they'd like it to be called. |
I’ll add a vote against including this because it’s right on the precipice of a very slippery slope — many density functions can be written as mixtures of two other density functions. There’s the compound Poisson-gamma giving a negative binomial but also a compound normal-exponential giving a Laplace, a compound normal-gamma giving a Student t, etc, etc.
The beta-binomial is a little different in that the mixture doesn’t result in a common probability mass function, so it has to be implemented as its own family.
Personally I think that focusing on implementing families of density functions that are not yet implemented, and avoiding redundant families that are just reparameterizations of each other, will help keep the language from being overburdened. Relationships like the compound Poisson-gamma and the existing negative binomial (either of them) would be excellent contributions to the documentation so that users could learn about the relationships and how to implement the compound distributions using the existing families. But definitely a trade off in user effort and language compactness.
|
I'd like to deal with these on a case by case basis. I think having too many distributions of the same kind (like multiple gamma-Poisson, i.e., neg binomial) is confusing, but on the other hand, it's error prone to reparameterize yourself, especially around distributions with multiple standard parameterizations like normal and gamma and negative binomial. I like that we have the beta proportion distribution in addition to the standard one, because the new one's more natural to formulate priors for. I like having the Cholesky factor multi-normals because they're more efficient than the covariance version even though we could feed |
I'd like to deal with these on a case by case basis. I think having too many distributions of the same kind (like multiple gamma-Poisson, i.e., neg binomial) is confusing, but on the other hand, it's error prone to reparameterize yourself, especially around distributions with multiple standard parameterizations like normal and gamma and negative binomial.
No disagreement on the error proneness of implementing reparameterizations by hand. I think the best trade off is implementing reparameterized density families or documenting the reparameterizations with code that can be copy-pasted.
I like having the Cholesky factor multi-normals because they're more efficient than the covariance version even though we could feed L * L' into the regular one and get the same result (up to numerical issues and speed).
Isn’t this just an implementation problem? Technically if the multi_normal calls the efficiently Cholesky code it should be just as efficient as Cholesky’ing the covariance matrix and then passing that to multi_normal_cholesky. The real difference is when modeling an unstructured covariance matrix itself in which case modeling the Cholesky factor directly, and plugging it into multi_normal_cholesky, will be more efficient.
I like that we have the beta proportion distribution in addition to the standard one, because the new one's more natural to formulate priors for.
I'm not convinced we need the compound gamma-Poisson because the gamma distribution doesn't lead to natural priors (it's like the beta that way in how it tangles the parameters), so I don't see the point. I absolutely don't think it should be our goal to implement as many known distributions as we can.
I would argue that beta_proportion is more commonly applied to constrained regression models, i.e.
y ~ beta_proportion( logistic(alpha + beta * x) , psi)
than non-uniform prior modeling for unit-interval constrained parameters, but that’s a judgment call.
The complication is that this is only one parameterization of the beta family that isolates the mean — for example in https://betanalpha.github.io/assets/case_studies/probability_densities.html#24_the_beta_family <https://betanalpha.github.io/assets/case_studies/probability_densities.html#24_the_beta_family> I argue that another parameterization (using the inverse of the beta_proportion scale parameter) is more useful in practice. A similar argument can be made for the second argument of the neg_binomial_2 family. Transforming between these two is relatively straightforward because the reparameterization is limited to a single parameter, but it’s still annoying.
On the prior modeling side I hesitate because I think that there is too much heterogeneity in the community. For example when using a beta prior density I use quantile constraints, and then transform them into alpha and beta parameters, rather than mean +/- std_dev heuristics. This avoids the need for beta_proportion entirely. Using prior modeling to motivate which families to implement can also lead to detritus in the language as strategies change, for example the rarely used neg_binomial in Stan (which frustratingly is not equal to the standard negative binomial process definition which makes implementing negative binomial models a pain).
All of that is to say that I think the only “fair” restriction to make is structural rather than interpretational. For example allowing the implementation of a “standard/conventional” and a “mean” parameterizations of families (the latter covering regression and some prior modeling strategies) but requiring users to reparameterize for any others themselves. Exceptions could be made for convolutions that can’t be implemented with a simple reparameterization of an existing family.
|
Summary
This PR adds a new compound distribution:
poisson_gamma()
, as the likelihood for a Poisson random variate with a Gamma prior for the rate parameter, after marginalising out the rate parameter.All distribution functions are implemented using the existing
neg_binomial_2
distribution, and reference values for testing calculated using theGammaPoiss
family in theextraDistr
R package: https://rdrr.io/cran/extraDistr/man/GammaPoiss.htmlTests
prim
tests added only, as the gradients are handled by theneg_binomial_2_
functionsSide Effects
N/A
Release notes
Introduced new
gamma_poisson()
compund distributionChecklist
Math issue New compound distribution -
poisson_gamma
#2772Copyright holder: Andrew Johnson
The copyright holder is typically you or your assignee, such as a university or company. By submitting this pull request, the copyright holder is agreeing to the license the submitted work under the following licenses:
- Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
- Documentation: CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)
the basic tests are passing
./runTests.py test/unit
)make test-headers
)make test-math-dependencies
)make doxygen
)make cpplint
)the code is written in idiomatic C++ and changes are documented in the doxygen
the new changes are tested