You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When the total count is set to zero in the RNG for the multinomial distribution, an exception is thrown:
Exception: multinomial_rng: number of trials variables is 0, but must be positive!
However, the multinomial_lpmf function accepts zero integer arrays (and the log-prob is always 0.0, as expected). I think it would make more sense if the RNG also accepts a zero total count and just returns a zero count vector: If you don't sample anything, then all the counts per category should also be zero. Allowing this can be useful for e.g. encoding missing data.
If you agree, I can submit a simple PR
Example
Stan model. In the generated quantities block, sum(n) can't be zero
data {
int<lower=1> k; // number of categories
array[k] int<lower=0> n; // number of samples per category
}
parameters {
simplex[k] p; // sampling probability per category
}
model {
n ~ multinomial(p);
}
generated quantities {
array[k] int n_sim = multinomial_rng(p, sum(n)); // simulated samples
}
Python code to run the model. Running the model with good_data works fine. If we use bad_data, Exceptions are thrown and the generated quantities are all NaN.
import cmdstanpy
cmdstanpy.show_versions()
sm = cmdstanpy.CmdStanModel(stan_file="multinomial_zero_error.stan")
good_data = {
"k" : 3,
"n" : [3,2,5]
}
sam = sm.sample(data=good_data, iter_sampling=100)
print(sam.summary())
# set all counts to zero. Not an issue for multinomial_lpmf
# but multinomial_rng throws an exception
bad_data = {
"k" : 3,
"n" : [0,0,0]
}
sam = sm.sample(data=bad_data, iter_sampling=100)
print(sam.summary())
Exception: multinomial_rng: number of trials variables is 0, but must be positive! (in 'multinomial_zero_error.stan', line 15, column 4 to column 52)
Exception: multinomial_rng: number of trials variables is 0, but must be positive! (in 'multinomial_zero_error.stan', line 15, column 4 to column 52)
...
Mean MCSE StdDev ... N_Eff N_Eff/s R_hat
lp__ -4.470830 0.091875 1.141650 ... 154.409 5514.62 1.013780
p[1] 0.347875 0.012957 0.233269 ... 324.102 11575.10 0.996473
p[2] 0.335542 0.012238 0.229005 ... 350.137 12504.90 0.999920
p[3] 0.316583 0.011180 0.234400 ... 439.615 15700.50 0.996259
n_sim[1] NaN NaN NaN ... NaN NaN NaN
n_sim[2] NaN NaN NaN ... NaN NaN NaN
n_sim[3] NaN NaN NaN ... NaN NaN NaN
Description
When the total count is set to zero in the RNG for the multinomial distribution, an exception is thrown:
However, the
multinomial_lpmf
function accepts zero integer arrays (and the log-prob is always 0.0, as expected). I think it would make more sense if the RNG also accepts a zero total count and just returns a zero count vector: If you don't sample anything, then all the counts per category should also be zero. Allowing this can be useful for e.g. encoding missing data.If you agree, I can submit a simple PR
Example
Stan model. In the
generated quantities
block,sum(n)
can't be zeroPython code to run the model. Running the model with
good_data
works fine. If we usebad_data
, Exceptions are thrown and thegenerated quantities
are all NaN.Expected output
Result with
good_data
:Result with
bad_data
Current Version:
python: 3.10.6 (v3.10.6:9c7b4bd164, Aug 1 2022, 17:13:48) [Clang 13.0.0 (clang-1300.0.29.30)]
cmdstan: (2, 34)
cmdstanpy: 1.2.1
The text was updated successfully, but these errors were encountered: