You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently the von_mises_lpdf function can overflow for large values of kappa. Because of this, using it in stan usually requires a conditional statement to use a normal distribution approximation for kappa > 100. For example, based on the recommendation in the Stan Functions Reference in brms the von_mises distribution is redifined as
real von_mises2_lpdf(vector y, vector mu, real kappa) {
if (kappa < 100) {
returnvon_mises_lpdf(y | mu, kappa);
} else {
returnnormal_lpdf(y | mu, sqrt(1 / kappa));
}
}
Because of this, the likelihood cannot be vectorized and brms needs to loop over the observations if kappa varies across conditions. It also still leads to issues, because sqrt(1/kappa) becomes 0 for large kappas.
A simple change can substantially improve the stability of the function and make that conditional approximation unnecessary.
There exist the log_modified_bessel_first_kind() function, but this is only used for the log likelihood calculation, but not for the partials. On lines 81-86:
truncated output below. requires conditional statement for kappa, making vectorization impossible
functions {
// more stuff hererealvon_mises2_lpdf(realy, realmu, realkappa) {
if (kappa<100) {
returnvon_mises_lpdf(y | mu, kappa);
} else {
returnnormal_lpdf(y | mu, sqrt(1/kappa));
}
}
}
// truncated other blocks ....model {
// likelihood including constantsif (!prior_only) {
// initialize linear predictor termvector[N] mu=rep_vector(0.0, N);
// initialize linear predictor termvector[N] kappa=rep_vector(0.0, N);
mu+=Intercept;
kappa+=Intercept_kappa;
for (nin1:N) {
// add more terms to the linear predictorkappa[n] +=r_1_kappa_1[J_1[n]] *Z_1_kappa_1[n];
}
mu=inv_tan_half(mu);
kappa=exp(kappa);
for (nin1:N) {
target+=von_mises2_lpdf(Y[n] | mu[n], kappa[n]);
}
}
// priors including constantstarget+=lprior;
target+=std_normal_lpdf(z_1[1]);
}
Model 2: Vectorized version without conditional approximation:
Truncated model code:
model {
// likelihood including constantsif (!prior_only) {
// initialize linear predictor termvector[N] mu=rep_vector(0.0, N);
// initialize linear predictor termvector[N] kappa=rep_vector(0.0, N);
mu+=Intercept;
kappa+=Intercept_kappa;
for (nin1:N) {
// add more terms to the linear predictorkappa[n] +=r_1_kappa_1[J_1[n]] *Z_1_kappa_1[n];
}
mu=inv_tan_half(mu);
kappa=exp(kappa);
target+=von_mises_lpdf(Y | mu, kappa);
}
// priors including constantstarget+=lprior;
target+=std_normal_lpdf(z_1[1]);
}
results
I recorded the sampling time (on a Mac M3 max) and the warnings about rejected proposals and overflow.
model
stanmath version
sampling time
warnings
1 (original)
current
130 s.
3 rejected proposals due to scale=0 for normal_lpdf
2 (vectorized)
current
84 s.
8 rejected proposals due to numeric overflow
1 (original)
changed as above
184 s.
2 rejected proposals due to scale=0 for normal_lpdf
2 (vectorized)
changed as above
107 s.
none
Despite the warnings of the other models, all models converged to similar posterior estimates.
Using the log_modified_bessel_first_kind() function ads an overhead because it is slower than the non-log version. But the end results is still faster because it allows vectorization. And it produced no warnings about overflow in 10 testing runs I did.
Current Version:
v4.8.1
The text was updated successfully, but these errors were encountered:
Description
Currently the von_mises_lpdf function can overflow for large values of kappa. Because of this, using it in stan usually requires a conditional statement to use a normal distribution approximation for kappa > 100. For example, based on the recommendation in the Stan Functions Reference in brms the von_mises distribution is redifined as
Because of this, the likelihood cannot be vectorized and brms needs to loop over the observations if kappa varies across conditions. It also still leads to issues, because sqrt(1/kappa) becomes 0 for large kappas.
A simple change can substantially improve the stability of the function and make that conditional approximation unnecessary.
There exist the log_modified_bessel_first_kind() function, but this is only used for the log likelihood calculation, but not for the partials. On lines 81-86:
the value of$I_1(k)/I_0(k)$ is always between 0 and 1:
but currently both the numerator and denominator overflow for large kappas.
This can be replaced by using the log_modified_bessel_first_kind function:
Consequences
I ran the following two models, with the current von_mises_lpdf code on the develop branch, and with my proposed changes:
Model 1: original model generated by brms:
truncated output below. requires conditional statement for kappa, making vectorization impossible
Model 2: Vectorized version without conditional approximation:
Truncated model code:
results
I recorded the sampling time (on a Mac M3 max) and the warnings about rejected proposals and overflow.
Despite the warnings of the other models, all models converged to similar posterior estimates.
Using the log_modified_bessel_first_kind() function ads an overhead because it is slower than the non-log version. But the end results is still faster because it allows vectorization. And it produced no warnings about overflow in 10 testing runs I did.
Current Version:
v4.8.1
The text was updated successfully, but these errors were encountered: