JAGSとStanによる負の二項分布のパラメータ推定 [統計]
熊谷さんが負の二項分布のパラメータ推定をWinBUGSでやっておられたので、JAGSとStanでもやってみました。
Rのコード
## ## Negative Binomial ## ## JAGS library(rjags) set.seed(1234) n <- 100 mu <- exp(2) r <- 1.5 Y <- rnbinom(n, size = r, mu = mu) params <- c("mu", "r") inits <- lapply(1:3, function(i) list(p = runif(1, 0, 1), r = rpois(1, 20) + 1)) model <- jags.model("negbin.bug", data = list(N = n, Y = Y), inits = inits, n.chains = 3, n.adapt = 500) samples <- coda.samples(model, variable.names = params, n.iter = 2000, thin = 2) gelman.diag(samples) summary(samples) ## Stan library(rstan) inits <- lapply(1:3, function(i) list(alpha = runif(1, 0, 100), beta = runif(1, 0, 100))) model <- stan_model("negbin.stan") fit <- sampling(model, data = list(N = n, Y = Y), pars = params, chain = 3, iter = 1500, warmup = 500, thin = 1, init = inits) print(fit, digits = 2)
JAGSのコード
## JAGS var N, p, r, mu, Y[N]; model { for (i in 1:N) { Y[i] ~ dnegbin(p, r); } ## priors p ~ dunif(0, 1); r ~ dgamma(1.0E-3, 1.0E-3); ## mu mu <- r * (1 - p) / p; }
Stanのコード
なお、Stanでの負の二項分布のパラメータは以下のとおり。
// Stan data { int<lower=0> N; int<lower=0> Y[N]; } parameters { real<lower=0> alpha; real<lower=0> beta; } model { for (i in 1:N) { Y ~ neg_binomial(alpha, beta); } // priors alpha ~ uniform(0, 1.0e+4); beta ~ uniform(0, 1.0e+4); } generated quantities { real p; real mu; real r; p <- beta / (beta + 1); mu <- alpha * (1 - p) / p; r <- alpha; }
結果
WinBUGSでは、sizeパラメータは自然数でなければならないようだが、JAGSでは正の実数でよいようだ。
> gelman.diag(samples) Potential scale reduction factors: Point est. Upper C.I. mu 1.00 1.00 r 1.02 1.05 Multivariate psrf 1.01 > summary(samples) Iterations = 502:2500 Thinning interval = 2 Number of chains = 3 Sample size per chain = 1000 1. Empirical mean and standard deviation for each variable, plus standard error of the mean: Mean SD Naive SE Time-series SE mu 6.559 0.5719 0.010442 0.01030 r 1.762 0.3396 0.006199 0.01685 2. Quantiles for each variable: 2.5% 25% 50% 75% 97.5% mu 5.490 6.171 6.530 6.932 7.748 r 1.181 1.534 1.726 1.946 2.546
Stanの結果
> print(fit, digits = 2) Inference for Stan model: negbin. 3 chains, each with iter=1500; warmup=500; thin=1; post-warmup draws per chain=1000, total post-warmup draws=3000. mean se_mean sd 2.5% 25% 50% 75% 97.5% mu 6.55 0.00 0.06 6.44 6.51 6.55 6.58 6.66 r 1.71 0.00 0.03 1.65 1.69 1.71 1.73 1.77 lp__ -29113.62 0.04 1.02 -29116.39 -29114.01 -29113.30 -29112.90 -29112.63 n_eff Rhat mu 1999 1 r 565 1 lp__ 732 1 Samples were drawn using NUTS(diag_e) at Tue Feb 11 14:16:06 2014. For each parameter, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence, Rhat=1).
計算時間は、rjagsの方が、RStanよりも圧倒的にはやかった。
コメント 0