StanでZero-infrated negative binomialモデル：Taglibro de H：SSブログ

	ブログをはじめるログイン

StanでZero-inflated P..｜Stan 2.1.0リリースブログトップ

StanでZero-infrated negative binomialモデル　[統計] [編集]

つづいて、Zero-infrated negative binomial (ZINB)モデル。

## Zero-Inflated Negative Binomial
set.seed(1234)
theta <- 0.6
size <- 1
mu <- 4
n <- 100  # number of data
y <- rbinom(n, 1, theta)
y <- y * rnbinom(n, size = size, mu = mu)

## histogram
h <- hist(y, breaks = min(y):(max(y) + 1),
          right = FALSE, plot = FALSE)
barplot(h$counts, names = h$breaks[-(length(h$breaks))],
        las = 1, xlab = "y", ylab = "count")

Stanコード。generated quantitiesブロックで、負の二項分布部分の平均値muを計算する。

// Zero-Inflated Negative Binomial model
data {
  int<lower=1>  N;
  int<lower=0>  y[N];
}
parameters {
  real<lower=0, upper=1> theta;
  real<lower=0>          alpha;
  real<lower=0>          beta;
}
model {
  // priors
  theta ~ uniform(0, 1);
  alpha ~ uniform(0, 100);
  beta ~ uniform(0, 100);
  for (i in 1:N) {
    if (y[i] == 0) {
      // not present
      increment_log_prob(log((1 - theta)
                           + theta * pow(beta / (beta + 1), alpha)));
    } else {
      // present
      increment_log_prob(bernoulli_log(1, theta)
                           + neg_binomial_log(y[i], alpha, beta));
    }
  }
}
generated quantities {
  real mu;
  mu <- alpha / beta;
}

RStanで実行。

## Stan
library(rstan)
model <- stan_model("zinb.stan")
fit <- sampling(model, data = list(N = n, y = y),
                chains = 4,
                iter = 2000, warmup = 500, thin = 1)

結果

> traceplot(fit)

> print(fit, digits = 2)
Inference for Stan model: zinb.
4 chains, each with iter=2000; warmup=500; thin=1; 
post-warmup draws per chain=1500, total post-warmup draws=6000.

         mean se_mean   sd    2.5%     25%     50%     75%   97.5% n_eff Rhat
theta    0.68    0.00 0.09    0.54    0.62    0.67    0.73    0.88  1392    1
alpha    1.25    0.02 0.51    0.51    0.89    1.17    1.55    2.43  1141    1
beta     0.27    0.00 0.10    0.13    0.20    0.26    0.33    0.50  1467    1
mu       4.54    0.01 0.74    3.11    4.03    4.54    5.02    6.01  3372    1
lp__  -218.71    0.03 1.24 -221.89 -219.31 -218.40 -217.78 -217.26  1576    1

Samples were drawn using NUTS(diag_e) at Thu Dec 26 16:35:13 2013.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).

タグ：STAn