StanでBinomial Mixture Model：Taglibro de H：SSブログ

	ブログをはじめるログイン

StanでBinomial Mixture Model　[統計] [編集]

Bayesian Population Analysis using WinBUGS 12章にあるBinomial Mixture Model (N-mixutre Model)をStanで実装してみました。

設定は以下のとおりです。

Rか所の調査地のそれぞれである生物を観測しますが、調査地iに生息する個体数n_iは期待値λに従うポアソン分布にしたがうとします。また、実際に観測される数は、試行回数をn_i、成功確率をpとする二項分布にしたがうとします。1調査地あたりt回の測定をおこなうとし、各回の測定値をy_i,tとします。

このとき尤度は以下のようになります。

Stanでは潜在離散パラメータをあつかえませんので、n_iを周辺化消去します。これは、Royle (2004)でもつかわれている方法ですが、無限大まで加算することは実際にはできませんので、じゅうぶんにおおきな数Kまで加算します。Dennis et al. (2015)では、“the Poisson upper tail probability is < 10^-10”をみたすようにとしています。

Stanでは以下のようにモデリングできます。

// Binomial Mixture Model
data {
  int<lower=1> R;       // Number of sites
  int<lower=1> T;       // Number of temporal replications
  int<lower=0> K;       // Upper bound of population size
  int<lower=0> y[R, T]; // Observations
}

transformed data {
  int<lower=0> max_y[R];

  for (i in 1:R)
    max_y[i] <- max(y[i]);
}

parameters {
  real<lower=0> lambda;     // Mean population size
  real<lower=0,upper=1> p;  // Detection probability
}

model {
  for (i in 1:R) {
    vector[K+1] lp;   // log prob. for n in 0:K

    for (n in max_y[i]:K) {
      lp[n + 1] <- poisson_log(n, lambda) +
                   binomial_log(y[i], n, p);
    }
    increment_log_prob(log_sum_exp(lp[(max_y[i] + 1):(K + 1)]));
  }
}

以下のようなRコードを使用しました。

library(rstan)
rstan_options(auto_write = TRUE)
options(mc.cores = parallel::detectCores())
set.seed(123)

R <- 200
T <- 4
lambda <- 5.0
p <- 0.8

y <- array(dim = c(R, T))
N <- rpois(R, lambda)
for (j in 1:T)
  y[, j] <- rbinom(R, N, p)

n.chains <- 4
n.iter <- 2000
n.warmup <- 1000
n.thin <- 1

stan_data <- list(R = R, T = T, K = 80, y = y)
inits <- lapply(1:n.chains,
                function(i) {
                    list(lambda = runif(1, 0, 5),
                         p = runif(1, 0, 1))})

fit <- stan("BinomialMixture.stan",
            data = stan_data, init = inits,
            chains = n.chains, iter = n.iter,
            warmup = n.warmup, thin = n.thin,
            seed = 1,
            open_progress = FALSE)
print(fit)

結果です。

Inference for Stan model: BinomialMixture.
4 chains, each with iter=2000; warmup=1000; thin=1; 
post-warmup draws per chain=1000, total post-warmup draws=4000.

           mean se_mean   sd     2.5%      25%      50%      75%    97.5% n_eff
lambda     4.99    0.00 0.17     4.67     4.87     4.99     5.11     5.34  2262
p          0.81    0.00 0.01     0.79     0.81     0.81     0.82     0.83  2229
lp__   -1265.37    0.03 1.02 -1268.10 -1265.75 -1265.04 -1264.65 -1264.38  1407
       Rhat
lambda    1
p         1
lp__      1

Samples were drawn using NUTS(diag_e) at Wed Feb 24 19:23:27 2016.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).

なお、Kを使用しない方法としては、Multivariate Poisson Modelとして実装する方法(Dennis et al. 2015)があるということで、Stan users mailing list(n-mixture model in Stan? [marginalizing)でも取り上げられていました。

タグ：STAn