ZIPモデルのWAICを計算してみた：Taglibro de H：SSブログ

	ブログをはじめるログイン

ZIPモデルのWAICを計算してみた　[統計] [編集]

ゼロ過剰ポアソン（Zero-Infrated Poisson; ZIP）モデルでWAICを計算させてみました。ついでに、ポアソン分布と、負の二項分布とも比較してみました。

[2017-12-07 コードが間違っていたのを修正。]

ZIPのStanファイル: zip_waic.stan

data {
  int<lower=0> N;
  int<lower=0> Y[N];
}

parameters {
  real<lower=0, upper=1> omega;
  real<lower=0> lambda;
}

model {
  for (i in 1:N)
    if (Y[i] > 0) {
      1 ~ bernoulli(omega);
      Y[i] ~ poisson(lambda);
    } else {
      target += log_sum_exp(bernoulli_lpmf(0 | omega),
                            bernoulli_lpmf(1 | omega)
                            + poisson_lpmf(0 | lambda));
    }
}

generated quantities {
  real log_lik[N];
  
  for (i in 1:N)
    log_lik[i] = (Y[i] > 0) ?
                 bernoulli_lpmf(1 | omega) + poisson_lpmf(Y[i] | lambda) :
                 log_sum_exp(bernoulli_lpmf(0 | omega),
                             bernoulli_lpmf(1 | omega)
                             + poisson_lpmf(0 | lambda));
}

ポアソン分布のStanファイル: poisson_waic.stan

data {
  int<lower=0> N;
  int<lower=0> Y[N];
}

parameters {
  real<lower=0> lambda;
}

model {
  Y ~ poisson(lambda);
}

generated quantities {
  real log_lik[N];
  
  for (i in 1:N)
    log_lik[i] = poisson_lpmf(Y[i] | lambda);
}

負の二項分布のStanファイル: negbin_waic.stan

data {
  int<lower=0> N;
  int<lower=0> Y[N];
}

parameters {
  real<lower=0> mu;
  real<lower=0> phi;
}

model {
  mu ~ cauchy(0, 5);
  phi ~ cauchy(0, 5);
  Y ~ neg_binomial_2(mu, phi);
}

generated quantities {
  real log_lik[N];
  
  for (i in 1:N)
    log_lik[i] = neg_binomial_2_lpmf(Y[i] | mu, phi);
}

phiに弱情報事前分布をつけないと、MCMC計算が収束しませんでした。

Rスクリプト

##
## Zero-inflated Poisson
##

set.seed(123)

n <- 40
omega <- 0.4
lambda <- 3

y <- rbinom(n, 1, omega) * rpois(n, lambda)

##
library(loo)
library(rstan)
rstan_options(auto_write = TRUE)
options(mc.cores = parallel::detectCores())

stan_data = list(N = n, Y = y)
fit1 <- stan("zip_waic.stan", data = stan_data,
             iter = 2000, warmup = 1000, thin = 1)
waic1 <- waic(extract_log_lik(fit1))

fit2 <- stan("poisson_waic.stan", data = stan_data,
             iter = 2000, warmup = 1000, thin = 1)
waic2 <- waic(extract_log_lik(fit2))

fit3 <- stan("negbin_waic.stan", data = stan_data,
             iter = 2000, warmup = 1000, thin = 1)
waic3 <- waic(extract_log_lik(fit3))

結果（追記: というとWAICは予測最適化なので語弊があります）ですが、当然ながら、データを生成したモデルであるZIPでWAICがもっとも小さくなりました。

> print(fit1, pars = c("omega", "lambda", "lp__"))
Inference for Stan model: zip_waic.
4 chains, each with iter=2000; warmup=1000; thin=1; 
post-warmup draws per chain=1000, total post-warmup draws=4000.

         mean se_mean   sd   2.5%    25%    50%    75%  97.5% n_eff Rhat
omega    0.47    0.00 0.08   0.31   0.42   0.47   0.53   0.64  3205    1
lambda   2.47    0.01 0.42   1.71   2.17   2.45   2.74   3.32  2692    1
lp__   -28.56    0.02 1.00 -31.32 -28.93 -28.26 -27.86 -27.59  1814    1

Samples were drawn using NUTS(diag_e) at Thu Dec  7 21:13:35 2017.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).
> print(fit2, pars = c("lambda", "lp__"))
Inference for Stan model: poisson_waic.
4 chains, each with iter=2000; warmup=1000; thin=1; 
post-warmup draws per chain=1000, total post-warmup draws=4000.

         mean se_mean   sd   2.5%    25%    50%    75%  97.5% n_eff Rhat
lambda   1.15    0.00 0.17   0.84   1.03   1.14   1.25   1.51  2979    1
lp__   -40.09    0.02 0.74 -42.18 -40.25 -39.80 -39.62 -39.57  2120    1

Samples were drawn using NUTS(diag_e) at Sat Oct  8 07:55:33 2016.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).
> print(fit3, pars = c("mu", "phi", "lp__"))
Inference for Stan model: negbin_waic.
4 chains, each with iter=2000; warmup=1000; thin=1; 
post-warmup draws per chain=1000, total post-warmup draws=4000.

       mean se_mean   sd   2.5%    25%    50%    75%  97.5% n_eff Rhat
mu     1.23    0.01 0.31   0.77   1.02   1.18   1.37   1.95  1664    1
phi    1.05    0.04 1.42   0.29   0.54   0.77   1.13   3.35  1221    1
lp__ -33.49    0.02 1.03 -36.31 -33.88 -33.16 -32.77 -32.52  1693    1

Samples were drawn using NUTS(diag_e) at Sat Oct  8 07:55:37 2016.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).
> 
> print(waic1)
Computed from 4000 by 40 log-likelihood matrix

          Estimate   SE
elpd_waic    -52.9  6.6
p_waic         1.7  0.2
waic         105.8 13.3
> print(waic2)
Computed from 4000 by 40 log-likelihood matrix

          Estimate   SE
elpd_waic    -67.3  6.0
p_waic         2.0  0.3
waic         134.7 11.9
> print(waic3)
Computed from 4000 by 40 log-likelihood matrix

          Estimate   SE
elpd_waic    -60.3  6.3
p_waic         1.6  0.1
waic         120.6 12.5