WAICを計算させてみた：Taglibro de H：SSブログ

	ブログをはじめるログイン

WAICを計算させてみた　[統計] [編集]

WAICの計算をやってみたので、そのメモです。

WAICについてはこのあたりを参考に。

コードです。久保みどり本10章の例題をつかってみました。WAICの計算には、Vehtari & Gelman (2014)のコードを使用します。

library(rstan)

## Vehtari and Gelman (2014)のwaic関数を"waic.R"として保存しておく
## http://www.stat.columbia.edu/~gelman/research/unpublished/waic_stan.pdf
source("waic.R")

## データ
d <- read.csv(url("http://hosho.ees.hokudai.ac.jp/~kubo/stat/iwanamibook/fig/hbm/data7a.csv"))
data <- list(N = nrow(d), Y = d$y)

n_models <- 3
model <- list(n_models)
fit <- list(n_models)

## 切片のみのモデル
model[[1]] <- '
  data {
    int<lower=0> N;
    int<lower=0> Y[N];
  }
  parameters {
    real beta;
  }
  transformed parameters {
    real q[N];

    for (i in 1:N) {
      q[i] <- inv_logit(beta);
    }
  }
  model {
    Y ~ binomial(8, q);
    beta ~ normal(0, 1.0e+2);
  }
  generated quantities {
    vector[N] log_lik;
    for (i in 1:N) {
      log_lik[i] <- binomial_log(Y[i], 8, q[i]);
    }
  }'

## 切片+ランダム効果のモデル
model[[2]] <- '
  data {
    int<lower=0> N;
    int<lower=0> Y[N];
  }
  parameters {
    real beta;
    real r[N];
    real<lower=0> s;
  }
  transformed parameters {
    real q[N];

    for (i in 1:N) {
      q[i] <- inv_logit(beta + r[i]);
    }
  }
  model {
    Y ~ binomial(8, q);
    beta ~ normal(0, 1.0e+2);
    r ~ normal(0, s);
    s ~ uniform(0, 1.0e+4);
  }
  generated quantities {
    vector[N] log_lik;	
    for (i in 1:N) {
      log_lik[i] <- binomial_log(Y[i], 8, q[i]);
    }
  }'

## model2の事前分布をかえたモデル
model[[3]] <- '
  data {
    int<lower=0> N;
    int<lower=0> Y[N];
  }
  parameters {
    real beta;
    real r[N];
    real<lower=0> s;
  }
  transformed parameters {
    real q[N];

    for (i in 1:N) {
      q[i] <- inv_logit(beta + r[i]);
    }
  }
  model {
    Y ~ binomial(8, q);
    beta ~ normal(0, 1.0e+0);
    r ~ normal(0, s);
    s ~ uniform(0, 1.0e+4);
  }
  generated quantities {
    vector[N] log_lik;	
    for (i in 1:N) {
      log_lik[i] <- binomial_log(Y[i], 8, q[i]);
    }
  }'


## あてはめ
fit[[1]] <- stan(model_code = model[[1]], data = data,
                 iter = 3000, warmup = 1000, thin = 2, chains = 4)
fit[[2]] <- stan(model_code = model[[2]], data = data,
                 iter = 3000, warmup = 1000, thin = 2, chains = 4)
fit[[3]] <- stan(model_code = model[[3]], data = data,
                 iter = 3000, warmup = 1000, thin = 2, chains = 4)

## 結果
for (i in 1:n_models) {
  print(fit[[i]], pars = "beta")
  print(waic(fit[[i]])$total)
}

あとでwaic関数で使用するため、Stanコード中のgenerated quantitiesブロックで、log_likに対数尤度を格納するようにしています。

結果です。

> for (i in 1:n_models) {
+   print(fit[[i]], pars = "beta")
+   print(waic(fit[[i]])$total)
+ }
Inference for Stan model: model[[1]].
4 chains, each with iter=3000; warmup=1000; thin=2; 
post-warmup draws per chain=1000, total post-warmup draws=4000.

     mean se_mean   sd  2.5%   25%  50%  75% 97.5% n_eff Rhat
beta 0.02       0 0.07 -0.12 -0.03 0.02 0.06  0.15  2458    1

Samples were drawn using NUTS(diag_e) at Sun Jun 21 06:54:21 2015.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).
       waic         lpd      p_waic   elpd_waic       p_loo    elpd_loo 
 767.993922 -379.006593    4.990368 -383.996961    4.989404 -383.995997 
Inference for Stan model: model[[2]].
4 chains, each with iter=3000; warmup=1000; thin=2; 
post-warmup draws per chain=1000, total post-warmup draws=4000.

     mean se_mean   sd  2.5%   25%  50%  75% 97.5% n_eff Rhat
beta 0.04    0.01 0.34 -0.64 -0.19 0.04 0.27  0.71  1302    1

Samples were drawn using NUTS(diag_e) at Sun Jun 21 06:54:42 2015.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).
      waic        lpd     p_waic  elpd_waic      p_loo   elpd_loo 
 279.42141  -98.19609   41.51462 -139.71070   57.45762 -155.65371 
Inference for Stan model: model[[3]].
4 chains, each with iter=3000; warmup=1000; thin=2; 
post-warmup draws per chain=1000, total post-warmup draws=4000.

     mean se_mean   sd  2.5%   25%  50%  75% 97.5% n_eff Rhat
beta 0.04    0.01 0.34 -0.63 -0.18 0.04 0.27  0.71  1489    1

Samples were drawn using NUTS(diag_e) at Sun Jun 21 06:55:02 2015.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).
      waic        lpd     p_waic  elpd_waic      p_loo   elpd_loo 
 278.95844  -98.23228   41.24694 -139.47922   56.70686 -154.93915

モデル3のWAICがもっともちいさくなりました。

モデル1を、GLMであてはめた結果と比較してみます。

> fit.glm <- glm(cbind(y, 8 - y) ~ 1, family = "binomial", data = d)
> fit.glm

Call:  glm(formula = cbind(y, 8 - y) ~ 1, family = "binomial", data = d)

Coefficients:
(Intercept)  
      0.015  

Degrees of Freedom: 99 Total (i.e. Null);  99 Residual
Null Deviance:	    631.5 
Residual Deviance: 631.5 	AIC: 763.9
> summary(fit.glm)

Call:
glm(formula = cbind(y, 8 - y) ~ 1, family = "binomial", data = d)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-3.3483  -2.2699  -0.0212   2.2299   3.3122  

Coefficients:
            Estimate Std. Error z value Pr(>|z|)
(Intercept)  0.01500    0.07071   0.212    0.832

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 631.49  on 99  degrees of freedom
Residual deviance: 631.49  on 99  degrees of freedom
AIC: 763.94

Number of Fisher Scoring iterations: 3

Vehtari & Gelman (2014)のWAICの計算では、AICとおなじスケールになるようにしてあるということです。