WAICを計算させてみた [統計]
WAICの計算をやってみたので、そのメモです。
WAICについてはこのあたりを参考に。
コードです。久保みどり本10章の例題をつかってみました。WAICの計算には、Vehtari & Gelman (2014)のコードを使用します。
library(rstan) ## Vehtari and Gelman (2014)のwaic関数を"waic.R"として保存しておく ## http://www.stat.columbia.edu/~gelman/research/unpublished/waic_stan.pdf source("waic.R") ## データ d <- read.csv(url("http://hosho.ees.hokudai.ac.jp/~kubo/stat/iwanamibook/fig/hbm/data7a.csv")) data <- list(N = nrow(d), Y = d$y) n_models <- 3 model <- list(n_models) fit <- list(n_models) ## 切片のみのモデル model[[1]] <- ' data { int<lower=0> N; int<lower=0> Y[N]; } parameters { real beta; } transformed parameters { real q[N]; for (i in 1:N) { q[i] <- inv_logit(beta); } } model { Y ~ binomial(8, q); beta ~ normal(0, 1.0e+2); } generated quantities { vector[N] log_lik; for (i in 1:N) { log_lik[i] <- binomial_log(Y[i], 8, q[i]); } }' ## 切片+ランダム効果のモデル model[[2]] <- ' data { int<lower=0> N; int<lower=0> Y[N]; } parameters { real beta; real r[N]; real<lower=0> s; } transformed parameters { real q[N]; for (i in 1:N) { q[i] <- inv_logit(beta + r[i]); } } model { Y ~ binomial(8, q); beta ~ normal(0, 1.0e+2); r ~ normal(0, s); s ~ uniform(0, 1.0e+4); } generated quantities { vector[N] log_lik; for (i in 1:N) { log_lik[i] <- binomial_log(Y[i], 8, q[i]); } }' ## model2の事前分布をかえたモデル model[[3]] <- ' data { int<lower=0> N; int<lower=0> Y[N]; } parameters { real beta; real r[N]; real<lower=0> s; } transformed parameters { real q[N]; for (i in 1:N) { q[i] <- inv_logit(beta + r[i]); } } model { Y ~ binomial(8, q); beta ~ normal(0, 1.0e+0); r ~ normal(0, s); s ~ uniform(0, 1.0e+4); } generated quantities { vector[N] log_lik; for (i in 1:N) { log_lik[i] <- binomial_log(Y[i], 8, q[i]); } }' ## あてはめ fit[[1]] <- stan(model_code = model[[1]], data = data, iter = 3000, warmup = 1000, thin = 2, chains = 4) fit[[2]] <- stan(model_code = model[[2]], data = data, iter = 3000, warmup = 1000, thin = 2, chains = 4) fit[[3]] <- stan(model_code = model[[3]], data = data, iter = 3000, warmup = 1000, thin = 2, chains = 4) ## 結果 for (i in 1:n_models) { print(fit[[i]], pars = "beta") print(waic(fit[[i]])$total) }
あとでwaic関数で使用するため、Stanコード中のgenerated quantitiesブロックで、log_likに対数尤度を格納するようにしています。
結果です。
> for (i in 1:n_models) { + print(fit[[i]], pars = "beta") + print(waic(fit[[i]])$total) + } Inference for Stan model: model[[1]]. 4 chains, each with iter=3000; warmup=1000; thin=2; post-warmup draws per chain=1000, total post-warmup draws=4000. mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat beta 0.02 0 0.07 -0.12 -0.03 0.02 0.06 0.15 2458 1 Samples were drawn using NUTS(diag_e) at Sun Jun 21 06:54:21 2015. For each parameter, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence, Rhat=1). waic lpd p_waic elpd_waic p_loo elpd_loo 767.993922 -379.006593 4.990368 -383.996961 4.989404 -383.995997 Inference for Stan model: model[[2]]. 4 chains, each with iter=3000; warmup=1000; thin=2; post-warmup draws per chain=1000, total post-warmup draws=4000. mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat beta 0.04 0.01 0.34 -0.64 -0.19 0.04 0.27 0.71 1302 1 Samples were drawn using NUTS(diag_e) at Sun Jun 21 06:54:42 2015. For each parameter, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence, Rhat=1). waic lpd p_waic elpd_waic p_loo elpd_loo 279.42141 -98.19609 41.51462 -139.71070 57.45762 -155.65371 Inference for Stan model: model[[3]]. 4 chains, each with iter=3000; warmup=1000; thin=2; post-warmup draws per chain=1000, total post-warmup draws=4000. mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat beta 0.04 0.01 0.34 -0.63 -0.18 0.04 0.27 0.71 1489 1 Samples were drawn using NUTS(diag_e) at Sun Jun 21 06:55:02 2015. For each parameter, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence, Rhat=1). waic lpd p_waic elpd_waic p_loo elpd_loo 278.95844 -98.23228 41.24694 -139.47922 56.70686 -154.93915
モデル3のWAICがもっともちいさくなりました。
モデル1を、GLMであてはめた結果と比較してみます。
> fit.glm <- glm(cbind(y, 8 - y) ~ 1, family = "binomial", data = d) > fit.glm Call: glm(formula = cbind(y, 8 - y) ~ 1, family = "binomial", data = d) Coefficients: (Intercept) 0.015 Degrees of Freedom: 99 Total (i.e. Null); 99 Residual Null Deviance: 631.5 Residual Deviance: 631.5 AIC: 763.9 > summary(fit.glm) Call: glm(formula = cbind(y, 8 - y) ~ 1, family = "binomial", data = d) Deviance Residuals: Min 1Q Median 3Q Max -3.3483 -2.2699 -0.0212 2.2299 3.3122 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) 0.01500 0.07071 0.212 0.832 (Dispersion parameter for binomial family taken to be 1) Null deviance: 631.49 on 99 degrees of freedom Residual deviance: 631.49 on 99 degrees of freedom AIC: 763.94 Number of Fisher Scoring iterations: 3
Vehtari & Gelman (2014)のWAICの計算では、AICとおなじスケールになるようにしてあるということです。
コメント 0