glmmAK：Taglibro de H：SSブログ

	ブログをはじめるログイン

auto.close.parens｜メモ: マイナスイオンウォーターの人がお.. ブログトップ

glmmAK　[統計] [編集]

CRANにglmmAKというパッケージが登録されていた。AKというのは作者のArnošt Komárek氏の名前からとったものか。

This package implements several generalized linear (mixed) models (GLMM) where for random effects either a conventional normal distribution is assumed or a flexible model based on penalized smoothing is used for the random effect distribution. We call the second class of models as penalized Gaussian mixture (PGM) or shortly G-spline.

通常のGLMMではrandom effectが正規分布することを仮定するが、このパッケージではpenalized Gaussian mixture (PGM)とすることができるということのようだ。で、PGMというのはG-スプラインということか。このあたり勉強不足なのでよく分からないのだが。

通常のGLMMのrandam effect項

$b_i\sim\mathcal{N}(0,d)$

PGMのrandom effect項

$b_i\sim\sum_{j=-K}^{K}w_j(\mathbf{a})\mathcal{N}(\tau\mu_j,(\tau\sigma)^2)$

MCMCで推定をおこなうようになっている。

付属の例題(ex-Toenail)をやってみる。1) PGM GLMM, not hierarchically centered, 2) PGM GLMM, hierarchically centered, 3) Normal GLMM, not hierarchically centered, 4) Normal GLMM, hierarchically centeredの4つを比較しているが、とりあえず1)だけをやってみる。

まずはいろいろ準備。

library(glmmAK)
root <- paste(Sys.getenv("HOME"), "Documents/RData/", sep="/")
setwd(root)
data(toenail)

# 事前分布
prior.fixed <- list(mean = 0, var = 10000)

prior.gspline.Slice <- list(K = 15, delta = 0.3,
    sigma = 0.2, CARorder = 3, 
    Ldistrib = "gamma", Lequal = FALSE, Lshape = 1,
    LinvScale = 0.005, 
    AtypeUpdate = "slice") 

prior.gspline.Block <- list(K = 15, delta = 0.3,
    sigma = 0.2, CARorder = 3, 
    Ldistrib = "gamma", Lequal = FALSE, Lshape = 1,
    LinvScale = 0.005, 
    AtypeUpdate = "block") 

prior.random.gspl.nhc <- list(Ddistrib = "gamma",
    Dshape = 1, DinvScale = 0.005)

# 結果のファイルを収めるpathの準備
if (!("chToenail" %in% dir(root))) dir.create(paste(root, "chToenail", 
    sep = "")) 
dirNames <- c("PGM_nhc") 
dirPaths <- paste(root, "chToenail/", dirNames, "/", sep = "") 
for (i in 1:length(dirPaths)) { 
    if (!(dirNames[i] %in% dir(paste(root, "chToenail", sep = "")))) 
        dir.create(dirPaths[i]) 
} 
names(dirPaths) <- c("PGM_nhc") 

# 共変量
iXmat <- data.frame(trt = toenail$trt, 
    time = toenail$time, 
    trt.time = toenail$trt * toenail$time) 

# MCMCの回数
# niter × nthinが繰り返し回数となる
# 例題ではnthin = 100, nwrite = 100となっているが、それぞれ10に
nsimul <- list(niter = 4000, nthin = 10, nburn = 2000, nwrite = 10)

MCMCを実行。

fit.PGM.nhc <- cumlogitRE(y = toenail$infect, x = iXmat, 
    cluster = toenail$idnr, intcpt.random = TRUE,
    hierar.center = FALSE, 
    C = 1, drandom = "gspline", prior.fixed = prior.fixed, 
    prior.random = prior.random.gspl.nhc,
    prior.gspline = prior.gspline.Slice, 
    nsimul = nsimul, store = list(prob = FALSE, b = TRUE), 
    dir = dirPaths["PGM_nhc"])

事後分布を読み込む。

chPGM.nhc <- glmmAK.files2coda(dir = dirPaths["PGM_nhc"],
    drandom = "gspline", 
    skip = 0) 

iters <- scanFH(paste(dirPaths["PGM_nhc"], "iteration.sim", sep = "")) 
betaF <- scanFH(paste(dirPaths["PGM_nhc"], "betaF.sim", sep = "")) 
betaRadj <- scanFH(paste(dirPaths["PGM_nhc"], "betaRadj.sim",
    sep = "")) 
varRadj <- scanFH(paste(dirPaths["PGM_nhc"], "varRadj.sim", sep = "")) 
chPGM.nhc <- mcmc(data.frame(Trt = betaF[, "trt"], 
    Time = betaF[, "time"], 
    Trt.Time = betaF[, "trt.time"], 
    Meanb = betaRadj[, "(Intercept)"], 
    SDb = sqrt(varRadj[, "varR.1"])), 
    start = iters[1, 1]) 
rm(list = c("iters", "betaF", "betaRadj", "varRadj"))

結果。

summary(chPGM.nhc)


Iterations = 2001:4000
Thinning interval = 1 
Number of chains = 1 
Sample size per chain = 2000 

1. Empirical mean and standard deviation for each variable,
   plus standard error of the mean:

            Mean      SD  Naive SE Time-series SE
Trt      -0.1707 0.55305 0.0123667       0.038592
Time     -0.3888 0.04393 0.0009823       0.001471
Trt.Time -0.1364 0.06750 0.0015093       0.002118
Meanb    -1.6728 0.42032 0.0093987       0.048364
SDb       4.0355 0.40676 0.0090954       0.032822

2. Quantiles for each variable:

            2.5%     25%     50%      75%     97.5%
Trt      -1.2368 -0.5494 -0.1756  0.20957  0.914094
Time     -0.4759 -0.4183 -0.3876 -0.35809 -0.308134
Trt.Time -0.2688 -0.1810 -0.1373 -0.08957 -0.005997
Meanb    -2.5440 -1.9513 -1.6625 -1.38506 -0.897988
SDb       3.2792  3.7500  4.0270  4.29516  4.873613

plot(chPGM.nhc)

lmerと比較してみる。

library(lme4)
glmm.lmer <- lmer(infect ~ trt + time + trt:time + (1|idnr),
    family=binomial, data=toenail, method="Laplace")

Generalized linear mixed model fit using Laplace 
Formula: infect ~ trt + time + trt:time + (1 | idnr) 
   Data: toenail 
 Family: binomial(logit link)
  AIC  BIC logLik deviance
 1266 1293 -627.8     1256
Random effects:
 Groups Name        Variance Std.Dev.
 idnr   (Intercept) 20.867   4.568   
number of obs: 1908, groups: idnr, 294

Estimated scale (compare to  1 )  1.623704 

Fixed effects:
            Estimate Std. Error z value Pr(>|z|)    
(Intercept) -2.51388    0.46275  -5.432 5.56e-08 ***
trt         -0.30664    0.66293  -0.463   0.6437    
time        -0.40044    0.04520  -8.859  < 2e-16 ***
trt:time    -0.13629    0.07361  -1.851   0.0641 .  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Correlation of Fixed Effects:
         (Intr) trt    time  
trt      -0.698              
time     -0.277  0.193       
trt:time  0.170 -0.278 -0.614