SSブログ

R: Stanをためす (3) 欠測値のあるデータ [統計]

Stanで欠測値のあるデータの解析。

missing.stan

data {
  int<lower=0> N_obs;
  int<lower=0> N_miss;
  real x_obs[N_obs];
}
parameters {
  real mu;
  real<lower=0> sigma;
  real x_miss[N_miss];
}
model {
  for (i in 1:N_obs) {
    x_obs[i] ~ normal(mu, sigma);
  }
  for (i in 1:N_miss) {
    x_miss[i] ~ normal(mu, sigma);
  }
  mu ~ normal(0, 1.0e+2);
  sigma ~ uniform(0, 1.0e+4);
}

missing.R

## seed of random number
set.seed(1234)

## data
N_obs <- 36
N_miss <- 4
mu <- 50
sigma <- 10

x_obs <- rnorm(N_obs, mu, sigma)

library(coda)
library(parallel)

## Stan
program <- "missing"

# dump data
data.file <- paste(program, "_dump.R", sep = "")
dump(c("N_obs", "N_miss", "x_obs"), data.file)

# initial values
init.file <- paste(program, "_init.R", sep = "")
mu <- 0
sigma <- 1
x_miss <- rep(0, N_miss)
dump(c("mu", "sigma", "x_miss"), init.file)

# compile
stan.home <- ("~/Documents/src/stan-src-1.0.0")
cur.dir <- getwd()
setwd(stan.home)
system(paste("make ", cur.dir, "/", program, sep = ""))
setwd(cur.dir)

warmup <- 1000
iter <- 20000 + warmup
thin <- 10
run <- function(chain) {
  cmd <- paste("./", program,
               " --data=", data.file,
               " --init=", init.file,
               " --iter=", iter,
               " --warmup=", warmup,
               " --thin=", thin,
               " --chain_id=", chain,
               " --samples=", program, ".chain", chain, ".csv", sep = "")
  system(cmd)
  s <- read.csv(paste(program, ".chain", chain, ".csv", sep = ""), comment.char = "#")
  mcmc(s, start = warmup + 1, thin = thin)
}

samples <- as.mcmc.list(mclapply(1:3, run))

結果

> summary(samples[, c("mu", "sigma",
+                    "x_miss.1", "x_miss.2", "x_miss.3", "x_miss.4")])

Iterations = 1001:20991
Thinning interval = 10 
Number of chains = 3 
Sample size per chain = 2000 

1. Empirical mean and standard deviation for each variable,
   plus standard error of the mean:

           Mean    SD Naive SE Time-series SE
mu       46.570 1.575  0.02034        0.01888
sigma     9.334 1.177  0.01520        0.01550
x_miss.1 46.726 9.658  0.12469        0.10932
x_miss.2 46.610 9.579  0.12366        0.11869
x_miss.3 46.566 9.591  0.12382        0.13662
x_miss.4 46.445 9.620  0.12420        0.12369

2. Quantiles for each variable:

           2.5%    25%    50%   75% 97.5%
mu       43.542 45.537 46.565 47.58 49.73
sigma     7.403  8.513  9.204 10.03 11.97
x_miss.1 27.253 40.572 46.922 53.05 65.82
x_miss.2 27.645 40.324 46.619 52.86 65.42
x_miss.3 27.944 40.281 46.573 53.00 65.63
x_miss.4 27.444 40.026 46.583 52.75 65.22

データにNAを入れるとエラーになる

Exception: variable x_obs not found.
Diagnostic information: 
Dynamic exception type: std::runtime_error
std::exception::what: variable x_obs not found.

タグ:MCMC R STAn
nice!(0)  コメント(0)  トラックバック(0) 
共通テーマ:学問

nice! 0

コメント 0

コメントを書く

お名前:
URL:
コメント:
画像認証:
下の画像に表示されている文字を入力してください。

Facebook コメント

トラックバック 0