2010-12-02 (2010 12 02 10 :51 ) 1/ 71 GCOE 2010-12-02 WinBUGS kubo@ees.hokudai.ac.jp http://goo.gl/bukrb
12/1 ( ) GLM, R MCMC, WinBUGS 12/2 ( ) WinBUGS WinBUGS 12/2 ( ) : 12/3 ( ) :? 2010-12-02 (2010 12 02 10 :51 ) 2/ 71
2010-12-02 (2010 12 02 10 :51 ) 3/ 71 ( ) random effects BUGS
2010-12-02 (2010 12 02 10 :51 ) 4/ 71
2010-12-02 (2010 12 02 10 :51 ) 5/ 71 GLM (1)? y = 0, 1, 2, 3, (y ) (family = poisson) y = {0, 1}, y = {0, 1, 2,, N} (family = binomial) (family = Gamma) (family = gaussian)
R : glm() ( ) rbinom() glm(family = binomial) rbinom() glm(family = binomial) rpois() glm(family = poisson) rnbinom() glm.nb() ( ) rgamma() glm(family = gamma) rnorm() glm(family = gaussian) glm() glm.nb() MASS library GLM 2010-12-02 (2010 12 02 10 :51 ) 6/ 71
2010-12-02 (2010 12 02 10 :51 ) 7/ 71 GLM (2)!! GLMM! GLMM/ random effects GLM
2010-12-02 (2010 12 02 10 :51 ) 8/ 71 (Poisson distribution) (Binomial distribution) ( ) (Normal distribution, Gaussian ) (Gamma distribution)
2010-12-02 (2010 12 02 10 :51 ) 9/ 71
2010-12-02 (2010 12 02 10 :51 ) 10/ 71
2010-12-02 (2010 12 02 10 :51 ) 11/ 71 : + WinBUGS 1. : GLMM 2.
1. : GLMM
: 8 ( ) ( ) : N i = 8 i y i = 3 : 100 800 :? 2010-12-02 (2010 12 02 10 :51 ) 13/ 71
2010-12-02 (2010 12 02 10 :51 ) 14/ 71 :?! 100 800 403 0.50 0 5 10 15 20 25! 0 2 4 6 8 y i
2010-12-02 (2010 12 02 10 :51 ) 15/ 71 (overdispersion) 0 5 10 15 20 25 0 2 4 6 8 y i 0.5 : overdispersion :?
? : 1. fixed effects 2. random effects 2010-12-02 (2010 12 02 10 :51 ) 16/ 71
2010-12-02 (2010 12 02 10 :51 ) 17/ 71 fixed random? fixed/random effects : 1. fixed effects : ( ) fixed effects 2. random effects : fixed effects ( ) random effects
2010-12-02 (2010 12 02 10 :51 ) 18/ 71 : i N i y i ( ) Ni p(y i q i ) = q y i i (1 q i) N i y i, y i q i
2010-12-02 (2010 12 02 10 :51 ) 19/ 71 q i = q(z i ) (logistic) q(z) = 1/{1 + exp( z)} 0.5 1.0 q(z) 4 2 0 2 4 z z i = a + b i a: b i : i ( )
b i 100 101 (a {b 1, b 2,, b 100 }) /! ( )?? ( ) 2010-12-02 (2010 12 02 10 :51 ) 20/ 71
2010-12-02 (2010 12 02 10 :51 ) 21/ 71 : b i s p(b i s) = 1 2πs 2 exp b2 i 2s 2, = 1 = 1.5 = 3 b i 6 4 2 0 2 4 6 {b 1, b 2,, b 100 }
2010-12-02 (2010 12 02 10 :51 ) 22/ 71 b i s = 1 = 1.5 = 3 b i 6 4 2 0 2 4 6 s b i s b i y i
(C) : b i s s 2010-12-02 (2010 12 02 10 :51 ) 23/ 71 b i? (A) (B) (C)!? s -4-2 0 2 4-4 -2 0 2 4-4 -2 0 2 4 (A) : b i (B) : b i ( -5 5 )
y i = 2 b i b i p(b i y i = 2, s) s p(y i = 2 b i ) b i p(b i s) 6 4 2 0 2 4 6 6 4 2 0 2 4 6 6 4 2 0 2 4 6 s b i 2010-12-02 (2010 12 02 10 :51 ) 24/ 71
y i {2, 3, 5} b i s s 6 4 2 0 2 4 6 6 4 2 0 2 4 6 6 4 2 0 2 4 6 s b i 2010-12-02 (2010 12 02 10 :51 ) 25/ 71
2010-12-02 (2010 12 02 10 :51 ) 26/ 71? データ N[i] Y[i] 胚珠数中の生存 tau は hyper parameter 二項分布生存確率 q[i] 全体の平均 a 無情報事前分布 植物の個体差 b[i] 事前分布 tau 個体差のばらつき 無情報事前分布 ( 超事前分布 )
2010-12-02 (2010 12 02 10 :51 ) 27/ 71 データ N[i] Y[i] 胚珠数中の生存 tau は hyper parameter 二項分布生存確率 q[i] 全体の平均 a 無情報事前分布 植物の個体差 b[i] 事前分布 tau 個体差のばらつき 無情報事前分布 ( 超事前分布 )
? b i s = 0.1? s = 0.1 s individual 1 2 3 posterior prior small large ( ) s 2010-12-02 (2010 12 02 10 :51 ) 28/ 71
2010-12-02 (2010 12 02 10 :51 ) 29/ 71 τ = 1/s 2 s τ (non-informative prior) p(τ ) = τ α 1 e τ β Γ(α)β α, α = β = 10 4
2010-12-02 (2010 12 02 10 :51 ) 30/ 71 (1) τ 0.0 0.2 0.4 0.6 0.8 1.0 ( 1; 1) ( 1; 100) 0 2 4 6 8 10 τ τ
(2) a 0.0 0.1 0.2 0.3 0.4 ( 0; 1) ( 0; 100) -10-5 0 5 10 a (logit) a 2010-12-02 (2010 12 02 10 :51 ) 31/ 71
2010-12-02 (2010 12 02 10 :51 ) 32/ 71 p(a, {b i }, τ ) = 100 i=1 p(y i q(a + b i )) p(a) p(b i τ ) h(τ ) ( ) db i dτ da p(a, {b i }, τ ) 100 i=1 p(y i q(a + b i )) p(a) p(b i τ ) h(τ ) : : p(a, {b i }, τ ) 100 i=1 p(y i q(a + b i )) : p(a) p(b i τ ) h(τ )
2010-12-02 (2010 12 02 10 :51 ) 33/ 71 b i s individual 1 2 3 small hyperprior posterior prior (posterior) large hyperparameter s MCMC
? p(a, {b i }, τ ) 100 i=1 p(y i q(a + b i )) p(a) p(b i τ ) h(τ ) p(a, {b i }, τ ) Markov chain Monte Carlo (MCMC)! WinBUGS 2010-12-02 (2010 12 02 10 :51 ) 34/ 71
Gibbs sampling p(a ) 100 p(y i q(a + b i )) p(a) i=1 p(τ ) 100 i=1 p(b i τ ) h(τ ) p(b 1 ) p(y 1 q(a + b 1 )) p(b 1 τ ) p(b 2 ) p(y 2 q(a + b 2 )) p(b 2 τ ). p(b 100 ) p(y 100 q(a + b 100 )) p(b 100 τ ) 2010-12-02 (2010 12 02 10 :51 ) 35/ 71
2010-12-02 (2010 12 02 10 :51 ) 36/ 71 0 5 10 15 20 25 0 2 4 6 8
2010-12-02 (2010 12 02 10 :51 ) 37/ 71 : 0 5 10 15 20 25 0 2 4 6 8
2010-12-02 (2010 12 02 10 :51 ) 38/ 71 ( ) ( ) ( ) ( ) データ N[i] Y[i] 胚珠数中の生存 tau は hyper parameter 二項分布生存確率 q[i] 全体の平均 a 無情報事前分布 植物の個体差 b[i] 事前分布 tau 個体差のばらつき 無情報事前分布 ( 超事前分布 ) : Markov Chain Monte Carlo (MCMC)
2010-12-02 (2010 12 02 10 :51 ) 39/ 71
2010-12-02 (2010 12 02 10 :51 ) 40/ 71 WinBUGS
2010-12-02 (2010 12 02 10 :51 ) 41/ 71 WinBUGS 1. 2. BUGS (model.bug.txt) 3. R2WBwrapper R (runbugs.r) 4. R runbugs.r (source(runbugs.r) ) 5. bugs
2010-12-02 (2010 12 02 10 :51 ) 42/ 71? データ N[i] Y[i] 胚珠数中の生存 tau は hyper parameter 二項分布生存確率 q[i] 全体の平均 無情報事前分布 p(a, {b i }, τ ) 100 i=1 a 植物の個体差 b[i] 事前分布 tau 個体差のばらつき 無情報事前分布 ( 超事前分布 ) p( q(a + b i )) p(a) p(b i τ ) h(τ )
BUGS model.bug.txt ( ) model{ for (i in 1:N.sample) { Y[i] ~ dbin(q[i], N[i]) # logit(q[i]) <- a + b[i] # q[i] } a ~ dnorm(0, 1.0E-4) # for (i in 1:N.sample) { b[i] ~ dnorm(0, tau) # } tau ~ dgamma(1.0e-4, 1.0E-4) # sigma <- sqrt(1 / tau) # tau SD } 2010-12-02 (2010 12 02 10 :51 ) 43/ 71
2010-12-02 (2010 12 02 10 :51 ) 44/ 71 (hierarchical) random effects (non-informative) fixed effects (subjective) ( )
BUGS BUGS ( ) node ( ) 1. ~ sthochastic node 2. <- deterministic node 2010-12-02 (2010 12 02 10 :51 ) 45/ 71
R2WBwrapper R runbugs.r ( ) source("r2wbwrapper.r") # R2WBwrapper d <- read.csv("data.csv") # clear.data.param() # ( ) set.data("n.sample", nrow(d)) # set.data("n", d$n) # set.data("y", d$y) # 2010-12-02 (2010 12 02 10 :51 ) 46/ 71
R2WBwrapper R runbugs.r ( ) set.param("a", 0) # set.param("sigma", NA) # set.param("b", rep(0, N.sample)) # set.param("tau", 1, save = FALSE) # set.param("p", NA) # post.bugs <- call.bugs( # WinBUGS file = "model.bug.txt", n.iter = 2000, n.burnin = 1000, n.thin = 5 ) 2010-12-02 (2010 12 02 10 :51 ) 47/ 71
WinBUGS post.bugs <- call.bugs( file = "model.bug.txt", # WinBUGS n.iter = 2000, n.burnin = 1000, n.thin = 5 ) default ( ) 3 (n.chains = 3) MCMC sampling ( ) cf. PC MCMC chain 2000 step (n.iter = 2000) 1000 step (n.burnin = 1000) 1001 2000 step 5 step (n.thin = 5) 2010-12-02 (2010 12 02 10 :51 ) 48/ 71
? R source("runbugs.r") WinBUGS MCMC sampling (WinBUGS ) WinBUGS WinBUGS R post.bugs 2010-12-02 (2010 12 02 10 :51 ) 49/ 71
2010-12-02 (2010 12 02 10 :51 ) 50/ 71 R a a?
2010-12-02 (2010 12 02 10 :51 ) 51/ 71 bugs post.bugs (1) plot(post.bugs), R-hat Gelman-Rubin var ˆ ˆR + (ψ y) = W var ˆ + (ψ y) = n 1 n W + 1 n B W : chain variance B : chain variance Gelman et al. 2004. Bayesian Data Analysis. Chapman & Hall/CRC
2010-12-02 (2010 12 02 10 :51 ) 52/ 71 ubothinkpad/public_html/stat/2009/ism/winbugs/model.bug.txt", fit using WinBUGS, 3 chains, each with 1300 i 80% interval for each chain R-hat -10-5 0 5 10 1 1.5 2+ a sigma * b[1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] tau * q[1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] -10-5 0 5 10 1 1.5 2+ * array truncated for lack of space a sigma * b tau * q 0.5 0-0.5 3.5 3 2.5 10 5 0-5 -10 0.2 0.15 0.1 0.05 1 0.5 0 260 240 deviance 220 200 medians and 80% intervals 1 2 3 4 5 6 7 8 910 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 1 2 3 4 5 6 7 8 910 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40
mean sd 2.5% 25% 50% 75% 97.5% Rhat n.eff a 0.018 0.322-0.621-0.202 0.025 0.233 0.628 1.030 75 sigma 2.980 0.361 2.346 2.738 2.948 3.205 3.752 1.003 590 b[1] -3.800 1.711-7.652-4.776-3.503-2.554-1.193 1.002 1100 b[2] -1.142 0.874-3.003-1.688-1.111-0.530 0.464 1.010 200 b[3] 1.992 1.047 0.169 1.251 1.889 2.665 4.346 1.005 390 b[4] 3.745 1.781 0.975 2.503 3.408 4.751 7.926 1.008 520 b[5] -2.005 1.066-4.257-2.719-1.909-1.257-0.131 1.005 370 b[6] 2.047 1.077 0.147 1.310 1.933 2.716 4.456 1.002 1100 b[7] 3.765 1.763 1.023 2.482 3.593 4.811 7.515 1.000 1200 b[8] 3.782 1.661 1.133 2.591 3.570 4.703 7.621 1.003 640 b[9] -2.049 1.106-4.439-2.745-1.948-1.255-0.218 1.004 470 b[10] -2.028 1.066-4.340-2.655-1.902-1.314-0.175 1.002 750... 2010-12-02 (2010 12 02 10 :51 ) 53/ 71 bugs post.bugs (2) print(post.bugs, digits.summary = 3) 95%
2010-12-02 (2010 12 02 10 :51 ) 54/ 71 mcmc.list post.list <- to.list(post.bugs) plot(post.list[,1:4,], smooth = F),
2010-12-02 (2010 12 02 10 :51 ) 55/ 71
2010-12-02 (2010 12 02 10 :51 ) 56/ 71 mcmc post.mcmc <- to.mcmc(post.bugs) matrix : 0 5 10 15 20 25 0 2 4 6 8
2.
2010-12-02 (2010 12 02 10 :51 ) 58/ 71 : abundance 0 5 10 15 20 25 ( ) 0 10 20 30 40 50 location :
2010-12-02 (2010 12 02 10 :51 ) 59/ 71 : abundance 0 5 10 15 20 25 0 10 20 30 40 50 location ( 95% )
β : β Normal(0, 10 2 ), 2010-12-02 (2010 12 02 10 :51 ) 60/ 71 1 2 3 4 i λ i : y i Poisson(λ i ) λ i ( ) + ( ) : log λ i = β + r i
2010-12-02 (2010 12 02 10 :51 ) 61/ 71 ( ) 1 2 3 4 Conditional Autoregressive (CAR) r i (N i i, J i i ): j J r i Normal( i r j, σ ) σ N i N i σ : τ = 1/σ Gamma(1.0 2, 1.0 2 ) p(β, {r i }, τ {y i }) = p({y i} β, {r i }, τ ) ( ( )dβdr1 dr 50 dτ
τ - τ (σ ) - τ (σ ) - τ ( ) abundance 0 5 10 15 20 25 tau = 1000 0 10 20 30 40 50 abundance 0 5 10 15 20 25 tau = 20 0 5 10 15 20 25 tau = 0.01 0 10 20 30 40 50 0 10 20 30 40 50 location location 2010-12-02 (2010 12 02 10 :51 ) 62/ 71
2010-12-02 (2010 12 02 10 :51 ) 63/ 71 BUGS model { # BUGS for (i in 1:N.site) { Y[i] ~ dpois(mean[i]) # log(mean[i]) <- beta + re[i] # ( ) + ( ) } # re[i] CAR model re[1:n.site] ~ car.normal(adj[], Weights[], Num[], tau) beta ~ dnorm(0, 1.0E-2) # tau ~ dgamma(1.0e-2, 1.0E-2) # } 1 2 3 4
2010-12-02 (2010 12 02 10 :51 ) 64/ 71 abundance 0 5 10 15 20 25 β 0.0 0.5 1.0 1.5 2.0 2.5 3.0 beta τ 0 10 20 30 40 50 location 0 10 20 30 40 50 tau ( 95% )
2010-12-02 (2010 12 02 10 :51 ) 65/ 71 abundance 0 5 10 15 20 25 0 10 20 30 40 50 location GLMM OK?
2010-12-02 (2010 12 02 10 :51 ) 66/ 71 vs abundance 0 5 10 15 20 25 abundance 0 5 10 15 20 25 0 10 20 30 40 50 location 0 10 20 30 40 50 location?
2010-12-02 (2010 12 02 10 :51 ) 67/ 71 ( ):?!! abundance 0 5 10 15 20 25 0 10 20 30 40 50 location (
2010-12-02 (2010 12 02 10 :51 ) 68/ 71 abundance 0 5 10 15 20 25 abundance 0 5 10 15 20 25 0 10 20 30 40 50 location 0 10 20 30 40 50 location!
2010-12-02 (2010 12 02 10 :51 ) 69/ 71 abundance 0 5 10 15 20 25 abundance 0 5 10 15 20 25 0 10 20 30 40 50 0 10 20 30 40 50 location location CAR
2010-12-02 (2010 12 02 10 :51 ) 70/ 71 : abundance 0 5 10 15 20 25 abundance 0 5 10 15 20 25 0 10 20 30 40 50 location 0 10 20 30 40 50 location
2010-12-02 (2010 12 02 10 :51 ) 71/ 71 : 1. : GLMM 2.