X X X Y R Y R Y R MCAR MAR MNAR Figure 1: MCAR, MAR, MNAR Y R X 1.2 Missing At Random (MAR) MAR MCAR MCAR Y X X Y MCAR 2 1 R X Y Table 1 3 IQ MCAR Y I

Similar documents
takano1

0

(X) (Y ) Y = intercept + c X + e (1) e c c M = intercept + ax + e (2) a Y = intercept + cx + bm + e (3) (1) X c c c (3) b X M Y (indirect effect) a b

: (EQS) /EQUATIONS V1 = 30*V F1 + E1; V2 = 25*V *F1 + E2; V3 = 16*V *F1 + E3; V4 = 10*V F2 + E4; V5 = 19*V99

80 X 1, X 2,, X n ( λ ) λ P(X = x) = f (x; λ) = λx e λ, x = 0, 1, 2, x! l(λ) = n f (x i ; λ) = i=1 i=1 n λ x i e λ i=1 x i! = λ n i=1 x i e nλ n i=1 x

Dirichlet process mixture Dirichlet process mixture 2 /40 MIRU2008 :

03.Œk’ì

第11回:線形回帰モデルのOLS推定

1 Stata SEM LightStone 3 2 SEM. 2., 2,. Alan C. Acock, Discovering Structural Equation Modeling Using Stata, Revised Edition, Stata Press.

kubostat2017b p.1 agenda I 2017 (b) probability distribution and maximum likelihood estimation :

Mantel-Haenszelの方法

わが国企業による資金調達方法の選択問題

12/1 ( ) GLM, R MCMC, WinBUGS 12/2 ( ) WinBUGS WinBUGS 12/2 ( ) : 12/3 ( ) :? ( :51 ) 2/ 71

橡同居選択における所得の影響(DP原稿).PDF

kubostat2015e p.2 how to specify Poisson regression model, a GLM GLM how to specify model, a GLM GLM logistic probability distribution Poisson distrib

& 3 3 ' ' (., (Pixel), (Light Intensity) (Random Variable). (Joint Probability). V., V = {,,, V }. i x i x = (x, x,, x V ) T. x i i (State Variable),

/22 R MCMC R R MCMC? 3. Gibbs sampler : kubo/

kubostat2017e p.1 I 2017 (e) GLM logistic regression : : :02 1 N y count data or

dvi

01.Œk’ì/“²fi¡*

Kobe University Repository : Kernel タイトル Title 著者 Author(s) 掲載誌 巻号 ページ Citation 刊行日 Issue date 資源タイプ Resource Type 版区分 Resource Version 権利 Rights DOI


k2 ( :35 ) ( k2) (GLM) web web 1 :

パーソナリティ研究 2005 第13巻 第2号 170–182

66-1 田中健吾・松浦紗織.pwd

Stepwise Chow Test * Chow Test Chow Test Stepwise Chow Test Stepwise Chow Test Stepwise Chow Test Riddell Riddell first step second step sub-step Step

1 Stata SEM LightStone 4 SEM 4.. Alan C. Acock, Discovering Structural Equation Modeling Using Stata, Revised Edition, Stata Press 3.

カルマンフィルターによるベータ推定( )

dvi

1 Tokyo Daily Rainfall (mm) Days (mm)

untitled

橡表紙参照.PDF

2015 3

kubo2015ngt6 p.2 ( ( (MLE 8 y i L(q q log L(q q 0 ˆq log L(q / q = 0 q ˆq = = = * ˆq = 0.46 ( 8 y 0.46 y y y i kubo (ht

kubostat2017c p (c) Poisson regression, a generalized linear model (GLM) : :

最小2乗法

Newgarten, BL., Havighrst, RJ., & Tobin, S.Life Satisfaction Index-A LSIDiener. E.,Emmons,R.A.,Larsen,R.J.,&Griffin,S. The Satisfaction With Life Scal

Stata11 whitepapers mwp-037 regress - regress regress. regress mpg weight foreign Source SS df MS Number of obs = 74 F(

AR(1) y t = φy t 1 + ɛ t, ɛ t N(0, σ 2 ) 1. Mean of y t given y t 1, y t 2, E(y t y t 1, y t 2, ) = φy t 1 2. Variance of y t given y t 1, y t

2 1,2, , 2 ( ) (1) (2) (3) (4) Cameron and Trivedi(1998) , (1987) (1982) Agresti(2003)


IPSJ SIG Technical Report Pitman-Yor 1 1 Pitman-Yor n-gram A proposal of the melody generation method using hierarchical pitman-yor language model Aki

パーソナリティ研究2006 第14巻 第2号 214–226

x T = (x 1,, x M ) x T x M K C 1,, C K 22 x w y 1: 2 2

untitled

4.9 Hausman Test Time Fixed Effects Model vs Time Random Effects Model Two-way Fixed Effects Model

Isogai, T., Building a dynamic correlation network for fat-tailed financial asset returns, Applied Network Science (7):-24, 206,

kubostat2018d p.2 :? bod size x and fertilization f change seed number? : a statistical model for this example? i response variable seed number : { i

山形大学紀要

untitled


untitled

仏大 社会学部論集47号(P)/6.山口

講義のーと : データ解析のための統計モデリング. 第5回

2 (S, C, R, p, q, S, C, ML ) S = {s 1, s 2,..., s n } C = {c 1, c 2,..., c m } n = S m = C R = {r 1, r 2,...} r r 2 C \ p = (p r ) r R q = (q r ) r R

!!! 2!

:EM,,. 4 EM. EM Finch, (AIC)., ( ), ( ), Web,,.,., [1].,. 2010,,,, 5 [2]., 16,000.,..,,. (,, )..,,. (socio-dynamics) [3, 4]. Weidlich Haag.

kubostat7f p GLM! logistic regression as usual? N? GLM GLM doesn t work! GLM!! probabilit distribution binomial distribution : : β + β x i link functi

Preliminary Version Manning et al. (1986) Rand Health Insurance Experiment Manning et al. (1986) 3 Medicare Me

k3 ( :07 ) 2 (A) k = 1 (B) k = 7 y x x 1 (k2)?? x y (A) GLM (k

Kobe University Repository : Kernel タイトル Title 著者 Author(s) 掲載誌 巻号 ページ Citation 刊行日 Issue date 資源タイプ Resource Type 版区分 Resource Version 権利 Rights DOI

第13回:交差項を含む回帰・弾力性の推定

21 Pitman-Yor Pitman- Yor [7] n -gram W w n-gram G Pitman-Yor P Y (d, θ, G 0 ) (1) G P Y (d, θ, G 0 ) (1) Pitman-Yor d, θ, G 0 d 0 d 1 θ Pitman-Yor G

Adult Attachment Projective AAP PARS PARS PARS PARS Table

Microsoft Word - 計量研修テキスト_第5版).doc

講義のーと : データ解析のための統計モデリング. 第2回

(iii) x, x N(µ, ) z = x µ () N(0, ) () 0 (y,, y 0 ) (σ = 6) *3 0 y y 2 y 3 y 4 y 5 y 6 y 7 y 8 y 9 y ( ) *4 H 0 : µ

untitled

+深見将志.indd

Vol. 29, No. 2, (2008) FDR Introduction of FDR and Comparisons of Multiple Testing Procedures that Control It Shin-ichi Matsuda Department of


Stata 11 Stata ROC whitepaper mwp anova/oneway 3 mwp-042 kwallis Kruskal Wallis 28 mwp-045 ranksum/median / 31 mwp-047 roctab/roccomp ROC 34 mwp-050 s

i 2 ii) iii (random sampling) 3 (purposive selection) 2 4 (stratified sampling) 2 two-stage sampling 2 3 4

ohpmain.dvi

TF● :テーマ名

100 SDAM SDAM Windows2000/XP 4) SDAM TIN ESDA K G G GWR SDAM GUI

1 はじめに 85

tokei01.dvi

Microsoft Word - 計量研修テキスト_第5版).doc

LET2009

研究シリーズ第40号


_16_.indd

autocorrelataion cross-autocorrelataion Lo/MacKinlay [1988, 1990] (A)

3. ( 1 ) Linear Congruential Generator:LCG 6) (Mersenne Twister:MT ), L 1 ( 2 ) 4 4 G (i,j) < G > < G 2 > < G > 2 g (ij) i= L j= N

Stata 11 Stata ts (ARMA) ARCH/GARCH whitepaper mwp 3 mwp-083 arch ARCH 11 mwp-051 arch postestimation 27 mwp-056 arima ARMA 35 mwp-003 arima postestim

第3章.DOC

4 2 p = p(t, g) (1) r = r(t, g) (2) p r t g p r dp dt = p dg t + p g (3) dt dr dt = r dg t + r g dt 3 p t p g dt p t r t = Benefit view dp

SEJulyMs更新V7

Japanese Journal of Applied Psychology

Studies of Foot Form for Footwear Design (Part 9) : Characteristics of the Foot Form of Young and Elder Women Based on their Sizes of Ball Joint Girth

東アジアへの視点

ばらつき抑制のための確率最適制御

ソフトウェア 必 要 要 件 *MIを 行 なうプロシジャ* *バージョン8.1~ *SAS/STATプロダクト * 評 価 版 (experimental) バージョン6には 存 在 しない 日 本 :バージョン8.1 ( 日 本 語 版 )を Microsoft Windows 版 のみリクエス

: (GLMM) (pseudo replication) ( ) ( ) & Markov Chain Monte Carlo (MCMC)? /30

.2 ( ) ( ) (?? ).3 *2 ( ) *2 2

新製品開発プロジェクトの評価手法

5 Armitage x 1,, x n y i = 10x i + 3 y i = log x i {x i } {y i } 1.2 n i i x ij i j y ij, z ij i j 2 1 y = a x + b ( cm) x ij (i j )

Rによる計量分析:データ解析と可視化 - 第3回 Rの基礎とデータ操作・管理

2 Tobin (1958) 2 limited dependent variables: LDV 2 corner solution 2 truncated censored x top coding censor from above censor from below 2 Heck

Transcription:

(missing data analysis) - - 1/16/2011 (missing data, missing value) (list-wise deletion) (pair-wise deletion) (full information maximum likelihood method, FIML) (multiple imputation method) 1 missing completely at random (MCAR) missing at random (MAR) missing not at random (MNAR) FIML (auxiliary variable) Enders (2010) 1 MCAR, MAR MNAR 3 (Rubin, 1976) 1 1.1 Missing Completely At Random (MCAR) MCAR 1 Y X R 0 1 R X Y MCAR e-mail: murakou@orion.ocn.ne.jp 1 SEM HLM 1

X X X Y R Y R Y R MCAR MAR MNAR Figure 1: MCAR, MAR, MNAR Y R X 1.2 Missing At Random (MAR) MAR MCAR MCAR Y X X Y MCAR 2 1 R X Y Table 1 3 IQ MCAR Y IQ X) IQ MAR IQ 1 Y R MAR X IQ IQ IQ R Y MAR MAR X Y IQ IQ MAR 1 Y R X MAR 2 FIML 3 R 2

IQ MAR IQ Y MAR MAR Table 1: MAR id IQ 1 3 83 n/a 93 2 4 85 n/a 99 3 5 95 n/a 98 4 2 96 n/a 103 5 5 103 128 128 6 3 104 102 102 7 2 109 111 111 8 6 112 113 113 9 3 115 117 117 10 3 116 133 133 3.6 101.8 117.3 111.7 1.3 Missing Not At Random (MNAR) MNAR 1 X Y R 1.4 (auxiliary variable) MAR MNAR MAR MAR 2 MAR (auxiliary variable) inclusive analysis strategy (Enders, 2010; Rubin, 1996; Schafer & Graham, 2002) MAR FIML MAR 3

(Enders, 2008) FIML R Y MAR MAR A X X Y R Y R MNAR MAR Figure 2: inclusive analysis strategy A (auxiliary variables) A MNAR MAR 1.5 MCAR MAR Table 1 117.3 111.7 Rubin (full information maximum likelihood method; FIML) MCAR MAR MAR FIML MCAR FIML MCAR FIML MCAR FIML 4

MNAR FIML Heckman (1979) selection model GHlynn, Laird & Rubin (1986) pattern mixture model MNAR MAR (e.g., Schafer & Graham, 2002) MAR FIML MNAR MAR MAR-based FIML (Schafer, 2003, p. 30) MAR MCAR 4 MCAR MAR FIML 2 (full maximum likelihood method; FIML) FIML 2.1 FIML p x p 1 1 ( ) f(x µ, Σ) = exp 1 (2π) p/2 1/2 2 Σ (x µ) Σ 1 (x µ) µ Σ µ Σ 1 x 1 µ Σ 5 (1) 1 ( ) f(x 1 µ, Σ) = exp 1 (2π) p/2 1/2 2 Σ (x 1 µ) Σ 1 (x 1 µ) (2) 4 Little (1988) MCAR test 5 5

2 3... i x i 1 ( ) f(x i µ, Σ) = exp 1 (2π) p/2 1/2 2 Σ (x i µ) Σ 1 (x i µ) (3) x 1 x i i = 1, 2,..., N f(x 1, x 2,..., x N µ, Σ) = N 1 ( ) exp 1 (2π) p/2 1/2 2 Σ (x i µ) Σ 1 (x i µ) i=1 (4) µ Σ µ Σ µ Σ µ Σ (4) µ Σ (4) µ Σ (4) µ Σ (likelihood function) L(µ, Σ) (3) x i µ Σ L i (µ, Σ) N N log L(µ, Σ) = log L i (µ, Σ) = log L i (µ, Σ) (5) i=1 i=1 (5) (4) µ Σ µ Σ (SEM) Σ imply 2.2 i 1 1 ( ) L i (µ, Σ) = f(x i µ, Σ) = exp 1 (2π) p/2 1/2 2 Σ (x i µ) Σ 1 (x i µ) (6) 6

x i 3 x i 3 1 µ 3 1 Σ 3 3 Table 1 id = 5 x 5 = 5 103 148, µ = µ 1 µ 2 µ 3, Σ = σ 2 1 σ 12 σ 13 σ 21 σ 2 2 σ 23 σ 31 σ 32 σ 2 3 (7) µ Σ IQ 2 id = 1 3 2 (6) x i 2 1 µ 2 1 Σ 2 2 x 1 = ( 3 83 ), µ = ( µ 1 µ 2 ), Σ = ( σ 2 1 σ 12 σ 21 σ 2 2 ) (8) = 3 IQ = 83 µ Σ (4) (5) FIML 6 FIML MAR Table 1 = 3.6 IQ = 101.8 = 110.9 7 (111.7) 8 IQ IQ IQ IQ FIML IQ IQ IQ 117.3 IQ 6 FIML 7 Mplus 8 IQ 117.2 7

IQ MAR FIML IQ IQ (borrow the information) MAR Enders (2010) µ Σ SEM Σ 9 2.3 FIML (e.g., AMOS, LISREL, EQS, Mplus, and Mx) FIML SAS mixed model 2.4 (inclusive analysis strategy) MAR IQ MAR IQ IQ (auxiliary variables) SEM (Enders, 2008) Figure 3 9 FIML N 100 1 N=1 χ 2 χ 2 = (N 1) log L(ˆθ) (9) FIML L 0 L 1 χ 2 χ 2 = 2(log L 1 log L 0 ) (10) 8

χ 2 χ 2 X1 Y e X2 A1 A2 Figure 3: (auxiliary variables) A1 A2 Enders (2008) Figure 4 e e e X1 X2 F1 F2 A1 A2 Y1 e Y1 e Figure 4: A1 A2 Enders (2008) inclusive analysis strategy FIML (SEM) Mplus VARIABLE auxiliary = 9

CFI TLI incremental fit index CFI TLI (independent model) 10 CFI TLI Enders (2010) 3 (multiple imputation method) FIML (imputation method) stochastic regression imputation stochastic regression imputation MAR stochastic regression imputation stochastic regression imputation MAR Rubin (1987) stochastic regression imputation 10 CFI TLI 1 (see Wu, West, & Taylor, 2009) 10

Figure 5 3 (imputation step) (regression, anova, sem, etc...) straightforward (posterior step) 推定値 1 と標準誤差 1 目的となる統計的分析 推定値 2 と標準誤差 2 単一の 推定値と 標準誤差 推定値 N と標準誤差 N 欠損値のある オリジナル データセット N 個の擬似完全 データセットの 作成 N セットの推定 値と標準誤差 代入ステップ 分析ステップ 統合ステップ Figure 5: (multiple imputation method) 3.1 (imputation step) (data augmentation method) SAS proc MI NORM 11 SPSS multiple imputation module sequential regression approach (or chained equations approach) van Buuren (2007) (posterior predictive distribution) 11 Schafer (1997) http://www.stat.psu.edu/ jls/misoftwa.html 11

(Markov chain monte carlo; MCMC) 1. 2. 3. stochastic regression model 4. 5. 6. 2-5 3. 5 3. p(y t µ t 1, σ t 1, Y obs ) (11) Y obs Y t Y t t µt 1, σ t 1 * µ 0 σ0 1. 2 5. 3. µ t 1,σ t 1, σ t 1, Y obs ) p(y t µ t 1 Y t 4. 5. p(µ t Y t, σ t 1, Y obs ) (12) µ µ t p(σ t Y t, µ t, Y obs ) (13) 12

σ σ t 12 µ t σt (11) 2-3 Y µ 0, σ 0, Y 1, µ 1, σ 0, Y 2, µ 2, σ 2, Y 3,... (14) Y µ σ Y Y (burn-in) Y 200 Y 201 Y 13 sequential (Gibbs sampler) variant 1 14 autocorrelation function plot MCMC 2008 3.2 (posterior/integration step) ANOVA, SEM, etc...) 12 13 14 13

3.2.1 ˆθ t t m θ = 1 m m ˆθ t (15) N t=1 3.2.2 V W = 1 m SEt 2 (16) m t=1 SE t t within-imputation variance V B = 1 m 1 m (ˆθ t θ) (17) t=1 between-imputation variance V W V B ANOVA V T = V W + V B + V B m (18) SE = V T (19) V W V B V B V B stochastic regression V B (18) V B /m 14

3.3 3.3.1 3-5 (e.g., Rubin, 1987) Graham, Olchowski, & Gilreath (2007) Enders (2010) 20 3.3.2 MAR 1 (nested data, hierarchical data) (hierarchical linear model, HLM) 15 3.3.3 Rounding imputed value 2.33 rounding Allison (2002) 15 Norm variant (http://www.stat.psu.edu/ jls/misoftwa.html) Mplus version 6 15

3.3.4 10 1 9 Enders (2010) duplicate-scale imputation X 2 1 X 1 X 8 1 1 X 2 8 2 X 1 X 2 6 X 1 X 2 X 1 X 2 X 1 X 1 X 2 X 2 X 1 X 2 X 1 duplication-scale imputation X 1 X 2 Little et al. (2008) 16 three-step approach duplicate-scale method 3.4 SAS SPSS SPSS sequential regression model Schafer 16 http://www.crmda.ku.edu/pdf/11. Imputation with Large Data Sets.pdf 16

(1997) Norm 17 SEM HLM SAS Norm 18 4 MAR SEM FIML FIML Mplus 5 Allison, P. D. (2002). Missing data., Newbury Park, CA: Sage. Enders, C. K. (2008). A note on the use of missing auxiliary variables in FIML-based structural equation models. Structural Equation Modeling: A Multidisciplinary Journal, 15, 434-448. Enders, C.K. (2010). Applied missing data analysis. New York: Guilford. Glynn, R. J., Laird, N. M., & Rubin, D. B. (1986). Selection modeling versus mixture modeling with nonignorable nonresponse. In H. Weiner (Ed.), Drawing inferences from self-selected samples (pp. 116-142). Berlin Springer-Verlag. Graham, J. W., Olchowski, A. E., & Gilreath, T. D. (2007). How many imputations are really needed? Some practical clarifications of multiple imputation theory. Prevention Science, 8, 206-213. Heckman, J. J. (1976). The common structure of statistical models of truncation, sample selection and limited dependent variables and a simple estimator for such models. Annals of Economic and Social Measurement, 5, 475-492. Little, R. J. (1988). A test of missing completely at random for multivariate data with missing values. Journal of the American Statistical Associatin, 83, 1198-1202. Little, R. J. A., & Rubin, D. B. (1987). Statistical analysis with missing data. New York: Wiley. Rubin, D. B. (1987). Multiple imputation for nonresponse in surveys. New York: Wiley. Rubin, D.B. (1996). Multiple imputation after 18+ years (with discussion). Journal of the American Statistical Association, 91, 473-489. Schafer, J. L. (1997). Analysis of Incomplete Multivariate Data. Chapman & Hall, London. Schafer, J. L., & Graham, J. W. (2002). Missing data: Our view of the state of the art. Psychological Methods, 7, 147-177. 17 http://www.stat.psu.edu/ jls/misoftwa.html 18 Mplus version 6 17

(2008). van Buuren, S. (2007). Multiple imputation of discrete and continuous data by fully conditional specification. Statistical Methods in Medical Research, 16, 219-242. Wu, W., West, S. G., & Taylor, A. B. (2009). Evaluating model fit for growth curve models: Integration of fit indices from SEM and MLM frameworks. Psychological Methods, 14, 183-201. 18