Rによる計量分析:データ解析と可視化 - 第3回 Rの基礎とデータ操作・管理

Similar documents
k2 ( :35 ) ( k2) (GLM) web web 1 :

I L01( Wed) : Time-stamp: Wed 07:38 JST hig e, ( ) L01 I(2017) 1 / 19

¥¤¥ó¥¿¡¼¥Í¥Ã¥È·×¬¤È¥Ç¡¼¥¿²òÀÏ Âè2²ó

Rによる計量分析:データ解析と可視化 - 第2回 セットアップ

こんにちは由美子です

第11回:線形回帰モデルのOLS推定

Stata11 whitepapers mwp-037 regress - regress regress. regress mpg weight foreign Source SS df MS Number of obs = 74 F(

最小2乗法


kubostat2017c p (c) Poisson regression, a generalized linear model (GLM) : :

第13回:交差項を含む回帰・弾力性の推定

講義のーと : データ解析のための統計モデリング. 第2回

: (EQS) /EQUATIONS V1 = 30*V F1 + E1; V2 = 25*V *F1 + E2; V3 = 16*V *F1 + E3; V4 = 10*V F2 + E4; V5 = 19*V99

1 2 Sample Sample Sample 3 1

tokei01.dvi

5 5.1 A B mm 0.1mm Nominal Scale 74

こんにちは由美子です

kubostat2017b p.1 agenda I 2017 (b) probability distribution and maximum likelihood estimation :

untitled

分布

stat-base_ppt [互換モード]

1 Stata SEM LightStone 3 2 SEM. 2., 2,. Alan C. Acock, Discovering Structural Equation Modeling Using Stata, Revised Edition, Stata Press.

現代日本論演習/比較現代日本論研究演習I「統計分析の基礎」

R int num factor character 1 2 (dichotomous variable) (trichotomous variable) 3 (nominal scale) M F 1 2 coding as.numeric() as.integer() 2

ECCS. ECCS,. ( 2. Mac Do-file Editor. Mac Do-file Editor Windows Do-file Editor Top Do-file e

stat-base [互換モード]

1 Stata SEM LightStone 4 SEM 4.. Alan C. Acock, Discovering Structural Equation Modeling Using Stata, Revised Edition, Stata Press 3.

y i OLS [0, 1] OLS x i = (1, x 1,i,, x k,i ) β = (β 0, β 1,, β k ) G ( x i β) 1 G i 1 π i π i P {y i = 1 x i } = G (


!!! 2!

(pdf) (cdf) Matlab χ ( ) F t

untitled

Use R

¥¤¥ó¥¿¡¼¥Í¥Ã¥È·×¬¤È¥Ç¡¼¥¿²òÀÏ Âè2²ó

こんにちは由美子です

SHOBI_Portal_Manual

数理統計学Iノート

/22 R MCMC R R MCMC? 3. Gibbs sampler : kubo/

5.2 White

講義のーと : データ解析のための統計モデリング. 第3回

現代日本論演習/比較現代日本論研究演習I「統計分析の基礎」

kubostat2017e p.1 I 2017 (e) GLM logistic regression : : :02 1 N y count data or

1 I EViews View Proc Freeze

main.dvi

untitled


ohpmain.dvi

80 X 1, X 2,, X n ( λ ) λ P(X = x) = f (x; λ) = λx e λ, x = 0, 1, 2, x! l(λ) = n f (x i ; λ) = i=1 i=1 n λ x i e λ i=1 x i! = λ n i=1 x i e nλ n i=1 x

背景

kubostat2015e p.2 how to specify Poisson regression model, a GLM GLM how to specify model, a GLM GLM logistic probability distribution Poisson distrib

2変量データの共分散・相関係数・回帰分析

Stata 11 Stata ROC whitepaper mwp anova/oneway 3 mwp-042 kwallis Kruskal Wallis 28 mwp-045 ranksum/median / 31 mwp-047 roctab/roccomp ROC 34 mwp-050 s

populatio sample II, B II? [1] I. [2] 1 [3] David J. Had [4] 2 [5] 3 2

<4D F736F F F696E74202D2088E38A77939D8C7695D78BAD89EF313591E63189F18AEE967B939D8C7697CA2E >

データ分析のまとめ方

KLCシリーズ インストール/セットアップ・ガイド

AR(1) y t = φy t 1 + ɛ t, ɛ t N(0, σ 2 ) 1. Mean of y t given y t 1, y t 2, E(y t y t 1, y t 2, ) = φy t 1 2. Variance of y t given y t 1, y t

untitled

untitled


4.9 Hausman Test Time Fixed Effects Model vs Time Random Effects Model Two-way Fixed Effects Model

New version (2.15.1) of Specview is now available Dismiss Windows Specview.bat set spv= Specview set jhome= JAVA (C:\Program Files\Java\jre<version>\

151021slide.dvi

Japan Research Review 1998年7月号

untitled

untitled

一般化線形 (混合) モデル (2) - ロジスティック回帰と GLMM

DAA09

<4D F736F F D B B83578B6594BB2D834A836F815B82D082C88C60202E646F63>

10

k3 ( :07 ) 2 (A) k = 1 (B) k = 7 y x x 1 (k2)?? x y (A) GLM (k

4 OLS 4 OLS 4.1 nurseries dual c dual i = c + βnurseries i + ε i (1) 1. OLS Workfile Quick - Estimate Equation OK Equation specification dual c nurser

0.2 Button TextBox: menu tab 2

yamadaiR(cEFA).pdf

1. 2 Blank and Winnick (1953) 1 Smith (1974) Shilling et al. (1987) Shilling et al. (1987) Frew and Jud (1988) James Shilling Voith (1992) (Shilling e

28

Isogai, T., Building a dynamic correlation network for fat-tailed financial asset returns, Applied Network Science (7):-24, 206,

Microsoft Word - 研究デザインと統計学.doc

kubostat2018d p.2 :? bod size x and fertilization f change seed number? : a statistical model for this example? i response variable seed number : { i

R Console >R ˆ 2 ˆ 2 ˆ Graphics Device 1 Rcmdr R Console R R Rcmdr Rcmdr Fox, 2007 Fox and Carvalho, 2012 R R 2

鉄鋼協会プレゼン

10県別セミ-01.ai

<4D F736F F D20939D8C7689F090CD985F93C18EEA8D758B E646F63>

untitled

<4D F736F F D2095C48D9182CC92AA97AC82A982E793FA967B82CC8AD392E8955D89BF82CC95FB8CFC90AB82F08D6C82A682E981838FE CF68A4A8CB38CB48D65292E646F63>

<4D F736F F F696E74202D F95618A7789EF B836A F838C834E B88E38A77939D8C76322E >


Specview Specview Specview STSCI(Space Telescope SCience Institute) VO Specview Web page htt

(1) ) ) (2) (3) (4) (5) (1) (2) b (3)..

bron.dvi

12/1 ( ) GLM, R MCMC, WinBUGS 12/2 ( ) WinBUGS WinBUGS 12/2 ( ) : 12/3 ( ) :? ( :51 ) 2/ 71

A Nutritional Study of Anemia in Pregnancy Hematologic Characteristics in Pregnancy (Part 1) Keizo Shiraki, Fumiko Hisaoka Department of Nutrition, Sc


2 1,2, , 2 ( ) (1) (2) (3) (4) Cameron and Trivedi(1998) , (1987) (1982) Agresti(2003)

σ t σ t σt nikkei HP nikkei4csv H R nikkei4<-readcsv("h:=y=ynikkei4csv",header=t) (1) nikkei header=t nikkei4csv 4 4 nikkei nikkei4<-dataframe(n

研修コーナー

(lm) lm AIC 2 / 1

講義のーと : データ解析のための統計モデリング. 第5回

総合薬学講座 生物統計の基礎

自由集会時系列part2web.key

Transcription:

R 3 R 2017 Email: gito@eco.u-toyama.ac.jp October 23, 2017 (Toyama/NIHU) R ( 3 ) October 23, 2017 1 / 34

Agenda 1 2 3 4 R 5 RStudio (Toyama/NIHU) R ( 3 ) October 23, 2017 2 / 34

10/30 (Mon.) 12/11 (Mon.) New! 1/9 (Tue.) New! (Toyama/NIHU) R ( 3 ) October 23, 2017 3 / 34

(regression analysis) (OLS) (GLM) (inferential statistics) = ( ) ( ) ( ) (Toyama/NIHU) R ( 3 ) October 23, 2017 4 / 34

50% 30% Google (http://toyokeizai.net/articles/-/171160?display=b) 2017 R ( ) (Toyama/NIHU) R ( 3 ) October 23, 2017 5 / 34

(Toyama/NIHU) R ( 3 ) October 23, 2017 6 / 34

( ) ( ) ( ) (Toyama/NIHU) R ( 3 ) October 23, 2017 7 / 34

datum ( ) (1) (2) ( ( ) ) (https://kotobank.jp) ( ) ( ) ( ) GDP (Toyama/NIHU) R ( 3 ) October 23, 2017 8 / 34

(unit of observation) (unit of analysis) (variable) GDP, 2 GDP (constant) ( ) (Toyama/NIHU) R ( 3 ) October 23, 2017 9 / 34

4 1 (nominal scale): 2 ( ) 2 (ordinal scale): 2 2 1 2 ( ) 1 2 ( ) 3 (interval scale): ( ) 5 10 5 (2 ) (0 ) 4 (ratio scale): (0) (0 ) 50kg 100kg 50kg 2 ( ) > > > (Toyama/NIHU) R ( 3 ) October 23, 2017 10 / 34

(statistic) ( ) ( ) ( ) ( ) ( ) (Toyama/NIHU) R ( 3 ) October 23, 2017 11 / 34

(mean, average) n ( ) x = (x 1, x 2,..., x n ) x x = n i=1 n = (x 1 + x 2 + + x n ) n (1) (median) n x m m df(x) 1 2 and m df(x) 1 2 (2) m n 2 (Toyama/NIHU) R ( 3 ) October 23, 2017 12 / 34

(mode) ( ) x = (1, 1, 1, 1, 1, 2, 3, 4, 5, 6) 1 3 (outlier) (e.g., ) (e.g., ) (Toyama/NIHU) R ( 3 ) October 23, 2017 13 / 34

500, 100, n = 10, 000 ( x = 500, m = 500) Median Mean Frequency 200 400 600 800 10 4 100 ( x = 594, m = 501) Frequency 0 2000 4000 6000 8000 10000 100/10, 000 = 1/100 ( robust) (Toyama/NIHU) R ( 3 ) October 23, 2017 14 / 34

(IQR) (unbiased variance) n x = (x 1, x 2,..., x n ) x σ 2 x n σx 2 i=1 = (x i x) 2 n 1 (3) σ 2 x σ x (standard deviation, sd) ( ) (3) ( ) n 1 n ( ) ( ) x x (Toyama/NIHU) R ( 3 ) October 23, 2017 15 / 34

(IQR) (inter-quartile range, IQR) ( ) (1 ) n x = (x 1, x 2,..., x n ) x x 4 IQR 3 Q 3/4 (upper quartile) 1 Q 1/4 (lower quartile) Q 3/4 Q 1/4 m = Q 2/4 = Q 1/2 Q 0/4 Q 4/4 ( ) IQR 50% [Q 1/4 1.5IQR, Q 3/4 + 1.5IQR] (outlier) (box-and-whisker plot) ( ) q/10 q (Toyama/NIHU) R ( 3 ) October 23, 2017 16 / 34

(population) ( ) ( ) 2 (data generating process) (sample) (sampling) ( ) (statistical inference) ( ) (error) ( ) ( ) (Toyama/NIHU) R ( 3 ) October 23, 2017 17 / 34

(sample size): ( ) N (number of samples): ( ) 10 20 10 20 1,500 2,000 2 1,500 ( ) 2,000 ( ) (Toyama/NIHU) R ( 3 ) October 23, 2017 18 / 34

( ) (parameter) (parameter) (e.g., ) 1 2000 91.0% 2 5 2011 (http://www.asahi.com/edu/hiraku/hiraku2011/article01.html) ( ) ( standard error) 1 ( ) 2 (Toyama/NIHU) R ( 3 ) October 23, 2017 19 / 34

(Central Limit Theorem, CLT) ( ) n X 1, X 2,..., X n X n, σ 2 X X E[X] n Z n (X n E[X] ) 0, 1 ( ) N (0, 1) ( ) Z n = n(xn E[X]) n(xn E[X]) = (4) σ 2 X σ X X n E[X] N (0, σx 2 /n) ( ) n X n E[X] µ, σ ( σ 2 ) N (µ, σ 2 ) (Toyama/NIHU) R ( 3 ) October 23, 2017 20 / 34

( ) ( ) n X 1, X 2,..., X n X n, σx 2 X E[X] ( ) n 95% X n 1.96 σx 2 /n E[X] X n + 1.96 /n (5) X n N (E[X], σx 2 /n) 95% ( ) = σ 2 X (Toyama/NIHU) R ( 3 ) October 23, 2017 21 / 34

(5) E[X] 95% (confidence interval, CI) (standard error, SE): σx 2 /n = σ X/ n ( ) σx n ( n ) 95% CI [X n 1.96SE, X n + 1.96SE] 95% (X n E[X]) n ( ) t t 1.96 (Toyama/NIHU) R ( 3 ) October 23, 2017 22 / 34

( ) α% α (confidence coefficient) 95%, 90%, 99% ( 94%, 96%, etc. ) 5% 10% 1% p < 0.05, p < 0.1, p < 0.01 ( (Type I/α error) ( ) 5%, 10%, 1%) (Type I/α error) H 0 (e.g., ) H 0 (interval estimation) 100 ( ) 95% 95 95% 1 (point estimation) (Toyama/NIHU) R ( 3 ) October 23, 2017 23 / 34

SD = σ X, SE = σ X / n SD > SE n (n 2) n n ( ) n SE = σx/ n SE 95% [Xn 1.96SE, X n + 1.96SE] (Toyama/NIHU) R ( 3 ) October 23, 2017 24 / 34

α% 1 1 α% α% ( ) 100 ( ) 95% 95 95% 100 100 95% 100 95% 95 5 95% 0 1 (Toyama/NIHU) R ( 3 ) October 23, 2017 25 / 34

(file) path: URL PC URL http://cfes-project.eco.u-toyama.ac.jp/education/ education_2017/r_2017/ sample ( ) path /Users/Gaku/Desktop/sample sample.csv path /Users/Gaku/Desktop/sample.csv path OS (Win 10 ) Google!. sample.csv.csv, sample.xls.xls OS (Win 10 ) Google! (Toyama/NIHU) R ( 3 ) October 23, 2017 26 / 34

R (R path, encoding ) R /. R ( ) (1) (2) ( ) Mac Macintosh HD ( / ) Windows C (Toyama/NIHU) R ( 3 ) October 23, 2017 27 / 34

R R a ( ) ( ) a A R (Toyama/NIHU) R ( 3 ) October 23, 2017 28 / 34

R R A A A A A A R Google Error: object x not found (1) (2) (Toyama/NIHU) R ( 3 ) October 23, 2017 29 / 34

R (object) R R x 1 > x <- 1 + 1 <- ( = ) R ( ) ( ) vector, matrix, data.frame (tibble), list (e.g., ) 1 > x2 <- x/2 2 > x2 3 [1] 1 (Toyama/NIHU) R ( 3 ) October 23, 2017 30 / 34

R R double ( ), integer ( ), logical ( ), character ( ), factor ( ) 1 > x_num <- 1 + 1 2 > x_num 3 [1] 2 4 > x_chr <- "2" 5 > x_chr 6 [1] "2" 7 > class(x_num) 8 [1] "numeric" 9 > class(x_chr) 10 [1] "character" (Toyama/NIHU) R ( 3 ) October 23, 2017 31 / 34

R ( ) (5 8 ) x_chr 2 2 (9 10 ) 1 > num_vec <- c(1, 2, 3, 4, 5, 6) 2 > mean(num_vec) 3 [1] 3.5 4 > chr_vec <- c("1", "2", "3", "4", "5", "6") 5 > mean(chr_vec) 6 [1] NA 7 Warning message: 8 In mean.default(chr_vec) : argument is not numeric or logical: returning NA 9 > mean(as.numeric(chr_vec)) 10 [1] 3.5 (Toyama/NIHU) R ( 3 ) October 23, 2017 32 / 34

R 1 (URL: http://cfes-project.eco.u-toyama.ac.jp/education/education_ 2017/r_2017/rcode_fall2017/) 2 R 2. R R R (Toyama/NIHU) R ( 3 ) October 23, 2017 33 / 34

( ) (1 ) R R 1 3, 6 ( ) Gelman & Hill. Data analysis. Chap. 1 2 ( ) Stata 5 ( ) 1 2 ( ) R R (Toyama/NIHU) R ( 3 ) October 23, 2017 34 / 34