最小2乗法

Similar documents
第11回:線形回帰モデルのOLS推定

4 OLS 4 OLS 4.1 nurseries dual c dual i = c + βnurseries i + ε i (1) 1. OLS Workfile Quick - Estimate Equation OK Equation specification dual c nurser

Stata11 whitepapers mwp-037 regress - regress regress. regress mpg weight foreign Source SS df MS Number of obs = 74 F(

第13回:交差項を含む回帰・弾力性の推定

k2 ( :35 ) ( k2) (GLM) web web 1 :

4.9 Hausman Test Time Fixed Effects Model vs Time Random Effects Model Two-way Fixed Effects Model

80 X 1, X 2,, X n ( λ ) λ P(X = x) = f (x; λ) = λx e λ, x = 0, 1, 2, x! l(λ) = n f (x i ; λ) = i=1 i=1 n λ x i e λ i=1 x i! = λ n i=1 x i e nλ n i=1 x

AR(1) y t = φy t 1 + ɛ t, ɛ t N(0, σ 2 ) 1. Mean of y t given y t 1, y t 2, E(y t y t 1, y t 2, ) = φy t 1 2. Variance of y t given y t 1, y t

(lm) lm AIC 2 / 1

<4D F736F F D20939D8C7689F090CD985F93C18EEA8D758B E646F63>


28

Microsoft Word - 計量研修テキスト_第5版).doc

Part 1 GARCH () ( ) /24, p.2/93

1 15 R Part : website:

BR001

untitled

1 Stata SEM LightStone 4 SEM 4.. Alan C. Acock, Discovering Structural Equation Modeling Using Stata, Revised Edition, Stata Press 3.

y i OLS [0, 1] OLS x i = (1, x 1,i,, x k,i ) β = (β 0, β 1,, β k ) G ( x i β) 1 G i 1 π i π i P {y i = 1 x i } = G (

計量経済分析 2011 年度夏学期期末試験 担当 : 別所俊一郎 以下のすべてに答えなさい. 回答は日本語か英語でおこなうこと. 1. 次のそれぞれの記述が正しいかどうか判定し, 誤りである場合には理由, あるいはより適切な 記述はどのようなものかを述べなさい. (1) You have to wo

σ t σ t σt nikkei HP nikkei4csv H R nikkei4<-readcsv("h:=y=ynikkei4csv",header=t) (1) nikkei header=t nikkei4csv 4 4 nikkei nikkei4<-dataframe(n

Rによる計量分析:データ解析と可視化 - 第3回 Rの基礎とデータ操作・管理

kubostat2017c p (c) Poisson regression, a generalized linear model (GLM) : :

Stata 11 Stata ts (ARMA) ARCH/GARCH whitepaper mwp 3 mwp-083 arch ARCH 11 mwp-051 arch postestimation 27 mwp-056 arima ARMA 35 mwp-003 arima postestim

¥¤¥ó¥¿¡¼¥Í¥Ã¥È·×¬¤È¥Ç¡¼¥¿²òÀÏ Âè2²ó

151021slide.dvi

2 1 Introduction

H22 BioS (i) I treat1 II treat2 data d1; input group patno treat1 treat2; cards; ; run; I

(p.2 ( ) 1 2 ( ) Fisher, Ronald A.1932, 1971, 1973a, 1973b) treatment group controll group (error function) 2 (Legendre, Adrian

I L01( Wed) : Time-stamp: Wed 07:38 JST hig e, ( ) L01 I(2017) 1 / 19

DAA09

201711grade2.pdf

Stata 11 Stata ROC whitepaper mwp anova/oneway 3 mwp-042 kwallis Kruskal Wallis 28 mwp-045 ranksum/median / 31 mwp-047 roctab/roccomp ROC 34 mwp-050 s

講義のーと : データ解析のための統計モデリング. 第3回

1 Stata SEM LightStone 3 2 SEM. 2., 2,. Alan C. Acock, Discovering Structural Equation Modeling Using Stata, Revised Edition, Stata Press.

Microsoft Word - 計量研修テキスト_第5版).doc

Microsoft PowerPoint - ch03j

tokei01.dvi

% 10%, 35%( 1029 ) p (a) 1 p 95% (b) 1 Std. Err. (c) p 40% 5% (d) p 1: STATA (1). prtesti One-sample test of pr

untitled

こんにちは由美子です

Stata 11 whitepaper mwp 4 mwp mwp-028 / 41 mwp mwp mwp-079 functions 72 mwp-076 insheet 89 mwp-030 recode 94 mwp-033 reshape wide

waseda2010a-jukaiki1-main.dvi

¥¤¥ó¥¿¡¼¥Í¥Ã¥È·×¬¤È¥Ç¡¼¥¿²òÀÏ Âè2²ó

回帰分析 単回帰

第9回 日経STOCKリーグレポート 審査委員特別賞<地域の元気がでるで賞>

.3 ˆβ1 = S, S ˆβ0 = ȳ ˆβ1 S = (β0 + β1i i) β0 β1 S = (i β0 β1i) = 0 β0 S = (i β0 β1i)i = 0 β1 β0, β1 ȳ β0 β1 = 0, (i ȳ β1(i ))i = 0 {(i ȳ)(i ) β1(i ))

1 y x y = α + x β+ε (1) x y (2) x y (1) (2) (1) y (2) x y (1) (2) y x y ε x 12 x y 3 3 β x β x 1 1 β 3 1

(pdf) (cdf) Matlab χ ( ) F t


kubostat2018d p.2 :? bod size x and fertilization f change seed number? : a statistical model for this example? i response variable seed number : { i

10

untitled

: (EQS) /EQUATIONS V1 = 30*V F1 + E1; V2 = 25*V *F1 + E2; V3 = 16*V *F1 + E3; V4 = 10*V F2 + E4; V5 = 19*V99

Microsoft Word - 表紙.docx

Microsoft Word - 計量研修テキスト_第5版).doc

10:30 12:00 P.G. vs vs vs 2

1 Tokyo Daily Rainfall (mm) Days (mm)

TS002

Stata 11 Stata VAR VEC whitepaper mwp 4 mwp-084 var VAR 14 mwp-004 varbasic VAR 25 mwp-005 svar VAR 31 mwp-007 vec intro VEC 47 mwp-008 vec VEC 75 mwp

Microsoft Word - 計量研修テキスト_第5版).doc

インターネットを活用した経済分析 - フリーソフト Rを使おう

Use R

Microsoft Word - 研究デザインと統計学.doc

A B P (A B) = P (A)P (B) (3) A B A B P (B A) A B A B P (A B) = P (B A)P (A) (4) P (B A) = P (A B) P (A) (5) P (A B) P (B A) P (A B) A B P

ECCS. ECCS,. ( 2. Mac Do-file Editor. Mac Do-file Editor Windows Do-file Editor Top Do-file e

solutionJIS.dvi

H22 BioS t (i) treat1 treat2 data d1; input patno treat1 treat2; cards; ; run; 1 (i) treat = 1 treat =

yamadaiR(cEFA).pdf

7. フィリップス曲線 経済統計分析 (2014 年度秋学期 ) フィリップス曲線の推定 ( 経済理論との関連 ) フィリップス曲線とは何か? 物価と失業の関係 トレード オフ 政策運営 ( 財政 金融政策 ) への含意 ( 計量分析の手法 ) 関数形の選択 ( 関係が直線的でない場合の推定 ) 推

Statistics for finance Part II

untitled

untitled

k3 ( :07 ) 2 (A) k = 1 (B) k = 7 y x x 1 (k2)?? x y (A) GLM (k

1 kawaguchi p.1/81

Isogai, T., Building a dynamic correlation network for fat-tailed financial asset returns, Applied Network Science (7):-24, 206,

卒業論文

R John Fox R R R Console library(rcmdr) Rcmdr R GUI Windows R R SDI *1 R Console R 1 2 Windows XP Windows * 2 R R Console R ˆ R

( )/2 hara/lectures/lectures-j.html 2, {H} {T } S = {H, T } {(H, H), (H, T )} {(H, T ), (T, T )} {(H, H), (T, T )} {1

α β *2 α α β β α = α 1 β = 1 β 2.2 α 0 β *3 2.3 * *2 *3 *4 (µ A ) (µ P ) (µ A > µ P ) 10 (µ A = µ P + 10) 15 (µ A = µ P +

dvi

5 Armitage x 1,, x n y i = 10x i + 3 y i = log x i {x i } {y i } 1.2 n i i x ij i j y ij, z ij i j 2 1 y = a x + b ( cm) x ij (i j )

統計学のポイント整理

kubostat2017e p.1 I 2017 (e) GLM logistic regression : : :02 1 N y count data or

II (2011 ) ( ) α β û i R

chap10.dvi

kubostat2015e p.2 how to specify Poisson regression model, a GLM GLM how to specify model, a GLM GLM logistic probability distribution Poisson distrib

untitled

!!! 2!

プリント

ii

分布

µ i ν it IN(0, σ 2 ) 1 i ȳ i = β x i + µ i + ν i (2) 12 y it ȳ i = β(x it x i ) + (ν it ν i ) (3) 3 β 1 µ i µ i = ȳ i β x i (4) (least square d

2 Tobin (1958) 2 limited dependent variables: LDV 2 corner solution 2 truncated censored x top coding censor from above censor from below 2 Heck

X X X Y R Y R Y R MCAR MAR MNAR Figure 1: MCAR, MAR, MNAR Y R X 1.2 Missing At Random (MAR) MAR MCAR MCAR Y X X Y MCAR 2 1 R X Y Table 1 3 IQ MCAR Y I

populatio sample II, B II? [1] I. [2] 1 [3] David J. Had [4] 2 [5] 3 2


と入力する すると最初の 25 行が表示される 1 行目は変数の名前であり 2 列目は企業番号 (1,,10),3 列目は西暦 (1935,,1954) を表している ( 他のパネルデータを分析する際もデ ータをこのように並べておかなくてはならない つまりまず i=1 を固定し i=1 の t に関

,, Poisson 3 3. t t y,, y n Nµ, σ 2 y i µ + ɛ i ɛ i N0, σ 2 E[y i ] µ * i y i x i y i α + βx i + ɛ i ɛ i N0, σ 2, α, β *3 y i E[y i ] α + βx i

こんにちは由美子です

Transcription:

2 2012 4 ( ) 2 2012 4 1 / 42

X Y Y = f (X ; Z) linear regression model X Y slope X 1 Y (X, Y ) 1 (X, Y ) ( ) 2 2012 4 2 / 42

1 β = β = β (4.2) = β 0 + β (4.3) ( ) 2 2012 4 3 / 42

= β 0 + β + (4.4) ( ) 2 2012 4 4 / 42

1 Y i = β 0 + β 1 X i + u i, i = 1, 2,..., n (4.5) Y i dependent variable X i explanatory variable, regressor independent variable β 0 + β 1 X population regression line (function) X Y β 0 intercept X = 0 Y β 1 slope β 0 parameter coefficient u i error term β 0 + β 1 X i Y i X i ( ) 2 2012 4 5 / 42

Y i = β 0 + β 1 X i + u i, i = 1, 2,..., n (4.5) β 0 β 1 X i u i Y i Figure 4.1 (X i, Y i ) β 0 β 1 1 u i (X i, Y i, u i ) (X, Y, u) β 0 β 1 β 0 β 1 ( ) 2 2012 4 6 / 42

Figure 4.1 Scatter plot of test score vs. student-teacher ratio (hypothetical data)

β 0 β 1 Table 4.1. Figure 4.2. u i -0.23 ( ) 2 2012 4 7 / 42

Table 4.1 Summary statistics of California school district data (Eviews output) TESTSCR STR Mean 654.1565 19.64043 Median 654.45 19.72321 Maximum 706.75 25.8 Minimum 605.55 14 Std. Dev. 19.05335 1.891812 Skewness 0.091615-0.02537 Kurtosis 2.745712 3.609597 Jarque-Bera 1.719129 6.548185 Probability 0.423346 0.037851 Sum 274745.8 8248.979 Sum Sq. Dev. 152109.6 1499.581 Observations 420 420

Figure 4.2 Scatter plot of Test score vs. student-teacher ratio (California school district data)

1 X Y 2 Y E[Y ] Y = argmin m n (Y i m) 2 (β 0, β 1 ) i=1 ( ) 2 2012 4 8 / 42

(β 0, β 1 ) (b 0, b 1 ) b 0 + b 1 X X = X i Y b 0 + b 1 X i Y i (b 0 + b 1 X i ) n (Y i b 0 b 1 X i ) 2 (4.6) i=1 (b 0, b 1 ) ( ) 2 2012 4 9 / 42

b 0 b 1 n (Y i b 0 b 1 X i ) 2 = 0 i=1 n (Y i b 0 b 1 X i ) 2 = 0 i=1 (b 0, b 1 ) ˆβ 1 = b 1 = n i=1 (X i X )(Y i Y ) n i=1 (X = s XY i X ) 2 sx 2 = ˆβ 0 = b 0 = Y b 1 X ( ) 2 2012 4 10 / 42

OLS Ordinary Least Squares estimators ( ˆβ 0, ˆβ 1 ) OLS regression line OLS Y = ˆβ 0 + ˆβ 1 X fitted value X i X i Y X i Y i Ŷ i = ˆβ 0 + ˆβ 1 X i (4.9) ( ) 2 2012 4 11 / 42

residual Y i Ŷ i û i = Y i Ŷi (4.10) error OLS OLS ( ) 2 2012 4 12 / 42

Figure 4.3 The estimated regression line for the California data

(β 0, β 1 ) OLS OLS OLS OLS ( ) 2 2012 4 13 / 42

OLS MS-Excel ( ) 2 2012 4 14 / 42

OLS OLS OLS 3 1 2 i.i.d. 3 4 OLS ( ) 2 2012 4 15 / 42

1 E[u i X i ] = 0 X i u i u i X i Y i Figure 4.4 Randomized experiment E[u X = 0] = 0 E[u X = 1] = 0 ( ) 2 2012 4 16 / 42

Figure 4.4 The conditional probability distributions and the population regression line

1 E[u i X i ] = 0 E[u i X i ] = 0 = corr(x i, u i ) = 0 orthogonality condition exogenous endogenous ( ) 2 2012 4 17 / 42

2 (X i, Y i ) i.i.d. X i Oversampling ( ) 2 2012 4 18 / 42

3 0 < E[X 4 i ], E[u 4 i ] < X i u i 4 OLS 4 Figure 4.5 ( ) 2 2012 4 19 / 42

Figure 4.5 The sensitivity of OLS to large outliers

OLS OLS OLS ( ) 2 2012 4 20 / 42

OLS OLS OLS OLS ( ) 2 2012 4 21 / 42

n OLS E[Y ] = µ y Y d N ( µ y, σ 2 Y E[ ˆβ 0 ] = β 0, E[ ˆβ 1 ] = β 1 (4.20) ) ( ) 2 2012 4 22 / 42

OLS OLS ˆβ 1 = n i=1 (X i X )(Y i Y ) n i=1 (X i X ) 2, ˆβ 0 = Y ˆβ 1 X Y i = β 0 + β 1 X i + u i Y i Y = β 1 (X i X ) + u i u ˆβ 1 ˆβ 1 = n i=1 (X i X )(β 1 (X i X ) + u i u) n i=1 (X i X ) 2 = β 1 + n i=1 (X i X )u i n i=1 (X i X ) 2 E[ ˆβ 1 ] = β 1 + E = β 1 + E [ [ n i=1 E (X ]] i X )u i n i=1 (X X i X ) 2 i [ n i=1 (X i X )E [u i X i ] n i=1 (X i X ) 2 ] = β 1 Q.E.D. ( ) 2 2012 4 23 / 42

OLS n OLS 2 OLS ˆβ 1 = β 1 + P n i=1 (X i X )u i P n i=1 (X i X ) 2 X µ X v i (X i X )u i E[u i X i ] = 0 E[v i ] = 0 i.i.d. var(v i ) = var[(x i X )u i ] < v d N(0, σ 2 v /n) var(x ) ˆβ 1 β 1 = v/var(x ) d ˆβ 1 N β 1, var((x µ «X )u) Q.E.D. n(var(x )) 2 ( ) 2 2012 4 24 / 42

n > 100 OLS X i OLS ˆβ 1 Fig 4.6 ( ) 2 2012 4 25 / 42

Figure 4.6 The variance of beta and the variance of X

β = 0 1 H 0 : E[Y ] = µ Y,0 v.s. H 1 : E[Y ] µ Y,0 2 Y SE(Y ) 3 t t = (Y µ Y,0 )/SE(Y ) 4 p H 0 H 0 n t d N(0, 1) p = 2Φ( t act ) ( ) 2 2012 4 26 / 42

OLS ˆβ d N [1 ] H 0 : β 1 = β 1,0 v.s. H 1 : β 1 β 1,0 (5.2) [2 ] OLS ˆβ 1 SE( ˆβ 1 ) SE( ˆβ 1 ) = ˆσ 2ˆβ1 = 1 1 n n 2 i=1 (X i X ) 2 ûi 2 n [ 1 n n i=1 (X (5.3) i X ) 2 ] 2 σ 2ˆβ 1 = 1 n var[(x i µ X )u i ] (var[x i ]) 2 (5.4) ( ) 2 2012 4 27 / 42

OLS [3 ] t t = = ˆβ 1 β 1,0 SE( ˆβ 1 ) (5.5) [4 ] p H 0 [ p = Pr H0 ˆβ ] 1 β 1,0 > ˆβ act 1 β 1,0 = Pr H0 ( t > t act ) ˆβ d N (5.6) p = Pr H0 ( Z > t act ) = 2Φ( t act ) (5.7) H 0 ( ) 2 2012 4 28 / 42

OLS t p H 0 : β 1 = β 1,0 v.s. H 1 : β 1 < β 1,0 (5.9) p = Pr H0 (Z < t act ) = Φ(t act ) (5.10) ( ) 2 2012 4 29 / 42

(β 0, β 1 ) β 1 95% 5% β 95% 95% 5% H 0 : β 1 = β 1,0 β 1,0 ( ˆβ 1 ± 1.96SE( ˆβ 1 )) ( ˆβ 1 1.96SE( ˆβ 1 ), ˆβ 1 + 1.96SE( ˆβ 1 )) ( ) 2 2012 4 30 / 42

X x Y y = β 1 x x ˆβ 1 ˆβ 1 ( ) ( ˆβ 1 1.96SE( ˆβ 1 )) x, ( ˆβ 1 + 1.96SE( ˆβ 1 )) x (5.13) ( ) 2 2012 4 31 / 42

2 2 D i = 0, 1 indicator variable, dummy variable β 1 OLS D i = 0 Y i = β 0 + u i D i = 1 Y i = β 0 + β 1 + u i E[Y i D i = 0] = β 0, E[Y i D i = 1] = β 0 + β 1 (5.16, 17) ( ) 2 2012 4 32 / 42

2 β 1 2 H 0 : β 1 = 0 β 1 OLS 2 ( ) 2 2012 4 33 / 42

OLS OLS R 2 Y i X i 0 1 1 Y i Standard Error of the Regression Y i ( ) 2 2012 4 34 / 42

R 2 Y i X i Y i = Ŷi + û i R 2 = Ŷi Y i = ESS n TSS = i=1 (Ŷi Y ) 2 n i=1 (Y (4.16) i Y ) 2 ESS (explained sum of squares) TSS (total sum of squares) SSR: sum of squared residuals R 2 = ESS TSS SSR = = 1 SSR n TSS TSS TSS = 1 i=1 û2 i n i=1 (Y i Y ) 2 (4.18) ( ) 2 2012 4 35 / 42

R 2 [0, 1] ˆβ 1 = 0 X i Y i Ŷ i = Y, i ESS Ŷ i = Y i, i û i = 0, i ESS TSS R 2 = 1 R 2 1 Y i ( ) 2 2012 4 36 / 42

SER u i {u 1, u 2,..., u n } {û 1, û 2,..., û n } SER = sû, s 2 û = 1 n 2 n i=1 û 2 i = SSR n 2 (4.19) n 2 2 n ( ) 2 2012 4 37 / 42

E[u i X i ] = 0 X i E[u 2 i X i] i X i homoskedasticity X i E[u 2 i X i] heteroskedasticity (X i, Y i ) i.i.d. Fig4.4. Fig 5.2. ( ) 2 2012 4 38 / 42

Figure 5.2 An example of homoskedasticity

Figure 5.2 An example of heteroskedasticity

Earnings i = β 0 + β 1 Male i + u i (5.19) Male i β 1 var(u i Male i ) Male i u i D i = 0, 1 ( ) 2 2012 4 39 / 42

Figure 5.3 Scatter plot of hourly earnings and years of education Heteroskedastic or homoskedastic?

OLS Gauss-Markov OLS {Y 1, Y 2,..., Y n } efficient OLS BLUE Best Linear Unbiased Estimator OLS ( ) 2 2012 4 40 / 42

OLS Homoskedasticity-only var( ˆβ 1 ) var( ˆβ 1 ) = var[(x µ X )u] n(var(x )) 2 = var(u i) nvar(x i ) (5.22) Homoskedasticity-only var( ˆβ 1 ) var( ˆβ 1 ) t Heteroskedasticity-robust Eicker-Huber-White robust ( ) 2 2012 4 41 / 42

WLS Weighted Least Squares OLS BLUE OLS WLS var(u i /X i ) OLS var(u i /X i ) WLS ( ) 2 2012 4 42 / 42