2 2012 4 ( ) 2 2012 4 1 / 42
X Y Y = f (X ; Z) linear regression model X Y slope X 1 Y (X, Y ) 1 (X, Y ) ( ) 2 2012 4 2 / 42
1 β = β = β (4.2) = β 0 + β (4.3) ( ) 2 2012 4 3 / 42
= β 0 + β + (4.4) ( ) 2 2012 4 4 / 42
1 Y i = β 0 + β 1 X i + u i, i = 1, 2,..., n (4.5) Y i dependent variable X i explanatory variable, regressor independent variable β 0 + β 1 X population regression line (function) X Y β 0 intercept X = 0 Y β 1 slope β 0 parameter coefficient u i error term β 0 + β 1 X i Y i X i ( ) 2 2012 4 5 / 42
Y i = β 0 + β 1 X i + u i, i = 1, 2,..., n (4.5) β 0 β 1 X i u i Y i Figure 4.1 (X i, Y i ) β 0 β 1 1 u i (X i, Y i, u i ) (X, Y, u) β 0 β 1 β 0 β 1 ( ) 2 2012 4 6 / 42
Figure 4.1 Scatter plot of test score vs. student-teacher ratio (hypothetical data)
β 0 β 1 Table 4.1. Figure 4.2. u i -0.23 ( ) 2 2012 4 7 / 42
Table 4.1 Summary statistics of California school district data (Eviews output) TESTSCR STR Mean 654.1565 19.64043 Median 654.45 19.72321 Maximum 706.75 25.8 Minimum 605.55 14 Std. Dev. 19.05335 1.891812 Skewness 0.091615-0.02537 Kurtosis 2.745712 3.609597 Jarque-Bera 1.719129 6.548185 Probability 0.423346 0.037851 Sum 274745.8 8248.979 Sum Sq. Dev. 152109.6 1499.581 Observations 420 420
Figure 4.2 Scatter plot of Test score vs. student-teacher ratio (California school district data)
1 X Y 2 Y E[Y ] Y = argmin m n (Y i m) 2 (β 0, β 1 ) i=1 ( ) 2 2012 4 8 / 42
(β 0, β 1 ) (b 0, b 1 ) b 0 + b 1 X X = X i Y b 0 + b 1 X i Y i (b 0 + b 1 X i ) n (Y i b 0 b 1 X i ) 2 (4.6) i=1 (b 0, b 1 ) ( ) 2 2012 4 9 / 42
b 0 b 1 n (Y i b 0 b 1 X i ) 2 = 0 i=1 n (Y i b 0 b 1 X i ) 2 = 0 i=1 (b 0, b 1 ) ˆβ 1 = b 1 = n i=1 (X i X )(Y i Y ) n i=1 (X = s XY i X ) 2 sx 2 = ˆβ 0 = b 0 = Y b 1 X ( ) 2 2012 4 10 / 42
OLS Ordinary Least Squares estimators ( ˆβ 0, ˆβ 1 ) OLS regression line OLS Y = ˆβ 0 + ˆβ 1 X fitted value X i X i Y X i Y i Ŷ i = ˆβ 0 + ˆβ 1 X i (4.9) ( ) 2 2012 4 11 / 42
residual Y i Ŷ i û i = Y i Ŷi (4.10) error OLS OLS ( ) 2 2012 4 12 / 42
Figure 4.3 The estimated regression line for the California data
(β 0, β 1 ) OLS OLS OLS OLS ( ) 2 2012 4 13 / 42
OLS MS-Excel ( ) 2 2012 4 14 / 42
OLS OLS OLS 3 1 2 i.i.d. 3 4 OLS ( ) 2 2012 4 15 / 42
1 E[u i X i ] = 0 X i u i u i X i Y i Figure 4.4 Randomized experiment E[u X = 0] = 0 E[u X = 1] = 0 ( ) 2 2012 4 16 / 42
Figure 4.4 The conditional probability distributions and the population regression line
1 E[u i X i ] = 0 E[u i X i ] = 0 = corr(x i, u i ) = 0 orthogonality condition exogenous endogenous ( ) 2 2012 4 17 / 42
2 (X i, Y i ) i.i.d. X i Oversampling ( ) 2 2012 4 18 / 42
3 0 < E[X 4 i ], E[u 4 i ] < X i u i 4 OLS 4 Figure 4.5 ( ) 2 2012 4 19 / 42
Figure 4.5 The sensitivity of OLS to large outliers
OLS OLS OLS ( ) 2 2012 4 20 / 42
OLS OLS OLS OLS ( ) 2 2012 4 21 / 42
n OLS E[Y ] = µ y Y d N ( µ y, σ 2 Y E[ ˆβ 0 ] = β 0, E[ ˆβ 1 ] = β 1 (4.20) ) ( ) 2 2012 4 22 / 42
OLS OLS ˆβ 1 = n i=1 (X i X )(Y i Y ) n i=1 (X i X ) 2, ˆβ 0 = Y ˆβ 1 X Y i = β 0 + β 1 X i + u i Y i Y = β 1 (X i X ) + u i u ˆβ 1 ˆβ 1 = n i=1 (X i X )(β 1 (X i X ) + u i u) n i=1 (X i X ) 2 = β 1 + n i=1 (X i X )u i n i=1 (X i X ) 2 E[ ˆβ 1 ] = β 1 + E = β 1 + E [ [ n i=1 E (X ]] i X )u i n i=1 (X X i X ) 2 i [ n i=1 (X i X )E [u i X i ] n i=1 (X i X ) 2 ] = β 1 Q.E.D. ( ) 2 2012 4 23 / 42
OLS n OLS 2 OLS ˆβ 1 = β 1 + P n i=1 (X i X )u i P n i=1 (X i X ) 2 X µ X v i (X i X )u i E[u i X i ] = 0 E[v i ] = 0 i.i.d. var(v i ) = var[(x i X )u i ] < v d N(0, σ 2 v /n) var(x ) ˆβ 1 β 1 = v/var(x ) d ˆβ 1 N β 1, var((x µ «X )u) Q.E.D. n(var(x )) 2 ( ) 2 2012 4 24 / 42
n > 100 OLS X i OLS ˆβ 1 Fig 4.6 ( ) 2 2012 4 25 / 42
Figure 4.6 The variance of beta and the variance of X
β = 0 1 H 0 : E[Y ] = µ Y,0 v.s. H 1 : E[Y ] µ Y,0 2 Y SE(Y ) 3 t t = (Y µ Y,0 )/SE(Y ) 4 p H 0 H 0 n t d N(0, 1) p = 2Φ( t act ) ( ) 2 2012 4 26 / 42
OLS ˆβ d N [1 ] H 0 : β 1 = β 1,0 v.s. H 1 : β 1 β 1,0 (5.2) [2 ] OLS ˆβ 1 SE( ˆβ 1 ) SE( ˆβ 1 ) = ˆσ 2ˆβ1 = 1 1 n n 2 i=1 (X i X ) 2 ûi 2 n [ 1 n n i=1 (X (5.3) i X ) 2 ] 2 σ 2ˆβ 1 = 1 n var[(x i µ X )u i ] (var[x i ]) 2 (5.4) ( ) 2 2012 4 27 / 42
OLS [3 ] t t = = ˆβ 1 β 1,0 SE( ˆβ 1 ) (5.5) [4 ] p H 0 [ p = Pr H0 ˆβ ] 1 β 1,0 > ˆβ act 1 β 1,0 = Pr H0 ( t > t act ) ˆβ d N (5.6) p = Pr H0 ( Z > t act ) = 2Φ( t act ) (5.7) H 0 ( ) 2 2012 4 28 / 42
OLS t p H 0 : β 1 = β 1,0 v.s. H 1 : β 1 < β 1,0 (5.9) p = Pr H0 (Z < t act ) = Φ(t act ) (5.10) ( ) 2 2012 4 29 / 42
(β 0, β 1 ) β 1 95% 5% β 95% 95% 5% H 0 : β 1 = β 1,0 β 1,0 ( ˆβ 1 ± 1.96SE( ˆβ 1 )) ( ˆβ 1 1.96SE( ˆβ 1 ), ˆβ 1 + 1.96SE( ˆβ 1 )) ( ) 2 2012 4 30 / 42
X x Y y = β 1 x x ˆβ 1 ˆβ 1 ( ) ( ˆβ 1 1.96SE( ˆβ 1 )) x, ( ˆβ 1 + 1.96SE( ˆβ 1 )) x (5.13) ( ) 2 2012 4 31 / 42
2 2 D i = 0, 1 indicator variable, dummy variable β 1 OLS D i = 0 Y i = β 0 + u i D i = 1 Y i = β 0 + β 1 + u i E[Y i D i = 0] = β 0, E[Y i D i = 1] = β 0 + β 1 (5.16, 17) ( ) 2 2012 4 32 / 42
2 β 1 2 H 0 : β 1 = 0 β 1 OLS 2 ( ) 2 2012 4 33 / 42
OLS OLS R 2 Y i X i 0 1 1 Y i Standard Error of the Regression Y i ( ) 2 2012 4 34 / 42
R 2 Y i X i Y i = Ŷi + û i R 2 = Ŷi Y i = ESS n TSS = i=1 (Ŷi Y ) 2 n i=1 (Y (4.16) i Y ) 2 ESS (explained sum of squares) TSS (total sum of squares) SSR: sum of squared residuals R 2 = ESS TSS SSR = = 1 SSR n TSS TSS TSS = 1 i=1 û2 i n i=1 (Y i Y ) 2 (4.18) ( ) 2 2012 4 35 / 42
R 2 [0, 1] ˆβ 1 = 0 X i Y i Ŷ i = Y, i ESS Ŷ i = Y i, i û i = 0, i ESS TSS R 2 = 1 R 2 1 Y i ( ) 2 2012 4 36 / 42
SER u i {u 1, u 2,..., u n } {û 1, û 2,..., û n } SER = sû, s 2 û = 1 n 2 n i=1 û 2 i = SSR n 2 (4.19) n 2 2 n ( ) 2 2012 4 37 / 42
E[u i X i ] = 0 X i E[u 2 i X i] i X i homoskedasticity X i E[u 2 i X i] heteroskedasticity (X i, Y i ) i.i.d. Fig4.4. Fig 5.2. ( ) 2 2012 4 38 / 42
Figure 5.2 An example of homoskedasticity
Figure 5.2 An example of heteroskedasticity
Earnings i = β 0 + β 1 Male i + u i (5.19) Male i β 1 var(u i Male i ) Male i u i D i = 0, 1 ( ) 2 2012 4 39 / 42
Figure 5.3 Scatter plot of hourly earnings and years of education Heteroskedastic or homoskedastic?
OLS Gauss-Markov OLS {Y 1, Y 2,..., Y n } efficient OLS BLUE Best Linear Unbiased Estimator OLS ( ) 2 2012 4 40 / 42
OLS Homoskedasticity-only var( ˆβ 1 ) var( ˆβ 1 ) = var[(x µ X )u] n(var(x )) 2 = var(u i) nvar(x i ) (5.22) Homoskedasticity-only var( ˆβ 1 ) var( ˆβ 1 ) t Heteroskedasticity-robust Eicker-Huber-White robust ( ) 2 2012 4 41 / 42
WLS Weighted Least Squares OLS BLUE OLS WLS var(u i /X i ) OLS var(u i /X i ) WLS ( ) 2 2012 4 42 / 42