11 1 1 Cameron and Trivedi (1998,2005) Winkelmann(1997) 1 Bortkiewicz(1898) 1
2 2 2 (Poisson Distribution) P (y = j) = e λ λ j λ > 0, j = 0, 1, 2... j! j! j E(y) = V ar(y) = λ λ y x λ = λ(x iβ) f(y i x iβ) = exp( exp(x i β)) exp(y ix i β), y i = 0, 1, 2,... y i! E(y i x i ) = exp(x i β) V ar(y i x i ) = exp(x i β) log L(β; y, x) = n {y i x iβ exp(x iβ) ln y i!} i=1 1 n (y i exp(x iβ))x i = 0 i=1 β β marginal mean effect E(y i x i ) x il = exp(x iβ)β l = E(y i x i )β l 2 Cameron and Trivedi (1998, 2005; Chapter 20.) Winkelmann and Bose (2006, pp.279-294)
3 x x 3 E(y i x i )/E(y i x i ) x il = β l x il y i x il β l x l x m 2 E(y i x i ) x il x im = exp(x iβ)β l β m = E(y i x i )β l β m 0 marginal probability effect P (y i = j x i ) x il = P (y i = j x i )[j exp(x iβ)]β l j exp(x i β) 4 overdispersion unobserved heterogeneity y f(y i x i ) = λy i i y i! 0 e λiui u y i i γ θ Γ(θ) uθ 1 i e γui du i λ i = exp(x i β) E(u i x i ) = 1 γ = θ 3 median: 50% mode 4 single crossing property
4 f(y i x i ) = λy i i y i! 0 = λy i i θ θ y i! Γ(θ) λ yi i e λiui u y i i 0 θ θ θ θ Γ(θ) uθ 1 i e θui du i e (λ i+θ)u i u y i+θ 1 i du i Γ(y i + θ) = Γ(y i + 1) Γ(θ) (λ i + θ) y i+θ = Γ(y ( i + θ) λi Γ(y i + 1)Γ(θ) λ i + θ ) yi ( θ λ i + θ Negative Binominal Distribution 5 E(y i x i ) = λ i V ar(y i x i ) = λ i (1 + θ 1 λ i ) NB V ar = (1 + δ)λ i δ = θ 1 i λ i i θ i NB1 ) θ V ar(y i x i ) = (1 + σ 2 ) exp(x iβ) ln L(θ, β) = n yi 1 ( ln(j + θ exp(x iβ))) ln y i! (y i + θ exp(x iβ)) ln(1 + θ 1 ) y i ln θ i=1 j=0 1 n i=1 θ 2 n y i 1 θλ i x i + θλ i x i j + θλ i = 0 i=1 j=0 y i 1 λ i θ 2 λ i ln(1 + θ 1 ) θ 1 (j + θ) 1 + θ 1 + y iθ 1 = 0 j=0 δ = 0 θ θ 1 = σ 2 NB2 V ar(y i x i ) = exp(x iβ) + σ 2 [exp(x iβ)] 2 5 Greenwood and Yule (1920)
5 ln L(θ, β) = n yi 1 ( ln(j + θ)) ln y i! (y i + θ) ln(1 + exp(x iβ)) y i ln θ + y i x iβ i=1 j=0 n θ2 i=1 n i=1 ln(1 + λ i /θ) y i λ i 1 + λ i /θ x i = 0 y i 1 j=0 1 y i λ i + (j + θ) (1 + λ i /θ)/θ = 0 σ 2 0 NB1 NB2 NB OLS α (y i λ i ) 2 y i = α g( λ i ) + u i λ i λ i g( λ) = λ 2 g( λ) = λ λ i = exp(x i β) α = 0 V ar(y i x i ) = λ i overdispersion NB1 NB2 3 δ = 0 α = 0 NB1 NB2 3 RAND 6 6 Manning, Newhouse, Duan, Keeler and Leibowitz (1987) Newhouse and the Insurance Experiment Group (1993) Deb and Trivedi (2002)
6 1974-1982 6 7 8000 2823 14 8 3-4 0% 25% 50% 95% 100% individual deductible plan 9 1000 5% 10% 15% 10 40% 2 MDU 2.861 4.505 LC ln(1+ ) % 1.710 1.962 IDP 100% individual deductible plan 1 0 ( 0.220 0.414) LPI ln(max(1, )) 4.709 2.6NB1NB 97 FMDE IDP=1 ln(max(1,mde/(0.01 MDE 3.153 3.641 PHYSLIM 1 0.124 0.322 NDISEASE 11.244 6.742 HLTHG 0.362 0.481 HLTHF 0.077 0.267 HLTHP 0.015 0.121 LINC ln( )( 8.708 1.228) LFAM ln( )( 1.248 7 Dayton ( ) Seattle Fitchburge Franklin County Charleston Georgetown County 6 8 9 25% 50% 150 450 95% 10
7 0.539) EDUCDEC 11.967 2.806 AGE 25.718 16.768 FEMALE 0.517 0.500 CHILD 18 0.402 0.490 FEMCHILD FEMALE*CHILD 0.194 0.395 BLACK 1 0.182 0.383 MDU 1 1 31% 10 5% 2 NB1 NB2 3 LC IDP NB1 NB2 2 α = 0 δ = 0 NB1 NB2 NB2 2-4 31% NB1 NB2
8 NB1 4-5 NB2 3 NB2 4 3 69 30% 70 10% 20% 2007 4500 Hurdle and Zero-Inflated Cameron and Trivedi (1998)
9 5 STATA P. Deb and P.K. Trivedi (2002) The Structure of Demand for Medical Care: Latent Class versus Two-Part Models, Journal of Health Economics, 21, 601-625 www.econ.ucdavis.edu/faculty/cameron/ randdata.dta Cameron and Trivedi (2005, Chapter 20) set more off use randdata.dta, clear /* educdec is missing for some observations*/ drop if educdec==. /* rename variables*/ rename mdvis MDU rename meddol MED rename binexp DMED rename lnmeddol LNMED rename linc LINC rename lfam LFAM rename educdec EDUCDEC rename xage AGE rename female FEMALE rename child CHILD rename fchild FEMCHILD rename black BLACK rename disea NDISEASE rename physlm PHYSLIM rename hlthg HLTHG rename hlthf HLTHF rename hlthp HLTHP rename idp IDP rename logc LC rename lpi LPI rename fmde FMDE /* Define the regressor list which in commands can refer to as $XLIST*/ global XLIST LC IDP LPI FMDE PHYSLIM NDISEASE HLTHG HLTHF HLTHP LINC LFAM EDUCDEC AGE FEMALE CHILD FEMCHILD BLACK sum MDU $XLIST
10 /* */ tabulate MDU /* */ hist MDU /* */ /* */ poisson MDU $XLIST estimates store poisml poisson MDU $XLIST, robust estimates store poisrobust predict MDUhat poisson MDU $XLIST, cluster(zper) estimates store poiscluster /*NB */ nbreg MDU $XLIST, dispersion(mean) /*NB2*/ nbreg MDU $XLIST, dispersion(mean) robust /*NB2 with robust z*/ estimates store nb2 predict MDUhat2 nbreg MDU $XLIST, cluster(zper) estimates store nbcluster nbreg MDU $XLIST, dispersion(constant) /*NB1*/ nbreg MDU $XLIST, dispersion(constant) robust /*NB1 with robust z*/ estimates store nb1 predict MDUhat1 hist MDUhat /* */ graph save MDUhat.gph, replace hist MDUhat1/* */ graph save MDUhat1.gph, replace hist MDUhat2/* */ graph save MDUhat2.gph, replace
11 [1] 2005) [2] 2007 No.57.(2007 7 ) pp.23-52. [3] Amemiya, T.(1985) Advanced Econometrics, Harvard University Press. [4] Bortkiewicz,L. von.(1898) Das Gesetz de Kleinen Zahlen, Leipzig, Teubner. [5] Cameron, A.C. and Trivedi, P.K.(1998) Regression Analysis of Count Data, Cambridge University Press. [6] Cameron, A.C. and Trivedi, P.K.(2005) Microeconometrics: Methods and Applications, Cambridge University Press. [7] Deb, Partha and Trivedi, Pravin K.(2002) The Structure of Demand for Health Care: Latent Class versus Two-Part Models, Journal of Health Economics, 21, pp.601-625. [8] Gourieroux, Christian and Jasiak, Joann.(2007) The Econometrics of Individual Risk, Princeton University Press. [9] Greenwood, M. and Yule, G.U.(1920) An Inquiry into the Nature of Frequency Distributions of Multiple Happenings with Particular Reference to the Occurrence of Multiple Attacks of Disease or Repeated Accidents, Journal of the Royal Statistical Society A., 83, pp.255-279. [10] Maddala, G.S.(1983) Limited-Dependent and Qualitative Variables in Economics, Cambridge University Press. [11] Manning, Willard G., Newhouse, Joseph P., Duan, Naihua., Keeler, Emmett B., and Leibowitz, Arleen. (1987) Health Insurance and the Demand for Medical Care: Evidence from a Randomized Experiment, American Economic Review, 77(3), pp.251-277. [12] Newhouse, Joseph P., and the Insurance Experiment Group (1993) Free for All? Lessons from the RAND Health Insurance Experiment, Harvard University Press. [13] Winkelmann, R.(1997) Count Data Models: Econometric Theory and Application to Labor Mobility, Springer-Verlag. [14] Winkelmann, Rainer and Boes, Stefan.(2006) Analysis of Microdata, Springer.
12 [15] Wooldridge, Jeffrey. M.(2003a) Introductory Econometrics, Thomson. [16] Wooldridge, Jeffrey. M.(2003b) Econometric Analysis of Cross Section and Panel Data, The MIT Press
表 1 医院への通院回数の頻度分布 通院回数頻度パーセント累積値 0 6,308 31.25 31.25 1 3,815 18.90 50.15 2 2,795 13.85 63.99 3 1,884 9.33 73.33 4 1,345 6.66 79.99 5 968 4.80 84.79 6 689 3.41 88.20 7 531 2.63 90.83 8 408 2.02 92.85 9 287 1.42 94.27 10 206 1.02 95.29 11 190 0.94 96.24 12 118 0.58 96.82 13 109 0.54 97.36 14 82 0.41 97.77 15 59 0.29 98.06 16 56 0.28 98.34 17 33 0.16 98.50 18 37 0.18 98.68 19 35 0.17 98.86 20 26 0.13 98.98 21 22 0.11 99.09 22 19 0.09 99.19 23 19 0.09 99.28 24 13 0.06 99.35 25 8 0.04 99.39 26 10 0.05 99.44 27 6 0.03 99.46 28 12 0.06 99.52 29 6 0.03 99.55 30 8 0.04 99.59 31 8 0.04 99.63 32 4 0.02 99.65 33 5 0.02 99.68 34 9 0.04 99.72 35 5 0.02 99.75 37 5 0.02 99.77 38 9 0.04 99.82 39 1 0.00 99.82 40 3 0.01 99.84 41 5 0.02 99.86 44 6 0.03 99.89 45 2 0.01 99.90 46 2 0.01 99.91 48 2 0.01 99.92 51 1 0.00 99.93 52 3 0.01 99.94 55 1 0 99.95 56 1 0 99.95 57 1 0 99.96 58 1 0 99.96 62 1 0 99.97 63 1 0 99.97 65 1 0 99.98 69 1 0 99.98 72 1 0 99.99 74 1 0 99.99 76 1 0 100.00 77 1 0.00 100.00 合計 20,186 100
表 2 カウントデータ分析医院への通院確率 (1) Dependent Variable: MDU Coefficient Robust z-ratio Coefficient Robust z-ratio Coefficient Robust z-ratio LC -0.043-2.84-0.050-3.23-0.057-5.27 IDP -0.161-5.77-0.148-4.86-0.179-8.66 LPI 0.013 2.91 0.016 3.57 0.014 4.43 FMDE -0.021-2.32-0.021-2.35-0.013-2.14 PHYSLIM 0.268 8.24 0.275 8.07 0.201 8.34 NDISEASE 0.023 13.49 0.026 15.32 0.020 16.13 HLTHG 0.039 1.70 0.007 0.27 0.038 2.32 HLTHF 0.253 5.89 0.237 5.43 0.207 6.43 HLTHP 0.522 6.97 0.426 6.20 0.520 8.34 LINC 0.083 5.99 0.085 7.42 0.075 7.27 LFAM -0.130-5.72-0.123-5.30-0.097-5.98 EDUCDEC 0.018 4.36 0.016 4.03 0.022 7.48 AGE 0.002 2.12 0.003 2.33 0.002 2.22 FEMALE 0.349 12.30 0.367 12.85 0.371 18.17 CHILD 0.336 8.32 0.306 7.13 0.323 10.71 FEMCHILD -0.363-8.21-0.376-8.40-0.385-12.58 BLACK -0.680-18.44-0.710-19.76-0.721-25.69 _cons -0.190-1.49-0.207-1.83-0.129-1.36 Number of observations LR chi2(17) Pseudo R2 Log Likelihood alpha delta LR test of alpha=0: LR test of delta=0: Poisson NB1 20186 20186 13106.07 2828.01 0.098 0.032-60087.622-42777.611 1.182 chi2(01)=3.5e+04 Prob>chi2=0.000 NB 2 20186 3404.09 0.039-42489.57 3.460 chi2(01)=3.5e+04 Prob>chi2=0.000
図 1 医院への実際の通院回数のヒストグラム Density 0.1.2.3 0 20 40 60 80 number face-to-fact md visits
図 2 ポアソン推定に基づく予測通院回数のヒストグラム Density 0.1.2.3.4 0 5 10 15 20 25 predicted number of events
図 3 NB1 推定に基づく予測通院回数のヒストグラム Density 0.1.2.3.4 0 5 10 15 20 predicted number of events
図 4 NB2 推定に基づく予測通院回数のヒストグラム Density 0.1.2.3.4 0 5 10 15 20 25 predicted number of events