43, 1, 2013 9 41 58 Web Joint Random Effect Modeling for Repeated Durations and Discrete Choices with Selection Bias Correction: Application to Promotion Policy Planning for Potential Clients Using Web Access-log Data Takahiro Hoshino 1 3000 1.5 EM E We point out that the current researches in Big data analytics neglect two important issues, consumer heterogeniety and selection bias, in constructing predivtion modeling. We provide a real example in which we must deal with the two issues properly, joint modeling of repeated duration and pruchase behavior. To be more concrete, we apply the joint modeling to a Web access-log dataset from a very large panel study. To plan a promotion policy for potential clients of a online shopping company, we proposed a propensity score weighted generalized EM algorithm of the proposed model, to adjust for covariate differences between potential clients and current clients. The proposed model incorporates random effects expressing unmeasured heterogeniety, which inevitably requires numerical integration. However in large dataset it is not practical to employ Markov chain Monte Carlo methods in random effect modeling. We applied the fully exponential Laplace approximation to the estimation algorithm of the proposed model, found that the algorithm is less computationally expensive, while it provides accurate estimates. : EM 464-8601 (E-mail: hoshino@soec.nagoyau.ac.jp)
42 43 1 2013 1. (2012) (2013) 2013 1 2 ( (2009) Angrist and Pischke (2009))
EM 43 1 3000 1.5 Markov Chain Monte Carlo MCMC MCMC MCMC MCMC (Jordan et al. (1999)) ( Wang and Titterington (2004) Braun and McAuliffe (2010)) 18 Tierney and Kadane (1986) 20 MCMC MCMC Rue et al. (2009) latent Gaussian model (integrated nested Laplace approximation: INLA) Fong et al. (2010) INLA Rizopoulos et al. (2009) Bianconcini and Cagnone (2012) (Tierney et al. (1989)) (2012)
44 43 1 2013 1 * EM Web 1 z = 1 1 z = 0 (1) (2) z = 1
EM 45 (z = 0) (z = 0) EM 2 3 E EM 4 5 2. 2.1 (recurrent event duration analysis, Seethraman and Chintagunta (2003) Bijwaard et al. (2006)) (recurrent event survival analysis) Web Key Performance Indicator; KPI KPI KPI yij D i j j + 1 j + 1 1 0 yij B i J i ( j = 1,..., J i ) i x i i j j + 1 w ij i f i f i f i N(0, φ) y D ij yld ij = log y D ij
46 43 1 2013 f ( [ yij LD ) 1 y LD f i, w ij = σ exp ij (β 0 + f i + w t ij β w) σ exp ( y LD ij (β 0 + f i + w t ij β w) σ )] (2.1) (Klein and Moeschberger (2003)) β 0 + f i + w t ij β w σ j + 1 y B ij f i logit [ p ( y B ij = 1 f i, w ij )] = α0 + α f f i + w t ijα w (2.2) J i p(y i1,..., y iji w i1,..., w iji ) = p ( yij LD ) ( ) f i, w ij p y B ij f i, w ij p(f i )df i (2.3) j=1 y ij = (yij LD, yij B)t α f 2.2 1 z i = 0 z i i 1 0 z i = 1 y i1,..., y iji z i = 0 x y w (Missing at random) N i=1 z i (1 w(x i )) w(x i ) log p(y i1,..., y iji w i1,..., w iji ) (2.4)
EM 47 z = 0 p(y w, z = 0) ( (2005) Hoshino et al. (2006) Wooldridge (2007) Pan and Schaubel (2009)) w(x i ) x i z i = 1 y (z = 0) w y x y y w x 4 Web 1 z = 0 y w x (z = 1) p(x z = 0) w(x i ) = p(x i z i = 1)p(z i = 1) p(x i z i = 1)p(z i = 1) + p(x i z i = 0)p(z i = 0) (2.5) w(x i ) 2.3 Vaida and Xu (2000) clustered data EM Rizopoulos et al. (2009) 1 Web 2 3. EM α = (α 0, α f, α t w) t β = (σ, β 0, β t w) t θ = (α t, β t, φ) t
48 43 1 2013 S(θ) = N i=1 N i=1 z i (1 w(x i )) S i (θ) w(x i ) z i (1 w(x i )) w(x i ) log g(fi, y i, w i θ) p(f i y θ i, w i, θ)df i (3.1) S i (θ) = θ log p(y i1,..., y iji w i1,..., w iji ) (3.2) J i g (f i, y i, w i θ) = p ( yij LD ) ( ) f i, w ij p y B ij f i, w ij p(f i φ) (3.3) j=1 p(f i y i, w i, θ) = g(f i, y i, w i θ) g(fi, y i, w i θ)df i θ log g(f i, y i, w i θ) = α + β J i j=1 J i j=1 log p ( y B ij f i, w ij ) log p ( yij LD ) f i, w ij + φ log p(f i φ) (Tierney et al. (1989)) S i (θ) r ( ) log g ˆfi, y i, w i θ Ŝ ir (θ) = 1 θ r 2 γ ir (3.4) O(J 2 i ) (Rizopoulos et al. (2009)) θ r θ r { γ ir = Σ 1 Σi i Σ 1 i f i Σ i = 2 f 2 i 2 Σ i} θ r log g(f i, y f i θ i, w i θ) (3.5) r f i= ˆf i log g(f i, y i, w i, z i θ) (3.6) fi = ˆf i EM (1)
EM 49 (2) 1 (3.3) f i ˆf i log g(f i, y i, w i θ) 1 2 3 (3) (3.1) (4) θ ˆθ (5) (2) (4) EM MCMC S(θ) w(x i ) 1 w ij 2 1 0.7 1 1 N = 5,000 100,000 2 J i J i (1) 10 3 17 (2) 50 5 95 2 4 1 MCMC MCMC iteration 3000 1000 Burn-in phase Geweke 1 MCMC
50 43 1 2013 J i 10 50 MCMC MCMC SAS/IML Window7 64bit Intel Core i7-3930k (6 12way/3.20GHz/3.80GHz/12MB) 32GB N = 100,000 J i 50 10.2 MCMC iteration 3000 233.1 (4 18 23 ) MCMC MCMC 3000 iteration MCMC 4. 13000 URL URL URL web 2011 9708 7116 Random Digit Dialing; RDD 1 12 1 A URL
EM 51 URL 30 2010 4 2012 2 A 3 2 (z = 1) B 2 A 3 2 (z = 0) 196 2258 x (5 ) (9 ) (6 ) ( 3 ) (13 ) w ij 24 4 Web 3 (yahoo google ) blog SNS (twitter facebook mixi) ( ) 8 URL A URL A (z = 1) B )
52 43 1 2013 2 Web Web (z = 1) (z = 0) A (z = 1) 4483 (z = 0) 1397 8.3 2 31 31 4 (1) (2.4) EM (2) EM (3)
EM 53 (4) EM 4 2 (4) (4) (3) α f φ 2 (1) (2) (3) (1) (4) 5% (1) (3) (4) (2) (3) α w Yahoo! (4) A A (1) (4) google (2) (3) Yahoo! Yahoo! A Yahoo! google Web 10% ROI (Return on investment) (4) 2 3
54 43 1 2013 2 Web 4 (1) (2) (4) (3) σ 1.387 0.01039 1.389 0.00723 1.476 0.00695 1.364 0.00971 β0 0.229 0.00873 0.234 0.00721 0.261 0.00583 0.233 0.01028 12 18 0.165 0.03624 0.084 0.02140 0.053 0.01889 0.178 0.04865 18 22 0.089 0.04009 0.102 0.02509 0.179 0.02489 0.092 0.04803 22 6 0.137 0.04093 0.088 0.03305 0.050 0.03107 0.152 0.04848 Yahoo! 0.028 0.04510 0.141 0.03984 0.111 0.03791 0.049 0.06824 google 0.299 0.07304 0.093 0.06985 0.101 0.06208 0.367 0.08607 βw 0.109 0.10028 0.034 0.07709 0.060 0.08094 0.130 0.11276 blog 0.038 0.06904 0.110 0.07090 0.381 0.05033 0.093 0.06687 SNS 0.091 0.06002 0.009 0.05809 0.109 0.05034 0.088 0.07328 0.093 0.08709 0.034 0.04095 0.051 0.03097 0.121 0.08797 0.039 0.05001 0.019 0.04098 0.054 0.03887 0.071 0.07321 α0 3.869 0.29503 3.739 0.15600 3.593 0.13978 4.361 0.40943 α f 0.321 0.11010 0.208 0.06982 0.409 0.14132 12 18 0.110 0.07093 0.050 0.04499 0.069 0.04109 0.108 0.09834 18 22 0.051 0.04094 0.065 0.02499 0.041 0.02098 0.039 0.07169 22 6 0.210 0.05097 0.049 0.06097 0.110 0.05570 0.261 0.06095 Yahoo! 0.019 0.03069 0.109 0.01610 0.098 0.01609 0.048 0.02950 google 0.081 0.02483 0.051 0.03329 0.053 0.03192 0.083 0.03082 αw 0.060 0.06094 0.107 0.04508 0.050 0.04604 0.101 0.09989 blog 0.101 0.05098 0.030 0.03097 0.060 0.02806 0.101 0.08426 SNS 0.179 0.06093 0.061 0.05093 0.110 0.04330 0.210 0.08272 0.041 0.07083 0.149 0.03780 0.101 0.03512 0.098 0.04133 0.080 0.06095 0.008 0.04397 0.076 0.03799 0.051 0.07075 φ 0.832 0.05938 0.790 0.03921 0.889 0.08983 (4) 2 0.28059 0.71366 1.01456 5%
EM 55 3 * z ( (2009)) c 0.7331 z 5. Web MCMC (transfer learning) ( (2010)) (Shimodaira (2000) Sugiyama et al. (2007)) (Rosenbaum and Rubin (1983))
56 43 1 2013 2 z z f (Follman and Wu (1995) (2009)) MCMC ( Hjort et al. (2010)) (Hoshino, in press) MCMC URL A A (4) 2 URL
EM 57 ( ) (A)23680026 A. θ 1 2 3 [ [ ] log g(f i, y f i, w i, z i θ) = f Ji i i φ 1 σ + 1 y LD σ exp ij β t w ij σ j=1 ( ) ] α f y B ij p ij [ [ Σ i = 1 J i φ 1 y LD σ 2 exp ij J i Σ i = f i j=1 [ j=1 [ 1 y LD σ 3 exp ij β t w ij σ β t w ij σ ] ] + α 2 f p ij (1 p ij ) + α 3 f p ij (1 p ij )(1 2p ij ) ] ] (A.1) (A.2) (A.3) w ij = (1, f i, w t ij )t p ij = p(y B ij = 1 f i, x i, w ij ) Σ/ θ β β Σ i = J i j=1 [ 1 y LD ij σ 3 w ij exp β t w ij σ ] (A.4) α p ij αp ij (1 p ij ) Angrist, J. and Pischke, J. S. (2009). Mostly Harmless Econometrics: An Empiricist s Companion, Princeton University Press, London. Bianconcini, S. and Cagnone, S. (2012). Estimation of generalized linear latent variable models via fully exponential Laplace approximation, J. Multivar. Anal., 112, 183 193. Bijwaard, G. E., Franses, P. H. and Paap, R. (2006). Modeling purchases as repeated events, J. Bus. Econ. Stat., 24, 487 502. Braun, M. and McAuliffe, J. (2010). Variational inference for large-scale models of discrete choice, J. Am. Stat. Assoc., 105, 324 335.
58 43 1 2013 Follman, D. and Wu, M. (1995). An approximate generalized linear model with random effects for informative missing data, Biometrics, 51, 151 168. Fong, Y., Rue, H. and Wakefield, J. (2010). Bayesian inference for generalized linear mixed model, Biostatistics, 11, 397 412. Hjort, N. L., Holmes, C., Müller, O. and Walker, S. G. (2010). Bayesian Nonparametrics, Cambridge University Press, Cambridge. (2005). M 32 2, 121 132. (2009).. Hoshino, T. (in press). Semiparametric Bayesian estimation for marginal parametric potential outcome modeling: Application to causal inference, J. Am. Stat. Assoc. Hoshino, T., Kurata, H. and Shigemasu, K. (2006). A propensity score adjustment for multiple group structural equation modeling, Psychometrika, 71, 691 712. (2013). AI 28 1. Jordan, M. I., Ghahramani, Z., Jaakkola, T. S. and Saul, L. (1999). An introduction to variational methods for graphical models, Mach. Learn., 37, 183 233. (2010). 25 4. Klein, J. P. and Moeschberger, M. L. (2003) Survival Analysis: Techniques for Censored and Truncated Data, 2nd ed., Springer, New York. (2012). 60 1, 173 188. Pan, Q. and Schaubel, D. E. (2009). Evaluating bias correction in weighted proportional hazard regression, Lifetime Data Analysis, 15, 120 146. Rizopoulos, D., Verbeke, G. and Lesaffre, E. (2009). Fully exponential laplace approximations for the joint modeling of survival and longitudinal data, J. R. Stat. Soc., Ser. B, 71, 637 654. Rosenbaum, P. R. and Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects, Biometrika, 70, 41 55. Rue, H., Martino, S. and Chopin, N. (2009). Approximate Bayesian inference for latent Gaussian models by using integrated nested laplace approximations (with discussion), J. R. Stat. Soc., Ser. B, 71, 319 392. Seethraman, P. B. and Chintagunta, P. K. (2003). The proportional hazard model for purchase timing: A comparison of alternative specifications, J. Bus. Econ. Stat., 21, 368 382. Shimodaira, H. (2000). Improving predictive inference under covariate shift by weighting the log-likelihood function, J. Stat. Plan. Inf., 90, 227 244. (2012).. Sugiyama, M., Krauledat, M. and Müller, K.-R. (2007). Covariate shift adaptation by importance weighted cross validation, J. Mach. Learn. Res., 8, 985 1005. Tierney, L. and Kadane, J. B. (1986). Accurate approximations for posterior moments and marginal densities, J. Am. Stat. Assoc., 81, 82 86. Tierney, L., Kass, R. and Kadanae, J. B. (1989). Fully exponential Laplace approximations to expectations and variances of nonpositive functions, J. Am. Stat. Assoc., 84, 710 716. Vaida, F. and Xu, R. (2000). Proportional hazards models with random effects, Stat. Med., 19, 3309 3324. Wang, B. and Titterington, D. M. (2004). Lack of consistency of mean field and variational Bayes approximations for state space models, Neural Process. Lett., 20, 151 170. Wooldridge, J. M. (2007) Inverse probability weighted M-estimation for general missing data problems, J. Econom., 141, 1281 1301.