45, 2, 2016 3 217 230 The Role of Statisticians: Past, Present, and Future Manabu Iwasaki In this big data era, the roles of statisticians in academia and industries and of academic societies such as the Japan Statistical Society are gradually changing. The perspective of statistics seems to become much broader than ever. In this article, based on the author s own experience, activities of statisticians of past several decades are described, which are expected to be worth for rethinking the role that will be played by statisticians of present and coming ages. : Autocorrelation MAR Mid-P value Projection Pursuit Propensity Score Response Surface Spectral Analysis 1. 180-8633 3-3-1 (E-mail: iwasaki@st.seikei.ac.jp).
218 45 2 2016 2015 2. 2.1 1971 (1970) 1975 4 1912 45 Wilks (1962) (1956) (1986) 1953 1 Tumura (1965) James (1954) Constantine (1963) zonal Fisher (1949, 1964) ( (1956)) (1947) 1 (1951) (1962) 42 35 39 (1988) 2006
219 Zacks (1971) 2 PowerPoint OHP 5 YSG (Young Statisticians Group)
220 45 2 2016 1970 80 (Siotani et al. (1985)) 30 ( (1963)) 50 Herman Chernoff Harvard Donald B. Rubin
221 2.2 1970 Anderson (1958) Rao (1973) (1971) (1972) Wishart Wishart 1970 zonal Eaton (1983) coordinate-free Muirhead (1982) (1971) (1974) 1950
222 45 2 2016 1973 III ( (1989)) (horse-shoe effect) Rao (1973) 1 1 (1974) ( (2006)) (1983) (1972) (2006) 3. jackknife bootstrap LASSO AIC (Akaike Information Criterion) FPE (Final Prediction Error)
223 3.1 Response Surface Methodology (RSM) Response surface 1985 New Zealand New Zealand Taguchi method ( (1976, 1977)) Box et al. (1978) Box and Draper (1987) George E. P. Box Box (1994, 2006) (1994) (2008) 3.2 Autocorrelation 2 Autocorrelation Anderson (1971) 1985 New Zealand Iwasaki (1985, 1988) Iwasaki and Wang (1990) Iwasaki (1985)
224 45 2 2016 3.3 Projection Pursuit computer intensive 2 3 Friedman and Tukey (1974) Friedman (1987) Diaconis and Freedman (1984) Huber (1985) Jones and Sibson (1987) Huber (1981) P. J. Huber Huber (1985) Huber (1981) (1991, 1992b) (1989) 3.4 Spectral Analysis 2.2 Muirhead (1982) Eaton (1983) Persi Diaconis Diaconis (1988, 1989) Bloomfield (1974) Cox (1972) Iwasaki (1992) (1992a) 1991 k 2 k Iwasaki (1992)
225 1 Iwasaki (1991) Iwasaki (1991) Huber (1981) 1 3.5 mid-p value H 0 T t P p = P (T t ) T mid-p mid-p = P (T > t ) + 0.5P (T = t ) mid-p T P p (0, 1) E[p H 0 ] = 0.5 T E[p H 0 ] > 0.5 0.5 mid-p (E[mid-P H 0 ] = 0.5) Department of Statistics Department of Biostatistics P P mid-p E[mid-P H 0 ] = 0.5 mid-p P ( (1993), Iwasaki and Tanida (1994)) mid-p Agresti and Coull (1998) Iwasaki and Hidaka (2001) ZIP (Zero-Inflated Poisson) ( (2007), (2009)) (2010)
226 45 2 2016 3.6 MAR Propensity Score MAR (Missing At Random) propensity score Donald B. Rubin (Rubin (1976), Rosenbaum and Rubin (1983)) Rubin ignorability EM algorithm (Dempster et al. (1977)) SUTVA (Stable Unit Treatment Value Assumption) Little and Rubin (1987) 10 Schafer (1997) 10 Schafer (1997) (2002) 2 Little and Rubin (2002) nonignorable (2002) SAS PROC MI PROC MIANALYZE D. B. Rubin Propensity score Rubin potential outcomes Rosenbaum (2002) Rosenbaum Propensity score ( (2011, 2014)) Rubin (2006) (2015) 2010 Imbens and Rubin (2015) Imbens and Rubin (2015) An Introduction introduction Muirhead (1982)
227 (2004) 4. Google Hal Varian I keep saying the sexy job in the next ten years will be statisticians. sexy job Hal Varian The ability to take data to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it that s going to be a hugely important skill in the next decades, not only at the professional level but even at the educational level for elementary school kids, for high school kids, for college kids. 2011 (2012)
228 45 2 2016 (2012) (A) No. 25240005 Agresti, A. and Coull, B. A. (1998). Approximate is better than exact for interval estimation of binomial proportions, Am. Stat., 52, 119 126. Anderson, T. W. (1958). An Introduction to Multivariate Statistical Analysis, John Wiley & Sons. Anderson, T. W. (1971). The Statistical Analysis of Time Series, John Wiley & Sons. Bloomfield, P. (1974). Linear transformations for multivariate binary data, Biometrics, 30, 609 617. Box, G. E. P. and Draper, N. R. (1987). Empirical Model-Building and Response Surfaces, John Wiley & Sons. Box, G. E. P., Hunter, W. G. and Hunter, J. S. (1978). Statistics for Experimenters. An Introduction to Design, Data Analysis, and Model Building, John Wiley & Sons. Constantine, A. G. (1963). Some non-central problems in multivariate analysis, Ann. Math. Stat., 34, 1270 1285. Cox, D. R. (1972). The analysis of multivariate binary data, Applied Statistics, 21, 113 120. Dempster, A. P., Laird, N. M. and Rubin, D. B. (1977). Maximum likelihood estimation from incomplete data via the EM algorithm (with discussion), J. R. Stat. Soc. Ser. B, 39, 1 38. Diaconis, P. (1988). Group Representations in Probability and Statistics, Institute of Mathematical Statistics. Diaconis, P. (1989). A generalization of spectral analysis with application to ranked data, Ann. Stat., 17, 949 979. Diaconis, P. and Freedman, D. (1984). Asymptotics of graphical projection pursuit, Ann. Stat., 12, 793 815. Eaton, M. L. (1983). Multivariate Analysis. A Vector Space Approach, John Wiley & Sons. Friedman, J. H. (1987). Exploratory projection pursuit, J. Am. Stat. Assoc., 82, 249 266.
229 Friedman, J. H. and Tukey, J. W. (1974). A projection pursuit algorithm for exploratory data analysis, IEEE Trans. Computing, 23, 881 890. (1974). (2008). (LD) Response Surface Methodology (RSM) 37, 292 299. Huber, P. J. (1981). Robust Statistics, John Wiley & Sons. Huber, P. J. (1985). Projection pursuit (with discussion), Ann. Stat., 13, 436 525. Imbens, G. W. and Rubin, D. B. (2015). Causal Inference for Statistics, Social, and Biomedical Sciences. An Introduction, Cambridge University Press. Iwasaki, M. (1985). Mean efficiency of least squares estimator of regression coefficients, J. Japan Statist. Soc., 15, 139 149. Iwasaki, M. (1988). Efficiency of least squares in a linear model with autocorrelated disturbances, in Statistical Theory and Data Analysis II (ed. K. Matusita), North-Holland, 511 523. (1989). III 16, 13 21. (1991). 4, 41 56. Iwasaki, M. (1991). Construction of M-estimators by robustifying orthogonal polynomials associated with the density function, J. Japan Statist. Soc., 21, 155 171. Iwasaki, M. (1992). Spectral analysis of multivariate binary data, J. Japan Statist. Soc., 22, 45 65. (1992a). 19, 24 33. (1992b). 19, 37 49. (1993). mid-p value 22, 67 80. (1994). (2002). (2004). (2006). (2010). (2011). (2014). (2015). (2009). 36, 25 34. (1989). 18, 103 128. Iwasaki, M. and Hidaka, N. (2001). Notes on the central and shortest confidence intervals for a binomial parameter, Japanese Journal of Biometrics, 22, 1 13. Iwasaki, M. and Tanida, T. (1994). Sample size determination based on mid-p value for use with the testing in 2 2 comparative trials, Journal of the Japanese Society of Computational Statistics, 7, 57 64. Iwasaki, M. and Wang, S. (1990). On coordinate-free measures of efficiency of least squares in a linear model, J. Eng. Math., 7(1), 1 8. (2007). 34, 91 100. (2006). James, A. T. (1954). Normal multivariate analysis and the orthogonal group, Ann. Math. Stat., 25, 40 75. Jones, M. C. and Sibson, R. (1987). What is projection pursuit? (with discussion), J. R. Stat. Soc. Ser. A, 150, 1 36. Little, R. J. A. and Rubin, D. B. (1987). Statistical Analysis with Missing Data, John Wiley & Sons. Little, R. J. A. and Rubin, D. B. (2002). Statistical Analysis with Missing Data, Second Edition, John Wiley & Sons. (1949). (1956).,
230 45 2 2016 (1964). I II Muirhead, R. J. (1982). Aspects of Multivariate Statistical Theory, John Wiley & Sons. (1971). Rao, C. R. (1973). Linear Statistical Inference and Its Applications, Second Edition, John Wiley & Sons. Rosenbaum, P. R. (2002). Observational Studies, Second Edition, Springer. Rosenbaum, P. R. and Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70, 41 55. Rubin, D. B. (1976). Inference and missing data (with comments by R. J. A. Little), Biometrika, 63, 581 592. Rubin, D. B. (2006). Matched Sampling for Causal Effects, Cambridge University Press. Schafer, J. L. (1997). Analysis of Incomplete Multivariate Data, Chapman & Hall. Siotani, M., Hayakawa, T. and Fujikoshi, Y. (1985). Modern Multivariate Statistical Analysis: A Graduate Course and Handbook, American Science Press. (1994). 24, 193 201. (1976, 1977). (1963). (1974). (1972). (2012). 41, 251 264. (1956). Tumura, Y. (1965). The distribution of latent roots and vectors, TRU Mathematics, 1, 1 16. (1986). Wilks, S. S. (1962). Mathematical Statistics, John Wiley & Sons. (1983). Zacks, S. (1971). The Theory of Statistical Inference, John Wiley & Sons.