28 9

Size: px
Start display at page:

Download "28 9"

Transcription

1

2 28 9

3 D3()Vol.68No.5pp (2012) HASEGAWA Hironobu, FUJII Masaru, ARIMURA Mikiharu, TAMURA Tohru: A Basic Study on Traffic Accident Data Analysis Using Support Vector MachineJournal of the Eastern Asia Society of Transportation Studies, Vol.7, pp (2007) Vol.50pp (2007) 8

4 I II (1) (2) CSV Excel 9 III (1) (2) a) b) (3) (1) (2) (3) k-means (4) (1) (2) k-means (1) (2) (1) (2) a) b) subset() IV 60 V 61 3

5 (1) (2) R 68 (1) RscriptAndData.zip (2) R (3) (4) (5)

6 EWD EFD k-means / R Windows R R Windows Rstudio R sagamihara H

7 I 1. IC GPS 1 1: 1) p ),3),4),5),6),7) 8),9),10),11) R 1

8 Web 2

9 2. R II Excel R III IV V R R R PDF 1p. 1 *1 III5.(2)a) 2) R R 2p. 5 2) R 1 getwd ( ) # 2 q ( ) # R 3 # 4 i f ( 0 ) { 5 6 i f ( 0 ) { } 7 } R 1: R R R R $$ # data <- iris # R iris # 3 head(data, 3) $$ Sepal.Length Sepal.Width Petal.Length Petal.Width Species $$ setosa $$ setosa $$ setosa 1 3

10 II 4 (a) (b) (c) (d) R R TB R RODBC *2 R R *3 Web RjpWiki R-Tips seekr 2 1GB 3 12) 13) 4

11 3. Excel IC R R (1) read.fwf() Windows Mac Linux fileencoding cp932 *4 width help(read.fwf) R 2 3 1, 3, 5 *5 input.txt R R 2: read.fwf() 1 read. fwf ( f i l e= input. t x t, f i l e E n c o d i n g= cp932, width=c ( 1, 3, 5 ) ) R R *6 2),5) 108.4MB *7 read.fwf() 3 readr read fwf() 2 read fwf() (a) (b) factorcharacter (c) *8 *9 readr CRAN R 3 read fwf() R 3: read fwf() 1 i n s t a l l. packages ( r e a d r ) # readr 2 library ( r e a d r ) 3 read fwf ( f i l e= input. t x t, fwf widths ( c ( 1, 3, 5 ) ) ) (2) CSV CSV comma-separated values, 4 Shift-JIS CP = read.table() read.csv() MB 8 Mac Linux UTF-8 9 nkf VIMLinuxMacWindows Notepad++ WindowsCoteditorMac 5

12 R CSV read.csv() readr read csv() jyoukou csv 1p. 8 read.csv() *10 # sagamihara <- read.csv(file = "jyoukou_ csv", header = TRUE, fileencoding = "UTF-8") # colnames(sagamihara) <- c("", "", "1975 ", "1980 ", "1985 ", "1989 ", "1993 ", "1998 ", "2003 ", "2008 ", "2009 ", "2010 ", "2011 ", "2012 ", "2013 ") # sagamihara $$ $$ $$ $$ $$ $$ 5 NA NA NA $$ $$ $$ $$ $$ $$ $$ $$ $$ 14 NA NA NA NA NA NA $$ 15 NA NA NA NA NA NA $$ 16 NA NA NA NA NA NA $$ 17 NA NA NA NA NA NA $$ $$ $$ $$ $$ $$ 23 NA NA NA $$ 24 NA NA NA $$ 25 NA NA NA NA $$ $$ $$ $$ $$

13 $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ sagamihara sagamihara R R V 10.p sagamihara *11 NA 5.p

14 1: sagamihara

15 4. Excel Excel R xlsx read.xlsx() XLConnect readworksheetfromfile() 22 zkntrf05.xls 2 9

16 2: 22 10

17 readworksheetfromfile() # #install.packages("xlconnect") library(xlconnect) # # AkitaPT <- readworksheetfromfile(file="zkntrf05.xls", # sheet = 1, # header = TRUE, # startcol = 1, # startrow = 7, # endcol = 33 # ) # colnames(akitapt) $$ [1] "Col1" "Col2" "Col3" "Col4" "Col5" "Col6" "Col7" $$ [8] "Col8" "Col9" "Col10" "Col11" "Col12" "Col13" "Col14" $$ [15] "Col15" "Col16" "Col17" "Col18" "Col19" "Col20" "Col21" $$ [22] "Col22" "Col23" "Col24" "Col25" "Col26" "Col27" "Col28" $$ [29] "Col29" "Col30" "Col31" "X.." "X...1" # colnames(akitapt) <- c("", "", "", "1224 ", "", "", "", "7 ", "8 ", "9 ", "10 ", "11 ", "12 ", "13 ", "14 ", "15 ", "16 ", "17 ", "18 ", "19 ", "20 ", "21 ", "22 ", "23 ", "0 ", "1 ", "2 ", "3 ", "4 ", "5 ", "6 ", " 12 ", "24 ") str() # str(akitapt) $$ 'data.frame': 2272 obs. of 33 variables: $$ $ : num $$ $ : num $$ $ : num $$ $ 1224 : num $$ $ : num $$ $ : num $$ $ : num $$ $ 7 : num

18 $$ $ 8 : num $$ $ 9 : num $$ $ 10 : num $$ $ 11 : num $$ $ 12 : num $$ $ 13 : num $$ $ 14 : num $$ $ 15 : num $$ $ 16 : num $$ $ 17 : num $$ $ 18 : num $$ $ 19 : num $$ $ 20 : num $$ $ 21 : num $$ $ 22 : num $$ $ 23 : num $$ $ 0 : num $$ $ 1 : num $$ $ 2 : num $$ $ 3 : num $$ $ 4 : num $$ $ 5 : num $$ $ 6 : num $$ $ 12 : num $$ $ 24 : num numnumeric7 24 as.factor() as.ordered() # # for (i in 1:7) { AkitaPT[, i] <- as.factor(akitapt[, i]) # } # str(akitapt) $$ 'data.frame': 2272 obs. of 33 variables: $$ $ : Factor w/ 568 levels "10","20","30",..: $$ $ : Factor w/ 4 levels "1","3","4","6": $$ $ : Factor w/ 200 levels "2","3","4","7",..: $$ $ 1224 : Factor w/ 2 levels "1","2": $$ $ : Factor w/ 4 levels "1","2","3","6": $$ $ : Factor w/ 2 levels "1","2": $$ $ : Factor w/ 2 levels "1","2": $$ $ 7 : num $$ $ 8 : num

19 $$ $ 9 : num $$ $ 10 : num $$ $ 11 : num $$ $ 12 : num $$ $ 13 : num $$ $ 14 : num $$ $ 15 : num $$ $ 16 : num $$ $ 17 : num $$ $ 18 : num $$ $ 19 : num $$ $ 20 : num $$ $ 21 : num $$ $ 22 : num $$ $ 23 : num $$ $ 0 : num $$ $ 1 : num $$ $ 2 : num $$ $ 3 : num $$ $ 4 : num $$ $ 5 : num $$ $ 6 : num $$ $ 12 : num $$ $ 24 : num AkitaPT AkitaPT sagamihara AkitaPT # AkitaPT nrow(akitapt) $$ [1] 2272 # 10 head(akitapt, 10) $$ 1224 $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$

20 $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ 24 $$ $$ $$ $$ $$ $$ $$ $$ $$ $$

21 III data preprocessingdata cleaning, data cleansing R 15

22 5. missing value (1) complete.cases() complete.cases() FALSETRUE sagamihara # complete.cases(sagamihara) $$ [1] TRUE TRUE TRUE TRUE FALSE TRUE TRUE TRUE TRUE TRUE TRUE $$ [12] TRUE TRUE FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE TRUE $$ [23] FALSE FALSE FALSE sagamihara FALSE is.na() # is.na(sagamihara) $$ $$ [1,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE $$ [2,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE $$ [3,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE $$ [4,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE $$ [5,] FALSE FALSE TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE $$ [6,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE $$ [7,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE $$ [8,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE $$ [9,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE $$ [10,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE $$ [11,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE $$ [12,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE $$ [13,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE 16

23 $$ [14,] FALSE FALSE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE $$ [15,] FALSE FALSE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE $$ [16,] FALSE FALSE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE $$ [17,] FALSE FALSE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE $$ [18,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE $$ [19,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE $$ [20,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE $$ [21,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE $$ [22,] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE $$ [23,] FALSE FALSE TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE $$ [24,] FALSE FALSE TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE $$ [25,] FALSE FALSE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE $$ $$ [1,] FALSE FALSE FALSE FALSE FALSE $$ [2,] FALSE FALSE FALSE FALSE FALSE $$ [3,] FALSE FALSE FALSE FALSE FALSE $$ [4,] FALSE FALSE FALSE FALSE FALSE $$ [5,] FALSE FALSE FALSE FALSE FALSE $$ [6,] FALSE FALSE FALSE FALSE FALSE $$ [7,] FALSE FALSE FALSE FALSE FALSE $$ [8,] FALSE FALSE FALSE FALSE FALSE $$ [9,] FALSE FALSE FALSE FALSE FALSE $$ [10,] FALSE FALSE FALSE FALSE FALSE $$ [11,] FALSE FALSE FALSE FALSE FALSE $$ [12,] FALSE FALSE FALSE FALSE FALSE $$ [13,] FALSE FALSE FALSE FALSE FALSE $$ [14,] FALSE FALSE FALSE FALSE FALSE $$ [15,] FALSE FALSE FALSE FALSE FALSE $$ [16,] FALSE FALSE FALSE FALSE FALSE $$ [17,] FALSE FALSE FALSE FALSE FALSE $$ [18,] FALSE FALSE FALSE FALSE FALSE $$ [19,] FALSE FALSE FALSE FALSE FALSE $$ [20,] FALSE FALSE FALSE FALSE FALSE $$ [21,] FALSE FALSE FALSE FALSE FALSE $$ [22,] FALSE FALSE FALSE FALSE FALSE $$ [23,] FALSE FALSE FALSE FALSE FALSE $$ [24,] FALSE FALSE FALSE FALSE FALSE $$ [25,] FALSE FALSE FALSE FALSE FALSE mice md.pattern() # install.packages('mice') # library(mice) md.pattern(sagamihara) $$ $$ $$

24 $$ $$ $$ $$ $$ $$ $$ $$ $$ sagamihara 37 (2) 1) 1) p.28 ab 2 a 0.6 b ) 15) p.45 (a) 1 (b) (c) 1 1 (a) 18

25 (b) (c) (d) stochastic regression imputation, SRI (e) full information maximum likelihood, FIML (d) multiple imputation, MI 5. (3) a) sagamihara sagamihara R mice mice() method "norm.nob" library(mice) # imp <- mice(sagamihara[, c(9, 8)], method = "norm.nob", m = 1, maxit = 100, printflag = FALSE) # '1998.sri''1998 ' sagamihara$"1998.sri" <- sagamihara$"1998 " ## '1998.sri' sagamihara$"1998.sri"[is.na(sagamihara$"1998 ")] <- unlist(imp$imp$"1998 ") # subset(sagamihara, select = c("1998 ", "1998.sri", "2003 ")) $$ sri 2003 $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ 14 NA $$ 15 NA $$ 16 NA

26 $$ 17 NA $$ $$ $$ $$ $$ $$ $$ $$ b) FIML) lavaan lavaan cfa() (Confirmatory Factor Analysis, CFA) growth() *12 Growth Curve model lavaan() latent variable model lavcor() polychoric correlation coefficientpolyserial correlation coefficient *13 Pearson product-moment correlation coefficient sem() Structural Equation Modeling, SEM missing "fiml"fiml "listwise" "pairwise" 16) (3) na.omit() # na.omit(sagamihara) $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$

27 $$ $$ sri $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ # sagamihara2 sagamihara2 <- na.omit(sagamihara) 21

28 6. AkitaPT 2 AkitaPT 7 AkitaPT 7 / 7 0/1000/ # summary(akitapt$"7 ") $$ Min. 1st Qu. Median Mean 3rd Qu. Max. $$ # plot(akitapt$"7 ") 3 (a) EWD, Equal Width Discretization (b) EFD, Equal Frequency Discretization (c) k-means k (1) R infotheo discretize() disc "equalwidth" nbins 1/ /3 = n = 1 + log 2 NnN ( 1) = (1) EWD #install.packages("infotheo") library(infotheo) # length(akitapt$"7 ") $$ [1] 2272 # ewd.akitapt7 <- discretize(akitapt$"7 ", disc = "equalwidth" 22

29 2: 7 23

30 #, nbins = trunc(length(akitapt "7 ")^(1/3)) #, nbins = trunc(1 + log2(length(akitapt$"7 "))) # ) # (max(akitapt$"7 ") - min(akitapt$"7 "))/length(akitapt$"7 ") $$ [1] AkitaPT 7 2, 3 # plot(c(t(ewd.akitapt7))) # table(ewd.akitapt7) $$ ewd.akitapt7 $$ $$ (2) R infotheo discretize() disc "equalfreq" EWD nbins 1/ /3 = EFD #install.packages("infotheo") library(infotheo) # efd.akitapt7 <- discretize(akitapt$"7 ", disc = "equalfreq" #, nbins = trunc(length(akitapt "7 ")^(1/3)) #, nbins = trunc(1 + log2(length(akitapt$"7 "))) # ) 4p AkitaPT 7 24

31 3: EWD 7 25

32 4: EFD 7 2 p. 23, 4 # plot(c(t(efd.akitapt7))) EWD # table(efd.akitapt7) $$ efd.akitapt7 26

33 $$ $$ EWD EFD # ewdefd.akitapt7 <- data.frame(ewd = ewd.akitapt7, EFD = efd.akitapt7) # colnames(ewdefd.akitapt7) <- c("ewd", "EFD") # table(ewdefd.akitapt7) $$ EFD $$ EWD $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ (3) k-means k-means *14 k-means k k 2 2 k-means (a) k (b) (c) 2. 5 x y n = 100 k-means k-means 3 k-means p n (x 1,..., x n ) p R p k-means k 14 k-c-means c- 27

34 y y x x 5: 28

35 G i (i = 0,..., k) k n U x k G i i l u il 1 x l / G j u jl = 0 G i v i V = (v 0,..., v i ) G i N i *15 N i = n l=1 u il x y d(x, y) G i v i x l D il = v i x l 2 (2) k-means U 3 c n J(U, V ) = u ik D ik (3) i=1 l=1 k u ik = 1 (4) i=1 k-means 3 U V J(U, V ) U V k 2. V J(U, V ) U J(U, V ) 3. U J(U, V ) V J(U, V ) U V R k-means R kmeans() 7 k-means EWDEFD 12 # k-means km.akitapt7 <- kmeans(akitapt$"7 ", centers = 12 ) # kmeans() km.akitapt7$cluster $ 6 k-means 7 # plot(km.akitapt7$cluster) EFD EWD

36 6: k-means 7 table(km.akitapt7$cluster) $$ $$ $$ EWDEFDk-means 30

37 (a) (b) EWD (c) EFD (d) k-means 7: 7 31

38 (4) dummy variable *16 8 8: R Github makedummies makedummies() *17 Github makedummies devtools install github() # install.packages('devtools') library(devtools) install_github("toshi-ara/makedummies") 4.p. 9 as.factor() as.ordered() # str(ewd.akitapt7) $$ 'data.frame': 2272 obs. of 1 variable: $$ $ X: int # factor ewd.akitapt7$factor <- as.factor(ewd.akitapt7$x) # factor 30 head(ewd.akitapt7$factor, 30) $$ [1] $$ Levels: # summary(ewd.akitapt7) 16 II 17 AkitaPT$ 32

39 $$ X factor $$ Min. : :1664 $$ 1st Qu.: : 241 $$ Median : : 148 $$ Mean : : 85 $$ 3rd Qu.: : 60 $$ Max. : : 30 $$ (Other): 44 Levels: makedummies() library(makedummies) # factor fac.ewd.akitapt7 <- makedummies(dat = ewd.akitapt7, basal_level = TRUE) # fac.ewd.akitapt7[110:119, ] $$ X factor_1 factor_2 factor_3 factor_4 factor_5 factor_6 factor_7 $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ factor_8 factor_9 factor_10 factor_11 factor_12 $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ # factor makedummies(dat = ewd.akitapt7, basal_level = FALSE)[110:119, ] $$ X factor_2 factor_3 factor_4 factor_5 factor_6 factor_7 factor_8 33

40 $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ factor_9 factor_10 factor_11 factor_12 $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ makedummies() 34

41 7. (1) PCA, principle component analysis R prcomp() # AkitaPT AkitaPT2 AkitaPT2 <- na.omit(akitapt) # AkitaPT2 pca.akitapt <- prcomp(akitapt2[, 8:31], scale = TRUE) # str(pca.akitapt) $$ List of 5 $$ $ sdev : num [1:24] $$ $ rotation: num [1:24, 1:24] $$..- attr(*, "dimnames")=list of 2 $$....$ : chr [1:24] "7 " "8 " "9 " "10 "... $$....$ : chr [1:24] "PC1" "PC2" "PC3" "PC4"... $$ $ center : Named num [1:24] $$..- attr(*, "names")= chr [1:24] "7 " "8 " "9 " "10 "... $$ $ scale : Named num [1:24] $$..- attr(*, "names")= chr [1:24] "7 " "8 " "9 " "10 "... $$ $ x : num [1:420, 1:24] $$..- attr(*, "dimnames")=list of 2 $$....$ : chr [1:420] "1" "2" "3" "4"... $$....$ : chr [1:24] "PC1" "PC2" "PC3" "PC4"... $$ - attr(*, "class")= chr "prcomp" sdev rotation center scale FALSE TRUE x 5 prcomp # summary(pca.akitapt) $$ Importance of components: 35

42 $$ PC1 PC2 PC3 PC4 PC5 PC6 $$ Standard deviation $$ Proportion of Variance $$ Cumulative Proportion $$ PC7 PC8 PC9 PC10 PC11 PC12 $$ Standard deviation $$ Proportion of Variance $$ Cumulative Proportion $$ PC13 PC14 PC15 PC16 PC17 PC18 $$ Standard deviation $$ Proportion of Variance $$ Cumulative Proportion $$ PC19 PC20 PC21 PC22 PC23 PC24 $$ Standard deviation $$ Proportion of Variance $$ Cumulative Proportion summary 3 Standard deviation Proportion of Variance Cumulative Proportion 1 95% 2 1.8% % # (pca.akitapt$sdev)^2 $$ [1] $$ [6] $$ [11] $$ [16] $$ [21] # sum((pca.akitapt$sdev)^2) $$ [1] 24 # cumsum((pca.akitapt$sdev)^2) $$ [1] $$ [8] $$ [15] $$ [22]

43 9: 1 # screeplot(pca.akitapt) # 3 head(pca.akitapt$x, 3) $$ PC1 PC2 PC3 PC4 PC5 PC6 $$ $$

44 $$ $$ PC7 PC8 PC9 PC10 PC11 PC12 $$ $$ $$ $$ PC13 PC14 PC15 PC16 PC17 PC18 $$ $$ $$ $$ PC19 PC20 PC21 PC22 PC23 PC24 $$ $$ $$ # t(t(pca.akitapt$rotation) * pca.akitapt$sdev) $$ PC1 PC2 PC3 PC4 PC5 $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ PC6 PC7 PC8 PC9 PC10 $$ e-04 $$ e-02 $$ e-02 $$ e-02 38

45 $$ e-03 $$ e-03 $$ e-02 $$ e-02 $$ e-05 $$ e-03 $$ e-02 $$ e-02 $$ e-02 $$ e-02 $$ e-02 $$ e-02 $$ e-02 $$ e-01 $$ e-03 $$ e-02 $$ e-01 $$ e-01 $$ e-02 $$ e-02 $$ PC11 PC12 PC13 PC14 PC15 $$ e-02 $$ e-02 $$ e-02 $$ e-02 $$ e-02 $$ e-02 $$ e-02 $$ e-03 $$ e-02 $$ e-02 $$ e-03 $$ e-02 $$ e-02 $$ e-02 $$ e-02 $$ e-02 $$ e-03 $$ e-03 $$ e-04 $$ e-05 $$ e-04 $$ e-02 $$ e-02 $$ e-04 $$ PC16 PC17 PC18 PC19 PC20 $$

46 $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ PC21 PC22 PC23 PC24 $$ e e $$ e e $$ e e $$ e e $$ e e $$ e e $$ e e $$ e e $$ e e $$ e e $$ e e $$ e e $$ e e $$ e e $$ e e $$ e e $$ e e $$ e e $$ e e $$ e e $$ e e $$ e e $$ e e

47 $$ e e # biplot(pca.akitapt, choice = c(1, 2) # 1 2, cex = 0.5 # 50% ) 41

48 10: 42

49 (2) k-means 6. (3)p # km.akitapt <- na.omit(akitapt[, 8:31]) # k-means cluster <- kmeans(km.akitapt, centers = 12 ) # # table(cluster$cluster) $$ $$ $$ p. 45 # km.akitapt$cluster <- cluster$cluster for (i in 1:12) { # 1 12 plot(x = 0, y = 0, xlim = c(1, 24) # x 76 24, ylim = c(0, max(km.akitapt[, 1:24], na.rm = T)) # y 0, type = "n" #, main = paste("", i, ", N=", nrow(subset(km.akitapt, cluster == i)), sep = ""), xlab = "" # x, ylab = "/" # y, xaxt = "n" # x ) # x axis(side = 1, at = 1:24, labels = colnames(km.akitapt[, 1:24])) } # for (j in 1:nrow(subset(km.AkitaPT, cluster == i))) { lines(x = 1:24, subset(km.akitapt, cluster == i)[j, 1:24], col = j) } 43

50 (a) 1 (d) 4 (b) 2 (e) 5 (c) 3 (f) 6 11:

51 (a) 7 (d) 10 (b) 8 (e) 11 (c) 9 (f) 12 12:

52 8. normalize *18 (1) 0 1 *19 (2) R sagamihara2 *20 scale() (1) # 0 1 scale(sagamihara2$"1975 ") $$ [,1] $$ [1,] $$ [2,] $$ [3,] $$ [4,] $$ [5,] $$ [6,] $$ [7,] $$ [8,] $$ [9,] $$ [10,] $$ [11,] $$ [12,] $$ [13,] $$ [14,] $$ [15,] $$ [16,] $$ [17,] $$ attr(,"scaled:center") $$ [1] $$ attr(,"scaled:scale") $$ [1] scale() # 0 1 scale(sagamihara2$"1975 ", center = TRUE, scale = TRUE) $$ [,1] 18 standardization sagamihara 46

53 $$ [1,] $$ [2,] $$ [3,] $$ [4,] $$ [5,] $$ [6,] $$ [7,] $$ [8,] $$ [9,] $$ [10,] $$ [11,] $$ [12,] $$ [13,] $$ [14,] $$ [15,] $$ [16,] $$ [17,] $$ attr(,"scaled:center") $$ [1] $$ attr(,"scaled:scale") $$ [1] # 0 1 scaled1.sagamihara2 <- scale(sagamihara2[, -c(1:2)]) # scaled1.sagamihara2 $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$

54 $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ sri $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ attr(,"scaled:center") $$ $$ $$ $$ $$ sri $$ $$ attr(,"scaled:scale") $$ $$ $$ $$ $$ sri 48

55 $$ matrixstats colvars() colmeans() # install.packages('matrixstats') library(matrixstats) # colvars(scaled1.sagamihara2) $$ [1] # colmeans(scaled1.sagamihara2) $$ $$ e e e e e-17 $$ $$ e e e e e-17 $$ sri $$ e e e e-17 0 (2) # 0 1 scale(sagamihara2$"1975 ", center = min(sagamihara2$"1975 "), scale = max(sagamihara2$"1975 ")) $$ [,1] $$ [1,] $$ [2,] $$ [3,] $$ [4,] $$ [5,] $$ [6,] $$ [7,] $$ [8,] $$ [9,] $$ [10,] $$ [11,] $$ [12,] $$ [13,] $$ [14,] $$ [15,] $$ [16,] $$ [17,]

56 $$ attr(,"scaled:center") $$ [1] 700 $$ attr(,"scaled:scale") $$ [1] matrixstats colmins() colmaxs() as.matrix() # 0 1 scaled2.sagamihara2 <- scale(sagamihara2[, -c(1:2)], center = colmins(as.matrix(sagamihara2[, -c(1:2)])), scale = colmaxs(as.matrix(sagamihara2[, -c(1:2)]))) # scaled2.sagamihara2 $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$

57 $$ $$ $$ $$ $$ sri $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ attr(,"scaled:center") $$ [1] $$ attr(,"scaled:scale") $$ [1] $$ [11] # colmins(scaled2.sagamihara2) $$ [1] # colmaxs(scaled2.sagamihara2) $$ [1] $$ [8]

58 9. R sampling random sampling (1) R sample() # nrow(akitapt) $$ [1] 2272 # AkitaPT 5 # sampling.target <- sample(nrow(akitapt), 5, replace = FALSE) ## AkitaPT[sampling.target, ] $$ 1224 $$ <NA> <NA> $$ <NA> <NA> $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ NA NA NA NA NA $$ NA NA NA NA NA $$ $$ $$ NA NA NA NA NA $$ $$ 1425 NA NA NA NA NA NA NA

59 $$ 1663 NA NA NA NA NA NA NA 69 $$ $$ $$ 1996 NA NA NA NA NA NA NA 67 $$ 24 $$ 1425 NA $$ 1663 NA $$ $$ $$ 1996 NA 2 # AkitaPT 5 sampling.target.rep <- sample(nrow(akitapt), 5, replace = TRUE) ## AkitaPT[sampling.target.rep, ] $$ 1224 $$ <NA> <NA> $$ $$ <NA> <NA> $$ <NA> <NA> $$ <NA> <NA> $$ $$ $$ $$ $$ $$ $$ $$ NA NA NA NA NA $$ $$ NA NA NA NA NA $$ NA NA NA NA NA $$ NA NA NA NA NA $$ $$ 602 NA NA NA NA NA NA NA 460 $$ $$ 424 NA NA NA NA NA NA NA 139 $$ 1778 NA NA NA NA NA NA NA 127 $$ 1335 NA NA NA NA NA NA NA 1541 $$ 24 $$ 602 NA $$ $$ 424 NA $$ 1778 NA 53

60 $$ 1335 NA 54

61 (2) AkitaPT R subset() 2 TRUE FALSE R 3 3: == = is.null NULL!!= \= is.na NA & = is.nan NaN = is.finite && > is.infinite < complete.cases xor() 4 H22 4: H

62 a) # AkitaPT AkitaPT2 AkitaPT2 <- na.omit(akitapt) # AkitaPT2 '' AkitaPT2$ <- as.factor(akitapt2$) # table(akitapt2$) $$ $$ $$ # ''==3 TRUE FALSE list.rain <- AkitaPT2$ == 3 ## list.rain list.rain $$ [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE $$ [12] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE $$ [23] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE $$ [34] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE $$ [45] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE $$ [56] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE $$ [67] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE $$ [78] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE $$ [89] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE $$ [100] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE $$ [111] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE $$ [122] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE $$ [133] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE $$ [144] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE $$ [155] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE $$ [166] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE $$ [177] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE $$ [188] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE $$ [199] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE $$ [210] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE $$ [221] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE TRUE TRUE $$ [232] TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE $$ [243] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE $$ [254] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE $$ [265] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE $$ [276] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE $$ [287] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE 56

63 $$ [298] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE $$ [309] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE $$ [320] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE $$ [331] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE $$ [342] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE $$ [353] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE $$ [364] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE $$ [375] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE $$ [386] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE $$ [397] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE $$ [408] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE $$ [419] FALSE FALSE ## ''==3AkitaPT2rain AkitaPT2rain <- AkitaPT2[list.rain, ] # AkitaPT2rain <- AkitaPT2[AkitaPT2 ''==3, ] # ## AkitaPT2rain AkitaPT2rain $$ 1224 $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ 24 $$ $$ $$ $$ V 10.p

64 b) subset() subset() subset() # ==3 subset(akitapt2 #, == 3 # ) $$ 1224 $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ $$ 24 $$ $$ $$ $$ #!=3AkitaPT2norain AkitaPT2norain <- subset(akitapt2 #,!= 3 # ) # AkitaPT2norain nrow(akitapt2norain) $$ [1] 416 # AkitaPT2norain 7 summary(akitapt2norain)[, 1:7] 58

65 $$ 1224 $$ 10 : 4 1: 76 7 : 84 1: 0 1:142 $$ 20 : 4 3: : 76 2:416 2:142 $$ 30 : 4 4: : 40 3: 0 $$ 40 : 4 6: : 32 6:132 $$ 50 : : 16 $$ 60 : : 12 $$ (Other):392 (Other):156 $$ $$ 1:208 1:208 $$ 2:208 2:208 $$ $$ $$ $$ $$ 59

66 IV (a) (b) (c) (d) R Web 60

67 V 10. sagamihara # 2 sagamihara[2, ] $$ $$ $$ sri $$ # 1 3 sagamihara[c(1, 3), ] $$ $$ $$ $$ sri $$ $$ # 4 6 sagamihara[4:6, ] $$ $$ $$ 5 NA NA NA $$ $$ sri $$ $$ $$ # sagamihara[nrow(sagamihara), ] $$ $$ 25 NA NA NA NA $$ sri $$ # 8 sagamihara[, 8] $$ [1] $$ [11] NA NA NA NA $$ [21]

68 # 3 5 sagamihara[, 3:5] $$ $$ $$ $$ $$ $$ 5 NA NA NA $$ $$ $$ $$ $$ $$ $$ $$ $$ 14 NA NA NA $$ 15 NA NA NA $$ 16 NA NA NA $$ 17 NA NA NA $$ $$ $$ $$ $$ $$ 23 NA NA NA $$ 24 NA NA NA $$ 25 NA NA NA # '1998 '8 sagamihara$"1998 " $$ [1] $$ [11] NA NA NA NA $$ [21] # 3 5 sagamihara[3, 5] $$ [1] # sagamihara[c(1, 3), 5:8] $$ $$ $$ # sagamihara[, ncol(sagamihara)] 62

69 $$ [1] $$ [7] $$ [13] $$ [19] $$ [25] # 6 head(sagamihara) $$ $$ $$ $$ $$ $$ 5 NA NA NA $$ $$ sri $$ $$ $$ $$ $$ $$ # 3 head(sagamihara, 3) $$ $$ $$ $$ $$ sri $$ $$ $$ # 6 tail(sagamihara) $$ $$ $$ $$ $$ 23 NA NA NA $$ 24 NA NA NA $$ 25 NA NA NA NA $$ sri $$ $$ $$

70 $$ $$ $$ # 3 tail(sagamihara, 3) $$ $$ 23 NA NA NA $$ 24 NA NA NA $$ 25 NA NA NA NA $$ sri $$ $$ $$ (1) sagamihara nrow() ncol() summary() *21 # nrow(sagamihara) $$ [1] 25 # ncol(sagamihara) $$ [1] 16 # summary(sagamihara) $$ $$ :7 : 2 Min. : 700 Min. : 794 $$ :2 : 1 1st Qu.: st Qu.: 2930 $$ :6 : 1 Median : Median : $$ :6 : 1 Mean : Mean : $$ :4 : 1 3rd Qu.: rd Qu.: $$ : 1 Max. : Max. : $$ (Other) :18 NA's :8 NA's :8 $$ $$ Min. : 808 Min. : 1114 Min. : 1532 Min. : 1652 $$ 1st Qu.: st Qu.: st Qu.: st Qu.: 9910 $$ Median : Median : Median : Median :

71 $$ Mean : Mean : Mean : Mean : $$ 3rd Qu.: rd Qu.: rd Qu.: rd Qu.: $$ Max. : Max. : Max. : Max. : $$ NA's :8 NA's :5 NA's :4 NA's :4 $$ $$ Min. : 1694 Min. : 2002 Min. : 1994 Min. : 1968 $$ 1st Qu.: st Qu.: st Qu.: st Qu.: $$ Median : Median : Median : Median : $$ Mean : Mean : Mean : Mean : $$ 3rd Qu.: rd Qu.: rd Qu.: rd Qu.: $$ Max. : Max. : Max. : Max. : $$ $$ sri $$ Min. : 2016 Min. : 2128 Min. : 2252 Min. : $$ 1st Qu.: st Qu.: st Qu.: st Qu.: 6616 $$ Median : Median : Median : Median : $$ Mean : Mean : Mean : Mean : $$ 3rd Qu.: rd Qu.: rd Qu.: rd Qu.: $$ Max. : Max. : Max. : Max. : $$ summary(sagamihara) character *22 summary() Min. 1st. Qu. Median Mean 3rd Qu. Max. NA s *23 17) (2) AkitaPT table() table(akitapt$) $$ $$ $$ table() 22 (2)p

72 table(akitapt$, AkitaPT$) $$ $$ $$ $$ $$ $$ R # plot(x = 0, y = 0, xlim = c(1, 24) # x 76 24, ylim = c(0, max(akitapt[, 8:31], na.rm = T)) # y 0, type = "n" #, xlab = "" # x, ylab = "/" # y, xaxt = "n" # x ) # x axis(side = 1, at = 1:24, labels = colnames(akitapt[, 8:31])) # 2272 '24 ' 1 AkitaPTover10000 <- na.omit(akitapt) # NA AkitaPTover10000 <- AkitaPTover10000[AkitaPTover10000$"24 " >= 10000, ] # for (i in 1:nrow(AkitaPTover10000)) { lines(x = 1:24, AkitaPTover10000[i, 8:31], col = i) } R ggplot2 18) MurrellR 19) R R R Graphical Manual 66

73 13: / 67

74 12. R R MacWindows Mac R version ( ) Platform: x86 64-apple-darwin (64-bit) Running under: OS X (Yosemite) R version ( ) Platform: x86 64-w64-mingw32/x64 (64-bit) Windows Running under: Windows 7 x64 (build 7601) Service Pack 1 (1) RscriptAndData.zip RscriptAndData.zip 14 Windows JACIC Report Rscript data JACIC Report Rscript.Rjyoukou csv fixed jyoukou csv JACIC Report Rscript JACIC Report Rscript.RUTF-8 R jyoukou csvUTF fixed jyoukou csvUTF-8 jyoukou csv LibreOffice Calc : R JACIC Report Rscript.R jyoukou csv UTF-8 Windows R R CP932 zkntrf05.xls 22 *24 JACIC Report Rscript (2) R JACIC Report Rscript.R Windows R 15 JACIC Report Rscript.R CP932 Mac Linux *25 Rstudio R IDEIntegrated Development Environment UTF-8 R 16p

75 15: Windows R R 69

76 16: Windows Rstudio R 70

77 (3) Windows 7 hasegawa setwd("c:\\users\\hasegawa\\desktop\\jacic_report_rscript") (4) jyoukou csv Mac Windows incomplete final line found by readtableheader on jyoukou csv jyoukou csv Windows LibreOffice Calc fixed jyoukou csv fixed jyoukou csv Windows sagamihara <- read.csv(file = "fixed_jyoukou_ csv", header = TRUE, fileencoding = "UTF-8") (5) R library() R R library() install.packages() mice #install.packages( mice ) # library(mice) R #install.packages("mice") library(mice) Windows library(xlconnect) XLConnect XLConnectJars Error :.onload loadnamespace() rjava : call: fun(libname, pkgname) error: JAVA HOME cannot be determined from the Registry Error: XLConnectJars Java 64bit OS 32bit Java 32bit OS 64bit Java

78 1),,, :, IT Text,, ),,, :, D3(), Vol. 68, No. 5, pp , ) MURAI, Y., ARIMURA, M., HASEGAWA, H., TAMURA, T. and KAJIYA, Y. : Text mining analysis on methods of information provision that influence tourists travel behavior, Journal of the Eastern Asia Society for Transportation Studies, Vol. 8, pp , ) HASEGAWA, H., FUJII, M., ARIMURA, M. and TAMURA, T. : A Basic Study on Traffic Accident Data Analysis Using Support VectorMachine, Journal of the Eastern Asia Society of Transportation Studies, Vol. 7, pp , ) Hasegawa, H., Arimura, M. and Tamura, T.: Hybrid Model of Random Forests and Genetic Algorithms for Commute Mode Choice Analysis, Proceedings of The Eastern Asia Society for Transportation Studies, Vol. 9, pp , ) ARIMURA, M., NAITO, T., HASEGAWA, H. and TAMURA, T.: APPLICATION OF DATA MINING TECH- NIQUES TO CONGESTION DATA ANALYSIS: THE CASE OF SAPPORO URBAN AREA, Selected Proceedings of World Conference on Transport Research, Vol. 12, ) HASEGAWA, H., FUJII, M., ARIMURA, M. and TAMURA, T.: A Study on Traffic Accident Analysis Using Support Vector Machines, Proceedings of The 11th World Conference on Transportation Research, Vol. 11, World Conference on Transport Research Society, ),, :,, Vol. 40, ) Li, M., Zhang, Y. and Wang, W. : Analysis of congestion points based on probe car data, th International IEEE Conference on Intelligent Transportation Systems, pp. 1 5, ) Lu, Y. and Kawamura, K. : Data-Mining Approach to Work Trip Mode Choice Analysis in Chicago, Illinois, Area, Transportation Research Record: Journal of the Transportation Research Board, Vol. 2156, pp , ) Hossain, M. and Muromachi, Y. : Understanding Crash Mechanisms and Selecting Interventions to Mitigate Real- Time Hazards on Urban Expressways, Transportation Research Record: Journal of the Transportation Research Board, Vol. 2213, pp , ), : R, 2, ) : The R Tips, 2, ) : 22,, ) :, Useful R,, ) : (missing data analysis) --, ),,, :, 1, ),, :,, ) Murrell, P.: R,,

79 A as.factor() , 32 as.matrix() as.ordered() , 32 C cfa() colmaxs() colmeans() colmins() , 50 complete.cases() CP , 68 CSV D data cleaning data cleansing data preprocessing devtools discretize() , 24 dummy variable E Equal Frequency Discretization Equal Width Discretization Excel F FIML full information maximum likelihood G ggplot growth() I infotheo , 24 install.packages() install github() is.na() K k-means , 29 kmeans() L lavaan() lavaan lavcor() library() M makedummies() , 33 makedummies matrixstats , 50 md.pattern() MI mice() mice , 19 missing value multiple imputation N NA na.omit() ncol() normalize nrow() numeric P prcomp() principle component analysis R R , 60 random sampling read.csv() , 6 read.fwf() read.table() read.xlsx() read csv() read fwf() readr , 6 readworksheetfromfile() , 11 RODBC S sample() sampling scale() sem() Shift-JIS SRI standardization stochastic regression imputation

80 str() subset() , 58 summary() , 65 T table() U UTF X XLConnect , 71 xlsx , , 15, 16, II , , ,

81 様式 3 3 DATA PREPROCESSING AND CLEANING METHODS FOR UTILIZING BIG DATA IN TRANSPORTATION RESEARCH Hasegawa,H. 1 1 National Institute of Technology, Akita College In recent years, major improvements in information and communication technology have aided data collection in the transportation research area. The captured data needs to be converted into information and knowledge to become useful for decision-making. However, as data have grown in size and complexity, it is not satisfied only by traditional statistical methods. Searching for useful nuggets of information among huge amounts of data has become known as the field of data mining. Data mining is the entire process of applying computer-based methodology, including new techniques for knowledge discovery, to data. Under these circumstances, purposes of this study are followings: 1. Surveying data preprocessing and cleaning methods. 2. Applying data preprocessing and cleaning methods for transportation related data sets. 3. In particular, making well-organized reference about data preprocessing and cleaning methods for transportation researchers. The following two results were obtained. 1. GNU R is adopted for applying data preprocessing and cleaning methods for transportation related data sets. GNU R is a open source language and development environment for statistical analysis. 2. A reference report about data preprocessing and cleaning methods for transportation researchers is available via JACIC s web site. KEYWORDS: data preprocessing, data cleaning, big data, GNU R, transportation.

82 様式 3 2 研究成果の要約 助成番号助成研究名研究者 所属 第 号 交通関連ビッグデータ活用に向けたデータの前処理 クリーニングに関する研究 長谷川裕修 秋田工業高等専門学校 1. 研究の背景と目的近年, センサ技術の発達とデータ保存コストの低下を背景に, システムによって自動的に記録 蓄積されるデータ (= ビッグデータ ) の量的 質的な増加が加速している. これら増大し続ける交通関連ビッグデータからマーケティング 政策立案等における意志決定に有用な知識を発掘するためには, データマイニングによる知識発見プロセスが必要となる ( 図 -1). 図 -1 KDD プロセス このプロセスのうち, データセットから頻出するパターンやルールを発見するパターンの発見は狭義のデータマイニングとも呼ばれる中心的な過程であり, 高越分野においても多くの研究蓄積がある. 一方, パターン発見の前段階として行われるデータの前処理 クリーニングは, 分析の一過程として, 外れ値の削除や変数変換, セグメント毎のデータ分割などが行われているものの, いずれも探索的に行われているのが実情であり, 交通分野において体系だった整理は行われていない. また, それぞれの研究においてどのような前処理 クリーニングが行われているかについては, 紙幅の制約により詳しい説明は省略されることが多く, 科学における再現性 という観点からも問題がある. 以上を踏まえて本研究は, 交通系ビッグデータからの知識発見への応用を念頭に, 交通関連データの前処理 クリーニングの方法論を整理することを目的とするものである. 2. 研究手順具体的な研究手順としては, データマイニング 機械学習 情報処理 統計学分野におけるデータの前処理 クリーニング方法について資料収集を行い, それを元に, 交通関連データへの適用を, オープンソースのデータ解析環境 R およびその拡張パッケージを用いて行った. 3. 研究の新規性と成果の活用本研究の新規性は, これまで十分には整理されてこなかったデータの前処理 クリーニングについて, 適用対象にオープンデータとして公開されている, あるいは,Web 経由で簡単に入手可能な交通関連データを用いたことで, 交通分野の実務者 研究者にとって馴染みやすいものすることが出来た点にある. 交通関連データの持つ情報量の損失を抑えつつ, 扱いやすい形に変換するためには, 統計や機械学習の知識 技術と共に, 当該交通現象に関する領域知識が必要である. 本研究に対する読者からのフィードバックを得て, より一層, 資料を改善したいと考えている. データの前処理 クリーニングは分析精度に大きく影響するだけでなく, 増え続けるデータを現実的な計算時間で取り扱うためにも重要であることから, 本研究の成果が活用され, 今後の交通分野における実務 研究の発展に寄与することを期待している.

R による統計解析入門

R による統計解析入門 R May 31, 2016 R R R R Studio GUI R Console R Studio PDF URL http://ruby.kyoto-wu.ac.jp/konami/text/r R R Console Windows, Mac GUI Unix R Studio GUI R version 3.2.3 (2015-12-10) -- "Wooden Christmas-Tree"

More information

Use R

Use R Use R! 2008/05/23( ) Index Introduction (GLM) ( ) R. Introduction R,, PLS,,, etc. 2. Correlation coefficient (Pearson s product moment correlation) r = Sxy Sxx Syy :, Sxy, Sxx= X, Syy Y 1.96 95% R cor(x,

More information

講義のーと : データ解析のための統計モデリング. 第3回

講義のーと :  データ解析のための統計モデリング. 第3回 Title 講義のーと : データ解析のための統計モデリング Author(s) 久保, 拓弥 Issue Date 2008 Doc URL http://hdl.handle.net/2115/49477 Type learningobject Note この講義資料は, 著者のホームページ http://hosho.ees.hokudai.ac.jp/~kub ードできます Note(URL)http://hosho.ees.hokudai.ac.jp/~kubo/ce/EesLecture20

More information

講義のーと : データ解析のための統計モデリング. 第2回

講義のーと :  データ解析のための統計モデリング. 第2回 Title 講義のーと : データ解析のための統計モデリング Author(s) 久保, 拓弥 Issue Date 2008 Doc URL http://hdl.handle.net/2115/49477 Type learningobject Note この講義資料は, 著者のホームページ http://hosho.ees.hokudai.ac.jp/~kub ードできます Note(URL)http://hosho.ees.hokudai.ac.jp/~kubo/ce/EesLecture20

More information

「産業上利用することができる発明」の審査の運用指針(案)

「産業上利用することができる発明」の審査の運用指針(案) 1 1.... 2 1.1... 2 2.... 4 2.1... 4 3.... 6 4.... 6 1 1 29 1 29 1 1 1. 2 1 1.1 (1) (2) (3) 1 (4) 2 4 1 2 2 3 4 31 12 5 7 2.2 (5) ( a ) ( b ) 1 3 2 ( c ) (6) 2. 2.1 2.1 (1) 4 ( i ) ( ii ) ( iii ) ( iv)

More information

X X X Y R Y R Y R MCAR MAR MNAR Figure 1: MCAR, MAR, MNAR Y R X 1.2 Missing At Random (MAR) MAR MCAR MCAR Y X X Y MCAR 2 1 R X Y Table 1 3 IQ MCAR Y I

X X X Y R Y R Y R MCAR MAR MNAR Figure 1: MCAR, MAR, MNAR Y R X 1.2 Missing At Random (MAR) MAR MCAR MCAR Y X X Y MCAR 2 1 R X Y Table 1 3 IQ MCAR Y I (missing data analysis) - - 1/16/2011 (missing data, missing value) (list-wise deletion) (pair-wise deletion) (full information maximum likelihood method, FIML) (multiple imputation method) 1 missing completely

More information

Rによる計量分析:データ解析と可視化 - 第3回 Rの基礎とデータ操作・管理

Rによる計量分析:データ解析と可視化 - 第3回  Rの基礎とデータ操作・管理 R 3 R 2017 Email: [email protected] October 23, 2017 (Toyama/NIHU) R ( 3 ) October 23, 2017 1 / 34 Agenda 1 2 3 4 R 5 RStudio (Toyama/NIHU) R ( 3 ) October 23, 2017 2 / 34 10/30 (Mon.) 12/11 (Mon.)

More information

Rによる計量分析:データ解析と可視化 - 第2回 セットアップ

Rによる計量分析:データ解析と可視化 - 第2回 セットアップ R 2 2017 Email: [email protected] October 16, 2017 Outline 1 ( ) 2 R RStudio 3 4 R (Toyama/NIHU) R October 16, 2017 1 / 34 R RStudio, R PC ( ) ( ) (Toyama/NIHU) R October 16, 2017 2 / 34 R ( ) R

More information

講義のーと : データ解析のための統計モデリング. 第5回

講義のーと :  データ解析のための統計モデリング. 第5回 Title 講義のーと : データ解析のための統計モデリング Author(s) 久保, 拓弥 Issue Date 2008 Doc URL http://hdl.handle.net/2115/49477 Type learningobject Note この講義資料は, 著者のホームページ http://hosho.ees.hokudai.ac.jp/~kub ードできます Note(URL)http://hosho.ees.hokudai.ac.jp/~kubo/ce/EesLecture20

More information

1 Stata SEM LightStone 4 SEM 4.. Alan C. Acock, Discovering Structural Equation Modeling Using Stata, Revised Edition, Stata Press 3.

1 Stata SEM LightStone 4 SEM 4.. Alan C. Acock, Discovering Structural Equation Modeling Using Stata, Revised Edition, Stata Press 3. 1 Stata SEM LightStone 4 SEM 4.. Alan C. Acock, 2013. Discovering Structural Equation Modeling Using Stata, Revised Edition, Stata Press 3. 2 4, 2. 1 2 2 Depress Conservative. 3., 3,. SES66 Alien67 Alien71,

More information

一般化線形 (混合) モデル (2) - ロジスティック回帰と GLMM

一般化線形 (混合) モデル (2) - ロジスティック回帰と GLMM .. ( ) (2) GLMM [email protected] I http://goo.gl/rrhzey 2013 08 27 : 2013 08 27 08:29 kubostat2013ou2 (http://goo.gl/rrhzey) ( ) (2) 2013 08 27 1 / 74 I.1 N k.2 binomial distribution logit link function.3.4!

More information

DAA09

DAA09 > summary(dat.lm1) Call: lm(formula = sales ~ price, data = dat) Residuals: Min 1Q Median 3Q Max -55.719-19.270 4.212 16.143 73.454 Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) 237.1326

More information

k3 ( :07 ) 2 (A) k = 1 (B) k = 7 y x x 1 (k2)?? x y (A) GLM (k

k3 ( :07 ) 2 (A) k = 1 (B) k = 7 y x x 1 (k2)?? x y (A) GLM (k 2012 11 01 k3 (2012-10-24 14:07 ) 1 6 3 (2012 11 01 k3) [email protected] web http://goo.gl/wijx2 web http://goo.gl/ufq2 1 3 2 : 4 3 AIC 6 4 7 5 8 6 : 9 7 11 8 12 8.1 (1)........ 13 8.2 (2) χ 2....................

More information

kubostat2017c p (c) Poisson regression, a generalized linear model (GLM) : :

kubostat2017c p (c) Poisson regression, a generalized linear model (GLM) : : kubostat2017c p.1 2017 (c), a generalized linear model (GLM) : [email protected] http://goo.gl/76c4i 2017 11 14 : 2017 11 07 15:43 kubostat2017c (http://goo.gl/76c4i) 2017 (c) 2017 11 14 1 / 47 agenda

More information

kubostat2017e p.1 I 2017 (e) GLM logistic regression : : :02 1 N y count data or

kubostat2017e p.1 I 2017 (e) GLM logistic regression : : :02 1 N y count data or kubostat207e p. I 207 (e) GLM [email protected] https://goo.gl/z9ycjy 207 4 207 6:02 N y 2 binomial distribution logit link function 3 4! offset kubostat207e (https://goo.gl/z9ycjy) 207 (e) 207 4

More information

1 15 R Part : website:

1 15 R Part : website: 1 15 R Part 4 2017 7 24 4 : website: email: http://www3.u-toyama.ac.jp/kkarato/ [email protected] 1 2 2 3 2.1............................... 3 2.2 2................................. 4 2.3................................

More information

PackageSoft/R-033U.tex (2018/March) R:

PackageSoft/R-033U.tex (2018/March) R: ................................................................................ R: 2018 3 29................................................................................ R AI R https://cran.r-project.org/doc/contrib/manuals-jp/r-intro-170.jp.pdf

More information

1 R Windows R 1.1 R The R project web R web Download [CRAN] CRAN Mirrors Japan Download and Install R [Windows 9

1 R Windows R 1.1 R The R project web   R web Download [CRAN] CRAN Mirrors Japan Download and Install R [Windows 9 1 R 2007 8 19 1 Windows R 1.1 R The R project web http://www.r-project.org/ R web Download [CRAN] CRAN Mirrors Japan Download and Install R [Windows 95 and later ] [base] 2.5.1 R - 2.5.1 for Windows R

More information

1 Stata SEM LightStone 3 2 SEM. 2., 2,. Alan C. Acock, Discovering Structural Equation Modeling Using Stata, Revised Edition, Stata Press.

1 Stata SEM LightStone 3 2 SEM. 2., 2,. Alan C. Acock, Discovering Structural Equation Modeling Using Stata, Revised Edition, Stata Press. 1 Stata SEM LightStone 3 2 SEM. 2., 2,. Alan C. Acock, 2013. Discovering Structural Equation Modeling Using Stata, Revised Edition, Stata Press. 2 3 2 Conservative Depress. 3.1 2. SEM. 1. x SEM. Depress.

More information

plot type type= n text plot type= n text(x,y) iris 5 iris iris.label >iris.label<-rep(c(,, ),rep(50,3)) 2 13 >plot(iris[,1],iris

plot type type= n text plot type= n text(x,y) iris 5 iris iris.label >iris.label<-rep(c(,, ),rep(50,3)) 2 13 >plot(iris[,1],iris 23 2 2 R iris 1 3 plot >plot(iris[,1],iris[,3]) 11 iris[, 3] 1 2 3 4 5 6 7 iris[, 3] 1 2 3 4 5 6 7 119 106 118 123 108 132 101 110 103 104 105 126 131 136 144 109 125 145121 130 135137 129 133 141 138

More information

!!! 2!

!!! 2! 2016/5/17 (Tue) SPSS ([email protected])! !!! 2! 3! 4! !!! 5! (Population)! (Sample) 6! case, observation, individual! variable!!! 1 1 4 2 5 2 1 5 3 4 3 2 3 3 1 4 2 1 4 8 7! (1) (2) (3) (4) categorical

More information

Microsoft Word - 計量研修テキスト_第5版).doc

Microsoft Word - 計量研修テキスト_第5版).doc Q10-2 テキスト P191 1. 記述統計量 ( 変数 :YY95) 表示変数として 平均 中央値 最大値 最小値 標準偏差 観測値 を選択 A. 都道府県別 Descriptive Statistics for YY95 Categorized by values of PREFNUM Date: 05/11/06 Time: 14:36 Sample: 1990 2002 Included

More information

Studies of Foot Form for Footwear Design (Part 9) : Characteristics of the Foot Form of Young and Elder Women Based on their Sizes of Ball Joint Girth

Studies of Foot Form for Footwear Design (Part 9) : Characteristics of the Foot Form of Young and Elder Women Based on their Sizes of Ball Joint Girth Studies of Foot Form for Footwear Design (Part 9) : Characteristics of the Foot Form of Young and Elder Women Based on their Sizes of Ball Joint Girth and Foot Breadth Akiko Yamamoto Fukuoka Women's University,

More information

44 4 I (1) ( ) (10 15 ) ( 17 ) ( 3 1 ) (2)

44 4 I (1) ( ) (10 15 ) ( 17 ) ( 3 1 ) (2) (1) I 44 II 45 III 47 IV 52 44 4 I (1) ( ) 1945 8 9 (10 15 ) ( 17 ) ( 3 1 ) (2) 45 II 1 (3) 511 ( 451 1 ) ( ) 365 1 2 512 1 2 365 1 2 363 2 ( ) 3 ( ) ( 451 2 ( 314 1 ) ( 339 1 4 ) 337 2 3 ) 363 (4) 46

More information

i ii i iii iv 1 3 3 10 14 17 17 18 22 23 28 29 31 36 37 39 40 43 48 59 70 75 75 77 90 95 102 107 109 110 118 125 128 130 132 134 48 43 43 51 52 61 61 64 62 124 70 58 3 10 17 29 78 82 85 102 95 109 iii

More information

R John Fox R R R Console library(rcmdr) Rcmdr R GUI Windows R R SDI *1 R Console R 1 2 Windows XP Windows * 2 R R Console R ˆ R

R John Fox R R R Console library(rcmdr) Rcmdr R GUI Windows R R SDI *1 R Console R 1 2 Windows XP Windows * 2 R R Console R ˆ R R John Fox 2006 8 26 2008 8 28 1 R R R Console library(rcmdr) Rcmdr R GUI Windows R R SDI *1 R Console R 1 2 Windows XP Windows * 2 R R Console R ˆ R GUI R R R Console > ˆ 2 ˆ Fox(2005) [email protected]

More information

takano1

takano1 欠損値を補完する? 教育認知心理学講座野村研究室 M1 高野了太 データ解析演習 2017/07/05 目次 1. はじめに 2. 欠損値の種類 2-1. MCAR 2-2. MAR 2-3. MNAR 3. 欠損データの対処法 3-1. FIML 法 3-2. 多重代入法 4. 実際に多重代入法をやろう! 2 1. はじめに 心理学の研究 とりわけ質問紙調査などでは 欠損値はつきもの 欠損値があるデータをどのように扱うのかに関しては

More information

こんにちは由美子です

こんにちは由美子です Analysis of Variance 2 two sample t test analysis of variance (ANOVA) CO 3 3 1 EFV1 µ 1 µ 2 µ 3 H 0 H 0 : µ 1 = µ 2 = µ 3 H A : Group 1 Group 2.. Group k population mean µ 1 µ µ κ SD σ 1 σ σ κ sample mean

More information

2.1 R, ( ), Download R for Windows base. R ( ) R win.exe, 2.,.,.,. R > 3*5 # [1] 15 > c(19,76)+c(11,13)

2.1 R, ( ),   Download R for Windows base. R ( ) R win.exe, 2.,.,.,. R > 3*5 # [1] 15 > c(19,76)+c(11,13) 3 ( ) R 3 1 61, 2016/4/7( ), 4/14( ), 4/21( ) 1 1 2 1 2.1 R, ( )................ 2 2.2 ggm............................ 3 2.3,................ 4 2.4...................................... 6 2.5 1 ( )....................

More information

Stata11 whitepapers mwp-037 regress - regress regress. regress mpg weight foreign Source SS df MS Number of obs = 74 F(

Stata11 whitepapers mwp-037 regress - regress regress. regress mpg weight foreign Source SS df MS Number of obs = 74 F( mwp-037 regress - regress 1. 1.1 1.2 1.3 2. 3. 4. 5. 1. regress. regress mpg weight foreign Source SS df MS Number of obs = 74 F( 2, 71) = 69.75 Model 1619.2877 2 809.643849 Prob > F = 0.0000 Residual

More information

% 10%, 35%( 1029 ) p (a) 1 p 95% (b) 1 Std. Err. (c) p 40% 5% (d) p 1: STATA (1). prtesti One-sample test of pr

% 10%, 35%( 1029 ) p (a) 1 p 95% (b) 1 Std. Err. (c) p 40% 5% (d) p 1: STATA (1). prtesti One-sample test of pr 1 1. 2014 6 2014 6 10 10% 10%, 35%( 1029 ) p (a) 1 p 95% (b) 1 Std. Err. (c) p 40% 5% (d) p 1: STATA (1). prtesti 1029 0.35 0.40 One-sample test of proportion x: Number of obs = 1029 Variable Mean Std.

More information

第2回:データの加工・整理

第2回:データの加工・整理 2 2018 4 13 1 / 24 1. 2. Excel 3. Stata 4. Stata 5. Stata 2 / 24 1 cross section data e.g., 47 2009 time series data e.g., 1999 2014 5 panel data e.g., 47 1999 2014 5 3 / 24 micro data aggregate data 4

More information

はじめての帳票作成

はじめての帳票作成 ucosminexus EUR 3020-7-532 OS Windows Vista Windows XP P-26D2-5684 ucosminexus EUR Designer 08-00 P-26D2-5784 ucosminexus EUR Developer 08-00 ISO9001 TickIT Microsoft Microsoft Corp. Microsoft Excel Microsoft

More information

1 I EViews View Proc Freeze

1 I EViews View Proc Freeze EViews 2017 9 6 1 I EViews 4 1 5 2 10 3 13 4 16 4.1 View.......................................... 17 4.2 Proc.......................................... 22 4.3 Freeze & Name....................................

More information

統計研修R分散分析(追加).indd

統計研修R分散分析(追加).indd http://cse.niaes.affrc.go.jp/minaka/r/r-top.html > mm mm TRT DATA 1 DM1 2537 2 DM1 2069 3 DM1 2104 4 DM1 1797 5 DM2 3366 6 DM2 2591 7 DM2 2211 8

More information

2004/01/12 1 2004/01/23 2 I- - 10 2004/04/02 3-6 2004/04/03 4-1-5-1,-1-8-1,-2-2-1,-3-4-1,-3-5-1,-4-2-1, -5-4-2,-5-6-1,-6-2-1 4. _.doc 1

2004/01/12 1 2004/01/23 2 I- - 10 2004/04/02 3-6 2004/04/03 4-1-5-1,-1-8-1,-2-2-1,-3-4-1,-3-5-1,-4-2-1, -5-4-2,-5-6-1,-6-2-1 4. _.doc 1 4 2004 4 3 2004/01/12 1 2004/01/23 2 I- - 10 2004/04/02 3-6 2004/04/03 4-1-5-1,-1-8-1,-2-2-1,-3-4-1,-3-5-1,-4-2-1, -5-4-2,-5-6-1,-6-2-1 4. _.doc 1 - - I. 4 I- 4 I- 4 I- 6 I- 6 I- 7 II. 8 II- 8 II- 8 II-

More information

¥¤¥ó¥¿¡¼¥Í¥Ã¥È·×¬¤È¥Ç¡¼¥¿²òÀÏ Âè2²ó

¥¤¥ó¥¿¡¼¥Í¥Ã¥È·×¬¤È¥Ç¡¼¥¿²òÀÏ Âè2²ó 2 2015 4 20 1 (4/13) : ruby 2 / 49 2 ( ) : gnuplot 3 / 49 1 1 2014 6 IIJ / 4 / 49 1 ( ) / 5 / 49 ( ) 6 / 49 (summary statistics) : (mean) (median) (mode) : (range) (variance) (standard deviation) 7 / 49

More information

第11回:線形回帰モデルのOLS推定

第11回:線形回帰モデルのOLS推定 11 OLS 2018 7 13 1 / 45 1. 2. 3. 2 / 45 n 2 ((y 1, x 1 ), (y 2, x 2 ),, (y n, x n )) linear regression model y i = β 0 + β 1 x i + u i, E(u i x i ) = 0, E(u i u j x i ) = 0 (i j), V(u i x i ) = σ 2, i

More information

フリーソフトではじめる機械学習入門 サンプルページ この本の定価 判型などは, 以下の URL からご覧いただけます. このサンプルページの内容は, 初版 1 刷発行時のものです.

フリーソフトではじめる機械学習入門 サンプルページ この本の定価 判型などは, 以下の URL からご覧いただけます.   このサンプルページの内容は, 初版 1 刷発行時のものです. フリーソフトではじめる機械学習入門 サンプルページ この本の定価 判型などは, 以下の URL からご覧いただけます. http://www.morikita.co.jp/books/mid/085211 このサンプルページの内容は, 初版 1 刷発行時のものです. Weka Weka 2014 2 i 1 1 1.1... 1 1.2... 3 1.3... 6 1.3.1 7 1.3.2 11

More information

分布

分布 (normal distribution) 30 2 Skewed graph 1 2 (variance) s 2 = 1/(n-1) (xi x) 2 x = mean, s = variance (variance) (standard deviation) SD = SQR (var) or 8 8 0.3 0.2 0.1 0.0 0 1 2 3 4 5 6 7 8 8 0 1 8 (probability

More information

i

i 14 i ii iii iv v vi 14 13 86 13 12 28 14 16 14 15 31 (1) 13 12 28 20 (2) (3) 2 (4) (5) 14 14 50 48 3 11 11 22 14 15 10 14 20 21 20 (1) 14 (2) 14 4 (3) (4) (5) 12 12 (6) 14 15 5 6 7 8 9 10 7

More information

Stata 11 Stata ROC whitepaper mwp anova/oneway 3 mwp-042 kwallis Kruskal Wallis 28 mwp-045 ranksum/median / 31 mwp-047 roctab/roccomp ROC 34 mwp-050 s

Stata 11 Stata ROC whitepaper mwp anova/oneway 3 mwp-042 kwallis Kruskal Wallis 28 mwp-045 ranksum/median / 31 mwp-047 roctab/roccomp ROC 34 mwp-050 s BR003 Stata 11 Stata ROC whitepaper mwp anova/oneway 3 mwp-042 kwallis Kruskal Wallis 28 mwp-045 ranksum/median / 31 mwp-047 roctab/roccomp ROC 34 mwp-050 sampsi 47 mwp-044 sdtest 54 mwp-043 signrank/signtest

More information

: (EQS) /EQUATIONS V1 = 30*V F1 + E1; V2 = 25*V *F1 + E2; V3 = 16*V *F1 + E3; V4 = 10*V F2 + E4; V5 = 19*V99

: (EQS) /EQUATIONS V1 = 30*V F1 + E1; V2 = 25*V *F1 + E2; V3 = 16*V *F1 + E3; V4 = 10*V F2 + E4; V5 = 19*V99 218 6 219 6.11: (EQS) /EQUATIONS V1 = 30*V999 + 1F1 + E1; V2 = 25*V999 +.54*F1 + E2; V3 = 16*V999 + 1.46*F1 + E3; V4 = 10*V999 + 1F2 + E4; V5 = 19*V999 + 1.29*F2 + E5; V6 = 17*V999 + 2.22*F2 + E6; CALIS.

More information

SAS Enterprise Miner PFD SAS Rapid Predictive Modeler & SAS SEMMA 5 SEMMA SAS Rapid Predictive Modeler SAS Rapid Predictive Modeler SAS Enterprise Gui

SAS Enterprise Miner PFD SAS Rapid Predictive Modeler & SAS SEMMA 5 SEMMA SAS Rapid Predictive Modeler SAS Rapid Predictive Modeler SAS Enterprise Gui FACT SHEET SAS ENTERPRISE MINER 7.1 SAS SAS Enterprise Miner SAS SAS????? SAS Enterprise Miner SAS Analytics SAS SAS Enterprise Miner GUI SAS Enterprise Miner PFD SAS Rapid Predictive Modeler & SAS SEMMA

More information

こんにちは由美子です

こんにちは由美子です 1 2 . sum Variable Obs Mean Std. Dev. Min Max ---------+----------------------------------------------------- var1 13.4923077.3545926.05 1.1 3 3 3 0.71 3 x 3 C 3 = 0.3579 2 1 0.71 2 x 0.29 x 3 C 2 = 0.4386

More information

1 1.1 PC PC PC PC PC workstation PC hardsoft PC PC CPU 1 Gustavb, Wikimedia Commons.

1 1.1 PC PC PC PC PC workstation PC hardsoft PC PC CPU 1 Gustavb, Wikimedia Commons. 1 PC PC 1 PC PC 1 PC PC PC PC 1 1 1 1.1 PC PC PC PC PC workstation PC 1.1.1 hardsoft 1.1.2 PC PC 1.1 1 1. 2. 3. CPU 1 Gustavb, Wikimedia Commons.http://en.wikipedia.org/wiki/Image:Personal_computer,_exploded_5.svg

More information

1 1 ( ) ( 1.1 1.1.1 60% mm 100 100 60 60% 1.1.2 A B A B A 1

1 1 ( ) ( 1.1 1.1.1 60% mm 100 100 60 60% 1.1.2 A B A B A 1 1 21 10 5 1 E-mail: [email protected] 1 1 ( ) ( 1.1 1.1.1 60% mm 100 100 60 60% 1.1.2 A B A B A 1 B 1.1.3 boy W ID 1 2 3 DI DII DIII OL OL 1.1.4 2 1.1.5 1.1.6 1.1.7 1.1.8 1.2 1.2.1 1. 2. 3 1.2.2

More information

kubostat2017b p.1 agenda I 2017 (b) probability distribution and maximum likelihood estimation :

kubostat2017b p.1 agenda I 2017 (b) probability distribution and maximum likelihood estimation : kubostat2017b p.1 agenda I 2017 (b) probabilit distribution and maimum likelihood estimation [email protected] http://goo.gl/76c4i 2017 11 14 : 2017 11 07 15:43 1 : 2 3? 4 kubostat2017b (http://goo.gl/76c4i)

More information

- 2 -

- 2 - - 2 - - 3 - (1) (2) (3) (1) - 4 - ~ - 5 - (2) - 6 - (1) (1) - 7 - - 8 - (i) (ii) (iii) (ii) (iii) (ii) 10 - 9 - (3) - 10 - (3) - 11 - - 12 - (1) - 13 - - 14 - (2) - 15 - - 16 - (3) - 17 - - 18 - (4) -

More information

2 1980 8 4 4 4 4 4 3 4 2 4 4 2 4 6 0 0 6 4 2 4 1 2 2 1 4 4 4 2 3 3 3 4 3 4 4 4 4 2 5 5 2 4 4 4 0 3 3 0 9 10 10 9 1 1

2 1980 8 4 4 4 4 4 3 4 2 4 4 2 4 6 0 0 6 4 2 4 1 2 2 1 4 4 4 2 3 3 3 4 3 4 4 4 4 2 5 5 2 4 4 4 0 3 3 0 9 10 10 9 1 1 1 1979 6 24 3 4 4 4 4 3 4 4 2 3 4 4 6 0 0 6 2 4 4 4 3 0 0 3 3 3 4 3 2 4 3? 4 3 4 3 4 4 4 4 3 3 4 4 4 4 2 1 1 2 15 4 4 15 0 1 2 1980 8 4 4 4 4 4 3 4 2 4 4 2 4 6 0 0 6 4 2 4 1 2 2 1 4 4 4 2 3 3 3 4 3 4 4

More information

1 (1) (2)

1 (1) (2) 1 2 (1) (2) (3) 3-78 - 1 (1) (2) - 79 - i) ii) iii) (3) (4) (5) (6) - 80 - (7) (8) (9) (10) 2 (1) (2) (3) (4) i) - 81 - ii) (a) (b) 3 (1) (2) - 82 - - 83 - - 84 - - 85 - - 86 - (1) (2) (3) (4) (5) (6)

More information