1
Hitomi s English Tests 1 2 3 4 5 6 7 8 9 10 11 12 13 14 1 1 0 1 1 0 1 0 0 0 1 0 0 1 0 2 0 0 1 1 0 0 0 0 0 1 1 1 1 0 3 1 1 0 0 0 0 1 0 1 0 1 0 1 1 4 1 1 0 1 0 1 1 1 1 0 0 0 1 1 5 1 1 0 1 1 1 1 0 0 1 0 1 0 0 6 0 0 0 0 0 1 1 0 0 1 1 0 1 0 7 1 1 0 1 0 1 1 0 0 1 0 0 0 1 8 1 0 1 0 0 0 1 0 0 0 1 0 0 1 9 0 0 1 1 1 1 1 0 0 1 1 0 1 1 10 1 0 0 0 0 0 1 0 0 0 0 0 1 1 2
True Scores of 1000 Tests Density 0.0 0.1 0.2 0.3 0.4 3 2 1 0 1 2 3 Scores of randomly chosen tests 3
1( )?? 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 178 165 168 152 175 175 165 162 164 170 169 155 153 162 168 4
2( )?? 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 5
1 3(1 ) 1 1 one-dimensional data 1 6
4( ) 2 2 two-dimensional data 1 high-dimensional data 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 178 165 168 152 175 175 165 162 164 170 169 155 153 162 168 63 62 69 41 71 61 62 48 52 55 69 48 44 49 69 7
5( )?? time series data 1000 15 20 25 30 35 40 45 50 55 60 71,933 72,147 83,200 89,276 93,419 98,275 103,720 111,940 117,060 121,049 8
1000 120000 110000 Population 100000 90000 80000 20 30 40 50 60 Year 9
1.5 1 0.5 0-0.5-1 -1.5 0 2000 4000 6000 8000 10000 10
373 0 10 5 12 0.032 12 0.032 10 20 15 10 0.027 22 0.059 20 30 25 19 0.051 41 0.110 30 40 35 42 0.113 83 0.223 40 50 45 72 0.193 155 0.416 50 60 55 82 0.220 237 0.635 60 70 65 54 0.145 291 0.780 70 80 75 38 0.102 329 0.882 80 90 85 25 0.067 354 0.949 90 100 95 19 0.051 373 1.000 373 1.00 11
frequency relative frequency cumulative frequency cumulative relative frequency 12
6( ) class frequency frequency distribution 7( ) histogram 13
: n k 1 + log n/ log 2 n = 373 k =9.543 14
10 80 60 Frequency 40 20 0 20 40 60 80 100 Score 15
( ) 5 150 125 Frequency 100 75 50 25 0 20 40 60 80 100 Score 16
: 8( ) n ȳ = 1 n n y i i=1 = 1 n (y 1 + y 2 + + y n 1 + y n ) sample mean 17
( ) 9( ) order statistics: y 1,y 2,,y n 1,y n y (1) y (2) y (n 1) y (n) 10 ( ) median: n y med = y (m+1) n =2m +1 y med = y (m)+y (m+1) 2 n =2m 18
( ) 11 ( ) percentile: 0 p 1 y (1) y (2) y (n 1) y (n) 100p 100p% 12 ( ) quantile: y (1) y (2) y (n 1) y (n) 4 25% 1 50% 2 75% 3 19
( ) 13 ( ) mode: 14 ( ) mid-range: : y mid = y (1) + y (n) 2 20
: 15 ( ) variance: n Sn 2 = 1 (y i ȳ) 2 n i=1 = 1 { (y1 ȳ) 2 +(y 2 ȳ) 2 + +(y n ȳ) 2} n S 2 n = 1 n n i=1 y 2 i nȳ2 21
: / 16 ( ) standard deviation: : S n = S 2 n = 1 n n i=1 (y i ȳ) 2 17 ( ) coefficient of variation: : CV = = S n ȳ 22
y 1,,y n z 1 = y 1 ȳ S n,z 2 = y 2 ȳ, z n = y n ȳ S n S n standardization z 1,,z n (Z ) standard score z 1,,z n 0 50 10 z 1 =10z 1 +50,,z n =10z n +50 23
3 2 : 24
2 18 (2 ) two-dimensional data 1 2 1( )??, 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 178 165 168 152 175 175 165 162 164 170 169 155 153 162 168 63 62 69 41 71 61 62 48 52 55 69 48 44 49 69 25
2 ( ) (x 1,y 1 ), (x 2,y 2 ),, (x n,y n ) 19 ( ) scattergram 2 (x 1,y 1 ), (x 2,y 2 ),, (x n,y n ) (x, y) n 26
2 15 70 65 60 Weight 55 50 45 155 160 165 170 175 Height 27
2 vs. 3 2 1 0 C2-1 -2-3 -4-4 -2 0 2 C1 28
x s xx = 1 n i x) n i=1(x 2 = 1 n y n s yy = 1 i ȳ) n i=1(y 2 = 1 n (x, y) ( covariance ) n n i=1 n i=1 s xy = 1 i x)(y i ȳ) = n i=1(x 1 n x 2 i x2 y 2 i ȳ2 n i=1 x i y i xȳ 29
( ) 20 (correlation coefficient) 2 (x 1,y 1 ), (x 2,y 2 ),, (x n,y n ) x y r = = s xy sxx s yy ni=1 (x i x)(y i ȳ) ni=1 (x i x) 2 ni=1 (y i ȳ) 2 1( : ) 1 r 1 r>0: ; r<0: ; r =0: 30
r = 1 r 1 ni=1 (x i x)(y i ȳ) ni=1 (x i x) 2 n i=1 (y i ȳ) 2 a i =(x i x), b i =(y i ȳ) i =1,,n Schwarz : n i=1 a i b i 2 n a 2 n i b 2 i i=1 i=1 Schwarz t 2 n i=1 (a i + b i t) 2 0 31
: u i = ax i + b, v i = cy i + d (i =1, 2,,n) s xy sxx s yy = s uv suu s vv (ac > 0) 32
2 33
21 ( ) confounding 34
3 3 4 35
22 ( ) spurious correlation: x y 3 23 ( ) partial correlation coefficient: r xy : x y r xz : x z r yz : y z z x y r xy z = r xy r xz r yz 1 r 2 xz 1 r 2 yz 36
5( ) K. Pearson (1898) 50 (stature), (, femur) (, humerus), tibia ; (, radius) ( Krzanowski and Marriott, 1994, p.23) F H T R S F 1 0.8421 0.8058 0.7439 0.8105 H 1 0.8601 0.8451 0.8091 T 1 0.7804 0.7769 R 1 0.6956 S 1 37
r SR =0.6956 H T R S H 1 0.5682 0.6068 0.4007 T 1 0.4574 0.3569 R 1 0.2367 S 1 r r SR F = SR r SF r RF 1 r 2 SF 1 r 2 RF = 0.6956 0.8105 0.7439 1 0.8105 2 1 0.7439 2 = 0.2367 38
T R S T 1 0.1772 0.1714 R 1 0.0088 S 1 r SR HF = r SR F r SH F r RH F 1 r 2 SH F 1 r 2 RH F = 0.2367 0.4007 0.6068 1 0.4007 2 1 0.6068 2 = 0.0088 39