November 2015
R R-console R R Rscript R-console GUI 1
2 X Y 1 11.04 21.03 2 15.76 24.75 3 17.72 31.28 4 9.15 11.16 5 10.10 18.89 6 12.33 24.25 7 4.20 10.57 8 17.04 33.99 9 10.50 21.01 10 8.36 9.68
x = [x 1, x 2,..., x n ] y = [y 1, y 2,..., y n ] n x 1, x 2,... x = x 1 x 2. x n, y = T x = [x 1, x 2,..., x n ] T y 1 y 2. y n
2 x, y
x = 1 n (x 1 + x 2 + + x n ) ȳ = 1 n (y 1 + y 2 + + y n )
? 3 x 1, x 2, x 3 µ 3 S S = (x 1 µ) 2 + (x 2 µ) 2 + (x 3 µ) 2 S µ µ µ
S µ ds dµ = 2(µ x 1) + 2(µ x 2 ) + 2(µ x 3 ) = 0 µ = 1 3 (x 1 + x 2 + x 3 )
s 2 x = 1 n ( (x1 x) 2 + (x 2 x) 2 + + (x n x) 2) s 2 y = 1 n ( (y1 ȳ) 2 + (y 2 ȳ) 2 + + (y n ȳ) 2) x i x
0.0 0 5 10 15 20 n 21
0.0 0 5 10 15 20 n n = 21, p = 0.4
0.0 0.5 0 5 10 15 20 n n = 1, p = 0.4 p = 0, 1/2, 1
s xy = 1 n ((x 1 x)(y 1 ȳ) + + (x n x)(y n ȳ)) x y x i x, y i ȳ
Y y I II IV III I, III II, IV x X
r xy = 0.769 r xy = 0.737 I II I II Y Y IV III IV III X X
(1 ) r xy = s xy s x s y r xy = 1 y = ax + b (a > 0) r xy = 1 y = ax + b (a < 0) r xy 0
X Y 1 11.04 21.03 2 15.76 24.75 3 17.72 31.28 4 9.15 11.16 5 10.10 18.89 6 12.33 24.25 7 4.20 10.57 8 17.04 33.99 9 10.50 21.01 10 8.36 9.68 2 10 2 x, y
x = [x 1, x 2,..., x 10 ] y = [y 1, y 2,..., y 10 ] x 1, x 2,... p x 1 = [x 1,1, x 1,2,..., x 10,p ]
(Raw) x R1, x R2,... x x = x R x = [x R1 x, x R2 x,...] y s 2 x = 1 n (x2 1 + x 2 2 + + x 2 n) s xy = 1 n (x 1y 1 + x 2 y 2 + + x n y n )
x 2 = (x, x) = x 1 x 1 + x 2 x 2 + + x n x n s 2 x = 1 n x 2, s x = 1 n x x : x 2 (x, y) = x 1 y 1 + x 2 y 2 + + x n y n s xy = 1 (x, y) n
a, b ( a, b) = a b cos θ (x2,y2) b θ (x1,y1) a O ( a, b) = x 1 x 2 + y 1 y 2
r xy = s xy s x s y = (x, y) x y = cos θ 2 θ cos θ r xy < 0 r xy = 0 r xy > 0
I x y z = 1 0 0 0 1 0 0 0 1 x y z (x, y) (x, y ) O : AA 1 = I
y 1 = a 11 x 1 + a 12 x 2 + b 1 y 2 = a 11 x 1 + a 12 x 2 + b 2 y 1 y 2 = a 11 a 12 a 21 a 22 x 1 x 2 + b 1 b 2 x y = Ax + b
x = A 1 (y b) y = ax + b x = a 1 (y b)
X, Y E[X + Y ] = E[X] + E[Y ] () V [X ± Y ] = V [X] + V [Y ] (X, Y ) σx+y 2 = σx 2 + σy 2 X, Y, Z Y, Z X + Y X + Z
X σ X Y σ Y X + Y σ X+Y σ XY ρ XY 10.08 0.513 11.50 0.449 21.58 0.962 0.230 0.998 X σ X Z σ Z X + Z σ X+Z σ XZ ρ XZ 10.08 0.513 11.60 0.369 21.68 0.659 0.018 0.095 X Y σ X+Y = σ X + σ Y X Z σx+z 2 = σx 2 + σz 2
Y 11.0 11.2 11.4 11.6 11.8 12.0 12.2 Z 11.0 11.2 11.4 11.6 11.8 12.0 12.2 9.5 10.0 10.5 11.0 X 9.5 10.0 10.5 11.0 X 1 X Y 2 X Y
1 (xi,yi) Y (xi,a+bxi) (x2,y2) h2 h1 (x1,y1) hi y = a + b x X h i a, b (ax + b ) b = s xy s 2 x, a = ȳ b x
y = a + b 1 x 1 + b 2 x 2 b 1, b 2 b 1 b 2 = s2 x 1 s x1 x 2 s x1 x 2 s 2 x 2 1 s x 1 y s x2 y s x1 x 2
( p ) y = a + b 1 x 1 + b 2 x 2 + + b p x p b 1, b 2,... b 1 b 2. b p = s x1 x 1 s x1 x 2 s x1 x p s x2 x 1 s x2 x 2 s x2 x p...... s xp x 1 s xp x 2 s xp x p 1 = s x1 y s x2 y. s xp y x 1 n
R xy10.dat 2 10 X Y 11.04 21.03 15.76 24.75 17.72 31.28 9.15 11.16 10.1 18.89 12.33 24.25 4.2 10.57 17.04 33.99 10.5 21.01 8.36 9.68 DT <- read.table("xy10.dat",header=true) postscript("images/lmxy10.ps", horizontal=false, height=5.2,width=6,onefile=true) result = lm(y ~ X, data = DT) # summary(result) # plot(y ~ X, data = DT) # abline(result) #
DT <- read.table("xy10.dat",header=true) xy10.dat DT X Y 1 X, 2 Y postscript("lmxy10.ps", horizontal=false, height=5.2,width=6,onefile=true) Postscript (2.54 cm) result = lm(y ~ X, data = DT) (linear model) result
lm Y ~ X, data = DT DT X Y summary(result) result plot(y ~ X, data = DT) DT X Y abline(result)
R
Call: lm(formula = Y ~ X, data = DT) Residuals: Min 1Q Median 3Q Max -5.014-2.754 1.221 2.372 3.491 Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) -0.6092 3.4405-0.177 0.863859 X 1.8305 0.2800 6.538 0.000181 *** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05. 0.1 1 Residual standard error: 3.54 on 8 degrees of freedom Multiple R-squared: 0.8424,Adjusted R-squared: 0.8227 F-statistic: 42.75 on 1 and 8 DF, p-value: 0.000180
Residuals: Coefficients: Intercept, X Estimate Std. Error t value t Pr(> t ), p ( p) Residual standard error: Multiple R-squared: ( ) 2 Adjusted R-squared: ( ) 2 ( ) F-Statistic: F p
R BodyScore.txt Bust West Hip Weight 84 58 87 47 84 59 89 54 86 59 90 50 87 63 94 55 83 60 88 51 83 60 88 50 84 60 90 54 82 60 86 50 82 60 88 52 85 63 90 53 ( 30 ) Bust, West, Hip Weight
BS <- read.table("bodyscore.txt",header=t) cor(bs) ## Bust West Hip Weight Bust 1.0000000 0.3000223 0.6240064 0.4580098 West 0.3000223 1.0000000 0.5753726 0.6212204 Hip 0.6240064 0.5753726 1.0000000 0.6888909 Weight 0.4580098 0.6212204 0.6888909 1.0000000 0.6
cor(bs) pairs(bs) ( [2] 7 )
BS.fit <- lm(weight ~ Bust + West + Hip, data = BS) # + summary(bs.fit) Residuals: Min 1Q Median 3Q Max -4.5320-1.1779-0.3508 1.4179 6.4489 Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) -51.8052 19.2769-2.687 0.0124 * Bust 0.1165 0.2475 0.471 0.6418 West 0.6130 0.2873 2.133 0.0425 * Hip 0.6383 0.2835 2.251 0.0330 * --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05. 0.1 1 Residual standard error: 2.385 on 26 degrees of freedom Multiple R-squared: 0.554,Adjusted R-squared: 0.5025 F-statistic: 10.76 on 3 and 26 DF, p-value: 8.859e-05 West, Hip Bust
p x 1, x 2,..., x p m z 1, z 2,..., z m (m p) z 1 = c 11 x 1 + c 12 x 2 + + c 1p x p z 2 = c 21 x 1 + c 22 x 2 + + c 2p x p... z m = c m1 x 1 + c m2 x 2 + + c mp x p m z i (1 i m) p
x 1, x 2,..., x p R = r x1 x 1 r x1 x 2 r x1 x p r x2 x 1. r x2 x 2. r x2 x p.... r xp x 1 r xp x 2 r xp x p
SO2 Neg.Temp Manuf Pop Wind Precip Days Phoenix 10-70.3 213 582 6 7.05 36 Little Rock 13-61 91 132 8.2 48.52 100 San Francisco 12-56.7 453 716 8.7 20.66 67 Denver 17-51.9 454 515 9 12.95 86 Hartford 56-49.1 412 158 9 43.37 127 Wilmington 36-54 80 80 9 40.25 114 Washington 29-57.3 434 757 9.3 38.89 111 Jacksonville 14-68.4 136 529 8.8 54.47 116 Miami 10-75.5 207 335 9 59.8 128 Atlanta 24-61.5 368 497 9.1 48.34 115........................ Charleston 31-55.2 35 71 6.5 40.75 148 Milwaukee 16-45.7 569 717 11.8 29.07 123 SO2: SO 2 Neg.Temp: Manuf: Pop:, Wind: (MPH), Precip: (inch), Days:
D <- read.table("usair.txt",header=t) VD <- D[,-1] # SO2 cor(vd) # VD.pc <- princomp(vd,cor=t) # cor=t summary(vd.pc,loading=t) # Neg.Temp Manuf Pop Wind Precip Days Neg.Temp 1.00000 0.19004 0.06267 0.34973-0.38625 0.43024 Manuf 0.19004 1.00000 0.95526 0.23794-0.03241 0.13182 Pop 0.06267 0.95526 1.00000 0.21264-0.02611 0.04208 Wind 0.34973 0.23794 0.21264 1.00000-0.01299 0.16410 Precip -0.38625-0.03241-0.02611-0.01299 1.00000 0.49609 Days 0.43024 0.13182 0.04208 0.16410 0.49609 1.00000
Importance of components: Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Standard deviation 1.4819456 1.2247218 1.1809526 0.8719099 0.33848287 Proportion of Variance 0.3660271 0.2499906 0.2324415 0.1267045 0.01909511 Cumulative Proportion 0.3660271 0.6160177 0.8484592 0.9751637 0.99425879 Comp.6 Standard deviation 0.185599752 Proportion of Variance 0.005741211 Cumulative Proportion 1.000000000 6(components) 1 3 85% 3
( ) Loadings: Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Neg.Temp -0.330 0.128 0.672 0.306 0.558 0.136 Manuf -0.612-0.168-0.273 0.137 0.102-0.703 Pop -0.578-0.222-0.350 0.695 Wind -0.354 0.131 0.297-0.869-0.113 Precip 0.623-0.505-0.171 0.568 Days -0.238 0.708 0.311-0.580 1 2 3
3 61 2 3(PC1, PC2, PC3) postscript("pc1vspc2.ps",horizontal=f, onefile=true) par(pty = "s") # plot(vd.pc$scores[,1],vd.pc$scores[,2], #1 2 ylim = range(vd.pc$scores[,1]), # y PC1 xlab = "PC1", ylab = "PC2", type = "n", lwd = 2) # type="n" text(vd.pc$scores[,1], VD.pc$scores[,2], # labels = abbreviate(row.names(d)),cex = 0.7,lwd=2) dev.off() ## PC1 PC2 ##
PC1 vs. PC2 Chcg
PC1 vs. PC3 Chcg
PC2 vs. PC3 Phnx
VD.pc str(vd.pc) # VD.pc VD.pc$scores # 6 VD.pc$scores[,1:3] # 3 Comp.1 Comp.2 Comp.3 Phoenix 2.440096802-4.19114925-0.94155229 Little Rock 1.611599761 0.34248684-0.83970812 San Francisco 0.502073845-2.25528717 0.22663991 Denver 0.207434109-1.96320936 1.26621359 Hartford 0.219106349 0.97630584 0.59461329 Wilmington 0.996140738 0.50074082 0.43332129 Washington 0.022928417-0.05456742-0.35387289 Jacksonville 1.227849872 0.84912801-1.87611109 Miami 1.533160553 1.40469861-2.60660585 Atlanta 0.598994755 0.58723563-0.99541128... Charleston 1.429753320 1.21058365-0.07944350 Milwaukee -1.391024518 0.15761872 1.69127813
3 SO2 postscript("3pcvsso2.ps",horizontal=f, height=5.2,width=6, onefile=true) par(mfrow = c(1,3)) # 3 plot(vd.pc$scores[,1], VD[,1], xlab = "PC1", ylab="so2") plot(vd.pc$scores[,2], VD[,1], xlab = "PC2", ylab="so2") plot(vd.pc$scores[,3], VD[,1], xlab = "PC3", ylab="so2") ## VD.pc x ## VD[,1] SO2
: PC1 SO2
## PC1,2,3 pclm <- lm(d$so2 ~ VD.pc$scores[,1] + VD.pc$scores[,2] + VD.pc$scores[,3]) summary(pclm) Residuals: Min 1Q Median 3Q Max -36.420-10.981-3.184 12.087 61.273 Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) 30.049 2.907 10.336 1.85e-12 *** VD.pc$scores[, 1] -9.942 1.962-5.068 1.14e-05 *** VD.pc$scores[, 2] 2.240 2.374 0.943 0.352 VD.pc$scores[, 3] -0.375 2.462-0.152 0.880 --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05. 0.1 1 Residual standard error: 18.62 on 37 degrees of freedom Multiple R-squared: 0.4182,Adjusted R-squared: 0.371 F-statistic: 8.866 on 3 and 37 DF, p-value: 0.0001473
1 6 1.00 0.83 0.78 0.70 0.66 0.63 0.83 1.00 0.67 0.67 0.65 0.57 0.78 0.67 1.00 0.64 0.54 0.51 0.70 0.67 0.64 1.00 0.45 0.51 0.66 0.65 0.54 0.45 1.00 0.40 0.63 0.57 0.51 0.51 0.40 1.00
p z 1, z 2,..., z p (p ) z 1 = a 11 f 1 + a 12 f 2 + + a 1r f r + u 1 v 1 z 2 = a 21 f 1 + a 22 f 2 + + a 2f f r + u 2 v 2... z p = a p1 f 1 + a p2 f 2 + + a pf f r + u p v p f 1, f 2,... : v 1, v 2,... : a i,j : h 2 j = a 2 j1 + + a 2 jr : j u j : j h 2 j + u 2 j = 1 ( )
f 1, f 2,... v i (i = 1, 2,..., ) v i, v j (i j) f 1, f 2,..., v 1, v 2,... 1 f f i, f j (i j) ( )
10 3 x y z x 1.000 0.866 0.787 y 0.866 1.000 0.753 z 0.787 0.753 1.000 1 f x y z 1 28 29 28 2 18 23 18 3 11 22 16 4 21 23 22 5 26 29 26 6 20 23 22 7 16 22 22 8 14 23 24 9 24 29 24 10 22 27 24 z x = af + u x v x z y = bf + u y v y z z = cf + u z v z z x, z y, z z
a, b, c r xy = ab, r yz = bc, r xz = ac 33 a = rxy r xz r yz = 0.952 b = 0.910, c = 0.828 a 2, b 2, c 2 u 2 x = 0.094, u 2 y = 0.172, y 2 z = 0.314
x y z x 1.00 0.83 0.78 y 0.83 1.00 0.67 z 0.78 0.67 1.00 a = 1.2, b = 0.7, c = 0.5 u 2 x = 0.44, u 2 y = 0.51, u 2 z = 0.75 1
1.00 0.83 0.78 0.70 0.66 0.63 0.83 1.00 0.67 0.67 0.65 0.57 0.78 0.67 1.00 0.64 0.54 0.51 0.70 0.67 0.64 1.00 0.45 0.51 0.66 0.65 0.54 0.45 1.00 0.40 0.63 0.57 0.51 0.51 0.40 1.00 z 1 = a 11 f 1 + a 12 f 2 + + a 1r f r + u 1 v 1 z 2 = a 21 f 1 + a 22 f 2 + + a 2f f r + u 2 v 2... z p = a p1 f 1 + a p2 f 2 + + a pf f r + u p v p
Z = AF + UV R F Z = (AR)(R 1 F ) + UV R 1 3
32 ( ) m0 m25 m50 m75 w0 w25 w50 w75 Algeria 63 51 30 13 67 54 34 15 Cameroon 34 29 13 5 38 32 17 6 Madagascar 38 30 17 7 38 34 20 7 Mauritius 59 42 20 6 64 46 25 8 Reunion 56 38 18 7 62 46 25 10 Seychelles 62 44 24 7 69 50 28 14 South Africa(C) 50 39 20 7 55 43 23 8 South Africa(W) 65 44 22 7 72 50 27 9 Tunisia 56 46 24 11 63 54 33 19... United States (66) 67 45 23 8 74 51 28 10 United States (NW66) 61 40 21 10 67 46 25 11 United States (W66) 68 46 23 8 75 52 29 10 United States (67) 67 45 23 8 74 51 28 10 Argentina 65 46 24 9 71 51 28 10 Chile 59 43 23 10 66 49 27 12 Columbia 58 44 24 9 62 47 25 10 Ecuador 57 46 28 9 60 49 28 11
1 life <- read.table("../data/chap4lifeexp.txt",header = T) # life.fa1 <- factanal(life, factors = 1, method = "mle") # 1 life.fa1 #
1 Call: factanal(x = life, factors = 1, method = "mle") Uniquenesses: # m0 m25 m50 m75 w0 w25 w50 w75 0.238 0.470 0.399 0.696 0.217 0.005 0.117 0.532 Loadings: # Factor1 # 1 m0 0.873 m25 0.728 m50 0.776 m75 0.552 w0 0.885 w25 0.998 w50 0.940 w75 0.684 Factor1 SS loadings 5.329 Proportion Var 0.666 Test of the hypothesis that 1 factor is sufficient. #! The chi square statistic is 163.11 on 20 degrees of freedom. The p-value is 1.88e-24 # p
life.fa2 <- factanal(life, factors = 2, method = "mle") life.fa2 life.fa3 <- factanal(life, factors = 3, method = "mle") life.fa3 # Test of the hypothesis that 2 factors are sufficient. The chi square statistic is 45.24 on 13 degrees of freedom. The p-value is 1.91e-05 # 3 Test of the hypothesis that 3 factors are sufficient. The chi square statistic is 6.73 on 7 degrees of freedom. The p-value is 0.458 # p
3 Uniquenesses: m0 m25 m50 m75 w0 w25 w50 w75 0.005 0.362 0.066 0.288 0.005 0.011 0.020 0.146 Loadings: Factor1 Factor2 Factor3 m0 0.964 0.122 0.226 m25 0.646 0.169 0.438 m50 0.430 0.354 0.790 m75 0.525 0.656 w0 0.970 0.217 w25 0.764 0.556 0.310 w50 0.536 0.729 0.401 w75 0.156 0.867 0.280 Factor1 Factor2 Factor3 SS loadings 3.375 2.082 1.640 Proportion Var 0.422 0.260 0.205 Cumulative Var 0.422 0.682 0.887 Test of the hypothesis that 3 factors are sufficient. The chi square statistic is 6.73 on 7 degrees of freedom. The p-value is 0.458
scores <- scores factanal(life, factors = 3, method = "mle", scores = "regression")$scores Factor1 Factor2 Factor3 Algeria -0.258062561 1.90095771 1.91581631 Cameroon -2.782495791-0.72340014-1.84772224 Madagascar -2.806428187-0.81158820-0.01210318 Mauritius 0.141004934-0.29028454-0.85862443 Reunion -0.196352142 0.47429917-1.55046466 Seychelles 0.367371307 0.82902375-0.55214085 South Africa(C) -1.028567629-0.08065792-0.65421971 South Africa(W) 0.946193522 0.06400408-0.91995289 Tunisia -0.862493550 3.59177195-0.36442148 Canada 1.245304248 0.29564122-0.27342781...
plot(scores[,1], scores[,2], type = "n", xlab = " 1", ylab = " 2") text(scores[,1],scores[,2],labels=row.names(life), cex = 1.1, lwd=2)
I X1 X2 Y Med G2 9.58 Med G3 6.29 Med G4 12.35 Hi G1 9.12 Hi G2 13.84 Lo G3 1.63 Hi G4 15.37 Hi G2 13.45 8 2 X1, X2 X1 Lo, Med, Hi 3 X2 G1 G4 4 Y
Med, G2 9.58 I X1, X2, I R I
X = Med G2 < G3 < G4 Lo < Med < Hi Lo G4 Y I
R D <- read.table("numerizei.txt",header=t) # ## Y X1 X2 (lm) result <- lm(y ~ X1 + X2, data = D) summary(result) # predict(result) #
predict I summary Residuals: 1 2 3 4 5-2.986e-01 2.776e-17 2.986e-01 1.665e-16 3.443e-01 6 7 8 2.776e-17-2.986e-01-4.571e-02 1 2 3 4 5-2.986e-01 2.776e-17 2.986e-01 1.665e-16 3.443e-01 6 7 8 2.776e-17-2.986e-01-4.571e-02
Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) 9.1200 0.4405 20.704 0.00232 ** X1Lo -8.2771 0.7446-11.117 0.00799 ** X1Med -3.6171 0.4078-8.870 0.01247 * X2G2 4.3757 0.5265 8.311 0.01417 * X2G3 0.7871 0.7446 1.057 0.40126 X2G4 6.5486 0.5767 11.355 0.00767 ** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05. 0.1 1 Residual standard error: 0.4405 on 2 degrees of freedom Multiple R-squared: 0.9973,Adjusted R-squared: 0.9907 F-statistic: 149.5 on 5 and 2 DF, p-value: 0.006657
predict 1 2 3 4 5 9.878571 6.290000 12.051429 9.120000 13.495714 6 7 8 1.630000 15.668571 13.495714 Med G2 9.88
2 (x 1, y 1 ), (x 2, y 2 ) d = (x 1 x 2 ) 2 + (y 1 y 2 ) 2 2 (x 1, y 1, z 1 ), (x 2, y 2, z 2 ) d = (x 1 x 2 ) 2 + (y 1 y 2 ) 2 + (z 1 z 2 ) 2 n ( ) x 1, x 2,... x p x 1, x 2 d = (x 1,1 x 1,2 ) 2 + (x 2,1 x 2,2 ) 2 + + (x n,1 x n,2 ) 2
d = x 1 x 1 2( )
m0 m25 m50 m75 w0 w25 w50 w75 Algeria 63 51 30 13 67 54 34 15 Cameroon 34 29 13 5 38 32 17 6 Madagascar 38 30 17 7 38 34 20 7 Mauritius 59 42 20 6 64 46 25 8 Reunion 56 38 18 7 62 46 25 10 Seychelles 62 44 24 7 69 50 28 14 South Africa(B) 50 39 20 7 55 43 23 8 South Africa(W) 65 44 22 7 72 50 27 9 Tunisia 56 46 24 11 63 54 33 19 Canada 69 47 24 8 75 53 29 10 Costa Rica 65 48 26 9 68 50 27 10 Dominican Rep 64 50 28 11 66 51 29 11........................... Ecuador 57 46 28 9 60 49 28 11 31 mxx, wxx xx ( )
R ## life <- read.table("lifeexp.txt",header=t) ## country country <- row.names(life) ## dist <- dist(life) postscript("lifeexp.ps",horizontal=f, width=7, height=7,onefile=true) ## plot(hclust(dist, method = "complete"), labels = country, # xlab = " ", ylab = " ", main = " ")
TibetScull.txt Type Length Breadth Height Fheight Fbreadth Type 1 190.5 152.5 145 73.5 136.5 1 2 172.5 132 125.5 63 121 1 3 167 130 125.5 69.5 119.5 1 4 169.5 150.5 133.5 64.5 128 1 5 175 138.5 126 77.5 135.5 1..................... 19 179.5 135 128.5 74 132 2 20 191 140.5 140.5 72.5 131.5 2 21 184.5 141.5 134.5 76.5 141.5 2..................... 31 197 131.5 135 80.5 139 2 32 182.5 131 135 68.5 136 2
NewScull.txt Length Breadth Height Fheight Fbreadth A 171.0 140.5 127.0 69.5 137.0 B 179.0 132.0 140.0 72.0 138.5 library(mass) # MASS DT <- read.table("tibetscull.txt",header=t) dis <- lda(type ~ Length + Breadth + Height + Fheight + Fbreadth, data = DT, prior = c(0.5,0.5)) # ##. Type newscull <- read.table("newscull.txt",header=t) predict(dis, newdata = newscull) #
A, B 1 0.755, 0.174 $class [1] 1 2 Levels: 1 2 $posterior 1 2 A 0.7545066 0.2454934 B 0.1741016 0.8258984
1 ( ) Musicchoice.txt 39 45 21 83 68 47 53 51 65 41 32 55 χ 2 ## MData MData <- read.table("musicchoice.txt",header=t) ## chisq.test(mdata)
Pearson s Chi-squared test data: MData X-squared = 25.8888, df = 6, p-value = 0.0002335 p = 0.00024 χ 2 25.9 (df) = 6 0.5% α 0.995 0.975 0.950 0.900 0.500 0.05 0.025 0.01 0.005 ν = 6 0.676 1.24 1.64 2.20 5.35 12.59 14.45 16.81 18.55 ν = 9 1.73 2.70 3.33 4.17 8.34 16.92 19.02 21.67 23.59 ν = 10 2.16 3.25 3.94 4.87 9.34 18.31 20.48 23.21 25.19
[1] CRAN (http://cran.r-project.org) R [2] R ABC A.(2010) R [3] R S-PLUS B. (2007) [4] R (2009) R [5] ( ),, (1985) ( ) [6] (2012 2015) http://ruby.kyoto-wu.ac.jp/~konami/text/