Vol. 29, No. 2, (2008) FDR Introduction of FDR and Comparisons of Multiple Testing Procedures that Control It Shin-ichi Matsuda Department of

Vol. 29, No. 2, 125 139 (2008) FDR Introduction of FDR and Comparisons of Multiple Testing Procedures that Control It Shin-ichi Matsuda Department of Information Systems and Mathematical Sciences, Faculty of Mathematical Siences and Infomation Engineering, Nanzan University e-mail:matsu@nanzan-u.ac.jp In this paper, we introduce the definition of FDR (False Discovery Rate), which gets a lot of attention as a new concept for considering multiplicity effect, and expound properties of it. Furthermore, we enumerate multiple testing procedures that control FDR; Benjamini- Hochberg procedure (linear step-up procedure), Adaptive Benjamini-Hochberg procedure, Benjamini-Yekutieli procedure, Storey s procedure, two-stage linear step-up procedure, and Student-Newman-Kuels procedure. In addition, we review comparisons of them in order to consider which procedure is best to use. As a result, we show the conservativeness of Benjamini-Hochberg procedure and the availability of two-stage linear step-up procedure by Monte-Carlo simulation, in the case of dependent test statistics. Key words: false discovery rate; FWER (familywise error rate); multiple comparisons; linear step-up procedure; two-stage procedure; Monte-Carlo simulation. 1. FDR false discovery rate (1997) n Received August 2008. Revised September 2008. Accepted September 2008.

126 2. R. A. Fisher n x i ˆσ 2 2 x i x p j 2ˆσ2 /n 2 1 3. 1 Fisher 1 1 Fisher 2 t 2 protected LSD least significant difference J. W. Tukey 1 FWE familywise error FWE Hochberg and Tamhane (1987) (1997) FWE 3 FWE FWER familywise error rate 4 FWER 4 H 0 : µ 1 = µ 2 = µ 3 = µ 4 2 H 01 : µ 1 = µ 2 H 02 : µ 1 = µ 3 H 03 : µ 1 = µ 4, H 04 : µ 2 = µ 3 H 05 : µ 2 = µ 4 H 06 : µ 3 = µ 4 6 H 01 H 06 H 0 Fisher protected LSD FWER H 01 H 06 H 0 H 01 H 06

FDR 127 protected LSD 1 (1 α) 2 α = 0.05 0.0975 FWER 4. 1 FWE FWER 2 1 FDR FWER Benjamini and Hochberg (1995) Benjamini et al. (2005) 5. FDR Benjamini and Hochberg (1995) FDR 5.1 FDR m m 0 m 1 m = m 0 + m 1 R 1 R U, V, S, T FWER P (V 1) 1. m U V m 0 T S m m 0 m R R m R Q = V R R = 0 Q = 0 Benjamini and Hochberg (1995) Q Q e V Q e = E(Q) = E R

128 Q e FDR FDR FDR 5 5.2 FDR FWER FWER FDR FWER FDR Benjamini and Hochberg (1995) 1 FDR FWER 2 m 0 < m FDR FWER FWER FDR FDR FWER FDR FDR 5.3 FDR FWER FDR Benjamini and Hochberg (1995) q FDR FWER 0.05 q 0.01 0.1 q 6. FDR FDR 3 Benjamini et al. (2005) 1 p 2 p 3 1 Benjamini and Hochberg (1995) BH Benjamini and Yekutieli (2001) BY 2 BH BH 3 BH 2 1 BH BY 2 Adaptive BH Storey 3 2 SNK

6.1 BH FDR 129 H 1,H 2,,H m p P 1,P 2,,P m p P (1) P (2) P (m) P (i) H (i) BH Benjamini and Hochberg (1995) (2006) 1 i = m 2 P (i) i m q k = i 3 i 1 i 2 i = 1 3 H (i) ;i = 1,2,,k BH FDR BH p linear step-up procedure 6.2 BY BH BY Benjamini and Yekutieli (2001) (2006) mx 1 q = q 1 / j j=1 q q BH 2 i = m 3 P (i) i m q k = i 4 i 1 i 3 i = 1 4 H (i) ;i = 1,2,,k BY FDR 6.3 Adaptive BH BH Adaptive BH ABH Benjamini and Hochberg (2000) (2006) 1 q BH 1 2 2 S i = 3 i = 2 1 P (i) Si (i = 1,2,,m) m + 1 i

130 4 S i S i 1 i + 1 i 4 5 S i < S i 1 S i S S i S i 1 S m S 6 ˆm 0 = min 1 S + 1,m [ ] 7 i = m 8 P (i) i ˆm 0 q k = i i 1 i 8 ˆm 0 m k 9 H (i) ;i = 1,2,,k ABH FDR 6.4 Storey Storey et al. (2004) Storey (2002) ST λ t q ([ FDRλ ) p 2 1 p t q ([ FDRλ=0 ) BH 2 p t q ([ FDRλ ) m ˆπ 0(λ)m BH 6 ST BH Storey et al. (2004) MST. Storey et al. (2004) λ = 0.5 ST BH q λ ST q Storey et al. (2004) λ q Storey Storey (2008) 7 6.4.1 Storey 1 p P 1,P 2,,P m t (0 t 1) V (t) = (P i t ) S(t) = (P i t ) R(t) = V (t) + S(t)

FDR 131 FDR(t) = E V (t) max(r(t), 1) Storey (2002) λ FDR(t) [ FDRλ (t) [FDR λ (t) = ˆπ 0(λ)t max(r(t),1)/m ˆπ 0(λ) π 0 = m 0/m ˆπ 0(λ) = m R(λ) (1 λ)m [0,1] F t q (F ) t q (F ) = sup{0 t 1 : F (t) q } p p t q ([ FDRλ ) ST MST 2 1 ˆπ 0(λ) 1 [ FDRλ (t) 6.5 2 [FDR λ(t) = ˆπ 0(λ) = m + 1 R(λ) (1 λ)m 8 >< >: ˆπ 0(λ)t max(r(t),1)/m (t λ ) 1 (t > λ ) BH 2 Two-stage linear step-up procedure TST Benjamini et al. (2005) 1 q = q /(1 + q ) BH r 1 r 1 = 0 2 r 1 = m 2 ˆm 0 = m r 1 3 q m/ ˆm 0 BH FDR

132 6.6 SNK Student-Newman-Kuels SNK FDR Oehlert (2000) BH Tukey-Welsch FWER FDR SNK Tukey- Welsch SNK α FWER F (1997) 7. 2 18 20 p 8 2. 18 p 6.18 10 6 4.02 10 5 0.0966 0.9030 0.0151 4.17 10 6 0.0129 0.0047 0.4400 1.19 10 13 0.1171 0.0356 0.0018 0.0013 0.0092 1.09 10 7 6.02 10 8 0.3284 0.0608 0.1202 18 20 p 4 2 p FDR BH ABH λ = 0.5 ST TST q = 0.05 7.1 BH p P (1) = 1.19 10 13 P (2) = 6.02 10 8 P (3) = 1.09 10 7 P (4) = 4.17 10 6 P (5) = 6.18 10 6 P (6) = 4.02 10 5 P (7) = 0.0013 P (8) = 0.0018 P (9) = 0.0047 P (10) = 0.0092 P (11) = 0.0129 P (12) = 0.0151 P (13) = 0.0356 P (14) = 0.0608 P (15) = 0.0966 P (16) = 0.1171 P (17) = 0.1202 P (18) = 0.3284 P (19) = 0.4400 P (20) = 0.9030

FDR 133 1 i = 20 2 P (20) = 0.9030 > 20 20 q = 0.05 i = 19 2 P (19) = 0.4400 > 19 20 q = 0.0475 i = 18 2 i = 13 P (12) = 0.0151 12 20 q = 0.0300 k = 12 3 3 H (i) ;i = 1,2,,12 12 7.2 ABH 1 BH 2 S 1 = 1 P (1) 20 + 1 1 = 0.05, S2 = 1 P (2) = 0.0526, S3 = 0.0556, S4 = 0.0588, 20 + 1 2 S 5 = 0.0625, S 6 = 0.0667, S 7 = 0.0713, S 8 = 0.0767, S 9 = 0.0829, S 10 = 0.0901, S 11 = 0.0987, S 12 = 0.1094, S 13 = 0.1206, S 14 = 0.1342, S 15 = 0.1506, S 16 = 0.1766, S 17 = 0.2200, S 18 = 0.2239, S 19 = 0.2800, S 20 = 0.0970 3 i = 2 4 S 2 S 1 i = 3 4 i = 19 5 S 20 < S 19 S 20 S 6 ˆm 0 = min 7 i = 20 1 S + 1,20 = 11 8 P (20) = 0.9030 > 20 ˆm 0 q = 0.0909 i = 19 8 i = 15 i = 14 P (14) = 0.0608 0.0636 k = 14 9 9 H (i) ;i = 1,2,,14 BH 12 7.3 = 0:5 ST 20 R(0.5) ˆπ 0(0.5) = (1 0.5) 20 = 1 = 0.1 10 m ˆπ 0(0.5) 20 = 2 BH 1 i = 20 2 P (20) = 0.9030 > 20 2 q = 0.5 i = 19 2

134 P (19) = 0.4400 19 2 q = 0.475 k = 19 3 3 H (i) ;i = 1,2,,19 Storey ST ˆπ 0 = 0.091 ST ˆπ 0 = 0.318 17 7.4 TST 1 q = q /(1 + q ) = 0.04762 BH P (13) = 0.0356 > 13 20 q = 0.03095, P (12) = 0.0151 12 20 q = 0.02857 r 1 = 12 2 2 ˆm 0 = 20 r 1 = 8 3 q 20 ˆm 0 = 0.11905 BH P (15) = 0.0966 > 15 20 0.11905 = 0.08929, P (14) = 0.0608 14 0.11905 = 0.08333 20 H (i) ;i = 1,2,,14 ABH 14 7.5 9 BH p q ABH TST 0.0608 p FDR FDR p Bonferroni 0.05/20 = 0.0025 Holm 0.05/(m i + 1) 8 FDR λ = 0.5 ST λ ST 8. Benjamini et al. (2005) λ MST λ MST ABH FDR TST (2006) BH ABH BY FDR q = 0.05

FDR 135 116 10000 (2006) 1 2 3 1 4 1 5 1 6 1 7 p 8 3 5 9 4 5 10 6 5 3 BH FDR Benjamini and Hochberg (1995) FDR ABH 2 0.5 5 FDR BY FDR FWER 3. FDR BH ABH BY 0.00495 0.00542 0.00137 0.04353 0.11152 0.01088 8.1 TST TST (2006) 4 ABH FDR 3 TST FDR 8.2 FDR

136 4. TST FDR BH ABH BY TST 2: 0.5,, 20 15 0.03692 0.05463 0.01001 0.04393 2: 0.9,, 20 15 0.02669 0.07553 0.00780 0.03500 2: 0.5,, 20 15 0.03424 0.09708 0.00899 0.04611 2: 0.9,, 20 15 0.02811 0.11152 0.00819 0.03552 3: 20 10 0.01473 0.01455 0.00416 0.01570 5: 20 10 0.02284 0.08829 0.00629 0.04770 1. BH 2. ABH 3. λ = 0.5 ST 4. λ ST 5. TST BH ABH BH q λ = 0.5 ST 2 q λ ST BH TST Benjamini et al. (2005) BH λ ST TST FDR 9. FDR FDR FDR FDR FDR

FDR 137 2008 Benjamini, Y. and Hochberg, Y. (1995). Controlling the False Discovery Rate: a Practical and Powerful Approach to Multiple Testing. J. R. Statist. Soc. ser.b, 57(1), 289-300. Benjamini, Y. and Hochberg, Y. (2000). On the Adaptive Control of the False Discovery Rate in Multiple Testing With Independent Statistics. J. Edu. Behavioral Statistics, 25(1), 60-83. Benjamini, Y., Kriegery, A. M. and Yekutieli, D. (2005). Adaptive Linear Step-up Procedures that control the False Discovery Rate. Unpublished paper, http://www.math.tau.ac.il/ ybenja/mypapers/bkymarch9.pdf Benjamini, Y. and Yekutieli, D. (2001). The Control of the False Discovery Rate in Multiple Testing under Dependency. Annals of Statistics, 29(4), 1165-1188. Hochberg, Y. and Tamhane, A.C. (1987). Multiple Comparison Procedures, John Wiley and Sons. (2006). FDR., 6, 17-30. (URL: http://www.seto.nanzan-u.ac.jp/msie/nas/academia/vol 006pdf/06-017-030.pdf) (1997)... Oehlert, G. W. (2000). Student-Newman-Kuels controls the false discovery rate. Statistics and Probability Letters, 46, 381-383. Storey, J. D. (2002). A direct approach to false discovery rates. J. R. Statist. Soc. ser.b, 64(3), 479-498. Storey, J. D. (2008). Q-Value. http://genomics.princeton.edu/storeylab/qvalue/ Storey, J. D., Taylor, J. E., and Siegmund, D. (2004). Strong Control, Conservative Point Estimation, and Simultaneous Conservative Consistency of False Discovery Rates: a Unified Approach. J. R. Statist. Soc. ser.b, 66(1), 187-205.

138 1 F Tukey-Welsch (1997) 2 2 t 2 2 Welch 3 FWE FWE 1 FWE type I FWE (1997) 1. 4 FWE (1997) FWE FWER FDR FWER (1997) I FWE generalized type I FWE Hochberg and Tamhane (1987) FWE 5 FDR FDR Benjamini and Hochberg (1995) 5.1 FDR FDR. 6 BH q q /ˆπ 0(λ)

FDR 139 7 ST MST 100 Storey λ 8 47 p 20 9 BH 12 7.5 BH p. 20 47 p