D IEEJ Transactions on Industry Applications Vol.136 No.10 pp.686 691 DOI: 10.1541/ieejias.136.686 NW Accelerating Techniques for Sequence Alignment based on an Extended NW Algorithm Jin Okaze, Non-member, Chikatoshi Yamada, Member, Kei Miyagi, Non-member, Shuichi Ichikawa, Non-member 2016 2 15 2016 6 30 The NW (Needleman-Wunsch) algorithm is a method of sequence alignment in bioinformatics. The NW algorithm can be applied for global sequence alignment, which is a way of arranging the sequences of DNA to identify regions of similarity. However, the NW algorithm requires a huge number of calculations compared with the SW (Smith- Waterman) algorithm. Many studies have focused on analyzing the output of multiple sequences quickly in three dimensions. However, such methods cannot obtain similarities for whole sequences. In this article, we extend the NW algorithm to three dimensions. The proposed method is expected to provide a fast analysis of high precision data sequences. Needleman-Wunsch Keywords: sequence alignmentneedleman-wunsch algorithm 1. 2 Fig. 1 FASTA BLASTBasic Local Alignment Search Tool (1) GPUGraphic Processing Unit GPGPU 905-2192 905 National Institute of Technology, Okinawa College 905, Henoko, Nago, Okinawa 905-2192, Japan 441-8580 1-1 Toyohashi University of Technology 1-1, Hibarigaoka, Tempaku-cho, Toyohashi, Aichi 441-8580, Japan Fig. 1. A sequence alignment. General-purpose computing on GPUGPGPU (2)(4) SWSmith- Waterman GPU (5) SW NWNeedleman-Wunsch GPU 3 2. NWNeedleman-Wunsch 21 2 DNA 1 c 2016 The Institute of Electrical Engineers of Japan. 686
2 3 22 NW NW 2 1 2 Fig. 2 1 1 2 1 SPSum of Pair 3 2 1 2 1 2 (1) (1) A B A B s(a, A) = match = 2 s(a, B) = mismatch = 1 (1) s(a, ) = s(, A) = gap = 2 n m X = x 1, x 2,...,x i Y = y 1,y 2,...,y j NW (2) SP 1 (3) NW O(mn) NW(0, 0) = 0 NW(i, 0) = gapi (2) NW(0, j) = gapj NW(i, j) NW(i 1, j 1) + s(x i,y j ) = max NW(i 1, j) + gap (3) NW(i, j 1) + gap 23 NW X = ABCD Y = ACD 2 NW (3) (i, j) = (0, 0) (i, j) = (4, 3) 1 Fig. 3 Fig. 4 (i, j) = (4, 3) (i, j) = (3, 2)(i, j) = (2, 1)(i, j) = (1, 1)(i, j) = (0, 0) (i, j) = (2, 1) (i, j) = (1, 1) X 1 Y Y X = ABCD Y = ACD Fig. 3. Results of score calculation. Fig. 2. Score table. Fig. 4. Traceback. 687 IEEJ Trans. IA, Vol.136, No.10, 2016
X = ABCD Y = A CD Score = 4 3. ClustalW (6) MAFFT (7) T-COFFEE (8) GPU CUDASW++2.0 (9) DB 4. NW 3 Fig. 5. Fig. 6. 3D score table. Location of gap. 41 2 NW 3 3 Fig. 5 3 2 NW 3 NW SP 2 NW(i, j, k) Fig. 6 Fig. 7 3 2 3 1 3 (4) (4) ABC A B C s(a, A, A) = match = 2 s(a, A, B) = s(a, B, C) = mismatch = 1 (4) s(a, A, ) = s(a,, ) = gap = 2 NW(0, 0, 0) = 0 NW(i, j, 0) = gap(i + j) (5) NW(0, j, k) = gap( j + k) NW(i, 0, k) = gap(k + i) NW(i, j, k) Fig. 7. Location of Match/Mismatch. NW(i 1, j 1, k 1) + s(x i,y j, z k ) NW(i 1, j 1, k) + gap NW(i 1, j, k 1) + gap = max NW(i, j 1, k 1) + gap (6) NW(i, j, k 1) + gap NW(i, j 1, k) + gap NW(i 1, j, k) + gap nml X = x 1, x 2,...,x i Y = y 1,y 2,...,y j Z = z 1, z 2,...,z k (5) Fig. 6 Fig. 7 (6) NW O(mnl) 42 X = BD Y = ABCD Z = ABD 3 3 NW (6) (i, j, k) = (0, 0, 0) (i, j, k) = (2, 4, 3) 1 Fig. 8 Fig. 8 (i, j, k) = (2, 4, 3)(i, j, k) = (2, 3, 3)(i, j, k) = (1, 2, 2) (i, j, k) = (1, 1, 1)(i, j, k) = (0, 0, 0) (i, j, k) = (2, 4, 3) (i, j, k) = (2, 3, 3) X Z (i, j, k) = (1, 2, 2) (i, j, k) = (1, 1, 1) 688 IEEJ Trans. IA, Vol.136, No.10, 2016
Fig. 8. Score table and traceback. X X = BD Y = ABCD Z = ABD X = B D Y = ABCD Z = AB D Score= 2 5. GPU Fig. 9. Configuration of GPU architecture. NW GPU 51 GPGPU 1 GPUGraphics Processing Unit 1 CPU CPU GPGPU GPU 52 CUDA CUDA GPGPU NVIDIA C CPU GPU Fig. 9 CUDA 16 KB Fig. 10. Dependencies of elements. 4 (10) (11) 53 CUDA NW CUDA 3 NW(i, j, k) Fig. 10 Fig. 10 Fig. 11 STEP 54 CPU GPU Table 1 689 IEEJ Trans. IA, Vol.136, No.10, 2016
Fig. 13. Execution time of CPU/GPU(32-96). Fig. 11. Table 1. OS CPU GPU Memory bandwidth floating-point performance (single-precision) Processing flow. floating-point performance (double-precision) developmental environment Fig. 12. Environment. Windows7 Enterprise 64bit Intel Core i7-4770k 3.50 GHz NVIIA GeForce GTX TITAN 288 [GB/s] 4.5TFLOPS 1.3TFLOPS CUDA7.5 Execution time of CPU/GPU. CPU GPU Fig. 12 Fig. 12 32 96 Fig. 13 Fig. 12 GPU CPU Fig. 13 CPU GPU NW GPU CPU 5.25 No.40412902 1 NCBI: BLAST: Basic Local Alignment Search Tool, http://blast.ncbi.nlm. nih.gov/blast.cgi 2 CUDA GPU,, pp.17 18 (2008) 3 GPU,, p.74 (2007) 4 GPU,, p.1 (2012) 5 GPU 3 Smith-Waterman,, pp.35 39 (2011) 6 DNA Data Bank of Japan (DDBJ): ClustalW, http://clustalw.ddbj.nig.ac. jp/ 7 debian: Multiple alignment program for amino acid or nucleotide sequences, https://packages.debian.org/ja-/squeeze/mips/mafft 8 Swiss Institute of Bioinformatics (SIB): T-Coffee, http://tcoffee.vitalit.ch/apps/tcoffee 9 Y. Liu, B. Schmidt, and D.L. Maskell: CUDASW++2.0: enhanced Smith- Waterman protein database search on CUDA-enabled GPUs based on SIMT and virtualized SIMD abstractions, BMC Research Notes, pp.1 12 (2010) 10 J. Sanders + E. Kandrot, CUDA BY EXAMPLE An Introduction to General-Purpose GPU Programming,, pp.6 9 (2011) 11 CUDA GPU,, pp.34 41 (2010) 2011 6. 690 IEEJ Trans. IA, Vol.136, No.10, 2016
2000 2004 2007 2009 2014 2015 IEEE 2008 2010 2014 VLSI 1985 1987 1987 ERATO 1991 LSI LSI 1994 1997 2001 2007 2010 2011 2012 IEEEsenior member ACM 691 IEEJ Trans. IA, Vol.136, No.10, 2016