Vol. 43 No. SIG 6(HPS 5) PNN 207 PNN / PNN PNN 2 PNN PNN 3 PNN Kernel Layer Summation Layer Decision Layer X =(x,x 2,..., x n ) S i j C i

Vol. 43 No. SIG 6(HPS 5) Sep. 2002 PNN: Probabilistic Neural Network PNN PNN 33.3 ms 0 An Architecture for Parallelized High-speed Probabilistic-Neural- Network Calculation and Its Application to Image Recognition Systems Noriyuki Aibe and Moritoshi Yasunaga Recently, probabilistic neural networks (PNN) are being applied to the practical pattern recognition applications such as face recognition and satellite-image recognition. In order to obtain high recognition accuracy, however, PNN requires a large number of sample patterns to be processed in it. Furthermore, the PNN calculation must be repeated so many times in its learning-stage to determine a network parameter precisely. Specialized hardware is thus indispensable for the high speed PNN calculation. In this paper, we propose a specialized processor and a parallel architecture using the processors for high speed PNN calculation. And we develop an image-recognition prototype system in a small single board. Measurements of the system show its high-speed recognition in a video-rate period of 33.3 ms and a high-speed learning less than 0 seconds is estimated using the measured result. Those results demonstrate effectiveness of the proposed hardware for the practical pattern recognition problems.. PNN: Probabilistic Neural Network Specht 3 ) 4) Mao 5), Tian 6),7) Graduate School, Doctoral Program of Systems and Information Engineering, University of Tsukuba Institute of Information Sciences and Electronics, University of Tsukuba PNN PNN PNN 33.3 ms PNN 206

Vol. 43 No. SIG 6(HPS 5) PNN 207 PNN / PNN PNN 2 PNN 3 4 5 2. 2. PNN 3 PNN Kernel Layer Summation Layer Decision Layer X =(x,x 2,..., x n ) S i j C i j K ( ) ( ) X S i j K X S i j K ( { ) X S i : xk (s i j) k σ j = 2 () 0 : otherwise (for all k, k =, 2,...,n) X =(x,x 2,...,x n ), S i j = ( (s i j), (s i j) 2,...,(s i j) n ) x k, (s i j) k k n () σ Fig. Network configuration of PNN. PNN Summation Layer X C i p (X C i ) ˆp (X C i )= N p N p j=0 K ( ) X S i j (2) N p /N p Decision Layer 2.2 spnn serial-pnn Processor PNN 8),9)

208 Sep. 2002 Look-up Table 8),0) 2) spnn Processor 33.3 ms spnn Processor 4 3. PNN σ σ PNN PNN σ σ σ / PNN PNN 6) 7) GOES-8 / PNN 5 PNN PNN Sigma-Parallel Architecture SPA SPA PNN σ σ spnn Processor SPA VLSI VLSI spnn Processor Hybrid-SPA 3. Sigma-Parallel Architecture SPA C i j S i j S i j n, q bit qn bit N p N c qnn p N c bit q =8 n =256 N p =52 N c =8 MByte N c N p MByte MByte 2 (a) PNN 2 (a) Neuron in Kernel Layer X σ X

Vol. 43 No. SIG 6(HPS 5) PNN 209 Fig. 2 2 SPA Neuron Parallel Architecture vs. Sigma Parallel Architecture. σ σ m m =0 m = N σ N σ σ 2 (a) qn p N c q q (N p N c +) q =8 N p =52 N c = 8 q (N p N c +) = 32, 776 σ 2 (b) SPA Sigma-Parallel ArchitectureSPA spnn Processor σ spnn Processor PNN PNN spnn Processor σ Neuron-Parallel Architecture σ Sigma-Parallel Architecture σ log 2 N σ spnn Processor spnn Processor q bit, q bit 2q σ / VLSI 8),2) 6) SPA PNN 3.2 Hybrid-SPA 2 (b) SPA SPA spnn Processor 2 (b) VLSI Chip 3 SPA-Module 3

20 Sep. 2002 3 SPA Fig. 3 Neuron parallelism based on the SPA. σ Hybrid-SPA Hybrid- SPA 4. PNN SPA spnn Processor Hybrid-SPA 4. spnn Processor spnn Processor 4 X S i j σ m PNN PNN 3 PNN X S i j ( q bit x,x 2,...,x n ) (s i j), (s i j) 2,...,(s i j) n () σ m m σ SPA () PNN ˆp (X C i ) 7) ˆp (X C i ) ˆp (X C i ) X Minchin PNN 8) 9 4 97.0% 94.2% 5 48 95.5% 94.0%.5 2.8% 9) 4 C i j ( ) s i Kernel Size Adjuster σ j m/2 Kernel Size Comparator ( ) ( s i j + σ m/2 ) s i j σ m/2 X x () xk ( ) s i j k σ/2 Kernel Size Comparator (( ) s i σ j m/2 ) x (( ) s i j + σ m/2 ) x 0 S i j

Vol. 43 No. SIG 6(HPS 5) PNN 2 Fig. 4 4 spnn Processor Circuit configuration of spnn Processor. () Summation Layer Decision Layer Register Max Detector spnn Processor Xilinx FPGA XCV000E-6HQ240C SLICE SLICE 00 20 SLICE XCV000E 2,288 SLICE 20) FPGA 64 spnn Processor ASIC spnn Processor 4.2 SPA 5 5 Hybrid-SPA PNN 2 FPGA Xilinx XCV000E-6HQ240C XCV300E-6PQ240C 5 Hybrid-SPA Fig. 5 System configuration using the Hybrid-SPA. LSI 4M-bit SRAM HM62W85HJP FPGA XCV000E 3 Hybrid-SPA 64 spnn Processor 64 spnn Processor 4 SPA- Module 4

22 Sep. 2002 Fig. 6 6 Configuration of image recognition system. SPA-Module 6 spnn Processor 6σ 64 spnn Processor Processor 5 Common Data Bus FPGA XCV300E FPGA σ Best σ Detector Best σ Detector spnn Processor spnn Processor 8 bit Best σ Detector spnn Processor Best σ Detector spnn Processor spnn Processor 5. 5. 5 6 Hybrid- SPA Module 5 7 spnn Processor FPGA XCV000E XCS30XL-4PQ240C 30,000 XCS30XL 3 spnn Processor 6

Vol. 43 No. SIG 6(HPS 5) PNN 23 Table Fig. 7 Specification of prototype system. 27 MHz (spnn Processor:3.5 MHz) 3 spnn Processors (Max) 48 Mbit (4 Mbit 2)(Max) 96 bit (8 bit 2)(Max) 240 80 mm (4 ) NTSC PS/2 ATA Analog-RGB RS-232C 7 Photograph of prototype system. NTSC OKI MSM7674 A/D FPGA Xilinx XCV800-4HQ240C 6 6 XCV800-4HQ240C NTSC AnalogDevices ADV776AKS 6 6 spnn Processors spnn Processors 4Mbit-SRAM HM62W85HJP-2 2 XCS30XL-4PQ240C 8 bit 2 96 bit Best σ Detector Xilinx XCV300-4PQ240C SPA 8 Main Board 6 8 Sub Board spnn Processor Hybrid-SPA Module Main Board Sub Board 8 Pre-Processed Unknown Pattern Input Hybrid-SPA Module Common Data Bus 2 Hybrid-SPA Module XCV000E 64 spnn Processor 2 XCV000E spnn Processor 28 Processor XCV000E 3 5 7 spnn Processor 92 320 448 Processor PNN 5.2 5.2. NTSC 6 6 segment 8 bit n =256 q =8 spnn Processor 8.96 µs 33.3 ms,750 256 N p =256 spnn Processor 6 7 N c =6 7 64 spnn Processor XCV000E 6σ 4 Hybrid-SPA T learn

24 Sep. 2002 Fig. 8 8 spnn Processor Configuration of the extened system using Sub Boards. T learn = (T recog. + βn sp NN N Mod. T CLK ) N tp N σ N sp NN (3) T recog. = ( + qnn pn c rn Mod. + αn c + βn Mod. ) T CLK (4) n : q : r : s : N c : N p : N tp : N σ : N sp NN : N Mod. : α : γ : T CLK : spnn Processor σ SPA-Module spnn Processor SPA-Module spnn Processor Best σ Detector T recog. β = log 2 N p + log 2 N c s + γ (5) 5 spnn Processor Best σ Detector 5 Common Data Bus SPA-Module 5 spnn Processor Hybrid-SPA Module SPA-Module T recog. 2 qnn p N c rn Mod. n =256 q = r = s = 8 bit N c = N p =256 N Mod. = α = γ =0 T CLK =/3.5MHz T recog. = 4.85ms

Vol. 43 No. SIG 6(HPS 5) PNN 25 (3) β spnn Processor Best σ Detector (3) N c =8 2.5sec N c =32 0.0 sec Intel PentiumIII- GHz PC gcc ANSI C N c =8 70 sec N c =32 682 sec SPA PC 5 Hybrid-SPA Module 68 64 spnn Processor XCV000E,569,78 System Gates 20) XCV3200E 4,074,387 System Gates 20) 28 60 spnn Processor (3) (4) β spnn Processor FPGA XCV000E XCV3200E PC 36 70 2 XCV000E Sub Board 3 64 spnn Processors 7 Hybrid-SPA Modules spnn Processor 448 (3) (4) PC 480 FPGA Sub Board 5.2.2 250W 3 C C 2 C 3 FPGA Field Programmable Gate Array σ σ =0 256 N σ =256 σ =80 90 9 0 9 Unknown Pattern Field Video Camera Prototype Board SONY DCR-PC0 Video Monitors 0 6 6 q =8 n =256 N p = N tp =256 N c =3 N sp NN = N Mod. = r = s =8 80% 0 8 6. spnn Processor SPA

26 Sep. 2002 9 Fig. 9 Recognition test system. 200 No.345063 0 Fig. 0 Sample patterns for recognition test. 80% 33.3 ms 256 /256 /32 0.0 ) Specht, D.: Probabilistic Neural Networks, J. Neural Networks, Vol.3, pp.09 8 (990). 2) Specht, D.: Probabilistic Neural Networks and the Polynomial Adaline as Complementary Techniques for Classification, IEEE Trans. Neural Networks, Vol., No., pp.0 (990). 3) Müller, Klaus-R., Mika, S., Rätsch, G., Tsuda, K. and Schölkopf, B.: An Introduction to Kernel-Based Learning Algorithms, IEEE Trans. Neural Networks, Vol.2, No.2, pp.8 20 (200). 4) Ruiz, A. and López-de Teruel, P.E.: Nonlinear Kernel-Based Statistical Pattern Analysis, IEEE Trans. Neural Networks, Vol.2, No., pp.6 32 (200). 5) Mao, K.Z., Tan, K.C. and Ser, W.: Probabilistic Neural-Network Structure Determination for Pattern Classification, IEEE Trans. Neural Networks, Vol., No.4, pp.009 06 (2000). 6) Tian, B., Azimi-Sadjadi, M.R., Vonder Harr, T.H. and Reinke, D.: Temporal Updating Scheme for Probabilistic Neural Network with Application on Satellite Cloud Classification, IEEE Trans. Neural Networks, Vol., No.4, pp.903 920 (2000). 7) Tian, B., Azimi-Sadjadi, M.R.: Comparison

Vol. 43 No. SIG 6(HPS 5) PNN 27 of Two Different PNN Training Approaches for Satellite Cloud Data Classification, IEEE Trans. Neural Networks, Vol.2, No., pp.64 68 (200). 8) LSI (995). 9) Gotarredona, T.S., Barranco, B.L. and Andreou, A.G.: Programmable Kernel Analog VLSI Convolution Chip for Real Time Vision Processing, Proc. IEEE Int l Joint Conf. Neural Networks, CD-ROM (2000). 0) Szabo, T., Antoni, L., Horvath, G. and Feher, B.: A full-parallel digital implementation for pre-trained NNs, Proc. IEEE Int l Joint Conf. Neural Networks, CD-ROM (2000). ) Girau, B. and Lorraine, L.: Digital hardware implementation of 2D compatible neural networks, Proc. IEEE Int l Joint Conf. Neural Networks, CD-ROM (2000). 2) Yasunaga, M., Masuda, N., Yagyu, M., Asai, M., Yamada, M. and Masaki, A.: Design, Fabrication and Evaluation of A 5-inch Wafer Scale Neural Network LSI Composed of 576 Digital Neurons, IJCNN 90, II, pp.527 535 (990). 3) 000 00 J84-D-II, No.6, pp.85 93 (200). 4) Sinencico, E.S. and Lau, C. (Eds.): Artificial Neural Networks: Paradigms, Applications, and Hardware Implementations, IEEE Press (992). 5) Kohonen, T.: Chapter 8 (Hardware for SOM) in Self-Organizing Maps, Vol., 2nd edition, chapter 2, pp.37 38, Springer (997). 6) Wawrzynek, J., Asanovic, K., Kingsbury, B., Johnson, D., Beck, J. and Morgan, N.: Spert-II: A Vector Micro Processor System, Special Issue of Neural Computing in IEEE COMPUTER, Vol.29, No.3, pp.79 86 (996). 7) Bishop, C.: Neural Network for Pattern Recognition, Oxford Press (995). 8) Minchin, G. and Zaknich, A.: A Design for FPGA Implementation of the Probabilistic Neural Network. 9) PRMU2000-7, pp.7 78 (2000). 20) Xilinx, Inc.: Virtex-E.8V Field Programmable Gate Arrays, DS022- (v2.2) (200). http://www.xilinx.com/partinfo/ds022.pdf 7 6 5 4 3 2 7 6 5 4 3 2 c X X 2 (a) c~ 2 ~ ~ X X 2 (b) P(A) p(x A) P(B) p(x B) 0-0.2 0 0.2 0.4 0.6 0.8.2.4 c~ c 2 0-0.2 0 0.2 0.4 0.6 0.8.2.4 Fig. Experimental result of cross-points approximation between the probabilistic density distributions. 2 PC 2 p(x A) 0.4 0.03 p(x B) 0.55 0.5 (a) c c 2 X X =0.33 X 2 =0.457 0 20 S 20 S () (b) c c 2 X X =0.337 X 2 =0.488 20 20 () σ σ 20 σ =0.5, 0.05, 0.005 ( 4 29 ) ( 4 5 6 ) X

28 Sep. 2002 2 4 56 58 63 3 4 6 IEEE INNS