Vol. 42 No. 4 Apr VC 2 VC 4 VC VC 4 Recover-x Performance Evaluation of Adaptive Routers Based on the Number of Virtual Channels and Operating F

Similar documents
Core1 FabScalar VerilogHDL Cache Cache FabScalar 1 CoreConnect[2] Wishbone[3] AMBA[4] AMBA 1 AMBA ARM L2 AMBA2.0 AMBA2.0 FabScalar AHB APB AHB AMBA2.0

IPSJ SIG Technical Report Vol.2012-CG-148 No /8/29 3DCG 1,a) On rigid body animation taking into account the 3D computer graphics came

TCP/IP IEEE Bluetooth LAN TCP TCP BEC FEC M T M R M T 2. 2 [5] AODV [4]DSR [3] 1 MS 100m 5 /100m 2 MD 2 c 2009 Information Processing Society of

1 Fig. 1 Extraction of motion,.,,, 4,,, 3., 1, 2. 2.,. CHLAC,. 2.1,. (256 ).,., CHLAC. CHLAC, HLAC. 2.3 (HLAC ) r,.,. HLAC. N. 2 HLAC Fig. 2

1

202

第62巻 第1号 平成24年4月/石こうを用いた木材ペレット

A Feasibility Study of Direct-Mapping-Type Parallel Processing Method to Solve Linear Equations in Load Flow Calculations Hiroaki Inayoshi, Non-member

258 5) GPS 1 GPS 6) GPS DP 7) 8) 10) GPS GPS ) GPS Global Positioning System

Vol. 48 No. 4 Apr LAN TCP/IP LAN TCP/IP 1 PC TCP/IP 1 PC User-mode Linux 12 Development of a System to Visualize Computer Network Behavior for L

Consideration of Cycle in Efficiency of Minority Game T. Harada and T. Murata (Kansai University) Abstract In this study, we observe cycle in efficien

IPSJ SIG Technical Report Vol.2013-ARC-203 No /2/1 SMYLE OpenCL (NEDO) IT FPGA SMYLEref SMYLE OpenCL SMYLE OpenCL FPGA 1

IPSJ SIG Technical Report Vol.2009-DPS-141 No.20 Vol.2009-GN-73 No.20 Vol.2009-EIP-46 No /11/27 1. MIERUKEN 1 2 MIERUKEN MIERUKEN MIERUKEN: Spe

2 ( ) i

6 2. AUTOSAR 2.1 AUTOSAR AUTOSAR ECU OSEK/VDX 3) OSEK/VDX OS AUTOSAR AUTOSAR ECU AUTOSAR 1 AUTOSAR BSW (Basic Software) (Runtime Environment) Applicat

,4) 1 P% P%P=2.5 5%!%! (1) = (2) l l Figure 1 A compilation flow of the proposing sampling based architecture simulation

1., 1 COOKPAD 2, Web.,,,,,,.,, [1]., 5.,, [2].,,.,.,, 5, [3].,,,.,, [4], 33,.,,.,,.. 2.,, 3.., 4., 5., ,. 1.,,., 2.,. 1,,


1 I/F I/F 1 6) MobileIP 7) 8) MN: Monile Node MN AR Mobility Anchor Point(MAP) MobileIP HMIP HMIP HA-MAP MN MAP MN MAP HMIP MAP MN 2 MobileIP Mo

IPSJ SIG Technical Report Vol.2017-ARC-225 No.12 Vol.2017-SLDM-179 No.12 Vol.2017-EMB-44 No /3/9 1 1 RTOS DefensiveZone DefensiveZone MPU RTOS

組込みシステムシンポジウム2011 Embedded Systems Symposium 2011 ESS /10/20 FPGA Android Android Java FPGA Java FPGA Dalvik VM Intel Atom FPGA PCI Express DM

3. ( 1 ) Linear Congruential Generator:LCG 6) (Mersenne Twister:MT ), L 1 ( 2 ) 4 4 G (i,j) < G > < G 2 > < G > 2 g (ij) i= L j= N

IPSJ SIG Technical Report Vol.2011-MUS-91 No /7/ , 3 1 Design and Implementation on a System for Learning Songs by Presenting Musical St

Vol.53 No (Mar. 2012) 1, 1,a) 1, 2 1 1, , Musical Interaction System Based on Stage Metaphor Seiko Myojin 1, 1,a

23 Fig. 2: hwmodulev2 3. Reconfigurable HPC 3.1 hw/sw hw/sw hw/sw FPGA PC FPGA PC FPGA HPC FPGA FPGA hw/sw hw/sw hw- Module FPGA hwmodule hw/sw FPGA h

2. CABAC CABAC CABAC 1 1 CABAC Figure 1 Overview of CABAC 2 DCT 2 0/ /1 CABAC [3] 3. 2 値化部 コンテキスト計算部 2 値算術符号化部 CABAC CABAC

DS0 0/9/ a b c d u t (a) (b) (c) (d) [].,., Del Barrio [], Pilato [], [].,,. [],.,.,,.,.,,.,, 0%,..,,, 0,.,.,. (variable-latency unit)., (a) ( DFG ).,

Vol.55 No (Jan. 2014) saccess 6 saccess 7 saccess 2. [3] p.33 * B (A) (B) (C) (D) (E) (F) *1 [3], [4] Web PDF a m

1 [1, 2, 3, 4, 5, 8, 9, 10, 12, 15] The Boston Public Schools system, BPS (Deferred Acceptance system, DA) (Top Trading Cycles system, TTC) cf. [13] [

58 10

Fig. 3 Flow diagram of image processing. Black rectangle in the photo indicates the processing area (128 x 32 pixels).


Table 1. Reluctance equalization design. Fig. 2. Voltage vector of LSynRM. Fig. 4. Analytical model. Table 2. Specifications of analytical models. Fig

IPSJ SIG Technical Report Vol.2012-HCI-149 No /7/20 1 1,2 1 (HMD: Head Mounted Display) HMD HMD,,,, An Information Presentation Method for Weara

IPSJ SIG Technical Report Vol.2016-CE-137 No /12/ e β /α α β β / α A judgment method of difficulty of task for a learner using simple

1: A/B/C/D Fig. 1 Modeling Based on Difference in Agitation Method artisoc[7] A D 2017 Information Processing

2). 3) 4) 1.2 NICTNICT DCRA Dihedral Corner Reflector micro-arraysdcra DCRA DCRA DCRA 3D DCRA PC USB PC PC ON / OFF Velleman K8055 K8055 K8055

3_23.dvi

1 OpenCL OpenCL 1 OpenCL GPU ( ) 1 OpenCL Compute Units Elements OpenCL OpenCL SPMD (Single-Program, Multiple-Data) SPMD OpenCL work-item work-group N

Vol. 42 No. SIG 8(TOD 10) July HTML 100 Development of Authoring and Delivery System for Synchronized Contents and Experiment on High Spe

EQUIVALENT TRANSFORMATION TECHNIQUE FOR ISLANDING DETECTION METHODS OF SYNCHRONOUS GENERATOR -REACTIVE POWER PERTURBATION METHODS USING AVR OR SVC- Ju

2) 3) 3 4)5) 6)7) SiP LSI CPU SoC LSI LSI LSI. 2 3 NoC Network-on-Chip NoC ) 3) 8) 4) SoC CMOS 8GHz BER Bit Error Rate 7) 0.14pJ

13金子敬一.indd

DPA,, ShareLog 3) 4) 2.2 Strino Strino STRain-based user Interface with tacticle of elastic Natural ObjectsStrino 1 Strino ) PC Log-Log (2007 6)

スライド 1

FabHetero FabHetero FabHetero FabCache FabCache SPEC2000INT IPC FabCache 0.076%

百人一首かるた選手の競技時の脳の情報処理に関する研究

P2P P2P peer peer P2P peer P2P peer P2P i

900 GPS GPS DGPS Differential GPS RTK-GPS Real Time Kinematic GPS 2) DGPS RTK-GPS GPS GPS Wi-Fi 3) RFID 4) M-CubITS 5) Wi-Fi PSP PlayStation Portable

IPSJ SIG Technical Report Vol.2014-HCI-158 No /5/22 1,a) 2 2 3,b) Development of visualization technique expressing rainfall changing conditions

2017 (413812)

MDD PBL ET 9) 2) ET ET 2.2 2), 1 2 5) MDD PBL PBL MDD MDD MDD 10) MDD Executable UML 11) Executable UML MDD Executable UML

Modal Phrase MP because but 2 IP Inflection Phrase IP as long as if IP 3 VP Verb Phrase VP while before [ MP MP [ IP IP [ VP VP ]]] [ MP [ IP [ VP ]]]

THE INSTITUTE OF ELECTRONICS, INFORMATION AND COMMUNICATION ENGINEERS TECHNICAL REPORT OF IEICE {s-kasihr, wakamiya,

06_学術_技師の現状および将来需要_武藤様1c.indd

MmUm+FopX m Mm+Mop F-Mm(Fop-Mopum)M m+mop MSuS+FX S M S+MOb Fs-Ms(Mobus-Fex)M s+mob Fig. 1 Particle model of single degree of freedom master/ slave sy


B HNS 7)8) HNS ( ( ) 7)8) (SOA) HNS HNS 4) HNS ( ) ( ) 1 TV power, channel, volume power true( ON) false( OFF) boolean channel volume int

IPSJ SIG Technical Report Vol.2013-GN-87 No /3/ Research of a surround-sound field adjustmen system based on loudspeakers arrangement Ak

Sport and the Media: The Close Relationship between Sport and Broadcasting SUDO, Haruo1) Abstract This report tries to demonstrate the relationship be

1 1 CodeDrummer CodeMusician CodeDrummer Fig. 1 Overview of proposal system c


1 UD Fig. 1 Concept of UD tourist information system. 1 ()KDDI UD 7) ) UD c 2010 Information Processing S

IPSJ SIG Technical Report Vol.2009-DPS-141 No.23 Vol.2009-GN-73 No.23 Vol.2009-EIP-46 No /11/27 t-room t-room 2 Development of

Vol. 42 No MUC-6 6) 90% 2) MUC-6 MET-1 7),8) 7 90% 1 MUC IREX-NE 9) 10),11) 1) MUCMET 12) IREX-NE 13) ARPA 1987 MUC 1992 TREC IREX-N

Journal of Geography 116 (6) Configuration of Rapid Digital Mapping System Using Tablet PC and its Application to Obtaining Ground Truth


Q [4] 2. [3] [5] ϵ- Q Q CO CO [4] Q Q [1] i = X ln n i + C (1) n i i n n i i i n i = n X i i C exploration exploitation [4] Q Q Q ϵ 1 ϵ 3. [3] [5] [4]


修士論文

TA3-4 31st Fuzzy System Symposium (Chofu, September 2-4, 2015) Interactive Recommendation System LeonardoKen Orihara, 1 Tomonori Hashiyama, 1

IPSJ SIG Technical Report Vol.2009-CVIM-167 No /6/10 Real AdaBoost HOG 1 1 1, 2 1 Real AdaBoost HOG HOG Real AdaBoost HOG A Method for Reducing

Vol.57 No (Mar. 2016) 1,a) , L3 CG VDI VDI A Migration to a Cloud-based Information Infrastructure to Support

130 Oct Radial Basis Function RBF Efficient Market Hypothesis Fama ) 4) 1 Fig. 1 Utility function. 2 Fig. 2 Value function. (1) (2)

Chip Size and Performance Evaluations of Shared Cache for On-chip Multiprocessor Takahiro SASAKI, Tomohiro INOUE, Nobuhiko OMORI, Tetsuo HIRONAKA, Han

表紙参照.PDF

IPSJ SIG Technical Report Vol.2014-CE-123 No /2/8 Bebras 1,a) Bebras,,, Evaluation and Possibility of the Questions for Bebras Contest Abs

放水の物理的火災抑制効果に着目した地域住民の消火活動モデル

THE INSTITUTE OF ELECTRONICS, INFORMATION AND COMMUNICATION ENGINEERS TECHNICAL REPORT OF IEICE.

12 DCT A Data-Driven Implementation of Shape Adaptive DCT

Table 1. Assumed performance of a water electrol ysis plant. Fig. 1. Structure of a proposed power generation system utilizing waste heat from factori

™…

untitled

Visual Evaluation of Polka-dot Patterns Yoojin LEE and Nobuko NARUSE * Granduate School of Bunka Women's University, and * Faculty of Fashion Science,

IPSJ SIG Technical Report Vol.2011-EC-19 No /3/ ,.,., Peg-Scope Viewer,,.,,,,. Utilization of Watching Logs for Support of Multi-



23_02.dvi

Design at a higher level

IPSJ SIG Technical Report Secret Tap Secret Tap Secret Flick 1 An Examination of Icon-based User Authentication Method Using Flick Input for

橡最終原稿.PDF

P.5 P.6 P.3 P.4 P.7 P.8 P.9 P.11 P.19

VHDL-AMS Department of Electrical Engineering, Doshisha University, Tatara, Kyotanabe, Kyoto, Japan TOYOTA Motor Corporation, Susono, Shizuok

a) Extraction of Similarities and Differences in Human Behavior Using Singular Value Decomposition Kenichi MISHIMA, Sayaka KANATA, Hiroaki NAKANISHI a

DEIM Forum 2009 B4-6, Str

情報処理学会研究報告 IPSJ SIG Technical Report Vol.2013-HPC-139 No /5/29 Gfarm/Pwrake NICT NICT 10TB 100TB CPU I/O HPC I/O NICT Gf

On the Wireless Beam of Short Electric Waves. (VII) (A New Electric Wave Projector.) By S. UDA, Member (Tohoku Imperial University.) Abstract. A new e

システムオンチップ技術

IPSJ SIG Technical Report Vol.2010-GN-74 No /1/ , 3 Disaster Training Supporting System Based on Electronic Triage HIROAKI KOJIMA, 1 KU

TF-IDF TDF-IDF TDF-IDF Extracting Impression of Sightseeing Spots from Blogs for Supporting Selection of Spots to Visit in Travel Sat

Transcription:

Vol. 42 No. 4 Apr. 2001 VC 2 VC 4 VC VC 4 Recover-x Performance Evaluation of Adaptive Routers Based on the Number of Virtual Channels and Operating Frequencies Maki Horita, Tsutomu Yoshinaga, Kanemitsu Ootsu and Takanobu Baba In order to improve the communication performance of the parallel computer network,we should evaluate the various routing algorithms. Adaptive routing or virtual channels (VCs) can improve communication performance by increasing routing flexibility. However,the operating frequencies of the router become degraded,since the adaptive routing and the VCs require a complex and huge amount of hardware resources. Therefore,it is important to consider the trade-off between the routing flexibility and the operating frequency. We clarify this trade-off by evaluating the communication performance in 2D tori network for typical routers, taking into account the operating frequency of the routing circuits. Our experimental results show that the routers with four VCs per physical channel attain a good trade-off between routing flexibility and operating frequency. Adaptive routers show higher performance than non-adaptive routers due to their higher routing flexibility,especially in the case of a nonuniform communication pattern. The Recover-x router with four VCs per physical channel shows robust performance both in uniform and non-uniform traffic. 1. 1 Department of Information Science, Faculty of Engineering, Utsunomiya University Graduate School of Information Systems, University of Electro-Communications 4),6),8) VC 4 VC 9) VC 3) HDL 714

Vol. 42 No. 4 715 3 Dimension-order -channel 2) Recover-x 11) Recover-x DISHA 1) Recover-x DISHA VC 100 VC HDL 4 VC 2 3 4 5 RTL 6 2. 2.1 2 2.2 3 2.2.1 Dimension-order Dimension-order 2 X Y X Y 2.2.2 -channel -channel / VC -channel Duato 5) 2.2.3 Recover-x Recover-x Y X -channel VC DISHA Recover-x 3. 3.1 1 1 (a) 4 ± X ± Y Port PE I/F 1 (b)

716 Apr. 2001 1 VC VC Table 1 The number of VCs for each VC configuration. X Y PE I/F VC3 3 3 2 14 VC4 4 4 2 18 VC5 5 5 2 22 =2 X +2 Y + PE I/F VCC 3.2 1 1 Fig. 1 Hardware organization of the routers. VC BC Buffer Controller AD Address Decoder BC 8 FIFO (1) VC BC AD (2) AD Recover-x (3) OCA Output Channel Arbiter VC VC ( 4 ) OCA AD BC VCC Virtual Channel output Controller VC VC VC0 3 4 5 VC VC3 VC4 VC5 PE I/F 2 VC 2 3 4 VC VC VC 3.2.2 VC VC VC node#a node#b node#b node#c node#a node#c node#a 2 X Y 3.2.1 2 Dimension-order X Y 3 VC VC Dimension-order VC 3 -channel VC3 VC / -channel 1 2 / VC 4 Recover-x X VC 1 2 Y

Vol. 42 No. 4 717 VC 3.2.2 VC VC VC VC 7) Dimension-order VC -channel Recover-x 2 Dimension-order VC Fig. 2 VC assignment for a Dimension-order router. 3 -channel VC Fig. 3 VC assignment for a -channel router. VC AD OCA VC VC VC3 4 4 5 VC -channel Recover-x VC 3.3 1 32 1 2 VC 1 10) 4. 4.1 3 Verilog-HDL 4 Recover-x VC Fig. 4 VC assignment for a Recover-x router. -channel Synopsys HDL Compiler version 1999.05 Medium effort LSI Logic 0.6 µm Array- Based Gate Array Verilog-HDL

718 Apr. 2001 2 Table 2 Synthesis results. Dimension-order -channel Recover-x VCs/port 3 4 5 3 4 5 3 4 5 MHz 161.2 156.2 147.0 120.4 114.9 107.5 142.8 133.3 117.6 Kgates 70.9 90.2 109.1 72.2 96.5 120.9 75.6 94.5 118.2 Kgates 40.0 51.6 63.0 43.1 60.2 78.7 43.1 58.2 76.1 Kgates 110.9 141.8 172.1 115.3 156.7 199.6 118.7 152.7 194.3 4.2 2 2 2 4.2.1 VC Recover-x -channel VC Dimension-order VC VC VC VC3 4 VC4 5 VC VC3 VC5 OCA VC3 VC OCA 4.2.2 VC Recover-x 3 Recover-x Kgates Table 3 Each block area for Recover-x routers (Kgates). VC3 VC4 VC5 ±X ±Y ±X ±Y ±X ±Y AD 0.43 0.82 0.67 1.03 0.92 1.32 BC 13.11 13.44 17.57 18.29 21.29 21.37 OCA 2.15 1.01 3.17 2.00 4.81 2.86 VCC 0.21 0.22 0.40 0.36 0.44 0.47 15.89 15.47 21.80 21.68 27.44 26.01 -channel Dimension-order 1 Recover-x -channel VC3 Recover-x -channel VC4 /5 VC -channel VC VC Dimension-order VC VC 3 Recover-x BC AD OCA VC X AD Y OCA AD BC X Y OCA VC X

Vol. 42 No. 4 719 5. 5.1 Cadence Verilog-XL 10 10 = 100 Hot-spot 100 25% (4,j) 0 j 9 10 Random 100 1 1 3 1 4 PE PE PE I/F Recover-x VC 4 5.2 100 MHz 2 2000 5000 5.3 Hot-spot 5.3.1 5 (a) 100 MHz Hot-spot 4 256 Recover-x -channel Dimension-order Hot-spot 2 2 32 1 Dimension-order -channel VC -channel Dimension-order VC3 4 Recover-x VC3 4 5 VCC VC Recover-x -channel VC5 VC3 4 5 (b) Dimensionorder VC4 256 3 -channel 64 VC4 64 128

720 Apr. 2001 Fig. 5 5 Hot-spot Bandwidth for the Hot-spot traffic. 6 Fig. 6 Hot-spot Latency for the Hot-spot traffic. VC5 128 VC5 VC4 VC4 VC5 Recover-x VC3 Hot-spot VC VC5 4 5.3.2 6 Hot-spot

Vol. 42 No. 4 721 64 PE PE 6(a) -channel Recover-x Recover-x VC 3 -channel 1 VC3 6(b) Recover-x -channel Dimension-order -channel Dimension-order -channel Recover-x VC5 Dimension-order -channel VC3 VC4 5 VC4 5 Recover-x -channel Dimension-order Recover-x VC3 4 5 5.4 Random 5.4.1 7 Random Random 7 (a) 100 MHz Hot-spot 32 Hot-spot Dimension-order -channel VC VC 7 (b) -channel Dimension-order Recover-x VC4 5.4.2 8 Random 8 (a) 100 MHz -channel VC -channel Recover-x Dimension-order 6(a) Random VC VC 8 (b) Dimension-order -channel Recover-x VC VC5 VC4 Recover-x 3 VC4 6. Dimension-order -channel Recoverx 3 VC VC3 4

722 Apr. 2001 Fig. 7 7 Random Bandwidth for the Random traffic. 8 Fig. 8 Random Latency for the Random traffic. 4 5 Recover-x VC4 /

Vol. 42 No. 4 723 CAD B 10558039 A 11780190 1) Anjan, K.V. and Pinkston, T.M.: An Efficient, Fully Adaptive Deadlock Recovery Scheme: DISHA, Proc. 22nd ISCA, pp.201 210 (1995). 2) Berman, P.E., Gravano, L., Pifarré, G.D. and Sanz, J.L.C.: Adaptive Deadlock and Livelock Free Routing with all Minimal Paths in Torus Networks, Proc. SPAA (1992). 3) Chien, A.A.: A Cost and Speed Model for k- ary n-cube Wormhole Routers, IEEE Trans. Parallel and Distributed Systems, Vol.9, No.2, pp.150 162 (1998). 4) Dai, D. and Panda, D.K.: How Much Does Network Contention Affect Distributed Shared Memory Performance?, Proc.ICPP 97, pp.454 461 (1997). 5) Duato, J.: A New Theory of Deadlock-Free Adaptive Routing in Wormhole Network, IEEE Trans. Parallel and Distributed Systems, Vol.4, No.12, pp.1320 1331 (1993). 6) Flich, J., Malumbres, M.P., López, P. and Duato, J.: Performance Evaluation of Networks of Workstations with Hardware Shared Memory Model Using Execution-Driven Simulation, Proc. ICPP 99 (1999). 7) JSPP2000 pp.189 196(2000). 8) adaptive routing Proc. HOKKE2000, 2000-ARC-137-9, pp.47 52 (2000). 9) Vaidya, A.S., Sivasubramaniam, A. and Das, C.R.: LAPSES: A Recipe for High Performance Adaptive Router Design, Proc. HPCA- 5 99 (1999). 10) Vol.40, No.5, pp.1958 1967 (1999). 11) Recover-x Vol.41, No.5, pp.1360 1369 (2000). ( 12 8 31 ) ( 13 3 9 ) 1999 1986 1988 1997 2000 8 IEEE 1993 1995 1997 1970 1975 1982 1 IEEE 1992 Best Author Microprogrammable Parallel Computer MIT Press