1 2 4 5 9 10 12 3 6 11 13 14 0 8 7 15 Iteration 0 Iteration 1 1 Iteration 2 Iteration 3 N N N! N 1 MOPT(Merge Optimization) 3) MOPT 8192 2 16384 5 MOP



Similar documents
/ / / [4][3][5] / FX10 TAHB RMATT / / C.D. Sudheer [12] T. Hoefler [11] T. Hatazaki[7] J. L. Traff[8] F. Ercal [6] T. Agarwal [9] G. Bh

1 Fig. 1 Extraction of motion,.,,, 4,,, 3., 1, 2. 2.,. CHLAC,. 2.1,. (256 ).,., CHLAC. CHLAC, HLAC. 2.3 (HLAC ) r,.,. HLAC. N. 2 HLAC Fig. 2

IPSJ SIG Technical Report Vol.2011-EC-19 No /3/ ,.,., Peg-Scope Viewer,,.,,,,. Utilization of Watching Logs for Support of Multi-

Vol.11-HCI-15 No. 11//1 Xangle 5 Xangle 7. 5 Ubi-WA Finger-Mount 9 Digitrack 11 1 Fig. 1 Pointing operations with our method Xangle Xa

IPSJ SIG Technical Report Vol.2011-MUS-91 No /7/ , 3 1 Design and Implementation on a System for Learning Songs by Presenting Musical St

17 Proposal of an Algorithm of Image Extraction and Research on Improvement of a Man-machine Interface of Food Intake Measuring System

DEIM Forum 2009 B4-6, Str

IPSJ SIG Technical Report Vol.2012-CG-148 No /8/29 3DCG 1,a) On rigid body animation taking into account the 3D computer graphics came

1 Web [2] Web [3] [4] [5], [6] [7] [8] S.W. [9] 3. MeetingShelf Web MeetingShelf MeetingShelf (1) (2) (3) (4) (5) Web MeetingShelf

& Vol.5 No (Oct. 2015) TV 1,2,a) , Augmented TV TV AR Augmented Reality 3DCG TV Estimation of TV Screen Position and Ro

A Feasibility Study of Direct-Mapping-Type Parallel Processing Method to Solve Linear Equations in Load Flow Calculations Hiroaki Inayoshi, Non-member

2. CABAC CABAC CABAC 1 1 CABAC Figure 1 Overview of CABAC 2 DCT 2 0/ /1 CABAC [3] 3. 2 値化部 コンテキスト計算部 2 値算術符号化部 CABAC CABAC

3D UbiCode (Ubiquitous+Code) RFID ResBe (Remote entertainment space Behavior evaluation) 2 UbiCode Fig. 2 UbiCode 2. UbiCode 2. 1 UbiCode UbiCode 2. 2

1., 1 COOKPAD 2, Web.,,,,,,.,, [1]., 5.,, [2].,,.,.,, 5, [3].,,,.,, [4], 33,.,,.,,.. 2.,, 3.., 4., 5., ,. 1.,,., 2.,. 1,,

Web Web Web Web i

IPSJ SIG Technical Report Secret Tap Secret Tap Secret Flick 1 An Examination of Icon-based User Authentication Method Using Flick Input for

IPSJ SIG Technical Report Vol.2010-GN-74 No /1/ , 3 Disaster Training Supporting System Based on Electronic Triage HIROAKI KOJIMA, 1 KU

IPSJ SIG Technical Report Vol.2016-CE-137 No /12/ e β /α α β β / α A judgment method of difficulty of task for a learner using simple

ID 3) 9 4) 5) ID 2 ID 2 ID 2 Bluetooth ID 2 SRCid1 DSTid2 2 id1 id2 ID SRC DST SRC 2 2 ID 2 2 QR 6) 8) 6) QR QR QR QR

1 1 CodeDrummer CodeMusician CodeDrummer Fig. 1 Overview of proposal system c

DEIM Forum 2009 E

2). 3) 4) 1.2 NICTNICT DCRA Dihedral Corner Reflector micro-arraysdcra DCRA DCRA DCRA 3D DCRA PC USB PC PC ON / OFF Velleman K8055 K8055 K8055

Core1 FabScalar VerilogHDL Cache Cache FabScalar 1 CoreConnect[2] Wishbone[3] AMBA[4] AMBA 1 AMBA ARM L2 AMBA2.0 AMBA2.0 FabScalar AHB APB AHB AMBA2.0

LAN LAN LAN LAN LAN LAN,, i

HP cafe HP of A A B of C C Map on N th Floor coupon A cafe coupon B Poster A Poster A Poster B Poster B Case 1 Show HP of each company on a user scree

e-learning e e e e e-learning 2 Web e-leaning e 4 GP 4 e-learning e-learning e-learning e LMS LMS Internet Navigware

Vol. 48 No. 3 Mar PM PM PMBOK PM PM PM PM PM A Proposal and Its Demonstration of Developing System for Project Managers through University-Indus

29 jjencode JavaScript

Vol.55 No (Jan. 2014) saccess 6 saccess 7 saccess 2. [3] p.33 * B (A) (B) (C) (D) (E) (F) *1 [3], [4] Web PDF a m

1: A/B/C/D Fig. 1 Modeling Based on Difference in Agitation Method artisoc[7] A D 2017 Information Processing

12) NP 2 MCI MCI 1 START Simple Triage And Rapid Treatment 3) START MCI c 2010 Information Processing Society of Japan

IPSJ SIG Technical Report Vol.2009-HCI-134 No /7/17 1. RDB Wiki Wiki RDB SQL Wiki Wiki RDB Wiki RDB Wiki A Wiki System Enhanced by Visibl

1 Table 1: Identification by color of voxel Voxel Mode of expression Nothing Other 1 Orange 2 Blue 3 Yellow 4 SSL Humanoid SSL-Vision 3 3 [, 21] 8 325

IPSJ SIG Technical Report Vol.2014-IOT-27 No.14 Vol.2014-SPT-11 No /10/10 1,a) 2 zabbix Consideration of a system to support understanding of f

Q [4] 2. [3] [5] ϵ- Q Q CO CO [4] Q Q [1] i = X ln n i + C (1) n i i n n i i i n i = n X i i C exploration exploitation [4] Q Q Q ϵ 1 ϵ 3. [3] [5] [4]

IPSJ SIG Technical Report Vol.2014-GN-90 No.16 Vol.2014-CDS-9 No.16 Vol.2014-DCC-6 No /1/24 1,a) 2,b) 2,c) 1,d) QUMARION QUMARION Kinect Kinect

DPA,, ShareLog 3) 4) 2.2 Strino Strino STRain-based user Interface with tacticle of elastic Natural ObjectsStrino 1 Strino ) PC Log-Log (2007 6)

,4) 1 P% P%P=2.5 5%!%! (1) = (2) l l Figure 1 A compilation flow of the proposing sampling based architecture simulation

7,, i

258 5) GPS 1 GPS 6) GPS DP 7) 8) 10) GPS GPS ) GPS Global Positioning System

IPSJ SIG Technical Report Vol.2009-DPS-141 No.20 Vol.2009-GN-73 No.20 Vol.2009-EIP-46 No /11/27 1. MIERUKEN 1 2 MIERUKEN MIERUKEN MIERUKEN: Spe

1 DHT Fig. 1 Example of DHT 2 Successor Fig. 2 Example of Successor 2.1 Distributed Hash Table key key value O(1) DHT DHT 1 DHT 1 ID key ID IP value D

PeerPool IP NAT IP UPnP 2) Bonjour 3) PeerPool CPU 4) 2 UPnP Bonjour PeerPool CPU PeerPool PeerPool PPv2 PPv2 2. PeerPool 2.1 PeerPool PeerPool PoolGW

_先端融合開発専攻_観音0314PDF用

IPSJ SIG Technical Report Vol.2014-CG-155 No /6/28 1,a) 1,2,3 1 3,4 CG An Interpolation Method of Different Flow Fields using Polar Inter

24 LED A visual programming environment for art work using a LED matrix

知能と情報, Vol.30, No.5, pp

Web Basic Web SAS-2 Web SAS-2 i

3_23.dvi

TF-IDF TDF-IDF TDF-IDF Extracting Impression of Sightseeing Spots from Blogs for Supporting Selection of Spots to Visit in Travel Sat

58 10

AP AP AP AP AP AP AP( AP) AP AP( AP) AP AP Air Patrol[1] Air Patrol Cirond AP AP Air Patrol Senser Air Patrol Senser AP AP Air Patrol Senser AP

[4] ACP (Advanced Communication Primitives) [1] ACP ACP [2] ACP Tofu UDP [3] HPC InfiniBand InfiniBand ACP 2 ACP, 3 InfiniBand ACP 4 5 ACP 2. ACP ACP

2reN-A14.dvi

日本感性工学会論文誌

2 122

2. Eades 1) Kamada-Kawai 7) Fruchterman 2) 6) ACE 8) HDE 9) Kruskal MDS 13) 11) Kruskal AGI Active Graph Interface 3) Kruskal 5) Kruskal 4) 3. Kruskal

SOM SOM(Self-Organizing Maps) SOM SOM SOM SOM SOM SOM i

P.3 P.4 P.9 P.11

MDD PBL ET 9) 2) ET ET 2.2 2), 1 2 5) MDD PBL PBL MDD MDD MDD 10) MDD Executable UML 11) Executable UML MDD Executable UML

IPSJ SIG Technical Report Vol.2012-HCI-149 No /7/20 1 1,2 1 (HMD: Head Mounted Display) HMD HMD,,,, An Information Presentation Method for Weara

28 Horizontal angle correction using straight line detection in an equirectangular image

IPSJ SIG Technical Report Vol.2014-HCI-158 No /5/22 1,a) 2 2 3,b) Development of visualization technique expressing rainfall changing conditions

WikiWeb Wiki Web Wiki 2. Wiki 1 STAR WARS [3] Wiki Wiki Wiki 2 3 Wiki 5W1H Wiki Web 2.2 5W1H 5W1H 5W1H 5W1H 5W1H 5W1H 5W1H 2.3 Wiki 2015 Informa

4.1 % 7.5 %

6_27.dvi

Input image Initialize variables Loop for period of oscillation Update height map Make shade image Change property of image Output image Change time L

IPSJ SIG Technical Report Pitman-Yor 1 1 Pitman-Yor n-gram A proposal of the melody generation method using hierarchical pitman-yor language model Aki

fiš„v8.dvi

JOURNAL OF THE JAPANESE ASSOCIATION FOR PETROLEUM TECHNOLOGY VOL. 66, NO. 6 (Nov., 2001) (Received August 10, 2001; accepted November 9, 2001) Alterna

When creating an interactive case scenario of a problem that may occur in the educational field, it becomes especially difficult to assume a clear obj

企業内システムにおけるA j a x 技術の利用

The Journal of the Japan Academy of Nursing Administration and Policies Vol 7, No 2, pp 19 _ 30, 2004 Survey on Counseling Services Performed by Nursi

2reA-A08.dvi

Vol. 28 No. 2 Apr Web Twitter/Facebook UI Twitter Web Twitter/Facebook e.g., Web Web UI 1 2 SNS 1, 2 2

[2] , [3] 2. 2 [4] 2. 3 BABOK BABOK(Business Analysis Body of Knowledge) BABOK IIBA(International Institute of Business Analysis) BABOK 7

(a) (b) 1 JavaScript Web Web Web CGI Web Web JavaScript Web mixi facebook SNS Web URL ID Web 1 JavaScript Web 1(a) 1(b) JavaScript & Web Web Web Webji

6 2. AUTOSAR 2.1 AUTOSAR AUTOSAR ECU OSEK/VDX 3) OSEK/VDX OS AUTOSAR AUTOSAR ECU AUTOSAR 1 AUTOSAR BSW (Basic Software) (Runtime Environment) Applicat

Web Stamps 96 KJ Stamps Web Vol 8, No 1, 2004

[2] OCR [3], [4] [5] [6] [4], [7] [8], [9] 1 [10] Fig. 1 Current arrangement and size of ruby. 2 Fig. 2 Typography combined with printing

Vol. 48 No. 4 Apr LAN TCP/IP LAN TCP/IP 1 PC TCP/IP 1 PC User-mode Linux 12 Development of a System to Visualize Computer Network Behavior for L

The copyright of this material is retained by the Information Processing Society of Japan (IPSJ). The material has been made available on the website

Vol.54 No (July 2013) [9] [10] [11] [12], [13] 1 Fig. 1 Flowchart of the proposed system. c 2013 Information

,,,,., C Java,,.,,.,., ,,.,, i

i

28 TCG SURF Card recognition using SURF in TCG play video

IPSJ SIG Technical Report Vol.2014-CE-127 No /12/7 1,a) 2,3 2,3 3 Development of the ethological recording application for the understanding of

1_26.dvi

04-“²†XŒØ‘�“_-6.01

22SPC4報告書

中小企業の発展と政策支援

1 Fig. 2 2 Fig. 1 Sample of tab UI 1 Fig. 1 that changes by clicking tab 5 2. Web HTML Adobe Flash Web ( 1 ) ( 2 ) ( 3 ) ( 4 ) ( 5 ) 3 Web 2.1 Web Goo

IPSJ SIG Technical Report Vol.2009-BIO-17 No /5/26 DNA 1 1 DNA DNA DNA DNA Correcting read errors on DNA sequences determined by Pyrosequencing

08-特集04.indd

Vol. 42 No. SIG 8(TOD 10) July HTML 100 Development of Authoring and Delivery System for Synchronized Contents and Experiment on High Spe

, IT.,.,..,.. i

16.16%

Windows7 OS Focus Follows Click, FFC FFC focus follows mouse, FFM Windows Macintosh FFC n n n n ms n n 4.2 2

IPSJ SIG Technical Report Vol.2009-DPS-141 No.23 Vol.2009-GN-73 No.23 Vol.2009-EIP-46 No /11/27 t-room t-room 2 Development of

Table 1. Reluctance equalization design. Fig. 2. Voltage vector of LSynRM. Fig. 4. Analytical model. Table 2. Specifications of analytical models. Fig

Transcription:

10000 SFMOPT / / MOPT(Merge OPTimization) MOPT FMOPT(Fast MOPT) FMOPT SFMOPT(Subgrouping FMOPT) SFMOPT 2 8192 31 The Proposal and Evaluation of SFMOPT, a Task Mapping Method for 10000 Tasks Haruka Asano and Kenji Kise In recent years, computing systems which adopt mesh/torus network have appeared. Their communication time differs in where the positions of calculation tasks are located in such a direct network. Especially the positions of communicated tasks are important. The problem to determine the positions of all tasks is called a task mapping problem. The task mapping problem is complex because there are a lot of combinations. Therefore, it is difficult for the programmers to seek the best task mapping. We proposed a fast task mapping method MOPT (Merge OPTimization) before. In this paper, we propose FMOPT (Fast MOPT) which shortened the optimization time of MOPT using duplicate exclusions and parallelization. We also propose SFMOPT (Subgrouping FMOPT) which further accelerated FMOPT by introducing subgrouping. By using SFMOPT, we can optimize 8192 nodes in 31 seconds which had taken 2 hours before. 1. () / / Graduate School of Information Science and Engineering, Tokyo Institute of Technology 1), Jaguar 2) / () 1 ID MPI(Message Passing Interface) ID 2 3 / MPI c 2013 Information Processing Society of Japan 38

1 2 4 5 9 10 12 3 6 11 13 14 0 8 7 15 Iteration 0 Iteration 1 1 Iteration 2 Iteration 3 N N N! N 1 MOPT(Merge Optimization) 3) MOPT 8192 2 16384 5 MOPT FMOPT FMOPT SFMOPT 2 MOPT 3 MOPT 4 5 6 2. MOPT 2.1 / 1 (d) 2 (a) (b) MOPT (c) (f) 2.2 MOPT MOPT 16 4 4 ( 2) MOPT 2 MOPT 2(a) 2(b) 0 2(c) 2 2 16 4 2 2 2 c 2013 Information Processing Society of Japan 39

2 (1) G 1,G 2 s,t T raffic(s, t) s t Comm(G 1, G 2 ) = T raffic(s, t) (1) t G 2 s G 1 Eval Cost Eval Link 2 Eval Cost ( ) Eval Link XYZ Eval Cost MOPT MOPT mincost Eval Link MOPT MOPT minlink 2 2 X Y 3 X,Y,Z 3. MOPT 3.1 MOPT MOPT MOPT 8192 MOPT minlink 99.8% 98.5% 3.2 FMOPT MOPT FMOPT(Fast MOPT) FMOPT MOPT MOPT FMOPT MOPT 3.2.1 MOPT 2 X,Y (3 X,Y,Z ) X Y (3 XY YZ ZX ) YZ X ZX Y XY Z 2 2 4 8 3 6 8 48 MOPT 8 48 MOPT MOPT mincost Eval Cost 3 MOPT 3 3 ABC 3 abc A a X (1) Eval Cost (2) (8) (1) 90 (2) (4) (1) (4) Z (5) (8) (1) (2) (8) MOPT minlink MOPT mincost MOPT minlink XYZ XYZ c 2013 Information Processing Society of Japan 40

3 Eval Cost 4 Eval Link Eval Cost MOPT mincost 4 3 (1) Eval Link 3 (1) (8) (1) (6) (1)(6) Z (5)(2)Y (7)(4)Y Z (3)(8) A A A A A ( 1 ) A 2 2 (MOPT minlink ) ( 2 ) A 4 MOPT mincost 1/8MOPT minlink 1/4 3 MOPT 2 (G 1,G 2 ) X Y Z 1 MOPT mincost 2, MOPT minlink 3 MOPT mincost 6 1 G 1 G 2 X 6 8 6 8 2304 Y 2 8 2 8 256 Z 2 8 2 8 256 2 (FMOPT mincost) G 1 G 2 X 3 2 6 8 288 1/8 Y 2 2 2 8 64 1/4 Z 1 2 2 8 32 1/8 3 (FMOPT minlink) G 1 G 2 X 6 2 6 8 576 1/4 Y 2 2 2 8 64 1/4 Z 2 2 2 8 64 1/4 7 MOPT minlink 4 2 MOPT Z Z 3.2.2 FMOPT 3.3 SFMOPT FMOPT MOPT FMOPT SFMOPT(Subgrouping MOPT) SFMOPT FMOPT SFMOPT 2 SFMOPT 4 4 2 2 5 5 2 2 (1) c 2013 Information Processing Society of Japan 41

5 2 2 5 4 4 2 2 2 2 2 2 SFMOPT 2 3 SF- MOPT FMOPT SFMOPT FMOPT SFMOPT FMOPT SFMOPT 6 2 2 2 9 4 4 4 D SFMOPT(D ) N 2 D O(1) O(1) O(N) O(N) 4. 4.1 FMOPT MOPT Intel Core i7 870 Linux gettimeofday() OpenMP MOPT MOPT MOPT( FMOPT) 6 6 FMOPT MOPT 2 3 FMOPT minlink 4 FMOPT mincost 5 8 FMOPT mincost 128512 8192 2 MOPT FMOPT 7 7 4 23 mincost FMOPT 8 8 1024, 8192 512 1024 4096 8192 c 2013 Information Processing Society of Japan 42

6 7 8 FMOPT 10000 8192 Eval Link MOPT 7348 4 FMOPT 547 4.2 SFMOPT SFMOPT 2 1 FMOPT 1 3 / XYZ XYZ None FMOPT FMOPT SFMOPT SFMOPT N 2 2 2 FMOPT-half N FMOPT None FMOPT-half 2 1 FMOPT-half N N SFMOPT FMOPT-half 2 SFMOPT N SFMOPT FMOPT N FMOPT-half 4.2.1 FMOPT 9 9 Eval Link FMOPT, SFMOPT, FMOPT-half FMOPT SFMOPT 9 9(512 ) 6 6 10000 8192 FMOPT 547 SFMOPT 31 SFMOPT FMOPT-half FMOPT-half c 2013 Information Processing Society of Japan 43

9 10 (minlink) (mincost) SFMOPT 8192 SFMOPT FMOPT-half 1.1 FMOPT-half SFMOPT 10 Eval Cost FMOPT, SFMOPT, FMOPT-half Eval Link SF- MOPT FMOPT FMOPT-half Eval Cost Eval Link Eval Link Eval Link 4.2.2 NAS Parallel Benchmark CGBT MGSP 512 6 OpenNSIM 4) 512 11 Eval Link 12 Eval Cost 10000 Eval Link 11 Eval Cost 12 XYZ 1 2 13 Eval Link MG SFMOPT FMOPT XYZ SP FMOPT SFMOPT Eval Link 1 c 2013 Information Processing Society of Japan 44

MG FMOPTSFMOPT MG 3 XYZ XYZ MOPT 14 Eval Cost MG Eval Link FMOPT SFMOPT Eval Cost MG Eval Link XYZ 1 Eval Link 13 Eval Cost 14 MOPT MG 13 Eval Link MG FMOPTSFMOPTFMOPThalfNoneXYZ SFMOPT FMOPT-half SF- MOPT FMOPT 5% SFMOPT 14 Eval Cost SFMOPT FMOPT-halfNoneXYZ Eval Link CG SP SFMOPT FMOPT CG SP Eval Cost Eval Link Eval Link FMOPT SFMOPT 5. 13 (minlink) 14 (mincost) RMATT 5) RMATT BISEM Simulated annealing OPTIM BISEM SFMOPT BISEM SFMOPT BISEM SFMOPT 1 1 BISEM SFMOPT RMATT Simulated annealing RMATT RMATT 6) Simulated annealing 2 SFMOPT c 2013 Information Processing Society of Japan 45

RMATT BISEM SFMOPT 3 BlueGene/L 7) SFMOPT SFMOPT TAC3 8) RMATT Simulated annealing TAC3 SFMOPT 6. MOPT 4 4 3 5% SFMOPT MOPT 1% 16384 SFMOPT minlink 32%SF- MOPT mincost 67% MOPT FMOPT 512 SFMOPT minlink 0.4 6 BluGene/Q 5 3 SFMOPT 2 3 MOPT (CREST) 1) Yokokawa, M., Shoji, F., Uno, A., Kurokawa, M. and Watanabe, T.: The K computer: Japanese next-generation supercomputer development project, International Symposium on Low Power Electronics and Design (ISLPED), pp. 371 372 (2011). 2) Bland, A. S., Kendall, R. A., Kothe, D. B., Rogers, J. H. and Shipman, G. M.: Jaguar: The world s most powerful computer, Cray User Group (2009). 3),, : /, D, Vol. J96-D, No. 2, pp. 269 279 (2013). 4),,,,,,,, : OpenNSIM, 2010-ARC-192(15), pp. 1 9 (2010). 5),,, : RMATT, SACSIS2011, pp. 340 347 (2011). 6),,,,,, : RMATT, 2012, pp. 93 100 (2012). 7) Bhanot, G., Gara, A., Heidelberger, P., Lawless, E., Sexton, J. and Walkup, R.: Optimizing task layout on the Blue Gene/L supercomputer, IBM Journal of Research and Development, Vol. 49, No. 2.3, pp. 489 500 (2005). 8), : /, 2013, pp. 95 103 (2013). c 2013 Information Processing Society of Japan 46