untitled

Size: px
Start display at page:

Download "untitled"

Transcription

1 c NUMA (Moore s law) 1Hz CPU 2. 1 (Register) (RAM) Level 1 (L1) L2 L3 L4 TLB (translation look-aside buffer) (OS) TLB TLB 3. NUMA NUMA (Non-uniform memory access) Copyright c by ORSJ. Unauthorized reproduction of this article is prohibited

2 Intel Xeon X5460 Harpertown CPU 2 CPU 4 1 8(=2 4 1) 2 2-way Intel Xeon X5460 NUMA UMA (Uniform memory access) 2 UMA 3 NUMA UMA 2 CPU Intel Xeon X CPU CPU (RAM) RAM NUMA NUMA NUMA CPU NUMA 3 CPU Intel Xeon E NUMA 4 4. STREAM 1 1 STREAM: Sustainable Memory Bandwidth in High Performance Computers stream/ Intel Xeon E SandyBridge-EP CPU 4 CPU (= 4 8 2) 3 4-way Intel Xeon E STREAM 4 1 Triad n a, b, c R n r a b + rc 1 bytes 4 OpenMP Triad C/C++ 4 OpenMP Triad 5 4-way Intel Xeon E n n = {2 10,...,2 30 } Triad (GB/s) 2 20 STREAM 20 16, 32, 64 95, 98, 92 GB/s Copyright c by ORSJ. Unauthorized reproduction of this article is prohibited.

3 5 STREAM TRIAD 2 Hyper-threading Linux numactl NUMA numactl NUMA node NUMA node 3 --physcpubind --membind NUMA ID NUMA ID Linux /proc/cpuinfo processor ID physical id NUMA ID Portable Hardware Locality (HWLOC) [1] 6 n = {2 10, 2 11,...,2 30 } NUMA NUMA NUMA Triad NUMA 6 NUMA 0 NUMA (GB/s) 12 GB/s NUMA 3GB/s NUMA 1/4 4.2 numactl --localalloc 32 NUMA 0, KBytes NUMA NUMA numactl --interleave 32 NUMA 0, Copyright c by ORSJ. Unauthorized reproduction of this article is prohibited

4 NUMA 0, 1 4 Local allocation 5. NUMA 4.4 7(a) 7(b) STREAM TRIAD (GB/s) NUMA 1, 2, 4 n = {2 10,...,2 30 } NUMA (Local-allocation) (Interleaving) 2 20 NUMA Local allocation 13 GB/s, 21 GB/s, 24 GB/s Interleaving 13 GB/s, 6 GB/s, 8 GB/s Interleaving TRIAD STREAM 4 Local allocation Interleaving 6 NUMA numactl Linux sched_setaffinity() sched_getaffinity() mbind() sched_setaffinity() sched_setaffinity() mbind() NUMA NUMA 5.1 STREAM TRIAD TRIAD a, b, c 1 NUMA 7 STREAM TRIAD: (GB/s) Copyright c by ORSJ. Unauthorized reproduction of this article is prohibited.

5 8 TRIAD 9 NUMA 8 TRIAD NUMA 1, 2, 4 24, 48, 96 GB/s (Breath-first search; BFS) BFS G =(V,E) n = V m = E O(n + m) HPC Graph500 1 Graph500 2 BFS SCALE edgefactor =m/n 16 (a) (b) (c) (a) n=2 SCALE m=n edgefactor Kronecker graph (b) (c) 64 BFS 1 (traversed edges per second; TEPS) (c) 64 TEPS Green Graph500 3 Graph500 TEPS TEPS/W 9 1 BFS (Level) Level-synchronized BFS Beamer [3] Top-down Bottom-up Small-world Top-down Bottom-up Beamer Kronecker graph 4-way Intel Xeon E GTEPS (10 9 TEPS) NUMA GTEPS [4] Bottom-up Small-world 2.68 [5] [4, 5] CSR (Compressed Sparse Row) 2 Graph500: 3 Green Graph500: Copyright c by ORSJ. Unauthorized reproduction of this article is prohibited

6 1 (n, m) TEPS Madduri Cray MTA-2 (40 procs) (2 21,2 30 ) 0.5 G Agarwal [2] Intel Xeon X (2 20,2 26 ) 1.3 G Beamer [3] Intel Xeon E (2 28,2 32 ) 5.1 G Yasui [4] Intel Xeon E (2 26,2 30 ) 11.1 G Yasui [5] Intel Xeon E (2 27,2 31 ) 29.0 G V k = { [ )} kn (k +1)n v j V j, l l A Top-down v V A F k (v) Bottom-up w V k A B k (w) l 1 A F k (v) A B k (w) A F k (v)={w w {V k A(v)}}, v V, A B k (w)={v v A(w)}, w V k. NUMA Graph NUMA BFS Graph500 10(a) NUMA 10(b) NUMA l G l {G k}, (k = {0, 1,...,l 1}) NUMA k V k A k V k SGI UV Kronecker GTEPS Green Graph Big Data category 4-way Intel Xeon E , GTEPS 59.1 MTEPS/W 1 UV SDPARA (SemiDefinite Programming Algorithm PARAllel version) [6] SDPA (Semidefinite Programming Algorithms) ZDD (Zero-suppressed decision diagram) [7] [8] NUMA ULIBC (Ubiquity Library for Intelligently Binding Cores) 4 jun isc.php Copyright c by ORSJ. Unauthorized reproduction of this article is prohibited.

7 6. NUMA NUMA (JST) CREST SGI Silicon Graphics International Corp. [1] F. Broquedis, J. Clet-Ortega, S. Moreaud, N. Furmento, B. Goglin, G. Mercier, S. Thibault and R. Namyst, hwloc: A generic framework for managing hardware affinities in HPC applications, Proc. IEEE Int. Conf. PDP2010, [2] V. Agarwal, F. Petrini, D. Pasetto and D. A. Bader, Scalable graph exploration on multicore processors, Proc. ACM/IEEE Int. Conf. SC10, [3] S. Beamer, K. Asanović and D. A. Patterson, Direction-optimizing breadth-first search, Proc. ACM/IEEE Int. Conf. SC12, [4] Y. Yasui, K. Fujisawa and K. Goto, NUMAoptimized parallel breadth-first search on multicore single-node system, Proc. IEEE Int. Conf. BigData 2013, [5] Y. Yasui, K. Fujisawa and Y. Sato, Fast and energy-efficient breadth-first search on a single NUMA system, Proc. IEEE Int. Conf. ISC 14, [6] K. Fujisawa, T. Endo, Y. Yasui, H. Sato, N. Matsuzawa, S. Matsuoka and H. Waki, Peta-scale general solver for semidefinite programming problems with over two million constraints, Proc. IEEE Int. Conf. IPDPS 2014, [7] ULIBC 2014 (HPCS2014) HPCS [8]Y.Yasui,K.Fujisawa,K.Goto,N.Kamiyamaand M. Takamatsu, NETAL: High-performance implementation of network analysis library considering computer memory hierarchy, J. Oper. Res. Soc. Jpn., 54, , Copyright c by ORSJ. Unauthorized reproduction of this article is prohibited

メモリ階層構造を考慮した大規模グラフ処理の高速化

メモリ階層構造を考慮した大規模グラフ処理の高速化 , CREST ERATO 0.. (, CREST) ERATO / 8 Outline NETAL (NETwork Analysis Library) NUMA BFS raph500, reenraph500 Kronecker raph Level Synchronized parallel BFS Hybrid Algorithm for Parallel BFS NUMA Hybrid

More information

untitled

untitled c 816 Web 1. 30 [1] [2] [3, 4] [5] 10 [6] 185 8540 2 8 38 [5] [5, 7] [5] 3 (1) 608 18 Copyright c by ORSJ. Unauthorized reproduction of this article is prohibited. (2) (3) 2. 2.1 Web 2014 1 2013 12 2.2

More information

CPU Levels in the memory hierarchy Level 1 Level 2... Increasing distance from the CPU in access time Level n Size of the memory at each level 1: 2.2

CPU Levels in the memory hierarchy Level 1 Level 2... Increasing distance from the CPU in access time Level n Size of the memory at each level 1: 2.2 FFT 1 Fourier fast Fourier transform FFT FFT FFT 1 FFT FFT 2 Fourier 2.1 Fourier FFT Fourier discrete Fourier transform DFT DFT n 1 y k = j=0 x j ω jk n, 0 k n 1 (1) x j y k ω n = e 2πi/n i = 1 (1) n DFT

More information

untitled

untitled c Society5.0 Society5.0 Society5.0 Society5.0 2017 Society5.0 SDGs SIP PRISM Society5.0 2017 SIP ImPACT PRISM SDGs 1. Society5.0 2016 9 Society5.0 OR [1] Society5.0 2. Society5.0 2.1 Society5.0 Society5.0

More information

untitled

untitled A = QΛQ T A n n Λ Q A = XΛX 1 A n n Λ X GPGPU A 3 T Q T AQ = T (Q: ) T u i = λ i u i T {λ i } {u i } QR MR 3 v i = Q u i A {v i } A n = 9000 Quad Core Xeon 2 LAPACK (4/3) n 3 O(n 2 ) O(n 3 ) A {v i }

More information

Microsoft PowerPoint - stream.ppt [互換モード]

Microsoft PowerPoint - stream.ppt [互換モード] STREAM 1 Quad Opteron: ccnuma Arch. AMD Quad Opteron 2.3GHz Quad のソケット 4 1 ノード (16コア ) 各ソケットがローカルにメモリを持っている NUMA:Non-Uniform Access ローカルのメモリをアクセスして計算するようなプログラミング, データ配置, 実行時制御 (numactl) が必要 cc: cache-coherent

More information

untitled

untitled c Twitter 1. Twitter 140 SNS 1,392 Facebook 2 14 [4]. 2011 Twitter 58 1 [1]. Twitter Twitter [4] Twitter SNS [5]. [1]. 432 8561 3 5 1 13.5.22 14.2.10 [6] Web [13] SIR [10] SIR SIR 2 2014 4 Copyright c

More information

Vol.214-HPC-145 No /7/3 C #pragma acc directive-name [clause [[,] clause] ] new-line structured block Fortran!$acc directive-name [clause [[,] c

Vol.214-HPC-145 No /7/3 C #pragma acc directive-name [clause [[,] clause] ] new-line structured block Fortran!$acc directive-name [clause [[,] c Vol.214-HPC-145 No.45 214/7/3 OpenACC 1 3,1,2 1,2 GPU CUDA OpenCL OpenACC OpenACC High-level OpenACC CPU Intex Xeon Phi K2X GPU Intel Xeon Phi 27% K2X GPU 24% 1. TSUBAME2.5 CPU GPU CUDA OpenCL CPU OpenMP

More information

or58_8_455.dvi

or58_8_455.dvi c Voice of CustomerVOC CS VOC Facebook Twitter SNS VOC SNS 1 VOC 1. WEB 2. 151 8583 2 2 1 VOC SNS 3. 3.1 4 (1) FAX (2) HP Twitter (3) (4) (1) (3) (4) 1 WEB 2013 8 Copyright c by ORSJ. Unauthorized reproduction

More information

or57_12_673.dvi

or57_12_673.dvi c ID ID ID 1 POS ID-POS ID-POS ID ID RFM LTV 8 7 ID KPI ID ID-POS ID KPI 1. ID 1.1 ID ID ID-POS ID IC ID IC ID nanaco Edy Suica PASMO IC 1 ID ID 100 0005 1 6 5 ID ID POS Web IC 1 ID 1 Twitter Facebook

More information

2 HI LO ZDD 2 ZDD 2 HI LO 2 ( ) HI (Zero-suppress ) Zero-suppress ZDD ZDD Zero-suppress 1 ZDD abc a HI b c b Zero-suppress b ZDD ZDD 5) ZDD F 1 F = a

2 HI LO ZDD 2 ZDD 2 HI LO 2 ( ) HI (Zero-suppress ) Zero-suppress ZDD ZDD Zero-suppress 1 ZDD abc a HI b c b Zero-suppress b ZDD ZDD 5) ZDD F 1 F = a ZDD 1, 2 1, 2 1, 2 2 2, 1 #P- Knuth ZDD (Zero-suppressed Binary Decision Diagram) 2 ZDD ZDD ZDD Knuth Knuth ZDD ZDD Path Enumeration Algorithms Using ZDD and Their Performance Evaluations Toshiki Saitoh,

More information

IPSJ SIG Technical Report Vol.2013-ARC-203 No /2/1 SMYLE OpenCL (NEDO) IT FPGA SMYLEref SMYLE OpenCL SMYLE OpenCL FPGA 1

IPSJ SIG Technical Report Vol.2013-ARC-203 No /2/1 SMYLE OpenCL (NEDO) IT FPGA SMYLEref SMYLE OpenCL SMYLE OpenCL FPGA 1 SMYLE OpenCL 128 1 1 1 1 1 2 2 3 3 3 (NEDO) IT FPGA SMYLEref SMYLE OpenCL SMYLE OpenCL FPGA 128 SMYLEref SMYLE OpenCL SMYLE OpenCL Implementation and Evaluations on 128 Cores Takuji Hieda 1 Noriko Etani

More information

,4) 1 P% P%P=2.5 5%!%! (1) = (2) l l Figure 1 A compilation flow of the proposing sampling based architecture simulation

,4) 1 P% P%P=2.5 5%!%! (1) = (2) l l Figure 1 A compilation flow of the proposing sampling based architecture simulation 1 1 1 1 SPEC CPU 2000 EQUAKE 1.6 50 500 A Parallelizing Compiler Cooperative Multicore Architecture Simulator with Changeover Mechanism of Simulation Modes GAKUHO TAGUCHI 1 YOUICHI ABE 1 KEIJI KIMURA 1

More information

1 OpenCL OpenCL 1 OpenCL GPU ( ) 1 OpenCL Compute Units Elements OpenCL OpenCL SPMD (Single-Program, Multiple-Data) SPMD OpenCL work-item work-group N

1 OpenCL OpenCL 1 OpenCL GPU ( ) 1 OpenCL Compute Units Elements OpenCL OpenCL SPMD (Single-Program, Multiple-Data) SPMD OpenCL work-item work-group N GPU 1 1 2 1, 3 2, 3 (Graphics Unit: GPU) GPU GPU GPU Evaluation of GPU Computing Based on An Automatic Program Generation Technology Makoto Sugawara, 1 Katsuto Sato, 1 Kazuhiko Komatsu, 2 Hiroyuki Takizawa

More information

soturon.dvi

soturon.dvi 12 Exploration Method of Various Routes with Genetic Algorithm 1010369 2001 2 5 ( Genetic Algorithm: GA ) GA 2 3 Dijkstra Dijkstra i Abstract Exploration Method of Various Routes with Genetic Algorithm

More information

07-二村幸孝・出口大輔.indd

07-二村幸孝・出口大輔.indd GPU Graphics Processing Units HPC High Performance Computing GPU GPGPU General-Purpose computation on GPU CPU GPU GPU *1 Intel Quad-Core Xeon E5472 3.0 GHz 2 6 MB L2 cache 1600 MHz FSB 80 GFlops 1 nvidia

More information

スライド 1

スライド 1 swk(at)ic.is.tohoku.ac.jp 2 Outline 3 ? 4 S/N CCD 5 Q Q V 6 CMOS 1 7 1 2 N 1 2 N 8 CCD: CMOS: 9 : / 10 A-D A D C A D C A D C A D C A D C A D C ADC 11 A-D ADC ADC ADC ADC ADC ADC ADC ADC ADC A-D 12 ADC

More information

or58_11_651.dvi

or58_11_651.dvi c 1. 2. 480 1195 1 1 OECD 2010 [1] 33 OECD 2009 3,265 3,035 8,233 913 1 1 4 OECD 2 3 OECD 1,000 2.2 OECD 3.1 34 5 OECD 14 [2] 1 2013 11 Copyright c by ORSJ. Unauthorized reproduction of this article is

More information

\\ \Data_in4\TeX\OR\63-7\07\or63_7_401.dvi

\\ \Data_in4\TeX\OR\63-7\07\or63_7_401.dvi c CO 2 2 CO 2 CO 2 CO 2 IPCC 1. CO 2 2015 400 ppm CO 2 CO 2 2 2.5 16.2 8.2 [1] CO 2 305 0005 1 1 1 3F 1134 mamoru@sk.tsukuba.ac.jp 206 000 626 2 2 507 brother.hide10@gmail.com 305 005 1 1 1 IIIS 4F kojima.kazunori.ga@un.tsukuba.ac.jp

More information

GPGPU

GPGPU GPGPU 2013 1008 2015 1 23 Abstract In recent years, with the advance of microscope technology, the alive cells have been able to observe. On the other hand, from the standpoint of image processing, the

More information

Linux @ S9 @ CPU #0 CPU #1 FIB Table Neighbor Table 198.51.100.0/24 fe540072d56f 203.0.113.0/24 fe54003c1fb2 TX Ring TX Ring TX Buf. Dsc. RX Buf. Dsc. TX Buf. Dsc. RX Buf. Dsc. Packet NIC #0 NIC #1 CPU

More information

untitled

untitled PC murakami@cc.kyushu-u.ac.jp muscle server blade server PC PC + EHPC/Eric (Embedded HPC with Eric) 1216 Compact PCI Compact PCIPC Compact PCISH-4 Compact PCISH-4 Eric Eric EHPC/Eric EHPC/Eric Gigabit

More information

211 年ハイパフォーマンスコンピューティングと計算科学シンポジウム Computing Symposium 211 HPCS /1/18 a a 1 a 2 a 3 a a GPU Graphics Processing Unit GPU CPU GPU GPGPU G

211 年ハイパフォーマンスコンピューティングと計算科学シンポジウム Computing Symposium 211 HPCS /1/18 a a 1 a 2 a 3 a a GPU Graphics Processing Unit GPU CPU GPU GPGPU G 211 年ハイパフォーマンスコンピューティングと計算科学シンポジウム Computing Symposium 211 HPCS211 211/1/18 GPU 4 8 BLAS 4 8 BLAS Basic Linear Algebra Subprograms GPU Graphics Processing Unit 4 8 double 2 4 double-double DD 4 4 8 quad-double

More information

HP High Performance Computing(HPC)

HP High Performance Computing(HPC) ACCELERATE HP High Performance Computing HPC HPC HPC HPC HPC 1000 HPHPC HPC HP HPC HPC HPC HP HPCHP HP HPC 1 HPC HP 2 HPC HPC HP ITIDC HP HPC 1HPC HPC No.1 HPC TOP500 2010 11 HP 159 32% HP HPCHP 2010 Q1-Q4

More information

workshop Eclipse TAU AICS.key

workshop Eclipse TAU AICS.key 11 AICS 2016/02/10 1 Bryzgalov Peter @ HPC Usability Research Team RIKEN AICS Copyright 2016 RIKEN AICS 2 3 OS X, Linux www.eclipse.org/downloads/packages/eclipse-parallel-application-developers/lunasr2

More information

単位、情報量、デジタルデータ、CPUと高速化 ~ICT用語集~

単位、情報量、デジタルデータ、CPUと高速化  ~ICT用語集~ CPU ICT mizutani@ic.daito.ac.jp 2014 SI: Systèm International d Unités SI SI 10 1 da 10 1 d 10 2 h 10 2 c 10 3 k 10 3 m 10 6 M 10 6 µ 10 9 G 10 9 n 10 12 T 10 12 p 10 15 P 10 15 f 10 18 E 10 18 a 10 21

More information

untitled

untitled c OR&SA OR&SA 2 OR&SA (Polarity) OR&SA 1. 1) 2) OR&SA 2 3) 2 OR&SA 2014 7 [1] 1) 2) 3) 153 8648 2 2 1 4) 5) 6) 1980 1990 2000 2. 234 36 Copyright c by ORSJ. Unauthorized reproduction of this article is

More information

FINAL PROGRAM 25th Annual Workshop SWoPP / / 2012 Tottori Summer United Workshops on Parallel, Distributed, and Cooperative Processing 2012

FINAL PROGRAM 25th Annual Workshop SWoPP / / 2012 Tottori Summer United Workshops on Parallel, Distributed, and Cooperative Processing 2012 FINAL PROGRAM 25th Annual Workshop SWoPP 2012 2012 / / 2012 Tottori Summer United Workshops on Parallel, Distributed, and Cooperative Processing 2012 8 1 ( ) 8 3 ( ) 680-0017 101-5 http://www.torikenmin.jp/kenbun/

More information

09中西

09中西 PC NEC Linux (1) (2) (1) (2) 1 Linux Linux 2002.11.22) LLNL Linux Intel Xeon 2300 ASCIWhite1/7 / HPC (IDC) 2002 800 2005 2004 HPC 80%Linux) Linux ASCI Purple (ASCI 100TFlops Blue Gene/L 1PFlops (2005)

More information

untitled

untitled OS 2007/4/27 1 Uni-processor system revisited Memory disk controller frame buffer network interface various devices bus 2 1 Uni-processor system today Intel i850 chipset block diagram Source: intel web

More information

Publish/Subscribe KiZUNA P2P 2 Publish/Subscribe KiZUNA 2. KiZUNA 1 Skip Graph BF Skip Graph BF Skip Graph Skip Graph Skip Graph DDLL 2.1 Skip Graph S

Publish/Subscribe KiZUNA P2P 2 Publish/Subscribe KiZUNA 2. KiZUNA 1 Skip Graph BF Skip Graph BF Skip Graph Skip Graph Skip Graph DDLL 2.1 Skip Graph S KiZUNA: P2P 1,a) 1 1 1 P2P KiZUNA KiZUNA Pure P2P P2P 1 Skip Graph ALM(Application Level Multicast) Pub/Sub, P2P Skip Graph, Bloom Filter KiZUNA: An Implementation of Distributed Microblogging Service

More information

untitled

untitled c 645 2 1. GM 1959 Lindsey [1] 1960 Howard [2] Howard 1 25 (Markov Decision Process) 3 3 2 3 +1=25 9 Bellman [3] 1 Bellman 1 k 980 8576 27 1 015 0055 84 4 1977 D Esopo and Lefkowitz [4] 1 (SI) Cover and

More information

i

i 24 19 19115096 i 1 1 2 2 2.1..................................... 2 2.2....................... 3 2.3................................... 3 2.3.1.................. 4 2.4............................... 4

More information

12 DCT A Data-Driven Implementation of Shape Adaptive DCT

12 DCT A Data-Driven Implementation of Shape Adaptive DCT 12 DCT A Data-Driven Implementation of Shape Adaptive DCT 1010431 2001 2 5 DCT MPEG H261,H263 LSI DDMP [1]DDMP MPEG4 DDMP MPEG4 SA-DCT SA-DCT DCT SA-DCT DDMP SA-DCT MPEG4, DDMP,, SA-DCT,, ο i Abstract

More information

01_OpenMP_osx.indd

01_OpenMP_osx.indd OpenMP* / 1 1... 2 2... 3 3... 5 4... 7 5... 9 5.1... 9 5.2 OpenMP* API... 13 6... 17 7... 19 / 4 1 2 C/C++ OpenMP* 3 Fortran OpenMP* 4 PC 1 1 9.0 Linux* Windows* Xeon Itanium OS 1 2 2 WEB OS OS OS 1 OS

More information

GPU n Graphics Processing Unit CG CAD

GPU n Graphics Processing Unit CG CAD GPU 2016/06/27 第 20 回 GPU コンピューティング講習会 ( 東京工業大学 ) 1 GPU n Graphics Processing Unit CG CAD www.nvidia.co.jp www.autodesk.co.jp www.pixar.com GPU n GPU ü n NVIDIA CUDA ü NVIDIA GPU ü OS Linux, Windows, Mac

More information

B 2 Thin Q=3 0 0 P= N ( )P Q = 2 3 ( )6 N N TSUB- Hub PCI-Express (PCIe) Gen 2 x8 AME1 5) 3 GPU Socket 0 High-performance Linpack 1

B 2 Thin Q=3 0 0 P= N ( )P Q = 2 3 ( )6 N N TSUB- Hub PCI-Express (PCIe) Gen 2 x8 AME1 5) 3 GPU Socket 0 High-performance Linpack 1 TSUBAME 2.0 Linpack 1,,,, Intel NVIDIA GPU 2010 11 TSUBAME 2.0 Linpack 2CPU 3GPU 1400 Dual-Rail QDR InfiniBand TSUBAME 1.0 30 2.4PFlops TSUBAME 1.0 Linpack GPU 1.192PFlops PFlops Top500 4 Achievement of

More information

[4] ACP (Advanced Communication Primitives) [1] ACP ACP [2] ACP Tofu UDP [3] HPC InfiniBand InfiniBand ACP 2 ACP, 3 InfiniBand ACP 4 5 ACP 2. ACP ACP

[4] ACP (Advanced Communication Primitives) [1] ACP ACP [2] ACP Tofu UDP [3] HPC InfiniBand InfiniBand ACP 2 ACP, 3 InfiniBand ACP 4 5 ACP 2. ACP ACP InfiniBand ACP 1,5,a) 1,5,b) 2,5 1,5 4,5 3,5 2,5 ACE (Advanced Communication for Exa) ACP (Advanced Communication Primitives) HPC InfiniBand ACP InfiniBand ACP ACP InfiniBand Open MPI 20% InfiniBand Implementation

More information

橡3_2石川.PDF

橡3_2石川.PDF PC RWC 01/10/31 2 1 SCore 1,024 PC SCore III PC 01/10/31 3 SCore SCore Aug. 1995 Feb. 1996 Oct. 1996 1997-1998 Oct. 1999 Oct. 2000 April. 2001 01/10/31 4 2 SCore University of Bonn, Germany University

More information

最新Linuxデバイスドライバ開発応用-修正版-PDF.PDF

最新Linuxデバイスドライバ開発応用-修正版-PDF.PDF Linux Kernel Conference 2004 Linux - / - info@devdrv.co.jp 2004/10/14 Device Drivers Limited 1 Device Drivers Limited 2 IF Device Drivers Limited 3 Linux Device Drivers Limited 4 2.6 2.6 2.6 Device Drivers

More information

Cisco 1711/1712セキュリティ アクセス ルータの概要

Cisco 1711/1712セキュリティ アクセス ルータの概要 CHAPTER 1 Cisco 1711/1712 Cisco 1711/1712 Cisco 1711/1712 1-1 1 Cisco 1711/1712 Cisco 1711/1712 LAN Cisco 1711 1 WIC-1-AM WAN Interface Card WIC;WAN 1 Cisco 1712 1 ISDN-BRI S/T WIC-1B-S/T 1 Cisco 1711/1712

More information

1 GPU GPGPU GPU CPU 2 GPU 2007 NVIDIA GPGPU CUDA[3] GPGPU CUDA GPGPU CUDA GPGPU GPU GPU GPU Graphics Processing Unit LSI LSI CPU ( ) DRAM GPU LSI GPU

1 GPU GPGPU GPU CPU 2 GPU 2007 NVIDIA GPGPU CUDA[3] GPGPU CUDA GPGPU CUDA GPGPU GPU GPU GPU Graphics Processing Unit LSI LSI CPU ( ) DRAM GPU LSI GPU GPGPU (I) GPU GPGPU 1 GPU(Graphics Processing Unit) GPU GPGPU(General-Purpose computing on GPUs) GPU GPGPU GPU ( PC ) PC PC GPU PC PC GPU GPU 2008 TSUBAME NVIDIA GPU(Tesla S1070) TOP500 29 [1] 2009 AMD

More information

untitled

untitled c 2020 70 800 1. 1 1 1,600 1 [1, 2] 112 8551 1 13 27 taguchi@ise.chuo-u.ac.jp 1 1.1 17 [1] 14 30 1,600 1.2 [2] 54 37 2017 1 Copyright c by ORSJ. Unauthorized reproduction of this article is prohibited.

More information

develop

develop SCore SCore 02/03/20 2 1 HA (High Availability) HPC (High Performance Computing) 02/03/20 3 HA (High Availability) Mail/Web/News/File Server HPC (High Performance Computing) Job Dispatching( ) Parallel

More information

untitled

untitled 1 4 4 6 8 10 30 13 14 16 16 17 18 19 19 96 21 23 24 3 27 27 4 27 128 24 4 1 50 by ( 30 30 200 30 30 24 4 TOP 10 2012 8 22 3 1 7 1,000 100 30 26 3 140 21 60 98 88,000 96 3 5 29 300 21 21 11 21

More information

bit bit bit VAST N d i d 1 <d 2 <...<d k <...<d N d k VAST d k 3 d k 3 d k 2 d k 1 d k 4 w w=4 ) HW HW 32bit γ δ [4] PForDelta [3] HW CPU VAST VAST VA

bit bit bit VAST N d i d 1 <d 2 <...<d k <...<d N d k VAST d k 3 d k 3 d k 2 d k 1 d k 4 w w=4 ) HW HW 32bit γ δ [4] PForDelta [3] HW CPU VAST VAST VA DEIM Forum 2013 F10-6 VAST CPU NTT, 180-0012 3-9-11 E-mail: {yamamuro.takeshi,onizuka.makoto,konishi.fumikazu}@lab.ntt.co.jp CPU HW HW HW VAST VAST SIMD CPU TLB bit VAST VAST VAST VAST CPU SIMD VAST-Tree

More information

FabHetero FabHetero FabHetero FabCache FabCache SPEC2000INT IPC FabCache 0.076%

FabHetero FabHetero FabHetero FabCache FabCache SPEC2000INT IPC FabCache 0.076% 2013 (409812) FabHetero FabHetero FabHetero FabCache FabCache SPEC2000INT 6 1000 IPC FabCache 0.076% Abstract Single-ISA heterogeneous multi-core processors are increasing importance in the processor architecture.

More information

Shonan Institute of Technology MEMOIRS OF SHONAN INSTITUTE OF TECHNOLOGY Vol. 41, No. 1, 2007 Ships1 * ** ** ** Development of a Small-Mid Range Paral

Shonan Institute of Technology MEMOIRS OF SHONAN INSTITUTE OF TECHNOLOGY Vol. 41, No. 1, 2007 Ships1 * ** ** ** Development of a Small-Mid Range Paral MEMOIRS OF SHONAN INSTITUTE OF TECHNOLOGY Vol. 41, No. 1, 2007 Ships1 * ** ** ** Development of a Small-Mid Range Parallel Computer Ships1 Makoto OYA*, Hiroto MATSUBARA**, Kazuyoshi SAKURAI** and Yu KATO**

More information

ポストペタスケール高性能計算に資するシステムソフトウェア技術の創出 平成 23 年度採択研究代表者 H27 年度 実績報告書 藤澤克樹 九州大学マス フォア インダストリ研究所 教授 ポストペタスケールシステムにおける超大規模グラフ最適化基盤 1. 研究実施体制 (1) 大規模最適化 グループ( 九

ポストペタスケール高性能計算に資するシステムソフトウェア技術の創出 平成 23 年度採択研究代表者 H27 年度 実績報告書 藤澤克樹 九州大学マス フォア インダストリ研究所 教授 ポストペタスケールシステムにおける超大規模グラフ最適化基盤 1. 研究実施体制 (1) 大規模最適化 グループ( 九 ポストペタスケール高性能計算に資するシステムソフトウェア技術の創出 平成 23 年度採択研究代表者 H27 年度 実績報告書 藤澤克樹 九州大学マス フォア インダストリ研究所 教授 ポストペタスケールシステムにおける超大規模グラフ最適化基盤 1. 研究実施体制 (1) 大規模最適化 グループ( 九州大学 ) 1 研究代表者 : 藤澤克樹 ( 九州大学マス フォア インダストリ研究所 教授 ) 2

More information

HPC (pay-as-you-go) HPC Web 2

HPC (pay-as-you-go) HPC Web 2 ,, 1 HPC (pay-as-you-go) HPC Web 2 HPC Amazon EC2 OpenFOAM GPU EC2 3 HPC MPI MPI Courant 1 GPGPU MPI 4 AMAZON EC2 GPU CLUSTER COMPUTE INSTANCE EC2 GPU (cg1.4xlarge) ( N. Virgina ) Quadcore Intel Xeon 5570

More information

ICDE2013study.ppt

ICDE2013study.ppt ICDE2013 勉強会 R10: Main Memory Query Processing 担当 : 山室健 1 概要 } このセクションの特徴 } in-memory を前提としたクエリ最適化 (Hash Join の高速化や MV による資源の利活用 ) に関する話題 } 紹介する論文リスト } 1. Efficient Many-Core Query Execution in Main Memory

More information

2004 Copyright by Tatsuo Minohara Programming with Mac OS X in Lambda 21 - page 2

2004 Copyright by Tatsuo Minohara Programming with Mac OS X in Lambda 21 - page 2 Living with Mac OS X in Lambda 21 2004 Copyright by Tatsuo Minohara Programming with Mac OS X in Lambda 21 - page 1 2004 Copyright by Tatsuo Minohara Programming with Mac OS X in Lambda 21 - page 2 2004

More information

MacOSXLambdaJava.aw

MacOSXLambdaJava.aw Living with Mac OS X in Lambda 21 2005 Copyright by Tatsuo Minohara Programming with Mac OS X in Lambda 21 - page 1 2005 Copyright by Tatsuo Minohara Programming with Mac OS X in Lambda 21 - page 2 2005

More information

untitled

untitled Power Wall HPL1 10 B/F EXTREMETECH Supercomputing director bets $2,000 that we won t have exascale computing by 2020 One of the biggest problems standing in our way is power. [] http://www.extremetech.com/computing/155941

More information

1重谷.PDF

1重谷.PDF RSCC RSCC RSCC BMT 1 6 3 3000 3000 200310 1994 19942 VPP500/32PE 19992 VPP700E/128PE 160PE 20043 2 2 PC Linux 2048 CPU Intel Xeon 3.06GHzDual) 12.5 TFLOPS SX-7 32CPU/256GB 282.5 GFLOPS Linux 3 PC 1999

More information

1, 4,a) 1, 4 1, 4 1, , 4 3, 4 HPC HPC HPC Slurm 1. HPC Tianhe MW MW [1] MW CREST a)

1, 4,a) 1, 4 1, 4 1, , 4 3, 4 HPC HPC HPC Slurm 1. HPC Tianhe MW MW [1] MW CREST a) Title 電力制約を考慮した資源管理を行うリソースマネージャの実装と評価 Author(s) 坂本, 龍一 ; タン, カオ ; 和, 遠 ; 近藤, 正章 ; 深沢, 圭田, 将嗣 ; 稲富, 雄一 ; 井上, 弘士 Citation 情報処理学会研究報告 = IPSJ SIG Technical Rep 2015-HPC-151(1): 1-8 Issue Date 2015-09-23 URL

More information

[2] 2. [3 5] 3D [6 8] Morishima [9] N n 24 24FPS k k = 1, 2,..., N i i = 1, 2,..., n Algorithm 1 N io user-specified number of inbetween omis

[2] 2. [3 5] 3D [6 8] Morishima [9] N n 24 24FPS k k = 1, 2,..., N i i = 1, 2,..., n Algorithm 1 N io user-specified number of inbetween omis 1,a) 2 2 2 1 2 3 24 Motion Frame Omission for Cartoon-like Effects Abstract: Limited animation is a hand-drawn animation style that holds each drawing for two or three successive frames to make up 24 frames

More information

21 20 20413525 22 2 4 i 1 1 2 4 2.1.................................. 4 2.1.1 LinuxOS....................... 7 2.1.2....................... 10 2.2........................ 15 3 17 3.1.................................

More information

先進的計算基盤システムシンポジウム SACSIS2012 Symposium on Advanced Computing Systems and Infrastructures SACSIS /5/18 CPU, CPU., Memory-bound CPU,., Memory-bo

先進的計算基盤システムシンポジウム SACSIS2012 Symposium on Advanced Computing Systems and Infrastructures SACSIS /5/18 CPU, CPU., Memory-bound CPU,., Memory-bo CPU, CPU, Memory-bound CPU,, Memory-bound ( ) Performance Monitoring Counter(PMC), PMC (nmi watchdog), PMC CPU., PMC, CPU, Memory-bound, CPU-bound,, CPU,, PMC,,,, CPU, NPB 8, 5% CPU, CPU, 3%, 5% CPU, IS

More information

HPEハイパフォーマンスコンピューティング ソリューション

HPEハイパフォーマンスコンピューティング ソリューション HPE HPC / AI Page 2 No.1 * 24.8% No.1 * HPE HPC / AI HPC AI SGIHPE HPC / AI GPU TOP500 50th edition Nov. 2017 HPE No.1 124 www.top500.org HPE HPC / AI TSUBAME 3.0 2017 7 AI TSUBAME 3.0 HPE SGI 8600 System

More information

FINAL PROGRAM 22th Annual Workshop SWoPP / / 2009 Sendai Summer United Workshops on Parallel, Distributed, and Cooperative Processing

FINAL PROGRAM 22th Annual Workshop SWoPP / / 2009 Sendai Summer United Workshops on Parallel, Distributed, and Cooperative Processing FINAL PROGRAM 22th Annual Workshop SWoPP 2009 2009 / / 2009 Sendai Summer United Workshops on Parallel, Distributed, and Cooperative Processing 2009 8 4 ( ) 8 6 ( ) 981-0933 1-2-45 http://www.forestsendai.jp

More information

HTM RaR HTM 2. 2) 3) HTM 2 3 Yoo 4) HTM Adaptive Transaction Scheduling Akpinar 5) HTM Gaona 6) HTM 3. Read-after-Read HTM 3.1 Read-after-Read Read Wr

HTM RaR HTM 2. 2) 3) HTM 2 3 Yoo 4) HTM Adaptive Transaction Scheduling Akpinar 5) HTM Gaona 6) HTM 3. Read-after-Read HTM 3.1 Read-after-Read Read Wr 1 1, 1 1 1 1 Readafter-Read Read-after-Read 66.9% A Speed-Up Technique for Hardware Transactional Memories by Reducing Concurrency Considering Conflicting Addresses Koshiro Hashimoto, 1 Masamichi Eto,

More information

or58_10_599.dvi

or58_10_599.dvi c 1. 450 m 14 26 1 =1.852 km/h 300 m 1 2 34 m 1933 2 30 m 135 8533 2 1 6 1 2009 12 29 m 1 (Weather Routing) [1] 2013 10 Copyright c by ORSJ. Unauthorized reproduction of this article is prohibited. 27

More information

プロセッサ・アーキテクチャ

プロセッサ・アーキテクチャ 2. NII51002-8.0.0 Nios II Nios II Nios II 2-3 2-4 2-4 2-6 2-7 2-9 I/O 2-18 JTAG Nios II ISA ISA Nios II Nios II Nios II 2 1 Nios II Altera Corporation 2 1 2 1. Nios II Nios II Processor Core JTAG interface

More information

卒業論文

卒業論文 PC OpenMP SCore PC OpenMP PC PC PC Myrinet PC PC 1 OpenMP 2 1 3 3 PC 8 OpenMP 11 15 15 16 16 18 19 19 19 20 20 21 21 23 26 29 30 31 32 33 4 5 6 7 SCore 9 PC 10 OpenMP 14 16 17 10 17 11 19 12 19 13 20 1421

More information

untitled

untitled c 1. 2 2011 2012 0.248 0.252 1 Data Envelopment Analysis DEA 4 2 180 8633 3 3 1 IT DHARMA Ltd. 272 0122 1 14 12 13.10.7 14.5.27 DEA-AR (Assurance Region) 1 DEA 1 1 [1] 2011 2012 220 446 [2] 2. [2] 1 1

More information

DEIM Forum 2010 D Development of a La

DEIM Forum 2010 D Development of a La DEIM Forum 2010 D5-3 432-8011 3-5-1 E-mail: {cs06062,cs06015}@s.inf.shizuoka.ac.jp, {yokoyama,fukuta,ishikawa}@.inf.shizuoka.ac.jp Development of a Large-scale Visualization System Based on Sensor Network

More information

untitled

untitled c OR 21 OR 1. 21 21 IoT OR OR OR 260 8672 1 8 1 OR 2. 2.1 public health [1] communicable (infectious) diseases vehicle burden HIV/AIDS (SARS) 258 60 Copyright c by ORSJ. Unauthorized reproduction of this

More information

倍々精度RgemmのnVidia C2050上への実装と応用

倍々精度RgemmのnVidia C2050上への実装と応用 .. maho@riken.jp http://accc.riken.jp/maho/,,, 2011/2/16 1 - : GPU : SDPA-DD 10 1 - Rgemm : 4 (32 ) nvidia C2050, GPU CPU 150, 24GFlops 25 20 GFLOPS 15 10 QuadAdd Cray, QuadMul Sloppy Kernel QuadAdd Cray,

More information

PassMark PerformanceTest ™

PassMark PerformanceTest ™ KRONOS S ライン 性能ベンチマーク オーバークロックモニター OCCT OverClock Checking Tool i7z (A better i7 (and now i3, i5) reporting tool for Linux) KRONOS S800 CATIA Benchmark Aerospace - 8/17 passengers Jet - Mid Fuse DELL Precision

More information

or58_8_462.dvi

or58_8_462.dvi c Twitter2013 30 2013 Twitter Twitter Twitter API 1. Twitter 2006 140 SNS Facebook mixi [1] No.345 2012 12 2013 1 2 3 4 5 6 7 8 9 10 11 12 13 14 ALBERT 151 0053 2 22 17 15 16 17 18 EV 19 20 ALBERT 2013

More information

Slides: TimeGraph: GPU Scheduling for Real-Time Multi-Tasking Environments

Slides: TimeGraph: GPU Scheduling for Real-Time Multi-Tasking Environments 計算機アーキテクチャ第 11 回 マルチプロセッサ 本資料は授業用です 無断で転載することを禁じます 名古屋大学 大学院情報科学研究科 准教授加藤真平 デスクトップ ジョブレベル並列性 スーパーコンピュータ 並列処理プログラム プログラムの並列化 for (i = 0; i < N; i++) { x[i] = a[i] + b[i]; } プログラムの並列化 x[0] = a[0] + b[0];

More information

情報処理学会研究報告 IPSJ SIG Technical Report Vol.2013-HPC-139 No /5/29 Gfarm/Pwrake NICT NICT 10TB 100TB CPU I/O HPC I/O NICT Gf

情報処理学会研究報告 IPSJ SIG Technical Report Vol.2013-HPC-139 No /5/29 Gfarm/Pwrake NICT NICT 10TB 100TB CPU I/O HPC I/O NICT Gf Gfarm/Pwrake NICT 1 1 1 1 2 2 3 4 5 5 5 6 NICT 10TB 100TB CPU I/O HPC I/O NICT Gfarm Gfarm Pwrake A Parallel Processing Technique on the NICT Science Cloud via Gfarm/Pwrake KEN T. MURATA 1 HIDENOBU WATANABE

More information

<95DB8C9288E397C389C88A E696E6462>

<95DB8C9288E397C389C88A E696E6462> 2011 Vol.60 No.2 p.138 147 Performance of the Japanese long-term care benefit: An International comparison based on OECD health data Mie MORIKAWA[1] Takako TSUTSUI[2] [1]National Institute of Public Health,

More information

IPSJ SIG Technical Report Vol.2017-ARC-225 No.12 Vol.2017-SLDM-179 No.12 Vol.2017-EMB-44 No /3/9 1 1 RTOS DefensiveZone DefensiveZone MPU RTOS

IPSJ SIG Technical Report Vol.2017-ARC-225 No.12 Vol.2017-SLDM-179 No.12 Vol.2017-EMB-44 No /3/9 1 1 RTOS DefensiveZone DefensiveZone MPU RTOS 1 1 RTOS DefensiveZone DefensiveZone MPU RTOS RTOS OS Lightweight partitioning architecture for automotive systems Suzuki Takehito 1 Honda Shinya 1 Abstract: Partitioning using protection RTOS has high

More information

東京大学情報基盤センターFX10スパコンシステム(Oakleaf-FX)活用事例

東京大学情報基盤センターFX10スパコンシステム(Oakleaf-FX)活用事例 FX10 Oakleaf-FX Practical use of FX10 Supercomputer System (Oakleaf-FX) of Information Technology Center, The University of Tokyo 坂口吉生 小倉崇浩 あらまし FUJITSU Supercomputer PRIMEHPC FX10 Oakleaf-FX 2012 4 Oakleaf-FX

More information

IPSJ SIG Technical Report Vol.2013-HPC-138 No /2/21 GPU CRS 1,a) 2,b) SpMV GPU CRS SpMV GPU NVIDIA Kepler CUDA5.0 Fermi GPU Kepler Kepler Tesla

IPSJ SIG Technical Report Vol.2013-HPC-138 No /2/21 GPU CRS 1,a) 2,b) SpMV GPU CRS SpMV GPU NVIDIA Kepler CUDA5.0 Fermi GPU Kepler Kepler Tesla GPU CRS 1,a),b) SpMV GPU CRS SpMV GPU NVIDIA Kepler CUDA5.0 Fermi GPU Kepler Kepler Tesla K0 CUDA5.0 cusparse CRS SpMV 00 1.86 177 1. SpMV SpMV CRS Compressed Row Storage *1 SpMV GPU GPU NVIDIA Kepler

More information

Cloud[2] (48 ) Xeon Phi (50+ ) IBM Cyclops[9] (64 ) Cavium Octeon II (32 ) Tilera Tile-GX (100 ) PE [11][7] 2 Nsim[10] 8080[1] SH-2[5] SH [8

Cloud[2] (48 ) Xeon Phi (50+ ) IBM Cyclops[9] (64 ) Cavium Octeon II (32 ) Tilera Tile-GX (100 ) PE [11][7] 2 Nsim[10] 8080[1] SH-2[5] SH [8 1600 1,a) 1,b) 8080 SH-2 8080 SH-2 Simulation of a Many-Core Architecture with 16 Million Processing Cores Hisanobu Tomari 1,a) Kei Hiraki 1,b) Abstract: 8080 and SH-2 processors are evaluated as building

More information

IPSJ SIG Technical Report Vol.2012-ARC-202 No.13 Vol.2012-HPC-137 No /12/13 Tightly Coupled Accelerators 1,a) 1,b) 1,c) 1,d) GPU HA-PACS

IPSJ SIG Technical Report Vol.2012-ARC-202 No.13 Vol.2012-HPC-137 No /12/13 Tightly Coupled Accelerators 1,a) 1,b) 1,c) 1,d) GPU HA-PACS Tightly Coupled Accelerators 1,a) 1,b) 1,c) 1,d) HA-PACS 2012 2 HA-PACS TCA (Tightly Coupled Accelerators) TCA PEACH2 1. (Graphics Processing Unit) HPC GP(General Purpose ) TOP500 [1] CPU PCI Express (PCIe)

More information

or57_4_175.dvi

or57_4_175.dvi c Excel Excel Excel Excel Microsoft Excel 1. OR Microsoft Excel Excel 1 Excel Excel Excel or 2007 Excel OR Excel Excel LP Excel LP Excel 112 8551 1 13 27 1 Excel Excel Excel 2010 Excel OpenOffice Calc

More information

VXPRO R1400® ご提案資料

VXPRO R1400® ご提案資料 Intel Core i7 プロセッサ 920 Preliminary Performance Report ノード性能評価 ノード性能の評価 NAS Parallel Benchmark Class B OpenMP 版での性能評価 実行スレッド数を 4 で固定 ( デュアルソケットでは各プロセッサに 2 スレッド ) 全て 2.66GHz のコアとなるため コアあたりのピーク性能は同じ 評価システム

More information

AV 1000 BASE-T LAN 90 IEEE ac USB (3 ) LAN (IEEE 802.1X ) LAN AWS (Amazon Web Services) AP 3 USB wget iperf3 wget 40 MBytes 2 wget 40 MByt

AV 1000 BASE-T LAN 90 IEEE ac USB (3 ) LAN (IEEE 802.1X ) LAN AWS (Amazon Web Services) AP 3 USB wget iperf3 wget 40 MBytes 2 wget 40 MByt 1 BYOD LAN 1 2 3 4 1 BYOD 1 Gb/s LAN BYOD LAN LAN Access Point (AP) IEEE 802.11n BYOD LAN AP wget iperf3 1 AP [2] 2 IEEE 802.11ac [3] AP 4 AV (207 m 2 ) ( 1 2 )[4, 5] AP Wave2 Aruba AP-335 Aruba LAN 7210

More information

untitled

untitled AMD HPC GP-GPU Opteron HPC 2 1 AMD Opteron 85 FLOPS 10,480 TOP500 16 T2K 95 FLOPS 10,800 140 FLOPS 15,200 61 FLOPS 7,200 3 Barcelona 4 2 AMD Opteron CPU!! ( ) L1 5 2003 2004 2005 2006 2007 2008 2009 2010

More information

23 Fig. 2: hwmodulev2 3. Reconfigurable HPC 3.1 hw/sw hw/sw hw/sw FPGA PC FPGA PC FPGA HPC FPGA FPGA hw/sw hw/sw hw- Module FPGA hwmodule hw/sw FPGA h

23 Fig. 2: hwmodulev2 3. Reconfigurable HPC 3.1 hw/sw hw/sw hw/sw FPGA PC FPGA PC FPGA HPC FPGA FPGA hw/sw hw/sw hw- Module FPGA hwmodule hw/sw FPGA h 23 FPGA CUDA Performance Comparison of FPGA Array with CUDA on Poisson Equation (lijiang@sekine-lab.ei.tuat.ac.jp), (kazuki@sekine-lab.ei.tuat.ac.jp), (takahashi@sekine-lab.ei.tuat.ac.jp), (tamukoh@cc.tuat.ac.jp),

More information

IPSJ SIG Technical Report Vol.2015-HPC-150 No /8/6 I/O Jianwei Liao 1 Gerofi Balazs 1 1 Guo-Yuan Lien Prototyping F

IPSJ SIG Technical Report Vol.2015-HPC-150 No /8/6 I/O Jianwei Liao 1 Gerofi Balazs 1 1 Guo-Yuan Lien Prototyping F I/O Jianwei Liao 1 Gerofi Balazs 1 1 Guo-Yuan Lien 1 1 1 1 1 30 30 100 30 30 2 Prototyping File I/O Arbitrator Middleware for Real-Time Severe Weather Prediction System Jianwei Liao 1 Gerofi Balazs 1 Yutaka

More information

ScaleGraph

ScaleGraph 超大規模半正定値計画問題に対する高性能汎用ソルバの開発と評価 数理計画問題 ( 最適化問題 ) と 2015 年予想 ( 目標 ) 非常に応用が広範 ( 企業 社会 公共政策 ) 高性能なソルバーを作ること自体が最適化問題 センサーデータによる最適化問題の複雑 & 巨大化 半正定計画問題 (SDP) と混合整数計画問題 (MIP) が 2 大注目数理計画問題 汎用ソルバーの必要性 ( 個別の問題に対する仮定やチューニングは効果が低い

More information

C++ TPDPL(Template Parallel Distributed Processing Library) C X10 1) Place Activity X10 Place 2) 2.2 C++ C/C++OpenMP MPI C/C++ OpenMP

C++ TPDPL(Template Parallel Distributed Processing Library) C X10 1) Place Activity X10 Place 2) 2.2 C++ C/C++OpenMP MPI C/C++ OpenMP C++ 1 2 2 CPU S.C. () PC C++ TPDPL(Template Parallel Distributed Processing Library) PE(Processing Element ) S.C.(T2K ) An Implementation of C++ Task Mapping Library and Evaluation on Heterogeneous Environments

More information

HPC可視化_小野2.pptx

HPC可視化_小野2.pptx 大 小 二 生 高 方 目 大 方 方 方 Rank Site Processors RMax Processor System Model 1 DOE/NNSA/LANL 122400 1026000 PowerXCell 8i BladeCenter QS22 Cluster 2 DOE/NNSA/LLNL 212992 478200 PowerPC 440 BlueGene/L 3 Argonne

More information

Second-semi.PDF

Second-semi.PDF PC 2000 2 18 2 HPC Agenda PC Linux OS UNIX OS Linux Linux OS HPC 1 1CPU CPU Beowulf PC (PC) PC CPU(Pentium ) Beowulf: NASA Tomas Sterling Donald Becker 2 (PC ) Beowulf PC!! Linux Cluster (1) Level 1:

More information

Dual Stack Virtual Network Dual Stack Network RS DC Real Network 一般端末 GN NTM 端末 C NTM 端末 B IPv4 Private Network IPv4 Global Network NTM 端末 A NTM 端末 B

Dual Stack Virtual Network Dual Stack Network RS DC Real Network 一般端末 GN NTM 端末 C NTM 端末 B IPv4 Private Network IPv4 Global Network NTM 端末 A NTM 端末 B root Android IPv4/ 1 1 2 1 NAT Network Address Translation IPv4 NTMobile Network Traversal with Mobility NTMobile Android 4.0 VPN API VpnService root VpnService IPv4 IPv4 VpnService NTMobile root IPv4/

More information

untitled

untitled taisuke@cs.tsukuba.ac.jp http://www.hpcs.is.tsukuba.ac.jp/~taisuke/ CP-PACS HPC PC post CP-PACS CP-PACS II 1990 HPC RWCP, HPC かつての世界最高速計算機も 1996年11月のTOP500 第一位 ピーク性能 614 GFLOPS Linpack性能 368 GFLOPS (地球シミュレータの前

More information

Estimation of Photovoltaic Module Temperature Rise Motonobu Yukawa, Member, Masahisa Asaoka, Non-member (Mitsubishi Electric Corp.) Keigi Takahara, Me

Estimation of Photovoltaic Module Temperature Rise Motonobu Yukawa, Member, Masahisa Asaoka, Non-member (Mitsubishi Electric Corp.) Keigi Takahara, Me Estimation of Photovoltaic Module Temperature Rise Motonobu Yukawa, Member, Masahisa Asaoka, Non-member (Mitsubishi Electric Corp.) Keigi Takahara, Member (Okinawa Electric Power Co.,Inc.) Toshimitsu Ohshiro,

More information

4.1 % 7.5 %

4.1 % 7.5 % 2018 (412837) 4.1 % 7.5 % Abstract Recently, various methods for improving computial performance have been proposed. One of these various methods is Multi-core. Multi-core can execute processes in parallel

More information

28 NTMobile Java Proposal and Implementation of Java Wrapper for NTMobile ( : ) :

28 NTMobile Java Proposal and Implementation of Java Wrapper for NTMobile ( : ) : 28 NTMobile Java Proposal and Implementation of Java Wrapper for NTMobile ( : 130441077) : 29 2 10 NTMobile Network Traversal with Mobility NTMobile Linux NTMobile C Java NTMobile Java Java JNA Java Native

More information

IPSJ SIG Technical Report Vol.2015-ARC-215 No.13 Vol.2015-OS-133 No /5/ ,a) % 13.9% 1. Transactional Memory: TM [1] TM TM 1 Nag

IPSJ SIG Technical Report Vol.2015-ARC-215 No.13 Vol.2015-OS-133 No /5/ ,a) % 13.9% 1. Transactional Memory: TM [1] TM TM 1 Nag 1 1 1 1,a) 16 67.2% 13.9% 1. Transactional Memory: TM [1] TM TM 1 Nagoya Institute of Technology, Nagoya, Aichi, 466-8555, Japan a) tsumura@computer.org Hardware Transactional Memory: HTM HTM Read Write

More information

Microsoft PowerPoint - CCS学際共同boku-08b.ppt

Microsoft PowerPoint - CCS学際共同boku-08b.ppt マルチコア / マルチソケットノードに おけるメモリ性能のインパクト 研究代表者朴泰祐筑波大学システム情報工学研究科 taisuke@cs.tsukuba.ac.jp アウトライン 近年の高性能 PC クラスタの傾向と問題 multi-core/multi-socket ノードとメモリ性能 メモリバンド幅に着目した性能測定 multi-link network 性能評価 まとめ 近年の高性能 PC

More information