Second-semi.PDF



Similar documents
Microsoft PowerPoint - sales2.ppt

001.dvi

untitled

スパコンに通じる並列プログラミングの基礎

total-all-nt.dvi

スパコンに通じる並列プログラミングの基礎

スパコンに通じる並列プログラミングの基礎

FINAL PROGRAM 22th Annual Workshop SWoPP / / 2009 Sendai Summer United Workshops on Parallel, Distributed, and Cooperative Processing

GNU/Linux on SuperH g,,,,, GNU/Linux on SuperH [1] SuperH (SH-3 SH-4) GNU/Linux g linux-kernel 1998 Linux (SH-3) g GD


07-二村幸孝・出口大輔.indd

<4D F736F F F696E74202D C835B B E B8CDD8AB B83685D>

倍々精度RgemmのnVidia C2050上への実装と応用

SC-85X2取説


untitled

IO Linux Vyatta PC

i


Wide Scanner TWAIN Source ユーザーズガイド

1 (bit ) ( ) PC WS CPU IEEE754 standard ( 24bit) ( 53bit)

09中西

VNSTProductDes3.0-1_jp.pdf

1 M32R Single-Chip Multiprocessor [2] [3] [4] [5] Linux/M32R UP(Uni-processor) SMP(Symmetric Multi-processor) MMU CPU nommu Linux/M32R Linux/M32R 2. M

OpenMP (1) 1, 12 1 UNIX (FUJITSU GP7000F model 900), 13 1 (COMPAQ GS320) FUJITSU VPP5000/64 1 (a) (b) 1: ( 1(a))

ストリーミング SIMD 拡張命令2 (SSE2) を使用した SAXPY/DAXPY

debian_install.dvi

01_OpenMP_osx.indd

II

これわかWord2010_第1部_ indd

パワポカバー入稿用.indd

これでわかるAccess2010


AMD/ATI Radeon HD 5870 GPU DEGIMA LINPACK HD 5870 GPU DEGIMA LINPACK GFlops/Watt GFlops/Watt Abstract GPU Computing has lately attracted

1 OS OS OS Macintosh

平成18年版 男女共同参画白書

II III I ~ 2 ~

中堅中小企業向け秘密保持マニュアル

- 2 -


1 (1) (2)

ProLiant BL20p Generation 4 システム構成図

橡Webcamユーザーガイド03.PDF

ProLiant BL460c システム構成図


Nios® II HAL API を使用したソフトウェア・サンプル集 「Modular Scatter-Gather DMA Core」

“‡fi¡

23 Fig. 2: hwmodulev2 3. Reconfigurable HPC 3.1 hw/sw hw/sw hw/sw FPGA PC FPGA PC FPGA HPC FPGA FPGA hw/sw hw/sw hw- Module FPGA hwmodule hw/sw FPGA h

HP High Performance Computing(HPC)


MPI usage

THE PARALLEL Issue UNIVERSE James Reinders Parallel Building Blocks: David Sekowski Parallel Studio XE Cluster Studio Sanjay Goil John McHug


211 年ハイパフォーマンスコンピューティングと計算科学シンポジウム Computing Symposium 211 HPCS /1/18 a a 1 a 2 a 3 a a GPU Graphics Processing Unit GPU CPU GPU GPGPU G

エクセルカバー入稿用.indd

konicaminolta.co.jp PageScope Net Care

A Responsive Processor for Parallel/Distributed Real-time Processing


040312研究会HPC2500.ppt

template.dvi

26 FPGA FPGA (Field Programmable Gate Array) ASIC (Application Specific Integrated Circuit) FPGA FPGA FPGA FPGA Linux FreeDOS skewed way L1

CAD ICT

atama.dvi

160311_icm2015-muramatsu-v2.pptx

01_.g.r..

unitech PA600 Rugged En PDA - RFID HF - unitech G Ver.1.2

Transcription:

PC 2000 2 18 2 HPC

Agenda PC

Linux OS UNIX OS Linux Linux OS

HPC 1 1CPU CPU

Beowulf PC (PC) PC CPU(Pentium ) Beowulf: NASA Tomas Sterling Donald Becker 2 (PC ) Beowulf

PC!!

Linux Cluster (1) Level 1: OS Level 2:

Linux Cluster (2) Level 3: ( )

( ) SofTek PC Cluster 1350-324 324 (24 node) Spec CPU: Pentiumlll 500MHz * 24 RAM:12GB Network: Fast Ethernet(100BaseTX) Peak Performance: 13.6GFlops OS: LASER5 Linux6.0 (kernel 2.2.5) Compiler: PGI CDK Programming Model: C/C++, F77,f90,HPF Parallel Programming: MPI, PVM, HPF

Beowulf type 1. 2.

bottleneck Processor Memory Processor Onchip Cache 16K-32KB Cache 128K 4MB Memory

Pentium II 300MHz

Pentium II 300MHz

1CPU 1CPU/SMP Linpack LAPACK ScaLAPACK ATLAS ASCI-Red Opt. BLAS PBLAS Parallel BLAS BLAS BLAS Basic Linear Algebra Subprogram Basic Linear Algebra Communication Subprogram BLACS PVM/MPI..

BLAS Level Level 1 BLAS Vector-Vector Operations + S V V * V Level 2 BLAS Matrix-Vector Operations V M * V Level 3 BLAS Matrix-Matrix Operations + M M M * M

BLAS Level 1 BLAS y = y + s * x Operation Level 2 BLAS y = y + A* x Operation Level 3 BLAS C = C + A*B Operation

BLAS MFLOPS 250 200 150 100 50 Level 3 BLAS Level 2 BLAS Level 1 BLAS 100 200 300 400 500 Order of Vector/Matrix

LU Linpack &LAPACK) ATLAS BLAS LAPACK (BLAS 3) ASCI-Red BLAS Normal BLAS PGI compiler Linpack (BLAS1)

Linpack LAPACK : Level 1 BLAS : Coding Style : Cache : Level 3 BLAS : Block algorithm Cache BLAS Cache ATLAS (Automatically Tuned Linear Algebra Software) ASCI-Red BLAS

TCP/IP (1) Socket I/F TCP/IP window ack check sum CPU TCP/IP mbuf TCP CPU

TCP/IP (2) MTU(Ethernet:1500byte) large packet OS interrupt

IP USER Space Kernel Space NIC TCP/UDP TCP/IP

(NIC) API M-VIA (Linux VI Architecture GAMMA (Linux Active Message M-VIA, GAMMA MPI

M-VIA (A High Performance Modular VIA for Linux) [3] VI Architecture API NIC DEC Tulip (DC21*4*, 21143 ) chip, Intel i8255x (for x=7, 8 or 9) chip, Packet Engines GNIC-I, GNIC-II Gigabit Ethernet M-VIA

MVICH [4] VI Architecture MPICH 1.1.2 MPI (0.0.3 bsend, pack/unpack M-VIA

Pentium III 500MHz 2, Memory 384MB, Intel EtherExpress Pro/100 NIC, 100Base Switching Hub, Linux 2.2.13 128byte MPICH MVICH 1.9

MPICH socket(tcp) MVICH(M-VIA) M-VIA 4Kbyte MPICH MVICH 34% 139%(32byte)

GAMMA (Genoa Active Message Machine) [5] communication handlers Active Messages [7] API NIC DEC Tulip (DC21*4*, 21143 ) chipsets, Intel i8255x (for x=7, 8 or 9) chipsets

MPI/GAMMA [6] GAMMA MPI MPICH 1.1.2 Fast Ethernet MPI

Pentium III 500MHz 2, Memory 384MB, DEC DC21143 NIC, 100Base Switching Hub, Linux 2.2.13 128byte MPICH MPI/GAMMA 3.1

MPICH socket(tcp) MPI/GAMMA GAMMA 8Kbyte MPICH MPI/GAMMA 49% 404% (32byte)

IP

MPICH socket(tcp) MVICH(M-VIA) M-VIA MPI/GAMMA GAMMA IP

( ) Fast Ethernet API MPI GAMMA MPI/GAMMA Fast Ethernet Gigabit Fast Ethernet

ScaLAPACK ScaLAPACK(ScalLable Linear Algebra PACKage) LAPACK PGI CDK ScaLAPCK LU xdlutime [8] Pentium III ATLAS BLAS ASCI-Red BLAS ScaLAPACK MPI MPICH p4 MPI/GAMMA BLAS MPI

Pentium III 500MHz 4, Memory 256MB, DEC DC21143 NIC, 100Base Switching Hub, Linux 2.2.13 CPU 2 2 1 64 64 N N cpu1 cpu2 64 64 cpu3 cpu4

ScaLAPACK Matrix size 2000 ASCI-red, MPI/GAMMA

ScaLAPACK ASCI-MPICH ASCI-MPI/GAMMA 4% 79% (size 100)

1 VAMPIR VAMPIRtrace MPICH MPI/GAMMA

2 MPICH(p4)

3 MPI/GAMMA

4 850Mflop/s

[1] http://www.netlib.org/atlas/ [2] http://www.cs.utk.edu/~ghenry/distrib/archive.htm [3] http://www.nersc.gov/research/ftg/via/ [4] http://www.nersc.gov/research/ftg/mvich/index.html [5] http://www.disi.unige.it/project/gamma/ [6] http://www.disi.unige.it/project/gamma/mpigamma/ [7] http://now.cs.berkeley.edu/am/active_messages.html [8] http://ie.korea.ac.kr/~supercom/software/