Second-semi.PDF



Similar documents
Microsoft PowerPoint - sales2.ppt

卒業論文

develop

橡3_2石川.PDF

1重谷.PDF

CPU Levels in the memory hierarchy Level 1 Level 2... Increasing distance from the CPU in access time Level n Size of the memory at each level 1: 2.2

Myrinet2000 ご紹介

001.dvi

untitled

untitled

untitled

PDF.PDF

スパコンに通じる並列プログラミングの基礎

total-all-nt.dvi

PC Development of Distributed PC Grid System,,,, Junji Umemoto, Hiroyuki Ebara, Katsumi Onishi, Hiroaki Morikawa, and Bunryu U PC WAN PC PC WAN PC 1 P

スパコンに通じる並列プログラミングの基礎

HPC (pay-as-you-go) HPC Web 2

PCクラスタコメント無い.PDF

Dual Stack Virtual Network Dual Stack Network RS DC Real Network 一般端末 GN NTM 端末 C NTM 端末 B IPv4 Private Network IPv4 Global Network NTM 端末 A NTM 端末 B

i IHE IHE-J HIS RIS PACS CT CT CT

スパコンに通じる並列プログラミングの基礎

FINAL PROGRAM 22th Annual Workshop SWoPP / / 2009 Sendai Summer United Workshops on Parallel, Distributed, and Cooperative Processing

i TCP/IP NIC Intel 3com NIC TCP/IP *1 20 IPv4 IPv6 IPv6 TCP/IP TCP/IP *1 3

GNU/Linux on SuperH g,,,,, GNU/Linux on SuperH [1] SuperH (SH-3 SH-4) GNU/Linux g linux-kernel 1998 Linux (SH-3) g GD

Itanium2ベンチマーク

Shonan Institute of Technology MEMOIRS OF SHONAN INSTITUTE OF TECHNOLOGY Vol. 41, No. 1, 2007 Ships1 * ** ** ** Development of a Small-Mid Range Paral


VXPRO R1400® ご提案資料

07-二村幸孝・出口大輔.indd

<4D F736F F F696E74202D C835B B E B8CDD8AB B83685D>

倍々精度RgemmのnVidia C2050上への実装と応用

SC-85X2取説


untitled

IO Linux Vyatta PC


i


Wide Scanner TWAIN Source ユーザーズガイド

InterSafe Personal_v2.3 ユーザーズガイド_初版

HPC

1 (bit ) ( ) PC WS CPU IEEE754 standard ( 24bit) ( 53bit)

09中西

VNSTProductDes3.0-1_jp.pdf

1 M32R Single-Chip Multiprocessor [2] [3] [4] [5] Linux/M32R UP(Uni-processor) SMP(Symmetric Multi-processor) MMU CPU nommu Linux/M32R Linux/M32R 2. M

OpenMP (1) 1, 12 1 UNIX (FUJITSU GP7000F model 900), 13 1 (COMPAQ GS320) FUJITSU VPP5000/64 1 (a) (b) 1: ( 1(a))

ストリーミング SIMD 拡張命令2 (SSE2) を使用した SAXPY/DAXPY

debian_install.dvi

01_OpenMP_osx.indd

II

これわかWord2010_第1部_ indd

パワポカバー入稿用.indd

これでわかるAccess2010


AMD/ATI Radeon HD 5870 GPU DEGIMA LINPACK HD 5870 GPU DEGIMA LINPACK GFlops/Watt GFlops/Watt Abstract GPU Computing has lately attracted

1 OS OS OS Macintosh

平成18年版 男女共同参画白書


II III I ~ 2 ~

中堅中小企業向け秘密保持マニュアル


PR映画-1

- 2 -


1 (1) (2)

マルチコアPCクラスタ環境におけるBDD法のハイブリッド並列実装

ProLiant BL20p Generation 4 システム構成図

橡Webcamユーザーガイド03.PDF

ProLiant BL460c システム構成図


Nios® II HAL API を使用したソフトウェア・サンプル集 「Modular Scatter-Gather DMA Core」

“‡fi¡

23 Fig. 2: hwmodulev2 3. Reconfigurable HPC 3.1 hw/sw hw/sw hw/sw FPGA PC FPGA PC FPGA HPC FPGA FPGA hw/sw hw/sw hw- Module FPGA hwmodule hw/sw FPGA h

HP High Performance Computing(HPC)


User's Guide

MPI usage

メタコンピュータ構成方式の研究

THE PARALLEL Issue UNIVERSE James Reinders Parallel Building Blocks: David Sekowski Parallel Studio XE Cluster Studio Sanjay Goil John McHug

Microsoft PowerPoint - sales2.ppt


次世代スーパーコンピュータのシステム構成案について

LAN LAN LAN LAN LAN LAN,, i

211 年ハイパフォーマンスコンピューティングと計算科学シンポジウム Computing Symposium 211 HPCS /1/18 a a 1 a 2 a 3 a a GPU Graphics Processing Unit GPU CPU GPU GPGPU G

エクセルカバー入稿用.indd

konicaminolta.co.jp PageScope Net Care

A Responsive Processor for Parallel/Distributed Real-time Processing



040312研究会HPC2500.ppt

i

template.dvi

PROLIANT ML

26 FPGA FPGA (Field Programmable Gate Array) ASIC (Application Specific Integrated Circuit) FPGA FPGA FPGA FPGA Linux FreeDOS skewed way L1

先進的計算基盤システムシンポジウム SACSIS2012 Symposium on Advanced Computing Systems and Infrastructures SACSIS /5/18 CPU, CPU., Memory-bound CPU,., Memory-bo

CAD ICT

atama.dvi

160311_icm2015-muramatsu-v2.pptx

01_.g.r..

unitech PA600 Rugged En PDA - RFID HF - unitech G Ver.1.2

Transcription:

PC 2000 2 18 2 HPC

Agenda PC

Linux OS UNIX OS Linux Linux OS

HPC 1 1CPU CPU

Beowulf PC (PC) PC CPU(Pentium ) Beowulf: NASA Tomas Sterling Donald Becker 2 (PC ) Beowulf

PC!!

Linux Cluster (1) Level 1: OS Level 2:

Linux Cluster (2) Level 3: ( )

( ) SofTek PC Cluster 1350-324 324 (24 node) Spec CPU: Pentiumlll 500MHz * 24 RAM:12GB Network: Fast Ethernet(100BaseTX) Peak Performance: 13.6GFlops OS: LASER5 Linux6.0 (kernel 2.2.5) Compiler: PGI CDK Programming Model: C/C++, F77,f90,HPF Parallel Programming: MPI, PVM, HPF

Beowulf type 1. 2.

bottleneck Processor Memory Processor Onchip Cache 16K-32KB Cache 128K 4MB Memory

Pentium II 300MHz

Pentium II 300MHz

1CPU 1CPU/SMP Linpack LAPACK ScaLAPACK ATLAS ASCI-Red Opt. BLAS PBLAS Parallel BLAS BLAS BLAS Basic Linear Algebra Subprogram Basic Linear Algebra Communication Subprogram BLACS PVM/MPI..

BLAS Level Level 1 BLAS Vector-Vector Operations + S V V * V Level 2 BLAS Matrix-Vector Operations V M * V Level 3 BLAS Matrix-Matrix Operations + M M M * M

BLAS Level 1 BLAS y = y + s * x Operation Level 2 BLAS y = y + A* x Operation Level 3 BLAS C = C + A*B Operation

BLAS MFLOPS 250 200 150 100 50 Level 3 BLAS Level 2 BLAS Level 1 BLAS 100 200 300 400 500 Order of Vector/Matrix

LU Linpack &LAPACK) ATLAS BLAS LAPACK (BLAS 3) ASCI-Red BLAS Normal BLAS PGI compiler Linpack (BLAS1)

Linpack LAPACK : Level 1 BLAS : Coding Style : Cache : Level 3 BLAS : Block algorithm Cache BLAS Cache ATLAS (Automatically Tuned Linear Algebra Software) ASCI-Red BLAS

TCP/IP (1) Socket I/F TCP/IP window ack check sum CPU TCP/IP mbuf TCP CPU

TCP/IP (2) MTU(Ethernet:1500byte) large packet OS interrupt

IP USER Space Kernel Space NIC TCP/UDP TCP/IP

(NIC) API M-VIA (Linux VI Architecture GAMMA (Linux Active Message M-VIA, GAMMA MPI

M-VIA (A High Performance Modular VIA for Linux) [3] VI Architecture API NIC DEC Tulip (DC21*4*, 21143 ) chip, Intel i8255x (for x=7, 8 or 9) chip, Packet Engines GNIC-I, GNIC-II Gigabit Ethernet M-VIA

MVICH [4] VI Architecture MPICH 1.1.2 MPI (0.0.3 bsend, pack/unpack M-VIA

Pentium III 500MHz 2, Memory 384MB, Intel EtherExpress Pro/100 NIC, 100Base Switching Hub, Linux 2.2.13 128byte MPICH MVICH 1.9

MPICH socket(tcp) MVICH(M-VIA) M-VIA 4Kbyte MPICH MVICH 34% 139%(32byte)

GAMMA (Genoa Active Message Machine) [5] communication handlers Active Messages [7] API NIC DEC Tulip (DC21*4*, 21143 ) chipsets, Intel i8255x (for x=7, 8 or 9) chipsets

MPI/GAMMA [6] GAMMA MPI MPICH 1.1.2 Fast Ethernet MPI

Pentium III 500MHz 2, Memory 384MB, DEC DC21143 NIC, 100Base Switching Hub, Linux 2.2.13 128byte MPICH MPI/GAMMA 3.1

MPICH socket(tcp) MPI/GAMMA GAMMA 8Kbyte MPICH MPI/GAMMA 49% 404% (32byte)

IP

MPICH socket(tcp) MVICH(M-VIA) M-VIA MPI/GAMMA GAMMA IP

( ) Fast Ethernet API MPI GAMMA MPI/GAMMA Fast Ethernet Gigabit Fast Ethernet

ScaLAPACK ScaLAPACK(ScalLable Linear Algebra PACKage) LAPACK PGI CDK ScaLAPCK LU xdlutime [8] Pentium III ATLAS BLAS ASCI-Red BLAS ScaLAPACK MPI MPICH p4 MPI/GAMMA BLAS MPI

Pentium III 500MHz 4, Memory 256MB, DEC DC21143 NIC, 100Base Switching Hub, Linux 2.2.13 CPU 2 2 1 64 64 N N cpu1 cpu2 64 64 cpu3 cpu4

ScaLAPACK Matrix size 2000 ASCI-red, MPI/GAMMA

ScaLAPACK ASCI-MPICH ASCI-MPI/GAMMA 4% 79% (size 100)

1 VAMPIR VAMPIRtrace MPICH MPI/GAMMA

2 MPICH(p4)

3 MPI/GAMMA

4 850Mflop/s

[1] http://www.netlib.org/atlas/ [2] http://www.cs.utk.edu/~ghenry/distrib/archive.htm [3] http://www.nersc.gov/research/ftg/via/ [4] http://www.nersc.gov/research/ftg/mvich/index.html [5] http://www.disi.unige.it/project/gamma/ [6] http://www.disi.unige.it/project/gamma/mpigamma/ [7] http://now.cs.berkeley.edu/am/active_messages.html [8] http://ie.korea.ac.kr/~supercom/software/