Parallel Studio XE 2013 Cluster Studio XE 2013
) ( Intel s Terms and Conditions of Sale Sandy Bridge SYSmark MobileMark http://www.intel.com/performance/ Intel Intel Intel Atom Intel Core Intel Xeon Phi Xeon Cilk VTune / Intel Corporation ( HT ): HT PC HT Core http://www.intel.co.jp/content/www/jp/ja/architecture-and-technology/hyper-threading/hyper-threading-technology.html 64 :64 BIOS PC http://www.intel.com/content/www/jp/ja/architecture-andtechnology/microarchitecture/intel-64-architecture-general.html : PC http://www.intel.co.jp/content/www/jp/ja/architecture-and-technology/turbo-boost/turbo-boosttechnology.html 2
Parallel Studio XE 2013 Cluster Studio XE 2013 50+ / 128 256 512 3
Parallel Studio XE 2013 Cluster Studio XE 2013 Advisor XE (Studio ) Composer XE C/C++ Fortran Cilk Plus MPI MPI VTune Amplifier XE & Inspector XE / / / Trace Analyzer & Collector MPI MPI hotspot 4
C++11 Linux Fortran C# Ivy Bridge Java Fortran 2008 (Windows ) C/C++ Haswell CPU MPI MPI 2.2 Xeon Phi Cluster Studio XE Windows Linux 5
Parallel Studio XE 2013/ Cluster Studio XE 2013 Intel Parallel Studio XE Intel Cluster Studio & XE 3 Core ( : Ivy Bridge) ( : Haswell) Xeon Phi C++ Fortran OS: Windows 8 Desktop Linux IDE: Visual Studio 2008 2010 2012 GNU : C99 C++11 Fortran 2003 Fortran 2008 MPI 2.2 6
7
Ivy Bridge Haswell Xeon Phi C++/Fortran AVX AVX2 FMA3 IMCI MPI MKL AVX AVX2 FMA3 MPI VTune Amplifier XE Inspector XE & & & / 開発コード名 8
Windows Linux Mac OS X C++ Composer XE 2013 Cilk Plus C++ XE 13.0 TBB MKL IPP Xeon Phi (Linux) Fortran Composer XE 2013 Fortran XE 13.0 MKL Compaq Visual Fortran Fortran 2003/2008 Xeon Phi (Linux) Composer XE 2013 C++ Composer XE Fortran Composer XE C++ Fortran Windows (Visual Studio ) Linux Windows: C++ /Visual C++ & Microsoft Visual Studio Linux: C++ /gcc & Eclipse CDT Mac OS X: C++ /gcc & XCode : Fortran Compaq Visual Fortran : : 1 9
C++ AVX AVX2 Xeon Phi Linux: Cilk Plus: SYSmark MobileMark : SIMD 2 ( SSE2) SIMD 3 ( SSE3) SIMD 3 (SSSE3) #20110804 10
Fortran Xeon Phi : Linux AVX AVX2 (-xa /Qxa) SIMD & Co-Array & : VECTOR PARALLEL SIMD (align arraynbyte) SYSmark MobileMark : SIMD 2 ( SSE2) SIMD 3 ( SSE3) SIMD 3 (SSSE3) #20110804 11
C++ Windows 5 Intel Parallel Studio XE Intel Cluster Studio & XE 12
( MKL) Windows Linux Mac OS gcc MSFT PGI Parallel Studio XE Cluster Studio XE : North American Development Survey 2011 Volume II Evans Data Corp 33% MKL MKL 13
MKL LAPACK コンパイラー & ライブラリー SYSmark MobileMark : SIMD 2 ( SSE2) SIMD 3 ( SSE3) SIMD 3 (SSSE3) #20110804 14
( IPP) & OS SSE AVX Windows Linux Mac OS X Atom Core Xeon : : intel.com/software/products/eval 15
IPP AVX SYSmark MobileMark : SIMD 2 ( SSE2) SIMD 3 ( SSE3) SIMD 3 (SSSE3) #20110804 16
VTune Amplifier XE Intel VTune Amplifier XE? 時間がかかっている関数を重点的にチューニングする コールスタックを確認する ソースレベルで時間を確認する ソースレベルでキャッシュミスを確認する 関数をキャッシュミスの回数でソートして確認する 待機時間ごとにロックを確認する 待機中の CPU 利用率は赤または緑で示される Windows Linux Amplifier XE 3 VTune SAS Institute Inc. Claire Cates 17 17
12 VTune Amplifier XE 2013 Intel VTune Amplifier XE 1) & 2) + 3) hotspot 7) ( GCC ) 8) Java Java 9) API 10) 4) Ivy Bridge 5) Haswell 6) Xeon Phi 11) 12) ( : ) / 開発コード名 18
Java VTune Amplifier XE 2013 Intel VTune Amplifier XE ( ) (...) JVM Java / C++ / Fortran Java 19
CPU VTune Amplifier XE 2013 Intel VTune Amplifier XE CPU H/W ( CPU ) 1 : Linux 20
21
Intel Advisor XE Advisor XE : Advisor XE! Linux Windows! C C++ Fortran C# 22
Advisor XE 2013 Intel Advisor XE 1) 2) ( ) 3) 4) 5)! 23
Xeon Phi Compilers & Libraries Xeon Cilk Plus ( TBB) C/C++ C++ Xeon Phi 標準規格のサポート OpenMP Co-Array Fortran MPI! 24
Cilk Plus ( TBB) Compilers & Libraries Cilk Plus TBB What 3 & / C++ Why Windows/Linux C/C++ C++ Windows Linux Mac OS X OS / 25
( TBB) C++ C++ OS TBB Golaem, CTO Michaël Rouillé 26
Xeon Phi Cilk Plus : Cilk Plus (C/C++ ) / 3 : cilk_for cilk_spawn cilk_sync Cilk Plus 27
28
Intel Parallel Studio XE Intel Cluster Studio & XE (Microsoft gdb ) { } char p, q; p = malloc(10); q = p; free(p); q = 0; { } char my_chp = "abc"; char an_chp = (char ) malloc (strlen((char )my_chp)); memset (an_chp, '@', sizeof(my_chp)); CHKP: トレースバック :./a.out(main+0x1b2) [0x402d7a] in file mems.c at line 13 29
Compilers & Libraries ( MKL) : C++ Fortran OpenMP: MKL ( TBB) / MSTC Modern Software Technology CEO Franz Bernasek 30
C++ 11 Compilers & Libraries ( ) ( ) noexcept ( ) for Windows Linux ++11 31
Fortran 2008 Compilers & Libraries 31 (Fortran 2008 15 ) ALLOCATABLE Co-Array CODIMENSION SYNC ALL SYNC IMAGES SYNC MEMORY CRITICAL END CRITICAL LOCK UNLOCK ERROR STOP ALLOCATE DEALLOCATE Co-Array IMAGE_INDEX LCOBOUND NUM_IMAGES THIS_IMAGE UCOBOUND CONTIGUOUS ALLOCATE MOLD G0 G0.d CONTAINS : BESSEL_J0 BESSEL_J1 BESSEL_JN BESSEL_YN BGE BGT BLE, BLT DSHIFTL DSHIFTR ERF ERFC ERFC_SCALED GAMMA HYPOT IALL IANY IPARITY IS_CONTIGUOUS LEADZ LOG_GAMMA MASKL MASKR MERGE_BITS NORM2 PARITY POPCNT POPPAR SHIFTA SHIFTL SHIFTR STORAGE_SIZE TRAILZ ISO_FORTRAN_ENV : ATOMIC_INT_KIND ATOMIC_LOGICAL_KIND CHARACTER_KINDS INTEGER_KINDS INT8 INT16 INT32 INT64 LOCK_TYPE LOGICAL_KINDS REAL_KINDS REAL32 REAL64 REAL128 STAT_LOCKED STAT_LOCKED_OTHER_IMAGE STAT_UNLOCKED DO CONCURRENT OPEN NEWUNIT : ATOMIC_DEFINE ATOMIC_REF INTENT(OUT) G 0 Co-Array ( ) Linux Windows OSX F2008 32
Inspector XE 2013 / Intel Inspector XE... & MPI / 33
Inspector XE 2013 Intel Inspector XE? API 34
Parallel Studio XE 2013 / Intel Parallel Studio XE Intel Cluster Studio & XE 250 : : - - Parallel Studio XE 35
36
Intel Cluster Studio & XE MPI - MPI - MPI 6.5 C/C++ Fortran MPI 12 / Inspector XE MPI VTune Amplifier XE hotspot MPI 37
MPI Cluster Studio XE 2013 Intel MPI Library 18000 Ivy Bridge Haswell Xeon Phi 16000 14000 12000 10000 8000 90K 60K 120K Intel MPI Library, K processes Doubling, K processes 120000 6000 4000 2000 Exascale, K processes (estimated ) MPI 2.2 0 2010 2011 2012 2013 2014 2015 2016 2017 2018 MPI 38
MPI Intel MPI Library Berkeley Lab / BLCR Xeon MPI 39
MPI 2.2 Intel MPI Library MPI 2.1 MPI MPI MPI MPI 2.2 40
Processes MPI Trace Analyzer/Collector ( Cluster Studio XE 2013 ) Intel ITAC Hotspot MPI MPI 6 7000 6000 5000 4000 3000 2000 1000 0 Intel Trace Analyzer and Collector (processes) 2010 2011 2012 Year MPI 41
Intel s Terms and Conditions of Sale http://www.intel.com/performance/ ( ) Intel Intel Intel Atom Intel Core Intel Xeon Phi Xeon Xeon Inside Cilk VTune / Intel Corporation SIMD 2 ( SSE2) SIMD 3 ( SSE3) SIMD 3 (SSSE3) #20110804