untitled

Similar documents
外部SQLソース入門

ScanFront300/300P セットアップガイド

Presentation Template Tutorial

エレクトーンのお客様向けiPhone/iPad接続マニュアル

ScanFront 220/220P 取扱説明書

ScanFront 220/220P セットアップガイド

インターネット接続ガイド v110

untitled

2

TH-47LFX60 / TH-47LFX6N


DS-30

NetVehicle GX5取扱説明書 基本編

Chapter


ES-D400/ES-D200

WYE771W取扱説明書

EPSON ES-D200 パソコンでのスキャンガイド

ES-D400/ES-D350

iPhone/iPad接続マニュアル

! 行行 CPUDSP PPESPECell/B.E. CPUGPU 行行 SIMD [SSE, AltiVec] 用 HPC CPUDSP PPESPE (Cell/B.E.) SPE CPUGPU GPU CPU DSP DSP PPE SPE SPE CPU DSP SPE 2

DS-70000/DS-60000/DS-50000


ベース0516.indd

untitled

GT-F740/GT-S640

GT-X830

PX-403A

DDK-7 取扱説明書 v1.10

EPSON PX-503A ユーザーズガイド

Software Tag Implementation in Adobe Products

DDR3 SDRAMメモリ・インタフェースのレベリング手法の活用

TH-80LF50J TH-70LF50J

GT-X980

Adobe Acrobat DC 製品比較表

untitled

EPSON EP-803A/EP-803AW ユーザーズガイド

PX-504A

EPSON EP-703A ユーザーズガイド

PX-434A/PX-404A

PX-673F

EP-704A

DS-860

1 OpenCL OpenCL 1 OpenCL GPU ( ) 1 OpenCL Compute Units Elements OpenCL OpenCL SPMD (Single-Program, Multiple-Data) SPMD OpenCL work-item work-group N

基本操作ガイド

操作ガイド(本体操作編)

IM 21B04C50-01

Zinstall WinWin 日本語ユーザーズガイド

07-二村幸孝・出口大輔.indd

Appendix

操作ガイド(本体操作編)

1 GPU GPGPU GPU CPU 2 GPU 2007 NVIDIA GPGPU CUDA[3] GPGPU CUDA GPGPU CUDA GPGPU GPU GPU GPU Graphics Processing Unit LSI LSI CPU ( ) DRAM GPU LSI GPU

PX-B750F

GPGPU

基本操作ガイド

Dolphin 6110 Quick Start Guide

Microsoft Word - quick_start_guide_16 1_ja.docx

SketchBook Express V6.0.1

HAR-LH500

> > > > > はじめに

Xpand! Plug-In Guide

PSP-3000 MHB

X-Form Plug-in Guide

2

EP-904シリーズ/EP-804シリーズ/EP-774A

SonicWALL SSL-VPN 4000 導入ガイド

PSP-1000

VNX for Fileでの監査ツールの構成および使用

MIDI_IO.book

Frequently Asked Questions (FAQ) About Sunsetting the SW-CMMR

Copyrights and Trademarks Autodesk SketchBook Express v Autodesk, Inc. All Rights Reserved. Except as otherwise permitted by Autodesk, Inc.,

Copyright SATO International All rights reserved. This software is based in part on the work of the Independen


License

名称未設定

Autodesk Fusion 360 Autodesk Fusion 360 Honda 3D Fusion 360 CAD Honda EV Autodesk Fusion 360 Honda D 3D Web Rinkak 3D 2016 Honda 3D CEATEC JAPAN

> > > > > はじめに

展開とプロビジョニングの概念

PSP-3000

ダウンロード方法アルテラのソフトウェアをインストールするためのダウンロード ファイルには以下の種類があります.tar フォーマットのソフトウェアとデバイス ファイルの完全なセット ダウンロードとインストールをカスタマイズするための個別の実行ファイル ディスクに焼いて他の場所にインストールするための

ダウンロード方法 アルテラのソフトウェアをインストールするためのダウンロード ファイルには以下の種類が あります.tar フォーマットのソフトウェアとデバイス ファイルがバンドルされたセット ダウンロードとインストールをカスタマイズするための個別の実行ファイル ディスクに焼いて他の場所にインストール

ダウンロード方法 アルテラのソフトウェアをインストールするためのダウンロード ファイルには以下の種類があります.tar フォーマットのソフトウェアとデバイス ファイルがバンドルされたセット ダウンロードとインストールをカスタマイズするための個別の実行ファイル ディスクに焼いて他の場所にインストールす

> > > > > はじめに

Parallels Desktop : Parallels Transporter ( VMware Microsoft Virtual PC VirtualBox ) Parallels Image Tool Parallels Mounter 2. ( build ) Para

64bit SSE2 SSE2 FPU Visual C++ 64bit Inline Assembler 4 FPU SSE2 4.1 FPU Control Word FPU 16bit R R R IC RC(2) PC(2) R R PM UM OM ZM DM IM R: reserved

TOOLS for UR44 Release Notes for Windows

IPSJ SIG Technical Report Vol.2013-ARC-203 No /2/1 SMYLE OpenCL (NEDO) IT FPGA SMYLEref SMYLE OpenCL SMYLE OpenCL FPGA 1




NEXT FUNDS NASDAQ-100 連動型上場投信

NEXT FUNDS NASDAQ-100 連動型上場投信

NEXT FUNDS NASDAQ-100 連動型上場投信

NEXT FUNDS NASDAQ-100 連動型上場投信







Transcription:

OpenCL GPU AMD JEL

-OpenCL - GPU - AMD/ATI GPU - 2

OpenCL 3

OpenCL : GPGPU API: CPU GPU : AMD, Apple, IBM, Intel, Nvidia, Sony : 2008 12 : GPGPU AMD developer.amd.com OpenCL OpenCL Zone: http://developer.amd.com/openclzone 4 OpenCL and the OpenCL logo are trademarks of Apple Inc. used by permission by Khronos.

OpenCL Language Specification C-based cross-platform programming interface Subset of ISO C99 with language extensions - familiar to developers Well-defined numerical accuracy - IEEE 754 rounding behavior with defined maximum error Online or offline compilation and build of compute kernel executables Includes a rich set of built-in functions Platform Layer API A hardware abstraction layer over diverse computational resources Query, select and initialize compute devices Create compute contexts and work-queues Runtime API Execute compute kernels Manage scheduling, compute, and memory resources 5

OpenCL int main() { // ICD based init ICDinit(&platform); // Setup setupcl(cl_device_type_gpu, ); // Compile, Load Kernel compileloadcl(, vec_mult.cl ); // Create and Initialize buffers input0 = clcreatebuffer(context, ); input1 = clcreatebuffer(context, ); output = clcreatebuffer(context, ); CPU inputbuffer0 inputbuffer1 outputbuffer GPU GPU input0 input1 output // Execute kernel runclkernel(, &input0, &input1, &output, ); // Wait for kernel completion status = clfinish(cmd_queue); // Read data from CL device to CPU memory clenqueuereadbuffer(cmd_queue, ); // Wait for read completion status = clfinish(cmd_queue); 6 } main.cpp ( ) kernel void vec_mult ( global const float *a, global const float *b, global float *c) { int gid = get_global_id(0); c[gid] = a[gid] * b[gid]; } vec_mult.cl ( )

OpenCL C kernel void vec_mult ( global const float *a, global const float *b, global float *c) { int gid = get_global_id(0); c[gid] = a[gid] * b[gid]; } 1. kernel kernel 2. global, constant, local 3. ID 2 The OpenCL Specification, Version: 1.1 7

OpenCL in-order out-of-order (GPU) (CPU) N work-item work-group : private, local, constant, global The OpenCL Specification, Version: 1.1 8

PC CPU OS - Microsoft Windows - Linux - Microsoft Visual Studio -gcc CPU inputbuffer0 inputbuffer1 outputbuffer 9

GPGPU CPU OS - Microsoft Windows - Linux - Microsoft Visual Studio -gcc CPU GPU -GPU -GPGPU API (CPU ) GPU OS GPU -GPU CPU GPU OpenCL SDK AMD ATI Stream SDK inputbuffer0 input0 inputbuffer1 input1 outputbuffer output GPU 10

OpenCL : Khronos OpenCL C++ Bindings libstdcl Simplified OpenCL (SOCL) JavaCL OpenCL.NET PyOpenCL ScalaCL Wrapper Libraries (7+) Optional Extensions (15) cl_khr_gl_event 1.1 cl_khr_d3d10 2/26/2010 cl_khr_icd 1/29/2010 cl_khr_gl_sharing 8/28/2009 cl_khr_fp64 cl_khr_select_fprounding_mode cl_khr_fp16 cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_stores Multi-Vendor Extensions (2) cl_ext_device_fission 6/9/2010 cl_ext_migrate_memobject 6/8/2010 Core features in 1.1 Vendor-Specific Extensions (8) cl_amd_printf 7/2010 cl_amd_event_callback 7/2010 cl_amd_device_attribute_query 3/26/2010 cl_amd_fp64 3/26/2010 cl_amd_media_ops 3/26/2010 + 3 extensions from other vendors OpenCL Core Specification 1.0 12/9/2008 1.1 6/14/2010... 11

GPU -AMD/ATI GPU - 12

AMD/ATI GPU Stream Computing 3000 2500 2000 1500 1000 500 0 R520 ATI RADEON X1800 ATI FireGL V7200 V7300 V7350 GigaFLOPS * Cypress ATI RADEON * Peak single-precision performance; For RV670, RV770 & Cypress divide by 5 for peak double-precision performance R580(+) ATI RADEON X19xx ATI FireStream R600 ATI RADEON HD 2900 ATI FireGL V7600 V8600 V8650 CTM GPGPU RV670 ATI RADEON HD 3800 ATI FireGL V7700 AMD FireStream 9170 RV770 ATI RADEON HD 4800 ATI FirePro V8700 AMD FireStream 9250 9270 HD 5870 ALU 2.5 Stream SDK CAL+IL/Brook+ OpenCL 1.0 DirectX 11 2.25 <+18% TDP Sep 05 Mar 06 Oct 06 Apr 07 Nov 07 Jun 08 Dec 08 Jul 09 13

AMD/ATI GPU AMD/ATI GPU 5-way VLIW Shader Processing Unit VLIW 5 ALU SPU(Stream Processing Unit) Stream Stream 5 Stream (T-Stream ) 4 Stream T-Stream Radeon HD5870 x: MULADD_e, R5.y, R2.x, T0.x VEC_120 y: MULADD_e R0.y, R5.y, R2.w, T1.w VEC_120 z: MULADD_e T1.z, R6.y, R2.x, T1.z VEC_210 w: MULADD_e, R7.y, R2.x, T0.w VEC_021 t: MULADD_e T0.z, R6.y, R2.z, T2.z VEC_120 Stream コア T-Stream コア 14 14

AMD/ATI GPU SIMD SIMD SIMD 16 SIMD SIMD SIMD RV770 10 Cypress 20 Cypress 15 15

GPU OpenCL GPU SIMD Stream OpenCL CU PE CU ID ID 2 ID gx, gy ID wx, wy ID sx, sy (gx, gy) = (wx * Sx+ sx, wy * Sy+ sy) Sx, Sy The OpenCL Specification, Version: 1.1 16

- CPU GPU CPU CPU GPU CAE GPGPU GPU Graphics Workloads Serial/Task-Parallel Workloads Other Highly Parallel Workloads 17

Disclaimer & Attribution DISCLAIMER The information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions and typographical errors. The information contained herein is subject to change and may be rendered inaccurate for many reasons, including but not limited to product and roadmap changes, component and motherboard version changes, new model and/or product releases, product differences between differing manufacturers, software changes, BIOS flashes, firmware upgrades, or the like. AMD assumes no obligation to update or otherwise correct or revise this information. However, AMD reserves the right to revise this information and to make changes from time to time to the content hereof without obligation of AMD to notify any person of such revisions or changes. AMD MAKES NO REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE CONTENTS HEREOF AND ASSUMES NO RESPONSIBILITY FOR ANY INACCURACIES, ERRORS OR OMISSIONS THAT MAY APPEAR IN THIS INFORMATION. AMD SPECIFICALLY DISCLAIMS ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE. IN NO EVENT WILL AMD BE LIABLE TO ANY PERSON FOR ANY DIRECT, INDIRECT, SPECIAL OR OTHER CONSEQUENTIAL DAMAGES ARISING FROM THE USE OF ANY INFORMATION CONTAINED HEREIN, EVEN IF AMD IS EXPRESSLY ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. ATTRIBUTION 2010 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD Arrow logo, ATI, the ATI Logo, Radeon, FirePro, FireStream and combinations thereof are trademarks of Advanced Micro Devices, Inc. Microsoft, Windows, Windows Vista, and DirectX are registered trademarks of Microsoft Corporation in the United States and/or other jurisdictions. Other names are for informational purposes only and may be trademarks of their respective owners. OpenCL and the OpenCL logo are trademarks of Apple Inc. used by permission by Khronos. 18