HBase Phoenix API Mars GPU MapReduce GPU Hadoop Hadoop Hadoop MapReduce : (1) MapReduce (2)JobTracker 1 Hadoop CPU GPU Fig. 1 The overview of CPU-GPU

Similar documents
DEIM Forum 2012 C2-6 Hadoop Web Hadoop Distributed File System Hadoop I/O I/O Hadoo

Microsoft PowerPoint - SWoPP2010_Shirahata

GPGPU

23 Fig. 2: hwmodulev2 3. Reconfigurable HPC 3.1 hw/sw hw/sw hw/sw FPGA PC FPGA PC FPGA HPC FPGA FPGA hw/sw hw/sw hw- Module FPGA hwmodule hw/sw FPGA h

1 OpenCL OpenCL 1 OpenCL GPU ( ) 1 OpenCL Compute Units Elements OpenCL OpenCL SPMD (Single-Program, Multiple-Data) SPMD OpenCL work-item work-group N

1 GPU GPGPU GPU CPU 2 GPU 2007 NVIDIA GPGPU CUDA[3] GPGPU CUDA GPGPU CUDA GPGPU GPU GPU GPU Graphics Processing Unit LSI LSI CPU ( ) DRAM GPU LSI GPU

IPSJ SIG Technical Report Vol.2013-ARC-203 No /2/1 SMYLE OpenCL (NEDO) IT FPGA SMYLEref SMYLE OpenCL SMYLE OpenCL FPGA 1

,4) 1 P% P%P=2.5 5%!%! (1) = (2) l l Figure 1 A compilation flow of the proposing sampling based architecture simulation

IPSJ SIG Technical Report iphone iphone,,., OpenGl ES 2.0 GLSL(OpenGL Shading Language), iphone GPGPU(General-Purpose Computing on Graphics Proc

2 JSON., 2. JSON,, JSON Jaql [9] Spark Streaming [8], Spark [7].,, 2, 3 4, JSON [3], Jaql [9], Spark [7] Spark Streaming [8] JSON JSON [

258 5) GPS 1 GPS 6) GPS DP 7) 8) 10) GPS GPS ) GPS Global Positioning System

IPSJ-HPC

P2P P2P peer peer P2P peer P2P peer P2P i

IPSJ SIG Technical Report Vol.2014-ARC-213 No.24 Vol.2014-HPC-147 No /12/10 GPU 1,a) 1,b) 1,c) 1,d) GPU GPU Structure Of Array Array Of

3.1 Thalmic Lab Myo * Bluetooth PC Myo 8 RMS RMS t RMS(t) i (i = 1, 2,, 8) 8 SVM libsvm *2 ν-svm 1 Myo 2 8 RMS 3.2 Myo (Root

先進的計算基盤システムシンポジウム Shuffle KVP KVP MapReduce KVP 7) Jimmy PageRank MapReduce.69 Jimmy KVP Jimmy key KVP value KVP MapReduce 3 PageRank 4 Jimmy M

IPSJ SIG Technical Report Vol.2012-HCI-149 No /7/20 1 1,2 1 (HMD: Head Mounted Display) HMD HMD,,,, An Information Presentation Method for Weara

1 1 CodeDrummer CodeMusician CodeDrummer Fig. 1 Overview of proposal system c

28 Docker Design and Implementation of Program Evaluation System Using Docker Virtualized Environment

fiš„v8.dvi

1

情報処理学会研究報告 IPSJ SIG Technical Report Vol.2013-HPC-139 No /5/29 Gfarm/Pwrake NICT NICT 10TB 100TB CPU I/O HPC I/O NICT Gf

IPSJ SIG Technical Report Vol.2013-ARC-206 No /8/1 Android Dominic Hillenbrand ODROID-X2 GPIO Android OSCAR WFI 500[us] GPIO GP

2. Eades 1) Kamada-Kawai 7) Fruchterman 2) 6) ACE 8) HDE 9) Kruskal MDS 13) 11) Kruskal AGI Active Graph Interface 3) Kruskal 5) Kruskal 4) 3. Kruskal

& Vol.5 No (Oct. 2015) TV 1,2,a) , Augmented TV TV AR Augmented Reality 3DCG TV Estimation of TV Screen Position and Ro

211 年ハイパフォーマンスコンピューティングと計算科学シンポジウム Computing Symposium 211 HPCS /1/18 a a 1 a 2 a 3 a a GPU Graphics Processing Unit GPU CPU GPU GPGPU G

main.dvi

07-二村幸孝・出口大輔.indd

IPSJ SIG Technical Report Vol.2015-ARC-215 No.7 Vol.2015-OS-133 No /5/26 Just-In-Time PG 1,a) 1, Just-In-Time VM Geyser Dalvik VM Caffei

1 Web [2] Web [3] [4] [5], [6] [7] [8] S.W. [9] 3. MeetingShelf Web MeetingShelf MeetingShelf (1) (2) (3) (4) (5) Web MeetingShelf

IPSJ SIG Technical Report Vol.2009-DPS-141 No.20 Vol.2009-GN-73 No.20 Vol.2009-EIP-46 No /11/27 1. MIERUKEN 1 2 MIERUKEN MIERUKEN MIERUKEN: Spe

Run-Based Trieから構成される 決定木の枝刈り法

Core1 FabScalar VerilogHDL Cache Cache FabScalar 1 CoreConnect[2] Wishbone[3] AMBA[4] AMBA 1 AMBA ARM L2 AMBA2.0 AMBA2.0 FabScalar AHB APB AHB AMBA2.0

Microsoft PowerPoint - GPU_computing_2013_01.pptx

2). 3) 4) 1.2 NICTNICT DCRA Dihedral Corner Reflector micro-arraysdcra DCRA DCRA DCRA 3D DCRA PC USB PC PC ON / OFF Velleman K8055 K8055 K8055

TCP/IP IEEE Bluetooth LAN TCP TCP BEC FEC M T M R M T 2. 2 [5] AODV [4]DSR [3] 1 MS 100m 5 /100m 2 MD 2 c 2009 Information Processing Society of

IPSJ SIG Technical Report Vol.2010-GN-74 No /1/ , 3 Disaster Training Supporting System Based on Electronic Triage HIROAKI KOJIMA, 1 KU

(a) 1 (b) 3. Gilbert Pernicka[2] Treibitz Schechner[3] Narasimhan [4] Kim [5] Nayar [6] [7][8][9] 2. X X X [10] [11] L L t L s L = L t + L s

Shonan Institute of Technology MEMOIRS OF SHONAN INSTITUTE OF TECHNOLOGY Vol. 41, No. 1, 2007 Ships1 * ** ** ** Development of a Small-Mid Range Paral

DPA,, ShareLog 3) 4) 2.2 Strino Strino STRain-based user Interface with tacticle of elastic Natural ObjectsStrino 1 Strino ) PC Log-Log (2007 6)

4.1 % 7.5 %

IPSJ SIG Technical Report Vol.2012-CG-148 No /8/29 3DCG 1,a) On rigid body animation taking into account the 3D computer graphics came

3_23.dvi

.,,, [12].,, [13].,,.,, meal[10]., [11], SNS.,., [14].,,.,,.,,,.,,., Cami-log, , [15], A/D (Powerlab ; ), F- (F-150M, ), ( PC ).,, Chart5(ADIns

2017 (413812)

IPSJ SIG Technical Report 1 1, Nested Transactional Memory Selecting the Optimal Rollback Point Yuji Ito, 1 Ryota Shioya, 1, 2 Masahiro Goshima

( ) [1] [4] ( ) 2. [5] [6] Piano Tutor[7] [1], [2], [8], [9] Radiobaton[10] Two Finger Piano[11] Coloring-in Piano[12] ism[13] MIDI MIDI 1 Fig. 1 Syst

untitled

SEJulyMs更新V7

IPSJ SIG Technical Report Vol.2014-CG-155 No /6/28 1,a) 1,2,3 1 3,4 CG An Interpolation Method of Different Flow Fields using Polar Inter

Vol.57 No (Mar. 2016) 1,a) , L3 CG VDI VDI A Migration to a Cloud-based Information Infrastructure to Support

Input image Initialize variables Loop for period of oscillation Update height map Make shade image Change property of image Output image Change time L

FINAL PROGRAM 25th Annual Workshop SWoPP / / 2012 Tottori Summer United Workshops on Parallel, Distributed, and Cooperative Processing 2012

IPSJ SIG Technical Report Vol.2017-ARC-225 No.12 Vol.2017-SLDM-179 No.12 Vol.2017-EMB-44 No /3/9 1 1 RTOS DefensiveZone DefensiveZone MPU RTOS

untitled

HPC pdf

第62巻 第1号 平成24年4月/石こうを用いた木材ペレット

PC Development of Distributed PC Grid System,,,, Junji Umemoto, Hiroyuki Ebara, Katsumi Onishi, Hiroaki Morikawa, and Bunryu U PC WAN PC PC WAN PC 1 P

IPSJ SIG Technical Report Vol.2011-MUS-91 No /7/ , 3 1 Design and Implementation on a System for Learning Songs by Presenting Musical St

2. CABAC CABAC CABAC 1 1 CABAC Figure 1 Overview of CABAC 2 DCT 2 0/ /1 CABAC [3] 3. 2 値化部 コンテキスト計算部 2 値算術符号化部 CABAC CABAC

DEIM Forum 2009 C8-4 QA NTT QA QA QA 2 QA Abstract Questions Recomme

untitled

,,,,., C Java,,.,,.,., ,,.,, i

Vol.55 No (Jan. 2014) saccess 6 saccess 7 saccess 2. [3] p.33 * B (A) (B) (C) (D) (E) (F) *1 [3], [4] Web PDF a m

2. Twitter Twitter 2.1 Twitter Twitter( ) Twitter Twitter ( 1 ) RT ReTweet RT ReTweet RT ( 2 ) URL Twitter Twitter 140 URL URL URL 140 URL URL

HP cafe HP of A A B of C C Map on N th Floor coupon A cafe coupon B Poster A Poster A Poster B Poster B Case 1 Show HP of each company on a user scree

1 3DCG [2] 3DCG CG 3DCG [3] 3DCG 3 3 API 2 3DCG 3 (1) Saito [4] (a) 1920x1080 (b) 1280x720 (c) 640x360 (d) 320x G-Buffer Decaudin[5] G-Buffer D

Lyra X Y X Y ivis Designer Lyra ivisdesigner Lyra ivisdesigner 2 ( 1 ) ( 2 ) ( 3 ) ( 4 ) ( 5 ) (1) (2) (3) (4) (5) Iv Studio [8] 3 (5) (4) (1) (

Vol. 23 No. 4 Oct Kitchen of the Future 1 Kitchen of the Future 1 1 Kitchen of the Future LCD [7], [8] (Kitchen of the Future ) WWW [7], [3

GPU n Graphics Processing Unit CG CAD

08 IPSJ/SIGSE Software Engineering Symposium (SES08) duce [] Assembly [6] Script 0 64 % 4 8% BBVC BBVC.. VC: Volunteer Computing VC LAN VC VC VC LAN V

<95DB8C9288E397C389C88A E696E6462>

Amazon EC2 IaaS (Infrastructure as a Service) HPCI HPCI ( VM) VM VM HPCI VM OS VM HPCI HPC HPCI RENKEI-PoP 2 HPCI HPCI 1 HPCI HPCI HPC CS

DEIM Forum 2009 B4-6, Str

Vol.53 No (Mar. 2012) 1, 1,a) 1, 2 1 1, , Musical Interaction System Based on Stage Metaphor Seiko Myojin 1, 1,a

Iteration 0 Iteration 1 1 Iteration 2 Iteration 3 N N N! N 1 MOPT(Merge Optimization) 3) MOPT MOP

[2] OCR [3], [4] [5] [6] [4], [7] [8], [9] 1 [10] Fig. 1 Current arrangement and size of ruby. 2 Fig. 2 Typography combined with printing

IPSJ SIG Technical Report Vol.2009-DPS-141 No.23 Vol.2009-GN-73 No.23 Vol.2009-EIP-46 No /11/27 t-room t-room 2 Development of

yamamoto_hadoop.pptx

1 Fig. 1 Extraction of motion,.,,, 4,,, 3., 1, 2. 2.,. CHLAC,. 2.1,. (256 ).,., CHLAC. CHLAC, HLAC. 2.3 (HLAC ) r,.,. HLAC. N. 2 HLAC Fig. 2

IPSJ SIG Technical Report Vol.2011-ARC-195 No.23 Vol.2011-OS-117 No /4/14 1. Cassandra CMS CMS 100 PC Cassandra Cassandra CMS Design of S

mobicom.dvi

IPSJ SIG Technical Report Vol.2011-EC-19 No /3/ ,.,., Peg-Scope Viewer,,.,,,,. Utilization of Watching Logs for Support of Multi-

Vol.11-HCI-15 No. 11//1 Xangle 5 Xangle 7. 5 Ubi-WA Finger-Mount 9 Digitrack 11 1 Fig. 1 Pointing operations with our method Xangle Xa

JAXA-RR ICT ICT (Virtual Observatory = VO) JVO (Japanese Virtual Observatory) 1,2,3,4) 1 VO 1 Google Sky API (JVOSky) 1 VO Hadoop

BOK body of knowledge, BOK BOK BOK 1 CC2001 computing curricula 2001 [1] BOK IT BOK 2008 ITBOK [2] social infomatics SI BOK BOK BOK WikiBOK BO

IPSJ SIG Technical Report Pitman-Yor 1 1 Pitman-Yor n-gram A proposal of the melody generation method using hierarchical pitman-yor language model Aki

IEEE HDD RAID MPI MPU/CPU GPGPU GPU cm I m cm /g I I n/ cm 2 s X n/ cm s cm g/cm

149 (Newell [5]) Newell [5], [1], [1], [11] Li,Ryu, and Song [2], [11] Li,Ryu, and Song [2], [1] 1) 2) ( ) ( ) 3) T : 2 a : 3 a 1 :

1_26.dvi

[4] ACP (Advanced Communication Primitives) [1] ACP ACP [2] ACP Tofu UDP [3] HPC InfiniBand InfiniBand ACP 2 ACP, 3 InfiniBand ACP 4 5 ACP 2. ACP ACP

1 Web DTN DTN 2. 2 DTN DTN Epidemic [5] Spray and Wait [6] DTN Android Twitter [7] 2 2 DTN 10km 50m % %Epidemic 99% 13.4% 10km DTN [8] 2

MDD PBL ET 9) 2) ET ET 2.2 2), 1 2 5) MDD PBL PBL MDD MDD MDD 10) MDD Executable UML 11) Executable UML MDD Executable UML

IPSJ SIG Technical Report 1, Instrument Separation in Reverberant Environments Using Crystal Microphone Arrays Nobutaka ITO, 1, 2 Yu KITANO, 1

[2] 2. [3 5] 3D [6 8] Morishima [9] N n 24 24FPS k k = 1, 2,..., N i i = 1, 2,..., n Algorithm 1 N io user-specified number of inbetween omis

Bulletin of JSSAC(2014) Vol. 20, No. 2, pp (Received 2013/11/27 Revised 2014/3/27 Accepted 2014/5/26) It is known that some of number puzzles ca

2) TA Hercules CAA 5 [6], [7] CAA BOSS [8] 2. C II C. ( 1 ) C. ( 2 ). ( 3 ) 100. ( 4 ) () HTML NFS Hercules ( )

( 1) 3. Hilliges 1 Fig. 1 Overview image of the system 3) PhotoTOC 5) 1993 DigitalDesk 7) DigitalDesk Koike 2) Microsoft J.Kim 4). 2 c 2010

6 2. AUTOSAR 2.1 AUTOSAR AUTOSAR ECU OSEK/VDX 3) OSEK/VDX OS AUTOSAR AUTOSAR ECU AUTOSAR 1 AUTOSAR BSW (Basic Software) (Runtime Environment) Applicat

Transcription:

GPU MapReduce 1 1 1, 2, 3 MapReduce GPGPU GPU GPU MapReduce CPU GPU GPU CPU GPU CPU GPU Map K-Means CPU 2GPU CPU 1.02-1.93 Improving MapReduce Task Scheduling for CPU-GPU Heterogeneous Environments Koichi Shirahata, 1 Hitoshi Sato 1 and Satoshi Matsuoka 1, 2, 3 MapReduce is a programming model that enables efficient massive data processing in a large-scale computing environment such as supercomputers and clouds. On the other hand, recent such large-scale computers tend to employ GPUs to enjoy its good peak performance and high memory bandwidth. However, scheduling MapReduce tasks onto CPUs and GPUs for efficient execution is difficult, since it depends on running application characteristics and underlying computing environments. To address this problem, we propose a hybrid online scheduling technique for GPU-based computing clusters, which minimizes the execution time of a submitted job using dynamic profiles of map tasks running on CPUs or GPUs. Our experimental results using a K-Means application show that the proposed technique achieves 1.02-1.93 times faster than simple techniques, such as ones that CPU only or GPU only schedulings. 1. Google MapReduce 1) GPGPU 2) GPU GPU CUDA 3) TSUBAME2.0 3 GPU CPU GPU MapReduce CPU GPU CPU GPU I/O GPU CPU GPU CPU GPU CPU GPU CPU GPU Map CPU GPU Map CPU GPU CPU GPU Map Map ( 1) K-Means 4),5) Map CPU 1 2 3 1 c 2010 Information Processing Society of Japan

HBase Phoenix API Mars GPU MapReduce GPU Hadoop Hadoop Hadoop MapReduce : (1) MapReduce (2)JobTracker 1 Hadoop CPU GPU Fig. 1 The overview of CPU-GPU hybrid processing on Hadoop GPU 1.0-1.25 CPU 2GPU 1.02-1.93 2. MapReduce GPGPU MapReduce GPGPU CPU 2.1 MapReduce MapReduce Google Map Shuffle Reduce 3 Map key-value Shuffle key value Reduce key-value key-value Map Reduce MapReduce MapReduce Hadoop 6) Phoenix 7) Mars 8) Hadoop GFS(Google File System) MapReduce Java MapReduce HDFS (3)TaskTracker (4) 3 JobTracker TaskTracker JobTracker Map Reduce Map ( 64MB) 2.2 GPGPU GPGPU (General-purpose computing on GPU) 2) GPU GPU GPU GPU GPU SIMD CPU GPU CPU CPU GPU GPU GPU CPU GPU GPU CPU GPU CPU 2 c 2010 Information Processing Society of Japan

( 2) MapReduce Mapper Reducer Hadoop Pipes Hadoop Pipes Hadoop MapReduce C++ Map Reduce Streaming Pipes TaskTracker C++ Map Reduce JNI 2 Hadoop Streaming Hadoop Pipes Fig. 2 Hadoop Straming and Hadoop Pipes GPGPU NVIDIA C CUDA CUDA C 3. CPU GPU Map CPU GPU Hadoop GPU CPU GPU 3.1 Hadoop CUDA Hadoop CPU GPU GPU Hadoop Hadoop Java Java GPU Hadoop GPU Hadoop Streaming Hadoop Pipes JNI jcuda Hadoop Streaming Hadoop Streaming Hadoop Unix JNI JNI(Java Native Interface) JVM Java C C++ JVM Java jcuda jcuda(java for CUDA) 9) CUDA API Java Java CUDA GPU jcuda CUDA CUDA2.1 API CUDA2.1 API CUFFT OpenGL CUBLAS Hadoop CUDA Hadoop Streaming Hadoop Pipes Hadoop key JNI Java JNI Java 3 c 2010 Information Processing Society of Japan

Java jcuda CUDA2.1 CUDA2.2 jcuda CUDA Hadoop Pipes 3.2 CPU GPU MapRecuce MapReduce GPU Map CPU GPU CPU GPU Map CPU GPU GPU CPU GPU Map CPU GPU Map CPU Map CPU GPU GPU Reduce Map Reduce Map GPU Map Reduce 3.3 CPU GPU CPU GPU Map CPU GPU CPU GPU 10) Map CPU GPU Map CPU GPU CPU GPU CPU GPU 3.4 Map CPU GPU CPU GPU Map N CPU n GPU m CPU GPU a 1 GPU Map t Map 1 CPU 1 GPU CPU GPU a( ) a = mean map task time run on CP U mean map task time run on GP U 1 GPU Map t CPU at x CPU Map y CPU Map Map minimize f(x, y) subject to f(x, y) = max{ x n at, y m t} x + y = N x, y 0 : CPU x GPU y Map : N Map CPU GPU x, y CPU GPU Map x 0 Map GPU y 0 CPU 4 c 2010 Information Processing Society of Japan

Map Reduce Pipes Child JVM Map Reduce C++ Map Reduce key-value 2 CPU GPU C++ Map CPU GPU Child JVM GPU Pipes CPU Map CPU Map GPU GPU 3 Hadoop Fig. 3 The structure of task scheduling on Hadoop 4. Hadoop CUDA CPU GPU Map Hadoop CUDA JobTracker TaskTracker GPU Map Map ( 3) 4.1 Hadoop GPU Hadoop CUDA CUDA C C++ C++ Hadoop Pipes Hadoop Pipes C++ C++ Java Pipes Java key-value Map Reduce key-value Java TaskTracker TaskTracker Map Reduce Hadoop : (1)MapReduce JobClient (2)JobClient JobTracker (3)JobTracker TaskTracker Map Reduce (4)TaskTracker Child JVM CPU GPU CPU GPU Map CPU GPU CPU GPU Map CPU GPU CPU GPU JobTracker TaskTracker JobTracker Map TaskTracker CPU GPU JobTracker Map TaskTracker DataNode CPU GPU JobTracker TaskTracker Map CPU GPU Map Map CPU GPU CPU GPU TaskTracker TaskTracker JobTracker CPU Map CPU GPU GPU GPU GPU Map GPU TaskTracker GPU JobTracker Map Map 5 c 2010 Information Processing Society of Japan

4.2 Hadoop GPU Map JobTracker CPU GPU TaskTracker Map TaskTracker JobTracker TaskTracker JobTracker TaskTracker Task- Tracker Map Map Map JobTracker TaskTracker TaskTracker Map Map CPU GPU CPU GPU CPU GPU Map Map CPU GPU CPU GPU Map JobTracker TaskTracker Map CPU GPU 5. CPU GPU Map 5.1 CPU GPU CPU GPU Map K-Means Map GPU 1 CPU GPU AMD Opteron(Dual Core) Tesla S1070 2.4GHz 1.296-1.44GHzGHz 1.0GB 16GB Map K-Means Reduce Map K-Means K-Means (1)k (2) (3) k (4) 1 k 128 2 262144 4000 20GB TSUBAME GPU 1 64 Lustre 4 I/O 32MB write 180MB/s read 610MB/s CPU GPU 1 1 CPU 16 GPU 2 GPU Map GPU CPU GPU 1CPU 1GPU 15 CPU 1 GPU Map 2GPU 14 CPU 2 GPU Map 32MB Reduce 1 16 64 1 15 5.2 4 Map CPU GPU 1.0-1.25 CPU 2GPU 1.02-1.93 GPU 15CPU 1GPU 14CPU 2GPU 6 c 2010 Information Processing Society of Japan

1.29 1.02 64 32 Map 20GB 32MB 619 64 1 16 Map I/O GPU MapReduce GPU 1GPU CPU GPU Map CPU GPU 6. CPU GPU 10) CPU GPU CPU GPU CPU GPU 11) CPU GPU 12) CPU GPU 4 TSUBAME K-Means Fig. 4 Total Job Time of K-Means on TSUBAME CPU GPU 13) 7. CPU GPU MapReduce Hadoop Map GPU CPU GPU Map CPU GPU K-Means Map CPU GPU 7 c 2010 Information Processing Society of Japan

1.0-1.25 CPU 2GPU 1.02-1.93 Map 18049028 JST CREST ULP-HPC: support for enabling generalized reduction computations on heterogeneous parallel configurations, ICS 10: Proceedings of the 24th ACM International Conference on Supercomputing, New York, NY, USA, ACM, pp.137 146 (2010). 11) Lu, C.-K., Hong, S. and Kim, H.: Qilin: Exploiting Parallelism on Heterogeneous Multiprocessors with Adaptive Mapping, MICRO 09, pp.45 55 (2009). 12) Zaharia, M., Konwinski, A., Joseph, A. D., Katz, R. and Stoica, I.: Improving MapReduce Performance in Heterogeneous Environments, Technical report, EECS Department, University of California, Berkeley (2008). 13) Vol.47, No.SIG 1 8(ACS 1 6), pp.92 114(2006). 1) Dean, J. and Ghemawat, S.: MapReduce: Simplified Data Processing on Large Clusters, OSDI 04, Sixth Symposium on Operating System Design and Implementation, pp.137 150 (2004). 2) D.Owens, J., Houston, M., Luebke, D., Green, S., E.Stone, J. and C.Phillips, J.: GPU Computing, Proc IEEE, Vol.96, No.5, pp.879 899 (2008). 3) John, N., Ian, B., Michael, G. and Kevin, S.: Scalable Parallel Programming with CUDA, Queue, Vol.6, No.2, pp.40 53 (2008). 4) K., J.A. and C., D.R.: Algorithms for clustering data, Prentice-Hall, Inc., Upper Saddle River, NJ, USA (1988). 5) Hong-tao, B., Li-li, H., Dan-tong, O., Zhan-shan, L. and He, L.: K-Means on Commodity GPUs with CUDA, Computer Science and Information Engineering, 2009 WRI World Congress, pp.651 655 (2009). 6) Bialecki, A., Cordova, M., Cutting, D. and O Malley, O.: Hadoop: a framework for running applications on large clusters built of commodity hardware (2005). 7) Ranger, C., Raghuraman, R., Penmetsa, A., Bradski, G. and Kozyrakis, C.: Evaluating MapReduce for Multi-core and Multiprocessor Systems, Proceedings of the 13th Intl. Symposium on High-Performance Computer Architecture (HPCA) (2007). 8) He, B., Fang, W., Luo, Q., K.Govindaraju, N. and Wang, T.: Mars: A MapReduce Framework on Graphics Processors, Parallel Architectures and Compilation Techniques, pp.260 269 (2008). 9) Company for Advanced Supercomputing Solutions Ltd.: jcuda, http://hoopoecloud.com/solutions/jcuda/default.aspx. 10) Vignesh, T. R., Wenjing, M., David, C. and Gagan, A.: Compiler and runtime 8 c 2010 Information Processing Society of Japan