Nexus7 2 Skia 3!"#$%&'(')"#*+(, 4 5"#$., skia 5 0$"1*(2, -".#')*/"#*+(, 2. Skia 2D Android 2D.+9):'%*6"2', 6".7, 3*#34*#, 1'.#*("#*+(% 86"2', Skia 6+1



Similar documents
Nexus7 2 Skia 3 4 skia 5 2. Skia 2D Android 2D Skia 2.1 Skia Skia 2D Skia Google Chrome Mozilla Firefox Android Chorome OS Android 2D Skia [7]. Androi

IPSJ SIG Technical Report Vol.2013-ARC-206 No /8/1 Android Dominic Hillenbrand ODROID-X2 GPIO Android OSCAR WFI 500[us] GPIO GP

! 行行 CPUDSP PPESPECell/B.E. CPUGPU 行行 SIMD [SSE, AltiVec] 用 HPC CPUDSP PPESPE (Cell/B.E.) SPE CPUGPU GPU CPU DSP DSP PPE SPE SPE CPU DSP SPE 2

"%%%#%%%$ $ * ) '(%&! ## # # $'( *-, ++ #+!" *!" ) * ""!! 3d

GPGPU

WebGL OpenGL GLSL Kageyama (Kobe Univ.) Visualization / 57

main.dvi

血統詳細‐本番/血統詳細0602

07-二村幸孝・出口大輔.indd

23 Fig. 2: hwmodulev2 3. Reconfigurable HPC 3.1 hw/sw hw/sw hw/sw FPGA PC FPGA PC FPGA HPC FPGA FPGA hw/sw hw/sw hw- Module FPGA hwmodule hw/sw FPGA h

血統詳細(本番)/2冊目-5日目(1071~)

,., ping - RTT,., [2],RTT TCP [3] [4] Android.Android,.,,. LAN ACK. [5].. 3., 1.,. 3 AI.,,Amazon, (NN),, 1..NN,, (RNN) RNN

#"

)+, $( -++ $ )* "& $ "$...( # / $ & ' / $# && &# & ' '' '( '# ' "& / $ $

IPSJ SIG Technical Report Vol.2013-ARC-203 No /2/1 SMYLE OpenCL (NEDO) IT FPGA SMYLEref SMYLE OpenCL SMYLE OpenCL FPGA 1

血統詳細‐本番/3日目 482‐750

血統詳細/ブラックタイプ本番

fiš„v8.dvi

血統詳細_本番/血統詳細(2歳)

$%&$%'"! $%&)& )&! )&)')' )')&!!)&! )&)& )( )& "#! )&)&!)')&!$%& $%&!! )&)( $%()( # )&)')(

血統詳細(本番)/2冊目-3日目(539~)

)( ) 12 %& #" #"! )'!!!!!! #$%& "#$!! "! " "##!!!

血統詳細/BT‐3Day(0648‐0966)

●70974_100_AC009160_KAPヘ<3099>ーシス自動車約款(11.10).indb

E MathML W3C MathJax 1.3 MathJax MathJax[5] TEX MathML JavaScript TEX MathML [8] [9] MathSciNet[10] MathJax MathJax MathJax MathJax MathJax MathJax We

Nios® II HAL API を使用したソフトウェア・サンプル集 「Modular Scatter-Gather DMA Core」

1 3DCG [2] 3DCG CG 3DCG [3] 3DCG 3 3 API 2 3DCG 3 (1) Saito [4] (a) 1920x1080 (b) 1280x720 (c) 640x360 (d) 320x G-Buffer Decaudin[5] G-Buffer D

1 GPU GPGPU GPU CPU 2 GPU 2007 NVIDIA GPGPU CUDA[3] GPGPU CUDA GPGPU CUDA GPGPU GPU GPU GPU Graphics Processing Unit LSI LSI CPU ( ) DRAM GPU LSI GPU

[1] [2] [3] (RTT) 2. Android OS Android OS Google OS 69.7% [4] 1 Android Linux [5] Linux OS Android Runtime Dalvik Dalvik UI Application(Home,T

IPSJ SIG Technical Report Vol.2015-MUS-107 No /5/23 HARK-Binaural Raspberry Pi 2 1,a) ( ) HARK 2 HARK-Binaural A/D Raspberry Pi 2 1.

2) TA Hercules CAA 5 [6], [7] CAA BOSS [8] 2. C II C. ( 1 ) C. ( 2 ). ( 3 ) 100. ( 4 ) () HTML NFS Hercules ( )

血統詳細/ブラックタイプ本番

Presentation

2009 3DCG : M DCG,,,, 3DCG 2D 3DCG 2D 3DCG 3DCG

2. CABAC CABAC CABAC 1 1 CABAC Figure 1 Overview of CABAC 2 DCT 2 0/ /1 CABAC [3] 3. 2 値化部 コンテキスト計算部 2 値算術符号化部 CABAC CABAC

血統詳細/3日目

Vol.214-HPC-145 No /7/3 C #pragma acc directive-name [clause [[,] clause] ] new-line structured block Fortran!$acc directive-name [clause [[,] c

CASP WildCAT WildCAT Java CASP CASP XML Context Query API CASP 1 Fig. 1 Outline Of Framework WildCAT CASP 3. 1.,,,.,

IPSJ SIG Technical Report Vol.2016-ARC-221 No /8/9 GC 1 1 GC GC GC GC DalvikVM GC 12.4% 5.7% 1. Garbage Collection: GC GC Java GC GC GC GC Dalv

JavaScript MathTOUCH (Shizuka Shirai) Graduate School of Human Environmental Sciences, Mukogawa Women s University (Tetsuo Fukui) S

IPSJ SIG Technical Report Vol.2009-DPS-141 No.20 Vol.2009-GN-73 No.20 Vol.2009-EIP-46 No /11/27 1. MIERUKEN 1 2 MIERUKEN MIERUKEN MIERUKEN: Spe

6 2. AUTOSAR 2.1 AUTOSAR AUTOSAR ECU OSEK/VDX 3) OSEK/VDX OS AUTOSAR AUTOSAR ECU AUTOSAR 1 AUTOSAR BSW (Basic Software) (Runtime Environment) Applicat


2). 3) 4) 1.2 NICTNICT DCRA Dihedral Corner Reflector micro-arraysdcra DCRA DCRA DCRA 3D DCRA PC USB PC PC ON / OFF Velleman K8055 K8055 K8055

GPU GPU CPU CPU CPU GPU GPU N N CPU ( ) 1 GPU CPU GPU 2D 3D CPU GPU GPU GPGPU GPGPU 2 nvidia GPU CUDA 3 GPU 3.1 GPU Core 1

HBase Phoenix API Mars GPU MapReduce GPU Hadoop Hadoop Hadoop MapReduce : (1) MapReduce (2)JobTracker 1 Hadoop CPU GPU Fig. 1 The overview of CPU-GPU

第168回日本胸部外科学会関東甲信越地方会要旨集

HPEハイパフォーマンスコンピューティング ソリューション


01_OpenMP_osx.indd

DEIM Forum 2012 E Web Extracting Modification of Objec

AMD/ATI Radeon HD 5870 GPU DEGIMA LINPACK HD 5870 GPU DEGIMA LINPACK GFlops/Watt GFlops/Watt Abstract GPU Computing has lately attracted

Transcription:

Android 2D SKIA OSCAR 1,a) 1 1 1 1 1 1 Android 2D Skia OSCAR OSCAR Parallelizable C C Skia Android Skia Oprofile OSCAR Parallelizable C Parallelizable C 0xbench NVIDIA Tegra3 (ARM Cortex-A9 4 ) Nexus7 Skia Android core0 3 Skia DrawRect 1.91 43.57[fps] DrawArc 1.32 50.98[fps] DrawCircle2 1.5 50.77[fps] 1. [1] NVIDIA Tegra3[2] Qualcomm Snapdragon[3], Samsung Exynoso[4] OpenMP MPI[5] API OSCAR compiler[6] 1 Waseda University. a) tgoto@kasahara.cs.waseda.ac.jp 2D 2D skia[7], Quartz[8], cairo[9] OSCAR Oprofile OSCAR Android 2D Skia Google

Nexus7 2 Skia 3!"#$%&'(')"#*+(, 4 5"#$., skia 5 0$"1*(2, -".#')*/"#*+(, 2. Skia 2D Android 2D.+9):'%*6"2', 6".7, 3*#34*#, 1'.#*("#*+(% 86"2', Skia 6+1*;'1, 1'.#*("#*+(%86"2', 2.1 Skia Skia 2D Skia Google Chrome Mozilla Firefox Android Chorome OS Android 2D Skia [7]. Android Java API(Application Programming Interface) android.graphics.canvas [10] API Canvas drawrect drawimage JNI(Java Native interface) Skia [11] Skia JNI Java Android Skia Android 2.1.1 Skia Skia Skia 1 [12] Path Generation, Rasterization, Shading, (Bit-Level Block Transfer)[12] Path Generation Rasterization 1 Skia Shading BitBlit Rastererization Shading 2.2 0xbench, Android 0xbench. 0xbench, 0xlab Android [13], C library and system call, OpenGL-ES, 2D canvas, Garbage collection in Dalvik, JavaScript engine Skia 2D Canvas 2D Canvas android.graphic.canvas FPS 2D canvas DrawRect, DrawArc, DrawCircle2 3 2 DrawRect () Canvas drawrect 300 DrawArc 17 drawarc 500 DrawCircle2 drawrect 6 drawcircle 300

2 2D 3. Oprofile OSCAR 3.1 OSCAR OSCAR API OSCAR [14], [15], [16] 3 [6], [17] OSCAR Parallelizable C Fortran Parallelizable C OSCAR C Fortran OSCAR API OSCAR API API OpenMP DMA OSCAR OpenMP OSCAR API API OSCAR API 1 parallel sections API oscar thread create oscar thread join 2 pthread oscar thread create oscar thread join pthread create pthread join OSCAR OSCAR API 3.2 OProfile Oprofile [18][19] Oprofile Oprofile for Tegra (version 0.9.6) [20] 20 50000 3.3 Skia Oprofile OSCAR OSCAR 3 HotSpot Oprofile OSCAR OSCAR Parallelizable C Parallelizable C 4. Skia Skia 3

?47,7;2!),;,7(*&"/4)5+&6,*+92 ')/<*+&%+94*:2 =,7()8&6,*+2 9+*+5:2 >/:91/:&"/4)5+&6,*+2!"#$%&'()(**+*,-(.*+&#/01,*+)2 $7(*8-+3&%+94*:2! /)2 >/:91/:&$7(*8-+)2 '()(**+*,-,7;&>/:91/:2 @7A/)0(:,/72 '()(**+*,-+3&"/4)5+&6,*+2!"#$#%&'()*+",-(.*/-0 void SkRGB16_Blitter::blitRect(int x, int y, int width, int height) { SkASSERT(x + width <= fdevice.width() && y + height <= fdevice.height()); uint16_t* SK_RESTRICT device = fdevice.getaddr16(x, y); unsigned devicerb = fdevice.rowbytes(); SkPMColor src32 = fsrccolor32; while (--height >= 0) { blend32_16_row(src32, device, width); device = (uint16_t*)((char*)device + devicerb); } } 123-"(4+%#%$0 C++コード 分 離 void SkRGB16_Blitter::blitRect(int x, int y, int width, int height) { SkASSERT(x + width <= fdevice.width() && y + height <= fdevice.height()); uint16_t* SK_RESTRICT device = fdevice.getaddr16(x, y); unsigned devicerb = fdevice.rowbytes(); SkPMColor src32 = fsrccolor32; SkRGB16_Blitter_blitRect_oscar(width, height, device, devicerb, src32); }! 3 4.1 Skia Oprofile Application Profiling 2.2 DrawRect 5(a) SkRGB16 Blitter::blitRect 2.1 BitBlit Blit xy (destiniation) DrawArc 5(b) SkRGB16 Blitter::blitH 82% SkRGB16 Blitter::blitRect DrawCircle2 5(c) SkRGB16 Blitter::blitAntiH 78%, SkRGB16 Blitter::blitRect 9% DrawRect blit blit 4.2 Skia 3.3 void SkRGB16_Blitter_blitRect_oscar(int width, int height, uint16_t* device, unsigned devicerb, SkPMColor src32) { int i; uint16_t* devicetmp; for (i = height; i > 0; i--){ devicetmp = (uint16_t*)((char*)device + (devicerb*(height- i))); blend32_16_row(src32, devicetmp, width); } }! device 変 数 の 依 存 解 消 4 Skia DrawRect Original Source Code After Tuning Code 4 DrawRect OSCAR SkRGB16 Blitter::blitRect Parallelizable C C while for OSCAR for device device OSCAR OSCAR BitBlit height width 2.1 BitBlit 5. Skia OSCAR Skia

!"#$%&' ()*+,'!"!+-*AB..C!"!+-*ABDE1 <3'&51 6,67,+@8()99:&821 &34851 F+?,-71 &&38851 MainThread! Additional Thread 1! Additional Thread 2! 0%)$')12(%)$*. +!34516%"'1'78)-. 0%)$')12(%)$*. +!34516%"'1'78)-. 7"(=))(:>+?1 838&51 6,67,+(&82()99:1 83;<51!"#$%&'(%)$*&#%)$')+,-.!"#$%&'(%)$*&#%)$')+/-. Transfer FunctionPointer! 9$7'1:!%13);'. 9$7'1:!%13);'. -./01234156""$%77856"/$9"' :+);<,'!"#$%&'(%)*++,-../)*+01 2&32451 <=3#'7!31>?B. <=3#'7!31>?@. <=3#'7!31>?A. CD0EF1>$%$44)47G)*1D)#'7!3. FunctionPointer=null! FunctionPointer=null! =>?@A%>B/$9"'!"0)<=>#?1@..%-,>"3 B67B83 D+=,-@3 '67&83 D/EGH->IJ-K1!"#$%&'(%)$*&H!73+,-.!"#$%&'(%)$*&H!73+/-. Check FunctionPointer!!?<,-%)*++,-../)*+23 B6C483!"0)<=>#?1@..>AA3 B6;483!"#$%&'(%)*++,-../)*+#,9+3 56:;83 6 OSCAR!"#$%&'(%)*++,-../)*+01+*23 456'783 E9FGH->IJ*-9),B3 5 1 Nexus7 CPU ARM Cortex-A9 NVIDIA Tegra 3 CPU Frequency 1.2GHz (1.3GHz single-core mode) CPU core quad-core GPU NVIDIA GeForce ULP GPU Frequency 416MHz GPU core twelve-core RAM 1GB Display 1280x800 WXGA pixels 5.1 Skia 5.1.1 Nexus7. ARM Cortex- A9 NVIDIA Tegra3 2012 Nexus7 4 1.2[GHz] Nexus7 1 [21] 5.1.2 Skia init Android OS core0 3 Skia 5.1.3 BitBlit OSCAR OSCAR API OSCAR API oscar thread create oscar thread join 6 oscar thread create pthread NULL oscar thread join NULL join 5.2 ARM ARM Cortex-A9 Performance Monitor Unit(PMU) [22] PMU (CCNT) CCNT (USERNE) 1 USEREN USEREN skia

2 blitter Sequential Parallelized DrawRect 742634 267821 DrawArc 2182 1140 DrawCircle2 8013 2764 3 FPS Sequential Parallelized DrawRect 22.82 43.57 DrawArc 38.58 50.98 DrawCircle2 33.86 50.77 *")% *%!")%!"##$% 70890:25-6% ;-,-66065<0=%!"'($% '"(% '%!"#!$% 6/78/914,5% :,+,55/54;/<%!"##$%"&'()*+,!% &")% &% (")% &"'&$%!"##$%"&'()*+,!"(%!% )"(%!"&'$%!"()$% (% +,-./012% +,-.3,1% +,-.45,160!% -#./01('23, )% *+,-./01% *+,-2+0% *+,-34+05/'% -#./01('23, 7 blitter 8 FPS 5.3 Nexus7 2.2 DrawRect, DrawArc, DrawCircle2 SkRGB Blitter::blitRect, SkRGB16 Blitter::blitH, SkRGB16 Blitter::blitAntiH 2 7 DrawRect 742634 3 26821 DrawArc 2182 1140 DrawCircle2 8013 2764 DrawRect 2.77 DrawArc 1.91 DrawCircle2 2.90 5.4 FPS Nexus7 FPS FPS 0xbench 1 5.3 FPS JAVA Skia 3 8 3 DrawRect 22.82[fps] 43.57[fps] DrawArc 38.58[fps] 50.98[fps] DrawCircle2 33.86[fps] 50.77[fps] DrawRect 1.91 DrawArc 1.32 DrawCircle2 1.50!"#$%&'%()*"+,$-*".!/#0"1"++%+*2%3,$-*". 9 DrawRect Systrace DrawCircle2 FPS Android 60 2 Systrace[10] Skia CPU.9 DrawRect Systrace (a) Skia DrawRect CPU Skia CPU1, CPU2, CPU0 4 (b) Skia DrawRect (a) 2 Skia CPU1,2,3 CPU0 Skia

!"#$!%&'()"(%)#(*+,-./ 9.7:7;/8<=>7/<?;<@AB% A/./88287C2D<=>7/<?;<6AB%,(%!"#"$%!(#&'%!(#**%!(% )"#!*% "&#&'% )(% "(% +(% $(#$% $(% (% -./01234% -./05.3% -./067.382+% 0(,*1'&%23/ 10 Skia GPU FPS 5.5 Hardware Acceralation(GPU) Android Version 3.0 Hardware Acceralation 2.1 Android Canvas API OpenGL ES GPU [10][12] <application android:hardwareaccelerated= true > Harware Acceralation GPU 10. DrawRect 3 43.57[fps] GPU 53.31[fps] DrawArc 3 50.98[fps] GPU 39.98[fps] DrawCircle2 50.77[fps] 10.1[fps] DrawArc DrawCircle2 GPU DrawRect GPU GPU 3 DrawArc 1,28 DrawCircle2 5.10 5.6 Oprofile OS- CAR 20 Android 2D Skia DrawRect 3 2.77 DrawArc 1.91 DrawCircle2 2.90 DrawRect 1.91 DrawArc 1.32 DrawCircle2 1.50 GPU 3 DrawArc 1.28 DrawCircle2 5.1 [1] Blake, G., Dreslinski, R. and Mudge, T.: A survey of multicore processors, IEEE SIGNAL PROCESSING MAGAZINE, No. November, pp. 26 37 (2009). [2] NVIDIA Corporation: Whitepaper NVIDIA Tegra Multi-processor Architecture, pp. 1 12. [3] QUALCOMM Inc.: Snapdragon S4 Processors : System on Chip Solutions for a New Mobile Age (2012). [4] Samsung Electronics Co., L.: White Paper of Exynos 5, pp. 1 8 (2011). [5] Mallón, D., Taboada, G. and Teijeiro, C.: Performance Evaluation of MPI, UPC and OpenMP on Multicore Architectures, Recent Advances in Parallel Virtual Machine and Message Passing Interface. Springer Berlin Heidelberg, 2009., pp. 174 184 (2009). [6] Kasahara, H., Obata, M. and Ishizaka, K.: Automatic coarse grain task parallel processing on smp using openmp, Workship on Lan- guages and Compilers for Parallel Computing, pp. 1 15 (2001). [7] Google: skia 2D Graphics Library. [8] Apple Inc.: Quartz 2D Programming Guide, Technical report (2012). [9] Worth, C. and Packard, K.: Xr: Cross-device rendering for vector graphics, Ottawa Linux Symposium (2003). [10] Google: Android Developers. [11] Kim, Y.-J., Cho, S.-J., Kim, K.-J., Hwang, E.-H., Yoon, S.-H. and Jeon, J.-W.: Benchmarking Java application using JNI and native C application on Android (2012). [12] Jim Huang: Hardware Accelerated 2D Rendering for Android, Android Builders Summit 2013 (2013). [13] 0xlab: 0xbench. [14] Ishizaka, K., Obata, M. and Kasahara, H.: Coarse Grain Task Parallel Processing with Cache Optimization on Shared Memory Multiprocessor, Proc. of 14th International Workshop on Languages and Compilers for Parallel Computing (LCPC2001) (2001). [15] Obata, M., Shirako, J., Kaminaga, H., Ishizaka, K. and Kasahara, H.: Hierarchical Parallelism Control for Multigrain Parallel Processing, Lecture Notes in Computer Science, Vol. 2481, pp. 31 44 (2005). [16] Shirako, J., Oshiyama, N., Wada, Y., Shikano, H., Kimura, K. and Kasahara, H.: Compiler Control Power Saving Scheme for Multi Core Processors, Lecture Notes in Computer Science, Vol. 4339, pp. 362 376 (2007). [17] Kimura, K., Wada, Y., Nakano, H., Kodaka, T., Shirako, J., Ishizaka, K. and Kasahara, H.: Multigrain Parallel Processing on Compiler Cooperative Chip Multiprocessor, Proc. of 9th Workshop on Interaction between Compilers and Computer Architectures (INTERACT- 9) (2005). [18] Cohen, W.: Tuning Programs with OProfile, Wide Open Magazine, pp. 53 62 (2004). [19] Lee, N. and Lim, S.-S.: A whole layer performance analysis method for Android platforms, 2011 9th IEEE Symposium on Embedded Systems for Real-Time Multimedia, pp. 1 1 (online), DOI: 10.1109/ESTIMedia.2011.6088515 (2011). [20] NVIDIA: NVIDIA Developer Zone. [21] ASUSTeK Computer Inc.: Nexus7 Specifications. [22] ARM Corporation: Cortex-A9 Technical Reference Manual.