Agenda Intro & history LLVM overview Demo Pros & Cons LLVM Intermediate Language LLVM tools

Similar documents
Agenda Motivation How it works Performance Limitation Conclusion

今日の話 : LLVM のバックエンド Backend = 機械語を出力するモジュール <-> Frontend Pluggable になっている CPU 色々 機械語以外も出力できる binaries are not only output 2

r1.dvi

I117 II I117 PROGRAMMING PRACTICE II SOFTWARE DEVELOPMENT ENV. 1 Research Center for Advanced Computing Infrastructure (RCACI) / Yasuhiro Ohara

ipsj-final.dvi

maegaki_4_suzuki_yuusuke.pdf

() () (parse tree) ( (( ) * 50) ) ( ( NUM 10 + NUM 30 ) * NUM 50 ) ( * ) ( + ) NUM 50 NUM NUM (abstract syntax tree, AST) ( (( ) * 5


RaVioli SIMD

main.dvi

I117 II I117 PROGRAMMING PRACTICE II 2 SOFTWARE DEVELOPMENT ENV. 2 Research Center for Advanced Computing Infrastructure (RCACI) / Yasuhiro Ohara yasu

Microsoft Word J.^...O.|Word.i10...j.doc

言語プロセッサ2005

#include <stdio.h> unsigned char x86[] = { 0x8b, 0x44, 0x24, 0x04, // mov eax,[esp+4] 0x03, 0x44, 0x24, 0x08, // add eax,[esp+8] 0xc3 // ret }; int ma

/* do-while */ #include <stdio.h> #include <math.h> int main(void) double val1, val2, arith_mean, geo_mean; printf( \n ); do printf( ); scanf( %lf, &v

untitled

情報処理学会研究報告 IPSJ SIG Technical Report Vol.2011-OS-118 No /7/28 LLVM LLVM Scattaring Object files by LLVM Natsuki Kawai 1 and Koichi Sa

メタコンピュータ構成方式の研究


情報処理学会研究報告 IPSJ SIG Technical Report Vol.2014-ARC-210 No.10 Vol.2014-OS-129 No /5/15 Continuation based C LLVM/clang 3.5 Data Segment, Code Seg

untitled

組込みシステムシンポジウム2011 Embedded Systems Symposium 2011 ESS /10/20 FPGA Android Android Java FPGA Java FPGA Dalvik VM Intel Atom FPGA PCI Express DM

,,,,., C Java,,.,,.,., ,,.,, i

1.ppt

( ) ( ) 30 ( ) 27 [1] p LIFO(last in first out, ) (push) (pup) 1

I 2 tutimura/ I 2 p.1/??

@ LL Future 2008/08/30 MORITA Hajime

/* sansu1.c */ #include <stdio.h> main() { int a, b, c; /* a, b, c */ a = 200; b = 1300; /* a 200 */ /* b 200 */ c = a + b; /* a b c */ }

プログラミング言語処理系論 (4) Design and Implementation of Programming Language Processors

fmaster.dvi

C による数値計算法入門 ( 第 2 版 ) 新装版 サンプルページ この本の定価 判型などは, 以下の URL からご覧いただけます. このサンプルページの内容は, 新装版 1 刷発行時のものです.

1 Code Generation Part I Chapter 8 (1 st ed. Ch.9) COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University,

DPD Software Development Products Overview

untitled

B

Intel Memory Protection Extensions(Intel MPX) x86, x CPU skylake 2015 Intel Software Development Emulator 本資料に登場する Intel は Intel Corp. の登録

SQUFOF NTT Shanks SQUFOF SQUFOF Pentium III Pentium 4 SQUFOF 2.03 (Pentium 4 2.0GHz Willamette) N UBASIC 50 / 200 [

Informatics 2014

/ SCHEDULE /06/07(Tue) / Basic of Programming /06/09(Thu) / Fundamental structures /06/14(Tue) / Memory Management /06/1

プログラミング言語処理系論 (6) Design and Implementation of Programming Language Processors 佐藤周行 ( 情報基盤センター / 電気系専攻融合情報学コース )

COINS..

The 3 key challenges in programming for MC

spa99.dvi

インテル(R) Visual Fortran Composer XE

debug ( ) 1) ( ) 2) ( ) assert, printf ( ) Japan Advanced Institute of Science and Technology

橡Pro PDF

RHEA key

Microsoft PowerPoint - NxLecture ppt [互換モード]

Microsoft Word - keisankigairon.ch doc

double float

Effective Android NDK Advanced Core Engineer

Copyright Oracle Parkway, Redwood City, CA U.S. GOVERNMENT END USERS: Oracle programs, including any operating system, integrated softw

compiler-text.dvi

(Version: 2017/4/18) Intel CPU 1 Intel CPU( AMD CPU) 64bit SIMD Inline Assemler Windows Visual C++ Linux gcc 2 FPU SSE2 Intel CPU do

Nios II カスタム・インストラクションによるキャスト(型変換)の高速化

untitled

016-22_ŒÚ”Ł

C 2 / 21 1 y = x 1.1 lagrange.c 1 / Laglange / 2 #include <stdio.h> 3 #include <math.h> 4 int main() 5 { 6 float x[10], y[10]; 7 float xx, pn, p; 8 in

1 (bit ) ( ) PC WS CPU IEEE754 standard ( 24bit) ( 53bit)

r07.dvi

ohp07.dvi

() / (front end) (back end) (phase) (pass) 1 2

2 1 Web Java Android Java 1.2 6) Java Java 7) 6) Java Java (Swing, JavaFX) (JDBC) 7) OS 1.3 Java Java

橡kenkyuhoukoku8.PDF

Microsoft Word - Sample_CQS-Report_English_backslant.doc

Microsoft PowerPoint ppt

ストリーミング SIMD 拡張命令2 (SSE2) を使用した、倍精度浮動小数点ベクトルの最大/最小要素とそのインデックスの検出

Microsoft Word - RMD_75.doc

PC Windows 95, Windows 98, Windows NT, Windows 2000, MS-DOS, UNIX CPU

3 SIMPLE ver 3.2: SIMPLE (SIxteen-bit MicroProcessor for Laboratory Experiment) 1 16 SIMPLE SIMPLE 2 SIMPLE 2.1 SIMPLE (main memo

x h = (b a)/n [x i, x i+1 ] = [a+i h, a+ (i + 1) h] A(x i ) A(x i ) = h 2 {f(x i) + f(x i+1 ) = h {f(a + i h) + f(a + (i + 1) h), (2) 2 a b n A(x i )

ACE Associated Computer Experts bv

GPU Computing on Business

2005 1

PowerPoint プレゼンテーション

I117 II I117 PROGRAMMING PRACTICE II DEBUG Research Center for Advanced Computing Infrastructure (RCACI) / Yasuhiro Ohara

A/B (2010/10/08) Ver kurino/2010/soft/soft.html A/B

XMPによる並列化実装2

アセンブラ入門(CASL II) 第3版

untitled

1 138

2008 IIA (program) pro(before)+gram(write) (artificial language) (programming languege) (programming) (machine language) (assembly language) ( )

course pptx

cpp1.dvi

Informatics 2015

LAN Copyright c Daikoku Manabu This tutorial is licensed under a Creative Commons Attribution 2.1 Japan License

Java (5) 1 Lesson 3: x 2 +4x +5 f(x) =x 2 +4x +5 x f(10) x Java , 3.0,..., 10.0, 1.0, 2.0,... flow rate (m**3/s) "flow

CPU Levels in the memory hierarchy Level 1 Level 2... Increasing distance from the CPU in access time Level n Size of the memory at each level 1: 2.2

10/ / /30 3. ( ) 11/ 6 4. UNIX + C socket 11/13 5. ( ) C 11/20 6. http, CGI Perl 11/27 7. ( ) Perl 12/ 4 8. Windows Winsock 12/11 9. JAV

H H H H H H H H Windows IC USB WindowsXP+FZ1360 WindowsXP+FZ1350 J2SE Runtime Environment 5.0 Window

ProVAL Recent Projects, ProVAL Online 3 Recent Projects ProVAL Online Show Online Content on the Start Page Page 13

(CC Attribution) Lisp 2.1 (Gauche )


untitled

Informatics 2010.key

JavaScript Web Web Web Web Web JavaScript Web Web JavaScript JavaScript JavaScript GC GC GC GC JavaScript SSJSVM GC SSJSVM GC GC GC SSJSVM GC GC SSJSV

64bit SSE2 SSE2 FPU Visual C++ 64bit Inline Assembler 4 FPU SSE2 4.1 FPU Control Word FPU 16bit R R R IC RC(2) PC(2) R R PM UM OM ZM DM IM R: reserved

Intel® Compilers Professional Editions

18 C ( ) hello world.c 1 #include <stdio.h> 2 3 main() 4 { 5 printf("hello World\n"); 6 } [ ] [ ] #include <stdio.h> % cc hello_world.c %./a.o

Transcription:

LLVM Intro Syoyo Fujita syoyo@lucillerender.org

Agenda Intro & history LLVM overview Demo Pros & Cons LLVM Intermediate Language LLVM tools

LLVM

, Lightweight Language

No! No! No!

LLVM

, Virtual Machine

No! No! No!

LLVM

,!

No! No! No!

LLVM = Low Level Virtual Machine

!

LLVM

Low Level Virtual Machine

2000 Chris Lattner

!!!

LLVM (C++ )

Agenda Intro & history LLVM overview Demo Pros & Cons LLVM Intermediate Language LLVM tools

Frontend LLVM IR Backend C/C++ x86 Java Python LLVM Sparc PPC......

Frontend LLVM IR Backend C/C++ x86 Java int add_func( int a, int b) { return a + b; } Python define i32 @add_func(i32 %a, i32 %b) { LLVM entry: %tmp3 = add i32 %b, %a ret i32 %tmp3 } Sparc _add_func: movl 8(%esp), %eax addl 4(%esp), %eax ret PPC......

Frontend clang C/C++ llvm-gcc Java pypy Python LLVM IR LLVM IR API LLVM Backend x86 Sparc PPC... LLVM C++API...

Frontend LLVM IR Backend C/C++ Alias DCE User pass x86 Java Python LLVM Sparc PPC,... Bitcode writer Bitcode reader... file file

Frontend LLVM IR Backend C/C++ Codegen, JIT facility Native CodeGen Register Allocation x86 Instruction Scheduling Java LLVM Sparc Python PPC......

History 2000 Chris Latter LLVM 2005 ver 1.0 Apple Chris hired LLVM 2007 Leopard OpenGL LLVM iphone 20XX LLVM?

Agenda Intro & history LLVM overview Demo Pros & Cons LLVM Intermediate Language LLVM tools

Agenda Intro & history LLVM overview Demo Pros & Cons LLVM Intermediate Language LLVM tools

LLVM 1/2 llvm-gcc full C gcc (SIMD). OpenGL on Leopard, iphone, PhysX?, etc...

LLVM 2/2 ( x86 ) Illinois OSL(BSD license )

LLVM 1/3

LLVM 2/3, JIT SIMD

LLVM 3/3

LLVM 1/2 (VM runtime, JIT) VM runtime. LLVM. JIT AOT GC optional(pypy )

JIT

JIT Dynamic Languages Strike Back http://steve-yegge.blogspot.com/2008/05/dynamic-languagesstrike-back.html http://www.stanford.edu/class/ee380/abstracts/080507-dynamiclanguages.pdf Trace tree HotpathVM: An Effective JIT Compiler for Resource-constrained Devices http://www.usenix.org/ events/vee06/full_papers/p144-gal.pdf Andreas Gal http://andreasgal.com/ Double-dispatch specialization VM Efficient Just-In-Time Execution of Dynamically Typed Languages Via Code Specialization Using Precise Runtime Type Inference http://www.ics.uci.edu/~franz/site/pubs-pdf/ics-tr-07-10.pdf Parrotcode: Parrot Virtual Machine http://www.parrotcode.org/

LLVM 2/2 (web, mobile, etc...) (C++ + STL ). (C++ + STL ) LLVM (LowLevel ).

Debug : 120 MB!!!

LLVM Illinois OSL(BSD ) STL (APFloat:, ) C++ gcc

LLVM IR. C++ assert bitcode

Agenda Intro & history LLVM overview Demo Pros & Cons LLVM Intermediate Language LLVM tools

LLVM 1/2 LLVM IR Java SSA =>.

LLVM 2/2 LLVM LLVM IR LLVM LLVM IR( )

LLVM IR API IR C++ C++ // create fib(x-1) Value *Sub = BinaryOperator::CreateSub(ArgX, One, "arg", RecurseBB); CallInst *CallFibX1 = CallInst::Create(FibF, Sub, "fibx1", RecurseBB); CallFibX1->setTailCall(); LLVM IR %tmp2 = sub i32 %tmp1, 1 ; <i32> [#uses=1] %tmp3 = call i32 (...)* bitcast (i32 (i32)* @fib to i32 (...)*)( i32 %tmp2 ) nounwind ; <i32> [#uses=1]

float add_func(float a, float b) { return a + b; } LLVM

C LLVM float add_func(float a, float b) { return a + b; } define float @add_func(float %a, float %b) { entry: %a_addr = alloca float ; <float*> [#us %b_addr = alloca float ; <float*> [#us %retval = alloca float ; <float*> [#us %tmp = alloca float ; <float*> [#uses=2 %"alloca point" = bitcast i32 0 to i32 store float %a, float* %a_addr store float %b, float* %b_addr %tmp1 = load float* %a_addr, align 4 %tmp2 = load float* %b_addr, align 4 %tmp3 = add float %tmp1, %tmp2 ; <floa store float %tmp3, float* %tmp, align 4 %tmp4 = load float* %tmp, align 4 ; <f store float %tmp4, float* %retval, align 4 br label %return return: ; preds = %entry %retval5 = load float* %retval ; <floa ret float %retval5 }

@ : ( ) % : ( ) define float @add_func(float %a, float %b) { entry: %a_addr = alloca float ; <float*> [#uses=2] %b_addr = alloca float ; <float*> [#uses=2] %retval = alloca float ; <float*> [#uses=2] %tmp = alloca float ; <float*> [#uses=2] %"alloca point" = bitcast i32 0 to i32 ; <i32> [#uses=0] store float %a, float* %a_addr store float %b, float* %b_addr %tmp1 = load float* %a_addr, align 4 ; <float> [#uses=1] %tmp2 = load float* %b_addr, align 4 ; <float> [#uses=1] %tmp3 = add float %tmp1, %tmp2 ; <float> [#uses=1] store float %tmp3, float* %tmp, align 4 %tmp4 = load float* %tmp, align 4 ; <float> [#uses=1] store float %tmp4, float* %retval, align 4 br label %return return: ; preds = %entry %retval5 = load float* %retval ; <float> [#uses=1] ret float %retval5 }

LLVM define float @add_func(float %a, float %b) { entry: %a_addr = alloca float ; <float*> [#uses=2] %b_addr = alloca float ; <float*> [#uses=2] %retval = alloca float ; <float*> [#uses=2] %tmp = alloca float ; <float*> [#uses=2] %"alloca point" = bitcast i32 0 to i32 ; <i32> [#uses=0] store float %a, float* %a_addr store float %b, float* %b_addr %tmp1 = load float* %a_addr, align 4 ; <float> [#uses=1] %tmp2 = load float* %b_addr, align 4 ; <float> [#uses=1] %tmp3 = add float %tmp1, %tmp2 ; <float> [#uses=1] store float %tmp3, float* %tmp, align 4 %tmp4 = load float* %tmp, align 4 ; <float> [#uses=1] store float %tmp4, float* %retval, align 4 br label %return return: ; preds = %entry %retval5 = load float* %retval ; <float> [#uses=1] ret float %retval5 }

stack reg %a %b define float @add_func(float %a, float %b) { entry: %a_addr = alloca float ; <float*> [#uses=2] %b_addr = alloca float ; <float*> [#uses=2] %retval = alloca float ; <float*> [#uses=2] %tmp = alloca float ; <float*> [#uses=2] %"alloca point" = bitcast i32 0 to i32 ; <i32> [#uses=0] store float %a, float* %a_addr store float %b, float* %b_addr %tmp1 = load float* %a_addr, align 4 ; <float> [#uses=1] %tmp2 = load float* %b_addr, align 4 ; <float> [#uses=1] %tmp3 = add float %tmp1, %tmp2 ; <float> [#uses=1] store float %tmp3, float* %tmp, align 4 %tmp4 = load float* %tmp, align 4 ; <float> [#uses=1] store float %tmp4, float* %retval, align 4 br label %return return: ; preds = %entry %retval5 = load float* %retval ; <float> [#uses=1] ret float %retval5 }

stack %a_addr %b_addr %retval %tmp reg %a %b define float @add_func(float %a, float %b) { entry: %a_addr = alloca float ; <float*> [#uses=2] %b_addr = alloca float ; <float*> [#uses=2] %retval = alloca float ; <float*> [#uses=2] %tmp = alloca float ; <float*> [#uses=2] %"alloca point" = bitcast i32 0 to i32 ; <i32> [#uses=0] store float %a, float* %a_addr store float %b, float* %b_addr %tmp1 = load float* %a_addr, align 4 ; <float> [#uses=1] %tmp2 = load float* %b_addr, align 4 ; <float> [#uses=1] %tmp3 = add float %tmp1, %tmp2 ; <float> [#uses=1] store float %tmp3, float* %tmp, align 4 %tmp4 = load float* %tmp, align 4 ; <float> [#uses=1] store float %tmp4, float* %retval, align 4 br label %return return: ; preds = %entry %retval5 = load float* %retval ; <float> [#uses=1] ret float %retval5 }

stack %a_addr %b_addr %retval %tmp reg %a %b define float @add_func(float %a, float %b) { entry: %a_addr = alloca float ; <float*> [#uses=2] %b_addr = alloca float ; <float*> [#uses=2] %retval = alloca float ; <float*> [#uses=2] %tmp = alloca float ; <float*> [#uses=2] %"alloca point" = bitcast i32 0 to i32 ; <i32> [#uses=0] store float %a, float* %a_addr store float %b, float* %b_addr %tmp1 = load float* %a_addr, align 4 ; <float> [#uses=1] %tmp2 = load float* %b_addr, align 4 ; <float> [#uses=1] %tmp3 = add float %tmp1, %tmp2 ; <float> [#uses=1] store float %tmp3, float* %tmp, align 4 %tmp4 = load float* %tmp, align 4 ; <float> [#uses=1] store float %tmp4, float* %retval, align 4 br label %return return: ; preds = %entry %retval5 = load float* %retval ; <float> [#uses=1] ret float %retval5 }

stack %a_addr %b_addr %retval %tmp reg %a %b %tmp1 %tmp2 define float @add_func(float %a, float %b) { entry: %a_addr = alloca float ; <float*> [#uses=2] %b_addr = alloca float ; <float*> [#uses=2] %retval = alloca float ; <float*> [#uses=2] %tmp = alloca float ; <float*> [#uses=2] %"alloca point" = bitcast i32 0 to i32 ; <i32> [#uses=0] store float %a, float* %a_addr store float %b, float* %b_addr %tmp1 = load float* %a_addr, align 4 ; <float> [#uses=1] %tmp2 = load float* %b_addr, align 4 ; <float> [#uses=1] %tmp3 = add float %tmp1, %tmp2 ; <float> [#uses=1] store float %tmp3, float* %tmp, align 4 %tmp4 = load float* %tmp, align 4 ; <float> [#uses=1] store float %tmp4, float* %retval, align 4 br label %return return: ; preds = %entry %retval5 = load float* %retval ; <float> [#uses=1] ret float %retval5 }

stack %a_addr %b_addr %retval %tmp reg %a %b %tmp1 %tmp2 %tmp3 define float @add_func(float %a, float %b) { entry: %a_addr = alloca float ; <float*> [#uses=2] %b_addr = alloca float ; <float*> [#uses=2] %retval = alloca float ; <float*> [#uses=2] %tmp = alloca float ; <float*> [#uses=2] %"alloca point" = bitcast i32 0 to i32 ; <i32> [#uses=0] store float %a, float* %a_addr store float %b, float* %b_addr %tmp1 = load float* %a_addr, align 4 ; <float> [#uses=1] %tmp2 = load float* %b_addr, align 4 ; <float> [#uses=1] %tmp3 = add float %tmp1, %tmp2 ; <float> [#uses=1] store float %tmp3, float* %tmp, align 4 %tmp4 = load float* %tmp, align 4 ; <float> [#uses=1] store float %tmp4, float* %retval, align 4 br label %return return: ; preds = %entry %retval5 = load float* %retval ; <float> [#uses=1] ret float %retval5 }

stack %a_addr %b_addr %retval %tmp reg %a %b %tmp1 %tmp2 %tmp3 define float @add_func(float %a, float %b) { entry: %a_addr = alloca float ; <float*> [#uses=2] %b_addr = alloca float ; <float*> [#uses=2] %retval = alloca float ; <float*> [#uses=2] %tmp = alloca float ; <float*> [#uses=2] %"alloca point" = bitcast i32 0 to i32 ; <i32> [#uses=0] store float %a, float* %a_addr store float %b, float* %b_addr %tmp1 = load float* %a_addr, align 4 ; <float> [#uses=1] %tmp2 = load float* %b_addr, align 4 ; <float> [#uses=1] %tmp3 = add float %tmp1, %tmp2 ; <float> [#uses=1] store float %tmp3, float* %tmp, align 4 %tmp4 = load float* %tmp, align 4 ; <float> [#uses=1] store float %tmp4, float* %retval, align 4 br label %return return: ; preds = %entry %retval5 = load float* %retval ; <float> [#uses=1] ret float %retval5 }

stack %a_addr %b_addr %retval %tmp reg %a %b %tmp1 %tmp2 %tmp3 %tmp4 define float @add_func(float %a, float %b) { entry: %a_addr = alloca float ; <float*> [#uses=2] %b_addr = alloca float ; <float*> [#uses=2] %retval = alloca float ; <float*> [#uses=2] %tmp = alloca float ; <float*> [#uses=2] %"alloca point" = bitcast i32 0 to i32 ; <i32> [#uses=0] store float %a, float* %a_addr store float %b, float* %b_addr %tmp1 = load float* %a_addr, align 4 ; <float> [#uses=1] %tmp2 = load float* %b_addr, align 4 ; <float> [#uses=1] %tmp3 = add float %tmp1, %tmp2 ; <float> [#uses=1] store float %tmp3, float* %tmp, align 4 %tmp4 = load float* %tmp, align 4 ; <float> [#uses=1] store float %tmp4, float* %retval, align 4 br label %return return: ; preds = %entry %retval5 = load float* %retval ; <float> [#uses=1] ret float %retval5 }

stack %a_addr %b_addr %retval %tmp reg %a %b %tmp1 %tmp2 %tmp3 %tmp4 define float @add_func(float %a, float %b) { entry: %a_addr = alloca float ; <float*> [#uses=2] %b_addr = alloca float ; <float*> [#uses=2] %retval = alloca float ; <float*> [#uses=2] %tmp = alloca float ; <float*> [#uses=2] %"alloca point" = bitcast i32 0 to i32 ; <i32> [#uses=0] store float %a, float* %a_addr store float %b, float* %b_addr %tmp1 = load float* %a_addr, align 4 ; <float> [#uses=1] %tmp2 = load float* %b_addr, align 4 ; <float> [#uses=1] %tmp3 = add float %tmp1, %tmp2 ; <float> [#uses=1] store float %tmp3, float* %tmp, align 4 %tmp4 = load float* %tmp, align 4 ; <float> [#uses=1] store float %tmp4, float* %retval, align 4 br label %return return: ; preds = %entry %retval5 = load float* %retval ; <float> [#uses=1] ret float %retval5 }

stack %a_addr %b_addr %retval %tmp reg %a %b %tmp1 %tmp2 %tmp3 %tmp4 %retval5 define float @add_func(float %a, float %b) { entry: %a_addr = alloca float ; <float*> [#uses=2] %b_addr = alloca float ; <float*> [#uses=2] %retval = alloca float ; <float*> [#uses=2] %tmp = alloca float ; <float*> [#uses=2] %"alloca point" = bitcast i32 0 to i32 ; <i32> [#uses=0] store float %a, float* %a_addr store float %b, float* %b_addr %tmp1 = load float* %a_addr, align 4 ; <float> [#uses=1] %tmp2 = load float* %b_addr, align 4 ; <float> [#uses=1] %tmp3 = add float %tmp1, %tmp2 ; <float> [#uses=1] store float %tmp3, float* %tmp, align 4 %tmp4 = load float* %tmp, align 4 ; <float> [#uses=1] store float %tmp4, float* %retval, align 4 br label %return return: ; preds = %entry %retval5 = load float* %retval ; <float> [#uses=1] ret float %retval5 }

?...

$ llvm-gcc -emit-llvm -S -O2 muda.c Or LLVM bc bc

define float @add_func(float %a, float %b) nounwind { entry: %tmp3 = add float %a, %b ; <float> [#uses=1] ret float %tmp3 }

Agenda Intro & history LLVM overview Demo Pros & Cons LLVM Intermediate Language LLVM tools

llvm-gcc

gcc llvm-gcc C/C++ Parser C/C++ Parser GIMPLE LLVM IR Backend LLVM Backend a.out a.out

$ llvm-gcc muda.c C/C++ Parser LLVM IR LLVM Backend a.out

$ llvm-gcc -emit-llvm -c muda.c C/C++ Parser LLVM IR LLVM Backend muda.bc LLVM (BitCode)

$ llvm-gcc -emit-llvm -S muda.c C/C++ Parser LLVM IR LLVM Backend muda.s LLVM

LLVM IR $ opt -std-compile-opts <input.bc> bc LLVM $ llc -march=... -mcpu=... -mattr=... lli

lli LLVM bc JIT (AOT ) -force-interpreter

fib.c #include <stdio.h> int fib(int a) { if (a < 2) return 1; return fib(a-2) + fib(a-1); } int main() { printf("fib(30) = %d\n", fib(30)); }

$ llvm-gcc -emit-llvm -c fib.c $ time lli fib.o fib(30) = 1346269 real0m0.050s user0m0.044s sys 0m0.006s $ time lli -force-interpreter fib.o fib(30) = 1346269 real0m32.424s user0m30.889s sys 0m0.207s

llc LLVM LLVM bc -> native obj experimental

$ llc --march=x86 -mcpu=help

-mcpu= athlon - Select the athlon processor. athlon-4 - Select the athlon-4 processor. athlon-fx - Select the athlon-fx processor. athlon-mp - Select the athlon-mp processor. athlon-tbird - Select the athlon-tbird processor. athlon-xp - Select the athlon-xp processor. athlon64 - Select the athlon64 processor. c3 - Select the c3 processor. c3-2 - Select the c3-2 processor. core2 - Select the core2 processor. generic - Select the generic processor. i386 - Select the i386 processor. i486 - Select the i486 processor. i686 - Select the i686 processor. k6 - Select the k6 processor. k6-2 - Select the k6-2 processor. k6-3 - Select the k6-3 processor. k8 - Select the k8 processor. nocona - Select the nocona processor. opteron - Select the opteron processor. penryn - Select the penryn processor. pentium - Select the pentium processor. pentium-m - Select the pentium-m processor. pentium-mmx - Select the pentium-mmx processor. pentium2 - Select the pentium2 processor. pentium3 - Select the pentium3 processor. pentium4 - Select the pentium4 processor. pentiumpro - Select the pentiumpro processor. prescott - Select the prescott processor. winchip-c6 - Select the winchip-c6 processor. winchip2 - Select the winchip2 processor. x86-64 - Select the x86-64 processor. yonah - Select the yonah processor.

-mattr= 3dnow - Enable 3DNow! instructions. 3dnowa - Enable 3DNow! Athlon instructions. 64bit - Support 64-bit instructions. mmx - Enable MMX instructions. sse - Enable SSE instructions. sse2 - Enable SSE2 instructions. sse3 - Enable SSE3 instructions. sse41 - Enable SSE 4.1 instructions. sse42 - Enable SSE 4.2 instructions. ssse3 - Enable SSSE3 instructions.

define void @t1(float* %R, <4 x float>* %P1) { %X = load <4 x float>* %P1 %tmp = extractelement <4 x float> %X, i32 3 store float %tmp, float* %R ret void }

$ llvm-as < input.ll llc -march=x86 -mattr=+sse41... _t1: Leh_func_begin1: Llabel1: movl8(%esp), %eax movaps(%eax), %xmm0 movl4(%esp), %eax extractps$3, %xmm0, (%eax) ret Leh_func_end1:...

?