Mastering the Game of Go without Human Knowledge ( ) AI 3 1 AI 1 rev.1 (2017/11/26) 1 6 2

Size: px
Start display at page:

Download "Mastering the Game of Go without Human Knowledge ( ) AI 3 1 AI 1 rev.1 (2017/11/26) 1 6 2"

Transcription

1 ?

2 Mastering the Game of Go without Human Knowledge ( ) AI 3 1 AI 1 rev.1 (2017/11/26) 1 6 2

3 6.1 ( ) AI (MEMO ) 2 3 AI MEMO: ( ) %, % 3

4 6.1: 3 4

5 (MEMO ) dual MEMO: Mastering the Game of Go without Human Knowledge ( ), (David Silver, et al., Nature, 2017) Mastering the game of go with deep neural networks and tree search ( ) (David Silver, et al., Nature, 2016) : 17 1 : ReLU 2 39 : ReLU 2 :

6 1 : ReLU 2 : 362 : 362 (361 ) : 1 : ReLU 2 : 256 ReLU 3 : 1 tanh : 1 ( ) 6.2: 6

7 48 ( 6.3(a)) 17 ( 6.3(b)) ( 6.3(a)) n(=1 7) ( 6.3(b)) n 6.3: (a) 48 (b) 17 7

8 (MEMO ) 2017 p v 2 2 MEMO: (ResNet) (MEMO ) 6.4(a) 19 ( 6.4(b)) (3x3 Conv 256) (Bn) ReLU (ReLU) 2 3 MEMO: Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun, Deep residual learning for image recognition. Computer Vision and Pattern Recognition (CVPR), (a) ReLU 8

9 3x3 39 : ( :39) = 83 : ( :39) = SL 5 6.4: s k π k, z k M {(s k, π k, z k )} M k=1 9

10 π s a A a a 1 0 A π = {π a } A a=1 A z ( +1, 1) θ f θ (s) s a p(s, a) v(s) (p, v) (π, z) L (p, v) = f θ (s) (6.1) L θ = M { A } (z k v k ) 2 πa k log p k a + c θi 2 (6.2) k=1 (z k v k ) 2 z v A a=1 πa k log p k a π = {π a} A a=1 p = {p a} A a=1 i θi 2 θ (weight decay) c 2017 c = 10 4 L θ θ L θ θ a=1 i θ θ α θ (6.3) θ = L θ (6.4) θ α α SGD % 57.0% Chainer Chainer 10

11 6.5: Chainer

12 12

13 6.3 (MCTS) MCTS (MEMO ) MEMO: MCTS UCB ( + ) (Selection) ( ) (Evaluation) (Backup) (Expansion) MCTS MCTS MCTS

14 6.6: (a) (Q(s, a)+ u(s, a)) a (b) p, v (c) Step

15 Step 1( ) Step 1 s 4 Q(s, a) + u(s, a) a Q(s, a) = W (s, a) N(s, a) u(s, a) = c puct p(s, a) b N(s, b) 1 + N(s, a) (6.5) (6.6) Q(s, a) u(s, a) a p(s, a) a b N(s,b) 1+N(s,a) u(s, a) p(s, a) a c puct Q(s, a) u(s, a) Q(s, a) Step 2( ) Step 2 s f θ, p(s, a), v(s ) n(= 40) 1 Step 3( ) Step 3 s W (s, a) N(s, a) N(s, a) = N(s, a) + 1 (6.7) W (s, a) = W (s, a) + v(s) (6.8) 1 N(s, a), W (s, a) s a W (s, a)/n(s, a) s a MCTS W (s, a)/n(s, a) MCTS MCTS 15

16 Step 4( ) Step 1 3 N ( 1600 ) (Step 4) 6.7: 1 16

17 TPU 4000 / MCTS 1 6 MCTS 6 B* 17

18 ( )(MEMO ) θ MEMO: ( ) AI AI 1 1 AI 1 AI AI AI AI? SL RL (MEMO ) MEMO: Q ( )

19 6.8: ,, 3 ( )θ f θ ( z π) θ f θ θ f θ f θ θ θ θ Step 1 θ Step 3 θ (Step 4) θ θ 19

20

21 6.9: f θ 21

22 MCTS ( 6.7 Step 4) % 9 1 s a N(s, a) z 50 A = π z π % MCTS p 0-1 π 0-1 π a MCTS N(s, a) π a = N(s, a)1/τ (6.9) b N(s, a) 1/τ N(s, a) τ τ = 0 N(s, a) a 100% MCTS τ = 1 N(s, a) a τ

23 θ θ 1,000 f θ f θ 400 ( ) f θ f θ 220 θ θ ( / ) 150( / ) 490 ( ) 2.9 (6.10) ( ) ( / ) ( ) = 40 (6.11) ? TPU 4 TPU GPU ( ) 98% 23

24 TPU CPU 1? GPU CPU 20 TPU GPU ( ) 20( ) 30( ) 4( ) 1000( ) = 720 (6.12) ? f θ θ? (MEMO ) MEMO: 2 ( ) 1 100% 6.10(a) s f θ v z θ ( 6.10(a-1)) f θ s v v ( 6.10(a-2)) z new z new ( 6.10(a-3)) z new 24

25 θ 6.10(b) s f θ p N(s, a) π θ ( 6.10(b-1)) p = f θ ( 6.10(b-2)) N(s, a) π new π new ( 6.10(b-3)) θ? 25

26 6.10: (MEMO )

27 ( AlphaGo Lee) MEMO: % : AI AI AI ( )

28 AI AI (MEMO ) AI MEMO:, Webpage: (Last access: 2017/11/4) 28

29 (AlphaGoFan) (AlphaGoLee) (AlphaGoMaster) (AlphaGoZero) : (a) ( 2017 ) (b)

30 6.12(a) (b) AlphaGoFan, AlphaGoLee AlphaGoMaster, AlphaGoZero 4TPU 1 AlphaGoZero

31 6.6 AI 3 1 3? 3 4TPU 1000 CPU 1 PC 20,000 3? MCTS π, v 1? 1 4TPU CPU2400 GPU120 AI AI 31

知能科学:ニューラルネットワーク

知能科学:ニューラルネットワーク 2 3 4 (Neural Network) (Deep Learning) (Deep Learning) ( x x = ax + b x x x ? x x x w σ b = σ(wx + b) x w b w b .2.8.6 σ(x) = + e x.4.2 -.2 - -5 5 x w x2 w2 σ x3 w3 b = σ(w x + w 2 x 2 + w 3 x 3 + b) x,

More information

2017 (413812)

2017 (413812) 2017 (413812) Deep Learning ( NN) 2012 Google ASIC(Application Specific Integrated Circuit: IC) 10 ASIC Deep Learning TPU(Tensor Processing Unit) NN 12 20 30 Abstract Multi-layered neural network(nn) has

More information

IPSJ SIG Technical Report Vol.2016-GI-35 No /3/9 StarCraft AI Deep Q-Network StarCraft: BroodWar Blizzard Entertainment AI Competition AI Convo

IPSJ SIG Technical Report Vol.2016-GI-35 No /3/9 StarCraft AI Deep Q-Network StarCraft: BroodWar Blizzard Entertainment AI Competition AI Convo StarCraft AI Deep Q-Network StarCraft: BroodWar Blizzard Entertainment AI Competition AI Convolutional Neural Network(CNN) Q Deep Q-Network(DQN) CNN DQN,,, 1. StarCraft: Brood War *1 Blizzard Entertainment

More information

全集’.PDF

全集’.PDF 1 2 3 4 STEP1 STEP2 PC 5 PC http://www.geocities.co.jp/heartland-icho/1493/yumearuki.htm 6 7 8 NO N N N [ ] [ ] [ ] [ ] [ ] [ ] 1 2 N 34 13 1 3 2 1 3 2 9 2 [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] 4 10 NO [ ] [

More information

P072-076.indd

P072-076.indd 3 STEP0 STEP1 STEP2 STEP3 STEP4 072 3STEP4 STEP3 STEP2 STEP1 STEP0 073 3 STEP0 STEP1 STEP2 STEP3 STEP4 074 3STEP4 STEP3 STEP2 STEP1 STEP0 075 3 STEP0 STEP1 STEP2 STEP3 STEP4 076 3STEP4 STEP3 STEP2 STEP1

More information

Microsoft Word - ランチョンプレゼンテーション詳細.doc

Microsoft Word - ランチョンプレゼンテーション詳細.doc PS1-1-1 PS1-1-2 PS1-1-3 PS1-1-4 PS1-1-5 PS1-1-6 PS1-1-7 PS1-1-8 PS1-1-9 1 25 12:4514:18 25 12:4513:15 B PS1-1-10 PS1-2-1 PS1-2-2 PS1-2-3 PS1-2-4 PS1-2-5 PS1-2-6 25 13:1513:36 B PS1-2-7 PS1-3-1 PS1-3-2

More information

2013 5

2013 5 12 (SL) (L) (SL) 2013 5 5 29 () 4 ( ) 7 17 20 ( ) 2 14. 4.17 14. 5. 1 14. 5.22 14. 6. 5 14. 4.17 14. 5. 1 14. 5. 8 14. 5.22 14. 4.17 14. 5. 1 14. 5.22 14. 6. 5 4 10 7 7 10 7 31 8 14.4.10 14.7.10 14.7.31

More information

1000

1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 SL 1000 1000 1000 1000 1000 1000 1000 1000 1000 ( 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000

More information

sikepuri.dvi

sikepuri.dvi 2009 2 2 2. 2.. F(s) G(s) H(s) G(s) F(s) H(s) F(s),G(s) H(s) : V (s) Z(s)I(s) I(s) Y (s)v (s) Z(s): Y (s): 2: ( ( V V 2 I I 2 ) ( ) ( Z Z 2 Z 2 Z 22 ) ( ) ( Y Y 2 Y 2 Y 22 ( ) ( ) Z Z 2 Y Y 2 : : Z 2 Z

More information

(4) ω t(x) = 1 ω min Ω ( (I C (y))) min 0 < ω < C A C = 1 (5) ω (5) t transmission map tmap 1 4(a) 2. 3 2. 2 t 4(a) t tmap RGB 2 (a) RGB (A), (B), (C)

(4) ω t(x) = 1 ω min Ω ( (I C (y))) min 0 < ω < C A C = 1 (5) ω (5) t transmission map tmap 1 4(a) 2. 3 2. 2 t 4(a) t tmap RGB 2 (a) RGB (A), (B), (C) (MIRU2011) 2011 7 890 0065 1 21 40 105-6691 1 1 1 731 3194 3 4 1 338 8570 255 346 8524 1836 1 E-mail: {fukumoto,kawasaki}@ibe.kagoshima-u.ac.jp, [email protected], [email protected],

More information

: BV15005

: BV15005 29 5 26 : BV15005 1 1 1.1............................................. 1 1.2........................................ 1 1.3........................................ 1 2 3 2.1.............................................

More information

untitled

untitled 70.0 60.0 50.0 40.0 30.0 20.0 10.0 0.0 18.5 18 60.4 6.3 45.5 18.9 41.8 5.0 29.3 17.1 1.2 3.7 0.0 0.0 1.5 19 20 21 22 2.50 2.00 1.50 1.00 0.50 0.00 19 2.38 1.48 1.02 2.05 0.11 0.00 0.00 20 21 1.22 0.44

More information

JA2008

JA2008 A1 1 10 vs 3 2 1 3 2 0 3 2 10 2 0 0 2 1 0 3 A2 3 11 vs 0 4 4 0 0 0 0 0 3 6 0 1 4 x 11 A3 5 4 vs 5 6 5 1 0 0 3 0 4 6 0 0 1 0 4 5 A4 7 11 vs 2 8 8 2 0 0 0 0 2 7 2 7 0 2 x 11 A5 9 5 vs 3 10 9 4 0 1 0 0 5

More information

18 (1) US (2) US US US 90 (3) 2 8 1 18 108 2 2,000 3 6,000 4 33 2 17 5 2 3 1 2 8 6 7 7 2 2,000 8 1 8 19 9 10 2 2 7 11 2 12 28 1 2 11 7 1 1 1 1 1 1 3 2 3 33 2 1 3 2 3 2 16 2 8 3 28 8 3 5 13 1 14 15 1 2

More information

RX501NC_LTE Mobile Router取説.indb

RX501NC_LTE Mobile Router取説.indb 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 1 2 3 4 5 6 7 8 19 20 21 22 1 1 23 1 24 25 1 1 26 A 1 B C 27 D 1 E F 28 1 29 1 A A 30 31 2 A B C D E F 32 G 2 H A B C D 33 E 2 F 34 A B C D 2 E 35 2 A B C D 36

More information

VOLTA TENSOR コアで 高速かつ高精度に DL モデルをトレーニングする方法 成瀬彰, シニアデベロッパーテクノロジーエンジニア, 2017/12/12

VOLTA TENSOR コアで 高速かつ高精度に DL モデルをトレーニングする方法 成瀬彰, シニアデベロッパーテクノロジーエンジニア, 2017/12/12 VOLTA TENSOR コアで 高速かつ高精度に DL モデルをトレーニングする方法 成瀬彰, シニアデベロッパーテクノロジーエンジニア, 2017/12/12 アジェンダ Tensorコアとトレーニングの概要 混合精度 (Tensorコア) で FP32と同等の精度を得る方法 ウェイトをFP16とFP32を併用して更新する ロス スケーリング DLフレームワーク対応状況 ウェイトをFP16で更新する

More information

Convolutional Neural Network A Graduation Thesis of College of Engineering, Chubu University Investigation of feature extraction by Convolution

Convolutional Neural Network A Graduation Thesis of College of Engineering, Chubu University Investigation of feature extraction by Convolution Convolutional Neural Network 2014 3 A Graduation Thesis of College of Engineering, Chubu University Investigation of feature extraction by Convolutional Neural Network Fukui Hiroshi 1940 1980 [1] 90 3

More information

1 3 1.1.......................... 3 1............................... 3 1.3....................... 5 1.4.......................... 6 1.5........................ 7 8.1......................... 8..............................

More information

Microsoft PowerPoint - 201409_秀英体の取組み素材(予稿集).ppt

Microsoft PowerPoint - 201409_秀英体の取組み素材(予稿集).ppt 1 2 3 4 5 6 7 8 9 10 11 No Image No Image 12 13 14 15 16 17 18 19 20 21 22 23 No Image No Image No Image No Image 24 No Image No Image No Image No Image 25 No Image No Image No Image No Image 26 27 28

More information

AI技術を活用して戦略的優位性を構築する――Using AI to Create Advantage

AI技術を活用して戦略的優位性を構築する――Using AI to Create Advantage MAY 2018 AI Using AI to Create Advantage ボストンコンサルティンググループ ( BCG) BCGは 世界をリードする経営コンサルティングファームとして 政府 民間企業 非営利団体など さまざまな業種 マーケットにおいて カスタムメードのアプローチ 企業 市場に対する深い洞察 クライアントとの緊密な協働により クライアントが持続的競争優位を築き 組織能力 ( ケイパビリティ

More information

Haiku Generation Based on Motif Images Using Deep Learning Koki Yoneda 1 Soichiro Yokoyama 2 Tomohisa Yamashita 2 Hidenori Kawamura Scho

Haiku Generation Based on Motif Images Using Deep Learning Koki Yoneda 1 Soichiro Yokoyama 2 Tomohisa Yamashita 2 Hidenori Kawamura Scho Haiku Generation Based on Motif Images Using Deep Learning 1 2 2 2 Koki Yoneda 1 Soichiro Yokoyama 2 Tomohisa Yamashita 2 Hidenori Kawamura 2 1 1 School of Engineering Hokkaido University 2 2 Graduate

More information

IPSJ SIG Technical Report Vol.2013-CVIM-187 No /5/30 1,a) 1,b), 1,,,,,,, (DNN),,,, 2 (CNN),, 1.,,,,,,,,,,,,,,,,,, [1], [6], [7], [12], [13]., [

IPSJ SIG Technical Report Vol.2013-CVIM-187 No /5/30 1,a) 1,b), 1,,,,,,, (DNN),,,, 2 (CNN),, 1.,,,,,,,,,,,,,,,,,, [1], [6], [7], [12], [13]., [ ,a),b),,,,,,,, (DNN),,,, (CNN),,.,,,,,,,,,,,,,,,,,, [], [6], [7], [], [3]., [8], [0], [7],,,, Tohoku University a) [email protected] b) [email protected], [3],, (DNN), DNN, [3],

More information

OJT OJT 2 00S012L22 1. 5. 2. 6. 3. 7. 4. 20 2011 4 2013 2013415 2013 10 15 20 2011 4 2013 2013 00S012L22 00S012L22 5. 6. 7. 3. 2. 1. 4. L1 L2 L3 L4 PC 00C001L11 00C002L11 00C003L11

More information