Mastering the Game of Go without Human Knowledge ( ) AI 3 1 AI 1 rev.1 (2017/11/26) 1 6 2
|
|
|
- さや ことじ
- 7 years ago
- Views:
Transcription
1 ?
2 Mastering the Game of Go without Human Knowledge ( ) AI 3 1 AI 1 rev.1 (2017/11/26) 1 6 2
3 6.1 ( ) AI (MEMO ) 2 3 AI MEMO: ( ) %, % 3
4 6.1: 3 4
5 (MEMO ) dual MEMO: Mastering the Game of Go without Human Knowledge ( ), (David Silver, et al., Nature, 2017) Mastering the game of go with deep neural networks and tree search ( ) (David Silver, et al., Nature, 2016) : 17 1 : ReLU 2 39 : ReLU 2 :
6 1 : ReLU 2 : 362 : 362 (361 ) : 1 : ReLU 2 : 256 ReLU 3 : 1 tanh : 1 ( ) 6.2: 6
7 48 ( 6.3(a)) 17 ( 6.3(b)) ( 6.3(a)) n(=1 7) ( 6.3(b)) n 6.3: (a) 48 (b) 17 7
8 (MEMO ) 2017 p v 2 2 MEMO: (ResNet) (MEMO ) 6.4(a) 19 ( 6.4(b)) (3x3 Conv 256) (Bn) ReLU (ReLU) 2 3 MEMO: Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun, Deep residual learning for image recognition. Computer Vision and Pattern Recognition (CVPR), (a) ReLU 8
9 3x3 39 : ( :39) = 83 : ( :39) = SL 5 6.4: s k π k, z k M {(s k, π k, z k )} M k=1 9
10 π s a A a a 1 0 A π = {π a } A a=1 A z ( +1, 1) θ f θ (s) s a p(s, a) v(s) (p, v) (π, z) L (p, v) = f θ (s) (6.1) L θ = M { A } (z k v k ) 2 πa k log p k a + c θi 2 (6.2) k=1 (z k v k ) 2 z v A a=1 πa k log p k a π = {π a} A a=1 p = {p a} A a=1 i θi 2 θ (weight decay) c 2017 c = 10 4 L θ θ L θ θ a=1 i θ θ α θ (6.3) θ = L θ (6.4) θ α α SGD % 57.0% Chainer Chainer 10
11 6.5: Chainer
12 12
13 6.3 (MCTS) MCTS (MEMO ) MEMO: MCTS UCB ( + ) (Selection) ( ) (Evaluation) (Backup) (Expansion) MCTS MCTS MCTS
14 6.6: (a) (Q(s, a)+ u(s, a)) a (b) p, v (c) Step
15 Step 1( ) Step 1 s 4 Q(s, a) + u(s, a) a Q(s, a) = W (s, a) N(s, a) u(s, a) = c puct p(s, a) b N(s, b) 1 + N(s, a) (6.5) (6.6) Q(s, a) u(s, a) a p(s, a) a b N(s,b) 1+N(s,a) u(s, a) p(s, a) a c puct Q(s, a) u(s, a) Q(s, a) Step 2( ) Step 2 s f θ, p(s, a), v(s ) n(= 40) 1 Step 3( ) Step 3 s W (s, a) N(s, a) N(s, a) = N(s, a) + 1 (6.7) W (s, a) = W (s, a) + v(s) (6.8) 1 N(s, a), W (s, a) s a W (s, a)/n(s, a) s a MCTS W (s, a)/n(s, a) MCTS MCTS 15
16 Step 4( ) Step 1 3 N ( 1600 ) (Step 4) 6.7: 1 16
17 TPU 4000 / MCTS 1 6 MCTS 6 B* 17
18 ( )(MEMO ) θ MEMO: ( ) AI AI 1 1 AI 1 AI AI AI AI? SL RL (MEMO ) MEMO: Q ( )
19 6.8: ,, 3 ( )θ f θ ( z π) θ f θ θ f θ f θ θ θ θ Step 1 θ Step 3 θ (Step 4) θ θ 19
20
21 6.9: f θ 21
22 MCTS ( 6.7 Step 4) % 9 1 s a N(s, a) z 50 A = π z π % MCTS p 0-1 π 0-1 π a MCTS N(s, a) π a = N(s, a)1/τ (6.9) b N(s, a) 1/τ N(s, a) τ τ = 0 N(s, a) a 100% MCTS τ = 1 N(s, a) a τ
23 θ θ 1,000 f θ f θ 400 ( ) f θ f θ 220 θ θ ( / ) 150( / ) 490 ( ) 2.9 (6.10) ( ) ( / ) ( ) = 40 (6.11) ? TPU 4 TPU GPU ( ) 98% 23
24 TPU CPU 1? GPU CPU 20 TPU GPU ( ) 20( ) 30( ) 4( ) 1000( ) = 720 (6.12) ? f θ θ? (MEMO ) MEMO: 2 ( ) 1 100% 6.10(a) s f θ v z θ ( 6.10(a-1)) f θ s v v ( 6.10(a-2)) z new z new ( 6.10(a-3)) z new 24
25 θ 6.10(b) s f θ p N(s, a) π θ ( 6.10(b-1)) p = f θ ( 6.10(b-2)) N(s, a) π new π new ( 6.10(b-3)) θ? 25
26 6.10: (MEMO )
27 ( AlphaGo Lee) MEMO: % : AI AI AI ( )
28 AI AI (MEMO ) AI MEMO:, Webpage: (Last access: 2017/11/4) 28
29 (AlphaGoFan) (AlphaGoLee) (AlphaGoMaster) (AlphaGoZero) : (a) ( 2017 ) (b)
30 6.12(a) (b) AlphaGoFan, AlphaGoLee AlphaGoMaster, AlphaGoZero 4TPU 1 AlphaGoZero
31 6.6 AI 3 1 3? 3 4TPU 1000 CPU 1 PC 20,000 3? MCTS π, v 1? 1 4TPU CPU2400 GPU120 AI AI 31
知能科学:ニューラルネットワーク
2 3 4 (Neural Network) (Deep Learning) (Deep Learning) ( x x = ax + b x x x ? x x x w σ b = σ(wx + b) x w b w b .2.8.6 σ(x) = + e x.4.2 -.2 - -5 5 x w x2 w2 σ x3 w3 b = σ(w x + w 2 x 2 + w 3 x 3 + b) x,
2017 (413812)
2017 (413812) Deep Learning ( NN) 2012 Google ASIC(Application Specific Integrated Circuit: IC) 10 ASIC Deep Learning TPU(Tensor Processing Unit) NN 12 20 30 Abstract Multi-layered neural network(nn) has
IPSJ SIG Technical Report Vol.2016-GI-35 No /3/9 StarCraft AI Deep Q-Network StarCraft: BroodWar Blizzard Entertainment AI Competition AI Convo
StarCraft AI Deep Q-Network StarCraft: BroodWar Blizzard Entertainment AI Competition AI Convolutional Neural Network(CNN) Q Deep Q-Network(DQN) CNN DQN,,, 1. StarCraft: Brood War *1 Blizzard Entertainment
全集’.PDF
1 2 3 4 STEP1 STEP2 PC 5 PC http://www.geocities.co.jp/heartland-icho/1493/yumearuki.htm 6 7 8 NO N N N [ ] [ ] [ ] [ ] [ ] [ ] 1 2 N 34 13 1 3 2 1 3 2 9 2 [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] 4 10 NO [ ] [
P072-076.indd
3 STEP0 STEP1 STEP2 STEP3 STEP4 072 3STEP4 STEP3 STEP2 STEP1 STEP0 073 3 STEP0 STEP1 STEP2 STEP3 STEP4 074 3STEP4 STEP3 STEP2 STEP1 STEP0 075 3 STEP0 STEP1 STEP2 STEP3 STEP4 076 3STEP4 STEP3 STEP2 STEP1
Microsoft Word - ランチョンプレゼンテーション詳細.doc
PS1-1-1 PS1-1-2 PS1-1-3 PS1-1-4 PS1-1-5 PS1-1-6 PS1-1-7 PS1-1-8 PS1-1-9 1 25 12:4514:18 25 12:4513:15 B PS1-1-10 PS1-2-1 PS1-2-2 PS1-2-3 PS1-2-4 PS1-2-5 PS1-2-6 25 13:1513:36 B PS1-2-7 PS1-3-1 PS1-3-2
2013 5
12 (SL) (L) (SL) 2013 5 5 29 () 4 ( ) 7 17 20 ( ) 2 14. 4.17 14. 5. 1 14. 5.22 14. 6. 5 14. 4.17 14. 5. 1 14. 5. 8 14. 5.22 14. 4.17 14. 5. 1 14. 5.22 14. 6. 5 4 10 7 7 10 7 31 8 14.4.10 14.7.10 14.7.31
1000
1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 SL 1000 1000 1000 1000 1000 1000 1000 1000 1000 ( 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000 1000
sikepuri.dvi
2009 2 2 2. 2.. F(s) G(s) H(s) G(s) F(s) H(s) F(s),G(s) H(s) : V (s) Z(s)I(s) I(s) Y (s)v (s) Z(s): Y (s): 2: ( ( V V 2 I I 2 ) ( ) ( Z Z 2 Z 2 Z 22 ) ( ) ( Y Y 2 Y 2 Y 22 ( ) ( ) Z Z 2 Y Y 2 : : Z 2 Z
(4) ω t(x) = 1 ω min Ω ( (I C (y))) min 0 < ω < C A C = 1 (5) ω (5) t transmission map tmap 1 4(a) 2. 3 2. 2 t 4(a) t tmap RGB 2 (a) RGB (A), (B), (C)
(MIRU2011) 2011 7 890 0065 1 21 40 105-6691 1 1 1 731 3194 3 4 1 338 8570 255 346 8524 1836 1 E-mail: {fukumoto,kawasaki}@ibe.kagoshima-u.ac.jp, [email protected], [email protected],
: BV15005
29 5 26 : BV15005 1 1 1.1............................................. 1 1.2........................................ 1 1.3........................................ 1 2 3 2.1.............................................
untitled
70.0 60.0 50.0 40.0 30.0 20.0 10.0 0.0 18.5 18 60.4 6.3 45.5 18.9 41.8 5.0 29.3 17.1 1.2 3.7 0.0 0.0 1.5 19 20 21 22 2.50 2.00 1.50 1.00 0.50 0.00 19 2.38 1.48 1.02 2.05 0.11 0.00 0.00 20 21 1.22 0.44
JA2008
A1 1 10 vs 3 2 1 3 2 0 3 2 10 2 0 0 2 1 0 3 A2 3 11 vs 0 4 4 0 0 0 0 0 3 6 0 1 4 x 11 A3 5 4 vs 5 6 5 1 0 0 3 0 4 6 0 0 1 0 4 5 A4 7 11 vs 2 8 8 2 0 0 0 0 2 7 2 7 0 2 x 11 A5 9 5 vs 3 10 9 4 0 1 0 0 5
平成20年5月 協会創立50年の歩み 海の安全と環境保全を目指して 友國八郎 海上保安庁 長官 岩崎貞二 日本船主協会 会長 前川弘幸 JF全国漁業協同組合連合会 代表理事会長 服部郁弘 日本船長協会 会長 森本靖之 日本船舶機関士協会 会長 大内博文 航海訓練所 練習船船長 竹本孝弘 第二管区海上保安本部長 梅田宜弘
18 (1) US (2) US US US 90 (3) 2 8 1 18 108 2 2,000 3 6,000 4 33 2 17 5 2 3 1 2 8 6 7 7 2 2,000 8 1 8 19 9 10 2 2 7 11 2 12 28 1 2 11 7 1 1 1 1 1 1 3 2 3 33 2 1 3 2 3 2 16 2 8 3 28 8 3 5 13 1 14 15 1 2
RX501NC_LTE Mobile Router取説.indb
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 1 2 3 4 5 6 7 8 19 20 21 22 1 1 23 1 24 25 1 1 26 A 1 B C 27 D 1 E F 28 1 29 1 A A 30 31 2 A B C D E F 32 G 2 H A B C D 33 E 2 F 34 A B C D 2 E 35 2 A B C D 36
VOLTA TENSOR コアで 高速かつ高精度に DL モデルをトレーニングする方法 成瀬彰, シニアデベロッパーテクノロジーエンジニア, 2017/12/12
VOLTA TENSOR コアで 高速かつ高精度に DL モデルをトレーニングする方法 成瀬彰, シニアデベロッパーテクノロジーエンジニア, 2017/12/12 アジェンダ Tensorコアとトレーニングの概要 混合精度 (Tensorコア) で FP32と同等の精度を得る方法 ウェイトをFP16とFP32を併用して更新する ロス スケーリング DLフレームワーク対応状況 ウェイトをFP16で更新する
Convolutional Neural Network A Graduation Thesis of College of Engineering, Chubu University Investigation of feature extraction by Convolution
Convolutional Neural Network 2014 3 A Graduation Thesis of College of Engineering, Chubu University Investigation of feature extraction by Convolutional Neural Network Fukui Hiroshi 1940 1980 [1] 90 3
1 3 1.1.......................... 3 1............................... 3 1.3....................... 5 1.4.......................... 6 1.5........................ 7 8.1......................... 8..............................
Microsoft PowerPoint - 201409_秀英体の取組み素材(予稿集).ppt
1 2 3 4 5 6 7 8 9 10 11 No Image No Image 12 13 14 15 16 17 18 19 20 21 22 23 No Image No Image No Image No Image 24 No Image No Image No Image No Image 25 No Image No Image No Image No Image 26 27 28
AI技術を活用して戦略的優位性を構築する――Using AI to Create Advantage
MAY 2018 AI Using AI to Create Advantage ボストンコンサルティンググループ ( BCG) BCGは 世界をリードする経営コンサルティングファームとして 政府 民間企業 非営利団体など さまざまな業種 マーケットにおいて カスタムメードのアプローチ 企業 市場に対する深い洞察 クライアントとの緊密な協働により クライアントが持続的競争優位を築き 組織能力 ( ケイパビリティ
Haiku Generation Based on Motif Images Using Deep Learning Koki Yoneda 1 Soichiro Yokoyama 2 Tomohisa Yamashita 2 Hidenori Kawamura Scho
Haiku Generation Based on Motif Images Using Deep Learning 1 2 2 2 Koki Yoneda 1 Soichiro Yokoyama 2 Tomohisa Yamashita 2 Hidenori Kawamura 2 1 1 School of Engineering Hokkaido University 2 2 Graduate
IPSJ SIG Technical Report Vol.2013-CVIM-187 No /5/30 1,a) 1,b), 1,,,,,,, (DNN),,,, 2 (CNN),, 1.,,,,,,,,,,,,,,,,,, [1], [6], [7], [12], [13]., [
,a),b),,,,,,,, (DNN),,,, (CNN),,.,,,,,,,,,,,,,,,,,, [], [6], [7], [], [3]., [8], [0], [7],,,, Tohoku University a) [email protected] b) [email protected], [3],, (DNN), DNN, [3],
OJT OJT 2 00S012L22 1. 5. 2. 6. 3. 7. 4. 20 2011 4 2013 2013415 2013 10 15 20 2011 4 2013 2013 00S012L22 00S012L22 5. 6. 7. 3. 2. 1. 4. L1 L2 L3 L4 PC 00C001L11 00C002L11 00C003L11
