,.,., ( ).,., A, B. A, B,.,...,., Python Long Short Term Memory(LSTM), Unity., Asynchronous method, Deep Q-Network(DQN), LSTM, TORCS. Asynchronous met

Similar documents
,.,., ( ).,., A, B., A, B,.,, 2010 [2], [3],.,,,.,,.,, Python Python 2.., (SVM).,,.,,.,.,.,.,,, - i -

Web Web Web Web 1 1,,,,,, Web, Web - i -

IPSJ SIG Technical Report Vol.2016-GI-35 No /3/9 StarCraft AI Deep Q-Network StarCraft: BroodWar Blizzard Entertainment AI Competition AI Convo

2011 Future University Hakodate 2011 System Information Science Practice Group Report Project Name Visualization of Code-Breaking Group Name Implemati

(group A) (group B) PLE(Primary Leaving Examination) adobe Flash ipad 1 adobe Flash e-book ipad adobe Flash adobe Flash Pixton scratch PLE(Primary Lea

,,,, : - i -

1 Web Web 1,,,, Web, Web : - i -

/ Web,,,, - i -

2015 Future University Hakodate 2015 System Information Science Practice Group Report Project Name Development of myoelectric prosthetic hand with hap

2011 Future University Hakodate 2011 System Information Science Practice Group Report Project Name Visualization of Code-Breaking RSA Group Name RSA C

Q [4] 2. [3] [5] ϵ- Q Q CO CO [4] Q Q [1] i = X ln n i + C (1) n i i n n i i i n i = n X i i C exploration exploitation [4] Q Q Q ϵ 1 ϵ 3. [3] [5] [4]

ディープラーニングとオープンサイエンス ~研究の爆速化が引き起こす摩擦なき情報流通へのシフト~

2013 Future University Hakodate 2013 System Information Science Practice Group Report biblive : Project Name biblive : Recording and sharing experienc

Haiku Generation Based on Motif Images Using Deep Learning Koki Yoneda 1 Soichiro Yokoyama 2 Tomohisa Yamashita 2 Hidenori Kawamura Scho

DQN Pathak Intrinsic Curiosity Module (ICM) () [2] Pathak VizDoom Super Mario Bros Mnih A3C [3] ICM Burda ICM Atari 2600 [4] Seijen Hybrid Reward Arch

2017 (413812)

2016 Future University Hakodate 2016 System Information Science Practice Group Report Project Name Field Oriented System Design Learning by Users Feed

( ) [1] [4] ( ) 2. [5] [6] Piano Tutor[7] [1], [2], [8], [9] Radiobaton[10] Two Finger Piano[11] Coloring-in Piano[12] ism[13] MIDI MIDI 1 Fig. 1 Syst

Vol. 48 No. 4 Apr LAN TCP/IP LAN TCP/IP 1 PC TCP/IP 1 PC User-mode Linux 12 Development of a System to Visualize Computer Network Behavior for L

,,,,., C Java,,.,,.,., ,,.,, i

2016 Future University Hakodate 2016 System Information Science Practice Group Report Project Name Designing Learning Environment for Mathematics at F

1 1 tf-idf tf-idf i

,,,,,, : - i -

(SNS). Twitter , LINE Facebook, SNS 1., Twitter 80%, Twitter.,,., Twitter,,.,,,, Twitter., Twitter,., Twitter,.,. - i -

,..,.,,. - i -

kut-paper-template.dvi

13 RoboCup The Interface System for Learning By Observation Applied to RoboCup Agents Ruck Thawonmas

1 [1, 2, 3, 4, 5, 8, 9, 10, 12, 15] The Boston Public Schools system, BPS (Deferred Acceptance system, DA) (Top Trading Cycles system, TTC) cf. [13] [

soturon.dvi

A Study on Practical Use of Artificial Intelligence. The purpose of this research paper is to demonstrate the ease of using artificial intelligence in

FAX-760CLT

Mastering the Game of Go without Human Knowledge ( ) AI 3 1 AI 1 rev.1 (2017/11/26) 1 6 2

Consideration of Cycle in Efficiency of Minority Game T. Harada and T. Murata (Kansai University) Abstract In this study, we observe cycle in efficien

fiš„v8.dvi

A Study on Throw Simulation for Baseball Pitching Machine with Rollers and Its Optimization Shinobu SAKAI*5, Yuichiro KITAGAWA, Ryo KANAI and Juhachi

WebRTC P2P Web Proxy P2P Web Proxy WebRTC WebRTC Web, HTTP, WebRTC, P2P i

05_藤田先生_責

,,, : - i -

_314I01BM浅谷2.indd

e-learning e e e e e-learning 2 Web e-leaning e 4 GP 4 e-learning e-learning e-learning e LMS LMS Internet Navigware

IPSJ SIG Technical Report Vol.2011-MUS-91 No /7/ , 3 1 Design and Implementation on a System for Learning Songs by Presenting Musical St

28 Horizontal angle correction using straight line detection in an equirectangular image

浜松医科大学紀要

4 Marine Traffic Web Web,,,, Web - i -

4.1 % 7.5 %

, IT.,.,..,.. i


自然言語処理24_705

[2] , [3] 2. 2 [4] 2. 3 BABOK BABOK(Business Analysis Body of Knowledge) BABOK IIBA(International Institute of Business Analysis) BABOK 7

johnny-paper2nd.dvi

IR0036_62-3.indb

The 23rd Game Programming Workshop ,a) 2,3,b) Deep Q-Network Atari2600 Minecraft AI Minecraft hg-dagger/q Imitation Learning and Reinforcement L

DTN DTN DTN DTN i

Sobel Canny i

IPSJ SIG Technical Report Secret Tap Secret Tap Secret Flick 1 An Examination of Icon-based User Authentication Method Using Flick Input for

4 2 ios Objective-c Xcode - i -

COM COM 4) 5) COM COM 3 4) 5) COM COM 6) 7) 10) COM Bonanza 6) Bonanza Hearts COM 7) 10) Hearts 3 2,000 4,000

TF-IDF TDF-IDF TDF-IDF Extracting Impression of Sightseeing Spots from Blogs for Supporting Selection of Spots to Visit in Travel Sat


7,, i

IPSJ SIG Technical Report Vol.2016-CE-137 No /12/ e β /α α β β / α A judgment method of difficulty of task for a learner using simple

28 Docker Design and Implementation of Program Evaluation System Using Docker Virtualized Environment

WHITE PAPER RNN

IT i

fiš„v5.dvi

DPA,, ShareLog 3) 4) 2.2 Strino Strino STRain-based user Interface with tacticle of elastic Natural ObjectsStrino 1 Strino ) PC Log-Log (2007 6)

GPGPU

kut-paper-template.dvi

Deep Learning Deep Learning GPU GPU FPGA %

No. 26 March 2016 IoT innovation IoT innovation 1 2 ICT Industry and IoT Innovation-Case Study of Competition and Cooperation between ICT and Automobi

<95DB8C9288E397C389C88A E696E6462>


& Vol.5 No (Oct. 2015) TV 1,2,a) , Augmented TV TV AR Augmented Reality 3DCG TV Estimation of TV Screen Position and Ro

1 3DCG [2] 3DCG CG 3DCG [3] 3DCG 3 3 API 2 3DCG 3 (1) Saito [4] (a) 1920x1080 (b) 1280x720 (c) 640x360 (d) 320x G-Buffer Decaudin[5] G-Buffer D

1 Fig. 1 Extraction of motion,.,,, 4,,, 3., 1, 2. 2.,. CHLAC,. 2.1,. (256 ).,., CHLAC. CHLAC, HLAC. 2.3 (HLAC ) r,.,. HLAC. N. 2 HLAC Fig. 2

Fig. 3 Flow diagram of image processing. Black rectangle in the photo indicates the processing area (128 x 32 pixels).

ABSTRACT The movement to increase the adult literacy rate in Nepal has been growing since democratization in In recent years, about 300,000 peop

27 VR Effects of the position of viewpoint on self body in VR environment

IPSJ SIG Technical Report Vol.2010-SLDM-144 No.50 Vol.2010-EMB-16 No.50 Vol.2010-MBL-53 No.50 Vol.2010-UBI-25 No /3/27 Twitter IME Twitte

IPSJ SIG Technical Report Vol.2009-BIO-17 No /5/26 DNA 1 1 DNA DNA DNA DNA Correcting read errors on DNA sequences determined by Pyrosequencing

56

149 (Newell [5]) Newell [5], [1], [1], [11] Li,Ryu, and Song [2], [11] Li,Ryu, and Song [2], [1] 1) 2) ( ) ( ) 3) T : 2 a : 3 a 1 :

WikiWeb Wiki Web Wiki 2. Wiki 1 STAR WARS [3] Wiki Wiki Wiki 2 3 Wiki 5W1H Wiki Web 2.2 5W1H 5W1H 5W1H 5W1H 5W1H 5W1H 5W1H 2.3 Wiki 2015 Informa

) ,


Convolutional Neural Network A Graduation Thesis of College of Engineering, Chubu University Investigation of feature extraction by Convolution

橡最新卒論

システム開発プロセスへのデザイン技術適用の取組み~HCDからUXデザインへ~

untitled

16_.....E...._.I.v2006

Computational Semantics 1 category specificity Warrington (1975); Warrington & Shallice (1979, 1984) 2 basic level superiority 3 super-ordinate catego


3_23.dvi

GP ICT GP GP GP reading writing listening speaking GP ICT ICT

IPSJ SIG Technical Report Vol.2012-CG-148 No /8/29 3DCG 1,a) On rigid body animation taking into account the 3D computer graphics came

はじめに


<8ED089EF8B D312D30914F95742E696E6464>

On Japanese empathy and interpretation IKEDA, Masatoshi In this paper I compared it with empathy as a manner of psychotherapist about interpretation a

29 jjencode JavaScript

Transcription:

2016 Future University Hakodate 2016 System Information Science Practice Group Report AI Project Name AI love Deep Learning TORCS Deep Learning Group Name TORCS Deep Learning /Project No. 14-B /Project Leader 1014041 Daichi Fukuda /Group Leader 1014053 Masafumi Takahashi /Group Member 1014018 Kaede Noto 1014023 Sora Ito 1014053 Masafumi Takahashi 1014066 Masataka Kato 1014094 Reina Saito 1014126 Tomoya Minamoto Advisor Takashi Takenouchi Kiyohito Nagano Kengo Terasawa Yasuhiro Katagiri 2017 1 18 Date of Submission January 18, 2017

,.,., ( ).,., A, B. A, B,.,...,., Python Long Short Term Memory(LSTM), Unity., Asynchronous method, Deep Q-Network(DQN), LSTM, TORCS. Asynchronous method.,,,,, Unity, TORCS - i -

Abstract Recently, AI attracts attention because it can imitate humans in various cases. AI is a kind of technology of Machine Learning. We use it to implement some intelligence. These are same intelligence as human s natural learning abilities. Especially, Deep Learning has archived good results in the field of image processing. In this project, our goal is to imitate and surpass human s thoughts with using Machine Learning. After we discussed our goal, we made two groups. These were group A and group B. The members in group A aimed to develop a combination of pitches expectation system. And the members in group B aimed to develop a car agent that can drive cars faster than humans. We belong to group B and use Deep Reinforcement Learning to develop such a car agent. Deep Reinforcement Learning is technique that applies technique of Deep Learning to function approximation of Reinforcement Learning. The problems to learn in good order are to set appropriate network, rewards, and environment. In the first semester, we implemented Long Short Term Memory (LSTM) on Python, rewards, and environment of a racing game on Unity. And we had the car agent learn with using these works. In the second semester, we had the car agent learn with using techniques of Asynchronous method, Deep Q-Network (DQN), and LSTM on TORCS. TORCS is open source car simulator. Finally, Asynchronous method car agent can drive cars faster in an oval track than humans. Keyword Artificial Intelligence, Machine Learning, Deep Learning, Reinforcement Learning, Unity, TORCS - ii -

1 1 1.1.......................................... 1 1.2.......................................... 1 1.3........................................ 1 1.4..................................... 2 1.5.......................................... 2 2 3 2.1....................................... 3 2.2...................................... 3 2.3...................................... 3 2.3.1............................... 3 2.3.2................................... 4 2.3.3................................... 4 2.3.4............................... 4 2.3.5................................... 5 2.4.................................... 5 2.4.1............................. 5 2.4.2............................. 6 3 7 3.1.............................. 7 3.2.............................. 8 4 9 4.1....................... 9 4.1.1 Python................................... 9 4.1.2 Unity.................................... 9 4.2....................... 10 4.2.1 Asynchronous method.......................... 10 4.2.2 Deep Q-Network.............................. 11 4.2.3 Long Short Term Memory........................ 11 4.3........................ 12 4.3.1 (, Python, Asynchronous method )..... 12 4.3.2 Python, DQN........................ 13 4.3.3 (Python, LSTM )........................ 13 4.3.4 Unity, DQN......................... 14 4.3.5 Unity, LSTM....................... 14 - iii -

4.3.6 Unity, Asynchronous method.............. 15 5 16 5.1 Asynchronous method............................... 16 5.2 Deep Q-Network................................... 16 5.3 Long Short Term Memory............................. 17 6 18 7 19 8 20 8.1................................. 20 8.2...................................... 20 8.2.1 Unity.................................... 20 8.2.2 Python................................... 21 8.3...................................... 22 8.3.1 Asynchronous method.......................... 22 8.3.2 Deep Q-Network.............................. 22 8.3.3 Long Short Term Memory........................ 22 8.4.......................................... 23 8.4.1............................... 23 8.4.2............................... 24 8.5....................................... 25 8.5.1............................. 25 8.5.2............................. 26 8.6....................................... 27 8.6.1................................ 27 8.6.2................................ 28 9 29 9.1.......................... 29 9.1.1.......................... 29 9.1.2.................... 29 9.2.......................... 30 9.2.1.......................... 30 9.2.2.................... 31 9.2.3................................... 32 33 - iv -

1 1.1,., ImageNet Large Scale Visual Recognition Challenge 2012(ILSVRC2012), 10% [1].,,., Google Google Brain YouTube, [2]., DeepMind AlphaGo, 4 [3]. 1.2,.,,.,,.,,,. 1.3, DeepMind Deep Q- Network(DQN). DQN,. DQN,,. Atari2600 49,, 43. 29. DQN 1.1. Group Report of 2016 SISP - 1 - Group Number 14-B

1.1: DQN 1.4 DQN 2. 1,. DQN., DQN,,. 2,.,, 1. DQN 4. 4. 1.5, 2. 1,, 2,,. 2,.,,. Group Report of 2016 SISP - 2 - Group Number 14-B

2 2.1. 1,. 2.2, DQN,., 1.4. (1). (2). 2.3 2. 2. 2.3.1,. Unity Python, Experience Replay, Fixed Target Q-Network, Long Short Term Memory(LSTM). Group Report of 2016 SISP - 3 - Group Number 14-B

2.3.2. Unity (1), (2),,,. Experience Replay, Fixed Target Q-Network (1), (2) Deep Q-Network. (1),,. Long Short Term Memory(LSTM) (2), 2.3.3,.,,,., OpenAI OpenAI gym,,. 2.3.4,. TORCS Python, Asynchronous Method, Deep Q-Network, Long Short Term Memory(LSTM). OpenAI gym,. Group Report of 2016 SISP - 4 - Group Number 14-B

2.3.5. TORCS (1), (2),,,. Asynchronous Method, Deep Q-Network, Long Short Term Memory (1), (2) Asynchronous Method, Deep Q-Network, Long Short Term Memory. (1),,. OpenAI gmy OpenAI gym,,. 2.4 2.4.1 Python, Unity Python, Unity 2,,.. Python Python-Unity LSTM. Experience Replay. Fixed Target Q-Network. Unity,.. Python,. Group Report of 2016 SISP - 5 - Group Number 14-B

2.4.2 Python, TORCS. Python Unity, 2 3. Python Unity 1,, Unity, Python.,. Asynchronous method : Asynchronous method,.. Deep Q-Network :. : Deep Q-Network. Long Short Term Memory : Long Short Term Memory,. : Long Short Term Memory,. Group Report of 2016 SISP - 6 - Group Number 14-B

3,. 1, 2, Python, 3 Python., TORCS TORCS, 1 2., 3,.,,. 3.1, Python ( ), Unity 2. Python, Unity. Python, Unity.,,., Python ILSVRC2012 AlexNet. Unity,, 6., MessagePack WebSocket, AlexNet Unity,., Python, Experience Replay LSTM Fixed Target Q-Network 3, Unity,., 3.1,. 3.1: Group Report of 2016 SISP - 7 - Group Number 14-B

3.2,. 2.3.3.,.,., gym-torcs TORCS, Unity TORCS.,,,,., 3, 1 2, 3.,, 2 1 3,.,,. Group Report of 2016 SISP - 8 - Group Number 14-B

4, 3., 5. 4.1,. 4.1.1 Python Experimence Replay, Fixed Target Q-Network, Long Short Term Memory,,. Python Unity, Life in Silico,. 4.1.2 Unity Unity, Unity Unity, C#.,, Unity, Blender 3D,., Python Unity Python, Life in Silico. Group Report of 2016 SISP - 9 - Group Number 14-B

4.2,. 4.2.1 Asynchronous method Asynchronous method, TORCS, TORCS,., Asynchronous method, Asynchronous method., Asynchronous method. 5.1.,,, OpenAI gym CartPole.,,., CartPole.,., Asynchronous method, Asynchronous method, 2,., 2,,., Asynchronous method, CartPole.,.,.,,,,.,,,.,. Group Report of 2016 SISP - 10 - Group Number 14-B

4.2.2 Deep Q-Network Deep Q-Network, TORCS,., Deep Q-Network., 1 ATARI.,,., Deep Q-Network, OpenAI gym CartPole., TORCS,. CartPole. CartPole, TORCS, TORCS., Deep Q-Network, TORCS,,. 4.2.3 Long Short Term Memory Long Short Term Memory, CartPole, TORCS.,, LSTM. CartPole, (Cart) (Pole). CartPole,,.,,,.,,,., TORCS, TORCS.,.,,,., TORCS,. TORCS,,. Group Report of 2016 SISP - 11 - Group Number 14-B

4.3,. 4.3.1 (, Python, Asynchronous method ) (1),,,. (2) Python Unity MessagePack, WebSocket,. (3), Python,,,. (4), OpenAI gym CartPole, 2, 3. (5) Asynchronous method,,. (6),,,.,. Group Report of 2016 SISP - 12 - Group Number 14-B

4.3.2 Python, DQN (1), Python, Unity, Python Unity, Life in Silico( LIS ),,. (2) LIS,,,.. (3) Python 1 1, Fixed Target Q- network,. (4), CartPole Deep Q-Network Experience Replay Fixed Target Q-network. (5) DQN,. (6) CartPole,. 4.3.3 (Python, LSTM ) (1). (2) Chainer,. (3) Experience Replay,,. (4) gym-torcs Ubuntu. (5) RNN, LSTM. (6) RNN, LSTM, CartPole, TORCS. (7),,,,. Group Report of 2016 SISP - 13 - Group Number 14-B

4.3.4 Unity, DQN (1) Unity, Unity [5]. (2). (3), Slack.,. (4) Blender,,. (5). (6),. (7) gym-torcs Ubuntu. (8), DQN. (9),. (10),,, DQN. 4.3.5 Unity, LSTM (1),.,. (2), 4. (3) RNN, LSTM,. (4),,. Group Report of 2016 SISP - 14 - Group Number 14-B

4.3.6 Unity, Asynchronous method (1) Unity,,. (2), WebSocket Python, 6. (3),. (4) Python Unity Unity WebSocket. (5),. (6). Group Report of 2016 SISP - 15 - Group Number 14-B

5,. 5.1 Asynchronous method,,.,,.,,., θ θ., (1)θ θ, (2)θ θ dθ, (3)dθ θ,.,, RMSprop 2,. 5.2 Deep Q-Network Deep Q-Network Experience Replay, Fixed Target Q-Network, Clipping 3. Experience Replay,,,,.,.,.. Fixed Target Q-Network, target. TD target θ, θ., target θ, θ. Clipping,, 1, -1.,. Group Report of 2016 SISP - 16 - Group Number 14-B

5.3 Long Short Term Memory Long Short Term Memory. 3. (1),, (2), (3).,,., LSTM,. Group Report of 2016 SISP - 17 - Group Number 14-B

6 B, 2.2. 6.1. 6.1:, Unity 3D Blender 3D GitHub, Python C# Unity Chainer Group Report of 2016 SISP - 18 - Group Number 14-B

7, Unity. Unity C#. C# II. II, Java., C#.,, Python.,,. I II.,,.,. Group Report of 2016 SISP - 19 - Group Number 14-B

8 8.1 3 TORCS,., Asynchronous method,.,, 1 1 37.36 1 37.27. 8.2 Unity, Python. 8.2.1 Unity Unity. Unity.,. Blender Unity (400m ). 8.1 Unity, 8.2 Blender.,. Python. Unity,. Unity Python. Python. Group Report of 2016 SISP - 20 - Group Number 14-B

8.1: Unity 8.2: 8.2.2 Python Python.,. Unity.. LSTM. Group Report of 2016 SISP - 21 - Group Number 14-B

8.3,, 3,. 8.3.1 Asynchronous method Asynchronous one step Q-Learning, [6]. TORCS. Asynchronous method, TORCS. CartPole Asynchronous method. OU process... 8.3.2 Deep Q-Network TORCS. GPU CPU. Excel. TORCS Experience Replay. TORCS Fixed Target Q-Network.. 8.3.3 Long Short Term Memory TORCS,,. LSTM, TORCS LSTM.,.,. Group Report of 2016 SISP - 22 - Group Number 14-B

8.4,.. 8.4.1 69, 8.1. 8.1: 1 0 0 2 0 0 3 2 1 4 2 2 5 5 3 6 12 7 7 16 16 8 15 20 9 7 9 10 3 5 7 6 7. 032 7. 476,.......,. Group Report of 2016 SISP - 23 - Group Number 14-B

8.4.2 76, 8.2. 8.2: 1 0 0 2 0 0 3 0 0 4 1 0 5 1 5 6 12 7 7 19 11 8 26 25 9 12 19 10 5 9 0 0 7.631 7.960 +0.599 +0.484.......,,..... Group Report of 2016 SISP - 24 - Group Number 14-B

,.,,.,,. 8.5 B,,,.. 8.5.1, 8.3. 1 5 5. 8.3: 4,. 3. 3,. 4.,. 3. Group Report of 2016 SISP - 25 - Group Number 14-B

8.5.2, 8.4. 1 5 5. 8.4: 4,. 4. 5 3,,., 5. 5 AI TORCS, ( ).,,,.,,, 5. 5,,.,,,.,,.,, 5. 4, 4. Group Report of 2016 SISP - 26 - Group Number 14-B

8.6,.. 8.6.1 8.5. 8.5: Python Unity,,.,. Unity,. Unity. Unity..,.,.,,.,. Group Report of 2016 SISP - 27 - Group Number 14-B

8.6.2 8.6. 8.6:,.,. Deep Q-Network, TORCS, TORCS.,,.,,.,,, AI,.,,.,.,. CartPole DQN TORCS.,,. Group Report of 2016 SISP - 28 - Group Number 14-B

9 9.1,. 9.1.1 Python, Unity,.,,. 9.1.2. (Python, ) Python-Unity,, Python LSTM DQN Clipping.,,., LSTM,,. (Python ) LIS,, Python DQN Fixed Target Q-Network., LIS., Fixed Target Q-Network,. (Python ) Python DQN Experience Replay.,,. (Unity ), 4., Group Report of 2016 SISP - 29 - Group Number 14-B

4.,. (Unity ) Unity,, Python.,,. (Unity ), Unity.,,,,. 9.2,. 9.2.1 3., TORCS. Asynchronous method, TORCS., 2.1. Group Report of 2016 SISP - 30 - Group Number 14-B

9.2.2. (Asynchronous method, ) PC GPU, Asynchronous method,,.,. (Deep Q-Network ) CartPole Deep Q-Network Experience Replay, Fixed Target Q-network Clipping., TORCS Experience Replay Fixed Target Q- network., CartPole, TORCS. (Long Short Term Memory ) RNN, LSTM,. TORCS,,.. (Long Short Term Memory ), LSTM. LSTM,,,,,. (Asynchronous method ),,.,. (Deep Q-Network ) Ubuntu TORCS. Group Report of 2016 SISP - 31 - Group Number 14-B

9.2.3,.,.,,, 3.,, TORCS. TORCS.,,..,.,,., TORCS,.,..,,,.,. Group Report of 2016 SISP - 32 - Group Number 14-B

[1] All results, http://imagenet.org/challenges/lsvrc/2012/results.html.(2016/07/15 ) [2] Le Q, Ranzato M, Monga R, Devin M, Chen K, Corrado G, Dean J, and Ng A, Building high-level features using large scale unsupervised learning, In ICML, 2012. [3] nikkei BP net, 2016 3 31, AI http://www.nikkeibp.co.jp/atcl/matome/15/325410/032800202/(2016/07/15 ) [4],,, 2015. [5], Unity5 3D/2D,, 2015. [6] Volodymyr Mnih, Adri Puigdomnech Badia, Mehdi Mirza, Alex Graves, Timothy P Lillicrap, Tim Harley, David Silver, Koray Kavukcuoglu, Asynchronous Methods for Deep Reinforcement Learning, In ICML, 2016. [7] Simon O Haykin, Neural Networks and Learning Machines, Pearson, 2008. [8] Yann LeCun, Leon Bottou, Genevieve B Orr, Klaus Robert Mller, Efficient BackProp, Springer Berlin Heidelberg, 2002. [9] Richard S Sutton, Andrew G Barto,,,,, 2000. [10] Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, Martin Riedmiller, Playing Atari with Deep Reinforcement Learning, NIPS Deep Learning Workshop 2013, 2013. [11] Daniele Loiacono, Luigi Cardamone, Pier Luca Lanzi, Simulated Car Racing Championship: Competition Software Manual, 2013. [12] Ćirović Velimir, Braking torque control using recurrent neural networks, In Proceedings of the Institution of Mechanical Engineers Part D Journal of Automobile Engineering 226(6), May 2012. [13] Sepp Hochreiter, Jrgen Schmidhuber, Long short-term memory, In NEURAL COMPU- TATION 1997. Group Report of 2016 SISP - 33 - Group Number 14-B