[4], [5] COM [4] COM [5] COM COM [6] Infinite Mario Bros. COM 2 3 4 Infinite Mario Bros. 5 2. 2.1 COM [7] [1], [2] Bonanza [7] Bonanza 6 Bonanza [3],



Similar documents

COM COM 4) 5) COM COM 3 4) 5) COM COM 6) 7) 10) COM Bonanza 6) Bonanza Hearts COM 7) 10) Hearts 3 2,000 4,000

[1] AI [2] Pac-Man Ms. Pac-Man Ms. Pac-Man Pac-Man Ms. Pac-Man IEEE AI Ms. Pac-Man AI [3] AI 2011 UCT[4] [5] 58,990 Ms. Pac-Man AI Ms. Pac-Man 921,360

Q [4] 2. [3] [5] ϵ- Q Q CO CO [4] Q Q [1] i = X ln n i + C (1) n i i n n i i i n i = n X i i C exploration exploitation [4] Q Q Q ϵ 1 ϵ 3. [3] [5] [4]

知能と情報, Vol.30, No.5, pp

( ) fnirs ( ) An analysis of the brain activity during playing video games: comparing master with not master Shingo Hattahara, 1 Nobuto Fuji


人工知能学会研究会資料 SIG-KBS-B Analysis of Voting Behavior in One Night Werewolf 1 2 Ema Nishizaki 1 Tomonobu Ozaki Graduate School of Integrated B

& Vol.5 No (Oct. 2015) TV 1,2,a) , Augmented TV TV AR Augmented Reality 3DCG TV Estimation of TV Screen Position and Ro

1 Fig. 1 Extraction of motion,.,,, 4,,, 3., 1, 2. 2.,. CHLAC,. 2.1,. (256 ).,., CHLAC. CHLAC, HLAC. 2.3 (HLAC ) r,.,. HLAC. N. 2 HLAC Fig. 2

The 15th Game Programming Workshop 2010 Magic Bitboard Magic Bitboard Bitboard Magic Bitboard Bitboard Magic Bitboard Magic Bitboard Magic Bitbo

1 Web [2] Web [3] [4] [5], [6] [7] [8] S.W. [9] 3. MeetingShelf Web MeetingShelf MeetingShelf (1) (2) (3) (4) (5) Web MeetingShelf

1., 1 COOKPAD 2, Web.,,,,,,.,, [1]., 5.,, [2].,,.,.,, 5, [3].,,,.,, [4], 33,.,,.,,.. 2.,, 3.., 4., 5., ,. 1.,,., 2.,. 1,,

IPSJ SIG Technical Report Vol.2012-CG-148 No /8/29 3DCG 1,a) On rigid body animation taking into account the 3D computer graphics came

The 18th Game Programming Workshop ,a) 1,b) 1,c) 2,d) 1,e) 1,f) Adapting One-Player Mahjong Players to Four-Player Mahjong

Consideration of Cycle in Efficiency of Minority Game T. Harada and T. Murata (Kansai University) Abstract In this study, we observe cycle in efficien

2. CABAC CABAC CABAC 1 1 CABAC Figure 1 Overview of CABAC 2 DCT 2 0/ /1 CABAC [3] 3. 2 値化部 コンテキスト計算部 2 値算術符号化部 CABAC CABAC

( )

IPSJ SIG Technical Report Vol.2011-MUS-91 No /7/ , 3 1 Design and Implementation on a System for Learning Songs by Presenting Musical St

1,a) 1,b) TUBSTAP TUBSTAP Offering New Benchmark Maps for Turn Based Strategy Game Tomihiro Kimura 1,a) Kokolo Ikeda 1,b) Abstract: Tsume-shogi and Ts

A Study on Throw Simulation for Baseball Pitching Machine with Rollers and Its Optimization Shinobu SAKAI*5, Yuichiro KITAGAWA, Ryo KANAI and Juhachi

3_23.dvi

Input image Initialize variables Loop for period of oscillation Update height map Make shade image Change property of image Output image Change time L

Vol.55 No (Jan. 2014) saccess 6 saccess 7 saccess 2. [3] p.33 * B (A) (B) (C) (D) (E) (F) *1 [3], [4] Web PDF a m

258 5) GPS 1 GPS 6) GPS DP 7) 8) 10) GPS GPS ) GPS Global Positioning System

日本感性工学会論文誌

Core1 FabScalar VerilogHDL Cache Cache FabScalar 1 CoreConnect[2] Wishbone[3] AMBA[4] AMBA 1 AMBA ARM L2 AMBA2.0 AMBA2.0 FabScalar AHB APB AHB AMBA2.0

IPSJ SIG Technical Report Vol.2014-CE-126 No /10/11 1,a) Kinect Support System for Romaji Learning through Exercise Abstract: Educatio

28 Horizontal angle correction using straight line detection in an equirectangular image

情報処理学会研究報告 IPSJ SIG Technical Report Vol.2015-GI-34 No /7/ % Selections of Discarding Mahjong Piece Using Neural Network Matsui

2 ( ) i

56

兵庫県立大学学報vol.17

( ) [1] [4] ( ) 2. [5] [6] Piano Tutor[7] [1], [2], [8], [9] Radiobaton[10] Two Finger Piano[11] Coloring-in Piano[12] ism[13] MIDI MIDI 1 Fig. 1 Syst

..,,,, , ( ) 3.,., 3.,., 500, 233.,, 3,,.,, i

JAIST Reposi Title 少数の記録からプレイヤの価値観を機械学習するチー ムプレイ AI の構成 Author(s) 和田, 堯之 ; 佐藤, 直之 ; 池田, 心 Citation 研究報告ゲーム情報学 (GI), 2015-GI-33(5): 1-

IPSJ SIG Technical Report Vol.2009-DPS-141 No.20 Vol.2009-GN-73 No.20 Vol.2009-EIP-46 No /11/27 1. MIERUKEN 1 2 MIERUKEN MIERUKEN MIERUKEN: Spe

第62巻 第1号 平成24年4月/石こうを用いた木材ペレット

昭和恐慌期における長野県下農業・農村と産業組合の展開過程

IPSJ SIG Technical Report Vol.2014-GN-90 No.16 Vol.2014-CDS-9 No.16 Vol.2014-DCC-6 No /1/24 1,a) 2,b) 2,c) 1,d) QUMARION QUMARION Kinect Kinect

1

1: A/B/C/D Fig. 1 Modeling Based on Difference in Agitation Method artisoc[7] A D 2017 Information Processing

untitled

e-learning e e e e e-learning 2 Web e-leaning e 4 GP 4 e-learning e-learning e-learning e LMS LMS Internet Navigware

4) 5) ) ( 1 ) ( 2 ) ( 3 ) ( 4 ) ( 5 ) ( 6 ) )8) ( 1 ) ( 2 ) ( 3 ) ( 200 9) ( 10) 1 2 (

2015 3

TCP/IP IEEE Bluetooth LAN TCP TCP BEC FEC M T M R M T 2. 2 [5] AODV [4]DSR [3] 1 MS 100m 5 /100m 2 MD 2 c 2009 Information Processing Society of

5) 2. Geminoid HI-1 6) Telenoid 7) Geminoid HI-1 Geminoid HI-1 Telenoid Robot- PHONE 8) RobotPHONE 11 InterRobot 9) InterRobot InterRobot irt( ) 10) 4

Vol. 48 No. 3 Mar PM PM PMBOK PM PM PM PM PM A Proposal and Its Demonstration of Developing System for Project Managers through University-Indus

(a) Picking up of six components (b) Picking up of three simultaneously. components simultaneously. Fig. 2 An example of the simultaneous pickup. 6 /

( ) ( ) (action chain) (Langacker 1991) ( 1993: 46) (x y ) x y LCS (2) [x ACT-ON y] CAUSE [BECOME [y BE BROKEN]] (1999: 215) (1) (1) (3) a. * b. * (4)

表紙PDF作成用/PDF表紙作成用

DEIM Forum 2009 B4-6, Str

[2][3][4][5] 4 ( 1 ) ( 2 ) ( 3 ) ( 4 ) 2. Shiratori [2] Shiratori [3] [4] GP [5] [6] [7] [8][9] Kinect Choi [10] 3. 1 c 2016 Information Processing So

Fig. 2 28th Ryuou Tournament, Match 5, 59th move. The last move is Black s Rx5f. 1 Tic-Tac-Toe Fig. 1 AsearchtreeofTic-Tac-Toe. [2] [3], [4]

6_27.dvi

IPSJ SIG Technical Report An Evaluation Method for the Degree of Strain of an Action Scene Mao Kuroda, 1 Takeshi Takai 1 and Takashi Matsuyama 1

2). 3) 4) 1.2 NICTNICT DCRA Dihedral Corner Reflector micro-arraysdcra DCRA DCRA DCRA 3D DCRA PC USB PC PC ON / OFF Velleman K8055 K8055 K8055

Modal Phrase MP because but 2 IP Inflection Phrase IP as long as if IP 3 VP Verb Phrase VP while before [ MP MP [ IP IP [ VP VP ]]] [ MP [ IP [ VP ]]]

Fig. 3 Flow diagram of image processing. Black rectangle in the photo indicates the processing area (128 x 32 pixels).

2017 (413812)

39-3/2.論説:藤井・戸前・山本・井上

Table 1. Assumed performance of a water electrol ysis plant. Fig. 1. Structure of a proposed power generation system utilizing waste heat from factori

教職教育センター紀要 7号☆/表紙(7)

DEIM Forum 2009 E

先端社会研究 ★5★号/4.山崎

IPSJ SIG Technical Report Vol.2011-EC-19 No /3/ ,.,., Peg-Scope Viewer,,.,,,,. Utilization of Watching Logs for Support of Multi-

IPSJ SIG Technical Report Vol.2009-DPS-141 No.23 Vol.2009-GN-73 No.23 Vol.2009-EIP-46 No /11/27 t-room t-room 2 Development of

Vol. 48 No. 4 Apr LAN TCP/IP LAN TCP/IP 1 PC TCP/IP 1 PC User-mode Linux 12 Development of a System to Visualize Computer Network Behavior for L

udc-4.dvi

floodgate 15 Nine- DayFever XeonE c(NDF) gpsfish XeonX c 11 * [6]

HP cafe HP of A A B of C C Map on N th Floor coupon A cafe coupon B Poster A Poster A Poster B Poster B Case 1 Show HP of each company on a user scree

13金子敬一.indd

1 1 CodeDrummer CodeMusician CodeDrummer Fig. 1 Overview of proposal system c

ipod touch 1 2 Apple ipod touch ipod touch 3 ( ) ipod touch ( 1 ) Apple ( 2 ) Web 1),2) 3. ipod touch 1 2 ipod touch x y z i

IPSJ SIG Technical Report Vol.2016-GI-35 No /3/9 StarCraft AI Deep Q-Network StarCraft: BroodWar Blizzard Entertainment AI Competition AI Convo

TD 2048 TD 1 N N 2048 N TD N N N N N N 2048 N 2048 TD 2048 TD TD TD 2048 TD 2048 minimax 2048, 2048, TD, N i

IPSJ SIG Technical Report Vol.2010-SLDM-144 No.50 Vol.2010-EMB-16 No.50 Vol.2010-MBL-53 No.50 Vol.2010-UBI-25 No /3/27 Twitter IME Twitte


百人一首かるた選手の競技時の脳の情報処理に関する研究

理学療法検査技術習得に向けた客観的臨床能力試験(OSCE)の試行

29 jjencode JavaScript

5 5 5 Barnes et al

Admissions Assistance Office

IPSJ SIG Technical Report Vol.2012-MUS-96 No /8/10 MIDI Modeling Performance Indeterminacies for Polyphonic Midi Score Following and

Vol. 42 No. SIG 8(TOD 10) July HTML 100 Development of Authoring and Delivery System for Synchronized Contents and Experiment on High Spe

3_39.dvi

IPSJ SIG Technical Report Vol.2014-HCI-158 No /5/22 1,a) 2 2 3,b) Development of visualization technique expressing rainfall changing conditions

第 55 回自動制御連合講演会 2012 年 11 月 17 日,18 日京都大学 1K403 ( ) Interpolation for the Gas Source Detection using the Parameter Estimation in a Sensor Network S. T

(a) 1 (b) 3. Gilbert Pernicka[2] Treibitz Schechner[3] Narasimhan [4] Kim [5] Nayar [6] [7][8][9] 2. X X X [10] [11] L L t L s L = L t + L s

Core Ethics Vol.


FUJII, M. and KOSAKA, M. 2. J J [7] Fig. 1 J Fig. 2: Motivation and Skill improvement Model of J Orchestra Fig. 1: Motivating factors for a

新製品開発プロジェクトの評価手法

58 10

26 Development of Learning Support System for Fixation of Basketball Shoot Form


‰gficŒõ/’ÓŠ¹

16_.....E...._.I.v2006

Core Ethics Vol. -

Transcription:

(EC2013) 2013 10 COM 1,2,a) 1 1 1,b) 1,c) COMCOM COM COM COM COM Evaluating Human-like Video-Game Agents Autonomously Acquired with Biological Constraints Fujii Nobuto 1,2,a) Sato Yuichi 1 Wakama Hironori 1 Kazai Koji 1,b) Katayose Haruhiro 1,c) Abstract: While various systems that have aimed at automatically acquiring behavioral patterns have been proposed and some have successfully obtained stronger patterns than human players, those patterns have looked mechanical. We propose the autonomous acquisition of NPCs human-like behaviors, which emulate the behaviors of human players. In our previous study, the behaviors are acquired using techniques of reinforcement learning and pathfinding, where biological constraints are imposed. In this paper, We evaluated human-like behavioral patterns through subjective assessments, and discuss the possibility of implementing the proposed system. 1. =COM 1 Graduate School of Science and Technology, Kwansei Gakuin University 2 DC2 Research Fellow of Japan Society for the Promotion of Science a) nobuto@kwansei.ac.jp b) kazai@kwansei.ac.jp c) katayose@kwansei.ac.jp COM COM COM COM [1], [2], [3] COM COM COM 26

[4], [5] COM [4] COM [5] COM COM [6] Infinite Mario Bros. COM 2 3 4 Infinite Mario Bros. 5 2. 2.1 COM [7] [1], [2] Bonanza [7] Bonanza 6 Bonanza [3], [8] Robin 2009 Mario AI Competition A* COM [1] Mario AI Competition Infinite Mario Bros. COM [9] Robin COM A* Hearts Q COM [2] 4 3 Hearts COM AI COM COM 2.2 COM COM Jacob 2012 The 2K BotPrize COM [4] The 2K BotPrize FPS COM COM COM 27

[5] 3. 3.1 COM COM COM Cabrera [10] [11] Maslow 5 [12] 1) 2) 3) 4) 5) 5) COM COM 3.2 Cabrera [10] [11] Maslow [12] ( 1 ) COM ( 2 ) COM ( 3 ) COM ( 4 ) 28

4. 4.1 Q [13] Q Q argmax at Q(s t, a t ) (1) 1 t s t t a t t COM Q(s t, a t ) s t a t Q Q s t Q COM Q Q(s t, a t ) = (1 α)q(s t, a t )+α((r+γmax p Q(s t+1, p))(2) 2 α Q r γ 0 1 r s t a t COM ϵ greedy ϵ greedy 1 ϵ Q ϵ t s t a t r COM [2], [14] 2 Q(s t, a t ) s t 2 Q r r 4.4 ϵ s t s t 4.2 A* 2.1 A* A* f (n) = g (n) + h (n) (3) 3, f (n) n f (n) g (n) n h (n) n A* 4.3 Infinite Mario Bros. COM 1) 2) 3) Infinite Mario Bros. Infinite Mario Bros. 1 Infinite Mario Bros. 29

1 Infinite Mario Bros. COM COM 1 COM LEFT, RIGHT, DOWN, SPEED, JUMP 24 COM Mario AI Competition[9] COM COM COM 22 22 COM 4.4 Infinite Mario Bros. Q Infinite Mario Bros. COM s COM s 7 7 COM 22 22 1 1 (LEFT,RIGHT,DOWN,JUMP,SPEED) (OFF,ON,OFF,OFF,OFF) (OFF,ON,OFF,OFF,ON) (OFF,ON,OFF,ON,OFF) (OFF,ON,OFF,ON,ON) (ON,OFF,OFF,OFF,OFF) (ON,OFF,OFF,OFF,ON) (ON,OFF,OFF,ON,OFF) (ON,OFF,OFF,ON,ON) (OFF,OFF,OFF,ON,OFF) (OFF,OFF,ON,OFF,OFF) (OFF,OFF,OFF,OFF,OFF) 7 7 s COM 8 9 COM COM Q a 11 a 1 Q r r r = distance + damaged + death + keyp ress (4) 30

2 Q COMQ A* COMA* 2 4 distance damaged death keyp ress distance 2.0 damaged -50.0 death -100.0 keyp ress -5.0 A* 2.1 A* COM[1] g (n) h (n) [1] 5. 5.1 COM 2 2 5 2 5.2 20 24 20 13 7 20 µ = 34 σ = 29 5 µ σ 4 3 0 63 µ + σ 2 5 63 14 Infinite Mario Bros. 10 1 25 2 7 1 2 Q 3 A* 2 3 Q 2 Q ϵ 0.0 ϵ 0.2 5 31

2 [, ] (COM) 10.62 5448 [, ] (COM) 14.25 4069 [,, ] (COM) () 15.57 3458 [, ] (COM) 7.29 7926 [, ] (COM) 9.34 3118 [ ] ( ) 10.08 6031 [ ] ( ) 14.25 3644 [ ] ( ) 7.68 7371 50 200 5.3 2 7 COM COM Q A* 3 Q A* Q [, ] 人 間 ら し く な い 人 間 ら し く な い 探 索 無 し 3 強 化 導 入 ( 挑 戦 のみ) 強 化 無 し 上 級 者 初 級 者 中 級 者 探 索 導 入 上 級 者 初 級 者 中 級 者 強 化 導 入 人 間 ら し い 人 間 ら し い 0.66 [, ] 0.29 0.66 0.29 = 0.37 95% 0.48 5% :0.37 < 95% :0.48 A* [, ] [, ] 1% :1.35 > 99%:0.72 COM Q [, ] [ ][ ][ ] A* [, ] [ ][ ] [, ] [ ] :1.12 > 99%:0.58 [, ] [ ] :1.33 > 99%:0.58 [, ] [ ] :0.71 > 95%:0.59 6. COM 3 [, ] [, ] [, ] 1% [ ] [, ] [ ] 5% 2.1 COM [, ] [, ] [, ] [ ] [, ] 2 32

Q 4.4 s [,, ] 7. COM COM COM COM 1) 2) 3) 3 COM [1] Togelius, J., Karakovskiy, S. and Baumgarten, R.: The 2009 Mario AI Competition, Evolutionary Computation (CEC) 2010 IEEE, pp. 1 8 (2010). [2] Fujita, H. and Ishii, S.: Model-based reinforcement learning for partially observable games with samplingbased state estimation, Neural Computation, Vol. 19, pp. 3051 3087 (2007). [3] Hoki, K. and Kaneko, T.: The Global Landscape of Objective Functions for the Optimization of Shogi Piece Values with a Game-Tree Search, Advances in Computer Games 2012, Lecture Notes in Computer Science, Vol. 7168, pp. 184 195 (2012). [4] Schrum, J., Karpov, I. V. and Miikkulainen, R.: Humanlike Behavior via Neuroevolution of Combat Behavior and Replay of Human Traces, 2011 IEEE Conference CIG 11, pp. 329 336 (2011). [5] Viennot, S. AI GPW2012, pp. 47 54 (2012). [6] COM Vol. 2013-EC-27, No. 16, pp. 1 6 (2013). [7] GPW2006, pp. 78 83 (2006). [8] Sugiyama, T., Obata, T., Hoki, K. and Ito, T.: Optimistic Selection Rule Better Than Majority Voting System, Computers and Games, Lecture Notes in Computer Science, Vol. 6515, pp. 166 175 (2011). [9] J.Togelius, S.Karakovskiy, J.Koutnik and J.Schmidhuber: Super Mario Evolution, 2009 IEEE Conference CIG 09, pp. 156 161 (2009). [10] J.L.Cabrera and J.G.Milton: On-Off Intermittency in a Human Balancing Task, Physical Review Letters, Vol. 89, No. 15 (2002). [11] pp. 19 22 (2004). [12] Maslow, A. H.: A Theory of Human Motivation, Psychological Review, Vol. 50, pp. 370 396 (1943). [13] Watkins, C.: Learning from Delayed Rewards, PhD thesis, Cambridge University, Cambridge, England. (1989). [14] Patel, P. G., Carver, N. and Rahimi, S.: Tuning Computer Gaming Agents using Q-Learning, pp. 581 588 (2011). 33