JAIST Reposi Title 少数の記録からプレイヤの価値観を機械学習するチー ムプレイ AI の構成 Author(s) 和田, 堯之 ; 佐藤, 直之 ; 池田, 心 Citation 研究報告ゲーム情報学 (GI), 2015-GI-33(5): 1-

Similar documents
1,a) 1,b) TUBSTAP TUBSTAP Offering New Benchmark Maps for Turn Based Strategy Game Tomihiro Kimura 1,a) Kokolo Ikeda 1,b) Abstract: Tsume-shogi and Ts

AI

The copyright of this material is retained by the Information Processing Society of Japan (IPSJ). The material has been made available on the website

Copyright 2008 by Tomoyoshi Yamazaki

2015 3


1., 1 COOKPAD 2, Web.,,,,,,.,, [1]., 5.,, [2].,,.,.,, 5, [3].,,,.,, [4], 33,.,,.,,.. 2.,, 3.., 4., 5., ,. 1.,,., 2.,. 1,,

2 ( ) i

2

JAIST Reposi Title KJ 法における作法の研究 Author(s) 三村, 修 Citation Issue Date Type Thesis or Dissertation Text version author URL http

Copyright c 2001 by Shuuhei Takimoto


29 jjencode JavaScript





,,,,., C Java,,.,,.,., ,,.,, i


1 Web [2] Web [3] [4] [5], [6] [7] [8] S.W. [9] 3. MeetingShelf Web MeetingShelf MeetingShelf (1) (2) (3) (4) (5) Web MeetingShelf


FUJII, M. and KOSAKA, M. 2. J J [7] Fig. 1 J Fig. 2: Motivation and Skill improvement Model of J Orchestra Fig. 1: Motivating factors for a


1 1 tf-idf tf-idf i

DOUSHISYA-sports_R12339(高解像度).pdf




1

2006 (2006)

百人一首かるた選手の競技時の脳の情報処理に関する研究

227 study


Core Ethics Vol. : - : : : -



3_23.dvi


( ) [1] [4] ( ) 2. [5] [6] Piano Tutor[7] [1], [2], [8], [9] Radiobaton[10] Two Finger Piano[11] Coloring-in Piano[12] ism[13] MIDI MIDI 1 Fig. 1 Syst

1: A/B/C/D Fig. 1 Modeling Based on Difference in Agitation Method artisoc[7] A D 2017 Information Processing

Trial Study to Aggregate the Flow of Relief Funds for the Great East Japan Earthquake: Matrix of Relief Fund Inflow and Outflow Abstract The 2011 Grea

1 Web Web 1,,,, Web, Web : - i -

Copyright ' 2001 by Manabu Masuoka i


JOURNAL OF THE JAPANESE ASSOCIATION FOR PETROLEUM TECHNOLOGY VOL. 66, NO. 6 (Nov., 2001) (Received August 10, 2001; accepted November 9, 2001) Alterna



Q [4] 2. [3] [5] ϵ- Q Q CO CO [4] Q Q [1] i = X ln n i + C (1) n i i n n i i i n i = n X i i C exploration exploitation [4] Q Q Q ϵ 1 ϵ 3. [3] [5] [4]

The 19th Game Programming Workshop 2014 SHOT 1,a) 2 UCT SHOT UCT SHOT UCT UCT SHOT UCT An Empirical Evaluation of the Effectiveness of the SHOT algori


1.



How to Use World News as English Teaching Material

IPSJ SIG Technical Report Vol.2011-EC-19 No /3/ ,.,., Peg-Scope Viewer,,.,,,,. Utilization of Watching Logs for Support of Multi-

計量国語学 アーカイブ ID KK 種別 特集 招待論文 A タイトル Webコーパスの概念と種類, 利用価値 語史研究の情報源としてのWebコーパス Title The Concept, Types and Utility of Web Corpora: Web Corpora as

日本感性工学会論文誌

Eclipse A Tool Support to Merge Similer Methods with Differences Akira Goto 1 Norihiro Yoshida 2 Masakazu Ioka 1 Katsuro Inoue 1 Abstra

Web Stamps 96 KJ Stamps Web Vol 8, No 1, 2004

FA

The 18th Game Programming Workshop ,a) 1,b) 1,c) 2,d) 1,e) 1,f) Adapting One-Player Mahjong Players to Four-Player Mahjong




1 Fig. 1 Extraction of motion,.,,, 4,,, 3., 1, 2. 2.,. CHLAC,. 2.1,. (256 ).,., CHLAC. CHLAC, HLAC. 2.3 (HLAC ) r,.,. HLAC. N. 2 HLAC Fig. 2


FA FA FA FA FA 5 FA FA 9



早稲田大学現代政治経済研究所 ダブルトラック オークションの実験研究 宇都伸之早稲田大学上條良夫高知工科大学船木由喜彦早稲田大学 No.J1401 Working Paper Series Institute for Research in Contemporary Political and Ec



IPSJ SIG Technical Report Vol.2016-CE-137 No /12/ e β /α α β β / α A judgment method of difficulty of task for a learner using simple



:. SPSS

28_3-03-伊勢坊 中原先生-原著③.indd

4.1 % 7.5 %


Copyright SATO International All rights reserved. This software is based in part on the work of the Independen

2006 3




johnny-paper2nd.dvi



Vol.57 No

WikiWeb Wiki Web Wiki 2. Wiki 1 STAR WARS [3] Wiki Wiki Wiki 2 3 Wiki 5W1H Wiki Web 2.2 5W1H 5W1H 5W1H 5W1H 5W1H 5W1H 5W1H 2.3 Wiki 2015 Informa



) ,

人工知能学会研究会資料 SIG-KBS-B Analysis of Voting Behavior in One Night Werewolf 1 2 Ema Nishizaki 1 Tomonobu Ozaki Graduate School of Integrated B

41 1. 初めに ) The National Theatre of the Deaf 1980

〈論文〉興行データベースから「古典芸能」の定義を考える

ID 3) 9 4) 5) ID 2 ID 2 ID 2 Bluetooth ID 2 SRCid1 DSTid2 2 id1 id2 ID SRC DST SRC 2 2 ID 2 2 QR 6) 8) 6) QR QR QR QR



/ p p

Studies of Foot Form for Footwear Design (Part 9) : Characteristics of the Foot Form of Young and Elder Women Based on their Sizes of Ball Joint Girth

Transcription:

JAIST Reposi https://dspace.j Title 少数の記録からプレイヤの価値観を機械学習するチー ムプレイ AI の構成 Author(s) 和田, 堯之 ; 佐藤, 直之 ; 池田, 心 Citation 研究報告ゲーム情報学 (GI), 2015-GI-33(5): 1-8 Issue Date 2015-02-26 Type Journal Article Text version publisher URL Rights http://hdl.handle.net/10119/13464 社団法人情報処理学会, 和田堯之, 佐藤直之, 池田心, 研究報告ゲーム情報学 (GI), 2015-GI-33(5), 2015, 1-8. ここに掲載した著作物の利用に関する注意 : 本著作物の著作権は ( 社 ) 情報処理学会に帰属します 本著作物は著作権者である情報処理学会の許可のもとに掲載するものです ご利用に当たっては 著作権法 ならびに 情報処理学会倫理綱領 に従うことをお願いいたします Notice for the use of th material: The copyright of this mate retained by the Information Processi Japan (IPSJ). This material is publi web site with the agreement of the a the IPSJ. Please be complied with Co of Japan and the Code of Ethics of t any users wish to reproduce, make de work, distribute or make available t any part or whole thereof. All Right Copyright (C) Information Processing Japan. Description Japan Advanced Institute of Science and

AI 1,a) 1,b) 1,c) RPG AI AI AI AI AI 70.6% 67.1% 3.5% AI, RPG,, Design of a Teammate AI by Learning Human-player Utility from a few Records of Actions Wada Takayuki 1,a) Sato Naoyuki 1,b) Ikeda Kokolo 1,c) Abstract: Some genres of commercial video games, especially RPG games, allow players to play the game with the AI players as the teammates. But the AI players as the teammates often take actions that the human player does not expect them to do. Such mismatches between the expectations of the human players and the actions taken by the AI players often cause dissatisfaction of the players. One of the reasons for such mismatches is that there are several types of sub-goals in these games and the AI players act without understanding which types of sub-goals are important for each human player. The purpose of this study is to propose a method to develop teammate AI players that estimate the sub-goal preference of the human players and act with causing less dissatisfaction of the players. In an evaluation experiment, we prepared some artificial players with various preferences for the sub-goals and tried to estimate their sub-goals by the proposed method. The selected actions based on the estimated sub-goal preferences were the same as the selected actions by the original artificial players at the rate of 67.1% in one setting. The upper bound of the rate is about 70.6% (in this setting), which is the rate at which the same actions are selected when the preference of sub-goals is the same. Thus the proposed method is only 3.5% inferior in performance in the worst case compared to an ideal estimation. Keywords: Game AI, RPG, Utility, Machine Learning, Team-mate, Cooperation game 1 JAIST, Asahidai 1-1, Nomi, Ishikawa, Japan a) s1310082@jaist.ac.jp b) satonao@jaist.ac.jp c) kokolo@jaist.ac.jp 1. AI 1

AI AI AI Sander,B. [1] QuakeIII AI AI AI RPG AI AI AI AI AI 2. AI AI [2] AI Infinite Mario Bros. Matteo [3] AI Believability AI Sander AI QuakeIII AI AI [1] RPG QuakeIII RPG 1 Sander AI [9] [7] [4][8] [10], [11] [5] (w 1 + w2 ) w 1 0 w 2 0 RPG AI [6] 100 2

1 3. AI AI AI (a) AI (b) AI (c) (d) [6] (a) AI 1 ( 1 ) RPG RPG ( 2 ) AI ( 3 ) ( 4 ) ( 5 ) ( 6 ) (5) ( 7 ) 4. 1 1 3

4.1 S A j s j S A s j A a j A s j {(s j, a j )} j 2 4.2 s a A s 1 a a A s π S R A s a π i s i (s, a, π) s i (s, a, π) S R n x i (s, a, π) m x(s, a, π) = 1 m m x i (s, a, π) (1) i=1 x 2 1 π [7] x(s, a, π) a 2 a 1 π x(s, a 1, π ) x(s, a 2, π) 4.3 s S A s A a A s a A s π : S R A Π s i (s, a, π) x i (s, a, π) R n x(s, a, π) R n w u( x, w) R 1 s s a π i s i (s, a, π) { x i (s, a, π)} i w x u : x R (2) u( x(s, a, π), w) = x(s, a, π) w (2) x(s, a, π) s a π w a (3) max u( x(s, π Π a, π), w) max u( x(s, a, π), w) (3) π Π,a A s Π π w W w a* (3) w W w W 1 4

Algorithm 1 for each w W do p w = 0 end for for each (s, a ) {(s j, a j )} j do for each w W do u = max π Π u( x(s, a, π), w) for each a A s \ a do if u < max π Π u( x(s, a, π), w) then p w + = 1 end if end for end for end for return arg min p w w W 4.4 a π x(s, a, π) u( x, w) arg max u( x(s, a, π) w) a A s,π Π 5. 5.1 5 HP 0 0 MP 3 5.2-1 1 1 6 10 5.3 3 1 RPG 6. 5

2 2 HP MP 1 134 30 60 28 2 108 60 44 34 1 52 0 40 26 2 82 32 38 32 3 70 0 50 30 6.1 5 2 2 MP MP HP 6.2 7 ( 1 ) ( 2 ) HP HP ( 3 ) 5 ( 4 ) 5 MP ( 5 ) 5 ( 6 ) (2) (4) ( 7 ) (2) (5) 6.3 1000 1 2 HP MP Turn AI 3 3 HP 99% 99% 96% 96% MP Turn 98% 83% 93% 69% AI 8 10% 7. w 7.1 3 x 4 x = {x HP, x MP, x T urn } (4) 4 5 7 a, b, x HP HP = HP (5) x MP MP = MP (6) x T urn = b a (7) 7.2 w x 3 x HP, x MP, x T urn 1 1 [1,10,0.1] MP [1,0.1,0.1] HP W x MP, x T urn 2 31 31 32 1 32 7.3 w 6

4 [1, 4, 8] 5 [1, 1 8, 1 16 ] 2 AI w = [1, 4, 8] MP Turn 4 w = [1, 1 8, 1 16 ] HP 5 [1, 4, 8] 10% [1, 1 8, 1 16 ] [1, 4, 8] 1 2 2 8 1 8 8 1 20 5 [1, 1/8, 1/16] [1, 1/16, 1/32] 6 1 1 3 5 8 5 [1,12,0.167] MP 4 15% MP 8. AI 8.1 4 4 5 20 RPG 6 7 5 4 8.2 7

4 HP MP Turn HP 1 0.071 0.071 Turn 1 0.143 18 MP Turn 1 10 10 MP 1 12 0.167 5 AI 4.1 MP Turn AI 3.1 MP AI 3.4 4.3 Turn Turn AI 4.0 MP AI 2.6 4.0 MP Turn Turn AI 3.0 MP AI 2.7 AI AI 7 2 8 1 3 MP Turn MP Turn 1 4 2 1 4 1 AI 2 1 5 AI 3 AI 2 [1, 0.3, 3] Turn AI [1, 4, 0.25] MP AI AI 7 5 Turn Turn AI (4.0) MP AI (2.6) (4.3) [1, 0.5, 16] [1, 0.3, 3] 9. [1] Sander Bakkes, Pieter Spronck and Eric Postma : TEAM : The Team-Oriented Evolutionary Adaptability Mechanism, Entertainment Computing - ICEC 2004, pp.273-282, 2004. [2] 55(7) pp.1655-1664 2014. [3] Matteo Bernacchia Hoshino Jun ichi AI platform for supporting believable combat in role-playing games, 2014 pp.139-144 2014. [4] 48 6 (2) pp.123-124 1994. [5] pp.1-28 1997. [6] AI 29 pp.1-8 2013. [7] Remi Coulom Computing Elo ratings of move patterns in the game of Go International Computer Games Association Journal 30 (2007) pp.198208 2007. [8] 2011 pp.46-53 2011. [9] 2001 pp.17-24 2001. [10] 2006 pp.78-83 2006. [11] AI Vol 2010-GI-24 No.3 pp.1-7 2010. 8