情報処理学会研究報告 IPSJ SIG Technical Report Vol.2015-GI-34 No /7/ % Selections of Discarding Mahjong Piece Using Neural Network Matsui

Similar documents
1: A/B/C/D Fig. 1 Modeling Based on Difference in Agitation Method artisoc[7] A D 2017 Information Processing

The 18th Game Programming Workshop ,a) 1,b) 1,c) 2,d) 1,e) 1,f) Adapting One-Player Mahjong Players to Four-Player Mahjong

Q [4] 2. [3] [5] ϵ- Q Q CO CO [4] Q Q [1] i = X ln n i + C (1) n i i n n i i i n i = n X i i C exploration exploitation [4] Q Q Q ϵ 1 ϵ 3. [3] [5] [4]

2017 (413812)

IPSJ SIG Technical Report Secret Tap Secret Tap Secret Flick 1 An Examination of Icon-based User Authentication Method Using Flick Input for

[2] OCR [3], [4] [5] [6] [4], [7] [8], [9] 1 [10] Fig. 1 Current arrangement and size of ruby. 2 Fig. 2 Typography combined with printing

Input image Initialize variables Loop for period of oscillation Update height map Make shade image Change property of image Output image Change time L

2006 [3] Scratch Squeak PEN [4] PenFlowchart 2 3 PenFlowchart 4 PenFlowchart PEN xdncl PEN [5] PEN xdncl DNCL 1 1 [6] 1 PEN Fig. 1 The PEN

2 ( ) i

3_23.dvi

( ) [1] [4] ( ) 2. [5] [6] Piano Tutor[7] [1], [2], [8], [9] Radiobaton[10] Two Finger Piano[11] Coloring-in Piano[12] ism[13] MIDI MIDI 1 Fig. 1 Syst

1 Fig. 1 Extraction of motion,.,,, 4,,, 3., 1, 2. 2.,. CHLAC,. 2.1,. (256 ).,., CHLAC. CHLAC, HLAC. 2.3 (HLAC ) r,.,. HLAC. N. 2 HLAC Fig. 2

ID 3) 9 4) 5) ID 2 ID 2 ID 2 Bluetooth ID 2 SRCid1 DSTid2 2 id1 id2 ID SRC DST SRC 2 2 ID 2 2 QR 6) 8) 6) QR QR QR QR

kiyo5_1-masuzawa.indd

130 Oct Radial Basis Function RBF Efficient Market Hypothesis Fama ) 4) 1 Fig. 1 Utility function. 2 Fig. 2 Value function. (1) (2)

258 5) GPS 1 GPS 6) GPS DP 7) 8) 10) GPS GPS ) GPS Global Positioning System

..,,,, , ( ) 3.,., 3.,., 500, 233.,, 3,,.,, i

1 StarCraft esportsleague WallPlayed.org 200 StarCraft Benzene StarCraft 3 Terran Zerg Protoss Terran Terran Terran 3 Terran Zerg Zerg Worker D

IT i

& Vol.2 No (Mar. 2012) 1,a) , Bluetooth A Health Management Service by Cell Phones and Its Us

情報処理学会研究報告 IPSJ SIG Technical Report Vol.2014-GI-31 No /3/17 有効牌を数えて牌効率をあげる面前全ツッパ麻雀 AI の性能評価 佐藤諒 1 西村夏夫 2 保木邦仁 3 本研究では, シャンテン数が下がるような牌を有効牌とし, 数

206“ƒŁ\”ƒ-fl_“H„¤‰ZŁñ

GPGPU

DPA,, ShareLog 3) 4) 2.2 Strino Strino STRain-based user Interface with tacticle of elastic Natural ObjectsStrino 1 Strino ) PC Log-Log (2007 6)

Vol. 48 No. 4 Apr LAN TCP/IP LAN TCP/IP 1 PC TCP/IP 1 PC User-mode Linux 12 Development of a System to Visualize Computer Network Behavior for L

& Vol.5 No (Oct. 2015) TV 1,2,a) , Augmented TV TV AR Augmented Reality 3DCG TV Estimation of TV Screen Position and Ro

161 J 1 J 1997 FC 1998 J J J J J2 J1 J2 J1 J2 J1 J J1 J1 J J 2011 FIFA 2012 J 40 56

Studies of Foot Form for Footwear Design (Part 9) : Characteristics of the Foot Form of Young and Elder Women Based on their Sizes of Ball Joint Girth

, IT.,.,..,.. i

17 Proposal of an Algorithm of Image Extraction and Research on Improvement of a Man-machine Interface of Food Intake Measuring System

IPSJ SIG Technical Report Vol.2009-DPS-141 No.23 Vol.2009-GN-73 No.23 Vol.2009-EIP-46 No /11/27 t-room t-room 2 Development of

IPSJ SIG Technical Report Vol.2011-MUS-91 No /7/ , 3 1 Design and Implementation on a System for Learning Songs by Presenting Musical St

22 Google Trends Estimation of Stock Dealing Timing using Google Trends

IPSJ SIG Technical Report An Evaluation Method for the Degree of Strain of an Action Scene Mao Kuroda, 1 Takeshi Takai 1 and Takashi Matsuyama 1

1 Table 1: Identification by color of voxel Voxel Mode of expression Nothing Other 1 Orange 2 Blue 3 Yellow 4 SSL Humanoid SSL-Vision 3 3 [, 21] 8 325

COM COM 4) 5) COM COM 3 4) 5) COM COM 6) 7) 10) COM Bonanza 6) Bonanza Hearts COM 7) 10) Hearts 3 2,000 4,000

1 UD Fig. 1 Concept of UD tourist information system. 1 ()KDDI UD 7) ) UD c 2010 Information Processing S

IPSJ SIG Technical Report Vol.2012-CG-148 No /8/29 3DCG 1,a) On rigid body animation taking into account the 3D computer graphics came

HP cafe HP of A A B of C C Map on N th Floor coupon A cafe coupon B Poster A Poster A Poster B Poster B Case 1 Show HP of each company on a user scree

WikiWeb Wiki Web Wiki 2. Wiki 1 STAR WARS [3] Wiki Wiki Wiki 2 3 Wiki 5W1H Wiki Web 2.2 5W1H 5W1H 5W1H 5W1H 5W1H 5W1H 5W1H 2.3 Wiki 2015 Informa

IPSJ SIG Technical Report Vol.2012-HCI-149 No /7/20 1 1,2 1 (HMD: Head Mounted Display) HMD HMD,,,, An Information Presentation Method for Weara

浜松医科大学紀要

FA

6_27.dvi

Appropriate Disaster Preparedness Education in Classrooms According to Students Grade, from Kindergarten through High School Contrivance of an Educati


04_奥田順也.indd

Table 1. Assumed performance of a water electrol ysis plant. Fig. 1. Structure of a proposed power generation system utilizing waste heat from factori

IPSJ SIG Technical Report Vol.2013-GN-86 No.35 Vol.2013-CDS-6 No /1/17 1,a) 2,b) (1) (2) (3) Development of Mobile Multilingual Medical

1_26.dvi

IPSJ SIG Technical Report Vol.2016-CE-137 No /12/ e β /α α β β / α A judgment method of difficulty of task for a learner using simple

<30375F97E996D88E812E696E6464>

The copyright of this material is retained by the Information Processing Society of Japan (IPSJ). The material has been made available on the website

58 10

johnny-paper2nd.dvi

Vol.54 No (July 2013) [9] [10] [11] [12], [13] 1 Fig. 1 Flowchart of the proposed system. c 2013 Information

FIG 7 5) 7 FIG ) 7) 8) 9) 10) 11) 12) 3 18 Gymnastik 13) 1793 J. Ch. F. Guts Muths Gymnastik fuer die Juegend 1816 F. L. Jahn Turnkunst Rhythm

IPSJ SIG Technical Report Vol.2012-MUS-96 No /8/10 MIDI Modeling Performance Indeterminacies for Polyphonic Midi Score Following and

企業の信頼性を通じたブランド構築に関する考察

「第三セクター」・その功罪に関する一考察

A Feasibility Study of Direct-Mapping-Type Parallel Processing Method to Solve Linear Equations in Load Flow Calculations Hiroaki Inayoshi, Non-member

1 Web [2] Web [3] [4] [5], [6] [7] [8] S.W. [9] 3. MeetingShelf Web MeetingShelf MeetingShelf (1) (2) (3) (4) (5) Web MeetingShelf

Mimehand II[1] [2] 1 Suzuki [3] [3] [4] (1) (2) 1 [5] (3) 50 (4) 指文字, 3% (25 個 ) 漢字手話 + 指文字, 10% (80 個 ) 漢字手話, 43% (357 個 ) 地名 漢字手話 + 指文字, 21

ActionScript Flash Player 8 ActionScript3.0 ActionScript Flash Video ActionScript.swf swf FlashPlayer AVM(Actionscript Virtual Machine) Windows

Consideration of Cycle in Efficiency of Minority Game T. Harada and T. Murata (Kansai University) Abstract In this study, we observe cycle in efficien

( )


IPSJ SIG Technical Report Vol.2012-GN-82 No.13 Vol.2012-CDS-3 No /1/19 Development and Application of the System which Promotes Sharing of Feel

第62巻 第1号 平成24年4月/石こうを用いた木材ペレット

Otto Friedrich Bollnow,~

A comparative study of the team strengths calculated by mathematical and statistical methods and points and winning rate of the Tokyo Big6 Baseball Le

Tf dvi

untitled

1 DHT Fig. 1 Example of DHT 2 Successor Fig. 2 Example of Successor 2.1 Distributed Hash Table key key value O(1) DHT DHT 1 DHT 1 ID key ID IP value D

Abstract This paper concerns with a method of dynamic image cognition. Our image cognition method has two distinguished features. One is that the imag

4.1 % 7.5 %

IPSJ SIG Technical Report Pitman-Yor 1 1 Pitman-Yor n-gram A proposal of the melody generation method using hierarchical pitman-yor language model Aki

人工知能学会研究会資料 SIG-KBS-B Analysis of Voting Behavior in One Night Werewolf 1 2 Ema Nishizaki 1 Tomonobu Ozaki Graduate School of Integrated B

Core Ethics Vol.

The 15th Game Programming Workshop 2010 Magic Bitboard Magic Bitboard Bitboard Magic Bitboard Bitboard Magic Bitboard Magic Bitboard Magic Bitbo



IPSJ SIG Technical Report Vol.2014-CE-123 No /2/8 Bebras 1,a) Bebras,,, Evaluation and Possibility of the Questions for Bebras Contest Abs

IPSJ SIG Technical Report Vol.2015-CVIM-196 No /3/6 1,a) 1,b) 1,c) U,,,, The Camera Position Alignment on a Gimbal Head for Fixed Viewpoint Swi

IPSJ SIG Technical Report Vol.2016-MUS-111 No /5/21 1, 1 2,a) HMM A study on an implementation of semiautomatic composition of music which matc

Fig. 3 3 Types considered when detecting pattern violations 9)12) 8)9) 2 5 methodx close C Java C Java 3 Java 1 JDT Core 7) ) S P S

kut-paper-template.dvi

ï\éÜA4*


A Study on Throw Simulation for Baseball Pitching Machine with Rollers and Its Optimization Shinobu SAKAI*5, Yuichiro KITAGAWA, Ryo KANAI and Juhachi

IPSJ SIG Technical Report Vol.2011-EC-19 No /3/ ,.,., Peg-Scope Viewer,,.,,,,. Utilization of Watching Logs for Support of Multi-

Bull. of Nippon Sport Sci. Univ. 47 (1) Devising musical expression in teaching methods for elementary music An attempt at shared teaching

Vol.55 No (Jan. 2014) saccess 6 saccess 7 saccess 2. [3] p.33 * B (A) (B) (C) (D) (E) (F) *1 [3], [4] Web PDF a m

平成○○年度知能システム科学専攻修士論文

udc-2.dvi

1., 1 COOKPAD 2, Web.,,,,,,.,, [1]., 5.,, [2].,,.,.,, 5, [3].,,,.,, [4], 33,.,,.,,.. 2.,, 3.., 4., 5., ,. 1.,,., 2.,. 1,,

39-3/2.論説:藤井・戸前・山本・井上



FA FA FA FA FA 5 FA FA 9

Transcription:

2 3 2000 3.3% Selections of Discarding Mahjong Piece Using Neural Network Matsui Kazuaki Matoba Ryuichi 2 Abstract: Mahjong is one of games with imperfect information, and its rule is very complicated to construct mahjong AI. In this study, as a way of discarding mahjong piece, we employed three layer neural networks to calculate evaluation value of each pieces for discarding. Inputs of the neural networks are a current state and pieces on which a player holds, and outputs are evaluation values for deciding a discard piece. Each parameter of the evaluation function is adjusted by backpropagation. As learning data, we employed score sheets of players who have rating over 2000, from the Internet mahjong sever called Tonpuso. As a result, our NN selects discarding pieces that correspond to learning data with 3.3% accuracy ratio. Keywords: Neural Network, Backpropagation, Reinforcement Learning., AI 950 AI [] mini-max [2] 2 AI AI[3] 2. [4] [5] c 205 Information Processing Society of Japan

Table Representation of each Mahjong Piece. 2 Table 2 Base point. 9 m 9m 9 p 9p 9 s 9s AI [6] AI [7] 56% AI [8] 38 AI 2.2 AI 2. 2. m 5 5p 4 4 34 6 3-2 2 2 4-6 4 36 3 3 70 4 3 m,2m,3m,, 2 5p,5p, () 4 ( ) ( ) ( ) ( ) () 4 4 30,000 30,000 27,000 30,000 2.2 500 500 400 500 2 6 2 2 3-2 4-6 () R ( ) : R R 300 () c 205 Information Processing Society of Japan 2

Fig. 3 Three layer Neural Network 400 (2) () : + 400 00 (2) 2.3 3 3 ( ) 820 3 BOOL 34 2 Sigmoid a (x) = +e ax a x i X i m h m X i h m w mi o V o h m V o v om h m (3) V o (4) M 280 h m = Sigmoid a ( X i w mi ) (3) i= M V o = Sigmoid a ( h m v om ) (4) m= 2 Fig. 2 3 Table 3 Sigmoid function Input Data. 4 36 544 36 820 2.4 V o T o E (5) E = 34 o= 2 (V o T o ) 2 (5) h m V o v om, X i h m w mi (6) (7) η v om = v om η v om (6) w mi = w mi η w mi (7) v om w mi (8) (9) v om = E v o = (V o T o )V o ( V o )h m (8) w mi = E w m = { 34 o= (V o T o )V o ( V o )v om } h m ( h m )X i (9) c 205 Information Processing Society of Japan 3

3. 3. Intel Core i7 3.20GHz 32.0GB [4] 2000 340,000 Ruby 2000 0 C++ 340,000 340,000 820 η = 0.0 24 AI AI (v0.92) 3.2 3 340,000 820 η = 0.0 3.3% 300 5.95% 5000 4.98% h m V o v om X i h m w mi 4 5 v om w mi 3 Fig. 3 Rate of concordance with learning frequency. 4 v om Fig. 4 Transition of w om w mi X i X i 0 6 0,000 30 820 340,000 820 η = 0.0 6 4. c 205 Information Processing Society of Japan 4

図 6 各牌の選択回数 Fig. 6 Number of saelection mahjong piece. 今後の課題として パラメータが偏ることによって特定 の種類の評価値が高くなってしまう問題を解決する必要が ある その他に 面子や雀頭をニューラルネットワークに 学習させるために 入力データとして新たに面子や雀頭の 情報を加えることや より面子や雀頭について学習し易い ニューラルネットワークの構造を考える必要がある これ により 学習後のニューラルネットワークと麻雀上級者の 打牌一致率の向上が期待できる 参考文献 図 5 重み wmi の推移 Fig. 5 Transition of wmi 手に対局できるレベルには達しなかった [] [2] [3] 今回作成したニューラルネットワークで教師データとの 一致率が上がらなかった要因に 入力データの偏りが挙げ られる 学習回数 340,000 回の場合 教師データの種類別 の打牌選択回数は最も少ない牌が 4m の 5,707 となり 最も 多い牌が西で 6,7 と 3 倍以上もの差がある このため 教師データにおいて多く選択された種類の牌の評価値が高 [4] [5] [6] [7] [8] 松原仁 完全情報ゲームから不完全情報ゲームへ (202). 作田誠 不完全情報ゲームの研究公益社団法人日本オペ レーションズ リサーチ学会 (2007). Martin Zinkevich Michael Johanson Michael Bowling Carmelo Piccione Regret Minimization in Games with Incomplete Information, (2007). と つ げ き 東 北 シ ス テ マ テ ィ ッ ク 麻 雀 研 究 所 http://totutohoku.b23.coreserver.jp/hp/ とつげき東北 科学する麻雀, 講談社現代新書 (2004). まったり麻雀 http://homepage2.nifty.com/kmo2/ 北川竜平 三輪誠 近山隆 麻雀の牌譜からの打ち手評価 関数の学習 (2004). インターネット雀荘 東風荘 http://mj.giganet.net/ くなるように学習されてしまった これに対して 各種類 の打牌が一定の回数となるものを入力データとしたものを 試みたが 良い結果とはならなかった これは ニューラ ルネットワークのパラメータが局所解に陥ってしまった可 能性が考えられる また 入力データの情報量が少なかったのが原因の つ に挙げられる 先行研究では面子や雀頭の情報を直接入力 データとして使用していたのに対し 本研究では 手牌の 情報から面子や雀頭をニューラルネットワークが勝手に学 習してくれることを期待した しかし 今回の手法では面 子や雀頭の学習が出来ないことがわかった 205 Information Processing Society of Japan 5