IPSJ SIG Technical Report Vol.2011-DBS-153 No /11/3 Wikipedia Wikipedia Wikipedia Extracting Difference Information from Multilingual Wiki

Similar documents
DEIM Forum 2012 E8-4 Wikipedia y

IPSJ SIG Technical Report Vol.2013-GN-86 No.35 Vol.2013-CDS-6 No /1/17 1,a) 2,b) (1) (2) (3) Development of Mobile Multilingual Medical

WikiWeb Wiki Web Wiki 2. Wiki 1 STAR WARS [3] Wiki Wiki Wiki 2 3 Wiki 5W1H Wiki Web 2.2 5W1H 5W1H 5W1H 5W1H 5W1H 5W1H 5W1H 2.3 Wiki 2015 Informa


SERPWatcher SERPWatcher SERP Watcher SERP Watcher,

IPSJ SIG Technical Report Vol.2009-DPS-141 No.20 Vol.2009-GN-73 No.20 Vol.2009-EIP-46 No /11/27 1. MIERUKEN 1 2 MIERUKEN MIERUKEN MIERUKEN: Spe

126 学習院大学人文科学論集 ⅩⅩⅡ(2013) 1 2

1 UD Fig. 1 Concept of UD tourist information system. 1 ()KDDI UD 7) ) UD c 2010 Information Processing S

Vol.54 No (July 2013) [9] [10] [11] [12], [13] 1 Fig. 1 Flowchart of the proposed system. c 2013 Information

1 Fig. 1 Extraction of motion,.,,, 4,,, 3., 1, 2. 2.,. CHLAC,. 2.1,. (256 ).,., CHLAC. CHLAC, HLAC. 2.3 (HLAC ) r,.,. HLAC. N. 2 HLAC Fig. 2

[2] OCR [3], [4] [5] [6] [4], [7] [8], [9] 1 [10] Fig. 1 Current arrangement and size of ruby. 2 Fig. 2 Typography combined with printing

DEIM Forum 2010 A3-3 Web Web Web Web Web. Web Abstract Web-page R

はじめに


Web Web Web Web Web, i

Vol.55 No (Jan. 2014) saccess 6 saccess 7 saccess 2. [3] p.33 * B (A) (B) (C) (D) (E) (F) *1 [3], [4] Web PDF a m

kut-paper-template.dvi

IPSJ SIG Technical Report Vol.2009-HCI-134 No /7/17 1. RDB Wiki Wiki RDB SQL Wiki Wiki RDB Wiki RDB Wiki A Wiki System Enhanced by Visibl

FA FA FA FA FA 5 FA FA 9

IPSJ SIG Technical Report Vol.2016-CE-137 No /12/ e β /α α β β / α A judgment method of difficulty of task for a learner using simple

IPSJ SIG Technical Report Vol.2010-NL-199 No /11/ treebank ( ) KWIC /MeCab / Morphological and Dependency Structure Annotated Corp

wki_shuronn.pdf

DEIM Forum 2009 E

IPSJ SIG Technical Report Vol.2012-CG-148 No /8/29 3DCG 1,a) On rigid body animation taking into account the 3D computer graphics came

2013 Future University Hakodate 2013 System Information Science Practice Group Report biblive : Project Name biblive : Recording and sharing experienc

TF-IDF TDF-IDF TDF-IDF Extracting Impression of Sightseeing Spots from Blogs for Supporting Selection of Spots to Visit in Travel Sat

3_23.dvi

1: A/B/C/D Fig. 1 Modeling Based on Difference in Agitation Method artisoc[7] A D 2017 Information Processing

3_39.dvi

17 Proposal of an Algorithm of Image Extraction and Research on Improvement of a Man-machine Interface of Food Intake Measuring System

言語間比較に基づくWikipediaの補完情報抽出手法の提案

41 1. 初めに ) The National Theatre of the Deaf 1980


言語間比較に基づくWikipediaの補完情報抽出手法の提案

GPGPU

DEIM Forum 2009 C8-4 QA NTT QA QA QA 2 QA Abstract Questions Recomme

BOK body of knowledge, BOK BOK BOK 1 CC2001 computing curricula 2001 [1] BOK IT BOK 2008 ITBOK [2] social infomatics SI BOK BOK BOK WikiBOK BO

外国語教育/斎藤

_™J„û

IPSJ SIG Technical Report Vol.2017-ARC-225 No.12 Vol.2017-SLDM-179 No.12 Vol.2017-EMB-44 No /3/9 1 1 RTOS DefensiveZone DefensiveZone MPU RTOS

IPSJ SIG Technical Report Secret Tap Secret Tap Secret Flick 1 An Examination of Icon-based User Authentication Method Using Flick Input for

Fig. 3 Flow diagram of image processing. Black rectangle in the photo indicates the processing area (128 x 32 pixels).

Core Ethics Vol.

untitled

鹿大広報149号

2006 [3] Scratch Squeak PEN [4] PenFlowchart 2 3 PenFlowchart 4 PenFlowchart PEN xdncl PEN [5] PEN xdncl DNCL 1 1 [6] 1 PEN Fig. 1 The PEN

Fig. 3 3 Types considered when detecting pattern violations 9)12) 8)9) 2 5 methodx close C Java C Java 3 Java 1 JDT Core 7) ) S P S

1 1 CodeDrummer CodeMusician CodeDrummer Fig. 1 Overview of proposal system c

, IT.,.,..,.. i

/ p p

[2] , [3] 2. 2 [4] 2. 3 BABOK BABOK(Business Analysis Body of Knowledge) BABOK IIBA(International Institute of Business Analysis) BABOK 7

,,,,., C Java,,.,,.,., ,,.,, i

6 2. AUTOSAR 2.1 AUTOSAR AUTOSAR ECU OSEK/VDX 3) OSEK/VDX OS AUTOSAR AUTOSAR ECU AUTOSAR 1 AUTOSAR BSW (Basic Software) (Runtime Environment) Applicat


IPSJ SIG Technical Report Vol.2011-EC-19 No /3/ ,.,., Peg-Scope Viewer,,.,,,,. Utilization of Watching Logs for Support of Multi-

1_26.dvi

DPA,, ShareLog 3) 4) 2.2 Strino Strino STRain-based user Interface with tacticle of elastic Natural ObjectsStrino 1 Strino ) PC Log-Log (2007 6)


1 2. Nippon Cataloging Rules NCR [6] (1) 5 (2) 4 3 (3) 4 (4) 3 (5) ISSN 7 International Standard Serial Number ISSN (6) (7) 7 16 (8) ISBN ISSN I

15 NODA MAP 一 はじめに 1 NODA MAP

2 122

1 Table 1: Identification by color of voxel Voxel Mode of expression Nothing Other 1 Orange 2 Blue 3 Yellow 4 SSL Humanoid SSL-Vision 3 3 [, 21] 8 325

IPSJ SIG Technical Report Vol.2014-EIP-63 No /2/21 1,a) Wi-Fi Probe Request MAC MAC Probe Request MAC A dynamic ads control based on tra


Web Stamps 96 KJ Stamps Web Vol 8, No 1, 2004

log F0 意識 しゃべり 葉の log F0 Fig. 1 1 An example of classification of substyles of rap. ' & 2. 4) m.o.v.e 5) motsu motsu (1) (2) (3) (4) (1) (2) mot

IPSJ SIG Technical Report An Evaluation Method for the Degree of Strain of an Action Scene Mao Kuroda, 1 Takeshi Takai 1 and Takashi Matsuyama 1

16_.....E...._.I.v2006

Kyoto University * Filipino Students in Japan and International Relations in the 1930s: An Aspect of Soft Power Policies in Imperial Japan

Input image Initialize variables Loop for period of oscillation Update height map Make shade image Change property of image Output image Change time L

Microsoft Word - toyoshima-deim2011.doc

) 6) 2 (1855) 10 (1921) 7) II 8) 75 9)

Modal Phrase MP because but 2 IP Inflection Phrase IP as long as if IP 3 VP Verb Phrase VP while before [ MP MP [ IP IP [ VP VP ]]] [ MP [ IP [ VP ]]]

NO

untitled

Sport and the Media: The Close Relationship between Sport and Broadcasting SUDO, Haruo1) Abstract This report tries to demonstrate the relationship be



& Vol.5 No (Oct. 2015) TV 1,2,a) , Augmented TV TV AR Augmented Reality 3DCG TV Estimation of TV Screen Position and Ro

企業の信頼性を通じたブランド構築に関する考察

Mimehand II[1] [2] 1 Suzuki [3] [3] [4] (1) (2) 1 [5] (3) 50 (4) 指文字, 3% (25 個 ) 漢字手話 + 指文字, 10% (80 個 ) 漢字手話, 43% (357 個 ) 地名 漢字手話 + 指文字, 21

自然言語処理16_2_45


正誤表 グローバル コミュニケーション研究 第 4 号 ( 特別号 ) におきま して 以下の箇所に誤りがございました お詫びして訂正いたします 訂正箇所誤正 34 頁下から 2 行目約 45km 約 450km (2017 年 5 月 )

258 5) GPS 1 GPS 6) GPS DP 7) 8) 10) GPS GPS ) GPS Global Positioning System


2. Twitter Twitter 2.1 Twitter Twitter( ) Twitter Twitter ( 1 ) RT ReTweet RT ReTweet RT ( 2 ) URL Twitter Twitter 140 URL URL URL 140 URL URL

Phonetic Perception and Phonemic Percepition

7,, i

Copyright SATO International All rights reserved. This software is based in part on the work of the Independen

評論・社会科学 116号(P)Y☆/1.郭

NINJAL Project Review Vol.3 No.3

- June 0 0

Studies of Foot Form for Footwear Design (Part 9) : Characteristics of the Foot Form of Young and Elder Women Based on their Sizes of Ball Joint Girth

[1] [3]. SQL SELECT GENERATE< media >< T F E > GENERATE. < media > HTML PDF < T F E > Target Form Expression ( ), 3.. (,). : Name, Tel name tel


L3 Japanese (90570) 2008

untitled

情報処理学会研究報告 IPSJ SIG Technical Report Vol.2011-MBL-57 No.27 Vol.2011-UBI-29 No /3/ A Consideration of Features for Fatigue Es

平常時火災における消火栓の放水能力に関する研究

Transcription:

Wikipedia 1 2 3 Wikipedia Wikipedia Extracting Difference Information from Multilingual Wikipedia Yuya Fujiwara, 1 Yu Suzuki 2 and Akiyo Nadamoto 3 There are multilingual articles on the Wikipedia. The information between multilingual articles is different. Especially, the case of the articles which is written about culture is very different between languages. In this paper, we propose the system which extracts different information between Japan and other countries on the Wikipedia. Specifically, the system compare Japanese Wikipedia article which is written about foreign things with English Wikipedia. Then it extract different information which is written in English version. In this time, the granularity of information between two language is different, it means a Japanese article is not suitable for an English article. Then we propose how to extracts multiple English articles which is written about same a Japanese article based on link graph. 1. Wikipedia 1 Wikipedia 250 2 ( 1 ) Gallery ( 1 ) Wikipedia 8 4 3 1 Konan University 2 Nagoya University 3 Konana University 1 Wikipedia http://www.wikipedia.org/ 2 Wikipedia: http://ja.wikipedia.org/wiki/wikipedia: 3 asahi.com http://www.asahi.com/national/update/0303/tky201003030157.html 1 c 2011 Information Processing Society of Japan

Fish and chips Wikipedia Wikipedia 1 Wikipedia Wikipedia 2 ( 1 ) ( 2 ) ( 3 ) 1 Wikipedia Fig. 1 Example of multilingual Wikipedia pages ( 4 ) (2) ( 5 ) ( 6 ) (5) ( 7 ) ( 8 ) (4) (2) ( 9 ) (8) 2 3 Wikipedia 4 5 6 7 2 c 2011 Information Processing Society of Japan

2. 2 Fig. 2 System Flow 1)2) Wikipedia Wikipedia Wikipedia Wikipedia 3) Wikipedia Wikipedia Wikipedia 4) Wikipedia 5) Wikipedia Wikipedia Wikipedia 6) pfibf pfibf Wikipedia 3. 3.1 Wikipedia 1 Cricket ( 1 ) ( 2 ) 3 c 2011 Information Processing Society of Japan

Fig. 3 3 Analysis of the link structure ( 5) ( 3 ) Cricket Batting ( 4 ) 5 (1) W kl ( 3) ( 3 ) ( 4 ) 3.2 7) 3.1 CricketBatting Batting Wikipedia ( 1 ) ( 4) ( 2 ) W kl = af cos(k, l) + af i=1 ( 1 d i ) n i (n i o i + 1) (1) af d i i n i i o i i cos(k, l) k l Cricket Batting Batting Cricket Bat and ball Pitch, wickets and creases af=2 Bat and ball 3 d 1 =3 3 25 n 1 =25 3 Bat and ball 11 o 1=11 Pitch, wickets and crease d 2 =3 n 2 =25 o 2 =3 (1) Batting 0.71 0.6 Batting 4 c 2011 Information Processing Society of Japan

Fig. 4 4 Wikipedia segment contents structure of Wikipedia 4. Fig. 5 5 Wikipedia Tree posture Creator of the article of Wikipedia 4.1 Wikipedia Wikipedia Wikipedia Wikipedia ( 6 ) GENE95 1 GENE95 Google Ajax api 2 Microsoft Translator api 3 Bowls World Cricket League 8 Wikipedia 1 GENE95 http://www.namazu.org/ tsuchiya/sdic/data/gene.html 2 Google Ajax api http://code.google.com/apis/language/ 3 Microsoft Translator api http://www.microsofttranslator.com/dev/. (2) 0.3 xi y i cos(x, y) = x 2 i (2) yi 2 x 1 y 1 x i 1 i y i 1 i 5 c 2011 Information Processing Society of Japan

Fig. 6 6 Table of contents structure and contents 7 Fig. 7 output of prototypesystem 5. Ruby 1 Mecab 2 Tree Tagger 3 MySQL 4 7 7 Wikipedia 1 Ruby http://www.ruby-lang.org/ja/ 2 Mecab http://mecab.sourceforge.net/ 3 Tree Tagger http://www.ims.uni-stuttgart.de/projekte/corplex/treetagger/ 4 Mysql http://www-jp.mysql.com/ 6. 1 2 6.1 1 8) Cricket Warwick Castle Snooker Fish and chips Goodwood Festival of Speed Bowls Polo Association football 8 0.6 0.7 0.6 Criket 56 Warwick Castle 2 Snooker 15 Fish and Chips 5 Goodwood Festival of Speed 2 Bowls 1 Polo 4 Association football 9 6 c 2011 Information Processing Society of Japan

1 1 Table 1 Result of experiment1 F F Cricket 100 40 50 97 55 70 Warwick Castle 0 0 0 50 33 40 Snooker 100 7 13 63 33 43 Fish and chips 67 20 31 50 60 55 Goodwood Festival of Speed 0 0 0 100 100 100 Bowls 0 0 0 0 0 0 Polo 0 0 0 100 25 40 Association football 100 33 50 100 63 77 46 13 19 70 46 53 1 Cricket Cricket Cricket Warwick Castle Warwick Castle 1 46% 70% 13% 46% F 19% 53% Cricket List of international Cricket Council members Cricket List of international Cricket Council members Cricket F Cricket Association football F Snooker Snooker n i (1) 2 2 Table 2 Result of experiment2 F 56 83 69 79 88 83 100 75 86 83 63 71 88 74 80 81 80 80 Bowls World Bowls Events World Bowls Events 6.2 2 5 1 F 6 25 8 8 19 2 2 F 80% History International structure 7 c 2011 Information Processing Society of Japan

strike head grass 7., 2 Wikipedia 1 72% 49% F 57% 81% 80% F 80% Wikipedia 1) Wikipedia 50 ( 72 ) No.5, pp.181 182 (2010). 2) Wikipedia 21 (2009). 3) 71 No.2, pp.269 270 (2009). 4) DEIM Forum 2009. 5) Wikipedia DEIM Forum 2009, pp. 1 8. 6) K Nakayama T Hara S Nishio Wikipedia Mining for An Association Web Thesaurus Construction, WISE 2007, pp.1 11. 7) Wikipedia 73 No.1, pp.1.575 1.576. 8) Wikipedia No.2011. 8 c 2011 Information Processing Society of Japan