Wikipedia 2. Mihalcea Wikify! [8] 1) 2) Wikify! [8] Wikipedia (keyphraseness) keyphraseness TF IDF [11] Lesk [6] 3 Naive Bayes Cucerzan [2] Wikipedia

Similar documents
DEIM Forum 2019 D3-5 Web Yahoo! JAPAN Q&A Web Web

DEIM Forum 2018 C (1) (2), Web [3] 1. Shen [7] SNS T

和文タイトル

_314I01BM浅谷2.indd

1 AND TFIDF Web DFIWF Wikipedia Web Web AND 5. Wikipedia AND 6. Wikipedia Web Ma [4] Ma URL AND Tian [8] Tian Tian Web Cimiano [3] [

Web 1 q q Step1) Twitter Step2) (w i, w j ) S(w i, w j ) Step3) q I Twitter MeCab[6] URL 2.2 (w i, w j ) S(w i, w j ) I w i w

Wikipedia 2 Wikipedia Web Wikipedia 2. Web [6] [11] [8] 2 SVM Bollegala [1] 5-gram URL URL 2-gram [6] [11] SVM 3 SVM [8] Bollegala [1] SVM [7] [9] [6]

,,, Twitter,,, ( ), 2. [1],,, ( ),,.,, Sungho Jeon [2], Twitter 4 URL, SVM,, , , URL F., SVM,, 4 SVM, F,.,,,,, [3], 1 [2] Step Entered

独立行政法人情報通信研究機構 Development of the Information Analysis System WISDOM KIDAWARA Yutaka NICT Knowledge Clustered Group researched and developed the infor

卒論タイトル

IPSJ SIG Technical Report Vol.2011-DBS-153 No /11/3 Wikipedia Wikipedia Wikipedia Extracting Difference Information from Multilingual Wiki

DEIM Forum 2010 A3-3 Web Web Web Web Web. Web Abstract Web-page R

-2-

Find Find WIX DB entry WIX Aho- Corasick 1 WIX 2. 2 WIX WIX WIX WIX DBFind () WIX, FireFox add-on Chrome Extension [1]. 32 Chrome

( )

2 21, Twitter SNS [8] [5] [7] 2. 2 SNS SNS Cheng [2] Twitter [6] Backstrom [1] Facebook 3 Jurgens


"-./0%. "-%!"#$#% $%&'(%)*+,%.!"#+$,$% &'()*% $%&'-(.(/%+,% $%&'0%12*+,'% 1 RMX.. grade gradetype= integer grade[

2 : Open Clip Art Library [4] Microsoft Office PowerPoint Web PowerPoint 2 Yahoo! Web [5] SlideShare Yahoo! Web Yahoo! Web

([ ]!) name1 name2 : [Name]! name SuperSQL,,,,,,, (@) < >@{ < > } =,,., 200,., TFE,, 1 2.,, 4, 3.,,,, Web EGG [5] SSVisual [6], Java SSedit( ss


IPSJ SIG Technical Report Vol.2014-DBS-159 No.6 Vol.2014-IFAT-115 No /8/1 1,a) 1 1 1,, 1. ([1]) ([2], [3]) A B 1 ([4]) 1 Graduate School of Info

1., 1 COOKPAD 2, Web.,,,,,,.,, [1]., 5.,, [2].,,.,.,, 5, [3].,,,.,, [4], 33,.,,.,,.. 2.,, 3.., 4., 5., ,. 1.,,., 2.,. 1,,

●70974_100_AC009160_KAPヘ<3099>ーシス自動車約款(11.10).indb

untitled

DEIM Forum 2012 E Web Extracting Modification of Objec


REALV5_A4…p_Ł\1_4A_OCF

untitled

「都市から地方への人材誘致・移住促進に関する調査」

<91498EE88CA D815B2E786C73>

〔 大 会 役 員 〕

橡本体資料+参考条文.PDF

Lecture on

<報告書発刊にあたって>

3339 Web Web 1 Web Web 2 3 Wikipedia 19),24) Web 7) Web ID 17) Table 1 47 semantic categories, represented in the form of ID:

1 1 tf-idf tf-idf i

,255 7, ,355 4,452 3,420 3,736 8,206 4, , ,992 6, ,646 4,


2 3

川崎学報57-4.indd

繧オ繧吶う蟐剃ス楢ウ・侭2008-陦ィ髱「-4N

Microsoft Word - toyoshima-deim2011.doc

[12] Qui [6][7] Google N-gram[11] Web ( 4travel 5, 6 ) ( 7 ) ( All About 8 ) (1) (2) (3) 3 3 (1) (2) (3) (a) ( (b) (c) (d) (e) (1

BN43.indd

<> <name> </name> <body> <></> <> <title> </title> <item> </item> <item> 11 </item> </>... </body> </> 1 XML Web XML HTML 1 name item 2 item item HTML

untitled

22 Google Trends Estimation of Stock Dealing Timing using Google Trends

dプログラム_1


3_39.dvi

DEIM Forum 2012 E8-4 Wikipedia y

untitled

36

IPSJ SIG Technical Report Vol.2012-CG-149 No.13 Vol.2012-CVIM-184 No /12/4 3 1,a) ( ) DB 3D DB 2D,,,, PnP(Perspective n-point), Ransa

DEIM Forum 2017 E Netflix (Video on Demand) IP 4K [1] Video on D

波野の宝・波野の地域課題

([ ],), : [Name], name1 name2 name10 4, 2 SuperSQL, ([ ]!), name1 name2 : [Name]! name SuperSQL,,,,,,, < < > } =,

kut-paper-template.dvi

2015 9

1 2. Nippon Cataloging Rules NCR [6] (1) 5 (2) 4 3 (3) 4 (4) 3 (5) ISSN 7 International Standard Serial Number ISSN (6) (7) 7 16 (8) ISBN ISSN I

Vol.55 No (Jan. 2014) saccess 6 saccess 7 saccess 2. [3] p.33 * B (A) (B) (C) (D) (E) (F) *1 [3], [4] Web PDF a m

. Yahoo! 1!goo 2 QA..... QA Web Web [1]Web Web Yin [2] Web Web Web. [3] Web Wikipedia 1 2

Vol. 9 No. 5 Oct (?,?) A B C D 132

main.dvi

main.dvi

001

Vol.54 No (July 2013) [9] [10] [11] [12], [13] 1 Fig. 1 Flowchart of the proposed system. c 2013 Information

教師情報を必要としないWebページ群のコンテンツ自動抽出ツールの提案

BOK body of knowledge, BOK BOK BOK 1 CC2001 computing curricula 2001 [1] BOK IT BOK 2008 ITBOK [2] social infomatics SI BOK BOK BOK WikiBOK BO

Twitter‡Ì”À‰µ…c…C†[…g‡ðŠŸŠp‡µ‡½…^…C…•…›…C…fi‘ã‡Ì…l…^…o…„‘îŁñ„�™m

Computational Semantics 1 category specificity Warrington (1975); Warrington & Shallice (1979, 1984) 2 basic level superiority 3 super-ordinate catego

/27 (13 8/24) (9/27) (9/27) / / / /16 12

和文タイトル

2007/2 Vol. J90 D No Web 2. 1 [3] [2], [11] [18] [14] YELLOW [16] [8] tfidf [19] 2. 2 / 30% 90% [24] 2. 3 [4], [21] 428

Q [4] 2. [3] [5] ϵ- Q Q CO CO [4] Q Q [1] i = X ln n i + C (1) n i i n n i i i n i = n X i i C exploration exploitation [4] Q Q Q ϵ 1 ϵ 3. [3] [5] [4]

main.dvi

4) 5) ) ( 1 ) ( 2 ) ( 3 ) ( 4 ) ( 5 ) ( 6 ) )8) ( 1 ) ( 2 ) ( 3 ) ( 200 9) ( 10) 1 2 (

IPSJ SIG Technical Report Vol.2010-SLDM-144 No.50 Vol.2010-EMB-16 No.50 Vol.2010-MBL-53 No.50 Vol.2010-UBI-25 No /3/27 Twitter IME Twitte

untitled

DEIM Forum 2019 H Web 1 Tripadvisor

A Japanese Word Dependency Corpus ÆüËܸì¤Îñ¸ì·¸¤ê¼õ¤±¥³¡¼¥Ñ¥¹

1 Twitter Twitter Twitter 2. 1 Xu [3] Twitter Twitter Twitter Twitter iphone iphone iphone Twitter Xu [3] Twitter Xu [5] Web Web Web Web

IPSJ SIG Technical Report iphone iphone,,., OpenGl ES 2.0 GLSL(OpenGL Shading Language), iphone GPGPU(General-Purpose Computing on Graphics Proc

Vol. 19 No. 4 December 2012 level and replace them to the original category, and (2) cut not-is-a links between categories and category-to-articles. E

人工知能学会研究会資料 SIG-KBS-B Analysis of Voting Behavior in One Night Werewolf 1 2 Ema Nishizaki 1 Tomonobu Ozaki Graduate School of Integrated B

Web サイト作成者によって設定された Web リンク システムが作成した Web リンク データ 静的リンク 学部ページ 大学 ページ 就職関連ページ 入試関連ページ 利用者の要求 データベース 動的リンク データ工学研究室ページ ベース研究室ページ 大学ページ 3. Web 学科ページ データ工

DEIM Forum 2009 C8-4 QA NTT QA QA QA 2 QA Abstract Questions Recomme

¥ì¥·¥Ô¤Î¸À¸ì½èÍý¤Î¸½¾õ

paper

main.dvi

本文/講演2


理工ジャーナル 23‐1☆/1.外村

Introduction to Information and Communication Technology (a)

2009/9 Vol. J92 D No. 9 HTML [3] Microsoft PowerPoint Apple Keynote OpenOffice Impress XML 4 1 (A) (C) (F) Fig. 1 1 An example of slide i

Ł\”ƒ-2005

DEIM Forum 2010 A Web Abstract Classification Method for Revie

10-1.indd

DEIM Forum 2014 B Twitter Twitter Twitter 2006 Twitter 201

Transcription:

DEIM Forum 2015 A3-1 565 0871 1-5 E-mail: {nakamura.tatsuya,shirakawa.masumi,hara,nishio}@ist.osaka-u.ac.jp Wikipedia 1.,,, Twitter () Twitter 44 1 ( ) () 12015 1 (Beta ) [13] Wikipedia Wikipedia Wikipedia Wikipedia () ( )

Wikipedia 2. Mihalcea Wikify! [8] 1) 2) Wikify! [8] Wikipedia (keyphraseness) keyphraseness TF IDF [11] Lesk [6] 3 Naive Bayes Cucerzan [2] Wikipedia Wikipedia Wikipedia Milne [10] Wikipedia Miner 2 Wikipedia () [9] 2http://wikipedia-miner.cms.waikato.ac.nz/ Wikify! Kulkarni [5] Cucerzan Milne Hoffart [4] AIDA 3 AIDA Mention-Entity Graph (Mention) (Entity) Ferragina [3] TAGME 4 TAGME Wikipedia () Meij [7] keyphraseness 33 Twitter Wikipedia Miner TAGME [1] AIDA 15 2 [4] TAGME 2 [3] TAGME Wikipedia 3http://www.mpi-inf.mpg.de/yago-naga/aida/ 4http://tagme.di.unipi.it/

3. TAGME [3] TAGME Wikipedia Wikipedia 3. 1 TAGME TAGME [3] Wikipedia TAGME 1) 2) 3) TAGME 3. 1. 1 TAGME Wikipedia a 1 a 2 Wikipedia lp(a 1)lp(a 2) lp(a 1) < lp(a 2) a 2 lp(a 1) > = lp(a 2) a 1 a 2 lp(a) a lp(a) = link(a) freq(a) link(a) Wikipedia a freq(a) Wikipedia a (1) keyphraseness [8] Wikipedia a a link(a) a freq(a) lp(a) keyphraseness a lf(a) a df(a) keyphraseness(a) = lf(a) df(a) a keyphraseness 3. 1. 2 (2) a A P g(a) p a P g(a) a p a voting scheme rel a (p a ) = p b P g(b) rel(p b, p a ) P r(p b b) (3) P g(b) b A\{a} rel(p b, p a ) [9] P r(p b b) b p b commonness (3) P r(p b b) > τ (3) p a a (3) ϵ% commonness p a a τ ϵ TAGME [3] τ = 0.02ϵ = 30% 3. 1. 3 3. 1. 2 Wikipedia 3. 1. 2 (a, p a) ρ(a, p a ) = 1 2 (lp(a) + coherence(a, p a)) (4) ρ(a, p a ) > ρ NA a p a coherence(a, p a ) coherence(a, p a ) = 1 S 1 p b S\{p a } rel(p b, p a ) (5) S (4) (5) 4 ρ NA 3. 2

入 力力の テキスト 集 合 アンカーテキスト の 周 辺 情 報 P(w), P(w a) TAGMEの 枠 組 み W 杯 得 点 王 の ハメスがレアルに 移 籍 テキスト ( 太 文 字 下 線 はキーワード) メス レアル W 杯 抽 出 された 語 句句 (アンカーテキスト) 付 与 された 記 事 集 約 後 の 記 事 (1)キーワード 抽 出 (2)キーワードの 曖 昧 性 解 消 (3) 記 事 集 約 1 Wikipedia Wikipedia ( Wikipedia ) TAGME Wikipedia Wikipedia Wikipedia FIFA FIFA FIFA FIFA World Cup2014 FIFA World Cup FIFA FIFA World CupFIFA World Cup 2014 FIFA World Cup 2014 FIFA World Cup 3. 3 3. 1 TAGME Wikipedia Wikipedia 1 Wikipedia TAGME Wikipedia 3. 2 3. 4 w 5 5

a () con(w, a) w a ( ) P r(w a) w () P r(w) con(w, a) = P r(w a) P r(w) (6) w con(w, a) = 1 P r(w a) P r(w) count(w, a) P r(w a) = w count(w, a) a count(w, a) P r(w) = w,a count(w, a) count(w, a) w a ( ) (6) w a () (6) a w (6) (7) (8) (4) (6) a () w ρ(w, a, p a) = lp(a) + coherence(a, p a) + (1 con(w, a)) 3 (9) ρ(w, a, p a) > ρ NA a p a con(w, a) w con(w, a) (6) 20 (6) (9) 20 TAGME (4) 3. 5 FIFA World Cup2014 FIFA World Cup Microsoft WindowsiPhone () ワールドカップ キーワード (アンカーテキスト) カテゴリ iphone 候 補 となる 記 事 集 合 (a) カテゴリ iphone に 所 属 する 記 事 (b) 2 ( 数 字 )_ トピック 名 を トピック 名 に 集 約 集 約 後 の 記 事 カテゴリと 同 名 の 記 事 に 集 約 集 約 後 の 記 事 Wikipedia ( 2) 3. 4 3. 5. 1 2(a) 2014 FIFA World Cup() p x title x () re withyear extract(title x ) 1 p x re withyear title x 2 title x extract(title x ) = title y p y (1) (2) p x p y 35,847 8,240 6 3. 5. 2 Wikipedia ( 2(b)) 62014 11 06 Wikipedia

1 k 1 439,955 73,478 2 815,458 83,518 3 1,076,751 84,587 p x categories(p x ) c x cattitle x c x c parent (c x ) p x 1 p x categories(p x ) title x c x categories(p x ) 2 cattitle x = title y p y p x p y 3 k k c parent (c x ) k p x Wikipedia ID 1 k 4. 4. 1 Twitter ( ) Wikipedia Shirakawa [12] 2014 11 1 2015 1 15 2 300 () 2 1 5 647,937 13.0 91,219 55.1 () ( ) 100 5 () ( ) (::) : 2014 FIFA FIFA World Cup2014 FIFA World Cup

(hop0) k k = 1 k = 2 (hop1hop2) ( (6)) 2 TAGME [3] (TAGME(LP)) keyphraseness (TAGME(KP)) Micro (Precision) (Recall) (9) (4) ρ NA 0 1 4. 2 3 4 3 hop0hop1hop2 ( ) 3 hop1 4 TAGME TAGME 5 (hop1) TAGME(KP) ρ NA TAGME ρ NA = 0.2 : : ρ NA = 0.3 2 free mobilepisseddescend Web Wikipedia free mobile free mobile app 適 合 率率率 0.95 TAGME(LP) 0.9 TAGME(KP) 0.85 提 案 手 法 (hop1) 0.8 0.75 0.7 0.65 0.6 0 0.2 0.4 0.6 0.8 1 再 現 率率率 3 適 合 率率率 0.8 0.75 TAGME(LP) 0.7 TAGME(KP) 0.65 提 案 手 法 (hop0) 0.6 提 案 手 法 (hop1) 0.55 提 案 手 法 (hop2) 0.5 0.45 0.4 0.35 0.3 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 再 現 率率率 4 TAGME(KP) TAGME(LP) keyphraseness TAGME keyphraseness (hop0hop1) TAGME(KP) TAGME 5 (9) ρ (6) con : TAGME(KP) TAGME hop0hop1hop2 hop0 hop1 hop2 hop1

(a) (hop1) Twitter A(26240013) IT IT (2012 2016 ) (b) TAGME(KP) 5 ρ NA ice rink Ice Hockey Wikipedia ID Xperia Z3Sony Xperia hop2 () 2 hop0 163,564 hop1 hop2 152,288 139,680 5. Wikipedia [1] M. Cornolti, P. Ferragina, and M. Ciaramita, A Framework for Benchmarking Entity-annotation Systems, In WWW, pp.249 260, 2013. [2] S. Cucerzan, Large-Scale Named Entity Disambiguation Based on Wikipedia Data, In EMNLP-CoNLL, pp.708 716, 2007. [3] P. Ferragina, and U. Scaiella, Fast and Accurate Annotation of Short Texts with Wikipedia Pages, IEEE Software, vol.29, no.1, pp.70 75, 2011. [4] J. Hoffart, M.A. Yosef, I. Bordino, H. Fürstenau, M. Pinkal, M. Spaniol, B. Taneva, S. Thater, and G. Weikum, Robust Disambiguation of Named Entities in Text, In EMNLP, pp.782 792, 2011. [5] S. Kulkarni, A. Singh, G. Ramakrishnan, and S. Chakrabarti, Collective Annotation of Wikipedia Entities in Web Text, In KDD, pp.457 466, 2009. [6] M. Lesk, Automatic Sense Disambiguation Using Machine Readable Dictionaries: How to Tell a Pine Cone from an Ice Cream Cone, In SIGDOC, pp.24 26, 1986. [7] E. Meij, W. Weerkamp, and M. de Rijke, Adding Semantics to Microblog Posts, In WSDM, pp.563 572, Feb. 2012. [8] R. Mihalcea, and A. Csomai, Wikify!: Linking Documents to Encyclopedic Knowledge, In CIKM, pp.233 242, 2007. [9] D. Milne, and I.H. Witten, An Effective, Low-cost Measure of Semantic Relatedness Obtained from Wikipedia Links, In AAAI Workshop on Wikipedia and Artificial Intelligence, pp.25 30, July 2008. [10] D. Milne, and I.H. Witten, Learning to Link with Wikipedia, In CIKM, pp.509 518, 2008. [11] G. Salton, and C. Buckley, Term-weighting Approaches in Automatic Text Retrieval, Information processing & management, vol.24, no.5, pp.513 523, 1988. [12] M. Shirakawa, T. Hara, and S. Nishio, MLJ: Language- Independent Real-Time Search of Tweets Reported by Media Outlets and Journalists, In VLDB, vol.7, no.13, pp.1605 1608, 2014. [13],,,, Wikipedia,, vol.2014-dbs-160, no.11, pp.1 9, 2014.