1 AND TFIDF Web DFIWF Wikipedia Web Web 2. 3. 4. AND 5. Wikipedia AND 6. Wikipedia Web 7. 8. 2. Ma [4] Ma URL AND Tian [8] Tian Tian Web Cimiano [3] [

Similar documents
FIT2014( 第 13 回情報科学技術フォーラム ) RD-002 Web SNS Yuanyuan Wang Gouki Yasui Yuji Hosokawa Yukiko Kawai Toyokazu Akiyama Kazutoshi Sumiya 1. Twitter 1 Facebo

Wikipedia 2 Wikipedia Web Wikipedia 2. Web [6] [11] [8] 2 SVM Bollegala [1] 5-gram URL URL 2-gram [6] [11] SVM 3 SVM [8] Bollegala [1] SVM [7] [9] [6]

2 3, 4, [1] [2] [3]., [4], () [3], [5]. Mel Frequency Cepstral Coefficients (MFCC) [9] Logan [4] MFCC MFCC Flexer [10] Bogdanov2010 [3] [14],,,

IPSJ SIG Technical Report Vol.2010-SLDM-144 No.50 Vol.2010-EMB-16 No.50 Vol.2010-MBL-53 No.50 Vol.2010-UBI-25 No /3/27 Twitter IME Twitte


. Yahoo! 1!goo 2 QA..... QA Web Web [1]Web Web Yin [2] Web Web Web. [3] Web Wikipedia 1 2

DEIM Forum 2012 E Web Extracting Modification of Objec

,,, Twitter,,, ( ), 2. [1],,, ( ),,.,, Sungho Jeon [2], Twitter 4 URL, SVM,, , , URL F., SVM,, 4 SVM, F,.,,,,, [3], 1 [2] Step Entered

DEIM Forum 2010 A3-3 Web Web Web Web Web. Web Abstract Web-page R

卒論タイトル


1 Broder Navigational URL URL Informational Web Transactional Web Web Web 2 Broder [16] SearchLife Broder [16] Daniel [17] Broder

DEIM Forum 2009 C8-4 QA NTT QA QA QA 2 QA Abstract Questions Recomme

DEIM Forum 2014 B Twitter Twitter Twitter 2006 Twitter 201

main.dvi

SERPWatcher SERPWatcher SERP Watcher SERP Watcher,

12_24.dvi

1 Web,.,, Web..,, Web.,,,.,,,., CGI.,, Web, Web.,,. PC,,.

,, WIX. 3. Web Index 3. 1 WIX WIX XML URL, 1., keyword, URL target., WIX, header,, WIX. 1 entry keyword 1 target 1 keyword target., entry, 1 1. WIX [2

IPSJ SIG Technical Report Vol.2009-DBS-149 No /11/ Bow-tie SCC Inter Keyword Navigation based on Degree-constrained Co-Occurrence Graph

TF-IDF TDF-IDF TDF-IDF Extracting Impression of Sightseeing Spots from Blogs for Supporting Selection of Spots to Visit in Travel Sat

Mining Social Network of Conference Participants from the Web

thesis.dvi

Web Web Web Twitter Web Web 2 Web Web Web Web URL Web Web 2 Web Twitter Developers Streaming API 1 2 Google Place API vervion 3 1 lm 1

DEIM Forum 2010 A Web Abstract Classification Method for Revie

untitled

2 : Open Clip Art Library [4] Microsoft Office PowerPoint Web PowerPoint 2 Yahoo! Web [5] SlideShare Yahoo! Web Yahoo! Web

Web Hashtag Hashtag Twitter Hashtag Twitter Hashtag Hashtag Hashtag Twitter Hashtag Twitter Hashtag contexthashtag contexthashtag Hashtag contexthasht

Wikipedia YahooQA MAD 4)5) MAD Web 6) 3. YAMAHA 7) 8) Vocaloid PV YouTube 1 minato minato ussy 3D MAD F EDis ussy

22 Google Trends Estimation of Stock Dealing Timing using Google Trends

IT i

‰gficŒõ/’ÓŠ¹

Microsoft Word - toyoshima-deim2011.doc

IPSJ SIG Technical Report Vol.2011-DBS-153 No /11/3 Wikipedia Wikipedia Wikipedia Extracting Difference Information from Multilingual Wiki

DEIM Forum 2013 B6-3 MAP Web MAP Implementation and Ev

main.dvi

2. Twitter Twitter 2.1 Twitter Twitter( ) Twitter Twitter ( 1 ) RT ReTweet RT ReTweet RT ( 2 ) URL Twitter Twitter 140 URL URL URL 140 URL URL

IPSJ SIG Technical Report Vol.2009-DPS-141 No.20 Vol.2009-GN-73 No.20 Vol.2009-EIP-46 No /11/27 1. MIERUKEN 1 2 MIERUKEN MIERUKEN MIERUKEN: Spe

2reN-A14.dvi

<> <name> </name> <body> <></> <> <title> </title> <item> </item> <item> 11 </item> </>... </body> </> 1 XML Web XML HTML 1 name item 2 item item HTML

独立行政法人情報通信研究機構 Development of the Information Analysis System WISDOM KIDAWARA Yutaka NICT Knowledge Clustered Group researched and developed the infor

[7] [10] Web Web RDF Resource Description Framework subjectpredicate object Web Web Web Web Web 2 Web Web MUC(Message Understanding Confere

Honda 3) Fujii 4) 5) Agrawala 6) Osaragi 7) Grabler 8) Web Web c 2010 Information Processing Society of Japan

Introduction to Information and Communication Technology (a)

1 1 tf-idf tf-idf i

Web PDF [7, 8] 1 1 [9, 10] OCR [9] HITS [10] 2. 3 [11] IDF TF-IDF [12] PageRank,, PageRank TF-IDF k-means PageRank Web ios 1 imac mac

なみよけ36_p01.ai

Danushka Bollegala Keigo WATANABE Danushka BOLLEGALA Yutaka MATSUO and Mitsuru ISHIZUKA Graduate School of Information Science and Technology, T

Twitter‡Ì”À‰µ…c…C†[…g‡ðŠŸŠp‡µ‡½…^…C…•…›…C…fi‘ã‡Ì…l…^…o…„‘îŁñ„�™m

1034 IME Web API Web API 1 IME Fig. 1 Suitable situations for context-aware IME. IME IME IME IME 1 GPS Web API Web API Web API Web )


1 Web DTN DTN 2. 2 DTN DTN Epidemic [5] Spray and Wait [6] DTN Android Twitter [7] 2 2 DTN 10km 50m % %Epidemic 99% 13.4% 10km DTN [8] 2

Web サイト作成者によって設定された Web リンク システムが作成した Web リンク データ 静的リンク 学部ページ 大学 ページ 就職関連ページ 入試関連ページ 利用者の要求 データベース 動的リンク データ工学研究室ページ ベース研究室ページ 大学ページ 3. Web 学科ページ データ工

Vol. 9 No. 5 Oct (?,?) A B C D 132

27 YouTube YouTube UGC User Generated Content CDN Content Delivery Networks LRU Least Recently Used UGC YouTube CGM Consumer Generated Media CGM CGM U

Microsoft Word - SNSで繋がる人間関係.doc

”‰−ofiI…R…fi…e…L…X…g‡ðŠp‡¢‡½„�“õ„‰›Ê‡Ì™ñ”¦

DEIM Forum 2009 E


(2008) JUMAN *1 (, 2000) google MeCab *2 KH coder TinyTextMiner KNP(, 2000) google cabocha(, 2001) JUMAN MeCab *1 *2 h

キャッチーブランディングで稼ぐ

untitled

Haiku Generation Based on Motif Images Using Deep Learning Koki Yoneda 1 Soichiro Yokoyama 2 Tomohisa Yamashita 2 Hidenori Kawamura Scho

2 2.1 SNS web Facebook Google+ SNS web SNS web HITS ANT(Auction Network Trust) web [4] SNS WEB PageRank HITS HITS web authorities, hubs Pagerank web S

DEIM Forum 2015 F8-4 Twitter Twitter 1. SNS

_314I01BM浅谷2.indd

2 21, Twitter SNS [8] [5] [7] 2. 2 SNS SNS Cheng [2] Twitter [6] Backstrom [1] Facebook 3 Jurgens


¥ì¥·¥Ô¤Î¸À¸ì½èÍý¤Î¸½¾õ

1 4 4 [3] SNS 5 SNS , ,000 [2] c 2013 Information Processing Society of Japan

やまびこ60.indd

教師情報を必要としないWebページ群のコンテンツ自動抽出ツールの提案

3.5 検索で上位に表示させるタイトル付けの奥義

Vol. 28 No. 2 Apr Web Twitter/Facebook UI Twitter Web Twitter/Facebook e.g., Web Web UI 1 2 SNS 1, 2 2

wki_shuronn.pdf


Microsoft PowerPoint - takeda-panel.ppt

IPSJ SIG Technical Report Vol.2014-HCI-157 No.26 Vol.2014-GN-91 No.26 Vol.2014-EC-31 No /3/15 1,a) 2 3 Web (SERP) ( ) Web (VP) SERP VP VP SERP

..,,,, , ( ) 3.,., 3.,., 500, 233.,, 3,,.,, i

ルール&マナー集_社内版)_修正版.PDF

A Japanese Word Dependency Corpus ÆüËܸì¤Îñ¸ì·¸¤ê¼õ¤±¥³¡¼¥Ñ¥¹

IPSJ SIG Technical Report Vol.2011-MUS-91 No /7/ , 3 1 Design and Implementation on a System for Learning Songs by Presenting Musical St

Vol.20, No.1, 2018 Castillo [10] Yang [11] Sina Weibo 3 Castillo [10] Twitter 4 Twitter [12] Twitter ) 2 Twitter [13] 3. Twitter Twitter 3

Vol. 46 No. SIG 13(TOD 27) ) ),4) 4) 1 2 5) 2 Cutting Scatter/Gather 6),7) Fractionation 6) Leuski 8)

1. [1, 2, 3] (PDF ) [4] API API [5] ( ) PDF Web Web Annotate[6] Digital Library for Earth System Education(DLESE)[7] Web PDF Text, Link, FreeTe

Web Web Web Web Web, i

<955C8E86819A2E6169>

滋賀県研究者情報システムのテキストマイニングによる性能改善について

[1] [3]. SQL SELECT GENERATE< media >< T F E > GENERATE. < media > HTML PDF < T F E > Target Form Expression ( ), 3.. (,). : Name, Tel name tel

TA3-4 31st Fuzzy System Symposium (Chofu, September 2-4, 2015) Interactive Recommendation System LeonardoKen Orihara, 1 Tomonori Hashiyama, 1

[1] HITS EigenRumor Web PageRank 情報の要求 投稿者推薦システム 投稿者の重要度推定 ( 本研究 ) の引用回数から推定 投稿者のネットワークから推定 個人的な興味を考慮した部分 1 投稿者のランキング Web EigenRumor Kri

2009 2

IPSJ-TOD

DEIM Forum 2019 C3-5 tweet

Web 1 q q Step1) Twitter Step2) (w i, w j ) S(w i, w j ) Step3) q I Twitter MeCab[6] URL 2.2 (w i, w j ) S(w i, w j ) I w i w

06sugiyama.dvi

1 Fogg Fogg Behavior Model [1] information cascade [2] TPO [3] Fig. 2 Target area of this paper. 1 Fig. 1 Fogg b

4. C i k = 2 k-means C 1 i, C 2 i 5. C i x i p [ f(θ i ; x) = (2π) p 2 Vi 1 2 exp (x µ ] i) t V 1 i (x µ i ) 2 BIC BIC = 2 log L( ˆθ i ; x i C i ) + q

( : A8TB2163)

IPSJ SIG Technical Report Vol.2009-HCI-134 No /7/17 1. RDB Wiki Wiki RDB SQL Wiki Wiki RDB Wiki RDB Wiki A Wiki System Enhanced by Visibl

Transcription:

DEIM Forum 2015 B1-5 606 8501 606 8501 E-mail: komurasaki@dl.kuis.kyoto-u.ac.jp, tajima@i.kyoto-u.ac.jp Web Web AND AND Web 1. Twitter Facebook SNS Web Web Web Web [5] Bollegala [2] Web Web 1 Google Microsoft Bing Cimiano [3] Web Web Web Web Web Web Web 1 4,730,000 660,000 0.993 0.830 Web Satoh [7]

1 AND TFIDF Web DFIWF Wikipedia Web Web 2. 3. 4. AND 5. Wikipedia AND 6. Wikipedia Web 7. 8. 2. Ma [4] Ma URL AND Tian [8] Tian Tian Web Cimiano [3] [5] AND Bollegala [2] AND SVM Satoh [7] Uyar [9] Satoh Web Uyar Google Yahoo Microsoft Satoh Uyar 3. 3. 1 500 1000 1 Microsoft Bing Search API 2 2 0.7 0.9 0.5 1 2 https://datamarket.azure.com/dataset/bing/search

Histogram of Famousness 1 Frequency 0 50 100 150 200 2 8 5 8 13 7 20 12 11 12 15 31 37 61 54 0.0 0.2 0.4 0.6 0.8 1.0 97 115 169 184 139 0.0246 log( ) 0.220 4. Famousness 0.0 0.2 0.4 0.6 0.8 1.0 Famousness 2 0 2 4 6 8 10 Hitcount.logarithm 3 93 3. 2 1000 Bing Search API 20 20000 MeCab [1] 3. 3 3 5 7 10 1000 1 Web 0.220 3. Web 20 AND 4. 1 TFIDF TFIDF(term frequency / inverse document frequency) [6] AND 4. 1. 1 TFIDF TFIDF C c i D i w T F w,i D i w D C i d i DF w,i 1000 20 1 1000 w DF w,i IDF w,i IDF w,i = log( Di DF w,i ) (1) T F w,i IDF w,i T F IDF w,i TFIDF w,i = TF w,i IDF w,i (2) T F IDF w,i c i C c i 4. 1. 2 3. 2 TFIDF

2 0.0255 log( ) 0.212 AND 2 TFIDF AND AND 13,300,000 21 Web 4. 2 DFIWF 4. 1 TFIDF AND DFIWF(document frequency / inverse web frequency) 4. 2. 1 DFIWF DFIWF DFIWF Web C c i d i D = {d 1, d 2,... d n } w D DF w w W F w IW F w 1 IWF w = log( ) (3) W F w DF IW F w DFIWF w = DF w IWF w (4) DF IW F w D Web w w Web 4. 2. 2 3. 2 DFIWF 3 1000 AND 3 1000 AND 3 DFIWF DF WF DFIWF 1 1000 5940000 64.1 0.489 2 1000 11300000 61.6 0.303 3 1000 12300000 61.3 0.304 5 DF 1000 4. 1 DFIWF TFIDF 2 0.489 DFIWF 3 3 AND 5. AND 4. TFIDF DFIWF TFIDF DFIWF 5. 1 15

5. 2 5. 1 AND Web 4. TFIDF Web Web AND AND Web AND AND T c i t j T AND h i,tj famousness i W = {w 1, w 2,, w n} famousness i = w 1 h i,t1 + w 2 h i,t2 + + w n h i,tn n (5) = (w j h i,tj ) j=1 (5) AND W W AND 5. 3 1000 500 5. 2 Leave-one-out 499 AND W W 1 AND 500 5. 3 AND 5 w i 0 4 0.422 0.0357 0.259 0.449 5. 3 0.0357 0.449 AND 0 4. Satoh [7] 6. 5. Web Web 6. 1 Web 3 3 6. 2 Web Web Web Web

Web Web 5. 3 6. 2. 1 Category c W = {w 0, w 1,..., w n} c C w i f c,i Category c = {f c,0, f c,1,..., f c,n} 6. 2. 2 AND t X c Y t c AND X Y b t,c b t,c = X Y X + Y 6. 2. 3 (6) 6. 2. 1 6. 2. 2 t c Category c t c b t,c V t 6. 2. 1 w i f t,i V t = {f t,0, f t,1,..., f t,n} c Category c t c b t,c V t t Celebrity t t C t Celebrity t = c C t (b t,c Category c) + V t (7) 7 Celebrity t Celebrity t 6. 2. 4 Web 7 Web Web Web Web Web 7 Web Web Web Web Web p P age p V t w i Web p f p,i Page p = {f p,0, f p,1,..., f p,n} (8) 8 Web p P age p 7 t Celebrity t cos sim t,p sim t,p = Celebrityt Pagep Celebrity t Page p (9) Web p sim t,p Web Occurrence t Occurrence t Occurrence t = P p simt,p P (10) Occurrence t t Web Web Occurrence t t Web WebAffinity t WebAffinity t = p 1 Occurrence t + p 2 (11) p 1 p 2 Web 6. 3 6. 1 t NewsHook t Web Wikipedia Wikipedia Wikipedia t Wikipedia 1 WikiAccess t Wikimedia WikiEdit t Wikipedia WikiAccess t WikiEdit t t NewsHook t NewsHook t = p 3 WikiAccess t +p 4 WikiEdit t +p 5 (12) 11 p 1, p 2 p 3, p 4, p 5

Web 4 t Famousness t HitCount t WebAffinity t NewsHook t AccumulateDuration t 4 Infobox HitCount t =Famousness t WebAffinity t NewsHook t AccumulateDuration t (14) 6. 4 Web Web Wikipedia 4 Wikipedia infobox infobox t days t AccumulateDuration t AccumulateDuration t = p 6 days t + p 7 (13) p 6 p 7 6. 5 6. 2 6. 3 6. 4 t Web WebAffinity t NewsHook t AccumulateDuration t Web Web Web Web Web 14 Famousness t Famousness t = HitCount t WebAffinity t NewsHook t AccumulateDuration t (15) 15 WebAffinity t NewsHook t AccumulateDuration t 11 12 13 p 1 p 7 4. 2 DFIWF 15 HitCount t DFIWF 15 7. 15 7. 1 3. 1 3. 2 4. 2 DFIWF AND 2 ( ) (DFIWF) 12 Wikipedia 2013 1 1 2013 12 31 1

5 Result of estimation ( ) (DFIWF) 0.0300 0.0243 0.470 0.592 Correct 0.0 0.2 0.4 0.6 0.8 1.0 Correct 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.5 1.0 1.5 0.0 0.5 1.0 1.5 Estimate Estimate 5 ( ) 6 (DFIWF) 13 days t t 2013 12 31 15 p 1 p 7 Leave-one-Out 7. 2 5 5 DFIWF 6 5 DFIWF 0.0243 5. 3 0.0357 0.592 5. 3 0.499 5 6 3. 1 Web JSPS 26280112 [1] Mecab. http://mecab.googlecode.com/svn/trunk/mecab/ doc/index.html. [2] Danushka Bollegala, Yutaka Matsuo, and Mitsuru Ishizuka. Measuring semantic similarity between words using web search engines. www, 7:757 766, 2007. [3] Philipp Cimiano, Siegfried Handschuh, and Steffen Staab. Towards the self-annotating web. In Proceedings of the 13th international conference on World Wide Web, pages 462 471. ACM, 2004. [4] Qiang Ma and Masatoshi Yoshikawa. Ranking people based on metadata analysis of search results. In Sven Hartmann, Xiaofang Zhou, and Markus Kirchberg, editors, Web Information Systems Engineering - WISE 2008 Workshops, volume 5176 of Lecture Notes in Computer Science, pages 48 60. Springer Berlin Heidelberg, 2008. [5] Yutaka Matsuo, Hironori Tomobe, and Takuichi Nishimura. Robust estimation of google counts for social network extraction. In AAAI, volume 7, pages 1395 1401, 2007. [6] GERARD SALTON. Developments in automatic text retrieval. Science, 253(5023):974 980, 1991. [7] Koh Satoh and Hayato Yamana. Hit count reliability: how much can we trust hit counts? Web Technologies and Applications, pages 751 758, 2012. [8] Tian Tian, Soon Ae Chun, and James Geller. A prediction model for web search hit counts using word frequencies. Journal of Information Science, page 0165551511415183, 2011. [9] Ahmet Uyar. Investigation of the accuracy of search engine hit counts. Journal of Information Science, 35(4):469 480, 2009. 8. Web