RD-003 Building a Database of Purpose for Action from Word-of-mouth on the Web y Hiromi Wakaki y Hiroko Fujii y Michiaki Ariga y Kazuo Sumita y Kouta Nakata y Masaru Suzuki 1 ().com 1 Amazon 2 3 [10] 2007 Web [15][4] " ffl ffl ffl y ( ) 1 http://kakaku.com/ 2 http://www.amazon.co.jp/ Google N-gram[11] 2 [18] [14] ( ) ( ) [15] Query Classification [8][3] 4 ` ' WordNet[1] [5] [2] [9] [17] 3 http://www.jalan.net/ 4 2005 KDD cup 2005 80 67 15
[12] Qui [6][7] Google N-gram[11] 3 3.1 Web ( 4travel 5, 6 ) ( 7 ) ( All About 8 ) 3 47 40645 3.2 (1) (2) (3) 3 3 (1) (2) (3) (a) ( (b) (c) (d) (e) (1) (2) (3) ( ), ( ) (*) ( ) 3 ` ' ` ' 3 (= ) 3.3 60 60 30 / 0:79 9 0:57 10 / 5 http://4travel.jp/domestic/ 6 http://tabisuke.arukikata.co.jp/domestic/ " 7 http://www.naha-navi.or.jp/ 8 http://allabout.co.jp/domestic/ 9 10 16
60 (307 ) 107 (99 ) 2 107 (d) (c) 4 (c) (d) (d) (d) 4.1 3 3 4.2 5 5 (a) ( (b) (c) (d) (e) (a) (a-i) (a-ii) 2 (a-i) (a-ii) NAIST 11 [13] [16] 8 5 2 (d) (c) (d) 80% (e) () 5 5.1 (b) 2 11 http://cl.naist.jp/ inui/research/em/sentiment-lexicon.html NAIST Web 17
1: Cabocha Cabocha (i) (ii) [ ] + ( ) (iii) ( / / / ) (iv) ( / ) 2: ( / ) 7:5% (9/120) 8:3%(10/120) 25%(30/120) 56%(69/120) 1:7%(2/120) 3: X+( )+ Y (I) ( ) [ ] +( ) + ( ) [ ] (II) ( ) [ ] +( )+ ( ) [ ] (III) ( ) [ ] +( ) + ( ) [ ] (IV) ( ) [ ] + / + ( ) [ ] / (V) ( ) [ ] + / + 4: 47 40645 252692 98776 ( ) (a-i) 174 (a-ii) 2695 1 (b) 359 (d) 63931 30804 (e) 813 5: ( / ) (a) 1% (1/93) 4:3% (4/93) (b) 2:2% (2/93) (d) (+(b) +(c) ) 74% (69/93) 17% (16/93) (e) 1% (1/93) 18
5.2 Google Google N-gram[11] Google N-gram Google Web 200 2550 n-gram (1 7 gram) 2007 11 12 N-gram N-gram 5.3 Step1: Cabocha 13 V+ V+ V+ V+ N- N- Cabocha [ ] + ( ) Cabocha 1 Step2: Step1 ( X ) X ( Y ) 3 X Y Y Step3 Google N-gram Y X Y Step1 X Step3 X Step3: Google N-gram Step2 X Y 3 (I) (V) N-gram Google N-gram X b ( X X b ) R positive(x b ) = N p(x b ) N p(x b )+N n(x b ) 12 http://googlejapan.blogspot.com/2007/11/n-gram.html 13 http://chasen.org/ taku/software/cabocha/ 14 15 20 N-gram (1) Y Y p Y n 14 X Y S Google N-gram n true; G(x; y; S) = (2) false; N p(x b )=jfy 2 Y pjg(x; y; S) ^ x 2fXjbase(X) =X b ggj (3) base(x) X R positive Google N-gram 2 6-gram 15 (I) (V) X Y X b X Y (1) 5.4 (1) F F 1 F 0.7 921 0.3 565 6,,,,,,,,,,,,,,,,,,, 19
5.5 50 100 16 4 5 20 50 3 2 5 1-1 0 5 0 0 0 (1) 100 (2) (3) 7 50 72% 68% 1 86% 85% 1: X,, F. 6: 1286 921 818 565 7: () P N 100 0:72(36=50) 0:68(34=50) 1 0:86(36=42) 0:85(34=40) 3 (1) (2) (3) (1) Cabocha (2) 2 (3) 2 1 N-gram ( ) 2 ( ) 6 47 40645 252692 (a-ii) 104 (e) 131 (d) (a-i) (b) (a-ii) ( 104 ),,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, 100,,, (e) ( 131 ) 30,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, 5.6 72% 16 17 1 98776 17 3.3 1 3 20
4 (a) (a-i) (a-ii) 174 2695 (b) 359 (d) 63931 30804 (e) 813 6 3 1 99 120 93 5 2 (b) (c) (d) (d) (a) (e) 74% 17% 7 Google N-gram 921 70% 4 10 6 3 [1] Christiane Fellbaum, editor. WordNet: An Electronic Lexical Database. MIT Press, 1998. [2] Vasileios Hatzivassiloglou and Kathleen R. McKeown. Predicting the semantic orientation of adjectives. In Proc. of ACL, pp. 174 181, 1997. [3] Jian Hu, Gang Wang, Fred Lochovsky, Jian T. Sun, and Zheng Chen. Understanding user's query intent with wikipedia. In Proc. of WWW, pp. 471 480, 2009. 18 [4] Kentaro Inui, Shuya Abe, Hiraku Morita, Megumi Eguchi, Asuka Sumida, Chitose Sao, Kazuo Hara, Koji Murakami, and Suguru Matsuyoshi. Experience mining: Building a large-scale database of personal experiences and opinions from web documents. In Proc. of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence, pp. 314 321, 2008. [5] Jaap Kamps, Robert J. Mokken, Maarten Marx, and Maarten de Rijke. Using wordnet to measure semantic orientation of adjectives. In Proc. of LREC 2004, pp. 1115 1118, 2004. [6] Guang Qiu, Bing Liu, Jiajun Bu, and Chun Chen. Expanding domain sentiment lexicon through double propagation. In Proc. of IJCAI-09, pp. 1199 1204, 2009. [7] Guang Qiu, Bing Liu, Jiajun Bu, and Chun Chen. Opinion word expansion and target extraction through double propagation. Computational Linguistics, Vol. 37, No. 1, 2011. [8] Dou Shen, Jian-Tao Sun, Qiang Yang, and Zheng Chen. Building bridges for web query classification. In Proc. of SIGIR, pp. 131 138, 2006. [9] Peter Turney. Thumbs up or thumbs down? semantic orientation applied to unsupervised classification of reviews. In Proc. of ACL, pp. 417 424, 2002. [10],.., Vol. 13, No. 3, pp. 201 241, 2006. [11],. Web N.. [12],,. web., Vol. 24, No. 3, pp. 322 332, 2009. [13],,,,.., 12. [14],.., Vol. 49, No. 7, 2008. [15],,.. D, Vol. J92-D, No. 3, pp. 301 310, 2008. [16],,.. 14, pp. 584 587, 2008. [17],.. 2004-NL-168, pp. 109 116, 2004. [18],,.. (NL-144-11), 2001. 100 18 100 50 21
(P) (N) 5 (P 1 N -1 (E) 0 5 ) P P 5 P P 5 P P 5 P P 5 P P 5 P P 5 P P 5 P P 5 P P 5 P P 5 P P 5 P P 5 P P 5 P P 5 P P 5 P P 5 P P 4 N P 4 1 P P 4 P P 4 1 P P 4 P P 4 P P 4 P P 3 P P 3 P P 3 P P 3 P P 3 P P 2 P P 2 N P 2 P P 2 P P 1 P P 1 P P 1 N P 1 P P 1 2 P P 1 P P 1 P E 0 2 N E 0 N E 0 N E 0 4 P E 0 5 N E 0 3 N E 0 3 N E 0 3 P E 0 5 N E 0 N E 0 2 N E 0 P E 0 2 P E 0 2 N E 0 N E 0 P E 0 N E 0 1 N E 0 2 N N -1 3 N N -1 N N -1 1 P N -1 4 N N -1 N N -1 N N -1 N N -1 P N -1 N N -1 N N -2 N N -2 N N -2 N N -2 P N -2 N N -2 N N -3 P N -3 N N -3 P N -4 N N -4 P N -4 N N -4 N N -4 N N -4 P N -5 N N -5 N N -5 N N -5 N N -5 N N -5 N N -5 N N -5 N N -5 N N -5 P N -5 N N -5 N N -5 N N -5 N N -5 N N -5 N N -5 22