2011 2012 3 26 ( : A8TB2163)
( A B [1] A B A B B i
1 1 2 3 2.1... 3 2.1.1... 3 2.1.2... 4 2.2... 5 3 7 3.1... 7 3.2... 7 3.3 A B... 7 4 8 4.1... 8 4.1.1... 9 4.1.2... 9 4.1.3... 9 4.1.4... 10 4.2 A B... 10 5 12 5.1... 12 5.2... 12 5.3... 13 5.3.1... 13 5.3.2... 13 6 [B] 15 6.1 B... 16 6.2 B... 16 6.3 B A... 17 7 18 ii
1 Web Web ( ) ( ) ( ) 3 1. 2. 2 3. 4. 3 4 2 ( A B A B A B [2] A B A B [1] A B A B B 1
2 3 4 A B 5 6 4 A B (B) 7 2
2 2.1 [3, 4, 5, 6, 7, 8] [9, 10, 11, 1] 2.1.1 WordNet Kamps [4] WordNet (synonymy) w SO score(w) = d(w, bad) d(w, good) d(good, bad) (2.1) d(t a, t b ) w good, bad good bad Hu [5] WordNet Kamps (antonymy) 30 WordNet WordNet Hatzivassiloglou [3] and but simple and well-received and 3
simplistic and well-received but and but 2 Takamura [6] ( ) ( ) WordNet healthy and delicious and SL but but DL w ij = 1 (l ij SL) d(i)d( j) 1 d(i)d( j) (l ij DL) 0 otherwise l ij ij d(i) i Ahmed [7] (2.2) 2.1.2 ( good,excellent) ( bad,poor) Turney [9] SO-PMI(Semantic Orientation Pointwise Mutual Information) SO PMI(w) = PMI(w, excellent) PMI(w, poor) (2.3) PMI word1 word2 4
PMI(word 1, word 2 ) = log P(word 1, word 2 ) P(word 1 )P(word 2 ) (2.4) 10 excellent SO-PMI poor SO-PMI WordNet Kaji [10] ( ) SO-PMI [1] Turney De [12] 2.2, Turney WordNet [13] [14] 3 Willson [15] [16] 5
Web Lu [17] and but 6
3 [1] A B A B 3.1 3.2 3 1. ( ) 2. ( ) 3. ( ) ( ) 1. 2. 3. plus minus 3.3 A B A B (B) (A) A B B 7
4 4.1: A B [1] A B A B 4.1 A B [1] 8
4.1.1 4.1 4.1.2 3 1. 2. 2. ( ) ( ) 4.1.3 4 3. (0) 4. (100 ) 5. (50) 6. (2) 3. (PMI) PMI(word 1, word 2 ) = log P(word 1, word 2 ) P(word 1 )P(word 2 ) (4.1) 9
4. ( ) 5. 6. 4.1.4 (SVM) 3 2 4.2 A B A B A B 4.2 A B A B A B 10
4.2: A B 11
5 5.1 ( ) 6330 15636 3459 TSUBAKI[18] 1 1 2 1 A B 10 247 86 492 A B 10 200 224 92 621 10 117 174 838 5.2 Precision( ) Recall( ) F1 Precision = (5.1) Recall = (5.2) F1 = 2 Precision Recall Precision + Recall (5.3) ( ) 12
5.1: A B 10 A B A B 3.16 4.91 2.81 4.75 2.60 4.17 2.38 3.81 2.34 3.64 2.19 3.57 2.15 3.43 2.14 2.95 2.11 2.94 2.06 2.40 5.3 5.3.1 A B 5.2 A B A B 5.1 ( ) A B 10 A B A B A B A B Recall Precision Shoushan [19] 5.3.2 5.3 A B A B 0.2 13
5.2: A B (P R F1) (P R F1) (P R F1) A B ( ).375.621.468.415.591.487.305.709.427 (A B).201.327.249.290.329.306.109.337.165.386.804.522.315.768.447.530.851.653 ( ).146.363.209.164.362.225.129.366.190 5.3: A B (P R F1) (P R F1) (P R F1) ( ).375.621.468.415.591.487.305.709.427 ( ).201.327.249.290.329.306.109.337.165 ( ).241.516.328.289.473.359.183.516.285 ( ).163.322.217.220.316.260.102.337.157 A B B 14
6 [B] 6.1: B A B 200 B A B 6.1 A Cpp A B B 3 1. B 2. B 3. B A 15
6.1 B B A A A B 6.1 Cpp Cnn C pp C p + C nn C n (6.1) Cpp,Cnn 3 15 6.1 6.1: B 6.2 B A B B 6.1 C*p C*n C p C or C n C (6.2) C*p,C*n 5 15 6.3 A A 16
B ( ) 6.2: B B ( ) 6.3 B A A / /, B 6.1 Cpn Cnp C pn C p + C np C n (6.3) Cpn,Cnp 3 15 6.3 6.3: A 17
7 A B B A B A B A B A B 18
19
[1].. 2008. [2]. a b. 2005. [3] V. Hatzivassiloglou and K.R. McKeown. Predicting the semantic orientation of adjectives. In Proceedings of the eighth conference on European chapter of the Association for Computational Linguistics, pp. 174 181. Association for Computational Linguistics, 1997. [4] J. Kamps, MJ Marx, R.J. Mokken, and M. De Rijke. Using wordnet to measure semantic orientations of adjectives. 2004. [5] M. Hu and B. Liu. Mining and summarizing customer reviews. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 168 177. ACM, 2004. [6] H. Takamura, T. Inui, and M. Okumura. Extracting semantic orientations of words using spin model. In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pp. 133 140. Association for Computational Linguistics, 2005. [7] A. Hassan and D. Radev. Identifying text polarity using random walks. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 395 403. Association for Computational Linguistics, 2010. [8] L. Velikovich, S. Blair-Goldensohn, K. Hannan, and R. McDonald. The viability of web-derived polarity lexicons. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 777 785. Association for Computational Linguistics, 2010. [9] P.D. Turney. Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pp. 417 424. Association for Computational Linguistics, 2002. [10] N. Kaji and M. Kitsuregawa. Building lexicon for sentiment analysis from massive collection of html documents. In Proceedings of the joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL), pp. 1075 1083, 2007. [11] H. Kanayama and T. Nasukawa. Fully automatic lexicon expansion for domain-oriented sentiment analysis. In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, pp. 355 363. Association for Computational Linguistics, 2006. [12] S. De Saeger, K. Torisawa, and J. Kazama. Looking for trouble. In Proceedings of the 22nd International Conference on Computational Linguistics-Volume 1, pp. 185 192. Association for Computational Linguistics, 2008. 20
[13],. ( ).. D,, Vol. 93, No. 9, pp. 1778 1789, 2010-09-01. [14],,,,.. = Journal of natural language processing, Vol. 12, No. 3, pp. 203 222, 2005-07-10. [15] T. Wilson, J. Wiebe, and P. Hoffmann. Recognizing contextual polarity in phrase-level sentiment analysis. In Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, pp. 347 354. Association for Computational Linguistics, 2005. [16],,.., 2005. [17] Y. Lu, M. Castellanos, U. Dayal, and C.X. Zhai. Automatic construction of a context-aware sentiment lexicon: an optimization approach. In Proceedings of the 20th international conference on World wide web, pp. 347 356. ACM, 2011. [18] Keiji Shinzato, Tomohide Shibata, Daisuke Kawahara, Chikara Hashimoto, and Sadao Kurohashi. TSUBAKI: An open search engine infrastructure for developing new information access methodology. In Proc. the 3rd International Joint Conference on Natural Language Processing (IJC- NLP2008), pp. 189 196, 2008. [19] S. Li, Z. Wang, G. Zhou, and S.Y.M. Lee. Semi-supervised learning for imbalanced sentiment classification. Proceedings of IJCAI-2011, 2011. 21