Web Web Web Web Personal Name Disambiguation in Web Search Results Using a Semi-Supervised Clustering Approach Kazunari Sugiyama and Manabu Okumura Personal names are often submitted to search engines as query keywords. However, in response to a personal name query, search engines return a long list of search results that contains Web pages about several namesakes. In order to address this problem, most of the previous works that disambiguate personal names in Web search results often employ agglomerative clustering approaches. In contrast, we have adopted a semi-supervised clustering approach to integrate similar documents into a seed document. Our proposed semi-supervised clustering approach is novel in that it controls the fluctuation of the centroid of a cluster. Key Words: Web information retrieval, Semi-supervised clustering, Personal name disambiguation 1 ALLTheWeb 1 1 2 Web, Department of Computer Science, National University of Singapore, Precision and Intelligence Laboratory, Tokyo Institute of Technology 1 http://www.alltheweb.com/ 2 http://tap.stanford.edu/peoplesearch.pdf
Vol. 16 No. 5 October 2009 Google 3 William Cohen Web Web (Mann and Yarowsky 2003) (Pedersen, Purandare, and Kulkarni 2005) (Bekkerman, El-Yaniv, and McCallum 2005) (Bollegala, Matsuo, and Ishizuka 2006) Web seed seed Web (1) (2) Wagstaff (Wagstaff and Cardie 2000) (Wagstaff, Rogers, and Schroedl 2001) K-means must-link 2 cannot-link 2 2 Basu (Basu, Banerjee, and Mooney 2002) K-means Klein (Klein, Kamvar, and Manning 2002) 2 (x i,x j ) 0 2 (max i,j D ij )+1 Xing (Xing, Ng, Jordan, and Russell 2003) Bar-Hillel (Bar-Hillel, Hertz, and Shental 2003) RCA (Relevant Component Analysis) (Shental, Hertz, Weinshall, and Pavel 2002) seed 2 3 3 http://www.google.com/ 24
Web 4 2 1 K K-means (MacQueen 1967) Web (Wagstaff and Cardie 2000) (Wagstaff et al. 2001) (Basu et al. 2002) (Klein et al. 2002) (Xing et al. 2003) (Bar-Hillel et al. 2003) seed seed (1) seed seed (2) Web seed W p Web p i w pi (i =1,,n) (1) w pi =(w pi t 1,w pi t 2,,w pi t m ) (1) m W p t k (k =1, 2,,m) (a) Term Frequency (TF) (b) Inverse Document Frequency (IDF) (c) residual IDF (RIDF) (d) TF-IDF (e) x I -measure (f) gain 6 25
Vol. 16 No. 5 October 2009 (a) Term Frequency (TF) TF tf(t k,p i ) Web p i t k w pi w pi t k (2) tf(t w pi k,p i ) t k = m s=1 tf(t s,p i ) (b) Inverse Document Frequency (IDF) (Jones 1973) IDF w pi w pi t k (3) N w pi t k =log df (t k ) N Web df (t k ) t k Web (c) Residual Inverse Document Frequency (RIDF) Church and Gale (Church and Gale 1995a, 1995b) IDF residual IDF IDF IDF cf k t k N Web 1 Web t k λ k = cf k N wpi w pi t k (4) 1 w pi t k = IDF log 1 p(0; λ i ) N =log df (t k ) +log(1 p(0; λ k)) (4) p λ k RIDF (d) TF-IDF TF-IDF (Salton and McGill 1983) TF-IDF (a) TF (b) IDF (5) w pi t k = tf(t k,p i ) m s=1 tf(t s,p i ) log N df (t k ) (2) (3) (5) 26
Web tf(t k,p i ) df (t k ) Web p i t k t k Web N Web (e) x I -measure Bookstein and Swanson (Bookstein and Swanson 1974) t k x I -measure tf(t k,p i ) Web p i t k df (t k ) t k Web w pi w pi t k (6) w pi t k = tf(t k,p i ) df (t k ) (6) 2 (f) gain IDF Papineni (Papineni 2001) IDF gain w pi w pi t k (7) w pi t k = df (t ( k) df (tk ) N N 1 log df (t ) k) (7) N df (t k ) t k Web N Web (a) (f) (f) gain C G C (8) G C =(g C t 1,g C t 2,,g C t m ) (8) gt C k G C t k (k =1, 2,,m) 2 C i C j sim(c i,c j ) (9) G sim(c i,c j )= GCi Cj G Ci G Cj (9) G Ci G Cj C i C j 2.1 Web 27
Vol. 16 No. 5 October 2009 1 C i ( n i ) C j ( n j ) C new G new (10) w G new = p C i w p + w p C j w p (10) n i + n j 2.2 seed C sj seed C i C i ( G Ci ) seed C sj ( G C ) D(G Ci, G C ) Web p w p C i k j seed C s (kj) j ( n sj ) C i ( n i ) k j +1 1 28
Web C (0) (1) C (kj) seed C i C s (kj) j j G C(k ) C i G Ci D(G Ci, G C(k j ) ) C i Web w p l C i (l =1,,n i ) C i ( n i ) C i Web w p l C i (11) w p l C = w pl C i i D(G Ci, G C(k j ) )+c (11) D(G Ci, G C(k j ) ) (i) (ii) (iii) c D(G Ci, G C(k j ) ) 0 w p c 3.3.1 (2) seed C s (kj) j ( n sj ) C i ( n i ) C s (kj+1) j ( n sj + n i ) C (kj+1) = {w p1 C (k j ),, w pnsj C (k j ), w p1 C i,, wpn i C i } (3) k j +1 C (kj+1) G C(k j +1) (12) (11) w p l C i n i G C(k j +1) = w p C (k j +1) n sj + n i 1 w p D(G C i,g C(k j ) )+c (12) seed 2 G new (13) w G new = p C i w p + w p C j w p (13) n i + n j seed seed Web seed 29
Vol. 16 No. 5 October 2009 seed (Rocchio 1971) seed Clusty 4 Web seed seed 2 Web seed seed 2 7 else if (11) (i) (ii) (iii) 3 (i) (11) seed G Cs C G C D(G Cs, G C ) (8) (14) D(G Cs, G C )= m (gt Cs k gt C k ) 2 (14) k=1 (ii) (11) seed C s G Cs C G C D(G Cs, G C ) (15) D(G C(s), G C )= (G Cs G C ) T Σ 1 (G Cs G C ) (15) 4 http://clusty.com 30
Web 2 31
Vol. 16 No. 5 October 2009 Σ seed C s C s C s = {w p1 C s, w p2 C s,, w pm C s } G Cs G Cs = 1 m m i=1 w pi C s Σ ij (16) Σ ij = 1 m m i=1 (w pi C s G Cs )(w pj C s G Cs ) T (16) Σ Σ 11 Σ 12 Σ 1m Σ 21 Σ 22 Σ 2m Σ =...... Σ m1 Σ m2 Σ mm (iii) (ii) seed C sj Web C sj G C C l G C l (Diday and Govaert 1977) (1) C sj Web w pi v (w pi v) d sj (w pi, v) (17) d sj (w pi, v) =(w pi v) T M 1 (w pi v) (17) M sj C sj C sj C sj = {w p1 C sj, w p2 C sj,, w pm C sj } G C 32
Web G C = 1 m m i=1 w pi C sj M ij (18) M ij = 1 m m i=1 (w pi C sj G Cs p j )(w j C sj G C ) T (18) M sj M 11 M 12 M 1m M 21 M 22 M 2m M sj =........ M m1 M m2 M mm (2) Δ sj (v, M sj )= w p i C sj d sj (w pi, v) = (w pi v) T M 1 (w pi v) w p i C sj C sj L sj S sj (i) C sj M sj Δ sj L sj L sj =argmin v w p i C sj (w pi v) T M 1 (w pi v) (19) (19) C sj G G v G L sj = v G (ii) (i) L sj = v G det(m sj )=1 Δ sj S sj S sj =argmin M sj w p i C sj (w pi v G ) T M 1 (w pi v G ) (20) S sj C sj M sj (21) (Diday and Govaert 1977) 33
Vol. 16 No. 5 October 2009 S sj =(det(m sj )) 1/m M 1 (21) det(m sj ) 0 m seed C sj Web S sj S sj C sj G C Cl G C l (22) D(G C, G C l )= (G C G C l ) T S 1 (G C G C l ) (22) (22) (1) (2) C sj Web (21) S sj (15) 3 3.1 Web People Search Task (Artiles, Gonzalo, and Sekine 2007) WePS WePS 49 30 79 Yahoo! 5 API 100 7,900 Web 1 Web 6 Porter Stemmer (Porter 1980) 7 WePS WePS 3.2 purity inverse purity F (Hotho, Nürnberger, and Paaß 2005) Web People Search Task purity C 5 http://www.yahoo.com/ 6 ftp://ftp.cs.cornell.edu/pub/smart/english.stop 7 http://www.tartarus.org/ martin/porterstemmer/ 34
Web 1 WePS (A) (A) 1 Wikipedia 7 23.1 ECDL 06 10 15.3 32 5.90 49 10.8 Wikipedia 10 56.5 ACL 06 10 31.0 10 50.3 30 45.9 *ECDL: European Conference on Digital Libraries, ACL: Association for Computational Linguistics L n purity (23) Purity = i C i n max P recision(c i,l j ) (23) L j C i P recision(c i,l j ) (24) P recision(c i,l j )= C i Lj (24) C i inverse purity inverse purity (25) InversePurity = j L j n max Recall(C i,l j ) (25) L j C i Recall(C i,l j ) (26) Recall(C i,l j )= C i Lj (26) L j purity inverse purity F (27) 1 F = α 1 Purity +(1 α) 1 InverseP urity (27) 35
Vol. 16 No. 5 October 2009 α =0.5 0.2 α =0.5 0.2 F F 0.5 F 0.2 3.3 2 seed (a) Wikipedia (Remy 2002) (b) Web Web 3.3.1 c seed C sj C i 2 (11) C i Web w p l C i (l =1,,n i ) (12) (11) c D(G Ci, G C ) 0 w p WePS 2 seed (a) (b) 7 seed 0.1 c 50 seed 7 seed 2 4 c F 0.5 F 0.2 3.3.3 seed 2 4 c WePS 2 c Seed page c F 0.5 F 0.2 Seed page c F 0.5 F 0.2 1 Wikipedia article 0.94 0.63 0.61 1 Web page 0.95 0.61 0.58 2 Wikipedia articles 0.93 0.65 0.64 2 Web pages 0.94 0.63 0.61 3 Wikipedia articles 0.95 0.66 0.66 3 Web pages 0.96 0.64 0.65 4 Wikipedia articles 0.97 0.67 0.68 4 Web pages 0.93 0.64 0.66 5 Wikipedia articles 0.98 0.68 0.69 5 Web pages 0.97 0.66 0.67 6 Wikipedia articles 0.96 0.67 0.70 6 Web pages 0.94 0.64 0.67 7 Wikipedia articles 0.97 0.66 0.68 7 Web pages 0.96 0.62 0.65 36
Web 3 c Seed page c F 0.5 F 0.2 Seed page c F 0.5 F 0.2 1 Wikipedia article 0.95 0.64 0.61 1 Web page 0.93 0.62 0.59 2 Wikipedia articles 0.94 0.66 0.63 2 Web pages 0.96 0.63 0.61 3 Wikipedia articles 0.96 0.68 0.66 3 Web pages 0.95 0.66 0.64 4 Wikipedia articles 0.98 0.70 0.68 4 Web pages 0.95 0.67 0.68 5 Wikipedia articles 0.97 0.71 0.70 5 Web pages 0.94 0.69 0.70 6 Wikipedia articles 0.95 0.69 0.71 6 Web pages 0.93 0.68 0.69 7 Wikipedia articles 0.96 0.67 0.70 7 Web pages 0.95 0.66 0.68 4 c Seed page c F 0.5 F 0.2 Seed page c F 0.5 F 0.2 1 Wikipedia article 0.97 0.66 0.67 1 Web page 0.96 0.65 0.63 2 Wikipedia articles 0.95 0.68 0.68 2 Web pages 0.98 0.66 0.65 3 Wikipedia articles 0.94 0.70 0.71 3 Web pages 0.93 0.68 0.69 4 Wikipedia articles 0.96 0.72 0.73 4 Web pages 0.95 0.70 0.72 5 Wikipedia articles 0.98 0.75 0.74 5 Web pages 0.98 0.73 0.73 6 Wikipedia articles 0.97 0.73 0.72 6 Web pages 0.96 0.71 0.72 7 Wikipedia articles 0.95 0.71 0.70 7 Web pages 0.97 0.69 0.71 5 Purity Inverse purity F 0.5 F 0.2 0.67 0.48 0.52 0.49 3.3.2 (1) 5 (2) seed seed 3.3 2 seed (a) Wikipedia (b) 1 Web 3.1 WePS Wikipedia Wikipedia seed Web 1 Web WePS 30 16 Wikipedia 14 1 Web seed 37
Vol. 16 No. 5 October 2009 Wikipedia Bunescu (Bunescu and Pasca 2006) Wikipedia 6 seed seed F (F 0.5 =0.68 F 0.2 =0.66) seed 3.3.1 seed seed 7 2 seed cannot-link Web 3 4, Wikipedia 7 Web (F ) 2 seed 1 Klein (Klein et al. 2002) Xing (Xing et al. 2003) Bar-Hillel (Bar-Hillel et al. 2003) seed 3.3.3 3.3.2 Web seed Web seed 3 5 6 1 seed Distance measure Seed page* Purity Inverse purity F 0.5 F 0.2 (i) Euclidean distance (a) 0.47 0.82 0.60 0.62 (b) 0.49 0.77 0.58 0.61 (ii) Mahalanobis distance (a) 0.50 0.85 0.62 0.64 (b) 0.53 0.75 0.64 0.65 (iii) Adapative Mahalanobis distance (a) 0.57 0.88 0.68 0.66 (b) 0.55 0.76 0.65 0.64 *(a) and (b) in Seed page denote Wikipedia article and top-ranked Web page, respectively. 38
Web 3 seed (7 Wikipedia ) 4 seed ( 7 Web ) Wikipedia (F 0.5 =0.76,F 0.2 =0.74) (i) seed Web (ii) (i) WePS F seed Web 5 (ii) 39
Vol. 16 No. 5 October 2009 5 3 5 seed (Wikipedia ), seed Web ( w s, ) 6 3 5 seed (Wikipedia ) seed ( w s, ) WePS F seed 6 (i) (ii) 7 3.3.4 Web People Search Task 3 (F ) 7 7 40
Web 7 Web People Search Task 3 Team-ID Purity Inverse purity F 0.5 F 0.2 CU COMSEM (Chen and Martin 2007) 0.72 0.88 0.78 0.83 IRST-BP (Popescu and Magnini 2007) 0.75 0.80 0.75 0.77 PSNUS (Elmacioglu, Tan, Yan, Kan, and Lee 2007) 0.73 0.82 0.75 0.78 Our proposed method (with adaptive Mahalanobis distance) Using full text (Sec. 3.3.2) 1 Wikipedia article 0.58 0.84 0.68 0.66 1 Web page 0.60 0.76 0.65 0.64 5 Wikipedia articles 0.75 0.78 0.76 0.74 5 Web pages 0.77 0.69 0.72 0.73 Using fragments (Sec. 3.3.3) (i) 2 and 3 sentences in 5 Wikipedia seed pages and a search result Web page, respectively 0.80 0.83 0.81 0.82 (ii) Snippet and 3 sentences in 5 Wikipedia seed pages 0.70 0.62 0.66 0.68 3.4 3.3.2 (11) seed Web 7 Wikipedia 7 Web seed 3.3.2 PC (CPU: Intel Pentium M 2.0 GHz Memory: 2 GByte OS: Windows XP) Perl 7 3.5 (11) c 2 4 c =0.95 5 c 50 (11) c Web 41
Vol. 16 No. 5 October 2009 7 seed 5 purity (0.67) inverse purity (0.48) purity F F 0.5 =0.52 F 0.2 =0.49 2 6 purity (0.47 0.57) 5 purity (0.67) inverse purity (0.75 0.88) (0.48) inverse purity F seed seed 6 seed Wikipedia F (F 0.5 =0.68 F 0.2 =0.66) seed 3 4 seed (F ) seed 7 seed 5 Web Web purity 42
Web Bar-Hillel (Bar-Hillel et al. 2003) Xing (Xing et al. 2003) Klein (Klein et al. 2002) 1 Klein 2 (x i,x j ) 0 2 (max i,j D ij )+1 Xing Bar-Hillel seed seed seed Wikipedia Web Web seed Wikipedia WePS 3.3.3 (i) seed Web 5 Web 5 seed Web 2 3 F (F 0.5 =0.79 F 0.2 =0.80) WePS [purity:0.80 inverse purity:0.83, F 0.5 =0.81 F 0.2 =0.82] F α =0.5 7 Web People Search Task (Artiles et al. 2007) 1 (CU COMSEM) 0.03 3.3.2 (2) Wikipedia 16 Wikipedia 10 ( 1 (A) ) ACL 06 Wikipedia 6 1 (B) Wikipedia seed 8 (A) (B) 0.02 0.04 seed Wikipedia (B) Web Wikipedia 43
Vol. 16 No. 5 October 2009 seed seed Web 7 inverse purity Web People Search Task 3 seed 5 2 gain Web People Search Task 3 3 9 7 seed Wikipedia 5 CU COMSEM 7 F (F 0.5 =0.78 F 0.2 =0.83) F (F 0.5 =0.81 F 0.2 =0.84) F 0.5 0.03 F 0.2 0.01 URL IRST-BP 7 F (F 0.5 =0.75 F 0.2 =0.77) (F 0.5 =0.76 F 0.2 =0.81) F 0.5 0.01 F 0.2 0.04 8 Wikipedia seed (A) Wikipedia 10 (B) Wikipedia ACL 06 6 F 0.5 F 0.2 F 0.5 F 0.2 (A) 0.65 0.75 0.72 0.77 (B) 0.63 0.71 0.69 0.74 9 Web People Search Task 3 Team-ID Purity Inverse purity F 0.5 F 0.2 CU COMSEM 0.70 0.92 0.81 0.84 IRST-BP 0.69 0.82 0.76 0.81 PSNUS 0.67 0.84 0.78 0.82 44
Web PSNUS NE TF-IDF 7 F (F 0.5 =0.75 F 0.2 =0.78) F F 0.5 =0.78 F 0.2 =0.82 F 0.5 0.03 F 0.2 0.04 gain 7 F 0.5 =0.81 F 0.2 =0.82 F CU COMSEM F gain Web 5 F (F 0.5 =0.52 F 0.2 =0.49) F 0.5 0.29 F 0.2 0.33 WePS 3.3.3 (ii) seed 6 seed 6 seed 3 F (F 0.5 =0.64 F 0.2 =0.67) WePS [purity:0.70 inverse purity:0.62 F 0.5 =0.66 F 0.2 =0.68] Web People Search Task 3 seed 3 Web seed Wikipedia seed 2 Web 3 (22) 2 1 100 Web 5 seed seed Web 0.8 7 45
Vol. 16 No. 5 October 2009 4 Web seed [purity:0.80 inverse purity:0.83 F 0.5 :0.81 F 0.2 :0.82] seed cannot-link seed seed Web seed Web Web seed seed Artiles, J., Gonzalo, J., and Sekine, S. (2007). The SemEval-2007 WePS Evaluation: Establishing a Benchmark for the Web People Search Task. In Proc. of the Semeval 2007, Association for Computational Linguistics (ACL), pp. 64 69. Bar-Hillel, A., Hertz, T., and Shental, N. (2003). Learning Distance Functions Using Equivalence Relations. In Proc. of the 20th International Conference on Machine Learning (ICML 2003), pp. 577 584. Basu, S., Banerjee, A., and Mooney, R. (2002). Semi-supervised Clustering by Seeding. In Proc. of the 19th International Conference on Machine Learning (ICML 2002), pp. 27 34. Bekkerman, R., El-Yaniv, R., and McCallum, A. (2005). Multi-way Distributional Clustering via Pairwise Interactions. In Proc. of the 22nd International Conference on Machine Learning (ICML2005), pp. 41 48. Bollegala, D., Matsuo, Y., and Ishizuka, M. (2006). Extracting Key Phrases to Disambiguate Personal Names on the Web. In Proc. of the 7th International Conference on Computational Linguistics and Intelligent Text Processing (CICLing 2006), pp. 223 234. 46
Web Bookstein, A. and Swanson, D. R. (1974). Probabilistic Models for Automatic Indexing. Journal of the American Society for Information Science, 25 (5), pp. 312 318. Bunescu, R. and Pasca, M. (2006). Using Encyclopedic Knowledge for Named Entity Disambiguation. In Proc. of the 11th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2006), pp. 9 16. Chen, Y. and Martin, J. (2007). CU-COMSEM: Exploring Rich Features for Unsupervised Web Personal Name Disambiguation. In Proc. of the Semeval 2007, Association for Computational Linguistics (ACL), pp. 125 128. Church, K. W. and Gale, W. A. (1995a). Inverse Document Frequency (IDF): A Measure of Deviation from Poisson. In Proc. of the 3rd Workshop on Very Large Corpora, pp. 121 130. Church, K. W. and Gale, W. A. (1995b). Poisson Mixtures. Journal of Natural Language Engineering, 1 (2), pp. 163 190. Diday, E. and Govaert, G. (1977). Classification Automatique Avec Distances Adaptatives. R.A.I.R.O. Informatique Computer Science, 11 (4), pp. 329 349. Elmacioglu, E., Tan, Y. F., Yan, S., Kan, M.-Y., and Lee, D. (2007). PSNUS: Web People Name Disambiguation by Simple Clustering with Rich Features. In Proc.oftheSemeval 2007, Association for Computational Linguistics (ACL), pp. 268 271. Hotho, A., Nürnberger, A., and Paaß, G. (2005). A Brief Survey of Text Mining. GLDV-Journal for Computational Linguistics and Language Technology, 20 (1), pp. 19 62. Jones, K. S. (1973). Index Term Weighting. Information Strage and Retrieval, 9 (11), pp. 619 633. Klein, D., Kamvar, S. D., and Manning, C. D. (2002). From Instance-level Constraints to Spacelevel Constraints: Making the Most of Prior Knowledge in Data Clustering. In Proc. of the 19th International Conference on Machine Learning (ICML 2002), pp. 307 314. MacQueen, J. (1967). Some Methods for Classification and Analysis of Multivariate Observations. In Proc. of the 5th Berkeley Symposium on Mathmatical Statistics and Probability, pp. 281 297. Mann, G. S. and Yarowsky, D. (2003). Unsupervised Personal Name Disambiguation. In Proc. of the 7th Conference on Natural Language Learning (CoNLL-2003), pp. 33 40. Papineni, K. (2001). Why Inverse Document Frequency? In Proc. of the 2nd Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL 2001), pp. 25 32. Pedersen, T., Purandare, A., and Kulkarni, A. (2005). Name Discrimination by Clustering Similar Contexts. In Proc. of the 6th International Conference on Computational Linguistics 47
Vol. 16 No. 5 October 2009 and Intelligent Text Processing (CICLing 2005), pp. 226 237. Popescu, O. and Magnini, B. (2007). IRST-BP: Web People Search Using Name Entities. In Proc. of the Semeval 2007, Association for Computational Linguistics (ACL), pp. 195 198. Porter, M. F. (1980). An Algorithm for Suffix Stripping. Program, 14 (3), pp. 130 137. Remy, M. (2002). Wikipedia: The Free Encyclopedia. Online Information Review, 26 (6), p. 434. Rocchio, J. (1971). Relevance Feedback in Information Retrieval. In Salton, G. (Ed.), The Smart Retrieval System: Experiments in Automatic Document Processing, pp. 313 323. Prentice-Hall, Englewood Cliffs, NJ. Salton, G. and McGill, M. J. (1983). Introduction to Modern Information Retrieval. McGraw-Hill. Shental, N., Hertz, T., Weinshall, D., and Pavel, M. (2002). Adjustment Learning and Relevant Component Analysis. In Proc. of the 7th European Conference on Computer Vision (ECCV 2002), pp. 776 792. Wagstaff, K. and Cardie, C. (2000). Clustering with Instance-level Constraints. In Proc.ofthe 17th International Conference on Machine Learning (ICML 2000), pp. 1103 1110. Wagstaff, K., Rogers, S., and Schroedl, S. (2001). Constrained K-means Clustering with Background Knowledge. In Proc. of the 18th International Conference on Machine Learning (ICML 2001), pp. 577 584. Xing, E. P., Ng, A. Y., Jordan, M. I., and Russell, S. J. (2003). Distance Metric Learning with Application to Clustering with Side-Information. Advances in Neural Information Processing Systems, 15, pp. 521 528. 1998 2000 KDDI 2004 2006 2009 IEEE ACM AAAI 1984 1989 1992 2000 2009 48
Web AAAI ACL 2009 2 6 2009 7 16 2009 8 19 49