HotMiner:Discovering Hot Topics from Dirty Text Malu Castellanos
Overview
Overview
Introduction
Introduction Hewlett-Packard 2 Search Case case
Introduction Search Log Postfiltering
Introduction Case Log
Part I:Mining Search Logs SearchView,
Preprocessing HP 3 Usage) DoSearch ReviewResultsDoc DoSearch Usage,Query Web search view
Preprocessing V) Wij i 2 k-means
Postfiltering SearchView:: HTML
Postfiltering Probability Distribution 2 w = k ij ij [ tfij log( n / df j )] Wij TF IDF n tfij i j k i Flexibility to augument K di dj AVG_SIM(di) AVG_SIM(C
Posftifltering AVG_SIM(dij S(Ck) AVG_SIM(di) Z Z(AVG_SIM(di)) = [ AVG_SIM(di -AVG_SIM(Ck)] / S(Ck)
Postfiltering L Postfiltering
Experimental Result HotMiner HP 100 case A. B. A B:
Experimental Result HotMiner WEB
Experimental Result SOM_PAK Fig6.3 SOM SOM
Experimental Result TD IDF a e a,c,e
Experimental Result A. B. B
Experimental Result Shared Memory memory shared memory Shared Memory Memory shared Shared Memory Memory shared
Experimental Result Fig6.4 patch HP-UX for Y2K compliance Fig6.5 sendmail mail unknown sendmail SOM (Fig6.4 Fig.6.6 Y2K
Experimental Result SearchView ContentView AVG_SIM(Ck) S(Ck)
PartII Mining Case Log Search Log Case log e-mail Search Log Case Case
Technical Description case
Thesaurus Assistant
Thesaurus Assistant Smith-Waterman Thesaurus Creation
Sentence Identifier elsif
Sentence Identifier
Sentence Extractor IR P149
TF/IDF Sentence Extractor P149
Sentence Extractor case
Sentence Extractor IR 2 2 P151
Experimental Results 5000 case 15 case
Mining Case Excerpts for Hot Topics 5 Fig.6.9) case case case SearchLog SOM Searchlog case F-measure
Conclusions Hotminer Search Case HP Search Log Case Log: