DEWS2008 E4-4 606-8501 E-mail: {hsato,oyama,tanaka}@dl.kuis.kyoto-u.ac.jp.. Supporting the Selection of Images Based on Referential Semantics from Surrounding Information of the Image in Presentation Files Abstract Hiroyuki SATO, Satoshi OYAMA, and Katsumi TANAKA Department of Social Informatics, Graduate School of Informatics, Kyoto University Yoshida-Honmachi, Sakyo, Kyoto 606-8501 Japan E-mail: {hsato,oyama,tanaka}@dl.kuis.kyoto-u.ac.jp We often use images, especially clip art to show the content in presentation more effectively, but it is difficult to select clip art suitably. This is because each clip art does not have appropriate annotation, and can be read several ways according as surrounding information since it has various meanings. In this paper, we extract the text surrounding each clip art from presentation files, and support the selection of images considering the term frequency. Key words presentation, clip art, referential semantics 1. Microsoft Office PowerPoint [1] Keynote [2] 1 1 SlideShare [3] Web
2 : Open Clip Art Library [4] 2 3 4 5 6 2. 2 2. 1 Microsoft Office PowerPoint Web PowerPoint 2 Yahoo! Web [5] SlideShare 2. 1. 1 Yahoo! Web Yahoo! Web PowerPoint 1 q = Q = ϕ Q = ϕ 2 ppt 1 q 3 20 Web MeCab [6] N 4 Q = Q {q} 5 Q = Q N Q ( ) 6 Q 1 q q = q 7 (2) 2. 1. 2 SlideShare SlideShare API Power- Point 1 q = ppt Q = ϕ Q = ϕ 2 q SlideShare 3 T 4 Q = Q {q} 5 Q = Q T Q ( ) 6 Q 1 q q = q 7 (2) 2. 1. 3 PowerPoint (URL ) PowerPoint PowerPoint 2 2. 1. 4 Yahoo! Web PowerPoint 7346 SlideShare Power- Point 7043 Yahoo! (1) 1 Microsoft Office PowerPoint
3 ) (body ) XML Microsoft Office ID ( : j0123456 BD012345) XML ID (clipartid ) 2. 3. 2 4 PowerPoint (2) PowerPoint Web () 2. 2 Microsoft Office PowerPoint 2007 pptx pptx Microsoft Office Open XML XML ZIP PowerPoint Microsoft.NET Framework 2.0 Visual C# 1 PowerPoint ( ppt ) pptx Microsoft PowerPoint 12.0 Object Library 2 pptx ZIP SharpZipLib [7] 3 XML XML 2. 3 2. 3. 1 1 1 pptx ZIP pptx ZIP (presentationpath ) ID /ppt/slides/ n XML ( : slide n.xml) n XML (slidepath ) 3 XML (title 4 DB 3. Yahoo! Web 6807 clipartid 2521 SlideShare 4210 clipartid 2278 Yahoo! SQL id clipartid 1 id SQL 1: SELECT slide.title, slide.body 2: FROM slide, clipart 3: WHERE clipart.clipartid = id 4: AND slide.slidepath = clipart.slidepath; SQL d id D id MeCab [6] id
T id ( ) t id T id id d id D id ( [8] TF tf(t, d id ) ) id t weight(t, id) : weight(t, id) = tf(t, d id ) (1) d id D id 4. 4. 1 1 3 5 5 2 3 4 5 4. 2 2 4. 2. 1 Microsoft Office Microsoft Office Web Microsoft Office 96448 2521 139 Microsoft Office 2382 96448 Microsoft Office 31688 16531 3995 2 44224 30 30 20 20 100 2 5 4. 2. 2 6 2 Microsoft 2 Microsoft Office MS 1 Microsoft Office MS +
Microsoft Office 6 Office Microsoft Office 4. 2. 3 Microsoft Office 1 7 7 Microsoft Office 1 8 8 2 Microsoft Office 5. Web [9] [10] [11] Web HTML Web
[12] Web Web 6. [1] Microsoft Office. http://office.microsoft.com/. [2] Apple iwork Keynote. http://www.apple.com/jp/iwork/keynote/. [3] SlideShare. http://www.slideshare.net/. [4] Open Clip Art Library. http://www.openclipart.org/. [5] Yahoo! Web. http://developer.yahoo.co.jp/search/. [6] MeCab. http://mecab.sourceforge.net/. [7] SharpZipLib. http://www.sharpziplib.com/. [8] G. Salton: Automatic Information Organization and Retrieval, McGraw Hill Text (1968). [9],,,, Letters Vol. 3 No. 1. [10],,, 15 (DEWS2004). [11], Web, 14 (DEWS2003) (2003). [12], Web, 14 (DEWS2003) (2003). Microsoft Office COE : 19 23 IT ( A01-00-02, 18049041 IT Y00-01 18049073