a) Structure Extraction from Presentation Slide Information Tessai HAYAMA a), Hidetsugu NANBA, and Susumu KUNIFUJI Web 1. Web Graduate School of Knowledge Science, Japan Advanced Institute of Science and Technology, 1 1, Nomi-shi, 923 1218 Japan Faculty of Information Sciences, Hiroshima City University, 3 4 1, Ozukahigashi, Asaminami-ku, Hiroshima-shi, 731 3194 Japan a) E-mail: t-hayama@jaist.ac.jp [1], [7], [8] Rosenfeld [6] Zhai [9] PDF Web [5] Web Web D Vol. J92 D No. 9 pp. 1483 1494 c 2009 1483
2009/9 Vol. J92 D No. 9 HTML [3] Microsoft PowerPoint Apple Keynote OpenOffice Impress XML 4 1 (A) (C) (F) 2. 2. 1 1484 Fig. 1 1 An example of slide information and its structure.
(F) (E) 2. 2 1 (A) (A) (B) (C) (C) (D) (G) (H) (I) (G) / 3. 2 3. 1 3. 2 3. 1 2 1485
2009/9 Vol. J92 D No. 9 Table 1 1 Score sheet of attribute based on the likelihood of the attribute. Ti1) >Threshold (fontsize1) +1 S1) +1 Ti2) >Threshold (y axis position) +1 S2) Ti3) +1 +1 Ti4) +1 S3) / Ti5) >Threshold (number of characters) +1 +1 S4) >Threshold (fontsize2) +1 S5) >Threshold (number of characters) +1 F 1) / 5 Ta1) F 2) / 4 5 F 3) / 4 Ta2) F 4) / / 4 3 Ta3) 4 F 5) / / Ta4) 3 1 Ta5) / 3 F 6) 4 Ta6) / F 7) <Threshold (number of characters) +1 1 T hreshold (fontsize1), T hreshold (fontsize2), T hreshold (Yaxis position) T hreshold (number of characters) 2 Fig. 2 Flow chart of organizing processing. (1) 1 1 Ti1 Ti2 Ti3 Ti4 Ti5 S1 S2 S3 S4 S5 1486
F 1 F 2 F 3 F 4 / F 5 S6 Ta1 Ta2 Ta3 Ta4 Ta5 / T 6 1 3 Fig. 3 An example of a slide including attributes scores. 3 Object(b) [ ] [3,5,0,0] Object(b) Object(c) (g) (h) (2) (1) (2.1) (2.3) (2.1) Li Attri (1) (2) 1487
2009/9 Vol. J92 D No. 9 Ev (attri) = Attri Val (attri) (if attri cand == attri) MaxScore (attri) (otherwise) Attri Val (attri) Li Attri = Ev ( title ) Ev ( body text ) (1) Ev ( figure ) Ev ( table ). (2) attri Attri Val (attri) attri cand MaxScore (attri) 1 (1) Ev (attri) attri attri (2) Li Attri (1) 2 Object(b) (g) 375 300 object(b) object(g) (2.2) (2.1) 0 3 object(d) object(d) object(f) 3 object(a) 0 (2.3) (2.1) (2.2) (3) / / (2.2) / (4) 3. 2 1 1 5 1488
Web (1) (2) (3) 1 2 / 4 block(a) (b) (a) 4 Fig. 4 Units attribute sequence in a block and it s dividing point. / (b) block(d) block(c) / (i) / / / block(a) block(b) (d) (ii) / / block(c) 3 (4) (2) (3) 1489
2009/9 Vol. J92 D No. 9 4. 4. 1 / / P recision Recall F measure (3) (5) Matched CorrectData Recall = (3) Total CorrectData Matched CorrectData P recision = (4) Total DetectedData 2 Recall P recision F measure = (5) Recall + P recision Matched CorrectData Total CorrectData Total DetectedData Web [4] 98 24.14 2366 7 2 5 18 1 2 3 XML Microsoft Visual Studio C# Microsoft PowerPoint PPT PPT PPT 1 5 5 Unit Object attribute Node- List Unit ID 2 1490
5 XML Fig. 5 An example of XML data outputted by an experimental system developted based on proposal method. 2 Table 2 Parameters of proposal method used in this experiment. Threshold (fontsize1) Threshold (fontsize2) Threshold (Yaxis position) Threshold (number of charactors) 1 24 pt 32 pt 1/4 8 24 pt 4. 2 3 4 3 3 F measure 0.89 0.69 4 4 0.95 0.90 1491
2009/9 Vol. J92 D No. 9 Table 3 3 Accuracy for each attribute results in the organizing process. (2333) (9285) (1905) (46) (2201) Recall 0.97 0.89 0.93 0.96 0.96 Precision 0.99 0.85 0.85 0.98 0.81 F-measure 0.98 0.85 0.89 0.97 0.87 Recall 0.87 0.69 0.64 0.93 0.91 Precision 0.96 0.88 0.63 0.93 0.63 F-measure 0.92 0.77 0.64 0.93 0.74 Table 4 4 Ratio in pages for each correct ratio of results in the structuring process. 1.00 0.99 0.80 0.79 0.60 0.59 0.00 N/A 0.95 0.03 0.04 0.05 0.12 0.90 0.05 0.06 0.07 0.12 0.76 0.07 0.08 0.15 0.12 6 [I] [II] Fig. 6 Slide samples matching/mis-matching structure data extracted by the proposal method to its correct data. 0.95 0.76 1492
/ 95% [I] [II] 6 (C) (D) (E) (B) (C) (D) (E) (E) (D) 5. 95% [2] 21 B 20300046 [1] A. Anjewierden, AIDAS: Incremental logical structure discovery in PDF documents, Proc. 6th International Conference on Document Analysis and Recognition, pp.374 378, 2001. [2] T. Hayama, H. Nanba, and S. Kunifuji, Alignment between a technical paper and presentation sheets using a hidden Markov model, Proc. Active Media Technology 2005, pp.102 106, 2005. [3] T. Ishihara, H. Takagi, T. Itoh, and C. Asakawa, Analyzing visual layout for a non-visual presentation-document interface, Proc. 8th International ACM SIGACCESS Conference on Computers and Accessibility, pp.165 172, 2006. [4] H. Nanba, T. Abekawa, M. Okumura, and S. Saito, Bilingual presri: Integration of multiple research paper databases, Proc. 7th RIAO Conference: Coupling Approaches, Coupling Media and Coupling Languages for Information Retrieval, pp.195 211, 2004. [5] Web vol.45, no.9, pp.2157 2167, 2004. [6] B. Rosenfeld, R. Feldman, and Y. Aumann, Structural extraction from visual layout of documents, Proc. 11th International Conference on Information and Knowledge Management, pp.203 210, 2002. [7] T. Watanabe, Q. Luo, and N. Sugie, Layout recognition of multi-kinds of table-form documents, IEEE Trans. Pattern Anal. Mach. Intell., vol.17, no.4, pp.432 445, 1995. [8] Y. Yang and H. Zhang, HTML page analysis based 1493
2009/9 Vol. J92 D No. 9 on visual cues, Proc. 6th International Conference on Document Analysis and Recognition, pp.859 864, 2001. [9] Y. Zhai and B. Liu, Structured data extraction from the Web based on partial tree alignment, IEEE Trans. Knowl. Data Eng., vol.18, no.12, pp.1614 1628, 2006. 20 12 15 21 4 13 2001 2003 2006 2007 1996 1998 2001 2002 2007 ACL ACM 1974 1982 1986 ICOT 1992 1998 25 1996 2004 1494