WebTiling Web 17 2 10
i WebTiling Web Web ( URL ) ( ) Web Web Yahoo Google Web Web Web Web Web Web Web Web Web Web 1 Web DOM( ) Web
ii 1 Web Web Web Web Web
WebTiling Displaying and Manipulating Multiple Web Contents with Structural Similarity Makoto Noudomi Abstract The Internet environment in recent years must not be as an information medium and a social dependence and the quantity of contents are increasing increasingly One of the reasons is that the Internet can send information easily but a Web page is managed by people of various senses of values and the life is influenced by the method of management The information site used today may moves stops or closes page by the end of tomorrow In the present Web space since an appearance and disappearance of a website are repeated frequently a user needs to know the contents of a new website in order to always acquire the newest information First a user acquires information by using search engines such as Yahoo and Google These accumulate information and display the search result according to the keyword which a user inputs in order of importance However Web contents are increasing every day and a user has to check all search results over long time Therefore a researcher raises the accuracy of narrowing [ of a search engine ] down or it has prefetched the Web page so that a user can see a Web page in a short time We propose Tile Arrangement Web browser which likened the Web page with the tile by this research If this browser is used a user can display two or more Web pages efficiently and can reduce the burden which looks at a Web page We think that a user can acquire information quickly by looking at many Web pages simultaneously We propose the structural link navigation and similar part extraction as another research These are functions which carry out processing put in block to two or more Web pages At the conventional web browser even if it opened two or more Web pages there was no function to process by connecting them We compare the difference of two or more Web pages and discover what was not visible only by looking at one Web page Structural link navigation and the structural similar part extraction are fun- iii
iv damentally based on the same idea Both are extracted by DOM (Document Object Model) ask for the similarity measure using the KOSAIN similarity measure and choose and display a portion structurally similar out of a set of a Web page A user can perform comparison processing to two or more Web pages in one operation and can obtain a result Moreover in order to display these results the Tile Arrangement Web browser is useful Finally we create a prototype and evaluate about the convenience The Tile Arrangement Web browser can be put in order and displayed without breaking down the layout of a Web page and is suitable for grasp of an image and slanting reading However it has the lowness of the readability of a character delay of expansion and reduction of a tile plane etc About the structural similar part extraction and link navigation we were able to achieve a certain amount of success but it is necessary to improve the display method
WebTiling Web 1 1 2 3 3 4 3.1 Infotube( 1)................................... 4 3.2 InfoLead( 2)................................... 5 3.3 Comparative Web Browser( 3)...................... 6 3.4 ListLeaf( 4)................................... 7 4 WebBrowser 8 4.1.................................... 8 4.2..................................... 8 4.3.................. 10 5 11 5.1............................. 11 5.1.1 DOM( ).......... 11 5.2................................. 12 5.2.1 Text............................... 13 5.2.2................... 13 6 15 6.1............................. 15 6.2............................. 15 7 17 7.1....................................... 17 7.2.................................. 17 7.3.......................................... 18 7.3.1 Web............. 18 7.3.2........... 19
7.3.3............ 20 7.4.......................................... 20 7.4.1................ 20 7.4.2................... 20 7.4.3............................ 21 8 23 24 25
1 Web ( URL ) ( ) Web Web Web Yahoo Google Web PageRank Web 1 Web Web 1 Web Opera[1] Sleipnir Web Web NetscapeNavigator FireFox[2] Web 1 Web Web 1
Web Web Web 1 Web Web Web 2
2 Web Web Web Web Web Web Web Web Web Web Web Web 3
3 3.1 Infotube( 1) Infotube[4] Web 1 1 Web Infotube Web Web 1: Infotube 4
3.2 InfoLead( 2) InfoLead[3] Web 3D Web Web 2: InfoLead 5
3.3 Comparative Web Browser( 3) Comparative Web Browser[5] 2 Web Web CWB Web 2 3: Comparative Web Browser 6
3.4 ListLeaf( 4) ListLeaf[6] Web Web Web Web CSS Web Web 4: ListLeaf 7
4 WebBrowser Web Web 1. Web Web 2. Web 3. Web 2 4. Web 5. 6. 3 4 4 5 4.1 Web Web Web Web Web 4.2 Web (Tile Arrangement Algorithm TAA) TAA foreach(web ) { if ( Width Height) { } 8
{ else if(width Height) Height } } ( 5) Web Web 5: Web ( ) 9
4.3 Web Web Web 2 1 Web 10
5 Web A a B Web b A B a b 1 Web 5.1 DOM 5.1.1 DOM( ) (DOM) HTML XML = = (API) [7] DOM HTML XML DOM HTML Document HTML Element DocumentFragmen Comment DocumentFragment HTML Element Text Comment Element Element( ) Text Commment Attr Element Text( ) Comment! Text 11
DOM DOM HTML ( 6) Element parentnode childnodes firstchild lastchild previoussibiling null nextsibiling null API Web Microsoft Visual C#.NET2003 mshtml.ihtmldocument DOM foreach( Web ) { Web mshtml.ihtmldocument (htmldocument) IHTMLElementCollection htmldocument.links foreach(htmldocument.links) { } } 5.2 Text Web Web 12
6: DOM HTML m n O(mn) Text 5.2.1 Text "<a href="url">string</a>" Text String Text 2 x,y Sim T ext (x, y) = ( ) 2 (1) length(x) length(y) x String y Text 0.5 5.2.2 <a href="url">string</a> DOM Element a Attr( ) href Text URL n Text URL 13
Web ( ) Web P {..( ),.( ), childnode( )} 1 0 Web P 1 URL 1./research/ Web P 2 URL 2./research/ Sim Relative (URL 1, URL 2 ) 1 Text 3 β α Web β T ext Sim T ext + β Relative Sim Relative + β F ile Sim F ile >= α (2) 14
6 Web Web Web 6.1 Web Flash 4.1.1 DOM DOM Text ( )Element Element (7) HTML HTML Document Text 6.2 HTML HTML Web Web ( 8) Web Web 15
7: DOM HTML 8: 16
7 7.1 Web ( 9) ( 10) Web Microsoft Visual Studio C#.NET2003 7.2 1. Web Web 2. Ctrl+ Web 3. 4. 5. Ctrl+ 6. Web 7. Web 17
7.3 9: Web 3 7.3.1 Web Web 18
10: Web 7.3.2 Web Web Web URL 19
7.3.3 Web Text Element Text Element Text 7.4 7.4.1 Web 7.4.2 Web Web Web 10 20 Web InfoLead 20 Web Web 20
Web Web Web Web Web Web Web 7.4.3 Web Web (Tile Getting Algorithm TGA) Web Web Web [8] Web Web Web Web WebWatcher Web Web [9] 21
Web TGA Web Web 2002 6 6500 (etforecasts ) 6480 Yahoo! SOHO Web Web Web Web Web Web Web Web 22
8 Web Web Web Web Web 2 Web Web Web Web Web 23
24
[1] OPERA Software http://www.opera.com/ [2] FireFox http://www.mozilla-japan.org/products/firefox/ [3] InfoLead http://goo.ntt-infolead.net/ [4] Infotube http://www.plannet-arch.com/information/tube-jp.htm [5] Akiyo Nadamoto Katsumi Tanaka A Comparative Web Browser (CWB) for Browsing and Comparing Web Pages WWW2003 Budapest Hungary May 2003 [6] ListLeaf http://www.listbrowser.com/ [7] DOM1 http://www.doraneko.org/misc/dom10/19981001/cover.html [8] Web Letters Vol.2, No.1, pp.139-142 [9] WEB Vol.2003, No.71, 2003-DBS-131(I)-45, pp.343-349 25