Vol. 44 No. SIG 18(TOD 20) 33 2. 4) e 1) 15) 6) 19) 16) 17) 21) OLAP lycos 2) 20) microsoft Encarta 12) 8) 14) 22) URL IP 11) 13) 22) 3. 3.1 1 RDD Ran



Similar documents
Vol. 46 No. SIG 8(TOD 26) 83 URL URL URL URL 2),21) URL 4 URL ) 1),13),16),20) 17),18),22) 2),11),12),21) 7),14) 23) 13) 1 Fig. 1 A meth

2. Twitter Twitter 2.1 Twitter Twitter( ) Twitter Twitter ( 1 ) RT ReTweet RT ReTweet RT ( 2 ) URL Twitter Twitter 140 URL URL URL 140 URL URL

1 Fig. 1 Extraction of motion,.,,, 4,,, 3., 1, 2. 2.,. CHLAC,. 2.1,. (256 ).,., CHLAC. CHLAC, HLAC. 2.3 (HLAC ) r,.,. HLAC. N. 2 HLAC Fig. 2

TF-IDF TDF-IDF TDF-IDF Extracting Impression of Sightseeing Spots from Blogs for Supporting Selection of Spots to Visit in Travel Sat

DEIM Forum 2009 B4-6, Str

258 5) GPS 1 GPS 6) GPS DP 7) 8) 10) GPS GPS ) GPS Global Positioning System

3D UbiCode (Ubiquitous+Code) RFID ResBe (Remote entertainment space Behavior evaluation) 2 UbiCode Fig. 2 UbiCode 2. UbiCode 2. 1 UbiCode UbiCode 2. 2

untitled

DPA,, ShareLog 3) 4) 2.2 Strino Strino STRain-based user Interface with tacticle of elastic Natural ObjectsStrino 1 Strino ) PC Log-Log (2007 6)

IPSJ-TOM

SERPWatcher SERPWatcher SERP Watcher SERP Watcher,

IPSJ SIG Technical Report Vol.2009-DPS-141 No.20 Vol.2009-GN-73 No.20 Vol.2009-EIP-46 No /11/27 1. MIERUKEN 1 2 MIERUKEN MIERUKEN MIERUKEN: Spe

(a) (b) 1 JavaScript Web Web Web CGI Web Web JavaScript Web mixi facebook SNS Web URL ID Web 1 JavaScript Web 1(a) 1(b) JavaScript & Web Web Web Webji

dews2004-final.dvi

Web Web Web Web Web, i

3_23.dvi

IPSJ SIG Technical Report Secret Tap Secret Tap Secret Flick 1 An Examination of Icon-based User Authentication Method Using Flick Input for

1., 1 COOKPAD 2, Web.,,,,,,.,, [1]., 5.,, [2].,,.,.,, 5, [3].,,,.,, [4], 33,.,,.,,.. 2.,, 3.., 4., 5., ,. 1.,,., 2.,. 1,,

Vol.55 No (Jan. 2014) saccess 6 saccess 7 saccess 2. [3] p.33 * B (A) (B) (C) (D) (E) (F) *1 [3], [4] Web PDF a m

IPSJ SIG Technical Report Vol.2011-EC-19 No /3/ ,.,., Peg-Scope Viewer,,.,,,,. Utilization of Watching Logs for Support of Multi-

3_39.dvi

1 4 4 [3] SNS 5 SNS , ,000 [2] c 2013 Information Processing Society of Japan

1 Web [2] Web [3] [4] [5], [6] [7] [8] S.W. [9] 3. MeetingShelf Web MeetingShelf MeetingShelf (1) (2) (3) (4) (5) Web MeetingShelf

IPSJ SIG Technical Report Vol.2010-GN-74 No /1/ , 3 Disaster Training Supporting System Based on Electronic Triage HIROAKI KOJIMA, 1 KU

DEIM Forum 2009 C8-4 QA NTT QA QA QA 2 QA Abstract Questions Recomme

IPSJ SIG Technical Report Vol.2016-CE-137 No /12/ e β /α α β β / α A judgment method of difficulty of task for a learner using simple

28 Horizontal angle correction using straight line detection in an equirectangular image

Vol. 42 No. SIG 8(TOD 10) July HTML 100 Development of Authoring and Delivery System for Synchronized Contents and Experiment on High Spe

& Vol.5 No (Oct. 2015) TV 1,2,a) , Augmented TV TV AR Augmented Reality 3DCG TV Estimation of TV Screen Position and Ro

日本感性工学会論文誌

29 jjencode JavaScript

3807 (3)(2) ,267 1 Fig. 1 Advertisement to the author of a blog. 3 (1) (2) (3) (2) (1) TV 2-0 Adsense (2) Web ) 6) 3

johnny-paper2nd.dvi

_314I01BM浅谷2.indd

第62巻 第1号 平成24年4月/石こうを用いた木材ペレット

IPSJ SIG Technical Report Vol.2012-CG-148 No /8/29 3DCG 1,a) On rigid body animation taking into account the 3D computer graphics came

IT,, i

1: ( 1) 3 : 1 2 4

IPSJ SIG Technical Report Vol.2009-HCI-134 No /7/17 1. RDB Wiki Wiki RDB SQL Wiki Wiki RDB Wiki RDB Wiki A Wiki System Enhanced by Visibl

Vol.54 No (July 2013) [9] [10] [11] [12], [13] 1 Fig. 1 Flowchart of the proposed system. c 2013 Information

1 Web Web 1,,,, Web, Web : - i -

60 90% ICT ICT [7] [8] [9] 2. SNS [5] URL 1 A., B., C., D. Fig. 1 An interaction using Channel-Oriented Interface. SNS SNS SNS SNS [6] 3. Processing S

: W, k : C 1,, C k 1. W D ii = j W ij D 2. W, D L = I D 1/2 W D 1/2 L 3. L, k U 4. U k-means C 3: 2: 3. ( ) k-means HITS k-means k-mean

[1] [3]. SQL SELECT GENERATE< media >< T F E > GENERATE. < media > HTML PDF < T F E > Target Form Expression ( ), 3.. (,). : Name, Tel name tel

7,, i

: ( 1) () 1. ( 1) 2. ( 1) 3. ( 2)

IPSJ SIG Technical Report Vol.2014-DBS-159 No.6 Vol.2014-IFAT-115 No /8/1 1,a) 1 1 1,, 1. ([1]) ([2], [3]) A B 1 ([4]) 1 Graduate School of Info

Vol. 23 No. 4 Oct Kitchen of the Future 1 Kitchen of the Future 1 1 Kitchen of the Future LCD [7], [8] (Kitchen of the Future ) WWW [7], [3

DEIM Forum 2012 E Web Extracting Modification of Objec

IPSJ SIG Technical Report Vol.2009-BIO-17 No /5/26 DNA 1 1 DNA DNA DNA DNA Correcting read errors on DNA sequences determined by Pyrosequencing

IPSJ SIG Technical Report Vol.2011-MUS-91 No /7/ , 3 1 Design and Implementation on a System for Learning Songs by Presenting Musical St

finalrep.dvi

(a) 1 (b) 3. Gilbert Pernicka[2] Treibitz Schechner[3] Narasimhan [4] Kim [5] Nayar [6] [7][8][9] 2. X X X [10] [11] L L t L s L = L t + L s

27 YouTube YouTube UGC User Generated Content CDN Content Delivery Networks LRU Least Recently Used UGC YouTube CGM Consumer Generated Media CGM CGM U

& Vol.2 No (Mar. 2012) 1,a) , Bluetooth A Health Management Service by Cell Phones and Its Us

Vol.54 No (Mar. 2013) 1,a) , A Case Study of the Publication of Information on the Japan Earthquake Naoto Matsumoto 1,a

Vol. 42 No MUC-6 6) 90% 2) MUC-6 MET-1 7),8) 7 90% 1 MUC IREX-NE 9) 10),11) 1) MUCMET 12) IREX-NE 13) ARPA 1987 MUC 1992 TREC IREX-N

ICT Web Web ICT Web 2. 新 学 習 指 導 要 領 の 理 念 と 教 育 の 情 報 化 の 意 義 2-1 新 学 習 指 導 要 領 の 理 念 ICT 2

1 1 CodeDrummer CodeMusician CodeDrummer Fig. 1 Overview of proposal system c

1_26.dvi

A Study of Effective Application of CG Multimedia Contents for Help of Understandings of the Working Principles of the Internal Combustion Engine (The

Vol. 28 No. 2 Apr Web Twitter/Facebook UI Twitter Web Twitter/Facebook e.g., Web Web UI 1 2 SNS 1, 2 2

Mining Social Network of Conference Participants from the Web


HP cafe HP of A A B of C C Map on N th Floor coupon A cafe coupon B Poster A Poster A Poster B Poster B Case 1 Show HP of each company on a user scree

4. C i k = 2 k-means C 1 i, C 2 i 5. C i x i p [ f(θ i ; x) = (2π) p 2 Vi 1 2 exp (x µ ] i) t V 1 i (x µ i ) 2 BIC BIC = 2 log L( ˆθ i ; x i C i ) + q

DEIM Forum 2009 E

Fig. 3 3 Types considered when detecting pattern violations 9)12) 8)9) 2 5 methodx close C Java C Java 3 Java 1 JDT Core 7) ) S P S

DEIM Forum 2010 A3-3 Web Web Web Web Web. Web Abstract Web-page R

Honda 3) Fujii 4) 5) Agrawala 6) Osaragi 7) Grabler 8) Web Web c 2010 Information Processing Society of Japan

IPSJ SIG Technical Report Vol.2010-SLDM-144 No.50 Vol.2010-EMB-16 No.50 Vol.2010-MBL-53 No.50 Vol.2010-UBI-25 No /3/27 Twitter IME Twitte

fiš„v5.dvi

untitled

システム開発プロセスへのデザイン技術適用の取組み~HCDからUXデザインへ~

The copyright of this material is retained by the Information Processing Society of Japan (IPSJ). The material has been made available on the website

6_27.dvi

IPSJ SIG Technical Report Vol.2014-EIP-63 No /2/21 1,a) Wi-Fi Probe Request MAC MAC Probe Request MAC A dynamic ads control based on tra

.,,, [12].,, [13].,,.,, meal[10]., [11], SNS.,., [14].,,.,,.,,,.,,., Cami-log, , [15], A/D (Powerlab ; ), F- (F-150M, ), ( PC ).,, Chart5(ADIns

Web [1] [2] [3] [4] [5] SupportVectorMachine SVM [6] [7] Google [11] Web

00.\...ec5

1 Table 1: Identification by color of voxel Voxel Mode of expression Nothing Other 1 Orange 2 Blue 3 Yellow 4 SSL Humanoid SSL-Vision 3 3 [, 21] 8 325

[2] , [3] 2. 2 [4] 2. 3 BABOK BABOK(Business Analysis Body of Knowledge) BABOK IIBA(International Institute of Business Analysis) BABOK 7

HASC2012corpus HASC Challenge 2010,2011 HASC2011corpus( 116, 4898), HASC2012corpus( 136, 7668) HASC2012corpus HASC2012corpus

地域共同体を基盤とした渇水管理システムの持続可能性

2007/8 Vol. J90 D No. 8 Stauffer [7] 2 2 I 1 I 2 2 (I 1(x),I 2(x)) 2 [13] I 2 = CI 1 (C >0) (I 1,I 2) (I 1,I 2) Field Monitoring Server

DTN DTN DTN DTN i

22 Google Trends Estimation of Stock Dealing Timing using Google Trends

ID 3) 9 4) 5) ID 2 ID 2 ID 2 Bluetooth ID 2 SRCid1 DSTid2 2 id1 id2 ID SRC DST SRC 2 2 ID 2 2 QR 6) 8) 6) QR QR QR QR

. Yahoo! 1!goo 2 QA..... QA Web Web [1]Web Web Yin [2] Web Web Web. [3] Web Wikipedia 1 2

2) TA Hercules CAA 5 [6], [7] CAA BOSS [8] 2. C II C. ( 1 ) C. ( 2 ). ( 3 ) 100. ( 4 ) () HTML NFS Hercules ( )

P2P P2P Winny 3 P2P P2P 1 P2P, i

Vol. 45 No Web ) 3) ),5) 1 Fig. 1 The Official Gazette. WTO A

Vol.53 No (Mar. 2012) 1, 1,a) 1, 2 1 1, , Musical Interaction System Based on Stage Metaphor Seiko Myojin 1, 1,a

DEIM Forum 2010 A Web Abstract Classification Method for Revie

知能と情報, Vol.30, No.5, pp

TCP/IP IEEE Bluetooth LAN TCP TCP BEC FEC M T M R M T 2. 2 [5] AODV [4]DSR [3] 1 MS 100m 5 /100m 2 MD 2 c 2009 Information Processing Society of

1 1 tf-idf tf-idf i

Q [4] 2. [3] [5] ϵ- Q Q CO CO [4] Q Q [1] i = X ln n i + C (1) n i i n n i i i n i = n X i i C exploration exploitation [4] Q Q Q ϵ 1 ϵ 3. [3] [5] [4]

ディスプレイと携帯端末間の通信を実現する映像媒介通信技術

Computer Security Symposium October 2013 Android OS kub

Transcription:

Vol. 44 No. SIG 18(TOD 20) Dec. 2003 URL URL URL A Study for Analysis of Web Access Logs with Web Communities Shingo Otsuka, Masashi Toyoda and Masaru Kitsuregawa To extract model of Web users behavior is of decisive importance and there are a lot of work has been done in this area. As far as we know, most of the work utilize logs on serverside, even it can gain an understanding of behavior inside the server, but it is hard to analyze complete users behavior (inside and outside the server). Recently, similar to survey on TV audience rating, a new kind of business appeared, which collects URL histories of users (called panel) who are selected without statistic deviation. By analyzing panel logs which are merged from panels, it becomes possible to collect all the web pages (URLs) accessed by the users. In contrast to Web server logs which have a limited page-space, panel logs have an extremely broad page-space. For this reason, it s difficult to get hold of behavior on global page-space by just checking reference histories. In this papaer, we propose a prototype system to extract user access patterns from panel logs and show users global behavior patterns which are hard to be grasped for URL-based analysis using our proposed system. 1. URL URL Institute of Industrial Science, The University of Tokyo URL 2 3 4 5 6 32

Vol. 44 No. SIG 18(TOD 20) 33 2. 4) e 1) 15) 6) 19) 16) 17) 21) OLAP lycos 2) 20) microsoft Encarta 12) 8) 14) 22) URL IP 11) 13) 22) 3. 3.1 1 RDD Random Digit Dialing URL 1 ID URL ID

34 Dec. 2003 1 Fig. 1 A method of collecting panel logs. 1 Table 1 A part of the panel logs. ID (1) URL URL 30 3) 3.2 (1) 5) (2) 9) 2 10) 10) (1) (2) 2 Fig. 2 Typical graph of authorities and hubs. HITS 7) 2 HITS (2) HITS 18) 2002 2 4,500 100 17 3.3 URL URL URL X A Y B C D 5 2 2 2 Y

Vol. 44 No. SIG 18(TOD 20) 35 2 Table 2 The detail of our used panel logs. 10 Giga byte 45 55,415,473 1,148,104 1 RDD Random Digit Dialing 30 3 URL URL Table 3 The adaptation ratio of the URLs belonged to web-communities and the URLs included panel logs. 18.8% 36.3% 7.7% 37.2% 4. 4.1 URL 2 URL URL URL URL = URL URL = URL URL = URL 3 18.8% 36.3% 1 7.7% URL 1 http://xxx.yyy.com/ xxx http://yyy.com/.com co.jp 4 Table 4 The search (portal) sites which extracted search words. yahoo.co.jp nifty.com biglobe.ne.jp infoseek.co.jp msn.co.jp ocn.ne.jp so-net.ne.jp dion.ne.jp lycos.co.jp goo.ne.jp hi-ho.ne.jp odn.ne.jp excite.co.jp google.co.jp fresheye.co.jp altavista.com 63% URL 4.2 2 google Yahoo! nifty biglobe Yahoo! Yahoo! auctions 4 URL 3 3.1 URL Yahoo! shopping Yahoo! auctions 4 2 http://www.vrnetcom.co.jp/ 3 yahoo http://shopping.yahoo.co.jp/ http://auctions.yahoo.co.jp/ nifty 4 http://shopping.yahoo.co.jp/ http://www.rakuten.co.jp/ http://auctions.yahoo.co.jp/ http://www.rakuten.co.jp/auction/

36 Dec. 2003 5 URL Table 5 The ratio of the group of the search sites, shopping sites and auction sites in the URLs included in panel logs. 4.1% 19.4% 1.5% 10.9% 64.1% 6 URL Table 6 The ratio of the group of the search sites in the URLs included in panel logs. (1) * 4.1% (2) ** 19.4% (3) * URL 12.3% (4) ** URL 43.4% (5) URL 7.7% (6) URL 13.1% 7 Table 7 The ratio of the sessions included in the group of the search sites, shopping sites and auction sites. 23.3% 69.6% 5.7% 12.4% URL 5 4.1% 1.5% 20% 10% 6 (3) (4) URL (1) (4) URL 80% URL URL 16.4% (1) (3) 7 23% 70% 5 5 1 5 Yahoo! shopping Yahoo! auctions 3 5 Yahoo! shopping Yahoo! auctions 5. 5.1 3.3 URL 5.2 ID URL ID (1) (2) (3)

Vol. 44 No. SIG 18(TOD 20) 37 Fig. 3 3 The architecture of our proposed system. (4) (1) (2) (3) (4) 5.3 3 (a) (b) (c) (d) 4 HTML 4 Fig. 4 Starting page of our system. 2 ID ID ID ID ID 2 URL ID ID 4 4 (1) 3 ID (1) ID (2) (3)(4)

38 Dec. 2003 5 Fig. 5 Expression of Web communities with input child car seat. ID (5) 6. 6.1 ID 6.1.1 5 (a) 4 (1) 2 ID: 43606 1 ID: 36955 X (1) 5 (b) 4.1

Vol. 44 No. SIG 18(TOD 20) 39 6 Fig. 6 The list of search words used for view of the community related to baby. Fig. 7 7 The list of inflow and outflow Web community. URL 37% (2) 6.1.2 6 (1) URL (2) 5 (a) 6.1.3 7 ID

40 Dec. 2003 8 X Fig. 8 The list of co-occurrence of Web community in the session with search words child car seat and community child car seat vendors (X). 6.1.4 X ID: 36955 8 (a) 3 ID: 83551 4 ID: 92480 Y 2 8 (b) 1 X 3 X Y Y ID: 43606 9 X 2 3 JAF 8 (a) 6.1.5 5 (a) (3) (4) (5)

Vol. 44 No. SIG 18(TOD 20) 41 9 Fig. 9 The list of co-occurrence of Web community in the session with search words child car seat and community administrative organs. ID 6 (6) (7) 7 8 9 (8) 6.2 6.2.1 5 (a) 6.1.2 6.1.3 5 (a) 8 9 10 JAF 6.2.2 11 11 (a) 11 (b) 11 (c) 6.2.3 5 (a) 2 10 12% 10%

42 Dec. 2003 10 Fig. 10 The users behaviors with input child car seat. Fig. 11 11 The other examples of users behaviors. 5 (b) URL URL 6.3 6

Vol. 44 No. SIG 18(TOD 20) 43 7. URL URL C13224014 SI 1) Batista, P. and Silva, M.J.: Mining on-line newspaper web access logs, 12th International Meeting of the Euro Working Group on Decision Support Systems (EWG-DSS 2001) (May 2001). 2) Beeferman, D. and Berger, A.: Agglomerative clustering of s earch engine query log, The 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD2000 ) (Aug. 2000). 3) Catledge, L. and Pitkow, J.E.: Characterizing browsing behaviors on the world-wide web, Computer Networks and ISDN Systems, Vol.27, No.6 (1995). 4) Cooley, R., Mobasher, B. and Srivastava, J.: Web mining: Information and pattern discovery on the world wide web, Proc. 9th IEEE International Conference on Tools with Artificial Intelligence (ICTAI 97) (Nov. 1997). 5) Flake, G.W., Lawrence, S., Lee Giles, C. and Coetzee, F.M.: Self-organization and identification of web communities, IEEE Computer, Vol.35, No.3, pp.66 71 (2002). 6) Fu, Y., Sandhu, K. and Shih, M.: Clustering of web users based on access patterns, Proc. 1999 KDD Workshop on Web Mining (WEBKDD 99 ) (Aug. 1999). 7) Kleinberg, J.M.: Authoritative sources in a hyperlinked environment, Proc. ACM-SIAM Symposium on Discrete Algorithms (1998). 8) Koutsoupias, N.: Exploring web access logs with correspondence analysis, Methods and Applications of Artificial Intelligence, 2nd Hellenic (Apr. 2002). 9) Kumar, R., Raghavan, P., Rajagopalan, S. and Tomkins, A.: Trawling the web for emerging cyber-communities. Proc. 8th WWW Conference, pp.403 416 (1999). 10) Web Vol.44, No.7, pp.702 706 (2003). 11) Nanopoulos, A., Manolopoulos, Y., Zakrzewicz, M. and Morzy, T.: Indexing web access-logs for pattern queries, 4th ACM CIKM Nternational Workshop on Web Information and Data Management (WIDM2002 ), pp.63 68 (Nov. 2002). 12) Ohura, Y., Takahashi, K., Pramudiono, I. and Kitsuregawa, M.: Experiments on query expansion for Internet yellow page services using web log mining, The 28th International Conference on Very Large Data Bases (VLDB2002) (Aug. 2002). 13) Pramudiono, I., Shintani, T., Takahashi, K. and Kitsuregawa, M.: User behavior analysis of location aware search engine, Proc. International Conference On Mobile Data Management (MDM 02 ), pp.139 145 (Jan. 2002). 14) Prasetyo, B., Pramudiono, I., Takahashi,

44 Dec. 2003 K. and Kitsuregawa, M.: Naviz: Website navigational behavior visualizer, Advances in Knowledge Discovery and Data Mining 6th Pacific-Asia Conference (PAKDD2002) (May 2002). 15) Shahabi, C., Zarkesh, A.M., Adibi, J. and Shah, V.: Knowledge discovery from users webpage navigation, Proc. IEEE RIDE97 Workshop (Apr. 1997). 16) Su, Z., Yang, Q., Zhang, H., Xu, X. and Hu, Y.: Correlation-based document clustering using web logs, 34th Hawaii International Conference on System Sciences (HICSS-34 ) (Jan. 2001). 17) Tan, P. and Kumar, V.: Mining association patterns in web usage data. International Conference on Advances in Infrastructure for e-business, e-education, e-science, and e-medicine on the Internet (Jan. 2002). 18) Toyoda, M. and Kitsuregawa, M.: Creating a web community chart for navigating related communities, Conference Proceedings of Hypertext 2001, pp.103 112 (2001). 19) Ungar, L.H. and Foster, D.P.: Clustering methods for collaborative filtering, AAAI Workshop on Recommendation Systems (July 1998). 20) Wen, J., Nie, J. and Zhang, H.: Query clustering using user logs, ACM Trans. Info. Syst. (ACM TOIS), Vol.20, No.1, pp.59 81 (2002). 21) Zaiane, O.R., Xin, M. and Han, J.: Discovering web access patterns and trends by applying olap and data mining technology on web logs, Proc. Advances in Digital Libraries (ADL 98 ) (Apr. 1998). 22) Zeng, H., Chen, Z. and Ma, W.: A unified framework for clustering heterogeneous web objects, 3rd International Conference on Web Information Systems Engineering (WISE2002) (Dec. 2002). 1996 2002 1994 1999 2001 2003 ACM IEEE CS 1978 1983 2003 Web 1999 2002 ACM SIGMOD Japan Chapter Chair 1997 1998 VLDB Trustee 1997 2002 IEEE ICDE PAKDD WAIM ( 15 6 20 ) ( 15 10 6 )