Vol No 959 (AOI: Attribute Oriented Induction) AOI AOI Characterization and Anomaly Detection for Network Log Using Attribute Oriented Induction Akira Yamada, Yutaka Miyake, Keisuke Takemori and Toshiaki Tanaka At network management, they are important routines that to extract characteristics events and to detect anomalies from daily network log In this paper, we propos characterization and anomaly detection for network log using attribute-oriented induction (AOI) The proposed scheme composes concept hierarchy, which is required at AOI algorithm adaptively Therefore our system doesn t need to prepare concept hierarchy based on each network configuration or network services Using periodic results of AOI, the proposed system detects anomalies, which are lurking behind a volume of network log We evaluated our system using log, which is collected at actual network, and presented effectiveness of our system (IDS : Intrusion Detection System) Julish 9),) Julish (AOI:Attribute Oriented Induction) 6) 8) KDDI KDDI R&D Laboratories Inc IDS Han 6) Han IP IP 4),) IP IP
Vol No AOI LAN 2 AOI 3 4 5 6 7 8 9 2 2 2 3 3 %3 % 22 IP DMZ Internal-IP NAT ANY-IP Domain-a External-IP Domain-b IP2 IP3 IP4 IPa IPa2 IPb IPb2 IP Fig An example of concept hierarchy {T, A, A 2,, A n } T A A 2 A n T T, A i H H l H l () H ANY H : H l H l H = ANY () IP (2) H 3 : {IP, IP2, } H 2 : {DMZ, NAT, } H : {Internal, External} H : {ANY} (2) 23 AOI {A, A 2,, A n, C} T A i H i O C A i T r r r[c], r[a i ] O ( )
2 959 4:48:2 IP4 alert 4:48:2 IP4 alert Internal-IP 4:48:2 IP5 alert (a) Network Log NAT (a) 4:48:2 NAT alert x3 IP3 IP4 (c) (b) (c) Summarized Network Log (b) Concept Hierarchy 2 Fig 2 Summarization using AOI Table An example of sequencial, unsequencial and tree structual value Table 2 IP 2 xferlog A assignment attributes to parameters of xferlog transfer-time remote-host file-size r[c] ( 2 ) T O ( 3 ) A i ( 4 ) A i H i H ( 5 ) T r r r r[c] r[c] + r [C] 2(a) 3 IP 2(b) 2(c) 24 Julisch IDS IP 9),) 2 IDS 3 Julisch 2 3 4 3 3 9268 / 28/ 92/2 92/3 92/32 3 IP 2 Fig 3 Structure of IP address 2 ftpd xferlog 3) 32 IP URL URI XPath IP 3 32 32 2 URL hostname path ( 4) URI XPath e-mail URL 4 4 AOI
Vol No 3 com www hostname www example com c (4) () a (2) example e / hoge/ hoge/ path 4 URL Fig 4 Structure of URL f (3) (a) Clustering b hogehoge/ indexhtml () / hogehoge/ indexhtml (2) 5 Fig 5 Hierarchical clustering (4) (b) Hierarchy (3) a e c b f 5(a) 2 5(b) 42 A n A {{a, a,2}, {a 2, a 2,2},, {a na, a na,2}} {p, p 2,, p na } a, < a 2, < < a na, {a i a j} a i a j (a i < a j ) (3) d({a j, a j,2 }, {a j+, a j+,2 }) = { pj +p j+ a j,2 a j, (a j,2 > a j+,2) p j +p j+ a j+,2 a j, (a j+,2 > a j,2 ) (3) {a j, a j,2} {a j+, a j+,2} a p (4) { a {a j, a j,2 } (a j,2 > a j+,2 ) {a j, a j+,2} (a j+,2 > a j,2) p p j + p j+ (4) R A R ( < R < ) 43 A n a {a, a 2,, a na } {p, p 2,, p na } p < p 2 < < p na ANY R A R ( < R < ) 44 A n A {a, a 2,, a na } {p, p 2,, p na }(p < p 2 < < p na ) {n, n 2,, n m } (m < n A ) n n[p] n[o], n[c] n[p] = S S = n A p j j R A R ( < R < ) 6 ( ) A i ( 2 ) T h = S A i ( 3 ) n min n min n min
4 959 /usr/src/a /usr/src/b /usr/local/c /usr/local/d /home/yamada/e /home/yamadaf /home/miyake/g /home/miyake/h usr/ src/ local/ / 2 3 3 2 home/ yamada/ miyake/ a b c d e f g h Fig 6 S=4 O=4 Th=4/4=36 / /usr/src/ /usr/local/c /home/yamada/e usr/ src/ local/ / 5 3 3 3 home/ yamada/ miyake/ a b c d e f g h 6 Construction of hierarchy from a tree structure ( 4 ) n min[o] < T h n min[p] > n min n min ( 5 ) (4) n min [o] < T h n min [c] n min n q n min n q n min ( 6 ) (4) (5) n min n min (4),(5),(6) 45 AOI AOI {T, A, A 2,, A n } A i R L O 7 ( ) AOI C ( 2 ) A i (i =, 2,, n ) I(A i) ( 3 ) I(A i ) A min A min ( 4 ) A min A min = A min R ( 5 ) A min A min ( 6 ) ( 7 ) (2) (3) (4) (5) (6) O I(A) A {a, a 2,, a na } (2) (3) (4)? no () C (5) (6) (7) <O? yes 7 Fig 7 Proposed algorithm {p, p 2,, p na } I(A) (5) I(A) = n A j= 5 (p j log na p j ) (5) 6 ( 8) 8
Vol No 5 Logs for min Log AOI Summary Group Summaries for min Summaries for hour Total Group 2 Group 2 Summary for day Fig 9 9 Grouping into some sub groups 8 AOI Fig 8 Iterative process of AOI 6 6 AOI 5 2 ( ) ( 2 ) ( 3 ) (2) 2 2 ( 4 ) 62 9 62 2 x i(i =,,, m ) n(< m) s k (n < k < m ) (6) x k = k x j, n s k = n j=k n+ k i=k n+ (x j x j) 2 (6) r l ( < l < m 2) (7) A max = max (x j), j m 2 r l = (x l+ x l ) (7) A max 7 7 IDS FTP Web Java
6 959 3 Table 3 Implementation environment of software Java J2SE 42 5 Library Jdom b, JavaMail 3, JAF 2 Other Tool Gnuplot 4 Web Server Apache 249, Tomcat 4 Table 4 CPU 4 Implementation environment of hardware Memory Intel Xeon 266GHz (Dual) 4G byte OS Fedora Core 2 5 Table 5 Evaluation Data type term Mbyte line IDS 24/8/5-9/3 986 6,473,996 FTP server 23/9/-/ 678 4,39,93 3 4 72 LAN IDS 2) FTP ) 5 73 6,7 6 7 FTP IDS AOI 6 7 Alert IP FTP IDS (a) (b) (b) 3 2 2 (b) 8 6 2 4 FTP command overflow attempt (b),2 74 The Number of Lines [line/min] The Number of Lines [line/min] 7 6 5 4 3 2 4 8 2 6 Time [hour] (a) Total Lines 4 Change Point 35 3 25 Group 2 2 Group Group 8 5 5 4 8 2 6 2 24 3 5 7 9 3 5 7 9 8 2 4 6 8 2 Time [hour] 2 4 6 Group No (b) Grouped Lines 2 IDS Fig Grouping Analysis for IDS s Log skew kurt 3 x i (i =,,, m ) µ σ m skew = ( xj µ ) 3 σ j= m kurt 3 = ( x j µ ) 4 3 σ j= 2 2 (a)(b) 2 3 2 2 2(a) 3 24
Vol No 7 6 Snort IDS Table 6 A Summary of Snort IDS Alerts group no alert src IP src port dst IP dst port lines SNMP request udp yxzxxyzxyyz/32 Any xzzzxxyzz/32 6 3292 2 SNMP request udp yxzxxyzxyyz/32 Any xzzzxxyxz/32 6 3263 3 SNMP request udp yxzxxyzxyyz/32 Any xzzzxxyxx/32 6 32377 4 SNMP request udp yxzxxyzxyyz/32 Any xzzzxxyzy/32 6 3225 5 SNMP request udp yxzxxyzxyyz/32 Any xzzzxxyxy/32 6 3676 6 SNMP request udp yxzxxyzxyyz/32 Any xzzzxxyxz/32 6 99 7 ICMP PING NMAP / / 7789 8 MS-SQL version overflow attempt / Any / 434 4434 9 MS-SQL Worm propagation attempt / Any / 434 4434 OUTBOUND MS-SQL Worm propagation attempt / Any / 434 4434 (snort decoder): Experimental Tcp / / 4367 Options found 2 FTP command overflow attempt / Any xzyyzzxxzz/32 2 482 3 SNMP request udp yxzxxyzxyx/32 Any xzzzxxyxy/3 6 46 4 ICMP Destination Unreachable Communication yxzyyyyxxxy/32 yzyyyyxyy/32 3638 with Destination Host is Administratively Prohibited 5 Any xzyyzzxxzz/32 Any / 2 382 6 SNMP request udp yxzxxyzxyx/32 Any / 6 33 7 ICMP PING CyberKit 22 Windows / / 2574 8 FTP command overflow attempt yxzxyzxxzyzy/32 3997 xzyyzzxxzz/32 2 2537 9 FTP command overflow attempt xzyyzzxxzz/32 3698 xzyxxxxxzz/32 2 258 2 FTP command overflow attempt yxzxyzxxzyzy/32 365 xzyyzzxxzz/32 2 2473 7 xferlog Table 7 A Summary of xferlog no A B C D E F G H I J K L M lines xxxxxxxxyx/32-85 //NetBSD b o a xxxxx@xxxxx ftp * c 7773 acjp 2 xxyxxxxyyx/32 2-228 //NetBSD b o a xxxxx@xxxxx ftp * c 266 acjp 3 / 5986 - / b o a Any ftp * c 648 55286 4 / 4463-39255 / b o a Any ftp * c 495 5 xxzxzxxxzyz/32-228 //Linux/ packages / b o a guest@null ftp * c 493 Caldera 6 / 688 - / b o a Any ftp * c 44 98 7 / - 228 / b o a Any ftp * c 392 8 / 8286-8586 / b o a Any ftp * c 355 9 xxyxxyxxzx/32 54 - //Linux/packages/ Red- Any o a xxxx@xxcom ftp * c 3 5874 Hat /redhat/linux/9/en/ doc /RH-DOCS / 362-77766 / b o a Any ftp * c 29 xxyxxyzxzx/32 223-879 //Linux/packages/ Red- Hat /redhat/linux/9/en/ doc /RH-DOCS Any o a xxxx@xxcom ftp * c 234 2 xxzxzxxxzyz/32 54-562 //Linux/ packages / b o a guest@null ftp * c 27 Caldera 3 xxyxyyxzzxyx/32 4463- //Linux/packages/ Mandrake b o a xxxxxx@xxxxx ftp * c 22 39255 /Mandrake-devel xxxxxorjp 4 y/ -85 //Linux/packages b o a null ftp * c 98 5 xxyxxyzxzx/32 922-4994 //Linux/packages/ Red- Hat /redhat/linux/9/en/ doc /RH-DOCS b o a xxxx@xxcom ftp * c 88 6 xxyxxyzxzx/32 36283-9946 //Linux/packages/ Red- Hat /redhat/linux/9/en/ doc /RH-DOCS b o a xxxx@xxcom ftp * c 44 7 / -4543 / b o a Any ftp * c 43 8 xxzxzzxyxzx/32 223 - //Linux/packages b o a xxxxxx@xxxxx ftp * c 3 879 cojp 9 xyx/7 4463-39255 / b o a Any ftp * c 88 2 xxzxzzxyxzx/32 54-562 //Linux/packages b o a xxxxxx@xxxxx ftp * c 85 cojp A:transfer-time, B:remote-host, C:file-size, D:filename, E:transfer-type, F:special-action-flag, G:direction, H:access-mode, I:username, J:servicename, K:authentication-method, L:authenticated-user-id, M:completion-status 24 8 5 8 62 8 8 75 3 (i)
8 959 The Number of Lines [line/min] Kurt -3 The Number of Lines [line/min] 4 2 (ii) 7 6 5 4 3 2 4 8 2 Time [hour] (a) Total Lines 4 35 Change Point 3 25 2 Group Group 2 5 5 4 8 2 6 Time [hour] 2 24 3 5 7 9 3 5 7 9 6 8 2 8 2 4 2 4 6 Group No 8 6 4 2 6 (b) Grouped lines ftp Fig Grouping Analysis for Ftp Server s Log (a) Similar to Normal Distribution 8/5 (b) Dissimilar to Normal Distribution - 2 3 4 5 6 7 8 9 2 Skew 2 2 Fig 2 Skew and Kurt of total log 6 2 (iii) 44 6 3 4 24 Kurt -3 Process Time [sec] 4 2 8 6 4 2 (a) Similar to Normal Distribution (b) Dissimilar to Normal Distribution Group at 8/5 Group 8 at 8/5-2 3 4 5 6 7 8 9 2 Skew 3 (a)(b) Fig 3 Skew and Kurt of grouped logs 5 4 3 2 8 24 8 5 Table 8 Statistical values of the logs Criterion Total Group Group 8 Standard 2962 2889 857 Amplitude 6 285 (i) Times (ii) 6*+ Times (iii) 6*24+24*+ Times 2 4 6 8 Log [line] 4 Fig 4 Process time, (ii) (i) 49 (iii) (i) 77 8 8, 7 5, (i) (ii) (iii)
Vol No 9 Edit Distance / Maximum 2 5 5 6*+ vs 6*24+24*+ vs 2 4 6 8 Log [line] 5 Fig 5 Edit Distace (Edit Distance) 5 (i) (ii) (i) (iii) % 82 IDS (True Positive) (False Positive) 3) IDS 5),4) IDS Alert 42,43,44 2) 2) 9 AOI AOI 5, (NICT) ) : http://ftpnejp/ 2) : Snort - the de facto standard for intrusion detection/prevention, http://wwwsnortorg/ 3) : xferlog, http://wwwwu-ftpdorg/man/xferloghtml 4) CEstan, SS and GVarghese: Automatically Inferring Patterns of Resource Consumption in Network Traffic, Applications, technologies, architectures, and protocols for computer communications (23) 5) Debar, H and Wespi, A: Aggregation and correlation of intrusion detection alerts, Recent Advances in Intrusion Detection (2) 6) Han, J and Fu, Y: Dynamic Generation and Refinement of Concept Hierarchies for Knowledge Discovery in Databases, KDD Workshop (994) 7) Han, J and Fu, Y: Exploration of the Power of Attribute-Oriented Induction in Data Mining, Advances in Knowledge Discovery and Data Mining (996) 8) J Han, Y C and Cercone, N: Data-driven discovery of quantitative rules in relational databases, IEEE Transactions on Knowledge and Data Engineering, Vol 5, No, pp 29 4 (993) 9) Julisch, K: Clustering Intrusion Detection Alarms to Support Root Cause Analysis, ACM Transactions on Information and System Security, Vol6, No4, pp443 47 (23)
959 ) Julisch, K and Dacier, M: Mining Intrusion Detection Alarms for Actionable Knowledge, 8th ACM International Conference on Knowledge Discovery and Data Mining (22) ) K Cho, R K and Kato, A: Aguri: An Aggregation-based Traffic Profiler, Quality of Future Internet Services (2) 2) K Yamanishi, J Takeuchi, G W and Milne, P: On-line Unsupervised Oultlier Detection Using Finite Mixtures with Discounting Learning Algorithms, ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2) 3) Pietraszek, T: Using Adaptive Alert Classification to Reduce False Positives in Intrusion Detection, Recent Advances in Intrusion Detection (24) 4) Valdes, A and Skinner, K: Probabilistic Alert Correlation, Recent Advances in Intrusion Detection (2)