Kochi University of Technology Aca Title jjencode で難読化された JavaScript の検知 Author(s) 中村, 弘亮 Citation Date of 2018-03 issue URL http://hdl.handle.net/10173/1975 Rights Text version author Kochi, JAPAN http://kutarr.lib.kochi-tech.ac.jp/dspa
29 jjencode JavaScript 1180357 2018 2 28
jjencode JavaScript Web JavaScript Web JavaScript Web jjencode JavaScript jjencode i
Abstract Koryo NAKAMURA In recent years, a drive-by download attack of infecting malware by browsing a tampered legitimate website has been reported. Most of the malicious JavaScript used for attacks has been treated as obfuscated, but obfuscation may be applied to JavaScript on common Web sites as well. Obfuscation research that can be used for attacks in this way but can also be used on ordinary malicious Web sites is important. Among various obfuscation methods, there is a relatively new method of converting to a code with only symbols. However, there are still few studies targeting obfuscation method where obfuscated code becomes only symbols. Nishida s method, which is an existing research, is effective for this obfuscation method, but the method had a problem that verification of the feature acquired by machine learning was not sufficient. In this paper, we aimed to verify whether obfuscation can be detected if the code after obfuscation is only a symbol. In addition, we verify what can be distinguished by using the character appearance frequency. This experiment showed that it was possible to discriminate between code obfuscated by jjencode and general code, and it became clear that obfuscation can be learned by using the existing method. key words JavaScript obfuscation jjencode ii
1 1 1.1.................................. 1 1.2................................. 2 2 3 2.1................................. 3 2.1.1............................... 4 2.1.2............................... 4 2.1.3............................... 4 2.1.4............................... 4 2.2.................................. 5 2.2.1.......................... 6 2.2.2.................... 6 2.2.3......................... 6 2.2.4.............. 6 2.3................................... 7 2.3.1................................ 7 2.3.2................................ 8 3 9 3.1 JavaScript........................ 9 3.2................................. 9 3.3................................. 10 3.4 SVM............................... 11 iii
3.5.......................... 11 4 12 4.1...................................... 12 4.1.1 Web................. 13 4.1.2....................... 13 4.1.3.......................... 14 4.1.4......................... 15 4.1.5 SVM........................ 15 4.1.6................................ 16 5 17 6 19 20 21 iv
2.1............................... 5 2.2............................... 5 2.3-1........................ 7 2.4 false..................... 7 2.5 f....................... 7 2.6 jjencode.................... 8 4.1......................... 14 v
2.1................................ 3 3.1.............................. 10 4.1 JavaScirpt......................... 13 4.2 5......................... 15 vi
1 1.1 Web [1] JavaScript JavaScript Web Web jjencode [2] JavaScript [3] 1
1.2 1.2 6 2 3 JavaScript 4 5 6 2
2 JavaScript 2.1 4 2.1 2.1 3
2.1 2.1.1 2.1.2 2.1.3 Web Web HTML id 2.1.4 4
2.2 2.2 jjencode 2.1 JavaScript Obfuscator Tool[4] 2.2 2.1 Hello World! 2.2 2.1 2.2 2.1 2.2 5
2.2 2.2.1 2.2.2 2.2.3 Unicode 2.2.4 JavaScript 2.3 2.4 2.5 JavaScript 2.3 1 2.4 6
2.3 2.3-1 2.4 false false 2.5 f 2.3 2.4 2.5 2.6 2.6 jjencode [5] 2.3 JavaScript [6] JavaScript 2.3.1 JavaScript 2.5 f 7
2.3 2.6 jjencode 2.3.2 JavaScript scirpt JavaScript 8
3 JavaScript 3.1 JavaScript JavaScript JavaScript 3.2 MWS Dataset 2013[7] D3M Web Alexa[8] 500 Web JavaScript 9
3.3 1KB 3.3 ASCII 0x21 0x7e 96 96 3.1 3.1 16 [A-Z] [a-z] 0x41-0x5a 0x61-0x7a [0-9] 0x30-0x39 0x21-0x2f 0x3a-0x40 0x5b-0x60 0x7b-0x7e i m i N F (i) (3.1) (3.2) N = n m i (3.1) i F (i) = m i N (3.2) F (i) 0 1 10
3.4 SVM 3.4 SVM SVM(Support Vector Machine) 2 [9] 96 96 SVM 3.5 SVM RBF( ) C = 25.22 = 55.72 98.84% (accuracy) 11
4 jjencode jjencode 4.1 1. Web 2. jjencode 3. 4. 5. SVM 12
4.1 4.1.1 Web Web Alexa Top 500 Global Sites Web Alexa 10 script JavaScript script JavaScript JavaScript 438 4.1 Web JavaScript 4.1 JavaScirpt 438 14,754,808 bites 33,687 bites 1,682,739 bites 23 bites 1KB JavaScript JavaScript jjencode 1KB Web 4.1.2 jjencode MWS Dataset 2013 D3M 13
4.1 Web 2 4.1.3 ASCII 0x21 0x7e 94 0 1 4.1 4.1 jjencode 0x21 0x7e 14
4.1 4.2 5. e t a n (0x2e) (0x65) (0x74) (0x61) (0x6e) 0.055343 0.055101 0.051603 0.051199 0.046320 $ +. (0x24) (0x5f) (0x2b) (0x2e) (0x22) 0.309287 0.196983 0.167870 0.129734 0.088579 4.2 4.1 5 jjencode 4.1.4 Web jjencode 438 4.1.5 SVM SVM 96 10 15
4.1 4.1.6 RBF C = 1 = 1 100% 16
5 jjencode 100% jjencode 100% 1KB jjencode aaencode Web 17
Web Web Web Web 18
6 jjencode jjencode jjencode 19
20
[1] Information-technology Promotion Agency, Japan IPA, 2016 1 IPA, https://www.ipa.go.jp/security/txt/ 2016/01outline.html, 2018. [2] Yosuke HASEGAWA, JavaScript, https://www.slideshare.net/ hasegawayosuke/javascript-51570525, 2018. [3],,,,,, JavaScript, Vol.2014-CSEC-64, No.21, pp.1-7, 2014. [4] Tiago Serafim, JavaScript Obfuscator Tool, https://javascriptobfuscator. herokuapp.com/, 2018. [5] Yosuke HASEGAWA, jjencode - Encode any JavaScript program using only symbols, http://utf-8.jp/public/jjencode.html, 2018. [6],,,, JavaScript, Vol.2014-DPS-161, No.17, pp.1-7, 2014. [7],,,,,, MWS Datasets 2013, Computer Security Symposium 2013, 2013. [8] Alexa Internet, Inc., Alexa Top 500 Global Sites, https://www.alexa.com/ topsites, 2018. [9],,, 2012. 21