28 1211072 2017 1 31
1,.,., Unicode,.,,.,. 2010,,,.,,.
2 1 6 1.1........................................... 6 1.1.1...................... 6 1.2.......................................... 7 1.3........................................... 7 1.4......................................... 8 2 9 2.1 CAO: A Fully Automatic Emoticon Analysis System[1]............... 9 2.1.1....................................... 9 2.1.2................................. 9 2.2 [2]......... 10 2.2.1....................................... 10 2.2.2................................. 10 3 12 3.1........................................... 12 3.2.............................. 12 3.3................................. 12 3.4....................................... 13 3.4.1....................... 13 3.5............................ 14 4 15 4.1......................................... 15 4.2........................................... 15 4.2.1................................... 15 4.2.2................................ 16 4.3.................................. 16 4.4................................ 18 4.4.1...................................... 18 4.4.2................................... 19 4.5.................................... 19 5 21 5.1...... 21 5.1.1..................................... 21
3 5.1.2..................................... 21 5.1.3....................................... 22 5.2............................ 23 5.2.1..................................... 23 5.2.2..................................... 23 5.2.3....................................... 24 5.2.4................................ 25 5.3...................... 25 5.3.1..................................... 25 5.3.2....................................... 26 5.3.3............................ 26 6 27 6.1...................................... 27 6.2...................................... 27 6.3........................................... 27 6.3.1............................ 27 6.3.2........................... 28
4 1.1.................. 6 1.2 e............................... 7 3.1.................. 12 3.2........................... 14 4.1.......................... 15 4.2 (2 )...... 15 4.3 (4 )..................... 16 4.4.................. 18 4.5 R(x, y) 0.98............. 18 5.1................................. 22 5.2............... 22 5.3.................. 24 5.4................ 25
5 1.1.............................. 7 2.1..................... 11 2.2 6....................... 11 5.1 2010............................ 21 5.2............................... 23 5.3.................... 25 5.4..................................... 26 6.1 [2]....................... 28
6 1 1.1, Web...[1] [2],, Unicode,. 1.1.1,. 1. 1 2,,,. Unicode CodePoint. 1.1:
1 7 1.2,.. e U+0065 LATIN SMALL LETTER E),. 1.2: e, Unicode.. 1.3,.,,.,..,. 1.1: :-) ( ) :-( ( ) :-/ (; ;) x x :-O o /
1 8 1.4.......
9 2 2.1 CAO: A Fully Automatic Emoticon Analysis System[1] 2.1.1 Ptaszynski,, CAO. CAO Web, 10,137. 1. 2. 1 2.1.2, 400 3 2.,, 400. ameba, 97. 6%,, 2010, 2017 400.,,.
2 10 2.2 [2] 2.2.1 Web Twitter,, Positive Negative PN. 5,. 2.2.2 Perl Compatible Regular Expression.. ((?!C{3,}).){2,} C,,, 1.,. C 3 2,,., UTF-8. [0-9A-Za-z - - ] [ - ].
2 11 2.1: 1 (- -)zzz (14) 2 ( )?( ) 3 ( ) ( ) 4 ( O ) ( ) 2.2: 6 1 * 2 3 4 5 6
12 3 3.1,,. 1. 2. 3. 4. 3.2, Cafe 1, Web.,,., CAO. 3.3... 3.1: 1 http://kaomoji-cafe.jp/
3 13 3.4 3.3, 3.2.,. 3.4.1,.,,,. (OpenCV ). 2 R(x, y) = x,y (T (x, y ) I(x + x, y + y ) I(x, y) : (x, y) T (x, y ) :, 1,0 255. R(x, y) -1 1. R(x, y) = x,y T (x, y ) I(x + x, y + y ) x,y T (x, y ) 2 x,y I(x + x, y + y ) 2 2 http://docs.opencv.org/2.4/doc/tutorials/imgproc/histograms/template_matching/template_ matching.html
3 14 3.5, 3,.. 3.2:
15 4 4.1 Java Standard Edition 7 Graphics2D, Python2. 7. 4.1: 4.2 Java Graphics2D drawstring(). MSP, 16pt. google Chrome,.,. 4.2.1 19 1. 1. 4.2: (2 )
4 16 4.2.2 1.. 4.3: (4 ),. 4.3 CAO, 325,.
4 17 4. 3
4 18 4.4 Intel OpenCV2. 4. 13 1 Python2. 7, 3. 2. cv2. matchtemplate(img_gray, template, cv2. TM_CCOEFF_NORMED) 1, 2, 3 R(x, y). R(x,y), x,. R(x,y) 4.4.2. 4.4.1 4.4:,. 4.5: R(x, y) 0.98,,,. 1 http://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_imgproc/py_template_ matching/py_template_matching.html
4 19 4.4.2. template, x (pixel), (pixel). 5,563,6 5,589,6 23,554,11 32,455,8 32,481,8 32,507,8 33,598,8 34,548,8 50,355,5 72,467,11 72,493,11 72,519,11 75,574,15 117,18,15 274,352,11, 117, x=18. 4.5 3.5. 1., x, 2. (x + +24 ), 3. 2 True 2, 4. 2 False, 3 2, 5. 2 Flase 3, x, (x + ),,,,. 2. 24 16, 1. 5., 2.
4 20, 5, 0.
21 5 5.1 CAO, CAO,. 5.1.1 CAO 10,131, 1000., CAO 1000 100%. 2010 CAO, 2010 CAO.,. 5.1.2 5.1: 2010 997/1000 99.7% 885/1000 88.5%
5 22 5.1.3 324 CAO 400 8, 99%.., 3,. 5.1: 1. 2, 3,. +,, 1,,. 5.2: A 2 ( ) ( B 2 ( ) ( C 3 ( ) (
5 23 5.2 2010,.,,., CAO,, 3. 5.2.1 Cafe 1. 5.2.2 5.2: 418/500 83.6% CAO 390/500 78.0% 402/500 80.2% 1 http://kaomoji-cafe.jp/author/kaomoji/
5 24 5.2.3, CAO 2010, CAO.. CAO. 5.3: CodePoint U+275B HEAVY SINGLE TURNED COMMA QUOTATION MARK ORNAMENT,2010, 5 U+30FB KATAKANA MIDDLE DOT,.
5 25 5.4:, CodePoint U+141B CANADIAN SYLLABICS NASKAPI WAA. CAO 10,131, 50 U+FF0E FULLWIDTH FULL STOP. 5.2.4 80% 1 2010, 2,. 5.3 False positive. CAO 0%. Twitter., RT,,. 5.3.1 5.3: 44/500 8.8% 17/500 3.4%
5 26 5.3.2 False positive 3. 1., 23/44 2. 12/44 3., 9/44 2, 1 3. 5.3.3 5.2. 1, 2, 3, 2 2. 5.4: F 90.5 (418/462) 83.6 (418/500) 87.2 (1, 3 ) 97.2 (418/430) 83.6 (418/500) 90.0 CAO 100 (390/390) 78.0 (390/500) 87.6 95.9 (402/419) 81.6 (408/500) 88.2
27 6 6.1,,..,,. 6.2. 5.2,,.,.,,. 6.3 6.3.1,,, 2. 4.3 27 28.,.., 275? 79., 1,.
6 28, 5.3.2,.,,,. 6.3.2,, [2],. 6.1: [2] 0.0009 0.0055 0.0000 0.0048 0.0007 o 0.0000 0.0004 0.0000 0.0000 0.0000 0.0009 0.0000 0.0000 0.0000 0.0007 0.0000 0.0000 0.0000 0.0000 0.0007 T T 0.0000 0.0000 0.0010 0.0000 0.0000,,. CAO.
29,,..., New BSD Lincense Ptaszynski.
30 [1] Michal Ptaszynski, AO: A Fully Automatic Emoticon Analysis System Based on Theory of Kinesics IEEE Transactions on Affective Computing, vol. 1, no. 1, pp. 46-59, 2010. 1. [2],, The 5th Forum on Data Engineering and Information Management. 2013. 3.