Vol.54 No.7 1937 1950 (July 2013) 1,a) 2012 11 1, 2013 4 5 1 Similar Sounds Sentences Generator Based on Morphological Analysis Manner and Low Class Words Masaaki Kanakubo 1,a) Received: November 1, 2012, Accepted: April 5, 2013 Abstract: Two different sentences which resemble each other in pronunciation are humourous. Japanese language morphological analysis systems can make two different sentences which have pronunciations which are in step with each other. This paper proposes the generation system of sentences which sound like entered sentences by morphological analysis manner. The proposed system has something low class word database, to generate funny sentences. This paper shows subjects experiments in order to confirm the validity of the proposed system. Keywords: language sense, wordplay, pun, idiomatic phrase, morphological analysis 1. [1], [2], [3] 2 1 Sizuoka Institute of Sicience and Technology, Fukuroi, Shizuoka 437 8555, Japan a) kanakubo@cs.sist.ac.jp [4], [5], [6] [7] [8] CM c 2013 Information Processing Society of Japan 1937
Vol.54 No.7 1937 1950 (July 2013) [9] [10] 3 2 1 2 3 4 2. 2.1 1 1 5 2.2 [11] [12], [13] 1 Fig. 1 Flowchart of the proposed system. c 2013 Information Processing Society of Japan 1938
Vol.54 No.7 1937 1950 (July 2013) 1 Table 1 Criterion of selecting nouns. Wikipedia 142 127 66 589 83 63 228 175 141 304 175 167 261 150 4 2.2.1 [14] 6,104 Wikipedia [15] Wikipedia Wikipedia 1 100 [16] 2.2.2 2 Table 2 Nouns used for selecting declinable words. [17] 2 [18] 2 20 721 140 268 2.2.3 100 100 1 c 2013 Information Processing Society of Japan 1939
Vol.54 No.7 1937 1950 (July 2013) 445 21 [19] 3 1 2 3 Table 3 Selected particles. 2 117 71 2.3 2 1 2.3.1 20 2.3.2 4 5 Fig. 2 2 Juncture rules of the proposed system. Table 4 4 Juncture rules from nouns to fundamental nouns. c 2013 Information Processing Society of Japan 1940
Vol.54 No.7 1937 1950 (July 2013) Table 5 5 Juncture rules from fundamental nouns to the other nouns. be 2.4 [20] 6 Table 6 Assigned degree of similarity of each sound-alike types. A B C A C B A B C C C B C C C B C 1 2 [21] 6 3 A B C 2 2 c 2013 Information Processing Society of Japan 1941
Vol.54 No.7 1937 1950 (July 2013) 4 5 de doe 1 2.5 (1) 1 5 (2) 4 (2) 1 1 2 3 1 5 3 2 1 (1) (3) (2) (1) (1) (4) (1) (3) (5) 2.6 3 2.6.1 1 1 3 2.6.2 4 3 2 3 1 3 Fig. 3 3 Transformation position for similar sounds. Fig. 4 4 The difference of compartmental locations of words. c 2013 Information Processing Society of Japan 1942
Vol.54 No.7 1937 1950 (July 2013) 7 Table 7 Emphatic adverbs stored in the proposed system. 8 Table 8 Other emphatic words. * * 2.6.3 7 8 3. 20 3.1 2.5 3 4.55 4.55 1 0.25 1 2 2 1 2 6 A 0 B 1 C 2 2 1 3.2 1 10 [22] 1,563 c 2013 Information Processing Society of Japan 1943
Vol.54 No.7 1937 1950 (July 2013) Table 9 9 20 Best 20 similar sounds sentences chosen by the proposed system. 10 19.34 18.56 18.01 17.97 17.84 17.7 17.2 16.9 16.62 16.06 15.9 15.9 15.82 15.82 15.65 15.25 15.23 15.23 15.2 15.09 3.3 1,563 2 1 20 305 etc. 251 101 40.2% 3.4 1,563 476 30.5% 40.2% 3,119 1 6.55 9 20 20 10 3.5 2 4 30 40 20 1 30 4 32 1 4 3 1 11 [8] c 2013 Information Processing Society of Japan 1944
Vol.54 No.7 1937 1950 (July 2013) Table 10 10 20 Best 20 similar sounds sentences chosen by human. 10 15.4 15.23 13.75 12.2 12.2 11.9 11.84 11.48 10.95 9.64 8.86 7.91 7.55 6.36 5.95 5.53 5.53 5.05 5.0 5.0 11 Table 11 Human-made similar sounds sentences. 3.6 56.5% 65% 50% 65% 3.7 40 4 4 3 2 1 4 1 2 Z 549 242 Z 10.9 Z 1% c 2013 Information Processing Society of Japan 1945
Vol.54 No.7 1937 1950 (July 2013) Table 12 12 Comparison of the kansei estimations. Z Z Z Z 1 222 140 4.31 199 170 1.51 204 168 1.87 248 136 5.72 2 262 112 7.76 238 130 5.63 246 134 5.75 247 130 6.03 3 296 117 8.80 271 116 7.88 279 119 8.02 243 155 4.41 Table 13 13 Z Similar sounds sentences with high Z score. Z 1 20 1 4.15 1 22 0 4.69 21 0 4.58 20 1 4.15 2 20 1 4.15 21 1 4.26 2 18 1 3.9 21 0 4.58 3 23 1 4.49 18 1 3.9 22 1 4.38 3 20 0 4.47 3 23 0 4.8 23 0 4.8 22 0 4.69 3 23 0 4.8 20 1 4.15 22 0 4.69 3 21 2 3.96 3 21 1 4.26 19 0 4.36 19 1 4.02 3 24 0 4.9 24 0 4.9 23 0 4.8 3 18 2 3.58 18 2 3.58 3 20 3 3.54 22 1 4.38 3.8 1 2 3 5 4 Z 12 Z 1% 1% 1 13 Z 3.5 3 1 2 c 2013 Information Processing Society of Japan 1946
Vol.54 No.7 1937 1950 (July 2013) 14 SD Table 14 Similar sounds sentences estimated by SD method. A B C D E 1 3.9 SD A B C D E 14 8 13 6.5 8.5 10 SD 7 6 0 8 10 80 5 10 5 2 t 1% A B C E D A B C D E A C D E B 5%C A B A B c 2013 Information Processing Society of Japan 1947
Vol.54 No.7 1937 1950 (July 2013) Fig. 5 5 SD Comparison of the SD method estimations. Table 15 15 Increase in the number of generated sentences by similar sounds. % 5 0.0016 1,010 32.4 1,654 53.0 1,681 53.9 1,697 54.4 2,685 86.1 3,042 97.5 3,119 100.0 C A D 5% A D E D B E 5% C D C E E C E B E 0.8 1 0.6 2 69.7% 3.10 1,563 3,119 15 5 c 2013 Information Processing Society of Japan 1948
Vol.54 No.7 1937 1950 (July 2013) 3 3.11 CM CM CM 4. 1,563 30% 20 20 SD [1] Vol.15, No.3, pp.446 455 (2000). [2] NLC96-31 (1996). [3] TL97-2 (1997). [4] B Vol.12, pp.685 686 (1998). [5] BOKE Vol.13, No.6, pp.920 927 (1998). [6] 2007-NL-178, pp.91 95 (2007). [7] http://www.tv-asahi.co.jp/tamoriclub/index.html. [8] http://www.sutv.zaq.ne.jp/shirokuma/gocho.html. [9] http://ameblo.jp/totherma/entry-10344318836.html. [10] SIG- SLUD-9202-5, pp.37 46 (1992). [11] (1976). [12] Vol.2, No.1, pp.100 104 (1990). [13] Vol.15, No.5, pp.577 583 (2003). [14] (2001). [15] Wikipedia http://ja.wikipedia.org/wiki/. c 2013 Information Processing Society of Japan 1949
Vol.54 No.7 1937 1950 (July 2013) [16] 100 http://prw.kyodonews.jp/opn/release/201201201897/. [17] Web 171 12, pp.67 73 (2006). [18] http://reed.kuee.kyoto-u.ac.jp/cf-search/. [19] High School http://www.hello-school.net/harojapa000top.htm. [20] Vol.26, No.1, pp.65 74 (2005). [21] http://www006.upp.so-net.ne.jp/ okosoken/magirawashii.html. [22] (2005). 15 16 18 22 16 c 2013 Information Processing Society of Japan 1950