Vol. 43 No. 7 July 2002 ATR-MATRIX,,, ATR ITL ATR-MATRIX ATR-MATRIX 90% ATR-MATRIX Development and Evaluation of ATR-MATRIX Speech Translation System Fumiaki Sugaya,,, Toshiyuki Takezawa, Eiichiro Sumita, Yoshinori Sagisaka, and Seiichi Yamamoto ATR-MATRIX speech translation system was developed at ATR Interpreting Telecommunications Research Laboratories (ATR-ITL). In this paper we explain the system s outline and its development process including the initial objective, corpus collection and its overall evaluation. Each of three major components of the system: speech recognition, language translation, and speech synthesis, introduced an innovative corpus-based technology. In the paper, however the explanation is focused to major topics in the overall system, while rendering appropriate references to detail explanations of specific technology. We also explain some experimental results: additional sessions improve the performance of the same task. 1. 1993 ATR ATR Interpreting Telecommunications Research Laboratories ITL 1) ITL ATR-MATRIX 2) ATR ATR Spoken Language Translation Research Laboratories KDDI Presently with KDDI R&D Laboratories, Inc. Presently with Graduate School of Kobe University Presently with Graduate School of Waseda University 3) VERBMOBIL 21) 5) ATR-MATRIX TOEIC ITL ITL ATR-MATRIX 2 ITL 3 ATR-MATRIX 2230
Vol. 43 No. 7 ATR-MATRIX 2231 4 5 ATR-MATRIX 6 ATR-MATRIX 7 8 ITL 9 2. 2.1 ITL ATR 1 ITL Spontaneous speech 15),16) spontaneous speech read speech spontaneous speech 2.2 18) SLDB 17) 9) 12) 98 ATR-MATRIX ATR-MATRIX 3. ATR-MATRIX 3.1 1 ATR-MATRIX 1 19) ITL SPREC 7) TDMT 13) CHATR 20) 1 ATR-MATRIX Fig. 1 Configuration of ATR-MATRIX speech translation system.
2232 July 2002 3.2 Lisp 100 msec PC Pentium III 450 MHz 1 1 Table 1 Task/domain in data collection. 2 Table 2 Rule for conversation proceeding. 4. 4.1 1 4.2 2 4 1 1 10 2 4 4.3 2 1 1 4.4
Vol. 43 No. 7 ATR-MATRIX 2233 3 Table 3 Feature comparison between monolingual and bilingual DB. 2 3 2 1 5. ATR-MATRIX 5.1 HMM ML-SSS 6) N-gram 10) 5.2 ATR-MATRIX TDMT 12) TDMT X Y X X X Y Y to X (( ), ( )...), Y at X (( ),...),... X Y Y to X Y at X X Y 13) JE JK JG EJ 5 6 4 A B C 98% 85% TDMT 1
2234 July 2002 Table 4 4 Rank criteria for translation evaluation. 5 Table 5 Data size used for language translation subsystem. Fig. 2 2 Relationship between translation rate and pattern extraction rate. 6 Table 6 Evaluation results for several language pairs. 2 2 1/2 85.0% 95.3% 5.3 TDMT 1 SPREC N 19) 5.4 TDMT
Vol. 43 No. 7 ATR-MATRIX 2235 Table 7 7 System s specification and host performance. Fig. 3 3 Configuration for end-to-end dialogue experiment. 14) TDMT 8 Table 8 Performances of subsystems. 6. 6.1 3 3),4) 1 SPREC TDMT CHATR 7 barge-in ATR-MATRIX LAN TV 8 ATR-MATRIX SPREC 7) ATR-MATRIX TOEIC MAP-VFS 6.2 3 1 1 1 GUI 2 1 3 5 3 6.3 6.3.1 perplexity 4 perplexity 5 6
2236 July 2002 9 Table 9 Data size of dialog tests. 4 Perplexity Fig. 4 Perplexity along dialogues. 5 Fig. 5 Session time along dialogues. 6 Fig. 6 Word accuracy along dialogues. 1 3 Perplexity 18.3% 23.8% 18.0% 20% 8) 1 3 2 2 6.3.2 1 0 90% 6.3.3 9 ATR SLDB 17) 6.8 SLDB 10.3 7 SLDB 23 330 SLTA1 8 8 A A+B A+B+C 4 A A A+B A B A+B+C A B C 7 8 SLDB 10.3 6.8 3.5 7 8 3.5 2% 10% 7
Vol. 43 No. 7 ATR-MATRIX 2237 Table 10 10 Data size of dialogue tests without attention to machine. Fig. 7 7 Word accuracy vs. sentence length. Fig. 8 8 Translation rate vs. sentence length. 8 7. 6 ATR- MATRIX 22) ATR-MATRIX 7.1 6 PC 9 Fig. 9 Effects of speaking style. TV 2 SPREC 10 18.5 10.3 8 7.2 9 3 8) 11)
2238 July 2002 3 1 2 3 9 7 9 (1) (2) (3) (4) (5) (6) (7) 9 7.4% 1 8.2% 1.2% SPREC 82.5% 83.04% 82.46% 83% 8. 8.1 1 85% 98% 8.2 1 1 1 1 2 3 9. 9.1 5) TOEIC 700 550 150 13 TOEIC 575 ATR-MATRIX
Vol. 43 No. 7 ATR-MATRIX 2239 PC 1 3.8 88.1% 85% 9.2 ATR-MATRIX ATR-MATRIX ATR ATR 1) ASURA Vol.37, No.9, pp.1726 1735 (1996). 2) Takezawa, T., Morimoto, T., Sagisaka, Y., Campbell, N., Iida, H., Sugaya, F., Yokoo, A. and Yamamoto, S.: A Japanese-to-English speech translation system: ATR-MATRIX, Proc. ICSLP 1998, pp.2779 2782 (1998). 3) Sugaya, F., Takezawa, T., Yokoo, A. and Yamamoto, S.: End-to-end evaluation in ATR- MATRIX: Speech translation system between English and Japanese, Proc. Eurospeech99, pp.2431 2434 (1999). 4) ATR-MATRIX SP2000-21, pp.39 45 (June 2000). 5) D-II Vol.J84-D-II, No.11, pp.2362 2370 (2001). 6) Ostendorf, M. and Singer, H.: HMM topology design using maximum likelihood successive state splitting, Computer Speech and Language, Vol.11, No.1, pp.17 41 (1997). 7) ATR-MATRIX 1998 2-Q-20 (Mar. 1998). 8) D-II Vol.J84-D-II, No.1, pp.31 40 (2001). 9) 1999 pp.169 170 (1999). 10) N-gram D- II Vol.J81-D-II, No.9, pp.1929 1936 (1998). 11) N-gram D-II Vol.J83-D-II, No.11, pp.2146 2151 (2000). 12) Vol.6, No.5, pp.63 91 (1999). 13) Sumita, E., Yamada, S., Yamamoto, K., Paul, M., Kashioka, H., Ishikawa, K. and Shirai, S.: Solutions to Problems Inherent in Spokenlanguage Translation: The ATR- MATRIX Approach, Proc. MT Summit 99, pp.229 235 (Sep. 1999). 14)
2240 July 2002 Vol.5, No.4, pp.111 125 (1998). 15) SP2000-95, pp.1 5 (Dec. 2000). 16) 99 SLP-31-2 (2000). 17) Morimoto, T., Uratani, N., Takezawa, T., Furuse, O., Sobashima, Y., Iida, H., Nakamura, A., Sagisaka, Y., Higuchi, N. and Yamazaki, Y.: A speech and language database for speech translation research, Proc. ICSLP 94, pp.1791 1794 (1994). 18) Vol.83, No.8, pp.604 611 (2000). 19) Vol.6, No.2, pp.83 95 (1999). 20) Campbell, N.: CHATR: A high-definition speech re-sequencing systems, Proc. ASA/ASJ Joint Meeting, pp.1223 1228 (1996). 21) Wahlster, W.: verbmobil: foundations of speech-to-speech translation, Springer (2000). 22) pp.117 124 (Feb. 2001). ( 13 11 16 ) ( 14 4 16 ) 57 59 KDD 3 9 ATR 13 4 14 4 KDDI 59 62 ATR ATR 55 57 11 ATR ACL 48 50 NTT 61 ATR IEEE
Vol. 43 No. 7 ATR-MATRIX 2241 47 49 9 ATR ATR 56 3 5 IEEE