NTT 465 図 1.,,..,, 1980,.,, [Hori 12]..,, [Kinoshita 09]. REVERB Challange, 30,, [Delcorix 14].,,.,,,,.,.., [ 13]. 2 4 会話シーンを捉えるリアルタイム会話分析 2,. 360,,,

464 29 5 2014 9 企業における AI 研究の最前線コミュニケーション科学と人工知能研究 NTT コミュニケーション科学基礎研究所の取組み Communication Science and Artificial Intelligence Research Activities at NTT Communication Science Laboratories 柏野邦夫 Kunio Kashino NTT Communication Science Laboratories, Nippon Telegraph and Telephone Corporation. kashino.kunio@lab.ntt.co.jp, http://www.brl.ntt.co.jp/people/kunio/ 平松薫 Kaoru Hiramatsu hiramatsu.kaoru@lab.ntt.co.jp, http://www.kecl.ntt.co.jp/csl/sirg/people/hiramatu/ 大和淳司 Junji Yamato yamato.junji@lab.ntt.co.jp, http://www.brl.ntt.co.jp/people/yamato/ 山田武士 Takeshi Yamada yamada.tak@lab.ntt.co.jp, http://www.kecl.ntt.co.jp/as/members/yamada/index-j.html Keywords: speech recognition, scene analysis, machine learning, big data analysis, machine translation. 1. はじめに,,, NTT,. 2. 外界の事物の認識 2 1 音声認識からコミュニケーション理解へ,,.,,,,,.,,.,,.,,IC.,,..,,,.,,,, 2 2 マイクを意識しない会話を捉える残響の除去と制御, 1,4.,,

NTT 465 図 1.,,..,, 1980,.,, [Hori 12]..,, [Kinoshita 09]. REVERB Challange, 30,, [Delcorix 14].,,.,,,,.,.., [ 13]. 2 4 会話シーンを捉えるリアルタイム会話分析 2,. 360,,,.,,..,,,,., 2 3 多様な会話を捉える音声言語一体型モデル,., 図 2

466 29 5 2014 9 2 5 音声のニュアンスを捉える韻律モデル,,.,,,,. [Fujisaki 88],.,.,,, [Kameoka 10].,,,. 2 6 メディアの辞書を引くメディア同一性の高速探索,,..,.,1990, RMS Robust Media Search,,.. near duplicate content detection,2000 RMS, NIST National Institute of Standards and Technology TRECVID, 1 [Mukai 10].RMS,,,,., RMS.,,,,,.,., sameness,. 2013, RMS [Murata 14],2013 TRECVID,,,,. 3. 知識の獲得 3 1 ビッグデータからの統計的機械学習,,,,, 3.,, ID, ID,,.,SNS,,,,

NTT 467 図 4 図 3,, NMF Nonnegative Matrix Factorization IRM Infinite Relational Model. 3 2 NMF によるデータ行列からのパターン発見 NMF,, [ 12].,, 4.,,. NMF,,,,,.,. twitter 5. twitter, NMF.twitter,,. NMF 5 A.,NMF.,2010 2 2011 4 10 1 000,10, 5 B. 1, 2, 3, 4, 5,, 図 5 twitter.,, NMF. NMF, NMF [Kameoka 09] [Sawada 13],. 3 3 IRM によるデータ行列からの関係クラスタ抽出 IRM, [Kemp 06]., 6,,,, 7 1 0 2., 7,, IRM,,,1 0

468 29 5 2014 9,,,, IRM, 8,,. 図 6 1 4. 機械翻訳図 7 2 4 1 ルールベース翻訳から統計翻訳へ,, 1950,,,,,. 9. 図 8 3.,IRM [Nakano 14],., [Ishiguro 10], [Ishiguro 12]., 0.,,,,. IRM, 8,,, SIRM Subset IRM.,, 図 9 1990,2000, 2005,,

NTT 469 4 2 事前並べ替え翻訳,2000,2010,.,..,, [Isozaki 12].2011 NTCIR-9,NTT Enju,,., [Goto 12, Sudoh 12]. 4 3 日本語の主辞後置性に基づく事前並べ替え 10.,,.,.,,., 10.,,.,.,,,,,,,.,,.,,.,. 4 4 技術文書の多言語翻訳,,...,,.,2004,,,, 300, 800, 200.,, NTT., 4, 1 5, 2,., 図 10. 図 11.,

470 29 5 2014 9 BLEU 13., RIBES Rank-based Intuitive Bilingual Evaluation Score, [RIBES 11]. RIBES, NTCIR-9,, BLEU [Isozaki 10]. 図 12,2009 NTT,, [Suzuki 09]. 11, 12.,,.,, 4 5 翻訳精度の自動評価.,.1990 BLEU BiLingual Evaluation Understudy,,,BLEU 2008 NTCIR-7 4 6 機械翻訳の実用化に向けて,,,,,,,100,,.,,. 5. おわりに,NTT,,,,.,,,.,,. 参考文献 13 NTCIR-7 2008 BLEU [Delcorix 14] Delcorix, M., Yoshioka, T., Ogawa, A., Kubo, Y., Fujimoto, M., Ito, N., Kinoshita, K., Espi, M., Hori, T., Nakatani, T. and Nakamura, A.: Linear prediction-based dereverberation with advanced speech enhancement and recognition technologies for the reverb challenge, REVERB Challenge 2014 [Fujisaki 88] Fujisaki, H.: Vocal Physiology: Voice Production, Mechanisms and Functions, Raven Press 1988 [Goto 12] Goto, I., Lu, B., Chow, K. P., Sumita, E. and Tsou, B. K.: Overview of the patent machine translation task at the

NTT 471 NTCIR-9 Workshop, NTCIR-9, pp. 559-578 2012 [Hori 12] Hori, T., Araki, S., Yoshioka, T., Fujimoto, M., Watanabe, S., Ogawa, A., Otsuka, K., Mikami, D., Kinoshita, K., Nakatani, T., Nakamura, A. and Yamato, J.: Low-latency realtime meeting recognition and understanding using distant microphones and omni-directional camera, IEEE Trans. on Audio, Speech, and Language Proc., Vol. 20, No. 2, pp. 499-513 2012 [Ishiguro 10] Ishiguro, K., Iwata, T., Ueda, N. and Tenenbaum, J. B.: Dynamic infinite relational model for time-varying relational data analysis, Advances in Neural Information Processing Systems, Vol. 23, pp. 919-927 2010 [Ishiguro 12] Ishiguro, K., Ueda, N. and Sawada, H.: Subset Infinite relational models, Proc. 15th Int. Conf. on Artificial Intelligence and Statistics AISTATS2012,pp. 547-555 2012 [Isozaki 10] Isozaki, H., Hirao, T., Duh, K., Sudoh, K. and Tsukada, H.: An empirical study of semi-supervised structured conditional models for dependency parsing, Proc. 2010 Conf. on Empirical Methods in Natural Language Processing EMNLP,pp. 944-952 2010 [Isozaki 12] Isozaki, H., Sudoh, K., Tsukada, H. and Duh, K.: HPSG-based preprocessing for English-to-Japanese translation, ACM Trans. on Asian Language Information Processing TALIP,Vol. 11, No. 3, pp. 8:1-8:16 2012 [Kameoka 09] Kameoka, H., Ono, N., Kashino, K. and Sagayama, S.: Complex NMF: A new sparse representation for acoustic signals, Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Proc. ICASSP,pp. 3437-3440 2009 [Kameoka 10] Kameoka, H., Roux, J. L. and Ohishi, Y.: A statistical model of speech F 0 contours, Proc. SAPA, pp. 43-48 2010 [Kemp 06] Kemp, C., Tenenbaum, J. B., Griffiths, T. L., Yamada, T. and Ueda, N.: Learning systems of concepts with an infinite relational model, AAAI 06 Proc. 21st National Conf. on Artificial intelligence, pp. 381-388 2006 [Kinoshita 09] Kinoshita, K., Delcroix, M., Nakatani, T. and Miyoshi, M.: Suppression of late reverberation effect on speech signal using long-term multiple step linear prediction, IEEE Trans. on Audio, Speech, and Language Proc., Vol. 17, No. 4, pp. 534-545 2009 [ 13],,,,NTT,Vol. 25, No. 9, pp. 22-25 2013 [Mukai 10] Mukai, R., Kurozumi, T., Hiramatsu, K., Kawanishi, T., Nagano, H. and Kashino, K.: NTT communication science laboratories at TRECVID 2010 content-based copy detection, TRECVID Workshop 2010 [Murata 14] Murata, M., Nagano, H., Mukai, R., Kashino, K. and Satoh, S.: BM25 with exponential IDF for instance search, IEEE Trans. on Multimedia, Vol. 16, No. 6, to appear 2014 [Nakano 14] Nakano, M., Ishiguro, K., Kimura, A., Yamada, T. and Ueda, N.: Rectangular tiling process, Proc. Int. Conf. on Machine Learning ICML,pp. 361-369 2014 [RIBES 11] RIBES 2011,http://www.kecl.ntt.co.jp/ icl/lirg/ribes/index-j.html [ 12] NMF,,Vol. 95, No. 9, pp. 829-833 2012 [Sawada 13] Sawada, H., Kameoka, H., Araki, S. and Ueda, N.: Multichannel extensions of non-negative matrix factorization with complex-valued data, IEEE Trans. on Audio, Speech and Language Processing, Vol. 21, No. 5, pp. 971-982 2013 [Sudoh 12] Sudoh, K., Duh, K., Tsukada, H., Nagata, M., Wu, X., Matsuzaki, T. and Tsujii, J.: NTT-UT statistical machine translation in NTCIR-9 PatentMT, NTCIR-9, pp. 585-592 2012 [Suzuki 09] Suzuki, J., Isozaki, H., Carreras, X. and Collins, M.: An empirical study of semi-supervised structured conditional models for dependency parsing, Proc. 2009 Conf. on Empirical Methods in Natural Language Processing EMNLP,pp. 551-560 2009 2014 8 11 著者紹介柏野邦夫 1990,1995.,.,NTT.,2002,.. 平松薫 1994,1996..,NTT.,2003 04,.. 大和淳司 1988,1990.,.,NTT.,1996 98,MIT.MIT Electrical Engineering and Computer Science.. 山田武士 1988.,.,NTT.,1996 97,..

NTT 465 図 1.,,..,, 1980,.,, [Hori 12]..,, [Kinoshita 09]. REVERB Challange, 30,, [Delcorix 14].,,.,,,,.,.., [ 13]. 2 4 会話シーンを捉える リアルタイム会話分析 2,. 360,,,

NTT 465 図 1.,,..,, 1980,.,, [Hori 12]..,, [Kinoshita 09]. REVERB Challange, 30,, [Delcorix 14].,,.,,,,.,.., [ 13]. 2 4 会話シーンを捉えるリアルタイム会話分析 2,. 360,,,