464 29 5 2014 9 企業における AI 研究の最前線 コミュニケーション科学と人工知能研究 NTT コミュニケーション科学基礎研究所の取組み Communication Science and Artificial Intelligence Research Activities at NTT Communication Science Laboratories 柏野邦夫 Kunio Kashino NTT Communication Science Laboratories, Nippon Telegraph and Telephone Corporation. kashino.kunio@lab.ntt.co.jp, http://www.brl.ntt.co.jp/people/kunio/ 平松薫 Kaoru Hiramatsu hiramatsu.kaoru@lab.ntt.co.jp, http://www.kecl.ntt.co.jp/csl/sirg/people/hiramatu/ 大和淳司 Junji Yamato yamato.junji@lab.ntt.co.jp, http://www.brl.ntt.co.jp/people/yamato/ 山田武士 Takeshi Yamada yamada.tak@lab.ntt.co.jp, http://www.kecl.ntt.co.jp/as/members/yamada/index-j.html Keywords: speech recognition, scene analysis, machine learning, big data analysis, machine translation. 1. はじめに,,, NTT,. 2. 外界の事物の認識 2 1 音声認識からコミュニケーション理解へ,,.,,,,,.,,.,,.,,IC.,,..,,,.,,,, 2 2 マイクを意識しない会話を捉える 残響の除去と制御, 1,4.,,
NTT 465 図 1.,,..,, 1980,.,, [Hori 12]..,, [Kinoshita 09]. REVERB Challange, 30,, [Delcorix 14].,,.,,,,.,.., [ 13]. 2 4 会話シーンを捉える リアルタイム会話分析 2,. 360,,,.,,..,,,,., 2 3 多様な会話を捉える 音声言語一体型モデル,., 図 2
466 29 5 2014 9 2 5 音声のニュアンスを捉える 韻律モデル,,.,,,,. [Fujisaki 88],.,.,,, [Kameoka 10].,,,. 2 6 メディアの辞書を引く メディア同一性の高速探索,,..,.,1990, RMS Robust Media Search,,.. near duplicate content detection,2000 RMS, NIST National Institute of Standards and Technology TRECVID, 1 [Mukai 10].RMS,,,,., RMS.,,,,,.,., sameness,. 2013, RMS [Murata 14],2013 TRECVID,,,,. 3. 知識の獲得 3 1 ビッグデータからの統計的機械学習,,,,, 3.,, ID, ID,,.,SNS,,,,
NTT 467 図 4 図 3,, NMF Nonnegative Matrix Factorization IRM Infinite Relational Model. 3 2 NMF によるデータ行列からのパターン発見 NMF,, [ 12].,, 4.,,. NMF,,,,,.,. twitter 5. twitter, NMF.twitter,,. NMF 5 A.,NMF.,2010 2 2011 4 10 1 000,10, 5 B. 1, 2, 3, 4, 5,, 図 5 twitter.,, NMF. NMF, NMF [Kameoka 09] [Sawada 13],. 3 3 IRM によるデータ行列からの関係クラスタ抽出 IRM, [Kemp 06]., 6,,,, 7 1 0 2., 7,, IRM,,,1 0
468 29 5 2014 9,,,, IRM, 8,,. 図 6 1 4. 機械翻訳 図 7 2 4 1 ルールベース翻訳から統計翻訳へ,, 1950,,,,,. 9. 図 8 3.,IRM [Nakano 14],., [Ishiguro 10], [Ishiguro 12]., 0.,,,,. IRM, 8,,, SIRM Subset IRM.,, 図 9 1990,2000, 2005,,
NTT 469 4 2 事前並べ替え翻訳,2000,2010,.,..,, [Isozaki 12].2011 NTCIR-9,NTT Enju,,., [Goto 12, Sudoh 12]. 4 3 日本語の主辞後置性に基づく事前並べ替え 10.,,.,.,,., 10.,,.,.,,,,,,,.,,.,,.,. 4 4 技術文書の多言語翻訳,,...,,.,2004,,,, 300, 800, 200.,, NTT., 4, 1 5, 2,., 図 10. 図 11.,
470 29 5 2014 9 BLEU 13., RIBES Rank-based Intuitive Bilingual Evaluation Score, [RIBES 11]. RIBES, NTCIR-9,, BLEU [Isozaki 10]. 図 12,2009 NTT,, [Suzuki 09]. 11, 12.,,.,, 4 5 翻訳精度の自動評価.,.1990 BLEU BiLingual Evaluation Understudy,,,BLEU 2008 NTCIR-7 4 6 機械翻訳の実用化に向けて,,,,,,,100,,.,,. 5. おわりに,NTT,,,,.,,,.,,. 参考文献 13 NTCIR-7 2008 BLEU [Delcorix 14] Delcorix, M., Yoshioka, T., Ogawa, A., Kubo, Y., Fujimoto, M., Ito, N., Kinoshita, K., Espi, M., Hori, T., Nakatani, T. and Nakamura, A.: Linear prediction-based dereverberation with advanced speech enhancement and recognition technologies for the reverb challenge, REVERB Challenge 2014 [Fujisaki 88] Fujisaki, H.: Vocal Physiology: Voice Production, Mechanisms and Functions, Raven Press 1988 [Goto 12] Goto, I., Lu, B., Chow, K. P., Sumita, E. and Tsou, B. K.: Overview of the patent machine translation task at the
NTT 471 NTCIR-9 Workshop, NTCIR-9, pp. 559-578 2012 [Hori 12] Hori, T., Araki, S., Yoshioka, T., Fujimoto, M., Watanabe, S., Ogawa, A., Otsuka, K., Mikami, D., Kinoshita, K., Nakatani, T., Nakamura, A. and Yamato, J.: Low-latency realtime meeting recognition and understanding using distant microphones and omni-directional camera, IEEE Trans. on Audio, Speech, and Language Proc., Vol. 20, No. 2, pp. 499-513 2012 [Ishiguro 10] Ishiguro, K., Iwata, T., Ueda, N. and Tenenbaum, J. B.: Dynamic infinite relational model for time-varying relational data analysis, Advances in Neural Information Processing Systems, Vol. 23, pp. 919-927 2010 [Ishiguro 12] Ishiguro, K., Ueda, N. and Sawada, H.: Subset Infinite relational models, Proc. 15th Int. Conf. on Artificial Intelligence and Statistics AISTATS2012,pp. 547-555 2012 [Isozaki 10] Isozaki, H., Hirao, T., Duh, K., Sudoh, K. and Tsukada, H.: An empirical study of semi-supervised structured conditional models for dependency parsing, Proc. 2010 Conf. on Empirical Methods in Natural Language Processing EMNLP,pp. 944-952 2010 [Isozaki 12] Isozaki, H., Sudoh, K., Tsukada, H. and Duh, K.: HPSG-based preprocessing for English-to-Japanese translation, ACM Trans. on Asian Language Information Processing TALIP,Vol. 11, No. 3, pp. 8:1-8:16 2012 [Kameoka 09] Kameoka, H., Ono, N., Kashino, K. and Sagayama, S.: Complex NMF: A new sparse representation for acoustic signals, Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Proc. ICASSP,pp. 3437-3440 2009 [Kameoka 10] Kameoka, H., Roux, J. L. and Ohishi, Y.: A statistical model of speech F 0 contours, Proc. SAPA, pp. 43-48 2010 [Kemp 06] Kemp, C., Tenenbaum, J. B., Griffiths, T. L., Yamada, T. and Ueda, N.: Learning systems of concepts with an infinite relational model, AAAI 06 Proc. 21st National Conf. on Artificial intelligence, pp. 381-388 2006 [Kinoshita 09] Kinoshita, K., Delcroix, M., Nakatani, T. and Miyoshi, M.: Suppression of late reverberation effect on speech signal using long-term multiple step linear prediction, IEEE Trans. on Audio, Speech, and Language Proc., Vol. 17, No. 4, pp. 534-545 2009 [ 13],,,,NTT,Vol. 25, No. 9, pp. 22-25 2013 [Mukai 10] Mukai, R., Kurozumi, T., Hiramatsu, K., Kawanishi, T., Nagano, H. and Kashino, K.: NTT communication science laboratories at TRECVID 2010 content-based copy detection, TRECVID Workshop 2010 [Murata 14] Murata, M., Nagano, H., Mukai, R., Kashino, K. and Satoh, S.: BM25 with exponential IDF for instance search, IEEE Trans. on Multimedia, Vol. 16, No. 6, to appear 2014 [Nakano 14] Nakano, M., Ishiguro, K., Kimura, A., Yamada, T. and Ueda, N.: Rectangular tiling process, Proc. Int. Conf. on Machine Learning ICML,pp. 361-369 2014 [RIBES 11] RIBES 2011,http://www.kecl.ntt.co.jp/ icl/lirg/ribes/index-j.html [ 12] NMF,,Vol. 95, No. 9, pp. 829-833 2012 [Sawada 13] Sawada, H., Kameoka, H., Araki, S. and Ueda, N.: Multichannel extensions of non-negative matrix factorization with complex-valued data, IEEE Trans. on Audio, Speech and Language Processing, Vol. 21, No. 5, pp. 971-982 2013 [Sudoh 12] Sudoh, K., Duh, K., Tsukada, H., Nagata, M., Wu, X., Matsuzaki, T. and Tsujii, J.: NTT-UT statistical machine translation in NTCIR-9 PatentMT, NTCIR-9, pp. 585-592 2012 [Suzuki 09] Suzuki, J., Isozaki, H., Carreras, X. and Collins, M.: An empirical study of semi-supervised structured conditional models for dependency parsing, Proc. 2009 Conf. on Empirical Methods in Natural Language Processing EMNLP,pp. 551-560 2009 2014 8 11 著者紹介 柏野邦夫 1990,1995.,.,NTT.,2002,.. 平松薫 1994,1996..,NTT.,2003 04,.. 大和淳司 1988,1990.,.,NTT.,1996 98,MIT.MIT Electrical Engineering and Computer Science.. 山田武士 1988.,.,NTT.,1996 97,..