NTT 465 図 1.,,..,, 1980,.,, [Hori 12]..,, [Kinoshita 09]. REVERB Challange, 30,, [Delcorix 14].,,.,,,,.,.., [ 13]. 2 4 会話シーンを捉える リアルタイム会話分析 2,. 360,,,

Similar documents
A Japanese Word Dependency Corpus ÆüËܸì¤Îñ¸ì·¸¤ê¼õ¤±¥³¡¼¥Ñ¥¹

10_08.dvi

21 Pitman-Yor Pitman- Yor [7] n -gram W w n-gram G Pitman-Yor P Y (d, θ, G 0 ) (1) G P Y (d, θ, G 0 ) (1) Pitman-Yor d, θ, G 0 d 0 d 1 θ Pitman-Yor G

IPSJ SIG Technical Report Vol.2009-DPS-141 No.20 Vol.2009-GN-73 No.20 Vol.2009-EIP-46 No /11/27 1. MIERUKEN 1 2 MIERUKEN MIERUKEN MIERUKEN: Spe

main.dvi

2

2.R R R R Pan-Tompkins(PT) [8] R 2 SQRS[9] PT Q R WQRS[10] Quad Level Vector(QLV)[11] QRS R Continuous Wavelet Transform(CWT)[12] Mexican hat 4

IPSJ SIG Technical Report Vol.2015-MUS-107 No /5/23 HARK-Binaural Raspberry Pi 2 1,a) ( ) HARK 2 HARK-Binaural A/D Raspberry Pi 2 1.

3 2 2 (1) (2) (3) (4) 4 4 AdaBoost 2. [11] Onishi&Yoda [8] Iwashita&Stoica [5] 4 [3] 3. 3 (1) (2) (3)

No. 3 Oct The person to the left of the stool carried the traffic-cone towards the trash-can. α α β α α β α α β α Track2 Track3 Track1 Track0 1

2 DS SS (SS+DS) Fig. 2 Separation algorithm for motorcycle sound by combining DS and SS (SS+DS). 3. [3] DS SS 2 SS+DS 1 1 B SS SS 4. NMF 4. 1 (NMF) Y

( ) Kevin Duh

知能と情報, Vol.30, No.5, pp

1 7.35% 74.0% linefeed point c 200 Information Processing Society of Japan


独立行政法人情報通信研究機構 Development of the Information Analysis System WISDOM KIDAWARA Yutaka NICT Knowledge Clustered Group researched and developed the infor

untitled

WISS 2018 [2 4] [5,6] Query-by-Dancing Query-by- Dancing Cao [1] OpenPose 2 Ghias [7] Query by humming Chen [8] Query by rhythm Jang [9] Query-by-tapp


& Vol.5 No (Oct. 2015) TV 1,2,a) , Augmented TV TV AR Augmented Reality 3DCG TV Estimation of TV Screen Position and Ro

main.dvi

特集_03-07.Q3C

1 Table 1: Identification by color of voxel Voxel Mode of expression Nothing Other 1 Orange 2 Blue 3 Yellow 4 SSL Humanoid SSL-Vision 3 3 [, 21] 8 325

untitled

THE INSTITUTE OF ELECTRONICS, INFORMATION AND COMMUNICATION ENGINEERS TECHNICAL REPORT OF IEICE. TRECVID2012 Instance Search {sak

Duplicate Near Duplicate Intact Partial Copy Original Image Near Partial Copy Near Partial Copy with a background (a) (b) 2 1 [6] SIFT SIFT SIF

Mimehand II[1] [2] 1 Suzuki [3] [3] [4] (1) (2) 1 [5] (3) 50 (4) 指文字, 3% (25 個 ) 漢字手話 + 指文字, 10% (80 個 ) 漢字手話, 43% (357 個 ) 地名 漢字手話 + 指文字, 21

Vol.55 No (Jan. 2014) saccess 6 saccess 7 saccess 2. [3] p.33 * B (A) (B) (C) (D) (E) (F) *1 [3], [4] Web PDF a m


IPSJ-TOD

_314I01BM浅谷2.indd

動画コンテンツ 動画 1 動画 2 動画 3 生成中の映像 入力音楽 選択された素片 テンポによる伸縮 音楽的構造 A B B B B B A C C : 4) 6) Web Web 2 2 c 2009 Information Processing S

Microsoft Word - toyoshima-deim2011.doc

_先端融合開発専攻_観音0314PDF用

非線形長波モデルと流体粒子法による津波シミュレータの開発 I_ m ρ v p h g a b a 2h b r ab a b Fang W r ab h 5 Wendland 1995 q= r ab /h a d W r ab h

Haiku Generation Based on Motif Images Using Deep Learning Koki Yoneda 1 Soichiro Yokoyama 2 Tomohisa Yamashita 2 Hidenori Kawamura Scho

理工ジャーナル 23‐1☆/1.外村

IPSJ SIG Technical Report Vol.2014-MUS-104 No /8/27 F0 1,a) 1,b) 1,c) 2,d) (F0) F0 F0 Graphical User Interface (GUI) F0 1. [1] CD MIDI [2] [3,

Vol. 9 No. 5 Oct (?,?) A B C D 132

IPSJ SIG Technical Report Vol.2009-DPS-141 No.23 Vol.2009-GN-73 No.23 Vol.2009-EIP-46 No /11/27 t-room t-room 2 Development of

¥ì¥·¥Ô¤Î¸À¸ì½èÍý¤Î¸½¾õ

2016 Institute of Statistical Research

IPSJ SIG Technical Report 1, Instrument Separation in Reverberant Environments Using Crystal Microphone Arrays Nobutaka ITO, 1, 2 Yu KITANO, 1

WII-D 2017 (1) (2) (1) (2) [Tanaka 07] [ 04] [ 10] [ 13, 13], [ 08] [ 13] (1) (2) 2 2 e.g., Wikipedia [ 14] Wikipedia [ 14] Linked Open

4. C i k = 2 k-means C 1 i, C 2 i 5. C i x i p [ f(θ i ; x) = (2π) p 2 Vi 1 2 exp (x µ ] i) t V 1 i (x µ i ) 2 BIC BIC = 2 log L( ˆθ i ; x i C i ) + q

Outline ACL 2017 ACL ACL 2017 Chairs/Presidents

untitled

ワトソンで体感する人工知能 フォローアップ情報 株式会社リックテレコム / 書籍出版部 ( 最終情報更新日 :2018 年 4 月 5 日 ) [INDEX] 2018 年 4 月 1 日時点の IBM Watson 仕様変更について ( 著者 : 井上研一氏からのフォロー情報 ) [ 変更点 -1

20mm 63.92% ConstantZoom U 5

IPSJ SIG Technical Report Vol.2017-SLP-115 No /2/18 1,a) 1 1,2 Sakriani Sakti [1][2] [3][4] [5][6][7] [8] [9] 1 Nara Institute of Scie

DEIM Forum 2017 E Netflix (Video on Demand) IP 4K [1] Video on D

1 Kinect for Windows M = [X Y Z] T M = [X Y Z ] T f (u,v) w 3.2 [11] [7] u = f X +u Z 0 δ u (X,Y,Z ) (5) v = f Y Z +v 0 δ v (X,Y,Z ) (6) w = Z +

Vol. 43 No. 7 July 2002 ATR-MATRIX,,, ATR ITL ATR-MATRIX ATR-MATRIX 90% ATR-MATRIX Development and Evaluation of ATR-MATRIX Speech Translation System

IPSJ SIG Technical Report Vol.2014-NL-219 No /12/17 1,a) Graham Neubig 1,b) Sakriani Sakti 1,c) 1,d) 1,e) 1. [23] 1(a) 1(b) [19] n-best [1] 1 N

IPSJ SIG Technical Report iphone iphone,,., OpenGl ES 2.0 GLSL(OpenGL Shading Language), iphone GPGPU(General-Purpose Computing on Graphics Proc

main.dvi

untitled


1. HNS [1] HNS HNS HNS [2] HNS [3] [4] [5] HNS 16ch SNR [6] 1 16ch 1 3 SNR [4] [5] 2. 2 HNS API HNS CS27-HNS [1] (SOA) [7] API Web 2

[2][3][4][5] 4 ( 1 ) ( 2 ) ( 3 ) ( 4 ) 2. Shiratori [2] Shiratori [3] [4] GP [5] [6] [7] [8][9] Kinect Choi [10] 3. 1 c 2016 Information Processing So

A Feasibility Study of Direct-Mapping-Type Parallel Processing Method to Solve Linear Equations in Load Flow Calculations Hiroaki Inayoshi, Non-member

Computer Security Symposium October ,a) 1,b) Microsoft Kinect Kinect, Takafumi Mori 1,a) Hiroaki Kikuchi 1,b) [1] 1 Meiji U

ばらつき抑制のための確率最適制御

main.dvi

08-特集04.indd

スライド 1

i

DPA,, ShareLog 3) 4) 2.2 Strino Strino STRain-based user Interface with tacticle of elastic Natural ObjectsStrino 1 Strino ) PC Log-Log (2007 6)

A pp CALL College Life CD-ROM Development of CD-ROM English Teaching Materials, College Life Series, for Improving English Communica

Gaze Head Eye (a) deg (b) 45 deg (c) 9 deg 1: - 1(b) - [5], [6] [7] Stahl [8], [9] Fang [1], [11] Itti [12] Itti [13] [7] Fang [1],

[1] SBS [2] SBS Random Forests[3] Random Forests ii

新製品開発プロジェクトの評価手法

テクノロジーに支えられた社会,., 6, 7,Brain Machine/Computer Interface BMI/BCI 8,3D 9,IoT Internet of Things 10,Siri 11,Google Translate 12,Unmanned Aerial Ve

HASC2012corpus HASC Challenge 2010,2011 HASC2011corpus( 116, 4898), HASC2012corpus( 136, 7668) HASC2012corpus HASC2012corpus

2007/8 Vol. J90 D No. 8 Stauffer [7] 2 2 I 1 I 2 2 (I 1(x),I 2(x)) 2 [13] I 2 = CI 1 (C >0) (I 1,I 2) (I 1,I 2) Field Monitoring Server

Optical Flow t t + δt 1 Motion Field 3 3 1) 2) 3) Lucas-Kanade 4) 1 t (x, y) I(x, y, t)

JVRSJ Vol.18 No.3 September, NPC RTS Real-time Simulation NPC NPC NPC AI NPC 4 AI 2 AI 図 1 ゲームとユーザエクスペリエンス reality a

DEIM Forum 2014 B Twitter Twitter Twitter 2006 Twitter 201


(a) 1 (b) 3. Gilbert Pernicka[2] Treibitz Schechner[3] Narasimhan [4] Kim [5] Nayar [6] [7][8][9] 2. X X X [10] [11] L L t L s L = L t + L s

ID 3) 9 4) 5) ID 2 ID 2 ID 2 Bluetooth ID 2 SRCid1 DSTid2 2 id1 id2 ID SRC DST SRC 2 2 ID 2 2 QR 6) 8) 6) QR QR QR QR

log F0 意識 しゃべり 葉の log F0 Fig. 1 1 An example of classification of substyles of rap. ' & 2. 4) m.o.v.e 5) motsu motsu (1) (2) (3) (4) (1) (2) mot

pp d 2 * Hz Hz 3 10 db Wind-induced noise, Noise reduction, Microphone array, Beamforming 1

Fig. 3 Flow diagram of image processing. Black rectangle in the photo indicates the processing area (128 x 32 pixels).

IS1-09 第 回画像センシングシンポジウム, 横浜,14 年 6 月 2 Hough Forest Hough Forest[6] Random Forest( [5]) Random Forest Hough Forest Hough Forest 2.1 Hough Forest 1 2.2

IEEE e

DEIM Forum 2019 H Web 1 Tripadvisor

IPSJ-SLP

,,, Twitter,,, ( ), 2. [1],,, ( ),,.,, Sungho Jeon [2], Twitter 4 URL, SVM,, , , URL F., SVM,, 4 SVM, F,.,,,,, [3], 1 [2] Step Entered

Vol. 42 No MUC-6 6) 90% 2) MUC-6 MET-1 7),8) 7 90% 1 MUC IREX-NE 9) 10),11) 1) MUCMET 12) IREX-NE 13) ARPA 1987 MUC 1992 TREC IREX-N

Computational Semantics 1 category specificity Warrington (1975); Warrington & Shallice (1979, 1984) 2 basic level superiority 3 super-ordinate catego

人工知能学会研究会資料 SIG-KBS-B Analysis of Voting Behavior in One Night Werewolf 1 2 Ema Nishizaki 1 Tomonobu Ozaki Graduate School of Integrated B

第 1 回バイオメトリクス研究会 ( 早稲田大学 ) THE INSTITUTE OF ELECTRONICS, INFORMATION AND COMMUNICATION ENGINEERS Proceedings of Biometrics Workshop,169

1 Fogg Fogg Behavior Model [1] information cascade [2] TPO [3] Fig. 2 Target area of this paper. 1 Fig. 1 Fogg b

IPSJ SIG Technical Report Vol.2016-CE-137 No /12/ e β /α α β β / α A judgment method of difficulty of task for a learner using simple

1(a) (b),(c) - [5], [6] Itti [12] [13] gaze eyeball head 2: [time] [7] Stahl [8], [9] Fang [1], [11] 3 -

258 5) GPS 1 GPS 6) GPS DP 7) 8) 10) GPS GPS ) GPS Global Positioning System

1 2 3 マルチメディア, 分散, 協調とモバイル (DICOMO2013) シンポジウム 平成 25 年 7 月.,.,,.,. Surrogate Diner,., Surrogate Diner,, 3,, Surrogate Diner. An Interface Agent for Ps

1 Fig. 1 Extraction of motion,.,,, 4,,, 3., 1, 2. 2.,. CHLAC,. 2.1,. (256 ).,., CHLAC. CHLAC, HLAC. 2.3 (HLAC ) r,.,. HLAC. N. 2 HLAC Fig. 2

Silhouette on Image Object Silhouette on Images Object 1 Fig. 1 Visual cone Fig. 2 2 Volume intersection method Fig. 3 3 Background subtraction Fig. 4

Transcription:

464 29 5 2014 9 企業における AI 研究の最前線 コミュニケーション科学と人工知能研究 NTT コミュニケーション科学基礎研究所の取組み Communication Science and Artificial Intelligence Research Activities at NTT Communication Science Laboratories 柏野邦夫 Kunio Kashino NTT Communication Science Laboratories, Nippon Telegraph and Telephone Corporation. kashino.kunio@lab.ntt.co.jp, http://www.brl.ntt.co.jp/people/kunio/ 平松薫 Kaoru Hiramatsu hiramatsu.kaoru@lab.ntt.co.jp, http://www.kecl.ntt.co.jp/csl/sirg/people/hiramatu/ 大和淳司 Junji Yamato yamato.junji@lab.ntt.co.jp, http://www.brl.ntt.co.jp/people/yamato/ 山田武士 Takeshi Yamada yamada.tak@lab.ntt.co.jp, http://www.kecl.ntt.co.jp/as/members/yamada/index-j.html Keywords: speech recognition, scene analysis, machine learning, big data analysis, machine translation. 1. はじめに,,, NTT,. 2. 外界の事物の認識 2 1 音声認識からコミュニケーション理解へ,,.,,,,,.,,.,,.,,IC.,,..,,,.,,,, 2 2 マイクを意識しない会話を捉える 残響の除去と制御, 1,4.,,

NTT 465 図 1.,,..,, 1980,.,, [Hori 12]..,, [Kinoshita 09]. REVERB Challange, 30,, [Delcorix 14].,,.,,,,.,.., [ 13]. 2 4 会話シーンを捉える リアルタイム会話分析 2,. 360,,,.,,..,,,,., 2 3 多様な会話を捉える 音声言語一体型モデル,., 図 2

466 29 5 2014 9 2 5 音声のニュアンスを捉える 韻律モデル,,.,,,,. [Fujisaki 88],.,.,,, [Kameoka 10].,,,. 2 6 メディアの辞書を引く メディア同一性の高速探索,,..,.,1990, RMS Robust Media Search,,.. near duplicate content detection,2000 RMS, NIST National Institute of Standards and Technology TRECVID, 1 [Mukai 10].RMS,,,,., RMS.,,,,,.,., sameness,. 2013, RMS [Murata 14],2013 TRECVID,,,,. 3. 知識の獲得 3 1 ビッグデータからの統計的機械学習,,,,, 3.,, ID, ID,,.,SNS,,,,

NTT 467 図 4 図 3,, NMF Nonnegative Matrix Factorization IRM Infinite Relational Model. 3 2 NMF によるデータ行列からのパターン発見 NMF,, [ 12].,, 4.,,. NMF,,,,,.,. twitter 5. twitter, NMF.twitter,,. NMF 5 A.,NMF.,2010 2 2011 4 10 1 000,10, 5 B. 1, 2, 3, 4, 5,, 図 5 twitter.,, NMF. NMF, NMF [Kameoka 09] [Sawada 13],. 3 3 IRM によるデータ行列からの関係クラスタ抽出 IRM, [Kemp 06]., 6,,,, 7 1 0 2., 7,, IRM,,,1 0

468 29 5 2014 9,,,, IRM, 8,,. 図 6 1 4. 機械翻訳 図 7 2 4 1 ルールベース翻訳から統計翻訳へ,, 1950,,,,,. 9. 図 8 3.,IRM [Nakano 14],., [Ishiguro 10], [Ishiguro 12]., 0.,,,,. IRM, 8,,, SIRM Subset IRM.,, 図 9 1990,2000, 2005,,

NTT 469 4 2 事前並べ替え翻訳,2000,2010,.,..,, [Isozaki 12].2011 NTCIR-9,NTT Enju,,., [Goto 12, Sudoh 12]. 4 3 日本語の主辞後置性に基づく事前並べ替え 10.,,.,.,,., 10.,,.,.,,,,,,,.,,.,,.,. 4 4 技術文書の多言語翻訳,,...,,.,2004,,,, 300, 800, 200.,, NTT., 4, 1 5, 2,., 図 10. 図 11.,

470 29 5 2014 9 BLEU 13., RIBES Rank-based Intuitive Bilingual Evaluation Score, [RIBES 11]. RIBES, NTCIR-9,, BLEU [Isozaki 10]. 図 12,2009 NTT,, [Suzuki 09]. 11, 12.,,.,, 4 5 翻訳精度の自動評価.,.1990 BLEU BiLingual Evaluation Understudy,,,BLEU 2008 NTCIR-7 4 6 機械翻訳の実用化に向けて,,,,,,,100,,.,,. 5. おわりに,NTT,,,,.,,,.,,. 参考文献 13 NTCIR-7 2008 BLEU [Delcorix 14] Delcorix, M., Yoshioka, T., Ogawa, A., Kubo, Y., Fujimoto, M., Ito, N., Kinoshita, K., Espi, M., Hori, T., Nakatani, T. and Nakamura, A.: Linear prediction-based dereverberation with advanced speech enhancement and recognition technologies for the reverb challenge, REVERB Challenge 2014 [Fujisaki 88] Fujisaki, H.: Vocal Physiology: Voice Production, Mechanisms and Functions, Raven Press 1988 [Goto 12] Goto, I., Lu, B., Chow, K. P., Sumita, E. and Tsou, B. K.: Overview of the patent machine translation task at the

NTT 471 NTCIR-9 Workshop, NTCIR-9, pp. 559-578 2012 [Hori 12] Hori, T., Araki, S., Yoshioka, T., Fujimoto, M., Watanabe, S., Ogawa, A., Otsuka, K., Mikami, D., Kinoshita, K., Nakatani, T., Nakamura, A. and Yamato, J.: Low-latency realtime meeting recognition and understanding using distant microphones and omni-directional camera, IEEE Trans. on Audio, Speech, and Language Proc., Vol. 20, No. 2, pp. 499-513 2012 [Ishiguro 10] Ishiguro, K., Iwata, T., Ueda, N. and Tenenbaum, J. B.: Dynamic infinite relational model for time-varying relational data analysis, Advances in Neural Information Processing Systems, Vol. 23, pp. 919-927 2010 [Ishiguro 12] Ishiguro, K., Ueda, N. and Sawada, H.: Subset Infinite relational models, Proc. 15th Int. Conf. on Artificial Intelligence and Statistics AISTATS2012,pp. 547-555 2012 [Isozaki 10] Isozaki, H., Hirao, T., Duh, K., Sudoh, K. and Tsukada, H.: An empirical study of semi-supervised structured conditional models for dependency parsing, Proc. 2010 Conf. on Empirical Methods in Natural Language Processing EMNLP,pp. 944-952 2010 [Isozaki 12] Isozaki, H., Sudoh, K., Tsukada, H. and Duh, K.: HPSG-based preprocessing for English-to-Japanese translation, ACM Trans. on Asian Language Information Processing TALIP,Vol. 11, No. 3, pp. 8:1-8:16 2012 [Kameoka 09] Kameoka, H., Ono, N., Kashino, K. and Sagayama, S.: Complex NMF: A new sparse representation for acoustic signals, Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Proc. ICASSP,pp. 3437-3440 2009 [Kameoka 10] Kameoka, H., Roux, J. L. and Ohishi, Y.: A statistical model of speech F 0 contours, Proc. SAPA, pp. 43-48 2010 [Kemp 06] Kemp, C., Tenenbaum, J. B., Griffiths, T. L., Yamada, T. and Ueda, N.: Learning systems of concepts with an infinite relational model, AAAI 06 Proc. 21st National Conf. on Artificial intelligence, pp. 381-388 2006 [Kinoshita 09] Kinoshita, K., Delcroix, M., Nakatani, T. and Miyoshi, M.: Suppression of late reverberation effect on speech signal using long-term multiple step linear prediction, IEEE Trans. on Audio, Speech, and Language Proc., Vol. 17, No. 4, pp. 534-545 2009 [ 13],,,,NTT,Vol. 25, No. 9, pp. 22-25 2013 [Mukai 10] Mukai, R., Kurozumi, T., Hiramatsu, K., Kawanishi, T., Nagano, H. and Kashino, K.: NTT communication science laboratories at TRECVID 2010 content-based copy detection, TRECVID Workshop 2010 [Murata 14] Murata, M., Nagano, H., Mukai, R., Kashino, K. and Satoh, S.: BM25 with exponential IDF for instance search, IEEE Trans. on Multimedia, Vol. 16, No. 6, to appear 2014 [Nakano 14] Nakano, M., Ishiguro, K., Kimura, A., Yamada, T. and Ueda, N.: Rectangular tiling process, Proc. Int. Conf. on Machine Learning ICML,pp. 361-369 2014 [RIBES 11] RIBES 2011,http://www.kecl.ntt.co.jp/ icl/lirg/ribes/index-j.html [ 12] NMF,,Vol. 95, No. 9, pp. 829-833 2012 [Sawada 13] Sawada, H., Kameoka, H., Araki, S. and Ueda, N.: Multichannel extensions of non-negative matrix factorization with complex-valued data, IEEE Trans. on Audio, Speech and Language Processing, Vol. 21, No. 5, pp. 971-982 2013 [Sudoh 12] Sudoh, K., Duh, K., Tsukada, H., Nagata, M., Wu, X., Matsuzaki, T. and Tsujii, J.: NTT-UT statistical machine translation in NTCIR-9 PatentMT, NTCIR-9, pp. 585-592 2012 [Suzuki 09] Suzuki, J., Isozaki, H., Carreras, X. and Collins, M.: An empirical study of semi-supervised structured conditional models for dependency parsing, Proc. 2009 Conf. on Empirical Methods in Natural Language Processing EMNLP,pp. 551-560 2009 2014 8 11 著者紹介 柏野邦夫 1990,1995.,.,NTT.,2002,.. 平松薫 1994,1996..,NTT.,2003 04,.. 大和淳司 1988,1990.,.,NTT.,1996 98,MIT.MIT Electrical Engineering and Computer Science.. 山田武士 1988.,.,NTT.,1996 97,..