10_08.dvi

Similar documents
IEEE e

SIP SDP(Session Description Protocol) RTSP(Real-time Streaming Protocol) RTP(Real-time Transport Protocol) IP 1 [1] 1: IP RTP(Real-Time RFC1889 Transf

2 DS SS (SS+DS) Fig. 2 Separation algorithm for motorcycle sound by combining DS and SS (SS+DS). 3. [3] DS SS 2 SS+DS 1 1 B SS SS 4. NMF 4. 1 (NMF) Y

本文/羽田野貴仁(p119‐136)

pp d 2 * Hz Hz 3 10 db Wind-induced noise, Noise reduction, Microphone array, Beamforming 1

2007/8 Vol. J90 D No. 8 Stauffer [7] 2 2 I 1 I 2 2 (I 1(x),I 2(x)) 2 [13] I 2 = CI 1 (C >0) (I 1,I 2) (I 1,I 2) Field Monitoring Server

THE INSTITUTE OF ELECTRONICS, INFORMATION AND COMMUNICATION ENGINEERS TECHNICAL REPORT OF IEICE.


Input image Initialize variables Loop for period of oscillation Update height map Make shade image Change property of image Output image Change time L

IPSJ SIG Technical Report Vol.2009-DPS-141 No.20 Vol.2009-GN-73 No.20 Vol.2009-EIP-46 No /11/27 1. MIERUKEN 1 2 MIERUKEN MIERUKEN MIERUKEN: Spe

Vol. 43 No. 7 July 2002 ATR-MATRIX,,, ATR ITL ATR-MATRIX ATR-MATRIX 90% ATR-MATRIX Development and Evaluation of ATR-MATRIX Speech Translation System

H(ω) = ( G H (ω)g(ω) ) 1 G H (ω) (6) 2 H 11 (ω) H 1N (ω) H(ω)= (2) H M1 (ω) H MN (ω) [ X(ω)= X 1 (ω) X 2 (ω) X N (ω) ] T (3)

h(n) x(n) s(n) S (ω) = H(ω)X(ω) (5 1) H(ω) H(ω) = F[h(n)] (5 2) F X(ω) x(n) X(ω) = F[x(n)] (5 3) S (ω) s(n) S (ω) = F[s(n)] (5

A pp CALL College Life CD-ROM Development of CD-ROM English Teaching Materials, College Life Series, for Improving English Communica

B HNS 7)8) HNS ( ( ) 7)8) (SOA) HNS HNS 4) HNS ( ) ( ) 1 TV power, channel, volume power true( ON) false( OFF) boolean channel volume int


A Feasibility Study of Direct-Mapping-Type Parallel Processing Method to Solve Linear Equations in Load Flow Calculations Hiroaki Inayoshi, Non-member

66-1 田中健吾・松浦紗織.pwd


log F0 意識 しゃべり 葉の log F0 Fig. 1 1 An example of classification of substyles of rap. ' & 2. 4) m.o.v.e 5) motsu motsu (1) (2) (3) (4) (1) (2) mot

(MIRU2008) HOG Histograms of Oriented Gradients (HOG)

5005-toku3.indd

橡LET.PDF


音響モデル triphone 入力音声 音声分析 デコーダ 言語モデル N-gram bigram HMM の状態確率として利用 出力層 triphone: 3003 ノード リスコア trigram 隠れ層 2048 ノード X7 層 1 Structure of recognition syst

IPSJ SIG Technical Report * Wi-Fi Survey of the Internet connectivity using geolocation of smartphones Yoshiaki Kitaguchi * Kenichi Nagami and Yutaka

2) TA Hercules CAA 5 [6], [7] CAA BOSS [8] 2. C II C. ( 1 ) C. ( 2 ). ( 3 ) 100. ( 4 ) () HTML NFS Hercules ( )

Table 1. Reluctance equalization design. Fig. 2. Voltage vector of LSynRM. Fig. 4. Analytical model. Table 2. Specifications of analytical models. Fig

A Study on Throw Simulation for Baseball Pitching Machine with Rollers and Its Optimization Shinobu SAKAI*5, Yuichiro KITAGAWA, Ryo KANAI and Juhachi

(a) (b) (c) Canny (d) 1 ( x α, y α ) 3 (x α, y α ) (a) A 2 + B 2 + C 2 + D 2 + E 2 + F 2 = 1 (3) u ξ α u (A, B, C, D, E, F ) (4) ξ α (x 2 α, 2x α y α,

7) 8) 9),10) 11) 18) 11),16) 18) 19) 20) Vocaloid 6) Vocaloid 1 VocaListener1 2 VocaListener1 3 VocaListener VocaListener1 VocaListener1 Voca

[1] SBS [2] SBS Random Forests[3] Random Forests ii

第 1 回バイオメトリクス研究会 ( 早稲田大学 ) THE INSTITUTE OF ELECTRONICS, INFORMATION AND COMMUNICATION ENGINEERS Proceedings of Biometrics Workshop,169

Vol. 42 No MUC-6 6) 90% 2) MUC-6 MET-1 7),8) 7 90% 1 MUC IREX-NE 9) 10),11) 1) MUCMET 12) IREX-NE 13) ARPA 1987 MUC 1992 TREC IREX-N

A Japanese Word Dependency Corpus ÆüËܸì¤Îñ¸ì·¸¤ê¼õ¤±¥³¡¼¥Ñ¥¹

2). 3) 4) 1.2 NICTNICT DCRA Dihedral Corner Reflector micro-arraysdcra DCRA DCRA DCRA 3D DCRA PC USB PC PC ON / OFF Velleman K8055 K8055 K8055

1. HNS [1] HNS HNS HNS [2] HNS [3] [4] [5] HNS 16ch SNR [6] 1 16ch 1 3 SNR [4] [5] 2. 2 HNS API HNS CS27-HNS [1] (SOA) [7] API Web 2

YUHO

VHDL-AMS Department of Electrical Engineering, Doshisha University, Tatara, Kyotanabe, Kyoto, Japan TOYOTA Motor Corporation, Susono, Shizuok

Studies of Foot Form for Footwear Design (Part 9) : Characteristics of the Foot Form of Young and Elder Women Based on their Sizes of Ball Joint Girth

IPSJ SIG Technical Report Vol.2012-MUS-96 No /8/10 MIDI Modeling Performance Indeterminacies for Polyphonic Midi Score Following and

理学療法検査技術習得に向けた客観的臨床能力試験(OSCE)の試行

IPSJ SIG Technical Report Vol.2011-CE-110 No /7/9 Bebras 1, 6 1, 2 3 4, 6 5, 6 Bebras 2010 Bebras Reporting Trial of Bebras Contest for K12 stud

Vol. 36, Special Issue, S 3 S 18 (2015) PK Phase I Introduction to Pharmacokinetic Analysis Focus on Phase I Study 1 2 Kazuro Ikawa 1 and Jun Tanaka 2

& Vol.2 No (Mar. 2012) 1,a) , Bluetooth A Health Management Service by Cell Phones and Its Us

The Evaluation of LBB Behavior and Crack Opening Displacement on Statically Indeterminate Piping System Subjected to Monotonic Load The plastic collap

旭硝子のリグラスカタログ

DT pdf

The Evaluation on Impact Strength of Structural Elements by Means of Drop Weight Test Elastic Response and Elastic Limit by Hiroshi Maenaka, Member Sh

EQUIVALENT TRANSFORMATION TECHNIQUE FOR ISLANDING DETECTION METHODS OF SYNCHRONOUS GENERATOR -REACTIVE POWER PERTURBATION METHODS USING AVR OR SVC- Ju

1

GPGPU

report-MSPC.dvi

[2] , [3] 2. 2 [4] 2. 3 BABOK BABOK(Business Analysis Body of Knowledge) BABOK IIBA(International Institute of Business Analysis) BABOK 7

Transcription:

476 67 10 2011 pp. 476 481 * 43.72.+q 1. MOS Mean Opinion Score ITU-T P.835 [1] [2] [3] Subjective and objective quality evaluation of noisereduced speech. Takeshi Yamada, Shoji Makino and Nobuhiko Kitawaki (University of Tsukuba, Tsukuba, 305 8573) [4 10] [5, 6, 9] 2. 2.1 ITU- T P.835 [1] 3 1 2 3 1 5 P.835 Good Fair

477 1 5 Speech quality Noise quality Overall quality Score Category Category Category 5 Not distorted Not noticeable Excellent 4 Slightly distorted Slightly noticeable Good 3 Somewhat distorted Noticeable but not intrusive Fair 2 Fairly distorted Somewhat intrusive Poor 1 Very distorted Very intrusive Bad [11] P.835 32 2 4 [12] SNR Clean 20 15 10 5 0dB 6 EVRC Enhanced Variable Rate Codec [13] [14] SVD [15] GMM [15] 5 8kHz 1 SNR MOS 32 4 1 1 2.2 Overall quality = 0.6303 Speech quality +0.6125 Noise quality 1.3917 (1) 1 1 2 RMSE Root Mean Square Error 0.26 [6] (1) 2

478 67 10 2011 2 4 FR 3 [7] 3 P.835 FR Full-Reference NR Non-Reference 2.3 FR FR FR ITU-T P.862 [16] PESQ 5 1 (1) 4 SNR 2.1 RMSE 0.33 PESQ

479 5 PESQ 6 NR 5 RMSE 0.94 PESQ 2.4 NR NR ITU-T P.563 [17] P.563 NR Basic speech descriptors Unnatural speech 27 Noise analysis Interruptions/Mutes 24 1 (1) 2.3 6 RMSE 0.37 FR P.563 7 RMSE 0.58 P.563 7 P.563 3. 3.1 [2] 7.0 1.0 F4 7.0 5.5 F3 5.5 4.0 F2 4.0 2.5 F1 2.5 1.0 20 NTT [18] 1

480 67 10 2011 8 F4 10 F4 F1 9 F1 4 AURORA-2J [19] SNR Clean 20 15 10 5 0dB 6 (S) SS-SMT [20] (T) SVD [15] (G) GMM [15] (N) 4 8kHz F4 F1 8 9 SNR SNR F1 Clean 80% 3.2 PESQ MOS PESQ MOS 11 a y = (2) 1+e b(x c) y x PESQ MOS a b c PESQ MOS (2) [21] F4 F1 10 PESQ MOS SNR 10 PESQ MOS (2) 11

481 3.1 11 RMSE 4.2 7.0 PESQ 4. [ 1 ] ITU-T Rec. P.835, Subjective test methodology for evaluating speech communication systems that include noise suppression algorithm (2003). [2],,,,,,,, 54, 842 849 (1998). [3],,,,,,, 63, 196 205 (2007). [ 4 ] N. Egi, H. Aoki and A. Takahashi, Objective quality evaluation method for noise-reduced speech, IEICE Trans. Commun., E91-B, 1279 1286 (2008). [5],,,,, 7 QoS, pp. 40 41 (2009). [ 6 ] T. Yamada, Y. Kasuya, Y. Shinohara and N. Kitawaki, Non-reference objective quality evaluation for noise-reduced speech using overall quality estimation model, IEICE Trans. Commun., E93-B, 1367 1372 (2010). [7],,,,, B-11-18, p. 447 (2011.3). [ 8 ] ETSI EG 202 396-3 V1.3.1, Speech and multimedia Transmission Quality (STQ); Speech quality performance in the presence of background noise Part 3: Background noise transmission Objective test methods (2011). [ 9 ] T. Yamada, M. Kumakura and N. Kitawaki, Objective estimation of word intelligibility for noisereduced speech, IEICE Trans. Commun., E91-B, 4075 4077 (2008). [10] K. Kondo and Y. Takano, Estimation of twoto-one forced selection intelligibility scores by speech recognizers using noise-adapted models, Proc. Interspeech 2010, pp. 302 305 (2010). [11] Z. Cai, N. Kitawaki, T. Yamada and S. Makino, Comparison of MOS evaluation characteristics for Chinese, Japanese, and English in IP telephony, Proc. Int. Universal Communication Symp., IUCS2010, pp. 111 114 (2010). [12], http://research.nii.ac.jp/ src/list/detail.html#jeida-noise. [13] 3GPP2 C.S0014-A Version 1.0, Enhanced variable rate codec, speech service option 3 for wideband spread spectrum digital systems (2004). [14],,,,, J87-D-II, 464 474 (2004). [15] M. Fujimoto and Y. Ariki, Combination of temporal domain SVD based speech enhancement and GMM based speech estimation for ASR in noise Evaluation on the AURORA2 task, Proc. Eurospeech 2003, pp. 1781 1784 (2003). [16] ITU-T Rec. P.862, Perceptual evaluation of speech quality (PESQ): An objective method for endto-end speech quality assessment of narrow-band telephone networks and speech codecs (2001). [17] ITU-T Rec. P.563, Single ended method for objective speech quality assessment in narrow-band telephony applications (2004). [18] NTT, http://research.nii.ac.jp/src/list/detail.html #FW03. [19] S. Nakamura, K. Takeda, K. Yamamoto, T. Yamada, S. Kuroiwa, N. Kitaoka, T. Nishiura, A. Sasou, M. Mizumachi, C. Miyajima, M. Fujimoto and T. Endo, AURORA-2J: An evaluation framework for Japanese noisy speech recognition, IEICE Trans. Inf. Syst., E88-D, 535 544 (2005). [20],,,,, J83-D-II, 500 509 (2000). [21] T. Yamada, M. Kumakura and N. Kitawaki, Performance estimation of speech recognition system under noise conditions using objective quality measures and artificial voice, IEEE Trans. Audio Speech Lang. Process., 14, 2006 2013 (2006).