1, 2 1 1 1 Instrument Separation in Reverberant Environments Using Crystal Microphone Arrays Nobutaka ITO, 1, 2 Yu KITANO, 1 Nobutaka ONO 1 and Shigeki SAGAYAMA 1 This paper deals with instrument separation from a music signal recorded with a microphone array in a reverberant environment. To this end, we apply our signal processing method for suppressing diffuse noise, which utilizes symmetrical microphone arrangements called crystal microphone arrays. We show some results of experiments where the proposed method is applied to instrument separation in a reverberant environment. 1. 1 Graduate School of Information Science and Technology, The University of Tokyo 2 METISS group, IRISA-INRIA, France 1) / 2) 1 3) Wiener 4) Wiener 3),5),6) 7) 2 3 1 c 2009 Information Processing Society of Japan
4 2. 2.1 T H Hermite τ Z ω R α(τ, ω) β(τ, ω) φ αβ (τ, ω), E[α(τ, ω)β (τ, ω)] (1) (τ, ω) Φ (τ, ω), E[ (τ, ω) H (τ, ω)] (2) E[ ] 2.2 s(τ, ω) x m(τ, ω) v m(τ, ω) m s(τ, ω) x m (τ, ω) d m (ω) d m(ω) = e jωδ m (3) δ m m x m (τ, ω) = s(τ, ω)d m (ω) + v m (τ, ω) (4) x(τ, ω) = s(τ, ω)d(ω) + v(τ, ω) (5) x(τ, ω), [x 1 (τ, ω) x 2 (τ, ω)... x M (τ, ω)] T (6) d(ω), [d 1 (ω) d 2 (ω)... d M (ω)] T (7) v(τ, ω), [v 1(τ, ω) v 2(τ, ω)... v M (τ, ω)] T (8) M s(τ, ω) v(τ, ω) 2.3 Wiener 4),8) ŝ(τ, ω), w H (τ, ω)x(τ, ω) (9) E [ ŝ(τ, ω) s(τ, ω) 2] (10) ŝ o (τ, ω), E[s(τ, ω)x H (τ, ω)]e[x(τ, ω)x H (τ, ω)] 1 x(τ, ω) (11) = φ ss(τ, ω)d H (ω)φ 1 xx(τ, ω)x(τ, ω) (12) (1) (2) φ ss (τ, ω) Φ xx(τ, ω) φ ss (τ, ω), E [ s(τ, ω) 2] (13) Φ xx (τ, ω), E[x(τ, ω)x H (τ, ω)] (14) (12) ŝ o(τ, ω) y(τ, ω), dh (ω)φ 1 xx(τ, ω)x(τ, ω) (15) d H (ω)φ 1 y(τ, ω) φ yy(τ, ω) = dh (ω)φ 1 xx(τ, ω)φ xx (τ, ω)φ 1 [d H (ω)φ 1 (16) ] 2 1 = (17) d H (ω)φ 1 (12) (17) ŝ o (τ, ω) 4),8) ŝ o(τ, ω) = φ ss(τ, ω) dh (ω)φ 1 xx(τ, ω)x(τ, ω) φ yy (τ, ω) d }{{} H (ω)φ 1 }{{}, p(τ, ω) = y(τ, ω) ŝ o (τ, ω) y(τ, ω) Wiener p(τ, ω) 2.4 φ ss(τ, ω) Wiener p(τ, ω) φ ss (τ, ω) x(τ, ω) 7) (18) 2 c 2009 Information Processing Society of Japan
s(τ, ω) v(τ, ω) Φ xx (τ, ω) = φ ss (τ, ω)d(ω)d H (ω) + Φ vv (τ, ω) (19) Φ xx(τ, ω) φ ss(τ, ω) d(ω)d H (ω) d(ω) Φ vv (τ, ω) Φ vv(τ, ω) Φ xx(τ, ω) φ ss(τ, ω) Φ vv (τ, ω) Φ vv (τ, ω) (a) v m (τ, ω) φ vm v m (τ, ω) φ v1 v 1 (τ, ω) = φ v2 v 2 (τ, ω) = = φ vm v M (τ, ω) (20) (b) v m (τ, ω) φ vm v n (τ, ω) r mn m n r mn = r kl φ vm v n (τ, ω) = φ vk v l (τ, ω) (21) (1) (2) φ vmvn (τ, ω) Φ vv (τ, ω) (m, n) (a) Φ vv (τ, ω) (b) Φ vv (τ, ω) Φ vv(τ, ω) 1 Φ vv (τ, ω) 9),10) 1 1 (a) φ v1 v 1 (τ, ω) = φ v2 v 2 (τ, ω) = φ v3 v 3 (τ, ω) = φ v4 v 4 (τ, ω) =: α(τ, ω) (22) r 12 = r 21 = r 34 = r 43 (23) r 13 = r 31 = r 24 = r 42 (24) r 14 = r 41 = r 23 = r 32 (25) (b) φ v1 v 2 (τ, ω) = φ v2 v 1 (τ, ω) = φ v3 v 4 (τ, ω) = φ v4 v 3 (τ, ω) =: β(τ, ω) (26) φ v1 v 3 (τ, ω) = φ v3 v 1 (τ, ω) = φ v2 v 4 (τ, ω) = φ v4 v 2 (τ, ω) =: γ(τ, ω) (27) φ v1 v 4 (τ, ω) = φ v4 v 1 (τ, ω) = φ v2 v 3 (τ, ω) = φ v3 v 2 (τ, ω) =: δ(τ, ω) (28) 3 c 2009 Information Processing Society of Japan
(22) (26) (27) (28) Φ vv(τ, ω) α(τ, ω) β(τ, ω) γ(τ, ω) δ(τ, ω) β(τ, ω) α(τ, ω) δ(τ, ω) γ(τ, ω) Φ vv (τ, ω) = γ(τ, ω) δ(τ, ω) α(τ, ω) β(τ, ω) δ(τ, ω) γ(τ, ω) β(τ, ω) α(τ, ω) α(τ, ω) β(τ, ω) γ(τ, ω) δ(τ, ω) 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 10) Φ vv (τ, ω) U (19) (29) (30) U H Φ xx (τ, ω)u = φ ss (τ, ω)u H d(ω)d H (ω)u + U H Φ vv (τ, ω)u (31) U H Φ vv(τ, ω)u U H Φ xx(τ, ω)u u H mφ xx (τ, ω)u n = φ ss (τ, ω)u H md(ω)d H (ω)u n (m n) (32) u m U m u H mφ xx (τ, ω)u n u H md(ω)d H (ω)u n (32) φ ss(τ, ω) φ ss (τ, ω) [ ] u H m d(ω)d H u H (ω)u n m Φ xx(τ, ω)u n ˆφ ss (τ, ω), m n u H m d(ω)d H (ω)u n 2 m n 2.5 Wiener (33) φ ss (τ, ω) Wiener (33) ˆp(τ, ω), ˆφ ss(τ, ω) (34) ˆφ yy (τ, ω) ˆφ yy (τ, ω) Zelinski 3) φ yy (τ, ω) ˆφ yy(τ, ω), 1 M φ xm x M m (τ, ω) (35) m=1 y(τ, ω) φ yy(τ, ω) (35) (35) p(τ, ω) 0 p(τ, ω) 1 ˆp(τ, ω) 3. ˆp(τ, ω) 0, if ˆp(τ, ω) < 0 ˆp(τ, ω) 1, if ˆp(τ, ω) > 1 2 11) RT 60 270 ms SiSEC 12) 6 s 16 khz (15) (34) Fourier 512 16 Hamming (15) Φ xx x(τ, ω)x H (τ, ω) (34) (33) Φ xx (τ, ω) (35) φ xm x m (τ, ω) 32 x(τ, ω)x H (τ, ω) x m(τ, ω)x n(τ, ω) (15) (33) d(ω) SN 4) LSD 13) 1 2 SN (36) 4 c 2009 Information Processing Society of Japan
2 @mic 1 LSD(dB) 1 2 3 (@mic 1) 10.1 9.2 7.8 4.8 4.8 6.2 4.6 6.1 6.5 2 1 @mic 1 SN (db) 1 2 3 (@mic 1) 4.4 8.9 1.2 4.1 1.1 4.6 10.8 3.9 12.5 (a) (b) LSD LSD SN 3 3@mic 1 3 (c) (d) 3 3 (c) 3 (a) 3 (b) @mic 1 (c) (d) (d) 5 c 2009 Information Processing Society of Japan
4. 3 8 db SN 1) M. Brandstein and D. Ward, Microphone Arrays: Signal Processing Techniques and Applications. Berlin: Springer-Verlag, 2001. 2) http://www.music-ir.org/mirex/2009/index.php 3) R. Zelinski, A microphone array with adaptive post-filtering for noise reduction in reverberant rooms, in Proc. ICASSP 88, New York, Apr. 1988, pp. 2578 2581. 4) K. U. Simmer, J. Bitzer, and C. Marro, Post-filtering techniques, in Microphone Arrays: Signal Processing Techniques and Applications, M. Brandstein and D. Ward, Eds. Berlin: Springer-Verlag, 2001, ch. 3, pp. 39 60. 5) I. A. McCowan and H. Bourlard, Microphone array post-filter based on noise field coherence, IEEE Trans. Speech Audio Process., vol. 11, no. 6, pp. 709 716, Nov. 2003. 6) S. Lefkimmiatis and P. Maragos, A generalized estimation approach for linear and nonlinear microphone array post-filters, Speech Commun., vol.49, no. 7 8, pp. 657 666, July Aug. 2007. 7) N. Ito, N. Ono, and S. Sagayama, A blind noise decorrelation approach with crystal arrays on designing post-filters for diffuse noise suppression, in Proc. ICASSP 2008, Las Vegas, Apr. 2008, pp. 317 320. 8) H. L. Van Trees, Optimum Array Processing. New York: John Wiley & Sons, 2002. 9) H. Shimizu, N. Ono, K. Matsumoto, and S. Sagayama, Isotropic noise suppression in the power spectrum domain by symmetric microphone arrays, in Proc. WASPAA, New Paltz, NY, Oct. 2007, pp. 54 57. 10) N. Ono, N. Ito, and S. Sagayama, Five classes of crystal arrays for blind decorrelation of diffuse noise, in Proc. SAM, Darmstadt, Germany, Jul. 2008, pp. 151 154. 11) J. A. Allen, D. A. Berkeley, Image method for efficiently simulating small-room acoustics, JASA, vol. 65, no. 4, pp. 943 950, Apr. 1979. 12) http://sisec.wiki.irisa.fr/tiki-index.php 13) I. Cohen, Multichannel Post-Filtering in Nonstationary Noise Environments, IEEE Trans. Signal Process., vol. 52, no. 5, pp. 1149 1160, May 2004. CREST 6 c 2009 Information Processing Society of Japan