19 Estimation of Sound Source Zone using The Arrival Time Interval 1080351 2008 3 7
S/N 2 2 2 i
Abstract Estimation of Sound Source Zone using The Arrival Time Interval Koichiro Kanai The microphone array can control the directivity and presume the direction where the sound comes. Therefore, it is possible to separate a desired sound and an unnecessary sound from the direction of coming. Hence, the microphone array can improve S/N of the receiving sound. However, when the target of receiving sound moves, the microphone array always should match the direction of the microphone according to the movement. Information of the changing sound source position becomes necessary in order to automate directivity control of the microphone array. In this paper, estimation of sound source zone using two microphones is proposed. The proposed system is used for the correlation of two input signals. If the arrival time interval is estimated, the microphone which is near to the sound source is decided. The proposed system can estimate when two sound source exist at the same time. key words cross-correlation, omni directional microphone, directional microphone, sound source ii
1 1 1.1.................................. 1 1.2...................................... 2 2 3 2.1................................... 3 2.2................................ 3 2.3............................. 4 2.4............................ 4 2.5........................... 4 2.6.............................. 5 3 7 3.1................................... 7 3.2................................. 7 3.3................................... 8 3.4................................... 10 3.5............................. 10 3.6............................ 11 3.7................................... 13 3.8....................... 15 3.9.............. 15 4 20 4.1................................... 20 iii
4.2................. 20 4.3............................ 21 4.4................................... 23 4.5....................... 24 5 30 5.1................................ 30 5.2.................................. 30 31 32 A 33 A.1............................. 33 A.2.............................. 33 A.3.................................. 34 iv
2.1............................ 4 2.2................................... 5 2.3........................ 6 3.1............................. 8 3.2....................... 11 3.3................................ 12 3.4................................ 13 3.5........................ 13 3.6....................... 14 3.7........................ 14 3.8 100mm.......................... 17 3.9 200mm.......................... 17 3.10 300mm.......................... 18 3.11 400mm.......................... 18 3.12.............................. 19 3.13 200mm45deg....................... 19 4.1.................. 21 4.2 (1)...................... 22 4.3 (2)...................... 22 4.4 A............................. 23 4.5 B.............................. 23 4.6 100mm45deg ( 4,410 )......... 26 v
4.7 100mm45deg ( 4,410 )... 27 4.8 100mm ( 4,410 )....... 27 4.9 100mm ( 4,410 ). 28 4.10 200mm45deg ( 4,410 )......... 28 4.11 200mm45deg ( 4,410 )... 29 4.12 200mm ( 4,410 )....... 29 vi
3.1.................................... 10 3.2 200mm.................... 16 3.3 300mm.................... 16 3.4 400mm.................... 16 4.1 100mm45deg..................... 25 4.2 100mm................... 25 4.3 200mm45deg..................... 26 vii
1 1.1 S/N [1] DFT [2] 2 2 1
1.2 1.2 2 5 2 3 4 5 2
2 2.1 10ms 1 2 2.2 340m/s 200mm 44.1KHz 0.2(m) 44100(Hz) = 25.94 (2.1) 340(m/s) 26 3
2.3 speaker microphone 2.1 2.3 2.4 2 2.5 4
2.6 2.6 2.2 44.1kHz 20 10 441,000 10 180 2.3 10 30000 20000 10000 Amplitude 0-10000 -20000-30000 -40000 0 5 10 15 20 Time [sec] 2.2 5
2.6 30000 20000 10000 Amplitude 0-10000 -20000-30000 -40000 0 5 10 15 20 Time [sec] 2.3 6
3 3.1 3.2 2 3.1 1 0 B 0 A 7
3.3 0 2 A minus shift mic A lag plus shift mic B B cross correlation shift length > 0 YES NO A B 3.1 3.3 2 N f = {f 0, f 1,..., f N 1 } g = {g 0, g 1,..., g N 1 } R (fg) R (fg) = < f, g > f g (3.1) 3.1 2 2 8
3.3 2 2 2 f = {f 0, f 1,..., f N 1 } g = {g 0, g 1,..., g N 1 } R (fg) n R (fg) n = 1 N 1 N N 1 i=0 N 1 i=0 f 2 i f i g i+n 1 N N 1 i=0 g 2 i+n (3.2) f g n g n = {g 0+n, g 1+n,..., g N 1+n } (3.3) 2 n R (fg) n = N 1 i=0 f i g i+n N 1 f 2 N 1 i gi 2 i=0 i=0 (3.4) 1 +1 ˆR (fg) n ˆR (fg) n = 1 N 1 N N 1 i=0 N 1 i=0 ˆf 2 i ˆf i ĝ i+n 1 N N 1 i=0 ĝ 2 i = N 1 i=0 N 1 i=0 ˆf i ĝ i+n ˆf i 2 N 1 ĝi 2 i=0 (3.5) 9
3.4 2 3.1 0 3.1 0.0 r 0.2 0.2 < r 0.4 0.4 < r 0.7 0.7 < r 1.0 3.4 3.2 100mm 200mm 300mm 400mm 4 1,200mm 2 3.3 A B 3.5 (A260) 3.2 5 10
3.6 speaker 1200mm A B C D E A-B 100mm A-C 200mm A-D 300mm A-E 400mm 3.2 3.6 3.5 44.1kHz 4,410 8,820 13,230 17,640 4 2 1 (0 n < N 1) h(n) = 0 (otherwise) (3.6) 1/2 11
3.6 A microphone A microphone B B 3.3 A 10 B 2 2 1.0 4,098 2 A B 12
3.7 3.4 30000 20000 10000 Amplitude 0-10000 -20000-30000 -40000 0 5 10 15 20 Time [sec] 3.5 3.7 3.7 x(t) y(t) S 13
3.7 sound source A H(z) y(k) e(k) B - target zone + output 3.6 sound source target zone A H(z) e(k) y(k) B + - output 3.7 A 10 B S = 10log 10 y(t) x(t) [db] (3.7) 0dB 14
3.8 3.8 3.8 3.11 y A B 3.8 100mm 3.9 200mm 100mm 2 3.2 3.4 100mm 4,410 S 0dB 3.9 45 3.12 200mm 3.13 2 15
3.9 3.2 200mm Sample Number 4,410 8,820 13,230 17,640 445,410 19.66 20.51 18.91 19.36 18.16 533,610 19.62 17.91 19.90 19.81 19.16 621,810 19.08 16.27 17.11 16.11 17.13 710,010 18.26 25.26 24.70 25.00 25.60 798,210 20.75 1.17 24.50 24.86 25.04 3.3 300mm Sample Number 4,410 8,820 13,230 17,640 445,410 19.66 18.30 19.49 16.96 18.61 533,610 19.62 20.79 20.69 18.52 19.00 621,810 19.08 17.46 17.11 16.98 16.82 710,010 18.26 24.23 24.63 24.98 24.39 798,210 20.75 4.51 26.68 26.22 26.59 3.4 400mm Sample Number 4,410 8,820 13,230 17,640 445,410 19.66 16.19 15.39 18.71 16.05 533,610 19.62 17.80 16.83 16.63 16.43 621,810 19.08 16.43 16.27 15.47 15.62 710,010 18.26 31.47 31.09 29.82 30.50 798,210 20.75 27.15 24.96 26.61 27.05 16
3.9 60 Shift Length 40 20 0-20 4,410 8,820 13,230 17,640-40 -60 0 4 8 12 16 20 Time [sec] 3.8 100mm 40 4,410 8,820 13,230 17,640 Shift Length 20 0-20 -40 0 4 8 12 16 20 Time [sec] 3.9 200mm 100 200mm 300mm 17
3.9 40 4,410 8,820 13,230 17,640 20 Shift Length 0-20 -40 0 4 8 12 16 20 Time [sec] 3.10 300mm 60 40 4,410 8,820 13,230 17,640 Shift Length 20 0-20 -40-60 0 4 8 12 16 20 Time [sec] 3.11 400mm 18
3.9 mic A mic B (a) mic A mic B 45 (b) 3.12 40 20 4,410 8,820 13,230 17,640 Shift Length 0-20 -40 0 4 8 12 16 20 Time [sec] 3.13 200mm45deg 19
4 4.1 4.2 4.1 B A A B 0 2 20
4.3 A minus shift mic A lag plus shift mic B B shift length cross > 0 correlation plus shift YES shift length A < 0 minus shift YES B 4.1 4.3 44.1kHz A503 4.4 4.5 13,230 2 4,410 70 10 21
4.3 A 1 10 B 1 10 2 A B 10 A 10 20000 10000 Amplitude 0-10000 -20000-30000 0 5 10 15 20 Time [sec] 4.2 (1) 30000 20000 10000 Amplitude 0-10000 -20000-30000 0 5 10 15 20 Time [sec] 4.3 (2) 22
4.4 20000 10000 Amplitude 0-10000 -20000 0 5 10 15 20 Time [sec] 4.4 A 20000 10000 Amplitude 0-10000 -20000 0 5 10 15 20 Time [sec] 4.5 B 4.4 10 23
4.5 4.5 4.1 4.3 4.6 4.12 4,410 shift B B shift A A 10 10 A B 0 0 200 300 400mm 45 300 400mm 4.12 200mm 70 24
4.5 4.1 100mm45deg A B 4,410 (0.10sec) 0.54 0.61 8,820 (0.20sec) 0.45 0.66 13,230 (0.30sec) 0.42 0.68 17,640 (0.40sec) 0.42 0.67 4.2 100mm A B 4,410 (0.10sec) 0.56 0.58 8,820 (0.20sec) 0.45 0.64 13,230 (0.30sec) 0.41 0.67 17,640 (0.40sec) 0.42 0.67 25
4.5 4.3 200mm45deg A B 4,410 (0.10sec) 0.65 0.69 8,820 (0.20sec) 0.59 0.74 13,230 (0.30sec) 0.54 0.77 17,640 (0.40sec) 0.52 0.75 10 10 2 2 15 10 shift B shift A Shift Length 5 0-5 -10-15 0 4 8 12 16 20 Time [sec] 4.6 100mm45deg ( 4,410 ) 26
4.5 1 0.9 shift B shift A Correlation Value 0.8 0.7 0.6 0.5 0.4 0 4 8 12 16 20 Time [sec] 4.7 100mm45deg ( 4,410 ) 20 shift B shift A 10 Shift Length 0-10 -20 0 4 8 12 16 20 Time [sec] 4.8 100mm ( 4,410 ) 27
4.5 1 shift B shift A 0.8 Correlation Value 0.6 0.4 0.2 0 4 8 12 16 20 Time [sec] 4.9 100mm ( 4,410 ) 20 shift B shift A 10 Shift Length 0-10 -20 0 4 8 12 16 20 Time [sec] 4.10 200mm45deg ( 4,410 ) 28
4.5 1 shift B shift A 0.8 Correlation Value 0.6 0.4 0.2 0 0 4 8 12 16 20 Time [sec] 4.11 200mm45deg ( 4,410 ) 80 60 40 Shift Lehgth 20 0-20 -40-60 shift B shift A -80 0 2 4 6 8 10 12 14 16 18 20 Time [sec] 4.12 200mm ( 4,410 ) 29
5 5.1 2 5.2 2 30
2 4 31
[1] 2 (A) vol.j82-a no.6 pp.860-866 Jun 1999 [2] (A) vol.j83-a no.12 pp.1445-1454 Dec 2000 [3] 1995 [4] CQ 2005 32
A A.1 A.2 LMS 2 1967 LMS 1960 Kalman 2 RLS RLS N 1 N 2 LMS N 33
A.3 RLS A.3 NLMS LMS t y(t) d(t) d(t) = h T N x N (t) (A.1) h N = w N x(t) A.1 A.1 h N (t) A.1 x N (t) w N h N (t) x N (t) w N h N (t) w N (t) w N h N (t + 1) h N (t + 1) = h N (t) + {h N (t + 1) h N (t)} = h N (t) + {w N h N (t)} T {h N (t + 1) h N (t)} h N (t + 1) h N (t) h N (t + 1) h N (t) h N (t + 1) h N (t) (A.2) 2 h N (t + 1) h N (t) h N (t + 1) h N (t) = x N(t) x N (t) (A.3) {w N h N (t)} T x N (t) = d(t) y(t) = e(t) (A.4) 34
A.3 A.2 h N (t + 1) = h N (t) + x N(t) x N (t) 2 e(t) (A.5) A.5 h N (t + 1) = h N (t) + α x N(t) x N (t) e(t) 2 (A.6) 35