2013 M0110453
2013 : M0110453 20
1 1 1.1............................ 1 1.2.............................. 4 2 5 2.1................................. 6 2.2................................. 8 2.3................................. 13 2.4................................. 13 3 15 3.1.......................... 15 3.2................................. 17 3.3................................... 19 4 22 24 25 I
1.1.... 2 2.1............................ 7 2.2........................ 10 2.3............................ 11 3.1........ 16 3.2 I................... 17 3.3 II................... 18 3.4........................ 19 II
2.1 PCM...................... 6 2.2................... 12 3.1................................. 17 3.2............. 18 3.3................... 19 III
1 1.1 2 CD [1][2] Crypton Future Media MUTANT [3] MUTANT 1
1.1 MUTANT 1.1: MUTANT 2
[4][5] SoundHound midomi [6] midomi 10 [7] [8] 3
1.2 2 3 4 4
2 2.1 2.2 2.3 2.4 PCM 2.1 5
2.1: PCM 44.1kHz 16bit 705.6kbpm PCM 1 1 2.1 16bit PCM -32768 32767-1 1 1 2.1 2.1 0.1 6
2.1: a b f s T (2.1) T = a b f s (2.1) η α (2.2) α = { 1 T η ( T < η) 0 (otherwise) (2.2) 2.4 7
η 3 2.2 [9][10][11][12] [13][14][15] [16][17] [9] 2 [18] 2 (2.3) (2.4) H(p) = h(q) = 1 N N 1 q=0 N 1 p=0 2πpq i h(q)e N (p = 0, 1, 2,..., N 1) (2.3) H(p)e i 2πpq N (q = 0, 1, 2,..., N 1) (2.4) H(p) h(q) N N 2048 8
t (2.5) t = N f s (2.5) N f s N 2048 f s 44.1kHz t 0.04644 [19][20] W (q) (2.6) W (q) = 0.54 0.46 cos 2πq N (2.6) h(q) h (q) (2.7) h (q) = h(q)w (q) (2.7) h (q) (2.3) h(q) 2.2 9
2.2: (2.4) h(q) q q = 0 h(q) q 0 1 h(q) 0 q h(q) q q f s f 0 (2.8) 2.3 f 0 = f s q (2.8) 10
2.3: 2 440Hz(A4) 1 12 2.2 A0 G#10 A0 11
2.2: (Hz) 1 A0 27.50 2 A#0 29.14 3 B0 30.87 48 G#4 415.31 49 A4 440.00 50 A#4 466.16 118 F#10 23679.64 119 G10 25087.71 120 G#10 26579.50 (2.5) t m A (2.9) A = 1 m m ( u j v j ) (2.9) j=1 u j j v j j A τ β (2.10) β = { 1 A τ (A < τ) 0 (otherwise) (2.10) (2.2) A τ 50 12
2.3 2-1 1 1 2-1 1 t {(x k, y k )}(k = 1, 2, 3,..., m) r (2.11) r = m (x k x)(y k y) k=1 m (x k x) 2 m (y k y) 2 k=1 k=1 (2.11) x, y x = {x k }, y = {y k } γ (2.12) γ = r + 1 2 (2.12) r -1 1 r -1 γ 0 r 1 γ 1 2.4 Z (2.2) α (2.10) β 13
(2.12) γ (2.13) Z = 1 (α + β + γ) (2.13) 3 Z 14
3 2 3.1 2 1. 2. 2 2 A. B. 2 15
(http://maoudamashii.jokersounds.com/) On-jin (http://on-jin.com/) 2062 - [21] A B 30 5 3.1 3.1: (2.4) 3.1 16
OS CPU 3.1: Windows 7 Professional 64bit Intel(R) Core(TM) i7 CPU M 640 @ 2.80GHz 6.00GB ECM-PCV80U 3.2 20 14 6 A 2 I 3.2 II 3.3 3.2 II 3.2: I 17
3.3: II 3.2: I 2 16 1 59 II 7 57 0 29 5 7 1 14 B 3.4 3.3 18
3.4: 3.3: 2.85 3.7 3.3 3.2 t [22] 5% p t p I 0.63619 0.05 II 0.000006 0.05 0.000075 0.05 II 19
10 I t t p 0.043286 0.05 A 3 70 192 II 20
B 12 5 3 30 1 2 21
4 22
23
(http://maoudamashii.jokersounds.com/) On-jin (http:// on-jin.com/) 24
[1] Apple Inc. itunes. http://www.apple.com/jp/itunes/. [2] Soundminer Inc. Soundminer. http://store.soundminer.com/. [3] Crypton Future Media Inc. MUTANT. http://sonicwire.com/mutant. [4] Shazam Entertainment Ltd. Shazam. http://www.shazam.com/. [5] Sony Mobile Communications Inc. TrackID. http://appnavi.sonymobile. co.jp/pc/ag/index.php?page=cate&cid=26&id=925. [6] SoundHound Inc. midomi. http://www.midomi.co.jp/. [7],,.., 1994. [8],.., 1996. [9] Philip McLeod, Geoff Wyvill. A Smarter Way to Find Pitch. Proc. International Computer Music Conference, Barcelona, Spain, pp. 138 141, September 2005. [10] Alain De Cheveigné, Hideki Kawahara. YIN, A Fundamental Frequency Estimator for Speech and Music. The Journal of the Acoustical Society of America, Vol. 111, p. 1917, April 2002. 25
[11] Lawrence R. Rabiner. On the Use of Autocorrelation Analysis for Pitch Detection. IEEE Trans. Acoust., Speech & Signal Process., Vol. ASSP-25, pp. 24 33, February 1977. [12] M.J. Ross, H.L. Shaffer, A. Cohen, R. Freudbereg, H.J Manley. Average Magnitude Diffrence Function Pitch Extractor. IEEE Trans. Acoust., Speech & Signal Process., Vol. ASSP-22, No. 5, pp. 353 362, October 1974. [13] Adriano Mitre, Marcelo Queiroz, Regis R. A. Faria. Accurate and Efficient Fundamental Frequency Determination from Precise Partial Estimates. Proc. 4th AES Brazil Conference, pp. 113 118, 2006. [14] M.S. Andrew, J. Pincone, R.D. Degroat. Robust Pitch Determination via SVD Based Cepstral Methods. IEEE Int. Conf. Acoust., Speech & Signal Process., Albuquerque, U.S.A., Vol. 1, pp. 253 256, April 1990. [15] C. Nadeu, J. Pascual, J. Hernando. Pitch Determination using the Cepstrum of the One-sided Autocorrelation Sequence. IEEE Int. Conf. Acoust., Speech & Signal Process., Toronto, Canada, Vol. 5, pp. 3677 3680, April 1991. [16].. PhD thesis,, March 2004. [17] Stephen A. Zahorian, Hongbing Hu. A SpectralOtemporal Method for Robust Fundamental Frequency Tracking. The Journal of the Acoustical Society of America, Vol. 123, pp. 4559 4571, April 2008. [18] D. G. Lampard. Generalization of the WienerKhintchine Theorem to Nonstationary Processes. Journal of Applied Physics, Vol. 25, No. 6, p. 802, June 1954. 26
[19],. (Window Function). http://laputa.cs. shinshu-u.ac.jp/~yizawa/infsys1/basic/chap9/index.htm. [20] Andrew Greensted. FIR Filters by Windowing. http://www.labbookpages. co.uk/audio/firwindowing.html. [21] PINO.TO. -. http://pino.to/choroku/ index.htm. [22]. 2 t. http://www.spc.tmu.ac.jp/lit/2013/1a/stat3/index.html. 27