IPSJ SIG Technical Report Vol.2014-MUS-104 No /8/27 F0 1,a) 1,b) 1,c) 2,d) (F0) F0 F0 Graphical User Interface (GUI) F0 1. [1] CD MIDI [2] [3,

Size: px

Start display at page:

Download "IPSJ SIG Technical Report Vol.2014-MUS-104 No /8/27 F0 1,a) 1,b) 1,c) 2,d) (F0) F0 F0 Graphical User Interface (GUI) F0 1. [1] CD MIDI [2] [3,"

たつぞうこうじょう
5 years ago
Views:

1 F,a),b),c) 2,d) (F) F F Graphical User Interface (GUI) F. [] CD MIDI [2] [3, 4] [5] 2 a) ikemiya@kuis.kyoto-u.ac.jp b) itoyama@kuis.kyoto-u.ac.jp c) yoshii@kuis.kyoto-u.ac.jp d) okuno@aoni.waseda.jp TANDEM-STRAIGHT [6] (F) 3 [7] F F [8] [9] F F [] F c 24 Information Processing Society of Japan

2 Time [] F ( ) F GUI 2 F 2. F F ( 2) Q [2] F Q 2. Q x(n) Q [2] X(n, k) = n+ N k /2 N k { j=n N k /2 x(j)a k(j n + N k /2) () a k (n) = w(n/n k ) exp( i2πnf k /f s ) N k = Q fs f k, Q = (2 /fratio ) qrate k f k k [Hz] f s w(t) [, ] fratio qrate n [msec] Q t f X(t, f) 2.2 Robust PCA Robust PCA (RPCA) [3] 2 minimize L + λ S (subjectto L + S = M) (2) M L S L λ RPCA [4] [4] Q c 24 Information Processing Society of Japan 2

3. 2. 3. Q 4. F GUI F F 2.3 F F F 2.3. F F F F 3 GUI F F F c [cent] F L t (c) F c l c h [cent] F [cent] F (t) = arg max c l c c h L t (c) (3) L t (c) [5, 6] 4 X(t, f) M b (t, f) M h (t, f) X s (t, f)

3 Q 4. F GUI F F 2.3 F F F 2.3. F F F F 3 GUI F F F c [cent] F L t (c) F c l c h [cent] F [cent] F (t) = arg max c l c c h L t (c) (3) L t (c) [5, 6] 4 X(t, f) M b (t, f) M h (t, f) X s (t, f) Original spectrum RPCA mask Harmonic mask Masked spectrum [cent] RPCA F Subharmonic Summation (SHS) [7] SHS L t (c) L t (c) = N λ n S t (c + 2 log 2 n) (4) n= S t (c) N λ 5, F F ( 4) [ Ht h w 2 < C(f) < Hh t + w 2 M h (t, f) = Ht h = F t + 2 log 2 h, h H otherwise (5) F t t F [cent] C(f) f [cent] H w [cent] RPCA M b (t, f) X s (t, f) X m (t, f) X s (t, f) = M b (t, f)m h (t, f)x(t, f), X m (t, f) = ( M b (t, f)m h (t, f))x(t, f) (6) 2.4 [Hz] n [cent] 2 log 2 n [cent] c 24 Information Processing Society of Japan 3

4 X s (t, f) E(t, f) X shift (t, f) Original spectrum (vocal) Estimated spectrum envelope Simply-shifted spectrum Corrected spectrum [cent] X s (t, f) X m (t, f) [6] ( 5) (DAP) [8] F F E(t, f) m E(t, f) X shift (t, f) = A t X s (t, f m) (7) E(t, f m) A t X new (t, f) = X m (t, f m) + X shift (t, f) (8) 2.5 Q Q [2] 2.4 Q [9] 3. [], F,, 3. F F F 4. F 4. 6kHz 6bit Q fratio.5 (2 bins per octave) qrate.2 [msec] RPCA k [4]. 2.3 w 2 [cent] 4.2 F 2.3. F F F c 24 Information Processing Society of Japan 4

[%] [cent] 6 98 96 94 92 9 88 86 84 5 5 2 25 3 35 4 45 5 55 6 c: [cent] 6 48 36 (a) F 24 2 3 4 5 [msec] (b) F (c = 4 RWC-MDB-P-2: No.

5 [%] [cent] c: [cent] (a) F [msec] (b) F (c = 4 RWC-MDB-P-2: No.7) ±4 [cent] F (b) F F ±4 [cent] RWC Music Database: Popular Music (RWC-MDB-P-2) [2] 94 F ±c [cent] F c F 6 (a) 5 [cent] c = [cent] % c c = 4 [cent] 9 F 4 F ( 6 (b)) [cent] 6 [Hz] No correction With correction TANDEM-STRAIGHT [cent] TANDEM-STRAIGHT TANDEM-STRAIGHT [6] TANDEM-STRAIGHT [cent] TANDEM- STRAIGHT 2 [cent] TANDEM-STRAIGHT DAP 4.4 F 8 Q c 24 Information Processing Society of Japan 5

[cent] 8 96 84 72 6 48 36 24 2 4 3 2 2 3 4 5 96 84 72 6 48 36 24 2 Original spectrogram Vocal expression Modified spectrogram 2 3 4 5 6 [msec] 5.

6 [cent] Original spectrogram Vocal expression Modified spectrogram [msec] 5. F F F GUI F GUI JSPS JST CREST OngaCREST [] Goto, M.: Active Music Listening Interfaces Based on Signal Processing, Proc. ICASSP (27). [2] Yoshii, K., Goto, M., Komatani, K., Ogata, T. and Okuno, H. G.: Drumix: An Audio Player with Real-time Drum-part Rearrangement Functions for Active Music Listening, IPSJ Journal (27). [3] Itoyama, K., Goto, M., Komatani, K., Ogata, T. and Okuno, H. G.: Instrument Equalizer for Query-by- Example Retrieval: Improving Sound Source Separation based on Integrated Harmonic and Inharmonic Models, Proc. ISMIR (28). [4] Fritsch, J. and Plumbley, M. D.: Score Informed Audio Source Separation using Constrained Nonnegative Matrix Factorization and Score Synthesis, Proc. ICASSP (23). [5] Rafii, Z., Germain, F. G., Sun, D. L. and Mysore, G. J.: Combining Modeling of Singing Voice and Background Music for Automatic Separation of Musical Mixtures, Proc. ISMIR (23). [6] Kawahara, H., Morise, M., Takahashi, T., Nisimura, R., Irino, T. and Banno, H.: Tandem-STRAIGHT: A Temporally Stable Power Spectral Representation for Periodic Signals and Applications to Interference-free Spectrum, F, and Aperiodicity Estimation, Proc. ICASSP (28). [7] Ohishi, Y., Mochihashi, D., Kameoka, H. and Kashino, K.: Mixture of Gaussian Process Experts for Predicting Sung Melodic Contour with Expressive Dynamic Fluctuations, Proc. ICASSP (24). [8] (23). [9] Fujihara, H. and Goto, M.: Concurrent Estimation of Singing Voice F and Phonemes by Using Spectral Envelopes Estimated from Polyphonic Music, Proc. ICASSP, pp (2). [] Saito, T. and Goto, M.: Acoustic and Perceptual Effects of Vocal Training in Amateur Male Singing, Proc. INTERSPEECH (29). [] Ikemiya, Y., Itoyama, K. and Okuno, H. G.: Transcribing Vocal Expression from Polyphonic Music, Proc. ICASSP (24). [2] Schorkhuber, C. and Klapuri, A.: Constant-Q Transform Toolbox for Music Processing, SMC Conference (2). [3] Candes, E. J., Li, X., Ma, Y. and Wright, J.: Robust Principal Component Analysis?, J. ACM (2). [4] Huang, P.-S., Chen, S. D., Smaragdis, P. and Hasegawa- Johnson, M.: Singing-Voice Separation from Monaural Recordings Using Robust Principal Component Analysis, Proc. ICASSP (22). [5] Goto, M.: PreFEst: A Predominant-F Estimation Method for Polyphonic Musical Audio Signals, Proc. MIREX (25). [6] Saito, S., Kameoka, H., Takahashi, K., Nishimoto, T. and Sagayama, S.: Specmurt Analysis of Polyphonic Music Signals, IEEE Trans. on Audio, Speech, and Language Process (28). [7] Hermes, D. J.: Measurement of pitch by subharmonic summation, J. Acoust. Soc. Am., Vol. 83, No., pp (online), DOI:.2/ (988). [8] El-Jaroudi, A. and Makhoul, J.: Discrete All-Pole Modeling, IEEE Trans. on Signal Proc. (99). [9] Irino, T. and Kawahara, H.: Signal Reconstruction from Modified Auditory Wavelet Transform, IEEE Trans. on Signal Proc. (993). [2] Goto, M., Hashiguchi, H., Nishimura, T. and Oka, R.: RWC Music Database: Popular, Classical, and Jazz Music Databases, Proc. ISMIR, pp (22). c 24 Information Processing Society of Japan 6

情報処理学会インタラクション 2015 IPSJ Interaction INT /3/7 1,a) 1,b) 1,c) CD Robust PCA Subharmonic Summation MIREX2014 GUI GUI A Vocal Expression Ed

情報処理学会インタラクション 2015 IPSJ Interaction INT /3/7 1,a) 1,b) 1,c) CD Robust PCA Subharmonic Summation MIREX2014 GUI GUI A Vocal Expression Ed 情報処理学会インタラクション 215 IPSJ Interaction 215 15INT15 215/3/7 1,a) 1,b) 1,c) CD Robust PCA Subharmonic Summation MIREX214 GUI GUI A Vocal Expression Editing System based on Singing Voice Separation and F Estimation