Voice-to-MIDI A Method of Note Counting and Pitch Extraction by Using Melody Rhythm Taps for Voice-to-MIDI System Naoki ITOU and Kazushi NISHIMOTO MID



Similar documents
Supervisor: Professor Dr. Kazushi Nishimoto School of Knowledge Science Japan Advanced Institute of Science and Technology March 2013

Copyright 2008 by Tomoyoshi Yamazaki

JAIST Reposi Title 既存曲に合わせて口す さまれる即興歌唱を利用した 音楽創作支援手法に関する研究 Author(s) 柳, 卓知 Citation Issue Date Type Thesis or Dissertation Te

sigmusdemo.dvi


JAIST Reposi Title KJ 法における作法の研究 Author(s) 三村, 修 Citation Issue Date Type Thesis or Dissertation Text version author URL http



WISS 2018 [2 4] [5,6] Query-by-Dancing Query-by- Dancing Cao [1] OpenPose 2 Ghias [7] Query by humming Chen [8] Query by rhythm Jang [9] Query-by-tapp


Web














133



1.






log F0 意識 しゃべり 葉の log F0 Fig. 1 1 An example of classification of substyles of rap. ' & 2. 4) m.o.v.e 5) motsu motsu (1) (2) (3) (4) (1) (2) mot




Copyright ' 2001 by Manabu Masuoka i


The copyright of this material is retained by the Information Processing Society of Japan (IPSJ). The material has been made available on the website
















135



Vol.53 No (Mar. 2012) 1, 1,a) 1, 2 1 1, , Musical Interaction System Based on Stage Metaphor Seiko Myojin 1, 1,a






IPSJ SIG Technical Report Vol.2011-MUS-91 No /7/ , 3 1 Design and Implementation on a System for Learning Songs by Presenting Musical St

17






Title 本 間 久 雄 日 記 を 読 む (3) Author(s) 岡 崎, 一 Citation 人 文 学 報 表 象 文 化 論 (461): 1-26 Issue Date URL Rights






- 17 -








93











227 study




Transcription:

JAIST Reposi https://dspace.j Title Voice-to-MIDIのためのメロディリズムタップを 用 いた 音 数 音 高 の 判 定 手 法 の 提 案 Author(s) 伊 藤, 直 樹 ; 西 本, 一 志 Citation 電 子 情 報 通 信 学 会 論 文 誌 D, J96-D(4): 965-977 Issue Date 2013-04-01 Type Journal Article Text version publisher URL Rights http://hdl.handle.net/10119/11576 Copyright (C)2013 IEICE. 伊 藤 直 樹, 西 本 一 志, 情 報 通 信 学 会 論 文 誌 D, J96-D(4), 2013, 965-977 http://www.ieice.org/jpn/trans_onlin Description Japan Advanced Institute of Science and

Voice-to-MIDI A Method of Note Counting and Pitch Extraction by Using Melody Rhythm Taps for Voice-to-MIDI System Naoki ITOU and Kazushi NISHIMOTO MIDI Voice-to-MIDI 1 MIDI Voice-to-MIDI 3 1. MIDI Musical Instrument Digital Interface [1] [3] Voice-to-MIDI VtoM VtoM School of Knowledge Science, Japan Advanced Institute of Science and Technology, 1 1 Asahidai, Nomi-shi, 923 1292 Japan Research Center for Innovative Lifestyle Design, Japan Advanced Institute of Science and Technology, 1 1 Asahidai, Nomi-shi, 923 1292 Japan VtoM VtoM VtoM 1 2 1 3 F0 F0 4 F0 5 1 23 F0 F0 (2) D Vol. J96 D No. 4 pp. 965 977 c 2013 965

2013/4 Vol. J96 D No. 4 (3) F0 (4) 1 1 VtoM [4], [5] [6] [6] VtoM Voice-to-MIDI 1 1 Voice-to-MIDI TVM VtoM 2. 3. 4. 5. 6. 2. VtoM [7] [10] VtoM Query By Humming QBH [11] [14] F0 [15], [16] VtoM [17] Step Entry [18] 3. Voice-to-MIDI Voice-to-MIDI 966

Voice-to-MIDI 1 Fig. 1 Score of Aka tombo : compositon by Kosaku Yamada, lyric by Rofu Miki. 3 Fig. 3 Samples of segmentation mistake with extra notes. 2 1 1 Fig. 2 Samples of segmentation mistake with note binding and divorcing. 3. 1 VtoM VtoM VtoM [19] 1 2 2 1 1 3 VtoM 1 3. 2 VtoM TVM TVM F0 PC 967

2013/4 Vol. J96 D No. 4 3. 3 TVM D2-F5 A4 = 440 Hz MIDI 22050 Hz 16 bit MIDI PC PC 2 1 2 1 1 F0 2 1 F0 3 1 F0 F0 (STFT=2048 samples 100 ms =128 samples 6ms) D2-F5 IFFT [20] cent F0 PC Keypress 1024 sample 50 ms Keypress 1024 sample 3. 4 2 1 4 1 4 2 1 2 F0 [21], [22] D2-F5 6ms 3 1 2 3 200 ms 2 1 F0 F0 FFT 90% 4 2 Fig. 4 2 types of tapping manner. 968

Voice-to-MIDI / PC 1 3. 5 A4 = 440 Hz D2 F5 [23] FFT 6ms 16 6ms BPM=2500 BPM=250 4. 4. 1 TVM 3. 5 TVM 1 2 3 2 4 1 2 5. 1 5. 3 3 5. 4 TVM 4 5. 5 3 4. 2 2 1 2 1 2 BPM=120 4. 3 VtoM VtoM 3 1 CMP 2 RYN [10] 3 BP2 [25] CMP 969

2013/4 Vol. J96 D No. 4 F0 TVM [24] 50 cent 70 ms 16 BPM=213 RYN [10] Linux MIDI [9] Ryynanen Accent Signal FFT BP2 KAWAI: Band Producer 2 4. 4 TVM HP: 2710p PC Shure: SM87A 2 PC PC1 BP2 BP2 MIDI Wave BP2 PC2 2710p PCTVM PC1 BP2 MIDI TVM PC1 TVM PC2 PC1 PC1 PC2 CMP RYN BP2 Adobe: Audition 1.0 4. 5 8 1 TVM 1 2 3 2 6 9 4 5 1 1 VtoM 4. 6 1 VtoM 970

Voice-to-MIDI 1 1 3 Table 1 Results of pre-test and experiences of musical performing for each subject. 1 2 2 3 A 6 0 1 5 B 3 0 0 2 C 6 1 0 5 D 3 1 0 6 E 0 1 0 6 1 F 5 0 0 5 2 3 G 6 0 0 6 2 H 6 0 4 6 3 5 I 6 5 1 6 10 1 A D 22 Table 2 2 Singing conditions for each song. A BPM = 120 B 5 1 31 1 3 1 2 BPM=120 3 2 3 BPM=120 1 1 3 3 Table 3 List of subject own-selected songs. A Mr. Children Over B C 11 3 D E Acid Black Cherry F G 1 H SMAP I 4. 7 BP2 1 1 1 2 Adobe: Audition1.0 Ensoniq: MR-76 3 1 2 4 4 1 1 3 971

2013/4 Vol. J96 D No. 4 1 2 2 3 2 3 2 3 1 1 1 2 1 2 3 4 F0 1 4 Table 4 Categories for melody extracts. 1 1 31 1 3 = ++ F 1 % = / 100 2 % = / + + 100 3 F = 2 /+ 5. 5. 1 3 93 5 TVM F 5 100% F0 CMP RYN BP2 TVM CMP 95 58 RYN 42 23 4 972

Voice-to-MIDI 5 [ ] Table 5 Results of Aka tombo : [sung with own tempo, lyrics and taps]. 1 * 6 2 F0 6 3 4 6 6 [ BPM=120 ] Table 6 Results of Aka tombo : [sung with BPM=120, lyrics and taps]. CMP RYN BP2 3 3 4 TVM 5. 2 BPM = 120 BPM = 120 3 93 6 RYN E TVM E 3 973

2013/4 Vol. J96 D No. 4 3 1 TVM F TVM 2 100% 5. 3 7 7 TVM 3 F TVM A, E, F 1 A, F TVM F TVM TVM A E F 5. 4 TVM A D F I 2 TVM E 98.7% 98.7% 99.7% 99.7% t 100% 5 7 [ ] Table 7 Results of self-selected songs: [sung with own tempo, lyrics and taps]. 1 * 2 F0 3 4 974

Voice-to-MIDI 8 F Table 8 Differences of the addition of tapping in the total values of recall, precision and F-value of Aka tombo (sung with own tempo). CMP RYN BP2 85.6 85.4 87.2 88.9 92.7 94.7 84.0 83.3 79.5 75.4 88.7 86.4 F 84.8 84.3 83.2 81.6 90.6 90.4 % 9 F BPM=120 Table 9 Differences of the addition of tapping in the total values of recall, precision and F-value of Aka tombo (sung with BPM=120). CMP RYN BP2 83.8 86.1 87.5 81.7 78.5 79.0 86.9 86.7 84.7 77.1 92.1 92.1 F 85.3 86.4 86.1 79.3 84.8 85.0 % BPM=120 97.8% 96.6% 98.1% 98.1% t 100% 2 1 TVM 1 1 4 1 12 BPM=120 5. 5 TVM 3 3 8 836 830 CMP, RYN, BP2 9 BPM=120 837 835 BP2 CMP RYN 87.5% 84.7% 81.7% 77.1% E 35 11 BPM=120 5. 6 TVM 975

2013/4 Vol. J96 D No. 4 [22] [22] F0 65.9% 70.2% 24.0% 36.8% TVM 6. Voice-to-MIDI VtoM MIDI Note No. Voice-to-MIDI (Voice-to-MusicalExpression) [10] Matti Ryynanen Anssi Klapuri [1] YAMAHA XGworks ST2003. [2] INTERNET SingerSongWriter Lite6.0, 2008. [3] MakeMusic Inc., Finale2010, USA, 2009. [4] pp.109 118, 2005. [5] pp.20 21, 2003. [6] C. Oshima, N. Itou, K. Nishimoto, N. Hosoi, K. Yasuda, and K. Nakayama, An accompaniment system for healing emotions of patients with dementia who repeat stereotypical utterances, Proc. 9th Int l. Conf. Smart Homes and Health Telematics, 2011. [7] vol.20, no.10, pp.68 73, 1984. [8] C.C. Toh, B. Zhang, and Y. Wang, Multiple-feature fusion based onset detection for solo singing voice, Proc. ISMIR 2008, 2008. [9] M. Ryynanen and A. Klapuri, Modelling of note events for singing transcription, Proc. ISCA Tutorial and Research Workshop on Statistical and Perceptual Audio, 2004. [10] M. Ryynanen and A. Klapuri, Automatic transcription of melody, bass line, and chords in polyphonic music, Computer Music Journal, vol.32, no.3, pp.73 86, 2008. [11] T. Kageyama, K. Mochizuki, and Y. Takashima, Melody retrieval with humming, Proc. ICMC 1993, pp.349 351, 1993. [12] A. Ghias, J. Logan, D. Chamberlin, and B.C. Smith, Query by humming: Musical information retrieval in an audio database, Proc. ACM Multimedia 95, San Francisco, California, Nov. 1995. [13] L. Prechelt and R. Typke, An interface for melody input, ACM Trans. Computer-Human Interaction (TOCHI), vol.8, no.2, pp.133 149, 2001. [14] N. Kosugi, Y. Nishihara, T. Sakata, M. Yamamuro, and K. Kushima, A practical query-by-humming system for a large music database, Proc. 8th ACM Intl. Conf. Multimedia, pp.333 342, Marina del Rey, California, 2000. [15] SLP-47, pp.71 76, 2003. [16] 2006 1-2-23, 2006. [17] Wildcat Canyon Software Inc., Autoscore 2.0, 1999. [18] MUS-34, pp.21 26, 1999. [19] p.68, 1994. [20] pp.718 723, 1983. [21] MIDI 2step 2006-EC-5, vol.2006, pp.43 48, 2006. [22] N. Itou and K. Nishimoto, A voice-to-midi system 976

Voice-to-MIDI for singing melodies with lyrics, Proc. Intl. Conf. ACE 07, pp.183 189, Salzburg, Austria, 2007. [23] p.439, 2004. [24] vol.23, no.5, pp.95 100, 2004. [25] Band Producer 2, 2008. 24 7 14 10 25 2011 ICOST2011 Best Multi-Disciplinary Paper Award GLOBAL HEALTH 2012 Best Paper Award 1987 1992 ATR 1995 ATR 1999 2007 2000 2003 21 1999 1999 ACM Multimedia 2004 Best Paper Award ICOST2011 Best Multi-Disciplinary Paper Award GLOBAL HEALTH 2012 Best Paper Award IEEE computer society ACM 977