OngaCREST [10] A 3. Latent Dirichlet Allocation: LDA [11] Songle [12] Pitman-Yor (VPYLM) [13] [14,15] n n n 3.1 [16 18] PreFEst [19] F

Similar documents
IPSJ SIG Technical Report Vol.2011-MUS-91 No /7/ , 3 1 Design and Implementation on a System for Learning Songs by Presenting Musical St

力 出力 ÝÒ 源分離 f å 2 š ž 伸縮率 f g å ² f œå 1 ( F0) audio-to-audio 3 2 RNMF [2] DTW audio-to-audio [3] [4] MIDI 2.2 [5 10] Dannenberg [5] Verc

sigmusdemo.dvi

IPSJ SIG Technical Report Vol.2014-MUS-104 No /8/27 F0 1,a) 1,b) 1,c) 2,d) (F0) F0 F0 Graphical User Interface (GUI) F0 1. [1] CD MIDI [2] [3,

main.dvi

IPSJ SIG Technical Report Pitman-Yor 1 1 Pitman-Yor n-gram A proposal of the melody generation method using hierarchical pitman-yor language model Aki

IPSJ SIG Technical Report Vol.2012-MUS-96 No /8/10 MIDI Modeling Performance Indeterminacies for Polyphonic Midi Score Following and

IPSJ SIG Technical Report Vol.2017-MUS-116 No /8/24 MachineDancing: 1,a) 1,b) 3 MachineDancing MachineDancing MachineDancing 1 MachineDan

Songrium: 多様な関係性に基づく音楽視聴支援サービス

3 3) 6) 1) MPEG-7 2) MPEG-7 (A) (B) 2 9) Zils 10) (1) (2) 2.1 2

IPSJ-MUS

main.dvi

IPSJ SIG Technical Report Vol.2009-DPS-141 No.20 Vol.2009-GN-73 No.20 Vol.2009-EIP-46 No /11/27 1. MIERUKEN 1 2 MIERUKEN MIERUKEN MIERUKEN: Spe

2 3, 4, [1] [2] [3]., [4], () [3], [5]. Mel Frequency Cepstral Coefficients (MFCC) [9] Logan [4] MFCC MFCC Flexer [10] Bogdanov2010 [3] [14],,,

IPSJ SIG Technical Report Vol.2015-MUS-107 No /5/23 HARK-Binaural Raspberry Pi 2 1,a) ( ) HARK 2 HARK-Binaural A/D Raspberry Pi 2 1.

動画コンテンツ 動画 1 動画 2 動画 3 生成中の映像 入力音楽 選択された素片 テンポによる伸縮 音楽的構造 A B B B B B A C C : 4) 6) Web Web 2 2 c 2009 Information Processing S

The copyright of this material is retained by the Information Processing Society of Japan (IPSJ). The material has been made available on the website

Gaze Head Eye (a) deg (b) 45 deg (c) 9 deg 1: - 1(b) - [5], [6] [7] Stahl [8], [9] Fang [1], [11] Itti [12] Itti [13] [7] Fang [1],

1 1 CodeDrummer CodeMusician CodeDrummer Fig. 1 Overview of proposal system c

21 Pitman-Yor Pitman- Yor [7] n -gram W w n-gram G Pitman-Yor P Y (d, θ, G 0 ) (1) G P Y (d, θ, G 0 ) (1) Pitman-Yor d, θ, G 0 d 0 d 1 θ Pitman-Yor G

1 4 4 [3] SNS 5 SNS , ,000 [2] c 2013 Information Processing Society of Japan

main.dvi

1(a) (b),(c) - [5], [6] Itti [12] [13] gaze eyeball head 2: [time] [7] Stahl [8], [9] Fang [1], [11] 3 -

[2][3][4][5] 4 ( 1 ) ( 2 ) ( 3 ) ( 4 ) 2. Shiratori [2] Shiratori [3] [4] GP [5] [6] [7] [8][9] Kinect Choi [10] 3. 1 c 2016 Information Processing So

IPSJ-SLP

情報処理学会インタラクション 2015 IPSJ Interaction INT /3/7 1,a) 1,b) 1,c) CD Robust PCA Subharmonic Summation MIREX2014 GUI GUI A Vocal Expression Ed

DEIM Forum 2012 E Web Extracting Modification of Objec

7) 8) 9),10) 11) 18) 11),16) 18) 19) 20) Vocaloid 6) Vocaloid 1 VocaListener1 2 VocaListener1 3 VocaListener VocaListener1 VocaListener1 Voca

2007/8 Vol. J90 D No. 8 Stauffer [7] 2 2 I 1 I 2 2 (I 1(x),I 2(x)) 2 [13] I 2 = CI 1 (C >0) (I 1,I 2) (I 1,I 2) Field Monitoring Server

IPSJ SIG Technical Report Vol.2015-MUS-106 No.10 Vol.2015-EC-35 No /3/2 BGM 1,4,a) ,4 BGM. BGM. BGM BGM. BGM. BGM. BGM. 1.,. YouTube 201

main.dvi

Wikipedia YahooQA MAD 4)5) MAD Web 6) 3. YAMAHA 7) 8) Vocaloid PV YouTube 1 minato minato ussy 3D MAD F EDis ussy


2. [2], [3], [4] [5] [6], [7], [8] Agnihotri [6] Xu [7] [8] [9] Nakamura [10] TRECVID (TREC Video Retrieval Evaluation) [11] TRECVID TRECVID Singing s

2. BGM Pampalk [2] ( 1 ) s s s a ( 2 ) s a > s s s a ( 3 ) s a > s s sa s s UniversalPlaylist [3] Yes No BGM BGM LISWO [4] LISWO Support V

情報処理学会研究報告 図 1 LYRICS RADAR の歌詞検索用インタフェースの表示例 実際にはポピュラー音楽 (J-POP) を用いて実装しているが 本図では歌詞の例示のために RWC 研究用音楽デー タベースの楽曲 (RWC-MDB-P-2001 No.30) を用いた る動作 というトピッ

知識ベースCFD

untitled

4. C i k = 2 k-means C 1 i, C 2 i 5. C i x i p [ f(θ i ; x) = (2π) p 2 Vi 1 2 exp (x µ ] i) t V 1 i (x µ i ) 2 BIC BIC = 2 log L( ˆθ i ; x i C i ) + q

2reN-A14.dvi

WISS 2018 [2 4] [5,6] Query-by-Dancing Query-by- Dancing Cao [1] OpenPose 2 Ghias [7] Query by humming Chen [8] Query by rhythm Jang [9] Query-by-tapp

sigmus201007_fujihara.dvi

untitled

トピックモデルの応用: 関係データ、ネットワークデータ

log F0 意識 しゃべり 葉の log F0 Fig. 1 1 An example of classification of substyles of rap. ' & 2. 4) m.o.v.e 5) motsu motsu (1) (2) (3) (4) (1) (2) mot

& Vol.2 No (Mar. 2012) 1,a) , Bluetooth A Health Management Service by Cell Phones and Its Us

2). 3) 4) 1.2 NICTNICT DCRA Dihedral Corner Reflector micro-arraysdcra DCRA DCRA DCRA 3D DCRA PC USB PC PC ON / OFF Velleman K8055 K8055 K8055

1. HNS [1] HNS HNS HNS [2] HNS [3] [4] [5] HNS 16ch SNR [6] 1 16ch 1 3 SNR [4] [5] 2. 2 HNS API HNS CS27-HNS [1] (SOA) [7] API Web 2

DEIM Forum 2016 E3-6 : SERVA

& Vol.5 No (Oct. 2015) TV 1,2,a) , Augmented TV TV AR Augmented Reality 3DCG TV Estimation of TV Screen Position and Ro

IPSJ SIG Technical Report Vol.2011-EC-19 No /3/ ,.,., Peg-Scope Viewer,,.,,,,. Utilization of Watching Logs for Support of Multi-

WII-D 2017 (1) (2) (1) (2) [Tanaka 07] [ 04] [ 10] [ 13, 13], [ 08] [ 13] (1) (2) 2 2 e.g., Wikipedia [ 14] Wikipedia [ 14] Linked Open

2.R R R R Pan-Tompkins(PT) [8] R 2 SQRS[9] PT Q R WQRS[10] Quad Level Vector(QLV)[11] QRS R Continuous Wavelet Transform(CWT)[12] Mexican hat 4

Vol. 23 No. 4 Oct Kitchen of the Future 1 Kitchen of the Future 1 1 Kitchen of the Future LCD [7], [8] (Kitchen of the Future ) WWW [7], [3

untitled

Vol. 43 No. 2 Feb. 2002,, MIDI A Probabilistic-model-based Quantization Method for Estimating the Position of Onset Time in a Score Masatoshi Hamanaka

No. 3 Oct The person to the left of the stool carried the traffic-cone towards the trash-can. α α β α α β α α β α Track2 Track3 Track1 Track0 1

3 2 2 (1) (2) (3) (4) 4 4 AdaBoost 2. [11] Onishi&Yoda [8] Iwashita&Stoica [5] 4 [3] 3. 3 (1) (2) (3)


Vol.55 No (Jan. 2014) saccess 6 saccess 7 saccess 2. [3] p.33 * B (A) (B) (C) (D) (E) (F) *1 [3], [4] Web PDF a m

IPSJ SIG Technical Report GPS LAN GPS LAN GPS LAN Location Identification by sphere image and hybrid sensing Takayuki Katahira, 1 Yoshio Iwai 1

Lyra X Y X Y ivis Designer Lyra ivisdesigner Lyra ivisdesigner 2 ( 1 ) ( 2 ) ( 3 ) ( 4 ) ( 5 ) (1) (2) (3) (4) (5) Iv Studio [8] 3 (5) (4) (1) (

IPSJ SIG Technical Report 1, Instrument Separation in Reverberant Environments Using Crystal Microphone Arrays Nobutaka ITO, 1, 2 Yu KITANO, 1

1 Kinect for Windows M = [X Y Z] T M = [X Y Z ] T f (u,v) w 3.2 [11] [7] u = f X +u Z 0 δ u (X,Y,Z ) (5) v = f Y Z +v 0 δ v (X,Y,Z ) (6) w = Z +

IPSJ SIG Technical Report Vol.2010-NL-199 No /11/ treebank ( ) KWIC /MeCab / Morphological and Dependency Structure Annotated Corp

2) 3) LAN 4) 2 5) 6) 7) K MIC NJR4261JB0916 8) 24.11GHz V 5V 3kHz 4 (1) (8) (1)(5) (2)(3)(4)(6)(7) (1) (2) (3) (4)

THE INSTITUTE OF ELECTRONICS, INFORMATION AND COMMUNICATION ENGINEERS TECHNICAL REPORT OF IEICE {s-kasihr, wakamiya,

音響モデル triphone 入力音声 音声分析 デコーダ 言語モデル N-gram bigram HMM の状態確率として利用 出力層 triphone: 3003 ノード リスコア trigram 隠れ層 2048 ノード X7 層 1 Structure of recognition syst

1 7.35% 74.0% linefeed point c 200 Information Processing Society of Japan

ホットスポット 1 音リアクションイベント BIC GMM 2 3 BIC GMM HMM 10) SVM 11) 12) 13) Bayesian Information Criterion BIC 14) BIC M = M 1, M 2,,

Vol.53 No (Mar. 2012) 1, 1,a) 1, 2 1 1, , Musical Interaction System Based on Stage Metaphor Seiko Myojin 1, 1,a

gengo.dvi

IEEE e

JAPAN MARKETING JOURNAL 110 Vol.28 No.22008

JAPAN MARKETING JOURNAL 123 Vol.31 No.32012

JAPAN MARKETING JOURNAL 115 Vol.29 No.32010

機関リポジトリ.PDF


H(ω) = ( G H (ω)g(ω) ) 1 G H (ω) (6) 2 H 11 (ω) H 1N (ω) H(ω)= (2) H M1 (ω) H MN (ω) [ X(ω)= X 1 (ω) X 2 (ω) X N (ω) ] T (3)

main.dvi

HASC2012corpus HASC Challenge 2010,2011 HASC2011corpus( 116, 4898), HASC2012corpus( 136, 7668) HASC2012corpus HASC2012corpus

論文08.indd

(255) Vol. 19 No. 4 July (completion) tcsh bash UNIX Emacs/Mule 2 ( ) [2] [9] [11] 2 (speech completion) 3 ( ) [7] 2 ( 7.1 )

情報セキュリティの現状と課題

IPSJ SIG Technical Report Vol.2009-DPS-141 No.23 Vol.2009-GN-73 No.23 Vol.2009-EIP-46 No /11/27 t-room t-room 2 Development of

JAPAN MARKETING JOURNAL 122 Vol.31 No.22011

_314I01BM浅谷2.indd

MDD PBL ET 9) 2) ET ET 2.2 2), 1 2 5) MDD PBL PBL MDD MDD MDD 10) MDD Executable UML 11) Executable UML MDD Executable UML

(a) 1 (b) 3. Gilbert Pernicka[2] Treibitz Schechner[3] Narasimhan [4] Kim [5] Nayar [6] [7][8][9] 2. X X X [10] [11] L L t L s L = L t + L s

Microsoft Word - toyoshima-deim2011.doc

独立行政法人情報通信研究機構 Development of the Information Analysis System WISDOM KIDAWARA Yutaka NICT Knowledge Clustered Group researched and developed the infor

IPSJ SIG Technical Report Vol.2012-MUS-94 No.3 Vol.2012-SLP-90 No /2/ DTM 200 GUIN-Resonator: A system synthesizing voice with the styl

IPSJ SIG Technical Report Vol.2012-DCC-1 No /5/18 1,a) 2,b) 3,c) 4,d) ( ) Discussion Mining with Music Theory Being Applied to Analysis of Meet

IPSJ SIG Technical Report iphone iphone,,., OpenGl ES 2.0 GLSL(OpenGL Shading Language), iphone GPGPU(General-Purpose Computing on Graphics Proc

i

JAIST Reposi Title 既存曲に合わせて口す さまれる即興歌唱を利用した 音楽創作支援手法に関する研究 Author(s) 柳, 卓知 Citation Issue Date Type Thesis or Dissertation Te

2

IPSJ SIG Technical Report Vol.2013-CE-119 No /3/15 enpoly enpoly enpoly 1) 2) 2 C Java Bertrand Meyer [1] 1 1 if person greeting()

10_08.dvi

1 2 3 マルチメディア, 分散, 協調とモバイル (DICOMO2013) シンポジウム 平成 25 年 7 月.,.,,.,. Surrogate Diner,., Surrogate Diner,, 3,, Surrogate Diner. An Interface Agent for Ps

Transcription:

1,a) 2,b) 1,c) LPMCC MFCC Fluctuation Pattern (LDA) Songle Pitman-Yor (VPYLM) 3278 1. (MIR: Music Information Retrieval) [1 5] [6 8] 1 National Institute of Advanced Industrial Science and Technology (AIST) 2 Kyoto University a) t.nakano [at] aist.go.jp b) yoshii [at] i.kyoto-u.ac.jp c) m.goto [at] aist.go.jp *1 N *2 [9] *1 http://www.nicovideo.jp/ *2 cfl 2014 Information Processing Society of Japan 1

OngaCREST [10] 4 2. 1 1 A 3. Latent Dirichlet Allocation: LDA [11] Songle [12] Pitman-Yor (VPYLM) [13] [14,15] n n n 3.1 [16 18] 3.1.1 PreFEst [19] F 0 20 32 ms LPMCC 12 ΔF 0 1 10ms GMM [16] 15% 16kHz LPMCC LPC MFCC cfl 2014 Information Processing Society of Japan 2

LPC 25 15 ΔF 0 50ms GMM GMM RWC [20] 100 80 20 32 1.0 GMM 12 GMM 27 3.1.2 LDA [17,18] k-means RWC 100 k = 100 LDA 100 Gibbs [21] 1 0.1 3.1.3 LDA [11,21] 1 3.2 3.2.1 25 ms MFCC 12 ΔMFCC 12 Δ 1 10ms 15% 16kHz 0.97 MFCC 15 22 Δ 50ms 3.2.2 k =64 k-means RWC 100 LDA (3.1.3) 3.2.3 3.1.3LDA 3.3 [22, 23] 3.3.1 6 Fluctuation Pattern (FP) [22, 23] 1200 3 FP 2 RWC [20] 100 95% 79 FP 23.2 ms FFT 11.6 ms Bark 20 6 FFT 0 10Hz 60 1200 = 20 60 [22,23] 11.025kHz MATLAB MA (Music Analysis) toolbox [23] 3.3.2 k =64 k-means RWC 100 LDA (3.1.3) 3.3.3 3.1.3LDA cfl 2014 Information Processing Society of Japan 3

2 Fluctuation patterns Fluctuation Pattern (FP) FP WSOLA FP 3.4 [12] 3.4.1 9 major, major 6th, major 7th, dominant 7th, minor, minor 7th, half-diminished, diminished, augmented major 5 /2, /3, /5, /b7, /7 14 (= 9 + 5) 12 168 (= 14 12) [12] HMM 3 major, natural minor, harmonic minor HMM [24] HMM HMM Viterbi 3.4.2 C 8 major, major 6th, major 7th, dominant 7th, minor, minor 7th, diminished, augmented 12 97 (= 8 12 + 1) 3.4.3 VPYLM tri-gram n =3 VPYLM tri-gram 1.0 10 5 VPYLM 10 5 4. *3 2000 2008 20 3278 A B 2 A 20 1 463 B RWC [20] 4.1 A: 3 6 10% 46 *3 http://www.oricon.co.jp/ cfl 2014 Information Processing Society of Japan 4

1 A 20 A 33 B B z 28 C 28 D 27 E 25 F BoA 24 G EXILE 24 H L Arc en Ciel 24 I 24 J w-inds. 23 K SOPHIA 22 L 22 M CHEMISTRY 21 N Gackt 21 O GARNET CROW 20 P TOKIO 20 Q 20 R 20 S Every Little Thing 19 T GLAY 19 11 9 463 5 6 10% 10% 3 10% 4 10% 5 6 4.2 B: 7 RWC 100 8 9 10 2 No.60, 70, 20 11 RWC 100 3 5 3 cfl 2014 Information Processing Society of Japan 5

7 0.02 0.02 0.02 3728 No.45-3.82 No.20 No.42-4.66 8-3.98-4.33 9 No.60 No.70 RWC 100 3728 No.15 No.55 No.90 No.73 No.99 RWC 100 3728 C FGCAm F G C 5. -3.86 No.6 No.8 No.29 No.60 No.81-7.51 10 RWC 100 3728-1 -5 No.56 No.41 No.54 No.82 No.84 11 RWC 100 3728 Songle [12] JST CREST OngaCREST RWC cfl 2014 Information Processing Society of Japan 6

2 B 5 () No. (1) 60 (2) 70 (3) 45 (4) 20 (5) 42 (1) 15 (2) 90 (3) 99 (4) 55 (5) 73 (1) 6 (2) 81 (3) 29 (4) 8 (2 ) (5) 60 M&Y (2 ) (1) 56 (2) 82 (3) 41 (4) 84 (5) 54 3 5 B... No. 56... F:maj C:maj G:maj F:maj C:maj G:maj... 82... G:maj C:maj F:maj G:maj C:maj F:maj...... E:maj A:min F:maj G:maj C:maj F:maj... 41 F:maj C:maj F:maj C:maj F:maj... 84... G:maj C:maj F:maj G:maj C:maj F:maj... 54 G:maj F:maj G:maj F:maj G:maj... [1] Vol. 60, No. 11, pp. 675 681 (2004). [2] Pardo, B.(ed.): Special issue: Music information retrieval, Communications of the ACM, Vol. 49, No. 8, pp. 28 58 (2006). [3] Casey, M., Veltkamp, R., Goto, M., Leman, M., Rhodes, C. and Slaney, M.: Content-Based Music Information Retrieval: Current Directions and Future Challenges, Proceedings of the IEEE, Vol. 96, No. 4, pp. 668 696 (2008). [4] Downie, J. S.: The music information retrieval evaluation exchange (2005 2007): A window into music information retrieval research, Acoust.Sci.&Tech., Vol. 29, pp. 247 255 (2008). [5] Downie, J. S., Byrd, D. and Crawford, T.: Ten Years of ISMIR: Reflections on Challenges and Opportunities, Proc. ISMIR 2009 (2009). [6] pp. 751 755 (2009). [7] Song, Y., Dixon, S. and Pearce, M.: Survey of Music Recommendation Systems and Future Perspectives, Proc. CMMR 2012, pp. 395 410 (2012). [8] Knees, P. and Schedl, M.: A Survey of Music Similarity and Recommendation from Music Context Data, ACM Trans. on Multimedia Computing, Communications and Applications, Vol. 10, No. 1, pp. 1 21 (2013). [9] Hamasaki, M., Goto, M. and Nakano, T.: Songrium: A Music Browsing Assistance Service with Interactive Visualization and Exploration of a Web of Music, Proc. WWW 2014 (2014). [10] 2013-MUS-99, No. 33, pp. 1 9 (2013). [11] Blei, D. M., Ng, A. Y. and Jordan, M. I.: Latent Dirichlet Allocation, Journal of Machine Learning Research, Vol. 3, pp. 993 1022 (2003). [12] Mauch, M. Songle: Vol. 54, pp. 1363 1372 (2013). [13] Pitman-Yor n-gram Vol. 48, pp. 4023 4032 (2007). [14] 2011-MUS-91, pp. 1 10 (2013). [15] Yoshii, K. and Goto, M.: A Vocabulary-Free Infinity- Gram Model for Nonparametric Bayesian Chord Progression Analysis, Proc. ISMIR 2011, pp. 645 650 (2014). [16] Fujihara, H., Goto, M., Kitahara, T. and Okuno, H. G.: A Modeling of Singing Voice Robust to Accompaniment Sounds and Its Application to Singer Identification and Vocal-Timbre-SimilarityBased Music Information Retrieval, IEEE Trans. on ASLP, Vol. 18, No. 3, pp. 638 648 (2010). [17] 2013-MUS-100, pp. 1 7 (2013). [18] Nakano, T., Yoshii, K. and Goto, M.: Vocal Timbre Analysis Using Latent Dirichlet Allocation and Cross- Gender Vocal Timbre Similarity, Proc. ICASSP 2014 (2014). [19] Goto, M.: A Real-time Music Scene Description System: Predominant-F0 Estimation for Detecting Melody and Bass Lines in Real-world Audio Signals, Speech Communication, Vol. 43, No. 4, pp. 311 329 (2004). [20] RWC : Vol. 45, No. 3, pp. 728 738 (2004). [21] Griffiths, T. L. and Steyvers, M.: Finding scientific topics, Proc. of the National Academy of Sciences of the United States of America, Vol. 1, pp. 5228 5235 (2004). [22] Pampalk, E., Rauber, A. and Merkl, D.: Contentbased Organization and Visualization of Music Archives, Proc. ACMMM 02, pp. 570 579 (2002). [23] Pampalk, E.: Computational Models of Music Similarity and Their Application to Music Information Retrieval, Ph.D. Dissertation, Vienna Inst. of Tech. (2006). [24] Mauch, M. and Dixon, S.: Simultaneous Estimation of Chords and Musical Context from Audio, IEEE Trans. on ASLP, Vol. 18, pp. 1280 1289 (2010). cfl 2014 Information Processing Society of Japan 7