情報処理学会研究報告 IPSJ SIG Technical Report Vol.2013-CVIM-186 No /3/15 EMD 1,a) SIFT. SIFT Bag-of-keypoints. SIFT SIFT.. Earth Mover s Distance

EMD 1,a) 1 1 1 SIFT. SIFT Bag-of-keypoints. SIFT SIFT.. Earth Mover s Distance (EMD), Bag-of-keypoints,. Bag-of-keypoints, SIFT, EMD, A method of similar image retrieval system using EMD and SIFT Hoshiga Fumito 1,a) Higuchi Tatsuya 1 Nakajima Yuma 1 Shishibori masami 1 Abstract: The content-based image retrieval methods using the SIFT features which is the local features of a image have been studied actively in recent years. The Bag-of-keypoints is very famous as the retrieval technique using the SIFT features. However, in order to quantize the whole SIFT features extracted from the image to a fixed-length feature vector, the positions of each SIFT in the image can not be taken into consideration. This method applys color segmentation module in order to separate the corresponging image into some regions which have same color pixels. And then, this method makes the corresponding fixed-length feature vector form SIFT features in each region area. However, t is impossible for this method to use the Euclidean distance measure, because the number of color segmentation areas of the image is not fixed value, as a result, the lenght of vector also changes. In order to solve this problem, this mehod applys the Earth Mover s Distance (EMD) as the distance measure instead of the Euclidean distance. Keywords: Bag-of-keypoints, SIFT, EMD, Content-based image retrieval methods 1.,,. SD,,,.,,,. 1 a) hoshiga-fumito@iss.tokushima-u.ac.jp,. SIFT,,,.SIFT,. Bag-of-keypoints,,.,, SIFT. SIFT 1

1 2.. Earth Mover s Distance (EMD), Bag-of-keypoints,. 2. Bag-of-keypoints Bag-of-keypoints,..,SIFT(Scale Invariant Feature Transform). 2.1 SIFT SIFT Lowe [1].,,.. 128 ( 1). 2.2 Bag-of-keypoints, visual words, visual words. visual words. 128 ( 2). visual words,. 3. Bag-of-keypoints,. SIFT 128,.,. EMD(Earth Mover s Distance),. 3.1 EMD Earth Mover s Distance(EMD), 1. 2,. EMD,., m, n P, Q. P = {(p 1, w p1 ),..., (p m, w pm )} (1) Q = {(q 1, w q1 ),..., (q n, w qn )} (2) p i i, w pi i., q j j, w qj j. P, Q i, j (d ij ). 2

p i, q j, d ij = p i q j (3)., i j., i j ( ) (F = {f ij }). (WORK), WORK(P, Q, F ) = d ij f ij (4) i=1 j=1., i j., ( (5) (8)). : f ij 0, (1 i m, 1 j n) (5) 4 3 EMD EMD : i w pi n f ij w pi, (1 i m) (6) j=1 : j w qj m f ij w qj, (1 j n) (7) i=1 : () f ij = min w pi, (8) i=1 j=1 i=1 j=1 w qi EMD(P Q) min(work(p, Q, F )), EMD(P, Q) =. min(work(p, Q, F )) m n i=1 j=1 f ij (9) EMD 3.,,.,.,.., (,, ) (X,Y ), ( 4). 3.2 Bag-of-keypoints + EMD EMD, Bag-of-keypotins. ( X,Y, ),.. 1. opencv2.4.2 cv::siftfeaturedetector cv::siftdescriptorextractor SIFT. 2., visual words k-means. 3., ImageMagick,. ( 5). 4. 6 5, visual words 7. 5. EMD 4, EMD ( 7).,,, EMD. 3

5 EMD 1 24, 24. 5, 5., 5. Bag-of-keypoints, visual-words 2 24, 23. Bag-of-keyoitins+EMD, 1 24, 24, visual-words 2 24. 24 23 552. 900. 4.1 900 3, ( 2). 6 2 Bag-of-keypoins EMD (900 ) 428 389 83 ( 3). 1, 90.,., 552 4. 7 EMD Bag-of-keypoints+EMD, Bag-of-keypoints, EMD.. Caltec256 10 ( 1), 0001 0090 90. 900. 1 Caltec256 10 015.bonsai-101 016.boom-box 023.bulldozer 036.chandelier 072.fire-truck 073.fireworks 092.grapes 132.light-house 213.teddy-bear 251.airplanes-101 3 Bag-of-keypoins EMD 20 58 12 45 44 1 55 28 7 16 62 12 46 40 4 76 9 5 16 45 29 59 26 5 33 49 8 62 28 0 (24 23 )., 3 ( 8). 900, 552. 5. 2,. 3, Bag-of-keypoints.,,,, 4

,. SIFT. [1] Lowe, D.G : Object recognition from local scale invariant features, Proc. of IEEE InternationalConference on Computer Vision, pp. 1150-1157(1999) 8.,,,,,.,,,,..,,, Bag-of-keyoituns.,, EMD EMD,.,, 900, 1, visual words 2 ( 8)., (, ), visual words visual words,. 6., Bag-of-keypoints,.,, Bag-of-keypoints,, EMD,,.,.,,.,, EMD., 5