Duplicate Near Duplicate Intact Partial Copy Original Image Near Partial Copy Near Partial Copy with a background (a) (b) 2 1 [6] SIFT SIFT SIF

Partial Copy Detection of Line Drawings from a Large-Scale Database Weihan Sun, Koichi Kise Graduate School of Engineering, Osaka Prefecture University E-mail: sunweihan@m.cs.osakafu-u.ac.jp, kise@cs.osakafu-u.ac.jp Abstract MSER HOG 11,603 1 2 Bas [1] SIFT(Scale-Invariant Feature Transform) [2] Ke SIFT PCA- SIFT(Principal Component Analysis SIFT)[4] [3] Kise PCA-SIFT [5] [6] MSER (Maximally Stable Extremal Region) [7] HOG (Histogram of Oriented Gradients) [8] 2 [6]

Duplicate Near Duplicate Intact Partial Copy Original Image Near Partial Copy Near Partial Copy with a background (a) (b) 2 1 [6] 2916 1 SIFT SIFT SIFT 2 1 5 (duplicate) (near duplicate) partial copy( ) intact partial copy near partial copy duplicate 1 near partial copy 2 3 1 3.1 3 2 (region detector) (feature descriptor) ID

Illegal Images with Copy Copyright Region Region Detector Detector Feature Feature Descriptor Descriptor Feature vectors Query Processing Matching Voting Copied Feature vector Database Database Processing (a) MSER Images 3 MSER HOG ANN(Approximate Nearest Neighbor search)[9] (Matching) (Voting) 3.2 : MSER Matas wide-baseline MSER [7] (Extremal region) MSER MSER 4 MSER MSER (b) MSER 4 MSER MSER MSER MSER ( 6 ) y 5 MSER 3.3 HOG HOG (Histogram of Oriented Gradients) Dalal [8]

(a) MSER (b) 5 one cell One block (a) (b) (c) 6 HOG MSER HOG 6(b) 9 9 6(c) 3 3 9 1 9 6 6 9 3 3 6 6 = 2916 3.4 [6] 6 3.5 ANN(Approximate Nearest Neighbor)[9] ANN k-d q k-d p q p r q r ANN ε r/(1 + ε) q p 1, 2 p 2 d(q, p 1 ) d(q, p 2 ) < T (1) 2 T R 4 4.1 10 11603

101 1/5 near partial copy 2 ( ) ( ) 3/4 7 2, 4, 10 3 4 ANN ε 5 CPU Opteron 2.8GHz, 64GB 4.2 1 SIFT [2] SIFT T 0.85 8 9 R R ( ) 8 SIFT 2 9 SIFT SIFT 5 89%, 10 75% MSER+HOG 10 (a) (b) 11 10 2 11(a) 11(b)

(a) (b) 2x (c) 4x (d) 10x 7 1. size of background 1x 2x 4x 10x method using SIFT 2, 376ms 4, 386ms 9, 476ms 18, 361ms proposed method 382ms 1, 021ms 2, 730ms 5, 615ms 2. (a) (b) rotation cumulative detection rate (R = 5) degree printed handwritten 0 99% 89% 30 99% 90% 45 99% % 12 12(a) SIFT 17GB 8.2GB 1 SIFT 1/6 1/3 4.3 2 30,45 2 ( 3/2 ) 5 2 3 2 0, 3 3/4 1 4.4 3 4

95 90 85 95 90 85 (a) (b) 2x 95 90 85 95 90 85 (c) 4x (d) 10x 8 3. scale cumulative detection rate (R = 5) printed handwritten 3/4 99% 89% 3/2 99% 98% ε 5 MSER HOG 4 ε R = 5. ε cumulative detection detection time rate (R = 5) 1 93% 64, 995ms 5 89% 336ms 10 84% 84ms SIFT 1. SIFT 2. SIFT 5 75 89%

60 40 20 60 40 20 0 0 (a) (b) 2x 60 40 20 60 40 20 0 0 (c) 4x (d) 10x 9 3. 4. SIFT [1] P. Bas, J-M Chassery, and B. Macq, Geometrically invariant watermarking using feature points, IEEE Trans. Antennas Propag., Vol.11, No.9, pp.1014-1028, 2002. [2] D. G. Lowe, Distinctive image features from scale-invariant key-points, Int. J. Comput. Vis. 60(2), pp.91 110, 2004. [3] Y. Ke, R.Sukthankar, and L. Hustion, Efficient near-duplicate detection and sub-image retrieval, MM, pp. 869 876, 2004. [4] Y. Ke and R. Sukthankar, PCA-SIFT: A more distinctive representation for local image descriptors, Proc. CVPR, Vol. 2, pp.506 513, 2004. [5] K. Kise, K. Noguchi and M. Iwamura, Simple Representation and Approximate Search of Feature Vectors for Large-Scale Object Recognition, Proc. BMVC, pp.182 191, 2007. [6],,,, PRMU2008-228, pp.121-126, 2009. [7] J. Matas, O. Chum, M. Urban and T. Pajdla, Robust Wide Baseline Stereo from Maximally Stable Extremal Regions, Proc. BMVC, pp.384 393, 2002. [8] N. Dalal and B. Triggs, Histograms of Oriented Gradients for Human Detection, Proc. IEEE CVPR, vol.1, pp.88 893, 2005. [9] S. Arya, D. Mount, R. Silverman and A. Y. Wu, An optimal algorithm for approximate nearest neighbor searching, Journal of the ACM, 45, 6, pp.891 923, 1998.