i ( ) (RF: Relevance Feedback) RF 1 Regularized Nearest Points(RNP) RF 2 RF RNP

Similar documents

THE INSTITUTE OF ELECTRONICS, INFORMATION AND COMMUNICATION ENGINEERS TECHNICAL REPORT OF IEICE

1 Fig. 1 Extraction of motion,.,,, 4,,, 3., 1, 2. 2.,. CHLAC,. 2.1,. (256 ).,., CHLAC. CHLAC, HLAC. 2.3 (HLAC ) r,.,. HLAC. N. 2 HLAC Fig. 2

kut-paper-template.dvi

,,.,.,,.,.,.,.,,.,..,,,, i

28 TCG SURF Card recognition using SURF in TCG play video

(MIRU2008) HOG Histograms of Oriented Gradients (HOG)

情報処理学会研究報告 IPSJ SIG Technical Report Vol.2013-CVIM-186 No /3/15 EMD 1,a) SIFT. SIFT Bag-of-keypoints. SIFT SIFT.. Earth Mover s Distance

IPSJ SIG Technical Report Vol.2010-CVIM-170 No /1/ Visual Recognition of Wire Harnesses for Automated Wiring Masaki Yoneda, 1 Ta

IPSJ SIG Technical Report Vol.2009-CVIM-167 No /6/10 Real AdaBoost HOG 1 1 1, 2 1 Real AdaBoost HOG HOG Real AdaBoost HOG A Method for Reducing

24 Region-Based Image Retrieval using Fuzzy Clustering

( ) [1] [4] ( ) 2. [5] [6] Piano Tutor[7] [1], [2], [8], [9] Radiobaton[10] Two Finger Piano[11] Coloring-in Piano[12] ism[13] MIDI MIDI 1 Fig. 1 Syst

1 Web [2] Web [3] [4] [5], [6] [7] [8] S.W. [9] 3. MeetingShelf Web MeetingShelf MeetingShelf (1) (2) (3) (4) (5) Web MeetingShelf

25 D Effects of viewpoints of head mounted wearable 3D display on human task performance

28 Horizontal angle correction using straight line detection in an equirectangular image

IPSJ SIG Technical Report Vol.2012-CG-148 No /8/29 3DCG 1,a) On rigid body animation taking into account the 3D computer graphics came

IPSJ SIG Technical Report Vol.2014-GN-90 No.16 Vol.2014-CDS-9 No.16 Vol.2014-DCC-6 No /1/24 1,a) 2,b) 2,c) 1,d) QUMARION QUMARION Kinect Kinect

24 Depth scaling of binocular stereopsis by observer s own movements

Web Basic Web SAS-2 Web SAS-2 i

..,,,, , ( ) 3.,., 3.,., 500, 233.,, 3,,.,, i

2) TA Hercules CAA 5 [6], [7] CAA BOSS [8] 2. C II C. ( 1 ) C. ( 2 ). ( 3 ) 100. ( 4 ) () HTML NFS Hercules ( )

25 II :30 16:00 (1),. Do not open this problem booklet until the start of the examination is announced. (2) 3.. Answer the following 3 proble

[2] , [3] 2. 2 [4] 2. 3 BABOK BABOK(Business Analysis Body of Knowledge) BABOK IIBA(International Institute of Business Analysis) BABOK 7

(3.6 ) (4.6 ) 2. [3], [6], [12] [7] [2], [5], [11] [14] [9] [8] [10] (1) Voodoo 3 : 3 Voodoo[1] 3 ( 3D ) (2) : Voodoo 3D (3) : 3D (Welc

(a) 1 (b) 3. Gilbert Pernicka[2] Treibitz Schechner[3] Narasimhan [4] Kim [5] Nayar [6] [7][8][9] 2. X X X [10] [11] L L t L s L = L t + L s

,,,,., C Java,,.,,.,., ,,.,, i

Introduction Purpose This training course describes the configuration and session features of the High-performance Embedded Workshop (HEW), a key tool

3 2 2 (1) (2) (3) (4) 4 4 AdaBoost 2. [11] Onishi&Yoda [8] Iwashita&Stoica [5] 4 [3] 3. 3 (1) (2) (3)

258 5) GPS 1 GPS 6) GPS DP 7) 8) 10) GPS GPS ) GPS Global Positioning System

Fig. 3 Flow diagram of image processing. Black rectangle in the photo indicates the processing area (128 x 32 pixels).

塗装深み感の要因解析

Abstract This paper concerns with a method of dynamic image cognition. Our image cognition method has two distinguished features. One is that the imag

2 except for a female subordinate in work. Using personal name with SAN/KUN will make the distance with speech partner closer than using titles. Last

IPSJ SIG Technical Report Vol.2012-CG-149 No.13 Vol.2012-CVIM-184 No /12/4 3 1,a) ( ) DB 3D DB 2D,,,, PnP(Perspective n-point), Ransa

23 The Study of support narrowing down goods on electronic commerce sites

Study on Application of the cos a Method to Neutron Stress Measurement Toshihiko SASAKI*3 and Yukio HIROSE Department of Materials Science and Enginee

SOM SOM(Self-Organizing Maps) SOM SOM SOM SOM SOM SOM i

Duplicate Near Duplicate Intact Partial Copy Original Image Near Partial Copy Near Partial Copy with a background (a) (b) 2 1 [6] SIFT SIFT SIF

kut-paper-template.dvi

20 Method for Recognizing Expression Considering Fuzzy Based on Optical Flow

[2] OCR [3], [4] [5] [6] [4], [7] [8], [9] 1 [10] Fig. 1 Current arrangement and size of ruby. 2 Fig. 2 Typography combined with printing

On the Wireless Beam of Short Electric Waves. (VII) (A New Electric Wave Projector.) By S. UDA, Member (Tohoku Imperial University.) Abstract. A new e

4. C i k = 2 k-means C 1 i, C 2 i 5. C i x i p [ f(θ i ; x) = (2π) p 2 Vi 1 2 exp (x µ ] i) t V 1 i (x µ i ) 2 BIC BIC = 2 log L( ˆθ i ; x i C i ) + q

The 18th Game Programming Workshop ,a) 1,b) 1,c) 2,d) 1,e) 1,f) Adapting One-Player Mahjong Players to Four-Player Mahjong

IPSJ SIG Technical Report Vol.2011-MUS-91 No /7/ , 3 1 Design and Implementation on a System for Learning Songs by Presenting Musical St

_念3）医療2009_夏.indd

Virtual Window System Virtual Window System Virtual Window System Virtual Window System Virtual Window System Virtual Window System Social Networking

<8ED089EF8B D312D30914F95742E696E6464>

16_.....E...._.I.v2006

Web Web Web Web 1 1,,,,,, Web, Web - i -

2) 3) LAN 4) 2 5) 6) 7) K MIC NJR4261JB0916 8) 24.11GHz V 5V 3kHz 4 (1) (8) (1)(5) (2)(3)(4)(6)(7) (1) (2) (3) (4)

1 Table 1: Identification by color of voxel Voxel Mode of expression Nothing Other 1 Orange 2 Blue 3 Yellow 4 SSL Humanoid SSL-Vision 3 3 [, 21] 8 325

卒業論文2.dvi

IPSJ SIG Technical Report GPS LAN GPS LAN GPS LAN Location Identification by sphere image and hybrid sensing Takayuki Katahira, 1 Yoshio Iwai 1

29 jjencode JavaScript

JOURNAL OF THE JAPANESE ASSOCIATION FOR PETROLEUM TECHNOLOGY VOL. 66, NO. 6 (Nov., 2001) (Received August 10, 2001; accepted November 9, 2001) Alterna

25 Removal of the fricative sounds that occur in the electronic stethoscope

2007-Kanai-paper.dvi

IPSJ SIG Technical Report Vol.2011-EC-19 No /3/ ,.,., Peg-Scope Viewer,,.,,,,. Utilization of Watching Logs for Support of Multi-

IS1-09 第回画像センシングシンポジウム, 横浜,14 年 6 月 2 Hough Forest Hough Forest[6] Random Forest( [5]) Random Forest Hough Forest Hough Forest 2.1 Hough Forest 1 2.2

〈論文〉興行データベースから「古典芸能」の定義を考える

Page 1 of 6 B (The World of Mathematics) November 20, 2006 Final Exam 2006 Division: ID#: Name: 1. p, q, r (Let p, q, r are propositions. ) (10pts) (a

3 1 Table 1 1 Feature classification of frames included in a comic magazine Type A Type B Type C Others 81.5% 10.3% 5.0% 3.2% Fig. 1 A co

IPSJ SIG Technical Report Vol.2014-HCI-158 No /5/22 1,a) 2 2 3,b) Development of visualization technique expressing rainfall changing conditions

08-特集04.indd

Studies of Foot Form for Footwear Design (Part 9) : Characteristics of the Foot Form of Young and Elder Women Based on their Sizes of Ball Joint Girth

Journal of Geography 116 (6) Configuration of Rapid Digital Mapping System Using Tablet PC and its Application to Obtaining Ground Truth

17 Proposal of an Algorithm of Image Extraction and Research on Improvement of a Man-machine Interface of Food Intake Measuring System

1 1 tf-idf tf-idf i

Vol. 42 No MUC-6 6) 90% 2) MUC-6 MET-1 7),8) 7 90% 1 MUC IREX-NE 9) 10),11) 1) MUCMET 12) IREX-NE 13) ARPA 1987 MUC 1992 TREC IREX-N

VRSJ-SIG-MR_okada_79dce8c8.pdf

2.2 6).,.,.,. Yang, 7).,,.,,. 2.3 SIFT SIFT (Scale-Invariant Feature Transform) 8).,. SIFT,,. SIFT, Mean-Shift 9)., SIFT,., SIFT,. 3.,.,,,,,.,,,., 1,

THE INSTITUTE OF ELECTRONICS, INFORMATION AND COMMUNICATION ENGINEERS TECHNICAL REPORT OF IEICE. TRECVID2012 Instance Search {sak

<95DB8C9288E397C389C88A E696E6462>

% 95% 2002, 2004, Dunkel 1986, p.100 1

図 2: 高周波成分を用いた超解像解像度度画像とそれらを低解像度化して得られる低解像度画像との差により低解像度の高周波成分を得る高解像度と低解像度の高周波成分から位置関係を保ったままパッチ領域をそれぞれ切り出し高解像度パッチ画像と低解像度パッチ画像のペアとしてデータベースに登録する

2 1 ( ) 2 ( ) i

[2] 2. [3 5] 3D [6 8] Morishima [9] N n 24 24FPS k k = 1, 2,..., N i i = 1, 2,..., n Algorithm 1 N io user-specified number of inbetween omis

シラバス政治学H18.PDF

alternating current component and two transient components. Both transient components are direct currents at starting of the motor and are sinusoidal

IPSJ SIG Technical Report Vol.2016-CE-137 No /12/ e β /α α β β / α A judgment method of difficulty of task for a learner using simple

HP cafe HP of A A B of C C Map on N th Floor coupon A cafe coupon B Poster A Poster A Poster B Poster B Case 1 Show HP of each company on a user scree

,,,,,,,,,,,,,,,,,,, 976%, i

Transcription:

27 1 30

i ( ) (RF: Relevance Feedback) RF 1 Regularized Nearest Points(RNP) RF 2 RF RNP

ii RF RNP 2 RF (+1) (-1) 2 RNP 2 RF RF 33.6 2 SVM RF 2.7

Specific Person Image Retrieval by Relevance Feedback using outlier robust set-to-set distance Abstract iii Kazuhisa TAKAGI In this study, we focus on the Specific Person Image Retrieval, which is the system that shows a user all the images of a target person from a gallery database. We propose a method to apply an outlier robust set-to-set distance to Track based Specific Person Image Retrieval with Relevance Feedback. In Track based Specific Person Image Retrieval, firstly we register person images, each of which is the bounding box of a person s body in each frame, on a gellery database by the tracklet. Here, a tracklet is a set of person images shot while one person is going through the viewing field of each camera. A user input a tracklet as the query, and the system calculates similarity between the query tracklet and each tracklet in the database by tracklet s feature, which is a set of features of person images in a tracklet. The system shows several candidate tracklets to the user in decreasing order of similarity to the query tracklet. Then we apply Relevance Feedback (RF). In RF, a user tells the system whether the shown tracklets are the same person s as the query or not. The system optimizes the query feature or the similarity measure using the training samples which consist of the person s data and the feedbacks about them, and shows the next candidate tracklets from non-presented person s data to the user. In making gallery database, person images are usually extracted automatically. In this case, we often get erroneous person images such as misalignment person s bounding boxes, wrong detections and person images with hidden body by obstacles. We call these kind of erroneous person images as outlier images. Outlier images cause the problem that the similarity between the same persons tracklets becomes smaller than the similarity between different persons ones. We introduce an outlier robust set-to-set distance into the calculation of similarity in order to solve this problem. Especially, we introduce Regularized Nearest Points (RNP) distance, and we propose the way to introduce it into RF with query optimization and RF with two-class classifier. RNP distance defines the distance between two tracklets as the distance

iv between the representative points of two tracklets. The representative points are decided by using Regularized Affine Hull of tracklet s feature to be close to the mean feature of the tracklet and to be close to the other representative point. This method decides the optimal representative point with taking account of the mean feature. So, it can do robust calculation for outliers with taking account of data s distribution. In RF with query optimization, the query tracklet s feature transfers to be close to the mean tracklet s feature of the same person and to be far away from that of other persons. We calculate the similarity between the query and each tracklet by the set-to-set distance between them. Here, the smaller the distance is, the larger similarity is. In this method, we introduce RNP distance into the calculation of the distance between tracklets features. On the other hand, in RF with two-class classifier, we don t calculate the similarity by the distance between tracklets but by using the separating hyperplane which classifies person images into the same person s class (+1) and other persons class (-1) as the target person. In particular, we define the similarity between the target person and each person image in each tracklet as the value of evaluation function used in two-class classifier, and we define the similarity of tracklet as the maximum of the similarities of person images in the tracklet. We can regard this method defines the similarity by the distance between tracklet and the most relevant hyperplane which is translated from the separating hyperplane to be further from the different person s side than any features of person images on the gallery database. The distance between tracklet and hyperplane is the length of perpendicular of hyperplane to the representative point of tracklet. In our method, we introduce RNP to calculate the distance between the most relevant hyperplane and tracklet. In the exeriments, we use a gallery database automatically extracted from the surveillance videos in a shopping mall. When using the RF with query optimization and with two-class classififer, our experiments showed that the proposed method improves the performance by up to 33.6 points and 2.7 points, respectively. Our idea can be applied to video retrievals and it is left for future work.

1 1 2 3 2.1............................ 3 2.2....................................... 4 2.2.1............................ 5 2.2.2........................ 7 2.3......................... 9 3 10 3.1.................................. 10 3.2.................. 13 3.3....... 15 4 18 4.1....................................... 18 4.2....................................... 18 4.2.1.............. 18 4.2.2.................... 20 4.2.3................................. 21 4.2.4................................. 21 4.3....................................... 22 4.4.......................................... 22 4.4.1 SVM..... 22 4.4.2 Rocchio 23 4.4.3 2........... 23 5 24 24 25

1 2010 1000 1 ( ) ( ) ( ) 2 [1,2] [3 6] ( ) ( ) 1

(i) N (ii) (iii) N (iv) (ii) (iii) / Metternich [1] ( ) Metternich 2 2 2

2 3 4 5 2 2.1 [7] Gray [8] RGB YCbCr HSV Schmid Gabor Adaboost (Ensemble of Localized Features: ELF) [9] 1 1 [1] Bak [2] (Mean Riemannian Covariance Grid: MRCG) 3

[3 6] 1 / 2.2 1 2 I i T i = {I i1,..., I ini } n i i f ij F i = {f i1,..., f ini } T Q F Q D D(F Q, F i ) 4

(S(F Q, F i ) = D(F Q, F i )) 1. T i S(F Q, F i ) T i N 2. N 3. 4. N 5. 2 5 N 2.2.1 Metternich [1] 3 Average Description(AD) Largest Detection(LD) 5

1: 6

Minimum Pointwise Distance(MPD) 2 [10] 2 2 D D 2 2 2.2.2 Metternich [1] 5 (a) (b) (c) (d) (e)2 (a) 2 1 (b) 0 2 7

(c) (d) 2 (e)2 2 2 2 1 (a) (d) (e)2 2 Metternich [1] 2 SVM SVM 2 K(x, y) = t ϕ(x)ϕ(y) ϕ w, b x (+1) (-1) 8

sgn( t wϕ(x) + b) (1) SVM SVM Zhang [5] 1 SVM x t wϕ(x) + b (2) Metternich [1] T i I ij T i S(F Q, F i ) = max( t wϕ(f ij ) + b) (3) j 2.3 2 2 2.2.1 9

LD MPD ( 2) AD 2 ( 2) AD 2 LD 2 MPD 2 SVM MPD AD LD AD LD 3 3.1 10

2: AD( ),LD( ),MPD( ) 11

Yang [11] Regularized Nearest Points(RNP) RNP 2 Regularized Affine Hull(RAH ) RAH F i = {f i1,..., f ini } RAH (4) RAH = f = k f ik a k f ik F i, k a k = 1, a 2 k σ k RAH σ σ RAH RAH F Q F Q = (f Q1,..., f QnQ ) i F i = (f i1,..., f ini ) F Q F i α, β i min F Q α F i β i 2 α,β i 2, (5) subject to α k = 1, k k (4) β ik = 1, (6) α 2 σ 1, β i 2 σ 2 (7) 2 L2 σ 1, σ 2 RAH (6) k α k 1, k β ik 1 12

(5),(6),(7) { } min z ˆF Q α ˆF i β i 2 2 + λ 1 α 2 2 + λ 2 β i 2 2 α,β i (8) z = t (0, γ 1, γ 2 ), ˆF Q = t ( t F Q, 0, γ2 1), ˆF i = t ( t F i, γ 1 1, 0) λ 1, λ 2, γ 1, γ 2 (8) α, β i 0 α = ( tˆf Q ˆF Q + λ 2 I) 1tˆF Q (z ˆF i β i ) (9) β i = ( tˆf i ˆF i + λ 2 I) 1tˆF i (z ˆF Q α) (10) β i = 1 n i 2 α, β i α, β i α, β i D RNP (F Q, F i ) = ( F Q + F ) F Q α F i β i 2 2 (11) F Q, F i F Q, F i F Q, F i RAH ( 3) γ 1, γ 2, λ 1, λ 2 RNP F Q α F i β i 3.2 2.2.2 (a) (d) RNP 13

3: RNP (c) F Q i F i D D(F Q, F i ) D RNP (F Q, F i ) (e)2 2 SVM Metternich SVM RNP SVM RNP 2 14

1 1 SVM SVM 2 SVM RNP RNP (3) SVM RNP Metternich [1] (3) 4 (3) MPD Metternich MPD RNP 3.3 SVM RNP K(x, y) = t ϕ(x)ϕ(y) SVM 15

4: t wϕ(x) + b = 0 (12) f K(f, f) m 2 m( 0) 4 (f 0 = 0) m x d t wϕ(x) + b d = (13) w 2 x d 0 d > 0 d 0 16

d 0 = b (14) w 2 m d 0 = m b = m w 2 (15) t wϕ(x) m w 2 = 0 (15) K(x, y) = t ϕ(x)ϕ(y) RNP F i RNP ψ(f i )β i ψ(f i ) = (ϕ(f i1 ),..., ϕ(f ini )) ψ(f i )β i SVM (15) t wψ(f i )β i m w 2 ψ(f i )β i w (16) w 2 2 RNP (5) F Q α (16) ( t wψ(f i )β i m w 2 ) 2 min, (17) β i w 2 2 subject to (18) β ik = 1, (19) k β i 2 σ 2 (20) F i SVM RNP RNP (19) min β i { ( t wψ(f i )β i m w 2 ) 2 } + λ w 2 2 β i 2 2 + γ 2 (1 t 1β i ) 2 2 (21) 1 1 β i β 0 β i ( ) t 1 ( ψ(f i )ψ(f i ) β i = + λ w 2 2 I + γ 2 M ni γ 2 1 + m ) t ( t wψ(f i )) 2 w 2 17 (22)

(21) β i I n i n i M ni 1 n i n i ψ(f i )β i F i (2) t wψ(f i )β i m w 2 (23) 4 4.1 4.2 4.2.1 16 89 16039 4337 1 1 69 OKAO Vision [12] 5,6,7 5 6 7 Wu [13] 160 f c 160 Schmid f s 80 Gabor f g 18

図 5: 人物の向きの変化を含むトラックレット図 6: 人物領域の誤りによる少数の外れ値画像を含むトラックレット図 7: 人物領域の切り出しの失敗が生じた人物画像のみから成るトラックレット 19

400 f = t ( t f c, t f s, t f g ) f c RGB HSV dense sampling 160 4.2.2 2.2.2 Metternich [1] SVM Metternich SVM MATLAB SVM 5% 1 MPD SVM 2.2.2 (c) Rocchio [3] Rocchio [3] Rocchio (24) F l Q = af l 1 Q + b F l r c F l nr (24) 20

FQ l l F r, l Fnr l l F l r, F l nr F r, F nr Rocchio a, b, c(a + b c = 1) (a, b, c) = (0.7, 0.8, 0.5) N=20 N 3.1 RNP Metternich [1] MPD F Q F i MPD D MP D (F Q, F i ) f Qj F Q, f i k F i (25) 4.2.3 D MP D (F Q, F i ) = min j,k d(f Qj, f ik ) (25) 16 12 12 4.2.4 Cumulative Matching Curve (CMC: ) CMC l n CMC 21

CMC(l) = 1 n n k=1 l (26) 4.3 CMC 8,9 8 SVM (Relevance Feedback:RF) 9 Rocchio [3] RF 2 1500 CMC 8: SVM RF 9: Rocchio RF 4.4 4.4.1 SVM 8 300 84.09% 86.80% 2.71 Metternich SVM 22

4.4.2 Rocchio 9 1,100 38.22% 71.86% 33.64 4.4.3 2 SVM Rocchio Rocchio [3] Rocchio RNP SVM 3.2 RNP SVM Rocchio SVM Rocchio SVM Rocchio SVM / 2 Rocchio SVM Rocchio 23

5 2 2 SVM 2.7 Rocchio 33.6 Metternich [1] SVM 24

[1] Michael J Metternich, Marcel Worring, Track based relevance feedback for tracing persons in surveillance videos, Computer Vision and Image Understanding, Vol. 117, No. 3, pp. 229 237, 2013. [2] Slawomir Bak, Etienne Corvee, François Bremond, Monique Thonnat, Multiple-shot human re-identification by mean riemannian covariance grid, IEEE International Conference on Advanced Video and Signal-Based Surveillance (AVSS 2011), pp. 179 184, 2011. [3] Joseph John Rocchio, Relevance feedback in information retrieval, the Smart Retrieval System: Experiments in Automatic Document Processing, pp. 313 323, 1971. [4] Xiang Sean Zhou, Thomas S Huang, Relevance feedback in image retrieval: A comprehensive review, Multimedia systems, Vol. 8, No. 6, pp. 536 544, 2003. [5] Lei Zhang, Fuzong Lin, Bo Zhang, Support vector machine learning for image retrieval, IEEE International Conference on Image Processing (ICIP 2001)., Vol. 2, pp. 721 724, 2001. [6] Yoshiharu Ishikawa, Ravishankar Subramanya, Christos Faloutsos, Mindreader: Querying databases through multiple examples, Computer Science Department, p. 551, 1998. [7],,,, :,, PRMU2011-21, pp. 117 124, 2011. [8] Douglas Gray, Hai Tao, Viewpoint invariant pedestrian recognition with an ensemble of localized features, European Conference on Computer Vision (ECCV 2008), pp. 262 275. 2008. [9],,,, PRMU2011-25, Vol. 111, No. 49, pp. 139 146, 2011. 25

[10] Daniel P. Huttenlocher, Gregory A. Klanderman, William J Rucklidge, Comparing images using the hausdorff distance, IEEE Transactions on Pattern Analysis and Machine Intelligence,, Vol. 15, No. 9, pp. 850 863, 1993. [11] Meng Yang, Pengfei Zhu, Luc Van Gool, Lei Zhang, Face recognition based on regularized nearest points between image sets, IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG 2013), pp. 1 7, 2013. [12], Okao vision, http://www.omron.co.jp/ecb/products/mobile/ [13] Yang Wu, Michihiko Minoh, Masayuki Mukunoki, Collaboratively regularized nearest points for set based recognition, British Machine Vision Conference (BMVC 2013), pp. 134.1 134.10, 2013. 26