(b) BoF codeword codeword BoF (c) BoF Fergus Weber [11] Weber [12] Weber Fergus BoF (b) Fergus [13] Fergus 2. Fergus 2. 1 Fergus [3]

Similar documents
IPSJ SIG Technical Report Vol.2010-CVIM-170 No /1/ Visual Recognition of Wire Harnesses for Automated Wiring Masaki Yoneda, 1 Ta

4. C i k = 2 k-means C 1 i, C 2 i 5. C i x i p [ f(θ i ; x) = (2π) p 2 Vi 1 2 exp (x µ ] i) t V 1 i (x µ i ) 2 BIC BIC = 2 log L( ˆθ i ; x i C i ) + q

本文6(599) (Page 601)

(MIRU2010) Geometric Context Randomized Trees Geometric Context Rand

Microsoft PowerPoint - SSII_harada pptx

LBP 2 LBP 2. 2 Local Binary Pattern Local Binary pattern(lbp) [6] R

THE INSTITUTE OF ELECTRONICS, INFORMATION AND COMMUNICATION ENGINEERS TECHNICAL REPORT OF IEICE. TRECVID2012 Instance Search {sak

bag-of-words bag-of-keypoints Web bagof-keypoints Nearest Neighbor SVM Nearest Neighbor SIFT Nearest Neighbor bag-of-keypoints Nearest Neighbor SVM 84

IPSJ SIG Technical Report Vol.2012-CG-149 No.13 Vol.2012-CVIM-184 No /12/4 3 1,a) ( ) DB 3D DB 2D,,,, PnP(Perspective n-point), Ransa

Duplicate Near Duplicate Intact Partial Copy Original Image Near Partial Copy Near Partial Copy with a background (a) (b) 2 1 [6] SIFT SIFT SIF

(MIRU2008) HOG Histograms of Oriented Gradients (HOG)

2007/8 Vol. J90 D No. 8 Stauffer [7] 2 2 I 1 I 2 2 (I 1(x),I 2(x)) 2 [13] I 2 = CI 1 (C >0) (I 1,I 2) (I 1,I 2) Field Monitoring Server

IPSJ SIG Technical Report Vol.2013-CVIM-187 No /5/30 1,a) 1,b), 1,,,,,,, (DNN),,,, 2 (CNN),, 1.,,,,,,,,,,,,,,,,,, [1], [6], [7], [12], [13]., [

12_39.dvi

Google Goggles [1] Google Goggles Android iphone web Google Goggles Lee [2] Lee iphone () [3] [4] [5] [6] [7] [8] [9] [10] :

[1] SBS [2] SBS Random Forests[3] Random Forests ii

(MIRU2009) cuboid cuboid SURF 6 85% Web. Web Abstract Extracting Spatio-te

THE INSTITUTE OF ELECTRONICS, INFORMATION AND COMMUNICATION ENGINEERS TECHNICAL REPORT OF IEICE. Wang Jiani {jwang,mnod

IPSJ SIG Technical Report GPS LAN GPS LAN GPS LAN Location Identification by sphere image and hybrid sensing Takayuki Katahira, 1 Yoshio Iwai 1

一般画像認識のための単語概念の視覚性の分析

IPSJ SIG Technical Report iphone iphone,,., OpenGl ES 2.0 GLSL(OpenGL Shading Language), iphone GPGPU(General-Purpose Computing on Graphics Proc

IPSJ SIG Technical Report Vol.2009-CVIM-167 No /6/10 Real AdaBoost HOG 1 1 1, 2 1 Real AdaBoost HOG HOG Real AdaBoost HOG A Method for Reducing

3807 (3)(2) ,267 1 Fig. 1 Advertisement to the author of a blog. 3 (1) (2) (3) (2) (1) TV 2-0 Adsense (2) Web ) 6) 3

光学

IPSJ SIG Technical Report Vol.2011-CVIM-177 No /5/ TRECVID2010 SURF Bag-of-Features 1 TRECVID SVM 700% MKL-SVM 883% TRECVID2010 MKL-SVM A

Convolutional Neural Network A Graduation Thesis of College of Engineering, Chubu University Investigation of feature extraction by Convolution

SICE東北支部研究集会資料(2017年)

48_16_1.dvi

一般社団法人電子情報通信学会 THE INSTITUTE OF ELECTRONICS, INFORMATION AND COMMUNICATION ENGINEERS THE INSTITUTE OF ELECTRONICS, INFORMATION AND COMMUNICATION ENGIN

Vol.58 No (Sep. 2017) 1 2,a) 3 1,b) , A EM A Latent Class Model to Analyze the Relationship Between Companies Appeal Poi

塗装深み感の要因解析

(a) 1 (b) 3. Gilbert Pernicka[2] Treibitz Schechner[3] Narasimhan [4] Kim [5] Nayar [6] [7][8][9] 2. X X X [10] [11] L L t L s L = L t + L s

& Vol.5 No (Oct. 2015) TV 1,2,a) , Augmented TV TV AR Augmented Reality 3DCG TV Estimation of TV Screen Position and Ro

3 2 2 (1) (2) (3) (4) 4 4 AdaBoost 2. [11] Onishi&Yoda [8] Iwashita&Stoica [5] 4 [3] 3. 3 (1) (2) (3)

情報処理学会研究報告 IPSJ SIG Technical Report Vol.2013-CVIM-186 No /3/15 EMD 1,a) SIFT. SIFT Bag-of-keypoints. SIFT SIFT.. Earth Mover s Distance

No. 3 Oct The person to the left of the stool carried the traffic-cone towards the trash-can. α α β α α β α α β α Track2 Track3 Track1 Track0 1

1 (PCA) 3 2 P.Viola 2) Viola AdaBoost 1 Viola OpenCV 3) Web OpenCV T.L.Berg PCA kpca LDA k-means 4) Berg 95% Berg Web k-means k-means

1 Fig. 1 Extraction of motion,.,,, 4,,, 3., 1, 2. 2.,. CHLAC,. 2.1,. (256 ).,., CHLAC. CHLAC, HLAC. 2.3 (HLAC ) r,.,. HLAC. N. 2 HLAC Fig. 2

2.2 6).,.,.,. Yang, 7).,,.,,. 2.3 SIFT SIFT (Scale-Invariant Feature Transform) 8).,. SIFT,,. SIFT, Mean-Shift 9)., SIFT,., SIFT,. 3.,.,,,,,.,,,., 1,

untitled

Microsoft PowerPoint - cvim_harada pptx

2003/9 Vol. J86 D I No. 9 GA GA [8] [10] GA GA GA SGA GA SGA2 SA TS GA C1: C2: C3: 1 C4: C5: 692

22_05.dvi

DPA,, ShareLog 3) 4) 2.2 Strino Strino STRain-based user Interface with tacticle of elastic Natural ObjectsStrino 1 Strino ) PC Log-Log (2007 6)

(a) (b) (c) Canny (d) 1 ( x α, y α ) 3 (x α, y α ) (a) A 2 + B 2 + C 2 + D 2 + E 2 + F 2 = 1 (3) u ξ α u (A, B, C, D, E, F ) (4) ξ α (x 2 α, 2x α y α,

1 Table 1: Identification by color of voxel Voxel Mode of expression Nothing Other 1 Orange 2 Blue 3 Yellow 4 SSL Humanoid SSL-Vision 3 3 [, 21] 8 325

Silhouette on Image Object Silhouette on Images Object 1 Fig. 1 Visual cone Fig. 2 2 Volume intersection method Fig. 3 3 Background subtraction Fig. 4

thesis.dvi

1 Kinect for Windows M = [X Y Z] T M = [X Y Z ] T f (u,v) w 3.2 [11] [7] u = f X +u Z 0 δ u (X,Y,Z ) (5) v = f Y Z +v 0 δ v (X,Y,Z ) (6) w = Z +

130 Oct Radial Basis Function RBF Efficient Market Hypothesis Fama ) 4) 1 Fig. 1 Utility function. 2 Fig. 2 Value function. (1) (2)

SICE東北支部研究集会資料(2013年)

2 Fig D human model. 1 Fig. 1 The flow of proposed method )9)10) 2.2 3)4)7) 5)11)12)13)14) TOF 1 3 TOF 3 2 c 2011 Information

IS1-09 第 回画像センシングシンポジウム, 横浜,14 年 6 月 2 Hough Forest Hough Forest[6] Random Forest( [5]) Random Forest Hough Forest Hough Forest 2.1 Hough Forest 1 2.2

(3.6 ) (4.6 ) 2. [3], [6], [12] [7] [2], [5], [11] [14] [9] [8] [10] (1) Voodoo 3 : 3 Voodoo[1] 3 ( 3D ) (2) : Voodoo 3D (3) : 3D (Welc

258 5) GPS 1 GPS 6) GPS DP 7) 8) 10) GPS GPS ) GPS Global Positioning System

:EM,,. 4 EM. EM Finch, (AIC)., ( ), ( ), Web,,.,., [1].,. 2010,,,, 5 [2]., 16,000.,..,,. (,, )..,,. (socio-dynamics) [3, 4]. Weidlich Haag.


main.dvi

IPSJ SIG Technical Report Vol.2017-MUS-116 No /8/24 MachineDancing: 1,a) 1,b) 3 MachineDancing MachineDancing MachineDancing 1 MachineDan

80 Sep CBIR 6),7) WWW WWW Image Collector Image Collector (1) (2) 1 WWW 2 CBIR WWW WWW WWW CBIR example-based generic object recognition 8),9) W

1., 1 COOKPAD 2, Web.,,,,,,.,, [1]., 5.,, [2].,,.,.,, 5, [3].,,,.,, [4], 33,.,,.,,.. 2.,, 3.., 4., 5., ,. 1.,,., 2.,. 1,,

Dirichlet process mixture Dirichlet process mixture 2 /40 MIRU2008 :

IPSJ SIG Technical Report 1,a) 1,b) 1,c) 1,d) 2,e) 2,f) 2,g) 1. [1] [2] 2 [3] Osaka Prefecture University 1 1, Gakuencho, Naka, Sakai,

(a) (b) 2 2 (Bosch, IR Illuminator 850 nm, UFLED30-8BD) ( 7[m] 6[m]) 3 (PointGrey Research Inc.Grasshopper2 M/C) Hz (a) (b


Optical Flow t t + δt 1 Motion Field 3 3 1) 2) 3) Lucas-Kanade 4) 1 t (x, y) I(x, y, t)

% 2 3 [1] Semantic Texton Forests STFs [1] ( ) STFs STFs ColorSelf-Simlarity CSS [2] ii

3.1 Thalmic Lab Myo * Bluetooth PC Myo 8 RMS RMS t RMS(t) i (i = 1, 2,, 8) 8 SVM libsvm *2 ν-svm 1 Myo 2 8 RMS 3.2 Myo (Root

35_3_9.dvi

VRSJ-SIG-MR_okada_79dce8c8.pdf

Microsoft Word - toyoshima-deim2011.doc

On the Limited Sample Effect of the Optimum Classifier by Bayesian Approach he Case of Independent Sample Size for Each Class Xuexian HA, etsushi WAKA

Vol.55 No (Jan. 2014) saccess 6 saccess 7 saccess 2. [3] p.33 * B (A) (B) (C) (D) (E) (F) *1 [3], [4] Web PDF a m

h(n) x(n) s(n) S (ω) = H(ω)X(ω) (5 1) H(ω) H(ω) = F[h(n)] (5 2) F X(ω) x(n) X(ω) = F[x(n)] (5 3) S (ω) s(n) S (ω) = F[s(n)] (5

2003/3 Vol. J86 D II No Fig. 1 An exterior view of eye scanner. CCD [7] CCD PC USB PC PC USB RS-232C PC

IPSJ SIG Technical Report Pitman-Yor 1 1 Pitman-Yor n-gram A proposal of the melody generation method using hierarchical pitman-yor language model Aki

21 Pitman-Yor Pitman- Yor [7] n -gram W w n-gram G Pitman-Yor P Y (d, θ, G 0 ) (1) G P Y (d, θ, G 0 ) (1) Pitman-Yor d, θ, G 0 d 0 d 1 θ Pitman-Yor G

kut-paper-template.dvi

28 TCG SURF Card recognition using SURF in TCG play video

xx/xx Vol. Jxx A No. xx 1 Fig. 1 PAL(Panoramic Annular Lens) PAL(Panoramic Annular Lens) PAL (2) PAL PAL 2 PAL 3 2 PAL 1 PAL 3 PAL PAL 2. 1 PAL

1: A/B/C/D Fig. 1 Modeling Based on Difference in Agitation Method artisoc[7] A D 2017 Information Processing

,,.,.,,.,.,.,.,,.,..,,,, i

SFCJ2-MisaGrace

2. Twitter Twitter 2.1 Twitter Twitter( ) Twitter Twitter ( 1 ) RT ReTweet RT ReTweet RT ( 2 ) URL Twitter Twitter 140 URL URL URL 140 URL URL

Haiku Generation Based on Motif Images Using Deep Learning Koki Yoneda 1 Soichiro Yokoyama 2 Tomohisa Yamashita 2 Hidenori Kawamura Scho

a) Extraction of Similarities and Differences in Human Behavior Using Singular Value Decomposition Kenichi MISHIMA, Sayaka KANATA, Hiroaki NAKANISHI a

THE INSTITUTE OF ELECTRONICS, INFORMATION AND COMMUNICATION ENGINEERS TECHNICAL REPORT OF IEICE., {yamash

[2] OCR [3], [4] [5] [6] [4], [7] [8], [9] 1 [10] Fig. 1 Current arrangement and size of ruby. 2 Fig. 2 Typography combined with printing

,4) 1 P% P%P=2.5 5%!%! (1) = (2) l l Figure 1 A compilation flow of the proposing sampling based architecture simulation

Sobel Canny i

Vol.54 No (July 2013) [9] [10] [11] [12], [13] 1 Fig. 1 Flowchart of the proposed system. c 2013 Information

1 Web [2] Web [3] [4] [5], [6] [7] [8] S.W. [9] 3. MeetingShelf Web MeetingShelf MeetingShelf (1) (2) (3) (4) (5) Web MeetingShelf

X X X Y R Y R Y R MCAR MAR MNAR Figure 1: MCAR, MAR, MNAR Y R X 1.2 Missing At Random (MAR) MAR MCAR MCAR Y X X Y MCAR 2 1 R X Y Table 1 3 IQ MCAR Y I

( 1) 3. Hilliges 1 Fig. 1 Overview image of the system 3) PhotoTOC 5) 1993 DigitalDesk 7) DigitalDesk Koike 2) Microsoft J.Kim 4). 2 c 2010

2). 3) 4) 1.2 NICTNICT DCRA Dihedral Corner Reflector micro-arraysdcra DCRA DCRA DCRA 3D DCRA PC USB PC PC ON / OFF Velleman K8055 K8055 K8055

yoo_graduation_thesis.dvi

Vol.56 No (Mar. 2015) 1,a) 1 1 1, , Human Computation Quality Control 1 1 Nonnegative Matrix Factorization A Task A

図 2: 高周波成分を用いた超解像 解像度度画像とそれらを低解像度化して得られる 低解像度画像との差により低解像度の高周波成分 を得る 高解像度と低解像度の高周波成分から位 置関係を保ったままパッチ領域をそれぞれ切り出 し 高解像度パッチ画像と低解像度パッチ画像の ペアとしてデータベースに登録する

Vol. 48 No. 4 Apr LAN TCP/IP LAN TCP/IP 1 PC TCP/IP 1 PC User-mode Linux 12 Development of a System to Visualize Computer Network Behavior for L

IPSJ SIG Technical Report Vol.2010-MPS-77 No /3/5 VR SIFT Virtual View Generation in Hallway of Cybercity Buildings from Video Sequen

Transcription:

* A Multimodal Constellation Model for Generic Object Recognition Yasunori KAMIYA, Tomokazu TAKAHASHI,IchiroIDE, and Hiroshi MURASE Bag of Features (BoF) BoF EM 1. [1] Part-based Graduate School of Information Science, Nagoya University, Furo-cho, Chikusa-ku, Nagoya-shi, 464 8601 Japan Faculty of Economics and Information, Gifu Shotoku Gakuen University, 1 38 Nakauzura, Gifu-shi, 500 8288 Japan * 12 Bag of Features (BoF) [2] Fergus (constellation model) [3] BoF Bag of Words BoF SVM [4] [6] probabilistic Latent Semantic Analysis (plsa) Latent Dirichlet Allocation (LDA) Hierarchical Dirichlet Processes (HDP) [7] [9] (a) + [10] 1104 D Vol. J92 D No. 8 pp. 1104 1114 c 2009

(b) BoF codeword codeword BoF (c) BoF 2. 3. 4. 5. 1. 1 Fergus Weber [11] Weber [12] Weber Fergus BoF (b) Fergus [13] Fergus 2. Fergus 2. 1 Fergus [3] p(i Θ) = h H p(a, X, S, h Θ) = h H { p(a h,θ A)p(X h,θ X) } p(s h,θ S)p(h θ other ). I Θ Θ={θ A,θ X,θ S,θ other } I A X S R h I H h H p(a h,θ A) R p(x h,θ X) x, y 2R 1105

2009/8 Vol. J92 D No. 8 p(s h,θ S) R [3] ( h H) (p(a, X, S, h Θ)) Fergus 2. 2 { K L } p m(i Θ) = G(x l θ k,ˆrk,l ) π k = k k l { K L G(A l θ (A) k,ˆr k,l ) ˆr k,l =argmaxg(x l θ k,r ). r l } G(X l θ (X) k,ˆr k,l )G(S l θ (S) k,ˆr k,l ) π k K 2 k L I G() µ Σ Θ ={θ k,r,π k } θ = {µ, Σ} I = {x l } x =(A, X, S) θ k,r k r x l l A, X, S x π k k 0 π k 1 K k π k =1 ˆr k,l k l R 2. 3 Fergus 2 1. Σ (x µ) t Σ 1 (x µ) Σ Σ D D O(D 3 ) O(D) Σ σd 2 D (x µ) t Σ 1 1 (x µ) = (x d μ d ) 2 Σ = D d σ 2 d Σ Σ Σ Σ 0 Σ 2. Σ h H L l arg max r Fergus h H L R p(a, X, S, h Θ) O(L R ) A* L l arg max r O(LR) [14] Fergus 2 d σ 2 d 1106

Fig. 1 1 Model parameter estimation algorithm for the Multimodal Constellation Model. Fergus Σ h H L l (arg max r) Fergus Fergus Fergus h 1107

2009/8 Vol. J92 D No. 8 2. 4 EM [15] 1 N n x n,l n l ˆr k,n,l n ˆr k,l k l (1) µ Σ σ 2 µ Σ π 1 K EM µ, Σ n k q k,n µ, Σ µ, Σ ˆr k,n,l r l l:(ˆr k,n,l =r) 3. ĉ c ĉ =argmaxp c m(i Θ c)p(c) p(c) c 4. (Multi-CM) (Uni-CM) Uni-CM K =1 BoF LDA+BoF SVM+BoF LDA SVM BoF Multi-CM Uni-CM LDA+BoF SVM+BoF LDA K R Fergus 1. (b) BoF (c) BoF Multi-CM 4. 1 Caltech Database [3] Caltech PASCAL Visual Object Classes Challenge 2006 [16] Pascal = 1108

1 Caltech [3] Table 1 Number of object areas in Caltech [3]. Airplanes 1,074 Cars Rear 1,155 Faces 450 Motorbikes 826 2 Pascal [16] Table 2 Number of object areas in Pascal [16]. Bicycle 649 Bus 469 Car 1,708 Cat 858 Cow 628 Dog 845 Horse 650 Motorbike 549 Person 2,309 Sheep 843 2 Caltech [3] Fig. 2 Target images in Caltech [3]. 1 Caltech 4 1 2 Pascal 10 2 3 Cat Dog Person Pascal Caltech 10 K 5 3 Pascal [16] Fig. 3 Target images in Pascal [16]. R 21 4. 3 4. 4 Kadir Brady saliency detector KB detector [17] DCT (Discrete Cosine Transform) KB detector DCT 20 x A 20 X 2 S 1 23 4. 2 BoF Uni-CM Multi-CM LDA+BoF SVM+BoF 1 Caltech101 256 1109

2009/8 Vol. J92 D No. 8 Table 3 3 (%) Effectivity of multimodalization and comparison to related works, by average classification rates (%). LDA+BoF SVM+BoF Uni-CM Multi-CM Caltech 94.7 96.4 98.7 99.5 Pascal 29.6 27.9 37.0 38.8 Fig. 4 4 K Influence of K (number of components) on average classification rate. BoF codeword k-means k LDA LDA K 3 Caltech Pascal Multi- CM Uni-CM Caltech Face Pascal Bicycle 4. 5 LDA+BoF SVM+BoF BoF 4. 3 K K K 1 9 2 K K =1 Uni-CM K 2 Multi-CM R 21 4 Caltech K K =5 Pascal K =7 Pascal K Pascal Caltech K 2 K K =5 K 2 K =1 4. 4 R R R 3 21 3 R Multi-CM K 5 1110

Fig. 5 5 R Influence of R (number of regions) on average classification rate. Fig. 6 6 (Caltech) Example of groupings for each component of the model (Caltech). Each row shows each component. Uni-CM Multi-CM 5 R Caltech R =9 Pascal R =21 Pascal Caltech R R Multi-CM Uni-CM 4. 5 K 10 { L } l G(x l θ k,ˆrk,l ) π k 10 6 7 Caltech Cars Rear Motorbikes 1111

2009/8 Vol. J92 D No. 8 Fig. 7 7 (Pascal) Example of groupings for each component of the model (Pascal). Each row shows each component. Pascal Car DCT Motorbike Cow Cat 4. 6 Fergus Fergus Fergus 1 (L) 20 (R) 3 (K =1) 10 4 Caltech Pascal Fergus 1112

4 Fergus (%) L =20 R =3 K =1 Table 4 Comparison with Fergus s constellation model, by average classification rate (%). L = 20, R =3,(K = 1, proposed model only). Proposed model Fergus s model Caltech 93.0 71.1 Pascal 31.3 19.5 Table 5 5 (%) Validation of effectivity of continuous value expression and position-scale information, by average classification rate (%). LDA+BoF Multi-CM no-x,s Multi-CM Caltech 94.7 96.5 99.5 Pascal 29.6 33.5 38.8 Fergus 4. 7 [3] Fergus R =6 7 L =20 30 400 24 36 (K 2) R, L 4. 6 (R =3 L =20 K =1)1 1 Fergus 5 4. 8 1. (b) BoF (c) BoF (b) BoF LDA+BoF Multi-CM Multi-CM no-x,s (c) Multi-CM no-x,s Multi-CM 5 3 LDA+BoF Multi-CM no-x,s Multi- CM no-x,s Multi-CM 5. BoF Fergus K R [1] vol.48, no.sig 16 (CVIM 19), pp.1 24, 2007. [2] G. Csurka, C.R. Dance, L. Fan, J. Willamowski, and C. Bray, Visual categorization with bags of keypoints, Proc. ECCV International Workshop on Statistical Learning in Computer Vision, pp.1 22, 2004. [3] R. Fergus, P. Perona, and A. Zisserman, Object class recognition by unsupervised scale-invariant learning, Proc. IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, vol.2, pp.264 271, 2003. 1113

2009/8 Vol. J92 D No. 8 [4] K. Grauman and T. Darrell, The pyramid match kernel: Discriminative classification with sets of image features, Proc. IEEE Int. Conf. on Computer Vision, vol.2, pp.1458 1465, 2005. [5] M. Varma and D. Ray, Learning the discriminative power-invariance trade-off, Proc. IEEE Int. Conf. on Computer Vision, 2007. [6] J. Zhang, M. Marszalek, S. Lazebnik, and C. Schmid, Local features and kernels for classification of texture and object categories: A comprehensive study, Int. J. Comput. Vis., vol.73, no.2, pp.213 238, 2007. [7] A. Bosch, A. Zisserman, and X. Munoz, Scene classification via plsa, Proc. European Conf. on Computer Vision, vol.4, pp.517 530, 2006. [8] L. Fei-Fei and A.P. Perona, A Bayesian hierarchical model for learning natural scene categories, Proc. IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, vol.2, pp.524 531, 2005. [9] G. Wang, Y. Zhang, and L. Fei-Fei, Using dependent regions for object categorization in a generative framework, Proc. IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, vol.2, pp.1597 1604, 2006. [10] C.M. Bishop, Pattern Recognition and Machine Learning, Springer, 2006. [11] M. Weber, M. Welling, and P. Perona, Unsupervised learning of models for recognition, Proc. European Conf. on Computer Vision, vol.1, pp.18 32, 2000. [12] M. Weber, M. Welling, and P. Perona, Towards automatic discovery of object categories, Proc. European Conf. on Computer Vision, vol.2, pp.101 108, 2000. [13] R. Fergus, P. Perona, and A. Zisserman, A sparse object category model for efficient learning and exhaustive recognition, Proc. IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, vol.1, pp.380 387, 2005. [14] X. Ma and W.E.L. Grimson, Edge-based rich representation for vehicle classification, Proc. IEEE Int. Conf. on Computer Vision, vol.2, pp.1185 1192, 2005. [15] A.P. Dempster, N.M. Laird, and D.B. Rubin, Maximum likelihood from incomplete data via the EM algorithm, J. Royal Statistical Society, Series B, vol.39, no.1, pp.1 38, 1977. [16] M. Everingham, A. Zisserman, C.K.I. Williams, and L. Van Gool, The PASCAL Visual Object Classes Challenge 2006 (VOC2006) results, http://www.pascal-network.org/challenges/voc/ voc2006/results.pdf. [17] T. Kadir and M. Brady, Saliency, scale and image description, Int. J. Comput. Vis., vol.45, no.2, pp.83 105, 2001. 20 10 10 21 2 25 17 19 21 MMM2009 9 12 15 2 COE 17 3 20 6 8 12 16 19 14 16 17 19 IRISA IEEE Computer Society ACM 53 55 NTT 4 1 15 60 6 IEEE-CVPR 7 8 IEEE-ICRA 13 13 14 15 16 IEEE Trans MM IEEE 1114