Microsoft PowerPoint - cvim_harada pptx

Size: px

Start display at page:

Download "Microsoft PowerPoint - cvim_harada pptx"

いおりみおか
5 years ago
Views:

1 1

2 2

3 Flickr reaches 6 billion photos on 1 Aug,

4 4

5 5

6 6

7 位 LSVRC 位 LSVRC 位 Localization Car Car Categorization Car

8 1. neck brace 2. bullet train 3. potter's wheel 4. seat belt 5. barbell 1. mountain bike 2. hartebeest 3. yurt 4. bighorn 5. coho 1. brown bear 2. otter 3. hippopotamus 4. raccoon 5. deerhound 1. volleyball 2. bittern 3. shower curtain 4. crane 5. suspension bridge 1. mask 2. ski mask 3. jack-o'-lantern 4. jellyfish 5. teddy bear 1. toilet seat 2. scanner 3. hard disc 4. scale 5. backpack 1. baseball player 2. racket, racquet 3. solar dish 4. trimaran 5. paddle 1. aircraft carrier 2. paddle 3. bullfrog 4. water ouzel 5. mantis 8

9 9

10 The state of the world The gathered data The processed data w d r I( W; D) I( W; R) The data processing theorem states that data processing can only destroy information. 10 David J.C. MacKay. Information Theory, Inference, and Learning Algorithms. Cambridge University Press 2003.

11 11

14 S. Vijayanarasimhan and K. Grauman. Large Scale Live Active Learning: Training Object Detectors with Crawled Data and Crowds. In CVPR, 2011.

15 S. Vijayanarasimhan and K. Grauman. Large Scale Live Active Learning: Training Object Detectors with Crawled Data and Crowds. In CVPR, HOG deformation LLC+max pooling No deformation NIPS2010

18 S. J. Hwang, F. Sha, and K. Grauman. Sharing Features Between Objects and Their Attributes. CVPR, V. Ferrari and A. Zisserman. Learning visual attributes. In NIPS,

19 Attributes and Classification 20

20 21

21 S. Dhar, V. Ordonez, and T. L Berg. High Level Describable Attributes for Predicting Aesthetics and Interestingness. CVPR,

22 S. Dhar, V. Ordonez, and T. L Berg. High Level Describable Attributes for Predicting Aesthetics and Interestingness. CVPR,

23 24

24 M. Douze, A. Ramisa, and C. Schmid. Combining attributes and Fisher vectors for efficient image retrieval. CVPR,

25 26

26 D. Parikh and K. Grauman. Relative Attributes. In ICCV,

29 30

30 31

31 Deng et al., CVPR

32 33

34 d 2 d 3 d m d 1 d k d j d N 1) Input Image d m 2) Detection 3) Description p( d; θ) d N d 2 d 1 d k x f (θ) d j d 3 4) Local descriptors in feature space 5) PDF estimation 6) Feature vector 35

35 d 2 d 1 d m Local descriptors in feature space d k d N d j d 3 Descriptor matching Codebook Global feature # of anchor points: large # of anchor points: small Computational complexity: large Computational complexity: small SVM KNN Naïve Bayes Nearest Neighbor Graph Matching Kernel Bag of Visual Words Gaussian Mixture Model ScSPM, Super Vector, LLC Fisher Vector HLAC GLC Global Gaussian 36

37 H. Zhang, A. C. Berg, M. Maire, and J. Malik. SVM KNN: Discriminative Nearest Neighbor Classification for Visual Category Recognition. In CVPR,

38 T. Tuytelaars, M. Fritz, K. Saenko, and T. Darrell. The NBNN kernel. In ICCV,

39 40

40 O. Duchenne, A. Joulin and J. Ponce. A Graph Matching Kernel for Object Categorization. ICCV,

43 w 3 w 1 w 4 R d w 2 44

exp d 2 i w d j k 2 w k 2 w1 w2 w3 w4 d 1

w1 w2 w3 w4 w 1 d 2 w 3 d 3 w1 w2 w3 w4 d

BoW 1 N fbow( xi ) N i 1 d 4 w 4 f kc 1 N

44 Bag of Visual Words Kernel codebook d 1 w1 w2 w3 w4 [ f kc ( x )] i k K j 1 exp 2 exp d 2 i w d j k 2 w k 2 w1 w2 w3 w4 d 1 d 2 w1 w2 w3 w4 d 1 w1 w2 w3 w4 d 2 d 3 w1 w2 w3 w4 w 1 d 2 w 3 d 3 w1 w2 w3 w4 d 3 d 4 w1 w2 w3 w4 w 2 w1 w2 w3 w4 d 4 f BoW 1 N fbow( xi ) N i 1 d 4 w 4 f kc 1 N fkc ( xi ) N i 1 w1 w2 w3 w4 w1 w2 w3 w4 45

45 Image Local descriptors in feature space PDF estimation 46

46 Generative approach Image Local descriptors in feature space PDF estimation Fisher Kernel Feature vector Fisher Vector Discriminative classifier F. Perronnin and C. Dance. Fisher kernels on visual vocabularies for image categorization. CVPR, Discriminative approach Classifier e.g., SVMs Category 47

47 48

48 49

49 50

50 51

51 net.org/challenges/lsvrc/2010/ilsvrc2010_xrce.pdf 52

52 K p( x; ) w Ν ( x;, k 1 k k k ) ˆμ 1 Image Local descriptors in feature space U U μ ~ ( ) ( ) 1/ k wk ( k ) 2 μˆ k μˆ GMM ( U ) ˆ k k N i1 μˆ K ˆμ 2 Means of components N i1 ( k) x i i ( k) i μ~ 1 μ~ 2 ( X ) μ ~ K GMM supervectors 53

53 54 N i i N i i i U k k k x k 1 1 ) ( ) ( ) ( ˆ ˆ μ N i i i U k U k N i i i U k N i k U k k U k U k k x k w N x k i w w 1 2 1/ ) ( ) ( 1 2 1/ ) ( 1 ) ( 2 1/ ) ( ) ( ) ( ) ( 1 ) ( ) ( ) ( ˆ ) ( ~ μ μ 0 N i i Nw k k 1 ) ( N i k i k i k i k w N g 1 2 1/, ) ( 1 μ x N. Inoue and K. Shinoda. A Fast MAP Adaptation Technique for GMMsupervector based Video Semantic Indexing. ACM Multimedia, 2011.

54 55 Asymmetric Distance Computation

55 H. Jegou, M. Douze, C. Schmid, and P. Perez. Aggregating local descriptors into a compact image representation. CVPR,

56 57

57 58

58 59

59 60

60 61

61 62

62 63

63 64

64 net.org/challenges/lsvrc/2010/ilsvrc2010_nec UIUC.pdf 65

65 net.org/challenges/lsvrc/2010/ilsvrc2010_nec UIUC.pdf 66

66 J. Yang, K. Yu, Y. Gong, and T. Huang. Linear spatial pyramid matching using sparse coding for image classification. CVPR,

68 H. Nakayama, T. Harada, and Y. Kuniyoshi. Dense Sampling Low Level Statistics of Local Features. In CIVR, GMM Single Gaussian 69

69 H. Nakayama, T. Harada, and Y. Kuniyoshi. Dense Sampling Low Level Statistics of Local Features. In CIVR,

70 H. Nakayama, T. Harada, and Y. Kuniyoshi. Global Gaussian Approach for Scene Categorization Using Information Geometry. In CVPR, Image 1 Local descriptor space Feature vector Feature vector Local descriptor space Image 2 (1) x (2) x Similarity? ( j) x (i) x (2) x (k ) x (1) x Manifold

71 H. Nakayama, T. Harada, and Y. Kuniyoshi. Global Gaussian Approach for Scene Categorization Using Information Geometry. In CVPR,

72 H. Nakayama, T. Harada, and Y. Kuniyoshi. Global Gaussian Approach for Scene Categorization Using Information Geometry. In CVPR,

73 Super Vector Coding VLAD GMM + Bag of Visual Words Fisher Vector Sparse Coding Global Gaussian Local Coordinate Coding Bag of Visual Words Locality constrained Linear Coding 74

74 75

76 J. Sanchez, and F. Perronnin. High Dimensional Signature Compression for Large Scale Image Classification. In CVPR, 2011.

77 78

78 識別機 CPU 識別機識別機識別機 CPU 識別機識別機識別機 CPU 識別機識別機データデータデータデータデータデータ HDD データデータ HDD HDD 79

79 D dim D/N dim D/N dim w 3 w 3 2^K w 3 w 3 w 1 w 4 w 1 w 4 w 1 w 4 w 1 w 4 w 2 w 2 w 2 w 2 NK/D [bit/dim] NK/D [bit/dim] NK/D [bit/dim] NK/D [bit/dim] 80

80 81

81 net.org/challenges/lsvrc/2011/ilsvrc11.pdf 82

82 83

Microsoft PowerPoint - SSII_harada pptx

Microsoft PowerPoint - SSII_harada pptx The state of the world The gathered data The processed data w d r I( W; D) I( W; R) The data processing theorem states that data processing can only destroy information. David J.C. MacKay. Information