paper.dvi

Similar documents

28 TCG SURF Card recognition using SURF in TCG play video

(MIRU2008) HOG Histograms of Oriented Gradients (HOG)

28 Horizontal angle correction using straight line detection in an equirectangular image

情報処理学会研究報告 IPSJ SIG Technical Report Vol.2013-CVIM-186 No /3/15 EMD 1,a) SIFT. SIFT Bag-of-keypoints. SIFT SIFT.. Earth Mover s Distance

<4D F736F F F696E74202D C835B B E B8CDD8AB B83685D>

Web Web Web Web Web, i

yoo_graduation_thesis.dvi

これわかWord2010_第1部_ indd

パワポカバー入稿用.indd

これでわかるAccess2010

平成18年版　男女共同参画白書

kut-paper-template.dvi

卒業論文2.dvi

SOM SOM(Self-Organizing Maps) SOM SOM SOM SOM SOM SOM i

エクセルカバー入稿用.indd

25 Removal of the fricative sounds that occur in the electronic stethoscope

24 Region-Based Image Retrieval using Fuzzy Clustering

2.2 6).,.,.,. Yang, 7).,,.,,. 2.3 SIFT SIFT (Scale-Invariant Feature Transform) 8).,. SIFT,,. SIFT, Mean-Shift 9)., SIFT,., SIFT,. 3.,.,,,,,.,,,., 1,

,,.,.,,.,.,.,.,,.,..,,,, i

活用ガイド　（ソフトウェア編）

24 Depth scaling of binocular stereopsis by observer s own movements

困ったときのQ&A

,,,,., C Java,,.,,.,., ,,.,, i

SICE東北支部研究集会資料(2013年)

21 Effects of background stimuli by changing speed color matching color stimulus

四校＿目次～巻頭言.indd

& Vol.5 No (Oct. 2015) TV 1,2,a) , Augmented TV TV AR Augmented Reality 3DCG TV Estimation of TV Screen Position and Ro

2 1 ( ) 2 ( ) i

Duplicate Near Duplicate Intact Partial Copy Original Image Near Partial Copy Near Partial Copy with a background (a) (b) 2 1 [6] SIFT SIFT SIF

活用ガイド　（ソフトウェア編）

..,,,, , ( ) 3.,., 3.,., 500, 233.,, 3,,.,, i

SURF,,., 55%,.,., SURF(Speeded Up Robust Features), 4 (,,, ), SURF.,, 84%, 96%, 28%, 32%.,,,. SURF, i

はしがき・目次・事例目次・凡例.indd

25 D Effects of viewpoints of head mounted wearable 3D display on human task performance

(VKIR) VKIR VKIR DCT (R) (G) (B) Ward DCT i

橡6.プログラム.doc

22 Google Trends Estimation of Stock Dealing Timing using Google Trends

活用ガイド (ソフトウェア編)

パソコン機能ガイド

パソコン機能ガイド

『戦時経済体制の構想と展開』

Virtual Window System Virtual Window System Virtual Window System Virtual Window System Virtual Window System Virtual Window System Social Networking

23 The Study of support narrowing down goods on electronic commerce sites

2004年度日本経団連規制改革要望

IPSJ SIG Technical Report Vol.2010-CVIM-170 No /1/ Visual Recognition of Wire Harnesses for Automated Wiring Masaki Yoneda, 1 Ta

放射線専門医認定試験（２００９・２０回）／ＨＯＨＳ‐０１（基礎一次）

Web Basic Web SAS-2 Web SAS-2 i

長崎県地域防災計画

ｔｎｂｐ５９－１７＿Ｗｅｂ：プＯ１／ｋｙ０７９８８８５０９６１０００３２０１

三税協力の実質化 : 住民税の所得税閲覧に関する国税連携の効果

25 II :30 16:00 (1),. Do not open this problem booklet until the start of the examination is announced. (2) 3.. Answer the following 3 proble

TF-IDF TDF-IDF TDF-IDF Extracting Impression of Sightseeing Spots from Blogs for Supporting Selection of Spots to Visit in Travel Sat

Wide Scanner TWAIN Source ユーザーズガイド

活用ガイド　（ハードウェア編）

SNS ( ) SNS(Social Networking Service) SNS SNS i

28 Docker Design and Implementation of Program Evaluation System Using Docker Virtualized Environment

Core Ethics Vol.

29 Short-time prediction of time series data for binary option trade

〈論文〉英語学習辞書における二重母音と三重母音の発音表記の異同

A B C D E F G H J K L M 1A : 45 1A : 00 1A : 15 1A : 30 1A : 45 1A : 00 1B1030 1B1045 1C1030

, (GPS: Global Positioning Systemg),.,, (LBS: Local Based Services).. GPS,.,. RFID LAN,.,.,.,,,.,..,.,.,,, i

Transcription:

23 Study on character extraction from a picture using a gradient-based feature 1120227 2012 3 1

Google Street View Google Street View SIFT 3 SIFT 3 y -80 80-50 30 SIFT i

Abstract Study on character extraction from a picture using a gradient-based feature Kenji OGAWA In image retrieval system, a user has to use keywords to retrieve image from a image database. Every image in the database should be tagged by meta-data such as keywords. If images are not tagged by meta-data, we will not able to retrieve images. For example, images in Google Street View does not have meta-data, and it is difficult to search a special image from Google Street View. In this thesis, we study on character extraction from the image. Characters in Google Street View are usually not taken from directly right in front of them. They are usually taken obliquely. We use SIFT feature, which is robust for rotation of the image, invariant to changes in scale and brightness change, and study for possibility of robustness for three-dimensional rotation. The experiments are performed to extract characters from the image using SIFT, and change the distance to determine similarity among keypoints under the threedimensional image rotation. The images are rotated 80 from -80 around the y axis. The result result shows that characters can be extracted under the spacial rotation between -50 to 30. key words gradient-based feature SIFT character extraction ii

1 1 2 3 2.1 Histograms of Oriented Gradients(HOG)................. 3 2.2 Haar-like................................... 3 2.3 Scale-Invariant Feature Transform(SIFT)................. 4 3 SIFT 5 3.1........................ 5 3.1.1 Difference-of-Gaussian(DoG).................. 5 3.1.2................................ 7 3.2........................ 8 3.2.1................ 8 3.2.2......................... 9 3.2.3............. 10 3.3.......................... 10 3.4................................. 12 4 SIFT 15 5 20 5.1...................................... 20 5.2...................................... 24 6 26 iii

27 29 A d=0.3 30 B d=0.2 42 iv

3.1 DoG................................... 6 3.2.................................... 7 3.3........................ 12 3.4................................ 13 3.5............................. 14 4.1................................ 16 4.2 0.................................... 17 4.3-40................................... 17 4.4-80................................... 18 4.5 40................................... 18 4.6 80................................... 19 5.1 d=0.3 0............................... 20 5.2 d=0.3-55.............................. 21 5.3 d=0.3 70............................... 21 5.4 d=0.3........................ 22 5.5 d=0.2 0............................... 22 5.6 d=0.2-45.............................. 23 5.7 d=0.2 50............................... 23 5.8 d=0.2........................ 24 A.1 d=0.3-80.............................. 30 A.2 d=0.3-75.............................. 30 A.3 d=0.3-70.............................. 31 v

A.4 d=0.3-65.............................. 31 A.5 d=0.3-60.............................. 31 A.6 d=0.3-55.............................. 32 A.7 d=0.3-50.............................. 32 A.8 d=0.3-45.............................. 32 A.9 d=0.3-40.............................. 33 A.10 d=0.3-35.............................. 33 A.11 d=0.3-30.............................. 33 A.12 d=0.3-25.............................. 34 A.13 d=0.3-20.............................. 34 A.14 d=0.3-15.............................. 34 A.15 d=0.3-10.............................. 35 A.16 d=0.3-5............................... 35 A.17 d=0.3 0............................... 35 A.18 d=0.3 5............................... 36 A.19 d=0.3 10............................... 36 A.20 d=0.3 15............................... 36 A.21 d=0.3 20............................... 37 A.22 d=0.3 25............................... 37 A.23 d=0.3 30............................... 37 A.24 d=0.3 35............................... 38 A.25 d=0.3 40............................... 38 A.26 d=0.3 45............................... 38 A.27 d=0.3 50............................... 39 A.28 d=0.3 55............................... 39 A.29 d=0.3 60............................... 39 vi

A.30 d=0.3 65............................... 40 A.31 d=0.3 70............................... 40 A.32 d=0.3 75............................... 40 A.33 d=0.3 80............................... 41 B.1 d=0.2-80.............................. 42 B.2 d=0.2-75.............................. 42 B.3 d=0.2-70.............................. 43 B.4 d=0.2-65.............................. 43 B.5 d=0.2-60.............................. 43 B.6 d=0.2-55.............................. 44 B.7 d=0.2-50.............................. 44 B.8 d=0.2-45.............................. 44 B.9 d=0.2-40.............................. 45 B.10 d=0.2-35.............................. 45 B.11 d=0.2-30.............................. 45 B.12 d=0.2-25.............................. 46 B.13 d=0.2-20.............................. 46 B.14 d=0.2-15.............................. 46 B.15 d=0.2-10.............................. 47 B.16 d=0.2-5............................... 47 B.17 d=0.2 0............................... 47 B.18 d=0.2 5............................... 48 B.19 d=0.2 10............................... 48 B.20 d=0.2 15............................... 48 B.21 d=0.2 20............................... 49 vii

B.22 d=0.2 25............................... 49 B.23 d=0.2 30............................... 49 B.24 d=0.2 35............................... 50 B.25 d=0.2 40............................... 50 B.26 d=0.2 45............................... 50 B.27 d=0.2 50............................... 51 B.28 d=0.2 55............................... 51 B.29 d=0.2 60............................... 51 B.30 d=0.2 65............................... 52 B.31 d=0.2 70............................... 52 B.32 d=0.2 75............................... 52 B.33 d=0.2 80............................... 53 viii

2.1................................ 4 5.1 d=0.3................................. 22 5.2 d=0.2................................. 24 ix

1 ( ) Google Street View Panoramio [1] [2] [1] SIFT(Scale- Invariant Feature Transform) y -80 80 3-50 30 1

3 Google Street View Google Street View Google Street View 2 3 SIFT 4 SITF 5 2

2 HOG Haar-like SIFT 2.1 Histograms of Oriented Gradients(HOG) HOG [3] HOG 2.2 Haar-like Haar-like 2 [5] Haar-like 3

2.3 Scale-Invariant Feature Transform(SIFT) 2.3 Scale-Invariant Feature Transform(SIFT) SIFT [2] [4] SIFT y 2.1 2.1 HOG Haar-like SIFT 4

3 SIFT SIFT 3.1 3.1.1 Difference-of-Gaussian(DoG) DoG Difference-of-Gaussian G(x,y,σ) (3.1) I(a,b) L(a,b,σ) (3.2) DoG G(x,y,σ) = 1 2πσ 2 exp ( x2 +y 2 ) 2σ 2 (3.1) 5

3.1 L(a,b,σ) = G(x,y,σ) I(a,b) (3.2) DoG D(a,b,σ) DoG D(a,b,σ) = (G(x,y,kσ) G(x,y,σ)) I(a,b) = L(a,b,kσ) L(a,b,σ) (3.3) σ 0 k 3.1 DoG 3.1 DoG 6

3.1 3.1.2 DoG DoG [3] σ DoG DoG 26 1 [3] 3.2 26 3.2 7

3.2 3.2 DoG DoG 3.2.1 DoG H H = [ Dxx D xy D xy D yy ] (3.4) DoG 2 1 D xx =α 2 D yy =β(α > β) Tr(H) Det(H) Tr(H) = D xx +D yy = α+β (3.5) Det(H) = D xx D yy (D xy ) 2 = αβ (3.6) 1 α 2 β γ α = γβ Tr(H) 2 Det(H) = (α+β)2 αβ = (γβ +β)2 γβ 2 = (γ +1)2 γ (3.7) 8

3.2 1 α 2 β Tr(H) 2 Det(H) < (γ th +1) 2 (3.8) γ th γ th Tr(H) 2 /Det(H) [6] 3.2.2 [6] a=(x,y,σ) T DoG D(a) [3] D(a) = D + DT a a+ 1 2 at 2 D a 2 a (3.9) a 0 [3] D a + 2 D a 2 â = 0 (3.10) â (x,y,σ) T 2 D a 2 â = D a (3.11) 9

3.3 â â = x y σ = 2 D x 2 2 D xy 2 D xσ 2 D xy 2 D y 2 2 D yσ 2 D xσ 2 D yσ 2 D σ 2 1 D x D y D σ (3.12) â=(x,y,σ) 3.2.3 DoG D(â) = D + 1 2 D T a â (3.13) D DoG â D(â) D(â) 3.3 L(x,y) m(x,y) θ(x,y) 10

3.3 m(x,y) = f x (x,y) 2 +f y (x,y) 2 (3.14) θ(x,y) = tan 1 f x(x,y) f y (x,y) (3.15) { fx (x,y) = L(x+1,y) L(x 1,y) f y (x,y) = L(x,y +1) L(x,y 1) (3.16) m(x,y) θ(x,y) h θ = x ω(x,y)δ[θ,θ(x,y)] (3.17) y ω(x,y) = G(x,y,σ)m(x,y) (3.18) h θ 36 ω(x,y) δ θ(x,y) θ 1 [3] ω(x,y) G(x,y,σ) m(x,y) 36 h θ 80% 3.3 11

3.4 3.3 3.3 3.4 3.3 128 3.4 12

3.4 3.4 1 4 4 4=16 8 4 4 16=128 3.5 13

3.4 3.5 3.5 DoG [4] 2 2 14

4 SIFT SIFT SIFT 0.3 0.2 0.2 SIFT 0.3 SIFT 0.3 0.2 1. SIFT 2. SIFT 3. 4. 2 4.1 SIFT 15

4.1 y 5 2 y -80 80 16

4.2 0 4.3-40 17

4.4-80 4.5 40 18

4.6 80 4.2 4.3 4.4 4.5 4.6 y 0-40 -80 40 80 r r = n correct n all (4.1) n all 4.1 SIFT n correct n all 19

5 5.1 (d) 0.3 5.1 d=0.3 0 20

5.1 5.2 d=0.3-55 5.3 d=0.3 70 5.1 0 n all 5.2 5.3-55 70 SIFT A 5.1 (d) 0.3 n all n correct r 21

5.1 5.1 d=0.3-80 -70-60 -50-40 -30-20 -10 0 10 20 30 40 50 60 70 80 n all 2 0 5 9 18 20 23 19 23 17 18 18 18 5 3 4 3 n correct 0 0 0 7 13 14 16 16 15 13 12 11 7 0 1 0 0 r(%) 0 0 0 78 72 70 70 84 65 76 67 61 39 0 33 0 0 5.1-50 30 5.4 (d) 0.3 5.4 d=0.3 (d) 0.2 5.5 d=0.2 0 22

5.1 5.6 d=0.2-45 5.7 d=0.2 50 5.5 0 0.3 5.6-45 n all 5.7 50 SIFT B 5.2 (d) 0.2 n all n correct r n all 23

5.2 5.2 d=0.2-80 -70-60 -50-40 -30-20 -10 0 10 20 30 40 50 60 70 80 n all 0 0 1 1 8 7 6 7 7 11 9 5 3 0 1 0 0 n correct 0 0 0 1 4 7 6 5 6 9 7 5 1 0 1 0 0 r(%) 0 0 0 100 50 100 100 71 86 82 78 100 33 0 100 0 0 5.2-30 30 5.8 (d) 0.2 5.8 d=0.2 5.2 0.3-50 30 n all n correct SIFT 50 SIFT 0.3 SIFT SIFT 24

5.2 0 100% n all 0.2-30 30 100% SIFT SIFT 0.3 n all 25

6 SIFT y 0.3-50 45 0.2-40 30 SIFT 70 SIFT 3 Google Street View Google Street View 26

Free BSD PC LATEX 3 4 27

28

[1] Y. Kusachi, A. Suzuki, N. Ito, and K. Arakawa, Kanji Recognition in scene images without detection of textelds robust against variation of viewpoint, contrast, andbackground texture, Proc. ICPR2004, 2004. [2],,2011. [3] Gradient -SIFT HOG- [4],, SIFT Mean-Shift [5], [6] SIFT, http://www.scribd.com/doc/33063124/14/sift%e3%82%a2%e3%83%ab%e3% 82%B4%E3%83%AA%E3%82%BA%E3%83%A0 29

A d=0.3 A.1 d=0.3-80 A.2 d=0.3-75 30

A.3 d=0.3-70 A.4 d=0.3-65 A.5 d=0.3-60 31

A.6 d=0.3-55 A.7 d=0.3-50 A.8 d=0.3-45 32

A.9 d=0.3-40 A.10 d=0.3-35 A.11 d=0.3-30 33

A.12 d=0.3-25 A.13 d=0.3-20 A.14 d=0.3-15 34

A.15 d=0.3-10 A.16 d=0.3-5 A.17 d=0.3 0 35

A.18 d=0.3 5 A.19 d=0.3 10 A.20 d=0.3 15 36

A.21 d=0.3 20 A.22 d=0.3 25 A.23 d=0.3 30 37

A.24 d=0.3 35 A.25 d=0.3 40 A.26 d=0.3 45 38

A.27 d=0.3 50 A.28 d=0.3 55 A.29 d=0.3 60 39

A.30 d=0.3 65 A.31 d=0.3 70 A.32 d=0.3 75 40

A.33 d=0.3 80 41

B d=0.2 B.1 d=0.2-80 B.2 d=0.2-75 42

B.3 d=0.2-70 B.4 d=0.2-65 B.5 d=0.2-60 43

B.6 d=0.2-55 B.7 d=0.2-50 B.8 d=0.2-45 44

B.9 d=0.2-40 B.10 d=0.2-35 B.11 d=0.2-30 45

B.12 d=0.2-25 B.13 d=0.2-20 B.14 d=0.2-15 46

B.15 d=0.2-10 B.16 d=0.2-5 B.17 d=0.2 0 47

B.18 d=0.2 5 B.19 d=0.2 10 B.20 d=0.2 15 48

B.21 d=0.2 20 B.22 d=0.2 25 B.23 d=0.2 30 49

B.24 d=0.2 35 B.25 d=0.2 40 B.26 d=0.2 45 50

B.27 d=0.2 50 B.28 d=0.2 55 B.29 d=0.2 60 51

B.30 d=0.2 65 B.31 d=0.2 70 B.32 d=0.2 75 52

B.33 d=0.2 80 53