2014/3 Vol. J97 D No. 3 Recognition-based segmentation [7] 1 DP 1 Conditional random field; CRF [8] [10] CRF / OCR 2 2 2 2 OCR 2 2 2 2. 2 2 2 [11], [1



Similar documents
3: 2: 2. 2 Semi-supervised learning Semi-supervised learning [5,6] Semi-supervised learning Self-training [13] [14] Self-training Self-training Semi-s

Duplicate Near Duplicate Intact Partial Copy Original Image Near Partial Copy Near Partial Copy with a background (a) (b) 2 1 [6] SIFT SIFT SIF

IPSJ SIG Technical Report Vol.2012-CG-149 No.13 Vol.2012-CVIM-184 No /12/4 3 1,a) ( ) DB 3D DB 2D,,,, PnP(Perspective n-point), Ransa

4. C i k = 2 k-means C 1 i, C 2 i 5. C i x i p [ f(θ i ; x) = (2π) p 2 Vi 1 2 exp (x µ ] i) t V 1 i (x µ i ) 2 BIC BIC = 2 log L( ˆθ i ; x i C i ) + q

列生成を困難にする要因となっている 既存研究では 与え られた画像からグレイスケールに変換し 画像 1 枚から抽出 を行っているため 外乱 ( 影や光 ) の影響を受けると文字列を 正しく抽出できない (Yin et al., 2014) さらに 情景内の単一 の文字は既存研究では考慮されていない

IPSJ SIG Technical Report 1,a) 1,b) 1,c) 1,d) 2,e) 2,f) 2,g) 1. [1] [2] 2 [3] Osaka Prefecture University 1 1, Gakuencho, Naka, Sakai,

IPSJ SIG Technical Report Vol.2010-MPS-77 No /3/5 VR SIFT Virtual View Generation in Hallway of Cybercity Buildings from Video Sequen

2007/8 Vol. J90 D No. 8 Stauffer [7] 2 2 I 1 I 2 2 (I 1(x),I 2(x)) 2 [13] I 2 = CI 1 (C >0) (I 1,I 2) (I 1,I 2) Field Monitoring Server

258 5) GPS 1 GPS 6) GPS DP 7) 8) 10) GPS GPS ) GPS Global Positioning System

Optical Flow t t + δt 1 Motion Field 3 3 1) 2) 3) Lucas-Kanade 4) 1 t (x, y) I(x, y, t)

(4) ω t(x) = 1 ω min Ω ( (I C (y))) min 0 < ω < C A C = 1 (5) ω (5) t transmission map tmap 1 4(a) t 4(a) t tmap RGB 2 (a) RGB (A), (B), (C)

2. 研 究 実 施 内 容 ( 文 中 に 番 号 がある 場 合 は(3-1)に 対 応 する) (1) 黄 瀬 グループ 局 所 特 徴 量 と 最 近 傍 探 索 を 用 いる 文 字 認 識 手 法 の 開 発 を 進 めた 特 に 今 年 度 は 高 速 化 に 注 力 した 具 体 的

Silhouette on Image Object Silhouette on Images Object 1 Fig. 1 Visual cone Fig. 2 2 Volume intersection method Fig. 3 3 Background subtraction Fig. 4

(MIRU2008) HOG Histograms of Oriented Gradients (HOG)

2 Fig D human model. 1 Fig. 1 The flow of proposed method )9)10) 2.2 3)4)7) 5)11)12)13)14) TOF 1 3 TOF 3 2 c 2011 Information

2009/9 Vol. J92 D No. 9 HTML [3] Microsoft PowerPoint Apple Keynote OpenOffice Impress XML 4 1 (A) (C) (F) Fig. 1 1 An example of slide i

SICE東北支部研究集会資料(2017年)

3 1 Table 1 1 Feature classification of frames included in a comic magazine Type A Type B Type C Others 81.5% 10.3% 5.0% 3.2% Fig. 1 A co

3 2 2 (1) (2) (3) (4) 4 4 AdaBoost 2. [11] Onishi&Yoda [8] Iwashita&Stoica [5] 4 [3] 3. 3 (1) (2) (3)

Accuracy Improvement by Compound Discriminant Functions for Resembling Character Recognition Takashi NAKAJIMA, Tetsushi WAKABAYASHI, Fumitaka KIMURA,

(MIRU2010) Geometric Context Randomized Trees Geometric Context Rand

光学

[2] OCR [3], [4] [5] [6] [4], [7] [8], [9] 1 [10] Fig. 1 Current arrangement and size of ruby. 2 Fig. 2 Typography combined with printing

1 Web [2] Web [3] [4] [5], [6] [7] [8] S.W. [9] 3. MeetingShelf Web MeetingShelf MeetingShelf (1) (2) (3) (4) (5) Web MeetingShelf

paper.dvi

(a) (b) (c) Canny (d) 1 ( x α, y α ) 3 (x α, y α ) (a) A 2 + B 2 + C 2 + D 2 + E 2 + F 2 = 1 (3) u ξ α u (A, B, C, D, E, F ) (4) ξ α (x 2 α, 2x α y α,

(a) 1 (b) 3. Gilbert Pernicka[2] Treibitz Schechner[3] Narasimhan [4] Kim [5] Nayar [6] [7][8][9] 2. X X X [10] [11] L L t L s L = L t + L s

Fig. 1. Example of characters superimposed on delivery slip.

,,.,.,,.,.,.,.,,.,..,,,, i

1 Fig. 1 Extraction of motion,.,,, 4,,, 3., 1, 2. 2.,. CHLAC,. 2.1,. (256 ).,., CHLAC. CHLAC, HLAC. 2.3 (HLAC ) r,.,. HLAC. N. 2 HLAC Fig. 2

& Vol.5 No (Oct. 2015) TV 1,2,a) , Augmented TV TV AR Augmented Reality 3DCG TV Estimation of TV Screen Position and Ro

2003/3 Vol. J86 D II No Fig. 1 An exterior view of eye scanner. CCD [7] CCD PC USB PC PC USB RS-232C PC

IPSJ SIG Technical Report Vol.2009-CVIM-167 No /6/10 Real AdaBoost HOG 1 1 1, 2 1 Real AdaBoost HOG HOG Real AdaBoost HOG A Method for Reducing

xx/xx Vol. Jxx A No. xx 1 Fig. 1 PAL(Panoramic Annular Lens) PAL(Panoramic Annular Lens) PAL (2) PAL PAL 2 PAL 3 2 PAL 1 PAL 3 PAL PAL 2. 1 PAL

IPSJ SIG Technical Report Vol.2010-CVIM-170 No /1/ Visual Recognition of Wire Harnesses for Automated Wiring Masaki Yoneda, 1 Ta

IPSJ SIG Technical Report Vol.2013-CVIM-188 No /9/2 1,a) D. Marr D. Marr 1. (feature-based) (area-based) (Dense Stereo Vision) van der Ma

No. 3 Oct The person to the left of the stool carried the traffic-cone towards the trash-can. α α β α α β α α β α Track2 Track3 Track1 Track0 1

(a) (b) 2 2 (Bosch, IR Illuminator 850 nm, UFLED30-8BD) ( 7[m] 6[m]) 3 (PointGrey Research Inc.Grasshopper2 M/C) Hz (a) (b

03_特集2_3校_0929.indd

Run-Based Trieから構成される 決定木の枝刈り法

IPSJ SIG Technical Report Vol.2013-CVIM-187 No /5/30 1,a) 1,b), 1,,,,,,, (DNN),,,, 2 (CNN),, 1.,,,,,,,,,,,,,,,,,, [1], [6], [7], [12], [13]., [

1 Table 1: Identification by color of voxel Voxel Mode of expression Nothing Other 1 Orange 2 Blue 3 Yellow 4 SSL Humanoid SSL-Vision 3 3 [, 21] 8 325

[12] [5, 6, 7] [5, 6] [7] 1 [8] 1 1 [9] 1 [10, 11] [10] [11] 1 [13, 14] [13] [14] [13, 14] [10, 11, 13, 14] 1 [12]

IPSJ SIG Technical Report Vol.2014-HCI-158 No /5/22 1,a) 2 2 3,b) Development of visualization technique expressing rainfall changing conditions

[1] SBS [2] SBS Random Forests[3] Random Forests ii

THE INSTITUTE OF ELECTRONICS, INFORMATION AND COMMUNICATION ENGINEERS TECHNICAL REPORT OF IEICE. TRECVID2012 Instance Search {sak


(3.6 ) (4.6 ) 2. [3], [6], [12] [7] [2], [5], [11] [14] [9] [8] [10] (1) Voodoo 3 : 3 Voodoo[1] 3 ( 3D ) (2) : Voodoo 3D (3) : 3D (Welc

本文6(599) (Page 601)

IPSJ SIG Technical Report GPS LAN GPS LAN GPS LAN Location Identification by sphere image and hybrid sensing Takayuki Katahira, 1 Yoshio Iwai 1

SICE東北支部研究集会資料(2013年)

2.2 6).,.,.,. Yang, 7).,,.,,. 2.3 SIFT SIFT (Scale-Invariant Feature Transform) 8).,. SIFT,,. SIFT, Mean-Shift 9)., SIFT,., SIFT,. 3.,.,,,,,.,,,., 1,

光学

Microsoft Word - toyoshima-deim2011.doc

9_18.dvi

2. CABAC CABAC CABAC 1 1 CABAC Figure 1 Overview of CABAC 2 DCT 2 0/ /1 CABAC [3] 3. 2 値化部 コンテキスト計算部 2 値算術符号化部 CABAC CABAC

IPSJ SIG Technical Report Vol.2014-MBL-70 No.49 Vol.2014-UBI-41 No /3/15 2,a) 2,b) 2,c) 2,d),e) WiFi WiFi WiFi 1. SNS GPS Twitter Facebook Twit

1., 1 COOKPAD 2, Web.,,,,,,.,, [1]., 5.,, [2].,,.,.,, 5, [3].,,,.,, [4], 33,.,,.,,.. 2.,, 3.., 4., 5., ,. 1.,,., 2.,. 1,,

( )

EQUIVALENT TRANSFORMATION TECHNIQUE FOR ISLANDING DETECTION METHODS OF SYNCHRONOUS GENERATOR -REACTIVE POWER PERTURBATION METHODS USING AVR OR SVC- Ju

(MIRU2010) NTT Graphic Processor Unit GPU graphi

(a) Picking up of six components (b) Picking up of three simultaneously. components simultaneously. Fig. 2 An example of the simultaneous pickup. 6 /

顔画像を用いた個人認証システムの性能検討に関する研究

[2] 2. [3 5] 3D [6 8] Morishima [9] N n 24 24FPS k k = 1, 2,..., N i i = 1, 2,..., n Algorithm 1 N io user-specified number of inbetween omis

22_04.dvi

Automatic Detection of Circular Objects by Ellipse Growing Mitsuo OKABE, Kenichi KANATANI, and Naoya OHTA 1. [4], [5], [18], [19] [14], [17] [28], [32

Computer Security Symposium October ,a) 1,b) Microsoft Kinect Kinect, Takafumi Mori 1,a) Hiroaki Kikuchi 1,b) [1] 1 Meiji U

1 Kinect for Windows M = [X Y Z] T M = [X Y Z ] T f (u,v) w 3.2 [11] [7] u = f X +u Z 0 δ u (X,Y,Z ) (5) v = f Y Z +v 0 δ v (X,Y,Z ) (6) w = Z +

2_05.dvi

Input image Initialize variables Loop for period of oscillation Update height map Make shade image Change property of image Output image Change time L

IPSJ-CVIM

IPSJ SIG Technical Report Vol.2009-DPS-141 No.20 Vol.2009-GN-73 No.20 Vol.2009-EIP-46 No /11/27 1. MIERUKEN 1 2 MIERUKEN MIERUKEN MIERUKEN: Spe

Fig. 2 Signal plane divided into cell of DWT Fig. 1 Schematic diagram for the monitoring system

% 2 3 [1] Semantic Texton Forests STFs [1] ( ) STFs STFs ColorSelf-Simlarity CSS [2] ii

1 (PCA) 3 2 P.Viola 2) Viola AdaBoost 1 Viola OpenCV 3) Web OpenCV T.L.Berg PCA kpca LDA k-means 4) Berg 95% Berg Web k-means k-means

DEIM Forum 2012 E Web Extracting Modification of Objec

IPSJ SIG Technical Report 1, Instrument Separation in Reverberant Environments Using Crystal Microphone Arrays Nobutaka ITO, 1, 2 Yu KITANO, 1

1 3DCG [2] 3DCG CG 3DCG [3] 3DCG 3 3 API 2 3DCG 3 (1) Saito [4] (a) 1920x1080 (b) 1280x720 (c) 640x360 (d) 320x G-Buffer Decaudin[5] G-Buffer D

DPA,, ShareLog 3) 4) 2.2 Strino Strino STRain-based user Interface with tacticle of elastic Natural ObjectsStrino 1 Strino ) PC Log-Log (2007 6)

THE INSTITUTE OF ELECTRONICS, INFORMATION AND COMMUNICATION ENGINEERS TECHNICAL REPORT OF IEICE.

事務連絡

Fig. 1 Left: Example of a target image and lines. Solid lines mean foreground. Dotted lines mean background. Right: Example of an output mask i

Abstract

IS2-06 第21回画像センシングシンポジウム 横浜 2015年6月 画像をスーパーピクセルに変換する手法として SLIC[5] を用いる Achanta らによって提案された SLIC 2.2 グラフマッチング は K-means をベースにした手法で 単純な K-means に いる SPIN

IS1-09 第 回画像センシングシンポジウム, 横浜,14 年 6 月 2 Hough Forest Hough Forest[6] Random Forest( [5]) Random Forest Hough Forest Hough Forest 2.1 Hough Forest 1 2.2

1 Fogg Fogg Behavior Model [1] information cascade [2] TPO [3] Fig. 2 Target area of this paper. 1 Fig. 1 Fogg b

27 AR

DEIM Forum 2010 A Web Abstract Classification Method for Revie

A Feasibility Study of Direct-Mapping-Type Parallel Processing Method to Solve Linear Equations in Load Flow Calculations Hiroaki Inayoshi, Non-member

,4) 1 P% P%P=2.5 5%!%! (1) = (2) l l Figure 1 A compilation flow of the proposing sampling based architecture simulation

proc.dvi

Dynamic Time Warping( DTW DTW 30 k-d tree Forebes [1] 2. DTW[2] DTW DTW DTW Forbes[1] k-d tree DTW Hsu[3] DTW Zhu[4] K-SVD Sun[5] Self-S

LBP 2 LBP 2. 2 Local Binary Pattern Local Binary pattern(lbp) [6] R

IPSJ SIG Technical Report Vol.2014-CG-155 No /6/28 1,a) 1,2,3 1 3,4 CG An Interpolation Method of Different Flow Fields using Polar Inter

A Japanese Word Dependency Corpus ÆüËܸì¤Îñ¸ì·¸¤ê¼õ¤±¥³¡¼¥Ñ¥¹

Fig. 3 3 Types considered when detecting pattern violations 9)12) 8)9) 2 5 methodx close C Java C Java 3 Java 1 JDT Core 7) ) S P S

IPSJ SIG Technical Report Vol.2012-CG-148 No /8/29 3DCG 1,a) On rigid body animation taking into account the 3D computer graphics came

2 The Characteristics of Two Negative Peaks on Visual Evoked Potentials with Depth Perception Yoichi MIYAWAKI, Yasuyuki YANAGIDA, Taro MAEDA, and Susu

2). 3) 4) 1.2 NICTNICT DCRA Dihedral Corner Reflector micro-arraysdcra DCRA DCRA DCRA 3D DCRA PC USB PC PC ON / OFF Velleman K8055 K8055 K8055

VRSJ-SIG-MR_okada_79dce8c8.pdf

Transcription:

2, a) Scene Character Extraction by an Optimal Two-Dimensional Segmentation Hiroaki TAKEBE, a) and Seiichi UCHIDA / 2 2 2 2 2 2 1. FUJITSU LABORATORIES LTD., 4 1 1 Kamikodanaka, Nakahara-ku, Kawasaki-shi, 211 8588 Japan Kyusyu University, Fukuoka-shi, 819 0395 Japan a) E-mail: takebe.hiroaki@jp.fujitsu.com (a) (b) (c) / (d) (c) (d) OCR (c) OCR [1] [3] (d) OCR [4] [6] D Vol. J97 D No. 3 pp. 667 675 c 2014 667

2014/3 Vol. J97 D No. 3 Recognition-based segmentation [7] 1 DP 1 Conditional random field; CRF [8] [10] CRF / OCR 2 2 2 2 OCR 2 2 2 2. 2 2 2 [11], [12] 1 2 1 Fig. 1 Component tree. 2 OCR 2 2. 1 2 1 1 2 668

2 2 Fig. 2 Selection of combinations of components. 1 a f g 2. 2 2 OCR 2 a e OCR {a, d, e, c} {a, b, c} {a, b, c} [13], [14] [15] 3 Fig. 3 Construction of a component graph from a component tree. 3 OCR OCR S T 669

2014/3 Vol. J97 D No. 3 5 Fig. 5 Stability of components. 4 Fig. 4 Character extraction by graph cut. (1) u v u v c(u, v) (2) u g v g d v h uv C(S, T )= c(u, v) (1) c(u, v) = u S, v T { d v (u g v g) h uv (u g = v g) (2) 4 1 2 3. 3. 1 2 2 OCR 5 [16] xy z 3 z σ z 2 / σ = S(z) S(z 2) z 2 z 1 +1 (3) z=z 1 σ 6 670

2 τ OCR κ κα α [f 1,f 2] [g 1,g 2] α =min(f 2,g 2) max(f 1,g 1) (4) 6 Fig. 6 Contraction of a component tree. 7 Fig. 7 Neighbor edges of a contracted component tree. 2. 2 7 C [t3, t4] C B OCR 3. 2 2 4 { } { } 8 671

2014/3 Vol. J97 D No. 3 1 Table 1 Experimental results. 8 Fig. 8 Integration of character extraction results. The parenthesized number is the character recognition cost. 8 AE 4. ICDAR2003 Robust Reading Datasets [17] TrialTest 251 [17] Precision Recall F F-measure [18] 2 τ κ 9 Fig. 9 Examples of the proposed method. (5) (6) RGB (r, g, b) I =0.299r +0.587g +0.114b (5) I =0.5r 0.5b + 128 (6) 1 1 672

論文 最適 2 次元セグメンテーションによる情景内文字抽出 Fig. 10 図 10 文字抽出結果例 Examples of character extraction results. 出精度で代表させることを考える その上で 従来手 図 9 の (a) (c) に手法の処理結果例を示す 処理 法と比較してみると 注 1 精度向上の可能性を推測す 対象画像は [17] の TrialTrain に含まれるものである ることができる (a) は対象画像に対するコンポーネント グラフであ る ただし 図が煩雑になるため コンポーネント グ 注 1 提案手法による文字抽出結果に対して 正解の単語領域に含ま れるものを統合して単語領域とし 正解の単語領域に含まれないものは そのまま不正解の単語領域とした場合の単語抽出精度を測定した その 結果 適合率 0.88 再現率 0.78 F 値 0.82 となった これらの値は 提案手法による単語抽出精度の上限を意味する ラフの隣接エッジは省略した グラフの黒丸が安定コ ンポーネントを示し 白丸が中間コンポーネントを示 す 安定コンポーネントの画像上における領域を (b) に矩形で表示した また コンポーネント グラフに 673

2014/3 Vol. J97 D No. 3 2 Table 2 Number of nodes and processing time. (a) A L (c) 10 (a) (c) (a) (b) (d) (f) (d) (e) 1 (f) 2 (a) (f) #nodes of CT #nodes of CG 2 4 CPU Xeon 3.80GHz 2 5. / 2 OCR [1] J. Ohya, A. Shio, and S. Akamatsu, Recognizing characters in scene images, IEEE Trans. Pattern 674

2 Anal. Mach. Intell., vol.16, no.2, pp.214 220, 1994. [2] Y. Kusachi, A. Suzuki, N. Ito, and K. Arakawa, Kanji recognition in scene images without detection of text fields robust against variation of viewpoint, contrast, and background texture, International Conference on Pattern Recognition (ICPR2004), vol.1, pp.457 460, 2004. [3] D. Chen, J.M. Odobez, and H. Bourlard, Text detection and recognition in images and video frames, Pattern Recognit., vol.37, pp.595 608, 2004. [4] C. Li, X. Ding, and Y. Wu, Automatic text location in natural scene images, International Conference on Document Analysis and Recognition (ICDAR 2001), pp.1069 1073, 2001. [5] R. Huang, S. Oba, S. Palaiahnakote, and S. Uchida, Scene character detection and recognition based on multiple hypotheses framework, International Conference on Pattern Recognition (ICPR2012), pp.717 720, 2012. [6] R. Huang, S. Palaiahnakote, Y. Feng, and S. Uchida, Scene character detection and recognition with cooperative multiple-hypothesis framework, IEICE Trans. Inf. & Syst., vol.e96-d, no.10, pp.2235 2245, Oct. 2012. [7] H. Fujisawa, Y. Nakano, and K. Kurino, Segmentation methods for character recognition, Proc. IEEE, vol.80 no.7, pp.1079 1092, 1992. [8] M.S. Cho, J. Seok, S. Lee, and J. Kim, Scene text extraction by superpixel CRFs combining multiple character features, International Conference on Document Analysis and Recognition (ICDAR2011), pp.1034 1038, 2011. [9] Y. Pan, Y. Zhu, J. Sun, and S. Naoi, Improving scene text detection by scale-adaptive segmentation and weighted CRF verification, International Conference on Document Analysis and Recognition (ICDAR 2011), pp.759 763, 2011. [10] Y. Pan, X. Hou, and C. Liu, A hybrid approach to detect and localize texts in natural scene images, IEEE Trans. Image Process., vol.20, no.3, pp.800 813, 2011. [11] M. Couprie and G. Bertrand, Topological grayscale watershed transform, SPIE Vision Geometry V Proceedings, vol.3168, pp.136 146, 1997. [12] L. Najman and M. Couprie, Building the component tree in quasi-linear time, IEEE Trans. Image Process., vol.15, no.11, pp.3531 3539, 2006. [13] Y. Boykov and M-P. Jolly, Interactive graph cuts for optimal boundary & region segmentation of objects in N-D images, International Conference on Computer Vision (ICCV 2001), vol.1, pp.105 112, 2001 [14] H. Ishikawa, Exact optimization for Markov random fields with convex priors, IEEE Trans. Pattern Anal. Mach. Intell., vol.25, no.10, pp.1333 1336, 2003. [15] 2010. [16] J. Matas, O. Chum, M. Urban, and T. Pajdla, Robust wide-baseline stereo from maximally stable extremal regions, Image Vis. Comput., vol.22, no.10, pp.761 767, 2004. [17] S.M. Lucas, A. Panaretos, L. Sosa, A. Tang, S. Wong, and R. Young, ICDAR 2003 robust competitions, International Conference on Document Analysis and Recognition (ICDAR2003), pp.682 687, 2003. [18] D-II vol.j78-d-ii, no.11, pp.1627 1638, Nov. 1995. [19] L. Neumann and J. Matas, Text localization in real-world images using efficiently pruned exhaustive search, International Conference on Document Analysis and Recognition (ICDAR 2011), pp.687 691, 2011. [20] J. Lee, P. Lee, S. Lee, A. Yuille, and C. Koch, AdaBoost for text detection in natural scene, International Conference on Document Analysis and Recognition (ICDAR 2011), pp.429 434, 2011. [21] B. Epshtein, E. Ofek, and Y. Wexler, Detecting text in natural scenes with stroke width transform, IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2010), pp.2963 2970, 2010. 25 8 2 10 31 4 7 2 4 ( ) ( ) 15 PRMU 18 MIRU 19 IAPR/ICDAR The Best Paper Award 20 22 ICFHR Best Paper Award 23 MIRU IEEE 675