2014/3 Vol. J97 D No. 3 Recognition-based segmentation [7] 1 DP 1 Conditional random field; CRF [8] [10] CRF / OCR 2 2 2 2 OCR 2 2 2 2. 2 2 2 [11], [1



Similar documents
IPSJ SIG Technical Report Vol.2012-CG-149 No.13 Vol.2012-CVIM-184 No /12/4 3 1,a) ( ) DB 3D DB 2D,,,, PnP(Perspective n-point), Ransa

列生成を困難にする要因となっている 既存研究では 与え られた画像からグレイスケールに変換し 画像 1 枚から抽出 を行っているため 外乱 ( 影や光 ) の影響を受けると文字列を 正しく抽出できない (Yin et al., 2014) さらに 情景内の単一 の文字は既存研究では考慮されていない

2007/8 Vol. J90 D No. 8 Stauffer [7] 2 2 I 1 I 2 2 (I 1(x),I 2(x)) 2 [13] I 2 = CI 1 (C >0) (I 1,I 2) (I 1,I 2) Field Monitoring Server

258 5) GPS 1 GPS 6) GPS DP 7) 8) 10) GPS GPS ) GPS Global Positioning System

(4) ω t(x) = 1 ω min Ω ( (I C (y))) min 0 < ω < C A C = 1 (5) ω (5) t transmission map tmap 1 4(a) t 4(a) t tmap RGB 2 (a) RGB (A), (B), (C)

(MIRU2008) HOG Histograms of Oriented Gradients (HOG)

3 2 2 (1) (2) (3) (4) 4 4 AdaBoost 2. [11] Onishi&Yoda [8] Iwashita&Stoica [5] 4 [3] 3. 3 (1) (2) (3)

[2] OCR [3], [4] [5] [6] [4], [7] [8], [9] 1 [10] Fig. 1 Current arrangement and size of ruby. 2 Fig. 2 Typography combined with printing

1 Web [2] Web [3] [4] [5], [6] [7] [8] S.W. [9] 3. MeetingShelf Web MeetingShelf MeetingShelf (1) (2) (3) (4) (5) Web MeetingShelf

(a) (b) (c) Canny (d) 1 ( x α, y α ) 3 (x α, y α ) (a) A 2 + B 2 + C 2 + D 2 + E 2 + F 2 = 1 (3) u ξ α u (A, B, C, D, E, F ) (4) ξ α (x 2 α, 2x α y α,

(a) 1 (b) 3. Gilbert Pernicka[2] Treibitz Schechner[3] Narasimhan [4] Kim [5] Nayar [6] [7][8][9] 2. X X X [10] [11] L L t L s L = L t + L s

,,.,.,,.,.,.,.,,.,..,,,, i

1 Fig. 1 Extraction of motion,.,,, 4,,, 3., 1, 2. 2.,. CHLAC,. 2.1,. (256 ).,., CHLAC. CHLAC, HLAC. 2.3 (HLAC ) r,.,. HLAC. N. 2 HLAC Fig. 2

& Vol.5 No (Oct. 2015) TV 1,2,a) , Augmented TV TV AR Augmented Reality 3DCG TV Estimation of TV Screen Position and Ro

xx/xx Vol. Jxx A No. xx 1 Fig. 1 PAL(Panoramic Annular Lens) PAL(Panoramic Annular Lens) PAL (2) PAL PAL 2 PAL 3 2 PAL 1 PAL 3 PAL PAL 2. 1 PAL

Run-Based Trieから構成される 決定木の枝刈り法

IPSJ SIG Technical Report Vol.2013-CVIM-187 No /5/30 1,a) 1,b), 1,,,,,,, (DNN),,,, 2 (CNN),, 1.,,,,,,,,,,,,,,,,,, [1], [6], [7], [12], [13]., [

[12] [5, 6, 7] [5, 6] [7] 1 [8] 1 1 [9] 1 [10, 11] [10] [11] 1 [13, 14] [13] [14] [13, 14] [10, 11, 13, 14] 1 [12]

[1] SBS [2] SBS Random Forests[3] Random Forests ii

(3.6 ) (4.6 ) 2. [3], [6], [12] [7] [2], [5], [11] [14] [9] [8] [10] (1) Voodoo 3 : 3 Voodoo[1] 3 ( 3D ) (2) : Voodoo 3D (3) : 3D (Welc

SICE東北支部研究集会資料(2013年)

光学

9_18.dvi

2. CABAC CABAC CABAC 1 1 CABAC Figure 1 Overview of CABAC 2 DCT 2 0/ /1 CABAC [3] 3. 2 値化部 コンテキスト計算部 2 値算術符号化部 CABAC CABAC

1., 1 COOKPAD 2, Web.,,,,,,.,, [1]., 5.,, [2].,,.,.,, 5, [3].,,,.,, [4], 33,.,,.,,.. 2.,, 3.., 4., 5., ,. 1.,,., 2.,. 1,,

EQUIVALENT TRANSFORMATION TECHNIQUE FOR ISLANDING DETECTION METHODS OF SYNCHRONOUS GENERATOR -REACTIVE POWER PERTURBATION METHODS USING AVR OR SVC- Ju

(MIRU2010) NTT Graphic Processor Unit GPU graphi

[2] 2. [3 5] 3D [6 8] Morishima [9] N n 24 24FPS k k = 1, 2,..., N i i = 1, 2,..., n Algorithm 1 N io user-specified number of inbetween omis

22_04.dvi

1 Kinect for Windows M = [X Y Z] T M = [X Y Z ] T f (u,v) w 3.2 [11] [7] u = f X +u Z 0 δ u (X,Y,Z ) (5) v = f Y Z +v 0 δ v (X,Y,Z ) (6) w = Z +

Input image Initialize variables Loop for period of oscillation Update height map Make shade image Change property of image Output image Change time L

IPSJ SIG Technical Report Vol.2009-DPS-141 No.20 Vol.2009-GN-73 No.20 Vol.2009-EIP-46 No /11/27 1. MIERUKEN 1 2 MIERUKEN MIERUKEN MIERUKEN: Spe

Fig. 2 Signal plane divided into cell of DWT Fig. 1 Schematic diagram for the monitoring system

% 2 3 [1] Semantic Texton Forests STFs [1] ( ) STFs STFs ColorSelf-Simlarity CSS [2] ii

1 (PCA) 3 2 P.Viola 2) Viola AdaBoost 1 Viola OpenCV 3) Web OpenCV T.L.Berg PCA kpca LDA k-means 4) Berg 95% Berg Web k-means k-means

DEIM Forum 2012 E Web Extracting Modification of Objec

1 3DCG [2] 3DCG CG 3DCG [3] 3DCG 3 3 API 2 3DCG 3 (1) Saito [4] (a) 1920x1080 (b) 1280x720 (c) 640x360 (d) 320x G-Buffer Decaudin[5] G-Buffer D

DPA,, ShareLog 3) 4) 2.2 Strino Strino STRain-based user Interface with tacticle of elastic Natural ObjectsStrino 1 Strino ) PC Log-Log (2007 6)

THE INSTITUTE OF ELECTRONICS, INFORMATION AND COMMUNICATION ENGINEERS TECHNICAL REPORT OF IEICE.

IS2-06 第21回画像センシングシンポジウム 横浜 2015年6月 画像をスーパーピクセルに変換する手法として SLIC[5] を用いる Achanta らによって提案された SLIC 2.2 グラフマッチング は K-means をベースにした手法で 単純な K-means に いる SPIN

A Feasibility Study of Direct-Mapping-Type Parallel Processing Method to Solve Linear Equations in Load Flow Calculations Hiroaki Inayoshi, Non-member

proc.dvi

LBP 2 LBP 2. 2 Local Binary Pattern Local Binary pattern(lbp) [6] R

A Japanese Word Dependency Corpus ÆüËܸì¤Îñ¸ì·¸¤ê¼õ¤±¥³¡¼¥Ñ¥¹

IPSJ SIG Technical Report Vol.2012-CG-148 No /8/29 3DCG 1,a) On rigid body animation taking into account the 3D computer graphics came

2 The Characteristics of Two Negative Peaks on Visual Evoked Potentials with Depth Perception Yoichi MIYAWAKI, Yasuyuki YANAGIDA, Taro MAEDA, and Susu

2). 3) 4) 1.2 NICTNICT DCRA Dihedral Corner Reflector micro-arraysdcra DCRA DCRA DCRA 3D DCRA PC USB PC PC ON / OFF Velleman K8055 K8055 K8055

Transcription:

2, a) Scene Character Extraction by an Optimal Two-Dimensional Segmentation Hiroaki TAKEBE, a) and Seiichi UCHIDA / 2 2 2 2 2 2 1. FUJITSU LABORATORIES LTD., 4 1 1 Kamikodanaka, Nakahara-ku, Kawasaki-shi, 211 8588 Japan Kyusyu University, Fukuoka-shi, 819 0395 Japan a) E-mail: takebe.hiroaki@jp.fujitsu.com (a) (b) (c) / (d) (c) (d) OCR (c) OCR [1] [3] (d) OCR [4] [6] D Vol. J97 D No. 3 pp. 667 675 c 2014 667

2014/3 Vol. J97 D No. 3 Recognition-based segmentation [7] 1 DP 1 Conditional random field; CRF [8] [10] CRF / OCR 2 2 2 2 OCR 2 2 2 2. 2 2 2 [11], [12] 1 2 1 Fig. 1 Component tree. 2 OCR 2 2. 1 2 1 1 2 668

2 2 Fig. 2 Selection of combinations of components. 1 a f g 2. 2 2 OCR 2 a e OCR {a, d, e, c} {a, b, c} {a, b, c} [13], [14] [15] 3 Fig. 3 Construction of a component graph from a component tree. 3 OCR OCR S T 669

2014/3 Vol. J97 D No. 3 5 Fig. 5 Stability of components. 4 Fig. 4 Character extraction by graph cut. (1) u v u v c(u, v) (2) u g v g d v h uv C(S, T )= c(u, v) (1) c(u, v) = u S, v T { d v (u g v g) h uv (u g = v g) (2) 4 1 2 3. 3. 1 2 2 OCR 5 [16] xy z 3 z σ z 2 / σ = S(z) S(z 2) z 2 z 1 +1 (3) z=z 1 σ 6 670

2 τ OCR κ κα α [f 1,f 2] [g 1,g 2] α =min(f 2,g 2) max(f 1,g 1) (4) 6 Fig. 6 Contraction of a component tree. 7 Fig. 7 Neighbor edges of a contracted component tree. 2. 2 7 C [t3, t4] C B OCR 3. 2 2 4 { } { } 8 671

2014/3 Vol. J97 D No. 3 1 Table 1 Experimental results. 8 Fig. 8 Integration of character extraction results. The parenthesized number is the character recognition cost. 8 AE 4. ICDAR2003 Robust Reading Datasets [17] TrialTest 251 [17] Precision Recall F F-measure [18] 2 τ κ 9 Fig. 9 Examples of the proposed method. (5) (6) RGB (r, g, b) I =0.299r +0.587g +0.114b (5) I =0.5r 0.5b + 128 (6) 1 1 672

論文 最適 2 次元セグメンテーションによる情景内文字抽出 Fig. 10 図 10 文字抽出結果例 Examples of character extraction results. 出精度で代表させることを考える その上で 従来手 図 9 の (a) (c) に手法の処理結果例を示す 処理 法と比較してみると 注 1 精度向上の可能性を推測す 対象画像は [17] の TrialTrain に含まれるものである ることができる (a) は対象画像に対するコンポーネント グラフであ る ただし 図が煩雑になるため コンポーネント グ 注 1 提案手法による文字抽出結果に対して 正解の単語領域に含ま れるものを統合して単語領域とし 正解の単語領域に含まれないものは そのまま不正解の単語領域とした場合の単語抽出精度を測定した その 結果 適合率 0.88 再現率 0.78 F 値 0.82 となった これらの値は 提案手法による単語抽出精度の上限を意味する ラフの隣接エッジは省略した グラフの黒丸が安定コ ンポーネントを示し 白丸が中間コンポーネントを示 す 安定コンポーネントの画像上における領域を (b) に矩形で表示した また コンポーネント グラフに 673

2014/3 Vol. J97 D No. 3 2 Table 2 Number of nodes and processing time. (a) A L (c) 10 (a) (c) (a) (b) (d) (f) (d) (e) 1 (f) 2 (a) (f) #nodes of CT #nodes of CG 2 4 CPU Xeon 3.80GHz 2 5. / 2 OCR [1] J. Ohya, A. Shio, and S. Akamatsu, Recognizing characters in scene images, IEEE Trans. Pattern 674

2 Anal. Mach. Intell., vol.16, no.2, pp.214 220, 1994. [2] Y. Kusachi, A. Suzuki, N. Ito, and K. Arakawa, Kanji recognition in scene images without detection of text fields robust against variation of viewpoint, contrast, and background texture, International Conference on Pattern Recognition (ICPR2004), vol.1, pp.457 460, 2004. [3] D. Chen, J.M. Odobez, and H. Bourlard, Text detection and recognition in images and video frames, Pattern Recognit., vol.37, pp.595 608, 2004. [4] C. Li, X. Ding, and Y. Wu, Automatic text location in natural scene images, International Conference on Document Analysis and Recognition (ICDAR 2001), pp.1069 1073, 2001. [5] R. Huang, S. Oba, S. Palaiahnakote, and S. Uchida, Scene character detection and recognition based on multiple hypotheses framework, International Conference on Pattern Recognition (ICPR2012), pp.717 720, 2012. [6] R. Huang, S. Palaiahnakote, Y. Feng, and S. Uchida, Scene character detection and recognition with cooperative multiple-hypothesis framework, IEICE Trans. Inf. & Syst., vol.e96-d, no.10, pp.2235 2245, Oct. 2012. [7] H. Fujisawa, Y. Nakano, and K. Kurino, Segmentation methods for character recognition, Proc. IEEE, vol.80 no.7, pp.1079 1092, 1992. [8] M.S. Cho, J. Seok, S. Lee, and J. Kim, Scene text extraction by superpixel CRFs combining multiple character features, International Conference on Document Analysis and Recognition (ICDAR2011), pp.1034 1038, 2011. [9] Y. Pan, Y. Zhu, J. Sun, and S. Naoi, Improving scene text detection by scale-adaptive segmentation and weighted CRF verification, International Conference on Document Analysis and Recognition (ICDAR 2011), pp.759 763, 2011. [10] Y. Pan, X. Hou, and C. Liu, A hybrid approach to detect and localize texts in natural scene images, IEEE Trans. Image Process., vol.20, no.3, pp.800 813, 2011. [11] M. Couprie and G. Bertrand, Topological grayscale watershed transform, SPIE Vision Geometry V Proceedings, vol.3168, pp.136 146, 1997. [12] L. Najman and M. Couprie, Building the component tree in quasi-linear time, IEEE Trans. Image Process., vol.15, no.11, pp.3531 3539, 2006. [13] Y. Boykov and M-P. Jolly, Interactive graph cuts for optimal boundary & region segmentation of objects in N-D images, International Conference on Computer Vision (ICCV 2001), vol.1, pp.105 112, 2001 [14] H. Ishikawa, Exact optimization for Markov random fields with convex priors, IEEE Trans. Pattern Anal. Mach. Intell., vol.25, no.10, pp.1333 1336, 2003. [15] 2010. [16] J. Matas, O. Chum, M. Urban, and T. Pajdla, Robust wide-baseline stereo from maximally stable extremal regions, Image Vis. Comput., vol.22, no.10, pp.761 767, 2004. [17] S.M. Lucas, A. Panaretos, L. Sosa, A. Tang, S. Wong, and R. Young, ICDAR 2003 robust competitions, International Conference on Document Analysis and Recognition (ICDAR2003), pp.682 687, 2003. [18] D-II vol.j78-d-ii, no.11, pp.1627 1638, Nov. 1995. [19] L. Neumann and J. Matas, Text localization in real-world images using efficiently pruned exhaustive search, International Conference on Document Analysis and Recognition (ICDAR 2011), pp.687 691, 2011. [20] J. Lee, P. Lee, S. Lee, A. Yuille, and C. Koch, AdaBoost for text detection in natural scene, International Conference on Document Analysis and Recognition (ICDAR 2011), pp.429 434, 2011. [21] B. Epshtein, E. Ofek, and Y. Wexler, Detecting text in natural scenes with stroke width transform, IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2010), pp.2963 2970, 2010. 25 8 2 10 31 4 7 2 4 ( ) ( ) 15 PRMU 18 MIRU 19 IAPR/ICDAR The Best Paper Award 20 22 ICFHR Best Paper Award 23 MIRU IEEE 675