自然言語処理23_175

Similar documents
A Japanese Word Dependency Corpus ÆüËܸì¤Îñ¸ì·¸¤ê¼õ¤±¥³¡¼¥Ñ¥¹

21 Pitman-Yor Pitman- Yor [7] n -gram W w n-gram G Pitman-Yor P Y (d, θ, G 0 ) (1) G P Y (d, θ, G 0 ) (1) Pitman-Yor d, θ, G 0 d 0 d 1 θ Pitman-Yor G

No. 3 Oct The person to the left of the stool carried the traffic-cone towards the trash-can. α α β α α β α α β α Track2 Track3 Track1 Track0 1

DEIM Forum 2009 C8-4 QA NTT QA QA QA 2 QA Abstract Questions Recomme

gengo.dvi

1 Fig. 1 Extraction of motion,.,,, 4,,, 3., 1, 2. 2.,. CHLAC,. 2.1,. (256 ).,., CHLAC. CHLAC, HLAC. 2.3 (HLAC ) r,.,. HLAC. N. 2 HLAC Fig. 2

3807 (3)(2) ,267 1 Fig. 1 Advertisement to the author of a blog. 3 (1) (2) (3) (2) (1) TV 2-0 Adsense (2) Web ) 6) 3

1 1 tf-idf tf-idf i

Haiku Generation Based on Motif Images Using Deep Learning Koki Yoneda 1 Soichiro Yokoyama 2 Tomohisa Yamashita 2 Hidenori Kawamura Scho

n 2 n (Dynamic Programming : DP) (Genetic Algorithm : GA) 2 i

Computational Semantics 1 category specificity Warrington (1975); Warrington & Shallice (1979, 1984) 2 basic level superiority 3 super-ordinate catego

1 7.35% 74.0% linefeed point c 200 Information Processing Society of Japan

JFE.dvi

1 Web [2] Web [3] [4] [5], [6] [7] [8] S.W. [9] 3. MeetingShelf Web MeetingShelf MeetingShelf (1) (2) (3) (4) (5) Web MeetingShelf

main.dvi

kut-paper-template.dvi

IPSJ-TOD

: ( 1) () 1. ( 1) 2. ( 1) 3. ( 2)

1 [1, 2, 3, 4, 5, 8, 9, 10, 12, 15] The Boston Public Schools system, BPS (Deferred Acceptance system, DA) (Top Trading Cycles system, TTC) cf. [13] [

2

A Feasibility Study of Direct-Mapping-Type Parallel Processing Method to Solve Linear Equations in Load Flow Calculations Hiroaki Inayoshi, Non-member

fiš„v5.dvi

IPSJ SIG Technical Report Vol.2011-EC-19 No /3/ ,.,., Peg-Scope Viewer,,.,,,,. Utilization of Watching Logs for Support of Multi-

DEIM Forum 2009 E

DEIM Forum 2010 A Web Abstract Classification Method for Revie

独立行政法人情報通信研究機構 Development of the Information Analysis System WISDOM KIDAWARA Yutaka NICT Knowledge Clustered Group researched and developed the infor

IPSJ SIG Technical Report Vol.2009-BIO-17 No /5/26 DNA 1 1 DNA DNA DNA DNA Correcting read errors on DNA sequences determined by Pyrosequencing

TF-IDF TDF-IDF TDF-IDF Extracting Impression of Sightseeing Spots from Blogs for Supporting Selection of Spots to Visit in Travel Sat

2007/2 Vol. J90 D No Web 2. 1 [3] [2], [11] [18] [14] YELLOW [16] [8] tfidf [19] 2. 2 / 30% 90% [24] 2. 3 [4], [21] 428

16_.....E...._.I.v2006

HP cafe HP of A A B of C C Map on N th Floor coupon A cafe coupon B Poster A Poster A Poster B Poster B Case 1 Show HP of each company on a user scree

_314I01BM浅谷2.indd

ID 3) 9 4) 5) ID 2 ID 2 ID 2 Bluetooth ID 2 SRCid1 DSTid2 2 id1 id2 ID SRC DST SRC 2 2 ID 2 2 QR 6) 8) 6) QR QR QR QR

¥ì¥·¥Ô¤Î¸À¸ì½èÍý¤Î¸½¾õ

28 Horizontal angle correction using straight line detection in an equirectangular image

Vol. 48 No. 4 Apr LAN TCP/IP LAN TCP/IP 1 PC TCP/IP 1 PC User-mode Linux 12 Development of a System to Visualize Computer Network Behavior for L

Vol.55 No (Jan. 2014) saccess 6 saccess 7 saccess 2. [3] p.33 * B (A) (B) (C) (D) (E) (F) *1 [3], [4] Web PDF a m

( )

Vol. 42 No MUC-6 6) 90% 2) MUC-6 MET-1 7),8) 7 90% 1 MUC IREX-NE 9) 10),11) 1) MUCMET 12) IREX-NE 13) ARPA 1987 MUC 1992 TREC IREX-N

4. C i k = 2 k-means C 1 i, C 2 i 5. C i x i p [ f(θ i ; x) = (2π) p 2 Vi 1 2 exp (x µ ] i) t V 1 i (x µ i ) 2 BIC BIC = 2 log L( ˆθ i ; x i C i ) + q

IT,, i

1., 1 COOKPAD 2, Web.,,,,,,.,, [1]., 5.,, [2].,,.,.,, 5, [3].,,,.,, [4], 33,.,,.,,.. 2.,, 3.., 4., 5., ,. 1.,,., 2.,. 1,,

05_fuke.indd

untitled

,,,,., C Java,,.,,.,., ,,.,, i

IPSJ SIG Technical Report Vol.2011-MUS-91 No /7/ , 3 1 Design and Implementation on a System for Learning Songs by Presenting Musical St

IPSJ SIG Technical Report Vol.2009-HCI-134 No /7/17 1. RDB Wiki Wiki RDB SQL Wiki Wiki RDB Wiki RDB Wiki A Wiki System Enhanced by Visibl

第62巻 第1号 平成24年4月/石こうを用いた木材ペレット

149 (Newell [5]) Newell [5], [1], [1], [11] Li,Ryu, and Song [2], [11] Li,Ryu, and Song [2], [1] 1) 2) ( ) ( ) 3) T : 2 a : 3 a 1 :

IBM-Mode1 Q: A: cash money It is fine today 2

IPSJ SIG Technical Report Vol.2010-NL-199 No /11/ treebank ( ) KWIC /MeCab / Morphological and Dependency Structure Annotated Corp

fiš„v8.dvi

1: A/B/C/D Fig. 1 Modeling Based on Difference in Agitation Method artisoc[7] A D 2017 Information Processing

A Study of Effective Application of CG Multimedia Contents for Help of Understandings of the Working Principles of the Internal Combustion Engine (The

IPSJ SIG Technical Report Pitman-Yor 1 1 Pitman-Yor n-gram A proposal of the melody generation method using hierarchical pitman-yor language model Aki

Iteration 0 Iteration 1 1 Iteration 2 Iteration 3 N N N! N 1 MOPT(Merge Optimization) 3) MOPT MOP

Wikipedia YahooQA MAD 4)5) MAD Web 6) 3. YAMAHA 7) 8) Vocaloid PV YouTube 1 minato minato ussy 3D MAD F EDis ussy

The 18th Game Programming Workshop ,a) 1,b) 1,c) 2,d) 1,e) 1,f) Adapting One-Player Mahjong Players to Four-Player Mahjong

自然言語処理16_2_45

e-learning e e e e e-learning 2 Web e-leaning e 4 GP 4 e-learning e-learning e-learning e LMS LMS Internet Navigware

浜松医科大学紀要

日本感性工学会論文誌

2 ( ) i

<> <name> </name> <body> <></> <> <title> </title> <item> </item> <item> 11 </item> </>... </body> </> 1 XML Web XML HTML 1 name item 2 item item HTML

2) TA Hercules CAA 5 [6], [7] CAA BOSS [8] 2. C II C. ( 1 ) C. ( 2 ). ( 3 ) 100. ( 4 ) () HTML NFS Hercules ( )

1 4 4 [3] SNS 5 SNS , ,000 [2] c 2013 Information Processing Society of Japan

soturon.dvi

1 1 CodeDrummer CodeMusician CodeDrummer Fig. 1 Overview of proposal system c

& Vol.5 No (Oct. 2015) TV 1,2,a) , Augmented TV TV AR Augmented Reality 3DCG TV Estimation of TV Screen Position and Ro

DEIM Forum 2009 B4-6, Str

Izard 10 [1]Plutchik 8 [2] [3] Izard Neviarouskaya [4][5] 2.2 Hao [6] 1 Twitter[a] a) Shook Wikipedia

258 5) GPS 1 GPS 6) GPS DP 7) 8) 10) GPS GPS ) GPS Global Positioning System


202

(a) 1 (b) 3. Gilbert Pernicka[2] Treibitz Schechner[3] Narasimhan [4] Kim [5] Nayar [6] [7][8][9] 2. X X X [10] [11] L L t L s L = L t + L s

Modal Phrase MP because but 2 IP Inflection Phrase IP as long as if IP 3 VP Verb Phrase VP while before [ MP MP [ IP IP [ VP VP ]]] [ MP [ IP [ VP ]]]

Optical Flow t t + δt 1 Motion Field 3 3 1) 2) 3) Lucas-Kanade 4) 1 t (x, y) I(x, y, t)

Fig. 3 3 Types considered when detecting pattern violations 9)12) 8)9) 2 5 methodx close C Java C Java 3 Java 1 JDT Core 7) ) S P S

2. Twitter Twitter 2.1 Twitter Twitter( ) Twitter Twitter ( 1 ) RT ReTweet RT ReTweet RT ( 2 ) URL Twitter Twitter 140 URL URL URL 140 URL URL

GPGPU

IPSJ SIG Technical Report Vol.2014-HCI-158 No /5/22 1,a) 2 2 3,b) Development of visualization technique expressing rainfall changing conditions

12) NP 2 MCI MCI 1 START Simple Triage And Rapid Treatment 3) START MCI c 2010 Information Processing Society of Japan

tikeya[at]shoin.ac.jp The Function of Quotation Form -tte as Sentence-final Particle Tomoko IKEYA Kobe Shoin Women s University Institute of Linguisti

IPSJ SIG Technical Report Vol.2009-DPS-141 No.23 Vol.2009-GN-73 No.23 Vol.2009-EIP-46 No /11/27 t-room t-room 2 Development of

自然言語処理21_249

3_39.dvi

IPSJ SIG Technical Report Vol.2009-DPS-141 No.20 Vol.2009-GN-73 No.20 Vol.2009-EIP-46 No /11/27 1. MIERUKEN 1 2 MIERUKEN MIERUKEN MIERUKEN: Spe

2006 [3] Scratch Squeak PEN [4] PenFlowchart 2 3 PenFlowchart 4 PenFlowchart PEN xdncl PEN [5] PEN xdncl DNCL 1 1 [6] 1 PEN Fig. 1 The PEN

untitled

A B C B C ICT ICT ITC ICT

23 Study on Generation of Sudoku Problems with Fewer Clues

3. ( 1 ) Linear Congruential Generator:LCG 6) (Mersenne Twister:MT ), L 1 ( 2 ) 4 4 G (i,j) < G > < G 2 > < G > 2 g (ij) i= L j= N

IPSJ SIG Technical Report Vol.2016-CE-137 No /12/ e β /α α β β / α A judgment method of difficulty of task for a learner using simple

授受補助動詞の使用制限に与える敬語化の影響について : 「くださる」「いただく」を用いた感謝表現を中心に

,.,. NP,., ,.,,.,.,,, (PCA)...,,. Tipping and Bishop (1999) PCA. (PPCA)., (Ilin and Raiko, 2010). PPCA EM., , tatsukaw

( : A8TB2163)

Stepwise Chow Test * Chow Test Chow Test Stepwise Chow Test Stepwise Chow Test Stepwise Chow Test Riddell Riddell first step second step sub-step Step

01-._..

Web Hashtag Hashtag Twitter Hashtag Twitter Hashtag Hashtag Hashtag Twitter Hashtag Twitter Hashtag contexthashtag contexthashtag Hashtag contexthasht

01ⅢⅣⅤⅥⅦⅧⅨⅩ一二三四五六七八九零壱弐02ⅢⅣⅤⅥⅦⅧⅨⅩ一二三四五六七八九零壱弐03ⅢⅣⅤⅥⅦⅧⅨⅩ一二三四五六七八九零壱弐04ⅢⅣⅤⅥⅦⅧⅨⅩ一二三四五六七八九零壱弐05ⅢⅣⅤⅥⅦⅧⅨⅩ一二三四五六七八九零壱弐06ⅢⅣⅤⅥⅦⅧⅨⅩ一二三四五六

Transcription:

2 Sequence Alignment as a Set Partitioning Problem Masaaki Nishino,JunSuzuki, Shunji Umetani, Tsutomu Hirao and Masaaki Nagata Sequence alignment, which involves aligning elements of two given sequences, occurs in many natural language processing (NLP) tasks such as sentence alignment. Previous approaches for solving sequence alignment problems in NLP can be categorized into two groups. The first group assumes monotonicity of alignments; the second group does not assume monotonicity or consider the continuity of alignments. However, for example, in aligning sentences of parallel legal documents, it is desirable to use a sentence alignment method that does not assume monotonicity but can consider continuity. Herein, we present a method to align sequences where block-wise changes in the order of sequence elements exist. Our method formalizes a sequence alignment problem as a set partitioning problem, a type of combinatorial optimization problem, and solves the problem to obtain an alignment. We also propose an efficient algorithm to solve the optimization problem by applying column generation. Key Words: Sequence Alignment, Combinatorial Optimization, Column Generation NTT, NTT Communication Science Laboratories, Nippon Telegraph and Telephone Corporation, Graduate School of Information Science and Technology, Osaka University

Vol. 23 No. 2 March 2016 1 2 DNA RNA (Moore 2002; Braune and Fraser 2010; Quan, Kit, and Song 2013) (Nie, Simard, Isabelle, and Durand 1999) (Qu and Liu 2012; 2015; 2012) F E F i f i E j e j F i +1 E j +1 (Qu and Liu 2012; 2015; 2012) 1 2 176

1 (Quan et al. 2013) Wikipedia f i e j f i+1 e j 1 F f i f i+1 E 2 (Quan et al. 2013) Bilingual Laws Information System (BLIS) 2 1 4 f i e j f i+1 e j f i e j f i+1 e j 2 http://www.legistlation.gov.hk 177

Vol. 23 No. 2 March 2016 2 BLIS F E (Korte and Vygen 2008) F E S S 1,...,S N S D {S 1,...,S N } D S S = Si DS i i j S i,s j D S i S j = 2 F, E S 1,...,S N S 1,...,S N F, E F, E 178

F, E 3 O( F 2 E 2 ) NP (Lübbecke and Desrosiers 2005) 2 (Gale and Church 1993) (Moore 2002; Braune and Fraser 2010) Deng (Deng, Kumar, and Byrne 2007) Deng Quan (Quan et al. 2013) 3 F, E 4 179

Vol. 23 No. 2 March 2016 Quan Quan Quan (Qu and Liu 2012) ( 2015) ( 2012) Brown (Brown, Petra, Pietra, and Mercer 1993) Wu (Wu 1997) (Brown et al. 1993) 3 F, E s F, t E S(s, t) 4 4 180

S(s, t) Moore (Moore 2002) s F s t E t S(s, t) ( S(s, t) = P (m mt )( s,m t ) m s ms ) (m s +1) m tr(t t j s i ) u(s i ) (1) j=1 i=1 m s, m t s, t s i, t j s i t j tr(t j s i ) s i t j u(s i ) s i P (m s,m t ) i=1 P (m s,m t )= exp ( m sr)(m s r) mt m t! (2) r (Brown et al. 1993) 4 (1) (2) (1) (2) 3 3 1 181

Vol. 23 No. 2 March 2016 3 3 2 F, E F E f i F i e k E k F i j f ij F 1 i j F e kl E E k l 1 k l E a ijkl (f ij,e kl ) f(a ijkl )=f ij, e(a ijkl )=e kl f ij e kl seqmatch(f ij,e kl ) f ij e kl X, A ijkl seqmatch(f ij,e kl ) seqmatch(f ij,e kl )= max X A ijkl (s,t) X S(s, t) (3) 4.1 seqmatch(f ij,e kl ) f ij e kl seqmatch(f ij,e kl ) a ijkl M M A M A a, a A f(a) f(a )= e(a) e(a )= a A f(a) =F a A e(a) =E 182

A A Â =argmax{score(a)} (4) A A Â score(a) F E A score(a) =λ K a A seqmatch(f(a),e(a)) (5) K A λ 0 <λ 1 λ =1 λ λ 1 (5) score(a) a (ILP) ILP maximize ijkl +logλ)y ijkl (6) ijkl(w subject to y ijkl =1 x :1 x F (7) i,j:i x j kl ij k,l:k x l y ijkl =1 x :1 x E (8) y ijkl {0, 1} i, j, k, l (9) w ijkl log seqmatch(f ij,e kl ) y ijkl a ijkl y ijkl =1 a ijkl (7) F x a ijkl 1 (8) E 2 F E 1 (f ij,e kl ) 183

Vol. 23 No. 2 March 2016 5 5 (6) (9) F E ( F +1)( E +1)/4 (Lübbecke and Desrosiers 2005) (6) (9) a ijkl {0, 1} 0 a ijkl 1 (Master problem: MP) M MP (Restricted master problem: RMP) RMP M M RMP RMP f n u n, e m v m RMP RMP RMP MP a ijkl M\M a ijkl 5 f 1 -e 4, f 2 -e 3, f 3 -e 2, f 4 -e 1 (3) seqmatch(f ij,e kl ) 184

j l w ijkl = w ijkl +logλ û n ˆv m (10) n=i m=k û n RMP u n ˆv m v m w ijkl RMP RMP MP MP RMP MP F E ( F +1)( E +1)/4 w ijkl 3 (10) û n,ˆv m Smith- Waterman (Smith and Waterman 1981) Smith-Waterman N, M O(NM) 6 q[j, l] f j,e l w ijkl (1 i j, 1 k l) q[j, l] log λ q[j 1,l 1] + S(f j,e l ) û j ˆv l q[j, l] =max (11) q[j 1,l]+S(f j ) û j q[j, l 1] + S(e l ) ˆv l q[0, 0] = log λ S(f j,e l ), S(f j ), S(e l ) (Moore 2002) f j e l f j E 6 185

Vol. 23 No. 2 March 2016 e l F log λ f j+1, e l+1 f j, e l 1 j F, 1 l E q[j, l] q[j, l] (11) q[j, l] x ijkl (11) 4 3 q[j 1,l 1], q[j 1,l], q[j, l 1] log λ q[j,l ] i = j +1 k = l +1 i k q[j, l] O( F E ) O( F + E ) Smith-Waterman 4 RMP M = {x 1 F,1 E } (line 1) x 1 F,1 E RMP RMP 7 (line 3) Smith-Waterman (line 4) (line 5) MP RMP (line 8) RMP MP 4 7 RMP RMP 186

6 6.1 25,000 2,500 K 60 K =1, 3, 6, 12, 20 K =1 60 40 20 K =3, 6, 12 2/3 60 40 20 K =3, 6, 12 1/3 Moore (Moore) (BM) (1) Moore(Moore 2002) (recall) (precision) F (F-measure) K 5 GIZA++(Och and Ney 2003) ILOG CPLEX λ λ =0.1 λ =0.01 2 187

Vol. 23 No. 2 March 2016 6.2 1, 2 3 SP + ILP SP + CG BM Moore Moore (Moore 2002) K =1 K λ =0.1 2 F K =1 Moore F 0.003 λ 1 F K =1 K =3 K =6 F F F SP+ILP (λ =0.1) 0.961 0.866 0.911 0.986 0.917 0.950 1.000 0.822 0.902 SP+CG (λ =0.1) 0.961 0.872 0.914 0.986 0.923 0.954 1.000 0.815 0.898 SP+ILP (λ =0.01) 0.961 0.872 0.914 0.986 0.923 0.954 0.804 0.831 0.817 SP+CG (λ =0.01) 0.961 0.872 0.914 0.928 0.908 0.918 0.833 0.843 0.838 BM 0.955 0.765 0.849 0.986 0.853 0.915 1.000 0.730 0.844 Moore 0.961 0.872 0.914 0.585 0.873 0.701 0.524 0.743 0.614 K =12 K =20 F F SP+ILP (λ =0.1) 0.990 0.764 0.862 0.986 0.731 0.840 SP+CG (λ =0.1) 0.991 0.769 0.866 0.986 0.743 0.847 SP+ILP (λ =0.01) 0.609 0.776 0.682 0.504 0.697 0.585 SP+CG (λ =0.01) 0.536 0.776 0.634 0.441 0.659 0.528 BM 0.990 0.694 0.816 0.986 0.623 0.763 Moore 0.506 0.776 0.613 0.400 0.694 0.508 2 F K =3 K =6 K =12 F F F SP+ILP (λ =0.1) 0.989 0.876 0.929 0.990 0.869 0.926 0.988 0.758 0.858 SP+CG (λ =0.1) 0.989 0.876 0.929 0.990 0.843 0.911 0.988 0.761 0.859 SP+ILP (λ =0.01) 0.989 0.883 0.933 0.879 0.896 0.887 0.721 0.790 0.754 SP+CG (λ =0.01) 0.989 0.899 0.942 0.801 0.872 0.835 0.703 0.806 0.751 BM 0.989 0.772 0.867 0.990 0.794 0.881 0.988 0.674 0.801 Moore 0.638 0.907 0.749 0.707 0.827 0.762 0.486 0.719 0.580 188

3 F K =3 K =6 K =12 F F F SP+ILP (λ =0.1) 1.000 0.810 0.895 0.990 0.760 0.860 0.990 0.799 0.884 SP+CG (λ =0.1) 1.000 0.792 0.884 0.990 0.760 0.860 0.990 0.791 0.879 SP+ILP (λ =0.01) 1.000 0.826 0.905 0.910 0.805 0.854 0.749 0.783 0.766 SP+CG (λ =0.01) 0.987 0.870 0.925 0.748 0.805 0.775 0.762 0.798 0.779 BM 1.000 0.703 0.825 0.978 0.673 0.797 0.990 0.678 0.805 Moore 0.684 0.885 0.772 0.655 0.796 0.719 0.484 0.685 0.567 K =1, 3 λ =0.01 λ =0.1 F K =6, 12, 20 λ =0.1 λ λ K λ =0.01 K λ =0.1 Moore F F 0.08 λ =0.1 0.03 1 F 4 CPLEX 30 400 100 5 10 6 10 3 189

Vol. 23 No. 2 March 2016 4 (a) (λ =0.1) k =1 K =3 K =6 K =12 K =20 SP+ILP 3.60 10 2 5.57 10 2 6.08 10 2 6.41 10 2 4.46 10 2 SP+CG 1.48 2.35 1.32 1.19 0.79 (b) (λ =0.01) k =1 K =3 K =6 K =12 K =20 SP+ILP 3.55 10 2 5.51 10 2 5.91 10 2 5.83 10 2 4.07 10 2 SP+CG 1.63 7.88 3.92 3.49 3.72 (c) (λ =0.1) K =3 K =6 K =12 SP+ILP 8.80 10 9.04 10 1.10 10 3 SP+CG 2.03 1.09 2.31 (e) (λ =0.1) K =3 K =6 K =12 SP+ILP 4.11 10 2 4.26 10 2 4.61 10 2 SP+CG 1.58 1.38 1.09 (d) (λ =0.01) K =3 K =6 K =12 SP+ILP 6.12 10 8.11 10 7.05 10 SP+CG 1.96 1.30 2.24 (f) (λ =0.01) K =3 K =6 K =12 SP+ILP 4.04 10 2 4.04 10 2 4.01 10 2 SP+CG 3.83 3.98 3.99 NP 5 3,600 190

5 (a) (λ =0.1) k =1 K =3 K =6 K =12 K =20 SP+ILP 3.35 10 6 3.35 10 6 3.35 10 6 3.35 10 6 3.35 10 6 SP+CG 9.39 10 2 1.02 10 3 8.31 10 2 7.38 10 2 7.00 10 2 (b) (λ =0.01) k =1 K =3 K =6 K =12 K =20 SP+ILP 3.35 10 6 3.35 10 6 3.35 10 6 3.35 10 6 3.35 10 6 SP+CG 1.02 10 3 1.50 10 3 1.10 10 3 1.12 10 3 1.21 10 3 (c) (λ =0.1) K =3 K =6 K =12 SP+ILP 1.50 10 6 1.50 10 6 1.50 10 6 SP+CG 7.14 10 2 7.18 10 2 5.90 10 2 (e) (λ =0.1) K =3 K =6 K =12 SP+ILP 3.35 10 6 3.35 10 6 3.35 10 6 SP+CG 9.30 10 2 8.89 10 2 7.65 10 2 (d) (λ =0.01) K =3 K =6 K =12 SP+ILP 1.50 10 6 1.50 10 6 1.50 10 6 SP+CG 9.24 10 2 8.35 10 2 9.09 10 2 (f) (λ =0.01) K =3 K =6 K =12 SP+ILP 3.35 10 6 3.35 10 6 3.35 10 6 SP+CG 1.27 10 3 1.17 10 3 1.10 10 3 10,000 1,000 100 10 1 0.1 0.01 0 100 200 300 400 500 600 5 191

Vol. 23 No. 2 March 2016 7 Braune, F. and Fraser, A. (2010). Improved Unsupervised Sentence Alignment for Symmetrical and Asymmetrical Parallel Corpora. In Proceedings of COLING 2010, pp. 81 89. Brown, P. F., Petra, S. A. D., Pietra, V. J. D., and Mercer, R. L. (1993). The Mathematics of Statistical Machine Translation: Parameter Estimation. Computational Linguistics, 19 (2), pp. 263 311. Deng, Y., Kumar, S., and Byrne, W. (2007). Segmentation and Alignment of Parallel Text for Statistical Machine Translation. Natural Language Engineering, 13 (3), pp. 235 260. Gale, W. A. and Church, K. W. (1993). A Program for Aligning Sentences in Bilingual Corpora. Computational Linguistics, 19 (1), pp. 75 102. Korte, B. H. and Vygen, J. (2008). Combinatorial Optimization: Theory and Algorithms. Springer Verlag. Lübbecke, M. E. and Desrosiers, J. (2005). Selected Topics in Column Generation. Operations Research, 53 (6), pp. 1007 1023. Moore, R. C. (2002). Fast and Accurate Sentence Alignment of Bilingual Corpora. In Proceedings of AMTA 02, pp. 135 144. Nie, J.-Y., Simard, M., Isabelle, P., and Durand, R. (1999). Cross-language Information Retrieval Based on Parallel Texts and Automatic Mining of Parallel Texts from the Web. In Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 74 81. ACM. Och, F. J. and Ney, H. (2003). A Systematic Comparison of Various Statistical Alignment Models. Computational Linguistics, 29 (1), pp. 19 51. 192

Qu, Z. and Liu, Y. (2012). Sentence Dependency Tagging in Online Question Answering Forums. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 554 562, Jeju Island, Korea. Association for Computational Linguistics. Quan, X., Kit, C., and Song, Y. (2013). Non-Monotonic Sentence Alignment via Semisupervised Learning. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 622 630, Sofia, Bulgaria. Association for Computational Linguistics. Smith, T. F. and Waterman, M. S. (1981). Identification of Common Molecular Subsequences. Journal of Molecular Biology, 147, pp. 195 197. (2012).., 19 (3), pp. 193 212. (2015).., 22 (1), pp. 27 58. Wu, D. (1997). Stochastic Inversion Transduction Grammars and Bilingual Parsing of Parallel Corpora. Computational Linguistics, 23 (3), pp. 377 403. 2008 ACL 1999 2001 2005 2008 2009 MIT CSAIL NTT ACL 1998 2002 INFORMS MOS AAAI 193

Vol. 23 No. 2 March 2016 1995 1997 NTT 2000 NTT ACL 1987 ACL 2015 9 2 2015 11 5 2015 12 3 194