[1] B =b 1 b n P (S B) S S O = {o 1,2, o 1,3,, o 1,n, o 2,3,, o i,j,, o n 1,n } D = {d 1, d 2,, d n 1 } S = O, D o i,j 1 i

Similar documents
21 Pitman-Yor Pitman- Yor [7] n -gram W w n-gram G Pitman-Yor P Y (d, θ, G 0 ) (1) G P Y (d, θ, G 0 ) (1) Pitman-Yor d, θ, G 0 d 0 d 1 θ Pitman-Yor G

A Japanese Word Dependency Corpus ÆüËܸì¤Îñ¸ì·¸¤ê¼õ¤±¥³¡¼¥Ñ¥¹

Modal Phrase MP because but 2 IP Inflection Phrase IP as long as if IP 3 VP Verb Phrase VP while before [ MP MP [ IP IP [ VP VP ]]] [ MP [ IP [ VP ]]]

[4], [5] [6] [7] [7], [8] [9] 70 [3] 85 40% [10] Snowdon 50 [5] Kemper [3] 2.2 [11], [12], [13] [14] [15] [16]

IPSJ SIG Technical Report Vol.2010-NL-199 No /11/ treebank ( ) KWIC /MeCab / Morphological and Dependency Structure Annotated Corp

1 Fig. 1 Extraction of motion,.,,, 4,,, 3., 1, 2. 2.,. CHLAC,. 2.1,. (256 ).,., CHLAC. CHLAC, HLAC. 2.3 (HLAC ) r,.,. HLAC. N. 2 HLAC Fig. 2

IPSJ SIG Technical Report Vol.2011-EC-19 No /3/ ,.,., Peg-Scope Viewer,,.,,,,. Utilization of Watching Logs for Support of Multi-

Vol.54 No (July 2013) [9] [10] [11] [12], [13] 1 Fig. 1 Flowchart of the proposed system. c 2013 Information

1: A/B/C/D Fig. 1 Modeling Based on Difference in Agitation Method artisoc[7] A D 2017 Information Processing

IPSJ SIG Technical Report Vol.2009-DPS-141 No.20 Vol.2009-GN-73 No.20 Vol.2009-EIP-46 No /11/27 1. MIERUKEN 1 2 MIERUKEN MIERUKEN MIERUKEN: Spe

¥ì¥·¥Ô¤Î¸À¸ì½èÍý¤Î¸½¾õ

[2] OCR [3], [4] [5] [6] [4], [7] [8], [9] 1 [10] Fig. 1 Current arrangement and size of ruby. 2 Fig. 2 Typography combined with printing

Input image Initialize variables Loop for period of oscillation Update height map Make shade image Change property of image Output image Change time L

258 5) GPS 1 GPS 6) GPS DP 7) 8) 10) GPS GPS ) GPS Global Positioning System


自然言語処理16_2_45

The copyright of this material is retained by the Information Processing Society of Japan (IPSJ). The material has been made available on the website

ID 3) 9 4) 5) ID 2 ID 2 ID 2 Bluetooth ID 2 SRCid1 DSTid2 2 id1 id2 ID SRC DST SRC 2 2 ID 2 2 QR 6) 8) 6) QR QR QR QR

IPSJ SIG Technical Report Pitman-Yor 1 1 Pitman-Yor n-gram A proposal of the melody generation method using hierarchical pitman-yor language model Aki

main.dvi

1 UD Fig. 1 Concept of UD tourist information system. 1 ()KDDI UD 7) ) UD c 2010 Information Processing S

9_18.dvi

Vol. 43 No. 7 July 2002 ATR-MATRIX,,, ATR ITL ATR-MATRIX ATR-MATRIX 90% ATR-MATRIX Development and Evaluation of ATR-MATRIX Speech Translation System

第62巻 第1号 平成24年4月/石こうを用いた木材ペレット

& Vol.5 No (Oct. 2015) TV 1,2,a) , Augmented TV TV AR Augmented Reality 3DCG TV Estimation of TV Screen Position and Ro

The Plasma Boundary of Magnetic Fusion Devices

IPSJ SIG Technical Report Vol.2014-DBS-159 No.6 Vol.2014-IFAT-115 No /8/1 1,a) 1 1 1,, 1. ([1]) ([2], [3]) A B 1 ([4]) 1 Graduate School of Info

(a) (b) (c) Canny (d) 1 ( x α, y α ) 3 (x α, y α ) (a) A 2 + B 2 + C 2 + D 2 + E 2 + F 2 = 1 (3) u ξ α u (A, B, C, D, E, F ) (4) ξ α (x 2 α, 2x α y α,

Vol. 48 No. 4 Apr LAN TCP/IP LAN TCP/IP 1 PC TCP/IP 1 PC User-mode Linux 12 Development of a System to Visualize Computer Network Behavior for L

JFE.dvi

(2008) JUMAN *1 (, 2000) google MeCab *2 KH coder TinyTextMiner KNP(, 2000) google cabocha(, 2001) JUMAN MeCab *1 *2 h

untitled

Studies of Foot Form for Footwear Design (Part 9) : Characteristics of the Foot Form of Young and Elder Women Based on their Sizes of Ball Joint Girth

Vol.55 No (Jan. 2014) saccess 6 saccess 7 saccess 2. [3] p.33 * B (A) (B) (C) (D) (E) (F) *1 [3], [4] Web PDF a m

1 Web [2] Web [3] [4] [5], [6] [7] [8] S.W. [9] 3. MeetingShelf Web MeetingShelf MeetingShelf (1) (2) (3) (4) (5) Web MeetingShelf

untitled

xx/xx Vol. Jxx A No. xx 1 Fig. 1 PAL(Panoramic Annular Lens) PAL(Panoramic Annular Lens) PAL (2) PAL PAL 2 PAL 3 2 PAL 1 PAL 3 PAL PAL 2. 1 PAL

17 Proposal of an Algorithm of Image Extraction and Research on Improvement of a Man-machine Interface of Food Intake Measuring System


<> <name> </name> <body> <></> <> <title> </title> <item> </item> <item> 11 </item> </>... </body> </> 1 XML Web XML HTML 1 name item 2 item item HTML

A Feasibility Study of Direct-Mapping-Type Parallel Processing Method to Solve Linear Equations in Load Flow Calculations Hiroaki Inayoshi, Non-member

Table 1. Reluctance equalization design. Fig. 2. Voltage vector of LSynRM. Fig. 4. Analytical model. Table 2. Specifications of analytical models. Fig

Study on Throw Accuracy for Baseball Pitching Machine with Roller (Study of Seam of Ball and Roller) Shinobu SAKAI*5, Juhachi ODA, Kengo KAWATA and Yu

2). 3) 4) 1.2 NICTNICT DCRA Dihedral Corner Reflector micro-arraysdcra DCRA DCRA DCRA 3D DCRA PC USB PC PC ON / OFF Velleman K8055 K8055 K8055

(MIRU2008) HOG Histograms of Oriented Gradients (HOG)

( ) [1] [4] ( ) 2. [5] [6] Piano Tutor[7] [1], [2], [8], [9] Radiobaton[10] Two Finger Piano[11] Coloring-in Piano[12] ism[13] MIDI MIDI 1 Fig. 1 Syst

Table 1. Assumed performance of a water electrol ysis plant. Fig. 1. Structure of a proposed power generation system utilizing waste heat from factori

BOK body of knowledge, BOK BOK BOK 1 CC2001 computing curricula 2001 [1] BOK IT BOK 2008 ITBOK [2] social infomatics SI BOK BOK BOK WikiBOK BO

【資料4】CEFR準拠の英語到達基準JS「ジャパン・スタンダード」の策定

HASC2012corpus HASC Challenge 2010,2011 HASC2011corpus( 116, 4898), HASC2012corpus( 136, 7668) HASC2012corpus HASC2012corpus

( ) ( ) Modified on 2009/05/24, 2008/09/17, 15, 12, 11, 10, 09 Created on 2008/07/02 1 1) ( ) ( ) (exgen Excel VBA ) 2)3) 1.1 ( ) ( ) : : (1) ( ) ( )

Transcription:

1,a) 2,b) 3,c) 1,d) CYK 552 1. 2 ( 1 ) ( 2 ) 1 [1] 2 [2] [3] 1 Graduate School of Information Science, Nagoya University, Japan 2 Information Technology Center, Nagoya University, Japan 3 Information & Communications, Nagoya University, Japan a) yoshida@db.ss.is.nagoya-u.ac.jp b) ohno@nagoya-u.jp c) yoshihide@icts.nagoya-u.ac.jp d) matubara@nagoya-u.jp [4], [5], [6], [7], [8] [9], [10], [11], [12], [13] 552 2 c 2015 Information Processing Society of Japan 1

2 3 4 5 2. [1] 1 1 2 1 2 3. 3.1 B =b 1 b n P (S B) S S O = {o 1,2, o 1,3,, o 1,n, o 2,3,, o i,j,, o n 1,n } D = {d 1, d 2,, d n 1 } S = O, D o i,j 1 i < j n 2 b i b j b i o i,j = 1 o i,j = 0 d i b i S = O, D P (S B) (1) 式 (2) 低 入力 語順整序 係り受け解析 出力 1 入力文の語順の適切さ 入力 式 (1) 出力 (1) (3) 入力 係り受け解析 語順整序 出力 Fig. 1 Relationships among formulas (1) to (3). P (S B) = P (O, D B) 高 式 (3) = {P (O B) P (D O, B)} α (1) {P (D B) P (O D, B)} 1 α (0 α 1) (1) (2) (3) *1 α (0 α 1) P (O, D B) = P (O B) P (D O, B) (2) P (O, D B) = P (D B) P (O D, B) (3) (2) (3) P (O, D B) 1 (1) (3) S = O, D (2) P (O, D B) (3) (2) (3) (1) P (O, D B) α (1) (3) y = a x 0 a 1 0 x 1 (1) α (2) (1) 2 *1 c 2015 Information Processing Society of Japan 2

o i,j 2 d i *2 P (O B) = n 1 n i=1 j=i+1 P (o i,j B) (4) n 1 P (D O, B) = P (d i O, B) (5) i=1 n 1 P (D B) = P (d i B) (6) i=1 n 1 P (O D, B) = n i=1 j=i+1 P (o i,j D, B) (7) P (o i,j B) B b i b j o i,j P (d i O, B) B O b i d i P (d i B) B b i d i P (o i,j D, B) D B b i b j o i,j P (o i,j D, B) [2] P (o i,j B) P (o i,j D, B) P (d i O, B) [14] P (d i B) P (d i O, B) 3.2 B O D S O D CYK P (O, D B) O D ( 1 ) *2 o i,j d i O D [14] ( 2 ) ( 3 ) *3 (1) (3) D (2) (3) O CYK CYK 3.2.1 2 n n n M i,j (1 i j n) 4 i j M i,j B i,j = b i b j, O i,j D i,j argmax Si,j P (S i,j B i,j ) S i,j d x (i x j) S 1,3 = d 3 2 d3 1 d0 3 b 2 1 b 1 2 b 3 {d 3 1, d3 2, d0 3 } d y x b x b y d 0 j b j 4 6 M i,i d 0 i 7 15 M i,i M i,j M i,j 10 12 concatreorder M i,k M k+1,j C i,j 13 b i b j argmax Si,j C i,j P (S i,j B i,j ) M i,j *3 2 2 c 2015 Information Processing Society of Japan 3

1: input B 1,n := b 1 b n // 2: set M i,j := null (1 i j n) // 3: set C i,j := null (1 i j n) // 4: for i := 1 to n do 5: M i,i := d 0 i 6: end for 7: for d := 1 to n 1 do 8: for i := 1 to n d do 9: j := i + d 10: for k := i to j 1 do 11: C i,j := C i,j concatreorder(m i,k, M k+1,j ) 12: end for 13: M i,j := argmax Si,j C i,j P (S i,j B i,j ) 14: end for 15: end for 16: return M 1,n Fig. 2 2 Word reordering algorithm. 1: function concatreorder(s 1, S 2 ) 2: begin 3: set C := null // 4: set z := last(s 2 ) // last(s 2 ) S 2 5: 6: // 7: // 8: // S 1 b z S 1 9: S 1 := changelastdep(z, S 1) 10: // 11: C := { S 1 S 2 } 12: 13: // 14: for each d {d y x y = z, d y x S 2 } do 15: // S 2 d S2 l Sr 2 16: (S2 l, Sr 2 ) := divide(d, S 2) 17: C := C { S2 l S 1 Sr 2 } 18: end for 19: 20: return C 21: end 3 Fig. 3 concatreorder Function: concatreorder. M 1,n 16 M 1,n 3 concatreorder concatreorder 2 S 1 (= M i,k ) S 2 (= M k+1,j ) M i,j C 2 9 S 1 S 2 changelastdep S 1 S 2 b z S 1 S 1 = d i d i+1 d k 1 d 0 k changelastdep(s 1, z) d i d i+1 d k 1 d z k 11 2 i j : 1 2 3 2 1 3 2 2 3 : 文節 i から j までの部分文節列に対する語順と係り受けの最尤構造 文節 i が j より前に位置し,i が j に係ることを表す. 例えば, 2 1 3 は, 意味する. Fig. 4 4 3 3 4 4 を 下記の構造候補の中でが最大となる構造をに書き込む とから生成される候補 連結プロセスにおいて生成 候補 1: 2 3 4 語順変更プロセスにおいて生成 候補 2: 3 2 4 とから生成される候補 連結プロセスにおいて生成 候補 3: 2 3 4 語順変更プロセスにおいて生成 候補は生成されない Execution example of our search algorithm. S 1 S 2 S 1 S 2 S 1 S 2 M i,j C 0 ( 1 ) ( 2 ) S 1 S 2 14 18 14 S 2 S 2 16 17 S 1 17 S2 l S 1 Sr 2 C S 2 S 2 2 concatreorder concat CYK concat concatreorder 3 13 18 3.2.2 4 n = 4 4 4 4 M 1,1, M 2,2, M 3,3, M 4,4, M 1,2, M 2,3, M 3,4, M 1,3 M 2,4 4 M 2,4 2 10 12 concatreorder 3 concatreorder(m 2,2, M 3,4 ) c 2015 Information Processing Society of Japan 4

M 2,2 d 0 2 d4 2 M 2,2 M 3,4 1 京大テキストコーパス concatreorder(m 2,2, M 3,4 ) M 2,2 b4 3 2 concatreorder(m 2,3, M 4,4 ) (1) 1 文の取り出し 新聞記事文 3 concatreorder(m 2,3, M 4,4 ) 3 3 P (S 2,4 B) = P (O 2,4, D 2,4 B 2,4 ) M 2,4 No Yes (4) 他に異なる語順があるか? No (2) ランダムな語順変更 (3) 母語話者が書きそうか? Yes 語順変更後の読みにくい文 作業者が判断 4. 採用 Fig. 5 新聞記事文 : 評価用データ 5 Construction procedure of evaluation data. 4.1 [15] Version 2 5 ( 1 ) 1 * 4 3.2 (1) (3) *4 消費税増税が国会を通った後でも 世論調査では 増税反対が まず, 大勢だ に係る 3 つの部分構造 ( 青太線枠 ) の順序を変更. 次に, 通った に係る 2 つの部分構造 ( 赤点線枠 ) の順序を変更. 語順変更後の読みにくい文 : 大勢だ 増税反対が世論調査では国会を消費税増税が通った後でも大勢だ Fig. 6 6 Example of random reordering. ( 2 ) 3.2 (1) (3) 6 ( 3 ) (2) 1 Yes (1) No (4) [1] ( 4 ) (2) Yes (2) c 2015 Information Processing Society of Japan 5

1 Table 1 Size of evaluation data. 2 Table 2 Experimental results for each set. 552 4,906 No (1) 1995 1 9 1 4.2 (1) α 552 5 552 5 4 α α 1 5 α 2 α 0.1 α α 0.1 0.01 α α (4) (7) 7 1 1 3 8 7,976 [16] [2] 2 2 2 1 [14] [2] 2 CaboCha[17] [2] 552 1 P (o i,j D, B) α 2 1 0.66 83.02% (3,790/4,565) 32.73% (36/110) 2 0.75 83.83% (4,308/5,139) 29.09% (32/110) 3 0.66 83.30% (3,862/4,636) 32.73% (36/110) 4 0.66 81.89% (3,496/4,269) 28.83% (32/111) 5 0.75 86.91% (4,018/4,623) 31.53% (35/111) 3 Table 3 Experimental results (word reordering). 2 83.82% (19,474/23,232) 30.98% (171/552) 1 82.39% (19,140/23,232) 26.99% (149/552) 2 83.35% (19,365/23,232) 26.63% (147/552) 76.78% (17,838/23,232) 0% (0/552) 1 P (d i O, B) 2 CaboCha 4.3 α 2 1 α 2 4 4 2 α α 0.66 0.75 3 5 2 *5 (p < 0.05) 2 2 (p > 0.05) 1 (p < 0.05) 1 7 *5 c 2015 Information Processing Society of Japan 6

例 1 入力文 麻薬売買にも手を越境前の生活費を得るため出す 語順整序結果 越境前の生活費を得るため麻薬売買にも手を出す 例 2 入力文 生産量は少なくなるが高品質のブドウをブドウの樹液を濃縮し収穫するためだ 語順整序結果 生産量は少なくなるがブドウの樹液を濃縮し高品質のブドウを収穫するためだ 例 3 入力文 コペンハーゲンからの報道によるとチェチェン紛争への軍事介入を理由に七日デンマーク政府はロシアとの昨年九月に結んだ軍事協力協定を一時的に凍結した 語順整序結果 コペンハーゲンからの報道によるとデンマーク政府は七日チェチェン紛争への軍事介入を理由に昨年九月に結んだロシアとの軍事協力協定を一時的に凍結した 7 Fig. 7 Examples of sentences correctly reordered by our 4 Table 4 method. 2 Agreement rates calculated separately for the case that the word order between two bunsetsus in the input sentence is appropriate and the case that the order is inappropriate. R F T R T T 66.30% (3,576/5,394) 88.77% (15,834/17,838) 1 52.65% (2,840/5,394) 91.38% (16,300/17,838) 2 51.67% (2,787/5,394) 92.94% (16,578/17,838) 4.4 2 2 4 552 2 2 R F T 2 R T T 2 R F T 2 7 2 入力文 亡くなる前に演出プランが出来ていたため演出が飯沢プラン通り行われた 本手法の出力結果 演出プランが出来ていたため飯沢プラン通り亡くなる前に演出が行われた 正解文 亡くなる前に演出プランが出来ていたため飯沢プラン通り演出が行われた 8 Fig. 8 Example of sentences incorrectly reordered by our method. 2 2 2 8 2 8 2 8 2 2 CaboCha 8 α 2 (3) α α 0.66 0.83 *6 *6 α 0.01 α c 2015 Information Processing Society of Japan 7

Table 5 5 Experimental results (dependency parsing). 83.39% (3,631/4,354) 40.04% (221/552) 1 84.75% (3,690/4,354) 36.78% (203/552) 2 86.08% (3,748/4,354) 37.50% (207/552) α 5. 5.1 4 [14] 5 5 p > 0.05 p < 0.05 5.2 α 4 (1) α α α 552 2 α 4 552 α 4.2 6 α = 0.66 2 2 Table 6 6 α Word reordering results by our method using the optimal α. 2 (α = 0.66) 84.12% (19,542/23,232) 31.16% (172/552) 3 p < 0.05 α 2 2 α 1 2 α α 5.3 α (1) α 3.1 (2) (3) (2) (1) α 4 α 4 ( 1 ) 4.1 552 ( 2 ) 4.1 552 ( 3 ) 4.1 (2) (1) 552 ( 4 ) (3) 5.2 2 α c 2015 Information Processing Society of Japan 8

7 4 Table 7 Experimental results on four kinds of data which have different word order readability. 2 α 2 100% (23,232/23,232) 100% (552/552) 0.92 86.51% (20,098/23,232) 40.40% (223/552) 76.78% (17,838/23,232) 0% (0/552) 0.66 84.12% (19,542/23,232) 31.16% (172/552) 74.46% (17,298/23,232) 18.12% (100/552) 0.52 82.58% (19,184/23,232) 32.07% (177/552) 60.40% (14,032/23,232) 0% (0/552) 0.41 80.66% (18,671/23,232) 24.46% (135/552) 7 2 3 4 2 α 5 6 7 α 3 6. CYK No.24650066 (B) No.25730134 [1] 7 (2009). [2] Vol. 7, No. 4, pp. 163 180 (2000). [3] Vol. 45, No. 5, pp. 1451 1459 (2004). [4] Filippova, K. and Strube., M.: Generating constituent order in German clauses, In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics (ACL2007), pp. 320 327 (2007). [5] Harbusch, K., Kempen, G., van Breugel, C. and Koch, U.: A generation-oriented workbench for performance grammar: Capturing linear order variability in German and Dutch, In Proceedings of the 4th International Natural Language Generation Conference (INLG2006), pp. 9 11 (2006). [6] Kruijff, G. M., Kruijff-Korbayová, I., Bateman, J. and Teich, E.: Linear order as higher-level decision: Information structure in strategic and tactical generation, In Proceedings of the 8th European Workshop on Natural Language Generation (ENLG2001), pp. 74 83 (2001). [7] Ringger, E., Gamon, M., Moore, R. C., Rojas, D., Smets, M. and Corston-Oliver, S.: Linguistically informed statistical models of constituent structure for ordering in sentence realization, In Proceedings of the 20th International Conference on Computational Linguistics (COL- ING2004), pp. 673 679 (2004). [8] Shaw, J. and Hatzivassiloglou, V.: Ordering among premodifiers, In Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics (ACL 99), pp. 135 143 (1999). [9] Goto, I., Utiyama, M. and Sumita, E.: Post-ordering by parsing for Japanese-English statistical machine translation, In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (ACL2012), pp. 311 316 (2012). [10] Elming, J.: Syntactic reordering integrated with phrasebased SMT, In Proceedings of the 22nd International Conference on Computational Linguistics (COL- ING2008), pp. 209 216 (2008). [11] Ge, N.: A direct syntax-driven reordering model for phrase-based machine translation, In Proceedings of Human Language Technologies: The 11th Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-HLT2010), pp. 849 857 (2010). [12] Christoph, T. and Hermann, N.: Word reordering and a dynamic programming beam search algorithm for statistical machine translation, Computational Linguistics, Vol. 29, No. 1, pp. 97 133 (2003). [13] Nizar, H.: Syntactic preprocessing for statistical machine translation, In Proceedings of the 11th Machine Translation Summit (MT SUMMIT XI), pp. 215 222 (2007). [14] Vol. 40, No. 9, pp. 3397 3407 (1999). [15] 3 pp. 115 118 (1997). [16] Zhang, L.: Maximum entropy modeling toolkit for c 2015 Information Processing Society of Japan 9

Python and C++, http://homepages.inf.ed.ac.uk/ s0450736/maxent_toolkit.html (2008). [Online; accessed 1-March-2008]. [17] Vol. 43, No. 6, pp. 1834 1842 (2002). c 2015 Information Processing Society of Japan 10