2

Similar documents

BLEU Kishore Papineni, Salim Roukos, Todd Ward and Wei-Jing Zhu. (2002) BLEU: a method for Automatic Evaluation of Machine Translation. ACL. MT ( ) MT

IPSJ SIG Technical Report Vol.2014-NL-219 No /12/17 1,a) Graham Neubig 1,b) Sakriani Sakti 1,c) 1,d) 1,e) 1. [23] 1(a) 1(b) [19] n-best [1] 1 N

f ê ê = arg max Pr(e f) (1) e M = arg max λ m h m (e, f) (2) e m=1 h m (e, f) λ m λ m BLEU [11] [12] PBMT 2 [13][14] 2.2 PBMT Hiero[9] Chiang PBMT [X

taro.watanabe at nict.go.jp

研究報告 2 自動評価法を用いた機械翻訳の定量的評価越前谷博 ( 北海学園大学 ) 磯崎秀樹 ( 岡山県立大学 ) 目次 1. 自動評価法とは 2. 自動評価法における動向 Workshop on Statistical Machine Translationに参加して 3. 自動評価法 :APA

dプログラム_1

No. 3 Oct The person to the left of the stool carried the traffic-cone towards the trash-can. α α β α α β α α β α Track2 Track3 Track1 Track0 1

2014/1 Vol. J97 D No. 1 2 [2] [3] 1 (a) paper (a) (b) (c) 1 Fig. 1 Issues in coordinating translation services. (b) feast feast feast (c) Kran

A Japanese Word Dependency Corpus ÆüËÜ¸ì¤ÎÃ±¸ì·¸¤ê¼õ¤±¥³¡¼¥Ñ¥¹

Table 1. Reluctance equalization design. Fig. 2. Voltage vector of LSynRM. Fig. 4. Analytical model. Table 2. Specifications of analytical models. Fig

Abstract Journal of Agricultural Science 2

フレーズベース機械翻訳システムの構築フレーズベース機械翻訳システムの構築 Graham Neubig & Kevin Duh 奈良先端科学技術大学院大学 (NAIST) 5/10/2012 1

自然言語処理22_289

一般社団法人電子情報通信学会 THE INSTITUTE OF ELECTRONICS, INFORMATION AND COMMUNICATION ENGINEERS THE INSTITUTE OF ELECTRONICS, INFORMATION AND COMMUNICATION ENGIN

２７巻３号／ＦＵＪＳＹＵ０３‐１０７（プログラム）

第１０１回　日本美容外科学会誌／ｎｂｇｋｐ‐０１（大扉）

パーキンソン病治療ガイドライン2002

本文２７／Ａ（ＣＤ－ＲＯＭ

ｔｎｂｐ５９－２０＿Ｗｅｂ：Ｐ１／ｋｙ１０８６７９５０９６１０００２９４３

ズテーブルを用いて対訳専門用語を獲得する手法を提案する具体的にはまず専門用語対訳辞書獲得の情報源として用いる日中対訳文対に対して句に基づく統計的機械翻訳モデルを適用すること

IPSJ SIG Technical Report Vol.2010-NL-199 No /11/ treebank ( ) KWIC /MeCab / Morphological and Dependency Structure Annotated Corp

25.4<4E92><52A9><4F1A><5831> - <30B3><30D4><30FC>.pdf

Vol. 43 No. 7 July 2002 ATR-MATRIX,,, ATR ITL ATR-MATRIX ATR-MATRIX 90% ATR-MATRIX Development and Evaluation of ATR-MATRIX Speech Translation System

相続支払い対策ポイント

150423HC相続資産圧縮対策のポイント

ハピタスのコピー.pages

DEIM Forum 2009 B4-6, Str

Copyright 2008 All Rights Reserved 2

% 95% 2002, 2004, Dunkel 1986, p.100 1

教育実践上の諸問題

ネットワーク設定マニュアル（Windows Vista編）

open / window / I / shall / the? something / want / drink / I / to the way / you / tell / the library / would / to / me

Copyright 2008 NIFTY Corporation All rights reserved. 2

‚æ4“ƒ.ren

PLQ-20 取扱説明書　詳細編

1 [1, 2, 3, 4, 5, 8, 9, 10, 12, 15] The Boston Public Schools system, BPS (Deferred Acceptance system, DA) (Top Trading Cycles system, TTC) cf. [13] [

自然言語処理23_175

広報1505月号.indd

基本操作ガイド

操作ガイド（本体操作編）

1000 Copyright(C)2009 All Rights Reserved - 2 -

16_.....E...._.I.v2006

Copyright 2006 KDDI Corporation. All Rights Reserved page1

[4], [5] [6] [7] [7], [8] [9] 70 [3] 85 40% [10] Snowdon 50 [5] Kemper [3] 2.2 [11], [12], [13] [14] [15] [16]

’ÓŠ¹/‰´„û

IPSJ SIG Technical Report Pitman-Yor 1 1 Pitman-Yor n-gram A proposal of the melody generation method using hierarchical pitman-yor language model Aki

A B C B C ICT ICT ITC ICT

音響モデル triphone 入力音声音声分析デコーダ言語モデル N-gram bigram HMM の状態確率として利用出力層 triphone: 3003 ノードリスコア trigram 隠れ層 2048 ノード X7 層 1 Structure of recognition syst

操作ガイド（本体操作編）

NTT 465 図 1.,,..,, 1980,.,, [Hori 12]..,, [Kinoshita 09]. REVERB Challange, 30,, [Delcorix 14].,,.,,,,.,.., [ 13]. 2 4 会話シーンを捉えるリアルタイム会話分析 2,. 360,,,

LP-S8160 LP-S7160 LP-S6160

LP-M8040シリーズ

LP-M5300シリーズ

[1] B =b 1 b n P (S B) S S O = {o 1,2, o 1,3,, o 1,n, o 2,3,, o i,j,, o n 1,n } D = {d 1, d 2,, d n 1 } S = O, D o i,j 1 i

07-第4章村上英吾.indd

統計的機械翻訳モデルの構築各モデルを対訳文から学習対訳文太郎が花子を訪問した Taro visited Hanako. 花子にプレセントを渡した He gave Hanako a present.... モデル翻訳モデル並べ替えモデル言語モデル 2

06’ÓŠ¹/ŒØŒì

初心者にもできるアメブロカスタマイズ新2016.pages

- 2 Copyright (C) All Rights Reserved.

21 Pitman-Yor Pitman- Yor [7] n -gram W w n-gram G Pitman-Yor P Y (d, θ, G 0 ) (1) G P Y (d, θ, G 0 ) (1) Pitman-Yor d, θ, G 0 d 0 d 1 θ Pitman-Yor G

本文／０２０：デジタルデータ　Ｐ７８‐９７

tikeya[at]shoin.ac.jp The Function of Quotation Form -tte as Sentence-final Particle Tomoko IKEYA Kobe Shoin Women s University Institute of Linguisti

アジア言語を中心とした機械翻訳の評価第 1 回アジア翻訳ワークショップ概要 Evaluation of Machine Translation Focusing on Asian Languages Overview of the 1st Workshop on Asian Translation

鈴木（最終版）

100 SDAM SDAM Windows2000/XP 4) SDAM TIN ESDA K G G GWR SDAM GUI

of one s information (hearsay, personal experience, traditional lore), or epistemological stance may be expected of all speakers. This is especially t

基本操作ガイド

０１ⅢⅣⅤⅥⅦⅧⅨⅩ一二三四五六七八九零壱弐０２ⅢⅣⅤⅥⅦⅧⅨⅩ一二三四五六七八九零壱弐０３ⅢⅣⅤⅥⅦⅧⅨⅩ一二三四五六七八九零壱弐０４ⅢⅣⅤⅥⅦⅧⅨⅩ一二三四五六七八九零壱弐０５ⅢⅣⅤⅥⅦⅧⅨⅩ一二三四五六七八九零壱弐０６ⅢⅣⅤⅥⅦⅧⅨⅩ一二三四五六

(1) a. He has gone already. b. He hasn't gone yet. c. Has he gone yet?

1 に対話文完成問題の例を示す. 全ての問題は,2 名の話者の間で交わされるおよそ 3~6 文の短い発話 ( 会話部 ) と, 会話部に挿入される可能性のある 4 つの発話 ( 選択肢 ) で構成される. 会話部の一部の発話が空白として隠されており ( [BLANK]), 被験者はこの空白に当てはま

独立行政法人情報通信研究機構 Development of the Information Analysis System WISDOM KIDAWARA Yutaka NICT Knowledge Clustered Group researched and developed the infor

On Sapir's Principles of Historical Linguistics (I) An Interpretation on Sapir's View of Language Contact Nobuharu MIWA Abstract This paper is an atte

概要単語の分散表現に基づく統計的機械翻訳の素性を提案既存手法の FFNNLM に CNN と Gate を追加 dependency- to- string デコーダにおいて既存手法を上回る翻訳精度を達成

Transcription:

NTT 2012 NTT Corporation. All rights reserved.

2

3

4

5

Noisy Channel f : (source), e : (target) ê = argmax e p(e f) = argmax e p(f e)p(e) 6

p( f e) (Brown+ 1990) f1 f2 f3 f4 f5 f6 f7 He is a high school student. a3=0 a2=4 φ e0 e1 e2 e3 e4 e5 p(fj ei) p(aj f, e) 7

p(aj f, e) IBM Model (Brown+ 1990) Model 1: Model 2: j, f, e Model 3: Model 2+ Model 4: Model 5: Model 4+ HMM Model (Vogel+ 1996): Model 1+ 8

(word alignment) IBM/HMM Models Viterbi GIZA++ MGIZA++ (GIZA++ multi-thread ) Chaski (Hadoop wrapper) Berkeley Aligner 9

(Koehn+ 2003) He is a high school student. He He high school student is a high school student p(f e) 10

Log-linear! ê = argmax e p(e f) = argmax e exp X k w k h k (f, e) p(f e), p(e f) p(e) 11

(Chiang 2007) S X1 is X2. a X He is a high school student. He high school student He high school student X He is a high school student. X1 S X2 X1 X2 X1 is a X2. 12

(Galley+ 2004, GHKM) S NP VP P NP NP He is a high school student. S NP VP P NP VP P VP is NP NP NP a NP NP NP high school student / / 13

! ê = argmax e p(e f) = argmax e exp X k w k h k (f, e) Minimum Error Rate Training [Och 2003] Margin Infused Relaxed Alg. [Watanabe+ 2007] Pairwise Ranking Optimization [Hopkins+ 2011] 14

/ 15

16

ê! ê = argmax e p(e f) = argmax e exp X k w k h k (f, e) O(n m n!) n m n 17

[Koehn 2003] He was His He is He He........ He is =1 =2 =3 =4 =5 18

[Chiang 2007, Zollmann 2006] S NP VP P NP VP P VP is NP NP NP a NP NP NP He NP high school student P. NP S He is a high school student. S VP NP NP NP VP P NP NP P 19

... O(n!)... He lost his wallet in the airport yesterday. 20

Moses Philipp Koehn (U. Edinburgh) http://www.statmt.org/moses/ Footer 21

Moses 22

23

Pros: Cons:, Pros:,, Cons: = 24

BLEU IBM (Papineni+ 2002): de-facto standard n-gram We are delighted to inform you that your paper has been accepted. We are sorry to inform you that your paper was not accepted. 1-gram: 10/13 2-gram: 7/12 3-gram: 4/11 4-gram: 3/10 s Y BLEU = n p n min n n-gram 1, length(output) length(reference) brevity penalty : BLEU BLEU 25

WER (Word Error Rate) PER (Position-independent WER) TER (Translation Edit Rate) METEOR 26

RIBES NTT (Isozaki+ 2010, 2011) My paper was rejected because I drunk so much today. I drunk so much today because my paper was rejected. BLEU: 0.74 RIBES: 0.47 Kendall s τ : RIBES = 1+ kendall 2 p 1 BP brevity penalty * α=0.25, β=0.1 GPLv2 http://www.kecl.ntt.co.jp/icl/lirg/ribes/index-j.html RIBES NTT Search 27

BLEU vs. RIBES : Spearman s ρ) ( ) BLEU 0.931 0.511-0.029 RIBES 0.949 0.929 0.716 NTCIR-9 PatentMT (Goto+ 2011) RIBES BLEU 28

29

30

Further Reading... Philipp Koehn, Statistical Machine Translation Cambridge University Press, 2010 2000, 10, 2003 IBM (p.100 ) : ACL, NAACL, EACL, EMNLP, IJCNLP, AMTA, EAMT, MT Summit,... : Computational Linguistics, Machine Translation, ACM TALIP,... 31

... ( ) 32

P. F. Brown et al., A Statistical Approach to Machine Translation, Computational Linguistics, vol. 16, no. 2 (1990) S. Vogel et al., HMM-Based Word Alignment in Statistical Translation, Proc. COLING (1996) 33

P. Koehn et al., Statistical Phrase-Based Translation, Proc. NAACL (2003) M. Galley et al., What s in a translation rule?, Proc. NAACL (2004) D. Chiang, Hierarchical Phrase-Based Translation, Computational Linguistics, vol. 33, no. 2 (2007) 34

F. J. Och, Minimum Error Rate Training in Statistical Machine Translation, Proc. ACL (2003) T. Watanabe et al., Online Large Margin Training for Statistical Machine Translation, Proc. EMNLP (2007) M. Hopkins and J. May, Tuning as Ranking, Proc. EMNLP (2011) 35

K. Papineni et al., BLEU: a Method for Automatic Evaluation of Machine Translation, Proc. ACL (2002) H. Isozaki et al., Automatic Evaluation of Translation Quality for Distant Language Pairs, Proc. EMNLP (2010) et al., RIBES:, (2011) I. Goto et al., Overview of the Patent Machine Translation Task at the NTCIR-9 Workshop, Proc. NTCIR-9 (2011) 36