2



Similar documents
( ) Kevin Duh

BLEU Kishore Papineni, Salim Roukos, Todd Ward and Wei-Jing Zhu. (2002) BLEU: a method for Automatic Evaluation of Machine Translation. ACL. MT ( ) MT

IPSJ SIG Technical Report Vol.2014-NL-219 No /12/17 1,a) Graham Neubig 1,b) Sakriani Sakti 1,c) 1,d) 1,e) 1. [23] 1(a) 1(b) [19] n-best [1] 1 N

f ê ê = arg max Pr(e f) (1) e M = arg max λ m h m (e, f) (2) e m=1 h m (e, f) λ m λ m BLEU [11] [12] PBMT 2 [13][14] 2.2 PBMT Hiero[9] Chiang PBMT [X

taro.watanabe at nict.go.jp

研究報告 2 自動評価法を用いた機械翻訳の定量的評価 越前谷博 ( 北海学園大学 ) 磯崎秀樹 ( 岡山県立大学 ) 目次 1. 自動評価法とは 2. 自動評価法における動向 Workshop on Statistical Machine Translationに参加して 3. 自動評価法 :APA

IPSJ-TOD

dプログラム_1

No. 3 Oct The person to the left of the stool carried the traffic-cone towards the trash-can. α α β α α β α α β α Track2 Track3 Track1 Track0 1

2014/1 Vol. J97 D No. 1 2 [2] [3] 1 (a) paper (a) (b) (c) 1 Fig. 1 Issues in coordinating translation services. (b) feast feast feast (c) Kran

A Japanese Word Dependency Corpus ÆüËܸì¤Îñ¸ì·¸¤ê¼õ¤±¥³¡¼¥Ñ¥¹

89-95.indd


Table 1. Reluctance equalization design. Fig. 2. Voltage vector of LSynRM. Fig. 4. Analytical model. Table 2. Specifications of analytical models. Fig

Abstract Journal of Agricultural Science 2

フレーズベース機械翻訳システムの構築 フレーズベース機械翻訳システムの構築 Graham Neubig & Kevin Duh 奈良先端科学技術大学院大学 (NAIST) 5/10/2012 1

gengo.dvi

自然言語処理22_289

main.dvi

一般社団法人電子情報通信学会 THE INSTITUTE OF ELECTRONICS, INFORMATION AND COMMUNICATION ENGINEERS THE INSTITUTE OF ELECTRONICS, INFORMATION AND COMMUNICATION ENGIN

27巻3号/FUJSYU03‐107(プログラム)

第101回 日本美容外科学会誌/nbgkp‐01(大扉)

パーキンソン病治療ガイドライン2002

本文27/A(CD-ROM

tnbp59-20_Web:P1/ky108679509610002943

ズテーブルを 用 いて 対 訳 専 門 用 語 を 獲 得 する 手 法 を 提 案 する 具 体 的 には まず 専 門 用 語 対 訳 辞 書 獲 得 の 情 報 源 として 用 いる 日 中 対 訳 文 対 に 対 して 句 に 基 づく 統 計 的 機 械 翻 訳 モデルを 適 用 すること

Cain & Abel

46 583/4 2012

IPSJ SIG Technical Report Vol.2010-NL-199 No /11/ treebank ( ) KWIC /MeCab / Morphological and Dependency Structure Annotated Corp

25.4<4E92><52A9><4F1A><5831> - <30B3><30D4><30FC>.pdf


Vol. 43 No. 7 July 2002 ATR-MATRIX,,, ATR ITL ATR-MATRIX ATR-MATRIX 90% ATR-MATRIX Development and Evaluation of ATR-MATRIX Speech Translation System

相続支払い対策ポイント

150423HC相続資産圧縮対策のポイント

ハピタス のコピー.pages

DEIM Forum 2009 B4-6, Str

Copyright 2008 All Rights Reserved 2

% 95% 2002, 2004, Dunkel 1986, p.100 1

教育実践上の諸問題


untitled

ネットワーク設定マニュアル(Windows Vista編)

open / window / I / shall / the? something / want / drink / I / to the way / you / tell / the library / would / to / me

Copyright 2008 NIFTY Corporation All rights reserved. 2


‚æ4“ƒ.ren

PLQ-20 取扱説明書 詳細編

1 [1, 2, 3, 4, 5, 8, 9, 10, 12, 15] The Boston Public Schools system, BPS (Deferred Acceptance system, DA) (Top Trading Cycles system, TTC) cf. [13] [

自然言語処理23_175


広報1505月号.indd

untitled

基本操作ガイド

操作ガイド(本体操作編)

1000 Copyright(C)2009 All Rights Reserved - 2 -

16_.....E...._.I.v2006

Copyright 2006 KDDI Corporation. All Rights Reserved page1

[4], [5] [6] [7] [7], [8] [9] 70 [3] 85 40% [10] Snowdon 50 [5] Kemper [3] 2.2 [11], [12], [13] [14] [15] [16]

’ÓŠ¹/‰´„û

IPSJ SIG Technical Report Pitman-Yor 1 1 Pitman-Yor n-gram A proposal of the melody generation method using hierarchical pitman-yor language model Aki

A B C B C ICT ICT ITC ICT

音響モデル triphone 入力音声 音声分析 デコーダ 言語モデル N-gram bigram HMM の状態確率として利用 出力層 triphone: 3003 ノード リスコア trigram 隠れ層 2048 ノード X7 層 1 Structure of recognition syst

操作ガイド(本体操作編)

NTT 465 図 1.,,..,, 1980,.,, [Hori 12]..,, [Kinoshita 09]. REVERB Challange, 30,, [Delcorix 14].,,.,,,,.,.., [ 13]. 2 4 会話シーンを捉える リアルタイム会話分析 2,. 360,,,

LP-S8160 LP-S7160 LP-S6160

LP-M8040シリーズ

LP-M5300シリーズ

[1] B =b 1 b n P (S B) S S O = {o 1,2, o 1,3,, o 1,n, o 2,3,, o i,j,, o n 1,n } D = {d 1, d 2,, d n 1 } S = O, D o i,j 1 i

07-第4章村上英吾.indd

統計的機械翻訳モデルの構築 各モデルを対訳文から学習 対訳文 太郎が花子を訪問した Taro visited Hanako. 花子にプレセントを渡した He gave Hanako a present.... モデル翻訳モデル並べ替えモデル言語モデル 2

- - Warm Up

-2-

untitled

06’ÓŠ¹/ŒØŒì

2011spTP

初心者にもできるアメブロカスタマイズ新2016.pages

- 2 Copyright (C) All Rights Reserved.

21 Pitman-Yor Pitman- Yor [7] n -gram W w n-gram G Pitman-Yor P Y (d, θ, G 0 ) (1) G P Y (d, θ, G 0 ) (1) Pitman-Yor d, θ, G 0 d 0 d 1 θ Pitman-Yor G

本文/020:デジタルデータ P78‐97

tikeya[at]shoin.ac.jp The Function of Quotation Form -tte as Sentence-final Particle Tomoko IKEYA Kobe Shoin Women s University Institute of Linguisti

アジア言語を中心とした機械翻訳の評価 第 1 回アジア翻訳ワークショップ概要 Evaluation of Machine Translation Focusing on Asian Languages Overview of the 1st Workshop on Asian Translation


鈴木(最終版)


100 SDAM SDAM Windows2000/XP 4) SDAM TIN ESDA K G G GWR SDAM GUI

yasi10.dvi

of one s information (hearsay, personal experience, traditional lore), or epistemological stance may be expected of all speakers. This is especially t

dekiru_asa

基本操作ガイド

01ⅢⅣⅤⅥⅦⅧⅨⅩ一二三四五六七八九零壱弐02ⅢⅣⅤⅥⅦⅧⅨⅩ一二三四五六七八九零壱弐03ⅢⅣⅤⅥⅦⅧⅨⅩ一二三四五六七八九零壱弐04ⅢⅣⅤⅥⅦⅧⅨⅩ一二三四五六七八九零壱弐05ⅢⅣⅤⅥⅦⅧⅨⅩ一二三四五六七八九零壱弐06ⅢⅣⅤⅥⅦⅧⅨⅩ一二三四五六

(1) a. He has gone already. b. He hasn't gone yet. c. Has he gone yet?

1 に対話文完成問題の例を示す. 全ての問題は,2 名の話者の間で交わされるおよそ 3~6 文の短い発話 ( 会話部 ) と, 会話部に挿入される可能性のある 4 つの発話 ( 選択肢 ) で構成される. 会話部の一部の発話が空白として隠されており ( [BLANK]), 被験者はこの空白に当てはま

Ł\”ƒ-2005

2


FAX-760CLT

橡LET.PDF

独立行政法人情報通信研究機構 Development of the Information Analysis System WISDOM KIDAWARA Yutaka NICT Knowledge Clustered Group researched and developed the infor

On Sapir's Principles of Historical Linguistics (I) An Interpretation on Sapir's View of Language Contact Nobuharu MIWA Abstract This paper is an atte

概要 単語の分散表現に基づく統計的機械翻訳の素性を提案 既存手法の FFNNLM に CNN と Gate を追加 dependency- to- string デコーダにおいて既存手法を上回る翻訳精度を達成

Transcription:

NTT 2012 NTT Corporation. All rights reserved.

2

3

4

5

Noisy Channel f : (source), e : (target) ê = argmax e p(e f) = argmax e p(f e)p(e) 6

p( f e) (Brown+ 1990) f1 f2 f3 f4 f5 f6 f7 He is a high school student. a3=0 a2=4 φ e0 e1 e2 e3 e4 e5 p(fj ei) p(aj f, e) 7

p(aj f, e) IBM Model (Brown+ 1990) Model 1: Model 2: j, f, e Model 3: Model 2+ Model 4: Model 5: Model 4+ HMM Model (Vogel+ 1996): Model 1+ 8

(word alignment) IBM/HMM Models Viterbi GIZA++ MGIZA++ (GIZA++ multi-thread ) Chaski (Hadoop wrapper) Berkeley Aligner 9

(Koehn+ 2003) He is a high school student. He He high school student is a high school student p(f e) 10

Log-linear! ê = argmax e p(e f) = argmax e exp X k w k h k (f, e) p(f e), p(e f) p(e) 11

(Chiang 2007) S X1 is X2. a X He is a high school student. He high school student He high school student X He is a high school student. X1 S X2 X1 X2 X1 is a X2. 12

(Galley+ 2004, GHKM) S NP VP P NP NP He is a high school student. S NP VP P NP VP P VP is NP NP NP a NP NP NP high school student / / 13

! ê = argmax e p(e f) = argmax e exp X k w k h k (f, e) Minimum Error Rate Training [Och 2003] Margin Infused Relaxed Alg. [Watanabe+ 2007] Pairwise Ranking Optimization [Hopkins+ 2011] 14

/ 15

16

ê! ê = argmax e p(e f) = argmax e exp X k w k h k (f, e) O(n m n!) n m n 17

[Koehn 2003] He was His He is He He........ He is =1 =2 =3 =4 =5 18

[Chiang 2007, Zollmann 2006] S NP VP P NP VP P VP is NP NP NP a NP NP NP He NP high school student P. NP S He is a high school student. S VP NP NP NP VP P NP NP P 19

... O(n!)... He lost his wallet in the airport yesterday. 20

Moses Philipp Koehn (U. Edinburgh) http://www.statmt.org/moses/ Footer 21

Moses 22

23

Pros: Cons:, Pros:,, Cons: = 24

BLEU IBM (Papineni+ 2002): de-facto standard n-gram We are delighted to inform you that your paper has been accepted. We are sorry to inform you that your paper was not accepted. 1-gram: 10/13 2-gram: 7/12 3-gram: 4/11 4-gram: 3/10 s Y BLEU = n p n min n n-gram 1, length(output) length(reference) brevity penalty : BLEU BLEU 25

WER (Word Error Rate) PER (Position-independent WER) TER (Translation Edit Rate) METEOR 26

RIBES NTT (Isozaki+ 2010, 2011) My paper was rejected because I drunk so much today. I drunk so much today because my paper was rejected. BLEU: 0.74 RIBES: 0.47 Kendall s τ : RIBES = 1+ kendall 2 p 1 BP brevity penalty * α=0.25, β=0.1 GPLv2 http://www.kecl.ntt.co.jp/icl/lirg/ribes/index-j.html RIBES NTT Search 27

BLEU vs. RIBES : Spearman s ρ) ( ) BLEU 0.931 0.511-0.029 RIBES 0.949 0.929 0.716 NTCIR-9 PatentMT (Goto+ 2011) RIBES BLEU 28

29

30

Further Reading... Philipp Koehn, Statistical Machine Translation Cambridge University Press, 2010 2000, 10, 2003 IBM (p.100 ) : ACL, NAACL, EACL, EMNLP, IJCNLP, AMTA, EAMT, MT Summit,... : Computational Linguistics, Machine Translation, ACM TALIP,... 31

... ( ) 32

P. F. Brown et al., A Statistical Approach to Machine Translation, Computational Linguistics, vol. 16, no. 2 (1990) S. Vogel et al., HMM-Based Word Alignment in Statistical Translation, Proc. COLING (1996) 33

P. Koehn et al., Statistical Phrase-Based Translation, Proc. NAACL (2003) M. Galley et al., What s in a translation rule?, Proc. NAACL (2004) D. Chiang, Hierarchical Phrase-Based Translation, Computational Linguistics, vol. 33, no. 2 (2007) 34

F. J. Och, Minimum Error Rate Training in Statistical Machine Translation, Proc. ACL (2003) T. Watanabe et al., Online Large Margin Training for Statistical Machine Translation, Proc. EMNLP (2007) M. Hopkins and J. May, Tuning as Ranking, Proc. EMNLP (2011) 35

K. Papineni et al., BLEU: a Method for Automatic Evaluation of Machine Translation, Proc. ACL (2002) H. Isozaki et al., Automatic Evaluation of Translation Quality for Distant Language Pairs, Proc. EMNLP (2010) et al., RIBES:, (2011) I. Goto et al., Overview of the Patent Machine Translation Task at the NTCIR-9 Workshop, Proc. NTCIR-9 (2011) 36