Microsoft PowerPoint - Kyoto-U WSLT2006.ppt

Similar documents
Vol. 42 No MUC-6 6) 90% 2) MUC-6 MET-1 7),8) 7 90% 1 MUC IREX-NE 9) 10),11) 1) MUCMET 12) IREX-NE 13) ARPA 1987 MUC 1992 TREC IREX-N

Vol.54 No (July 2013) [9] [10] [11] [12], [13] 1 Fig. 1 Flowchart of the proposed system. c 2013 Information

IPSJ SIG Technical Report Vol.2010-NL-199 No /11/ treebank ( ) KWIC /MeCab / Morphological and Dependency Structure Annotated Corp

258 5) GPS 1 GPS 6) GPS DP 7) 8) 10) GPS GPS ) GPS Global Positioning System

2 except for a female subordinate in work. Using personal name with SAN/KUN will make the distance with speech partner closer than using titles. Last

_念3)医療2009_夏.indd

Mimehand II[1] [2] 1 Suzuki [3] [3] [4] (1) (2) 1 [5] (3) 50 (4) 指文字, 3% (25 個 ) 漢字手話 + 指文字, 10% (80 個 ) 漢字手話, 43% (357 個 ) 地名 漢字手話 + 指文字, 21

1 Fig. 1 Extraction of motion,.,,, 4,,, 3., 1, 2. 2.,. CHLAC,. 2.1,. (256 ).,., CHLAC. CHLAC, HLAC. 2.3 (HLAC ) r,.,. HLAC. N. 2 HLAC Fig. 2

206“ƒŁ\”ƒ-fl_“H„¤‰ZŁñ

1 1 tf-idf tf-idf i

16_.....E...._.I.v2006



{.w._.p7_.....\.. (Page 6)

Vol. 48 No. 3 Mar PM PM PMBOK PM PM PM PM PM A Proposal and Its Demonstration of Developing System for Project Managers through University-Indus

& Vol.5 No (Oct. 2015) TV 1,2,a) , Augmented TV TV AR Augmented Reality 3DCG TV Estimation of TV Screen Position and Ro


Vol. 48 No. 4 Apr LAN TCP/IP LAN TCP/IP 1 PC TCP/IP 1 PC User-mode Linux 12 Development of a System to Visualize Computer Network Behavior for L

Fig. 2 Signal plane divided into cell of DWT Fig. 1 Schematic diagram for the monitoring system

Visual Evaluation of Polka-dot Patterns Yoojin LEE and Nobuko NARUSE * Granduate School of Bunka Women's University, and * Faculty of Fashion Science,

A B C B C ICT ICT ITC ICT

CA HP,,,,,,.,,,,,,.,,,,,,.,,,,,,.,,,,,,.,,,,,,.,,,,,,.,,,,,.,,,,,.,,,,,.,,,,,.,,,,,.,,,,,.,,,,,.,,,,,.,,,,,,.,,,,,.,,,,,,.,,,,,.,,,,,.,,,,,,.,,,,,,.,,

1 Web [2] Web [3] [4] [5], [6] [7] [8] S.W. [9] 3. MeetingShelf Web MeetingShelf MeetingShelf (1) (2) (3) (4) (5) Web MeetingShelf

鹿大広報149号

2014/1 Vol. J97 D No. 1 2 [2] [3] 1 (a) paper (a) (b) (c) 1 Fig. 1 Issues in coordinating translation services. (b) feast feast feast (c) Kran

2 : Open Clip Art Library [4] Microsoft Office PowerPoint Web PowerPoint 2 Yahoo! Web [5] SlideShare Yahoo! Web Yahoo! Web

Web Web Web Web 1 1,,,,,, Web, Web - i -

NO


Frequently Asked Questions (FAQ) About Sunsetting the SW-CMMR

PowerPoint Presentation

Appropriate Disaster Preparedness Education in Classrooms According to Students Grade, from Kindergarten through High School Contrivance of an Educati

Corrected Version NICT /11/15, 1 Thursday, May 7,


Journal of Geography 116 (6) Configuration of Rapid Digital Mapping System Using Tablet PC and its Application to Obtaining Ground Truth

soturon.dvi

1 7.35% 74.0% linefeed point c 200 Information Processing Society of Japan

THE INSTITUTE OF ELECTRONICS, INFORMATION AND COMMUNICATION ENGINEERS TECHNICAL REPORT OF IEICE.


(MIRU2008) HOG Histograms of Oriented Gradients (HOG)

untitled

DT pdf


第62巻 第1号 平成24年4月/石こうを用いた木材ペレット

(group A) (group B) PLE(Primary Leaving Examination) adobe Flash ipad 1 adobe Flash e-book ipad adobe Flash adobe Flash Pixton scratch PLE(Primary Lea

A Japanese Word Dependency Corpus ÆüËܸì¤Îñ¸ì·¸¤ê¼õ¤±¥³¡¼¥Ñ¥¹

06_学術.indd

IPSJ SIG Technical Report Vol.2011-MUS-91 No /7/ , 3 1 Design and Implementation on a System for Learning Songs by Presenting Musical St

05_藤田先生_責

els08ws-kuroda-slides.key

TF-IDF TDF-IDF TDF-IDF Extracting Impression of Sightseeing Spots from Blogs for Supporting Selection of Spots to Visit in Travel Sat

process of understanding everyday language is similar, finally as far as word production is concerned, individual variations seem to be greater at an

On the Wireless Beam of Short Electric Waves. (VII) (A New Electric Wave Projector.) By S. UDA, Member (Tohoku Imperial University.) Abstract. A new e

Motivation and Purpose There is no definition about whether seatbelt anchorage should be fixed or not. We tested the same test conditions except for t

NINJAL Project Review Vol.3 No.3

100 SDAM SDAM Windows2000/XP 4) SDAM TIN ESDA K G G GWR SDAM GUI

IPSJ SIG Technical Report Vol.2016-CE-137 No /12/ e β /α α β β / α A judgment method of difficulty of task for a learner using simple

1., 1 COOKPAD 2, Web.,,,,,,.,, [1]., 5.,, [2].,,.,.,, 5, [3].,,,.,, [4], 33,.,,.,,.. 2.,, 3.., 4., 5., ,. 1.,,., 2.,. 1,,

2009 No

149 (Newell [5]) Newell [5], [1], [1], [11] Li,Ryu, and Song [2], [11] Li,Ryu, and Song [2], [1] 1) 2) ( ) ( ) 3) T : 2 a : 3 a 1 :

2

07_太田美帆.indd

IPSJ SIG Technical Report Vol.2009-BIO-17 No /5/26 DNA 1 1 DNA DNA DNA DNA Correcting read errors on DNA sequences determined by Pyrosequencing

[2] , [3] 2. 2 [4] 2. 3 BABOK BABOK(Business Analysis Body of Knowledge) BABOK IIBA(International Institute of Business Analysis) BABOK 7


untitled

tikeya[at]shoin.ac.jp The Function of Quotation Form -tte as Sentence-final Particle Tomoko IKEYA Kobe Shoin Women s University Institute of Linguisti

新製品開発プロジェクトの評価手法


IPSJ SIG Technical Report Vol.2013-GN-86 No.35 Vol.2013-CDS-6 No /1/17 1,a) 2,b) (1) (2) (3) Development of Mobile Multilingual Medical

IPSJ SIG Technical Report Vol.2011-EC-19 No /3/ ,.,., Peg-Scope Viewer,,.,,,,. Utilization of Watching Logs for Support of Multi-


01_舘野.indd

Vol.2.indb

浜松医科大学紀要

Phonetic Perception and Phonemic Percepition

EQUIVALENT TRANSFORMATION TECHNIQUE FOR ISLANDING DETECTION METHODS OF SYNCHRONOUS GENERATOR -REACTIVE POWER PERTURBATION METHODS USING AVR OR SVC- Ju

<30315F985F95B65F90B490852E696E6464>

Hospitality-mae.indd

JOURNAL OF THE JAPANESE ASSOCIATION FOR PETROLEUM TECHNOLOGY VOL. 66, NO. 6 (Nov., 2001) (Received August 10, 2001; accepted November 9, 2001) Alterna

log F0 意識 しゃべり 葉の log F0 Fig. 1 1 An example of classification of substyles of rap. ' & 2. 4) m.o.v.e 5) motsu motsu (1) (2) (3) (4) (1) (2) mot

情報処理学会研究報告 IPSJ SIG Technical Report Vol.2013-HPC-139 No /5/29 Gfarm/Pwrake NICT NICT 10TB 100TB CPU I/O HPC I/O NICT Gf

.N..

Microsoft Word - toyoshima-deim2011.doc

はじめに

HP cafe HP of A A B of C C Map on N th Floor coupon A cafe coupon B Poster A Poster A Poster B Poster B Case 1 Show HP of each company on a user scree

2

RTM RTM Risk terrain terrain RTM RTM 48

DOUSHISYA-sports_R12339(高解像度).pdf

IPSJ SIG Technical Report Vol.2010-SLDM-144 No.50 Vol.2010-EMB-16 No.50 Vol.2010-MBL-53 No.50 Vol.2010-UBI-25 No /3/27 Twitter IME Twitte


DEIM Forum 2009 B4-6, Str

IPSJ SIG Technical Report Secret Tap Secret Tap Secret Flick 1 An Examination of Icon-based User Authentication Method Using Flick Input for

授受補助動詞の使用制限に与える敬語化の影響について : 「くださる」「いただく」を用いた感謝表現を中心に

<95DB8C9288E397C389C88A E696E6462>

<31322D899C8CA982D982A95F985F95B65F2E696E6464>

A5 PDF.pwd

06_仲野恵美.indd

IPSJ SIG Technical Report Vol.2009-HCI-134 No /7/17 1. RDB Wiki Wiki RDB SQL Wiki Wiki RDB Wiki RDB Wiki A Wiki System Enhanced by Visibl

高等学校 英語科

Transcription:

Example-based Machine Translation based on Deeper NLP Toshiaki Nakazawa 1, Kun Yu 1, Sadao Kurohashi 2 1. Graduate School of Information Science and Technology, The University of Tokyo, Tokyo, Japan, 113-8656 2. Graduate School of Informatics, Kyoto University, Kyoto, Japan, 606-8501

Outline Why EBMT? Description of Kyoto-U EBMT System Japanese Particular Processing Pronoun Estimation Japanese Flexible Matching Result and Discussion Conclusion and Future Work

Outline Why EBMT? Description of Kyoto-U EBMT System Japanese Particular Processing Pronoun Estimation Japanese Flexible Matching Result and Discussion Conclusion and Future Work

Why EBMT? Pursuing deep NLP - Improvement of fundamental analyses leads to improvement of MT - Feedback from MT can be expected EBMT setting is suitable in many cases - Not a large corpus, but similar translation examples in relatively close domain - e.g. manual translation, patent translation,

Outline Why EBMT? Description of Kyoto-U EBMT System Japanese Particular Processing Pronoun Estimation Japanese Flexible Matching Result and Discussion Conclusion and Future Work

Kyoto-U U System Overview Input 交差点に入る時私の信号は青でした 時 入る 私の 信号は 青 でした (was) (enter) (when) (signal) (blue) 点に 交差 (cross) (point) (my) 脱ぐ 時 サイン 交差 点で 突然 飛び出して来たのです (rush out) 私の 入る 信号は 青 でした (was) Translation Examples (suddenly) (put off) (point) (signal) (cross) (enter) (when) (my) (signal) (blue) 家に (house) came at me from the side at the intersection to remove when entering a house my signature traffic The light was green my traffic The light was green when entering the intersection Language Model Output My traffic light was green when entering the intersection.

Structure-based Alignment - Step1: Dependency structure transformation - Step2: Word/phrase correspondences detection - Step3: Correspondences disambiguation - Step4: Handling remaining words - Step5: Registration to database

Step1 Dependency Structure Transformation J: JUMAN/KNP E: Charniak s nlparser Dependency tree J: 交差点で 突然突然あのあの車が飛び出してして来たのですたのです E: The car came at me from the side at the intersection. 交差点で 突然あの車が飛び出して来たのです the car came at me from the side at the intersection

Step2 Word Correspondence Detection KENKYUSYA J-E, E-J dictionaries (300K entries) Transliteration (person/place names, Katakana words) Ex) 新宿 shinjuku shinjuku (similarity:1.0) sinjuku synjucu... 交差 点で 突然 車が あの 飛び出して来たのです came the car at me from the side at the intersection

Step3 Correspondence Disambiguation Calculate correspondence score based on unambiguous alignment Select correspondence with higher score Score = 1 1 + dist J dist Unamb. Matches E dist J/E = Distance to unambiguous correspondence in Japanese/English tree

Step3 Correspondence Disambiguation (cont.) 日本で保険会社に対して保険請求の申し立てが可能ですよ 1.5 1.0 you will have to file insurance an claim insurance 0.8 with the office in Japan

Step4 Handling Remaining Words Align root nodes when remained Merge Base NP nodes Merge into ancestor nodes 交差点で 突然あの車が飛び出して来たのです the car came at me from the side at the intersection

Step5 Registration to Database Register each correspondence Register a couple of correspondences 交差点で 突然あの車が飛び出して来たのです the car came at me from the side at the intersection

Translation Translation example (TE) retrieval - for all the sub-trees in the input TE selection - prefer to large size example TE combination - greedily form the root node

Combination Example Translation Examples 交差 (cross) came Input 時 入る 私の 信号は 青 でした (was) (enter) (when) (blue) (signal) 点に (my) 交差 (cross) (point) 脱ぐ 点で 突然 飛び出して来たのです (rush out) 時 サイン 私の 入る 信号は 青 でした (was) (suddenly) (enter) (when) (put off) (signal) (blue) (point) (my) (signal) 家に (house) at me from the side at the intersection to remove when entering a house my signature traffic The light was green my traffic The light was green when entering the intersection

Combination Example (cont.) Translation Examples 交差 (cross) came Input 時 入る 私の 信号は 青 でした (was) (enter) (when) (blue) (signal) 点に (my) 交差 (cross) (point) 脱ぐ 点で 突然 飛び出して来たのです (rush out) 時 サイン 私の 入る 信号は 青 でした (was) (suddenly) (enter) (when) (put off) (signal) (blue) (point) (my) (signal) 家に (house) at me from the side at the intersection to remove when entering a house my signature traffic The light was green my traffic The light was green when entering the intersection

Outline Why EBMT? Description of Kyoto-U EBMT System Japanese Particular Processing Pronoun Estimation Japanese Flexible Matching Result and Discussion Conclusion and Future Work

Pronoun Estimation Pronouns are often omitted in Japanese sentences Omitted in TE: - TE 胃が痛いのです I ve a stomachache - Input 私は胃が痛いのです I I ve a stomachache Omitted in Input - TE これを日本日本に送ってください Will you mail this to Japan? - Input: 日本へ送ってください Will you mail to Japan?

Pronoun Estimation (cont.) Estimate omitted pronoun by modality and subject case Omitted in TE: - TE ( 胃が痛いのです私は ) 胃が痛いのです I ve a stomachache I ve a stomachache - Input 私は胃が痛いのです Omitted in Input - TE これを日本日本に送ってください Will you mail this to Japan? - Input: ( 日本へ送ってくださいこれを ) 日本へ送ってください I ve a stomachache Will you mail this to Japan?

Various Expressions in Japanese Synonymous Relation - Hiragana/Katakana/Kanji variations りんご = リンゴ = 林檎 林檎 (apple) - Variations of Katakana expressions コンピュータ = コンピューター - Synonymous words 登山 = 山登 - Synonymous phrases コンピューター (computer) 山登り (climbing mountain vs mountain climgbing) 最寄りの = 一番近い (nearest) Hypernym-Hyponym Relation - 災難 災害 地震 (earthquake) 台風 (disaster) (most) (near) Morphological Analyzer Automatically Acquired from Japanese Dictionaries 台風 (typhoon)

Japanese Flexible Matching

IWSLT06 Evaluation Results Open data track (JE) Correct recognition translation & ASR output translation BLEU NIST Dev1 0.5087 9.6803 Correct recognition Dev2 Dev3 Dev4 0.4881 0.4468 0.1921 9.4918 9.1883 5.7880 Test 0.1655 (8 th /14) 5.4325 (8 th /14) ASR output Dev4 Test 0.1590 0.1418 (9 th /14) 5.0107 4.8804 (10 th /14)

Results Discussion Punctuation insertion failure caused parsing error Dictionary robustness affected alignment accuracy TE selection criterion failed when choosing among almost equal examples - e.g. Input: 買います (buy a ticket) TE: 買いません (not buy a ticket)

Conclusion and Future Work We not only aim at the development of MT, but also tackle this task from the viewpoint of structural NLP. Implement statistical method on alignment Improve parsing accuracies (both J and E) Improve Japanese flexible matching method J-C and C-J MT Project with NICT