(i) 1 (ii) ,, 第 5 回音声ドキュメント処理ワークショップ講演論文集 (2011 年 3 月 7 日 ) 1) 1 2) Lamel 2) Roy 3) 4) w 1 w 2 w n 2 2-g

Similar documents
Vol. 42 No MUC-6 6) 90% 2) MUC-6 MET-1 7),8) 7 90% 1 MUC IREX-NE 9) 10),11) 1) MUCMET 12) IREX-NE 13) ARPA 1987 MUC 1992 TREC IREX-N

1 Fig. 1 Extraction of motion,.,,, 4,,, 3., 1, 2. 2.,. CHLAC,. 2.1,. (256 ).,., CHLAC. CHLAC, HLAC. 2.3 (HLAC ) r,.,. HLAC. N. 2 HLAC Fig. 2

kut-paper-template.dvi

1 7.35% 74.0% linefeed point c 200 Information Processing Society of Japan

2 HMM HTK[2] 3 left-to-right HMM triphone MLLR 1 CSJ 10 1 : 3 1: GID AM/CSJ-APS/hmmdefs.gz

2) 3) LAN 4) 2 5) 6) 7) K MIC NJR4261JB0916 8) 24.11GHz V 5V 3kHz 4 (1) (8) (1)(5) (2)(3)(4)(6)(7) (1) (2) (3) (4)

1 Web [2] Web [3] [4] [5], [6] [7] [8] S.W. [9] 3. MeetingShelf Web MeetingShelf MeetingShelf (1) (2) (3) (4) (5) Web MeetingShelf

( ) [1] [4] ( ) 2. [5] [6] Piano Tutor[7] [1], [2], [8], [9] Radiobaton[10] Two Finger Piano[11] Coloring-in Piano[12] ism[13] MIDI MIDI 1 Fig. 1 Syst

& Vol.5 No (Oct. 2015) TV 1,2,a) , Augmented TV TV AR Augmented Reality 3DCG TV Estimation of TV Screen Position and Ro

4. C i k = 2 k-means C 1 i, C 2 i 5. C i x i p [ f(θ i ; x) = (2π) p 2 Vi 1 2 exp (x µ ] i) t V 1 i (x µ i ) 2 BIC BIC = 2 log L( ˆθ i ; x i C i ) + q

¥ì¥·¥Ô¤Î¸À¸ì½èÍý¤Î¸½¾õ

大学における原価計算教育の現状と課題

GPGPU

05_藤田先生_責

1: A/B/C/D Fig. 1 Modeling Based on Difference in Agitation Method artisoc[7] A D 2017 Information Processing

28 Horizontal angle correction using straight line detection in an equirectangular image

A Study on Throw Simulation for Baseball Pitching Machine with Rollers and Its Optimization Shinobu SAKAI*5, Yuichiro KITAGAWA, Ryo KANAI and Juhachi

Fig. 3 Flow diagram of image processing. Black rectangle in the photo indicates the processing area (128 x 32 pixels).

EQUIVALENT TRANSFORMATION TECHNIQUE FOR ISLANDING DETECTION METHODS OF SYNCHRONOUS GENERATOR -REACTIVE POWER PERTURBATION METHODS USING AVR OR SVC- Ju

DPA,, ShareLog 3) 4) 2.2 Strino Strino STRain-based user Interface with tacticle of elastic Natural ObjectsStrino 1 Strino ) PC Log-Log (2007 6)

TCP/IP IEEE Bluetooth LAN TCP TCP BEC FEC M T M R M T 2. 2 [5] AODV [4]DSR [3] 1 MS 100m 5 /100m 2 MD 2 c 2009 Information Processing Society of

揃 Lag [hour] Lag [day] 35

IT,, i

IPSJ SIG Technical Report Vol.2009-DPS-141 No.20 Vol.2009-GN-73 No.20 Vol.2009-EIP-46 No /11/27 1. MIERUKEN 1 2 MIERUKEN MIERUKEN MIERUKEN: Spe

(MIRU2008) HOG Histograms of Oriented Gradients (HOG)

& Vol.2 No (Mar. 2012) 1,a) , Bluetooth A Health Management Service by Cell Phones and Its Us

[2] OCR [3], [4] [5] [6] [4], [7] [8], [9] 1 [10] Fig. 1 Current arrangement and size of ruby. 2 Fig. 2 Typography combined with printing

.,,, [12].,, [13].,,.,, meal[10]., [11], SNS.,., [14].,,.,,.,,,.,,., Cami-log, , [15], A/D (Powerlab ; ), F- (F-150M, ), ( PC ).,, Chart5(ADIns

B HNS 7)8) HNS ( ( ) 7)8) (SOA) HNS HNS 4) HNS ( ) ( ) 1 TV power, channel, volume power true( ON) false( OFF) boolean channel volume int

IPSJ SIG Technical Report Vol.2014-CG-155 No /6/28 1,a) 1,2,3 1 3,4 CG An Interpolation Method of Different Flow Fields using Polar Inter

DT pdf

..,,,, , ( ) 3.,., 3.,., 500, 233.,, 3,,.,, i

IPSJ SIG Technical Report Vol.2015-MUS-107 No /5/23 HARK-Binaural Raspberry Pi 2 1,a) ( ) HARK 2 HARK-Binaural A/D Raspberry Pi 2 1.

IPSJ SIG Technical Report Vol.2011-MUS-91 No /7/ , 3 1 Design and Implementation on a System for Learning Songs by Presenting Musical St

untitled

Journal of Geography 116 (6) Configuration of Rapid Digital Mapping System Using Tablet PC and its Application to Obtaining Ground Truth

IPSJ SIG Technical Report Pitman-Yor 1 1 Pitman-Yor n-gram A proposal of the melody generation method using hierarchical pitman-yor language model Aki

29 jjencode JavaScript

IPSJ SIG Technical Report Vol.2012-MUS-96 No /8/10 MIDI Modeling Performance Indeterminacies for Polyphonic Midi Score Following and


Sobel Canny i

log F0 意識 しゃべり 葉の log F0 Fig. 1 1 An example of classification of substyles of rap. ' & 2. 4) m.o.v.e 5) motsu motsu (1) (2) (3) (4) (1) (2) mot

日本感性工学会論文誌

浜松医科大学紀要

情報処理学会研究報告 IPSJ SIG Technical Report Vol.2013-CVIM-186 No /3/15 EMD 1,a) SIFT. SIFT Bag-of-keypoints. SIFT SIFT.. Earth Mover s Distance

tikeya[at]shoin.ac.jp The Function of Quotation Form -tte as Sentence-final Particle Tomoko IKEYA Kobe Shoin Women s University Institute of Linguisti

Vol. 48 No. 4 Apr LAN TCP/IP LAN TCP/IP 1 PC TCP/IP 1 PC User-mode Linux 12 Development of a System to Visualize Computer Network Behavior for L

Modal Phrase MP because but 2 IP Inflection Phrase IP as long as if IP 3 VP Verb Phrase VP while before [ MP MP [ IP IP [ VP VP ]]] [ MP [ IP [ VP ]]]

:- Ofer Feldman,Feldman : -

IPSJ SIG Technical Report Vol.2010-CVIM-170 No /1/ Visual Recognition of Wire Harnesses for Automated Wiring Masaki Yoneda, 1 Ta

Vol. 43 No. 7 July 2002 ATR-MATRIX,,, ATR ITL ATR-MATRIX ATR-MATRIX 90% ATR-MATRIX Development and Evaluation of ATR-MATRIX Speech Translation System

音響モデル triphone 入力音声 音声分析 デコーダ 言語モデル N-gram bigram HMM の状態確率として利用 出力層 triphone: 3003 ノード リスコア trigram 隠れ層 2048 ノード X7 層 1 Structure of recognition syst

gengo.dvi

The 18th Game Programming Workshop ,a) 1,b) 1,c) 2,d) 1,e) 1,f) Adapting One-Player Mahjong Players to Four-Player Mahjong

NINJAL Project Review Vol.3 No.3

2 except for a female subordinate in work. Using personal name with SAN/KUN will make the distance with speech partner closer than using titles. Last

kut-paper-template.dvi

08-特集04.indd

第62巻 第1号 平成24年4月/石こうを用いた木材ペレット

経済論集 44‐1(よこ)/2.李

1 4 4 [3] SNS 5 SNS , ,000 [2] c 2013 Information Processing Society of Japan

1 UD Fig. 1 Concept of UD tourist information system. 1 ()KDDI UD 7) ) UD c 2010 Information Processing S

1., 1 COOKPAD 2, Web.,,,,,,.,, [1]., 5.,, [2].,,.,.,, 5, [3].,,,.,, [4], 33,.,,.,,.. 2.,, 3.., 4., 5., ,. 1.,,., 2.,. 1,,

IPSJ SIG Technical Report Secret Tap Secret Tap Secret Flick 1 An Examination of Icon-based User Authentication Method Using Flick Input for

Microsoft Word - toyoshima-deim2011.doc

Fig. 2 Signal plane divided into cell of DWT Fig. 1 Schematic diagram for the monitoring system

_念3)医療2009_夏.indd


258 5) GPS 1 GPS 6) GPS DP 7) 8) 10) GPS GPS ) GPS Global Positioning System

(a) 1 (b) 3. Gilbert Pernicka[2] Treibitz Schechner[3] Narasimhan [4] Kim [5] Nayar [6] [7][8][9] 2. X X X [10] [11] L L t L s L = L t + L s

RTM RTM Risk terrain terrain RTM RTM 48

untitled

The copyright of this material is retained by the Information Processing Society of Japan (IPSJ). The material has been made available on the website

Vol.54 No (July 2013) [9] [10] [11] [12], [13] 1 Fig. 1 Flowchart of the proposed system. c 2013 Information

<95DB8C9288E397C389C88A E696E6462>

/ p p

Mimehand II[1] [2] 1 Suzuki [3] [3] [4] (1) (2) 1 [5] (3) 50 (4) 指文字, 3% (25 個 ) 漢字手話 + 指文字, 10% (80 個 ) 漢字手話, 43% (357 個 ) 地名 漢字手話 + 指文字, 21

知能と情報, Vol.30, No.5, pp

ホットスポット 1 音リアクションイベント BIC GMM 2 3 BIC GMM HMM 10) SVM 11) 12) 13) Bayesian Information Criterion BIC 14) BIC M = M 1, M 2,,

IPSJ SIG Technical Report Vol.2009-HCI-134 No /7/17 1. RDB Wiki Wiki RDB SQL Wiki Wiki RDB Wiki RDB Wiki A Wiki System Enhanced by Visibl


THE INSTITUTE OF ELECTRONICS, INFORMATION AND COMMUNICATION ENGINEERS TECHNICAL REPORT OF IEICE.

IPSJ SIG Technical Report Vol.2009-BIO-17 No /5/26 DNA 1 1 DNA DNA DNA DNA Correcting read errors on DNA sequences determined by Pyrosequencing

IPSJ SIG Technical Report Vol.2009-CVIM-167 No /6/10 Real AdaBoost HOG 1 1 1, 2 1 Real AdaBoost HOG HOG Real AdaBoost HOG A Method for Reducing

IPSJ-TOD

36

Core1 FabScalar VerilogHDL Cache Cache FabScalar 1 CoreConnect[2] Wishbone[3] AMBA[4] AMBA 1 AMBA ARM L2 AMBA2.0 AMBA2.0 FabScalar AHB APB AHB AMBA2.0

thesis.dvi

IPSJ SIG Technical Report Vol.2011-MUS-90 No /5/ , 3 1 Design and Implementation of a Drumstick with Stroke Recognition Function for Inte


,,,,., C Java,,.,,.,., ,,.,, i


IPSJ SIG Technical Report iphone iphone,,., OpenGl ES 2.0 GLSL(OpenGL Shading Language), iphone GPGPU(General-Purpose Computing on Graphics Proc

On the Wireless Beam of Short Electric Waves. (VII) (A New Electric Wave Projector.) By S. UDA, Member (Tohoku Imperial University.) Abstract. A new e

人工知能学会研究会資料 SIG-KBS-B Analysis of Voting Behavior in One Night Werewolf 1 2 Ema Nishizaki 1 Tomonobu Ozaki Graduate School of Integrated B

3_23.dvi

IPSJ SIG Technical Report Vol.2014-GN-90 No.16 Vol.2014-CDS-9 No.16 Vol.2014-DCC-6 No /1/24 1,a) 2,b) 2,c) 1,d) QUMARION QUMARION Kinect Kinect

IT i

A Japanese Word Dependency Corpus ÆüËܸì¤Îñ¸ì·¸¤ê¼õ¤±¥³¡¼¥Ñ¥¹

[2] 2. [3 5] 3D [6 8] Morishima [9] N n 24 24FPS k k = 1, 2,..., N i i = 1, 2,..., n Algorithm 1 N io user-specified number of inbetween omis

Transcription:

1 2 1 closed Automatic Detection of Edited Parts in Inexact Transcribed Corpora Using Alignment between Edited Transcription and Corresponding Utterance Kengo Ohta, 1 Masatoshi Tsuchiya 2 and Seiichi Nakagawa 1 The availability of a large-scale spontaneous speech corpora is crucially important for various domains of spoken language processing. However, the available corpora are usually limited because of its cost to prepare. On the other hand, inexact transcribed corpora have been widely produced in the form of shorthand notes, meeting records, or closed captions. Although these inexact transcribed corpora are more freely available than faithful/exact ones, these are not faithfully transcribed but contains edited transcriptions. Under this background, we are considering to build an efficient semi-automatic framework for converting inexact transcripts to faithful ones or exact transcriptions. This framework consists of three steps: the first step is to automatically detect positions of edited parts, the second step is to manually transcribe the edited parts, and as the third step, we extract transformation rule from the parallel corpus of written style and spoken style. This paper proposes an automatic detection method of edited parts in edited transcribed corpora for this framework. In our proposed method, an automatic alignment between edited transcription and its corresponding utterance is performed, and then a support vector machine based detector is applied to detect edited parts using some features obtained by the automatic alignment. As a result of evaluation on the Japanese National Diet Record, a reasonable result was obtained in speaker-closed condition. By product, we obtain reliable transcript for unsupervised learning of acoustic models. 1. 1 Department of Computer Sciences and Engineering, Toyohashi University of Technology 2 Information Media Center, Toyohashi University of Technology 1234 c 2011 Information Processing Society of Japan

(i) 1 (ii) 1947 1 1 1,, 第 5 回音声ドキュメント処理ワークショップ講演論文集 (2011 年 3 月 7 日 ) 1) 1 http://kokkai.ndl.go.jp/ 2) Lamel 2) Roy 3) 4) 2 3 4 5 2. w 1 w 2 w n 2 2-gram w i i sp i w i 3 3 1235 c 2011 Information Processing Society of Japan

2 bigram 3 /si/ /te/ /ne/ /sp/ /te de su/ / 3. 3 / 2 SVM SVM TinySVM ver 0.09 5) 3.1 7 2 2 Huang 6) 3.1.1 3.1.2 6 3 3 Local d 1236 c 2011 Information Processing Society of Japan

Global d Local d = 1 N N i=1 1 6 dur(s i) i+3 dur(sj) (1) j=i 3(j i) N dur(s) s Global d = 1 N dur(s N i=1 i) s U dur(s) (2) 1 U U 3.1.3 V ar d ( ) 2 V ar d = 1 N dur(s i ) N s U dur(s) Global d (3) i=1 1 U 3.1.4 Lo 7) ( ) Score d = 1 W log P (dur(s) s) P anti model (dur(s)) s W W P (dur(s) s) P anti model (dur(s)) (4) 3.1.5 3.1.6 3.1.7 3.2 33% 50% 100 10 15 5 30 10 100 3 100 90% 85 80 94% 2) 4. 4.1 1 1237 c 2011 Information Processing Society of Japan

semi-closed open 2 semi-closed 2 2 open 2 1 SPOJUS++ 8) CSJ 10) left-to-right HMM 5 4 / 116 2 4 5 1 semi-closed open (min) 22 20 42 60 5 4 7 11 3.6k 3.6k 7.2k 10.8k 347 257 604 426 % 9.6 7.1 8.4 3.9 4 2 16kHz 0.98 Hamming 25ms 10ms MFCC + MFCC + MFCC + Pow + Pow (38 dimensions) 4.2 4.2.1 30msec 97.4% 4.2.2 Semi-closed 4 closed 2 open 2-6 6 closed open closed 60% 30% open 0% 5% 4.2.3 open open - 7 7 semi-closed 7 6 1238 c 2011 Information Processing Society of Japan

open 5 7 open - 6 semi-closed - 4.2.4 semi-closed 3 feature set feature set 1: feature set 2: feature set 3: feature set 8 9 closed 8 9 feature set2 feature set3 4.2.5 10 semi-closed 11 open 10 100% 93% 60% 98% 80% 97% 88% 97% 1239 c 2011 Information Processing Society of Japan

11 60% 96% 98% 80% 97.5% 95.5% 98.5% 11 2) 8 feature set - semi-closed 10 - semi-closed 9 feature set - semi-closed closed 5. 2 closed 1240 c 2011 Information Processing Society of Japan

11 - open 1) 11) Algorithm and System Development, Prentice Hall, 2001. 7) W. Lo, A.M.Harrison and H.Meng, Statistical Phone Duration Modeling to Filter for Intact Utterances in a Computer-Assisted Pronunciation Training System, in Proc. of International Conference on Acoustics, Speech, and Signal Processing(ICASSP), pp.5238 5241, 2010. 8) ++ 4 2010 9) S.Nakagawa, K.Hanai, K.Yamamoto, and N.Minematsu, Comparison of Syllable- Based HMMs and Triphone-Based HMMs in Japanese Speech Recognition, in Proc. of International Workshop on Automatic Speech Recognition and Understanding, pp.393-396, 1999. 10) K.Maekawa, Corpus of Spontaneous Japanese: Its Design and Evaluation, In Proc. of the ISCA & IEEE Workshop on Spontaneous Speech Processing and Recognition (SSPR2003), pp.7 12, 2003. 11) T.Kawahara, M.Mimura, and Y.Akita, Language Model Transformation Applied to Lightly Supervised Training of Acoustic Model for Congress Meetings, In Proc. of International Conference on Acoustics, Speech, and Signal Processing(ICASSP), pp.3853 3856, 2009. COE 1) G. Neubig, Y. Akita, S. Mori, and T. Kawahara, Improved Statistical Models for SMT-based Speaking Style Transformation, In Proc. of International Conference on Acoustics, Speech, and Signal Processing(ICASSP), pp.5206 5209, 2010. 2) L. Lamel, J.L. Gauvain, G. Adda, Investigating lightly supervised acoustic model training, in Proc. of International Conference on Acoustics, Speech, and Signal Processing(ICASSP), pp.477 480, 2001. 3) B.C.Roy, S Vosoughi, D Roy, Automatic Estimation of Transcription Accuracy and Difficulty, in Proc. of Interspeech, pp.1902 1905, 2010. 4) 3-Q-30 pp.177-178 1999 5) TinySVM, http://chasen.org/ taku/software/tinysvm/ 6) X. Huang, A. Acero, H. Hon, Spoken Language Processing: A Guide to Theory, 1241 c 2011 Information Processing Society of Japan