Journal ofchinese Language and Computing, l3 (2) 12l-158
Shiwen Yu, Huiming Duan, XuefengZhu, Bin Swen, Baobao Chang
Specification for Corpus Processing at Peking University
Shiwen Yu, Huiming Duan, Xuefeng Zhu, Bin Swen, Baobao Chang bu4 de2dao4 duol falzhan3 gan3shan4 guo2jial jingljia jiu4 ke3neng2 min2zu2 ren2min2 ru2guo3 shenglhuo2 shui3ping2 ti2gaol tuan?jie2 ye3 yi lge4 zan2men5 zhe4me5 zhonglguo2
Specification for Corpus Processing at Peking University
Shiwen Yu, Huiming Duan, Xuefeng Zhu, Bin Swen, Baobao Chang
Specification for Corpus Processing at Peking University
ShiwenYu, Huiming Duan, XuefengZhu, Bin Swen, Baobao Chang
Specification for Corpus Processing at Peking University
Shiwen Yu, Huiming Duan, XuefengZhu, Bin Swen, Baobao Chang
Specification for Corpus Processing at Peking University
Shiwen Yu, Huiming Duan, XuefengZht, Bin Swen, Baobao Chang
Specification for Corpus Processing at Peking University
Shiwen Yu, Huiming Duan, XuefengZhu, Bin Swen, Baobao Chang
Specification for Corpus Processing at Peking University
Shiwen Yu, Huiming Duan, XuefengZhu, Bin Swen, Baobao Chang
Specifrcation for Corpus Processing at Peking University
Shiwen Yu, Huiming Duan, XuefengZhu, Bin Swen, Baobao Chang
Specification for Corpus Processing at Peking University
Shiwen Yu, Huiming Duan, XuefengZhu, Bin Swen, Baobao Chang
Specification for Corpus Processing at Peking University
Shiwen Yu, Huiming Duan, XuefengZhu, Bin Swen, Baobao Chang
Specification for Corpus Processing at Peking University
Shiwen Yu, Huiming Duan, XuefengZhu,Bin Swen, Baobao Chang
Specification for Corpus Processing at Peking University
Shiwen Yu, Huiming Duan, Xuefeng Zhu, Bin Swen, Baobao Chang
Specification for Corpus Processing at Peking University
Shiwen Yu, Huiming Duan, Xuefeng Zhu, Bin Swen, Baobao Chang
Specification for Corpus Processing at Peking University
ShiwenYu, Huiming Duan, XuefengZhu, Bin Swen, Baobao Chang
Specification for Corpus Processing at Peking University
Shiwen Yu, Huiming Duan, XuefengZhu, Bin Swen, Baobao Chang
Specification for Corpus Processing at Peking University
Shiwen Yu, Huiming Duan, XuefengZhu, Bin Swen, Baobao Chang (w pos:"c" pinyin="he2''>f,[</w> <w pos:"wj'). </w>
Specification for Corpus Processing at Peking University
Shiwen Yu, Huiming Duan, XuefengZhu, Bin Swen, Baobao Chang
Specification for Corpus Processing at Peking University
Shiwen Yu, Huiming Duan, XuefengZhu, Bin Swen, Baobao Chang Specification for Corpus Processing at Peking University: Word Segmentationo POS Tagging and Phonetic Notation Shiwen Yu, Huiming Duan, Xuefeng Zht,Bin Swen, Baobao Chang Institute of Computati onal Li nguistics, Peking University, Beijing, 100871, China yusw@pku. edu. cn ; duenhm@pku. edu. cn; bswen@pku. edu. cn; chbb@pku. edu. cn AbStraCt: The Institute of Computational Linguistics, Peking University made a specification for the word segmentation and POS tagging of its People's Daily corpus (over 26 million Chinese characters) fhereinafter: Specification 2001, which was published in the Journal of Chinese Information Processing (lssue 5 & Issued 6, 2002), entitled The Basb Prrcessing of Conternporary Chfuese Corpw ct Peking ljnivercity - Specifuatbnl. In additbn arnther specificatbn was nude for building the phonaicalty tmrntaled cotpus (l million Chinese charar:ters). Based on these two specftcatbns, we hercby prcset the latest Specifuatianfor Corpu Prcrcessing U Peking IJniversity: Word Segmentation, POS Tagging and Phonetic Notation fhereinafter: Specification 20031. With the newly added ones, the togset now includes more than 100 tags. Following Specification 2003, the Institute of Computational Linguistics wiii go on with more corpora of high quality and in-depth processing. Keyword: Contemporary Chinese; Corpus; Word Segmentation; POS Tagging; P hone tic N otqtion ; Sp e cift cati on