計量国語学 アーカイブ ID KK300601 種別 特集 招待論文 A タイトル Webコーパスの概念と種類, 利用価値 語史研究の情報源としてのWebコーパス Title The Concept, Types and Utility of Web Corpora: Web Corpora as a Source of Information for Etymological Studies 著者 田野村忠温 Author TANOMURA Tadaharu 掲載号 30 巻 6 号 発行日 2016 年 9 月 20 日 開始ページ 326 終了ページ 343 著作権者 計量国語学会
30 6 2016 9 pp.326-343. Web Web Web Google Web Web Web 2015a 2015b 2016a 2016b Web 326 2016
BCCWJ 2011 2012 2014 BCCWJ BCCWJ 1,000 BCCWJ Web BCCWJ BCCWJ Web Web Web Web Web Web Web Web 327
Web Web Web Web Web Web Web Web Web Google https://books.google.co.jp/ Google OCR Google Google Web Google Google Google Google Google Web 3.2 Google Google Google Google Google Google Web 328
Google Web http://dl.ndl.go.jp/ http://kokkai.ndl.go.jp/ Google Google Web Web Web Web Web Web Web Web Google 1980 Google Web 1979 Internet Archive https://archive.org/ Wayback Machine Wayback Machine 329
Google 13 1975-2001 - Google 330
Google 1940 15 1954 29 3 1964 39 11 1974 49 2008 20 2014 26 331
1980 SVP CPU central processing unit SVP CPU J63-D 3 1980 55 OS 55 1981 56 Word 1979 1979 54 Web 332
Web Google television Web 20 2008 1920 40 1986 Wikipedia television Wikipedia 1934 television television television Wikipedia 1934 333
Web television 1922 11 8 2 1924 13 television 596 1924 13 10 1 1927 2 4 1 3 2 4 1927 2 2 1 4 1 (2008).NET http://www.nikkoku.net/tomonokai/ 2006 10 25 1928 3 television 5.2 1927 2 12 17 334
4 4 6 1927 2 10 1 12 1 2 12 17 1927 2 12 17 television 1917 6 3 4 1917 6 4 25 1915 4 19 20 television 10 7 1925 14 10 30 335
Television 1927 16 5 8 Bell Laboratory Herbert E. Ives 12 6 1927 16 6 20 Herbert Hoover Bell 24 17 1927 16 9 10 television television Visagraph 15 12 1928 17 1932 21 12 8 television 2008 336
tele- telegraph telephone 4 4 4 4 Tele Vision Vision Tele 4 4 4 337
1929 4 11 3 television tele- television télautographie téléphotographie 21 13 1924 13 tele- 1929 18 27 television television television telegraph telephone television (2007) 1879 12 7 9 1883 3 10 9 2 2 (2007) 338
television television 1932 7 2 24 1933 8 2 24 1934 9 4 23 10 4 1928 3 television 1931 20 15 3 telephone television television telephone (1950) 339
Web Google 4 4 4 Google Google Google 340
Google Google Google 4 Google Google Google Google Web Web Web Web Web 341
5.2 television 1917 1925 1927 2007 19 263-282.. 2007. 2012 BCCWJ 46 59-82.. 2014 BCCWJ 119-151.. 2015a 55 81-137. 2015b Amsterdam 49 9-34.. 2016a 56 123-181. 2016b 50... 2008 95-110.. 1950. 2016 3 13 342
Mathematical Linguistics, Vol.30 No.6 (September 2016) pp.326-343. Invited Paper (A) to the Special Issue The Concept, Types and Utility of Web Corpora: Web Corpora as a Source of Information for Etymological Studies TANOMURA Tadaharu (Osaka University) Abstract: The defining condition of a Web corpus will be that it is a huge amount of text data collected from the Internet. Although Websites such as Google Books, National Diet Library Digital Collections and newspaper archives do not satisfy the condition, they nevertheless cannot be clearly distinguished from typical Web corpora, and thus it may not be groundless to regard them as a type of Web corpus. This article, drawing upon two case studies, will demonstrate that we can easily enhance the level of the description of the history of Japanese as well as Chinese terms of the modern era with the help of information obtainable from those Websites. Keywords: Web corpus, diachronic corpus, modern Japanese and Chinese, etymology, tatiageru (transitivized form of the verb tatiagaru), densi/dianshi (Japanese/ Chinese term for television) 343