2. HTML 2 3. 1 1 100 6 4. csh AWK 4. 4. AWK 1., 2., 3. 2 HTML HTML HyperText Markup Language WWW WWW (.html



Similar documents
HTML HTML HTML

html_text

(1) <html>,,,,, <> ( ) (/ ) (2) <!DOCTYPE html> HTML5 (3) <html> HTML (4) <html lang= ja > html (ja) (5) JavaScript CSS (6) <meta charset= shift jis >

モール管理者マニュアル Ver.1.0

CSS


HTML web HTML HTML

HTML入門

Web

橡ホームページの作り方

6 2 1

20 180pixel 180pixel Copyright 2014 Yahoo Japan Corporation. All Rights Reserved.

●70974_100_AC009160_KAPヘ<3099>ーシス自動車約款(11.10).indb

</BODY> </HTML> HTML HTML HTML HTML <HTML> </HTML> <HTML> </HTML> HTML <HEAD> </HEAD> <TITLE> </TITLE> <BODY> </BODY> BODY moji.htm <HTML> <HEAD> <TIT

[ ][ ] HTML [ ] HTML HTML

コンピュータサイエンス 4. ウェブプログラミング

(a) WYSIWYG (What you see is what you get.) (b) (c) Hyper Text Markup Language: SGML (Standard Generalized Markup Language) HTML (d) TEX

07_経営論集2010 小松先生.indd

ホームページ制作スターターズ


コンピュータサイエンス 1. ウェブの基本

Copyright 2008 All Rights Reserved 2

ハピタス のコピー.pages

相続支払い対策ポイント

150423HC相続資産圧縮対策のポイント

経営論集2011_07_小松先生.indd

PowerPoint プレゼンテーション

untitled

Copyright 2008 NIFTY Corporation All rights reserved. 2

初心者にもできるアメブロカスタマイズ新2016.pages

- 2 Copyright (C) All Rights Reserved.

JavaScriptプログラミング入門


#"

Copyright 2006 KDDI Corporation. All Rights Reserved page1

Copyright All Rights Reserved. -2 -!

0序文‐1章.indd

Web 1990,HTTP, HTML, URL XML HTML XHTML XML Web XMLSOAPWSDL ( ) Web2.0 Web XML+WebAPI

雛形ホームページ利用マニュアル

textbook.indd

Microsoft Word - 最終版 バックせどりismマニュアル .docx

簡単なHTMLファイルを作ろう

untitled

Microsoft Word - class_specification_guide_v60.doc

1 1 2 L A TEX L A TEX HTML HTML


dekiru_asa

m_sotsuron

改訂版 :基本的な文字化の原則(Basic Transcription System for Japanese: BTSJ)

untitled

方程式を解いてみよう! C++ から PHP + JavaScriptへ

Twitter Copyright All Rights Reserved 2

■サイトを定義する

L03_final.indd

™ƒŒì„³001†`028.pwd

76

★分冊3-説明資料PDF用/02-PDF個別

' % % &! #


Transcription:

1. 1 AWK HTML 18 8 14 1 HTML Yahoo! 3 Yahoo! (http://www.yahoo.co.jp/) 1 Yahoo! : http://headlines.yahoo.co.jp/hl ( ) ( ) Netscape 3.04 1. 2 Netscape 3.04 2. 1 Yahoo!

2. HTML 2 3. 1 1 100 6 4. csh AWK 4. 4. AWK 1., 2., 3. 2 HTML HTML HyperText Markup Language WWW WWW (.html

2. HTML 3 ) HTML ( ) 2 HTML < > </ > < > 2 HTML : <html> <head> <title> </title> </head> <body> </body> </html> HTML HTML ( ) 4 < =" "...> 2 L A TEX 3 HTML HTML HTML 3 L A TEX 4 HTML

2. HTML 4 <html> </html>: HTML HTML 5 <head> </head>: HTML <title> </title>: HTML WWW <body> </body>: HTML HTML <br>: <hr>: + <b> </b>: <font> </font>: <small> </small>: <center> </center>: <h[n]> </h[n]>: ([n] 1 6 1 ) <a> </a>: <div> </div>: <table> </table>: <tr> </tr>: <table> <td> </td>: <tr> 5 HTML xml DOCTYPE

3. YAHOO! 5 <script> </script>: Javascript <noscript> </noscript>: Javascript <form> </form>: <img>: <!-- -->: ( ) HTML HTML ([7], [8] ) HTML ([9] ) 3 Yahoo! Yahoo! HTML <table> ( <tr>, <td> ) WWW 6 <table> Yahoo! HTML <html> <head> <!-- --> <title>yahoo! - XXXX - </title>... </head> <body marginheight=0 topmargin=0> <center>... <!--- CONTENTS_TITLE_TABLE ---> <table border=0 cellpadding=2 cellspacing=0 width=100%> <tr bgcolor="#9999cc"> <td nowrap> 6 table CSS

3. YAHOO! 6 <b><font size=+1> </font></b> <small> - 8 10 ( )12 00 </small> </td>... <!--- /CONTENTS_TITLE_TABLE ---> <!--br--> <!--- OUTLINE_TABLE ---> <font size=5 class="s130"><b> </b></font> <br><br> ( )... <div align=right> XXXX - 8 10 12 00 </div><br> </td></tr>...... <!--- /YBB module ---> <hr width=100% size=0> <small> <a href="http://help.yahoo.co.jp/help/jp/news/"> </a><br> Copyright (C) 2006 XXXX <br> Copyright (C) 2006 Yahoo Japan Corporation. All Rights Reserved. <br> </small> </center> </body> </html>... 1. <title> </title> 2. <!--- CONTENTS_TITLE_TABLE ---> <b><font size=+1>... </small> 3. <!--- OUTLINE_TABLE ---> 4. <!--- /YBB module ---> Copyright (C) 2006... HTML

4. 7 AWK END 4 3 1. <title> </title> 3 <title> </title> 3 <title>, </title> <title> </title> AWK 1 2 1. getline 2. ( ) getline getline: $0 1, 0, 1 AWK ( ) getline : ($0 ~ /<title>/){ N=0 h[++n]=$0 while($0!~ /<\/title>/){

4. 8 getline h[++n]=$0 # h[1] ~ h[n] : ($0 ~ /<title>/){ flag=1; N=0 (flag==1){ h[++n]=$0 if($0 ~ /<\/title>/){ flag=0 # h[1] ~ h[n] $0 ~,!~ ~ / / :!~ / / : / / / \ \/ ( ) <title> </title> h[++n]=$0 str = str $0 AWK 2 getline </title> ( ) getline if(getline<=0){ errorexit=1; exit exit END errorexit ( ) END

5. 9 END{ if(errorexit){ printf " (code = %d)\n",errorexit > "/dev/stderr" exit... END printf > "/dev/stderr" Unix AWK 5 4 1. <title> </title> AWK 2 1. sub(), gsub() 2. substr() (match() index() ) sub(r,s,c): c ( $0) r s gsub(r,s,c): c ( $0) r s substr(s,n,len): s n len ( s ) index(s,c): s c (s ) ( 0 )

5. 10 match(s,r): s r (s ) ( 0 ) match() RSTART RLENGTH RSTART = r RLENGTH = r s="54321" match(s,/[2-4]*/) RSTART=2, RLENGTH=3 ( 432 ) [2-4] = 2,3,4 [2-4]* = 2,3,4 0 ( ) 432 c=substr(s,rstart,rlength) c 432 s="<title> </title>" 1. sub(), gsub() sub(/<title>/,"",s) sub(/<\/title>/,"",s)

5. 11 s= 2 sub() gsub() gsub(/<\/?title>/,"",s) \/? \/ <title> </title> gsub() ( ) 1 gsub() 2. substr() s <title> </title> n1=length("<title>") n2=length("</title>") s1=substr(s,n1+1,length(s)-n1-n2) length(s) s <title> </title> substr() match() index() if(match(s,/<title>[ \t]*/)) # [ \t] s=substr(s,rstart+rlength) # if(match(s,/[ \t]*<\/title>/)) s=substr(s,1,rstart-1) # sub() sub(/.*<title>[ \t]*/,"",s) #.* 0 sub(/[ \t]*<\/title>.*/,"",s) sub(), gsub() substr() + match() sub(), gsub()

6. 12 6 6.1 4 getline 5 sub(), gsub() 3 1. <title> </title> ##### ##### ($0 ~ /<title>/){ titlestr=$0 while(titlestr!~ /<\/title>/){ if(getline<=0){ errorexit=1; exit titlestr = titlestr $0 sub(/.*<title>[ \t]*/,"",titlestr) sub(/[ \t]*<\/title>.*/,"",titlestr) next 4, 5 6.2 3 2. ##### ##### ($0 ~ / CONTENTS_TITLE_TABLE/){ # while($0!~ /<font size/) if(getline<=0){ errorexit=2; exit # headline headline=$0 while(headline!~ /<\/small>/){ if(getline<=0){ errorexit=3; exit headline = headline $0

6. 13 # (font,small) gsub(/<\/?font[^>]*>/,"",headline) sub(/[ \t]*<small>[ \t]*/,"",headline) sub(/[ \t]*<\/small>.*/,"",headline) while($0!~ /\/CONTENTS_TITLE_TABLE/) if(getline<=0){ errorexit=4; exit next CONTENTS_TITLE_TABLE 1. while() <font size getline 2. headline </small> getline headline 3. headline (font, small) 4. /CONTENTS_TITLE_TABLE <font>, </font> gsub() <small>, </small> sub() font small <b>, </b> <font> <\/?font[^>]*> [^>] : > [^>]* : > 0 [^>]*> : > 0 > <font> <font size=+1> <font > >

6. 14 6.3 3 3. OUTLINE_TABLE HTML OUTLINE_TABLE ##### ##### # N_ot = OUTLINE_TABLE ($0 ~ / OUTLINE_TABLE/){ N_ot++ # OUTLINE_TABLE (N_ot==1 && $0 ~ / OUTLINE_TABLE/){ # while($0!~ /<font size/) if(getline<=0){ errorexit=5; exit # (body[1] ~ body[n_body]) N_body=0 do{ if($0 ~ /<\/?font/) gsub(/<\/?font[^>]*>/,"") body[++n_body]=$0 if(getline<=0){ errorexit=6; exit while($0!~ /<div/) # <div> (<br>) sub(/<div[^>]*>/,"<br>") body[++n_body]=$0 # </div> while($0!~ /<\/div/){ if(getline<=0){ errorexit=7; exit body[++n_body]=$0 # </div> sub(/<\/div>/,"",body[n_body]) next N_ot OUTLINE_TABLE OUTLINE_TABLE font div <div...> (<br>) </div>

6. 15 body 6.4 Copyright 3 4. Copyright ##### Copyright ##### ($0 ~ /\/YBB module/){ while($0!~ /Copyright/) if(getline<=0){ errorexit=8; exit N_tail=0 do{ gsub(/<\/?small>/,"") gsub(/<\/?center>/,"") tail[++n_tail]=$0 while($0!~ /<\/html>/ && getline>0) exit /YBB module Copyright </html> tail small center 6.5 END ##### ##### # function putheader(titlestr,headline) { print "<html>" print "<head>" printf "<title>%s</title>\n",titlestr print "</head>" print "<body>" printf "<h1>%s</h1>\n",titlestr

6. 16 printf "%s<hr>\n",headline # function putbody(body,n, j) { for(j=1;j<=n;j++) print body[j] # function puttail(tail,n, j) { print "<hr>" for(j=1;j<=n;j++) print tail[j] putheader(): (titlestr) (<title> </title>) (<h1> </h1>) (headline) (<hr>) putbody(): puttail(): (<hr>) Copyright END ##### END ##### END{ if(errorexit){ printf " (code = %d)\n",errorexit > "/dev/stderr" exit putheader(titlestr,headline) putcontents(body,n_body) puttail(tail,n_tail)

7. 17 7 6 ##### ##### ($0 ~ /<title>/){ titlestr=$0 while(titlestr!~ /<\/title>/){ if(getline<=0){ errorexit=1; exit titlestr = titlestr $0 sub(/.*<title>[ \t]*/,"",titlestr) sub(/[ \t]*<\/title>.*/,"",titlestr) next ##### ##### ($0 ~ / CONTENTS_TITLE_TABLE/){ # while($0!~ /<font size/) if(getline<=0){ errorexit=2; exit # headline headline=$0 while(headline!~ /<\/small>/){ if(getline<=0){ errorexit=3; exit headline = headline $0 # (font,small) gsub(/<\/?font[^>]*>/,"",headline) sub(/[ \t]*<small>[ \t]*/,"",headline) sub(/[ \t]*<\/small>.*/,"",headline) while($0!~ /\/CONTENTS_TITLE_TABLE/) if(getline<=0){ errorexit=4; exit next ##### ##### # N_ot = OUTLINE_TABLE ($0 ~ / OUTLINE_TABLE/){ N_ot++ # OUTLINE_TABLE (N_ot==1 && $0 ~ / OUTLINE_TABLE/){ # while($0!~ /<font size/)

7. 18 if(getline<=0){ errorexit=5; exit # (body[1] ~ body[n_body]) N_body=0 do{ if($0 ~ /<\/?font/) gsub(/<\/?font[^>]*>/,"") body[++n_body]=$0 if(getline<=0){ errorexit=6; exit while($0!~ /<div/) # <div> (<br>) sub(/<div[^>]*>/,"<br>") body[++n_body]=$0 # </div> while($0!~ /<\/div/){ if(getline<=0){ errorexit=7; exit body[++n_body]=$0 # </div> sub(/<\/div>/,"",body[n_body]) next ##### Copyright ##### ($0 ~ /\/YBB module/){ while($0!~ /Copyright/) if(getline<=0){ errorexit=8; exit N_tail=0 do{ gsub(/<\/?small>/,"") gsub(/<\/?center>/,"") tail[++n_tail]=$0 while($0!~ /<\/html>/ && getline>0) exit ##### END ##### END{ if(errorexit){ printf " (code = %d)\n",errorexit > "/dev/stderr" exit putheader(titlestr,headline) putbody(body,n_body) puttail(tail,n_tail)

8. 19 ##### ##### # function putheader(titlestr,headline) { print "<html>" print "<head>" printf "<title>%s</title>\n",titlestr print "</head>" print "<body>" printf "<h1>%s</h1>\n",titlestr printf "%s<hr>\n",headline # function putbody(body,n, j) { for(j=1;j<=n;j++) print body[j] # function puttail(tail,n, j) { print "<hr>" for(j=1;j<=n;j++) print tail[j] 8 HTML AWK HTML HTML AWK AWK [1], [2], [3] [4], [5], [6] [1] AWK (2006) [2] AWK (2006) [3] AWK (2006)

8. 20 [4] A.V. B.W. P.J. ( ) AWK (2004) ( 1989) [5] D.Dougherty A.Robbins ( ) sed & awk (1997) [6] AWK 256 (1993) [7] HTML (1996) [8] HTML & XHTML & CSS (2002) [9] WWW http://www.tohoho-web.com/www.htm