1 1 2 2 3 3 3.1 RSS Dripper [1]............................................ 3 3.2 Whazzup [2].............................................. 3 3.3 Summ

Similar documents
Wiki

IoT

WebOS aplat WebOS WebOS 3 XML Yahoo!Pipes Popfry UNIX grep awk XML GUI WebOS GUI GUI 4 CUI

Android Windows 8 AP 9 AP ios & Android 10 ST 11 ST ios 12 ST Android 13 ST Win & Mac 14 ST ios 15 ST Android


勉強会の流れ Google API の概要 デモ curl で実際に体験 Copyright 2010 SRA OSS, Inc. Japan All rights reserved. 2

6/ Kageyama (Kobe Univ.) / 39

WebGL WebGL DOM Kageyama (Kobe Univ.) Visualization / 39

25 About what prevent spoofing of misusing a session information

3. XML, DB, DB (AP). DB, DB, AP. RDB., XMLDB, XML,.,,.,, (XML / ), XML,,., AP. AP AP AP 検索キー //A=1 //A=2 //A=3 返却 XML 全体 XML 全体 XML 全体 XMLDB <root> <A

ohp.mgp

OOW_I06

山梨県ホームページ作成ガイドライン

3 Powered by mod_perl, Apache & MySQL use Item; my $item = Item->new( id => 1, name => ' ', price => 1200,

インストール取扱説明書

橡dbweb2002-sato.PDF

_...j.f......_..

WebGL *1 DOM API *1 X LR301 Kageyama (Kobe Univ.) Visualization / 37

morita.PDF

LAPLINK ヘルプデスク 導入ガイド

XMLアクセス機能説明書

意外と簡単!?

NPCA部誌2018

¥Í¥Ã¥È¥ï¡¼¥¯¥×¥í¥°¥é¥ß¥ó¥°ÆÃÏÀ

B 20 Web

インストール取扱説明書

[1] [2] [3] (RTT) 2. Android OS Android OS Google OS 69.7% [4] 1 Android Linux [5] Linux OS Android Runtime Dalvik Dalvik UI Application(Home,T

IIJ Technical WEEK Cloudbusting Machine(CBM)

Microsoft Word - 11_thesis_08k1131_hamada.docx

WEB DB PRESS Vol.1 65

SAS Web XML * ** * ** Web Data Analysis with SAS Input and Output of XML Data and Application to Real Estate Valuation Map Junnosuke Matsushima*, Hiro


Lotus Domino XML活用の基礎!

untitled

nopcommerce Adobe Flash ( 1 ) 1 nopcommerce 2.2 ( [5, p.3-4] )

Web2.0 REST API + XSLT Amazon hon.jp API XML Consortium XML ( ) REST(GET)API hon.jp Amazon.co.jp Google Map Exif to RDF(kanzaki.com) REST +

untitled



untitled

IPA

2013 Future University Hakodate 2013 System Information Science Practice Group Report biblive : Project Name biblive : Recording and sharing experienc


XML Week Web 2.0 Day (1) SOA2.0 KM2.0? REST API + XSLT Amazon hon. hon.jp API XML Consortium XML ( ) REST(GET)API Amazon.co.jp hon.jp REST

Web JDBC JDBC Java JDBC DataBase Web CHtmlView...

Web SOAP Internet Web REST SOAP REST 3 REST SOAP 4

FileMaker Server Getting Started Guide

オントロジ入門




付加情報をもったファイル共有システム

Windows Macintosh 18 Java Windows 21 Java Macintosh


FileMaker Server 9 Getting Started Guide

1034 IME Web API Web API 1 IME Fig. 1 Suitable situations for context-aware IME. IME IME IME IME 1 GPS Web API Web API Web API Web )

54 5 PHP Web hellow.php 1:<?php 2: echo "Hellow, PHP!Y=n"; 3:?> echo PHP C 2: printf("hellow, PHP!Y=n"); PHP (php) $ php hellow.php Hellow, PHP! 5.1.2

6 (1) app.html.eex 28 lib/nano_planner_web/templates/layout/app.html.eex 27 <footer> Oiax Inc <%= this_year() %> Oiax Inc. 29 </footer>

IPSJ SIG Technical Report Vol.2013-CE-122 No.16 Vol.2013-CLE-11 No /12/14 Android 1,a) 1 1 GPS LAN 2 LAN Android,,, Android, HTML5 LAN 1. ICT(I

122.pdf

2 : Open Clip Art Library [4] Microsoft Office PowerPoint Web PowerPoint 2 Yahoo! Web [5] SlideShare Yahoo! Web Yahoo! Web

Condition DAQ condition condition 2 3 XML key value

Windows2000 Edge Components V Edge Components V Java Edge Components


untitled

CSS CSS



2

Twitter‡É‡¨‡¯‡éŁ”¦…V…X…e…•‡Ì™ñ‹Ä

WordPress Ktai Style Ktai Entry 18 Mac 18

untitled

help gem gem gem my help

untitled

XISによる効率良いシステム開発のポイント

Testing XML Performance

Pro 16 ipad iphone Windows Mac Web App : 12,600 T1 1 1 * Starter Solution Excel PDF Web Web CSV, Excel, XML, ODBC ODBC / JDBC ** SQL REST API (JSON, c

概要

背景

etrust Access Control etrust Access Control UNIX(Linux, Windows) 2

事例に見るSCORMの・・・

2009 Web B012-1

CE中高受講ルールBOOK_01-24.indd

実施していただく前に

DEIM Forum 2019 H2-2 SuperSQL SuperSQL SQL SuperSQL Web SuperSQL DBMS Pi

2015: Moodle 1,2, 2, 1, 2, Moodle Moodle SCO(Sharable Content Object) Moodle (Conditional Activities)


UniDic version

FileMaker Server 9 Getting Started Guide

デモで理解する Facebook アプリ開発のポイント シグマコンサルティング ( 株 ) 菅原英治

Google Apps Google Apps for Work Education Government Drive for Work Google Apps Unlimited

読み書き障害者のための視覚シンボルを用いたふりがな付与システムの開発

World Wide Web =WWW Web ipad Web Web HTML hyper text markup language CSS cascading style sheet Web Web HTML CSS HTML

book

Oracle XML DB によるスケーラビリティおよびパフォーマンス検証 - MML v.3.0

…l…b…g…‘†[…N…v…“…O…›…~…fi…OfiÁŸ_

CAC

untitled

Web±ÜÍ÷¤Î³Ú¤·¤µ¤ò¹â¤á¤ëWeb¥Ú¡¼¥¸²ÄÄ°²½¥·¥¹¥Æ¥à

PowerPoint プレゼンテーション

Transcription:

2011 08H046

1 1 2 2 3 3 3.1 RSS Dripper [1]............................................ 3 3.2 Whazzup [2].............................................. 3 3.3 Summify [3].............................................. 3 3.4 Paper.li [4]............................................... 3 3.5.............................................. 3 4 5 4.1................................................ 5 4.2............................................ 5 4.3.......................................... 7 4.4............................................... 8 4.5.............................................. 13 4.6 Python................................. 14 4.7................................................ 14 4.8.............................................. 14 4.9.......................................... 15 4.10...................................... 16 4.11.......................................... 19 4.12............................................ 19 5 20 6 24 6.1............................................... 24 A 26 A.1 main.py................................................ 26

1 *1 RSS *2 Atom *3 [1, 2] PC *4 *5 *1 *2 RDF site summary : XML *3 RSS2.0 *4 Gnu/Linux, BSD, OS X, MS Windows OS *5 Android, ios OS 1

2 Google Reader RSS RSS Google Reader Google Reader 2

3 3.1 RSS Dripper [1] Web 3.2 Whazzup [2] Python/web.py 3.3 Summify [3] Summify Twitter,Facebook,Google Reader Web ios RSS 1 1 3.4 Paper.li [4] Paper.li Twitter Facebook Web 3.5 Google Reader 1, 2 3

1 Google Reader 2 Google Reader 4

4 4.1 (MacBook Air Late 2010) :1.6 GHz Intel Core 2 Duo :4 GB 1067 Mhz DDR3 OS:Mac OS X Lion 10.7.2 (11C74) :Python 2.7.1 4.2 Google Reader 3 5

3 6

4.3 2 Google Reader 4.3.1 Rss Dripper Whazzup 4.3.2 A B X B Y A Y Summify Paper.li 7

4.4 4.4.1 MeCab( ) [5] ChaSen Juman KAKASI ChaSen 3 4 OS X Spotlight,iPhone OS 2.1 ChaSen( ) [6] Juman JUMAN [7] ChaSen KAKASI [8] kanji kana simple inverter MeCab 8

4.4.2 UniDic [9] Chasen MeCab mecab-ipadic [10] IPA IPA CRF MeCab mecab-jumandic [10] Juman CRF 30000 mecab-naist-jdic [10, 11] IPA / IPADIC (ICOT ) 4 4 6 9

4 1. MeCab 4 Xbox360 UniDic Xbox naist-jdic 360 naist-jdic ipadic, jumandic 10

5 2. MeCab 4 jumandic UniDic ipadic, naist-jdic 11

6 3. MeCab 4 ipadic, naist-jdic 2 jmandic 3 ipadic,naist-jdic naist-jdic UniDic jumandic ipadic ipadic MeCab ipadic 12

4.5 [12] P(B) = B, prior probability P(B A) = A B, posterior probability conditional probability P(A) > 0 P (A B) = P (A)P (B A) P (B) (1) A B 13

4.6 Python Python gdata-python-client (2.0.15) [13] Google Google Data API Python Google Google Reader API MeCab (0.98) [5] Reverend (0.3) [14] 4.7 Google Reader 4.8 Google API *6 [15, 16] Google Reader SID *7 Google *6 Application Program Interface *7 Session ID 14

4.9 sqlite3 feeds results 4.9.1 feeds crawltime : Google Reader feedurl : URL itemurl : URL itemid : ID title : body : status : 4.9.2 results feedurl : URL itemid : ID title : result sub : 15

4.10 API Google Reader Google Reader API 1000 XML *8 *9 4.10.1 XML XML XML parser * 10 API DOM * 11 4.10.2 7 8 URL ExtractContent * 12 ExtractContent XML *8 Extensible Markup Language: XML *9 feeds *10 XML *11 Document Object Model:XML *12 Web 16

7 Web Google Reader 8 Web Google Reader 17

4.10.3 XML html 9 10 9 html 10 html 4.10.4 status status : star : read : unread 18

4.11 feeds 4.11.1 4.11.2 results feeds XML result sub = (2) 4.12 results result sub Google Reader 19

5 Google Reader 11 20 12 13 1 Web Google Reader 12,14 PC Google Reader 15,16 11 Google Reader 20

12 Google Reader 13 21

14 Web Google Reader iphone4 Safari 15 ios Sylfeed Version 2.1.1 22

16 ios Reeder Versioin 2.5.4 23

6 Google Reader PC Google Reader 6.1 6.1.1 Web Python Web 6.1.2 OAuth SID OAuth Google Reader API 6.1.3 24

[1] Rss dripper. http://ns.oblique-project.com/rssdripper/. [2] Whazzup. http://code.google.com/p/whazzup/. [3] Summify. http://summify.com/. [4] Paper.li. http://paper.li/. [5] Mecab: Yet another part-of-speech and morphological analyzer. http://mecab.sourceforge.net/. [6] Chasen. http://chasen-legacy.sourceforge.jp/. [7] Juman - kurohashi-kawahara lab. http://nlp.ist.i.kyoto-u.ac.jp/index.php?juman. [8] Kakasi - ( ). http://kakasi.namazu.org. [9] unidic. http://www.tokuteicorpus.jp/dist/. [10] mecab - downloads. http://code.google.com/p/mecab/downloads/list. [11] Naist-jdic wiki. http://sourceforge.jp/projects/naist-jdic/wiki/frontpage. [12]. http://ja.wikipedia.org/wiki/%e3%83%99%e3%82%a4%e3%82%ba%e3%81%ae%e5%ae% 9A%E7%90%86. [13] gdata-python-client. http://code.google.com/p/gdata-python-client/. [14] Reverend. https://github.com/arnaudsj/reverend. [15] Koji Yamashita. google reader api api. http://colo-ri.jp/ develop/2009/12/google-reader-apiapi.html, 2009. [16] MOIMOI. Google python google reader api. http://moimoitei. blogspot.com/2011/03/google-python-google-reader-api.html, 2011. 25

A A.1 main.py #! / usr / bin /env python # coding : utf 8 Listing 1 main.py import gdata. s e r v i c e import s q l i t e 3 import os import MeCab import u r l l i b import re from xml. dom import minidom from reverend. thomas import Bayes USER NAME = @gmail. com USER PASSWD = EXTRACT FEED NUM = 20 LABEL NAME = NiceFeed GET FEED NUM = 1000 c l a s s Reader ( ) : def i n i t ( s e l f ) : s e l f. auth ( ) s e l f. l o a d database ( feeddata. db ) def auth ( s e l f ) : s e l f. s e r v i c e = gdata. s e r v i c e. GDataService ( account type = GOOGLE, s e r v i c e = reader, s e r v e s e l f. s e r v i c e. ClientLogin (USER NAME,USER PASSWD) s e l f. token = s e l f. s e r v i c e. Get ( / reader / api /0/ token, c o n v e r t e r=lambda x : x ) def l o a d d a t a b a s e ( s e l f, f i l e n a m e ) : i f os. path. i s f i l e ( f i l e n a m e ) : s e l f. database = s q l i t e 3. connect ( filename, i s o l a t i o n l e v e l=none ) e l s e : s e l f. database = s q l i t e 3. connect ( filename, i s o l a t i o n l e v e l=none ) s e l f. database. execute ( c r e a t e t a b l e f e e d s ( crawltime, status, f e e d u r l, itemurl, item s e l f. database. execute ( c r e a t e t a b l e r e s u l t s ( f e e d u r l, itemid, t i t l e, r e s u l t s u b ) ) 26

t a b l e = s e l f. database. execute ( s e l e c t from s q l i t e m a s t e r where type = table and name i f t a b l e. f e t c h o n e ( )!= None : s e l f. a d d l a b e l ( ) s e l f. database. execute ( d e l e t e from r e s u l t s ) def q u e r y s e l e c t e r ( s e l f, s t a t u s ) : try : crawltime = i n t ( s e l f. database. execute ( s e l e c t max( crawltime ) from f e e d s where s t a t u s crawltime = s t r ( crawltime + 1) except : crawltime = i f s t a t u s == s t a r : return gdata. s e r v i c e. Query ( f e e d = / reader /atom/ user/ / s t a t e /com. g o o g l e / starred, param e l i f s t a t u s == read : return gdata. s e r v i c e. Query ( f e e d = / reader /atom/ user/ / s t a t e /com. g o o g l e / read, params={ e l i f s t a t u s == unread : return gdata. s e r v i c e. Query ( f e e d = / reader /atom/ user/ / s t a t e /com. g o o g l e / reading l i s t, def add entry data ( s e l f, s t a t u s ) : query = s e l f. q u e r y s e l e c t e r ( s t a t u s ) feedxml = s e l f. s e r v i c e. Get ( query. ToUri ( ), c o n v e r t e r=lambda x : x ) e n t r i e s = minidom. p a r s e S t r i n g ( feedxml ). getelementsbytagname ( entry ) f o r entry in e n t r i e s : crawltime = entry. a t t r i b u t e s [ gr : crawl timestamp msec ]. value f e e d u r l = entry. getelementsbytagname ( source ) [ 0 ]. a t t r i b u t e s [ gr : stream id ]. value i t e m u r l = entry. getelementsbytagname ( l i n k ) [ 0 ]. a t t r i b u t e s [ h r e f ]. value itemid = entry. getelementsbytagname ( id ) [ 0 ]. childnodes [ 0 ]. data t i t l e = entry. getelementsbytagname ( t i t l e ) [ 0 ]. childnodes [ 0 ]. data body = t i t l e + s e l f. get subbody ( entry, content ) + s e l f. get subbody ( entry, summary ) v a l u e s = [ crawltime, status, f e e d u r l, itemurl, itemid, t i t l e, body ] s e l f. database. execute ( i n s e r t i n t o f e e d s v a l u e s (?,?,?,?,?,?,? ), v a l u e s ) def get subbody ( s e l f, entry, tag ) : try : data = entry. getelementsbytagname ( tag ) [ 0 ]. childnodes [ 0 ]. data 27

return re. sub ( <.? >,, data ) except : return def t r a i n ( s e l f, s t a t u s ) : f o r body in s e l f. database. execute ( s e l e c t body from f e e d s where s t a t u s =?,[ s t a t u s ] ) : wakati body = MeCab. Tagger( Owakati ). parse ( body [ 0 ]. encode ( utf 8 )) s e l f. g u e s s e r. t r a i n ( status, wakati body ) def e x t r a c t f e e d ( s e l f ) : s e l f. g u e s s e r = Bayes ( ) s e l f. t r a i n ( star ) s e l f. t r a i n ( read ) f o r body, f e e d u r l, itemid, t i t l e in s e l f. database. execute ( s e l e c t body, f e e d u r l, itemid, t i t wakati body = MeCab. Tagger( Owakati ). parse ( t i t l e. encode ( utf 8 )) r e s u l t s = s e l f. g u e s s e r. guess ( wakati body ) i f l e n ( r e s u l t s ) > 0 and r e s u l t s [ 0 ] [ 0 ] == s t a r and not t i t l e. s t a r t s w i t h ( ( PR:, AD: i f l e n ( r e s u l t s ) == 2 : r e s u l t s u b = r e s u l t s [ 0 ] [ 1 ] r e s u l t s [ 1 ] [ 1 ] e l s e : r e s u l t s u b = r e s u l t s [ 0 ] [ 1 ] v a l u e s = [ f e e d u r l, itemid, t i t l e, r e s u l t s u b ] s e l f. database. execute ( i n s e r t i n t o r e s u l t s v a l u e s (?,?,?,? ), v a l u e s ) def a d d l a b e l ( s e l f ) : f o r f e e d u r l, itemid, t i t l e, r e s u l t s u b in s e l f. database. execute ( s e l e c t f e e d u r l, itemid, t i params = u r l l i b. urlencode ( { s : f e e d u r l, i : itemid, a : user/ / l a b e l / + LABEL NAME, s e l f. s e r v i c e. Post ( params, / reader / api /0/ edit tag, c o n v e r t e r=lambda x : x, e x t r a h e a d e r s s e l f. database. execute ( d e l e t e from f e e d s where itemid =?,[ itemid ] ) p r i n t r e s u l t s u b, t i t l e s e l f. database. execute ( d e l e t e from r e s u l t s ) def main ( s e l f ) : s e l f. add entry data ( star ) 28

s e l f. add entry data ( read ) s e l f. add entry data ( unread ) s e l f. e x t r a c t f e e d ( ) s e l f. a d d l a b e l ( ) i f name == main : reader = Reader ( ) reader. main ( ) 29