¥¤¥ó¥¿¡¼¥Í¥Ã¥È·×¬¤È¥Ç¡¼¥¿²òÀÏ Âè2²ó

Similar documents
¥¤¥ó¥¿¡¼¥Í¥Ã¥È·×¬¤È¥Ç¡¼¥¿²òÀÏ Âè2²ó

¥¤¥ó¥¿¡¼¥Í¥Ã¥È·×¬¤È¥Ç¡¼¥¿²òÀÏ Âè3²ó

¥¤¥ó¥¿¡¼¥Í¥Ã¥È·×¬¤È¥Ç¡¼¥¿²òÀÏ Âè3²ó

¥¤¥ó¥¿¡¼¥Í¥Ã¥È·×¬¤È¥Ç¡¼¥¿²òÀÏ Âè3²ó

¥¤¥ó¥¿¡¼¥Í¥Ã¥È·×¬¤È¥Ç¡¼¥¿²òÀÏ Âè7²ó

¥¤¥ó¥¿¡¼¥Í¥Ã¥È·×¬¤È¥Ç¡¼¥¿²òÀÏ Âè1²ó

Rによる計量分析:データ解析と可視化 - 第3回 Rの基礎とデータ操作・管理

cpall.dvi

¥¤¥ó¥¿¡¼¥Í¥Ã¥È·×¬¤È¥Ç¡¼¥¿²òÀÏ Âè1²ó

¥¤¥ó¥¿¡¼¥Í¥Ã¥È·×¬¤È¥Ç¡¼¥¿²òÀÏ Âè1²ó

untitled

2 I I / 61

2 A I / 58

最小2乗法

GNUPLOT GNUPLOT GNUPLOT 1 ( ) GNUPLO

k2 ( :35 ) ( k2) (GLM) web web 1 :


講義のーと : データ解析のための統計モデリング. 第2回

gnuplot.dvi

!!! 2!

05 I I / 56

1 1 ( ) ( % mm % A B A B A 1

I L01( Wed) : Time-stamp: Wed 07:38 JST hig e, ( ) L01 I(2017) 1 / 19

yamadaiR(cEFA).pdf

kubostat2018d p.2 :? bod size x and fertilization f change seed number? : a statistical model for this example? i response variable seed number : { i

80 X 1, X 2,, X n ( λ ) λ P(X = x) = f (x; λ) = λx e λ, x = 0, 1, 2, x! l(λ) = n f (x i ; λ) = i=1 i=1 n λ x i e λ i=1 x i! = λ n i=1 x i e nλ n i=1 x

y i OLS [0, 1] OLS x i = (1, x 1,i,, x k,i ) β = (β 0, β 1,, β k ) G ( x i β) 1 G i 1 π i π i P {y i = 1 x i } = G (

こんにちは由美子です


I I / 68

こんにちは由美子です

1 1 Gnuplot gnuplot Windows gnuplot gp443win32.zip gnuplot binary, contrib, demo, docs, license 5 BUGS, Chang

programmingII2019-v01

Debian での数学ことはじめ。 - gnuplot, Octave, R 入門

tokei01.dvi

untitled

10

kubostat2017b p.1 agenda I 2017 (b) probability distribution and maximum likelihood estimation :

¥¤¥ó¥¿¡¼¥Í¥Ã¥È·×¬¤È¥Ç¡¼¥¿²òÀÏ Âè3²ó

gnuplot gnuplot 1 3 y = x 3 + 3x 2 2 y = sin x sin(x) x*x*x+3*x*x

¥¤¥ó¥¿¡¼¥Í¥Ã¥È·×¬¤È¥Ç¡¼¥¿²òÀÏ Âè9²ó

(Nov/2009) 2 / = (,,, ) /8

Stata11 whitepapers mwp-037 regress - regress regress. regress mpg weight foreign Source SS df MS Number of obs = 74 F(

kubostat2015e p.2 how to specify Poisson regression model, a GLM GLM how to specify model, a GLM GLM logistic probability distribution Poisson distrib

(pdf) (cdf) Matlab χ ( ) F t

ohp1.dvi


r2.dvi

こんにちは由美子です

1 I EViews View Proc Freeze

情報活用資料

kubostat2017e p.1 I 2017 (e) GLM logistic regression : : :02 1 N y count data or

kubostat2017c p (c) Poisson regression, a generalized linear model (GLM) : :

Excel97関数編

4.9 Hausman Test Time Fixed Effects Model vs Time Random Effects Model Two-way Fixed Effects Model

ECCS. ECCS,. ( 2. Mac Do-file Editor. Mac Do-file Editor Windows Do-file Editor Top Do-file e

untitled

A/B (2018/10/19) Ver kurino/2018/soft/soft.html A/B

25 II :30 16:00 (1),. Do not open this problem booklet until the start of the examination is announced. (2) 3.. Answer the following 3 proble

II - ( 02 ) 1,,,, 2, 3. ( ) HP,. 2 MATLAB MATLAB, C Java,,., MATLAB, Workspace, Workspace. Workspace who. whos. MATLAB, MATLAB Workspace. 2.1 Workspac

populatio sample II, B II? [1] I. [2] 1 [3] David J. Had [4] 2 [5] 3 2

DAA09

151021slide.dvi

2 1 Introduction

joho09.ppt

-2 gnuplot( ) j ( ) gnuplot /shell/myscript 1

分布

y2=x2(x+1)-001.ps

1


x1 GNUPLOT 2 x4 12 x1 Gnuplot Gnuplot,,. gnuplot, PS (Post Script), PS ghostview.,.,,,.,., gnuplot,,, (x2). x1.1 Gnuplot (gnuplot, quit) gnuplot,. % g

PowerPoint プレゼンテーション - 物理学情報処理演習

1.3 2 gnuplot> set samples gnuplot> plot sin(x) sin gnuplot> plot [0:6.28] [-1.5:1.5] sin(x) gnuplot> plot [-6.28:6.28] [-1.5:1.5] sin(x),co

2 Windows 10 *1 3 Linux 3.1 Windows Bash on Ubuntu on Windows cygwin MacOS Linux OS Ubuntu OS Linux OS 1 GUI Windows Explorer Mac Finder 1 GUI

Ver.1 1/17/2003 2

AR(1) y t = φy t 1 + ɛ t, ɛ t N(0, σ 2 ) 1. Mean of y t given y t 1, y t 2, E(y t y t 1, y t 2, ) = φy t 1 2. Variance of y t given y t 1, y t

I J

H22 BioS (i) I treat1 II treat2 data d1; input group patno treat1 treat2; cards; ; run; I

C言語によるアルゴリズムとデータ構造

1. A0 A B A0 A : A1,...,A5 B : B1,...,B

Isogai, T., Building a dynamic correlation network for fat-tailed financial asset returns, Applied Network Science (7):-24, 206,

if clear = 1 then Q <= " "; elsif we = 1 then Q <= D; end rtl; regs.vhdl clk 0 1 rst clear we Write Enable we 1 we 0 if clk 1 Q if rst =

A Nutritional Study of Anemia in Pregnancy Hematologic Characteristics in Pregnancy (Part 1) Keizo Shiraki, Fumiko Hisaoka Department of Nutrition, Sc

講義のーと : データ解析のための統計モデリング. 第3回

( ) 1.1 Polychoric Correlation Polyserial Correlation Graded Response Model Partial Credit Model Tetrachoric Correlation ( ) 2 x y x y s r 1 x 2

Microsoft Word - 計量研修テキスト_第5版).doc

28

きれいなグラフを作ろう!gnuplot 入門 1. 基本 1.1. プロット :test.plt plot x, sin(x) 1.2. データファイルのプロット 1:data.plt plot "data.dat" 1.3. データファイルのプロット 2:data2.plt plot "data2

Ruby Ruby ruby Ruby G: Ruby>ruby Ks sample1.rb G: Ruby> irb (interactive Ruby) G: Ruby>irb -Ks irb(main):001:0> print( ) 44=>


double float

C 2 / 21 1 y = x 1.1 lagrange.c 1 / Laglange / 2 #include <stdio.h> 3 #include <math.h> 4 int main() 5 { 6 float x[10], y[10]; 7 float xx, pn, p; 8 in

Stata User Group Meeting in Kyoto / ( / ) Stata User Group Meeting in Kyoto / 21

/

/22 R MCMC R R MCMC? 3. Gibbs sampler : kubo/

橡kenkyuhoukoku8.PDF

3. :, c, ν. 4. Burgers : t + c x = ν 2 u x 2, (3), ν. 5. : t + u x = ν 2 u x 2, (4), c. 2 u t 2 = c2 2 u x 2, (5) (1) (4), (1 Navier Stokes,., ν. t +

1 Stata SEM LightStone 4 SEM 4.. Alan C. Acock, Discovering Structural Equation Modeling Using Stata, Revised Edition, Stata Press 3.

資料

Microsoft Word - gnuplot

Transcription:

2 2015 4 20

1 (4/13) : ruby 2 / 49

2 ( ) : gnuplot 3 / 49

1 1 2014 6 IIJ / 4 / 49

1 ( ) / 5 / 49

( ) 6 / 49

(summary statistics) : (mean) (median) (mode) : (range) (variance) (standard deviation) 7 / 49

(mean): x = 1 n (median): { xr+1 m, m = 2r + 1 x median = (x r + x r+1 )/2 m, m = 2r n i=1 (mode): x i f(x) mode median mean median mode mean x 8 / 49

(percentiles) pth-percentile: p% median = 50th-percentile 100 90 80 total observations (%) 70 60 50 40 30 20 10 0-4 -3-2 -1 0 1 2 3 4 sorted variable x 9 / 49

(range): (variance): σ 2 = 1 n (x i x) 2 n i=1 (standatd deviation): σ 68% (mean ± stddev) 95% (mean ± 2stddev) f(x) 1 mean median exp(-x**2/2) 0.8 0.6 σ 0.4 0.2 0-5 -4-3 -2-1 0 1 2 3 4 5 68% x 95% 10 / 49

(variance): σ 2 = 1 n (x i x) 2 n i=1 σ 2 = 1 n (x i x) 2 n i=1 = 1 n (x 2 i n 2x i x + x 2 ) i=1 = 1 n n ( x 2 i 2 x n x i + n x 2 ) i=1 i=1 = 1 n x 2 i n 2 x2 + x 2 i=1 = 1 n x 2 i n x2 i=1 11 / 49

: 12 / 49

: 1/N ( ) 1/N : 1 13 / 49

: ( ) ( ) (population): (sample) : ( ) : ( ) population samples estimate estimate 14 / 49

( ) N(µ, σ/ n) n 15 / 49

(normal distribution) N(µ, σ) 2 : µ σ f(x) 1 mean median exp(-x**2/2) 0.8 0.6 σ 0.4 0.2 0-5 -4-3 -2-1 0 1 2 3 4 5 68% x 95% 16 / 49

(sample mean): x x = 1 n n i=1 x i (sample variance): s 2 s 2 = 1 n 1 n (x i x) 2 i=1 (sample standard deviation): s : n (n 1) (degree of freedom): x 1 17 / 49

(standard error) : (SE) SE = σ/ n n 1/ n ( ) N(µ, σ) µ SE = σ/ n 18 / 49

(sample variance): s 2 s 2 = 1 n 1 n (x i x) 2 i=1 (n 1) x µ S 2 σ 2 x µ N(µ, σ/ n) (n 1)/n E(S 2 ) = n 1 n σ2 σ 2 = n n 1 S2 = 1 n 1 n (x i x) 2 i=1 19 / 49

normalized traffic volume 4 2 0-2 -4 cdf 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 1.5 1 0.5 0-0.5-1 0 500 1000 1500 2000 2500 3000 3500 time (sec) 0-4 -3-2 -1 0 1 2 3 4 normalized traffic volume -1.5-1.5-1 -0.5 0 0.5 1 1.5 20 / 49

: sample data from a book: P. K. Janert Gnuplot in Action # Minutes Count 133 1 134 7 135 1 136 4 137 3 138 3 141 7 142 24... :2,355 :171.3 :14.1 :176 21 / 49

: (2) 180 160 140 120 count 100 80 60 40 20 0 120 140 160 180 200 220 240 finish time (minutes) 22 / 49

: (3) 2500 2000 1500 rank 1000 500 0 120 140 160 180 200 220 240 finish time (minutes) 23 / 49

XY XY : 0 ( ) XY 3D ( : ) 24 / 49

25 / 49

X Y 4 normalized traffic volume 2 0-2 -4 0 500 1000 1500 2000 2500 3000 3500 time (sec) 26 / 49

(1/2) X : Y : 160 140 120 frequency 100 80 60 40 20 0-4 -3-2 -1 0 1 2 3 4 normalized traffic volume 27 / 49

(2/2) ( ) ( ) 28 / 49

(probability density function; pdf) 1 : X x f(x) = P [X = x] 0.04 0.035 0.03 0.025 pdf 0.02 0.015 0.01 0.005 0-4 -3-2 -1 0 1 2 3 4 normalized traffic volume 29 / 49

(cumulative distribution function; cdf) : x f(x) = P [X = x] : x F (x) = P [X <= x] 1 0.9 0.8 0.7 0.6 cdf 0.5 0.4 0.3 0.2 0.1 0-4 -3-2 -1 0 1 2 3 4 normalized traffic volume 30 / 49

CDF CDF CDF 1800 ping rtt 18 ping rtt 1600 16 1400 14 1200 12 histogram 1000 800 histogram 10 8 600 6 400 4 200 2 0 300 400 500 600 700 800 900 1000 response time (msec) 0 300 400 500 600 700 800 900 1000 response time (msec) 1 0.9 0.8 0.7 0.6 CDF 0.5 0.4 0.3 0.2 0.1 8241 samples 100 samples 0 300 400 500 600 700 800 900 1000 response time (msec) ( ) ( )100 ( )CDF 31 / 49

(interquartile range) interquartile range (IQR): ( - ) ( 50%) ( ): ( ) : 25/50/75-percentiles : min/max inner fance (Q 1 1.5IQR, Q 3 + 1.5IQR) max upper quartile mean median lower quartile min 32 / 49

(original vs 100 samples) : min max 2000 1 0.9 1500 0.8 0.7 1000 CDF 0.6 0.5 0.4 500 0.3 0.2 0 original 100 samples 0.1 8241 samples 100 samples 0 300 400 500 600 700 800 900 1000 response time (msec) 33 / 49

(scatter plots) 2 X : X Y : Y X Y 1.5 1.5 1.5 1 1 1 0.5 0.5 0.5 0 0 0-0.5-0.5-0.5-1 -1-1 -1.5-1.5-1 -0.5 0 0.5 1 1.5-1.5-1.5-1 -0.5 0 0.5 1 1.5-1.5-1.5-1 -0.5 0 0.5 1 1.5 : ( ) 0.7 ( ) 0.0 ( ) -0.5 34 / 49

gnuplot http://gnuplot.info/ grace GUI http://plasma-gate.weizmann.ac.il/grace/ gnuplot Mac: gnuplot Homebrew/MacPorts (XQuatrz ) Windows: windows 35 / 49

: filename = ARGV[0] count = 0 file = open(filename) while text = file.gets count += 1 end file.close puts count count.rb $ ruby count.rb foo.txt Ruby #!/usr/bin/env ruby count = 0 ARGF.each_line do line count += 1 end puts count 36 / 49

: : P. K. Janert Gnuplot in Action http://web.sfc.keio.ac.jp/~kjc/classes/sfc2015s-measurement/marathon.txt 37 / 49

: ( ) # regular expression to read minutes and count re = /^(\d+)\s+(\d+)/ sum = 0 # sum of data n = 0 # the number of data ARGF.each_line do line if re.match(line) min = $1.to_i cnt = $2.to_i sum += min * cnt n += cnt end end mean = Float(sum) / n printf "n:%d mean:%.1f\n", n, mean % ruby mean.rb marathon.txt n:2355 mean:171.3 38 / 49

: : σ 2 = 1 n n i=1 (x i x) 2 # regular expression to read minutes and count re = /^(\d+)\s+(\d+)/ data = Array.new sum = 0 # sum of data n = 0 # the number of data ARGF.each_line do line if re.match(line) min = $1.to_i cnt = $2.to_i sum += min * cnt n += cnt for i in 1.. cnt data.push min end end end mean = Float(sum) / n sqsum = 0.0 data.each do i sqsum += (i - mean)**2 end var = sqsum / n stddev = Math.sqrt(var) printf "n:%d mean:%.1f variance:%.1f stddev:%.1f\n", n, mean, var, stddev % ruby stddev.rb marathon.txt n:2355 mean:171.3 variance:199.9 stddev:14.1 39 / 49

: : σ 2 = 1 n n i=1 x2 i x2 # regular expression to read minutes and count re = /^(\d+)\s+(\d+)/ sum = 0 # sum of data n = 0 # the number of data sqsum = 0 # sum of squares ARGF.each_line do line if re.match(line) min = $1.to_i cnt = $2.to_i sum += min * cnt n += cnt sqsum += min**2 * cnt end end mean = Float(sum) / n var = Float(sqsum) / n - mean**2 stddev = Math.sqrt(var) printf "n:%d mean:%.1f variance:%.1f stddev:%.1f\n", n, mean, var, stddev % ruby stddev2.rb marathon.txt n:2355 mean:171.3 variance:199.9 stddev:14.1 40 / 49

: # regular expression to read minutes and count re = /^(\d+)\s+(\d+)/ data = Array.new ARGF.each_line do line if re.match(line) min = $1.to_i cnt = $2.to_i for i in 1.. cnt data.push min end end end data.sort! # just in case data is not sorted n = data.length # number of array elements r = n / 2 # when n is odd, n/2 is rounded down if n % 2!= 0 median = data[r] else median = (data[r - 1] + data[r])/2 end printf "r:%d median:%d\n", r, median % ruby median.rb marathon.txt r:1177 median:176 41 / 49

: gnuplot gnuplot 42 / 49

plot "marathon.txt" using 1:2 with boxes ( ) set boxwidth 1 set xlabel "finish time (minutes)" set ylabel "count" set yrange [0:180] set grid y plot "marathon.txt" using 1:2 with boxes notitle 160 140 120 100 80 60 40 "marathon.txt" using 1:2 count 180 160 140 120 100 80 60 40 20 0 120 140 160 180 200 220 240 20 0 120 140 160 180 200 220 240 finish time (minutes) 43 / 49

: CDF : # Minutes Count 133 1 134 7 135 1 136 4 137 3 138 3 141 7 142 24... : # Minutes Count CumulativeCount 133 1 1 134 7 8 135 1 9 136 4 13 137 3 16 138 3 19 141 7 26 142 24 50... 44 / 49

: CDF (2) ruby code: re = /^(\d+)\s+(\d+)/ cum = 0 ARGF.each_line do line begin if re.match(line) # matched time, cnt = $~.captures cum += cnt.to_i puts "#{time}\t#{cnt}\t#{cum}" end end end gnuplot command: set xlabel "finish time (minutes)" set ylabel "CDF" set grid y plot "marathon-cdf.txt" using 1:($3 / 2355) with lines notitle 45 / 49

CDF 1 0.9 0.8 0.7 0.6 CDF 0.5 0.4 0.3 0.2 0.1 0 120 140 160 180 200 220 240 finish time (minutes) 46 / 49

: gnuplot> set terminal png gnuplot> set output "plotfile.png" gnuplot> replot gnuplot> load "scriptfile" gnuplot> quit 47 / 49

2 ( ) : gnuplot 48 / 49

3 (4/27) : 49 / 49