ch.10 summarizing data 10.2 methods based on the c.d.f

15
10. 2 10. 3 numbers. of batch order the is when , 1 , , , 0 of # 1 as defined is (ecdf) function on distributi cumulative empirical The number, of batch a is , , that Suppose Defn 2 1 1 0 1 n n k k n i n n x x x x x if x x x if n k x x if x F x x n x F x x . to equal or than less number of collection the of proportion the gives ) ( . y that probabilit the gives ) ( Remark x x F x X x F n Ch.10 Summarizing Data 10.2 Methods based on the C.D.F

Upload: gwenifer-allen

Post on 01-Jan-2016

12 views

Category:

Documents


0 download

DESCRIPTION

Ch.10 Summarizing Data 10.2 Methods based on the C.D.F. Figure10.1. Eg:Guinea pig 接種不同劑量的結核菌( tubercle bacilli) Control group(107 隻) Inoculated group ⅠⅡⅢⅣⅤ 各72隻,按照劑量順序排列 表10.2.1 表10.2.2 在 Control group 及較低量的注射劑群並非所有的天竺鼠都死,因此在 - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Ch.10  Summarizing Data 10.2  Methods based on the C.D.F

10.2 10.3

numbers. ofbatch order theis when

,1

,

,,0

of#1

as defined is (ecdf)function on distributi cumulative empirical

The number, ofbatch a is ,, that SupposeDefn

21

1

0

1

n

n

kk

n

in

n

xxx

xxif

xxxifn

k

xxifxF

xxn

xF

xx

. toequalor than less

number of collection theof proportion thegives )(

.y that probabilit thegives )(Remark

x

xF

xXxF

n

Ch.10 Summarizing Data

10.2 Methods based on the C.D.F

Page 2: Ch.10  Summarizing Data 10.2  Methods based on the C.D.F

10.2 10.3

布情形

的分分析純的以方法:收集目的:

已知:

蜜臘

point smeltingbeewax sources. 59

another. tobeehive one fromry beeswax va ofpoint Melting

beewax. of

point melting theraises wax stallinemicrocry ofaddition The

beeswax. toaddedbeen has that waxessynthetic of

presence thedetectingfor methods chemical einvestigat To Purpose

points. melting 59 sources 59

Beeswax E.g.

nF

Figure10.1

Page 3: Ch.10  Summarizing Data 10.2  Methods based on the C.D.F

10.2 10.3

)(1)(function surrival empirical The

F. c.d.f with r.v. a is T where

),(1)(S(t) is defined isfunction surrival The

0.5F(x) such that x of valueat that variancemaximum a has )(

)](1)[(1

))((Var is )()]([

r.v. F(x))B(n, a is )(F(x)-1 prob. with ,0

F(x) prob. with ,1

,0

,1 where

1)(Then

.function,on distributi continuous a from sample random a is ,, If

],(

],(1

],(

1

tFtS

tFtTPDefn

xF

xFxFn

xFxFxFE

xnFXI

xXif

xXifXIXI

nxF

FXX

nn

n

nn

nix

i

iix

n

jixn

n

Page 4: Ch.10  Summarizing Data 10.2  Methods based on the C.D.F

10.2 10.3

Eg: Guinea pig 接種不同劑量的結核菌 (tubercle bacilli)

Control group(107 隻 )

Inoculated group ⅠⅡⅢⅣⅤ 各 72 隻 , 按照劑量順序排列 表 10.2.1

表 10.2.2

在 Control group 及較低量的注射劑群並非所有的天竺鼠都死,因此在 Control group 中只有 65筆 data,在 DoseⅠ及 DoseⅡ 分別都只有 60,67筆 data

目的:比較對不同抵抗力的天竺鼠,其增加注射劑量的效果差別!Eg :對 Group Ⅲ 及Ⅳ (1) 對抵抗力較弱的天竺鼠 ( 其生命中 10% weakest ,即在圖中的 y 軸畫 0.9 的水平線其所對應的 x 軸,即為此 10% weakest 的最長生命期 )

Group Ⅲ 及Ⅳ的差距約在 50 天 (2) 然而對抵抗力較強的天竺鼠, Group Ⅲ 及Ⅳ的差距大約在 100 天左右

Page 5: Ch.10  Summarizing Data 10.2  Methods based on the C.D.F

10.2 10.3

)function survival theof log theof slope theof negative the(

)(log)](1log[)( &

) at time alive individualan for mortality of rate ousinstantane (the

)(1

)()( as defined isfunction hazard The

)(1

)(

)(1

)()(

)(

)() (

gven time a toup survived have o wh

individualfor ratedeath ousinstantane thefunction Hazard

tSdt

dtF

dt

dth

t

tF

tfth

tF

tf

tF

tFstF

tTP

stTtPtTstTtP

::

Page 6: Ch.10  Summarizing Data 10.2  Methods based on the C.D.F

10.2 10.3

圖形斜率的負號為則

將之修正為

沒有定義對但此時

而修正的為了計算

)(log )(

,1

1)(

)(log

, ,1)(1)(

function. survial Empirical function hazard

on)distributi lexponentia theofproperty "memoryless(" )(

)(

)(

1)(on distributi lexponentia TheEg

)1()(

)1()(

tSth

TtTforn

itS

TttS

TtTifn

itFtS

th

etf

etS

etF

nn

iin

nn

iiin

t

t

t

Page 7: Ch.10  Summarizing Data 10.2  Methods based on the C.D.F

10.2 10.3

Eg :天竺鼠的例子(1)   一開始的 hazard function 在每個 group 中都很小

(2)   隨著注射劑量的增加,其瞬時死亡率不僅增加且其增加的速率也隨著劑量的增加而加快

E.g.Ⅲ及Ⅴ的比較,Ⅴ較Ⅲ其 instant mortality rate 增加的速率較 快,且最後其曲線所相對的斜率也較Ⅲ的相對斜率更大 Figure10.2

Page 8: Ch.10  Summarizing Data 10.2  Methods based on the C.D.F

10.2 10.3

times.largefor

fuctions survived log theof nsfluctuatui large theNote Eg

.unreliable

extremely isfunction survival log empirical the, of valuelargeFor

large. is ) )(F1(log(therefor

,0F(t)-1 , of valuelargeFor

)(1

)(1

)](1[

)](1)[(1

)](1[

))(1())(1(log(

function. survival log empirical theofity variabilThe

2

2

t

tVar

t

tF

tF

n

tF

tFtF

n

tF

tFVartFVar n

n

Figure10.3

Page 9: Ch.10  Summarizing Data 10.2  Methods based on the C.D.F

10.2 10.3

)(~Y

)(~X Assume

)()(

,such that of value theis

ondistributi theof quantileth theF, function,on distributi

continuous a With . variablerandom continuous a is X If Defn

functions)on distributi comparing Plots.(For Quantile-Quantile §

1

grouptreatmentegGcdf

groupcontrolegFcdf

PFxorPxF

x

p

P

::

Page 10: Ch.10  Summarizing Data 10.2  Methods based on the C.D.F

10.2 10.3

s.individualstronger ofdetrimut

the tobe & sindividualeaker benefit w nt wouldA treatmeeg

(2)nor (1)Neither )3(

)()( & then

)25%by lifetime increaset treatmen the,25.1( If)2(

)()( & then

h)by changed responses their have

wouldsindividual strongest theand weakest the(eg If)1(

c

yFyGcxy

cegcxy

hyFyGhxy

hxy

pp

pp

Figure10.5

Figure10.4

Page 11: Ch.10  Summarizing Data 10.2  Methods based on the C.D.F

10.2 10.3

),( points theofplot theis ,,, and ,,

stateticsorder with numbers of batches twoofplot Q-Q The)(

. toassigned is data of quantile 1

the

,,, statisticsorder theand nsobservatio Gives

11

1

iinn

n

n

YXYYXX

n

Xn

k

XXnDefn

Page 12: Ch.10  Summarizing Data 10.2  Methods based on the C.D.F

10.2 10.3

Ex: Cleveland et al. (1974) 空氣污染研究 , 比較星期天與周日間污染物分配之變化

x 和 y 的尺度不同

Page 13: Ch.10  Summarizing Data 10.2  Methods based on the C.D.F

10.2 10.3

)t-t-tt10.2(

quantilesquantiles (3)

constatgroup(V))III(group

constart

100)2(

2group(V))III(group

2

group(V) x),III(group ,100)1(

E.g.

'0

'101

p

之結論相同及中與圖

小時為大大時,其差異較的值,當與

的壽命的壽命約等於

天當

倍的的天竺鼠的壽命為在約等於

天當

pxy

xy

T

xy

yT

Page 14: Ch.10  Summarizing Data 10.2  Methods based on the C.D.F

10.2 10.3

(1)Histograms

If the bin width is too small→the histogram is too ragged

If the bin width is too wide→the shape is over smoothed and obscured

Eg:

1)( & 0, contestat function, ed weight

symmetric ,0)( ),(1

)( &bardwith thecalled is

,)(1

)( is of estumatedensity y probabilit kernel

a ,function,density y probabilit a from sample a is ,, If

estimate density.y probabilit Kernel)2(

1

1

dxxW

xWh

xW

hxWh

XxWn

xff

fxx

h

n

iihh

n

10.3 Histograms, Density Curves, and Stem-and-Leaf Plots

Figure10.8

Page 15: Ch.10  Summarizing Data 10.2  Methods based on the C.D.F

10.2 10.3

Eg

much out too smeared is of shape thelarge, toois If

rough tooestumate thesmall, toois If

)(1

)(

density ),(~)( &

))(1

)( ds(Fransforn

density ),0(~)(then

density normal stanland 2

1)(e.g

1

2

2

2

2

fh

h

XxWn

xf

hXNXxW

h

yf

hyfxhy

hNxW

exW

n

iihh

iih

xy

h

x

Figure10.9