wfm 5201: data management and statistical analysis

22
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam WFM 5201: Data Management and Statistical Analysis Akm Saiful Islam Lecture-8: Probabilistic Analysis June, 2008 Institute of Water and Flood Management (IWFM) Bangladesh University of Engineering and Technology (BUET)

Upload: griffith-ward

Post on 02-Jan-2016

55 views

Category:

Documents


4 download

DESCRIPTION

Akm Saiful Islam. WFM 5201: Data Management and Statistical Analysis. Lecture-8: Probabilistic Analysis. Institute of Water and Flood Management (IWFM) Bangladesh University of Engineering and Technology (BUET). June, 2008. Frequency Analysis. Continuous Distributions Normal distribution - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: WFM 5201: Data Management and Statistical Analysis

WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam

WFM 5201: Data Management and Statistical Analysis

Akm Saiful Islam

Lecture-8: Probabilistic Analysis

June, 2008

Institute of Water and Flood Management (IWFM)Bangladesh University of Engineering and Technology (BUET)

Page 2: WFM 5201: Data Management and Statistical Analysis

WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam

Frequency Analysis

Continuous Distributions Normal distribution Lognormal distribution Pearson Type III distribution Gumbel’s Extremal distribution

Confidence Interval

Page 3: WFM 5201: Data Management and Statistical Analysis

WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam

Log-Normal Distribution The lognormal distribution (sometimes spelled out as the

logarithmic normal distribution) of a random variable is one for which the logarithm of follows a normal or Gaussian distribution. Denote , then Y has a normal or Gaussian distribution given by:

, (1)

2

2

1

22

1)(

y

yy

y

eyf

y

XY ln

Page 4: WFM 5201: Data Management and Statistical Analysis

WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam

Derived distribution: Since ,

the distribution of X can be found as:

(2)

Note that equation (1) gives the distribution of Y as a normal distribution with mean and variance . Equation (2) gives the distribution of X as the lognormal distribution with parameters

and .

22

2

1

22

2

1

2 2

11

2

1)()(

y

y

y

y y

y

y

y

exx

edx

dyyfxf

y 2y

y 2y

XY lnxdx

dy 1

Page 5: WFM 5201: Data Management and Statistical Analysis

WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam

Estimation of parameters ( , ) of lognormal distribution:

Note: , , Chow (1954) Method: (1) (2) (3)

(4)The mean and variance of the lognormal distribution are:

(5) The coefficient of variation of the Xs is:

(6) The coefficient of skew of the Xs is: (7) Thus the lognormal distribution is skewed to the right;

the skewness increasing with increasing values of .

XSC xv /

1ln

2

12

2

vC

XY

)1ln( 22 vy CS

XY lnn

yy

i1

22

2

n

ynyS

i

y

)2/exp()( 2yyXE 1)(

22 yeXVar x

12

yeCv

33 vv CC

vC

y 2y

and

Page 6: WFM 5201: Data Management and Statistical Analysis

WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam

Example-1:

Use the lognormal distribution and calculate the expected relative frequency for the third class interval on the discharge data in the next table

Page 7: WFM 5201: Data Management and Statistical Analysis

WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam

Frequency of the discharge of a River

Class Number Observed Relative

Frequency

25,000 2 0.03

35,000 3 0.045

45,000 10 0.152

55,000 9 0.136

65,000 11 0.167

75,000 10 0.152

85,000 12 0.182

95,000 6 0.091

105,000 0 0.000

115,000 3 0.045

Page 8: WFM 5201: Data Management and Statistical Analysis

WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam

Solution

According to the lognormal distribution is

311.0500,67/000,21/ xSC xV

0737.11)]1311.0/(500,67ln[21)]1/(ln[2

1 2222 vCxy

30395.01311.0)1ln( 22 vy Cs

182.130395.0/)0737.11000,45(ln/)(ln ysyxz

Page 9: WFM 5201: Data Management and Statistical Analysis

WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam

So from the standard normal table we get

The expected relative frequency according to the lognormal distribution is 0.145

5104476.1)(

)30395.0(000,45/198.0)/()()(

xp

SxzPxp

x

yzx

198.0)( zpz

145.0)104476.1(000,10 5000,45 f

Page 10: WFM 5201: Data Management and Statistical Analysis

WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam

Example-2:

Assume the data of previous table follow the lognormal distribution. Calculate the magnitude of the 100-year peak flood.

Page 11: WFM 5201: Data Management and Statistical Analysis

WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam

Solution: The 100-year peak flow corresponds to a prob(X > x) of

0.01. X must be evaluated such that Px(x) = 0.99. This can accomplished by evaluating Z such that Pz(z)=0.99 and then transforming to X. From the standard normal tables the value of Z corresponding to Pz(Z) of 0.99 is 2.326.

The values of Sy and are given

The 100-year peak flow according to the lognormal distribution is about 1,30,700 cfs.

y

yzsy y

781.110737.11)326.2(30395.0 y

cfsyx 700,130)exp(

Page 12: WFM 5201: Data Management and Statistical Analysis

WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam

Extreme Value Distributions

Many times interest exists in extreme events such as the maximum peak discharge of a stream or minimum daily flows.

The probability distribution of a set of random variables is also a random variable.

The probability distribution of this extreme value random variable will in general depend on the sample size and the parent distribution from which the sample was obtained.

Page 13: WFM 5201: Data Management and Statistical Analysis

WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam

Extreme value type-I: Gumbel distribution

Extreme Value Type I distribution, Chow (1953) derived the expression

To express T in terms of , the above equation can be written as

1lnln5772.0

6

T

TKT

6expexp1

1

TKT

TK

5772.0

(3)

Page 14: WFM 5201: Data Management and Statistical Analysis

WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam

Example-3: Gumble Determine the 5-year return period rainfall for Chicago

using the frequency factor method and the annual maximum rainfall data given below. (Chow et al., 1988, p. 391)

YearRainfall (inch) Year

Rainfall (inch) Year

Rainfall (inch)

1913 0.49 1926 0.68 1938 0.521914 0.66 1927 0.61 1939 0.641915 0.36 1928 0.88 1940 0.341916 0.58 1929 0.49 1941 0.71917 0.41 1930 0.33 1942 0.571918 0.47 1931 0.96 1943 0.921920 0.74 1932 0.94 1944 0.661921 0.53 1933 0.8 1945 0.651922 0.76 1934 0.62 1946 0.631923 0.57 1935 0.71 1947 0.61924 0.8 1936 1.11    1925 0.66 1937 0.64    

Page 15: WFM 5201: Data Management and Statistical Analysis

WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam

Solution

The mean and standard deviation of annual maximum rainfalls at Chicago are 0.67 inch and 0.177 inch, respectively. For , T=5, equation (3) gives

1lnln5772.0

6

T

TKT

0.71915

5lnln5772.0

6

TK

in 0.78= 177)(0.719)(0.+0.649=T

TT

x

sKxx

Page 16: WFM 5201: Data Management and Statistical Analysis

WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam

Log Pearson Type III

For this distribution, the first step is to take the logarithms of the hydrologic data, . Usually logarithms to base 10 are used. The mean , standard deviation , and coefficient of skewness, Cs are calculated for the logarithms of the data. The frequency factor depends on the return period and the coefficient of skewness .

When , the frequency factor is equal to the standard normal variable z .

When , is approximated by Kite (1977) as

5432232

3

1)1()6(

3

1)1( kzkkzkzzkzzKT

0sC

0sC

Page 17: WFM 5201: Data Management and Statistical Analysis

WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam

Example-4: Calculate the 5- and 50-year return period annual maximum discharges of the Gaudalupe River near Victoria, Texas, using the lognormal and log-pearson Type III distributions. The data in cfs from 1935 to 1978 are given below. (Chow et al., 1988, p. 393)

Year 1930   1940   1950   1960   1970

0     55900   13300   23700   91901     58000   12300   55800   9740

2     56000   28400   10800   585003     7710   11600   4100   33100

4     12300   8560   5720   252005 38500   22000   4950   15000   30200

6179000   17900   1730   9790   14100

7 17200   46000   25300   70000   545008 25400   6970   58300   44300   12700

9 4940   20600   10100   15200    

Page 18: WFM 5201: Data Management and Statistical Analysis

WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam

Solution

Page 19: WFM 5201: Data Management and Statistical Analysis

WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam

It can be seen that the effect of including the small negative coefficient of skewness in the calculations is to alter slightly the estimated flow with that effect being more pronounced at years than at years. Another feature of the results is that the 50-year return period estimates are about three times as large as the 5-year return period estimates; for this example, the increase in the estimated flood discharges is less than proportional to the increase in return period.

Page 20: WFM 5201: Data Management and Statistical Analysis

WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam

Confidence Interval

Page 21: WFM 5201: Data Management and Statistical Analysis

WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam

Page 22: WFM 5201: Data Management and Statistical Analysis

WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam