Download - Statistical methods
1
QUANTITATIVE METHODS I
Ashok K Mittal Department of IME
IIT Kanpur
Statistics
Q1 I want to invest Rs 1000 , in which company should I invest
GrowthReturnsRisk
Statistics
• Q2 How do I know which Company will give me
• High Return
• Or
• Growth
• But Risk should be Low
Statistics
• Collect Information from the Past • Qualitative • Quantitative ( Data)• Analyze Information ( Data) to provide
patterns of past performance ( Descriptive Statistics)
• Project these patterns to answer your questions ( Inference)
Types of data
Primary: You do a survey to find out the percentage of people living below the poverty line in Allahabad
Secondary: You are interested in studying the performance of banks and for that you study the RBI published documents
Descriptive Statistics
Presentation of data Non frequency data Frequency data
Non frequency dataTime series representation of data
BSE(30) Close
0500
100015002000250030003500400045005000
3-Ja
n-94
5-Ja
n-94
7-Ja
n-94
11-J
an-9
4
13-J
an-9
4
17-J
an-9
4
19-J
an-9
4
24-J
an-9
4
27-J
an-9
4
31-J
an-9
4
2-Fe
b-94
4-Fe
b-94
8-Fe
b-94
10-F
eb-9
4
14-F
eb-9
4
16-F
eb-9
4
18-F
eb-9
4
22-F
eb-9
4
24-F
eb-9
4
28-F
eb-9
4
1-M
ar-9
4
3-M
ar-9
4
7-M
ar-9
4
9-M
ar-9
4
15-M
ar-9
4
17-M
ar-9
4
21-M
ar-9
4
23-M
ar-9
4
25-M
ar-9
4
29-M
ar-9
4
31-M
ar-9
4
Date
BSE(
30) C
lose
Non frequency data Spatial series representation of
dataFertiliser Consumption for few Indian states for 1999-2000 (in tonnes)
0
500000
1000000
1500000
2000000
2500000
3000000
3500000
Andhr
a Pra
desh
Karna
taka
Keral
a
Tamil Nad
u
Gujar
at
Mad
hya P
rades
h
Mah
arash
tra
Raj
asth
an
Har
yana
Punjab
Utta
r Pra
desh
Bihar
Oris
sa
West
Beng
al
Assam
States
Fert
iliser
consum
ption (
in t
onnes)
Fertiliser Consumption
Frequency data:Tabular representation
India at a glance Year(% of GDP) 1983 1993 2002 2003Agriculture 36.6 31.0 22.7 22.2Industry 25.8 26.3 26.6 26.6Mfg 16.3 16.1 15.6 15.8Services 37.6 42.8 50.7 51.2Pvt Consump 71.8 37.4 65.0 64.9GOI consump 10.6 11.4 12.5 12.8Import 8.1 10.0 15.6 16.0Domes save 17.6 22.5 24.2 22.2Interests paid 0.4 1.3 0.7 18.3Note: 2003 refers to 2003-2004; data are preliminary. Gross domesticsavings figures are taken directly from India′s central statisticalorganization.
Frequency data Line diagram representation
Fertiliser Consumption for few Indian states for 1999-2000 (in tonnes)
0
500000
1000000
1500000
2000000
2500000
3000000
3500000
Andhr
a Pra
desh
Karna
taka
Keral
a
Tamil N
adu
Guja
rat
Mad
hya P
rades
h
Mah
arash
tra
Rajas
than
Harya
na
Punjab
Uttar P
rade
sh
Bihar
Oris
sa
West
Bengal
Assam
States
Fert
iliser
consum
ption (
in t
onnes)
Fertiliser Consumption
Frequency dataBar diagram (histogram)
representationWorld Population (projected mid 2004)
0
1000000000
2000000000
3000000000
4000000000
5000000000
6000000000
7000000000
1950 1960 1970 1980 1990 2000
Year
Pop
ulat
ion
World Population
Frequency dataBar diagram (histogram)
representationHeight and Weight of individuals
0
20
40
60
80
100
120
140
160
180
200
Ram Shyam Rahim Praveen Saikat Govind Alan
Individual
Hei
ght/
Wei
ght
Height (in cms) Weight (in kgs)
Frequency dataPie diagram/chart representation
Median marks in JMET (2003)
Verbal Quantitative Analytical Data Interpresentation
Frequency dataBox plot representation
The box plot is also called the box whisker plot. A box plot is a set of five summary measures of distribution of the data which are median lower quartile upper quartile smallest observation largest observation.
Frequency dataBox plot representation
MedianLQ UQ
Whisker
Frequency dataBox plot representation
Here: UQ – LQ = Inter quartile range
(IQR) X = Smallest observation within
certain percentage of LQ Y = Largest observation within
certain percentage of UQ
Important note
A cumulative frequency diagram is called the ogive. The abscissa of the point of intersection represents the median of the data.
ExampleFrequency
0
2
4
6
8
10
500
to 5
50
550
to 6
00
600
to 6
50
650
to 7
00
700
to 7
50
750
to 8
00
800
to 8
50
850
to 9
00
900
to 9
50
950
to 1
000
1000
to 1
050
1050
to 1
100
1100
to 1
150
1150
to 1
200
1200
to 1
250
1250
to 1
300
Class Interval
Fre
quen
cy
Frequency
Analysis
Measurements of uncertainty
Concept of uncertainty and different measures of uncertainty, Probability as a measure of uncertainty, Description of qualitative as well as quantitative probability, Assessment of probability, Concepts of Decision trees, Random Variables, Distributions, Expectations, Probability plots, etc.
Definitions
Quantitative variable: It can be described by a number for which arithmetic operations such as averaging make sense.Qualitative (or categorical) variable: It simply records a qualitative, e.g., good, bad, right, wrong, etc.
We know statistics deals with measurements, some being qualitative others being quantitative. The measurements are the actual numerical values of a variable. Qualitative variables could be described by numbers, although such a description might be arbitrary, e.g., good = 1, bad = 0, right = 1, wrong = 0, etc.
Scales of measurement
Nominal scale: In this scale numbers are used simply as labels for groups or classes. If we are dealing with a data set which consists of colours blue, red, green and yellow, then we can designate blue = 3, red = 4, green = 5 and yellow = 6. We can state that the numbers stand for the category to which a data point belongs. It must be remembered that nothing is sacrosanct regarding the numbering against each category. This scale is used for qualitative data rather than quantitative data.
Scales of measurement
Ordinal Scale: In this scale of measurement, data elements may be ordered according to relative size or quality. For example a customer or a buyer can rank a particular characteristics of a car as good, average, bad and while doing so he/she can assign some numeric value which may be as follows, characteristic good = 10, average = 5 and bad = 0.
Scales of measurement
Interval Scale: For the interval scale we specify intervals in a way so as to note a particular characteristic, which we are measuring and assign that item or data point under a particular interval depending on the data point. Consider we are measuring the age of school going students between classes 5 to 12 in the city of Kanpur. We may form intervals 10-12 years, 12-14 years,....., 18-20 years. Now when we have one data point, i.e., the age of a student we put that data under any one particular interval, e.g. if the student's age is 11 years, we immediately put that under the interval 10-12 years.
Scales of measurement
Ratio Scale: If two measurements are in ratio scale, then we can take ratios of measurements. The ratio scale represents the reading for each recorded data in a way which enables us to take a ratio of the readings in order to depict it either pictorially or in figures. Examples of ratio scale are measurements of weight, height, area, length etc.
Definitions: different measures1) Measure of central tendency
Mean (Arithmetic mean (AM), Geometric mean (GM), Harmonic mean (HM))
Median Mode
1) Measure of dispersion Variance or Standard deviation Skewness Kurtosis
Definition: Mean
Given N number of observations, x1, x2,….., xN we define the following
].....[1
21 NXXXN
AM +++=
NNXXXGM1
21 ]*.....**[=
1
21)]
1.....
11(
1[ −+++=
NXXXNHM
Definition: Median and Mode
Median(µe) : The median of a data set is the value below which lies half of the data points. To find the median we use F (µe) = 0.5.
Mode (µo): The mode of a data set is the value that occurs most frequently. Hence f(µo) ≥ f(x); ∀ x.
Definition: Variance, Standard deviation, Skewness, Kurtosis
Variance: V[X] =
Standard deviation (SD) = σ
Skewness =
Kurtosis =
( )33
2
3
2
31 σ
µ
µ
µβγ ===
( )[ ]22 XEXE −=σ
−=−= 33
44
22 σµβγ
Example
Consider we have the following data points:
5, 7, 10, 7, 10, 11, 3, 5, 5
For these data points we have
µ = 7; µe = 10; µo = 5; σ2 = 6.89
Descriptive statistics
Suppose the data are available in the form of a frequency distribution. Assume there are k classes and the mid-points of the corresponding class intervals being x1, x2,…., xk. While the corresponding frequencies are f1, f2,….., fk, such that n = f1+f2+…..+fk
Then:
∑=
=k
iii fxn 1
1µ ∑=
−=k
iii fxx
n 1
212 )(
1σ
Consider m groups of observations with respective means µ1, µ2,….., µm and standard deviations σ1, σ2,….., σm. Let the group sizes be n1, n2,….., nm such that n = n1, n2,….., nm.
Then:
Descriptive statistics
∑=
=m
iiiOVERALL f
n 1
1 µµ
21
1
2
1
2 ])(1[ ∑∑
==−+=
m
iOVERALLii
m
iiiOVERALL nn
nµµσσ
Probability
Random event
Random experiment: Is an experiment whose outcome cannot be predicted with certainty.
Sample space (Ω) : The set of all possible outcomes of a random experiment
Sample point (ωi): The elements of the sample space
Event (A): Is a subset of the sample space such that it is a collection of sample point(s).
Probability
Probability (P(A)): Of an event is defined as a quantitative measure of uncertainty of the occurrence of the event
Objective probability: Based on game of chance and which can be mathematically proved or verified . If the experiment is the same for two different persons, then the value of objective probability would remain the same. It is the limiting definition of relative frequency. Example: be probability of getting the number 5 when we roll a fair die.
Subjective probability: Based on personal judgment, intuition and subjective criteria. Its value will change from person to person. Example one person sees the chance of India winning the test series with Australia high while the other person sees it to be low.
Random event
For a random experiment, we denote
P(ωi) = pi
Where: P(ωi) = pi = Probability of occurrence of the sample
point ωi
P(A) = Probability of occurrence of the event
∑∈
=Ai
i
pAPϖ
)(
1)( ==Ω ∑Ω∈iipP
ϖ
Example 1
Suppose there are two dice each with faces 1, 2,....., 6 and they are rolled simultaneously. This rolling of the two dice would constitute our random experimentThen we have:
Ω = (1,1), (1,2), (1,3), (1,4), (1,5), (1,6), (2,1),.….., (5,6), (6,1), (6,2), (6,3), (6,4), (6,5), (6,6).
ωi = (1,1), (1,2),…., (6,5), (6,6) We define the event is such that the outcomes for
each die are equal in one simultaneous throw, then A = (1, 1), (2, 2),….., (6, 6)
P(ωi): p1 = p2 = ….. = p36 = 1/36 P(A) = p1 + p8 + p15 + p22 + p29 + p36 = 6/36
Example 2
Suppose a coin is tossed repeatedly till the first head is obtained.Then we have:
Ω = (H), (T,H), (T,T,H),……… ωi = (H), (T,H), (T,T,H),….. We define the event such that at most 3 tosses
are needed to obtain the first head, then A = (H), (T,H), (T,T,H)
P(ωi): p1 = ½, p2 = (½)2, p3 = (½)3, p4 = (½)4,..… P(A) = p1 + p2 + p3 = 7/8
Example 3
In a club there 10 members of whom 5 are Asians and the rest are Americans. A committee of 3 members has to be formed and these members are to be chosen randomly. Find the probability that there will be at least 1 Asian and at least 1 American in the committeeTotal number of cases = 10C2 and the number of cases favouring the formation of the committee is 5C2*5C1 + 5C1*5C2
Hence P(A) = 100/120
Example 4
Suppose we continue with example 2 which we have just discussed and we define the event B , that al least 5 tosses are needed to produce the first head
Ω = (H), (T,H), (T,T,H),………
ωi = (H), (T,H), (T,T,H),…..
P(ωi): p1 = ½, p2 = (½)2, p3 = (½)3, p4 = (½)4,..…
P(B) = p5+p6+p7+ ….. = 1 – (p1+p2+p3+p4)
Theorem in probability
For any event A, B ∈ Ω 0 ≤ P(A) ≤ 1 If A ⊂ B, then P(A) ≤ P(B) P(A U B) = P(A) + P(B) – P(A ∩ B) P(AC) = 1 – P(A) P(Ω) = 1 P(φ) = 0
Definitions
Mutually exclusive: Consider n events A1, A2,….., An. They are mutually exclusive if no two of them can occur together, i.e., P(Ai ∩ Aj) = 0. ∀ i, j (i ≠ j) ∈ n
Mutually exhaustive: Consider n events A1, A2,….., An. They are mutually exhaustive if at least one of them must occur and P(A1UA2U…..UAn) = 1
Example 5
Suppose a fair die with faces 1, 2,….., 6 is rolled. Then Ω = 1, 2, 3, 4, 5, 6. Let us define the events A1 = 1, 2, A2 = 3, 4, 5, 6 and A3 = 3, 5
The events A2 and A3 are neither mutually exclusive nor exhaustive
A1 and A3 are mutually exclusive but not exhaustive
A1, A2 and A3 are not mutually exclusive but are exhaustive
A1 and A2 are mutually exclusive and exhaustive
Conditional probability
Let A and B be two events such that P(B) > 0. Then the conditional probability of A given B is
Assume Ω = 1, 2, 3, 4, 5, 6, A = 2, B = 2, 4, 6. Then A ∩ B = 2 and
)(
)()|(
BP
BAPBAP
∩=
3
1
63
61)|( ==BAP 1
61
61)|( ==ABP
Baye′s Theorem
Let B1, B2,….., Bn be mutually exclusive and exhaustive events such that P(B i) > 0, for every i =1, 2,…., n and A be any event such that then we have∑
==n
iii BPBAPAP
1)()|()(
∑=
=n
iii
jjj
BPBAP
BPBAPABP
1)()|(
)()|()|(
Independence of events
Two events A and B are called independent if P(A∩B) = P(A)*P(B)
Distribution
Distribution
Depending what are the outcomes of an experiment a random variable (r.v) is used to denote the outcome of the experiment and we usually denote the r.v using X, Y or Z and the corresponding probability distribution is denoted by f(x), f(y) or f(z)
Discrete: probability mass function (pmf) Continuous: probability density function
(pdf)
Discrete distribution
1) Uniform discrete distribution
2) Binomial distribution
3) Negative binomial distribution
4) Geometric distribution
5) Hypergeometric distribution
6) Poisson distribution
7) Log distribution
Bernoulli Trails
1) Each trial has two possible outcomes, say a success and a failure.
2) The trials are independent
3) The probability of success remains the same and so does the probability of failure from one trial to another
Uniform discrete distribution[X ~ UD (a , b) ]
f(x) = 1/n x = a, a+k, a+2k,….., b a and b are the parameters where a, b ∈
R E[X] =
V[X] = Example: Generating the random
numbers 1, 2, 3,…, 10. Hence X~UD(1,10) where a=1, k=1, b=10. Hence n=10.
)1(2
−+ nk
a
)12
1(
22 −nk
Uniform discrete distribution
Uniform discrete distribution
0
0.02
0.04
0.06
0.08
0.1
0.12
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
x
f(x)
Binomial distribution[X ~ B (p , n)]
f(x) = nCxpxqn-x x = 0, 1, 2,…..,n
n and p are the parameters where p ∈ [0, 1] and n ∈ Z+
E[X] = np V[X] = npq Example: Consider you are checking the quality of the
product coming out of the shop floor. A product can either pass (with probability p = 0.8) or fail (with probability q = 0.2) and for checking you take such 50 products (n = 50). Then if X is the random variable denoting the number of success in these 50 inspections, we have
X~50Cx(0.8)x(0.2)50-x
Binomial distributionBinomial distribution
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0 2 4 6 8 10 12 14 16 18 20 22 24 30 32 34 36 38 40 42 44 46 48 50
x
f(x)
Negative binomial distribution[X ~ NB (p , r)]
f(x) = r+x-1Cr-1prqx x = r, r+1,.….. p and r are the parameters where p ∈ [0, 1] and r ∈ Z+
E[X] = rq/p V[X] = rq/p2
Example: Consider the example above where you are still inspecting items from the production line. But now you are interested in finding the probability distribution of the number of failures preceding the 5th success of getting the right product. Then, we have, considering p=0.8, q=0.2
X ~ 5+x-1C5-1(0.8)5(0.2)x
Negative binomial distribution
Negative binomial distribution
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
x
f(x)
Geometric distribution[X ~ G (p)]
f(x) = pqx x = 0,1,2,….. p is the parameter where p ∈ [0, 1] E[X] = q/p (r = 1 in the Negative Binomial distribution
case) V[X] = q/p2 (r = 1 in the Negative Binomial distribution
case) Example: Consider the example above. But now you
are interested in finding the probability distribution of the number of failures preceding the 1st success of getting the right product. Then, we have considering p=0.8, q=0.2
X~ (0.8)(0.2)x
Geometric distribution
Geometric distribution
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
x
f(x)
Hypergeometric distribution[X ~ HG (N, n, p)]
f(x) = NpCxNqCn-x/NCn 0 ≤ x ≤ Np and 0 ≤ (n – x) ≤ Nq
N, n and p are the parameters E[X] = np V[X] = npq(N – n)/(N – 1) Example: Consider the example above. But now you are interested in
finding the probability distribution of the number of failures(success) of getting the wrong(right) product when we choose n number of products for inspection out of the total population N. If the population is 100 and we choose 10 out of those, then the probability distribution of getting the right product, denoted by X is given by
X~85Cx15C10-x/100C10
Remember p (0.85) and q (0.15) are the proportions of getting a good item and
bad item respectively. In this distribution we are considering the choosing is done without
replacement
Hypergeometric distributionHypergeometric distribution
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
1 2 3 4 5 6 7 8 9 10
x
f(x)
f(x)
Poisson distribution[X ~ P (λ)]
f(x) = e-λλx/x! x = 0,1,2,….. λ is the parameter where λ > 0 E[X] = λ V[X] = λ Example: Consider the arrival of the number of
customers at the bank teller counter. If we are interested in finding the probability distribution of the number of customers arriving at the counter in specific intervals of time and we know that the average number of customers arriving is 5, then, we have
X~ e-55x/x!
Poisson distributionPoisson distribution
0
0.02
0.04
0.06
0.080.1
0.12
0.14
0.16
0.18
0.2
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
x
f(x)
Log distribution[X ~ L (p)]
f(x) = -(loge p)-1x-1(1 – p)x x = 1,2,3,…..
p is the parameter where p ∈ (0, 1)
E[X] = -(1-p)/(plogep)
V[X] = -(1-p)[1 + (1 - p)/logep]/(p2logep)
Example1) Emission of gases from engines against fuel type
2) Used to represent the distribution of the number of items of a product purchased by a buyer in a specified period of time
Log distribution
Log distribution
0
0.1
0.2
0.3
0.4
0.5
0.6
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
x
f(x)
Continuous distribution
1) Uniform distribution
2) Normal distribution
3) Exponential distribution
4) Chi-Square distribution
5) Gamma distribution
6) Beta distribution
7) Cauchy distribution
Continuous distribution
8) t-distribution
9) F-distribution
10)Log-normal distribution
11)Weibull distribution
12)Double exponential distribution
13)Pareto distribution
14)Logistic distribution
Uniform distribution[X ~ U (a, b)]
f(x) = 1/(b – a) a ≤ x ≤ b a and b are the parameters where a, b ∈
R and a < b E[X] = (a+b)/2 V[X] = (b-a)2/12 Example: Choosing any number between
1 and 10, both inclusive, from the real line
Uniform distribution
Uniform distribution
0
0.02
0.04
0.06
0.08
0.1
0.12
1 10
x
f(x)
Normal distribution[X ~ N ( µ, σ2)]
-∞ < x < ∞
µX, σ2X are the parameters where µX ∈ R and σ2
X > 0
E[X] = µX
V[X] = σ2X
Example: Consider the average age of a student between class VII and VIII selected at random from all the schools in the city of Kanpur
2
2
2
)(
2
1)( X
Xx
Xexf σ
µ
σπ
−−=
Normal distributionNormal distribution
0
0.05
0.1
0.15
0.2
0.25
-10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10
x
f(x)
Log-normal distribution[X ~ LN ( µ, σ2)]
0 < x < ∞
µX, σ2X are the parameters where µX ∈ R
and σ2X > 0
E[X] = exp(µX+σ2X/2)
V[X] = exp(2µX+σ2X)exp(σ2
X)-1
Example: Stock prices return distribution
2
2
2
)(log1
2
1)( X
Xe x
Xex
xf σµ
σπ
−−=
Log-normal distributionLog-normal distribution
0
0.002
0.004
0.006
0.008
0.01
0.012
0.5 3 5.5 8 10.5 13 15.5 18 20.5 23 25.5 28 30.5 33 35.5 38
x
f(x)
Relationship between Poisson and Exponential distribution
If a process has the intervals between successive events as independent and identical and it is exponentially distributed then the number of events in a specified time interval will be a Poisson distribution
Normal distribution results
Normal distribution
0
0.05
0.1
0.15
0.2
0.25
-10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10
x
f(x)
a b
z .00 .01 .02 .03 .04 .05 .06 .07 .08 .090.0 0.0000 0.0040 0.0080 0.0120 0.0160 0.0199 0.0239 0.0279 0.0319 0.03590.1 0.0398 0.0438 0.0478 0.0517 0.0557 0.0596 0.0636 0.0675 0.0714 0.07530.2 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064 0.1103 0.11410.3 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443 0.1480 0.15170.4 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808 0.1844 0.18790.5 0.1915 0.1950 0.1985 0.2019 0.2054 0.2088 0.2123 0.2157 0.2190 0.22240.6 0.2257 0.2291 0.2324 0.2357 0.2389 0.2422 0.2454 0.2486 0.2517 0.25490.7 0.2580 0.2611 0.2642 0.2673 0.2704 0.2734 0.2764 0.2794 0.2823 0.28520.8 0.2881 0.2910 0.2939 0.2967 0.2995 0.3023 0.3051 0.3078 0.3106 0.31330.9 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340 0.3365 0.33891.0 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.3577 0.3599 0.36211.1 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790 0.3810 0.38301.2 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3980 0.3997 0.40151.3 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.4147 0.4162 0.41771.4 0.4192 0.4207 0.4222 0.4236 0.4251 0.4265 0.4279 0.4292 0.4306 0.43191.5 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.4418 0.4429 0.44411.6 0.4452 0.4463 0.4474 0.4484 0.4495 0.4505 0.4515 0.4525 0.4535 0.45451.7 0.4554 0.4564 0.4573 0.4582 0.4591 0.4599 0.4608 0.4616 0.4625 0.46331.8 0.4641 0.4649 0.4656 0.4664 0.4671 0.4678 0.4686 0.4693 0.4699 0.47061.9 0.4713 0.4719 0.4726 0.4732 0.4738 0.4744 0.4750 0.4756 0.4761 0.47672.0 0.4772 0.4778 0.4783 0.4788 0.4793 0.4798 0.4803 0.4808 0.4812 0.48172.1 0.4821 0.4826 0.4830 0.4834 0.4838 0.4842 0.4846 0.4850 0.4854 0.48572.2 0.4861 0.4864 0.4868 0.4871 0.4875 0.4878 0.4881 0.4884 0.4887 0.48902.3 0.4893 0.4896 0.4898 0.4901 0.4904 0.4906 0.4909 0.4911 0.4913 0.49162.4 0.4918 0.4920 0.4922 0.4925 0.4927 0.4929 0.4931 0.4932 0.4934 0.49362.5 0.4938 0.4940 0.4941 0.4943 0.4945 0.4946 0.4948 0.4949 0.4951 0.49522.6 0.4953 0.4955 0.4956 0.4957 0.4959 0.4960 0.4961 0.4962 0.4963 0.49642.7 0.4965 0.4966 0.4967 0.4968 0.4969 0.4970 0.4971 0.4972 0.4973 0.49742.8 0.4974 0.4975 0.4976 0.4977 0.4977 0.4978 0.4979 0.4979 0.4980 0.49812.9 0.4981 0.4982 0.4982 0.4983 0.4984 0.4984 0.4985 0.4985 0.4986 0.49863.0 0.4987 0.4987 0.4987 0.4988 0.4988 0.4989 0.4989 0.4989 0.4990 0.4990
543210-1-2-3-4-5
0.4
0.3
0.2
0.1
0.0
Z
f( z)
Standard Normal Distribution
1.56
Standard Normal Probabilities
Look in row labeled 1.5 and column labeled .06 to findP(0 ≤ z ≤ 1.56) = .4406
Finding Probabilities using Standard Normal Distribution: P(0 < Z < 1.56)
Standard Normal distribution
Z ~ N(0,1) given by the equation
The area within an interval (a,b) is given by
which is not integrable
algebraically. The Taylor’s expansion of the above assists in speeding up the calculation, which is
2
2
2
1)(
z
ezf−
=π
dzebZaFb
a
z
∫−
=≤≤ 2
2
2
1)(
π
∑∞
=
+
+−+=≤
0
12
!2)12(
)1(
2
1
2
1)(
kk
kk
kk
zzZF
π
Cumulative distribution function (cdf) or the distribution function
We denote the distribution function by F(x)
∑≤
=≤=xx
ii
xfxXPxF )()()(
∫∫∞−∞−
==≤=xx
xdFdxxfxXPxF )()()()(
Properties of distribution function
1) F(x) is non-decreasing in x, i.e., if x1 ≤ x2, then F(x1) ≤ F(x2)
2) Lt F(x) = 0 as x → - ∞3) Lt F(x) = 1 as x → + ∞4) F(x) is right continuous
Standard normal distribution
Putting Z=(X-µX)/σX in the normal distribution we have the standard normal distribution
Where: µZ = 0 and σZ = 1
Remember• F(x) = P(X ≤ x) = F(z) = F(Z ≤ z)
• f(z) = φ(z)
•
2
2
2
1)(
z
ezf−
=π
)()()()()( zxdFdxzfzZPzFzz
Φ===≤= ∫∫∞−∞−
Standard normal distribution
Standard Normal distribution
-0.05
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
-10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10
z
f(z)
Exponential distribution[ X ~ E (a, θ)]
a < x < ∞ a and θ are the parameters where a ∈ R
and θ > 0 E[X] = a + θ V[X] = θ2
`Example: The life distribution of the number of hours a electric bulb survives.
θθ
)(1
)(
ax
exf
−−=
Exponential distributionExponential distribution
0
0.05
0.1
0.15
0.2
0.25
0.3
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
x
f(x)
Normal CDF Plot
Normal F(x)
0
0.2
0.4
0.6
0.8
1
1.2
1.8
3
2.4
6
2.7
4
2.9
9
3.1
2
3.2
5
3.3
6
3.4
5
3.5
1
3.5
6
3.7
4
3.8
3.9
3.9
8
4.0
8
4.2
8
4.4
8
4.7
8
5.1
6
X
Norm
al F
(x)
Exponential CDF Plot
Exponential F(x)
0
0.0020.004
0.006
0.0080.01
0.012
0.0140.016
0.018
1.8
3
2.4
6
2.7
4
2.9
9
3.1
2
3.2
5
3.3
6
3.4
5
3.5
1
3.5
6
3.7
4
3.8
3.9
3.9
8
4.0
8
4.2
8
4.4
8
4.7
8
5.1
6
X
F(X
)
Arrival time problem # 1
In a factory shop floor for a certain CNC machine (machine marked # 1) the number of jobs arriving per unit time are given below# of Arrivals Frequency0 21 42 43 14 25 16 47 68 49 1
Arrival time problem # 1
Frequency distribution of arrivals
0
1
2
3
4
5
6
7
0 1 2 3 4 5 6 7 8 9
# of Arrivals
Fre
quen
cy
Arrival time problem # 1
Relative Frequency of Number of Arrivals
0
0.05
0.1
0.15
0.2
0.250 1 2 3 4 5 6 7 8 9
# of Arrivals
Rel
ativ
e F
requ
ency
Arrival time problem # 1Cumulative Relative Frequency of Number of Arrivals
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 1 2 3 4 5 6 7 8 9 10
# of Arrivals
Cum
ulat
ive
Rel
ativ
e F
requ
ency
Arrival time problem # 1
1) The probability of number of arrivals of jobs being equal to or more than 7 is about 0.18.
2) The average number of arrival of jobs is 5.
3) The probability of number of arrivals of jobs being less than or equal to 4 is about 0.45.
Covariance
SAT problem
This data set [ ] includes eight variables:1) STATE: Name of state2) COST: Current expenditure per pupil (measured in thousands of
dollars per averagedaily attendance in public elementary and secondary schools)
3) RATIO: Average pupil/teacher ratio in public elementary and secondary schools during Fall 1994
4) SALARY: Estimated average annual salary of teachers in public elementary and secondary schools during 1994-95 (in thousands of dollars)
5) PERCENT: percentage of all eligible students taking the SAT in 1994-95
6) VERBAL: Average verbal SAT score in 1994-957) MATH: Average math SAT score in 1994-958) TOTAL: Average total score on the SAT in 1994-95
SAT problem
Histogram for Cost
0
2
4
6
8
10
12
St at eCOST
SAT problem
Histograms for Cost and Ratio
0
5
10
15
20
25
30
Ala
bam
a
Ark
ansas
Conn
ect
icut
Geor
gia
Illin
ois
Kan
sas
Main
e
Mic
hig
an
Mis
souri
Neva
da
New
Mexi
co
Nor
th D
akot
a
Ore
gon
Sou
th C
aro
lina
Texas
Virgi
nia
Wis
cons
in
State
Cost
and
Rat
io
COST RATIO
SAT problem
Histogram of Cost, Ratio and Salary
0
10
20
30
40
50
60
Alab
ama
Ariz
ona
Cal
iforn
ia
Con
nect
icut
Flo
rida
Haw
aii
Illin
ois
Iow
a
Ken
tuck
y
Mai
ne
Mas
sach
uset
ts
Min
neso
ta
Mis
sour
i
Neb
rask
a
New
New
Mex
ico
Nor
th C
arol
ina
Ohi
o
Ore
gon
Rho
de Is
lan
Sou
th D
akot
a
Tex
as
Ver
mon
t
Was
hing
ton
Wis
cons
in
State
Cos
t, R
atio
and
Sal
ary
COST RATIO SALARY
SAT problem
Average value is given by
Variance is given by
Covariance is given by
( ) ∑=
=n
iiXn
XE1
1
( ) ( ) ∑=
−=n
ii XEX
nXV
1
21
( ) ( ) ( ) [ ] ( ) ( )YVXVYEYXEXEYXCov YX ,, ρ=−−=
SAT problem
COST RATIO SALARY PERCENT VERBAL MATH TOTAL
Mean 5.90526 16.858 34.82892 35.24 457.14 508.78 965.92
Median 5.7675 16.6 33.2875 28 448 497.5 945.5
Maximum 9.774 24.3 50.045 81 516 592 1107
Minimum 3.656 13.8 25.994 4 401 443 844
Standard Deviation(1) 1.362807 2.266355 5.941265 26.76242 35.17595 40.20473 74.82056
Standard Deviation(2) 1.34911 2.243577 5.881552 26.49344 34.82241 39.80065 74.06857
SAT problem
COST RATIO SALARY PERCENT VERBAL MATH TOTAL
COST 1.820097 -1.12303 6.901753 21.18202 -19.2638 -18.7619 -38.0258
RATIO -1.12303 5.033636 -0.01512 -12.6639 4.98188 8.52076 13.50264
SALARY 6.901753 -0.01512 34.59266 96.10822 -97.6868 -93.9432 -191.63
PERCENT 21.18202 -12.6639 96.10822 701.9024 -824.094 -916.727 -1740.82
VERBAL -19.2638 4.98188 -97.6868 -824.094 1212.6 1344.731 2557.331
MATHS -18.7619 8.52076 -93.9432 -916.727 1344.731 1584.092 2928.822
TOTAL -38.0258 13.50264 -191.63 -1740.82 2557.331 2928.822 5486.154
SAT problem
COST RATIO SALARY PERCENT VERBAL MATH TOTAL
COST 1 -0.37103 0.869802 0.592627 -0.41005 -0.34941 -0.38054
RATIO -0.37103 1 -0.00115 -0.21305 0.063767 0.095422 0.081254
SALARY 0.869802 -0.00115 1 0.61678 -0.47696 -0.40131 -0.43988
PERCENT 0.592627 -0.21305 0.61678 1 -0.89326 -0.86938 -0.88712
VERBAL -0.41005 0.063767 -0.47696 -0.89326 1 0.970256 0.991503
MATHS -0.34941 0.095422 -0.40131 -0.86938 0.970256 1 0.993502
TOTAL -0.38054 0.081254 -0.43988 -0.88712 0.991503 0.993502 1
Inference
1) Point estimation
2) Interval estimation
3) Hypothesis testing
Sampling
• Population: N–Population distribution–Parameter (θ)
• Sample: n–Sampling distribution
–Statistic (tn)
Types of sampling
• Probability Sampling– Simple Random Sampling– Stratified Random Sampling– Cluster Sampling– Multistage Sampling– Systematic Sampling
• Judgement Sampling– Quota Sampling– Purposive Sampling
Simple Random Sampling
A simple random sampling of size (n) from a finite population (N) is a sample selected such that each possible sample of size n has he same probability of being selected. This would be akin to SRSWOR
Simple Random Sampling
A simple random sampling of size (n) from an infinite population (N) is a sample selected such that the following conditions hold
Each element selected comes from the population.
Each element is selected independently
This would be akin to SRSWR
Some Special Distribution Used
in Inference
Chi-square distribution
Suppose Z1, Z2,….., Zn are ′n′ independent observations from N(0, 1), then
Z21 + Z2
2 + Z23 +….. + Z2
n ~ χ2n
Chi-square distribution
0 ≤ x <∞
n is the parameter (degree of freedom) where n ∈ Z+
E[X] = n V[X] = 2n
)2(1)
2(
22)2(
1)(
xn
nex
nxf
−−
Γ
=
]~[ 2nX χ
Chi-square distribution
t-distribution
Suppose Z ~ N(0, 1), Y ~ χ2n and they are
independent, then
nt
n
Y
Z~
t-distribution[X ~ tn]
• n is the parameter where k ∈ Z+ • E[X] = 0 (n > 1)
• V[X] = n/(n – 2), (n > 2)
]1[)2(
]2
1[
)(2
n
xn
n
n
xf +Γ
+Γ=
π
t-distribution
F-distribution
Suppose X ~ χ2n, Y ~ χ2
m and they are
independent, then
mnF
m
Yn
X
,~
F-distribution[X ~ Fn,m]
0 < x < ∞
n, m are the parameter (degrees of freedom) where n, m ∈ Z+
E[X] = m/(m - 2), (m > 2) V[X] = 2m2(n + m -2)/[n(m – 2)2(m – 4)], (m > 4)
2
1)2(
22
))(2()
2(
)2
()(
mn
nmn
mnxmn
xmn
mnxf +
−
+ΓΓ
+Γ=
F-distribution
Some results
If X1, X2,….., Xn are ′n′ observations from X ~ N(µX, σ2
X) and
then n
n Xn
XXX =++ .....21
)1,0(~ N
n
X
X
Xnσ
µ−
Some results
If and
then
212
2, ~
)1(−
−n
X
XnSnχ
σ
∑=
−−
=n
jnjXn XX
nS
1
22,
)1(
1∑=
−=n
jXjXn X
nS
1
22/,
1 µ
22
2/, ~ nX
XnnS χσ
1,
~)(
−−
nXn
Xn tS
Xn µ
Some results
If X1, X2,….., Xn are ′mX′ observations from X ~ N(µX, σ2
X) and Y1, Y2,….., Ym are ′mY′ observations from Y ~ N(µY, σ2
Y) and more over these samples are independent then
1,12
2
2
2
2
2
2
2
~))((
)1(
)1(
)1(
)1(
−−=
−
−−
−
YX mmY
X
X
Y
YY
YY
XX
XX
FS
S
m
Sm
m
Sm
σσ
σ
σ
Estimation
Estimators and their properties
Estimator: Any statistic (a random function) which is used to estimate the population parameter
Unbiasedness
Eθ (tn) = θ Consistency
P[|tn - θ| < ε] = 1 as n → ∞
Estimators (Discrete distribution)
1) X ~ UD (a , b) then and
2) X ~ P (λ), then
3) X ~ B (n, p), then
),...,min(ˆ 1 nXXa =
),...,max(ˆ1 nXXb =
nX=λ
n
favouringp
#ˆ =
Estimators (Continuous distribution)
1) X ~ N(µ, σ2), then
2) X ~ N(µ, σ2) and if µ is known then
3) X ~ N(µ, σ2) and if µ is unknown then
4) X ~ E(θ), then
∑=
−=n
iiXn 1
22 1
ˆ µσ
nX=µ
nX=θ
∑=
−−
=n
ini XX
n 1
22 )1(
1σ
Examples (Estimation)
Number of jobs arriving in a unit time for a CNC machine Consider we choose from a discrete distribution whose population distribution [X ~ UD (20, 35)] is not know. We select the values from the distribution and the numbers sampled are 22, 34, 33, 21, 29, 29, 30. Then the best estimate for a, i.e., and the best estimate for b, i.e.,
21ˆ =a34ˆ =b
Examples (Estimation)
You are testing the components coming out of the shop floor and find that 9 out of 30 components fail. Then the estimated value of p (proportion of bad items in the population) = 9/30.
Examples (Estimation)
At a particular teller machine in one of the bank branches it was found that the number of customers arriving in an unit time span were 4, 6, 7, 4, 3, 5, 6, and 5. Then for this Poisson process the estimated value of λ is 5.
Examples (Estimation)
Suppose it is known that the survival time of a particular type of bulb has the exponential distribution. You test 10 such bulbs and find their respective survival times as 150, 225, 275, 300, 95, 155, 325, 75, 20 and 400 hours respectively. Then the estimated value of θ = 202.
Prediction
Multiple Linear Regression
Given ′k′ independent variables X1, X2,….., Xk and one dependent variable Y we predict the value of Y given by or y using the values of Xi ′s. We need ′n′ ( n ≥ k+1) data points and the multiple linear regression (MLR) equation is as follows:
Yj = β1x1,j + β2x2,j +…..+ βkxk,j + εj
∀ j = 1, 2,….., n
Y
Multiple Linear Regression
Note
There is no randomness in measuring X i
The relationship is linear and not non-linear. By non-linear we mean that at least one derivative of Y wrt βi′s is a function of at least one of the parameters. By parameters we mean the βi′s.
Multiple Linear Regression
Assumptions for the MLR Xi, Y are normally distributed Xi are all non-stochastic εj ~ N(0,σ2I) rank(X) = k n ≥ k No dependence between the X j′s, i.e., the rank
of the matrix X is E(εjεl) = 0 ∀ i, j = 1, 2,….., n Cov(Xi,εj) = 0 ∀ i ≠ j, i, j = 1, 2,….., n
Multiple Linear Regression
Find β1, β2,….., βk using the concept of minimizing the sum of square of errors. This is also known as least square method or method of ordinary least square. The estimates found are the estimates of β1, β2,….,βk respectively.
Utilize these estimates to find the forecasted value of Y (i.e., or y) and compare those with actual values of Y obtained in future.
kβββ ˆ,.....,ˆ,ˆ 21
Y
To check for normality of data
We need to check for the normality of Xi′s and Y1) List the observation number in the column # 1,call it i.2) List the data in column # 2.3) Sort the data from the smallest to the largest and place in
column # 3.4) For each ith of the n observations, calculate the
corresponding tail area of the standard normal distribution (Z) as follows, A = (i – 0.375)/(n + 0.25). Put the values in column # 4.
5) Use NORMSINV(A) function in MS-EXCEL to produce a column of normal scores. Put these values in column # 5.
6) Make a copy of the sorted data (be sure to use paste special and paste only the values) in column # 6.
7) Make a scatter plot of the data in columns # 5 and # 6.
To check for normality of dataChecking normality of data
0
50
100
150
200
250
300
350
400
450
-2.5 -1.6 -1.2 -1.0 -0.8 -0.6 -0.5 -0.3 -0.2 -0.1 0.1 0.2 0.3 0.5 0.6 0.8 1.0 1.2 1.6 3.0
Normal Score
Dat
a
Basic transformation for MLR
Some transformation to convert to MLR X → X(p) = Xp – 1/p, as p → 0 then X(p) =
logeX
If the variability in Y increases with increasing values of Y, then we use
logeY = β1logeX1 + β2logeX2 +….. + βklogeXk + ε
Simple linear regression
In the simple linear regression we have
Yj = α + βXj + εj ∀ j = 1,2,…..,n
The question is how do we find α and β, provided we have ′n′ number of observations which constitutes the sample.
We minimize the sum of square of the error wrt to α and β
Finally:
)(ˆˆ)( XEYE βα+=
∑=
+−=∆n
jjj XY
1
2)ˆˆ( βα
)()(ˆ)(ˆ)()(),cov( 2XEXVXEYEXEYX ++=+ βα
Simple linear regression
After we have found out the estimators of α and β, we use these values to predict/forecast the subsequent future values of Y, i.e., we find out y and compare those y′s with the corresponding values of Y. Thus we find
and compare them with corresponding values of Yk , for k = n+1, n+2,…….
kk Xy βα ˆˆ +=
Simple linear regression
-0.2
0.0
0.2
0.4
0.6
0.8
1.0
-6.69 -6.39 -6.07 -6.68 -6.38 -6.07 -5.75 -5.13 -4.81 -4.10 -3.66 -3.26
X
Y, Y
-hat
, err
or
Y Y-hat error
Non-linear regression
• y = (β + γX)/(1 + αX)• y = α(X - β)γ
• y = α - βloge(X + γ)
• y = α - βloge(X + γ)
• y = α[1 – exp(-βX)]γ
NOTE: For all these and other models we minimize the sum of squares and find the parameters α, β and γ.
Simple Moving Average3MA = 3 month Moving Average5MA = 5 month Moving Average
Month Actual 3MA 5MA
Jan 266.00
Feb 145.90 198.33
Mar 183.10 149.43 178.92
Apr 119.30 160.90 159.42
May 180.30 156.03 176.60
Jun 168.50 193.53 184.88
Jul 231.80 208.27 199.58
Aug 224.50 216.37 188.10
Sep 192.80 180.07 221.70
Oct 122.90 217.40 212.52
Nov 336.50 215.10 206.48
Dec 185.90 238.90 197.82
Jan 194.30 176.57 215.26
Feb 149.50 184.63 202.62
Mar 210.10 210.97 203.72
Apr 273.30 224.93 222.26
May 191.40 250.57 237.56
Jun 287.00 234.80 256.26
Month Actual 3MA 5MA
Jul 226.00 272.20 259.58
Aug 303.60 273.17 305.62
Sep 289.90 338.37 301.12
Oct 421.60 325.33 324.38
Nov 264.50 342.80 331.60
Dec 342.30 315.50 361.70
Jan 339.7 374.13 340.56
Feb 440.4 365.33 375.52
Mar 315.9 398.53 387.32
Apr 439.3 385.50 406.86
May 401.3 426.00 433.88
Jun 437.4 471.40 452.22
Jul 575.5 473.50 500.76
Aug 407.6 555.03 515.56
Sep 682 521.63 544.34
Oct 475.3 579.53 558.62
Nov 581.3 567.83
Dec 646.9
Simple Moving AveragesActual, 3MA, 5MA
Simple Moving Averages (Yellow is actual; Blue is 3MA; Red is 5MA)
0
100
200
300
400
500
600
700
800
Jan
Mar
May Ju
lSep Nov Ja
n M
arM
ay Jul
Sep Nov Jan
Mar
May Ju
lSep Nov
Month
Va
lue
Centred Moving Average4MA(1) = 4 month Moving Average, 4MA(2) = 4 month Moving Average
2X4MA=Averages of 2MA(1) and 4MA(2)
Mth Actual 4MA(1) 4MA(2) 2X4MA
Jan 266.00
Feb 145.90
Mar 183.10 178.58 157.15 167.86
Apr 119.30 157.15 162.80 159.98
May 180.30 162.80 174.98 168.89
Jun 168.50 174.98 201.28 188.13
Jul 231.80 201.28 204.40 202.84
Aug 224.50 204.40 193.00 198.70
Sep 192.80 193.00 219.18 206.09
Oct 122.90 219.18 209.53 214.35
Nov 336.50 209.53 209.90 209.71
Dec 185.90 209.90 216.55 213.23
Jan 194.30 216.55 184.95 200.75
Feb 149.50 184.95 206.80 195.88
Mar 210.10 206.80 206.08 206.44
Apr 273.30 206.08 240.45 223.26
May 191.40 240.45 244.43 242.44
Jun 287.00 244.43 252.00 248.21
Mth Actual 4MA(1) 4MA(2) 2X4MA
Jul 226.00 252.00 276.63 264.31
Aug 303.60 276.63 310.28 293.45
Sep 289.90 310.28 319.90 315.09
Oct 421.60 319.90 329.58 324.74
Nov 264.50 329.58 342.03 335.80
Dec 342.30 342.03 346.73 344.38
Jan 339.7 346.73 359.58 353.15
Feb 440.4 359.58 383.83 371.70
Mar 315.9 383.83 399.23 391.53
Apr 439.3 399.23 398.48 398.85
May 401.3 398.48 463.38 430.93
Jun 437.4 463.38 455.45 459.41
Jul 575.5 455.45 525.63 490.54
Aug 407.6 525.63 535.10 530.36
Sep 682 535.10 536.55 535.83
Oct 475.3 536.55 596.38 566.46
Nov 581.3 596.38
Dec 646.9
Centred Moving AverageActual, 4MA(1), 4MA(2), 2X4MA
Centred Moving Average (Yellow is Actual; Red is 4MA(1); Blue is 4MA(2); Green is 2X4MA)
0
100
200
300
400
500
600
700
800
Jan
Mar
May Jul
Sep Nov Jan
Mar
May Jul
Sep Nov Jan
Mar
May Jul
Sep Nov
Month
Val
ues
Centred Moving Average
Note:
4MA(1)=(Y1+Y2+Y3+Y4)/4
4MA(2)=(Y2+Y3+Y4+Y5)/4
2X4MA=(Y1+2*Y2+2*Y3+2*Y4+Y5)/8
Similarly we can have 2X6MA, 2X8MA, 2X12MA etc.
Centred Moving Average
Note:
3MA(1)=(Y1+Y2+Y3)/3
3MA(2)=(Y2+Y3+Y4)/3
3MA(3)=(Y3+Y4+Y5)/3
3X3MA=(Y1+2*Y2+3*Y3+2*Y4+Y5)/9
Weighted Moving Averages
In general a weighted k-point moving average can be written as
Note: The total of the weights is equal to 1
Weights are symmetric, i.e., aj = a-j
∑−=
+=m
mjjtjt YaT
Weighted Moving Averages
Steps are:1) 4MA(1)=(Y1+Y2+Y3+Y4)/4
2) 4MA(2)=(Y2+Y3+Y4+Y5)/4
3) 4MA(3)=(Y3+Y4+Y5+Y6)/4
4) 4MA(4)=(Y4+Y5+Y6+Y7)/4
5) 4X4MA=(Y1+2*Y2+3*Y3+4*Y4+3*Y5+2*Y6+Y7)/16
6) 5X4X4MA = a-2*4X4MA(1) + a-1*4X4MA(2) + a0*4X4MA(3) + a1*4X4MA(4) + a2*4X4MA(5)
where a-2 = -3/4, a-1 = 3/4, a0 = 1, a1 = 3/4, a2 = -3/4
Exponential Smoothing Methods
1) Single Exponential Smoothing (one parameter, adaptive parameter)
2) Holt′s linear method (suitable for trends)
3) Holt-Winter′s method (suitable for trends and seasonality)
Single Exponential Smoothing
The general equation is:
Ft+1 = Ft + α(Yt – Ft) = αYt + (1 - α)Ft,
Note: Error term: Et = Yt - Ft
Forecast value: Ft
Actual value: Yt
Weight: α ∈ (0,1) α is such that sum of square of errors is minimized
Single Exponential Smoothing
Month Y(t) F(t, 0.1) F(t.0.5) F(t,0.9)Jan 200.0 200.0 200.0 200.0Feb 135.0 200.0 200.0 200.0Mar 195.0 193.5 167.5 141.5Apr 197.5 193.7 181.3 189.7May 310.0 194.0 189.4 196.7Jun 175.0 205.6 249.7 298.7Jul 155.0 202.6 212.3 187.4Aug 130.0 197.8 183.7 158.2Sep 220.0 191.0 156.8 132.8Oct 277.5 193.9 188.4 211.3Nov 235.0 202.3 233.0 270.9Dec ------- 205.6 234.0 238.6
Single Exponential SmoothingSingle Exponential Smoothing
0
50
100
150
200
250
300
350
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Month
Val
ues
Y(t) F(t,0.1) F(t,0.5) F(t,0.9)
Extension of Exponential Smoothing
The general equation is:
Ft+1 = α1Yt + α2Ft + α3Ft-1
Note: Error term: Et = Yt – Ft
Forecast value: Ft
Actual value: Yt
Weights: αi ∈ (0,1) ∀ i = 1, 2 and 3 α1 + α2 + α3 = 1 αi′s are such that the sum of square of errors is
minimized
Extension of Exponential Smoothing
Extension of Exponential Smoothing
0
50
100
150
200
250
300
350
400
450
Jan Jul Jan Jul Jan Jul Jan Jul Jan Jul Jan Jul Jan Jul Jan Jul
Month
Val
ues
Y(t) F(t)
Adaptive Exponential Smoothing
The general equation is:Ft+1 = αtYt + (1 - αt)Ft
Note: Error term: Et = Yt – Ft
Forecast value: Ft
Actual value: Yt
Smoothed Error: At = βEt + (1 - β)At-1
Absolute Smoothed Error: Mt = β|Et| + (1 - β)Mt-1
Weight: αt+1 = |At/Mt| α and β are such that sum of square of errors is
minimized
Adaptive Exponential Smoothing
Starting values:
F2 = Y1
α2 = β = 0.2
A1 = M1 = 0
Adaptive Exponential Smoothing
Month Y(t) F(t) E(t) A(t) M(t) α βJan 200.0 0.0 0.0 0.2Feb 135.0 200.0 -65.0 -13.0 13.0 0.2Mar 195.0 187.0 8.0 -8.8 12.0 1.0Apr 197.5 188.6 8.9 -5.3 11.4 0.7May 310.0 190.4 119.6 19.7 33.0 0.5Jun 175.0 214.3 -39.3 7.9 34.3 0.6Jul 155.0 206.4 -51.4 -4.0 37.7 0.2Aug 130.0 196.2 -66.2 -16.4 43.4 0.1Sep 220.0 182.9 37.1 -5.7 42.1 0.4Oct 277.5 190.3 87.2 12.9 51.1 0.1Nov 235.0 207.8 27.2 15.7 46.4 0.3Dec 213.2 0.3
Adaptive Exponential SmoothingAdaptive Exponential Smoothing
0
50
100
150
200
250
300
350
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Month
Val
ues
Y(t) F(t)
Adaptive Exponential Smoothing
Adaptive Exponential Smoothing
-100.00
-50.00
0.00
50.00
100.00
150.00
200.00
250.00
300.00
350.00
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Time
Val
ues
A(t) M(t) E(t) F(t) Y(t)
Holt-Winter′s Method The general equations are:1) Lt = αYt/St-s + (1-α)(Lt-1 + bt-1)2) bt = β(Lt – Lt-1) + (1 - β)bt-1
3) St = γYt/Lt + (1 - γ )St-s for t > s4) Ft+m = (Lt + btm)St-s+m
5) Si = Yi/Ls where Ls = (Y1+…+Ys)/s for i ≤ s
Note: Forecast value: Ft
Actual value: Yt
Trend: bt
Seasonal component: St
Length of seasonality: s α, β and γ are chosen such that the sum of square of errors is minimized