lect01
DESCRIPTION
Quantitative Methods of Decision MakingTRANSCRIPT
Quantitative Methods for Decision Making
Lecture 1
Dr. Akhter
5 t h e d i t i o n
Marking Scheme Mid term 30%
Final Exam 40%
Quizzes 15% (mean of best five quizzes each of 15 points)
Assignments 15% (mean of best 7 assignments each of 15 points)
Book Introductory STATISTICS
9TH EDITION
ISBN-13: 978-0-321-69122-4
ISBN-10: 0-321-69122-9
Neil A. Weiss
Addison-Wesley
Topics
Gathering information and its Presentation
Measures of central tendency
Measures of Dispersion-
Probability Concepts
Random & Non Random Variables
Some Special Distributions
The Normal distribution
Fitting of a distribution
Sampling distributions
Topics
Estimation Theory
Mathematical Models
Regression & Correlation
Decision Theory (p-value approach)
Decision based on risk
Experimental Designs
Case studies related to the CRD and RBD using some
industrial and financial data sets
Setting up ANOVA tables and Decision Making
Computer Support producing group research
Statistics
Statistics (as subject) Science of collecting and analyzing data for the purpose
of drawing conclusions and making decisions Provides data collection methods to reduce biases, and
analysis methods to identify patterns and draw inference from noisy data
Statistics (facts and figures)
Aggregate of numerical facts: Statistics of scores,
statistics of marks, statistics of wages etc.
Statistic (constant) A characteristics of sample
Important terms
Population: Homogeneous, Heterogeneous, finite, Infinite,
Hypothetical, Existent,
Census Complete enumeration
Sampling frame or frame A complete list of all elements in our
population
Sampling, Sample, Random Sample
Parameter Characteristic of population
Statistic Characteristic of sample
Statistical Methods
Statistical
Methods
Descriptive
Statistics
Inferential
Statistics
• Descriptive statistics consists of methods for organizing,
displaying, and describing data by using tables, graphs, and
summary measures.
• Descriptive statistics is concerned with exploring, visualising, and
summarizing data but without fitting the data to any models.
• This kind of analysis is used to explore the data in the initial stages
of data analysis.
• Since no models are involved, it can not be used to test hypotheses
or to make testable predictions.
• Nevertheless, it is a very important part of analysis that can reveal
many interesting features in the data.
Descriptive statistics
Inferential statistics
Involves the identification of a suitable model. The data is then fit to the model to obtain an optimal estimation of the model's parameters.
The model then undergoes validation by testing either predictions or hypotheses of the model.
Models based on a unique sample of data can be used to infer generalities about features of the whole population.
Using Statistics (Two Categories)
Inferential Statistics Predict and forecast
values of population
parameters
Test hypotheses about
values of population
parameters
Make decisions
Descriptive Statistics Collect
Organize
Summarize
Display
Analyze
Qualitative -
Categorical or
Nominal: Color
Gender
Nationality
Quantitative -
Measurable or
Countable: Temperatures
Salaries
Number of points scored
on a 100 point exam
Types of Data - Two Types
Data
Collection of facts and figures
May be qualitative or quantitative
May be discrete or continuous
May be in un-group or group form
Data
Qualitative Quantitative
Discrete Continuous
A population consists of the set of all
measurements for which the investigator
is interested.
A sample is a subset of the measurements
selected from the population.
A census is a complete enumeration of
every item in a population.
Samples and Populations
Sampling from the population is often
done randomly, such that every possible
sample of equal size (n) will have an
equal chance of being selected.
A sample selected in this way is called a
simple random sample or just a random
sample.
A random sample allows chance to
determine its elements.
Simple Random Sample
Random Sampling
Stratified Sampling
Cluster Sampling
Systematic Sampling
Judgment Sampling
Quota Sampling
Sampling Techniques
Parameter A population constant
Statistic A sample constant
Parameter and Statistic
,,, 2
prsx ,,, 2
Population (N) Sample (n)
Samples and Populations
Census of a population may be:
Impossible
Impractical
Too costly
Why Sample?
Subscript Notation
iXList Name
Subscript
Subscript Notation
iXList Name
Subscript
ijXDouble Subscript
11 12 13
21 22 23
31 32 33
X X X
X X X
X X X
Summation Notations
1
N
i
i
X
summation
index
start value
stop value
Sigma Notation
Suppose our list has just 5 numbers, and
they are 1,3,2,5,6.
52
1
i
i
X
2 2 2 2 21 3 2 5 6 75
25
1
i
i
X
2 21 3 2 5 6 17 289
Properties of Sigma
1
N
i
a Na
1 1
N N
i i
i i
aX a X
1 1 1
N N N
i i i i
i i i
X Y X Y
1 1 1
N N N
i i i i
i i i
X Y X Y
( 1)y
i x
a y x a
Properties of Sigma
2
1
2
1
2xnxxx
n
i
i
n
i
i
Show that
xnx
or
n
x
x
x
n
i
i
n
i
i
1
1
is which data ofmean arithmetic theis where
Sigma Notation
=
Commonly used Greek Letters
2
1
2 5N
j
i
X
Expand
Exercise
In a survey it was found that 64 families bought milk in the
following quantities (liters) in a particular month:
19 22 09 22 12 39 19 14 23 06 24 16 18
7 17 20 25 28 18 10 24 20 21 10 07 18
28 24 20 14 24 25 34 22 05 33 23 26 29
13 36 11 26 11 37 30 13 08 15 22 21 32
21 31 17 16 23 12 09 15 27 17 21 16
(a) Construct a frequency distribution using 5 intervals
(b) Construct histogram, polygon, and frequency curve
(c) Construct c.f. distributions and draw Ogives
(d) Construct relative, cumulative relative, percentage relative dist’n.
Group data, ungroup data
Unweighted , weighted
Combined arithmetic mean
Assumed mean, trimmed mean
Arithmetic Mean The central value
Ungroup data (even, odd # of observations)
Group data
Graphical method of finding median
Median The most middle observation in arranged data
Ungroup data
Group data
Graphical method of finding mode
Relationship b/w mean, median, & moade
Mode The most frequent observation
Quartiles are the percentage points that break down
the ordered data set into quarters.
The first quartile is the 25th percentile. It is the point
below which lie 1/4 of the data.
The second quartile is the 50th percentile. It is the
point below which lie 1/2 of the data. This is also
called the median.
The third quartile is the 75th percentile. It is the
point below which lie 3/4 of the data.
Quartiles
The first quartile, Q1, (25th percentile) is
often called the lower quartile.
The second quartile, Q2, (50th
percentile) is often called median or the
middle quartile.
The third quartile, Q3, (75th percentile)
is often called the upper quartile.
The interquartile range is the difference
between the first and the third quartiles.
Quartiles and Interquartile Range
Sorted Sales Sales 9 6 6 9 12 10 10 12 13 13 15 14 16 14 14 15 14 16 16 16 17 16 16 17 24 17 21 18 22 18 18 19 19 20 18 21 20 22 17 24
First Quartile
Median
Third Quartile
(n+1)P/100 Quartiles
Example : Finding Quartiles
Measures of Variability
Range
Interquartile range
Variance
Standard Deviation
Measures of Central Tendency
Median
Mode
Mean
Other summary
measures:
Skewness
Kurtosis
Summary Measures: Population Parameters Sample Statistics
Median Middle value when
sorted in order of
magnitude
50th percentile
Mode Most frequently-
occurring value
Mean Average
Measures of Central Tendency or Location
Sales Sorted Sales
9 6
6 9
12 10
10 12
13 13
15 14
16 14
14 15
14 16
16 16
17 16
16 17
24 17
21 18
22 18
18 19
19 20
18 21
20 22
17 24
Median
Median
50th Percentile
(20+1)50/100=10.5 16 + (.5)(0) = 16
The median is the middle
value of data sorted in
order of magnitude. It is
the 50th percentile.
Example – Median (Data is used from previous example )
.
. . . . . : . : : : . . . . . ---------------------------------------------------------------
6 9 10 12 13 14 15 16 17 18 19 20 21 22 24
Mode = 16
The mode is the most frequently occurring value. It
is the value with the highest frequency.
Example - Mode (Data is used from Example 1-2)
The mean of a set of observations is their average -
the sum of the observed values divided by the
number of observations.
Population Mean Sample Mean
x
N
i
N
1 x
x
n
i
n
1
Arithmetic Mean or Average
x
x
n
i
n
1 317
20 15 85 .
Sales
9
6
12
10
13
15
16
14
14
16
17
16
24
21
22
18
19
18
20
17
317
Example – Mean
.
. . . . . : . : : : . . . . . ---------------------------------------------------------------
6 9 10 12 13 14 15 16 17 18 19 20 21 22 24
Median and Mode = 16
Mean = 15.85
Example - Mode
Dividing data into groups or classes or intervals
Groups should be:
Mutually exclusive
• Not overlapping - every observation is assigned to only one group
Exhaustive
• Every observation is assigned to a group
Equal-width (if possible)
• First or last group may be open-ended
Group Data and the Histogram
Table with two columns listing:
Each and every group or class or interval of values
Associated frequency of each group
• Number of observations assigned to each group
• Sum of frequencies is number of observations
– N for population
– n for sample
Class midpoint is the middle value of a group or class or interval
Relative frequency is the percentage of total observations in each class
Sum of relative frequencies = 1
Frequency Distribution
x f(x) f(x)/n
Spending Class ($) Frequency (number of customers) Relative Frequency
0 to less than 100 30 0.163
100 to less than 200 38 0.207
200 to less than 300 50 0.272
300 to less than 400 31 0.168
400 to less than 500 22 0.120
500 to less than 600 13 0.070
184 1.000
• Example of relative frequency: 30/184 = 0.163
• Sum of relative frequencies = 1
Example : Frequency Distribution
x F(x) F(x)/n
Spending Class ($) Cumulative Frequency Cumulative Relative Frequency
0 to less than 100 30 0.163
100 to less than 200 68 0.370
200 to less than 300 118 0.641
300 to less than 400 149 0.810
400 to less than 500 171 0.929
500 to less than 600 184 1.000
The cumulative frequency of each group is the sum of the
frequencies of that and all preceding groups.
Cumulative Frequency Distribution
A histogram is a chart made of bars of
different heights.
Widths and locations of bars correspond to
widths and locations of data groupings
Heights of bars correspond to frequencies or
relative frequencies of data groupings
Histogram
Frequency Histogram
Histogram Example
Relative Frequency Histogram
Histogram Example
Skewness – Measure of asymmetry of a frequency distribution
• Skewed to left
• Symmetric or unskewed
• Skewed to right
Kurtosis – Measure of flatness or peakedness of a frequency
distribution
• Platykurtic (relatively flat)
• Mesokurtic (normal)
• Leptokurtic (relatively peaked)
Skewness and Kurtosis
Skewed to left
Skewness
Skewness
Symmetric
Skewness
Skewed to right
Kurtosis
Platykurtic - flat distribution
Kurtosis
Mesokurtic - not too flat and not too peaked
Kurtosis
Leptokurtic - peaked distribution
Pie Charts
Categories represented as percentages of total
Bar Graphs
Heights of rectangles represent group frequencies
Frequency Polygons
Height of line represents frequency
Ogives Height of line represents cumulative frequency
Time Plots
Represents values over time
Methods of Displaying Data
Pie Chart
Bar Chart
Average Revenues
Average Expenses
Fig. 1-11 Airline Operating Expenses and Revenues
1 2
1 0
8
6
4
2
0
A i r l i n e
American Continental Delta Northwest Southwest United USAir
Relative Frequency Polygon Ogive
Frequency Polygon and Ogive
5 0 4 0 3 0 2 0 1 0 0
0 . 3
0 . 2
0 . 1
0 . 0
Sales
5 0 4 0 3 0 2 0 1 0 0
1 . 0
0 . 5
0 . 0
Sales
O S A J J M A M F J D N O S A J J M A M F J D N O S A J J M A M F J
8 . 5
7 . 5
6 . 5
5 . 5
M o n t h
M i l l
i o n
s o
f T
o n
s
M o n t h l y S t e e l P r o d u c t i o n
( P r o b l e m 1 - 4 6 )
Time Plot
Stem-and-Leaf Displays
Quick-and-dirty listing of all observations
Conveys some of the same information as a histogram
Box Plots
Median
Lower and upper quartiles
Maximum and minimum
Techniques to determine relationships and trends,
identify outliers and influential observations, and
quickly describe or summarize data sets.
1-9 Exploratory Data Analysis - EDA
1 122355567 2 0111222346777899 3 012457 4 11257 5 0236 6 02
Example: Stem-and-Leaf Display
Construct a stem & leaf graph of the following data
11,12, 12, 13, 15, 15, 15,16,17,20,21,21,
21,22,22,22,23,24,26,27,27,27,28,29,29, 56
30,31,32,34,35,37,41,41,42,45,47,50,52,53,62
X X * o
Median Q1 Q3 Inner
Fence Inner
Fence
Outer
Fence
Outer
Fence
Interquartile Range
Smallest data
point not below
inner fence
Largest data point
not exceeding
inner fence
Suspected
outlier Outlier
Q1-3(IQR)
Q1-1.5(IQR) Q3+1.5(IQR)
Q3+3(IQR)
Elements of a Box Plot
Box Plot
Example: Box and Whisker Plots
Order numbers
3, 5, 4, 2, 1, 6, 8, 11, 14, 13, 6, 9, 10, 7
• First, order your numbers from least to
greatest:
1, 2, 3, 4, 5, 6, 6, 7, 8, 9, 10, 11, 13, 14
Median
1, 2, 3, 4, 5, 6, 6, 7, 8, 9, 10, 11, 13, 14
• Then find the median (from the ordered list):
• Cross off one number from each side until you reach
the middle number (or numbers).
1, 2, 3, 4, 5, 6, 6, 7, 8, 9, 10, 11, 13, 14
Median (continued):
1, 2, 3, 4, 5, 6, 6, 7, 8, 9, 10, 11, 13, 14
• If there are two numbers in the middle,
Add those 2 middle numbers together:
6 + 7 = 13
• Then divide by 2:
13 ÷ 2 = 6.5
• The median is 6.5.
Quartiles (page 1)
1, 2, 3, 4, 5, 6, 6, 7, 8, 9, 10, 11, 13, 14
• Then split the numbers on left and right sides
of the median:
1, 2, 3, 4, 5, 6, 6, │7, 8, 9, 10, 11, 13, 14
Quartiles (page 2)
1, 2, 3, 4, 5, 6, 6, │7, 8, 9, 10, 11, 13, 14
• Find the median for each half:
1, 2, 3, 4, 5, 6, 6 │ 7, 8, 9, 10, 11, 13, 14
1, 2, 3, 4, 5, 6, 6 │ 7, 8, 9, 10, 11, 13, 14
Left Right
Median = 4 Median = 10
Quartiles (page 3)
1, 2, 3, 4, 5, 6, 6 │ 7, 8, 9, 10, 11, 13, 14
Left Right
Median = 4 Median = 10
• The left median is called the LOWER
QUARTILE.
• The right median is called the UPPER
QUARTILE.
Number line
1, 2, 3, 4, 5, 6, 6, 7, 8, 9, 10, 11, 13, 14
• Draw a number line from the smallest to the
largest number without skipping any numbers.
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Quartiles on number line
1, 2, 3, 4, 5, 6, 6, 7, 8, 9, 10, 11, 13, 14
• Put circles at the LOWER and UPPER
Quartiles.
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Box on Quartiles on number line
1, 2, 3, 4, 5, 6, 6, 7, 8, 9, 10, 11, 13, 14
• Draw a box connecting the circles at the
LOWER and UPPER Quartiles.
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Median on number line
1, 2, 3, 4, 5, 6, 6, 7, 8, 9, 10, 11, 13, 14
• Put a circle at the median (6.5).
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Median on number line
1, 2, 3, 4, 5, 6, 6, 7, 8, 9, 10, 11, 13, 14
• Draw a line connecting the median to the box.
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Low and high numbers
1, 2, 3, 4, 5, 6, 6, 7, 8, 9, 10, 11, 13, 14
• Put circles at the high and low points.
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Low and high numbers
1, 2, 3, 4, 5, 6, 6, 7, 8, 9, 10, 11, 13, 14
• Draw lines that connect the high and low
points to the box.
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Box and Whisker Plot
3, 5, 4, 2, 1, 6, 8, 11, 14, 13, 6, 9, 10, 7
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Here is the completed Box and Whisker Plot!
Example: Box Plot
Histogram
Histograms
Frequency Polygons & the Ogive
Two Frequency Polygons
Pie Chart
Bar Chart
Box Plot
Box Plot Compare Two Data Sets
Time Plot
Time Plot
Testing Normality
Check the normality of the following data
3, 5, 4, 2, 1, 6, 8, 11, 14, 13, 6, 9, 10, 7
Table of normal scores
Questions?