eng.mosab i. tabash applied statistics. eng.mosab i. tabash session 1 : lesson 1...

28
Eng.Mosab I. Tabash Applied Statistics Applied Statistics

Upload: jade-allen

Post on 12-Jan-2016

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Eng.Mosab I. Tabash Applied Statistics. Eng.Mosab I. Tabash Session 1 : Lesson 1 IntroductiontoStatisticsIntroductiontoStatistics

Eng.Mosab I. Tabash

Applied StatisticsApplied StatisticsApplied StatisticsApplied Statistics

Page 2: Eng.Mosab I. Tabash Applied Statistics. Eng.Mosab I. Tabash Session 1 : Lesson 1 IntroductiontoStatisticsIntroductiontoStatistics

Eng.Mosab I. Tabash

Session 1 : Lesson 1Session 1 : Lesson 1

Introduction Introduction

to to

StatisticsStatistics

Introduction Introduction

to to

StatisticsStatistics

Page 3: Eng.Mosab I. Tabash Applied Statistics. Eng.Mosab I. Tabash Session 1 : Lesson 1 IntroductiontoStatisticsIntroductiontoStatistics

Eng.Mosab I. Tabash

ObjectivesObjectives

To define statisticsTo define statistics To discuss the wide range of To discuss the wide range of

applications of statistics in businessapplications of statistics in business To understand the branches of To understand the branches of

statisticsstatistics To describe the levels of To describe the levels of

measurement of datameasurement of data

Page 4: Eng.Mosab I. Tabash Applied Statistics. Eng.Mosab I. Tabash Session 1 : Lesson 1 IntroductiontoStatisticsIntroductiontoStatistics

Eng.Mosab I. Tabash

What is Statistics? Science of collecting, organizing,

presenting, analyzing, and interpreting data for the purpose of assisting in making more effective decision

Branch of mathematics Facts and figures A subject or discipline Collections of data A way to get information from data

Page 5: Eng.Mosab I. Tabash Applied Statistics. Eng.Mosab I. Tabash Session 1 : Lesson 1 IntroductiontoStatisticsIntroductiontoStatistics

Eng.Mosab I. Tabash

What is Statistics?

“Statistics is a way to get information from data”

Data

Statistics

Information

Data: Facts, especially numerical facts, collected together for reference or information.

Definitions: Oxford English Dictionary

Information: Knowledge communicated concerning some particular fact.

Statistics is a tool for creating new understanding from a set of numbers.

Page 6: Eng.Mosab I. Tabash Applied Statistics. Eng.Mosab I. Tabash Session 1 : Lesson 1 IntroductiontoStatisticsIntroductiontoStatistics

Eng.Mosab I. Tabash

Example - Stats Anxiety…A business school student is anxious about their statistics course, since they’ve heard the course is difficult. The professor provides last term’s final exam marks to the student. What can be discerned from this list of numbers?

Data

Statistics

Information

List of last term’s marks.

958970657857:

New information about the statistics class.

E.g. Class average,Proportion of class receiving A’sMost frequent mark,Marks distribution, etc.

Page 7: Eng.Mosab I. Tabash Applied Statistics. Eng.Mosab I. Tabash Session 1 : Lesson 1 IntroductiontoStatisticsIntroductiontoStatistics

Eng.Mosab I. Tabash

Applications of Statistics in Business

Accounting – auditing and cost estimation Finance – investments and portfolio

management Human resource – compensation, job

satisfaction, performance measure Operation – quality management, forecasting,

MIS, capacity planning, materials control Marketing - market analysis, consumer

research, pricing Economics – regional, national, and

international economic performance International Business- market and

demographic analysis.

Page 8: Eng.Mosab I. Tabash Applied Statistics. Eng.Mosab I. Tabash Session 1 : Lesson 1 IntroductiontoStatisticsIntroductiontoStatistics

Eng.Mosab I. Tabash

Key Statistical Concepts… Population

— a population is the group of all items of interest to a statistics practitioner.

— frequently very large; sometimes infinite.

E.g. All Blue collar workers in Malaysia

Sample

— A sample is a set of data drawn from the population.

— Potentially very large, but less than the population.

E.g. a sample of 765 blue collar workers

Page 9: Eng.Mosab I. Tabash Applied Statistics. Eng.Mosab I. Tabash Session 1 : Lesson 1 IntroductiontoStatisticsIntroductiontoStatistics

Eng.Mosab I. Tabash

Key Statistical Concepts…

Parameter— A descriptive measure of a population.

Statistic— A descriptive measure of a sample.

Page 10: Eng.Mosab I. Tabash Applied Statistics. Eng.Mosab I. Tabash Session 1 : Lesson 1 IntroductiontoStatisticsIntroductiontoStatistics

Eng.Mosab I. Tabash

Key Statistical Concepts…

Populations have Parameters,Populations have Parameters, Samples have Statistics.Samples have Statistics.

Parameter

Population Sample

Statistic

Subset

Page 11: Eng.Mosab I. Tabash Applied Statistics. Eng.Mosab I. Tabash Session 1 : Lesson 1 IntroductiontoStatisticsIntroductiontoStatistics

Eng.Mosab I. Tabash

Branches of Statistics

Statistics

Descriptive Statistics Inferential Statistics

Non-Parametric StatisticsParametric Statistics

Page 12: Eng.Mosab I. Tabash Applied Statistics. Eng.Mosab I. Tabash Session 1 : Lesson 1 IntroductiontoStatisticsIntroductiontoStatistics

Eng.Mosab I. Tabash

Descriptive Statistics… …are methods of organizing, summarizing, and

presenting data in a convenient and informative way. These methods include: Graphical Techniques Numerical Techniques

The actual method used depends on what information we would like to extract. Are we interested in… measure(s) of central location? and/or measure(s) of variability (dispersion)?

Descriptive Statistics helps to answer these questions…

Page 13: Eng.Mosab I. Tabash Applied Statistics. Eng.Mosab I. Tabash Session 1 : Lesson 1 IntroductiontoStatisticsIntroductiontoStatistics

Eng.Mosab I. Tabash

Inferential Statistics… Descriptive Statistics describe the data set that’s

being analyzed, but doesn’t allow us to draw any conclusions or make any interferences about the data. Hence we need another branch of statistics: inferential statistics.

Inferential statistics is also a set of methods, but it is used to draw conclusions or inferences about characteristics of populations based on data from a sample.

Page 14: Eng.Mosab I. Tabash Applied Statistics. Eng.Mosab I. Tabash Session 1 : Lesson 1 IntroductiontoStatisticsIntroductiontoStatistics

Eng.Mosab I. Tabash

Statistical Inference…Statistical inference is the process of making an estimate, prediction, or decision about a population based on a sample.

Parameter

Population

Sample

Statistic

Inference

What can we infer about a Population’s Parametersbased on a Sample’s Statistics?

Page 15: Eng.Mosab I. Tabash Applied Statistics. Eng.Mosab I. Tabash Session 1 : Lesson 1 IntroductiontoStatisticsIntroductiontoStatistics

Eng.Mosab I. Tabash

Statistical Inference…We use statistics to make inferences about parameters.

Therefore, we can make an estimate, prediction, or decision about a population based on sample data.

Thus, we can apply what we know about a sample to the larger population from which it was drawn!

Page 16: Eng.Mosab I. Tabash Applied Statistics. Eng.Mosab I. Tabash Session 1 : Lesson 1 IntroductiontoStatisticsIntroductiontoStatistics

Eng.Mosab I. Tabash

Statistical Inference… Inference… Rationale:

•Large populations make investigating each member impractical and expensive.

•Easier and cheaper to take a sample and make estimates about the population from the sample.

However:Such conclusions and estimates are not always going to be correct.

For this reason, we build into the statistical inference “measures of reliability”, namely confidence level and significance level.

Page 17: Eng.Mosab I. Tabash Applied Statistics. Eng.Mosab I. Tabash Session 1 : Lesson 1 IntroductiontoStatisticsIntroductiontoStatistics

Eng.Mosab I. Tabash

Confidence & Significance Levels…

The confidence level is the proportion of times that an estimating procedure will be correct.

E.g. a confidence level of 95% means that, estimates based on this form of statistical inference will be correct 95% of the time.

When the purpose of the statistical inference is to draw a conclusion about a population, the significance level measures how frequently the conclusion will be wrong in the long run.

E.g. a 5% significance level means that, in the long run, this type of conclusion will be wrong 5% of the time.

Page 18: Eng.Mosab I. Tabash Applied Statistics. Eng.Mosab I. Tabash Session 1 : Lesson 1 IntroductiontoStatisticsIntroductiontoStatistics

Eng.Mosab I. Tabash

Process of Inferential Process of Inferential StatisticsStatistics

Process of Inferential Process of Inferential StatisticsStatistics

Population

(parameter)

Sample

x

(statistic)

Calculate x

to estimate

Select a

random sample

Page 19: Eng.Mosab I. Tabash Applied Statistics. Eng.Mosab I. Tabash Session 1 : Lesson 1 IntroductiontoStatisticsIntroductiontoStatistics

Eng.Mosab I. Tabash

Types of Data and InformationTypes of Data and Information

Definitions…A variable is some characteristic of a population or sample.E.g. student grades; workers salaryTypically denoted with a capital letter:A,A-, B+, B, B-…The values of the variable are the range of possible values for a variable.E.g. student marks (0..100)Data are the observed values of a variable.E.g. student marks: {67, 74, 71, 83, 93, 55, 48}

Page 20: Eng.Mosab I. Tabash Applied Statistics. Eng.Mosab I. Tabash Session 1 : Lesson 1 IntroductiontoStatisticsIntroductiontoStatistics

Eng.Mosab I. Tabash

Types of Data & Information

Data (at least for purposes of Statistics) fall into three main groups:

Interval Data Nominal Data

Ordinal Data

Page 21: Eng.Mosab I. Tabash Applied Statistics. Eng.Mosab I. Tabash Session 1 : Lesson 1 IntroductiontoStatisticsIntroductiontoStatistics

Eng.Mosab I. Tabash

Interval Data…Interval data

• Real numbers, i.e. heights, weights, prices, etc.

• Also referred to as quantitative or numerical.

Arithmetic operations can be performed on Interval Data, thus its meaningful to talk about 2*Height, or Price + $1, and so on.

Page 22: Eng.Mosab I. Tabash Applied Statistics. Eng.Mosab I. Tabash Session 1 : Lesson 1 IntroductiontoStatisticsIntroductiontoStatistics

Eng.Mosab I. Tabash

Nominal Data…Nominal Data…Nominal Data

• The values of nominal data are categories.

E.g. responses to questions about marital status, coded as:

Single = 1, Married = 2, Divorced = 3, Widowed = 4

Because the numbers are arbitrary, arithmetic operations don’t make any sense (e.g. does Widowed ÷ 2 = Married?!)

Nominal data are also called qualitative or categorical.

Page 23: Eng.Mosab I. Tabash Applied Statistics. Eng.Mosab I. Tabash Session 1 : Lesson 1 IntroductiontoStatisticsIntroductiontoStatistics

Eng.Mosab I. Tabash

Ordinal Data…Ordinal Data…Ordinal Data appear to be categorical in nature, Ordinal Data appear to be categorical in nature, but their values have an but their values have an orderorder; a ranking to them:; a ranking to them:

E.g. College course rating system:E.g. College course rating system:

poor = 1, fair = 2, good = 3, very good = 4, poor = 1, fair = 2, good = 3, very good = 4, excellent = 5excellent = 5

While its still not meaningful to do arithmetic on While its still not meaningful to do arithmetic on this data (e.g. does 2*fair = very good?!), we can this data (e.g. does 2*fair = very good?!), we can say things like:say things like:

excellent > poorexcellent > poor oror fair < very goodfair < very good

That is, order is maintained no matter what That is, order is maintained no matter what numeric values are assigned to each category.numeric values are assigned to each category.

Page 24: Eng.Mosab I. Tabash Applied Statistics. Eng.Mosab I. Tabash Session 1 : Lesson 1 IntroductiontoStatisticsIntroductiontoStatistics

Eng.Mosab I. Tabash

Types of Data & Information…

Categorical?DataInterval

Data

Nominal Data

Ordinal Data

N

Ordered?

Y

Y

N

Categorical Data

Page 25: Eng.Mosab I. Tabash Applied Statistics. Eng.Mosab I. Tabash Session 1 : Lesson 1 IntroductiontoStatisticsIntroductiontoStatistics

Eng.Mosab I. Tabash

E.g. Representing Student Grades…

Categorical?DataInterval Datae.g. {0..100}

Nominal Datae.g. {Pass | Fail}

Ordinal Datae.g. {F, D, C, B,

A}

N

Ordered?

Y

Y

N

Categorical Data

Rank order to data

NO rank order to data

Page 26: Eng.Mosab I. Tabash Applied Statistics. Eng.Mosab I. Tabash Session 1 : Lesson 1 IntroductiontoStatisticsIntroductiontoStatistics

Eng.Mosab I. Tabash

Calculations for Types of DataAs mentioned above,

• All calculations are permitted on interval data.

• Only calculations involving a ranking process are allowed for ordinal data.

• No calculations are allowed for nominal data, only counting the number of observations in each category is possible.

This lends itself to the following “hierarchy of data”…

Page 27: Eng.Mosab I. Tabash Applied Statistics. Eng.Mosab I. Tabash Session 1 : Lesson 1 IntroductiontoStatisticsIntroductiontoStatistics

Eng.Mosab I. Tabash

Hierarchy of Data…IntervalInterval

Values are real numbers.Values are real numbers.

All calculations are valid.All calculations are valid.

Data may be treated as ordinal or nominal.Data may be treated as ordinal or nominal.

OrdinalOrdinal

Values must represent the ranked order of the data.Values must represent the ranked order of the data.

Calculations based on an ordering process are valid.Calculations based on an ordering process are valid.

Data may be treated as nominal but not as interval.Data may be treated as nominal but not as interval.

Nominal Nominal

Values are the arbitrary numbers that represent Values are the arbitrary numbers that represent categories.categories.

Only calculations based on the frequencies of Only calculations based on the frequencies of occurrence are valid.occurrence are valid.

Data may not be treated as ordinal or interval.Data may not be treated as ordinal or interval.

Page 28: Eng.Mosab I. Tabash Applied Statistics. Eng.Mosab I. Tabash Session 1 : Lesson 1 IntroductiontoStatisticsIntroductiontoStatistics

Eng.Mosab I. Tabash

Data Level, Operations, and Statistical MethodsData Level, Operations, and Statistical Methods

Data Level

Nominal

Ordinal

Interval

Meaningful Operations

Classifying and Counting

All of the above plus Ranking

All of the above plus Addition, Subtraction, Multiplication, and Division

StatisticalMethods

Nonparametric

Nonparametric

Parametric