quantitative statistical methods

50
Gazdaságtudományi Kar Gazdaságelméleti és Módszertani Intézet Quantitative Statistical Methods

Upload: others

Post on 19-Jan-2022

15 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Quantitative Statistical Methods

• Gazdaságtudományi Kar• Gazdaságelméleti és Módszertani Intézet

Quantitative Statistical Methods

Page 2: Quantitative Statistical Methods

• Gazdaságtudományi Kar• Gazdaságelméleti és Módszertani Intézet

Required Readings:

• Petra Petrovics: SPSS Tutorial and Exercise Book• Quantitative Information Forming Methods 08.modul

(TAMOP – 4.1.2-08/1/A-2009-0049 Virtuális vállalatok)http://elearning.infotec.hu/ilias.php?baseClass=ilSAHSPre

sentationGUI&ref_id=2774

Proposed Readings:• Chris Brooks: Introductory Econometrics for Finance,

Cambridge; Second Edition:• Richard A. Defusco, CFA – Dennis W. McLeavey, CFA –

Jerald E. Pinto, CFA – David E. Runkley, CFA: Quantitative Investment Analysis, CFA Series; SecondEdition:

Page 3: Quantitative Statistical Methods

• Gazdaságtudományi Kar• Gazdaságelméleti és Módszertani Intézet

Petra Petrovics

Introduction to Statistics

Page 4: Quantitative Statistical Methods

• Gazdaságtudományi Kar• Gazdaságelméleti és Módszertani Intézet

Statistics

Statistics: is a mathematical sciencepertaining to the collection, analysis,interpretation or explanation, andpresentation of data.

• Practical activity – to analyze data

• Set of data – as a result of statistical activity

• Method

• Analyzing data

• Drawing conclusion

Page 5: Quantitative Statistical Methods

• Gazdaságtudományi Kar• Gazdaságelméleti és Módszertani Intézet

Data Gathering

• Trends and reports overview

• Observations

• Interview

• Focus group

• Survey

• Photo interview

Page 6: Quantitative Statistical Methods

• Gazdaságtudományi Kar• Gazdaságelméleti és Módszertani Intézet

Statistical Inference

• Study of how data can besummarized effectively todescribe the importantaspects of large data sets

• It turns data intoinformation

• Data collection &analyzation

• It is used when tentativeconclusions about apopulation are drawn onthe basis of a sample

Statistics

Descriptive Statistics

Page 7: Quantitative Statistical Methods

• Gazdaságtudományi Kar• Gazdaságelméleti és Módszertani Intézet

Statistical Population

• All members of a specified group (N)

• It is a set of entities concerning whichstatistical inferences are to be drawn, oftenbased on a random sample taken from thepopulation.

– Discrete population

– Continuous population (interval)

Page 8: Quantitative Statistical Methods

• Gazdaságtudományi Kar• Gazdaságelméleti és Módszertani Intézet

Statistical Variables

= Characteristic of a unit.

(1)• Quantitative • Qualitative• Temporal• Geographical

(2)• Common• Differential

Page 9: Quantitative Statistical Methods

• Gazdaságtudományi Kar• Gazdaságelméleti és Módszertani Intézet

Quantitative vs. Qualitative

• Quantitative data measures either howmuch or how many of something, i.e. aset of observations where any singleobservation is a number that representsan amount or a count.

• Qualitative data provide labels, ornames, for categories of like items, i.e. aset of observations where any singleobservation is a word or code thatrepresents a class or category.

~ categorical variable

Page 10: Quantitative Statistical Methods

• Gazdaságtudományi Kar• Gazdaságelméleti és Módszertani Intézet

Types of Quantitative Variables

• Continuous variables are those variables that havetheoretically an infinite number of gradationsbetween two measurements.For example, body weight of individuals, milk yield of cows orbuffaloes etc. Most of the variables in biology are of continuoustype.

• Discrete variables do not have continuous gradationsbut there is a definite gap between twomeasurements, i.e. they can not be measured infractions.For example, number of eggs laid by hens, number of children

in a family etc.

Page 11: Quantitative Statistical Methods

• Gazdaságtudományi Kar• Gazdaságelméleti és Módszertani Intézet

Scales of measurement

from weakest to strongest

- nominal scale

- ordinal scale

- interval scale

- ratio scale

Page 12: Quantitative Statistical Methods

• Gazdaságtudományi Kar• Gazdaságelméleti és Módszertani Intézet

1. Nominal scale

• Numbers are labels of groups or classes• Simple codes assigned to objects as labels• For qualitative data, e.g. professional

classification, geographic classification• e.g. - blonde: 1, brown: 2, red: 3, black: 4

(a person with red hair does not possess more "hairness" than a person with blonde hair)

- female: 1, male: 2

Page 13: Quantitative Statistical Methods

• Gazdaságtudományi Kar• Gazdaságelméleti és Módszertani Intézet

2. Ordinal scale

• Data elements may be ordered according to their relative size or quality, the numbers assigned to objects or events represent the rank order (1st, 2nd, 3rd etc.)

• e.g. top lists of companies

Page 14: Quantitative Statistical Methods

• Gazdaságtudományi Kar• Gazdaságelméleti és Módszertani Intézet

3. Interval scale

• Meaning of distances between any two observations

• The "zero point" is arbitrary

• Negative values can be used

• Ratios between numbers on the scale are not meaningful, so operations such as multiplication and division cannot be carried out directly

• e.g. temperature with the Celsius scale

Page 15: Quantitative Statistical Methods

• Gazdaságtudományi Kar• Gazdaságelméleti és Módszertani Intézet

4. Ratio scale

• Strongest scale of measurement

• Distances between observations and also the ratios of distances have a meaning

• Contains a meaningful zero

• e.g. mass, length, time

a salary of $50,000 is twice as large as a salary of $25,000

Page 16: Quantitative Statistical Methods

• Gazdaságtudományi Kar• Gazdaságelméleti és Módszertani Intézet

SPSS (Statistical Package for the Social Sciences )

• computer program used for statistical analysis

• 2 files: XY.sav - Data View

XY.spo - Output

Just with upper case!!!

It can be a

longer name

Short name; don’t use space!!

Number of the

characters in the

Data View

Width of a column

Page 17: Quantitative Statistical Methods

• Gazdaságtudományi Kar• Gazdaságelméleti és Módszertani Intézet

Review of Bivariate Correlationand Regression

Page 18: Quantitative Statistical Methods

• Gazdaságtudományi Kar• Gazdaságelméleti és Módszertani Intézet

Types of dependence

• association – between two nominal data

• mixed – between a nominal and a ratio data

• correlation – among ratio data

Page 19: Quantitative Statistical Methods

• Gazdaságtudományi Kar• Gazdaságelméleti és Módszertani Intézet

• X (or X1, X2, … , Xp):

known variable(s) / independent variable(s) / predictor(s)

• Y: unknown variable / dependent variable

• causal relationship: X „causes” Y to change

Correlation Regression

describes the strength of a

relationship, the degree to

which one variable is linearly

related to another

shows us how to determine

the nature of a relationship

between two or more

variables

Page 20: Quantitative Statistical Methods

• Gazdaságtudományi Kar• Gazdaságelméleti és Módszertani Intézet

Correlation Measures

1. Covariance

2. Coefficient of correlation

3. Coefficient of determination

4. Coefficient of rank correlation

Page 21: Quantitative Statistical Methods

• Gazdaságtudományi Kar• Gazdaságelméleti és Módszertani Intézet

Correlation Measures

1. Covariance

The covariance between two variables is a measure of the joint variation of the two variables

– ranges from - to +;

– Cov = 0, when X and Y are uncorrelated;

– its sign shows the direction of correlation

– it doesn’t measure the degree of relationship!!!

( )( ) ( )

1n

yyxx yx,Cov

−−=

Page 22: Quantitative Statistical Methods

• Gazdaságtudományi Kar• Gazdaságelméleti és Módszertani Intézet

2. Coefficient of correlation (Pearson)

• its sign shows the direction of correlation

• it measures the strength of correlation

• 0 < r < 1 → statistical dependence

r = 0 → X and Y are uncorrelated

r = -1 → negative ☻

r = 1 → positive ☺

• You can use only in case of linear relationship!

( )

yx ss

y,xCov r

=

Page 23: Quantitative Statistical Methods

• Gazdaságtudományi Kar• Gazdaságelméleti és Módszertani Intézet

3. Coefficient of determination

• r2

• The square of the sample correlation coefficient betweenthe outcomes and their predicted values.

• Measures the degree of correlation in percentage (%)

• It provides a measure of how well future outcomes arelikely to be predicted by the model.

• Vary from 0 to 1.

y

e

y

y2

S

S - 1 =

S

S r

ˆ=

Page 24: Quantitative Statistical Methods

• Gazdaságtudományi Kar• Gazdaságelméleti és Módszertani Intézet

Example

• A firm administers a test tosales trainees before they gointo the field. Themanagement of the firm isinterested in determiningthe relationship between thetest scores and the salesmade by the trainees at theend of one year in the field.The following data werecollected for 45 salespersonnel who have been inthe field one year.

• Calculate differentcorrelation measures!

Page 25: Quantitative Statistical Methods

• Gazdaságtudományi Kar• Gazdaságelméleti és Módszertani Intézet

Sales-

person

Test

score

Number of

units sold

K. A. 25 188 +9 +22 +198

L. Z. 16 157 0 -9 0

B. E. 30 165 +14 -1 -14

G. P. 5 124 -11 -42 +462

… … … … … …

… … … … … …

S. G. 10 158 -6 -8 +48

J. T. 24 224 +8 +58 +464

V. P. 17 169 +1 +3 +3

T. L. 6 114 -10 -52 +520

Total 716 7 464 0 0 ∑dxdy=8 894.5

X → Y

independent dependent variable

xi dxx =− yi dyy =− ( ) ( ) yxii ddyyxx =−−

Page 26: Quantitative Statistical Methods

• Gazdaságtudományi Kar• Gazdaságelméleti és Módszertani Intézet

Number of observed pairs: n = 45

Positive correlation

8.26 s 16 x x ==

30.99 s 166 y y ==

202.15 1-45

894.5 8

1n

dd C

yx==

−=

Page 27: Quantitative Statistical Methods

• Gazdaságtudományi Kar• Gazdaságelméleti és Módszertani Intézet

There is a strong & positive relationbetween test scores and number of unitssold.

The variation of test scores explains 62.36percent of the variation of number ofunits sold.

% 62.36 r

0.7897 30.99 8.26

202.15

ss

C

2

yx

=

=

=

=r

Page 28: Quantitative Statistical Methods

• Gazdaságtudományi Kar• Gazdaságelméleti és Módszertani Intézet

4. Coefficient of rank correlation (Spearman)

• Measure of the relationship between two ordinal data

• n = number of paired observations,

d = difference between the ranks for each pair of

observations.

• perfect correlation → rs = 1

perfect inverse correlation → rs = -1

in case of independence → rs = 0

)1 (nn

d6 -1 r

2

2i

s−

=

1 r 0 s

Page 29: Quantitative Statistical Methods

• Gazdaságtudományi Kar• Gazdaságelméleti és Módszertani Intézet

Student

Ability

A B C D E F G H I J Total

Mathematics 1 2 3 4 5 6 7 8 9 10 -

Music 3 4 1 2 5 7 10 6 8 9 -

di = xi - yi -2 -2 2 2 0 -1 -3 2 1 1 0

di2 4 4 4 4 0 1 9 4 1 1 32

Example

Ten students were ranked by theirmathematical and musical ability:

0.806 1) - (1010

326 - 1

)1 (nn

d6 - 1 ρ

22

2

i=

=

=

strong relationship

Page 30: Quantitative Statistical Methods

• Gazdaságtudományi Kar• Gazdaságelméleti és Módszertani Intézet

Simple Linear Regression Model

• We model the relationship between two variables, X and Y as a straight line.

• The model contains two parameters:

▪ an intercept parameter,

▪ a slope parameter.Y = β0 + β1x + ε

Y = deterministic component + random error

where: Y – dependent or response variable (the variable we

wish to explain or predict)

x – independent or predictor variableε – random error componentβ0 – y-intercept of the line, i.e. point at which the line

intercept the y-axisβ1 – slope of the line

E (y)

x

β0 = y-intercept

β1 = slope

Page 31: Quantitative Statistical Methods

• Gazdaságtudományi Kar• Gazdaságelméleti és Módszertani Intézet

y

x

Random error

Deterministic component • y = deterministic component + random error

• We always assume that the mean value of the random error equals 0 → the mean value of y equals the deterministic component.

• It is possible to find many lines for which the sum of the errors is equal to 0, but there is one (and only one) line for which the SSE (sum of squares of the errors) is a minimum:

→ least squares line / regression line.

ŷi = b0 + b1x i

Page 32: Quantitative Statistical Methods

• Gazdaságtudományi Kar• Gazdaságelméleti és Módszertani Intézet

• The method of least squares gives us the bestlinear unbiased estimators (BLUE) of the regressionparameters, β0, β1.

• The least-squares estimators:

b0 estimates β0

b1 estimates β1

• The (empirical) regression line:

y caret („hat”):• Calculation of the estimators:

( ) ( ) min!,

2

1

1010 →−−==

n

i

ii xbbybbf

xbby += 10ˆ

Page 33: Quantitative Statistical Methods

• Gazdaságtudományi Kar• Gazdaságelméleti és Módszertani Intézet

Least Square Methode• There is an extreme value (minimum) iftha partial derivation is equal to 0

• After transformation…• The normal equations (with 1 x)

Σy = nb0 + b1ΣxΣxy = b0Σx + b1Σx2

• The estimated regression line:

( )

( ) 02

02

10

1

10

0

=−−−=

=−−−=

iii

ii

xbbyxb

f

xbbyb

f

ŷ = b0 + b1x

Page 34: Quantitative Statistical Methods

• Gazdaságtudományi Kar• Gazdaságelméleti és Módszertani Intézet

Interpretation

• b0: when x=0, y=b0

If the X variable is 0, how much is the Y.

• b1: for every 1 unit increase in x we expecty to change by b1 units on average.

• If the X is higher with 1, what is the

difference in Y on average.

Page 35: Quantitative Statistical Methods

• Gazdaságtudományi Kar• Gazdaságelméleti és Módszertani Intézet

No relationship

0

1000

2000

3000

4000

0 10 20 30 40Number of storks

Number of

births

Page 36: Quantitative Statistical Methods

• Gazdaságtudományi Kar• Gazdaságelméleti és Módszertani Intézet

Independence

- 2 - 1 0 1 2

- 3

- 2

- 1

0

1

2

3

N i n c s k o r r e lá c i ó

Y = - 7 . 4 E - 0 2 + 0 . 2 0 8 3 4 8 X

R - S q = 3 . 4 %

Page 37: Quantitative Statistical Methods

• Gazdaságtudományi Kar• Gazdaságelméleti és Módszertani Intézet

Positive correlation

3210- 1- 2- 3

3

2

1

0

- 1

- 2

- 3

P o z i t ív k o r r e lá c i ó

R -S q = 6 2 .5 %

Y = -8 . 6 E -0 2 + 0 . 6 9 0 2 8 6 X

Page 38: Quantitative Statistical Methods

• Gazdaságtudományi Kar• Gazdaságelméleti és Módszertani Intézet

Negative correlation

- 3 - 2 - 1 0 1 2 3

- 3

- 2

- 1

0

1

2

3

N e g a t ív k o r r e lá c i ó

Y = 5 . 0 7 E - 0 2 - 0 . 6 4 7 8 7 2 X

R - S q = 7 0 . 9 %

Page 39: Quantitative Statistical Methods

• Gazdaságtudományi Kar• Gazdaságelméleti és Módszertani Intézet

Curvilinear relation

- 3 - 2 - 1 0 1 2 3

0

1 0

2 0

3 0

4 0

N e m l i n e á r i s k o r r e lá c i ó

Y = 1 2 . 0 9 5 8 + 6 . 0 7 6 8 4 X + 1 . 1 6 6 8 6 X * * 2

R - S q = 8 8 . 4 %

Page 40: Quantitative Statistical Methods

• Gazdaságtudományi Kar• Gazdaságelméleti és Módszertani Intézet

Scatter diagrams

direct relationship

positive slope

0

10

20

30

40

50

0 10 20 30 40

Production (number of products per day)

w

a

s

t

a

g

e

0

400

800

1200

1600

0 10 20 30 40

Advertising in $

S

a

l

e

s

i

n

$ 0

1000

2000

3000

4000

5000

0 2 4 6 8 10 12Age of a house (year)

S

e

l

l

i

n

g

p

r

i

c

e

0

1000

2000

3000

4000

0 5 10 15

Age of a car (year)

S

e

l

l

i

n

g

p

r

i

c

e

linear

curvilinear

inverse relationship

negative slope

Page 41: Quantitative Statistical Methods

• Gazdaságtudományi Kar• Gazdaságelméleti és Módszertani Intézet

Power regression

Y = a Xb

logY = loga + b logX

↓ ↓ ↓

V = b0 + b1 ∙ x

b1 = b

b0 = lga

+=

+=

xbxbyx

xbnby

2

10

10

lglglglg

lglg

Page 42: Quantitative Statistical Methods

• Gazdaságtudományi Kar• Gazdaságelméleti és Módszertani Intézet

Compound regression

Y = a bx

logY = loga + logb x

↓ ↓ ↓

V = b0 + b1 ∙ x

b1 = lgb

b0 = lga

+=

+=

xbxbyx

xbnby

10

10

lg

lg

Page 43: Quantitative Statistical Methods

• Gazdaságtudományi Kar• Gazdaságelméleti és Módszertani Intézet

Estimation in Regression

• Regression estimation is a technique used to replacemissing values in data.

• If we know:

1. The estimated parameter value;

2. The hypothesized value of the parameter;

3. Confidence interval around the estimatedparameter.

• The number of degrees of freedom equals the number ofobservations minus the number of parametersestimated.

• = n-2

Page 44: Quantitative Statistical Methods

• Gazdaságtudományi Kar• Gazdaságelméleti és Módszertani Intézet

Parameter Estimated value Standard error

0 b0

1 b1

0

Y0

Estimation in Regression

2i

2i

)x(xn

x

es

2i )xx −(

es

0y

2i

20

)xx

)xx

n −

−+

(

(1es

0y

2i

20

)x(x

)xx +

n

1

−+

(1es

y

y

b

b

sty

sty

stb

stb

ˆ

ˆ

1

0

ˆ

ˆ

1

0

= n-2

In case of average Y values

In case of discrete Y values

Page 45: Quantitative Statistical Methods

• Gazdaságtudományi Kar• Gazdaságelméleti és Módszertani Intézet

Elasticity

xbb

x b x)E(y,

10

1+

= E(y, x) = bx

y1

Elasticity at the mean

% change in x demanded % change in y

Page 46: Quantitative Statistical Methods

• Gazdaságtudományi Kar• Gazdaságelméleti és Módszertani Intézet

Residual variable

( )

( ) ( )

( ) ( ) ( ) == =

−+−=−

+−=−

+=

−=

n

i

ii

n

i

n

i

ii

iiii

iii

iii

yyyyyy

eyyyy

eyy

yye

1

2

1 1

22

ˆˆ

ˆ

ˆ

ˆ

Sy = + Se

Sum of square of Y Sum of squareexplained byregression

Sum of square of theerrors

yS ˆ

Page 47: Quantitative Statistical Methods

• Gazdaságtudományi Kar• Gazdaságelméleti és Módszertani Intézet

Sum of

SquaresDf

Mean Sum

of SquaresF

Regression 1

Residual n-2

Total n-1

Analysis of Variance in Regression Analysis

2e

2y

2y SS S += ˆ

2

i

n

1=i

2n

1=i

i

n

1=i

2

i )y(y + )yy( )y(y −−=−

2

iy )yy( = S − yS

2

ie )y(y = S − )2/( −= nS s e2e

S = (y y)y i

2 −1-n

Sy

2)-/(nS

S =F

e

y

Page 48: Quantitative Statistical Methods

• Gazdaságtudományi Kar• Gazdaságelméleti és Módszertani Intézet

Model testing

H0: β1 = 0

H1: β1 ≠ 0 (linear model)

Test statistic:

• F-statistic tests whether all the slope coefficientsin a linear regression are equal to 0.

• Measures how well the regression equationexplains the variation in the dependent variable.

2)-/(nS

S

s

S =F

e

y

2

e

y=0

Pr

211 : H

F

);(

1

121 −F

0

Pr

211 : H

);( 21

21

F

);(

1

12

21

F

F

0

Pr

211 : H

F);( 211 −F

H0

Page 49: Quantitative Statistical Methods

• Gazdaságtudományi Kar• Gazdaságelméleti és Módszertani Intézet

Parameter testing

H0: β1 = 0

H1: β1 ≠ 0

Test statistic:

where: b1 is the least square estimate of the

regression slope

s(b1) is the standard error of b1

)( 1

1

bs

bt =

−− 1t 0

Pr01 : mH

2/1 −−t 0

Pr

2/1 −t

01 : mH

0

Pr01 : mH

−1t

H0

Page 50: Quantitative Statistical Methods

• Gazdaságtudományi Kar• Gazdaságelméleti és Módszertani Intézet

Thanks for your attention!

[email protected]