ec339: applied econometrics

Introduction

EC339: Applied Econometrics

What is Econometrics? Scope of application is large

Literal definition: measurement in economics Working definition: application of statistical

methods to problems that are of concern to economists

Econometrics has wide applications—beyond the scope of economics

What is Econometrics? Econometrics is primarily interested in

Quantifying economic relationships Testing competing hypothesis Forecasting

Quantifying Economic Relationships Outcomes of many policies tied to the magnitude of the slope

of supply and demand curves Often need to know elasticities before we can begin practical

analysis For example, if the minimum wage is raised, unemployment

may drop as more workers enter the labor force However, this depends on the slopes of the labor supply and labor

demand curves Econometric analysis attempts to determine this answer

Allows us to quantify causal relationships when the luxury of a formal experiment is not available

Testing Competing Hypothesis Econometrics helps fill the gap between the

theoretical world and the real world For instance, will a tax cut impact consumer

spending? Keynesian models relate consumer spending to annual

disposable income, suggesting that a cut in taxes will change consumer spending

Other theories relate consumer spending to lifetime income, suggesting a tax cut (especially a “one-shot deal”) will have little impact on consumer spending

Forecasting Econometrics attempts to provide the

information needed to forecast future values Such as inflation, unemployment, stock market

levels, etc.

The Use of Models Economists use models to describe real-world

processes Models are simplified depictions of reality

Usually an equation or set of equations

Economic theories are usually deterministic while the world is characterized by randomness Empirical models include a random component known as

the error term, or i

Typically assume that the mean of the error term is zero

Types of Data Data provide the raw material needed to

Quantify economic relationships Test competing theories Construct forecasts

Data can be described as a set of observations such as income, age, grade Each occurrence is called an observation

Data are in different formats Cross-sectional Time series Panel data

Cross-Sectional Data Provide information on a variety of entities at

the same point in time

Time Series Data Provides information for the same entity at

different points in time

Panel (or Longitudinal) Data Represents a combination of cross-sectional

and time series data Provides information on a variety of entities at

different periods in time

Conducting an Empirical Project How to Write an Empirical Paper Select a topic

Textbooks, JSTOR, News sources (for ideas), “pop-econ”

Learn what others have learned about this topic Spend time researching what others have done Conduct extensive literature review

How to Write an Empirical Paper

Conducting an Empirical Project Theoretical Foundation Have an empirical strategy

Existing literature may help Would apply the methods you learn in this book Gather data and apply appropriate econometric techniques

Interpret your results Write it up…

Build like a court case or newspaper article

Where to obtain data How to use DataFerrett

CPS.doc

Files for course will be stored on datastor\\datastor\courses\economic\ec339

You can download all files from bookhttp://caleb.wabash.edu/econometrics/index.htm

CPS.doc

Web LinksResources for Economists on the Internet are available at

www.rfe.org

www.freelunch.com

www.bea.gov, www.census.gov, www.bls.gov

Math ReviewThere is much more to it… but these are the basics you must know

Math ReviewDifferentiation expresses the rate at which a

quantity, y, changes with respect to the change in another quantity, x, on which it has a functional relationship. Using the symbol Δ to refer to change in a quantity.

Linear Relationship (i.e., a straight line) has a specific equation. As x changes, how does y change?

Directly related (x increases, y increases)Inversely related (x increases, y decreases)

yslope b

( )y f x a bx

( )f x

x=0, y=3 or (0,3).

x=2, y=3+2(2) or (2,7)

( ) 3 2f x x 1 0

( ) (7 3) 42

( ) (2 0) 2

Math ReviewDerivatives are essentially the same thing.

Instead of looking at the difference in y as x goes from 0 to 2, if you look at very small intervals, say changing x from 0 to 0.0001, the slope does not change for a straight line

The basic rule for derivatives is that the distance between the initial x and new x approches zero (in what is called the limit)

yslope b

( )y f x a bx

( )f x

x=0, y=3 or (0,3).

x=.0001, y=3+2(.0001) or (x,y)=(.0001,3.0002)

( ) 3 2f x x 1 0

( ) (3.0002 3) .00022

( ) (.0001 0) .0001

Math ReviewDerivatives have a slightly different notation

than delta-y/delta-x, namely dy/dx or f’(x). Constants, such as the y-intercept do not change as x changes, and thus are dropped when taking derivatives.

Derivatives represent the general formula to find the slope of a function when evaluated at a particular point. For straight lines, this value is fixed.

( ) cy f x a bx

( )f x

x=0, y=3 or (0,3).

x=.0001, y=3+2(.0001) or (x,y)=(.0001,3.0002)

( ) 3 2f x x

1'( ) ( ) cdy f x c b x

1 1 0'( ) (1)2 2( ) 2f x x x

Math ReviewIntegration (or reverse differentiation) is just

the opposite of a derivative, you have to remember to add back in C (for constant) since you may not know the “primitive” equation.

There are indefinite integrals (over no specified region) and definite integrals (where the region of integration is specified).

Also, the result of integration should be the function you would HAVE TO TAKE the derivative of to get the initial function.

( ) ( ) cF x ydx f x dx a bx dx

1( ) ( )1

c cbF x a bx dx ax x

( ) (3 2 ) 31 1

F x x dx x x C

2( ) 3F x x x C 10

( ) (3 2 ) [3 ]F x x dx x x C 2 2( ) [3(10) (10) ] [3(0) (0) ] 130F x

Area=[3*(10-0)]+[1/2*(10-0)*(3+2(10))]=130

Basic Definitions Random variable

A function or rule that assigns a real number to each basic outcome in the sample space The domain of random variable X is the sample space The range of X is the real number line

Value changes from trial to trial Uncertainty prevails in advance of the trail as to

the outcome

Case Study

Weight Data

Introductory Statistics classSpring, 1997

Virginia Commonwealth University

Weight Data

Weight Data: Frequency TableWeight Group Count

100 - <120 7 120 - <140 12 140 - <160 7 160 - <180 8 180 - <200 12 200 - <220 4 220 - <240 1 240 - <260 0 260 - <280 1

sqrt(53) = 7.2, or 8 intervals; range (260100=160) / 8 = 20 = class width

Weight Data: Histogram

Frequency

100 120 140 160 180 200 220 240 260 280Weight

* Left endpoint is included in the group, right endpoint is not.

Numerical Summaries Center of the data

mean median

Variation range quartiles (interquartile range) variance standard deviation

Mean or Average Traditional measure of center Sum the values and divide by the number

of values

Median (M) A resistant measure of the data’s center At least half of the ordered values are less

than or equal to the median value At least half of the ordered values are

greater than or equal to the median value If n is odd, the median is the middle ordered value If n is even, the median is the average of the two

middle ordered values

Median (M)

Location of the median: L(M) = (n+1)/2 ,

where n = sample size.

Example: If 25 data values are recorded, the

Median would be the

(25+1)/2 = 13th ordered value.

Median Example 1 data: 2 4 6

Median (M) = 4

Example 2 data: 2 4 6 8 Median = 5 (ave. of 4 and 6)

Example 3 data: 6 2 4 Median 2 (order the values: 2 4 6 , so Median = 4)

Comparing the Mean & Median The mean and median of data from a

symmetric distribution should be close together. The actual (true) mean and median of a symmetric distribution are exactly the same.

In a skewed distribution, the mean is farther out in the long tail than is the median [the mean is ‘pulled’ in the direction of the possible outlier(s)].

Quartiles

Three numbers which divide the ordered data into four equal sized groups.

Q1 has 25% of the data below it.

Q2 has 50% of the data below it. (Median)

Q3 has 75% of the data below it.

Weight Data: Sorted100 124 148 170 185 215101 125 150 170 185 220106 127 150 172 186 260106 128 152 175 187110 130 155 175 192110 130 157 180 194119 133 165 180 195120 135 165 180 203120 139 165 180 210123 140 170 185 212

L(M)=(53+1)/2=27

L(Q1)=(26+1)/2=13.5

Variance and Standard Deviation

Recall that variability exists when some values are different from (above or below) the mean.

Each data value has an associated deviation from the mean:

Deviations what is a typical deviation from the

mean? (standard deviation) small values of this typical deviation

indicate small variability in the data large values of this typical deviation

indicate large variability in the data

Variance Find the mean Find the deviation of each value from the

mean Square the deviations Sum the squared deviations Divide the sum by n-1

(gives typical squared deviation from mean)

Variance Formula

2)()1(

Remember that you must find the deviations of EACH x, square the deviations, THEN add them up!

Standard Deviation Formulatypical deviation from the mean

2)()1(

[ standard deviation = square root of the variance ]

Variance and Standard DeviationExample from Text

Metabolic rates of 7 men (cal./24hr.) :

1792 1666 1362 1614 1460 1867 1439

1600 7

200,11

1439186714601614136216661792

Variance and Standard DeviationExample

Observations Deviations Squared deviations

1792 17921600 = 192 (192)2 = 36,864

1666 1666 1600 = 66 (66)2 = 4,356

1362 1362 1600 = -238 (-238)2 = 56,644

1614 1614 1600 = 14 (14)2 = 196

1460 1460 1600 = -140 (-140)2 = 19,600

1867 1867 1600 = 267 (267)2 = 71,289

1439 1439 1600 = -161 (-161)2 = 25,921

sum = 0 sum = 214,870

xxi ix 2xxi

Notice the deviations add to zero, so each deviation must be squared

Observation Value1 1,7922 1,6663 1,3624 1,6145 1,4606 1,8677 1,439

=sum(B1:B7) 11,200=stdevp(B1:B7) 175=stdev(B1:B7) 189=variance(B1:B7) 35,812

Variance versus Standard Deviation

24.18967.811,35

67.811,35)870,214(6

1)870,214(

Note: Standard deviation is in the same units as the original data (cal/24 hours) while variance is in those units squared (cal/24 hours)2. Thus variance is not easily comparable to the original data.

Density Curves

Example: here is a histogram of vocabulary scores of 947 seventh graders.

The smooth curve drawn over the histogram is a mathematical model for the distribution. This is typically written as f(x), also known as the PROBABILITY DISTRIBUTION FUNCTION (PDF)

Density Curves

Example: the areas of the shaded bars in this histogram represent the proportion of scores in the observed data that are less than or equal to 6.0. This proportion is equal to 0.303. The area underneath the curve, is called the CUMULATIVE DENSITY FUNCTION (CDF): denoted F(x)

Density Curves

Example: now the area under the smooth curve to the left of 6.0 is shaded. If the scale is adjusted so the total area under the curve is exactly 1, then this curve is called a density curve. The proportion of the area to the left of 6.0 is now equal to 0.293.

21.55 ( )21

( ) .2932

Density Curves

Always on or above the horizontal axis

Have area exactly 1 underneath curve

Area under the curve and above any range of values is the proportion of all observations that fall in that range

Density Curves

The median of a density curve is the equal-areas point, the point that divides the area under the curve in half

The mean of a density curve is the balance point, at which the curve would balance if made of solid material

Density Curves The mean and standard deviation computed

from actual observations (data) are denoted by and s, respectively.x

The mean and standard deviation of the actual distribution represented by the density curve are denoted by µ (“mu”) and (“sigma”), respectively.

QuestionData sets consisting of physical measurements (heights, weights, lengths of bones, and so on) for adults of the same species and sex tend to follow a similar pattern. The pattern is that most individuals are clumped around the average, with numbers decreasing the farther values are from the average in either direction. Describe what shape a histogram (or density curve) of such measurements would have.

Bell-Shaped Curve:The Normal Distribution

standard deviation

The Normal Distribution

Knowing the mean (µ) and standard deviation () allows us to make various conclusions about Normal distributions. Notation: N(µ,).

68-95-99.7 Rule forAny Normal Curve 68% of the observations fall within (meaning above

and below) one standard deviation of the mean 95% of the observations fall within two standard

deviations (actually 1.96) of the mean 99.7% of the observations fall within three standard

deviations of the mean

68-95-99.7 Rule for Approximates for any Normal Curve

68-95-99.7 Rule forAny Normal Curve

ec339: applied econometrics

empirical projecthow

empirical paper

labor supply

keynesian models

scope of application

economic relationshipsoutcomes

different periods

randomnessempirical

Documents

journal of applied econometrics

applied econometrics - welcome to...

working papers in econometrics and applied statistics ·...

applied econometrics using stata

applied econometrics instrumental variable approach

methods in applied econometrics

lectures in applied econometrics 13

applied econometrics

applied econometrics assigment 2

applied econometrics using matlab - econometrics toolbox for

mbook applied econometrics using matlab

lectures in applied econometrics 18

applied financial econometrics slides

lectures in applied econometrics 01

applied econometrics and the determinants of economic...

applied econometrics applied econometrics - ies

applied econometrics association series

master of applied econometrics

lectures in applied econometrics 05

applied time series econometrics