quantitative methods 2: “decision making under uncertainty” lecture 1 irco 454 professor edmund...

62
Quantitative Methods Quantitative Methods 2: 2: “Decision Making Under Uncertainty” “Decision Making Under Uncertainty” Lecture 1 Lecture 1 IRCO 454 IRCO 454 Professor Edmund Malesky, Professor Edmund Malesky, UCSD UCSD 1 1 Copyrighted by Edmund Malesky. Do not distribute without Copyrighted by Edmund Malesky. Do not distribute without permission. permission.

Upload: travon-jenney

Post on 01-Apr-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

Quantitative Methods 2:Quantitative Methods 2:“Decision Making Under Uncertainty”“Decision Making Under Uncertainty”

Lecture 1Lecture 1

IRCO 454IRCO 454

Professor Edmund Malesky, Professor Edmund Malesky, UCSDUCSD

11Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 2: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

Outline of Today’s LectureOutline of Today’s Lecture

1)1) Introduction to QM2Introduction to QM2

2)2) Flip to the last page of the novel – Flip to the last page of the novel – What is linear modeling and how is it What is linear modeling and how is it used? used?

3)3) A brief review of critical concepts that A brief review of critical concepts that you learned in QM1.you learned in QM1.

22Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 3: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

Goals of the CourseGoals of the Course

Learn to Learn to dodo quantitative empirical work for quantitative empirical work for use in economic analysis, public policy use in economic analysis, public policy and social sciences.and social sciences.

Learn the basic properties of the Learn the basic properties of the regression estimator.regression estimator.

Learn to diagnose and address problems Learn to diagnose and address problems with fit between data and estimatorwith fit between data and estimator

Learn to present results in a meaningful Learn to present results in a meaningful way.way.

Learn STATA.Learn STATA.33

Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 4: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

Topics We Will AddressTopics We Will Address

ONE basic equation:ONE basic equation: Y = βY = β00 + β + β11X + uX + u This is a VERY flexible model for This is a VERY flexible model for

understanding social, political, economic understanding social, political, economic behavior.behavior.

First part of course will be about HOW to First part of course will be about HOW to estimate βestimate β00 and β and β11

Also about what ASSUMPTIONS are Also about what ASSUMPTIONS are needed to make those estimates.needed to make those estimates.

44Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 5: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

Topics We Will AddressTopics We Will Address

Y = βY = β00 + β + β11X + uX + u

The rest of the course will be about what The rest of the course will be about what to do if those assumptions are not to do if those assumptions are not reasonablereasonable

How do we make sure that our estimates How do we make sure that our estimates of βof β11 are unbiased, or at least consistent are unbiased, or at least consistent

55Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 6: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

Problems with uProblems with u (the error term /residual)(the error term /residual)

Omitted Variable BiasOmitted Variable Bias HeteroskedasticityHeteroskedasticity Dichotomous Dependent VariablesDichotomous Dependent Variables AutocorrelationAutocorrelation

66Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 7: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

Problems with XProblems with X

Measurement ErrorMeasurement Error MulticollinearityMulticollinearity

77Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 8: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

Problems with βProblems with β00 & β & β11

Dummy variables for new interceptsDummy variables for new intercepts Non-linear effectsNon-linear effects Interaction EffectsInteraction Effects

88Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 9: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

Problems with YProblems with Y

Endogeneity BiasEndogeneity Bias Selection BiasSelection Bias The use (abuse) of R-squared and “curve The use (abuse) of R-squared and “curve

fitting”fitting”

99Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 10: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

Course Structure Course Structure (Two Components)(Two Components)

Monday: A theory based lecture on the Monday: A theory based lecture on the mathematical properties of the linear mathematical properties of the linear regression technique and problems with its regression technique and problems with its application. application. No laptops!No laptops!

Wednesdays: A practical hands-on lab, Wednesdays: A practical hands-on lab, where we will learn how to program where we will learn how to program statistical code in STATA. statistical code in STATA. Bring your Bring your laptops!!laptops!!

1010Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 11: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

Course RequirementsCourse Requirements 50% - Weekly Problem Sets50% - Weekly Problem Sets

Will your own computer code (A.K.A. “The .Do File”)Will your own computer code (A.K.A. “The .Do File”)• File sent in one hour before class on the File sent in one hour before class on the

Wednesday following distribution.Wednesday following distribution.• Send to “QM2 Homework” FolderSend to “QM2 Homework” Folder• “LastName_ProblemSet#”• Whether .do file runs is worth 20% of HW gradeWhether .do file runs is worth 20% of HW grade

Word Word write-up handed in before class lecture.write-up handed in before class lecture.

50% - Final Take-Home Exam50% - Final Take-Home Exam Will test cumulative knowledge.Will test cumulative knowledge. Will involve a .do fileWill involve a .do file Grade will be determined based on your answers to Grade will be determined based on your answers to

the questions and whether I can successfully run your the questions and whether I can successfully run your .do files without error..do files without error. 1111

Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 12: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

Required ReadingsRequired Readings

Wooldridge, Jeffrey M. 2008. Wooldridge, Jeffrey M. 2008. Introductory Introductory Economics: A Modern ApproachEconomics: A Modern Approach, Volume , Volume 4E.4E.

Other brief reading assignments sent out Other brief reading assignments sent out by professor.by professor.

1212Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 13: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

One Administrative IssueOne Administrative Issue

What do we do to make-up for missing What do we do to make-up for missing classes due to Martin Luther King and classes due to Martin Luther King and President’s Day?President’s Day? Continue with class as normal?Continue with class as normal? Schedule make-up class?Schedule make-up class? Webcast missing class?Webcast missing class? Malesky lectures at TA review sessions?Malesky lectures at TA review sessions?

1313Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 14: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

Optional ReadingsOptional Readings•Baum, Christopher. 2008. An Introduction to Modern Econometrics Using Stata.” Stata Press. http://www.stata.com/bookstore/statabooks.html•Xiao Chen, Philip B. Ender, Michael Mitchell & Christine Wells. 2006. Stata Web Books: Regression with Stata.http://www.ats.ucla.edu/stat/stata/webbooks/reg/default.htm•Zorn, Christopher. Stata for Dummieshttp://www.buec.udel.edu/yatawarr/Stata4Dummies.pdf•Acock, Alan. 2005. A Gentle Introduction to Stata. http://www.stata.com/bookstore/statabooks.html

1414Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 15: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

A New Way of StudyingA New Way of Studying Excelling in QM2 requires a different Excelling in QM2 requires a different

approach to learning than in many other approach to learning than in many other classes at IRPS.classes at IRPS.

This course is about skill acquisition. The This course is about skill acquisition. The speed-reading, participation, and writing skills speed-reading, participation, and writing skills that you have fine-tuned in other courses are that you have fine-tuned in other courses are important, but they will serve you less well important, but they will serve you less well here.here.

At the end of the day, you will be evaluated on At the end of the day, you will be evaluated on your ability to understand and employ OLS your ability to understand and employ OLS regression in policy analysis. Period.regression in policy analysis. Period.

1515Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 16: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

READINGREADING Reading assignments are short, but they are difficult, Reading assignments are short, but they are difficult,

technical, and essential to understanding the material technical, and essential to understanding the material in the course.in the course.

You cannot read Wooldridge like you are reading for You cannot read Wooldridge like you are reading for the policy or political science courses at IRPS.the policy or political science courses at IRPS.

Find a quiet spot, read with a pencil in hand, highlight Find a quiet spot, read with a pencil in hand, highlight and take notes vigorously.and take notes vigorously.

Do not skip over the equations or boxes.Do not skip over the equations or boxes.• Read the equations, as if you are reading a sentence.Read the equations, as if you are reading a sentence.• The boxes are examples of the material covered. Refer The boxes are examples of the material covered. Refer

back to the preceding lesson, as you read each box.back to the preceding lesson, as you read each box.

1616Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 17: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

LecturesLectures I will provide you with my lecture notes at the end I will provide you with my lecture notes at the end

of every class. of every class. Do not feel obligated to type everything I say.Do not feel obligated to type everything I say. You will do much better if you simply listen, ask You will do much better if you simply listen, ask

questions, and record a few important issues.questions, and record a few important issues. Stop me if you don’t understand.Stop me if you don’t understand.

Probably half or more of the class is also confused.Probably half or more of the class is also confused. If you didn’t understand slide 10, you will not If you didn’t understand slide 10, you will not

understand slide 11, and …ZZZZZZZZZZZZZunderstand slide 11, and …ZZZZZZZZZZZZZ

1717Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 18: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

STATASTATA Learning to write a .do file is like learning to Learning to write a .do file is like learning to

speak another language. All of you have speak another language. All of you have mastered this ability before.mastered this ability before.

Keep your own glossary of commands and Keep your own glossary of commands and syntax that I introduce to you in the course. The syntax that I introduce to you in the course. The process will help you remember the code and process will help you remember the code and give you a personalized reference guide.give you a personalized reference guide.

Feel free to make use of the Feel free to make use of the helphelp function, pull- function, pull-down menus, and the UCLA statistics website.down menus, and the UCLA statistics website.

When using When using helphelp, look most carefully at , look most carefully at examples.examples. 1818

Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 19: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

HomeworkHomework

Feel free to work in groups, but make sure Feel free to work in groups, but make sure you write-up your own final .do file. Do not you write-up your own final .do file. Do not cut and paste the work of others. After the cut and paste the work of others. After the group session, do the write-up and .do file group session, do the write-up and .do file on your own. That will help you know if on your own. That will help you know if you really understand it.you really understand it.

1919Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 20: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

Prof. and TA AvailabilityProf. and TA Availability

Prof MaleskyOffice Hours: MW 3 to 4 and by appointment.

Kevin OH/Breakout: Thursday, 3:30 pm

Matt OH/Breakout:

2020Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 21: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

Any Questions?Any Questions?

2121Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 22: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

The Linear Regression ModelThe Linear Regression ModelApproach to ResearchApproach to Research

Otherwise known as……Otherwise known as……

Advanced Line DrawingAdvanced Line Drawing

2222Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 23: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

The “General Linear Model” refers to a The “General Linear Model” refers to a class of statistical models which are class of statistical models which are “generalizations” of simple linear “generalizations” of simple linear regression analysis.regression analysis.

Regression is the predominant statistical Regression is the predominant statistical tool used in the social sciences due to its tool used in the social sciences due to its simplicity and versatility. simplicity and versatility.

Also called Linear Regression Analysis.Also called Linear Regression Analysis.

General Linear ModelGeneral Linear Model

2323Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 24: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

Notations for Regression Notations for Regression Line Line

Alternate Mathematical Notation for Alternate Mathematical Notation for the straight linethe straight line th Grade Geometryth Grade Geometry

Statistics LiteratureStatistics Literature

Econometrics LiteratureEconometrics Literature

y m x b= +

Y a bX ei i i

Y = βY = β00 + β + β11X + uX + u

Wooldrigde uses this

specification, so we will

too!

2424Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 25: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

Translating Math into EnglishTranslating Math into English

The linear model states that the The linear model states that the dependent variable is dependent variable is directly proportionaldirectly proportional to the value of the independent variable. to the value of the independent variable.

Thus, if a theory implies that Y increases Thus, if a theory implies that Y increases in direct proportion to an increase in X, it in direct proportion to an increase in X, it implies a specific mathematical model of implies a specific mathematical model of behavior - the linear model.behavior - the linear model.

E.g. “It’s the economy, stupid!”E.g. “It’s the economy, stupid!”

2525Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 26: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

Simple Linear Regression: Simple Linear Regression: The Basic Mathematical ModelThe Basic Mathematical Model

Regression is based on the concept Regression is based on the concept of the simple proportional relationshipof the simple proportional relationship

A.K.A . . . the straight line.A.K.A . . . the straight line. We can express this idea We can express this idea

mathematically!mathematically! Y = βY = β00 + β + β11X + uX + u

2626Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 27: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

The Theory Implies the MathThe Theory Implies the Math

ALL statements of relationships between ALL statements of relationships between variables imply a mathematical structure.variables imply a mathematical structure.

Even if we don’t like to phrase our theories Even if we don’t like to phrase our theories in these terms, they DO imply in these terms, they DO imply mathematical relationships.mathematical relationships.

Much of this course is about elaborating Much of this course is about elaborating the basic model to fit our more nuanced the basic model to fit our more nuanced theories.theories.

2727Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 28: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

Implications of a Implications of a LinearLinear Model Model

The linear aspect means that the same The linear aspect means that the same increase in inflation will always produce increase in inflation will always produce the same reduction in presidential the same reduction in presidential approval.approval.

This is perhaps the most restrictive of all This is perhaps the most restrictive of all the assumptions of OLS.the assumptions of OLS.

We will work to loosen this assumption We will work to loosen this assumption through the quarter.through the quarter.

2828Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 29: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

The Regression ParametersThe Regression Parameters

ββ00 = the intercept = the intercept the point where the line crosses the Y-axis.the point where the line crosses the Y-axis. (the value of the dependent variable when all of (the value of the dependent variable when all of

the independent variables = 0)the independent variables = 0)

ββ11 = the slope = the slope the increase in the dependent variable per unit the increase in the dependent variable per unit

change in the independent variable (also known change in the independent variable (also known as the 'rise over the run')as the 'rise over the run')

2929Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 30: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

Regression in a Perfect World…Regression in a Perfect World…

Y = 1XY = 1X

0

2

4

6

8

10

12

1 2 3 4 5 6 7 8 9 10

Y

X

The Straight Line

3030Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 31: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

……but life is full of errors…but life is full of errors…

Y = 1X + uY = 1X + u

0

2

4

6

8

10

12

0 1 2 3 4 5 6 7 8 9 10 11

Y

X

Simple Linear Regression

3131Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 32: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

The Error Term The Error Term

Our models do not predict behavior Our models do not predict behavior perfectly.perfectly. Life is not a “Just So” Story.Life is not a “Just So” Story.

So we add a term to adjust or compensate So we add a term to adjust or compensate for the errors in prediction (u).for the errors in prediction (u).

Much of our ability to estimate βMuch of our ability to estimate β1 1 depends depends

upon the assumptions we make about the upon the assumptions we make about the errors (u).errors (u).

Sometimes u is called the “Disturbance”Sometimes u is called the “Disturbance”3232

Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 33: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

The 'Goal' of Ordinary Least The 'Goal' of Ordinary Least SquaresSquares

Ordinary Least Squares (OLS) is a method Ordinary Least Squares (OLS) is a method of finding the of finding the linear model which linear model which minimizes the sum of the squared errors.minimizes the sum of the squared errors.

Such a model provides the best Such a model provides the best explanation/prediction of the data.explanation/prediction of the data.

It is the “Best Linear Unbiased Estimator”It is the “Best Linear Unbiased Estimator” It’s BLUEIt’s BLUE

3333Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 34: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

Other Goals are PossibleOther Goals are Possible

Minimize total errorsMinimize total errors Minimize Absolute Value of ErrorsMinimize Absolute Value of Errors Maximum Likelihood ModelsMaximum Likelihood Models

OLS is a special case of MLEOLS is a special case of MLE

3434Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 35: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

Why Least Why Least SquaredSquared error? error?

Why not simply minimum error?Why not simply minimum error? The errors about the line sum to 0.0!The errors about the line sum to 0.0! Minimum absolute deviation (error) models Minimum absolute deviation (error) models

now exist, but they are mathematically now exist, but they are mathematically cumbersome.cumbersome.

Try algebra with | Absolute Value | signs!Try algebra with | Absolute Value | signs!

3535Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 36: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

Implications of Implications of SquaredSquared Errors Errors

This model seeks to avoid BIG missesThis model seeks to avoid BIG misses A big u for one case leads to a REALLY A big u for one case leads to a REALLY

big ubig u22.. This means regression results can be This means regression results can be

heavily influenced by outlier casesheavily influenced by outlier cases Some feel this is theoretically appropriateSome feel this is theoretically appropriate Always look at your dataAlways look at your data

3636Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 37: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

Minimizing the Sum of Squared Minimizing the Sum of Squared ErrorsErrors

How to put the How to put the LeastLeast in OLS? in OLS? In mathematical jargon we seek to In mathematical jargon we seek to

minimize the residual sum of squares minimize the residual sum of squares (SSR), where:(SSR), where:

n

iii yySSR

1

n

iiu

1

3737Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 38: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

Picking the ParametersPicking the Parameters

To Minimize To Minimize SSRSSR, we need parameter , we need parameter estimates. estimates.

In calculus, if you wish to know when In calculus, if you wish to know when a function is at its minimum, you take a function is at its minimum, you take the first derivative. the first derivative.

In this case we must take partial In this case we must take partial derivatives since we have two derivatives since we have two parameters (βparameters (β00 & β & β11) to worry about. ) to worry about.

3838Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 39: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

How “good” does it fit?How “good” does it fit?

To measure “reduction in errors” we need To measure “reduction in errors” we need a benchmark variable is a relevant and a benchmark variable is a relevant and tractable benchmark for comparing tractable benchmark for comparing predictions for comparison.predictions for comparison.

The mean of the dependent.The mean of the dependent. The mean of Y represents our “best guess” The mean of Y represents our “best guess”

at the value of Yat the value of Yii absent other information. absent other information.

3939Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 40: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

Sums of SquaresSums of Squares This gives us the following 'sum-of-This gives us the following 'sum-of-

squares' measures:squares' measures: SST=Total Sum of SquaresSST=Total Sum of Squares SSE= Explained Sum of SquaresSSE= Explained Sum of Squares SSR= Residual (Unexplained Sum of Squares)SSR= Residual (Unexplained Sum of Squares)

Total Variation (SST) = Explained Variation (SSE) + Total Variation (SST) = Explained Variation (SSE) + Unexplained Variation (SSR)Unexplained Variation (SSR)

2

1

n

ii yySST

n

ii yySSE

1

n

iii yySSR

1

2ˆ4040

Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 41: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

““Explained and “Unexplained” Explained and “Unexplained” Variation Variation

Y

X

XXii

yyii

0

1

xy 10ˆˆˆ

iy

iu yyi

4141Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 42: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

““Explained and “Unexplained” Explained and “Unexplained” VariationVariation

Y

XXii

yyiiiy

iu yyi

Square this quantity and sum across all observations and we have our SST

(Total Sum of Squares)

Square this quantity and sum

across all observations and we have our SSE (Explained Sum

of Squares)

X

Square this quantity and sum

across all observations and we have our SSR (Residual Sum of

Squares)

4242Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 43: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

Some Confusing TerminologySome Confusing Terminology

Occasionally you may see people refer Occasionally you may see people refer instead to USS (Unexplained) and ESS instead to USS (Unexplained) and ESS (Error)(Error)

These terms are interchangeable, but…These terms are interchangeable, but… ESS can be confused with explained sum ESS can be confused with explained sum

of squaresof squares USS is not confused with any USS is not confused with any

mathematical jargon, but does pose issues mathematical jargon, but does pose issues for statistical work on the US Navy.for statistical work on the US Navy.

4343Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 44: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

Let’s Test Some “Theories”Let’s Test Some “Theories”

Presidential approval depends upon the Presidential approval depends upon the performance of the US economyperformance of the US economy

The development of US military power The development of US military power was a response to America’s threatening was a response to America’s threatening environmentenvironment

4444Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 45: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

Plotting Approval and InflationPlotting Approval and Inflation

4545Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 46: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

Regressing Approval on Regressing Approval on InflationInflation

. reg approve inflat. reg approve inflat

Source | SS df MS Number of obs = 46Source | SS df MS Number of obs = 46 ---------+------------------------------ F( 1, 44) = 17.20---------+------------------------------ F( 1, 44) = 17.20 Model | 1960.60398 1 1960.60398 Prob > F = 0.0002Model | 1960.60398 1 1960.60398 Prob > F = 0.0002 Residual | 5015.26094 44 113.983203 R-squared = 0.2811Residual | 5015.26094 44 113.983203 R-squared = 0.2811 ---------+------------------------------ Adj R-squared = 0.2647---------+------------------------------ Adj R-squared = 0.2647 Total | 6975.86492 45 155.01922 Root MSE = 10.676Total | 6975.86492 45 155.01922 Root MSE = 10.676

------------------------------------------------------------------------------------------------------------------------------------------------------------ approve | Coef. Std. Err. t P>|t| [95% Conf. Interval]approve | Coef. Std. Err. t P>|t| [95% Conf. Interval] ---------+-----------------------------------------------------------------------------+-------------------------------------------------------------------- inflat | -2.213684 .5337539 -4.147 0.000 -3.289394 -1.137973inflat | -2.213684 .5337539 -4.147 0.000 -3.289394 -1.137973 _cons | 63.80565 2.711964 23.527 0.000 58.34004 69.27125_cons | 63.80565 2.711964 23.527 0.000 58.34004 69.27125 ------------------------------------------------------------------------------------------------------------------------------------------------------------

4646Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 47: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

Fitting Inflation to ApprovalFitting Inflation to Approval

(mean) infla t

(mean) approve Fitted values

-.263876 12.8595

28.3333

76.2143

4747Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 48: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

Plotting US Power & DisputesPlotting US Power & Disputesu

sc

ap

bl

numtarg t0 7

.03

.38

4848Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 49: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

Regress US Power on DisputesRegress US Power on Disputes . reg uscapbl numtargt. reg uscapbl numtargt

Source | SS df MS Number of obs = 177Source | SS df MS Number of obs = 177 ---------+------------------------------ F( 1, 175) = 18.61---------+------------------------------ F( 1, 175) = 18.61 Model | .110444241 1 .110444241 Prob > F = 0.0000Model | .110444241 1 .110444241 Prob > F = 0.0000 Residual | 1.03834672 175 .00593341 R-squared = 0.0961Residual | 1.03834672 175 .00593341 R-squared = 0.0961 ---------+------------------------------ Adj R-squared = 0.0910---------+------------------------------ Adj R-squared = 0.0910 Total | 1.14879096 176 .006527221 Root MSE = .07703Total | 1.14879096 176 .006527221 Root MSE = .07703

------------------------------------------------------------------------------------------------------------------------------------------------------------ uscapbl | Coef. Std. Err. t P>|t| [95% Conf. Interval]uscapbl | Coef. Std. Err. t P>|t| [95% Conf. Interval] ---------+-----------------------------------------------------------------------------+-------------------------------------------------------------------- numtargt | .0201142 .0046621 4.314 0.000 .010913 .0293155numtargt | .0201142 .0046621 4.314 0.000 .010913 .0293155 _cons | .1455665 .0067132 21.684 0.000 .1323172 .1588157_cons | .1455665 .0067132 21.684 0.000 .1323172 .1588157 ------------------------------------------------------------------------------------------------------------------------------------------------------------

4949Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 50: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

Fitting Disputes to US Power Fitting Disputes to US Power

num targt

uscapbl Fitted values

0 7

.03

.38

5050Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 51: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

A Brief Review of Critical ConceptsA Brief Review of Critical Concepts Measures of Central Tendency Measures of Central Tendency

((MeanMean, Median, Mode), Median, Mode)

Population VariancePopulation Variance

Standard DeviationStandard Deviation

CovarianceCovariance

CorrelationCorrelation

Marginal EffectMarginal Effect

1

(1/ )n

ii

x n x

2

2 2

1

( ) [( ( )) ]

(1/ ) ( )n

ii

Var X E X E X

n x x

2( )sd X

1y x

( , ) ( ( ))( ( ))Cov X Y E X E X Y E Y

( , )( , )

( )* ( )XY

X Y

Cov X YCorr X Y

sd X sd Y

5151Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 52: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

Distributions – The Usual SuspectsDistributions – The Usual Suspects

Normal DistributionNormal Distribution Standard NormalStandard Normal Chi-SquareChi-Square tt FF

5252Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 53: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

The Normal DistributionThe Normal Distribution (Probability Density Function)(Probability Density Function)

x

2 2

2

1( ) [ ( ) / 2 ]

2

( , )

f x E x u

X Normal u

µ5353

Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 54: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

The Standard Normal The Standard Normal DistributionDistribution (PDF) (PDF)

Z0

21( ) [ / 2]

2(0,1)

z E z

Z Normal

5454Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 55: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

Chi-Square DistributionChi-Square Distributioni

n2

i=1

Let Z , 1,2..., be independent random variables, each distributed

standard normal.

= Z

( ,2 )

i

i n

n n

f(x)

x

df=2

df=4

df=6

5555Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 56: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

t-distribution:t-distribution:The Statistical WorkhorseThe Statistical Workhorse

03-3

Let have a chi-square distribution

with degrees of freedom.

ZT=

/

(0, /( 2))

n

n

t n n

df=2

df=4

df=6

As the degrees of freedom increase, the

t-distribution approaches the normal

distribution.

5656Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 57: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

Quick Review: Quick Review: Hypothesis TestingHypothesis Testing

In STATA, the null hypothesis for a two-In STATA, the null hypothesis for a two-tailed t-test is:tailed t-test is:

HH0: 0: ββjj=0=0

5757Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 58: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

Quick Review: Quick Review: Hypothesis TestingHypothesis Testing

To test the hypothesis, I need to have a rejection rule. That To test the hypothesis, I need to have a rejection rule. That is, I will reject the null hypothesis if, t is greater than some is, I will reject the null hypothesis if, t is greater than some critical value (c). critical value (c).

c is up to me to some extent, I must determine what level of c is up to me to some extent, I must determine what level of significance I am willing to accept. For instance, if my t-significance I am willing to accept. For instance, if my t-value is 1.85 with 40 df and I was willing to reject only at the value is 1.85 with 40 df and I was willing to reject only at the 5% level, my c would equal 2.021 and I would not reject the 5% level, my c would equal 2.021 and I would not reject the null. On the other hand, if I was willing to reject at the 10% null. On the other hand, if I was willing to reject at the 10% level, my c would be 1.684, and I would reject the null level, my c would be 1.684, and I would reject the null hypotheses. hypotheses.

| |t c

5858Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 59: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

t-distribution:t-distribution:5 % rejection rule for the that H5 % rejection rule for the that H0: 0: ββjj=0 =0

with 25 degrees of freedomwith 25 degrees of freedom

Rejection RegionArea=.025

Rejection RegionArea=.025

0

Looking at table G-2, I find the critical value for a two-tailed test is 2.06

2.06-2.06

5959Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 60: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

Quick Review: Quick Review:

But this operation hides some very useful But this operation hides some very useful information.information.

STATA has decided that it is more useful to STATA has decided that it is more useful to provide what is the smallest level of significance provide what is the smallest level of significance at which the null hypothesis would be rejected. at which the null hypothesis would be rejected. This is known as the p-value. This is known as the p-value.

In the previous example, we know In the previous example, we know that .05<p<.10.that .05<p<.10.

To calculate the p, STATA computes the area To calculate the p, STATA computes the area under the probability density function.under the probability density function.

6060Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 61: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

T-distribution:T-distribution:Obtaining the p-value against a two-sided Obtaining the p-value against a two-sided

alternative, when t=1.85 and df=40.alternative, when t=1.85 and df=40.

P-value=P(|T|>t)

In this case, P(|T|>1.85)=

2P(T>1.85)=2(.0359)

=.0718

Area=.9282

Rejection RegionArea=.0359

Rejection RegionArea=.0359

06161

Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.

Page 62: Quantitative Methods 2: “Decision Making Under Uncertainty” Lecture 1 IRCO 454 Professor Edmund Malesky, UCSD 1 Copyrighted by Edmund Malesky. Do not

F DistributionF Distribution

f(x)

x

df=2,8

df=6,8

df=6,20

1

2

1

2

2/

2/

k

k

kF

k

F and Chi Square testing involves only a one-tailed test of the area underneath the

right portion of the curve.

6262Copyrighted by Edmund Malesky. Do not distribute without permission.Copyrighted by Edmund Malesky. Do not distribute without permission.