gov 2000: 9. multiple regression in matrix form

62
Gov 2000: 9. Multiple Regression in Matrix Form Matthew Blackwell November 19, 2015 1 / 62

Upload: others

Post on 03-May-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Gov 2000: 9. Multiple Regression in Matrix Form

Gov 2000: 9. MultipleRegression in Matrix Form

Matthew BlackwellNovember 19, 2015

1 / 62

Page 2: Gov 2000: 9. Multiple Regression in Matrix Form

1. Matrix algebra review

2. Matrix Operations

3. Writing the linear model more compactly

4. A bit more about matrices

5. OLS in matrix form

6. OLS inference in matrix form

2 / 62

Page 3: Gov 2000: 9. Multiple Regression in Matrix Form

Where are we? Where are wegoing?

โ€ข Last few weeks: regression estimation and inference with oneand two independent variables, varying effects

โ€ข This week: the general regression model with arbitrarycovariates

โ€ข Next week: what happens when assumptions are wrong

3 / 62

Page 4: Gov 2000: 9. Multiple Regression in Matrix Form

Nunn & Wantchekon

โ€ข Are there long-term, persistent effects of slave trade onAfricans today?

โ€ข Basic idea: compare levels of interpersonal trust (๐‘Œ๐‘–) acrossdifferent levels of historical slave exports for a respondentโ€™sethnic group

โ€ข Problem: ethnic groups and respondents might differ in theirinterpersonal trust in ways that correlate with the severity ofslave exports

โ€ข One solution: try to control for relevant differences betweengroups via multiple regression

4 / 62

Page 5: Gov 2000: 9. Multiple Regression in Matrix Form

Nunn & Wantchekon

โ€ข Whaaaaa? Bold letter, quotation marks, what is this?โ€ข Todayโ€™s goal is to decipher this type of writing

5 / 62

Page 6: Gov 2000: 9. Multiple Regression in Matrix Form

Multiple Regression in Rnunn <- foreign::read.dta("Nunn_Wantchekon_AER_2011.dta")mod <- lm(trust_neighbors ~ exports + age + male + urban_dum

+ malaria_ecology, data = nunn)summary(mod)

#### Coefficients:## Estimate Std. Error t value Pr(>|t|)## (Intercept) 1.5030370 0.0218325 68.84 <2e-16 ***## exports -0.0010208 0.0000409 -24.94 <2e-16 ***## age 0.0050447 0.0004724 10.68 <2e-16 ***## male 0.0278369 0.0138163 2.01 0.044 *## urban_dum -0.2738719 0.0143549 -19.08 <2e-16 ***## malaria_ecology 0.0194106 0.0008712 22.28 <2e-16 ***## ---## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1#### Residual standard error: 0.978 on 20319 degrees of freedom## (1497 observations deleted due to missingness)## Multiple R-squared: 0.0604, Adjusted R-squared: 0.0602## F-statistic: 261 on 5 and 20319 DF, p-value: <2e-16

6 / 62

Page 7: Gov 2000: 9. Multiple Regression in Matrix Form

Why matrices and vectors?

7 / 62

Page 8: Gov 2000: 9. Multiple Regression in Matrix Form

8 / 62

Page 9: Gov 2000: 9. Multiple Regression in Matrix Form

Why matrices and vectors?

โ€ข Hereโ€™s one way to write the full multiple regression model:

๐‘ฆ๐‘– = ๐›ฝ0 + ๐‘ฅ๐‘–1๐›ฝ1 + ๐‘ฅ๐‘–2๐›ฝ2 + โ‹ฏ + ๐‘ฅ๐‘–๐‘˜๐›ฝ๐‘˜ + ๐‘ข๐‘–

โ€ข Notation is going to get needlessly messy as we add variablesโ€ข Matrices are clean, but they are like a foreign languageโ€ข You need to build intuitions over a long period of time

9 / 62

Page 10: Gov 2000: 9. Multiple Regression in Matrix Form

Quick note about interpretation

๐‘ฆ๐‘– = ๐›ฝ0 + ๐‘ฅ๐‘–1๐›ฝ1 + ๐‘ฅ๐‘–2๐›ฝ2 + โ‹ฏ + ๐‘ฅ๐‘–๐‘˜๐›ฝ๐‘˜ + ๐‘ข๐‘–

โ€ข In this model, ๐›ฝ1 is the effect of a one-unit change in ๐‘ฅ๐‘–1conditional on all other ๐‘ฅ๐‘–๐‘—.

โ€ข Jargon โ€œpartial effect,โ€ โ€œceteris paribus,โ€ โ€œall else equal,โ€โ€œconditional on the covariates,โ€ etc

10 / 62

Page 11: Gov 2000: 9. Multiple Regression in Matrix Form

1/ Matrix algebrareview

11 / 62

Page 12: Gov 2000: 9. Multiple Regression in Matrix Form

Matrices and vectors

โ€ข A matrix is just a rectangular array of numbers.โ€ข We say that a matrix is ๐‘› ร— ๐‘˜ (โ€œ๐‘› by ๐‘˜โ€) if it has ๐‘› rows and ๐‘˜

columns.โ€ข Uppercase bold denotes a matrix:

๐€ =โŽกโŽขโŽขโŽขโŽฃ

๐‘Ž11 ๐‘Ž12 โ‹ฏ ๐‘Ž1๐‘˜๐‘Ž21 ๐‘Ž22 โ‹ฏ ๐‘Ž2๐‘˜โ‹ฎ โ‹ฎ โ‹ฑ โ‹ฎ

๐‘Ž๐‘›1 ๐‘Ž๐‘›2 โ‹ฏ ๐‘Ž๐‘›๐‘˜

โŽคโŽฅโŽฅโŽฅโŽฆ

โ€ข Generic entry: ๐‘Ž๐‘–๐‘— where this is the entry in row ๐‘– and column ๐‘—โ€ข If youโ€™ve used Excel, youโ€™ve seen matrices.

12 / 62

Page 13: Gov 2000: 9. Multiple Regression in Matrix Form

Examples of matrices

โ€ข One example of a matrix that weโ€™ll use a lot is the designmatrix, which has a column of ones, and then each of thesubsequent columns is each independent variable in theregression.

๐— =โŽกโŽขโŽขโŽขโŽฃ

1 exports1 age1 male11 exports2 age2 male2โ‹ฎ โ‹ฎ โ‹ฎ โ‹ฎ1 exports๐‘› age๐‘› male๐‘›

โŽคโŽฅโŽฅโŽฅโŽฆ

13 / 62

Page 14: Gov 2000: 9. Multiple Regression in Matrix Form

Design matrix in R

head(model.matrix(mod), 8)

## (Intercept) exports age male urban_dum malaria_ecology## 1 1 855 40 0 0 28.15## 2 1 855 25 1 0 28.15## 3 1 855 38 1 1 28.15## 4 1 855 37 0 1 28.15## 5 1 855 31 1 0 28.15## 6 1 855 45 0 0 28.15## 7 1 855 20 1 0 28.15## 8 1 855 31 0 0 28.15

dim(model.matrix(mod))

## [1] 20325 6

14 / 62

Page 15: Gov 2000: 9. Multiple Regression in Matrix Form

Vectors

โ€ข A vector is just a matrix with only one row or one column.โ€ข A row vector is a vector with only one row, sometimes called a

1 ร— ๐‘˜ vector:

๐œถ = [ ๐›ผ1 ๐›ผ2 ๐›ผ3 โ‹ฏ ๐›ผ๐‘˜ ]

โ€ข A column vector is a vector with one column and more thanone row. Here is a ๐‘› ร— 1 vector:

๐ฒ =โŽกโŽขโŽขโŽขโŽฃ

๐‘ฆ1๐‘ฆ2โ‹ฎ

๐‘ฆ๐‘›

โŽคโŽฅโŽฅโŽฅโŽฆ

โ€ข Convention weโ€™ll assume that a vector is column vector andvectors will be written with lowercase bold lettering (๐›)

15 / 62

Page 16: Gov 2000: 9. Multiple Regression in Matrix Form

Vector examples

โ€ข One really common vector that we will work with areindividual variables, such as the dependent variable, which wewill represent as ๐ฒ:

๐ฒ =โŽกโŽขโŽขโŽขโŽฃ

๐‘ฆ1๐‘ฆ2โ‹ฎ

๐‘ฆ๐‘›

โŽคโŽฅโŽฅโŽฅโŽฆ

16 / 62

Page 17: Gov 2000: 9. Multiple Regression in Matrix Form

Vectors in Rโ€ข Vectors can come from subsets of matrices:

model.matrix(mod)[1, ]

## (Intercept) exports age male## 1.00 854.96 40.00 0.00## urban_dum malaria_ecology## 0.00 28.15

โ€ข Note, though, that R always prints a vector in row form, evenif it is a column in the original data:

head(nunn$trust_neighbors)

## [1] 3 3 0 0 1 1

17 / 62

Page 18: Gov 2000: 9. Multiple Regression in Matrix Form

2/ MatrixOperations

18 / 62

Page 19: Gov 2000: 9. Multiple Regression in Matrix Form

Transpose

โ€ข The transpose of a matrix ๐€ is the matrix created byswitching the rows and columns of the data and is denoted ๐€โ€ฒ.

โ€ข ๐‘˜th column of ๐€ becomes the ๐‘˜th row of ๐€โ€ฒ:

๐€ = โŽกโŽขโŽขโŽฃ

๐‘Ž11 ๐‘Ž12๐‘Ž21 ๐‘Ž22๐‘Ž31 ๐‘Ž32

โŽคโŽฅโŽฅโŽฆ

๐€โ€ฒ = [ ๐‘Ž11 ๐‘Ž21 ๐‘Ž31๐‘Ž12 ๐‘Ž22 ๐‘Ž32

]

โ€ข If ๐€ is ๐‘› ร— ๐‘˜, then ๐€โ€ฒ will be ๐‘˜ ร— ๐‘›.โ€ข Also written ๐€๐“

19 / 62

Page 20: Gov 2000: 9. Multiple Regression in Matrix Form

Transposing vectors

โ€ข Transposing will turn a ๐‘˜ ร— 1 column vector into a 1 ร— ๐‘˜ rowvector and vice versa:

๐œ” =โŽกโŽขโŽขโŽขโŽฃ

132

โˆ’5

โŽคโŽฅโŽฅโŽฅโŽฆ

๐œ”โ€ฒ = [ 1 3 2 โˆ’5 ]

20 / 62

Page 21: Gov 2000: 9. Multiple Regression in Matrix Form

Transposing in R

a <- matrix(1:6, ncol = 3, nrow = 2)a

## [,1] [,2] [,3]## [1,] 1 3 5## [2,] 2 4 6

t(a)

## [,1] [,2]## [1,] 1 2## [2,] 3 4## [3,] 5 6

21 / 62

Page 22: Gov 2000: 9. Multiple Regression in Matrix Form

Write matrices as vectorsโ€ข A matrix is just a collection of vectors (row or column)โ€ข As a row vector:

๐€ = [ ๐‘Ž11 ๐‘Ž12 ๐‘Ž13๐‘Ž21 ๐‘Ž22 ๐‘Ž23

] = [ ๐šโ€ฒ1

๐šโ€ฒ2

]

with row vectors๐šโ€ฒ

1 = [ ๐‘Ž11 ๐‘Ž12 ๐‘Ž13 ] ๐šโ€ฒ2 = [ ๐‘Ž21 ๐‘Ž22 ๐‘Ž23 ]

โ€ข Or we can define it in terms of column vectors:

๐ = โŽกโŽขโŽขโŽฃ

๐‘11 ๐‘12๐‘21 ๐‘22๐‘31 ๐‘32

โŽคโŽฅโŽฅโŽฆ

= [ ๐›๐Ÿ ๐›๐Ÿ ]

where ๐›๐Ÿ and ๐›๐Ÿ represent the columns of ๐.โ€ข ๐‘— subscripts columns of a matrix: ๐ฑ๐‘—โ€ข ๐‘– and ๐‘ก will be used for rows ๐ฑโ€ฒ

๐‘– .22 / 62

Page 23: Gov 2000: 9. Multiple Regression in Matrix Form

Addition and subtraction

โ€ข How do we add or subtract matrices and vectors?โ€ข First, the matrices/vectors need to be comformable, meaning

that the dimensions have to be the sameโ€ข Let ๐€ and ๐ both be 2 ร— 2 matrices. Then, let ๐‚ = ๐€ + ๐,

where we add each cell together:

๐€ + ๐ = [ ๐‘Ž11 ๐‘Ž12๐‘Ž21 ๐‘Ž22

] + [ ๐‘11 ๐‘12๐‘21 ๐‘22

]

= [ ๐‘Ž11 + ๐‘11 ๐‘Ž12 + ๐‘12๐‘Ž21 + ๐‘21 ๐‘Ž22 + ๐‘22

]

= [ ๐‘11 ๐‘12๐‘21 ๐‘22

]

= ๐‚

23 / 62

Page 24: Gov 2000: 9. Multiple Regression in Matrix Form

Scalar multiplication

โ€ข A scalar is just a single number: you can think of it sort oflike a 1 by 1 matrix.

โ€ข When we multiply a scalar by a matrix, we just multiply eachelement/cell by that scalar:

๐›ผ๐€ = ๐›ผ [ ๐‘Ž11 ๐‘Ž12๐‘Ž21 ๐‘Ž22

] = [ ๐›ผ ร— ๐‘Ž11 ๐›ผ ร— ๐‘Ž12๐›ผ ร— ๐‘Ž21 ๐›ผ ร— ๐‘Ž22

]

24 / 62

Page 25: Gov 2000: 9. Multiple Regression in Matrix Form

3/ Writing thelinear model morecompactly

25 / 62

Page 26: Gov 2000: 9. Multiple Regression in Matrix Form

The linear model with newnotation

โ€ข Remember that we wrote the linear model as the following forall ๐‘– โˆˆ [1, โ€ฆ , ๐‘›]:

๐‘ฆ๐‘– = ๐›ฝ0 + ๐‘ฅ๐‘–๐›ฝ1 + ๐‘ง๐‘–๐›ฝ2 + ๐‘ข๐‘–

โ€ข Imagine we had an ๐‘› of 4. We could write out each formula:

๐‘ฆ1 = ๐›ฝ0 + ๐‘ฅ1๐›ฝ1 + ๐‘ง1๐›ฝ2 + ๐‘ข1 (unit 1)๐‘ฆ2 = ๐›ฝ0 + ๐‘ฅ2๐›ฝ1 + ๐‘ง2๐›ฝ2 + ๐‘ข2 (unit 2)๐‘ฆ3 = ๐›ฝ0 + ๐‘ฅ3๐›ฝ1 + ๐‘ง3๐›ฝ2 + ๐‘ข3 (unit 3)๐‘ฆ4 = ๐›ฝ0 + ๐‘ฅ4๐›ฝ1 + ๐‘ง4๐›ฝ2 + ๐‘ข4 (unit 4)

26 / 62

Page 27: Gov 2000: 9. Multiple Regression in Matrix Form

The linear model with newnotation

๐‘ฆ1 = ๐›ฝ0 + ๐‘ฅ1๐›ฝ1 + ๐‘ง1๐›ฝ2 + ๐‘ข1 (unit 1)๐‘ฆ2 = ๐›ฝ0 + ๐‘ฅ2๐›ฝ1 + ๐‘ง2๐›ฝ2 + ๐‘ข2 (unit 2)๐‘ฆ3 = ๐›ฝ0 + ๐‘ฅ3๐›ฝ1 + ๐‘ง3๐›ฝ2 + ๐‘ข3 (unit 3)๐‘ฆ4 = ๐›ฝ0 + ๐‘ฅ4๐›ฝ1 + ๐‘ง4๐›ฝ2 + ๐‘ข4 (unit 4)

โ€ข We can write this as:

โŽกโŽขโŽขโŽขโŽฃ

๐‘ฆ1๐‘ฆ2๐‘ฆ3๐‘ฆ4

โŽคโŽฅโŽฅโŽฅโŽฆ

=โŽกโŽขโŽขโŽขโŽฃ

1111

โŽคโŽฅโŽฅโŽฅโŽฆ

๐›ฝ0 +โŽกโŽขโŽขโŽขโŽฃ

๐‘ฅ1๐‘ฅ2๐‘ฅ3๐‘ฅ4

โŽคโŽฅโŽฅโŽฅโŽฆ

๐›ฝ1 +โŽกโŽขโŽขโŽขโŽฃ

๐‘ง1๐‘ง2๐‘ง3๐‘ง4

โŽคโŽฅโŽฅโŽฅโŽฆ

๐›ฝ2 +โŽกโŽขโŽขโŽขโŽฃ

๐‘ข1๐‘ข2๐‘ข3๐‘ข4

โŽคโŽฅโŽฅโŽฅโŽฆ

โ€ข Outcome is a linear combination of the the ๐ฑ, ๐ณ, and ๐ฎ vectors

27 / 62

Page 28: Gov 2000: 9. Multiple Regression in Matrix Form

Grouping things into matrices

โ€ข Can we write this in a more compact form? Yes! Let ๐— and ๐œทbe the following:

๐—(4ร—3)

=โŽกโŽขโŽขโŽขโŽฃ

1 ๐‘ฅ1 ๐‘ง11 ๐‘ฅ2 ๐‘ง21 ๐‘ฅ3 ๐‘ง31 ๐‘ฅ4 ๐‘ง4

โŽคโŽฅโŽฅโŽฅโŽฆ

๐œท(3ร—1)

= โŽกโŽขโŽขโŽฃ

๐›ฝ0๐›ฝ1๐›ฝ2

โŽคโŽฅโŽฅโŽฆ

28 / 62

Page 29: Gov 2000: 9. Multiple Regression in Matrix Form

Matrix multiplication by a vector

โ€ข We can write this more compactly as a matrix(post-)multiplied by a vector:

โŽกโŽขโŽขโŽขโŽฃ

1111

โŽคโŽฅโŽฅโŽฅโŽฆ

๐›ฝ0 +โŽกโŽขโŽขโŽขโŽฃ

๐‘ฅ1๐‘ฅ2๐‘ฅ3๐‘ฅ4

โŽคโŽฅโŽฅโŽฅโŽฆ

๐›ฝ1 +โŽกโŽขโŽขโŽขโŽฃ

๐‘ง1๐‘ง2๐‘ง3๐‘ง4

โŽคโŽฅโŽฅโŽฅโŽฆ

๐›ฝ2 = ๐—๐œท

โ€ข Multiplication of a matrix by a vector is just the linearcombination of the columns of the matrix with the vectorelements as weights/coefficients.

โ€ข And the left-hand side here only uses scalars times vectors,which is easy!

29 / 62

Page 30: Gov 2000: 9. Multiple Regression in Matrix Form

General matrix by vectormultiplication

โ€ข ๐€ is a ๐‘› ร— ๐‘˜ matrixโ€ข ๐› is a ๐‘˜ ร— 1 column vectorโ€ข Columns of ๐€ have to match rows of ๐›โ€ข Let ๐š๐‘— be the ๐‘—th column of ๐ด. Then we can write:

๐œ(๐‘›ร—1)

= ๐€๐› = ๐‘1๐š1 + ๐‘2๐š2 + โ‹ฏ + ๐‘๐‘˜๐š๐‘˜

โ€ข ๐œ is linear combination of the columns of ๐€

30 / 62

Page 31: Gov 2000: 9. Multiple Regression in Matrix Form

Back to regressionโ€ข ๐— is the ๐‘› ร— (๐‘˜ + 1) design matrix of independent variablesโ€ข ๐œท be the (๐‘˜ + 1) ร— 1 column vector of coefficients.โ€ข ๐—๐œท will be ๐‘› ร— 1:

๐—๐œท = ๐›ฝ0 + ๐›ฝ1๐ฑ1 + ๐›ฝ2๐ฑ2 + โ‹ฏ + ๐›ฝ๐‘˜๐ฑ๐‘˜

โ€ข Thus, we can compactly write the linear model as thefollowing:

๐ฒ(๐‘›ร—1)

= ๐—๐œท(๐‘›ร—1)

+ ๐ฎ(๐‘›ร—1)

โ€ข We can also write this at the individual level, where ๐ฑโ€ฒ๐‘– is the

๐‘–th row of ๐—:๐‘ฆ๐‘– = ๐ฑโ€ฒ

๐‘–๐œท + ๐‘ข๐‘–

31 / 62

Page 32: Gov 2000: 9. Multiple Regression in Matrix Form

4/ A bit moreabout matrices

32 / 62

Page 33: Gov 2000: 9. Multiple Regression in Matrix Form

Matrix multiplication

โ€ข What if, instead of a column vector ๐‘, we have a matrix ๐with dimensions ๐‘˜ ร— ๐‘š.

โ€ข How do we do multiplication like so ๐‚ = ๐€๐?โ€ข Each column of the new matrix is just matrix by vector

multiplication:

๐‚ = [๐œ1 ๐œ2 โ‹ฏ ๐œ๐‘š] ๐œ๐‘— = ๐€๐›๐‘—

โ€ข Thus, each column of ๐‚ is a linear combination of thecolumns of ๐€.

33 / 62

Page 34: Gov 2000: 9. Multiple Regression in Matrix Form

Special multiplications

โ€ข The inner product of a two column vectors ๐š and ๐› (of equaldimension, ๐‘˜ ร— 1):

๐šโ€ฒ๐› = ๐‘Ž1๐‘1 + ๐‘Ž2๐‘2 + โ‹ฏ + ๐‘Ž๐‘˜๐‘๐‘˜

โ€ข Special case of above: ๐šโ€ฒ is a matrix with ๐‘˜ columns and just1 row, so the โ€œcolumnsโ€ of ๐šโ€ฒ are just scalars.

34 / 62

Page 35: Gov 2000: 9. Multiple Regression in Matrix Form

Sum of the squared residuals

โ€ข Example: letโ€™s say that we have a vector of residuals, ๏ฟฝ๏ฟฝ, thenthe inner product of the residuals is:

๏ฟฝ๏ฟฝโ€ฒ๏ฟฝ๏ฟฝ = [ ๐‘ข1 ๐‘ข2 โ‹ฏ ๐‘ข๐‘› ]โŽกโŽขโŽขโŽขโŽฃ

๐‘ข1๐‘ข2โ‹ฎ๐‘ข๐‘›

โŽคโŽฅโŽฅโŽฅโŽฆ

๏ฟฝ๏ฟฝโ€ฒ๏ฟฝ๏ฟฝ = ๐‘ข1 ๐‘ข1 + ๐‘ข2 ๐‘ข2 + โ‹ฏ + ๐‘ข๐‘› ๐‘ข๐‘› =๐‘›

โˆ‘๐‘–=1

๐‘ข2๐‘–

โ€ข Itโ€™s just the sum of the squared residuals!

35 / 62

Page 36: Gov 2000: 9. Multiple Regression in Matrix Form

Square matrices and the diagonal

โ€ข A square matrix has equal numbers of rows and columns.โ€ข The diagonal of a square matrix are the values ๐‘Ž๐‘—๐‘—:

๐€ = โŽกโŽขโŽขโŽฃ

๐‘Ž11 ๐‘Ž12 ๐‘Ž13๐‘Ž21 ๐‘Ž22 ๐‘Ž23๐‘Ž31 ๐‘Ž32 ๐‘Ž33

โŽคโŽฅโŽฅโŽฆ

โ€ข The identity matrix, ๐ˆ is a square matrix, with 1s along thediagonal and 0s everywhere else.

๐ˆ = โŽกโŽขโŽขโŽฃ

1 0 00 1 00 0 1

โŽคโŽฅโŽฅโŽฆ

โ€ข The identity matrix multiplied by any matrix returns thematrix: ๐€๐ˆ = ๐€.

36 / 62

Page 37: Gov 2000: 9. Multiple Regression in Matrix Form

Identity matrixโ€ข To get the diagonal of a matrix in R, use the diag() function:

b <- matrix(1:4, nrow = 2, ncol = 2)b

## [,1] [,2]## [1,] 1 3## [2,] 2 4

diag(b)

## [1] 1 4

โ€ข diag() also creates identity matrices in R:diag(3)

## [,1] [,2] [,3]## [1,] 1 0 0## [2,] 0 1 0## [3,] 0 0 1

37 / 62

Page 38: Gov 2000: 9. Multiple Regression in Matrix Form

5/ OLS in matrixform

38 / 62

Page 39: Gov 2000: 9. Multiple Regression in Matrix Form

Multiple linear regression in matrixform

โ€ข Let ๐œท be the matrix of estimated regression coefficients and ๏ฟฝ๏ฟฝbe the vector of fitted values:

๐œท =โŽกโŽขโŽขโŽขโŽฃ

๐›ฝ0๐›ฝ1โ‹ฎ

๐›ฝ๐‘˜

โŽคโŽฅโŽฅโŽฅโŽฆ

๏ฟฝ๏ฟฝ = ๐—๐œท

โ€ข It might be helpful to see this again more written out:

๏ฟฝ๏ฟฝ =โŽกโŽขโŽขโŽขโŽฃ

๐‘ฆ1๐‘ฆ2โ‹ฎ๐‘ฆ๐‘›

โŽคโŽฅโŽฅโŽฅโŽฆ

= ๐—๐œท =โŽกโŽขโŽขโŽขโŽฃ

1๐›ฝ0 + ๐‘ฅ11๐›ฝ1 + ๐‘ฅ12๐›ฝ2 + โ‹ฏ + ๐‘ฅ1๐‘˜๐›ฝ๐‘˜1๐›ฝ0 + ๐‘ฅ21๐›ฝ1 + ๐‘ฅ22๐›ฝ2 + โ‹ฏ + ๐‘ฅ2๐‘˜๐›ฝ๐‘˜

โ‹ฎ1๐›ฝ0 + ๐‘ฅ๐‘›1๐›ฝ1 + ๐‘ฅ๐‘›2๐›ฝ2 + โ‹ฏ + ๐‘ฅ๐‘›๐‘˜๐›ฝ๐‘˜

โŽคโŽฅโŽฅโŽฅโŽฆ

39 / 62

Page 40: Gov 2000: 9. Multiple Regression in Matrix Form

Residuals

โ€ข We can easily write the residuals in matrix form:

๏ฟฝ๏ฟฝ = ๐ฒ โˆ’ ๐—๐œท

โ€ข Our goal as usual is to minimize the sum of the squaredresiduals, which we saw earlier we can write:

๏ฟฝ๏ฟฝโ€ฒ๏ฟฝ๏ฟฝ = (๐ฒ โˆ’ ๐—๐œท)โ€ฒ(๐ฒ โˆ’ ๐—๐œท)

40 / 62

Page 41: Gov 2000: 9. Multiple Regression in Matrix Form

OLS estimator in matrix form

โ€ข Goal: minimize the sum of the squared residualsโ€ข Take (matrix) derivatives, set equal to 0 (see Wooldridge for

details)โ€ข Resulting first order conditions:

๐—โ€ฒ(๐ฒ โˆ’ ๐—๐œท) = 0

โ€ข Rearranging:๐—โ€ฒ๐—๐œท = ๐—โ€ฒ๐ฒ

โ€ข In order to isolate ๐œท, we need to move the ๐—โ€ฒ๐— term to theother side of the equals sign.

โ€ข Weโ€™ve learned about matrix multiplication, but what aboutmatrix โ€œdivisionโ€?

41 / 62

Page 42: Gov 2000: 9. Multiple Regression in Matrix Form

Scalar inverses

โ€ข What is division in its simplest form? 1๐‘Ž is the value such that๐‘Ž1๐‘Ž = 1:

โ€ข For some algebraic expression: ๐‘Ž๐‘ข = ๐‘, letโ€™s solve for ๐‘ข:

1๐‘Ž๐‘Ž๐‘ข = 1

๐‘Ž๐‘

๐‘ข = ๐‘๐‘Ž

โ€ข Need a matrix version of this: 1๐‘Ž .

42 / 62

Page 43: Gov 2000: 9. Multiple Regression in Matrix Form

Matrix inverses

โ€ข Definition If it exists, the inverse of square matrix ๐€, denoted๐€โˆ’1, is the matrix such that ๐€โˆ’1๐€ = ๐ˆ.

โ€ข We can use the inverse to solve (systems of) equations:

๐€๐ฎ = ๐›๐€โˆ’๐Ÿ๐€๐ฎ = ๐€โˆ’๐Ÿ๐›

๐ˆ๐ฎ = ๐€โˆ’๐Ÿ๐›๐ฎ = ๐€โˆ’๐Ÿ๐›

โ€ข If the inverse exists, we say that ๐€ is invertible or nonsingular.

43 / 62

Page 44: Gov 2000: 9. Multiple Regression in Matrix Form

Back to OLS

โ€ข Letโ€™s assume, for now, that the inverse of ๐—โ€ฒ๐— exists (weโ€™llcome back to this)

โ€ข Then we can write the OLS estimator as the following:

๐œท = (๐—โ€ฒ๐—)โˆ’1๐—โ€ฒ๐ฒ

โ€ข Memorize this: โ€œex prime ex inverse ex prime yโ€ sear it intoyour soul.

44 / 62

Page 45: Gov 2000: 9. Multiple Regression in Matrix Form

OLS by hand in R

๐œท = (๐—โ€ฒ๐—)โˆ’1๐—โ€ฒ๐ฒ

โ€ข First we need to get the design matrix and the response:

X <- model.matrix(trust_neighbors ~ exports + age + male+ urban_dum + malaria_ecology, data = nunn)

dim(X)

## [1] 20325 6

## model.frame always puts the response in the first columny <- model.frame(trust_neighbors ~ exports + age + male

+ urban_dum + malaria_ecology, data = nunn)[,1]length(y)

## [1] 20325

45 / 62

Page 46: Gov 2000: 9. Multiple Regression in Matrix Form

OLS by hand in R

๐œท = (๐—โ€ฒ๐—)โˆ’1๐—โ€ฒ๐ฒ

โ€ข Use the solve() for inverses and %*% for matrixmultiplication:

solve(t(X) %*% X) %*% t(X) %*% y

## (Intercept) exports age male urban_dum## [1,] 1.503 -0.001021 0.005045 0.02784 -0.2739## malaria_ecology## [1,] 0.01941

coef(mod)

## (Intercept) exports age male## 1.503037 -0.001021 0.005045 0.027837## urban_dum malaria_ecology## -0.273872 0.019411

46 / 62

Page 47: Gov 2000: 9. Multiple Regression in Matrix Form

Intuition for the OLS in matrix form

๐œท = (๐—โ€ฒ๐—)โˆ’1๐—โ€ฒ๐ฒ

โ€ข Whatโ€™s the intuition here?โ€ข โ€œNumeratorโ€ ๐—โ€ฒ๐ฒ: is roughly composed of the covariances

between the columns of ๐— and ๐ฒโ€ข โ€œDenominatorโ€ ๐—โ€ฒ๐— is roughly composed of the sample

variances and covariances of variables within ๐—โ€ข Thus, we have something like:

๐œท โ‰ˆ (variance of ๐—)โˆ’1(covariance of ๐— & ๐ฒ)

โ€ข This is a rough sketch and isnโ€™t strictly true, but it canprovide intuition.

47 / 62

Page 48: Gov 2000: 9. Multiple Regression in Matrix Form

6/ OLS inferencein matrix form

48 / 62

Page 49: Gov 2000: 9. Multiple Regression in Matrix Form

Random vectors

โ€ข A random vector is a vector of random variables:

๐ฑ๐‘– = [ ๐‘ฅ๐‘–1๐‘ฅ๐‘–2

]

โ€ข Here, ๐ฑ๐‘– is a random vector and ๐‘ฅ๐‘–1 and ๐‘ฅ๐‘–2 are randomvariables.

โ€ข When we talk about the distribution of ๐ฑ๐‘–, we are talkingabout the joint distribution of ๐‘ฅ๐‘–1 and ๐‘ฅ๐‘–2.

49 / 62

Page 50: Gov 2000: 9. Multiple Regression in Matrix Form

Distribution of random vectors

๐ฑ๐‘– โˆผ (๐”ผ[๐ฑ๐‘–], ๐•[๐ฑ๐‘–])

โ€ข Expectation of random vectors:

๐”ผ[๐ฑ๐‘–] = [ ๐”ผ[๐‘ฅ๐‘–1]๐”ผ[๐‘ฅ๐‘–2] ]

โ€ข Variance of random vectors:

๐•[๐ฑ๐‘–] = [ ๐•[๐‘ฅ๐‘–1] Cov[๐‘ฅ๐‘–1, ๐‘ฅ๐‘–2]Cov[๐‘ฅ๐‘–1, ๐‘ฅ๐‘–2] ๐•[๐‘ฅ๐‘–2] ]

โ€ข Variance of the random vector also called thevariance-covariance matrix.

โ€ข These describe the joint distribution of ๐ฑ๐‘–

50 / 62

Page 51: Gov 2000: 9. Multiple Regression in Matrix Form

Most general OLS assumptions

1. Linearity: ๐ฒ = ๐—๐œท + ๐ฎ2. Random/iid sample: (๐‘ฆ๐‘–, ๐ฑโ€ฒ

๐‘–) are a iid sample from thepopulation.

3. No perfect collinearity: ๐— is an ๐‘› ร— (๐‘˜ + 1) matrix with rank๐‘˜ + 1

4. Zero conditional mean: ๐”ผ[๐ฎ|๐—] = ๐ŸŽ5. Homoskedasticity: var(๐ฎ|๐—) = ๐œŽ2๐‘ข๐ˆ๐‘›6. Normality: ๐ฎ|๐— โˆผ ๐‘(๐ŸŽ, ๐œŽ2๐‘ข๐ˆ๐‘›)

51 / 62

Page 52: Gov 2000: 9. Multiple Regression in Matrix Form

No perfect collinearity

โ€ข In matrix form: ๐— is an ๐‘› ร— (๐‘˜ + 1) matrix with rank ๐‘˜ + 1โ€ข Definition The rank of a matrix is the maximum number of

linearly independent columns.โ€ข If ๐— has rank ๐‘˜ + 1, then all of its columns are linearly

independentโ€ข โ€ฆand none of its columns are linearly dependent โŸน no

perfect collinearityโ€ข ๐— has rank ๐‘˜ + 1โ‡ (๐—โ€ฒ๐—) is invertibleโ€ข Just like variation in ๐‘‹๐‘– led us to be able to divide by the

variance in simple OLS

52 / 62

Page 53: Gov 2000: 9. Multiple Regression in Matrix Form

Zero conditional mean error

โ€ข Using the zero mean conditional error assumptions:

๐”ผ[๐ฎ|๐—] =โŽกโŽขโŽขโŽขโŽฃ

๐”ผ[๐‘ข1|๐—]๐”ผ[๐‘ข2|๐—]

โ‹ฎ๐”ผ[๐‘ข๐‘›|๐—]

โŽคโŽฅโŽฅโŽฅโŽฆ

=โŽกโŽขโŽขโŽขโŽฃ

00โ‹ฎ0

โŽคโŽฅโŽฅโŽฅโŽฆ

= ๐ŸŽ

53 / 62

Page 54: Gov 2000: 9. Multiple Regression in Matrix Form

OLS is unbiased

โ€ข Under matrix assumptions 1-4, OLS is unbiased for ๐œท:

๐”ผ[๐œท] = ๐œท

54 / 62

Page 55: Gov 2000: 9. Multiple Regression in Matrix Form

Variance-covariance matrix

โ€ข The homoskedasticity assumption is different:

var(๐ฎ|๐—) = ๐œŽ2๐‘ข๐ˆ๐‘›

โ€ข In order to investigate this, we need to know what thevariance of a vector is.

โ€ข The variance of a vector is actually a matrix:

var[๐ฎ] = ฮฃ๐‘ข =โŽกโŽขโŽขโŽขโŽฃ

var(๐‘ข1) cov(๐‘ข1, ๐‘ข2) โ€ฆ cov(๐‘ข1, ๐‘ข๐‘›)cov(๐‘ข2, ๐‘ข1) var(๐‘ข2) โ€ฆ cov(๐‘ข2, ๐‘ข๐‘›)

โ‹ฎ โ‹ฑcov(๐‘ข๐‘›, ๐‘ข1) cov(๐‘ข๐‘›, ๐‘ข2) โ€ฆ var(๐‘ข๐‘›)

โŽคโŽฅโŽฅโŽฅโŽฆ

โ€ข This matrix is symmetric since cov(๐‘ข๐‘–, ๐‘ข๐‘—) = cov(๐‘ข๐‘–, ๐‘ข๐‘—)

55 / 62

Page 56: Gov 2000: 9. Multiple Regression in Matrix Form

Matrix version ofhomoskedasticity

โ€ข Once again: var(๐ฎ|๐—) = ๐œŽ2๐‘ข๐ˆ๐‘›โ€ข ๐ˆ๐‘› is the ๐‘› ร— ๐‘› identity matrixโ€ข Visually:

var[๐ฎ] = ๐œŽ2๐‘ข๐ˆ๐‘› =โŽกโŽขโŽขโŽขโŽฃ

๐œŽ2๐‘ข 0 0 โ€ฆ 00 ๐œŽ2๐‘ข 0 โ€ฆ 0

โ‹ฎ0 0 0 โ€ฆ ๐œŽ2๐‘ข

โŽคโŽฅโŽฅโŽฅโŽฆ

โ€ข In less matrix notation:โ–ถ var(๐‘ข๐‘–) = ๐œŽ2๐‘ข for all ๐‘– (constant variance)โ–ถ cov(๐‘ข๐‘–, ๐‘ข๐‘—) = 0 for all ๐‘– โ‰  ๐‘— (implied by iid)

56 / 62

Page 57: Gov 2000: 9. Multiple Regression in Matrix Form

Sampling variance for OLSestimates

โ€ข Under assumptions 1-5, the sampling variance of the OLSestimator can be written in matrix form as the following:

var[๐œท] = ๐œŽ2๐‘ข(๐—โ€ฒ๐—)โˆ’1

โ€ข This symmetric matrix looks like this:

๐›ฝ0 ๐›ฝ1 ๐›ฝ2 โ‹ฏ ๐›ฝ๐‘˜๐›ฝ0 var[๐›ฝ0] cov[๐›ฝ0, ๐›ฝ1] cov[๐›ฝ0, ๐›ฝ2] โ‹ฏ cov[๐›ฝ0, ๐›ฝ๐‘˜]๐›ฝ1 cov[๐›ฝ1, ๐›ฝ0] var[๐›ฝ1] cov[๐›ฝ1, ๐›ฝ2] โ‹ฏ cov[๐›ฝ1, ๐›ฝ๐‘˜]๐›ฝ2 cov[๐›ฝ2, ๐›ฝ0] cov[๐›ฝ2, ๐›ฝ1] var[๐›ฝ2] โ‹ฏ cov[๐›ฝ2, ๐›ฝ๐‘˜]โ‹ฎ โ‹ฎ โ‹ฎ โ‹ฎ โ‹ฑ โ‹ฎ

๐›ฝ๐พ cov[๐›ฝ๐‘˜, ๐›ฝ0] cov[๐›ฝ๐‘˜, ๐›ฝ1] cov[๐›ฝ๐‘˜, ๐›ฝ2] โ‹ฏ var[๐›ฝ๐‘˜]

57 / 62

Page 58: Gov 2000: 9. Multiple Regression in Matrix Form

Inference in the general settingโ€ข Under assumption 1-5 in large samples:

๐›ฝ๐‘— โˆ’ ๐›ฝ๐‘—๐‘†๐ธ[๐›ฝ๐‘—]

โˆผ ๐‘(0, 1)

โ€ข In small samples, under assumptions 1-6,๐›ฝ๐‘— โˆ’ ๐›ฝ๐‘—๐‘†๐ธ[๐›ฝ๐‘—]

โˆผ ๐‘ก๐‘›โˆ’(๐‘˜+1)

โ€ข Thus, under the null of ๐ป0 โˆถ ๐›ฝ๐‘— = 0, we know that๐›ฝ๐‘—

๐‘†๐ธ[๐›ฝ๐‘—]โˆผ ๐‘ก๐‘›โˆ’(๐‘˜+1)

โ€ข Here, the estimated SEs come from:var[๐œท] = ๏ฟฝ๏ฟฝ2๐‘ข(๐—โ€ฒ๐—)โˆ’1

๏ฟฝ๏ฟฝ2๐‘ข = ๏ฟฝ๏ฟฝโ€ฒ๏ฟฝ๏ฟฝ๐‘› โˆ’ (๐‘˜ + 1)

58 / 62

Page 59: Gov 2000: 9. Multiple Regression in Matrix Form

Covariance matrix in Rโ€ข We can access this estimated covariance matrix, ๏ฟฝ๏ฟฝ2๐‘ข(๐—โ€ฒ๐—)โˆ’1,

in R:

vcov(mod)

## (Intercept) exports age male## (Intercept) 0.0004766593 1.164e-07 -7.956e-06 -6.676e-05## exports 0.0000001164 1.676e-09 -3.659e-10 7.283e-09## age -0.0000079562 -3.659e-10 2.231e-07 -7.765e-07## male -0.0000667572 7.283e-09 -7.765e-07 1.909e-04## urban_dum -0.0000965843 -4.861e-08 7.108e-07 -1.711e-06## malaria_ecology -0.0000069094 -2.124e-08 2.324e-10 -1.017e-07## urban_dum malaria_ecology## (Intercept) -9.658e-05 -6.909e-06## exports -4.861e-08 -2.124e-08## age 7.108e-07 2.324e-10## male -1.711e-06 -1.017e-07## urban_dum 2.061e-04 2.724e-09## malaria_ecology 2.724e-09 7.590e-07

59 / 62

Page 60: Gov 2000: 9. Multiple Regression in Matrix Form

Standard errors from thecovariance matrix

โ€ข Note that the diagonal are the variances. So the square rootof the diagonal is are the standard errors:

sqrt(diag(vcov(mod)))

## (Intercept) exports age male## 0.02183253 0.00004094 0.00047237 0.01381627## urban_dum malaria_ecology## 0.01435491 0.00087123

coef(summary(mod))[, "Std. Error"]

## (Intercept) exports age male## 0.02183253 0.00004094 0.00047237 0.01381627## urban_dum malaria_ecology## 0.01435491 0.00087123

60 / 62

Page 61: Gov 2000: 9. Multiple Regression in Matrix Form

Nunn & Wantchekon

61 / 62

Page 62: Gov 2000: 9. Multiple Regression in Matrix Form

Wrapping up

โ€ข You have the full power of matrices.โ€ข Key to writing the OLS estimator and discussing higher level

concepts in regression and beyond.โ€ข Next week: diagnosing and fixing problems with the linear

model.

62 / 62