stat 451 lecture notes 0112 introductionrmartin/oldcourses/stat451/notes/451notes01.pdfstat 451...

56
Stat 451 Lecture Notes 01 12 Introduction Ryan Martin UIC www.math.uic.edu/ ~ rgmartin 1 Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens & Hoeting, and Chapter 7 of Lange 2 Updated: January 13, 2016 1 / 56

Upload: others

Post on 14-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Stat 451 Lecture Notes 0112 Introductionrmartin/OldCourses/Stat451/Notes/451notes01.pdfStat 451 Lecture Notes 0112 ... 1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens

Stat 451 Lecture Notes 0112

Introduction

Ryan MartinUIC

www.math.uic.edu/~rgmartin

1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens & Hoeting,and Chapter 7 of Lange

2Updated: January 13, 20161 / 56

Page 2: Stat 451 Lecture Notes 0112 Introductionrmartin/OldCourses/Stat451/Notes/451notes01.pdfStat 451 Lecture Notes 0112 ... 1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens

What to compute?

Stat 451 is a course about computational statistics.

Therefore, it is important first to discuss what we want tocompute in a statistics problems.

Here, we are basically concerned with two kinds of things:

maximizing the likelihood functionintegrating a “posterior distribution”

The former notion should be familiar from your experiencewith maximum likelihood in Stat 411.

The latter may be new to you — it’s “Bayesian”.

Next is a brief introduction to these concepts, along with anon-trivial illustration.

2 / 56

Page 3: Stat 451 Lecture Notes 0112 Introductionrmartin/OldCourses/Stat451/Notes/451notes01.pdfStat 451 Lecture Notes 0112 ... 1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens

Maximum likelihood

Suppose we have n independent observations, Y1, . . . ,Yn, andthe density/mass function pθ for these observations dependson an unknown parameter θ.

The likelihood and log-likelihood functions are

L(θ) =n∏

i=1

pθ(Yi ) and `(θ) =n∑

i=1

log pθ(Yi ).

The maximum likelihood estimator (MLE) θ of θ, based ondata, maximizes the likelihood, i.e.,

θ = arg maxθ

L(θ) ⇐⇒ ˙(θ) = 0.

Need to be able to optimize and/or find roots of functions.

3 / 56

Page 4: Stat 451 Lecture Notes 0112 Introductionrmartin/OldCourses/Stat451/Notes/451notes01.pdfStat 451 Lecture Notes 0112 ... 1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens

Maximum likelihood (cont)

Besides producing an estimate of the unknown parameter, wemight also like to assess its uncertainty.

In Stat 411 you learn that, under some conditions, when thesample size n is large, the distribution of θ is approximatelynormal with mean θ and variance I (θ)−1, where I (θ) is theFisher information matrix:

I (θ) = Eθ{ ˙(θ) ˙(θ)>} = −Eθ{¨(θ)}.

Then an approximate 95% confidence interval for θj is

θj ± 1.96 ·√

[I (θ)−1]jj , j = 1, . . . , d .

So, computing derivatives and inverting matrices is important.

4 / 56

Page 5: Stat 451 Lecture Notes 0112 Introductionrmartin/OldCourses/Stat451/Notes/451notes01.pdfStat 451 Lecture Notes 0112 ... 1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens

Bayesian approach

The Bayesian approach is based on using the rules ofprobability for inference.

Start with a prior distribution for θ, with density/massfunction π(θ), basically just a weight function.

Yields a conditional distribution for θ, given Y , as

π(θ | Y ) =L(θ)π(θ)∫L(u)π(u) du

∝ L(θ)π(θ).

Now we treat π(θ | Y ) as the object of interest and the goalis to produce various summaries, such as mean, variance,quantiles, probabilities, etc.

So, integrating functions will be important.

5 / 56

Page 6: Stat 451 Lecture Notes 0112 Introductionrmartin/OldCourses/Stat451/Notes/451notes01.pdfStat 451 Lecture Notes 0112 ... 1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens

Example: probit regression

Y1, . . . ,Yn are independent (not iid) binary observations.

Specifically, Yiind∼ Ber

(Φ(x>i θ)

), i = 1, . . . , n, where:

“Ber” denotes a Bernoulli distribution;x1, . . . , xn are fixed d-dimensional covariates;θ is a d-dimensional parameter vector; andΦ is the standard normal distribution function.3

Exercise:

write out log-likelihood functioncalculate Fisher information matrix...

3Other cdfs can be used, but then the model isn’t called “probit”...6 / 56

Page 7: Stat 451 Lecture Notes 0112 Introductionrmartin/OldCourses/Stat451/Notes/451notes01.pdfStat 451 Lecture Notes 0112 ... 1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens

Remarks

This course will mainly study how to solve certain optimizationand integration problems that arise in statistics applications.

We’ll need some background on general numerical methods.

Software will also be important — we will use R.

Some of what we discuss in the class will be simple, otherthings more difficult.

My goal is that students completing the course will havesufficient background to read current papers on computationalstatistics and implement their methods.

7 / 56

Page 8: Stat 451 Lecture Notes 0112 Introductionrmartin/OldCourses/Stat451/Notes/451notes01.pdfStat 451 Lecture Notes 0112 ... 1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens

Outline

1 Review of statistical inference

2 Introduction to RBasicsR sessionR graphicsR programmingData entry

3 Math and stat tools

8 / 56

Page 9: Stat 451 Lecture Notes 0112 Introductionrmartin/OldCourses/Stat451/Notes/451notes01.pdfStat 451 Lecture Notes 0112 ... 1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens

General facts about R

R is a free version of the S-PLUS software.

Can be downloaded for free (http://cran.r-project.org)for Windows, Mac, and Unix computers.

Environment is interactive by default—like a calculator—butusers can create files of R code (called scripts) on the sidewhich can be run all at once within R.

It is possible to write code that works together with lower-levelprogramming languages like C and FORTRAN (for speed).

R is powerful because of its flexibility — users can easilydefine their own functions or modify existing functions to suittheir needs.

9 / 56

Page 10: Stat 451 Lecture Notes 0112 Introductionrmartin/OldCourses/Stat451/Notes/451notes01.pdfStat 451 Lecture Notes 0112 ... 1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens

Arithmetic

Among other things, R can do arithmetic like a calculator.

Basic binary (arithmetic) operations are:

+ Addition ^ or ** Exponentiation- Subtraction %/% Integer division* Multiplication %% Modulus (remainder)/ Division

10 / 56

Page 11: Stat 451 Lecture Notes 0112 Introductionrmartin/OldCourses/Stat451/Notes/451notes01.pdfStat 451 Lecture Notes 0112 ... 1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens

Variables and assignments

Even with fairly routine calculations it is helpful to be able tostore some intermediate values.

R allows users to assign a value to a particular variable.

Syntax: x <- 7

This means that the value 7 is assigned to the variable x.

Note: The assignment symbol <- is to be treated as a singlecharacter, an arrow pointing to the left.

One can use the underscore symbol or an equal sign in placeof the assignment character — not recommended.

Underscore symbol cannot be used in variable names; use aperiod instead, e.g., pred.value.

11 / 56

Page 12: Stat 451 Lecture Notes 0112 Introductionrmartin/OldCourses/Stat451/Notes/451notes01.pdfStat 451 Lecture Notes 0112 ... 1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens

Expressions and objects

In R, the user enters an expression and the system evaluates itand produces output.

These expressions need not be formulas — they can generategraphs, output data sets, etc.

Expressions work on objects, basically anything that can beassigned to a variable.

But the syntax used is expression/object specific.

In what follows we will discuss several important types ofexpressions and objects.

Use the str(X) to view the “structure” of the object X.

12 / 56

Page 13: Stat 451 Lecture Notes 0112 Introductionrmartin/OldCourses/Stat451/Notes/451notes01.pdfStat 451 Lecture Notes 0112 ... 1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens

Functions and arguments

Functions in R can take many forms:

There’s the kind that look like mathematical functions, saylog(x),and the kind that don’t, say plot(x, y, pch=2).

The common feature is that there is a set of paranthesescontaining those arguments that the fuction applies to.

Two “types” of arguments:

Positional – variable recognized by position in the list.Named – variable recognized by name.

Some functions don’t have arguments, some have defaultarguments, and some allow “arbitrary” arguments.

R has an extensive list of built-in function that can do all sortsof things – and it’s easy to write your own functions since thefunction syntax in R is the same as ordinary R syntax.

13 / 56

Page 14: Stat 451 Lecture Notes 0112 Introductionrmartin/OldCourses/Stat451/Notes/451notes01.pdfStat 451 Lecture Notes 0112 ... 1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens

Vectors

Numeric vectors are fairly straightforward.

There are basically4 two other kinds of vectors:

CharacterLogical

Character vectors have elements made up of character strings;e.g. names <- c(’Small’, ’Medium’, ’Large’)

Logical vectors have elements TRUE or FALSE, and are veryuseful for indexing data sets.

An example of how to get a logical vector:

> gpa <- c(3.0, 2.8, 3.4, 3.7, 3.9, 3.3)

> gpa > 3.5

[1] FALSE FALSE FALSE TRUE TRUE FALSE

4Complex vectors also exist14 / 56

Page 15: Stat 451 Lecture Notes 0112 Introductionrmartin/OldCourses/Stat451/Notes/451notes01.pdfStat 451 Lecture Notes 0112 ... 1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens

Vectors (cont.)

Three functions to create vectors:

c() – concatenateseq() – patterned sequencerep() – repeat something

A vector must contain elements of the same “type”, so whathappens if two variables x and y of different types areconcatenated?

The general (and non-informative) answer is that they arecoerced into types that match.

For example:

> c(FALSE, 7)

[1] 0 7

> c(11.7, ’abc’)

[1] ’11.7’ ’abc’

15 / 56

Page 16: Stat 451 Lecture Notes 0112 Introductionrmartin/OldCourses/Stat451/Notes/451notes01.pdfStat 451 Lecture Notes 0112 ... 1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens

Vectors (cont.)

An interesting feature of R is that it does “vectorizedarithmetic.”

That is, R will apply arithmetic operations (and some otherfunctions) in a natural way.

For example:

> x <- c(7, 10, 11)

> y <- seq(5, 3, by=-1)

> x + y

[1] 12 14 14

If the two vectors are not of the same length, the shorter onegets “recycled” — error message if the length of the longervector is not a multiple of the length of the shorter vector.

When defining your own functions, remember to be carefulabout assuming it will vectorize how you want it to!

16 / 56

Page 17: Stat 451 Lecture Notes 0112 Introductionrmartin/OldCourses/Stat451/Notes/451notes01.pdfStat 451 Lecture Notes 0112 ... 1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens

Matrices and arrays

A natural extension of a vector is a matrix, which is just avector with a double index.

Example: M <- matrix(1:6, nrow=3, ncol=2).

In R, matrices are almost always5 treated just like vectors.

rbind() and cbind() functions can be used to append twoor more matrices by rows and columns, respectively.

Can name the rows and columns with rownames() andcolnames() functions.

More generally, R can work with an array (a vector with nindices), but these are a bit less common, perhaps becausethey’re hard to visualize.

5The only time R treats matrices in a linear algebra sort of way is when theuser asks R to do something “linear algebra like” such as matrix multiplication.

17 / 56

Page 18: Stat 451 Lecture Notes 0112 Introductionrmartin/OldCourses/Stat451/Notes/451notes01.pdfStat 451 Lecture Notes 0112 ... 1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens

Data frames

A data frame is R’s version of what we think of as a datamatrix or data set.

The columns represent variables and the rows represent cases.

This idea is similar to a matrix, but matrices must be entirelyof the same type, while data frames can have a mixture ofnumeric, character, and logical variables.

To create a data frame: D <- data.frame(list-of-variables)

We’ll talk about reading files into a data frame later.

Many statistical routines in R (e.g., linear regression) aredesigned to operate on data frames.

18 / 56

Page 19: Stat 451 Lecture Notes 0112 Introductionrmartin/OldCourses/Stat451/Notes/451notes01.pdfStat 451 Lecture Notes 0112 ... 1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens

Lists

The list structure is quite different and (as far as I know)unique to R.

A list in R is exactly as its name suggests — a list of objects.

The distinguishing feature is that a list can contain (almost?)any kind of object in R.

For example, objects in the list can be vectors, matrices, andeven functions.

Syntax: mylist <- list(list-of-objects).

For example:

> M <- matrix(c(2, 5, 7, 7), nrow=2)

> f <- function(x) log(x)+x^2

> mylist <- list(mymat=M, myfun=f)

> mylist$myfun(mylist$mymat)

19 / 56

Page 20: Stat 451 Lecture Notes 0112 Introductionrmartin/OldCourses/Stat451/Notes/451notes01.pdfStat 451 Lecture Notes 0112 ... 1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens

Indexing

Given a vector/matrix/array/data frame/list, we would like tobe able to pick off certain values.

Need understand how these objects are indexed.

For example, if M is a matrix, then M[i,j] refers to the valuein the i-th row and j-th column of M.

Also, M[,j] returns the j-th column of M as a vector.

Data frames are indexed similarly, and vectors are justone-dim matrices.

We’ve seen that objects of a list are indexed by their nameand a $ sign.

Example: What is mylist$mymat[2,2]?

Example: What is mymat[-1,]?

20 / 56

Page 21: Stat 451 Lecture Notes 0112 Introductionrmartin/OldCourses/Stat451/Notes/451notes01.pdfStat 451 Lecture Notes 0112 ... 1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens

Subsetting

By thinking of indexing in terms of logical variables, we canextend this idea to allow for subsetting of our object.

For example, consider the code M[1,2].

This is equivalent to defining two logical vectors:

> row.log <- (1:nrow(M)) == 1

> col.log <- (1:ncol(M)) == 2

Then M[1,2] is equivalent to M[row.log, col.log].

To generalize, we can define any sort of logical variable we likeand apply it as above.

For example, suppose a data frame D has a variable age. Toget only those rows for adults, use the code D[,D$age > 19]

There’s another generalization of the row/column indexing:

> x <- seq(5, 25, by=5)

> x[c(2,3)]

[1] 10 15

21 / 56

Page 22: Stat 451 Lecture Notes 0112 Introductionrmartin/OldCourses/Stat451/Notes/451notes01.pdfStat 451 Lecture Notes 0112 ... 1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens

Implicit loops

In many cases, we want to obtain some kind of summary ofthe rows and/or columns of a matrix or data frame (or list).

Such a process requires scanning the particular dimension ofthe object and applying the function each time.

R functions apply() and its variations do this directly.

For lists, use lapply() or sapply(); for matrices or dataframes use apply().

Syntax: Suppose x and y are two numeric vectors.

> mylist <- list(var1=x, var2=y)

> lapply(mylist, mean)

> sapply(mylist, mean)

> mymat <- cbind(var1=x, var2=y)

> apply(mymat, 2, mean, na.rm=TRUE)

22 / 56

Page 23: Stat 451 Lecture Notes 0112 Introductionrmartin/OldCourses/Stat451/Notes/451notes01.pdfStat 451 Lecture Notes 0112 ... 1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens

Sorting

Sorting a single vector is easy — use sort(x).

But what if you want to sort the rows of a matrix by aparticular column?

Goal is to sort the rows data frame D by column 1.

Use the order() function:

> o <- order(D[,1])

> D.sorted <- D[o,]

Using o <- order(D[,1], D[,2]) would sort D by the firstcolumn and then the second.

23 / 56

Page 24: Stat 451 Lecture Notes 0112 Introductionrmartin/OldCourses/Stat451/Notes/451notes01.pdfStat 451 Lecture Notes 0112 ... 1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens

Outline

1 Review of statistical inference

2 Introduction to RBasicsR sessionR graphicsR programmingData entry

3 Math and stat tools

24 / 56

Page 25: Stat 451 Lecture Notes 0112 Introductionrmartin/OldCourses/Stat451/Notes/451notes01.pdfStat 451 Lecture Notes 0112 ... 1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens

Workspace and directories

When you fire up R, you’ll have a working directory.

To view this directory, type getwd().

Change the directory by typing setwd(’mydir’).

To view the objects in the workspace, type ls().

To remove an object X from workspace, type rm(X).

25 / 56

Page 26: Stat 451 Lecture Notes 0112 Introductionrmartin/OldCourses/Stat451/Notes/451notes01.pdfStat 451 Lecture Notes 0112 ... 1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens

Workspace and directories (cont.)

Save workspace to the working directory: save.image().

This command saves all the objects in the current workspaceto a file .Rdata in the working directory — you can specify adifferent filename if you like.

To save objects x, y and z in file myfile.Rd, type

save(x, y, z, file=’myfile.Rd’)

Load the saved file: load(file=’myfile.Rd’)

Note: Saving the workspace does not save the output!

26 / 56

Page 27: Stat 451 Lecture Notes 0112 Introductionrmartin/OldCourses/Stat451/Notes/451notes01.pdfStat 451 Lecture Notes 0112 ... 1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens

Saving input and output

The save command saves objects in the workspace, but doesnot save either the R input or output.

R input (commands) can be stored in an external file, called ascript, say myscript.R.

Run commands in script: source(’myscript.R’).

To begin a session where the output is stored in a file myfile,type sink(’myfile’).

When an expression is evaluated, nothing is printed to theoutput terminal — instead, everything is printed to the filemyfile until the user types sink().

27 / 56

Page 28: Stat 451 Lecture Notes 0112 Introductionrmartin/OldCourses/Stat451/Notes/451notes01.pdfStat 451 Lecture Notes 0112 ... 1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens

Getting help

In R, type help(mean) to see some documentation for thefunction mean. You can also type ?mean for short.

Typing help.start(’mean’) will open a HTML help filewith searching capabilities, etc.

Google searches are very helpful too.

Extensive documentation online or in your installation — see“Introduction to R” and “Writing R Extensions” (probablymore than you’ll ever need to know about R).

28 / 56

Page 29: Stat 451 Lecture Notes 0112 Introductionrmartin/OldCourses/Stat451/Notes/451notes01.pdfStat 451 Lecture Notes 0112 ... 1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens

Packages

Thousands of extra packages are available that containspecialized functions and data sets.

These functions often contain compiled (C or Fortran) code.

Look at the CRAN repository for a list (with descriptions) ofavailable packages.

To install a package pkg, type install.packages(’pkg’)

and follow the instructions.

Once the package is installed, to access its contents typelibrary(pkg).

The objects within this package are now available for use.

29 / 56

Page 30: Stat 451 Lecture Notes 0112 Introductionrmartin/OldCourses/Stat451/Notes/451notes01.pdfStat 451 Lecture Notes 0112 ... 1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens

Outline

1 Review of statistical inference

2 Introduction to RBasicsR sessionR graphicsR programmingData entry

3 Math and stat tools

30 / 56

Page 31: Stat 451 Lecture Notes 0112 Introductionrmartin/OldCourses/Stat451/Notes/451notes01.pdfStat 451 Lecture Notes 0112 ... 1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens

Introduction

One of the main advantages of R is its graphical capabilities.

This includes having a number of built-in graphicalprocedures, as well as giving the user the flexibility to producehis/her own plots.

Here we’ll see a few examples of the available graphical tools,with some focus on how to annotate graphs for presentation.

Note: It is possible to directly produce PDF or Postscriptgraphics for inclusion in LaTeX files.

31 / 56

Page 32: Stat 451 Lecture Notes 0112 Introductionrmartin/OldCourses/Stat451/Notes/451notes01.pdfStat 451 Lecture Notes 0112 ... 1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens

Scatterplots

The following code contains lots of new ideas.

Take notice of where the labels are printed!

x <- runif(50, 0, 2)

y <- runif(50, 0, 2)

plot(x, y, xlab=’x-label’, ylab=’y-label’,

main=’Main Title’, sub=’subtitle’)

text(0.6, 0.6, ’text at (0.6,0.6)’)

abline(h=0.6, v=0.6, lty=2)

for(s in 1:4) mtext(-1:4, side=s, at=0.7, line=-1:4)

mtext(paste(’side’, 1:4), side=1:4, line=-1, font=2)

32 / 56

Page 33: Stat 451 Lecture Notes 0112 Introductionrmartin/OldCourses/Stat451/Notes/451notes01.pdfStat 451 Lecture Notes 0112 ... 1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens

Histograms

Histograms are very easy in R.

The basic command is hist(X), where X is the variable youwant to draw the histogram of.

There are a number of options to customize this plot.

You can add lines to the plot with curve().

Add a legend to label the various curves.

See the mean.med.hist function in the code online.

33 / 56

Page 34: Stat 451 Lecture Notes 0112 Introductionrmartin/OldCourses/Stat451/Notes/451notes01.pdfStat 451 Lecture Notes 0112 ... 1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens

Boxplots

Nice way to visualize the location and spread of a distribution.

Especially useful for comparing two or more distributions.

Basic syntax is boxplot(X) where X is a numeric vector or alist that contains multiple numeric vectors.

Can do some similar sorts of customization.

See the function mean.med.comp in the code online.

34 / 56

Page 35: Stat 451 Lecture Notes 0112 Introductionrmartin/OldCourses/Stat451/Notes/451notes01.pdfStat 451 Lecture Notes 0112 ... 1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens

Outline

1 Review of statistical inference

2 Introduction to RBasicsR sessionR graphicsR programmingData entry

3 Math and stat tools

35 / 56

Page 36: Stat 451 Lecture Notes 0112 Introductionrmartin/OldCourses/Stat451/Notes/451notes01.pdfStat 451 Lecture Notes 0112 ... 1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens

Flow control 1: If-then-else

An important part of programming is conditional execution ofcommands, and this is accomplished through the if-then-elsestructure.

Basic syntax:

if(condition1) {

## Do something

} else if(condition2) {

## Do something else

} else {

## Do another thing

}

Inside the if() is a logical variable, taking values TRUE orFALSE.

Logical variables can be “combined” with the usual Booleanoperators: & (and), | (or), ! (not).

To compare two variables, type A == B or A != B.

36 / 56

Page 37: Stat 451 Lecture Notes 0112 Introductionrmartin/OldCourses/Stat451/Notes/451notes01.pdfStat 451 Lecture Notes 0112 ... 1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens

Flow control 2: Loops

The three major players are for(), while(), and repeat.

Illustrate while() and repeat with an example of computingthe square root of a non-negative number.

y <- 12345

x <- y / 2

while(abs(x*x - y) > 1e-10) x <- (x + y / x) / 2

print(x) # based on while()

x <- y / 2

repeat {

x <- (x + y / x) / 2

if(abs(x*x - y) < 1e-10) break

}

print(x) # based on repeat

37 / 56

Page 38: Stat 451 Lecture Notes 0112 Introductionrmartin/OldCourses/Stat451/Notes/451notes01.pdfStat 451 Lecture Notes 0112 ... 1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens

Flow control 2: Loops (cont.)

The for() loop is by far the most common looping structure.

It is typical to run the loop with a counter, stopping once thecounter reaches it maximum value:

for(i in 1:n) { ## Do something }

In particular:

x <- seq(0, 1, by=0.05)

plot(x, x, type=’’l’’)

for(j in 2:5) lines(x, x^j)

But there’s a bunch of other things one can do too; e.g.,

for(i in (1:10)^4)

for(j in c(2,5,7))

for(var in names(data))

for(f in c(sin, cos, tan))

38 / 56

Page 39: Stat 451 Lecture Notes 0112 Introductionrmartin/OldCourses/Stat451/Notes/451notes01.pdfStat 451 Lecture Notes 0112 ... 1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens

Avoiding loops

Knowing when (and how) to avoid loops is as important asknowing how to use them.

Loops are very easy to program in R, but can run very slowdepending on the application.

Often a version of the apply function is better.

Example: find the maximum value in each column of X

Do this

max.X <- apply(X, 2, max)

don’t do this

max.X <- rep(NA, ncol(X))

for(j in 1:ncol(X)) max.X[j] <- max(X[,j])

apply is sometimes much faster, other times not much faster;but it’s always much cleaner!

39 / 56

Page 40: Stat 451 Lecture Notes 0112 Introductionrmartin/OldCourses/Stat451/Notes/451notes01.pdfStat 451 Lecture Notes 0112 ... 1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens

Outline

1 Review of statistical inference

2 Introduction to RBasicsR sessionR graphicsR programmingData entry

3 Math and stat tools

40 / 56

Page 41: Stat 451 Lecture Notes 0112 Introductionrmartin/OldCourses/Stat451/Notes/451notes01.pdfStat 451 Lecture Notes 0112 ... 1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens

Reading data 1: scan

Aside from typing in data directly with c(...), the simplestway to read in data is with the scan command.

Can read in vector or list objects.

If file.dat is a text file containing numeric or characterdata, typing X <- scan(file=’file.dat’) will read thedata from this file and store it in the vector X.

Lists can also be done but the syntax is weird; go tohelp(scan) for details.

41 / 56

Page 42: Stat 451 Lecture Notes 0112 Introductionrmartin/OldCourses/Stat451/Notes/451notes01.pdfStat 451 Lecture Notes 0112 ... 1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens

Reading data 2: read.table

In statistical applications, we usually have several variablesmeasured on a number of cases.

In such cases, a data frame is the most convenient data type.

By default, R looks for data points arranged in columns with asingle space separating two values.6

If two spaces (delimiters) appear next to one another, Rassumes the value is missing, and enters NA.

Suppose we have a file data.dat that contains data pointsseparated by commas, with a header row containing thevariable names. Then the syntax is

read.table(file=’data.dat’, header=TRUE, sep=’,’)

6It’s easy to change this default!42 / 56

Page 43: Stat 451 Lecture Notes 0112 Introductionrmartin/OldCourses/Stat451/Notes/451notes01.pdfStat 451 Lecture Notes 0112 ... 1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens

Reading data 2: read.table (cont.)

Things to consider when reading a file:

header lineseparator/delimiterquotesmissing valuesunfilled lineswhite space in character fieldscomments...

Check out help(read.table) for details.

There are also some special shortcut functions, such asread.csv and read.delim, that read comma and tabdelimited data files by default.

43 / 56

Page 44: Stat 451 Lecture Notes 0112 Introductionrmartin/OldCourses/Stat451/Notes/451notes01.pdfStat 451 Lecture Notes 0112 ... 1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens

Writing data to a file

In some cases we will have an output data set that we wouldlike to write to a file, perhaps for someone else using adifferent software to analyze.

For a “rectangular” data object X in R, we can write this to atext file with the write.table command.

The syntax is basically the same as that of read.table.

Note that R will first coerce X to a data frame, so that it’spossible to include headers.

44 / 56

Page 45: Stat 451 Lecture Notes 0112 Introductionrmartin/OldCourses/Stat451/Notes/451notes01.pdfStat 451 Lecture Notes 0112 ... 1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens

Outline

1 Review of statistical inference

2 Introduction to R

3 Math and stat toolsProbability stuffStatistical methodsLinear algebra

45 / 56

Page 46: Stat 451 Lecture Notes 0112 Introductionrmartin/OldCourses/Stat451/Notes/451notes01.pdfStat 451 Lecture Notes 0112 ... 1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens

Combinatorics

For “counting problems,” we’d like to have built-in functionsto calculate “combinations” and factorial.

factorial(x) returns x! for integer x.

choose(n,k) returns(nk

).

Related functions are gamma, lgamma, digamma, etc — theseare the gamma function, the log-gamma function, thederivative of the log-gamma function, respectively.

46 / 56

Page 47: Stat 451 Lecture Notes 0112 Introductionrmartin/OldCourses/Stat451/Notes/451notes01.pdfStat 451 Lecture Notes 0112 ... 1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens

Random sampling

The sample function can be used to take random samplesfrom a finite set.

If X is a vector, then sample(X) will generate a randompermutation of the elements in X.

For integer n, sample(n) and sample(1:n) are equivalent.

Options: size=k or replace=TRUE or...

If X is a matrix/data frame with 10 columns, thenX[,sample(10,size=7)] will create new matrix containing 7of the original columns in random order.

47 / 56

Page 48: Stat 451 Lecture Notes 0112 Introductionrmartin/OldCourses/Stat451/Notes/451notes01.pdfStat 451 Lecture Notes 0112 ... 1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens

Probability distributions

R has a number of built-in functions to do probabilitycalculations for random variables.

Built in stuff for normal, binomial, Poisson, exponential,gamma, uniform, hypergeometric,...

Let dist be the abbreviation for a generic distribution; forexample norm for normal.

ddist = compute pdf of dist

pdist = compute cdf of dist

qdist = compute inverse cdf of dist

rdist = generate random variables from dist

dist can be norm, binom, pois, exp, gamma, unif,...

Look at the help files for the parametrizations.

48 / 56

Page 49: Stat 451 Lecture Notes 0112 Introductionrmartin/OldCourses/Stat451/Notes/451notes01.pdfStat 451 Lecture Notes 0112 ... 1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens

Probability distribution example

Draw plots of a binomial pdf (technically a pmf) and cdf.

n <- 25

p <- 0.4

plot(x=0, y=0, type=’n’, xlim=c(0,n), ylim=c(0,1),

xlab=’x’, ylab=’PDF and CDF’)

lines(x=0:n, y=pbinom(0:n, n, p), type=’s’, lwd=2,

col=’gray’)

lines(x=0:n, y=dbinom(0:n, n, p), type=’h’, lwd=2)

legend(’right’, inset=0.05, lwd=2, col=c(’black’,’gray’),

c(’PDF’, ’CDF’))

49 / 56

Page 50: Stat 451 Lecture Notes 0112 Introductionrmartin/OldCourses/Stat451/Notes/451notes01.pdfStat 451 Lecture Notes 0112 ... 1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens

Outline

1 Review of statistical inference

2 Introduction to R

3 Math and stat toolsProbability stuffStatistical methodsLinear algebra

50 / 56

Page 51: Stat 451 Lecture Notes 0112 Introductionrmartin/OldCourses/Stat451/Notes/451notes01.pdfStat 451 Lecture Notes 0112 ... 1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens

Quick summary

R is designed for statistical analysis so, naturally, it has built-infunctions that do many of the standard statistical methods.

For example:

t.test (obviously) does t-testslm does linear models (e.g., ANOVA, regression, etc)glm for generalized linear models (e.g., logistic regression)

Some examples in the code online.

Here, in Stat 451, the goal is to learn how to carry out thesecomputations, so we will avoid using the built-in functions,except for checking our answers.7

7Of course, outside Stat 451, it is best to use built-in functions to do thesestandard things.

51 / 56

Page 52: Stat 451 Lecture Notes 0112 Introductionrmartin/OldCourses/Stat451/Notes/451notes01.pdfStat 451 Lecture Notes 0112 ... 1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens

Outline

1 Review of statistical inference

2 Introduction to R

3 Math and stat toolsProbability stuffStatistical methodsLinear algebra

52 / 56

Page 53: Stat 451 Lecture Notes 0112 Introductionrmartin/OldCourses/Stat451/Notes/451notes01.pdfStat 451 Lecture Notes 0112 ... 1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens

Matrix arithmetic

Let A and B be two matrices of suitable dimension.8

Adding and subtracting matrices is obvious.

What about A * B or A / B?

Matrix multiplication requires a different symbol: A %*% B.

We’ll talk about matrix inversion below.

8You need to be careful about making sure the matrix dimensions arecorrect, since some of the arithmetic operations will “vectorize” and can giveunexpected results...

53 / 56

Page 54: Stat 451 Lecture Notes 0112 Introductionrmartin/OldCourses/Stat451/Notes/451notes01.pdfStat 451 Lecture Notes 0112 ... 1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens

More matrix things

det(M) will return the determinant of M.

diag(M) will do one of two things:

If M is a matrix, then diag(M) is a vector filled with thediagonal entries of M;if M is a vector, then diag(M) will be a diagonal matrix withvector M on the diagonal.

Solving a linear system, Ax = b for x : solve(A, b).

Matrix inversion:

If M is invertible, then solve(M) is the inverse;if M is not invertible, then ginv(M) returns a generalizedinverse.9

9Requires the MASS library.54 / 56

Page 55: Stat 451 Lecture Notes 0112 Introductionrmartin/OldCourses/Stat451/Notes/451notes01.pdfStat 451 Lecture Notes 0112 ... 1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens

Matrix decompositions

Spectral theorem says that if M is a symmetric d × d positivedefinite matrix, then there exists a diagonal matrix Λ and anorthonormal matrix U such that M = UΛU>.

Diagonal entries of Λ are eigenvalues of M and the columns ofU are the corresponding eigenvectors.

R gives this decomposition of M with the function eigen(M).

There are other matrix decompositions of interest:

Cholesky decomposition: chol(M)

singular value decomposition: svd(M)

...

55 / 56

Page 56: Stat 451 Lecture Notes 0112 Introductionrmartin/OldCourses/Stat451/Notes/451notes01.pdfStat 451 Lecture Notes 0112 ... 1Based on parts of: Dalgaard’s ISwR book, Chapter 1 in Givens

Neat example: sweep operator

Let M = (Mij) be a symmetric positive definite matrix.

Sweeping on the kth diagonal entry returns a new matrixM = (mij) defined by

mkk = − 1

mkk, mik =

mik

mkk, mkj =

mkj

mkk, mij = mij −

mikmkj

mkk.

Sweeping gives lots of nice properties of the matrix; seeChapter 7.5 in Lange.

In particular, sweeping M successively along each diagonalentry (in any order) returns the inverse M−1.

See the function sweep in the online code.

56 / 56