apply functions - the uk's premier r user group · 2018. 4. 10. · other “apply”...

65
Richard Pugh, Commercial Director [email protected] Apply Functions

Upload: others

Post on 06-Oct-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

Apply Functions

Page 2: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

Agenda

• Why Apply?

• The “apply” family of functions

• Primary Functions

• Other Functions

• The “plyr” package

• Summary

Page 3: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

Why Apply?

Page 4: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

Why Apply?

• The functions we use are simple things …

• We need a framework to allow us to allow simple functions

in a more structured way

mean(x, trim = 0, na.rm = TRUE)

No “by” argument

Page 5: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

The “Apply” family of Functions

Page 6: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

Which Functions?

• Getting a full list is tricky

• Most challenging aspect: remembering which function to

use

Page 7: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

Primary Apply Functions

Page 8: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

The “Apply” family of Functions The apply Function

Page 9: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

The apply Function

• Applies a simple function over dimensions of a data structure

• So input is anything that has rows, columns (matrix, data

frame, array .. NOT lists or vectors)

Structure with a dimension

Function to apply

Dimension number (1 = rows, 2 = columns)

Additional arguments to the function

Page 10: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

Basic apply Example

Page 11: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

Mix it with graphics …

Page 12: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

Apply with multiple dimensions

Page 13: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

Apply over arrays

Page 14: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

Passing Additional Arguments

Page 15: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

Using Custom Functions

Page 16: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

A Quick Example

Page 17: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

A Quick Example

Page 18: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

The “Apply” family of Functions The lapply Function

Page 19: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

The lapply Function

• Applies a function over elements of a list

• Remember: a data frame is a list of named vectors!

• It always returns a list

List structure Function to apply

Additional arguments to the function

Page 20: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

Simple lapply Example

Page 21: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

Using split to generate a list

• Will split an object

(vector, data

frame) based on 1

or more factors

• Great input to lapply!

Page 22: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

Using split and lapply

Page 23: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

Using split and lapply

Page 24: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

Using lapply with vectors

Page 25: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

Split processing using lapply

Page 26: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

Split processing using lapply

Page 27: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

The “Apply” family of Functions The sapply Function

Page 28: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

The sapply Function

• Calls lapply and simplifies the output

Page 29: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

sapply vs lapply

Page 30: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

Using sapply and split

Page 31: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

Bootstrap style sapply example

Page 32: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

Bootstrap style sapply example

Page 33: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

Bootstrap style sapply example

Page 34: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

Using sapply with data frames

Page 35: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

The “Apply” family of Functions The tapply Function

Page 36: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

The tapply Function

• Applies a function to a vector BY levels of other factor(s)

• A wrapper to lapply & split

Vector to summarise

Factor(s) by which to summarise

Function to apply

Other arguments to the function

Page 37: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

Simple examples of tapply

Page 38: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

tapply vs sapply + split

Page 39: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

When tapply goes wrong

Page 40: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

The “Apply” family of Functions The by Function

Page 41: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

The by Function

• The tapply function is restricted to vector inputs

• The by function applies functions to level of a data frame

by one or more factors

Vector to summarise

Factor(s) by which to summarise

Function to apply

Other arguments to the function

Page 42: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

Simple by example

Page 43: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

An object of class by (example 1)

Page 44: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

An object of class by (example 2)

Page 45: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

The “Apply” family of Functions The aggregate Function

Page 46: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

The aggregate Function

• The aggregate function allows us to apply functions to

one or more variables by one or more factors

• It always returns a data frame

List of variables to summarise

List of factor(s) by which to summarise

Function to apply

Other arguments to the function

Page 47: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

Simple aggregateExample

Page 48: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

Alternative aggregate Usage

• The aggregate function allows us to apply functions to

one or more variables by one or more factors

• It always returns a data frame

Formula defining summary structure

Data frame to use

Function to apply

Other arguments to the function

Page 49: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

Simple aggregate Example

Page 50: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

The “Apply” family of Functions Other Functions

Page 51: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

Other “apply” Functions

Function Usage

eapply Apply a Function Over Values in an Environment

mapply Apply a Function to Multiple List or Vector Arguments

rapply Recursively Apply a Function to a List

vapply Similar to sapply with a pre-specified “template” return value

replicate Wrapper for sapply, for repeated evaluation of an expression

Page 52: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

The eapply Function

Page 53: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

The mapply Function

Page 54: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

The rapply Function

Page 55: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

The vapply Function

Page 56: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

The replicate Function

Page 57: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

The “plyr” package

Page 58: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

The “plyr” package Overview

• Written and maintained by Hadley Wickham

• Widely used in the R community:

• 158 other packages depend/import/suggest plyr

• Most downloaded package from the Rstudio cran mirror last

month (19,546 downloads)

• Consists of tools for splitting data, applying a function to

each part and then combining the results

Page 59: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

Common plyr functions

• Named according to the data structure they split

up and the data structure they return.

• Available data structures are:

• data frame, array, list, multiple inputs, repeat multiple

times, _ nothing

• For example: daply

Takes a data frame Returns an array

Page 60: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

Some plyr alternatives to base R

• Provides an alternative to using the apply family

of functions (covered so far today):

Base function plyr function

apply aaply/alply

lapply llply

sapply laply

tapply n/a

by dlply

aggregate ddply + colwise

Page 61: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

Simple lapply vs sapply example with plyr

plyr base

Page 62: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

Why use plyr?

• Consistent function names make it easier to know

which apply-type function is required

• Fast and memory efficient

• Uses parallelisation through the foreach package

• Additional “helper” functions included:

• arrange

• mutate

• summarise

• join

• match_df

• colwise

• rename

• round_any

• count

Page 63: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

Summary

Page 64: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]

Summary

• This was a quick overview of the apply family of

functions

• The key functions are apply, sapply, lapply

and aggregate

• The “plyr” package functions can be used as an

alternative to these functions

Page 65: Apply Functions - The UK's Premier R User Group · 2018. 4. 10. · Other “apply” Functions Function Usage eapply Apply a Function Over Values in an Environment mapply Apply a

Richard Pugh, Commercial Director

[email protected]