sweave: reproducible research using r and latex · 2016-07-13 · sweave: reproducible research...

Post on 14-Aug-2020

5 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Sweave: Reproducible Research using R and LATEX

Sandra D. GriffithDepartment of Biostatistics and EpidemiologyUniversity of Pennsylvaniasgrif@upenn.edu

Biostatistics Computing Workshop SeriesMarch 15, 2012

S. Griffith (sgrif@upenn.edu) Sweave 15 March 2012 1 / 20

Non-reproducible Research

• CharacteristicsI Prepare or manipulate data in a spreadsheetI Cut and paste output to create tablesI Multiple versions of data and analysis scriptsI Create many versions of graphics, selecting only one for final

presentation of results

• ProblemsI Data, code, and results not linkedI Any changes in analysis or data require manual regeneration of resultsI Workflow or organization scheme may change over timeI Can be difficult to replicate in the futureI Less forensic evidence if results are questioned

S. Griffith (sgrif@upenn.edu) Sweave 15 March 2012 2 / 20

Response to Duke University Scandal

“We now require most of our reports to be written using Sweave, a literateprogramming combination of LATEX source and R code (SASweave andodfWeave are also available) so that we can rerun the reports as needed

and get the same results.”

S. Griffith (sgrif@upenn.edu) Sweave 15 March 2012 3 / 20

Sweave: Conceptual Overview

• Link data, code, and results with a single .Rnw fileI Similar to .tex file, but includes interspersed “chunks” of R codeI Uses noweb syntax for literate programming

• Weave .Rnw file to produce .tex file which includes output from Rcode

• Compile TeX file to PDF or PS files as usual

• Tangle .Rnw file to extract R code into separate file

• In addition to including them in the output, creates individual files foreach figure

• Can refer to within-chunk R expressions in regular document textusing Sexpr

S. Griffith (sgrif@upenn.edu) Sweave 15 March 2012 4 / 20

Getting Started with Sweave

• Assume R and LATEX already installed

• Sweave.sty is already included with base R installation

I Preferred method: include R folder containing Sweave.sty in yourTeX path

F Will automatically update style file when you update R

I Copy Sweave.sty to a centralized location with other style files, alsoin your TeX path

F Requires manual updates, but can be located in a central locationshared among computers (e.g. Dropbox)

I Hard path: include \usepackage{...\Sweave} in preambleI Copy Sweave.sty into same folder as each .Rnw file

S. Griffith (sgrif@upenn.edu) Sweave 15 March 2012 5 / 20

Anatomy of a Code Chunk

<< label (optional), options >>=

insert R code here

@

Commonly-used options (see manual for full list)

• echo = F

Suppress R input from appearing in document (default = T)

• eval = F

R code not evaluated (default = T)

• results = hide

Suppress R output from appearing in document (default = verbatim)

• results = tex

R output will be read as TeX (default = verbatim)

• fig = T

Code chuck includes a figure (default = F)

S. Griffith (sgrif@upenn.edu) Sweave 15 March 2012 6 / 20

Global Options

Default options can be set in preamble and updated throughout document

• Set R chunk options\SweaveOpts{eval=T, echo=F}

• Preserve comments and spacing of echoed R code\SweaveOpts{keep.source=TRUE}

• Figure options for height, width, and file type

S. Griffith (sgrif@upenn.edu) Sweave 15 March 2012 7 / 20

Example

<<echo=T>>=

x <- exp(2.3)

x

@

> x <- exp(2.3)

> x

[1] 9.974182

<<echo=F>>=

x <- exp(2.3)

x

@

[1] 9.974182

<<echo=T, results=hide>>=

x <- exp(2.3)

x

@

> x <- exp(2.3)

> x

S. Griffith (sgrif@upenn.edu) Sweave 15 March 2012 8 / 20

Compiling an Sweave Document

• Manually (Windows or Mac)

1. Run Sweave(‘foo.Rnw’) in R console2. Open foo.tex in a TeX editor3. Compile PDF using TeX editor4. Stangle(‘foo.Rnw’) to extract R code if desired

• Manually (Linux/Unix)

1. Run R CMD Sweave foo.Rnw

2. Run pdflatex foo or latex foo

• Integrated Development Environment (IDE)

I Rstudio, Emacs (ESS), Eclipse (StatEt), etc.I If supported, usually one click/command for all steps (Sweave, compile

TeX, view PDF)

S. Griffith (sgrif@upenn.edu) Sweave 15 March 2012 9 / 20

RStudio

S. Griffith (sgrif@upenn.edu) Sweave 15 March 2012 10 / 20

The xtable Package: Basic Table Code

R package to convert many R objects to LATEXor HTML tables

<<label=tab:GenderRace, results=tex>>=

library(xtable)

data(tli)

xtable(table(tli$ethnicty, tli$sex),

caption="Distribution of gender and ethnicity")

@

<<label=tab:LM1, results=tex>>=

lm1 <- lm(tlimth ~ sex + ethnicty, data=tli)

xtable(lm1, caption="Linear Model Results")

@

S. Griffith (sgrif@upenn.edu) Sweave 15 March 2012 11 / 20

The xtable package: Basic Table Output

F M

BLACK 11 12HISPANIC 8 12

OTHER 2 0WHITE 30 25

Table: Distribution of gender and ethnicity

Estimate Std. Error t value Pr(>|t|)(Intercept) 71.0226 3.2894 21.59 0.0000

sexM 3.3734 2.8594 1.18 0.2410ethnictyHISPANIC -3.7466 4.3044 -0.87 0.3863

ethnictyOTHER 18.4774 10.4716 1.76 0.0809ethnictyWHITE 7.4622 3.4964 2.13 0.0354

Table: Linear Model Results

S. Griffith (sgrif@upenn.edu) Sweave 15 March 2012 12 / 20

The xtable package: Customized Tables

> mat <- round(matrix(c(0.9, 0.89, 200, 0.045, 2.0),

+ c(1, 5)), 4)

> rownames(mat) <- "$y_{t-1}$"

> colnames(mat) <- c("$R^2$", "$\\bar{R}^2$",

+ "F-stat", "S.E.E", "DW")

> mat <- xtable(mat)

> print(mat, sanitize.text.function = function(x){x})

R2 R̄2 F-stat S.E.E DW

yt−1 0.90 0.89 200.00 0.04 2.00

Almost all functionality available for LATEX tablescan be included directly in R code using xtable

S. Griffith (sgrif@upenn.edu) Sweave 15 March 2012 13 / 20

Aside: Using xtable for MS Word Tables

Non-statistical collaborators often prefer tabular results in MS Word

xtable(table(tli$ethnicty, tli$sex),

file="TabGenderRace",

type="html"

)

1. Save results in HTML file using xtable() in R

2. Open “TabGenderRace.htm” in a browser

3. Copy and paste into Word document as a fully-formatted table

S. Griffith (sgrif@upenn.edu) Sweave 15 March 2012 14 / 20

Basic Figure Example

<<fig=T, echo=F, width=5, height=3.5>>=

plot(1:10, rnorm(10))

@

2 4 6 8 10

−2

−1

01

1:10

rnor

m(1

0)

NB: Embed figure chunk within a LATEX figure environment for moreprecise control

S. Griffith (sgrif@upenn.edu) Sweave 15 March 2012 15 / 20

Large or Computationally Intensive Projects

• Use input statements or make files

• save() and load() intermediate results

• Conditional evaluationif (file exists) {load file} else {run; save file})

• Change R chunk evaluation options as necessary

• R package: cacheSweave to cache intermediate results

S. Griffith (sgrif@upenn.edu) Sweave 15 March 2012 16 / 20

Including R code as an Appendix

• Useful for homework, solution sets, etc.

• Include \usepackage{listings} in the preamble

• Include the following R chunk and TeX code in foo.Rnw where youwould like to place appendix

<<echo=FALSE, results=hide, split=TRUE>>=

Stangle(file="foo.Rnw",output="foo.R",

annotate=FALSE)

@

\pagebreak

\section{R Code}

\texttt{\lstinputlisting[emptylines=0]{foo.R}}

S. Griffith (sgrif@upenn.edu) Sweave 15 March 2012 17 / 20

Miscellaneous Sweave Tricks

• Load all libraries in one chunk with results = hide option tosuppress unwanted output (e.g. package dependencies)

• Beamer presentationsI Include [fragile] option for every frame with R code to handle

verbatim outputI For frames with TeX and verbatim output, must include

[containsverbatim] option instead

• R graphics package ggplot2

I Must use print() wrapper for ggplot objects

• R session information

> toLatex(sessionInfo(), locale=F)

I R version 2.14.1 (2011-12-22), x86_64-pc-mingw32I Base packages: base, datasets, graphics, grDevices, methods, stats,

utilsI Other packages: xtable 1.7-0I Loaded via a namespace (and not attached): tools 2.14.1

S. Griffith (sgrif@upenn.edu) Sweave 15 March 2012 18 / 20

Alternatives for Reproducible Research

• R for other document formats

I HTML: R2HTMLI Open Office: odfWeaveI MS Word: SwordI MS Powerpoint: R2PPT

• Other statistical packages

I Statweave for SAS, Stata, or MATLAB and LATEX or Open OfficeI Various other software-specific report generators

S. Griffith (sgrif@upenn.edu) Sweave 15 March 2012 19 / 20

Resources

• Sweave user manual (Friedrich Leisch): http://www.stat.

uni-muenchen.de/~leisch/Sweave/Sweave-manual.pdf

• Stack Overflow questions tagged Sweave:http://stackoverflow.com/questions/tagged/sweave

• Keith Baggerly’s introduction to Sweave: http://bioinformatics.

mdanderson.org/SweaveTalk/sweaveTalkb.pdf

• QuickR summary of alternatives to Sweave:http://www.statmethods.net/interface/output.html

• Citing R with Sweave: http://biostat.mc.vanderbilt.edu/

wiki/pub/Main/SweaveLatex/RCitation.pdf

• xtable gallery with examples: http://cran.r-project.org/web/

packages/xtable/vignettes/xtableGallery.pdf

S. Griffith (sgrif@upenn.edu) Sweave 15 March 2012 20 / 20

top related