a short overview thomas girke december 8,...

21
Introduction into R A Short Overview Thomas Girke December 8, 2012 Introduction into R Slide 1/21

Upload: others

Post on 19-Mar-2020

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A Short Overview Thomas Girke December 8, 2012faculty.ucr.edu/~tgirke/HTML_Presentations/Manuals/Rbasics/Introduction_into_R.pdfIntroduction into R A Short Overview Thomas Girke December

Introduction into RA Short Overview

Thomas Girke

December 8, 2012

Introduction into R Slide 1/21

Page 2: A Short Overview Thomas Girke December 8, 2012faculty.ucr.edu/~tgirke/HTML_Presentations/Manuals/Rbasics/Introduction_into_R.pdfIntroduction into R A Short Overview Thomas Girke December

IntroductionLook and Feel of the R EnvironmentR Library DepositoriesInstallationGetting AroundBasic SyntaxData Types and SubsettingBasic CalculationsReading and Writing External DataSome Great R FunctionsGraphics Utilities

Online Tutorial

Introduction into R Slide 2/21

Page 3: A Short Overview Thomas Girke December 8, 2012faculty.ucr.edu/~tgirke/HTML_Presentations/Manuals/Rbasics/Introduction_into_R.pdfIntroduction into R A Short Overview Thomas Girke December

Outline

IntroductionLook and Feel of the R EnvironmentR Library DepositoriesInstallationGetting AroundBasic SyntaxData Types and SubsettingBasic CalculationsReading and Writing External DataSome Great R FunctionsGraphics Utilities

Online Tutorial

Introduction into R Introduction Slide 3/21

Page 4: A Short Overview Thomas Girke December 8, 2012faculty.ucr.edu/~tgirke/HTML_Presentations/Manuals/Rbasics/Introduction_into_R.pdfIntroduction into R A Short Overview Thomas Girke December

What You’ll Get?

R Gui: OS X

Command-line R: Linux/OS X R Gui: Windows

Introduction into R Introduction Look and Feel of the R Environment Slide 4/21

Page 5: A Short Overview Thomas Girke December 8, 2012faculty.ucr.edu/~tgirke/HTML_Presentations/Manuals/Rbasics/Introduction_into_R.pdfIntroduction into R A Short Overview Thomas Girke December

RStudio: Alternative Working Environment for RNew integrated development environment (IDE) for R that workswell for beginners and developers.

Introduction into R Introduction Look and Feel of the R Environment Slide 5/21

Page 6: A Short Overview Thomas Girke December 8, 2012faculty.ucr.edu/~tgirke/HTML_Presentations/Manuals/Rbasics/Introduction_into_R.pdfIntroduction into R A Short Overview Thomas Girke December

Why Using R

Complete statistical package and programming language

Efficient functions and data structures for data analysis

Powerful graphics

Access to fast growing number of analysis packages

Most widely used language in bioinformatics

Is standard for data mining and biostatistical analysis

Technical advantages: free, open-source, available for all OSs

Books & Documentation

simpleR - Using R for Introductory Statistics (Gentleman etal., 2005)

Bioinformatics and Computational Biology Solutions Using Rand Bioconductor (John Verzani, 2004)

UCR Manual (Thomas Girke)

Introduction into R Introduction Look and Feel of the R Environment Slide 6/21

Page 7: A Short Overview Thomas Girke December 8, 2012faculty.ucr.edu/~tgirke/HTML_Presentations/Manuals/Rbasics/Introduction_into_R.pdfIntroduction into R A Short Overview Thomas Girke December

Package Depositories

CRAN (>3000 packages) general data analysis

BioConductor (>500 packages) bioscience data analysis

Omegahat (>30 packages) programming interfaces

Introduction into R Introduction R Library Depositories Slide 7/21

Page 8: A Short Overview Thomas Girke December 8, 2012faculty.ucr.edu/~tgirke/HTML_Presentations/Manuals/Rbasics/Introduction_into_R.pdfIntroduction into R A Short Overview Thomas Girke December

Installation

Install R binary for your operating system from:

http://cran.at.r-project.org

Installation of CRAN Packages> install.packages(c("pkg1", "pkg2"))

> install.packages("pkg.zip", repos=NULL)

Installation of BioConductor Packages> source("http://www.bioconductor.org/biocLite.R")

> biocLite()

> biocLite(c("pkg1", "pkg2"))

Introduction into R Introduction Installation Slide 8/21

Page 9: A Short Overview Thomas Girke December 8, 2012faculty.ucr.edu/~tgirke/HTML_Presentations/Manuals/Rbasics/Introduction_into_R.pdfIntroduction into R A Short Overview Thomas Girke December

Startup/Closing Behavior

Starting R

The R GUI versions under Windows and Mac OS X can be opened bydouble-clicking their icons. Alternatively, one can start it by typing ’R’ ina terminal (default under Linux).

Startup/Closing Behavior

The R environment is controlled by hidden files in the startup directory:.RData, .Rhistory and .Rprofile (optional).

## Closing R

> q()

Save workspace image? [y/n/c]:

Note

When responding with ’y’, then the entire R workspace will be written tothe .RData file which can become very large. Often it is sufficient to justsave an analysis protocol in an R source file. This way one can quicklyregenerate all data sets and objects.

Introduction into R Introduction Getting Around Slide 9/21

Page 10: A Short Overview Thomas Girke December 8, 2012faculty.ucr.edu/~tgirke/HTML_Presentations/Manuals/Rbasics/Introduction_into_R.pdfIntroduction into R A Short Overview Thomas Girke December

Getting Around

## Create an object with the assignment operator ’<-’ (or ’=’)

> object <- ...

## List objects in current R session

> ls()

## Return content of current working directory

> dir()

## Return path of current working directory

> getwd()

## Change current working directory

> setwd("/home/user")

Introduction into R Introduction Getting Around Slide 10/21

Page 11: A Short Overview Thomas Girke December 8, 2012faculty.ucr.edu/~tgirke/HTML_Presentations/Manuals/Rbasics/Introduction_into_R.pdfIntroduction into R A Short Overview Thomas Girke December

Basic R Syntax

## General R command syntax

> object <- function(arguments)

> object <- object[arguments]

## Execute an R script

> source("my script.R")

## Execute an R script from command-line

$ R CMD BATCH my script.R

$ R --slave < my script.R

## Finding help

> ?function

## Load a library

> library("my library")

## Summary of all functions within a library

> library(help="my library")

## Load library manual (PDF file)

> vignette()

Introduction into R Introduction Basic Syntax Slide 11/21

Page 12: A Short Overview Thomas Girke December 8, 2012faculty.ucr.edu/~tgirke/HTML_Presentations/Manuals/Rbasics/Introduction_into_R.pdfIntroduction into R A Short Overview Thomas Girke December

Data Types

## Numeric data: 1, 2, 3

> x <- c(1, 2, 3); x; is.numeric(x); as.character(x)

## Character data: "a", "b", "c"

> x <- c("1", "2", "3"); x; is.character(x); as.numeric(x)

## Complex data: 1, b, 3

> c(1, "b", 3)

## Logical data: TRUE, FALSE, TRUE

> x <- 1:10 < 5; x

> !x

## Return indices for the ’TRUEs’ in logical vector

> which(x)

Introduction into R Introduction Data Types and Subsetting Slide 12/21

Page 13: A Short Overview Thomas Girke December 8, 2012faculty.ucr.edu/~tgirke/HTML_Presentations/Manuals/Rbasics/Introduction_into_R.pdfIntroduction into R A Short Overview Thomas Girke December

Data Objects

## Vectors (1D)

> myVec <- 1:10; names(myVec) <- letters[1:10]

> myVec[1:5]; myVec[c(2,4,6,8)]; myVec[c("b", "d", "f")]

## Factors (1D): vectors with grouping information

> factor(c("dog", "cat", "mouse", "dog", "dog", "cat"))

## Matrices (2D), Data Frames (2D) and Arrays (≥2D)> myMA <- matrix(1:30, 3, 10, byrow = T)

> myDF <- data.frame(Col1=1:10, Col2=10:1)

> myDF[1:4, ]; myDF[ ,c("Col2", "Col1", "Col1")]

## Lists: containers for any object type

> myL <- list(name="Fred", wife="Mary", no.children=3,

child.ages=c(4,7,9))

> myL[[4]][1:2]

## Functions: piece of code

> myfct <- function(arg1, arg2, ...) { function body }

Introduction into R Introduction Data Types and Subsetting Slide 13/21

Page 14: A Short Overview Thomas Girke December 8, 2012faculty.ucr.edu/~tgirke/HTML_Presentations/Manuals/Rbasics/Introduction_into_R.pdfIntroduction into R A Short Overview Thomas Girke December

General Subsetting Rules

## Subsetting by indices

> myVec <- 1:26; names(myVec) <- LETTERS

> myVec[1:4]

## Subsetting by same length logical vectors

> myLog <- myVec > 10

> myVec[myLog]

## Subsetting by field names

> myVec[c("B", "K", "M")]

## Special case

> iris$Species

Introduction into R Introduction Data Types and Subsetting Slide 14/21

Page 15: A Short Overview Thomas Girke December 8, 2012faculty.ucr.edu/~tgirke/HTML_Presentations/Manuals/Rbasics/Introduction_into_R.pdfIntroduction into R A Short Overview Thomas Girke December

Basic Operators and Calculations

Comparison operators: ==, ! =, <, >, <=, >=## Example:

> 1==1

Logical operators: AND: &, OR: |, NOT: !## Example:

> x <- 1:10; y <- 10:1

> x > y & x > 5

Calculations:## Example:

> x + y; sum(x); mean(x), sd(x); sqrt(x)

> apply(iris[,1:3], 1, mean)

Introduction into R Introduction Basic Calculations Slide 15/21

Page 16: A Short Overview Thomas Girke December 8, 2012faculty.ucr.edu/~tgirke/HTML_Presentations/Manuals/Rbasics/Introduction_into_R.pdfIntroduction into R A Short Overview Thomas Girke December

Reading and Writing External Data

## Import Data into R

> read.delim("myData.xls", sep="\t")

## Export Data from R to File

> write.table(myframe, file="myfile.xls", sep="\t", quote=F)

Introduction into R Introduction Reading and Writing External Data Slide 16/21

Page 17: A Short Overview Thomas Girke December 8, 2012faculty.ucr.edu/~tgirke/HTML_Presentations/Manuals/Rbasics/Introduction_into_R.pdfIntroduction into R A Short Overview Thomas Girke December

Some Great R Functions

## The unique() function to make vector entries unique

> unique(iris$Sepal.Length); length(unique(iris$Sepal.Length))

## The table() function counts the occurrences of entries

> table(iris$Species)

## The aggregate() function computes statistics of data

aggregates

> aggregate(iris[,1:4], by=list(iris$Species), FUN=mean, na.rm=T)

## The %in% function returns the intersect between two vectors

> month.name[month.name %in% c("May", "July")]

## The merge() function joins data frames based on a common key

column

> merge(frame1, frame2, by.x=1, by.y=1, all = TRUE)

Introduction into R Introduction Some Great R Functions Slide 17/21

Page 18: A Short Overview Thomas Girke December 8, 2012faculty.ucr.edu/~tgirke/HTML_Presentations/Manuals/Rbasics/Introduction_into_R.pdfIntroduction into R A Short Overview Thomas Girke December

Graphics Utilities

t4 t5 t3 t1 t2

g10g69g59g50g86g12g77g19g62g5g46g22g88g25g4g89g13g54g79g95g3g85g87g38g81g23g91g15g75g30g58g7g44g56g60g11g71g82g21g67g66g80g8g100g37g55g14g26g74g92g78g83g29g98g97g28g73g49g76g40g61g32g70g53g65g33g24g90g18g63g6g36g9g20g1g84g68g48g57g41g45g47g2g39g17g35g31g16g52g27g64g34g93g51g94g43g42g72g96g99

-1 0 1Row Z-Score

Color Key

Venn Diagram

Unique objects: All = 25; S1 = 18; S2 = 16; S3 = 20; S4 = 22; S5 = 18

0

0

00

0

00

0

0

0

0

0

0

00

0 1

0

2

1

1

2

10

3

21

3

2

15

A

B

C

D

E

Introduction into R Introduction Graphics Utilities Slide 18/21

Page 19: A Short Overview Thomas Girke December 8, 2012faculty.ucr.edu/~tgirke/HTML_Presentations/Manuals/Rbasics/Introduction_into_R.pdfIntroduction into R A Short Overview Thomas Girke December

Some Graphics Commands

## Dot plots

> plot(1:10)

> plot(iris[,1:4])

## Barplot

> barplot(1:10)

## Help with plots

> ?plot; ?par

Demo Graphics Utilities

Introduction into R Introduction Graphics Utilities Slide 19/21

Page 20: A Short Overview Thomas Girke December 8, 2012faculty.ucr.edu/~tgirke/HTML_Presentations/Manuals/Rbasics/Introduction_into_R.pdfIntroduction into R A Short Overview Thomas Girke December

Outline

IntroductionLook and Feel of the R EnvironmentR Library DepositoriesInstallationGetting AroundBasic SyntaxData Types and SubsettingBasic CalculationsReading and Writing External DataSome Great R FunctionsGraphics Utilities

Online Tutorial

Introduction into R Online Tutorial Slide 20/21

Page 21: A Short Overview Thomas Girke December 8, 2012faculty.ucr.edu/~tgirke/HTML_Presentations/Manuals/Rbasics/Introduction_into_R.pdfIntroduction into R A Short Overview Thomas Girke December

More Details and Exercises

Continue in R Manual

Introduction into R Online Tutorial Slide 21/21