r- introduction
DESCRIPTION
R BasicsTRANSCRIPT
Introduction to RVenkat Reddy
Dat
a An
alys
is C
ours
e
V
enka
t Red
dy
Contents• What is R• R basics• Working with R• R packages• R Functions• Importing & Exporting data• Creating calculated fields• R Help & Tutorials
2
Dat
a An
alys
is C
ours
e
V
enka
t Red
dy
R• Programming “environment”• Runs on a variety of platforms including Windows, Unix and
MacOS.• Provides an unparalleled platform for programming new
statistical methods in an easy and straightforward manner. • Object-oriented• Open source• Excellent graphics capabilities• Supported by a large user network
3
Dat
a An
alys
is C
ours
e
V
enka
t Red
dy
Downloading• Google it using R or CRAN (Comprehensive R Archive
Network)• http://www.r-project.org
4
Dat
a An
alys
is C
ours
e
V
enka
t Red
dy
R
5
Dat
a An
alys
is C
ours
e
V
enka
t Red
dy
R Studio
6
Dat
a An
alys
is C
ours
e
V
enka
t Red
dy
R-Demo• 2+2• log(10)• help(log)• summary(airquality)• demo(graphics) # pretty pictures...
7
Dat
a An
alys
is C
ours
e
V
enka
t Red
dy
R-Basics :Naming convention• must start with a letter (A-Z or a-z)• can contain letters, digits (0-9), and/or periods “.”
• R is a case sensitive language. • mydata different from MyData
8
Dat
a An
alys
is C
ours
e
V
enka
t Red
dy
R-Basics :Assignment• “<-” used to indicate assignment• x<-c(1,2,3,4,5,6,7)• x<-c(1:7)• x<-1:4
• Assignment to an object is denoted by ”<-” or ”->” or ”=”. • If you see a notation ”= =”, you’ll looking at a comparison
operator.– Many other notations can be found from the
documentation for the Base package or R.
9
Dat
a An
alys
is C
ours
e
V
enka
t Red
dy
Workspace• during an R session, all objects are stored in a temporary,
working memory• Commands are entered interactively at the R user prompt. Up
and down arrow keys scroll through your command history. • list objects ls()• remove objects rm()• data()
10
Dat
a An
alys
is C
ours
e
V
enka
t Red
dy
Demo: Working with R• x <- round(rnorm(10,mean=20,sd=5)) # simulate data• x• mean(x)• m <- mean(x)• m• x- m # notice recycling• (x - m)^2• sum((x - m)^2)• data()• UKgas
11
Dat
a An
alys
is C
ours
e
V
enka
t Red
dy
R Packages• R consists of a core and packages. Packages contain functions
that are not available in the core.• Collections of R functions, data, and compiled code• Well-defined format that ensures easy installation, a basic
standard of documentation, and enhances portability and reliability
• When you download R, already a number (around 30) of packages are downloaded as well.
• You can use the function search to see a list of packages that are currently attached to the system, this list is also called the search path.• search() 12
Dat
a An
alys
is C
ours
e
V
enka
t Red
dy
R packages• Select the `Packages' menu and select `Install package...', a list
of available packages on your system will be displayed. • Select one and click `OK', the package is now attached to your
current R session. Via the library function• The library can also be used to list all the available libraries on
your system with a short description. Run the function without any arguments
13
Dat
a An
alys
is C
ours
e
V
enka
t Red
dy
Demo: R packages• Install cluster package• Install plyr package (for string operations)
14
Dat
a An
alys
is C
ours
e
V
enka
t Red
dy
Importing Data• Reading CSV file• X <-read.csv(“file.csv")
• read.table()• reads in data from an external file• d <- read.table("myfile", header=TRUE)
• R has ODBC for connecting to other programs• R gets confused if you use a path in your code like c:\
mydocuments\myfile.txt• This is because R sees "\" as an escape character. Instead, use
c:\\my documents\\myfile.txt or c:/mydocuments/myfile.txt
15
Dat
a An
alys
is C
ours
e
V
enka
t Red
dy
Demo: Importing data• Reading CSV file• petrol<-read.csv("C:\\Users\\VENKAT\\Google Drive\\Training\\
R\\Data\\Petrol_Consuption.csv")• sales_data<-read.table("C:\\Users\\VENKAT\\Google Drive\\
Training\\R\\Data\\sales.txt")• sales_data<-read.table("C:\\Users\\VENKAT\\Google Drive\\
Training\\R\\Data\\sales.txt",header=TRUE)
16
Dat
a An
alys
is C
ours
e
V
enka
t Red
dy
Exporting data• To A Tab Delimited Text File• write.table(mydata, "c:/mydata.txt", sep="\t")
• To an Excel Spreadsheet • library(xlsReadWrite)• write.xls(mydata, "c:/mydata.xls")
• To SAS• library(foreign)• write.foreign(mydata, "c:/mydata.txt", "c:/mydata.sas", package="SAS")
17
Dat
a An
alys
is C
ours
e
V
enka
t Red
dy
Demo: Exporting data• write.table(sales_data, "C:\\Users\\VENKAT\\Google Drive\\
Training\\R\\Data\\sales_export.txt", sep="\t") • write.table(sales_data, "C:\\Users\\VENKAT\\Google Drive\\
Training\\R\\Data\\sales_export.csv", sep=",")
18
Dat
a An
alys
is C
ours
e
V
enka
t Red
dy
R- Functions
19
Function Descriptionabs(x) absolute value
sqrt(x) square root
ceiling(x) ceiling(3.475) is 4
floor(x) floor(3.475) is 3
trunc(x) trunc(5.99) is 5
round(x, digits=n) round(3.475, digits=2) is 3.48
signif(x, digits=n) signif(3.475, digits=2) is 3.5
cos(x), sin(x), tan(x) also acos(x), cosh(x), acosh(x), etc.
log(x) natural logarithm
log10(x) common logarithm
exp(x) e^x
Numeric Functions
Dat
a An
alys
is C
ours
e
V
enka
t Red
dy
Demo: Numeric Functions• y<-abs(-20)• x<-Sum(y+5)• Z<-Log(x)• round(x,1)
20
Dat
a An
alys
is C
ours
e
V
enka
t Red
dy
Character Functions
21
Function Description
substr(x, start=n1, stop=n2) Extract or replace substrings in a character vector.x <- "abcdef" substr(x, 2, 4) is "bcd" substr(x, 2, 4) <- "22222" is "a222ef"
grep(pattern, x , ignore.case=FALSE, fixed=FALSE)
Search for pattern in x. If fixed =FALSE then pattern is a regular expression. If fixed=TRUE then pattern is a text string. Returns matching indices.grep("A", c("b","A","c"), fixed=TRUE) returns 2
sub(pattern, replacement, x, ignore.case =FALSE, fixed=FALSE)
Find pattern in x and replace with replacement text. If fixed=FALSE then pattern is a regular expression.If fixed = T then pattern is a text string. sub("\\s",".","Hello There") returns "Hello.There"
strsplit(x, split) Split the elements of character vector x at split. strsplit("abc", "") returns 3 element vector "a","b","c"
paste(..., sep="") Concatenate strings after using sep string to seperate them.paste("x",1:3,sep="") returns c("x1","x2" "x3")paste("x",1:3,sep="M") returns c("xM1","xM2" "xM3")paste("Today is", date())
toupper(x) Uppercase
Dat
a An
alys
is C
ours
e
V
enka
t Red
dy
Demo :Character Functions• cust_id<-"Cust1233416"• id<-substr(cust_id, 5,10)• Up=toupper(cust_id)
22
Dat
a An
alys
is C
ours
e
V
enka
t Red
dy
Calculated Fields in R• Use the assignment operator <- to create new variables. A
wide array of operators and functions are available here.• mydata$sum <- mydata$x1 + mydata$x2• mydata$mean <- (mydata$x1 + mydata$x2)/2• attach(mydata)• mydata$sum <- x1 + x2• mydata$mean <- (x1 + x2)/2• detach(mydata)
23
Dat
a An
alys
is C
ours
e
V
enka
t Red
dy
Demo Calculated Fields in R• sales_data$reduce<-(sales_data$Sales)*0.2• View(sales_data)• sales_data$new_sales<-sales_data$Sales- sales_data$reduce• attach(petrol)• ratio=Income/consum_mill_gallons• View(petrol)• petrol$ratio=Income/Consum_mill_gallons• View(petrol)
24
Dat
a An
alys
is C
ours
e
V
enka
t Red
dy
R-Help• If you encounter a new command during the exercises, and you’d
like to know what it does, please consult the documentation. All R commands are listed nowhere, and the only way to get to know new commands is to read the documentation files, so we’d like you to practise this youself.
• TutorialsEach of the following tutorials are in PDF format.• P. Kuhnert & B. Venables,
An Introduction to R: Software for Statistical Modeling & Computing • J.H. Maindonald, Using R for Data Analysis and Graphics • B. Muenchen, R for SAS and SPSS Users • W.J. Owen, The R Guide • D. Rossiter,
Introduction to the R Project for Statistical Computing for Use at the ITC • W.N. Venebles & D. M. Smith, An Introduction to R
25
Dat
a An
alys
is C
ours
e
V
enka
t Red
dy
R-Tutorials• Paul Geissler's excellent R tutorial • Dave Robert's Excellent Labs on Ecological Analysis • Excellent Tutorials by David Rossitier • Excellent tutorial an nearly every aspect of R (c/o Rob
Kabacoff) MOST of these notes follow this web page format• Introduction to R by Vincent Zoonekynd • R Cookbook • Data Manipulation Reference
26
Dat
a An
alys
is C
ours
e
V
enka
t Red
dy
Thank you
27