r- introduction

27
Introduction to R Venkat Reddy

Upload: venkata-reddy-konasani

Post on 15-May-2015

1.532 views

Category:

Education


2 download

DESCRIPTION

R Basics

TRANSCRIPT

Page 1: R- Introduction

Introduction to RVenkat Reddy

Page 2: R- Introduction

Dat

a An

alys

is C

ours

e

V

enka

t Red

dy

Contents• What is R• R basics• Working with R• R packages• R Functions• Importing & Exporting data• Creating calculated fields• R Help & Tutorials

2

Page 3: R- Introduction

Dat

a An

alys

is C

ours

e

V

enka

t Red

dy

R• Programming “environment”• Runs on a variety of platforms including Windows, Unix and

MacOS.• Provides an unparalleled platform for programming new

statistical methods in an easy and straightforward manner. • Object-oriented• Open source• Excellent graphics capabilities• Supported by a large user network

3

Page 4: R- Introduction

Dat

a An

alys

is C

ours

e

V

enka

t Red

dy

Downloading• Google it using R or CRAN (Comprehensive R Archive

Network)• http://www.r-project.org

4

Page 5: R- Introduction

Dat

a An

alys

is C

ours

e

V

enka

t Red

dy

R

5

Page 6: R- Introduction

Dat

a An

alys

is C

ours

e

V

enka

t Red

dy

R Studio

6

Page 7: R- Introduction

Dat

a An

alys

is C

ours

e

V

enka

t Red

dy

R-Demo• 2+2• log(10)• help(log)• summary(airquality)• demo(graphics) # pretty pictures...

7

Page 8: R- Introduction

Dat

a An

alys

is C

ours

e

V

enka

t Red

dy

R-Basics :Naming convention• must start with a letter (A-Z or a-z)• can contain letters, digits (0-9), and/or periods “.”

• R is a case sensitive language. • mydata different from MyData

8

Page 9: R- Introduction

Dat

a An

alys

is C

ours

e

V

enka

t Red

dy

R-Basics :Assignment• “<-” used to indicate assignment• x<-c(1,2,3,4,5,6,7)• x<-c(1:7)• x<-1:4

• Assignment to an object is denoted by ”<-” or ”->” or ”=”. • If you see a notation ”= =”, you’ll looking at a comparison

operator.– Many other notations can be found from the

documentation for the Base package or R.

9

Page 10: R- Introduction

Dat

a An

alys

is C

ours

e

V

enka

t Red

dy

Workspace• during an R session, all objects are stored in a temporary,

working memory• Commands are entered interactively at the R user prompt. Up

and down arrow keys scroll through your command history. • list objects ls()• remove objects rm()• data()

10

Page 11: R- Introduction

Dat

a An

alys

is C

ours

e

V

enka

t Red

dy

Demo: Working with R• x <- round(rnorm(10,mean=20,sd=5)) # simulate data• x• mean(x)• m <- mean(x)• m• x- m # notice recycling• (x - m)^2• sum((x - m)^2)• data()• UKgas

11

Page 12: R- Introduction

Dat

a An

alys

is C

ours

e

V

enka

t Red

dy

R Packages• R consists of a core and packages. Packages contain functions

that are not available in the core.• Collections of R functions, data, and compiled code• Well-defined format that ensures easy installation, a basic

standard of documentation, and enhances portability and reliability

• When you download R, already a number (around 30) of packages are downloaded as well.

• You can use the function search to see a list of packages that are currently attached to the system, this list is also called the search path.• search() 12

Page 13: R- Introduction

Dat

a An

alys

is C

ours

e

V

enka

t Red

dy

R packages• Select the `Packages' menu and select `Install package...', a list

of available packages on your system will be displayed. • Select one and click `OK', the package is now attached to your

current R session. Via the library function• The library can also be used to list all the available libraries on

your system with a short description. Run the function without any arguments

13

Page 14: R- Introduction

Dat

a An

alys

is C

ours

e

V

enka

t Red

dy

Demo: R packages• Install cluster package• Install plyr package (for string operations)

14

Page 15: R- Introduction

Dat

a An

alys

is C

ours

e

V

enka

t Red

dy

Importing Data• Reading CSV file• X <-read.csv(“file.csv")

• read.table()• reads in data from an external file• d <- read.table("myfile", header=TRUE)

• R has ODBC for connecting to other programs• R gets confused if you use a path in your code like c:\

mydocuments\myfile.txt• This is because R sees "\" as an escape character. Instead, use

c:\\my documents\\myfile.txt or c:/mydocuments/myfile.txt

15

Page 16: R- Introduction

Dat

a An

alys

is C

ours

e

V

enka

t Red

dy

Demo: Importing data• Reading CSV file• petrol<-read.csv("C:\\Users\\VENKAT\\Google Drive\\Training\\

R\\Data\\Petrol_Consuption.csv")• sales_data<-read.table("C:\\Users\\VENKAT\\Google Drive\\

Training\\R\\Data\\sales.txt")• sales_data<-read.table("C:\\Users\\VENKAT\\Google Drive\\

Training\\R\\Data\\sales.txt",header=TRUE)

16

Page 17: R- Introduction

Dat

a An

alys

is C

ours

e

V

enka

t Red

dy

Exporting data• To A Tab Delimited Text File• write.table(mydata, "c:/mydata.txt", sep="\t")

• To an Excel Spreadsheet • library(xlsReadWrite)• write.xls(mydata, "c:/mydata.xls")

• To SAS• library(foreign)• write.foreign(mydata, "c:/mydata.txt", "c:/mydata.sas", package="SAS")

17

Page 18: R- Introduction

Dat

a An

alys

is C

ours

e

V

enka

t Red

dy

Demo: Exporting data• write.table(sales_data, "C:\\Users\\VENKAT\\Google Drive\\

Training\\R\\Data\\sales_export.txt", sep="\t") • write.table(sales_data, "C:\\Users\\VENKAT\\Google Drive\\

Training\\R\\Data\\sales_export.csv", sep=",")

18

Page 19: R- Introduction

Dat

a An

alys

is C

ours

e

V

enka

t Red

dy

R- Functions

19

Function Descriptionabs(x) absolute value

sqrt(x) square root

ceiling(x) ceiling(3.475) is 4

floor(x) floor(3.475) is 3

trunc(x) trunc(5.99) is 5

round(x, digits=n) round(3.475, digits=2) is 3.48

signif(x, digits=n) signif(3.475, digits=2) is 3.5

cos(x), sin(x), tan(x) also acos(x), cosh(x), acosh(x), etc.

log(x) natural logarithm

log10(x) common logarithm

exp(x) e^x

Numeric Functions

Page 20: R- Introduction

Dat

a An

alys

is C

ours

e

V

enka

t Red

dy

Demo: Numeric Functions• y<-abs(-20)• x<-Sum(y+5)• Z<-Log(x)• round(x,1)

20

Page 21: R- Introduction

Dat

a An

alys

is C

ours

e

V

enka

t Red

dy

Character Functions

21

Function Description

substr(x, start=n1, stop=n2) Extract or replace substrings in a character vector.x <- "abcdef" substr(x, 2, 4) is "bcd" substr(x, 2, 4) <- "22222" is "a222ef"

grep(pattern, x , ignore.case=FALSE, fixed=FALSE)

Search for pattern in x. If fixed =FALSE then pattern is a regular expression. If fixed=TRUE then pattern is a text string. Returns matching indices.grep("A", c("b","A","c"), fixed=TRUE) returns 2

sub(pattern, replacement, x, ignore.case =FALSE, fixed=FALSE)

Find pattern in x and replace with replacement text. If fixed=FALSE then pattern is a regular expression.If fixed = T then pattern is a text string. sub("\\s",".","Hello There") returns "Hello.There"

strsplit(x, split) Split the elements of character vector x at split. strsplit("abc", "") returns 3 element vector "a","b","c"

paste(..., sep="") Concatenate strings after using sep string to seperate them.paste("x",1:3,sep="") returns c("x1","x2" "x3")paste("x",1:3,sep="M") returns c("xM1","xM2" "xM3")paste("Today is", date())

toupper(x) Uppercase

Page 22: R- Introduction

Dat

a An

alys

is C

ours

e

V

enka

t Red

dy

Demo :Character Functions• cust_id<-"Cust1233416"• id<-substr(cust_id, 5,10)• Up=toupper(cust_id)

22

Page 23: R- Introduction

Dat

a An

alys

is C

ours

e

V

enka

t Red

dy

Calculated Fields in R• Use the assignment operator <- to create new variables. A

wide array of operators and functions are available here.• mydata$sum <- mydata$x1 + mydata$x2• mydata$mean <- (mydata$x1 + mydata$x2)/2• attach(mydata)• mydata$sum <- x1 + x2• mydata$mean <- (x1 + x2)/2• detach(mydata)

23

Page 24: R- Introduction

Dat

a An

alys

is C

ours

e

V

enka

t Red

dy

Demo Calculated Fields in R• sales_data$reduce<-(sales_data$Sales)*0.2• View(sales_data)• sales_data$new_sales<-sales_data$Sales- sales_data$reduce• attach(petrol)• ratio=Income/consum_mill_gallons• View(petrol)• petrol$ratio=Income/Consum_mill_gallons• View(petrol)

24

Page 25: R- Introduction

Dat

a An

alys

is C

ours

e

V

enka

t Red

dy

R-Help• If you encounter a new command during the exercises, and you’d

like to know what it does, please consult the documentation. All R commands are listed nowhere, and the only way to get to know new commands is to read the documentation files, so we’d like you to practise this youself.

• TutorialsEach of the following tutorials are in PDF format.• P. Kuhnert & B. Venables,

An Introduction to R: Software for Statistical Modeling & Computing • J.H. Maindonald, Using R for Data Analysis and Graphics • B. Muenchen, R for SAS and SPSS Users • W.J. Owen, The R Guide • D. Rossiter,

Introduction to the R Project for Statistical Computing for Use at the ITC • W.N. Venebles & D. M. Smith, An Introduction to R

25

Page 26: R- Introduction

Dat

a An

alys

is C

ours

e

V

enka

t Red

dy

R-Tutorials• Paul Geissler's excellent R tutorial • Dave Robert's Excellent Labs on Ecological Analysis • Excellent Tutorials by David Rossitier • Excellent tutorial an nearly every aspect of R (c/o Rob

Kabacoff) MOST of these notes follow this web page format• Introduction to R by Vincent Zoonekynd • R Cookbook • Data Manipulation Reference

26

Page 27: R- Introduction

Dat

a An

alys

is C

ours

e

V

enka

t Red

dy

Thank you

27