rintro

42
Introduction To R

Upload: alokjain1987

Post on 26-Sep-2015

214 views

Category:

Documents


0 download

DESCRIPTION

Introduction to R

TRANSCRIPT

  • Introduction To R

  • What is R?The R statistical programming language is a free open source package based on the S language developed by Bell Labs. The language is very powerful for writing programs.Many statistical functions are already built in.Contributed packages expand the functionality to cutting edge research.Since it is a programming language, generating computer code to complete tasks is required.

  • Getting StartedWhere to get R?Go to www.r-project.orgDownloads: CRANSet your Mirror: Anyone in the USA is fine.Select Windows 95 or later.Select base.Select R-2.4.1-win32.exe The others are if you are a developer and wish to change the source code.UNT course website for R:http://www.unt.edu/rss/SPLUSclasslinks.html

  • Getting StartedThe R GUI?

  • Getting StartedOpening a script.This gives you a script window.

  • Getting StartedBasic assignment and operations.Arithmetic Operations:+, -, *, /, ^ are the standard arithmetic operators.Matrix Arithmetic.* is element wise multiplication%*% is matrix multiplicationAssignmentTo assign a value to a variable use
  • Getting StartedHow to use help in R?R has a very good help system built in.If you know which function you want help with simply use ?_______ with the function in the blank.Ex: ?hist.If you dont know which function to use, then use help.search(_______).Ex: help.search(histogram).

  • Importing DataHow do we get data into R?Remember we have no point and clickFirst make sure your data is in an easy to read format such as CSV (Comma Separated Values).Use code:D
  • Working with data.Accessing columns.D has our data in it. But you cant see it directly.To select a column use D$column.

  • Working with data.Subsetting data.Use a logical operator to do this.==, >,
  • Basic GraphicsHistogramhist(D$wg)

  • Basic GraphicsAdd a titleThe main statement will give the plot an overall heading.hist(D$wg , main=Weight Gain)

  • Basic GraphicsAdding axis labelsUse xlab and ylab to label the X and Y axes, respectively.hist(D$wg , main=Weight Gain,xlab=Weight Gain, ylab =Frequency)

  • Basic GraphicsChanging colorsUse the col statement.?colors will give you help on the colors.Common colors may simply put in using the name.hist(D$wg, main=Weight Gain,xlab=Weight Gain, ylab =Frequency, col=blue)

  • Basic Graphics Colors

  • Basic PlotsBox Plotsboxplot(D$wg)

  • BoxplotsChange it!boxplot(D$wg,main='Weight Gain',ylab='Weight Gain (lbs)')

  • Box-Plots - GroupingsWhat if we want several box plots side by side to be able to compare them.First Subset the Data into separate variables.wg.m
  • Boxplots Groupings

  • Boxplots - Groupingsboxplot(wg.m$wg, wg.f$wg, main='Weight Gain (lbs)', ylab='Weight Gain', names = c('Male','Female'))

  • Boxplot GroupingsDo it by shiftwg.7a
  • Boxplots Groupings

  • Scatter PlotsSuppose we have two variables and we wish to see the relationship between them.A scatter plot works very well.R code: plot(x,y)Exampleplot(D$metmin,D$wg)

  • Scatterplots

  • Scatterplotsplot(D$metmin,D$wg,main='Met Minutes vs. Weight Gain', xlab='Mets (min)',ylab='Weight Gain (lbs)')

  • Scatterplotsplot(D$metmin,D$wg,main='Met Minutes vs. Weight Gain', xlab='Mets (min)',ylab='Weight Gain (lbs)',pch=2)

  • Line PlotsOften data comes through time.Consider Dell stockD2
  • Line Plots

  • Line Plotsplot(t1,D2$DELL,type="l")

  • Line Plotsplot(t1,D2$DELL,type="l",main='Dell Closing Stock Price',xlab='Time',ylab='Price $'))

  • Overlaying PlotsOften we have more than one variable measured against the same predictor (X).plot(t1,D2$DELL,type="l",main='Dell Closing Stock Price',xlab='Time',ylab='Price $'))lines(t1,D2$Intel)

  • Overlaying Graphs

  • Overlaying Graphslines(t1,D2$Intel,lty=2)

  • Overlaying Graphs

  • Adding a LegendAdding a legend is a bit tricky in R. Syntaxlegend( x, y, names, line types)X coordinateYcoordinateNames of series in column formatCorresponding line types

  • Adding a Legendlegend(60,45,c('Intel','Dell'),lty=c(1,2))

  • Paneling GraphicsSuppose we want more than one graphic on a panel.We can partition the graphics panel to give us a framework in which to panel our plots.par(mfrow = c( nrow, ncol))

    Number of rowsNumber of columns

  • Paneling GraphicsConsider the followingpar(mfrow=c(2,2))hist(D$wg, main='Histogram',xlab='Weight Gain', ylab ='Frequency', col=heat.colors(14))boxplot(wg.7a$wg, wg.8a$wg, wg.9a$wg, wg.10a$wg, wg.11a$wg, wg.12p$wg, main='Weight Gain', ylab='Weight Gain (lbs)',xlab='Shift', names = c('7am','8am','9am','10am','11am','12pm'))plot(D$metmin,D$wg,main='Met Minutes vs. Weight Gain', xlab='Mets (min)',ylab='Weight Gain (lbs)',pch=2)plot(t1,D2$Intel,type="l",main='Closing Stock Prices',xlab='Time',ylab='Price $')lines(t1,D2$DELL,lty=2)

  • Paneling Graphics

  • Quality - Excel

    Chart1

    29.545

    19.2538.06

    17.4430.06

    26.1237

    21.8728.56

    25.6926.31

    26.2430.91

    24.3627.01

    26.1529.25

    26.9329.81

    21.3827.96

    18.5320.44

    23.9824.42

    27.9332.66

    27.1831.45

    27.4935.04

    24.6928.55

    26.1130.41

    26.3428.61

    26.8527.62

    26.1418.27

    24.9318.79

    26.6216.67

    23.5113.89

    28.6117.3

    28.620.88

    26.7415.57

    23.8615.66

    26.9617.25

    27.3116.28

    28.9818.4

    31.3720.82

    31.8420.81

    33.6824.89

    32.6228.59

    33.4227.52

    3632.95

    34.5733.54

    33.9832.05

    33.4430.52

    32.6529.2

    33.6227.2

    34.7825.73

    35.2428.55

    35.8227.6

    35.4724.38

    34.8421.29

    35.620.06

    35.0622.26

    40.5222.38

    42.1423.39

    41.7622.45

    40.0923.99

    38.4223.23

    34.8323.52

    39.9326.96

    39.4626.02

    40.4727.14

    35.625.72

    34.224.65

    31.8823.5

    30.1526.68

    29.9524.96

    29.3121.26

    2920.6

    29.7619.46

    26.219.98

    25.3818.02

    24.4619

    21.6818

    22.5519.57

    22.8420.57

    24.3321.34

    27.2421.4

    25.0920.25

    26.9321.92

    DELL

    Intel

    Time

    Price $

    Closing Stock Prices

    Dell

    DateDELLIntel

    2-Oct-0029.545

    1-Nov-0019.2538.06

    1-Dec-0017.4430.06

    2-Jan-0126.1237

    1-Feb-0121.8728.56

    1-Mar-0125.6926.31

    2-Apr-0126.2430.91

    1-May-0124.3627.01

    1-Jun-0126.1529.25

    2-Jul-0126.9329.81

    1-Aug-0121.3827.96

    4-Sep-0118.5320.44

    1-Oct-0123.9824.42

    1-Nov-0127.9332.66

    3-Dec-0127.1831.45

    2-Jan-0227.4935.04

    1-Feb-0224.6928.55

    1-Mar-0226.1130.41

    1-Apr-0226.3428.61

    1-May-0226.8527.62

    3-Jun-0226.1418.27

    1-Jul-0224.9318.79

    1-Aug-0226.6216.67

    3-Sep-0223.5113.89

    1-Oct-0228.6117.3

    1-Nov-0228.620.88

    2-Dec-0226.7415.57

    2-Jan-0323.8615.66

    3-Feb-0326.9617.25

    3-Mar-0327.3116.28

    1-Apr-0328.9818.4

    1-May-0331.3720.82

    2-Jun-0331.8420.81

    1-Jul-0333.6824.89

    1-Aug-0332.6228.59

    2-Sep-0333.4227.52

    1-Oct-033632.95

    3-Nov-0334.5733.54

    1-Dec-0333.9832.05

    2-Jan-0433.4430.52

    2-Feb-0432.6529.2

    1-Mar-0433.6227.2

    1-Apr-0434.7825.73

    3-May-0435.2428.55

    1-Jun-0435.8227.6

    1-Jul-0435.4724.38

    2-Aug-0434.8421.29

    1-Sep-0435.620.06

    1-Oct-0435.0622.26

    1-Nov-0440.5222.38

    1-Dec-0442.1423.39

    3-Jan-0541.7622.45

    1-Feb-0540.0923.99

    1-Mar-0538.4223.23

    1-Apr-0534.8323.52

    2-May-0539.9326.96

    1-Jun-0539.4626.02

    1-Jul-0540.4727.14

    1-Aug-0535.625.72

    1-Sep-0534.224.65

    3-Oct-0531.8823.5

    1-Nov-0530.1526.68

    1-Dec-0529.9524.96

    3-Jan-0629.3121.26

    1-Feb-062920.6

    1-Mar-0629.7619.46

    3-Apr-0626.219.98

    1-May-0625.3818.02

    1-Jun-0624.4619

    3-Jul-0621.6818

    1-Aug-0622.5519.57

    1-Sep-0622.8420.57

    2-Oct-0624.3321.34

    1-Nov-0627.2421.4

    1-Dec-0625.0920.25

    3-Jan-0726.9321.92

  • Quality - R

  • SummaryAll of the R code and files can be found at:http://www.cran.r-project.org/