1 the r project for statistical computing eric fouh, christopher poirel cs 5604 fall 2010
TRANSCRIPT
![Page 1: 1 The R Project for statistical computing Eric Fouh, Christopher Poirel CS 5604 Fall 2010](https://reader035.vdocument.in/reader035/viewer/2022081603/56649edc5503460f94bed2f6/html5/thumbnails/1.jpg)
1
The R Project for statistical computing
Eric Fouh, Christopher Poirel
CS 5604
Fall 2010
![Page 2: 1 The R Project for statistical computing Eric Fouh, Christopher Poirel CS 5604 Fall 2010](https://reader035.vdocument.in/reader035/viewer/2022081603/56649edc5503460f94bed2f6/html5/thumbnails/2.jpg)
2
What is R?
![Page 3: 1 The R Project for statistical computing Eric Fouh, Christopher Poirel CS 5604 Fall 2010](https://reader035.vdocument.in/reader035/viewer/2022081603/56649edc5503460f94bed2f6/html5/thumbnails/3.jpg)
3
Usages of R
• statistics system
• data handling and storage facility
• calculations on arrays, in particular matrices
• integrated collection of tools for data analysis
• graphical tool for data analysis
• programming language (called ‘S’)
![Page 4: 1 The R Project for statistical computing Eric Fouh, Christopher Poirel CS 5604 Fall 2010](https://reader035.vdocument.in/reader035/viewer/2022081603/56649edc5503460f94bed2f6/html5/thumbnails/4.jpg)
4
Structure of R• R functions and dataset are stored in packages
• R is provided with 25 “standard” packages:
• Hundreds of contributed packages (written by different authors ) are available
Package Name Description
baseBase R functions
dataset Base R datasets
graphicsR functions for base graphics
stats R statistical functions
utils R utility functions
matrix Matrix package
class Functions for classification
clusterFunctions for cluster analysis
![Page 5: 1 The R Project for statistical computing Eric Fouh, Christopher Poirel CS 5604 Fall 2010](https://reader035.vdocument.in/reader035/viewer/2022081603/56649edc5503460f94bed2f6/html5/thumbnails/5.jpg)
5
R and Information Retrieval
IR Concept R package
Text preprocessing
Term weighting, scoring
tm package: Constructs a term-document matrix, using one of the the following weighting functions TF (weightTf), TF-IDF
(weightTfIdf). e.g. tdm <- TermDocumentMatrix(crude, control = list(weighting = weightTfIdf, stopwords = TRUE))
vector space model for scoring clv package: dot.product function returns a cosine similarity
measure of two vectors.
vector space classification class package: performs a k-Nearest Neighbour Classification on a dataset
Hierarchical clustering Cluster package: computes clusters (agglomerative hierarchical ) on dataset
Latent Semantic Indexing Base package: performs Singular Value Decomposition on matrix
![Page 6: 1 The R Project for statistical computing Eric Fouh, Christopher Poirel CS 5604 Fall 2010](https://reader035.vdocument.in/reader035/viewer/2022081603/56649edc5503460f94bed2f6/html5/thumbnails/6.jpg)
6
Getting started with R• To start R
>R• To quit R
>q()• To see installed packages
>library()• To load a package
>library(class)• To start help
> help.start()• To create a vector
> x <- c(10.4, 5.6, 3.1, 6.4, 21.7)• To create a matrix
> x <- array(1:20, dim=c(4,5)) # Generate a 4 by 5 array filled with number from 1 to 20.• To display an object
>x• To delete an object
>rm x• To load data from file
>HousePrice <- read.table("houses.data")
![Page 7: 1 The R Project for statistical computing Eric Fouh, Christopher Poirel CS 5604 Fall 2010](https://reader035.vdocument.in/reader035/viewer/2022081603/56649edc5503460f94bed2f6/html5/thumbnails/7.jpg)
7
Examples (1)
• Term-Document Matrix
![Page 8: 1 The R Project for statistical computing Eric Fouh, Christopher Poirel CS 5604 Fall 2010](https://reader035.vdocument.in/reader035/viewer/2022081603/56649edc5503460f94bed2f6/html5/thumbnails/8.jpg)
8
Examples (1)
![Page 9: 1 The R Project for statistical computing Eric Fouh, Christopher Poirel CS 5604 Fall 2010](https://reader035.vdocument.in/reader035/viewer/2022081603/56649edc5503460f94bed2f6/html5/thumbnails/9.jpg)
9
Examples (2)
• Eigenvalues and eigenvectors
![Page 10: 1 The R Project for statistical computing Eric Fouh, Christopher Poirel CS 5604 Fall 2010](https://reader035.vdocument.in/reader035/viewer/2022081603/56649edc5503460f94bed2f6/html5/thumbnails/10.jpg)
10
Examples(3)
![Page 11: 1 The R Project for statistical computing Eric Fouh, Christopher Poirel CS 5604 Fall 2010](https://reader035.vdocument.in/reader035/viewer/2022081603/56649edc5503460f94bed2f6/html5/thumbnails/11.jpg)
11
Examples(3)
• Law Rank approximation
![Page 12: 1 The R Project for statistical computing Eric Fouh, Christopher Poirel CS 5604 Fall 2010](https://reader035.vdocument.in/reader035/viewer/2022081603/56649edc5503460f94bed2f6/html5/thumbnails/12.jpg)
12
Examples(3)
![Page 13: 1 The R Project for statistical computing Eric Fouh, Christopher Poirel CS 5604 Fall 2010](https://reader035.vdocument.in/reader035/viewer/2022081603/56649edc5503460f94bed2f6/html5/thumbnails/13.jpg)
13
Examples(3)