source code tons of code

Post on 26-Feb-2016

58 Views

Category:

Documents

5 Downloads

Preview:

Click to see full reader

DESCRIPTION

Package More Code Statistical Functions Datasets. Workspace Fewer Lines of Code Capability. Source Code Tons of Code. http:// www.statmethods.net/management/functions.html. Currently , h ow many R Packages?. At the command line enter: dim( available.packages ()) available.packages (). - PowerPoint PPT Presentation

TRANSCRIPT

Source Code- Tons of Code

Package- More Code- Statistical Functions- Datasets

Workspace- Fewer Lines of Code- Capability

Currently, how many R Packages?

At the command line enter: dim(available.packages()) available.packages()

Is there an R App Store?

Two heavyweights in the statistical software market are SAS and SPSS/IBM

R Packages have been created that are equivalent to the functionality of SAS and SPSS

XLConnect

XML

rhbase

sas7bdat

Rcpp

Packages for reading, writing for various file formats

RJSONIO

Hmisc

RODBC / ROracle

foreign

RMySQL

RWeka

Comma Separated Variables

Oracle R Enterprise (ORE)

R Being Integrated Into Other Data-Related Products

http://help.sap.com/hana/hana_dev_r_emb_en.pdf

https://blogs.oracle.com/R/

http://www-142.ibm.com/software/products/us/en/spss-stats-developer/

“Both R and SAS are here to stay, and finding ways to make them work better with each other is in the best interests of our customers.”`

http://support.sas.com/rnd/app/studio/Rinterface2.html

R “Machine Learning” LibrariesAnalytic Technique R Package/Library Author OrganizationSupport Vector Mach. libsvm

(ksvm)Chih-Chung ChangChih-Jen Lin

National Taiwan Univ. + EBay Research Labs

Neural Networks neuralnet Frauke GuntherStefan Fritsch

Epidemiology and Prevention Research

nnet Brian Ripley University of Oxford

monmlp Alex J. Cannon Atmospheric Science

Randomized Forests randomForest Fortran original by Leo Breiman & Adele Cutler, R port by Andy Liaw and Matthew Wiener. Merck

Decision Trees rpart Terry M Therneau and Beth Atkinson. R port by Brian Ripley.

Mayo Clinic

University of Oxford

Boosting Model Ada Mark Culp West Virginia University

Maximum Entropy maxent Yoshimasha TsuruokaTimothy Jurka

University of TokyoUC-Davis

Bagging, bootstrap adabag Esteban Alfaro-Cortes La Universidad de Castilla-La Mancha

Latent Diralect slda Jonathan Chang Facebook

Naïve Bayes e1071 David MeyerEvgenia Dimitriadout

Vienna University

Bayesian Network bnlearn Marco Scutari. UCL Genetics Institute

Hidden Markov hiddenmarkov David Harte Statistics Research

Industry Pct.Research 24%Higher Education 7%Information Technology 9%Computer Software 7%Financial Services 6%Banking 2%Pharmaceuticals 4%Biotechnology 4%Market Research 3%Management Consulting 3%Total 69%

Hadley Wickham

Asst. Professor of Statistics at Rice University

ggplot2plyrreshaperggobiprofr

Industries / Organizations Creating and Using R

Package Title Downloads1 plyr Tools for splitting, applying and combining data 840492 digest Create cryptographic hash digests of R objects 831923 ggplot2 An implementation of the Grammar of Graphics 827684 colorspace Color Space Manipulation 819015 stringr Make it easier to work with strings 776586 RColorBrewer ColorBrewer palettes 667837 reshape2 Flexibly reshape data: a reboot of the reshape package 649118 zoo S3 Infrastructure for Regular and Irregular Time Series 608449 proto Prototype object-based programming 59043

10 scales Scale functions for graphics 5836911 car Companion to Applied Regression 5745312 dichromat Color Schemes for Dichromats 5662413 gtable Arrange grobs in tables 5443114 munsell Munsell colour system 5318315 labeling Axis Labeling 5187716 Hmisc Harrell Miscellaneous 4783617 rJava Low-level R to Java interface 4773118 mvtnorm Multivariate Normal and t Distributions 4688419 bitops Bitwise Operations 4568920 rgl 3D visualization device system (OpenGL) 41001

http://www.r-statistics.com/2013/06/top-100-r-packages-for-2013-jan-may/

Top 100 R packages for 2013 (Jan-May)

Specialized“Domain”

Beginner Some Coverage

statsgraphics(both built-in)

Data Managementplyrreshape

Graphicsggplot2

BayesianDifferentialEquationsEconometricsEnvironmetricsExperimentalDesignFinanceGeneticsHighPerformanceComputingMachineLearningMedicalImagingNaturalLanguageProcessingPharmacokineticsPhylogeneticsPsychometricsSocialSciencesSpatialTimeSeries

Easy to

Use

InteractiveStandardVisualizations

SteepLearning

Curve

Visualization and Reporting

The R Graphics Package

Graphing Parameters

TitlesX-Axis TitleY-Axis TitleLegendScalesColorGridlines

library(help="graphics")

Basic Chart Types

In ggplot2 a plot is made up of layers.

ggplot2

Pl o t

Grammar of Graphics

Layer

- Data

- Mapping

- Geom

- Stat

- Postiion

Scale

Coord

Facet

Correlations Matrix library(car) scatterplotMatrix(h)

The Correlation Package was built on top of the Pairs Package

The next data visual was produced with about 150 lines of R code

http://rcharts.io/gallery/

https://plot.ly/r/

• http://statmethods.net/• good documentation and sample code

• http://stackoverflow.com/• helpful for trouble-shooting code

• http://www.r-bloggers.com/• helpful for hearing about new things

Additional Resources

top related