module -4 (r training) - basic stats & modeling

Upload: rohitgahlan

Post on 09-Jan-2016

230 views

Category:

Documents


0 download

DESCRIPTION

stats

TRANSCRIPT

Slide 1

IVY Professional School

Program: KPO TrainingModule: Basic Statistics and Predictive ModelingSession: 7 and 8

# Copyright Ivy Professional School - 2009-10 (All Rights Reserved) 12OutlineDescriptive StatisticsFrequencies and CrosstabsCorrelationsMultiple Linear RegressionLogistic RegressionTime SeriesPrincipal Component Factor AnalysisCluster Analysis

# Copyright Ivy Professional School - 2009-10 (All Rights Reserved) 23Descriptive StatisticsR provides a wide range of functions for obtaining summary statistics. One method of obtaining descriptive statistics is to use the sapply( ) function with a specified summary statistic. # get means for variables in data frame mydata# excluding missing values sapply(mydata, mean, na.rm=TRUE) - Possible functions used in sapply include mean, sd, var, min, max, median, range, and quantile. # mean,median,25th and 75th quartiles,min,maxsummary(mydata)library(Hmisc)describe(mydata) # n, nmiss, unique, mean, 5,10,25,50,75,90,95th percentiles # 5 lowest and 5 highest scores

# Copyright Ivy Professional School - 2009-10 (All Rights Reserved) 34FrequenciesR provides many methods for creating frequency and contingency tables. Three are described below. In the following examples, assume that A, B, and C represent categorical variables. # 2-Way Frequency Table attach(mydata)mytable