14755847-spss.ppt

42
Management Development Program Data Analysis using SPSS PRESENTER MR VENKAT

Upload: mefromnepal

Post on 24-Oct-2014

83 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: 14755847-spss.ppt

Management Development Program

Data Analysis using SPSS

PRESENTERMR VENKAT

Page 2: 14755847-spss.ppt

SPSS

• Statistical

• Package for

• Social

• Sciences

Page 3: 14755847-spss.ppt

VERSIONS OF SPSS• SPSS Ver-1 to Ver-5 : DOS

VERSIONS

• SPSS Ver-6 to Ver-15 : WINDOWS

VERSIONS

• SPSS-X : For MAIN FRAMES (on various operating system platforms)

• SPSS-LAN: For LANs

• Web site: http://www.spss.com

Page 4: 14755847-spss.ppt

BASIC APPLICATIONS

• Creating data as Spread-sheet

• Generating Reports as Tables

• Statistical Analysis of Data

• Graphic Presentations

Page 5: 14755847-spss.ppt

MAIN STEPS IN USING SPSS

• Creating data or Getting data

• Defining data

• Modifying data

• Processing data–generating tables

–statistical analysis

–generating graphs

Page 6: 14755847-spss.ppt

Structure of SPSS data file

• Variables (Fields) in columns

• Cases (Respondents) in rows

• A case contains several variables

Page 7: 14755847-spss.ppt

Data Definition

• Variables Name• Variable Type• Field Width• Decimal Positions• Variable Label• Value Labels• Missing Values• Column Width• Alignment• Scale

Page 8: 14755847-spss.ppt

Variable Name• Maxi. 8 characters (up to Ver

10)

• First letter must be alphabet

• Arithmetic operators, special symbols and blank spaces not permitted

• Two variables can not have same name in one data file

Page 9: 14755847-spss.ppt

Variable Label

• It helps in reading outputs.

• No restriction on characters.

Page 10: 14755847-spss.ppt

Variable Type

• Numeric (Floating point)

• String (Character / Text)

• Date

• Currency

Page 11: 14755847-spss.ppt

Value Labels• It helps in reading tables and other

outputs.

• For example variable “Marital Status” has five values (codes):– value 1 means “Never Married”– value 2 means “Currently Married”– value 3 means “Widow/Widower”– value 4 means “Divorced”– value 5 means “Separated”

Page 12: 14755847-spss.ppt

Missing Values• These are values indicating “No Response”

or “Not Applicable” in any variable.

• Declaring missing values tells the SPSS package to ignore the cases containing these values during analysis.

• A blank in Excel or dBase/FoxPro file is treated as missing value.

• In SPSS data file, blanks appear as dots (.) denoting that theses are missing values.

Page 13: 14755847-spss.ppt

Creating Data directly in SPSS

• After opening SPSS click on “file - new - data” on the menu bar.

• On getting “SPSS data editor” window, click on “variable view” (right bottom) and start defining data file i.e. variable name, variable type, variable label, value labels, missing values etc.

Page 14: 14755847-spss.ppt

MANIPULATING FILES• Insert variable

• Sort cases

• Transpose - Interchange rows and columns

• Merge Files - Add cases, Add variables

• Aggregate

• Select cases - Select with “if” condition

• Weight cases - for estimation / projection

Page 15: 14755847-spss.ppt

VIEW• Status Bar - process, selection,

weight, n of cases• Tool Bar - for data, syntax,

chart, navigator (output)• Fonts - type, size• Grid Lines• Value Labels

Page 16: 14755847-spss.ppt

Data Modifications• Compute - create new variable in

existing data file through an arithmetic expression.

• Recode - reorganize values of a variable.

• Rank cases• Auto recode• Create Time Series• Replace missing values

Page 17: 14755847-spss.ppt

STATISTICAL PROCEDURESOLAP Cubes

• On Line Analytical Processing Cubes

• Calculates uni-variate summary statistics with-in one or more categorical variables

Page 18: 14755847-spss.ppt

DESCRIPTIVE STATISTICS

– Frequencies - one variable at a time with various uni-variate statistics.

– Descriptives - uni-variate statistics.

– Explore - studying behaviour of variables.

– Crosstabs - Two-way, Three-way

– Ratio Statistics

Page 19: 14755847-spss.ppt

MEANS

• Display mean & S.D. by groups.

• One sample t-test.

• Two independent sample t-test.

• Two related samples or paired samples t-test.

• One-way ANalysis Of VAriance (ANOVA) with post-hoc tests.

Page 20: 14755847-spss.ppt

LINEAR REGRESSION• Methods: Enter, Stepwise, Remove, Backward,

Forward.• Regression Coefficients: Estimate, Standard

Error, Standardized coefficients, Significance.• Residuals: Durbin-Watson test (for auto-

correlation)• Save: Predicted values, Residuals etc.• Plot: Histogram, Normal Probability plot.• Others: Multi-colinearity diagnosis, partial

correlation, R-square change etc.

Page 21: 14755847-spss.ppt

CORRELATIONS• Bivariate Correlations.

• Partial Correlations.

• Distances - Similarities and Dissimilarities

Page 22: 14755847-spss.ppt

CLASSIFY

• K-means Cluster

• Hierarchical Cluster

• Discriminant Analysis

Page 23: 14755847-spss.ppt

DATA REDUCTION

• Factor Analysis.

• Correspondence Analysis.

• Optimal Scaling - Homals, Princals,

Overals.

Page 24: 14755847-spss.ppt

FACTOR ANALYSIS• Methods: Principal Components, Principal

Axis factoring, Maximum Likelihood etc.

• Criteria: Minimum Eigen value, N of factors, Number of Iterations.

• Rotation: Varimax, Quartimax, Equamax, Promax, Oblimin.

• Display: Initial factor matrix, Rotated factor matrix.

• Plot: Scree plot.

Page 25: 14755847-spss.ppt

SCALES• Reliability Analysis - Alpha, Split-

half, Guttman, Parallel.

• Multi Dimensional Scaling (MDS)

Page 26: 14755847-spss.ppt

NON-PARAMETRIC TESTS

• Chi-square

• Binomial

• Runs test

• One sample K-S test

• Two independent samples tests

• Several independent samples tests

• Two related samples tests

• Several related samples tests

Page 27: 14755847-spss.ppt

TIME SERIES ANALYSIS

• Exponential Smoothing.• Autoregression.• Auto Regressive Integrated Moving

Averages (ARIMA).• X11ARIMA.• Seasonal Decomposition.

Page 28: 14755847-spss.ppt

MULTIPLE RESPONSE ANALYSIS

• Defining sets.

• Frequencies• Crosstabulation.

Page 29: 14755847-spss.ppt

CHARTS• Bar, Line, Area, Pie, Hi-Low• Pareto Charts, Control Charts (X-bar,R,p,c)• Box Plot, Error Bar• Scatter Plot, Histogram, P-P Plot, Q-Q Plot,

Sequence Charts• ROC Curve (Receivers’ Op Characteristic)• Time Series : Autocorrelations, Spectral

Plots, Cross-correlations,

Page 30: 14755847-spss.ppt

Types of Data

• Nominal: A variable can be treated as nominal when its values represent categories with no intrinsic ranking; for example, the department of the company in which an employee works.

• Examples of nominal variables include• region• zip code• religious affiliation etc.

Page 31: 14755847-spss.ppt

Ordinal Data

• A variable can be treated as ordinal when its values represent categories with some intrinsic ranking; for example, levels of service satisfaction from highly dissatisfied to highly satisfied. Examples of ordinal variables include attitude scores representing degree of satisfaction or confidence and preference rating scores.

Page 32: 14755847-spss.ppt

Scale Data

• A variable can be treated as scale when its values represent ordered categories with a meaningful metric, so that distance comparisons between values are appropriate. Examples of scale variables include age in years and income in thousands of dollars.

Page 33: 14755847-spss.ppt

Data Analysis

• Simple Tabulation and Cross Tabulation

• Univariate and Bivariate Analysis

• Dependent and Independent variables

• First Stage Analysis- Simple Tabulation

• Second Stage Analysis- Cross Tabulation

• The Chi-square test for cross tabulation

Page 34: 14755847-spss.ppt

Anova and the design of Experiments

• The analysis of variance technique is used when the independent variables are of nominal scale (categorical) and the dependent variable is metric.

• The independent variable could be different level of prices, different pack sizes, or different product colors and the dependent variable could be sales of the product.

Page 35: 14755847-spss.ppt

Experimental Designs

• Completely Randomized design in a one way ANOVA (single Factor)

• Randomized Block Design (single blocking factor)

• Latin Square Design (two blocking factor)

• Factoral design with two or more factors.

Page 36: 14755847-spss.ppt

Correlation and Regression

• Correlation Analysis- to measure the degree of association between two sets of quantitative data e.g. how are sales of product A correlated with sales of product B etc.

• Regression Analysis- to explain the variation in one variable based on the variation in one or more variables.

Page 37: 14755847-spss.ppt

Regression

• Basically two approaches:• 1. Hit and trial approach (stepwise regression)- exploratory

research• 2. A preconceived approach• The output consist of the beta coefficient for all the

independent variables in the model. The output also gives the result of a t-test for significance of each variable in the model, and the result of F-test for model on the whole.

• The coefficient of determination R2 is the total varience in y explained by all independent variables in the regression equation.

Page 38: 14755847-spss.ppt

Problem• A manufacturer and marketer of electric motors

would like to build a regression model consisting of 5 or 6 independent variables, to predict sales. Past data has been collected for 15 sales territories, on sales and 6 independent variables. Build a regression model and recommend whether or not it should be used by the company

Page 39: 14755847-spss.ppt

Dependent variable Y= Sales in Rs. Lakh in the territory Independent VariableX1= Mkt potential in the territoryX2= No. of dealers of the company in the territoryX3= No. of sales people in the territoryX4= Index of Competitor activity on a 5 point scale (1= low, 5= high)X5= No. of service people in the territoryX6= No. of existing customers in the territory

Page 40: 14755847-spss.ppt

Factor Analysis

• For Data reduction

• There are two stages in Factor analysis

• Factor Extraction process

• Rotation of principal components

Page 41: 14755847-spss.ppt

•ANY QUESTIONS PLEASE???????

?

Page 42: 14755847-spss.ppt

THANK YOU