how to analyze data? aravinda guntupalli. spss windows process data window variable view window...

39
How to Analyze Data? Aravinda Guntupalli

Upload: gervais-griffith

Post on 25-Dec-2015

228 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window

How to Analyze Data?

Aravinda Guntupalli

Page 2: How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window

SPSS windows process

Data window Variable view window Output window Chart editor window

Page 3: How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window

How to use different file types?

Excel file csv file SPSS file

Page 4: How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window

Types of variables

You can select type of variableStringNumeric

You can also select format of variableCategorical Ordinal Interval

Page 5: How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window

Why does it matter?

Statistical computations and analyses assume that the variables have specific levels of measurement

Can you compute average of hair color? Does it makes sense to compute the

average of educational experience? An average requires a variable to be

interval. 

Page 6: How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window

Stock and flow variables In data analysis it is useful to distinguish

between between stock and flow variables.

Stock variables are measured at a point in time and flow variables are measured over a period in time.

Cross-section data make comparisons at a given or in a given period in time, while time-series data depict evolution over time.

Page 7: How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window

Manipulate existing data

Page 8: How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window

Compute new variable

You can calculate different variables from the existing variables.

For this you need to know the way to compute your target variable from the existing variables.

You can perform operations like addition, subtraction, division and multiplication of variables to create a new variable.

Page 9: How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window

Example

Total out put of food grains (addition of rice, wheat, maize and other grain output)

Income difference between males and females (male income – female income)

Age square variable (age*age) GDP Per capita (Total GDP/Population)

Page 10: How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window

Recode variable

Using SPSS you can recode a variable into the same variable. How?

We have data on years of education from 0 to 22 years for mothers and you need to do analysis using only 3 categories: Mothers who did not complete the high school, mothers who completed high school and mothers completed college?How you will do this?

Page 11: How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window

How to perform this? Go to Transform pull down menu – then

go to Recode- then to Recode into same variable (if you want to replace the existing information)

Select education and move it into the numeric variable list.

Define values by clicking Old and new values.Enter 0-11 range as 1, 12-15 as 2 and 16-22

as 3

Page 12: How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window

How to make a new data set? We will create now a data set on our own.

Cross-sectionalPanelTime series

Types of variablesStringNumeric

Page 13: How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window

Replace missing values

Missing observations can be problematic in analysis, and some time series measures cannot be computed if there are missing values in the series.

Replace Missing Values creates new time series variables from existing ones, replacing missing values with estimates computed with one of several methods.

Page 14: How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window

Also… Default new variable names are the first six

characters of the existing variable used to create it, followed by an underscore and a sequential number.

For example, for the variable PRICE, the new variable name would be PRICE_1. The new variables retain any defined value labels from the original variables.

Optionally, you can enter variable names to override the default new variable names.

Page 15: How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window

To Replace Missing Values for Time Series Variables

From the pull down menu choose: Transform and then Replace Missing Values

You can then select the estimation method you want to use to replace missing values.

Select the variable for which you want to replace missing values.

Also you can enter variable names to override the default new variable names.

Page 16: How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window

Graphs

Page 17: How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window

Boxplot

A boxplot consists of box and 2 tails. The horizontal line inside the box tells the

position of the median and its upper and lower boundaries are its upper and lower quartiles.

The tails run to the most extreme values. boxplot in sum shows structure of the data along

with its skewness and spread.

Page 18: How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window

Upper Quartile = 180

Qu

Lower Quartile = 158

QL

Median = 171

Q2

Question: We have recorded the heights in cm of boys in a class as shown below. We will draw a boxplot for this data.

Drawing a boxplot.

137, 148, 155, 158, 165, 166, 166, 171, 171, 173, 175, 180, 184, 186, 186

130 140 150 160 170 180 190cm

Page 19: How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window

Boxplot

589547N =

SES

highmiddlelow

rea

din

g s

co

re

80

70

60

50

40

30

20

Page 20: How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window

How to make a boxplot?

From the menus, choose: Graphs and Boxplot Select the icon for Simple and select

Summaries for groups of cases. Select Define. Select the variable for which you want boxplots,

and move it into the Variable box. Select a variable for the category axis and move

it into the Category Axis box. This variable may be numeric, string, or long string.

Page 21: How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window

Histogram

A Histogram is a graphical representation of a frequency distribution for continuous data.

The height is proportional to the frequency of that class

Page 22: How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window

Histogram (2)

math score

75.0

72.5

70.0

67.5

65.0

62.5

60.0

57.5

55.0

52.5

50.0

47.5

45.0

42.5

40.0

37.5

35.0

32.5

30

20

10

0

Std. Dev = 9.37

Mean = 52.6

N = 200.00

Page 23: How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window

How to make histogram?

From the menus, choose: Graphs and Histogram

Select a numeric variable for Variable in the Histogram dialog.

Select Display normal curve to display a normal curve on the histogram.

Page 24: How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window

Scatter plot (1)

To know the relationships between two quantitative variables we are interested in we can use scatter plots.

A scatter diagram plots the value of one economic variable against the value of another variable.

It can be used to reveal whether a relationship exists and the type of relationship that exists.

A scatter plot can describe the relation between reading and writing scores.

Page 25: How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window

Scatter plot (2)

writing score

7060504030

rea

din

g s

co

re

80

70

60

50

40

30

20

Page 26: How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window

Typical Patterns

Positive linear relationship Negative linear relationshipNo relationship

Negative nonlinear relationship Nonlinear (concave) relationship

Page 27: How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window

How to make scatter plots?

From the menus, choose: Graphs and Scatter Select the icon for Simple. Select Define. You must select a variable for the Y-axis and a

variable for the X-axis. These variables must be numeric, but should not be in date format.

You can select a variable and move it into the Set Markers by box. This variable may be numeric or string.

Page 28: How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window

Descriptive statistics

Page 29: How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window

Descriptive statistics It tells you how many valid cases you

have for data along with mean and standard deviation.

You can understand about distribution using this command in SPSS.

How to do this? Analyse Descriptive statistics Frequencies/Descriptives/Explore/Crosstabs Select the variables Using shift or ctrl key you can select multiple variables

Page 30: How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window

Correlation and regression

Page 31: How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window

What is Correlation?

Research question: What is the relation between two variables?

Correlation is a measure of the direction and degree of linear association between 2 variables

Page 32: How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window

Interpreting Correlation

Strength r

very weak 0 - .19

weak .20 - .39

moderate .40 - .59

strong .60 - .79

very strong .80 - 1.00

Page 33: How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window

Relation between hourly pay and age

R Square values indicate the proportion of variance in the dependent variable (y) accounted for by variation in the independent variable (x)

Model Summary

.397a .158 .158 3.59608Model1

R R SquareAdjustedR Square

Std. Error ofthe Estimate

Predictors: (Constant), Age last birthdaya.

Page 34: How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window

Regression coefficients

Coefficientsa

1.336 .130 10.314 .000

.231 .004 .397 53.500 .000

(Constant)

Age last birthday

Model1

B Std. Error

UnstandardizedCoefficients

Beta

StandardizedCoefficients

t Sig.

Dependent Variable: Gross hourly pay (£)a.

hourly pay = 1.336 + .231 x age + error

Page 35: How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window

Multivariate Regression Analysis

Page 36: How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window

When do we use Multivariate Regression Analysis To find the relationship between more than

two variables y= b0 + bx1 + bx2 + e

hours worked (y)education (x1) income (x2)

Page 37: How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window

Simultaneous regression hourly pay (£)= -8.773 + .622*education +

0.201*age

Coefficientsa

-7.827 .253 -30.988 .000

.217 .005 .343 46.457 .000

.540 .011 .355 48.123 .000

(Constant)

Age last birthday

Age completedcontinuousfull-time education

Model1

B Std. Error

UnstandardizedCoefficients

Beta

StandardizedCoefficients

t Sig.

Dependent Variable: Gross hourly pay (£)a.

Page 38: How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window

What if… we have a dichotomous dependent variable?

Use a dummy dependent variable regression modelLogistic regression model

Unlike simple linear regression and multiple regression, in logistic regression the dependent variable is dichotomous (ie. 0,1)

In logistic regression more than one independent variable can be used

Page 39: How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window

Thank You