introduction to quantitative data analysis. quantitative data analysis n types of statistics u...

51
Introduction to Quantitative Data Analysis

Upload: juliana-clarke

Post on 04-Jan-2016

221 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Introduction to Quantitative Data Analysis. Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques,

Introduction to Quantitative Data Analysis

Introduction to Quantitative Data Analysis

Page 2: Introduction to Quantitative Data Analysis. Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques,

Quantitative Data AnalysisQuantitative Data Analysis

Types of StatisticsTypes of Statistics DescriptiveDescriptive Inferential—probabilistic sampling techniques, notion of Inferential—probabilistic sampling techniques, notion of

randomrandom

Data Preparation (Coding & Cleaning Data)Data Preparation (Coding & Cleaning Data) Common Ways of Presenting StatisticsCommon Ways of Presenting Statistics

TablesTables ChartsCharts GraphsGraphs

Page 3: Introduction to Quantitative Data Analysis. Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques,

Presenting Data (Raw Data)Presenting Data (Raw Data)

Regan, T. (1985). In search of sobriety: Identifying factors contributing to the recovery from alcoholism. Kentville, NS.

Page 4: Introduction to Quantitative Data Analysis. Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques,

univariate:= one variableunivariate:= one variable ““raw count” (frequencies, percentages)raw count” (frequencies, percentages)

Simple Univariate Tables of Frequency Distributions and Percentages

Simple Univariate Tables of Frequency Distributions and Percentages

Neuman (2000: 318)

Page 5: Introduction to Quantitative Data Analysis. Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques,

Revision of Example: Collapsing Categories and Treatment of Missing Data in Tables Revision of Example: Collapsing Categories and Treatment of Missing Data in Tables

Johnson, A. G. (1977). Social Statistics Without Tears. Toronto: McGraw Hill.

Example: Raw Example: Raw Data FrequenciesData Frequencies

Page 6: Introduction to Quantitative Data Analysis. Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques,

Types of Missing DataTypes of Missing Data

Examples: Non-response, don’t know, refusal etc.Examples: Non-response, don’t know, refusal etc. Categories of missing dataCategories of missing data

Missing data completely at random (MCAR) (MCAR) Equipment malfunction, illness etc…Equipment malfunction, illness etc…

Missing data at randomMissing data at random Can be explained by controlling for another variableCan be explained by controlling for another variable

Missing data that is not randomMissing data that is not random

Page 7: Introduction to Quantitative Data Analysis. Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques,

Some techniques for dealing with missing dataSome techniques for dealing with missing data

OmissionOmission (may involve using statistical techniques or (may involve using statistical techniques or logie to decide who to omit, ex. Add all like cases logie to decide who to omit, ex. Add all like cases based on other responses)based on other responses)

ImputationImputation (guess at what the likely responses would (guess at what the likely responses would be by comparing with other response patterns)be by comparing with other response patterns) Match other characteristicsMatch other characteristics Distribute by equally or use weighted responsesDistribute by equally or use weighted responses

Page 8: Introduction to Quantitative Data Analysis. Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques,

Treatment of Missing Data (Ommison vs. Inclusion)Treatment of Missing Data (Ommison vs. Inclusion)

Table 5-1 Alienation of Workers

Level of Alienation F %High 30 14 Medium 100 48 Low 20 10 No Response 60 29

(Total) 210 100

Comparison of % distributions and without non Comparison of % distributions and without non respondentsrespondents

Table 5-1 Alienation of Workers

Level of Alienation F %High 30 20 Medium 100 67 Low 20 13

(Total) 150 100

Page 9: Introduction to Quantitative Data Analysis. Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques,

Comparison with high & medium alienation collapsedComparison with high & medium alienation collapsed

Treatment of Missing Data & collapsing categories (creating new variables after data collection)

Treatment of Missing Data & collapsing categories (creating new variables after data collection)

Table 5-1 Alienation of Workers

Level of Alienation F %High & Medium 130 62 Low 20 10 No Response 60 29

(Total) 210 100

Table 5-1 Alienation of Workers

Level of Alienation F %High & Medium 130 87 Low 20 13

(Total) 150 100

Non-respondents included Non-respondents eliminated

Page 10: Introduction to Quantitative Data Analysis. Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques,

Comparison with medium & low collapsedComparison with medium & low collapsedTreatment of Missing Data Treatment of Missing Data

Table 5-1 Alienation of Workers

Level of Alienation F %High 30 14 Medium & Low 120 58 No Response 60 29

(Total) 210 100

Table 5-1 Alienation of Workers

Level of Alienation F %High 30 20 Medium & Low 120 80

(Total) 150 100

Non-respondents included Non-respondents eliminated

Page 11: Introduction to Quantitative Data Analysis. Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques,

Effects of Collapsing Response CategoriesEffects of Collapsing Response Categories

Comparison of two different ways of Comparison of two different ways of collapsing response categoriescollapsing response categories

Table 5-1 Alienation of Workers

Level of Alienation F %High & Medium 130 87 Low 20 13

(Total) 150 100

Table 5-1 Alienation of Workers

Level of Alienation F %High 30 20 Medium & Low 120 80

(Total) 150 100

Page 12: Introduction to Quantitative Data Analysis. Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques,

Collapsing categories (U.N. example)Collapsing categories (U.N. example)

Babbie, E. (1995). The practice of social researchBelmont, CA: Wadsworth

Page 13: Introduction to Quantitative Data Analysis. Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques,

Collapsing Categories & omitting missing dataCollapsing Categories & omitting missing data

Babbie, E. (1995). The practice of social researchBelmont, CA: Wadsworth

Page 14: Introduction to Quantitative Data Analysis. Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques,

Grouping Response CategoriesGrouping Response Categories

To make new categoriesTo make new categories Facilitate analysis of trendsFacilitate analysis of trends But decisions have effects on the interpretation But decisions have effects on the interpretation

of patternsof patterns Importance of understanding logic, conceptual Importance of understanding logic, conceptual

and operational definitionsand operational definitions Same data can produce totally different-looking Same data can produce totally different-looking

resultsresults

Page 15: Introduction to Quantitative Data Analysis. Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques,

Bivariate Tables (Cross Tabulations): Tables Presenting Relationship between Two Variables

Bivariate Tables (Cross Tabulations): Tables Presenting Relationship between Two Variables

Singleton, R., Straits, B. & Straits, M. (1993)Approaches to social research. Toronto: Oxford

Page 16: Introduction to Quantitative Data Analysis. Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques,

Expected outcomes (Null Hypothesis)Expected outcomes (Null Hypothesis)

Singleton, R., Straits, B. & Straits, M. (1993)Approaches to social research. Toronto: Oxford

Page 17: Introduction to Quantitative Data Analysis. Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques,

Interpretation issues (Bivariate Tables) Interpretation issues (Bivariate Tables)

Percentages within categories of attributes of Percentages within categories of attributes of independent variable independent variable

In example:In example: Independent variable: genderIndependent variable: gender Dependent variable: fear of walking alone at nightDependent variable: fear of walking alone at night Women more afraid than men Women more afraid than men

Page 18: Introduction to Quantitative Data Analysis. Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques,

Styles of Presentation of Percentaged Tables (Bivariate)Styles of Presentation of Percentaged Tables (Bivariate)

Table 1. Percentage in support of strike by type of school

Percent supportingType of School Strike

Secondary 60% (800)

Elementary 30% (1000)

__________________________________________________________= .30 N = 1800

Serial NumberDescriptive CaptionDependent Variable

IndependentVariable

Variable

Categories

One category of dichotomousdependent variable

Marginals for independentvariable

Percentage difference(epsilon)

Total Sample

Page 19: Introduction to Quantitative Data Analysis. Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques,

Factors to consider when reading tableFactors to consider when reading table

SamplingSampling technique? Or total technique? Or total populationpopulation?? Conceptual & operational definitions (Conceptual & operational definitions (Validity & Validity &

reliability issues)reliability issues) What What measuremeasure was used? was used? How was it used?How was it used? Data preparation and cleaning issues (treatment of Data preparation and cleaning issues (treatment of

inconsistencies, non-responses etc..)inconsistencies, non-responses etc..) Data Analysis issuesData Analysis issues

Page 20: Introduction to Quantitative Data Analysis. Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques,

Other Ways of Presenting Same Data & Interpretation IssuesOther Ways of Presenting Same Data & Interpretation Issues

Deciding on Direction of Calculation of Deciding on Direction of Calculation of Percentages?Percentages? Depends on Objectives (Research Questions), for Depends on Objectives (Research Questions), for

example:example: Are we interested in the patterns within each school Are we interested in the patterns within each school

type?type? Are we interested in overall support of strike?Are we interested in overall support of strike?

Page 21: Introduction to Quantitative Data Analysis. Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques,

Other Ways of Presenting Bivariate Relationships in tabular form (ex. Ratios)Other Ways of Presenting Bivariate Relationships in tabular form (ex. Ratios)

Page 22: Introduction to Quantitative Data Analysis. Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques,

Control variables: Trivariate Tables Men/Women Drivers

Control variables: Trivariate Tables Men/Women Drivers

Automobile Accidents by Sex

------------------------------------------ Per Cent Accident Free

Women 68%

(6,950)

Men 56%

(7,080)

------------------------------------------

Automobile Accidents by Sex and Distance Driven

----------------------------------------------------------------------------Distance

Under 10,000 km Over 10,000 kmPer Cent Per Cent

Accident Free Accident Free

Women 75% 48% (5,035) (1,915)

Men 75% 48% (2,070) (5,010)

----------------------------------------------------------------------------

Women have fewer accidents than men because women tend to drive less frequently than do men, and people who drive less frequently tend to have fewer accidents

In, In, Say it with FiguresSay it with Figures, Hans Zeisel presents the following data:, Hans Zeisel presents the following data:

Page 23: Introduction to Quantitative Data Analysis. Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques,

Another Way to Present Percentaged Tables (Trivariate)Another Way to Present Percentaged Tables (Trivariate)

Table 2. Percentage who support strike by type of school and sex

Sex Female Per cent Male Per cent

Type of School supporting strike supporting strike

Secondary 60% 60% (400) (400)

Elementary 30% 30% (900) (100)

__________________________________________________________Female = .30 : Male = .30 N = 1800

Dependent Variable

IndependentVariable

Controlvariable

Control variable

Categories of control variable

Page 24: Introduction to Quantitative Data Analysis. Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques,

Common Types of Charts & GraphsCommon Types of Charts & Graphs

Bar chartsBar charts HistogramsHistograms Pie ChartsPie Charts Line Graphs/PolygonsLine Graphs/Polygons ScattergramsScattergrams

Page 25: Introduction to Quantitative Data Analysis. Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques,

Bar ChartBar Chart

Parallel bars or rectangles with lengths Parallel bars or rectangles with lengths proportional to the frequency with which specified proportional to the frequency with which specified quantities occur in a set of dataquantities occur in a set of data

graphic representation of frequency distributiongraphic representation of frequency distribution, , generally used for generally used for discrete datadiscrete data..

Page 26: Introduction to Quantitative Data Analysis. Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques,

A Bar Chart (flat-best for 2 dimensional data)A Bar Chart (flat-best for 2 dimensional data)

Page 27: Introduction to Quantitative Data Analysis. Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques,

Bar Chart with breakBar Chart with break World Population Growth Showing World Population Growth Showing

Projections (Time to add billions)Projections (Time to add billions)

Click for source

Page 28: Introduction to Quantitative Data Analysis. Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques,

HistogramsHistograms graphically representing grouped data of a frequency graphically representing grouped data of a frequency

distribution distribution baseline typically depicts the classes, and the vertical baseline typically depicts the classes, and the vertical

scale represents the frequencies or percentagesscale represents the frequencies or percentages for continuous data.for continuous data.

ExampleExample In a survey of people between the age of 18 and 74 to determine the In a survey of people between the age of 18 and 74 to determine the

number of bike users categorized by age groups. number of bike users categorized by age groups.

Q. Which age-group do you belong to?Q. Which age-group do you belong to?18 to 2418 to 2425 to 3425 to 3435 to 4435 to 4445 to 5445 to 5455 to 6455 to 6465 to 7465 to 74

Page 29: Introduction to Quantitative Data Analysis. Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques,

HistogramHistogram

Page 30: Introduction to Quantitative Data Analysis. Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques,

Pie ChartPie Chart

circular chart circular chart divided into sectors, divided into sectors,

illustrating relative illustrating relative magnitudes or magnitudes or frequencies. frequencies. arc length of each sector arc length of each sector

(and consequently its (and consequently its centralcentral angle and area), is angle and area), is proportional to the proportional to the quantity it represents. quantity it represents.

sectors create a full disk.sectors create a full disk.

Example: 2004 Election Results of EUExample: 2004 Election Results of EU

((link to source & data)source & data)

Page 31: Introduction to Quantitative Data Analysis. Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques,

exploded pie chart exploded pie chart

one or more sectors one or more sectors separated from the rest separated from the rest of the diskof the disk

Example: 2004 Election Results of EUExample: 2004 Election Results of EU

Page 32: Introduction to Quantitative Data Analysis. Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques,

Presentation of identical data in pie and bar chartsPresentation of identical data in pie and bar charts

Problem with pie charts: easier to compare bar Problem with pie charts: easier to compare bar charts visually & to see differences in proportionscharts visually & to see differences in proportions

Page 33: Introduction to Quantitative Data Analysis. Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques,

Line and Scatter Charts (Graph)Line and Scatter Charts (Graph)

starts with mapping quantitative data points. starts with mapping quantitative data points. usually a dot or a small circle represents a single data usually a dot or a small circle represents a single data

point. point. one mark (point) for every data pointone mark (point) for every data point visual distribution of the datavisual distribution of the data When both variables are quantitativeWhen both variables are quantitative, the line , the line

segment that connects the two points on the chart segment that connects the two points on the chart expresses a slopeexpresses a slope

Slope can be visually interpreted relative to the slope Slope can be visually interpreted relative to the slope of other lines. of other lines.

Page 34: Introduction to Quantitative Data Analysis. Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques,

Example of Frequency Distribution Table from Textbook

Example of Frequency Distribution Table from Textbook

Page 35: Introduction to Quantitative Data Analysis. Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques,

Frequency Polygon Showing Same Data (Graph Plotting Frequency Distribution)

Frequency Polygon Showing Same Data (Graph Plotting Frequency Distribution)

Page 36: Introduction to Quantitative Data Analysis. Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques,

Common types of DistributionsCommon types of Distributions

Normal DistributionNormal Distribution (bell-shaped curve) (bell-shaped curve) Skewed DistributionsSkewed Distributions Bi-Modal DistributionsBi-Modal Distributions

Page 37: Introduction to Quantitative Data Analysis. Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques,

Normal DistributionNormal Distribution

Neuman (2000: 319)Neuman (2000: 319)

Page 38: Introduction to Quantitative Data Analysis. Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques,

Skewed DistributionsSkewed Distributions

Neuman (2000: 319)Neuman (2000: 319)

Page 39: Introduction to Quantitative Data Analysis. Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques,

Multiple Line chartsMultiple Line charts

Page 40: Introduction to Quantitative Data Analysis. Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques,

Multi-symbol Line chartMulti-symbol Line chart

Page 41: Introduction to Quantitative Data Analysis. Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques,

Combining Quantitative & Qualitative Info. In Graphs: Temperatures during Napoleon’s March (E. Tufte)Combining Quantitative & Qualitative Info. In Graphs: Temperatures during Napoleon’s March (E. Tufte)

Page 42: Introduction to Quantitative Data Analysis. Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques,

Line Chart (Poor example)Line Chart (Poor example)

Example of Example of Bad choice of Bad choice of

graphic graphic representationrepresentation

Data discrete Data discrete ConnectingConnecting

dots does not make dots does not make sense becausesense because

Measures of Measures of colours are colours are nominal herenominal here

Page 43: Introduction to Quantitative Data Analysis. Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques,

ScattergramsScattergrams

Page 44: Introduction to Quantitative Data Analysis. Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques,

Design & Interpretation Issues: Choice of ScalesDesign & Interpretation Issues: Choice of Scales

Same data presented using Same data presented using different scales for x and y different scales for x and y axisaxis

Page 45: Introduction to Quantitative Data Analysis. Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques,

Core Notions in Basic Univariate StatisticsCore Notions in Basic Univariate Statistics

Ways of describing data about one Ways of describing data about one variable (“uni”=one)variable (“uni”=one) Measures of central tendencyMeasures of central tendency

Summarize information about one variable Summarize information about one variable (“averages”)(“averages”)

Measures of dispersionMeasures of dispersionVariations or “spread”Variations or “spread”

Page 46: Introduction to Quantitative Data Analysis. Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques,

Measures of Central Tendency Measures of Central Tendency

summarize information about one variable summarize information about one variable in single number in single number ModeMode MedianMedian MeanMean

Use of Measures of Central TendencyUse of Measures of Central Tendency to summarize common “overall” “centralized” trendsto summarize common “overall” “centralized” trends doesn’t show variability, spread, dispersiondoesn’t show variability, spread, dispersion

Page 47: Introduction to Quantitative Data Analysis. Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques,

ModeMode

Babbie (1995: 378)

most common or frequently occurring case most common or frequently occurring case (for all types of data)(for all types of data)

Page 48: Introduction to Quantitative Data Analysis. Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques,

MedianMedian

Babbie (1995: 378)

middle point (only for ordinal, interval or ratio middle point (only for ordinal, interval or ratio data)data)

Page 49: Introduction to Quantitative Data Analysis. Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques,

Mean (arithmetic mean)Mean (arithmetic mean)

Babbie (1995: 378)

““average” = sum of values divided by number of average” = sum of values divided by number of cases (only for ratio and interval data)cases (only for ratio and interval data)

Page 50: Introduction to Quantitative Data Analysis. Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques,

Normal Distribution & Measures of Central TendencyNormal Distribution & Measures of Central Tendency

Neuman (2000: 319)Neuman (2000: 319)

Page 51: Introduction to Quantitative Data Analysis. Quantitative Data Analysis n Types of Statistics u Descriptive u Inferential—probabilistic sampling techniques,

Skewed Distributions & Measures of Central TendencySkewed Distributions & Measures of Central Tendency

Neuman (2000: 319)Neuman (2000: 319)