)chi-square( parametric versus nonparametric · 2019-11-18 · advantages of parametric techniques...

Post on 21-May-2020

8 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Parametric versus Nonparametric(Chi-square)

Dr. Hamza Alduraidi

Parametric AssumptionsThe observations must be independent.Dependent variable should be continuous (I/R)The observations must be drawn from normallydistributed populationsThese populations must have the same variances.Equal variance (homogeneity of variance)The groups should be randomly drawn from normallydistributed and independent populations

e.g. Male X Female Pharmacist X Physician Manager X StaffNO OVER LAP

Parametric Assumptions❑ The independent variable is categorical with

two or more levels.❑ Distribution for the two or more independent

variables is normal.

Advantages of ParametricTechniques

They are more powerful and more flexiblethan nonparametric techniques.They not only allow the researcher tostudy the effect of many independentvariables on the dependent variable, butthey also make possible the study of theirinteraction.

Nonparametric methods are often the onlyway to analyze nominal or ordinal data anddraw statistical conclusions.Nonparametric methods require noassumptions about the population probabilitydistributions.Nonparametric methods are often calleddistribution-free methods.Nonparametric methods can be used withsmall samples

Nonparametric Methods

Nonparametric Methods

In general, for a statistical method to beclassified as nonparametric, it mustsatisfy at least one of the followingconditions.● The method can be used with nominal data.● The method can be used with ordinal data.● The method can be used with interval or ratio

data when no assumption can be made aboutthe population probability distribution (in smallsamples).

Non Parametric TestsDo not make as many assumptions aboutthe distribution of the data as theparametric (such as t test)● Do not require data to be Normal● Good for data with outliersNon-parametric tests based on ranks ofthe data● Work well for ordinal data (data that have a

defined order, but for which averages maynot make sense).

Nonparametric Methods

There is at least one nonparametric testequivalent to each parametric testThese tests fall into several categories1� Tests of differences between groups

(independent samples)2� Tests of differences between variables

(dependent samples)3� Tests of relationships between variables

Summary Table of Statistical TestsLevel of

MeasurementSample Characteristics Correlation

1Sampl

e

2 Sample K Sample (i.e., >2)

Independent Dependent Independent Dependent

Categoricalor Nominal

Χ2 Χ2 Macnarmar’sΧ2

Χ2 Cochran’s Q

Rank orOrdinal

MannWhitney U

WilcoxinMatched

PairsSignedRanks

Kruskal WallisH

Friendman’s ANOVA

Spearman’srho

Parametric(Interval &

Ratio)

z testor t test

t testbetweengroups

t test withingroups

1 wayANOVAbetweengroups

1 wayANOVA

(within orrepeatedmeasure)

Pearson’s r

Factorial (2 way) ANOVA

Summary: Parametric vs.Nonparametric Statistics

Parametric Statistics are statistical techniquesbased on assumptions about the populationfrom which the sample data are collected.● Assumption that data being analyzed are

randomly selected from a normallydistributed population.

● Requires quantitative measurement thatyield interval or ratio level data.

Nonparametric Statistics are based on fewerassumptions about the population and theparameters.● Sometimes called “distribution-free” statistics.● A variety of nonparametric statistics are available

for use with nominal or ordinal data.

Chi-Square

Types of Statistical TestsWhen running a t test and ANOVA

We compare:● Mean differences between groupsWe assume● random sampling● the groups are homogeneous● distribution is normal● samples are large enough to represent population

(>30)● DV Data: represented on an interval or ratio scaleThese are Parametric tests!

Types of TestsWhen the assumptions are violated:

Subjects were not randomly sampledDV Data:● Ordinal (ranked)● Nominal (categorized: types of car, levels of

education, learning styles)● The scores are greatly skewed or we have no

knowledge of the distributionWe use tests that are equivalent to t test and

ANOVANon-Parametric Test!

Chi-Square testMust be a random sample from populationData must be in raw frequenciesVariables must be independentA sufficiently large sample size is required(at least 20)Actual count data (not percentages)Observations must be independent.Does not prove causality.

Different Scales, Different Measuresof Association

Scale of BothVariables

Measures ofAssociation

Nominal Scale Pearson Chi-Square: χ2

Ordinal Scale Spearman’s rho

Interval or RatioScale

Pearson r

ImportantThe chi square test can only be used ondata that has the following characteristics:

The data must be in the formof frequencies

The frequency data must have aprecise numerical value and must beorganised into categories or groups.

The total number of observations must begreater than 20.

The expected frequency in any one cellof the table must be greater than 5.

Formula

χ 2 = ∑ (O – E)2

E

χ2 = The value of chi squareO = The observed valueE = The expected value∑ (O – E)2 = all the values of (O – E) squared thenadded together

Chi Square Test of IndependencePurpose● To determine if two variables of interest independent

(not related) or are related (dependent)?● When the variables are independent, we are saying that

knowledge of one gives us no information about the othervariable. When they are dependent, we are saying thatknowledge of one variable is helpful in predicting the valueof the other variable.

● Some examples where one might use the chi-squared testof independence are:• Is level of education related to level of income?• Is the level of price related to the level of quality in

production?Hypotheses● The null hypothesis is that the two variables are

independent. This will be true if the observed counts in thesample are similar to the expected counts.• H0: X and Y are independent• H1: X and Y are dependent

Chi Square Test of Goodness of FitPurpose● To determine whether an observed

frequency distribution departs significantlyfrom a hypothesized frequency distribution.

● This test is sometimes called a One-sampleChi Square Test.

Hypotheses● The null hypothesis is that the two variables are

independent. This will be true if the observedcounts in the sample are similar to theexpected counts.• H0: X follows the hypothesized distribution• H1: X deviates from the hypothesized distribution

Steps in Test of Hypothesis1� Determine the appropriate test2� Establish the level of significance:α3� Formulate the statistical hypothesis4� Calculate the test statistic5� Determine the degree of freedom6� Compare computed test statistic against a

tabled/critical value

1. Determine Appropriate TestChi Square is used when both variables aremeasured on a nominal scale.It can be applied to interval or ratio data thathave been categorized into a small numberof groups.It assumes that the observations arerandomly sampled from the population.All observations are independent (anindividual can appear only once in a tableand there are no overlapping categories).It does not make any assumptions about theshape of the distribution nor about thehomogeneity of variances.

2. Establish Level ofSignificance

α is a predetermined valueThe convention

• α = .05• α = .01• α = .001

3. Determine The Hypothesis:Whether There is anAssociation or Not

Ho : The two variables are independentHa : The two variables are associated

4. Calculating Test StatisticsContrasts observed frequencies in each cell of acontingency table with expected frequencies.The expected frequencies represent the numberof cases that would be found in each cell if thenull hypothesis were true ( i.e. the nominalvariables are unrelated).Expected frequency of two unrelated events isproduct of the row and column frequencydivided by number of cases.

Fe= Fr Fc / N

Expected frequency = row total x columntotal Grand

total

4. Calculating Test Statistics

4. Calculating Test StatisticsObserved

frequencies

Expe

cted

freq

uenc

y

Expected

frequency

5. Determine Degreesof Freedom

df = (R-1)(C-1)

Num

ber

of le

vels

in c

olum

n va

riab

le

Num

ber

of le

vels

in r

ow v

aria

ble

6. Compare computed test statisticagainst a tabled/critical valueThe computed value of the Pearson chi-square statistic is compared with thecritical value to determine if thecomputed value is improbableThe critical tabled values are based onsampling distributions of the Pearsonchi-square statisticIf calculated χ2 is greater than χ2 tablevalue, reject Ho

χ2

Decision and InterpretationIf the probability of the test statistic is less thanor equal to the probability of the alpha error rate,we reject the null hypothesis and conclude thatour data supports the research hypothesis. Weconclude that there is a relationship betweenthe variables.If the probability of the test statistic is greaterthan the probability of the alpha error rate, wefail to reject the null hypothesis. We concludethat there is no relationship between thevariables, i.e. they are independent.

Example

Suppose a researcher is interested invoting preferences on gun control issues.A questionnaire was developed and sentto a random sample of 90 voters.The researcher also collects informationabout the political party membership ofthe sample of 90 respondents.

Bivariate Frequency Table orContingency Table

Favor Neutral Oppose f row

Democrat 10 10 30 50

Republican 15 15 10 40

f column 25 25 40 n = 90

Bivariate Frequency Table orContingency Table

Favor Neutral Oppose f row

Democrat 10 10 30 50

Republican 15 15 10 40

f column 25 25 40 n = 90

Observ

ed

frequencie

s

Bivariate Frequency Table orContingency TableFavor Neutral Oppose f row

Democrat 10 10 30 50

Republican 15 15 10 40

f column 25 25 40 n = 90

Row

frequency

Bivariate Frequency Table orContingency Table

Favor Neutral Oppose f row

Democrat 10 10 30 50

Republican 15 15 10 40

f column 25 25 40 n = 90Column frequency

1. Determine Appropriate Test

1� Party Membership ( 2 levels) andNominal

2� Voting Preference ( 3 levels) andNominal

2. Establish Level ofSignificance

Alpha of .05

3. Determine The Hypothesis

• Ho : There is no difference between D &R in their opinion on gun control issue.

• Ha : There is an association betweenresponses to the gun control survey andthe party membership in the population.

4. Calculating Test Statistics

Favor Neutral Oppose f row

Democrat fo =10fe =13.9

fo =10fe =13.9

fo =30fe=22.2

50

Republican

fo =15fe =11.1

fo =15fe =11.1

fo =10fe =17.8

40

f column 25 25 40 n = 90

= 50*25/90

4. Calculating Test Statistics

Favor Neutral Oppose f row

Democrat fo =10fe =13.9

fo =10fe =13.9

fo =30fe=22.2

50

Republican

fo =15fe =11.1

fo =15fe =11.1

fo =10fe =17.8

40

f column 25 25 40 n = 90

= 40* 25/90

4. Calculating Test Statistics

= 11.03

5. Determine Degreesof Freedom

df = (R-1)(C-1) =(2-1)(3-1) = 2

6. Compare computed test statisticagainst a tabled/critical value

α = 0.05df = 2Critical tabled value = 5.991Test statistic, 11.03, exceeds critical valueNull hypothesis is rejectedDemocrats & Republicans differsignificantly in their opinions on guncontrol issues

Example 1: Testing for Proportions

χ2α=0.05 = 5.991

SPSS Output for Gun ControlExample

Interpreting Cell Differences ina Chi-square Test - 1

A chi-square test ofindependence of therelationship between sexand marital status finds astatistically significantrelationship between thevariables.

Chi-Square Test ofIndependence: post hoc test

in SPSS (1)

You can conduct a chi-square test ofindependence in crosstabulation ofSPSS by selecting:

Analyze > Descriptive Statistics> Crosstabs…

Chi-Square Test ofIndependence: post hoc test

in SPSS (2)

click on “Statistics…”button to request thetest statistic.

Chi-Square Test ofIndependence: post hoc test

in SPSS (3)

Second, click on “Continue”button to close the Statisticsdialog box.

First, click on “Chi-square” torequest the chi-square test ofindependence.

Chi-Square Test ofIndependence: post hoc test

in SPSS (6)

In the table Chi-Square Tests result,SPSS also tells us that “0 cells haveexpected count less than 5 and theminimum expected count is 70.63”.

The sample size requirement for thechi-square test of independence issatisfied.

Chi-Square Test ofIndependence: post hoc test

in SPSS (7)The probability of the chi-square teststatistic (chi-square=2.821) wasp=0.244, greater than the alpha levelof significance of 0.05. The nullhypothesis that differences in "degreeof religious fundamentalism" areindependent of differences in "sex" isnot rejected.

The research hypothesis thatdifferences in "degree of religiousfundamentalism" are related todifferences in "sex" is not supportedby this analysis.

Thus, the answer for this question isFalse. We do not interpret celldifferences unless the chi-square teststatistic supports the researchhypothesis.

top related