presented by : norazliyati yahya2009905123 nurharani selamat2009324059

Presented by :

NORAZLIYATI YAHYA 2009905123

NURHARANI SELAMAT 2009324059

NUR HAFIZA NGADENIN 2009720649

QUANTITATIVE DATA ANALYSIS

04/20/23 2

DATA ANALYSIS

DESCRITIVE STATISTICS INFERENTIAL

STATISTICS

STATISTICS IN PERSPECTIVE

QUANTITATIVE DATA Frequency polygons

Techniques for summarizing

quantitative data

Skewed polygons

Histogram &

Stem-leaf Plots

Normal Curve

Average

Spreads

Standard scores & Normal Curve

Correlation

4

FREQUENCY POLYGONSConstructing a frequency polygon

List all scores in order of size, group scores into interval

Label the horizontal axis by placing all the possible scores

at equal intervals

Label the vertical axis by indicating frequencies at equal

interval

Find the point where for each score intersect with frequency,

place a dot at the point

Connect all the dots with a straight line.

l

SKEWED POLYGONS

5

Positively Skewed Polygon Negatively Skewed Polygon

The tail of the distribution trails off to the right,

in the direction of the higher score value

The longer tail of the distribution goes off to the

left

HISTOGRAM

6

Bars arranged from left to right on horizontal axis

Width of the bar indicate the range of value in each bar

Histogram facts

Frequencies are shown in vertical axis, point of

intersection is always zero

Bars in the histogram touch, indicate they illustrate quantitative rather than categorical data

STEM-LEAF PLOTS

STEM LEAF

2 9

3 72

4 655

5 41555

6 0

7

STEM LEAF

2 9

3 27

4 556

5 14555

6 0

Constructing a Stem-Leaf Plot

Mathematics Quiz Score

Separate number into a stem and a leaf

Group number with the same stem in numerical order

Reorder the leaf values in sequence

NORMAL CURVE

8

Normal Distribution

Majority of the scores are concentrated in the middle of the distribution, scores decrease in

frequency the farther away from the middle

The smooth curve (distribution curve) shows a generalized

distribution of scores that is not limited to one specific set of data

The normal curve is symmetrical and bell-curved, commonly used

to estimate height and weight, spatial ability and creativity.

AVERAGES

9

Measure of Central Tendency

Mode

The most frequent score in

a distribution

Median

The midpoint - middlemost score or halfway between the

two middlemost score

Mean

Average of all the score in a distribution

SPREADS

10

Variability

Standard Deviation Facts

Represents the spreads of a distribution, describe the

variability based on how greater or smaller the standard deviation

34%34%

68%

95%

13.5%13.5%

99.7%2.15%

Mean 1 SD 2 SD-1 SD-2 SD50% of all observation fall on each

side of the mean

68% of the score fall within one standard deviation of the mean

27% of the observation fall between one or two standard deviation away from the mean

99.7% fall within three standard deviations of the mean

STANDARD SCORE & NORMAL CURVE

11

Standard score & Normal Curve z-score

How far a raw score is from the mean in standard

deviation units

Probability

Percentage associated with areas under a normal curve,

stated in decimal form

.3413.3413

.1359.1359 .0215.0215

CORRELATION

12

Correlation Coefficient and Scatterplots

Express the degree of relationship between two

sets of scores

Correlation Coefficient

Positive relationship is indicated when high score on one variable accompanied by high score on the other and when low score on one accompanied by low score

on the other

Scatterplots

Used to illustrate different degrees of correlation

CATEGORICAL DATA

13

Techniques for summarizing

categorical data

Frequency TableBar Graphs and

Pie Charts

Crossbreak Table

CROSSBREAK TABLE

14

Grade Level and Gender of Teachers (Hypothetical Data)

Male Female Total

Junior High School Teacher 40 60 100

High School Teacher 60 40 100

Total 100 100 200

Male Female Total

Junior High School Teacher 40 60 100

High School Teacher 60 40 100

Total 100 100 200

Reported a relationship between two categorical variables of interest

Junior high school teacher is more likely to be female. A high school teacher is more likely to be male. Exactly one-half of the total group of teachers are

female. If gender is unrelated to grade level, the same proportion of junior high school and high school teachers are would be expected female.

A researcher administered a study on the average A researcher administered a study on the average IQ of primary school students at Shah Alam district IQ of primary school students at Shah Alam district and findsand finds their average IQ score is 85.their average IQ score is 85.

Does the average IQ score of students in entire population is also equal to 85 or this sample of students differ from other students in Shah Alam district?

If different, how are they different? Are their IQ scores higher or lower?

I don’t want to obtain data for entire population but how am I going to estimate how closely the average sample IQ scores with population IQ scores?

INFERENTIAL STATISTICINFERENTIAL STATISTICo What is inferential statistic?What is inferential statistic?

It is the Statistical Technique/Method using obtained sample It is the Statistical Technique/Method using obtained sample data to infer the corresponding population.data to infer the corresponding population.

o Type of inferential statisticsType of inferential statistics

SAMPLE SAMPLE = 10.14

POPULATION μ =?

1. EstimationUsing a sample mean to estimate a population mean

Example: Interval Estimation: Confidence Intervals

2. Hypothesis testingComparing 2 meansComparing 2 proportionsAssociation between onevariable and another variable

1. INTERVAL ESTIMATION1. INTERVAL ESTIMATION

RESEARCH OBJECTIVE : RESEARCH OBJECTIVE :

To identify the average IQ of primary school To identify the average IQ of primary school students at Shah Alam district.students at Shah Alam district.

o PopulationPopulation: 1,000 students of Shah Alam primary schools

o SampleSample : 65 primary school students o Sample MeanSample Mean : 85o Standard Error of MeanStandard Error of Mean : 2.0o Interval Estimation Interval Estimation : 95% Confidence Interval = 85 1.96(2) = 85 3.92 = 81.08 or 88.92o InterpretationInterpretation: Researcher has 95% confidence that the

average IQ of primary students at Shah Alam district is between 81.08 or 88.92

SAMPLING ERRORSAMPLING ERROR

What is sampling error?What is sampling error? The difference between the population mean and the sample mean

Why does sampling error occurs?Why does sampling error occurs? Different samples drawn from the same population can have

different properties

How can we quantify sampling error?How can we quantify sampling error? Using standard error of mean.

It is useful because it allows us to represent the amount of sampling error associated with our sampling process—how much error we can expect on average.

S

S

S

P

1. HYPOTHESIS TESTING1. HYPOTHESIS TESTING

What is hypothesis testing?What is hypothesis testing?o A hypothesis is an assumption about the

population parameter.o A parameter is a characteristic of the population;

mean or relationship.o The parameter must be identified before analysis.

Steps in conducting hypothesis testingSteps in conducting hypothesis testing1. State the null hypothesis and research hypothesis.2. Identify the appropriate test.3. State the decision rule for rejecting null hypothesis.

NULL HYPOTHESISNULL HYPOTHESIS RESEARCH HYPOTHESISRESEARCH HYPOTHESIS

There is NONO difference between the population mean of students using method A and the population mean of students using method B

Treatment X has NO EFFECTNO EFFECT on outcome Y

The grade point average of juniors is LESSLESS than 3.0

The average IQ score of primary school students at Shah Alam district EQUALEQUAL to 85

The population mean of students using method A is GREATERGREATER than the population mean of students using method B

Treatment X has AN EFFECTAN EFFECT on outcome Y

The grade point average of juniors is AT LEASTAT LEAST 3.0

The average IQ score of primary school students at Shah Alam district is GREATERGREATER 85

NULL HYPOTHESISNULL HYPOTHESIS The average IQ score of primary school students at Shah Alam

district EQUALEQUAL to 85

This test is called one sample t test. At the end of the hypothesis testing, we will get a P value. If the P value is less than 0.05, we reject the Null Hypothesis and

conclude as Research Hypothesis. If the P value is more than or equal to 0.05, we cannot reject the

Null Hypothesis. In above example, if we get P =0.01, we reject the nullhypothesis, then we conclude Research Hypothesis “the average IQ

score of primary school students at Shah Alam district is GREATERGREATER 85 ”.

ONE AND TWO-TAILED TESTONE AND TWO-TAILED TEST

Susie has pneumonia

Susie does not have pneumonia

Doctors says that symptoms like Susie’s occur only 5 percent of the time in healthy people. To be safe, however, he decides to treat Susie for pneumonia

Doctor is correct. Susie does have pneumonia and the treatment cures her.

Doctor is wrong. Susie’s treatment was unnecessary and possibly unpleasant and expensive. Type 1 error.

Doctor says that symptoms like Susie’s occur 95 percent of the time in healthy people. In his judgement, therefore, her symptoms are a false alarm and do not warrant treatment, and he decides not to treat Susie for pneumonia

Doctor is wrong. Susie is not treated and may suffer serious consequences. Type II error.

Doctor is correct. Unnecessary treatment is avoided.

A HYPOTHETICAL EXAMPLE OF TYPE 1 AND TYPE II A HYPOTHETICAL EXAMPLE OF TYPE 1 AND TYPE II ERRORSERRORS

TYPE OF TESTSTYPE OF TESTS

Quantitative dataQuantitative datat-test for meansANOVAANCOVAMANOVAMANCOVAt-test for r

Categorical dataCategorical data

t-test for difference in proportion

PARAMETRIC TEST

NON PARAMETRIC TEST

Quantitative dataQuantitative dataMann-Whitney U testKruskall-Wallis one way analysis of varianceSign testFriedman two ways analysis of variance

Categorical dataCategorical data

Chi-square test

Comparing Groups

Quantitative DataQuantitative Data• Frequency polygons → central tendency

Interpretation

1. Information of known groups

2. Effect size, ES:

3. Inferential statistics

Mean experimental gain – mean comparison gain

Std dev. of gain of comparison group

Comparing Groups

Categorical DataCategorical Data• Crossbreak tables

Table 1 Felony Sentences for Fraud by Gender

Type of SentenceGender Probation Prison Totals

Male 24 11 35

Female 13 22 35

Totals 37 33 70

Table 1 Felony Sentences for Fraud by Gender (frequencies added)

Gender Probation Prison Totals

Male24

(3.178)11

(2.398) 35

Female13

(2.565)22

(3.091) 35

Totals 37 33 70

Interpretation

• Place data in tables

• Calculate contingency coefficient

c = √ X2

X2 + n

THANK YOU... THANK YOU...

presented by : norazliyati yahya2009905123 nurharani selamat2009324059

Documents

high score

low score

frequent score

raw score

middlemost score meanaverage

higher score valuethe

standard deviations

normal distributionmajority