[ppt]botanical extract - upm edutrain interactive learning 13dec2014 edu5950... · web viewwhat is...
TRANSCRIPT
Data Analysis Using SPSS
EDU5950SEM1 2014-15
•Assoc. Prof. Dr. Rohani Ahmad Tarmizi•Institute for Mathematical Research/
•Faculty of Educational Studies•UPM
LEARNING OUTCOMESFirst - students will be able to conceptualize importance of choosing appropriate statistical analyses Second – students will be able to conduct DATA ENTRY proceduresThird - students will be able to conduct descriptive statistical analysis and interpret the findingsFourth – students will be able to conduct test of hypotheses of differences and interpret the findingsFifth – students will be able to conduct test of correlation or relationship and interpret the findings
Statistics ANALYSES Some background► As we all know, human beings are complex
entities complete with knowledge , beliefs, feelings, opinions, attitudes, etc.
► Studying human subjects by examining a single independent variable (IV) and a single dependent variable (DV) is truly impractical since these variables do not co-exist in isolation as part of the human mind or set of behaviors.
► These two variables (an IV and the examined DV) may effect or be affected by several other variables.
► In order, to be able to draw conclusions offer accurate explanations of the phenomenon of interest, the researcher should be willing to examine many variables simultaneously.
Variables
EXTERNAL REWARDS INTRINSIC MOTIVATION
Independent Variables
Dependent Variable
Variables
EXTERNAL REWARDS
TASK INTEREST
TASK STRUCTURE
INTRINSIC MOTIVATION
Independent Variables
Dependent Variable
Variables
Spiritual well-being Experience Training
Demography:► Gender► Educational levels
Counseling competency► skills► knowledge► awareness
Level of integration of religious
perspectives
Independent Variables
Dependent Variable
VariablesCharacteristic studied that assume different values for different elements
Demography:► Gender► Job tenure► Occupational status
Job characteristic:► Work condition► Job demand► Job control Perceived quality of ICT facilities
Career commitment
Quality of work life
Independent Variables
Intervening Variable
Dependent Variable
BASIC CONCEPTSTATISTICAL ANALYSIS
MAJOR GROUPS OF HYPOTHESIS TESTINGS
• GROUP DIFFERENCES
• RELATIONSHIP BETWEEN VARIABLES
• PREDICTION OF GROUP MEMBERSHIP
• STRUCTURAL ANALYSES
Group Differences1. t Test (independent t-test)
Compare differences in mean of interval/ratio DV among groups of a qualitative IV. It analyzes differences between means of two group.
There is significant difference in mean literacy performance between male and female preschoolers.
2. t Test (dependent t-test)Compare differences in mean of interval/ratio DV based on paired or matched scores. It analyzes differences between means that are paired/matched from the group.
There is significant difference in mean literacy performance from pre to post remedial program among preschoolers who undergo the remedial program.
Group Differences3. One-Way Analysis of Variance (ANOVA ) and t Test
Compare differences in mean of interval/ratio DV among groups of a qualitative IV. It analyzes variation between and within each group. Since ANOVA determines the group differences and does not identify which groups are significantly different, post hoc tests are usually conducted.
There are significant differences in mean literacy
performance between preschoolers from the low, middle and high SES group.
Group Differences4. One-Way Analysis of Covariance (ANCOVA)
Assess group differences on a single metric DV after the effect of one or more covariates are statistically removed. Covariates are chosen because of their known relationship with the DV.
Do preschoolers of low, middle and high SES have
different literacy test scores after adjusting for family type?There are significant differences in mean literacy performance between preschoolers from the low, middle and high SES group after adjusting for family type.
Group Differences
5. Factorial Analysis of Variance (factorial ANOVA) Comparing differences of one metric DV among groups of several nonmetric IVs and interactions among the Ivs
Does ethnicity and learning preference (IVs)
significantly affect reading achievement, (DVs) among primary school students?
Relationship and Prediction between Variables
6. Bivariate Correlation and RegressionBivariate Correlation assess the degree of relationship between two metric variables.
What is the relationship between motivation
achievement and CGPA of UPM freshman students?
7. In contrast, Bivariate Regression utilizes the relationship between the IV and DV to predict the score of DV from the IV.
To what extend do motivation achievement scores
(IV) predict CGPA of UPM freshman students?
8. Multiple correlation- degree of relationship between one metric DV and a set of metric IVs.
What is the relationship between motivation achievement, learning preference, locus of control (IVs) wtih CGPA of UPM freshman students?
9. Multiple Regression-Objective: to predict changes in the DV in response
to changes to in several IVs-One metric DV-One or more metric IVs
To what extend do motivation achievement scores, learning preference, locus of control (IVs) predict CGPA of UPM freshman students?
Relationship and Prediction between Variables
No. & Type of DV
No. & Type of IV
Test Purpose of Analysis
1 DV 1 IV (2 categories)
t-test Determine significance of mean group differences
1 DV 1 IV (>2 categories)
One-way ANOVA
Determine significance of mean group differences
1 DV ≥ 2 IVs Factorial ANOVA
Determine significance of mean group differences
Decision-making Tree – Test of Group Differences
Null Hypothesis Significance Testing
• This address: • How likely it is to obtain an
observed (i.e sample) result given a specific assumption about the population.
• The assumption about the population is called the null hypothesis (e.g, there is no difference, there is no relationship, there is no predictive model, etc) and the observed result is what the sample produces (e.g.,there is differences, there is relationship, there is predictive model)
Null Hypothesis Significance Testing
• Statistical tests such as z, t, F (ANOVA), etc., determine how likely the sample result or any result more distant from the null hypothesis would be if the null hypothesis were true.
• This probability is then compared to a set criterion which is the set alpha value or the term Type I error or alpha error rate.
• POWER ANALYSIS focuses on situations for which the expectation is that the null hypothesis is false.
Levels of Measurement• Which statistics you can use to analyze your data are determined by the level of measurement of each variable• Four levels of measurement:
• Nominal - you group a variable into classes with no particular order (race, favorite color, etc)
• Ordinal - categories that represent somewhat ranks but you don’t know how much higher or lower, (weight categories (underweight, normal, overweight, obese)
• Interval - Data that have an inherent order and thus resulted in scores hence the data represent a true magnitude
• Ratio - Data that have an inherent order which resulted in scores and has a true 0 point.
• For purposes of choosing statistical analyses, the distinction between interval and ratio is unimportant!!
Levels of Measurement - Quiz1. IQ scores2. Gender3. Income (as a dollar amount)4. Income (in 6 categories)5. Agreement scores (1=strongly disagree, 2=slightly
disagree, 3=neutral, 4=slightly agree, 5=strongly disagree)
6. Cancer status (has cancer, does not have cancer)7. Practice location (rural, urban)8. Cigarette smoking (no. of cig/day)9. Cigarette smoking (none, up to ½ ppd, ½ ppd-<1
ppd, 1 ppd+)
Statistical Tools For Descriptive Analyses
• Frequency/percentage table, • Pie or bar Charts, • Histogram • Frequency Polygon, • Cross-tabulation• Scatter diagram• Mean, Median, Mode, Maximum,
Minimum• Range, Variance, Standard
Deviation, Coefficient of variation, Standard Scores
Statistical Tools For Inferential Statistics
• PARAMETRIC TESTS: – Test of hypothesis of differences
between means - Z-test, t-test, F-test, MANOVA
– Test of hypothesis of relationship – Pearson r, Point-biserial, Regression
• NON-PARAMETRIC TESTS: – Mann-Whitney, – Kruskal Wallis, – Spearman rho, – Chi-Square, Cramer’s V, Lambda,
dll.
In most research projects, it is likely that you will use quite a variety of different types of statistics, depending on the question you are addressing and the nature (level of measurement) of the data that you have.
It is therefore important that you have a basic understanding of Different statistical tools, Type of objectivesResearch questionsHypotheses to address and the underlying
assumptions and requirements.
• TO DESCRIBE MEASURED VARIABLES
• TO COMPARE MEANS or MEDIANS or FREQUENCIES – test of differences
• TO CORRELATE OR DETERMINE RELATIONSHIP OR ASSOCIATION – test of association or relationship
THREE MAJOR STATISTICAL TECHNIQUES
ACTIVITY 1- DATA ENTRY
INITIAL DATA FILE – VARIABLE VIEW
INITIAL DATA FILE – DATA VIEW
Go to Variable view To define the IVs & DVs . Use separate line for each & give sensible names. Decide specification/format of data: NAME, TYPE,
WIDTH, DECIMALS, LABEL, VALUES, MISSING, COLUMN, ALIGN, MEASURE. For example, String = text, numeric = numbers or others but numeric is generally the best format.
Variable view – use to define or give specifications for the IVs and DVs
Go to Data view To insert data – the
measured and collected responses for variables.
Data is input in columns under appropriate variable names.
Each row designate the respondent of the study.
DATA VIEW – use to input data (respondents by rows and variables by columns)
EXAMPLE OF DATA SET IN SPSS DATA EDITOR – Variable view
EXAMPLE OF DATA SET IN SPSS DATA EDITOR – Data View
DATA TRANSFORMATION• Used when variables need to be
transformed as intended by the researcher or as stated in the objectives.
• TRANSFORM- COMPUTE To compute or sum the scores
• TRANSFORM – RECODERecoding negatively worded scale
itemsCollapsing continuous variablesReplacing missing values
TO RECODE:• CLICK TRANSFORM => RECODEYOU WILL GET RECODE DIALOG BOX• CLICK VARIABLE TO THE EMPTY
RIGHT-HAND BOX• NAME THE NEW VARIABLE AND
LABEL• CLICK CHANGE• CLICK OLD AND NEW VALUES
BUTTON
To COMPUTE a score (TEACHER_EFFICACY)• Click Transform => Compute• You will get a Compute Variable dialog
box• Name your Target Variable• Type in the required Numeric Expression• Click OK
DATA SET WITH NEW VARIABLES - TEACHER_ FACTOR
You should be able to calculate descriptive statistics such as frequencies, descriptives, and crosstabs, bar charts, scattergram, box plot, histogram, etc.
Remember: output appears in a separate window.
ACTIVITY 2- DESCRIPTIVE ANALYSIS
Use the following Menu:
– DESCRIPTIVES STATISTICS FREQUENCIES – DESCRIPTIVE STATISTICS DESCRIPTIVES – CUSTOM TABLES
– DISPLAY DATA HISTOGRAM, BOXPLOT, STEM AND LEAF
TO DESCRIBE MEASURED VARIABLES
Gender
Frequency Percent Valid Percent
Cumulative
Percent
Valid lelaki 22 34.4 34.4 34.4
perempuan 42 65.6 65.6 100.0
Total 64 100.0 100.0
Race
Frequency Percent Valid Percent
Cumulative
Percent
Valid MELAYU 15 23.4 23.4 23.4
CINA 41 64.1 64.1 87.5
INDIA 8 12.5 12.5 100.0
Total 64 100.0 100.0
TO OBTAIN FREQUENCY DISTRIBUTION
Religion
Frequenc
y Percent
Valid
Percent
Cumulative
Percent
Valid ISLAM 25 39.1 39.7 39.7
BUDDHA 24 37.5 38.1 77.8
HINDU 1 1.6 1.6 79.4
KRISTIA
N
13 20.3 20.6 100.0
Total 63 98.4 100.0
Missing System 1 1.6
Total 64 100.0
TO OBTAIN DESCRIPTIVE STATISTICS OF DATA
Descriptive Statistics
N Minimum Maximum Mean Std. DeviationMy teacher wants us to enjoy learning maths 60 1 6 3.75 1.580
My teacher understand our problems in learning maths
36 1 6 3.89 1.833
My teacher try to make mathematics lessons interesting
64 1 6 4.00 1.533
My teacher appreciates it when we try hard, even when our results are not so good
64 1 6 4.16 1.514
My teacher show us step by step and how to solve maths problems
63 2 6 4.25 1.534
My teacher listen carefully to what we say 64 1 6 4.16 1.185
My teacher is friendly to us 64 1 6 3.52 1.491My teacher gives us time to explore new maths problems
63 1 6 3.81 1.216
Valid N (listwise) 34
ReportGender
My teacher wants us to
enjoy learning maths
My teacher understand our
problems in learning maths
My teacher try to make
mathematics lessons
interestinglelaki Mean 3.74 3.75 3.91
N 19 12 22Std. Deviation 1.821 1.913 1.716
perempuan Mean 3.76 3.96 4.05N 41 24 42Std. Deviation 1.480 1.829 1.447
Total Mean 3.75 3.89 4.00N 60 36 64Std. Deviation 1.580 1.833 1.533
TO OBTAIN DESCRIPTIVE -COMPARE MEANS OF DIFFERENT GROUPS
Plot graphs – you should be able to plot bar charts for sets of scores & plot scattergrams of relationships between the two sets of scores.
Remember: Select Graphs then explore the alternatives.
TO OBTAIN DESCRIPTIVE -COMPARE MEANS OF DIFFERENT GROUPS
Report
Gender
My instructor
wants us to enjoy
learning maths
My instructor
understand our
problems in learning maths
My instructor try
to make mathematics
lessons interesting
lelaki Mean 3.71 3.86 3.91
N 21 22 22
Std. Deviation
1.736 1.521 1.716
perempuan Mean 3.74 4.07 4.05
N 42 42 42
Std. Deviation
1.466 1.504 1.447
Total Mean 3.73 4.00 4.00
N 63 64 64
Std. Deviation
1.547 1.501 1.533
Summary of Statistical Tools For Descriptive Analyses
• Frequency/percentage table, • Pie or bar Charts, • Histogram • Frequency Polygon, • Cross-tabulation• Scatter diagram• Mean, Median, Mode, Maximum,
Minimum• Range, Variance, Standard
Deviation, Coefficient of variation, Standard Scores
ACTIVITY 3- COMPARISON OF MEANS OF TWO GROUPS
EXPLORING DIFFERENCES BETWEEN TWO GROUPS
1.t-test t-tests are used when you have two groups (e.g. males and females) or
two sets of data (before and after), and you wish to compare the mean
score on some continuous variable.
There are two main types of t-tests.
Paired sample t-tests (also called repeated measures) are used when you
are interested in changes in scores for subject tested at Time 1, and then
at Time 2 (often after some intervention or event). The samples are
‘related’ because they are the same people tested each time.
Independent sample t-tests are used when you have two different
(independent) groups of people (males and females), and you are
interested in comparing their scores. In this case, you collect information
on only one occasion, but from two different sets of people.
• TO MAKE COMPARISONS BETWEEN GROUPS ON ANY MEASURED VARIABLES AT INTERVAL AND RATIO LEVEL
• CLICK ANALYZE =>COMPARE MEANS
• You will get the following Sub-menus
– MEANS– ONE-SAMPLE T-TEST– INDEPENDENT SAMPLES T-TEST– PAIRED SAMPLES T-TEST– ONE-WAY ANOVA
PURPOSE EXAMPLE OF RESEARCH QUESTION
PARAMETRIC STATISTIC
INDEPENDENT VARIABLE
DEPENDENT VARIABLE
Comparing means of two groups
Is there a difference in instructors’ efficacy in teaching and learning mathematics as perceived by students of different gender?
Independent t-test
One categorical independent variable gender of two levels-males and females
One continuous dependent variablestudents’ perception on instructors’ efficacy in teaching and learning
To Compare Means of Two Groups• Click: Analyze>Compare means>Independent
T-test• You will get a Independent T-test dialog box• Select your variables – Test variables & Group
variables• Click OK
Independent Samples TestLevene's Test for Equality of
Variances t-test for Equality of Means
F Sig. t dfSig. (2-tailed)
Mean Difference
Std. Error Difference
95% Confidence Interval of the
DifferenceLower Upper
INSTRUCTORS’ EFFICACY
Equal variances assumed
.883 .351 -.094 60 .926 -.02315 .24740 -.51803 .47173
Equal variances not assumed
-.095 42.237 .925 -.02315 .24347 -.51440 .46811
Group StatisticsGender
N Mean Std. Deviation Std. Error MeanINSTRUCTORS’ EFFICACY
lelaki 21 3.9490 .89190 .19463perempuan 41 3.9721 .93662 .14628
HYPOTHESIS ALPHA VALUE SIGNIFICANT VALUE
(FROM THE SPSS OUTPUT)
EVALUATING DECISION
There is no significant difference in variance of students’ perception on instructors’ efficacy in T&Lof by different gender
0.05 .351 SIG.V > α Fail to reject null hypothesis,
Accept null hypothesis
Therefore , we Choose t from the equal variances assumed row
There is a significant difference in variance of students’ perception on instructors’ efficacy in T&L by different gender
DECISION MATRIX
HYPOTHESIS ALPHA
VALUE
SIGNIFICANT
VALUE (FROM
THE SPSS
OUTPUT)
EVALUATING DECISION CONCLUSION
There is no significant difference in mean students’ perception on instructors’ efficacy in T&L by different gender
0.05
.926 Sig. value lebih besar daripada α
Bermakna kebenaran hipotesis nol adalah besar.
Fail to reject null hypothesis,
Accept null hypothesis
There is no significant difference in students’ mean perception on instructors’ efficacy in T&L by gender, t (60) = -.094, p> .05. ( or p=.926)
There is a significant difference in mean students’ perception on instructors’ efficacy in T&L by different gender
PURPOSE EXAMPLE OF RESEARCH QUESTION
PARAMETRIC STATISTIC
INDEPENDENT
VARIABLE
DEPENDENT VARIABLE
Comparing means of two groups
Is there a difference in students’ perception of mathematics instructors’ role in making the students enjoy learning maths with making maths’ lessons interesting
Dependent t-test
- Two continuous dependent variable:students’ perception of mathematics inastructors’ role in making the students enjoy learning maths with making maths’ lessons interesting
Item 1 vs Item 3
To Compare Means of Two Dependent Groups
• Click: Analyze ->Compare means ->Paired Sample T-test• You will get a Paired
Sample T-test dialog box• Select your variables –
Paired variables • Click OK
Paired Samples Correlations
N Correlation Sig.Pair 1 My instructor wants us to
enjoy learning maths with My teacher try to make mathematics lessons interesting
63 .708 .000
Paired Samples TestPaired Differences
t dfSig. (2-tailed)Mean
Std. Deviation
Std. Error Mean
95% Confidence Interval of the DifferenceLower Upper
Pair 1 My instructors wants us to enjoy learning maths with My teacher try to make mathematics lessons interesting
-.238 1.174 .148 -.534 .058 -1.610 62 .112
HYPOTHESIS ALPHA VALUE
SIGNIFICANT VALUE
(FROM THE SPSS OUTPUT)
EVALUATING DECISION CONCLUSION
There is no significant difference in students’ perception of mathematics instructors’ role in making the students enjoy learning maths with making maths’ lessons interesting
0.05 .112 Sig. value lebih besar daripada α
Bermakna kebenaran hipotesis nol adalah besar.
Fail to reject null hypothesis,
Accept null hypothesis
There is no significant difference in students’ perception of mathematics instructors’ role in making the students enjoy learning maths with making maths’ lessons interesting, t (62) = -1.160, p> .05. (or p=.112)
There is a significant difference in students’ perception of mathematics instructors’ role in making the students enjoy learning maths with making maths’ lessons interesting
DECISION MATRIX
EXPLORING DIFFERENCES BETWEEN GROUPS
One-way analysis variance One-way analysis variance is similar to a t-test, but is used when you have two or more
groups and you wish to compare their mean scores on a continuous variable.
It is called one-way because you are looking at the impact of only one independent variable
on your dependent variable.
A one-way analysis of variance (ANOVA) will let you know whether your groups differ, but it
won’t tell you where the significant difference is (gp1/gp2, gp3/gp4 etc).
You can conduct post-hoc comparisons to find out which groups are significantly different
from one another.
You could also choose to test differences between specific groups, rather than comparing all
the groups by using planned comparisons. Similar to t-tests, there are two types of one-way
ANOVAs: repeated measures ANOVA (same people on more than two occasions), and
between-groups (or independent samples) ANOVA, where you are comparing the mean
scores of two or more different groups of people.
PURPOSE EXAMPLE OF RESEARCH QUESTION
PARAMETRIC STATISTIC
INDEPENDENT
VARIABLE
DEPENDENT VARIABLE
Comparing means of three groups
Is there a difference in students’ perception of instructors’ efficacy in T&L mathematics byrace?
One-way between groups ANOVA
One categorical independent variable (three levels of race)
One continuous dependent variable students’ perception of instructors’ efficacy in T&L mathematics
To Compare Means of Three or More Groups• Click: Analyze->Compare means->One-Way ANOVA• You will get a One-Way ANOVA
dialog box• Select your variables –> Dependent variables-> Factor or Group variables• Click: Options• Click OK
DescriptivesINSTRUCTORS’_EFFICACY
N MeanStd.
Deviation
Std.
Error
95% Confidence Interval
for Mean
Minimum MaximumLower Bound Upper Bound
MELAYU 14 4.2704 .73282 .19586 3.8473 4.6935 3.07 5.36
CINA 40 3.7339 .96118 .15198 3.4265 4.0413 2.21 5.71
INDIA 8 4.5804 .46673 .16501 4.1902 4.9706 3.86 5.07
Total 62 3.9643 .91443 .11613 3.7321 4.1965 2.21 5.71
ANOVAINSTRUCTORS’ EFFICACY
Sum of Squares df Mean Square F Sig.Between Groups 6.471 2 3.235 4.286 .018Within Groups 44.537 59 .755
Total 51.008 61
TEST OF DIFFERENCES BETWEEN GROUPS – BY RACE
HYPOTHESIS ALPHA VALUE
SIGNIFICANT VALUE
(FROM THE SPSS
OUTPUT)
EVALUATING DECISION CONCLUSION
There is no significant difference in mean students’ perception of instructors’ efficacy in T&L mathematics by race?
0.05 .018 Sig. value lebih kecil daripada α
Bermakna kebenaran hipotesis nol adalah kecil.
Reject null hypothesis,
Accept alternative hypothesis
There is significant difference in mean students’ perception of instructors’ efficacy in T&L mathematics by race, F(2,59) = 4.29, p<.05.
There is a significant difference in mean students’ perception of instructors’ efficacy in T&L mathematics by race?
DECISION MATRIX
TEST OF DIFFERENCES BETWEEN GROUPS – BY RELIGION
ANOVATEACHER_FACTOR
Sum of Squares df Mean Square F Sig.
Between Groups 14.849 2 7.424 11.982 .000
Within Groups 35.940 58 .620
Total 50.789 60
Descriptives
TEACHER_FACTOR N Mean Std. Deviation Std. Error 95% Confidence Interval for
MeanMinimum Maximum
Lower Bound Upper Bound
ISLAM 24 4.3929 .98705 .20148 3.9761 4.8097 2.21 5.71
BUDDHA 24 3.3601 .39376 .08038 3.1938 3.5264 2.71 4.14
KRISTIAN 13 4.3242 .91129 .25275 3.7735 4.8749 2.93 5.71
Total 61 3.9719 .92004 .11780 3.7363 4.2075 2.21 5.71
HYPOTHESIS ALPHA VALUE
SIGNIFICANT VALUE
(FROM THE SPSS
OUTPUT)
EVALUATING DECISION CONCLUSION
There is no significant difference in mean students’ perception of instructors’ efficacy in T&L mathematics by religion?
0.05 .018 Sig. value lebih kecil daripada α
Bermakna kebenaran hipotesis nol adalah kecil.
Reject null hypothesis,
Accept alternative hypothesis
There is significant difference in mean students’ perception of instructors’ efficacy in T&L mathematics by religion, F(2,58) = 11.98, p<.05.
There is a significant difference in mean students’ perception of instructors’ efficacy in T&L mathematics by religion?
DECISION MATRIX
Pearson Product-Moment Correlation
• A measure of the linear relationship between two or more variables.
• Correlation analysis produces Pearson Correlation Coefficient ( r ).
• It indicates the strength of the relation and the direction (+ve / -ve) of the relationship between the variables.
Significant of Relationship• The significance of the relationship
is expressed in probability levels p (e.g., significant at p =.05)
• The smaller the p-level, the more significant the relationship.
• The larger the correlation (r value), the stronger the relationship.
Example 1 CorrelationsCorrelations
Total life satisfaction
Total Self esteem
Total life satisfaction
Pearson Correlation 1 .488**
Sig. (2-tailed).000
N 436 434
Total Self esteem
Pearson Correlation .488** 1
Sig. (2-tailed).000
N 434 436**. Correlation is significant at the 0.01 level (2-tailed).
Example 2 CorrelationsIntimate
RelationshipFriends Common
SenseAcademic
IntelligenceGeneral
Intimate Relationship Pearson Correlation Sig. (2-tailed) N
1
80
.552** .000 80
.351** .001 80
.218 .052 80
.393** .000 80
Friends Pearson Correlation Sig. (2-tailed) N
.552** .000 80
1
80
.462** .000 80
.244* .029 80
.546** .000 80
Common Sense Pearson Correlation Sig. (2-tailed) N
.351** .001 80
.462** .000 80
1
80
.400** .000 80
.525** .000 80
Academic Intelligence Pearson Correlation Sig. (2-tailed) N
.218 .052 80
.244* .029 80
.400** .000 80
1
80
.261* .019 80
General Pearson Correlation Sig. (2-tailed) N
.393** . 000 80
.546** .000 80
.525** .000 80
.261* .019 80
1
80
**Correlation is significant at the level 0.01 level (2-tailed)*Correlation is significant at the level 0.005 level (1-tailed)
Report the Output of a Pearson Product-Moment Correlation
• Report the value of the correlation coefficient, r, as well as the degrees of freedom (df)
• The degrees of freedom (df) is the number of data points minus 2 (N - 2).
Coefficient of Determination, r2
• How much of the variation in the DV - Y is due to change in the IV - X
• It is sometimes expressed as a percentage when the proportion of variance explained by the correlation.
• Example: r² = 0.36Hence, 36% of the variation in Y is associated with the change in X. 64% of variation is Y is due to other factors.
Regression Analysis
• Regression analysis procedures have as their primary purpose the development of an equation that can be used for predicting values on some DV for all members of a population.
• A secondary purpose is to use regression analysis as a means of explaining causal relationships among variables.
Regression Analysis• The most basic application of regression
analysis is the bivariate situation, to which is referred as simple linear regression, or just simple regression.
• Simple regression involves a single IV and a single DV.
• Goal: to obtain a linear equation so that we can predict the value of the DV if we have the value of the IV.
• Simple regression capitalizes on the correlation between the DV and IV in order to make specific predictions about the DV.
• The correlation tells us how much information about the DV is contained in the IV.
• If the correlation is perfect (i.e r = ±1.00), the IV contains everything we need to know about the DV, and we will be able to perfectly predict one from the other.
• Regression analysis is the means by which we determine the best-fitting line, called the regression line.
• Regression line is the straight line that lies closest to all points in a given scatterplot
• This line sometimes pass through the centroid of the scatterplot.
• 3 important facts about the regression line must be known:– The extent to which points are scattered
around the line– The slope of the regression line– The point at which the line crosses the Y-
axis• The extent to which the points are scattered
around the line is typically indicated by the degree of relationship between the IV (X) and DV (Y).
• This relationship is measured by a correlation coefficient – the stronger the relationship, the higher the degree of predictability between X and Y.
• The degree of slope is determined by the amount of change in Y that accompanies a unit change in X.
• It is the slope that largely determines the predicted values of Y from known values for X.
• It is important to determine exactly where the regression line crosses the Y-axis (this value is known as the Y-intercept).
What you will calculate
1. A linear regression equation.
2. The statistical significance of β1 (null hypothesis significance testing).
3. A measure of effect size.
4. Confidence and prediction intervals.
EXAMPLE USED
• A researcher decided to determine if cholesterol concentration was related to time spent watching TV in otherwise healthy 45 to 65 year old men (an at-risk category of people). They believed that there would be a positive relationship: the more time people spent watching TV, the greater their cholesterol concentration.
• The researcher also wished to be able to predict cholesterol concentration and to know the proportion of cholesterol concentration that time spent watching TV could explain.
Daily time spent watching TV was recorded in the variable timetv Cholesterol concentration recorded in the variable cholesterol.
The following instructions will shown you how to produce a scatterplot in SPSS to establish if a linear relationship exists:
•Click Graphs > Chart Builder... on the main menu, as shown below:
•Select "Scatter/Dot" from the Choose from: box in the bottom-left-hand corner of the Chart Builder dialogue box, as highlighted below:
•Selecting "Scatter/Dot" will present eight different scatter/dot options in the lower-middle section of the Chart Builder dialogue box (as shown above and below).• Drag-and-drop the top-left-hand option (you will see it labelled as "Simple Scatter" if you hover your mouse over the box) into the main chart preview pane, as shown below:
•You will be presented with the screen below, which shows a simple scatterplot in the main chart preview pane with boxes for the y-axis ("Y-Axis?") and x-axis ("X-Axis?") for you to populate with the appropriate variables.
•Drag-and-drop the independent variable, time_tv, from the Variables: box into the "X-axis?" box in the main chart preview screen and do the same for the dependent variable, cholesterol, but into the "Y-axis?" box. You should end up with a screen like below:
•Click on "Y-Axis1 (Point1)" in the Element Properties dialogue box (the box on the right-hand-side) and you will be presented with the following screen:
Uncheck the Minimum option in the -Scale Range- area so that the Custom value is highlighted and has a value of 0 (zero), as shown below:
Click the Apply button to confirm these changes.
Click the OK button in the Chart Builder dialogue box to generate the scatterplot.
For this example, you can conclude from visual inspection of the above scatterplot that there is a linear relationship between cholesterol concentration and time spent watching TV.
Click Analyze > Regression > Linear... on the main menu, as shown below:
Click the Continue button. Click the OK button - This will generate the output.
Determining how well the model fits
The Model Summary table provides the information needed to determine how well the regression model fits the data:
R is the multiple correlation coefficient ("R" column).
As there is only one independent variable, R is simply the absolute value of the Pearson correlation between the dependent variable and the independent variable. It simply indicates the strength of the association between the two variables
•In this example, R = 0.389, which indicates a moderate correlation. However, you will not normally have to report this value.The R2 value ("R Square" column) represents the proportion of variance in the dependent variable that can be explained by our independent variable (technically it is the proportion of variation accounted for by the regression model above and beyond the mean model).
•In this example, R2 = 0.151, which means that the independent variable, time_tv, explains 15.1% of the variability of the dependent variable, cholesterol. However, R2 is based on the sample and is a positively biased estimate of the proportion of the variance of the dependent variable accounted for by the regression model (i.e., it is too large).
•SPSS also prints out an adjusted R2 value ("Adjusted R Square" column), which corrects positive bias to provide a value that would be expected in the population. Adjusted R2 is also an estimate of the effect size, which at 0.143 (14.3%), is indicative of a medium effect size, according to Cohen's (1988) classification.
The ANOVA table informs you whether the regression model results in a statistically significantly better prediction of the dependent variable, cholesterol, than if you just used the mean value.
The general form of the line to predict cholesterol concentration from time spent watching TV, expressed in SPSS variable form (i.e., cholesterol and time-tv), is:
cholesterol = b0 + (b1 x time-tv)
where b0 is the intercept and b1 is the coefficient. You can ascertain these value by inspecting the Coefficients table:
A linear regression established that daily time spent watching TV could statistically significantly predict cholesterol concentration, F(1, 97) = 14.395, p < .0001 and time spent watching TV accounted for 14.3% of the explained variability in cholesterol concentration. The regression equation was: predicted cholesterol concentration = -0.944 + 0.037 x (time spent watching tv).
Y’ = -0.94 + 0.037 X
Descriptive Statistics
Mean Std. Deviation NGrade - PMR MATH 2.53 1.468 62
TEACHER_FACTOR 3.9643 .91443 62
Correlations
Grade - PMR MATH
TEACHER_FACTOR
Pearson Correlation
Grade - PMR MATH
1.000 .571
TEACHER_EFF .571 1.000
Sig. (1-tailed) Grade - PMR MATH
. .000
TEACHER_EFF .000 .
N Grade - PMR MATH
62 62
TEACHER_EFF 62 62
Model Summaryb
Model
RR
SquareAdjusted R
Square
Std. Error of the
Estimated
i
m
e
n
s
i
o
n
0
1 .571a
.326 .315 1.215
a. Predictors: (Constant), TEACHER_FACTORb. Dependent Variable: Grade - PMR MATH
ANOVAb
Model Sum of Squares df Mean Square F Sig.
1 Regression 42.848 1 42.848 29.021 .000a
Residual 88.588 60 1.476
Total 131.435 61
a. Predictors: (Constant), TEACHER_FACTORb. Dependent Variable: Grade - PMR MATH
Coefficientsa
Model Unstandardized Coefficients
Standardized Coefficients
t Sig.B Std. Error Beta1 (Constant) -1.101 .692 -1.591 .117
TEACHER_FACTOR .917 .170 .571 5.387 .000a. Dependent Variable: Grade - PMR MATH
Descriptive Statistics
Mean Std. Deviation NGrade - PMR MATH 2.53 1.468 62
TEACHER_EFF 3.9643 .91443 62
Race 1.90 .593 62
Correlations
Grade - PMR MATH
TEACHER_FACTOR Race
Pearson Correlation
Grade - PMR MATH 1.000 .571 -.015
TEACHER_EFF .571 1.000 .019
Race -.015 .019 1.000
Sig. (1-tailed)
Grade - PMR MATH . .000 .453
TEACHER_EFF .000 . .440
Race .453 .440 .
N Grade - PMR MATH 62 62 62
TEACHER_EFF 62 62 62
Race 62 62 62
Model Summaryb
Model
RR
SquareAdjusted R
Square
Std. Error of the
Estimate
d
i
m
e
n
s
i
o
n
0
1 .572a .327 .304 1.225
a. Predictors: (Constant), Race, TEACHER_FACTOR
b. Dependent Variable: Grade - PMR MATH
ANOVAb
Model Sum of Squares df
Mean Square F Sig.
1 Regression 42.939 2 21.469 14.313 .000a
Residual 88.497 59 1.500
Total 131.435 61
a. Predictors: (Constant), Race, TEACHER_FACTORb. Dependent Variable: Grade - PMR MATH
Coefficientsa
ModelUnstandardized Coefficients
Standardized Coefficients
t Sig.B Std. Error Beta1 (Constant) -.980 .853 -1.150 .255
TEACHER_FACTOR .917 .172 .571 5.349 .000Race -.065 .265 -.026 -.246 .806
a. Dependent Variable: Grade - PMR MATH
Performing the paired t-test
Opens up dialogue box
Use: AnalyzeCompare MeansPaired Samples T-Test
The paired samples t- test dialogue box
Transfer two levels of IV to ‘paired variables boxBoth need to be highlighted
Variables shown in box as pairedClick OK
Output (1)
Mean for each condition
Number of paired scores
SD for each condition
Means suggest difference, but need to look at output of t-test to see if significant
Output (2)
t-value
p valuedfMean difference score
Reporting
There was a significant effect of statistics lecture on depression, t (18) = 5.86, p<.05). Findings indicated that depression scores recorded after the lecture were lower (mean = 13.0, SD= 2.33) than those recorded before the lecture (mean = 13.95, SD = 2.48).
Independent samples t-test
Used when different participants take part in each experimental condition.
Hypothesis: males can eat more chillies than females.
Eight males & eight females were tested on their chilli tolerance in a chilli eating competition.
Use arrow key to put IV here
Use arrow Key to put DV here.Define levelsof DV.
Examine descriptive statistics first.
Group Stati stics
8 5.6250 1.4079 .49788 4.1250 1.1260 .3981
GENDERmalefemale
CHILLIESN Mean Std. Deviat ion
Std. ErrorMean
GENDER
femalemale
Mea
n C
HIL
LIE
S
6.0
5.5
5.0
4.5
4.0
3.5
Results suggest that males could eat more chillies than females. But need to conduct t-test to determine if this difference is significant.
Ind e p e n de nt Sa mple s Te s t
.4 4 3 .5 1 7 2 .3 5 3 1 4 .0 3 4 1 .5 0 0 0 .6 3 7 4 .1 3 3 0 2 .8 6 7 0
2 .3 5 3 1 3 .3 5 5 .0 3 5 1 .5 0 0 0 .6 3 7 4 .1 2 6 7 2 .8 7 3 3
Eq u a l v a ria n c e sa s s u me dEq u a l v a ria n c e sn o t a s s u me d
CHIL L IESF Sig .
L e v e n e 's T e s t fo rEq u a lity o f Va ria n c e s
t d f Sig . (2 -ta ile d )Me a n
Diffe re n c eStd . Erro r
Diffe re n c e L o we r Up p e r
9 5 % Co n fid e n c eIn te rv a l o f th eDiffe re n c e
t-te s t fo r Eq u a lity o f Me a n s
Levene’s test - scores must have equal variance to use standard t-test techniques. Variances equal if p > 0.05
t-value, df & p shown here. Difference is significant if p < 0.05.
Results section
We examined chilli tolerance in males and females. Eight males and eight females were tested on their ability to consume chillies. Males with mean of 5.63 (s= 1.41) and females with mean of 4.13 (s= 1.13). Findings also showed that males ate significantly more chillies than females, t(14) = 2.35, p < 0.05.
The results suggest that males have greater chilli tolerance than females (or that males are foolish enough to try to win chilli eating contests).
Paired samples t-test Used when same or
matched pairs of participants take part in experimental conditions.
Hypothesis: chilli tolerance is more on cold days than on warm days.
Ten participants ate chillies on a warm day then cold day.
Use arrow key to select variables that are to be compared.
Pa ired Samples Test
-2 .3000 2.9 078 .9195 -4.3801 -.2199 -2.501 9 .03 4WARM - COLDPa ir 1Me an Std . Dev ia tion
Std . Erro rMean Lower Upper
95% Con fide nc eInte rv a l o f theDiffe renc e
Pa ired Diffe ren c es
t df Sig . (2 -ta iled )
Mean difference between pairs of scores shown here.
T-value, df & p shown here. Difference is significant if p < 0.05.
Results section
We examined chilli tolerance in warm and cold days. Ten participants were tested on their ability to consume chillies. The mean difference is 2.30 in which more chillies were consume in cold days compared to warm days. Findings also showed that chilli tolerance is more on cold days significantly than warm days, t(9) = -2.501, p < 0.05.
The results suggest that individuals can consume more chillies on cold days than on warm days.
Paired Samples Statistics
,4714 21 ,24276 ,05297,5019 21 ,25522 ,05569
DopplerCath
Pair1
Mean N Std. DeviationStd. Error
Mean
Paired Samples Correlations
21 ,888 ,000Doppler & CathPair 1N Correlation Sig.
Paired Samples Test
-,03048 ,11864 ,02589 -,08448 ,02353 -1,177 20 ,253Doppler - CathPair 1Mean Std. Deviation
Std. ErrorMean Lower Upper
95% ConfidenceInterval of the
Difference
Paired Differences
t df Sig. (2-tailed)
Paired Sample T-test
Results section
We examined chilli tolerance based on two type of chillies. 21 participants were tested on their ability to consume both type of chillies. The mean difference is 0.348. Findings also showed that there is no significant difference in chilli tolerance between the two types of chillies, t(20) = -1.77, p > 0.05.
The results suggest that there is no difference in chilli tolerance between the two types of chillies.
Group Statistics
12 25.5673 5.04689 1.4569112 31.1920 7.79554 2.25038
group1.002.00
DVN Mean Std. Deviation
Std. ErrorMean
Independent Samples Test
7.236 .013 -2.098 22 .048 -5.62476 2.68082 -11.18443 -.06508
-2.098 18.843 .050 -5.62476 2.68082 -11.23894 -.01057
Equal variancesassumedEqual variancesnot assumed
DVF Sig.
Levene's Test forEquality of Variances
t df Sig. (2-tailed)Mean
DifferenceStd. ErrorDifference Lower Upper
95% ConfidenceInterval of the
Difference
t-test for Equality of Means
variances are 25.4 and 60.7
Results section
We examined chilli tolerance between two groups of participants. Twelve participants per group were tested on their ability to consume chillies. Group 1 scored mean of 25.07 (s= 5.05) and group 2 scored mean of 31.19 (s= 7.80). Findings also showed that the two groups differ significantly in their chilli consumption, t(22) = 2.10, p < 0.05.
The results suggest that group 2 have greater chilli tolerance than group 1.
Statistical Tools For Inferential Statistics
• PARAMETRIC TESTS: – Test of hypothesis of differences
between means - Z-test, t-test, F-test, MANOVA
– Test of hypothesis of relationship – Pearson r, Point-biserial, Regression
• NON-PARAMETRIC TESTS: – Mann-Whitney, – Kruskal Wallis, – Spearman rho, – Chi-Square, Cramer’s V, Lambda,
dll.
STATISTICAL DECISION
Decision (fail to reject Ho)
1 – α
Decision (fail to reject Ho)
β errorType II error
Decision (reject Ho)α error
Type I error
Decision (reject Ho) 1 – βPower
Reality
H0 : No difference HA : Difference
H0 : No difference
HA : Difference