stata 2, bivariate
DESCRIPTION
Stata 2, Bivariate. Hein Stigum Presentation, data and programs at: http://folk.uio.no/heins/. Aug-14. H.S. 1. Datatypes. Categorical data Nominal: married/ single/ divorced Ordinal: small/ medium/ large Numerical data Discrete: number of children Continuous: weight. Aug-14. - PowerPoint PPT PresentationTRANSCRIPT
04/22/23 H.S.
104/22/23 H.S. 1
Stata 2, Bivariate
Hein Stigum
Presentation, data and programs at:
http://folk.uio.no/heins/
04/22/23 H.S.
204/22/23 H.S. 2
Datatypes
• Categorical data– Nominal: married/ single/ divorced– Ordinal: small/ medium/ large
• Numerical data– Discrete: number of children– Continuous: weight
04/22/23 H.S.
304/22/23 H.S. 3
Data type
Normal data
MeansT-testLinear regression
MediansNon-par tests
Freq tableCross, ChisquareLogistic regression
CategoricalNumerical
Yes No
Data type dictates type of analysis
04/22/23 H.S.
404/22/23 H.S. 4
Continuous symmetric outcome
Example:
Birth weight
04/22/23 H.S.
504/22/23 H.S. 5
Distribution0
.000
2.0
004
.000
6.0
008
Den
sity
0 2000 4000 6000weight
0.0
002
.000
4.0
006
.000
8D
ensi
ty2000 3000 4000 5000 6000
weight
kdensity weight drop if weight<2000kdensity weight
04/22/23 H.S.
604/22/23 H.S. 6
Central tendency and dispersion
Mean and standard deviation:
Mean with confidence interval:
04/22/23 H.S.
704/22/23 H.S. 7
Compare groups, equal variance?• Equal • Not equal
2 0 2 4 2 0 2 4
04/22/23 H.S.
804/22/23 H.S. 8
2 independent samples
Are birth weights the same for boys and girls?
2000 3000 4000 5000 6000Birth weight
2000
3000
4000
5000
6000
Birt
h w
eigh
t
Boys Girlssex
Scatterplot Density plot
04/22/23 H.S.
904/22/23 H.S. 9
2 independent samples test
04/22/23 H.S.
1004/22/23 H.S. 10
K independent samples
• Is birth weight the same over parity?
Scatterplot Density plot
2000
3000
4000
5000
6000
Birt
h w
eigh
t
0 1 2-7Parity
2000 3000 4000 5000 6000Birth weight, g
012+
Equal means? Linear effect?Outliers?
Equal variances?
04/22/23 H.S.
1104/22/23 H.S. 11
K independent samples test
equal means?
Equal variances?
04/22/23 H.S.
1204/22/23 H.S. 12
Continuous by continuous • Does birth weight depend on gestational age?
Scatterplot Scatterplot, outlier dropped
2000
3000
4000
5000
6000
Birt
h w
eigh
t
200 300 400 500 600 700Gestational age
2000
3000
4000
5000
Birt
h w
eigh
t
200 220 240 260 280 300Gestational age
04/22/23 H.S.
1304/22/23 H.S. 13
Continuous by continuous tests
• Cut gestational age up in groups, then use T-test or ANOVA
or
• Use linear regression with 1 covariate
04/22/23 H.S.
1404/22/23 H.S. 14
Test situations
• 2 independent samples• ttest weight, by(sex)
• K independent samples• oneway weight parity
• By continuous• regress weight gestAge
• 2 dependent samples (Paired)• ttest weight_last_year = weight_today
04/22/23 H.S.
1504/22/23 H.S. 15
Continuous skewed outcome
Example:
Number of sexual partners
04/22/23 H.S.
1604/22/23 H.S. 16
Distributionkdensity partners if partners<=50
25%50% 75% 95%0.0
2.0
4.0
6.0
8.1
11 4 9 20 50Partners
N=394
Distribution of number of lifetime partners
04/22/23 H.S.
1704/22/23 H.S. 17
Central tendency and dispersion
Median and percentiles:
04/22/23 H.S.
1804/22/23 H.S. 18
2 independent samplesDo males and females have the same number of partners?
Scatterplot Density plot
0 10 20 30 40 50partners
050
100
150
200
Par
tner
s
Males FemalesGender
04/22/23 H.S.
1904/22/23 H.S. 19
2 independent samples test
equal medians?
04/22/23 H.S.
20
050
100
150
200
Par
tner
s
18-29 30-44 45-60agegr3
04/22/23 H.S. 20
K independent samplesDo partners vary with age?
Scatterplot (partners<20) Density plot (partners<20)
0 5 10 15 20Partners
Age:18-2930-4445-60
05
1015
20P
artn
ers
18-29 30-44 45-60agegr3
Scatterplot
04/22/23 H.S.
2104/22/23 H.S. 21
K independent samples test
equal medians?
04/22/23 H.S.
2204/22/23 H.S. 22
Table of tests
ProportionsNormal Skewed
1 sample One sample T-test Kolmogorov-Smirnov Binomial2 independent samples Independent sample T-test Mann-Whitney U Chi-squareK independent samples ANOVA Kruskal-Wallis Chi-square2 dependent samples Paired sample T-test Wilcoxon signed rank test Mc-Nemar (2x2)
Numerical data
Categorical ordered: use nonparametric tests
04/22/23 H.S.
2304/22/23 H.S. 23
Categorical data
Example:
Being bullied
04/22/23 H.S.
2404/22/23 H.S. 24
Frequency and proportionFrequency:
Proportion with CI:
04/22/23 H.S.
2504/22/23 H.S. 25
Proportion, confidence interval
proportion:
standard error:
confidence interval:
nxp x=”disease”
n=total number
)(2)(
)1()(
pseppCI
npppse
04/22/23 H.S.
2604/22/23 H.S. 26
Crosstables
equal proportions?
Are boys bullied as much as girls?
04/22/23 H.S.
27
Ordered categories, trend
.1.1
5.2
.25
Pro
porti
on b
ullie
d
2-6 y 7-12 y 13-17 yAge group
Does bullied vary with age?twoway (fpfitci bullied agegr) ///
(lfit bullied agegr)
04/22/23 H.S.
2804/22/23 H.S. 28
Ordered categories, trend
Trend?
equal proportions?
04/22/23 H.S.
2904/22/23 H.S. 29
Table of tests
ProportionsNormal Skewed
1 sample One sample T-test Kolmogorov-Smirnov Binomial2 independent samples Independent sample T-test Mann-Whitney U Chi-squareK independent samples ANOVA Kruskal-Wallis Chi-square2 dependent samples Paired sample T-test Wilcoxon signed rank test Mc-Nemar (2x2)
Numerical data
Categorical ordered: use nonparametric tests