r graphics - university of wisconsin–madison › ~hemken › rworkshops › graphics › r...
TRANSCRIPT
R Graphics
21 Sept 2011
Three popular graphics systems
• Graphics package (aka “default”)
• Lattice package (aka “trellis” graphics)
• Ggplot2 package (“grammer of graphics”)
Plot()
• The type of graph depends on the type of data
# a factor
plot(nels88$paredu)
# a data frame
plot(nels88)
ba college hs lesshs ma phd
020
40
60
80
sex
1.0 2.0 3.0 4.0 1 2 3 4 5 6
1.0
1.4
1.8
1.0
2.0
3.0
4.0
race
ses
-2-1
01
2
12
34
56
paredu
1.0 1.4 1.8 -2 -1 0 1 2 30 40 50 60 70
30
40
50
60
70
math
Plot() methods
methods(plot)
[1] plot.acf* plot.data.frame* plot.decomposed.ts*
[4] plot.default plot.dendrogram* plot.density
[7] plot.ecdf plot.factor* plot.formula*
[10] plot.hclust* plot.histogram* plot.HoltWinters*
[13] plot.isoreg* plot.lm plot.medpolish*
[16] plot.mlm plot.ppr* plot.prcomp*
[19] plot.princomp* plot.profile.nls* plot.spec
[22] plot.spec.coherency plot.spec.phase plot.stepfun
[25] plot.stl* plot.table* plot.ts
[28] plot.tskernel* plot.TukeyHSD
Univariate plots
# for a factor, plot() returns a bar
chart
plot(nels88$paredu, main="Factor plot()")
# ERROR, character data first must be
converted to a factor
plot(as.character(nels88$paredu),
main="Character")
# for a numeric vector, a "scatterplot"
plot(nels88$math, main="Numeric plot()")
# boolean values coerced to numeric
plot(nels88$math>mean(nels88$math),
main="Boolean plot()")
ba hs lesshs ma phd
Factor plot()
020
40
60
80
0 50 100 150 200 250
30
40
50
60
70
Numeric plot()
Index
nels
88$m
ath
0 50 100 150 200 250
0.0
0.2
0.4
0.6
0.8
1.0
Boolean plot()
Index
nels
88$m
ath
> m
ean(n
els
88$m
ath
)
Using coercion to alter type
plot(factor(nels88$math > mean(
nels88$math)), main="Boolean
plot()", ylab="Count",
xlab="Above average")
FALSE TRUE
Boolean plot()
Above average
Count
020
40
60
80
100
120
Reording Factor Levels
parentedu <-
factor(nels88$paredu,
levels=c("lesshs", "hs",
"college", "ba", "ma",
"phd"),
ordered=TRUE)
plot(parentedu)
lesshs hs college ba ma phd
020
40
60
80
Other Univariate Plots for Factors
plot(nels88$race,
main="plot()")
# note that pie() and barplot()
# start from tabular data
pie(table(nels88$race),
main="pie()")
barplot(table(nels88$race),
main="barplot()",
ylab="Count")
White Asian Black Hispanic
plot()
050
100
150
White
AsianBlack
Hispanic
pie()
White Asian Black Hispanic
barplot()
Count
050
100
150
Histograms
hist(nels88$ses,
probability=TRUE)
a<-hist(nels88$ses,
probability=TRUE)
par(new=TRUE)
curve(dnorm,
ylim=c(0,max(a$density)),
xlab="", ylab="")
par(new=FALSE)
Histogram of nels88$ses
nels88$ses
Density
-2 -1 0 1 2
0.0
0.1
0.2
0.3
0.4
Histogram of nels88$ses
nels88$ses
Density
-2 -1 0 1 2
0.0
0.1
0.2
0.3
0.4
-2 -1 0 1 2
0.0
0.1
0.2
0.3
0.4
Q-Q Plots
qqnorm(nels88$ses)
qqline(nels88$ses)
-3 -2 -1 0 1 2 3
-2-1
01
2
Normal Q-Q Plot
Theoretical Quantiles
Sam
ple
Quantile
s
Bivariate Graphs
• Numeric ~ Factor
• Factor ~ Numeric
• Numeric ~ Numeric
• Factor ~ Factor
Scatterplots (Numeric ~ Numeric)
# two parameters
plot(nels88$ses, nels88$math)
# formula
plot(nels88$math ~ nels88$ses)
-2 -1 0 1 2
30
40
50
60
70
nels88$ses
nels
88$m
ath
-2 -1 0 1 2
30
40
50
60
70
nels88$ses
nels
88$m
ath
Boxplots (Numeric ~ Factor)
# boxplot, two parameters
plot(nels88$paredu,
nels88$math)
# boxplot, a "formula"
parameter
plot(nels88$math ~
nels88$paredu)
ba college hs lesshs ma phd
30
40
50
60
70
ba college hs lesshs ma phd
30
40
50
60
70
nels88$paredu
nels
88$m
ath
Factor ~ Numeric
# As x=,y= parameters, both
vectors are treated as
numeric
# scatterplot
plot(nels88$math,
nels88$paredu)
# As a formula, both vectors
are treated as factors!!!
# "mosaic", aka "spineplot"
plot(nels88$paredu ~
nels88$math)
30 40 50 60 70
12
34
56
nels88$math
nels
88$pare
du
nels88$math
nels
88$pare
du
30 35 40 45 50 55 60 65 70
ba
hs
ma
0.0
0.4
0.8
Mosaic (Factor ~ Factor)
plot(nels88$race,
nels88$paredu)
plot(nels88$paredu ~
nels88$race)
x
y
White Asian Black Hispanic
ba
hs
ma
0.0
0.4
0.8
nels88$race
nels
88$pare
du
White Asian Black Hispanic
ba
hs
ma
0.0
0.4
0.8