visualizing data · y = b + mx. log-log plots 10 100 1000 10000 1 100 10000 1000000 100000000...
TRANSCRIPT
![Page 1: Visualizing data · y = b + mx. Log-log plots 10 100 1000 10000 1 100 10000 1000000 100000000 10000000000 1000000000000 100000000000000 Three functions on a log-log scale Linear Squared](https://reader031.vdocument.in/reader031/viewer/2022040604/5ea3ae2ff16ebe20864db32a/html5/thumbnails/1.jpg)
Visualizing data
Why it's importantHow to do it well
![Page 2: Visualizing data · y = b + mx. Log-log plots 10 100 1000 10000 1 100 10000 1000000 100000000 10000000000 1000000000000 100000000000000 Three functions on a log-log scale Linear Squared](https://reader031.vdocument.in/reader031/viewer/2022040604/5ea3ae2ff16ebe20864db32a/html5/thumbnails/2.jpg)
Comparison is the primary occupation of scientists
● Compare treatment groups to control● Compare treatments to one another● Correct conclusions only possible when the
correct comparison is made● Visualizing experimental data makes
comparison easy
![Page 3: Visualizing data · y = b + mx. Log-log plots 10 100 1000 10000 1 100 10000 1000000 100000000 10000000000 1000000000000 100000000000000 Three functions on a log-log scale Linear Squared](https://reader031.vdocument.in/reader031/viewer/2022040604/5ea3ae2ff16ebe20864db32a/html5/thumbnails/3.jpg)
Not just pretty pictures
● Visualization is an important part of correct data analysis
▬ Deriving knowledge from information▬ Deriving meaning from data – especially large data
sets
● Visualization is an important part of communicating results to others
▬ We're visual creatures▬ A picture is worth a thousand words
![Page 4: Visualizing data · y = b + mx. Log-log plots 10 100 1000 10000 1 100 10000 1000000 100000000 10000000000 1000000000000 100000000000000 Three functions on a log-log scale Linear Squared](https://reader031.vdocument.in/reader031/viewer/2022040604/5ea3ae2ff16ebe20864db32a/html5/thumbnails/4.jpg)
Example: what is the relationship between mass and wing length?
● Expect bigger birds to have bigger wing spans
● Flat line – mass is not related to wing length in this bird
● What's wrong with this picture?
![Page 5: Visualizing data · y = b + mx. Log-log plots 10 100 1000 10000 1 100 10000 1000000 100000000 10000000000 1000000000000 100000000000000 Three functions on a log-log scale Linear Squared](https://reader031.vdocument.in/reader031/viewer/2022040604/5ea3ae2ff16ebe20864db32a/html5/thumbnails/5.jpg)
Grouped by sex
● Group by sex, and fit a line to each sex's data swarm
● Now the pattern is apparent
![Page 6: Visualizing data · y = b + mx. Log-log plots 10 100 1000 10000 1 100 10000 1000000 100000000 10000000000 1000000000000 100000000000000 Three functions on a log-log scale Linear Squared](https://reader031.vdocument.in/reader031/viewer/2022040604/5ea3ae2ff16ebe20864db32a/html5/thumbnails/6.jpg)
Using graphs to understand nature of relationship between variables
● You can use graphical methods to get an idea of the functional relationship between variables
● If we want to predict how a response variable changes when we make a change in a predictor, we need to know the correct functional form
● Different functions are straight lines on plots with logarithmic axes
▬ Log-log plots – both x and y are on log scales, power functions will be linear
▬ Semi-log plots – one linear axis, one log-scale axis– Exponential relationships are linear when y-axis is log-scale– Logarithmic relationships are linear when x-axis is log-scale
![Page 7: Visualizing data · y = b + mx. Log-log plots 10 100 1000 10000 1 100 10000 1000000 100000000 10000000000 1000000000000 100000000000000 Three functions on a log-log scale Linear Squared](https://reader031.vdocument.in/reader031/viewer/2022040604/5ea3ae2ff16ebe20864db32a/html5/thumbnails/7.jpg)
Three power function
relationships between x and y
0 2000 4000 6000 8000 10000 120000
20000
40000
60000
80000
100000
120000
X and Y linearly related
X
Y
0 2000 4000 6000 8000 10000 120000
200000000
400000000
600000000
800000000
1000000000
1200000000
Y = aX^2
X
Y
0 2000 4000 6000 8000 10000 120000
2000000000000
4000000000000
6000000000000
8000000000000
10000000000000
12000000000000
Y = aX^3
X
Y
Y=a X b
Y = aX1
Y = aX2
Y = aX3
![Page 8: Visualizing data · y = b + mx. Log-log plots 10 100 1000 10000 1 100 10000 1000000 100000000 10000000000 1000000000000 100000000000000 Three functions on a log-log scale Linear Squared](https://reader031.vdocument.in/reader031/viewer/2022040604/5ea3ae2ff16ebe20864db32a/html5/thumbnails/8.jpg)
Log Y scales linearly with log X
log (Y )=log (a )+ b log (X )
Y=a X b
y = b + mx
![Page 9: Visualizing data · y = b + mx. Log-log plots 10 100 1000 10000 1 100 10000 1000000 100000000 10000000000 1000000000000 100000000000000 Three functions on a log-log scale Linear Squared](https://reader031.vdocument.in/reader031/viewer/2022040604/5ea3ae2ff16ebe20864db32a/html5/thumbnails/9.jpg)
Log-log plots
10 100 1000 100001
100
10000
1000000
100000000
10000000000
1000000000000
100000000000000
Three functions on a log-log scale
Linear
Squared
Cubed
Log (X)
Log
(Y
)
Slope = 3
Slope = 2
Slope = 1
Both axes on a log scale
If data are a straight line on a log-log plot, then the relationship is a power function
y
x
![Page 10: Visualizing data · y = b + mx. Log-log plots 10 100 1000 10000 1 100 10000 1000000 100000000 10000000000 1000000000000 100000000000000 Three functions on a log-log scale Linear Squared](https://reader031.vdocument.in/reader031/viewer/2022040604/5ea3ae2ff16ebe20864db32a/html5/thumbnails/10.jpg)
Exponential relationships
Y=a10bX
0 2 4 6 8 10 120.00E+000
2.00E+016
4.00E+016
6.00E+016
8.00E+016
1.00E+017
1.20E+017
b = 2
X
Y
0 2 4 6 8 10 120.00E+000
2.00E+026
4.00E+026
6.00E+026
8.00E+026
1.00E+027
1.20E+027
b = 3
X
Y
![Page 11: Visualizing data · y = b + mx. Log-log plots 10 100 1000 10000 1 100 10000 1000000 100000000 10000000000 1000000000000 100000000000000 Three functions on a log-log scale Linear Squared](https://reader031.vdocument.in/reader031/viewer/2022040604/5ea3ae2ff16ebe20864db32a/html5/thumbnails/11.jpg)
Log of Y is linear with X
Y=a10bX
log (Y )=log(a)+bX
![Page 12: Visualizing data · y = b + mx. Log-log plots 10 100 1000 10000 1 100 10000 1000000 100000000 10000000000 1000000000000 100000000000000 Three functions on a log-log scale Linear Squared](https://reader031.vdocument.in/reader031/viewer/2022040604/5ea3ae2ff16ebe20864db32a/html5/thumbnails/12.jpg)
On a semi-log plot – y-axis on log scale
0 2 4 6 8 10 121.00E-003
1.00E+000
1.00E+003
1.00E+006
1.00E+009
1.00E+012
1.00E+015
1.00E+018
1.00E+021
1.00E+024
1.00E+027
All three on a semi-log plot
Linear
b = 2
b = 3
X
Y
X is on a linear scale
![Page 13: Visualizing data · y = b + mx. Log-log plots 10 100 1000 10000 1 100 10000 1000000 100000000 10000000000 1000000000000 100000000000000 Three functions on a log-log scale Linear Squared](https://reader031.vdocument.in/reader031/viewer/2022040604/5ea3ae2ff16ebe20864db32a/html5/thumbnails/13.jpg)
Logarithmic relationships
10Y=aX b
![Page 14: Visualizing data · y = b + mx. Log-log plots 10 100 1000 10000 1 100 10000 1000000 100000000 10000000000 1000000000000 100000000000000 Three functions on a log-log scale Linear Squared](https://reader031.vdocument.in/reader031/viewer/2022040604/5ea3ae2ff16ebe20864db32a/html5/thumbnails/14.jpg)
Y is related to the log of X
10Y=aX b
Y=log (a)+b log (X )
Y is on a linear scale, x is logarithmic
![Page 15: Visualizing data · y = b + mx. Log-log plots 10 100 1000 10000 1 100 10000 1000000 100000000 10000000000 1000000000000 100000000000000 Three functions on a log-log scale Linear Squared](https://reader031.vdocument.in/reader031/viewer/2022040604/5ea3ae2ff16ebe20864db32a/html5/thumbnails/15.jpg)
Wrong axis scales for the data lead to curved lines
10 2010 4010 6010 8010 10010120101.00E+000
1.00E+002
1.00E+004
1.00E+006
1.00E+008
1.00E+010
1.00E+012
1.00E+014
Power functions in a semi-log plot
Linear
Squared
Cubed
Log (X)
Log
(Y)
1 101.00E-003
1.00E+000
1.00E+003
1.00E+006
1.00E+009
1.00E+012
1.00E+015
1.00E+018
1.00E+021
1.00E+024
1.00E+027
Exponential functions in a log-log plot
Linear
b = 2
b = 3
X
Y
Thus, changing the scale from linear to logarithmic can help you diagnose the functional relationship between variables
Once we know the relationship, we can have Excel give us the equation for the line
![Page 16: Visualizing data · y = b + mx. Log-log plots 10 100 1000 10000 1 100 10000 1000000 100000000 10000000000 1000000000000 100000000000000 Three functions on a log-log scale Linear Squared](https://reader031.vdocument.in/reader031/viewer/2022040604/5ea3ae2ff16ebe20864db32a/html5/thumbnails/16.jpg)
Common graph types
● The graph you should use depends on the type of data you will display
● Common graph types (i.e. those supported by Excel) cover most of the basic data display tasks
● Less common, task-specific graph types (not supported by Excel) are used in various fields of Biology
▬ Statistical packages▬ Dedicated graphing software
![Page 17: Visualizing data · y = b + mx. Log-log plots 10 100 1000 10000 1 100 10000 1000000 100000000 10000000000 1000000000000 100000000000000 Three functions on a log-log scale Linear Squared](https://reader031.vdocument.in/reader031/viewer/2022040604/5ea3ae2ff16ebe20864db32a/html5/thumbnails/17.jpg)
Excel's graph typesGraph type Use
Column A numeric variable plotted at levels of a categorical variable
Bar A horizontal column chart
Histogram Displaying distribution of data (not a distinct graph type in Excel)
Line Values of a numeric variable displayed at the same levels of a categorical variable.
Pie Proportions, percentages
Area Line graph with the area below the lines shaded
Scatter Relationship between two numeric variables
Surface A three dimensional surface, with categorical x,y and numeric z
Bubble A scatter plot with symbol size set to display a third variable
Radar Each numeric variable is a ray, each observation is plotted on each ray, with points connected
![Page 18: Visualizing data · y = b + mx. Log-log plots 10 100 1000 10000 1 100 10000 1000000 100000000 10000000000 1000000000000 100000000000000 Three functions on a log-log scale Linear Squared](https://reader031.vdocument.in/reader031/viewer/2022040604/5ea3ae2ff16ebe20864db32a/html5/thumbnails/18.jpg)
Column chart
One categorical variable (grouping)One numeric variable (response)
Data in Excel
Pivot table for graphing
Graph of pivot table data
![Page 19: Visualizing data · y = b + mx. Log-log plots 10 100 1000 10000 1 100 10000 1000000 100000000 10000000000 1000000000000 100000000000000 Three functions on a log-log scale Linear Squared](https://reader031.vdocument.in/reader031/viewer/2022040604/5ea3ae2ff16ebe20864db32a/html5/thumbnails/19.jpg)
Grouped column chart
Two categorical variables (treatment group, plant type)One numeric variable (response)
Data in ExcelPivot table for graphing
Graph of pivot table data
![Page 20: Visualizing data · y = b + mx. Log-log plots 10 100 1000 10000 1 100 10000 1000000 100000000 10000000000 1000000000000 100000000000000 Three functions on a log-log scale Linear Squared](https://reader031.vdocument.in/reader031/viewer/2022040604/5ea3ae2ff16ebe20864db32a/html5/thumbnails/20.jpg)
HistogramData in Excel
Bins, midpoints, and frequencies
Bar chart of freq's, no gap between bars
Excel is not a good choice for histograms – use MINITAB, or equivalent
![Page 21: Visualizing data · y = b + mx. Log-log plots 10 100 1000 10000 1 100 10000 1000000 100000000 10000000000 1000000000000 100000000000000 Three functions on a log-log scale Linear Squared](https://reader031.vdocument.in/reader031/viewer/2022040604/5ea3ae2ff16ebe20864db32a/html5/thumbnails/21.jpg)
Line
Data in Excel
Graph
X axis can be numbers, dates, ordinal categories
BUT, regardless, axis is treated as a categorical axis
Order on the graph is the same as the order in the sheet
![Page 22: Visualizing data · y = b + mx. Log-log plots 10 100 1000 10000 1 100 10000 1000000 100000000 10000000000 1000000000000 100000000000000 Three functions on a log-log scale Linear Squared](https://reader031.vdocument.in/reader031/viewer/2022040604/5ea3ae2ff16ebe20864db32a/html5/thumbnails/22.jpg)
Line graphs are easy to misuse
Data in Excel
Graph
X categories are not in order
Amount of spacing between them is not even, but they're plotted as though the spacing is the same
![Page 23: Visualizing data · y = b + mx. Log-log plots 10 100 1000 10000 1 100 10000 1000000 100000000 10000000000 1000000000000 100000000000000 Three functions on a log-log scale Linear Squared](https://reader031.vdocument.in/reader031/viewer/2022040604/5ea3ae2ff16ebe20864db32a/html5/thumbnails/23.jpg)
Scatter
Without connecting lines With connecting lines
![Page 24: Visualizing data · y = b + mx. Log-log plots 10 100 1000 10000 1 100 10000 1000000 100000000 10000000000 1000000000000 100000000000000 Three functions on a log-log scale Linear Squared](https://reader031.vdocument.in/reader031/viewer/2022040604/5ea3ae2ff16ebe20864db32a/html5/thumbnails/24.jpg)
Line graphs vs. scatter plots with connected points?
Line graph Scatter plot
![Page 25: Visualizing data · y = b + mx. Log-log plots 10 100 1000 10000 1 100 10000 1000000 100000000 10000000000 1000000000000 100000000000000 Three functions on a log-log scale Linear Squared](https://reader031.vdocument.in/reader031/viewer/2022040604/5ea3ae2ff16ebe20864db32a/html5/thumbnails/25.jpg)
Pie chart
Data in ExcelGraph of percentages
Proportions of totalPercentagesRelative frequencies
Counts will be converted to proportions of the total (okay)
Any other numbers used will be converted to proportions (not okay)
![Page 26: Visualizing data · y = b + mx. Log-log plots 10 100 1000 10000 1 100 10000 1000000 100000000 10000000000 1000000000000 100000000000000 Three functions on a log-log scale Linear Squared](https://reader031.vdocument.in/reader031/viewer/2022040604/5ea3ae2ff16ebe20864db32a/html5/thumbnails/26.jpg)
High order data sets
● Some data sets have many different variables● Flat screens only have 2 dimensions – two
variables easy to display● Adding variables means adding dimensions we
don't have▬ 3-d graphs use depth cueing (perspective tricks)▬ Plot “slices” through the data▬ Use symbols/lines
![Page 27: Visualizing data · y = b + mx. Log-log plots 10 100 1000 10000 1 100 10000 1000000 100000000 10000000000 1000000000000 100000000000000 Three functions on a log-log scale Linear Squared](https://reader031.vdocument.in/reader031/viewer/2022040604/5ea3ae2ff16ebe20864db32a/html5/thumbnails/27.jpg)
Surface plot – depth cueing
Data in Excel
Graph of values in body of matrix
Three dimensionsX and Y are row and column labels, treated as categoricalZ is numeric, in body of matrix
![Page 28: Visualizing data · y = b + mx. Log-log plots 10 100 1000 10000 1 100 10000 1000000 100000000 10000000000 1000000000000 100000000000000 Three functions on a log-log scale Linear Squared](https://reader031.vdocument.in/reader031/viewer/2022040604/5ea3ae2ff16ebe20864db32a/html5/thumbnails/28.jpg)
Plotting slices across a third variable
● Can group data based on levels of a third variable
● You can see the effect of grouping by comparing the groups
● Example: plotting the length, width, area data as a set of lines
![Page 29: Visualizing data · y = b + mx. Log-log plots 10 100 1000 10000 1 100 10000 1000000 100000000 10000000000 1000000000000 100000000000000 Three functions on a log-log scale Linear Squared](https://reader031.vdocument.in/reader031/viewer/2022040604/5ea3ae2ff16ebe20864db32a/html5/thumbnails/29.jpg)
Slices through the data – lines defined by lengths
Data in Excel
Each length is a series, width on x-axis
Line colors selected to indicate lengths
![Page 30: Visualizing data · y = b + mx. Log-log plots 10 100 1000 10000 1 100 10000 1000000 100000000 10000000000 1000000000000 100000000000000 Three functions on a log-log scale Linear Squared](https://reader031.vdocument.in/reader031/viewer/2022040604/5ea3ae2ff16ebe20864db32a/html5/thumbnails/30.jpg)
Symbol properties
● Can subset the data and display with:▬ Symbol size▬ Symbol color▬ Symbol type
![Page 31: Visualizing data · y = b + mx. Log-log plots 10 100 1000 10000 1 100 10000 1000000 100000000 10000000000 1000000000000 100000000000000 Three functions on a log-log scale Linear Squared](https://reader031.vdocument.in/reader031/viewer/2022040604/5ea3ae2ff16ebe20864db32a/html5/thumbnails/31.jpg)
Bubble chartData in Excel
Chart - symbolsize is proportional topetal length
![Page 32: Visualizing data · y = b + mx. Log-log plots 10 100 1000 10000 1 100 10000 1000000 100000000 10000000000 1000000000000 100000000000000 Three functions on a log-log scale Linear Squared](https://reader031.vdocument.in/reader031/viewer/2022040604/5ea3ae2ff16ebe20864db32a/html5/thumbnails/32.jpg)
Radar charts – multiple numeric axes
● Each ray is a different variable● Each data point is plotted on each ray, with
lines connecting
![Page 33: Visualizing data · y = b + mx. Log-log plots 10 100 1000 10000 1 100 10000 1000000 100000000 10000000000 1000000000000 100000000000000 Three functions on a log-log scale Linear Squared](https://reader031.vdocument.in/reader031/viewer/2022040604/5ea3ae2ff16ebe20864db32a/html5/thumbnails/33.jpg)
Radar plot with four axes
Data in Excel
Chart