data visualization

29
Data Visualization Christian Stade-Schuldt Project-A Ventures BI Team Knowledge Transfer

Upload: christian-stade-schuldt

Post on 08-Aug-2015

132 views

Category:

Data & Analytics


1 download

TRANSCRIPT

Page 1: Data visualization

Data Visualization

Christian Stade-SchuldtProject-A Ventures

BI Team Knowledge Transfer

Page 2: Data visualization

Outline

Motivation

Principles of Data Visualization

Types of Visualization

Summary

,

Project-A, Data Visualization, 2015 2

Page 3: Data visualization

Why visualizing data?

É Visualization lets you see things that would rather go unnoticedÉ Visualization gives answers fasterÉ Color pictures are pretty and fun to look atÉ Simple example: Anscombe’s Quartet

,

Project-A, Data Visualization, 2015 3

Page 4: Data visualization

Anscombe’s Quartet

I II III IVx y x y x y x y

10.0 8.04 10.0 9.14 10.0 7.46 8.0 6.588.0 6.95 8.0 8.14 8.0 6.77 8.0 5.76

13.0 7.58 13.0 8.74 13.0 12.74 8.0 7.719.0 8.81 9.0 8.77 9.0 7.11 8.0 8.84

11.0 8.33 11.0 9.26 11.0 7.81 8.0 8.4714.0 9.96 14.0 8.10 14.0 8.84 8.0 7.04

6.0 7.24 6.0 6.13 6.0 6.08 8.0 5.254.0 4.26 4.0 3.10 4.0 5.39 19.0 12.50

12.0 10.84 12.0 9.13 12.0 8.15 8.0 5.567.0 4.82 7.0 7.26 7.0 6.42 8.0 7.915.0 5.68 5.10 4.74 5.0 5.73 8.0 6.89

,

Project-A, Data Visualization, 2015 4

Page 5: Data visualization

Anscombe’s Quartet II

,

Project-A, Data Visualization, 2015 5

Page 6: Data visualization

Edward Tufte

É Professor emeritus at YaleUniversity

É Pioneer in the field of datavisualization

É Notable works: The VisualDisplay of QuantitativeInformation

,

Project-A, Data Visualization, 2015 6

Page 7: Data visualization

Principles of graphical excellence andintegrity

1. Serve a purpose

2. Make large data sets coherent

3. Present many numbers in a small space

4. Don’t lie

5. Use clear labels to defeat ambuigity and graphical distortion

6. Show entire scales

7. Show in context

,

Project-A, Data Visualization, 2015 7

Page 8: Data visualization

Small multiples

,

Project-A, Data Visualization, 2015 8

Page 9: Data visualization

The Lie factor

Lie factor =Size of effect in graphic

Size of effect in data(1)

=5.3− 0.6

0.6/27.5− 18

18= 14.8 (2)

,

Project-A, Data Visualization, 2015 9

Page 10: Data visualization

Scale contorsions

,

Project-A, Data Visualization, 2015 10

Page 11: Data visualization

Scale contorsions

,

Project-A, Data Visualization, 2015 11

Page 12: Data visualization

Scale contorsions and context

,

Project-A, Data Visualization, 2015 12

Page 13: Data visualization

Principle of data graphics

1. Above all else show the data

2. Maximize the data-ink ratio

3. Erase non-data-ink

4. Erase redundant data-ink

5. Revise and edit

,

Project-A, Data Visualization, 2015 13

Page 14: Data visualization

Data-Ink

É Data-ink the non-erasable ink used for the presentation of dataÉ If removed the graphic would lose the contentÉ Non-Data-Ink is accordingly the ink that does not transport the

informationÉ Data-ink ratio = (data ink)/(total ink used to print the graphic)É Chartjunk: unecessary to comprehend the information represented or

distractive

,

Project-A, Data Visualization, 2015 14

Page 15: Data visualization

Data-Ink-Ratio Example

Low Data-Ink-Ratio High Data-Ink-Ratio

,

Project-A, Data Visualization, 2015 15

Page 16: Data visualization

Chart types

The goodÉ Bar chartsÉ Line chartsÉ Scatter plotsÉ Boxplots

The badÉ Pie chartsÉ Area charts

The uglyÉ All 3d charts

,

Project-A, Data Visualization, 2015 16

Page 17: Data visualization

Bar charts

É Go-to graph for comparing accross categories, discrete data orcontinous data

É Proximity: Set white space width separating contiguous bars equal to50%-150% width of bars

É Fills: Avoid pattern lines, use soft but distinct colorsÉ Borders: AvoidÉ Tick marks: Do not overdo

,

Project-A, Data Visualization, 2015 17

Page 18: Data visualization

Bar charts

,

Project-A, Data Visualization, 2015 18

Page 19: Data visualization

Line chart

É Used for continous dataÉ Intervals should be equal in sizeÉ Values should only direct connect values in adjacent intervalsÉ Indicate missing data

,

Project-A, Data Visualization, 2015 19

Page 20: Data visualization

Line chart

,

Project-A, Data Visualization, 2015 20

Page 21: Data visualization

Scatter plots

É Great for correlation between two quantitave dimensions

,

Project-A, Data Visualization, 2015 21

Page 22: Data visualization

Pie chartsPie charts are the Aquaman of data visualization

,

Project-A, Data Visualization, 2015 22

Page 23: Data visualization

Pie charts

The same data represented as a column chart

,

Project-A, Data Visualization, 2015 23

Page 24: Data visualization

3d Pie charts

What is worse than a pie chart? Meet the 3d pie chart

,

Project-A, Data Visualization, 2015 24

Page 25: Data visualization

Maps

É provide specificinformationabout particularlocations

É provide generalinformationabout spatialpatterns

É can be used tocomparepatterns on twoor more maps

,

Project-A, Data Visualization, 2015 25

Page 26: Data visualization

Maps

Relevant xkcd

,

Project-A, Data Visualization, 2015 26

Page 27: Data visualization

Remember Color Blindness

Approximately 10% of males and 1% of females suffer color visiondeficiency

Original colors Perceived colors

,

Project-A, Data Visualization, 2015 27

Page 28: Data visualization

Summary

É Visualize your dataÉ Choose the right type for your visualizationÉ Aim for a high Data-Ink-Ratio

,

Project-A, Data Visualization, 2015 28

Page 29: Data visualization

For Further Reading I

Tufte, Edward RThe Visual Display of Quantitative Information.Graphics Press, 2001.

,

Project-A, Data Visualization, 2015 29