scientific and large data visualization 19 october 2018...
TRANSCRIPT
![Page 1: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/1.jpg)
Scientific and Large Data Visualization19 October 2018
Information Visualization Basics
Massimiliano Corsini
Visual Computing Lab, ISTI - CNR - Italy
![Page 2: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/2.jpg)
InfoVis – Next Lessons• Information Visualization – Introduction and Motivations
• Data Types, Graph Types and Visual Perception
• Visualization in Python
• Multidimensional Data
• Time Series
• Graph Drawing
• Practice– D3.js
![Page 3: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/3.jpg)
InfoVis – Basics• Introduction and motivations
• Ingredients of effective visualization
• Data types
• Graph types
• Visual perception
![Page 4: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/4.jpg)
Information Visualization
• Information visualization is the study of (interactive) visual representations of abstract data to reinforce human cognition. [Wikipedia]
• The use of computer-supported, interactive, visual representations of abstract data to amplify cognition. [Card et al. 1999]
• The use of computer graphics and interaction to assist humans in solving problems. [Purchase et al. 2008]
![Page 5: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/5.jpg)
Information Visualization
• The purpose of information visualization is to amplify cognitive performance, not just to create interesting pictures. [Card 2007]
• Infographics is a visual tool for communication, for the understanding and for the analysis. [Alberto Cairo]
![Page 6: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/6.jpg)
InfoGraphics – Charts
From the Wall
Street Journal
![Page 7: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/7.jpg)
InfoGraphics – Diagrams
Juan Colombata & Enzo Oliva – La Voz del Interior (Argentina)
![Page 8: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/8.jpg)
Information Visualization
• Difference from Scientific Visualization:
– Information visualization treats also abstract data (numerical and non-numerical data).
– In scientific visualization spatial representation is given.
• Difference from Visual Analytics:
– In Visual Analytics the accent is on the reasoning/interaction loop.
![Page 9: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/9.jpg)
Data Visualization
A classic book on statistical graphics, charts, tables. Theory and practice in the design of graphics to visualize data.
Very practical, very classic.
![Page 10: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/10.jpg)
Information Visualization
A very interesting aspect:
why we need to consider visualizationas a SCIENCE
![Page 11: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/11.jpg)
Functional Art
• Function does not dictate but restrict our choices.
• This is particularly true for infographics.
![Page 12: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/12.jpg)
Motivations
![Page 13: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/13.jpg)
Motivations
Which country is close to its historical maximum ?
![Page 14: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/14.jpg)
Motivations
Easier to answer… Why ?
![Page 15: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/15.jpg)
Rationale
• The human visual system (HVS) is very good at identifies and analyzes patterns.
• We can visualize data to make easy for our brain to analyze them, for example to do comparisons.
![Page 16: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/16.jpg)
Effectiveness
• To be effective, data visualization should be taken into account several factors:
– Data type
– Goal (function)
– Visual Perception System
![Page 17: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/17.jpg)
Example – Comparisons
From “The functional art” by Alberto Cairo
![Page 18: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/18.jpg)
Cleveland and McGill 1984Less accurate
Less accurate
Adapted from “The functional art” by Alberto Cairo
![Page 19: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/19.jpg)
Key Ingredients
• In the following we focus on:
– Data types
• One dimension, two dimensions, N-dimensions
• Quantitative, Ordinal, Nominal
– Graph types
– Visual Perception
• We give some design guidelines time by time.
![Page 20: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/20.jpg)
Data
• Information is obtained from data (!)
• Structured / Unstructured.
• Generated by sensors, by computers, by humans, etc.
![Page 21: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/21.jpg)
Variable Types
• According to Stevens (1946):
– Nominal
• Labels (e.g. apples, oranges, bananas)
– Ordinal
• To ordering things (e.g. ranks of movies)
– Interval
• Interval scale of measurements (e.g. time of departure-arrival)
– Ratio
• Measures defined on a ratio scale (e.g. the mass of an object)
S. S. Stevens, “On the theory of scales and measurements.”, Science, 103,
pp. 677-680, 1946.
![Page 22: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/22.jpg)
Variable Types
• Category
– Steven’s nominal class (e.g. country names, type of disease)
• Ordinal
– Labels expressing degree (e.g. cold, hot, very hot)
– In general, encoded as integer data.
• Quantitative
– Intervals, measures, etc.
– In general, real-numbered data.
![Page 23: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/23.jpg)
Data Dimensions
• Common dimensions:
– Univariate, bivariate, trivariate
– Multi-variate (N>3 dimensions)
• Variables can be dependent or independent.
• Each case is a point in a space with N dimension (data point).
![Page 24: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/24.jpg)
Data Dimensions
• A set of data can be represented by a table with N columns (one for each variable).
Variable 1 Variable 2 Variable 3
Data 1
Data 2
Data 3
Data 4
…
![Page 25: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/25.jpg)
Data Relationships
• A table can be used to represent a relationship between different data.
User Id Game id
Data 1 Smith 023923
Data 2 James 238548
Data 3 Frank 385753
Data 4 Powell 357352
…
![Page 26: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/26.jpg)
Data Relationships
• 1-to-1
• 1-to-many
• Many-to-many
• Relationships may also have attributes
![Page 27: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/27.jpg)
What we want..
• A set of well formed and interconnected tables of data.
![Page 28: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/28.jpg)
What we have..
• Data does not come in the form we would like.
• Data may have inconsistencies:
– Corrupted data
– Missing data
– Several data may be equivalent (e.g. text field “R&D”, “Research”, “Research and Development”, “r*d”)
![Page 29: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/29.jpg)
Data Processing Pipeline
• Data does not come in the form we would like.
• Typical data processing pipeline (from raw data to clean structured data):
1. Collect data.
2. Data simplification (extract a subset of interest).
3. Clean and structure them.
• An output of the pipeline can be also metadata.
![Page 30: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/30.jpg)
Data Collection
• Search and download data.
• Parse text.
• Convert between different formats.
– Examples: CVS to JSON , MySQL to HTML , etc.
• Merge heterogeneous sources.
![Page 31: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/31.jpg)
Data Simplification
• Filter / Selection
– Remove unwanted data
– Remove invalid data (null values)
• Aggregation
– Collapse several data points into a single one
– Replace some values with minimum, maximum, average, total, etc.
![Page 32: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/32.jpg)
Data Processing Pipeline
• Typically performed automatically using scripts.
• Human-guided data transformations is possible (through macro-operations)
– Use appropriate tools (e.g. OpenRefine -http://openrefine.org)
![Page 33: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/33.jpg)
Metadata
• Data which describes the data
– Role of variables
– Type of variables
– Constraints
– Dependencies
• Collection and processing operations can be also described.
![Page 34: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/34.jpg)
How to Present Data ?
• How to present data graphically ?
– To allow visual analysis
– To highlight patterns
– To answer some specific questions
– Etc.
![Page 35: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/35.jpg)
Quantitative Values
• We have just mentioned the work by Cleveland & McGill (1984)
– Position
– Length
– Angle / slope
– Area
– Volume
– Color saturation / shading
Perceptual Accuracy
![Page 36: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/36.jpg)
From “Data Visualization” Course by John C. Hart, for Coursera, 2015.
QUANTITATIVE ORDINAL NOMINAL
Position Position Position
Length Density Hue
Angle Saturation Texture
Slope Hue Connection
Area Texture Containment
Volume Connection Density
Density Containment Saturation
Saturation Length Shape
Hue Angle Length
Slope Angle
Area Slope
Volume Area
Volume
![Page 37: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/37.jpg)
Table vs Graphs
• Present tables directly is preferred when:
– Few data points
– Precise values are important
![Page 38: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/38.jpg)
Univariate Data
• Few interesting solutions.
• Statistical description:
– Mean, median, standard deviation, quartiles.
• Warning! (some data that appears to be univariate are actually bivariate).
![Page 39: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/39.jpg)
Stemplots
• Also called stem and leaf plot.
• Used to display quantitative data, generally from small data sets (50 or fewer observations).
• Easy to print.
• Easy to read.Figure from Data Visualization Catalogue (http://datavizcatalogue.com)
![Page 40: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/40.jpg)
Stemplots
Figure from Data Visualization Catalogue (http://datavizcatalogue.com)
![Page 41: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/41.jpg)
Box and Whisker Plots• Box and Whisker Plot (or Box
Plot) is a convenient way of visually displaying a data distribution through their quartiles.
• Advantages:– Key values (average, median, 25th
percentile, etc.)
– If there are any outliers and what their values are.
– If the data is skewed and in what direction.
Figure from Data Visualization Catalogue (http://datavizcatalogue.com)
![Page 42: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/42.jpg)
Box and Whisker Plots
Figure from Data Visualization Catalogue (http://datavizcatalogue.com)
![Page 43: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/43.jpg)
Line Charts/Line Graphs
• Invented by the Scottish engineer and statistician William Playfair (1759-1823)
• Two quantitative variables, typically:
– X time or intervals , Y any
• Line indicates that there are also intermediate valuesFigure from Data Visualization Catalogue (http://datavizcatalogue.com)
![Page 44: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/44.jpg)
Bar Charts
• Bivariate Data
– One nominal variable (typically independent)
– One quantitative variable (typically dependent variable)
• Horizontal/Vertical bars
• Do not confuse with histograms.
![Page 45: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/45.jpg)
Bar Charts
3D?! Not a very good idea..
![Page 46: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/46.jpg)
Histograms
• Bivariate Data
– One independent and one dependent variable
– The first variable is quantized in intervals (bins)
Figure from Data Visualization Catalogue (http://datavizcatalogue.com)
![Page 47: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/47.jpg)
Pie Charts
• Bivariate Data
– One independent and one dependent variable
• Good for a quick visual check.
• Not good for:
– Many values.
– Accurate comparisons.
Figure from Data Visualization Catalogue (http://datavizcatalogue.com)
![Page 48: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/48.jpg)
Figure from Data Visualization Catalogue (http://datavizcatalogue.com)
![Page 49: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/49.jpg)
Pie Charts
3D?! Not more readable.
![Page 50: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/50.jpg)
Figure by Luiz Salomão.
![Page 51: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/51.jpg)
• Essentially, a pie chart with the center area cut out.
• Allows to focus more on arc length instead of comparing the proportion between slices.
• More space efficient.
Figure from Data Visualization Catalogue (http://datavizcatalogue.com)
Donut Charts
![Page 52: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/52.jpg)
Sunburst Charts• To shows hierarchy
through a series of rings. Each ring corresponds to a level in the hierarchy.
• Hierarchy moving outwards from the center.
• Colour can be used to highlight hierarchal groupings or specific categories.
Figure from Data Visualization Catalogue (http://datavizcatalogue.com)
![Page 53: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/53.jpg)
Sunburst Charts
Produces by Space Radar app (https://github.com/zz85/space-radar)
![Page 54: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/54.jpg)
Scatter Plots
• Bivariate Data
– Two independent variables
• Good to identify relationships, outliers and clusters.
Variable 1
Variable 2
![Page 55: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/55.jpg)
Scatter Plots
Variable 1
Variable 2
Outliers
High Density
Variable 1
Variable 2Clusters
![Page 56: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/56.jpg)
Surface Graphs
• Trivariate Data
– Three continuous variables
– Two independent and one dependent
![Page 57: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/57.jpg)
Surface Graphs
• Color may be associated to the dependent variable.
![Page 58: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/58.jpg)
Surface Graphs
• Level curves can be also used.
![Page 59: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/59.jpg)
3D Scatter Plots
• Trivariate Data
– Three quantitative variables
• Same concept of 2D Scatter Plot
![Page 60: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/60.jpg)
Colored Scatter Plots
• Trivariate Data
• Color can encode a variable or a category
![Page 61: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/61.jpg)
Bubble Charts
• Trivariate Data
– Three quantitative variables
• Do not allow for accurate comparison.
• Colors can be used to show different categories.
![Page 62: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/62.jpg)
Bubble Charts
Figure from http://gapminder.com
![Page 63: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/63.jpg)
Chloropleth Map
• Bivariate Data
– Quantitative value over a geographical areas/regions
• Data are coloured, shaded or patterned in different ways.
• Good for an overview, not for accurate comparison.
• Small areas can be underemphasized.
![Page 64: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/64.jpg)
Word Cloud
• Word Clouds displays how frequently words appear in a given body of text, by making the size of each word proportional to its frequency.
• Arrangement and color can vary a lot.
• Word Clouds can be used to compare two bodies of text or to give a quick idea of repeating keywords (e.g. used by researchers to summarize the content of their papers).
![Page 65: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/65.jpg)
Word Clouds
• Disadvantages:
– Long words are emphasized over short words.
– Words whose letters contain many ascenders and descenders may receive more attention.
– No accuracy comparison, mainly used for aesthetic reasons.
![Page 66: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/66.jpg)
Word Clouds (example)
Produces by www.jasondavies.com/wordcloud.
![Page 67: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/67.jpg)
Word Clouds (example)
Produces by www.jasondavies.com/wordcloud.
![Page 69: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/69.jpg)
Summary
• Visualization plays a fundamental role in data analysis.
• Information coming from data. Data should be collected and processed properly.
• Depending on goal and data type, the use of certain graphs is more effective than others.
![Page 70: Scientific and Large Data Visualization 19 October 2018 ...vcg.isti.cnr.it/~cignoni/SciViz1819/SciViz_07_Intro_InfoVis.pdf · Data Processing Pipeline •Data does not come in the](https://reader034.vdocument.in/reader034/viewer/2022043009/5f9d962ea439912555002cf4/html5/thumbnails/70.jpg)
Questions ?