91.541 data visualization spring 2006rosane/haim_lecture1_2006-08-08_2ppg.pdf · • pipeline •...
TRANSCRIPT
1
1
IVPR
Haim Levkowitz & Georges Grinstein
Olsen 301
{haim, grinstein}@cs.uml.edu
91.541
Data Visualization
Spring 2006
2
3
IVPR
One Look is Worth a Thousand WordsOne Look is Worth a Thousand Words
• Fred R. Barnard, in Printers' Ink, 8 Dec., 1921, p. 96
• He changed it to "One picture is worth a thousandwords" in Printers' Ink, 10 March 1927, p. 114, andcalled it "a Chinese proverb, so that people wouldtake it seriously."
• It was immediately credited to Confucious
• This establishes the link between the two ads, butmany sources misquote the 1927 advertisement bycopying "a thousand" from the 1921 advertisementinstead of replacing it by "ten thousand"
4
IVPR
• Part 1 - Visualization techniques (2 weeks)
– Introduction and goals
– History of visualization and techniques
– Computer Graphics
– Graphics and Visualization Pipelines
• Part 2 – The User (1-2 weeks)
– Perception (visual, aural, tactile, haptic, …)
– Illusions
OutlineOutline
3
5
IVPR
• Part 3 – Reference Models (2 weeks)
– Visualization pipeline
– Data, metadata, operations, mappings
– Visualization taxonomies and reference
models
– Visualization Theory
OutlineOutline
6
IVPR
• Part 4 – Techniques and Tools (5 weeks)
– Spatial
– Non-spatial
– Graphs and Networks
– Special
– Very high-dimensional
and some of their interactions
and their computations (operators)
– Data manipulation and mining
– Custom domain systems example
OutlineOutline
4
7
IVPR
• Part 6 – Interaction Theory (1 week)
– Operators
– Styles
– Techniques
• Part 7 – Utility, Usability and Effectiveness (1 week)
– Design process
– Evaluation
• Part 7 – Frameworks (2 weeks)
– Components, features, limitations, assumptions
– Application examples
– Futures
OutlineOutline
8
IVPR
Introduction and GoalsIntroduction and Goals
• Look at history of Computer Graphics and
Visualization (ScDV, InfoVis, mDV, EDA)
• Understand the issues in interactive data
visualization
• Examine numerous visualization
techniques, interactions, and systems
• Be able to implement visualizations within
a variety of frameworks and systems
• Explore the future of visualization
5
9
IVPR
Why Graphics or Visualization?Why Graphics or Visualization?
• To help the user
– See (understand)
– Remember
– Compute
– Analyze
– Discover
– Enjoy
– …
10
IVPR
6
12
IVPR
VocabularyVocabulary
• Data
• Information
• Knowledge
• Visualization
• Data exploration
• Databases
• Data analysis
• Knowledge discovery
• Data mining
• Computer vision
• Perception & cognition
• Graphics
• Display list
• Frame buffer
• Rendering
• Imaging
• Filtering
• Pipeline
• Input/output devices
• Human interface
• Multimedia
• Virtual reality
8
16
IVPR
Goals of Visualization TechniquesGoals of Visualization Techniques
Have no
hypotheses about
the data
Have some
hypotheses about
the data
Facts to be
presented are
known (these may
not represent the
truth)
Start
Exploratory
Analysis
Confirmatory
Analysis
Presentation
ResultProcess
Visualization of
data to lead to
hypotheses about
the data
Interactive usually
undirected search
for structures,
trends, patterns or
anomalies
Visualization of
data to confirm,
accept or reject the
hypotheses
Goal oriented
examination of the
hypotheses
High-quality
visualization of the
data and analysis
to present facts
(often without the
author’s presence)
Choose and tune
appropriate
visualization
technique
9
17
IVPR
The Knowledge Discovery ProcessThe Knowledge Discovery Process
Decisions
Tools
18
IVPR
• Data Exploration is the process of
searching and analyzing databases to
discover implicit but potentially useful
information
Data ExplorationData Exploration
10
19
IVPR
• Convey information
• Discover new knowledge
• Identify structure, patterns, anomalies,
trends, relationships
Data Information Knowledge
Goals of Data ExplorationGoals of Data Exploration
For decision
support!
20
IVPR
Data Mining
Database
TechnologyStatistics
Other
Disciplines
Information
Science
Machine
Learning (AI)Visualization
A Confluence of Multiple DisciplinesA Confluence of Multiple Disciplines
11
Final Model
User Requirements
and
User Interactions
Data
Visualization
Parameter
Visualization
Pattern
Visualization
Model
Visualization
Algorithm Engineering
Algorithm Selection
Data Engineering
Problem Formulation
Model Validation
Pattern Evaluation
Model Testing
Model Enhancement
Raw Data
Transformed Dataset
Selected Algorithm
Induced Model
Patterns, Statistics
Measure of Goodness
Patterns, Statistics
User Interactions
22
IVPR
Data Mining Tasks & TechniquesData Mining Tasks & Techniques
Major Techniques
• Linear Regression Trees
• Non-Linear Regression
• MARS
• Naïve Bayes
• K-Means and K-Median
• Neural Networks
• Association Rules
• Decision Trees
• Principal Curve Analysis
• Support Vector Machines
• Genetic Algorithms
Major Data Mining Tasks
• Summarization
• Association
• Classification
• Prediction
• Clustering
• Time-Series Analysis
using
based onStatistical Tools
• Missing Value Imputation
• Normalizations
• Error & Variational Analysis
• Confidence Estimates
12
23
IVPR
Why so many?Why so many?
• Almost all tasks are NP-hard!
• KDD2001 CUP
– Thrombosis data set
– Over 200 submissions
– Over 100 different techniques
– Many combined techniques
• KDD2002 CUP
– Creativity
24
IVPR
Pure• 2D and 3D Scatterplots
• Matrix of Scatterplots
• Statistical Charts
• Line and Multi-line Graphs
• Parallel Coordinates
• Circle Segment
• Polar Charts
• Survey Plots
• Heatmaps
• Height Maps
• Iconographic Displays
• RadViz
• PolyViz
Integrated with Analysis• Projection Pursuit
• Dimensional Stacking
• Sammon Plots
• Multi-Dimensional Scaling
• PCA and Principal Curves
• Self Organizing Maps
Interactions• Selection
• Probing, Querying
• Grand Tours
• Non-linear Zooms
Visualization TechniquesVisualization Techniques
13
25
IVPR
The Visualization ProblemThe Visualization Problem
• Massive amounts of data from
–databases
–simulations
–sensors
–decision systems
• Limited screen space
• Little is known about the human
perceptual system and information
transfer
26
IVPR
What is Visualization?What is Visualization?
• Visualization is a method of computing. It
transforms the symbolic into the geometric,
enabling researchers to observe their
simulations and computations.
Visualization offers a method for seeing the
unseen. (from McCormick87)
• Visualization now includes other data
representations
–Aural (auditory), haptic and tactile, …
14
27
IVPR
• It is the Visual Interface to the Data and the Mining tools
• It is a method of interacting with the data and algorithms
— supports the user through all the knowledge
discovery steps
— uses selections, queries, probes, and view
transformations
• It is completely separable from the analysis methods
— Data can be analyzed using many different algorithms
— Each result can be viewed in a different visualization
— Each visualization provides a different view of the results
A Definition of VisualizationA Definition of Visualization
Galileo
28
IVPR
• Very large number of parameters
–more than 100
• Very large data sets
–more than 107
• Multiple data types
–discrete and continuous
• Noisy data
–often not uniform
• Missing values
–could be important
• Lots of different tasks
What are the Key Data Factors?What are the Key Data Factors?
15
29
IVPR
The Great Demand for VisualizationThe Great Demand for Visualization
• Fueled by technological advancements
–Displays
–High performance computers
–Large storage systems
–Personal computers
–Sensor technology
• Fueled by user awareness
–Interfaces
–Programming tools
–Flexibility
30
IVPR
Global Computing ApplicationsGlobal Computing Applications
• 48-hour Weather Forecast
• 2D Airfoil
• Oil Reservoir Model
• Climate Monitoring
• Vehicle Signature
• Plasma Modeling
• Chemical Dynamics
• Stock Market Prediction
• WWW
• Drug Discovery
• Security (data and human)
1980s
1990s
2000s
16
31
IVPR
Very High > 1000
High 1000
Medium 100
Low 10
Dimensionality# of Variables
What is High Dimensional?What is High Dimensional?
32
IVPR
Low DimensionalLow Dimensional High Dimensional
A Complete Data ViewA Complete Data View
17
33
IVPR
DatabaseMetaData
DatabaseView Table
mapping and
display functions
querypreprocessor
Databases
Retrieved database subsets
4 2
6
3
1
7
5
8
9
Visualization
Subsystem
Database
Visualization Interface
Database ManagementSubsystem
Visualization ArchitectureVisualization Architecture
34
IVPR
simulated or
sampled data
derived or
massaged data
logical data
representation
data transformations -
interpolation, filtering, etc.
representation mappings -
geometry, color, sound, etc.
Image
rendering -
viewing, shading,
device transforms, etc.
D
B
M
S
USER
queries and probes
The Visualization PipelineThe Visualization Pipeline
Interactions with a DBMS ViewInteractions with a DBMS View
UserUser
18
35
IVPR
• Exploratory Visualization– Dynamic, relatively
unpredictable
– User searches for structure,trends, etc.
– Generating hypotheses
• Confirmatory Visualization– More stable and predictable
– Predetermined systemparameters
– Confirm or refute hypotheses
• Production Visualization– Most stable and predictable
– Fine-tune system parameters
– Already Validated hypothesesFocusVisualization DBMS
Visualization Interaction StylesVisualization Interaction Stylesand the integration of database and visualization technologiesand the integration of database and visualization technologies
IVPR
History of VisualizationHistory of Visualization
And Techniques
19
37
IVPR
• Pictures
– From hieroglyphics to spreadsheets
– From lines to surface and volumes
– From scatterplots to HDVs
– From static to dynamic images
– From simple to complex integratedanalysis
• Slides
5000 BC
2000 AD
A History of VisualizationA History of Visualization
IVPR
1-10 Variables1-10 Variables
20
21
22
23
24
25
26
27
54
IVPR
MapsMaps
• Valuable
– Save time, money, lives
• Anchoring image
– Experience base
– Reasoning base
• Understandable
28
55
IVPR
56
IVPR
29
Snow’s Map of
Cholera Deaths
in London
2 Dimensions
30
31
32
63
IVPR
33
65
IVPR
34
67
IVPR
Visualization FuelsVisualization Fuels
• Military
• Aerospace and Automotive
• Entertainment
• Scientific Data Visualization
• GIS
• Floods of Data
IVPR
NASA MovieNASA Movie
Classic Science
– Build Model
– Validate Model using Real Data
– Repeat
35
69
IVPR
70
IVPR
36
71
IVPR
Aircraft Data• Velocity = 165 knots
• Wing Area = 29 m2
• Wing Span = 16 m
• Mean Aerodynamic Chord = 2 m
• Weight = 8000 kg
• Chord Reynolds Number = 1.18x107
AerospaceAerospace
37
73
IVPR
74
IVPR
38
75
IVPR
Computer-Aided DesignComputer-Aided Design
76
IVPR
39
77
IVPR
78
IVPR
Computational Support and StatisticsComputational Support and Statistics
• Support tools for scientific visualization
• Support tools for CAD, CAM, CAE, …
• Statistics for social science data files
• Statistics for databases
• Modeling data
40
79
IVPR
Computational SupportComputational Support
80
IVPR
Computational SupportComputational Support
41
81
IVPR
Computational SupportComputational Support
82
IVPR
Computational SupportComputational Support
42
Computational SupportComputational Support
84
IVPR
Statistics for Files and DatabasesStatistics for Files and Databases
43
44
87
IVPR
New AreasNew Areas
• Entertainment
• Medicine
• Architecture
• Art
• Internet
• Public Demand
88
IVPR
Film and EntertainmentFilm and Entertainment
45
90
IVPRDan Raabe, Toolbox Films
46
91
IVPR
• Head, including cerebellum
• Cerebral cortex, brainstem
• Nasal passages from Head subset
Section of the Visible HumanSection of the Visible Human
47
93
IVPR
HIV-I TargetHIV-I Target
IBM, Data Explorer Binding of the drug TIBO-R86183
to specific pocket of HIV-I enzyme
94
IVPR
DNA Electron MicroscopyDNA Electron Microscopy
Bacterial RecA and eukaryotic Rad51
Proteins form similar filaments on DNA
48
95
IVPRElectron density of C-60
96
IVPRHIV Reverse Transcriptase Inhibitor (electrostatic potential)
ESP
0.25
0.20
0.15
0.10
0.05
0.00
- 0.05
49
98
IVPR
50
99
IVPR
51
52
53
54
55
56
112
IVPR
57
58
115
IVPR
59
117
IVPR
118
IVPR
60
Gram of Fat
61
121
IVPR
Homework linksHomework links
the aesthetics + computation group
http://acg.media.mit.edu/
Processing language and environment
http://processing.org/