data visualization (cis 490/680)

26
Data Visualization (CIS 490/680) D3 Dr. David Koop D. Koop, CIS 680, Fall 2019

Upload: others

Post on 26-Jan-2022

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data Visualization (CIS 490/680)

Data Visualization (CIS 490/680)

D3

Dr. David Koop

D. Koop, CIS 680, Fall 2019

Page 2: Data Visualization (CIS 490/680)

D. Koop, CIS 680, Fall 2019

Tables

Attributes (columns)

Items (rows)

Cell containing value

Networks

Link

Node (item)

Trees

Fields (Continuous)

Attributes (columns)

Value in cell

Cell

Multidimensional Table

Value in cell

Grid of positions

Geometry (Spatial)

Position

Dataset TypesDataset Types

�2

[Munzner (ill. Maguire), 2014]

Page 3: Data Visualization (CIS 490/680)

D. Koop, CIS 680, Fall 2019

song structure and singing (Joe Carmanica recently wrote about this trend for the New

York Times, arguing that it was led by Drake, who popularized the rapping-and-singing

formula over the past decade).

A better benchmark for Lil Uzi Vert’s word count (2,556) might be those of pop artists,

such as Beyonce (2,433 words), or even one his major influences: Marilyn Manson

(2,466 words).

There are also genre-bending artists. If Childish Gambino’s Awaken, My Love! is less hip

hop in the traditional ’90s boom-bap sense, is it fair to compare it to vocabulary-dense

Wu-Tang albums? Genre matters in vocabulary calculations—check out the chart below,

which takes 500 random samples of 35,000 words from rock, country, and hip hop.

# of Unique Words Used in 500 Random Samples of 35,000 Lyrics from Country, Rock, Hip Hop

Raw Lyrics Data via John W. Miller

In short, if artists depart from hip-hop song structure, we’d expect their vocabulary to

go down in the number of unique words.

That said, the results are still directionally interesting. Of the 150 artists in the dataset,

let’s take a look at who is on top.

2,800uniquewords

Rock

3,800uniquewords

4,800uniquewords

5,800uniquewords

Country

Hip Hop

0 samples

105 samples

211 samples

Sets & Lists

�3

[M. Daniels, 2019]

Page 4: Data Visualization (CIS 490/680)

D. Koop, CIS 680, Fall 2019

241 = Quantitative2 = Nominal3 = Ordinal

quantitative ordinal categorical

Categorial, Ordinal, and Quantitative

�4

Page 5: Data Visualization (CIS 490/680)

D. Koop, CIS 680, Fall 2019

Attributes

Attribute Types

Ordering Direction

Categorical Ordered

Ordinal Quantitative

Sequential Diverging Cyclic

Ordering Direction

�5

[Munzner (ill. Maguire), 2014; Rogowitz & Treinish, 1998; Weber et al., 2001]lar structure of spirals allows for an easy detection of cyclesand for the comparison of periodic data sets. Furthermore,the continuity of the data is expressed by using a spiral in-stead of a circle.

3.1. Mathematical description and types of spirals

A spiral is easy to describe and understand in polar coor-dinates, i.e. in the form . The distinctive feature ofa spiral is that f is a monotone function. In this work we as-sume a spiral is described by

Several simple functions f lead to well-known types ofspirals: • Archimedes‘ spiral has the form . It has the spe-

cial property that a ray emanating from the origincrosses two consecutive arcs of the spiral in a constantdistance .

• The Hyperbolic spiral has the form . It is theinverse of Archimedes‘ spiral with respect to the origin.

• More generally, spirals of the form are calledArchimedean spirals.

• The logarithmic spiral has the form . It has thespecial property that all arcs cut a ray emanating fromthe origin under the same angle.For the visualization of time-dependent data Archimedes‘

spiral seems to be the most appropriate. In most applicationsdata from different periods are equally important. Thisshould be reflected visually in that the distance to other peri-ods is always the same.

3.2. Mapping data to the spiral

In general, markers, bars, and line elements can be usedto visualize time-series data similar to standard point, bar,and line graphs on Spiral Graphs. For instance, quantitative,discrete data can be presented as bars on the spiral or bymarks with a corresponding distance to the spiral. However,since the x and y coordinate are needed to achieve the generalform of the spiral their use is limited for the display of datavalues. One might consider to map data values to small ab-solute changes in the radius, i.e.

Yet, we have found this way of visualizing to be ineffec-tive. We conclude that the general shape of the spiral shouldbe untouched and other attributes should be used, such as• colour,• texture, including line styles and patterns,

Figure 1:Two visualizations of sunshine intensity using about the same screen real estate and thesame color coding scheme. In the spiral visualization it is much easier to compare days, to spotcloudy time periods, or to see events like sunrise and sunset.

r f ϕ( )=

r f ϕ( ) dfdϕ------ 0 ϕ R+.∈,>,=

r aϕ=

2πar a ϕ⁄=

r aϕk=

r aekϕ=

r aϕ bv ϕ( ) b πamax v ϕ( )( )---------------------------<,+=

Page 6: Data Visualization (CIS 490/680)

D. Koop, CIS 680, Fall 2019

Tasks

�6

[Munzner (ill. Maguire), 2014]

Trends

Actions

Analyze

Search

Query

Why?

All Data

Outliers Features

Attributes

One ManyDistribution Dependency Correlation Similarity

Network Data

Spatial DataShape

Topology

Paths

Extremes

ConsumePresent EnjoyDiscover

ProduceAnnotate Record Derive

Identify Compare Summarize

tag

Target known Target unknown

Location knownLocation unknown

Lookup

Locate

Browse

Explore

Targets

Why?

How?

What?

Page 7: Data Visualization (CIS 490/680)

D. Koop, CIS 680, Fall 2019

Analyze

Search

Query

ConsumePresent EnjoyDiscover

ProduceAnnotate Record Derive

Identify Compare Summarise

tag

Target known Target unknown

Location known

Location unknown

Lookup

Locate

Browse

Explore

Actions

Actions: Analyze

�7

[Munzner (ill. Maguire), 2014]

Page 8: Data Visualization (CIS 490/680)

D. Koop, CIS 680, Fall 2019

Visualization for Consumption

�8

[M. Stefaner, M. Wattenberg]

Discover

Present

Enjoy

Page 9: Data Visualization (CIS 490/680)

D. Koop, CIS 680, Fall 2019

HIGH QUALITY DESCRIPTION

LOW QUALITY

DESCRIPTION

MEMORABLE

FORGETTABLE

Memorability

�9

[M. Borkin et al., InfoVis 2015]

Page 10: Data Visualization (CIS 490/680)

D. Koop, CIS 680, Fall 2019

Memorability of Visualizations• S. Few: "Visualizations don’t need to be designed for memorability – they

need to be designed for comprehension. For most visualizations, the comprehension that they provide need only last until the decision that it informs is made. Usually, that is only a matter of seconds."

• B. Jones (paraphrased): People make decisions using visualizations but this isn't instantaneous like robots or algorithms; they often chew on a decision for a while

• R. Kosara: there are cases where people benefit from remembering a visualization (e.g. health-related visualization)

• Are there tradeoffs between the characteristics?

�10

Page 11: Data Visualization (CIS 490/680)

D. Koop, CIS 680, Fall 2019

Visualization for Production• Generate new material • Annotate: - Add more to a visualization - Usually associated with text, but can be graphical

• Record: - Persist visualizations for historical record - Provenance (graphical histories): how did I get here?

• Derive (Transform): - Create new data - Create derived attributes (e.g. mathematical operations, aggregation)

�11

Page 12: Data Visualization (CIS 490/680)

D. Koop, CIS 680, Fall 2019

Original Data

exports

imports

Visualization for Production: Derived Data

�12

Derived Data

trade balance = exports − imports

trade balance

[Munzner (ill. Maguire), 2014]

Page 13: Data Visualization (CIS 490/680)

D. Koop, CIS 680, Fall 2019

Actions: Search• What does a user know? - Lookup: check bearings - Locate: find on a map - Browse: what’s nearby - Explore: where to go

- Patterns

�13

[Munzner (ill. Maguire), 2014]

Analyze

Search

Query

ConsumePresent EnjoyDiscover

ProduceAnnotate Record Derive

Identify Compare Summarize

tag

Target known Target unknown

Location known

Location unknown

Lookup

Locate

Browse

Explore

Actions

Page 14: Data Visualization (CIS 490/680)

D. Koop, CIS 680, Fall 2019

Actions: Query

• Number of targets: One, Some (Often 2), or All • Identify: characteristics or references • Compare: similarities and differences • Summarize: overview of everything

�14

[Munzner (ill. Maguire), 2014]

Analyze

Search

Query

ConsumePresent EnjoyDiscover

ProduceAnnotate Record Derive

Identify Compare Summarize

tag

Target known Target unknown

Location known

Location unknown

Lookup

Locate

Browse

Explore

Actions

Page 15: Data Visualization (CIS 490/680)

D. Koop, CIS 680, Fall 2019

Assignment 2• Link • Three parts: table, horizontal bar

chart, vertical bar chart - data processing - highlighting (CS 680)

• Vertical chart can be tricky • Start early! • Questions?

�15

Page 16: Data Visualization (CIS 490/680)

D. Koop, CIS 680, Fall 2019

Why?

How?

What?

Roadmap• What? → Data - Types - Semantics

• Why? → Tasks - Actions - Targets

• How → Vis Idioms/Techniques - Data Representation - Visual Encoding - Interaction Encoding

�16

Page 17: Data Visualization (CIS 490/680)

D. Koop, CIS 680, Fall 2019

Analysis Example: Different “Idioms”

�17

[SpaceTree, Grosjean et al.] [TreeJuxtaposer, Munzner et al.]

Page 18: Data Visualization (CIS 490/680)

D. Koop, CIS 680, Fall 2019

“Idiom” Comparison

�18

[Munzner (ill. Maguire), 2014]

[SpaceTree: Supporting Exploration in Large Node Link Tree, Design Evolution and Empirical Evaluation. Grosjean, Plaisant, and Bederson. Proc. InfoVis 2002, p 57–64.]

SpaceTree

[TreeJuxtaposer: Scalable Tree Comparison Using Focus+Context With Guaranteed Visibility. ACM Trans. on Graphics (Proc. SIGGRAPH) 22:453– 462, 2003.]

TreeJuxtaposer

Present Locate Identify

Path between two nodes

Actions

Targets

SpaceTree

TreeJuxtaposer

Encode Navigate Select Filter AggregateTree

Arrange

Why? What? How?

Encode Navigate Select

Page 19: Data Visualization (CIS 490/680)

D. Koop, CIS 680, Fall 2019

Analysis Example: Derivation• Strahler number

– centrality metric for trees/networks – derived quantitative attribute – draw top 5K of 500K for good skeleton

�19

[Munzner (ill. Maguire), 2014]

[Using Strahler numbers for real time visual exploration of huge graphs. Auber. Proc. Intl. Conf. Computer Vision and Graphics, pp. 56–69, 2002.]

Task 1

.58

.54

.64

.84

.24

.74

.64.84

.84

.94

.74

OutQuantitative attribute on nodes

.58

.54

.64

.84

.24

.74

.64.84

.84

.94

.74

InQuantitative attribute on nodes

Task 2

Derive

Why?What?

In Tree ReduceSummarize

How?Why?What?

In Quantitative attribute on nodes TopologyIn Tree

Filter

InTree

OutFiltered TreeRemoved unimportant parts

InTree +

Out Quantitative attribute on nodes Out Filtered Tree

Page 20: Data Visualization (CIS 490/680)

D. Koop, CIS 680, Fall 2019

Data-Driven Documentsd3.js

�20

Page 21: Data Visualization (CIS 490/680)

D. Koop, CIS 680, Fall 2019

Data-Driven Documents (D3)• Open-Source JavaScript Library • http://d3js.org/ • Original Authors: Mike Bostock, Vadim Ogievestky, and Jeff Heer • Focus on Web standards, customization, and usability • Grew from work on Protovis: more standard, more interactive • By nature, a low-level library; you have control over all elements and styles • A top project on GitHub (over 85,000 stars as of Sept. 2019) • Lots of impressive examples - Bostock was a New York Times Graphics Editor - https://bost.ocks.org/mike/ and https://observablehq.com/@mbostock

�21

Page 22: Data Visualization (CIS 490/680)

D. Koop, CIS 680, Fall 2019

D3 Key Features• Supports data as a core piece of Web elements - Loading data - Dealing with changing data (joins, enter/update/exit) - Correspondence between data and DOM elements

• Selections (similar to CSS) that allow greater manipulation • Method Chaining • Integrated layout algorithms, axes calculations, etc. • Focus on interaction support - Straightforward support for transitions - Event handling support for user-initiated changes

�22

Page 23: Data Visualization (CIS 490/680)

D. Koop, CIS 680, Fall 2019

D3 Introduction• Ogievetsky has put together a nice set of interactive examples that show off

the major features of D3 • http://dakoop.github.io/IntroD3/ - (Updated from original for D3 v5 with new joins)

• https://beta.observablehq.com/@dakoop/d3-intro • Other references: - Murrary’s book on Interactive Data Visualization for the Web - The D3 website: d3js.org - Ros's Slides on v4: https://iros.github.io/d3-v4-whats-new/

�23

Page 24: Data Visualization (CIS 490/680)

D. Koop, CIS 680, Fall 2019

D3 Data Joins• Two groups: data and visual elements • Three parts of the join between them: enter, update, and exit • enter: s.enter(), update: s, exit: s.exit()

�24

Data Visual Elements

Enter Update Exit

Page 25: Data Visualization (CIS 490/680)

D. Koop, CIS 680, Fall 2019

Merge vs. Join• Merge creates a new selection that includes the items from both selections - If you want to update all elements (including those just added via enter), use

merge! - Useful when enter+update have similar transitions

• Join allows you to modify different parts of the selection in a single statement - Also will create the final selection - Does enter+append and exit+remove automatically - Pass functions to modify the enter, update, and exit parts of the selection - Examples: https://beta.observablehq.com/@d3/selection-join

�25

Page 26: Data Visualization (CIS 490/680)

D. Koop, CIS 680, Fall 2019

Transitions• Nested transitions (those that "hang off" of a parent transition) follow

immediately after the parent transition

�26