interactive latency in big data visualization

74
Interactive Latency in Big Data Visualization Zhicheng “Leo” Liu Jan 22, 2014

Upload: bigdatavizbay

Post on 27-Jan-2015

114 views

Category:

Technology


0 download

DESCRIPTION

Interactive Latency in Big Data Visualization Zhicheng "Leo" Liu, Research Scientist at the Creative Technologies Lab at Adobe Research January 22nd, 2014 Reducing interactive latency is a central problem in visualizing large datasets. I discuss two inter-related projects in this problem space. First, I present the imMens system and show how we can achieve real-time interaction at 50 frames per second for billions of data points by combining techniques such as data tiling and parallel processing. Second, I discuss an ongoing user study that aims to understand the effect of interactive latency on human cognitive behavior in exploratory visual analysis. Big Data Visualization Meetup - South Bay http://www.meetup.com/Big-Data-Visualisation-South-Bay/

TRANSCRIPT

Page 1: Interactive Latency in Big Data Visualization

Interactive Latency in Big Data Visualization

Zhicheng “Leo” Liu Jan 22, 2014

Page 2: Interactive Latency in Big Data Visualization

Latency: a measure of time delay experienced in a system

rotational latency

network latency

query latency

interactive latency

Page 3: Interactive Latency in Big Data Visualization
Page 4: Interactive Latency in Big Data Visualization

Questions

How to reduce interactive latency in big data visualization? How does interactive latency affect user behavior?

Page 5: Interactive Latency in Big Data Visualization

Questions

How to reduce interactive latency in big data visualization? How does interactive latency affect user behavior?

Page 6: Interactive Latency in Big Data Visualization

Reducing Latency

More memory in-memory data store

Clever indexing cube representation schemes

Parallel processing multicore, GPGPU, distributed platforms

Page 7: Interactive Latency in Big Data Visualization

imMens: a holistic approach

Perceptual scalability Binned aggregation as primary data reduction strategy Interactive scalability Multivariate data tiles Parallel query processing and rendering on GPU

[Liu et. al. 2013]

Page 8: Interactive Latency in Big Data Visualization

imMens: a holistic approach

Perceptual scalability Binned aggregation as primary data reduction strategy Interactive scalability Multivariate data tiles Parallel query processing and rendering on GPU

[Liu et. al. 2013]

Page 9: Interactive Latency in Big Data Visualization

Guiding Principle

Perceptual & interactive scalability should be limited by the chosen resolution of the visualized data,

not the number of records.

Page 10: Interactive Latency in Big Data Visualization

10  

Data

Page 11: Interactive Latency in Big Data Visualization

11  

Data

Alpha-blending

Page 12: Interactive Latency in Big Data Visualization

12  

Data

Page 13: Interactive Latency in Big Data Visualization

13  

Data Sampling

Page 14: Interactive Latency in Big Data Visualization

14  

Data Sampling

Modeling

Page 15: Interactive Latency in Big Data Visualization

15  

Data Sampling

Modeling Binned Aggregation

Page 16: Interactive Latency in Big Data Visualization

Google Fusion Tables: Sampling

16  Sampling

Page 17: Interactive Latency in Big Data Visualization

17  Aggregation

Page 18: Interactive Latency in Big Data Visualization

Binned Plots: Design Space

18  

numeric   ordinal/categorical   temporal   geographic  

1D  

2D  

Page 19: Interactive Latency in Big Data Visualization

imMens: a holistic approach

Perceptual scalability Binned aggregation as primary data reduction strategy Interactive scalability Multivariate data tiles Parallel query processing and rendering on GPU

[Liu et. al. 2013]

Page 20: Interactive Latency in Big Data Visualization

Demo

Page 21: Interactive Latency in Big Data Visualization

Multivariate Data Tiles

21

Projections / Materialized database views

Provide data for dynamic visualization

Much faster than a traditional data cube

Page 22: Interactive Latency in Big Data Visualization

22  

Page 23: Interactive Latency in Big Data Visualization

Brush & Link: A Naïve Approach

23  

X!

Y!

256

767

512 1023 …

Day!

Hour!

Month!

23 …

0 1 … 30

0 …

11

1

23 …

0 …

11

0 1 … 30 0 1 … 30

0

23 …

0

11

1

0

1

0

12 x 31 x 24 x 512 x 512 = ~2.3 billion cells

Page 24: Interactive Latency in Big Data Visualization

Brushing Over January

24  

X!

Y!

256

767

512 1023 …

Day!

Hour!

Month!

23 …

0 1 … 30

0 …

11

1

23 …

0 …

11

0 1 … 30 0 1 … 30

0

23 …

0

11

1

0

1

0

31 x 24 x 512 x 512 = ~195 million cells

Page 25: Interactive Latency in Big Data Visualization

Sum Along Day

25  

X!

Y!

256

767

512 1023 …

[ 0 – 30 ] Day!

Hour!

Month!

23 …

0 …

11

1

23 …

0 …

11

[ 0 – 30 ] [ 0 - 30 ]

0

23 …

0

11

1

0

1

0

24 x 512 x 512 = ~6 million cells

Page 26: Interactive Latency in Big Data Visualization

Sum Along Hour

26  

X!

Y!

256

767

512 1023 …

[ 0 – 30 ] Day!

Hour!

Month!

[ 0 – 23 ] 0

… 11

0 …

11

[ 0 – 30 ] [ 0 - 30 ]

[ 0 – 23 ] 0

11 …

[ 0 – 23 ]

512 x 512 cells

Page 27: Interactive Latency in Big Data Visualization

Decomposing a Data Cube

27  

For any pair of 1D or 2D binned plots, the maximum number of dimensions needed to support brushing & linking is 4.

full 5-D cube!

Day!

Hou

r!

Month!

0 1 … 30

0 …

11

Y!

Hou

r!X!

512 513 … 1023

256 … 767

Y!

Day!

X!

512 513 … 1023

256 … 767

Y!

Mon

th!

X!

512 513… 1023

256 … 767

3-D !cubes!

23 …

1 0

23 …

1 0

30 …

1 0

11 …

1 0

Σ   Σ   Σ   Σ  

Page 28: Interactive Latency in Big Data Visualization

28  

Page 29: Interactive Latency in Big Data Visualization

Tiles

29  

X: 256-511 X: 512-767

Y: 5

12-7

67

Y: 7

68-1

023

Day: 31 bins

Page 30: Interactive Latency in Big Data Visualization

Y:  512  -­‐  1023  

day:    0  -­‐  31  

From Datacube to Data Tiles

30  

512 513 … 767

256 …

511

30 …

1 0

512 513 … 767

512 …

767

30 …

1 0

768 769 … 1023

256 …

511

30 …

1 0

768 769 … 1023

512 …

767

30 …

1 0

Page 31: Interactive Latency in Big Data Visualization

Data Tiles

31  

Page 32: Interactive Latency in Big Data Visualization

x1-y1-month

32  

Page 33: Interactive Latency in Big Data Visualization

x1-y1-day

33  

Page 34: Interactive Latency in Big Data Visualization

x1-y1-hour

34  

Page 35: Interactive Latency in Big Data Visualization

x1-y2-month

35  

Page 36: Interactive Latency in Big Data Visualization

x1-y2-day

36  

Page 37: Interactive Latency in Big Data Visualization

x1-y2-hour

37  

Page 38: Interactive Latency in Big Data Visualization

x2-y1-month

38  

Page 39: Interactive Latency in Big Data Visualization

x2-y1-day

39  

Page 40: Interactive Latency in Big Data Visualization

x2-y1-hour

40  

Page 41: Interactive Latency in Big Data Visualization

x2-y2-month

41  

Page 42: Interactive Latency in Big Data Visualization

x2-y2-day

42  

Page 43: Interactive Latency in Big Data Visualization

x2-y2-hour

43  

Page 44: Interactive Latency in Big Data Visualization

month-day-hour

44  

Page 45: Interactive Latency in Big Data Visualization

45  

Page 46: Interactive Latency in Big Data Visualization

imMens Architecture

46  SciDB,  Postgres  

Client  

Server  

UI  control   VisualizaHon  

specify  

brush    &  link  

zoom  &  pan  

Page 47: Interactive Latency in Big Data Visualization

Client-Side Processing

47  

0

1

… 11

768 769 … 1023

512 513

… 767

R   G   B   A  

R   G   B   A  

…   …   …   …  

R   G   B   A  

data  Hles  

query  fragment  shader  

Y  [768-­‐1023]  

X  [512-­‐767]  

{  0

1

11

Pass  1  projecHons   off-­‐screen  FBO  

render  fragment  shader  

Pass  2  canvas  

Pack  data  Hles  as  images  (352KB  for  Brightkite)  Bind  to  WebGL  context  as  textures    

Page 48: Interactive Latency in Big Data Visualization

48  

Simulate brush & linking across plots in a scatter plot matrix imMens vs. full data cube 60 synthesized datasets

Parameters bin count per dimension (10,20,30,40,50)

number of records (10K, 100K, 1M, 10M, 100M, 1B)

number of dimensions (4,5)

Performance Benchmarks

Page 49: Interactive Latency in Big Data Visualization

49  

Google Chrome v.23.0.1271.95 on a quad-core 2.3 GHz MacBook Pro (OS X 10.8.2) with per-core 256K L2 caches, shared 6MB L3 cache and 8GB RAM. PCI Express NVIDIA GeForce GT 650M graphics card with 1024MB video RAM.

51.9   52.3   51.6   52.0   53.2   52.1  

5.5  3.0   2.2  

Page 50: Interactive Latency in Big Data Visualization

50  

Google Chrome v.23.0.1271.95 on a quad-core 2.3 GHz MacBook Pro (OS X 10.8.2) with per-core 256K L2 caches, shared 6MB L3 cache and 8GB RAM. PCI Express NVIDIA GeForce GT 650M graphics card with 1024MB video RAM.

51.9   52.3   51.6   52.0   53.2   52.1  

5.5  3.0   2.2  

Page 51: Interactive Latency in Big Data Visualization

51  

Google Chrome v.23.0.1271.95 on a quad-core 2.3 GHz MacBook Pro (OS X 10.8.2) with per-core 256K L2 caches, shared 6MB L3 cache and 8GB RAM. PCI Express NVIDIA GeForce GT 650M graphics card with 1024MB video RAM.

51.9   52.3   51.6   52.0   53.2   52.1  

5.5  3.0   2.2  

50fps querying and rendering of 1B data points

Page 52: Interactive Latency in Big Data Visualization
Page 53: Interactive Latency in Big Data Visualization

Speed of Thought?

Page 54: Interactive Latency in Big Data Visualization

Questions

How to reduce interactive latency in big data visualization? How does interactive latency affect user behavior?

Page 55: Interactive Latency in Big Data Visualization

Newell (1994): Unified Theories of Cognition

Page 56: Interactive Latency in Big Data Visualization

Newell (1994) Card et al (1983) Example Time Range

deliberate act perceptual fusion recognize a pattern, track animation

~100 milliseconds

cognitive operation unprepared response click a link, select an object

~1 second

unit task unit task edit a line of text, make a chess move

~10 seconds

Page 57: Interactive Latency in Big Data Visualization

~300ms: The Embodiment Level

Page 58: Interactive Latency in Big Data Visualization

Deictic Strategy

Pointing movements bind objects in the world

Page 59: Interactive Latency in Big Data Visualization

Small changes in cost of binding cause different cognitive behavior

Page 60: Interactive Latency in Big Data Visualization

Latency affects high-level/longitudinal strategies

Block-copying Ballard et al (1995, 1997)

8-puzzle solving O’Hara and Payne (1998, 1999)

Search Brutlag (2009)

Page 61: Interactive Latency in Big Data Visualization

Exploratory Visual Analysis?

Page 62: Interactive Latency in Big Data Visualization

Operation Low High brush & link ~20ms ~20ms + 500ms

select ~20ms ~20ms + 500ms

pan ~100ms ~100ms + 500ms

zoom ~1000ms ~1000ms + 500ms

Latency Conditions

Page 63: Interactive Latency in Big Data Visualization

Datasets

Page 64: Interactive Latency in Big Data Visualization

Study Design

16 participants, 32 observations 2 X 2 between subject interaction logs audio transcripts

Page 65: Interactive Latency in Big Data Visualization

Log Events

System and Mouse Events brush, select, zoom, pan, clear, color slider, log scale tiles cached, mouse down, mouse up, mouse move

Trigger vs. Processed System Events debouncing keeps system usable timestamp, event type, parameters

Page 66: Interactive Latency in Big Data Visualization

Normalized Processed Events

Page 67: Interactive Latency in Big Data Visualization

How to Evaluate Performance?

The purpose of visualization is insight, not pictures.

Page 68: Interactive Latency in Big Data Visualization

Counting Insights

Page 69: Interactive Latency in Big Data Visualization

What is an insight?

"many new airlines emerged around year 2003”

"HP started in 2001, AS in 2003, PI in 2004, OH in 2003”

“OH started in 2003, and they are doing pretty well in terms of delays”

Page 70: Interactive Latency in Big Data Visualization

Questions

How to reduce interactive latency in big data visualization?

imMens: a system supporting real-time interaction binned aggregation for perceptual scalability multivariate data tiles & GPU processing for low latency How does interactive latency affect user behavior?

Comparative study: quantitative & qualitative analysis

Page 71: Interactive Latency in Big Data Visualization

Questions

How to reduce interactive latency in big data visualization?

imMens: a system supporting real-time interaction binned aggregation for perceptual scalability multivariate data tiles & GPU processing for low latency How does interactive latency affect user behavior?

Page 72: Interactive Latency in Big Data Visualization

Questions

How to reduce interactive latency in big data visualization?

imMens: a system supporting real-time interaction binned aggregation for perceptual scalability multivariate data tiles & GPU processing for low latency How does interactive latency affect user behavior?

User study: quantitative & qualitative analysis

Page 73: Interactive Latency in Big Data Visualization

Acknowledgment

Jeffrey Heer Biye Jiang

Page 74: Interactive Latency in Big Data Visualization

Thank You