dynamic visualization of transient data streams

Post on 04-Jan-2016

25 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

Dynamic Visualization of Transient Data Streams. P. Wong, et al The Pacific Northwest National Laboratory Presented by John Sharko Visualization of Massive Datasets. Characteristics of Data Streams. Arrives continuously Arrives unpredictably Arrives unboundedly - PowerPoint PPT Presentation

TRANSCRIPT

Dynamic Visualization of Transient Data Streams

P. Wong, et alThe Pacific Northwest National Laboratory

Presented by John SharkoVisualization of Massive Datasets

Characteristics of Data Streams

• Arrives continuously

• Arrives unpredictably

• Arrives unboundedly

• Arrives without persistent patterns

Examples of Data Streams

• Newswires

• Internet click streams

• Network resource management

• Phone call records

• Remote sensing imagery

Visualization Problem

• Fusing a large amount of previously analyzed information with a small amount of new information

• Reprocess the whole dataset in full detail

First Objective

• Achieve the best understanding of transient data when influx rate exceed processing rate

Approach: Data stratification to reduce data size

Second Objective

• Incremental visualization technique

Approach: Project new information incrementally onto previous data

Primary Visualization OutputMultidimensional Scaling

OJ Simpson trial

French elections

Oklahoma bombing

Adaptive Visualization Using Stratification

Methods for Adaptive Visualization

• Vector dimension reduction

• Vector sampling

Vector Dimension Reduction

Approach: dyadic wavelets (Haar)

200 terms

100 terms

50 terms

Results of Vector Dimension Reduction

200 10050

Dimensions

Results of Vector Sampling

3298 1649 824

Number of Documents

Scatterplot Similarity Matching

Scatterplot Similarity Matching

Procrustes Analysis Results

200 100 50

All 0.0 (self) 0.022 0.084

1/2 0.016 0.051 0.111

1/4 0.033 0.062 0.141

Incremental Visualization Using Fusion

• Reprocessing by projecting new items onto existing visualization

• Feature: reprocessing the entire dataset is often not required

Hyperspectral Image Processing

• Apply MDS to scale pixel vectors

• K-mean process to assign unique colors

• Stratify the vectors progressively

Robust Eigenvectors

Generate three MDS scatter plots for each third of the image

Robust Eigenvectors (cont’d)Generate MDS scatterplot for entire dataset

Robust Eigenvectors (cont’d)

Extract points from cropped areas

Using Multiple Sliding Windows

Eigenvectors determined by the long window

New vectors are projected using the Eigenvectors of the long window

Data Stream

Long Window Short Window

Sliding Direction

Dynamic Visualization Steps

1. When influx rate < processing rate, use MDS

2. When influx rate > processing rate, halt MDS

3. Use multiple sliding windows for pre-defined number of steps

4. Use stratification approach for fast overview

5. Check for accumulated error using Procrustes analysis

6. If error threshold not reached, go to step 3

If error threshold reached, go to step 1

Conclusions

• The data stratification approach can substantially accelerate visualization process

• The data fusion approach can provide instant updates

top related