sizeintrodefinitioncomplexitytuftswrap-up 1/54 big data visual analytics: challenges and...
TRANSCRIPT
![Page 1: SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University](https://reader033.vdocument.in/reader033/viewer/2022051619/56649e215503460f94b0e3ac/html5/thumbnails/1.jpg)
SizeIntro Definition Complexity Tufts Wrap-up1/54
Big Data Visual Analytics: Challenges and Opportunities
Remco ChangTufts University
![Page 2: SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University](https://reader033.vdocument.in/reader033/viewer/2022051619/56649e215503460f94b0e3ac/html5/thumbnails/2.jpg)
SizeIntro Definition Complexity Tufts Wrap-up2/54
Human + Computer
• Human vs. Artificial IntelligenceGarry Kasparov vs. Deep Blue (1997)– Computer takes a “brute force” approach
without analysis– “As for how many moves ahead a grandmaster
sees,” Kasparov concludes: “Just one, the best one”
• Artificial vs. Augmented IntelligenceHydra vs. Cyborgs (2005)– Grandmaster + 1 chess program > Hydra
(equiv. of Deep Blue)– Amateur + 3 chess programs > Grandmaster +
1 chess program1
1. http://www.collisiondetection.net/mt/archives/2010/02/why_cyborgs_are.php
![Page 3: SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University](https://reader033.vdocument.in/reader033/viewer/2022051619/56649e215503460f94b0e3ac/html5/thumbnails/3.jpg)
SizeIntro Definition Complexity Tufts Wrap-up3/54
Visual Analytics = Human + Computer
• Visual analytics is "the science of analytical reasoning facilitated by visual interactive interfaces.“ 1
• By definition, it is a collaboration between human and computer to solve problems.
1. Thomas and Cook, “Illuminating the Path”, 2005.
![Page 4: SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University](https://reader033.vdocument.in/reader033/viewer/2022051619/56649e215503460f94b0e3ac/html5/thumbnails/4.jpg)
SizeIntro Definition Complexity Tufts Wrap-up4/54
Example: What Does (Wire) Fraud Look Like?• Financial Institutions like Bank of America have legal responsibilities to
report all suspicious wire transaction activities (money laundering, supporting terrorist activities, etc)
• Data size: approximately 200,000 transactions per day (73 million transactions per year)
• Problems:– Automated approach can only detect known patterns– Bad guys are smart: patterns are constantly changing– Data is messy: lack of international standards resulting in ambiguous data
• Current methods:– 10 analysts monitoring and analyzing all transactions– Using SQL queries and spreadsheet-like interfaces– Limited time scale (2 weeks)
![Page 5: SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University](https://reader033.vdocument.in/reader033/viewer/2022051619/56649e215503460f94b0e3ac/html5/thumbnails/5.jpg)
SizeIntro Definition Complexity Tufts Wrap-up5/54
WireVis: Financial Fraud Analysis
• In collaboration with Bank of America– Develop a visual analytical tool (WireVis)– Visualizes 7 million transactions over 1 year– Beta-deployed at WireWatch
• A great problem for visual analytics:– Ill-defined problem (how does one define fraud?)– Limited or no training data (patterns keep changing)– Requires human judgment in the end (involves law enforcement
agencies)
• Design philosophy: “combating human intelligence requires better (augmented) human intelligence”
R. Chang et al., Scalable and interactive visual analysis of financial wire transactions for fraud detection. Information Visualization,2008.R. Chang et al., Wirevis: Visualization of categorical, time-varying data from financial transactions. IEEE VAST, 2007.
![Page 6: SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University](https://reader033.vdocument.in/reader033/viewer/2022051619/56649e215503460f94b0e3ac/html5/thumbnails/6.jpg)
SizeIntro Definition Complexity Tufts Wrap-up6/54
WireVis: A Visual Analytics Approach
Heatmap View(Accounts to Keywords Relationship)
Strings and Beads(Relationships over Time)
Search by Example (Find Similar Accounts)
Keyword Network(Keyword Relationships)
![Page 7: SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University](https://reader033.vdocument.in/reader033/viewer/2022051619/56649e215503460f94b0e3ac/html5/thumbnails/7.jpg)
SizeIntro Definition Complexity Tufts Wrap-up7/54
Applications of Visual Analytics
• Political Simulation– Agent-based analysis– With DARPA
• Global Terrorism Database– With DHS
• Bridge Maintenance – With US DOT– Exploring inspection
reports
• Biomechanical Motion– Interactive motion
comparisonR. Chang et al., Two Visualization Tools for Analysis of Agent-Based Simulations in Political Science. IEEE CG&A, 2012
![Page 8: SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University](https://reader033.vdocument.in/reader033/viewer/2022051619/56649e215503460f94b0e3ac/html5/thumbnails/8.jpg)
SizeIntro Definition Complexity Tufts Wrap-up8/54
Applications of Visual AnalyticsWhere
When
Who
What
Original Data
EvidenceBox
R. Chang et al., Investigative Visual Analysis of Global Terrorism, Journal of Computer Graphics Forum, 2008.
• Political Simulation– Agent-based analysis– With DARPA
• Global Terrorism Database– With DHS
• Bridge Maintenance – With US DOT– Exploring inspection
reports
• Biomechanical Motion– Interactive motion
comparison
![Page 9: SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University](https://reader033.vdocument.in/reader033/viewer/2022051619/56649e215503460f94b0e3ac/html5/thumbnails/9.jpg)
SizeIntro Definition Complexity Tufts Wrap-up9/54
Applications of Visual Analytics
R. Chang et al., An Interactive Visual Analytics System for Bridge Management, Journal of Computer Graphics Forum, 2010. To Appear.
• Political Simulation– Agent-based analysis– With DARPA
• Global Terrorism Database– With DHS
• Bridge Maintenance – With US DOT– Exploring inspection
reports
• Biomechanical Motion– Interactive motion
comparison
![Page 10: SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University](https://reader033.vdocument.in/reader033/viewer/2022051619/56649e215503460f94b0e3ac/html5/thumbnails/10.jpg)
SizeIntro Definition Complexity Tufts Wrap-up10/54
Applications of Visual Analytics
R. Chang et al., Interactive Coordinated Multiple-View Visualization of Biomechanical Motion Data , IEEE Vis (TVCG) 2009.
• Political Simulation– Agent-based analysis– With DARPA
• Global Terrorism Database– With DHS
• Bridge Maintenance – With US DOT– Exploring inspection
reports
• Biomechanical Motion– Interactive motion
comparison
![Page 11: SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University](https://reader033.vdocument.in/reader033/viewer/2022051619/56649e215503460f94b0e3ac/html5/thumbnails/11.jpg)
SizeIntro Definition Complexity Tufts Wrap-up11/54
Talk Outline
• Visual Analytics + Big Data:
1. What is Big Data Visual Analytics? Definition and Problem Statement
2. How to Visualize High Dimensional Data?
3. How to Visualize Large Amounts of Data?
4. Research at Tufts
![Page 12: SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University](https://reader033.vdocument.in/reader033/viewer/2022051619/56649e215503460f94b0e3ac/html5/thumbnails/12.jpg)
SizeIntro Definition Complexity Tufts Wrap-up12/54
1. What is Big Data Visual Analytics?A Definition and Problem Statement
![Page 13: SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University](https://reader033.vdocument.in/reader033/viewer/2022051619/56649e215503460f94b0e3ac/html5/thumbnails/13.jpg)
SizeIntro Definition Complexity Tufts Wrap-up13/54
Recall Bank of America Project
• Financial Institutions like Bank of America have legal responsibilities to report all suspicious wire transaction activities (money laundering, supporting terrorist activities, etc)
• Data size: approximately 200,000 transactions per day (73 million transactions per year)
• Question: How many people think this is Big Data?
![Page 14: SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University](https://reader033.vdocument.in/reader033/viewer/2022051619/56649e215503460f94b0e3ac/html5/thumbnails/14.jpg)
SizeIntro Definition Complexity Tufts Wrap-up14/54
Defining Big Data for Visual Analytics
• Let’s say that I have a billion data items, is that Big Data?
• What if:– These data items only have two
attributes (e.g., latitude, longitude)?
– If I transpose this dataset such that I have two rows of data, but with a billion attributes?
![Page 15: SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University](https://reader033.vdocument.in/reader033/viewer/2022051619/56649e215503460f94b0e3ac/html5/thumbnails/15.jpg)
SizeIntro Definition Complexity Tufts Wrap-up15/54
Defining Big Data for Visual Analytics
• Big Data is NOT just about the size of your data
• For the purpose of this talk, let’s talk about Big Data in the following way:
– Complexity: The number of attributes (k) • Assume (k > 2)
– Size: The number of rows (n)• Assume the amount of data cannot fit
into a desktop computer’s memory
![Page 16: SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University](https://reader033.vdocument.in/reader033/viewer/2022051619/56649e215503460f94b0e3ac/html5/thumbnails/16.jpg)
SizeIntro Definition Complexity Tufts Wrap-up16/54
Problem Statements
• Considering the two together is too difficult, so we’ll tackle the two issues independently for now
• Our goal is to visualize (complex | large) data sets while:– Maintaining interactivity:
rendering at 10 fps – Allowing for operations on the
data (zoom, pivot, etc)
![Page 17: SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University](https://reader033.vdocument.in/reader033/viewer/2022051619/56649e215503460f94b0e3ac/html5/thumbnails/17.jpg)
SizeIntro Definition Complexity Tufts Wrap-up17/54
2. How to Visualize Complex (High-Dimensional) Data?
![Page 18: SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University](https://reader033.vdocument.in/reader033/viewer/2022051619/56649e215503460f94b0e3ac/html5/thumbnails/18.jpg)
SizeIntro Definition Complexity Tufts Wrap-up18/54
Why is This Problem Hard?
You can only see 2D becauseYour monitor is 2D
In other words:you can show at most 2 dimensional data.
Everything else is a hack.
![Page 19: SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University](https://reader033.vdocument.in/reader033/viewer/2022051619/56649e215503460f94b0e3ac/html5/thumbnails/19.jpg)
SizeIntro Definition Complexity Tufts Wrap-up19/54
Ways to Visualize k-Dimensional Data
• Two primary ways to do this “hack”
– Divide up the 2D screen into multiple 2D regions• Showing no correlation between
dimensions• Showing k-1 correlations• Showing all pair-wise correlations
– Project k-Dimensional Data into 2D• 3D to 2D• k-D projection
![Page 20: SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University](https://reader033.vdocument.in/reader033/viewer/2022051619/56649e215503460f94b0e3ac/html5/thumbnails/20.jpg)
SizeIntro Definition Complexity Tufts Wrap-up20/54
Ways to Visualize k-Dimensional Data• Divide up the 2D screen into multiple 2D regions
– Showing no correlation between dimensions– Showing k-1 correlations– Showing all pair-wise correlations
• Project k-Dimensional Data into 2D– 3D to 2D– k-D projection
![Page 21: SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University](https://reader033.vdocument.in/reader033/viewer/2022051619/56649e215503460f94b0e3ac/html5/thumbnails/21.jpg)
SizeIntro Definition Complexity Tufts Wrap-up21/54
Ways to Visualize k-Dimensional Data• Divide up the 2D screen into multiple 2D regions
– Showing no correlation between dimensions
– Showing k-1 correlations– Showing all pair-wise correlations
• Project k-Dimensional Data into 2D– 3D to 2D– k-D projection
Parallel Coordinates
![Page 22: SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University](https://reader033.vdocument.in/reader033/viewer/2022051619/56649e215503460f94b0e3ac/html5/thumbnails/22.jpg)
SizeIntro Definition Complexity Tufts Wrap-up22/54
Ways to Visualize k-Dimensional Data• Divide up the 2D screen into multiple 2D regions
– Showing no correlation between dimensions– Showing k-1 correlations
– Showing all pair-wise correlations• Project k-Dimensional Data into 2D
– 3D to 2D– k-D projection
Scatterplot Matrix
![Page 23: SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University](https://reader033.vdocument.in/reader033/viewer/2022051619/56649e215503460f94b0e3ac/html5/thumbnails/23.jpg)
SizeIntro Definition Complexity Tufts Wrap-up23/54
Ways to Visualize k-Dimensional Data• Divide up the 2D screen into multiple 2D regions
– Showing no correlation between dimensions– Showing k-1 correlations– Showing all pair-wise correlations
• Project k-Dimensional Data into 2D
– 3D to 2D– k-D projection
![Page 24: SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University](https://reader033.vdocument.in/reader033/viewer/2022051619/56649e215503460f94b0e3ac/html5/thumbnails/24.jpg)
SizeIntro Definition Complexity Tufts Wrap-up24/54
Ways to Visualize k-Dimensional Data• Divide up the 2D screen into multiple 2D regions
– Showing no correlation between dimensions– Showing k-1 correlations– Showing all pair-wise correlations
• Project k-Dimensional Data into 2D
– 3D to 2D– k-D projection
![Page 25: SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University](https://reader033.vdocument.in/reader033/viewer/2022051619/56649e215503460f94b0e3ac/html5/thumbnails/25.jpg)
SizeIntro Definition Complexity Tufts Wrap-up25/54
Ways to Visualize k-Dimensional Data• Divide up the 2D screen into multiple 2D regions
– Showing no correlation between dimensions– Showing k-1 correlations– Showing all pair-wise correlations
• Project k-Dimensional Data into 2D– 3D to 2D
– k-D projection Example Projection Methods:(Dimension Reduction)• PCA• MDS• LDA• LLE
Many others! Usually, try to preserve distances in 2D as they exist in k-D
![Page 26: SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University](https://reader033.vdocument.in/reader033/viewer/2022051619/56649e215503460f94b0e3ac/html5/thumbnails/26.jpg)
SizeIntro Definition Complexity Tufts Wrap-up26/54
What We Have Done (at Tufts)
• We like projection methods because it is more scalable than the “divide the screen” methods
• iPCA – does interaction help understanding high dimensional data?– Demo
• Dis-Function – are interactions in 2D meaningful (recoverable) in k-D?
![Page 27: SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University](https://reader033.vdocument.in/reader033/viewer/2022051619/56649e215503460f94b0e3ac/html5/thumbnails/27.jpg)
SizeIntro Definition Complexity Tufts Wrap-up27/54
Dis-Function: Direct Manipulation of Visualization
• The user directly moves points on the 2D plane that don’t “look right”…
• Until the expert is happy (or the visualization can not be improved further)
• The system learns the weights (importance) of each of the original k dimensions
![Page 28: SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University](https://reader033.vdocument.in/reader033/viewer/2022051619/56649e215503460f94b0e3ac/html5/thumbnails/28.jpg)
SizeIntro Definition Complexity Tufts Wrap-up28/54
Dis-Function
• This iterative metric learning process finds the weights of the k-dimensions over a series of 2D interactions
R. Chang et al., Find Distance Function, Hide Model Inference. IEEE VAST Poster 2011R. Chang et al., Dis-function: Learning Distance Functions Interactively, IEEE VAST 2012. To Appear
![Page 29: SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University](https://reader033.vdocument.in/reader033/viewer/2022051619/56649e215503460f94b0e3ac/html5/thumbnails/29.jpg)
SizeIntro Definition Complexity Tufts Wrap-up29/54
Dis-Function: Implementation
Linear distance function:
Optimization:
![Page 30: SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University](https://reader033.vdocument.in/reader033/viewer/2022051619/56649e215503460f94b0e3ac/html5/thumbnails/30.jpg)
SizeIntro Definition Complexity Tufts Wrap-up30/54
Open Questions in High-Dimensional Data Visualization• When to use what?– Projection methods scale better, but are harder to
understand
• What happens when the data attributes are not all numeric, but contains categorical or text data?– Use multiple coordinated views
• But what if k gets to be really large and the types are mixed?– Uh…
![Page 31: SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University](https://reader033.vdocument.in/reader033/viewer/2022051619/56649e215503460f94b0e3ac/html5/thumbnails/31.jpg)
SizeIntro Definition Complexity Tufts Wrap-up31/54
3. How to Visualize Large Amount of Data?
![Page 32: SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University](https://reader033.vdocument.in/reader033/viewer/2022051619/56649e215503460f94b0e3ac/html5/thumbnails/32.jpg)
SizeIntro Definition Complexity Tufts Wrap-up32/54
Problem Statement
Visualization on aCommodity Hardware
Large Data in aData Warehouse
![Page 33: SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University](https://reader033.vdocument.in/reader033/viewer/2022051619/56649e215503460f94b0e3ac/html5/thumbnails/33.jpg)
SizeIntro Definition Complexity Tufts Wrap-up33/54
Problem Statement
• Constraint: Data is too big to fit into the memory or hard drive of the personal computer– Note: Ignoring various database technologies (OLAP, Column-
Store, No-SQL, Array-Based, etc)
• Classic Computer Science Problem…
• What are some previous techniques?– Truncate (sample, filter)– Resolution reduction (“blurring”, image zooming)– Stream (think Netflix, Hulu)– Pre-fetch (think open world 3D video games)
![Page 34: SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University](https://reader033.vdocument.in/reader033/viewer/2022051619/56649e215503460f94b0e3ac/html5/thumbnails/34.jpg)
SizeIntro Definition Complexity Tufts Wrap-up34/54
Pros and Cons: Truncate
• Truncate (sample, filter)– Pros: Easy to implement; efficient; scalable– Cons: Sampling is often data- or task-dependent
SamplingAlgorithm
![Page 35: SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University](https://reader033.vdocument.in/reader033/viewer/2022051619/56649e215503460f94b0e3ac/html5/thumbnails/35.jpg)
SizeIntro Definition Complexity Tufts Wrap-up35/54
Pros and Cons: Resolution Reduction
• Resolution reduction (“blurring”)– Pros: Allows hierarchical navigations– Cons:
• Fine details are often lost, • not all data types can be easily blurred (order-invariant data)
![Page 36: SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University](https://reader033.vdocument.in/reader033/viewer/2022051619/56649e215503460f94b0e3ac/html5/thumbnails/36.jpg)
SizeIntro Definition Complexity Tufts Wrap-up36/54
Pros and Cons: Streaming
• Stream [Fisher et al. CHI 2012]– Pros: Query can be terminated at any time– Cons: It is inefficient on the database end
t = 1 second t = 5 minuteFisher et al. , Trust Me, I'm Partially Right: Incremental Visualization Lets Analysts Explore Large Datasets Faster. CHI 2012
![Page 37: SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University](https://reader033.vdocument.in/reader033/viewer/2022051619/56649e215503460f94b0e3ac/html5/thumbnails/37.jpg)
SizeIntro Definition Complexity Tufts Wrap-up37/54
Pros and Cons: Pre-Fetch
• Pre-fetch– Pros: Seamless to the user– Cons: Predicting the future is kind of hard
• Possible in 3D games because of limited degrees of freedom• http://www.youtube.com/watch?v=n27NLuc44Lk
![Page 38: SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University](https://reader033.vdocument.in/reader033/viewer/2022051619/56649e215503460f94b0e3ac/html5/thumbnails/38.jpg)
SizeIntro Definition Complexity Tufts Wrap-up38/54
Pros and Cons: Pre-Fetch
• Pre-fetch in Visual Analytics [Chan, Hanrahan, 2008 VAST]– Limit the types of operations a user can do– Allows interactive analysis of over a billion data points
Chan et al. ,. Maintaining Interactivity While Exploring Massive Time Series. IEEE VAST 2008
![Page 39: SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University](https://reader033.vdocument.in/reader033/viewer/2022051619/56649e215503460f94b0e3ac/html5/thumbnails/39.jpg)
SizeIntro Definition Complexity Tufts Wrap-up39/54
Quick Summary• Most of the time, a combination of techniques is
used in a given system. For example, streaming and sampling.
• Pre-fetching is very interesting because:– The success metric is quantitative (cache misses)– Multiple approaches for prediction
• Feature-based (what data features is the user interested in?)
• Momentum-based (has the user been panning to the right?)• Probabilistic models (what is the user likely going to do?)• Profile-based (what type of user is it?)• etc
![Page 40: SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University](https://reader033.vdocument.in/reader033/viewer/2022051619/56649e215503460f94b0e3ac/html5/thumbnails/40.jpg)
SizeIntro Definition Complexity Tufts Wrap-up40/54
4. Research at Tufts:Visual Analytics of Large Amounts of Data
Joint work with Caroline Ziemkiewicz , Alvitta Ottley
![Page 41: SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University](https://reader033.vdocument.in/reader033/viewer/2022051619/56649e215503460f94b0e3ac/html5/thumbnails/41.jpg)
SizeIntro Definition Complexity Tufts Wrap-up41/54
Motivation
![Page 42: SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University](https://reader033.vdocument.in/reader033/viewer/2022051619/56649e215503460f94b0e3ac/html5/thumbnails/42.jpg)
SizeIntro Definition Complexity Tufts Wrap-up42/54
Individual Differences and Interaction Pattern
• Existing research shows that all the following factors affect how someone uses a visualization:
– Spatial Ability– Cognitive Workload/Mental Demand– Personality– Experience (novice vs. expert)– Emotional State– Perceptual Speed– … and more
![Page 43: SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University](https://reader033.vdocument.in/reader033/viewer/2022051619/56649e215503460f94b0e3ac/html5/thumbnails/43.jpg)
SizeIntro Definition Complexity Tufts Wrap-up43/54
Preliminary Study – Novice v. Expert
• Novice vs. Expert financial experts use of the WireVis system when searching for fraud
– Novice exhibited “breadth-first-search” behaviors
– Experts exhibited “depth-first-search” behaviors
• Our next step is to use Machine Learning methods to distinguish a user by analyzing their interactions in real-time
![Page 44: SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University](https://reader033.vdocument.in/reader033/viewer/2022051619/56649e215503460f94b0e3ac/html5/thumbnails/44.jpg)
SizeIntro Definition Complexity Tufts Wrap-up44/54
Preliminary Study – Locus of Control
• Identified the personality factor, Locus of Control (LOC), as a predictor for how a user interacts with the following visualizations:
![Page 45: SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University](https://reader033.vdocument.in/reader033/viewer/2022051619/56649e215503460f94b0e3ac/html5/thumbnails/45.jpg)
SizeIntro Definition Complexity Tufts Wrap-up45/54
Results
• When with list view compared to containment view, internal LOC users are:– faster (by 70%)– more accurate (by 34%)
• Only for complex (inferential) tasks• The speed improvement is about 2 minutes (116 seconds)R. Chang et al., How Locus of Control Influences Compatibility with Visualization Style , IEEE VAST 2011. R. Chang et al., How Visualization Layout Relates to Locus of Control and Other Personality Factors. TVCG 2012. To Appear.
![Page 46: SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University](https://reader033.vdocument.in/reader033/viewer/2022051619/56649e215503460f94b0e3ac/html5/thumbnails/46.jpg)
SizeIntro Definition Complexity Tufts Wrap-up46/54
Preliminary Study – Cognitive Priming
![Page 47: SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University](https://reader033.vdocument.in/reader033/viewer/2022051619/56649e215503460f94b0e3ac/html5/thumbnails/47.jpg)
SizeIntro Definition Complexity Tufts Wrap-up47/54
Results: Averages Primed More Internal
Visual Form
List-View Containment
Performance
Poor
Good
Internal LOC
External LOC
Average ->Internal
Average LOC
R. Chang et al., LOC it Down: Manipulating and Controlling for Personality Effects on Visualization Tasks. (In Submission to CHI)
![Page 48: SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University](https://reader033.vdocument.in/reader033/viewer/2022051619/56649e215503460f94b0e3ac/html5/thumbnails/48.jpg)
SizeIntro Definition Complexity Tufts Wrap-up48/54
Preliminary Study – Using Brain Sensing (fNIRS)
Functional Near-Infrared Spectroscopy • a lightweight brain sensing technique • measures mental demand (working memory)
R. Chang et al., Using fNIRS Brain Sensing to Evaluate Information Visualization Interfaces (In submission at CHI)
![Page 49: SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University](https://reader033.vdocument.in/reader033/viewer/2022051619/56649e215503460f94b0e3ac/html5/thumbnails/49.jpg)
SizeIntro Definition Complexity Tufts Wrap-up49/54
This is Your Brain on Bar graphs and Pie Charts
![Page 50: SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University](https://reader033.vdocument.in/reader033/viewer/2022051619/56649e215503460f94b0e3ac/html5/thumbnails/50.jpg)
SizeIntro Definition Complexity Tufts Wrap-up50/54
Make the Computer Aware of the User!
![Page 51: SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University](https://reader033.vdocument.in/reader033/viewer/2022051619/56649e215503460f94b0e3ac/html5/thumbnails/51.jpg)
SizeIntro Definition Complexity Tufts Wrap-up51/54
Summary
![Page 52: SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University](https://reader033.vdocument.in/reader033/viewer/2022051619/56649e215503460f94b0e3ac/html5/thumbnails/52.jpg)
SizeIntro Definition Complexity Tufts Wrap-up52/54
Summary
• Visual Analytics + Big Data is a critically important problem that isn’t going to go away
• Thinking of Big Data as problems of data complexity and size can lead to clearer research paths
• I propose that one research area that has largely been unexplored is in the understanding of the human user.
![Page 53: SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University](https://reader033.vdocument.in/reader033/viewer/2022051619/56649e215503460f94b0e3ac/html5/thumbnails/53.jpg)
SizeIntro Definition Complexity Tufts Wrap-up53/54
Summary
• Visual Analytics + Big Data:
1. What is Big Data Visual Analytics? Definition and Problem Statement
2. How to Visualize High Dimensional Data?
3. How to Visualize Large Amounts of Data?
4. Research at Tufts
![Page 54: SizeIntroDefinitionComplexityTuftsWrap-up 1/54 Big Data Visual Analytics: Challenges and Opportunities Remco Chang Tufts University](https://reader033.vdocument.in/reader033/viewer/2022051619/56649e215503460f94b0e3ac/html5/thumbnails/54.jpg)
SizeIntro Definition Complexity Tufts Wrap-up54/54