data visualisation

Post on 16-Aug-2015

97 Views

Category:

Data & Analytics

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

@dmarcous

“Data Is The New Oil”

“Data Is The New Soil”David McCandless

“The 19th century culture was defined by the novel, the 20th century culture was defined by the cinema, and the 21st century culture will be defined by the interface”

Lev Manovich, Visual arts professor and media theorist

Steps of Data Analytics (Science)

●Defining question●Data Exploration●Data Gathering●Data Preparation●Data Analysis●Data Visualisation

@dmarcous

●@IDFDBA (DataBase Administrator)Big Data Professional

●@Google – WazeData Wizard - Magic with Data

Old Roots

Charles Joseph MinardFrench civil engineerBorn: March 27, 1781Died: October 24, 1870

Lon-5 0 5 10

42

44

46

48

50

La

t

@dmarcous

Edward Tufte

●Visual Display of Quantitative Information●Beautiful Evidence

Data-Ink Ratio

Grammar of graphicsA graphic is a mapping:from data - to aesthetic attributes (colour, shape, size) of geometric object (points, lines, bar).

Leland Wilkinson, 1999

ggplot2 packageHadley WickhamAssistant professor of statistics at Rice University, Houston

Technical Problems -Large Result SetReal Time results

Serving Database

Indexed ResultsCached Results

Slice Data in MemoryBuild Graphs In Memory

Combined Architecture

●Batch Processing - Hadoop / Spark

●Stream Processing - Storm / Spark Streaming

●Serving / Online Indexing - HBase / solr / elastic / mongo

●Visualise - R / python /+ javascript (D3.js)

Dimensions of Data

NumbersNumbers

DistanceDistance

ShapesShapes

LocationLocation

OpacityOpacity SizeSize

WordsWords

Amount of data VS Human Readability

We don’t want only matter experts to be able to understand the visualisation

Techniques

Case Study 1

●Bush VS Gore election results●Divided by Ethnicity (white/ black/ hispanic)●Divided according to size of population

Red - BushBlack - Gore

Case Study 2

●Display specific area attack patterns○ Time of attack○ Intensity of response (we want to minimize casualties from our side)

●Take seasonality into account (day of week)●Display level of confidence●Bonus - Add “automatic pattern discovery” / Recommendation

for attack time

Case Study 3

●Most Photographed Places on Earth●Personalise to specific user taken photos●Compare specific user photo location to popularity of location

Case Study 4

●What customers do with my app?●Distribution of entry points●“Paths to success”

Case Study 5

●Detect and present Anomalies in data (machine data / user patterns / irregular traffic etc’)○ Anomaly = out of ordinary.○ Learn “normal”○ Detect Global anomalies○ Detect normal anomalies○ Detect change of trends

Twitter Breakout Detection

Case Study 6

●Assess and display the effect of an event on a certain metric○ Learn normal pattern○ Predict according to normal pattern○ compute residuals (predicted-actual)

●Did the storm affected number of accidents?●Did a military action affected number of missiles sent?

Google Causal Impact

Case Study 7

●Build a stock portfolio with minimised risk●Pick stocks that are doing well (easy…)●Pick stocks that act different from each other (correlations)

Clustering Analysis

Interactive Visualisation

Data Art

data-waze.com

Open source is good for me, I will fully embrace it.Open source is good for me, I will fully embrace it.Open source is good for me, I will fully embrace it.Open source is good for me, I will fully embrace it.Open source is good for me, I will fully embrace it.Open source is good for me, I will fully embrace it.Open source is good for me, I will fully embrace it.Open source is good for me, I will fully embrace it.Open source is good for me, I will fully embrace it.Open source is good for me, I will fully embrace it.Open source is good for me, I will fully embrace it.Open source is good for me, I will fully embrace it.Open source is good for me, I will fully embrace it.

Contact

●Daniel Marcous●dmarcous@gmail.com

top related