data discovery through high-data-density visual analysis...

29
© Synerscope 2013 Sept 2013

Upload: hoanganh

Post on 09-Mar-2018

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data Discovery through High-Data-Density Visual Analysis ...on-demand.gputechconf.com/gtc/2013/webinar/gtc-express-high-data... · problem, when in 2009 he met ... Nvidia: this brings

© Synerscope 2013

Sept 2013

Page 2: Data Discovery through High-Data-Density Visual Analysis ...on-demand.gputechconf.com/gtc/2013/webinar/gtc-express-high-data... · problem, when in 2009 he met ... Nvidia: this brings

© Synerscope 2013

© Synerscope 2013

HOW WE SOLVE THE PROBLEM

AGENDA • Synerscope Background • Problems of classic data analytics with Big Data • Solution of SynerScope mobilizes domain-expertise • How SynerScope works • How SynerScope uses the GPU • Live demo of SynerScope visual analytics

Page 3: Data Discovery through High-Data-Density Visual Analysis ...on-demand.gputechconf.com/gtc/2013/webinar/gtc-express-high-data... · problem, when in 2009 he met ... Nvidia: this brings

© Synerscope 2013

© Synerscope 2013

SynerScope

SEE -

EXPLORE -

DISCOVER

the best way to turn Big Data to Insight fast

Page 4: Data Discovery through High-Data-Density Visual Analysis ...on-demand.gputechconf.com/gtc/2013/webinar/gtc-express-high-data... · problem, when in 2009 he met ... Nvidia: this brings

© Synerscope 2013

© Synerscope 2013

SynerScope was started as a result of a problem encountered by its founder: analyzing complex interaction in networks proved impossible. In 2007 Jan-Kees Buenen set out to fix this problem, when in 2009 he met Danny Holten SynerScope was born. Jan-Kees Buenen: Sales, Manufacturing and Quality Analytics founder & CEO • Worked in multinational enterprises since 1993, managed $120 million sales • Deep experience with analytics: sales, CRM, operations research, demand-

forecasting , six-sigma, etc.

Danny Holten: Information Visualization founder & CSO • Worked on the technology of Hierarchical Edge Bundling during his PhD and

postdoc at the visualization department of the Eindhoven university of technology

Jorik Blaas: Computer imaging technologies in medical and photography CTO • Worked on various medical imaging technologies during his PhD at the Delft

university of technology • Successfully lead a team that developed an HDR photography product and an

other that designed and built a X-ray imaging instrument for paintings

Our road to here How we got here

Page 5: Data Discovery through High-Data-Density Visual Analysis ...on-demand.gputechconf.com/gtc/2013/webinar/gtc-express-high-data... · problem, when in 2009 he met ... Nvidia: this brings

© Synerscope 2013

© Synerscope 2013

HOW WE SOLVE THE PROBLEM

SynerScope allows everyone obtain Insight and Knowledge directly from Big Data: intuitively, quickly, safely and at low cost, and thus shortens the “Time to Insight” 20 Years of academic discussions and collaboration on Knowledge Discovery packed in an instrument that WORKS.

Page 6: Data Discovery through High-Data-Density Visual Analysis ...on-demand.gputechconf.com/gtc/2013/webinar/gtc-express-high-data... · problem, when in 2009 he met ... Nvidia: this brings

© Synerscope 2013

© Synerscope 2013

Customers and eco-system growing rapidly

(nog aan te vullen)

CUSTOMERS Insurance SIU: we could identify organized ring-fraud in hours Insurance claims director: we found hidden information for underwriting in claims Insurance healthcare: complex billing fraud shows up in the patterns Banking: we could find deviant cash movements instantly Big Four accountant: we could find related risk within 3000 trade positions in hours Telecoms –governmental: cell-tower to cell-tower tracks revealed violent incidents

Eco-system partners: Nvidia: this brings GPU rendering into the broader market of Big Data analytics Yarcdata: SynerScope iss a perfect scalable front-end for our ultra-scale graph engine SAP HANA labs: the complex nested queries of SynerScope add new sales options Dell: we look to support SynerScope for its private cloud appliance

Page 7: Data Discovery through High-Data-Density Visual Analysis ...on-demand.gputechconf.com/gtc/2013/webinar/gtc-express-high-data... · problem, when in 2009 he met ... Nvidia: this brings

© Synerscope 2013

© Synerscope 2013

Business Manager

Data scientist

Report Request

Business analyst

How business analytics work today

Data Warehouse Queries

Advanced stats

Page 8: Data Discovery through High-Data-Density Visual Analysis ...on-demand.gputechconf.com/gtc/2013/webinar/gtc-express-high-data... · problem, when in 2009 he met ... Nvidia: this brings

© Synerscope 2013

© Synerscope 2013

Business Manager

Data scientist

Report Request

Business analyst

SynerScope business analytics

Direct Insight

Advanced stats

Events Correlations Free search RCA

Data Warehouse

Page 9: Data Discovery through High-Data-Density Visual Analysis ...on-demand.gputechconf.com/gtc/2013/webinar/gtc-express-high-data... · problem, when in 2009 he met ... Nvidia: this brings

© Synerscope 2013

© Synerscope 2013

The problem with analytics and Big data

EXPENSIVE ( “RESOURCE- AND ASSET-HEAVY) • Heavy resource involvement in “waterfall method” Business defines questions-

Analysts translate into hypotheses- data scientists prepare the data models - expensive data warehouse loaded up – analysts perform queries and build reports – business does sense-making.

• Data models are core, but they require constant redesign, for new data and new questions

• Aimed at reporting, and repetitive BI, not built for ad-hoc or many new data types • Relies on Brute Force hardware for performance, rigidity built in • Multi-million dollar data ware-houses required to keep system performance

SLOW/ACCESSIBILITY • No access to data without query writing skills • Very slow to change for new data and/or new questions

AGILE BUSINESS INSIGHT DEMANDED

Page 10: Data Discovery through High-Data-Density Visual Analysis ...on-demand.gputechconf.com/gtc/2013/webinar/gtc-express-high-data... · problem, when in 2009 he met ... Nvidia: this brings

© Synerscope 2013

Introduction to SynerScope

SynerScope is an instrument for : access, search and analysis

o Notably to look at network patterns , scalable, flexible and interactive

Key features

o Proprietary bundling view, for scalable node-link views

o Showing both geo-location and time dimension of relationships

o All data is shown and interactive thanks to GPU acceleration

o Two-way integration with other applications (R, Python, Elasticsearch)

o Rapid set-up, roll-out, either stand-alone and/or virtualized

Synerscope

Legato

NoSQL

MongoDB

elasticsearch

source

data

Data integration (3rd party) Legato – machine based Marcato - human based

Page 11: Data Discovery through High-Data-Density Visual Analysis ...on-demand.gputechconf.com/gtc/2013/webinar/gtc-express-high-data... · problem, when in 2009 he met ... Nvidia: this brings

© Synerscope 2013

© Synerscope 2013

Volume Variety

Velocity Veracity

Computer Human

Structured Unstructured

LEGATO MARCATO

Page 12: Data Discovery through High-Data-Density Visual Analysis ...on-demand.gputechconf.com/gtc/2013/webinar/gtc-express-high-data... · problem, when in 2009 he met ... Nvidia: this brings

© Synerscope 2013

© Synerscope 2013

SynerScope – Analysis

12

Extract Data

Automated conversion

Statistics Starting point for analysis

Suggestions and

warnings

Business input

A

X

Y

Input business

questions

HUMAN SynerScope Marcato - Visual Analysis

Phase 1 Phase 2 Phase 3

Identify links over columns

Build logical entities

Add attribute

details

Add external

data Analyze

MACHINE Synerscope Legato Reporting (ETL)

Page 13: Data Discovery through High-Data-Density Visual Analysis ...on-demand.gputechconf.com/gtc/2013/webinar/gtc-express-high-data... · problem, when in 2009 he met ... Nvidia: this brings

© Synerscope 2013

SynerScope Legato - example

Page 14: Data Discovery through High-Data-Density Visual Analysis ...on-demand.gputechconf.com/gtc/2013/webinar/gtc-express-high-data... · problem, when in 2009 he met ... Nvidia: this brings

© Synerscope 2013

© Synerscope 2013

2.5% Columns with (nearly) unique values are prime candidates for joinss of sources (Policy #, Social security #, ID)

<40% Columns with many empty fields limited use for joinss or as starting point for visual analysis. (“”, NULL, NA, NaN)

>20% Columns with outliers , often also containing obvious mistakes (NLD * 1000, NL * 2, typo?)

>60% Tables with no direct primary key

>50% Tables with more than three types of data high potential for information

>40% Tables with many columns are mostly joins between sources. Often the relations in data were flattened as a result.

2.5% Columns with a field containing a date/time value important for patterned time views and link identification.

A data technical assessment is made of all data received

Information Quality – (sample 10,000 columns)

Page 15: Data Discovery through High-Data-Density Visual Analysis ...on-demand.gputechconf.com/gtc/2013/webinar/gtc-express-high-data... · problem, when in 2009 he met ... Nvidia: this brings

© Synerscope 2013

© Synerscope 2013

STEPS:

Example – Phase 2

Calculate candidates for nodes and links

Calculate candidates for joins

Visualize potential joins in SynerScope Marcato

Build a “life-story” around an entity as input to business questions

Determine “white spots” in the data

Suggestions and warnings

Realtime reporting – Synerscope Legato

Suggestions for joins

Construct logical entities

Select from available nodes

and links

Input towards the business

questions

Expand through external sources)

Entity

New services

Client focus

Profit / savings

Page 16: Data Discovery through High-Data-Density Visual Analysis ...on-demand.gputechconf.com/gtc/2013/webinar/gtc-express-high-data... · problem, when in 2009 he met ... Nvidia: this brings

© Synerscope 2013

SynerScope Legato - example

Page 17: Data Discovery through High-Data-Density Visual Analysis ...on-demand.gputechconf.com/gtc/2013/webinar/gtc-express-high-data... · problem, when in 2009 he met ... Nvidia: this brings

© Synerscope 2013

SynerScope Legato - example

Page 18: Data Discovery through High-Data-Density Visual Analysis ...on-demand.gputechconf.com/gtc/2013/webinar/gtc-express-high-data... · problem, when in 2009 he met ... Nvidia: this brings

© Synerscope 2013

© Synerscope 2013

Data profile – table – Data Quality

Page 19: Data Discovery through High-Data-Density Visual Analysis ...on-demand.gputechconf.com/gtc/2013/webinar/gtc-express-high-data... · problem, when in 2009 he met ... Nvidia: this brings

BIJLAGEN

SynerScope Marcato

19

Page 20: Data Discovery through High-Data-Density Visual Analysis ...on-demand.gputechconf.com/gtc/2013/webinar/gtc-express-high-data... · problem, when in 2009 he met ... Nvidia: this brings

© Synerscope 2013

© Synerscope 2013

SEE

EXPLORE

DISCOVER

Page 21: Data Discovery through High-Data-Density Visual Analysis ...on-demand.gputechconf.com/gtc/2013/webinar/gtc-express-high-data... · problem, when in 2009 he met ... Nvidia: this brings
Page 22: Data Discovery through High-Data-Density Visual Analysis ...on-demand.gputechconf.com/gtc/2013/webinar/gtc-express-high-data... · problem, when in 2009 he met ... Nvidia: this brings
Page 23: Data Discovery through High-Data-Density Visual Analysis ...on-demand.gputechconf.com/gtc/2013/webinar/gtc-express-high-data... · problem, when in 2009 he met ... Nvidia: this brings
Page 24: Data Discovery through High-Data-Density Visual Analysis ...on-demand.gputechconf.com/gtc/2013/webinar/gtc-express-high-data... · problem, when in 2009 he met ... Nvidia: this brings
Page 25: Data Discovery through High-Data-Density Visual Analysis ...on-demand.gputechconf.com/gtc/2013/webinar/gtc-express-high-data... · problem, when in 2009 he met ... Nvidia: this brings
Page 26: Data Discovery through High-Data-Density Visual Analysis ...on-demand.gputechconf.com/gtc/2013/webinar/gtc-express-high-data... · problem, when in 2009 he met ... Nvidia: this brings

BIJLAGEN

Live demo SynerScope Marcato

26

Page 27: Data Discovery through High-Data-Density Visual Analysis ...on-demand.gputechconf.com/gtc/2013/webinar/gtc-express-high-data... · problem, when in 2009 he met ... Nvidia: this brings

© Synerscope 2013

© Synerscope 2013

SynerScope

SEE -

EXPLORE -

DISCOVER

the best way to turn Big Data to Insight fast

domain-experts to work side-by-side with data experts

Page 28: Data Discovery through High-Data-Density Visual Analysis ...on-demand.gputechconf.com/gtc/2013/webinar/gtc-express-high-data... · problem, when in 2009 he met ... Nvidia: this brings

Upcoming GTC Express Webinars

Register at www.gputechconf.com/gtcexpress

September 10 - Virtualizing Tough 3D Workloads with VMware

Horizon View and NVIDIA Technologies

September 12 - Guided Performance Analysis with NVIDIA

Visual Profiler

September 17 - ArrayFire: A Productive GPU Software Library

for Defense and Intelligence Applications

September 19 - Learn How to Debug OpenGL 4.2 with NVIDIA®

Nsight™ Visual Studio Edition 3.1

September 25 - An Introduction to GPU Programming

Page 29: Data Discovery through High-Data-Density Visual Analysis ...on-demand.gputechconf.com/gtc/2013/webinar/gtc-express-high-data... · problem, when in 2009 he met ... Nvidia: this brings

GTC 2014 Call for Submissions

Looking for submissions in the fields of

Science and research

Professional graphics

Mobile computing

Automotive applications

Game development

Cloud computing

Submit by September 27 at www.gputechconf.com