webinar - introducing datameer 4.0: visual, end-to-end

Post on 29-Jan-2015

103 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

In Big Data projects, analysts often spend 80% of their time preparing data for analysis. In addition, users don’t have a good understanding of their data quality. Today there are multiple tools that assist with integration, data preparation, analysis and visualization. However, data quality continues to be one of the biggest challenges businesses face when deploying big data analytics. People need to profile their data and ensure data quality to get accurate insights and make informed business decisions. Join Datameer as we address these pain points with visualizations at every step. This webinar will highlight and showcase: -How visual data profiling reduces the guesswork in the data wrangling process -Enhanced interactive data mining capabilities reduce time to insight -A demonstration of the new 4.0 features and functions

TRANSCRIPT

Introducing Datameer 4.0!

Visual, End-to-End!

© 2014 Datameer, Inc. All rights reserved.

View Recording!!!

You can view the recording of this webinar at:!!

http://info.datameer.com/Online-Slideshare-Datameer-4-0-Visual-End-to-End-

OnDemand.html!

© 2014 Datameer, Inc. All rights reserved. © 2013 Datameer, Inc. All rights reserved.

About Our Speakers!

Matt Schumpert @datameer!Senior Director, Solutions Engineering!!Matt has been working in the enterprise infrastructure software space for over 14 years in various capacities, including sales engineering, strategic alliances and consulting.!!Matt currently runs the pre-sales engineering team at Datameer, supporting all technical aspects of customer engagement from initial contact through roll-out of customers into production.!!Matt holds a BS in Computer Science from the University of Virginia. ! #datameer

@datameer!

© 2013 Datameer, Inc. All rights reserved.

About Our Speaker !

Matt McManus @datameer Vice President, Engineering Matt has been building enterprise software products for over 10 years with deep experience in architecture, software engineering and team management roles. Matt currently leads the engineering team at Datameer, managing all aspects of product development, releases and quality assurance. Matt attended Boston University where he earned a Bachelor’s degree in Computer Science.

#datameer @datameer!

© 2014 Datameer, Inc. All rights reserved.

The Lean Data Supply Chain!

Classical Data Pipeline!

Modern Data Pipeline!

© 2014 Datameer, Inc. All rights reserved.

The Lean Data Supply Chain!

© 2014 Datameer, Inc. All rights reserved.

Informatica!Talend!Flume!Sqoop!

Trifacta!Paxata!

PIG!Hive!

Impala!Tableau!Platfora!

© 2013 Datameer, Inc. All rights reserved.

The Lean Data Supply Chain!

Integrate! Analyze! Visualize!Prepare!

© 2014 Datameer, Inc. All rights reserved. © 2013 Datameer, Inc. All rights reserved.

An end-to-end Solution!Analytics! Visualization!Data Integration!

Any Distro!

© 2014 Datameer, Inc. All rights reserved. © 2013 Datameer, Inc. All rights reserved.

Smart Analytics!Clusteringgg Column Dependencies

Recommendation Decision Trees

© 2014 Datameer, Inc. All rights reserved.

Enterprise Integration!

Introducing Datameer 4.0!

Visual Insights at Every Step!

© 2014 Datameer, Inc. All rights reserved. © 2013 Datameer, Inc. All rights reserved.

Introducing ‘Flip-Side’

© 2014 Datameer, Inc. All rights reserved. © 2013 Datameer, Inc. All rights reserved.

Before !

Integrate! Analyze! Visualize!Prepare!

© 2014 Datameer, Inc. All rights reserved. © 2013 Datameer, Inc. All rights reserved.

Now!

Integrate! Analyze! Visualize!Prepare!

© 2014 Datameer, Inc. All rights reserved. © 2013 Datameer, Inc. All rights reserved.

Problems Solved

Before:! With Datameer 4.0:!

Multiple Tools!

Not for business!

Visualize at End!

Single Platform!

Self-Service!

Visual Insights at Every Step!

© 2014 Datameer, Inc. All rights reserved. © 2013 Datameer, Inc. All rights reserved.

Use Cases and Impact Industry! Challenge! Impact!

Banking!Identify credit scores that were out of range based on zip code (credit scores in affluent

areas tend to be higher than in others)!!

Identify loans that have highest risk and better quantify risk exposure (>$13M)!

!

Retail!Identify missing product id or inaccurate

product descriptions!!

Inventory: Slower turnover of stock!Fulfillment: Out of stock at customers!

Logistics: Distribution errors and rework, extra shipping costs (>$1M)!

Telco!Identify incorrect subscriber data (e.g. invalid email addresses) that will skew

results on usage in particular area!

By correlating subscriber data with network performance data, meet existing and forecasted demand, but not excess

capacity resulting in inflated capital expenditures. (>$140M)!

Telco!Identify incorrect subscriber data (e.g.

negative ages) that will skew segments used for churn analysis!

Discount and retention campaigns are executed optimally and targeted to the

right clusters, avoiding lost revenue!

© 2014 Datameer, Inc. All rights reserved.

4.0 Technical Details!Matt McManus!VP, Engineering!

© 2014 Datameer, Inc. All rights reserved.

Column Metrics Collection!

Metric! Supported Column Types!

Cardinality*! All!Histogram*! Numeric + Date!Frequency* (Top K)! All!Summary (min/max/mean)! Numeric + Date!Null vs. Present! All!

* indicates estimated value!

© 2014 Datameer, Inc. All rights reserved.

Performance Implications!

!   Metrics are calculated using streaming techniques designed to minimize performance impacts!

!   Often an estimate is provided to achieve high performance!

!   Collection can be disabled on a per job or cluster wide basis!

© 2014 Datameer, Inc. All rights reserved.

Visual Profiling of Full Results!

!   Column statistics available on full results of every worksheet (without leaving workbook)!

!   Column statistics fall back to “preview” in certain circumstances!

! Visual cues guide users:!

© 2014 Datameer, Inc. All rights reserved.

Flip-side with Smart Analytics!!   Visualize model on full results!

• Decision trees!

• Column dependencies!

!   Visually explore cluster composition!•  Compare data shape across clusters !

!   Enhancements to recommendation visualizations!

© 2014 Datameer, Inc. All rights reserved.

Demo …!Customer Churn!

@Datameer!

© 2014 Datameer, Inc. All rights reserved. © 2013 Datameer, Inc. All rights reserved.

For More Information!

#datameer @datameer!

!   http://www.datameer.com!

!  @datameer!mschumpert@datameer.com!mmcmanus@datameer.com!

top related