informatica big data management - meetup › 16208282 › big data management... · 2016-04-15 ·...

40
Informatica Big Data Management Joel LaPlount Informatica Product Management

Upload: others

Post on 29-May-2020

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Informatica Big Data Management - Meetup › 16208282 › Big Data Management... · 2016-04-15 · Big Data = Big Opportunity Sources: Informatica Big Data Survey, March 2012 Cisco,

Informatica Big Data ManagementJoel LaPlountInformatica Product Management

Page 2: Informatica Big Data Management - Meetup › 16208282 › Big Data Management... · 2016-04-15 · Big Data = Big Opportunity Sources: Informatica Big Data Survey, March 2012 Cisco,

Data Powers Businesses

Page 3: Informatica Big Data Management - Meetup › 16208282 › Big Data Management... · 2016-04-15 · Big Data = Big Opportunity Sources: Informatica Big Data Survey, March 2012 Cisco,

Big Data = Big Opportunity

Sources:Informatica Big Data Survey, March 2012Cisco, The Zettabyte Era - Trends and Analysis, May 2013

67%Of respondents see big data as an opportunity for their organization.

By 2020, data is predicted to grow at least 75 times and more than 1/3 will pass through the Cloud.

Page 4: Informatica Big Data Management - Meetup › 16208282 › Big Data Management... · 2016-04-15 · Big Data = Big Opportunity Sources: Informatica Big Data Survey, March 2012 Cisco,

Example Use Cases

Advanced Analytics

Fraud / Risk Management

Process / AssetOptimization

DATA LAKE

Page 5: Informatica Big Data Management - Meetup › 16208282 › Big Data Management... · 2016-04-15 · Big Data = Big Opportunity Sources: Informatica Big Data Survey, March 2012 Cisco,

The Reality

By 201585% of Fortune 500 organizations will fail to effectively exploit big data for competitive advantage.

Page 6: Informatica Big Data Management - Meetup › 16208282 › Big Data Management... · 2016-04-15 · Big Data = Big Opportunity Sources: Informatica Big Data Survey, March 2012 Cisco,

Companies Taking on the Big Data Challenge

Page 7: Informatica Big Data Management - Meetup › 16208282 › Big Data Management... · 2016-04-15 · Big Data = Big Opportunity Sources: Informatica Big Data Survey, March 2012 Cisco,

Their Early Journey

All this new data –let’s just spin up a Hadoop cluster.

Now all we have to do is ingest, blend and prep

data…

STOP! How do we operationalize the

results? Reuse?

The “sandbox” is up – experiments are so much fun!!!

No real business value – no ROI –we are STUCK!

Oops! So many issues with data –just hand-code!

Biz’ wants more insights – let’s put it in the data lake!

We need more Hadoop

developers!!!

Page 8: Informatica Big Data Management - Meetup › 16208282 › Big Data Management... · 2016-04-15 · Big Data = Big Opportunity Sources: Informatica Big Data Survey, March 2012 Cisco,

Why Do Big Data Projects Fail?“rapid intake of new data sources”Vishal, VP Data Architecture

“too many data silos making it impossible to know what data can be trusted”Pete, Chief Data Officer

“simplify the work of ingesting and mapping data...so that we need fewer specialized development resources”Ron, VP Global Information Systems

“need to ensure confidence in data integrity, accuracy, and timeliness”Ron, VP Global Information Systems

“need code re-usability and code maintainability”Ben, Director of Platform Architecture

“regulations have become very strict and very precise – lots of gaps in the quality of the data”Christine, Manager Data Management

“prepping and cleaning the data used to take us 2-3 weeks”Vishal, VP Data Architecture

“transforming data management from a labor intensive, qualitative approach to a systematic approach…to classify data and understand lineage”Ned, Senior Vice President

Page 9: Informatica Big Data Management - Meetup › 16208282 › Big Data Management... · 2016-04-15 · Big Data = Big Opportunity Sources: Informatica Big Data Survey, March 2012 Cisco,

What’s Required for Successful Big Data Projects?

Big data does not mean NO

data integration.

Big data does not mean BAD quality

information.

Big data does not mean

PROLIFERATIONof sensitive data.

How do you certify and govern big data?

How do you quickly integrate big data?

How do you secure big data?

Page 10: Informatica Big Data Management - Meetup › 16208282 › Big Data Management... · 2016-04-15 · Big Data = Big Opportunity Sources: Informatica Big Data Survey, March 2012 Cisco,

Introducing Informatica Big Data Management

Industry’s Only Single Integrated Platform for Big Data ManagementInformatica Big Data Management

Analytical Applications

Data Warehouses, Data Lakes, NoSQL

PILLAR 1Big Data Integration

PILLAR 2Big Data Governance & Quality

PILLAR 3Big Data Security

Page 11: Informatica Big Data Management - Meetup › 16208282 › Big Data Management... · 2016-04-15 · Big Data = Big Opportunity Sources: Informatica Big Data Survey, March 2012 Cisco,

PILLAR 1 – Big Data Integration

Page 12: Informatica Big Data Management - Meetup › 16208282 › Big Data Management... · 2016-04-15 · Big Data = Big Opportunity Sources: Informatica Big Data Survey, March 2012 Cisco,

Big Data Cannot Be Tackled Manually

The Race to Business Value Will Not Be Won By Hand

MoreVolume

MoreVariety

MoreVelocity

More DataConsumers

More DataSilos

More DataPlatforms

Page 13: Informatica Big Data Management - Meetup › 16208282 › Big Data Management... · 2016-04-15 · Big Data = Big Opportunity Sources: Informatica Big Data Survey, March 2012 Cisco,

So Big Data Goes Unused or Is Delivered Late

Data Developer IT Data Management Business Analyst

Overwhelming Manual Efforts Complex Processes Analysis Too Late

Page 14: Informatica Big Data Management - Meetup › 16208282 › Big Data Management... · 2016-04-15 · Big Data = Big Opportunity Sources: Informatica Big Data Survey, March 2012 Cisco,

Big Data Integration For Maximum Performance

Ingest Instantly Process Everything Deploy Optimally

200+Pre-Built

Connectors

CloudConnectivity

Real-TimeStreaming

100+Pre-Built Parsers and

Transformations

GraphicalDevelopment

DynamicProcess and Mappings

MultipleEngines Supported

(MapReduce, Spark, etc.)

High-SpeedProcessing For Complex

Workloads

AccessWith Brokering & Federation

Page 15: Informatica Big Data Management - Meetup › 16208282 › Big Data Management... · 2016-04-15 · Big Data = Big Opportunity Sources: Informatica Big Data Survey, March 2012 Cisco,

PILLAR 2 – Big Data Governance & Quality

Page 16: Informatica Big Data Management - Meetup › 16208282 › Big Data Management... · 2016-04-15 · Big Data = Big Opportunity Sources: Informatica Big Data Survey, March 2012 Cisco,

Big Data Is Difficult To Trust

ChangingNeeds for Quality

Same data used formultiple purposes

HiddenRelationships

Everything and everyoneis interconnected

MagnifiedTrust Issues

New sources ofexternal data

Page 17: Informatica Big Data Management - Meetup › 16208282 › Big Data Management... · 2016-04-15 · Big Data = Big Opportunity Sources: Informatica Big Data Survey, March 2012 Cisco,

And Regulations And Controls Are Harder To Meet

SOXPCIHIPAAFISMA

ISOGLBANIST

Page 18: Informatica Big Data Management - Meetup › 16208282 › Big Data Management... · 2016-04-15 · Big Data = Big Opportunity Sources: Informatica Big Data Survey, March 2012 Cisco,

Big Data Governance for Agility and Trust

Collaborative Stewardship 360 Degree Insight Complete Confidence

BusinessContext Provisioning

Role-specific interfaces,business glossary and rules

PolicyDriven Processes

Workflow, approvals, voting

RelationshipDiscovery and View

Big data matching and linking

CatalogOf All Metadata

Smart knowledge graph

Certificationwith Data Quality

Validation, enrichment, standardization

TransparencyIn and Out of the Enterprise

Full data and metadata lineage

Page 19: Informatica Big Data Management - Meetup › 16208282 › Big Data Management... · 2016-04-15 · Big Data = Big Opportunity Sources: Informatica Big Data Survey, March 2012 Cisco,

PILLAR 3 – Big Data Security

Page 20: Informatica Big Data Management - Meetup › 16208282 › Big Data Management... · 2016-04-15 · Big Data = Big Opportunity Sources: Informatica Big Data Survey, March 2012 Cisco,

Perimeter Security Is Insufficient

Perimeter security: Outside in security

• Not if, but when• Network focused• Attacks will only grow

Page 21: Informatica Big Data Management - Meetup › 16208282 › Big Data Management... · 2016-04-15 · Big Data = Big Opportunity Sources: Informatica Big Data Survey, March 2012 Cisco,

Big Data: Bigger Risk

Sensitive Data

Security Exposure

• An exponential attack surface• With exponential risks

Page 22: Informatica Big Data Management - Meetup › 16208282 › Big Data Management... · 2016-04-15 · Big Data = Big Opportunity Sources: Informatica Big Data Survey, March 2012 Cisco,

Big Data Security Foundation: The ‘Data Perimeter’

Risk Analytics360 Degree Visibility Policy-Based Protection

Risk IdentificationProliferation, Cost, Protection,

Use, Location

DetectionRisky Users

Discoveryof Sensitive Data, with Context

VisualizationsWho, Where, When, What

CentralizedManagement of Rules

De-IdentificationFor Test, Reporting, Analytics

Page 23: Informatica Big Data Management - Meetup › 16208282 › Big Data Management... · 2016-04-15 · Big Data = Big Opportunity Sources: Informatica Big Data Survey, March 2012 Cisco,

The 3 Pillars of Informatica Big Data ManagementBig Data

Integration• Simple Visual Environment &

Templates• Optimized Execution & Flexible

Deployment• 100’s of Pre-built Transforms,

Connectors & Parsers• Broker-based Data Ingestion

Big Data Governance & Quality

• Collaboration Capabilities• Business Glossary• Profiling and Data Quality

• 360° Relationship Views • End-to-end Data Lineage

Big Data Security

• Sensitive Data Discovery & Classification

• Proliferation Analysis

• Risk Assessment• Persistent & Dynamic Data

Masking

Page 24: Informatica Big Data Management - Meetup › 16208282 › Big Data Management... · 2016-04-15 · Big Data = Big Opportunity Sources: Informatica Big Data Survey, March 2012 Cisco,

Big Data ManagementKey New Features

Page 25: Informatica Big Data Management - Meetup › 16208282 › Big Data Management... · 2016-04-15 · Big Data = Big Opportunity Sources: Informatica Big Data Survey, March 2012 Cisco,

A Big Data Fabric Enables Productivity, Repeatability, Collaboration

Page 26: Informatica Big Data Management - Meetup › 16208282 › Big Data Management... · 2016-04-15 · Big Data = Big Opportunity Sources: Informatica Big Data Survey, March 2012 Cisco,

Automate For Maximum Productivity

100+ PRE-BUILT PARSERS

AND TRANSFORMATIONS

200+PRE-BUILT CONNECTORS

DynamicPROCESSES AND

MAPPINGS

GraphicalDEVELOPMENT

Page 27: Informatica Big Data Management - Meetup › 16208282 › Big Data Management... · 2016-04-15 · Big Data = Big Opportunity Sources: Informatica Big Data Survey, March 2012 Cisco,

Develop More Quickly And Staff More Quickly

27

HadoopDevelopers

InformaticaDevelopers 100,000+

TRAINED DEVELOPERS WORLDWIDE

500% MORE PRODUCTIVE THAN HAND-CODING

0%RISK OF REWRITING

OUTDATED CODE

Page 28: Informatica Big Data Management - Meetup › 16208282 › Big Data Management... · 2016-04-15 · Big Data = Big Opportunity Sources: Informatica Big Data Survey, March 2012 Cisco,

Develop Fit-for-Purpose Assets & Drive Collaborative Governance

Apply

DataGovernance

Apply

Measureand

MonitorDefine

Discover

IT Business

Curation of Fit-for-Purpose Data Assets

Raw Prepared Cleansed/ Matched

Hadoop Data Lake

Page 29: Informatica Big Data Management - Meetup › 16208282 › Big Data Management... · 2016-04-15 · Big Data = Big Opportunity Sources: Informatica Big Data Survey, March 2012 Cisco,

Efficiency & Flexibility with Dynamic Mappings• Mass Ingestion: Build a template once – automate mapping

execution for 1000’s of sources with different schemas automatically• Mapping self adjusts dynamically to external schema changes and

column characteristics

Design time

Run timeAvailable in PC V10.0!

Page 30: Informatica Big Data Management - Meetup › 16208282 › Big Data Management... · 2016-04-15 · Big Data = Big Opportunity Sources: Informatica Big Data Survey, March 2012 Cisco,

Choice of Execution Engines• For Hadoop execution:

engines as native YARN apps

• Choice of execution on • Map-Reduce• Blaze or • INFA engines outside of

Hadoop • Future: spark based execution as

well as a smart optimizer which decides based on workload

HADOOP Cluster

HDFS

Map-Reduce

Hive Runtime

DIS

INFA Hive Executor

Data Engine Compiler

Blaze Executor

Blaze Runtime

DIS CAL

Hive Driver

Hive MetaStore

YARN

Blaze

Hadoop CAL

Page 31: Informatica Big Data Management - Meetup › 16208282 › Big Data Management... · 2016-04-15 · Big Data = Big Opportunity Sources: Informatica Big Data Survey, March 2012 Cisco,

Smart Optimizers• In-built mapping optimizer automatically tunes and re-arranges the

mapping for high performance• Early selection, Early projection, Mapping pruning, Semi-join, Join re-ordering

• Automatic partitioning support based on statistics and other heuristics

• Advanced full pushdown optimization support

31

Orderkey = L_ORDERKEY and L_EXTENDEDPRICE < 1000and id1 + id2 > 47 Orderkey = L_ORDERKEY

L_EXTENDEDPRICE < 1000

Id1 + id2 > 47

Page 32: Informatica Big Data Management - Meetup › 16208282 › Big Data Management... · 2016-04-15 · Big Data = Big Opportunity Sources: Informatica Big Data Survey, March 2012 Cisco,

Enterprise Information Catalog : Basis for Data Intelligence

EICRelationshipsCatalogStatistics

Live Data MapRulesGlossaryRatings

All Informatica

Repositories

Applications, Business glossary &

context

3rd party – BI, Modeling, Big Data,

RDBMS

User Ratings, Feedback,

Operational Stats

• Exploration• Semantic Search• Relationship Discovery

Data Discovery Sensitive Data Tracking

Stewardship & Governance

Smart Suggestions

Live Data Map

Knowledge Graph of all enterprise data assets

• Recommendations• 360 degree views• User Ratings

Page 33: Informatica Big Data Management - Meetup › 16208282 › Big Data Management... · 2016-04-15 · Big Data = Big Opportunity Sources: Informatica Big Data Survey, March 2012 Cisco,

Project Sonoma : Intelligent Data Lake

EnterpriseInformation

Catalog

BI & Analytics

Self-ServiceData Discovery

IT Monitoring& Tracking

Prepare (Rev)Raw

DataPublished Data Sets

DATADATA

METADATA

Self-Service for Analysts

• Search & Discover

• Prepare & Publish

Visibility for IT

• Usage tracking & monitoring

• Lineage & Security

• Operate at scale

Page 34: Informatica Big Data Management - Meetup › 16208282 › Big Data Management... · 2016-04-15 · Big Data = Big Opportunity Sources: Informatica Big Data Survey, March 2012 Cisco,

Project Sonoma: Intelligent Data LakeData Analysts

• Enterprise data assets search and discovery

• Data acquisition from on-premise and cloud sources, batch and real-time

• Data set recommendations

• Excel-like Data preparation, enrichment for large data sets

• Data publishing and sharing

Page 35: Informatica Big Data Management - Meetup › 16208282 › Big Data Management... · 2016-04-15 · Big Data = Big Opportunity Sources: Informatica Big Data Survey, March 2012 Cisco,

Why Informatica?

Page 36: Informatica Big Data Management - Meetup › 16208282 › Big Data Management... · 2016-04-15 · Big Data = Big Opportunity Sources: Informatica Big Data Survey, March 2012 Cisco,

Data Is ALL We Do

Page 37: Informatica Big Data Management - Meetup › 16208282 › Big Data Management... · 2016-04-15 · Big Data = Big Opportunity Sources: Informatica Big Data Survey, March 2012 Cisco,

Innovation and Leadership

Magic Quadrant for Data Integration Tools

Magic Quadrant for Enterprise Integration Platform as a Service

Magic Quadrant for Data Quality Tools

Magic Quadrant for Data Masking Technology

Magic Quadrant for Structured Data Archiving

and Application Retirement

Magic Quadrant for Master Data Management of

Customer Data Solutions

These graphics were published by Gartner, Inc. as part of larger research documents and should be evaluated in the context of the entire documents. The Gartner documents are available upon request from Informatica. Gartner does not endorse any vendor, product, or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings. Gartner research publications consist of the opinions of Gartner's research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.

Page 38: Informatica Big Data Management - Meetup › 16208282 › Big Data Management... · 2016-04-15 · Big Data = Big Opportunity Sources: Informatica Big Data Survey, March 2012 Cisco,

Informatica Big Data Customers (Sample)

Page 39: Informatica Big Data Management - Meetup › 16208282 › Big Data Management... · 2016-04-15 · Big Data = Big Opportunity Sources: Informatica Big Data Survey, March 2012 Cisco,

Informatica Big Data Ecosystem Partners

Page 40: Informatica Big Data Management - Meetup › 16208282 › Big Data Management... · 2016-04-15 · Big Data = Big Opportunity Sources: Informatica Big Data Survey, March 2012 Cisco,

Thank You!

Big Data Management V10.1 LaunchMay 12Webinar – Check Informatica.com