smarter management for your data growth

38
Smarter Management for Your Data Growth Retain Critical Data Online At A Fraction of The Cost April 2011

Upload: rainstor

Post on 20-May-2015

3.074 views

Category:

Technology


1 download

DESCRIPTION

Matt Aslett (The451Group) and Deirdre Mahon (RainStor) examine the evolving data management landscape and how RainStor's Online Data Retention (OLDR) repository fits into the equation.

TRANSCRIPT

Page 1: Smarter Management for Your Data Growth

Smarter Management for Your Data Growth

Retain Critical Data Online At A Fraction of The Cost

April 2011

Page 2: Smarter Management for Your Data Growth

Agenda

Introductions Changing Data Management

Landscape & Trends– From Operational to Analytical

Cloud and Hadoop– Where do They Fit?

RainStor and How it Works Analytics Data Retention Use-case Economics Q&A

Matt Aslett, The 451 Group

Deirdre Mahon, VP Marketing – RainStor

Ramon Chen, VP Product Management - RainStor

Page 3: Smarter Management for Your Data Growth

© 2011 by The 451 Group. All rights reserved

Matthew Aslett, The 451 [email protected]

© 2011 by The 451 Group. All rights reserved

Total DataThe changing data management landscape

Page 4: Smarter Management for Your Data Growth

© 2011 by The 451 Group. All rights reserved

451 Research is focused on the business of enterprise IT innovation. The company’s analysts provide critical and timely insight into the competitive dynamics of innovation in emerging technology segments.

The 451 Group

Tier1 Research is a single-source research and advisory firm covering the multi-tenant datacenter, hosting, IT and cloud-computing sectors, blending the best of industry and financial research.

The Uptime Institute is ‘The Global Data Center Authority’ and a pioneer in the creation and facilitation of end-user knowledge communities to improve reliability and uninterruptible availability in datacenter facilities.

TheInfoPro is a leading IT advisory and research firm that provides real-world perspectives on the customer and market dynamics of the enterprise information technology landscape, harnessing the collective knowledge and insight of leading IT organizations worldwide.

ChangeWave Research is a research firm that identifies and quantifies ‘change’ in consumer spending behavior, corporate purchasing, and industry, company and technology trends.

Page 5: Smarter Management for Your Data Growth

© 2011 by The 451 Group. All rights reserved

5

Overview

The changing data management landscape

One overarching trend: Total Data

Impacting four technology areas: Operational database Analytic database Data archiving Machine-generated data

The trends driving data management

Page 6: Smarter Management for Your Data Growth

© 2011 by The 451 Group. All rights reserved

Trends driving data management

The volume, variety and velocity of data has never been greater and is growing

The value of data has never been better understood

The capabilities for processing data have never been better Higher processor performance and density are enabling advanced

processing on commodity hardware Software enhancements designed to make best use of processing

performance and scalable architecture Advanced and in-database analytics bring processing to the data,

reducing latency and improving efficiency

The data deluge problem is also a big data opportunity

6

Page 7: Smarter Management for Your Data Growth

© 2011 by The 451 Group. All rights reserved

7

Introducing Total Data

A concept define by The 451 Group to describe new approaches to data management – beyond restrictive silos

Reflects the changing data management landscape as pragmatic choices are being made about data storage and analysis techniques

Processing any data that might be applicable to analytics in the operational database, data warehouse, or Hadoop, or archive Structured, semi-structured or unstructured Relational or non-relational, on-premise or in the cloud

Inspired by ‘Total Football’

Page 8: Smarter Management for Your Data Growth

© 2011 by The 451 Group. All rights reserved

8

Total Football meets Total Data

“You make space, you come into space. And if the ball doesn’t come, you leave this space and another player will come into it.”

Bernadus Hulshoff, Ajax 1966-77

Abandonment of restrictive (self-imposed) rules about individual roles and responsibility

Enabled and relied on fluidity and flexibility to respond to changing requirements

Reliant on, and exploited, improved performance levels

Page 9: Smarter Management for Your Data Growth

© 2011 by The 451 Group. All rights reserved

9

Reporting/BI

Data management – in theory

Enterprise app

Data cleansing/sampling/

MDM

EDW

Operationaldatabase

Infrastructure

The application is the primary source of data

The relational database is sacrosanct

The enterprise data warehouse is the single source of the truth (or is supposed to be)

Offline data archiving Infrastructure primarily exists

to support the data/application layer

Data archive

Page 10: Smarter Management for Your Data Growth

© 2011 by The 451 Group. All rights reserved

10

Data management – in practice

Enterprise app

Data cleansing/sampling/

MDM

EDW

Distributed data

Reporting/BI

Operationaldatabase

Operationaldatabase

Operationaldatabase

Operationaldatabase

Infrastructure

The relational database is sacrosanct

Distributed data layer to meet the scalability and performance demands

New opportunities for real-time BI

Polyglot persistence – use the most appropriate data storage for the application

Reporting/BI

Data archive

Page 11: Smarter Management for Your Data Growth

© 2011 by The 451 Group. All rights reserved

11

Data management – in practice

Enterprise app

Data cleansing/sampling/

MDM

EDW

Distributed data

Reporting/BI

Operationaldatabase

Operationaldatabase

Operationaldatabase

Operationaldatabase

Infrastructure

Reporting/BI

Data archive

ReportingReportingReporting

Analyticdatabase

Analyticdatabase

Analyticdatabase

The enterprise data warehouse is the single source of the truth

Data is copied into departmental or regional data marts

Data warehouse administrators are fighting a losing battle for control

Page 12: Smarter Management for Your Data Growth

© 2011 by The 451 Group. All rights reserved

12

Data management – in practice

Enterprise app

Data cleansing/sampling/

MDM

EDW

Distributed data

Reporting/BI

Operationaldatabase

Operationaldatabase

Operationaldatabase

Operationaldatabase

Infrastructure

Data archive

ReportingReportingReporting

Analyticdatabase

Analyticdatabase

Analyticdatabase

Reporting/BI

Higher processor performance and density are enabling advanced processing on commodity hardware

Advanced in-database analytics bring processing to the data, reducing latency and improving efficiency

Page 13: Smarter Management for Your Data Growth

© 2011 by The 451 Group. All rights reserved

13

Data management – in practice

Enterprise app

Data cleansing/sampling/

MDM

EDW

Distributed data

Reporting/BI

Operationaldatabase

Operationaldatabase

Operationaldatabase

Operationaldatabase

Infrastructure

Data archive

ReportingReportingReporting

Analyticdatabase

Analyticdatabase

Analyticdatabase

Reporting/BI

Hadoop

Reporting/BI

Hadoop and associated analysis tools (Hive, Pig) for large-scale batch processing of large, complex data sets

Taking further advantage of hardware economics

Page 14: Smarter Management for Your Data Growth

© 2011 by The 451 Group. All rights reserved

14

Data management – in practice

Enterprise app

Data cleansing/sampling/

MDM

EDW

Distributed data

Reporting/BI

Operationaldatabase

Operationaldatabase

Operationaldatabase

Operationaldatabase

Infrastructure

Data archive

ReportingReportingReporting

Analyticdatabase

Analyticdatabase

Analyticdatabase

Reporting/BI

Integrating Hadoop with the data warehouse for ETL and also two-step data analysis

Greater acceptance that the EDW is part of a broader data analytics architecture

Hadoop

Reporting/BI

Page 15: Smarter Management for Your Data Growth

© 2011 by The 451 Group. All rights reserved

15

Data location, data location, data location

Not the end of the EDW, but the EDW is one of many sources of BI, rather than the only source of BI

The issue of data location becomes paramount

Choose the right storage technology – software and hardware EDW, Hadoop or archive On-premise or on the cloud Memory, disk or SSD

Understand the requirements: Value and temperature of the data Ensure data can be queried using existing tools/skills Cost

Page 16: Smarter Management for Your Data Growth

© 2011 by The 451 Group. All rights reserved

16

EDW requirements/characteristics

High performance query/analysis response Ability to support multiple users concurrently Capacity for multi-terabyte storage and scale Fast data load and staging for data transformation Ability to operate with BI/analytics tools Security and governance

Cost - $20k-$50k per TB Alternatives

Do nothing and suffer the consequences Deploy appliances and/or Hadoop for specific use-cases Offload to an online repository

Page 17: Smarter Management for Your Data Growth

© 2011 by The 451 Group. All rights reserved

17

Data management – in practice

Enterprise app

Data cleansing/sampling/

MDM

EDW

Distributed data

Reporting/BI

Operationaldatabase

Operationaldatabase

Operationaldatabase

Operationaldatabase

Infrastructure

Data archive

ReportingReportingReporting

Analyticdatabase

Analyticdatabase

Analyticdatabase

Reporting/BI

Offline data archiving

Traditionally, data archived for legal requirements

Previously little need for querying/analytics

Hadoop

Reporting/BI

Page 18: Smarter Management for Your Data Growth

© 2011 by The 451 Group. All rights reserved

18

Data management – in practice

Enterprise app

Data cleansing/sampling/

MDM

EDW

Distributed data

Reporting/BI

Operationaldatabase

Operationaldatabase

Operationaldatabase

Operationaldatabase

Infrastructure

Data repository

ReportingReportingReporting

Analyticdatabase

Analyticdatabase

Analyticdatabase

Reporting/BI

Regulations have increased the need to query archived data

Focus shifts on to how to enable querying easily and cost effectively

Becomes an online repository for historical dataReporting

Hadoop

Reporting/BI

Page 19: Smarter Management for Your Data Growth

© 2011 by The 451 Group. All rights reserved

19

Data management – in practice

Enterprise app

Data cleansing/sampling/

MDM

EDW

Distributed data

Reporting/BI

Operationaldatabase

Operationaldatabase

Operationaldatabase

Operationaldatabase

Infrastructure

Data repository

ReportingReportingReporting

Analyticdatabase

Analyticdatabase

Analyticdatabase

Reporting/BI

Infrastructure primarily exists to support the data/application layer

“Machine generated data” an untapped source of data

Reporting

Hadoop

Reporting/BI

Page 20: Smarter Management for Your Data Growth

© 2011 by The 451 Group. All rights reserved

20

Data management – in practice

Enterprise app

Data cleansing/sampling/

MDM

EDW

Distributed data

Reporting/BI

Operationaldatabase

Operationaldatabase

Operationaldatabase

Operationaldatabase

Data repository

ReportingReportingReporting

Analyticdatabase

Analyticdatabase

Analyticdatabase

Reporting/BI

Reporting/BI

Datastructure

Infrastructure as a source of data for analysis and integration with application data: ‘datastructure’

Likely to transform into data-generating and data-processing infrastructure as analytics capabilities are applied directly to the data source

Reporting

Hadoop

Reporting/BI

Page 21: Smarter Management for Your Data Growth

© 2011 by The 451 Group. All rights reserved

21

Data management – in practice

Enterprise app

Data cleansing/sampling/

MDM

EDW

Distributed data

Reporting/BI

Operationaldatabase

Operationaldatabase

Operationaldatabase

Operationaldatabase

Data repository

ReportingReportingReporting

Analyticdatabase

Analyticdatabase

Analyticdatabase

Reporting/BI

Reporting/BI

Datastructure

Cloud as both a source of data and data storage and processing layer

Reporting

Reporting/BI

Hadoop/DW

Analyticdatabase

Analyticdatabase

Analyticdatabase

ReportingReportingReporting

Cloud Infrastructure

Data archive

Analytic DB

Hadoop

Reporting/BI

Page 22: Smarter Management for Your Data Growth

© 2011 by The 451 Group. All rights reserved

22

Total Data

Enterprise app

Data cleansing/sampling/

MDM

EDW

Distributed data

Reporting/BI

Operationaldatabase

Operationaldatabase

Operationaldatabase

Operationaldatabase

Data repository

ReportingReportingReporting

Analyticdatabase

Analyticdatabase

Analyticdatabase

Reporting/BI

Reporting/BI

Datastructure

Reporting

More flexible approach to data management

Greater opportunities for business intelligence

Reporting/BI

Hadoop/DW

Analyticdatabase

Analyticdatabase

Analyticdatabase

ReportingReportingReporting

Cloud Infrastructure

Data archive

Analytic DB

Hadoop

Reporting/BI

Page 23: Smarter Management for Your Data Growth

© 2011 by The 451 Group. All rights reserved

23

Data location, data location, data location

Avoid data movement and duplication – retain governance

Virtual data marts and data clouds

Data virtualization to provide access to multiple data sources

Page 24: Smarter Management for Your Data Growth

© 2011 by The 451 Group. All rights reserved

24

Data virtualization

Enterprise app

Data cleansing/sampling/

MDM

EDW

Distributed data

Reporting/BI

Operationaldatabase

Operationaldatabase

Operationaldatabase

Operationaldatabase

Data repository

ReportingReportingReporting

Analyticdatabase

Analyticdatabase

Analyticdatabase

Reporting/BI

Reporting/BI

Datastructure

Reporting

Reporting/BI

Hadoop/DW

Analyticdatabase

Analyticdatabase

Analyticdatabase

ReportingReportingReporting

Cloud Infrastructure

Data archive

Analytic DB

Hadoop

Reporting/BI

Page 25: Smarter Management for Your Data Growth

© 2011 by The 451 Group. All rights reserved

25

Data virtualization

Enterprise app

Data cleansing/sampling/

MDM

EDW

Distributed data

Reporting/BI

Operationaldatabase

Operationaldatabase

Operationaldatabase

Operationaldatabase

Data repository

Reporting/BI

Datastructure

Analytic DB

ReportingReportingReportingReportingReportingReporting ReportingReporting

Virtualdata mart

Virtualdata mart

Virtualdata mart

Virtualdata mart

Virtualdata mart

Virtualdata mart

Hadoop/DW Cloud Infrastructure

Data archive

Datavirtualization

Hadoop

Page 26: Smarter Management for Your Data Growth

Who is RainStor?Specialized database for cost effective

reduction, retention & on-demand retrievalof historical structured data

At 10x Less Cost

OEM Partner ModelCloud or On-premise

Page 27: Smarter Management for Your Data Growth

Partner Case Studies

Sector : Telco Solution : Message (SMS/MMS)

and traffic log management Retaining 1000s of messages a

second while keeping accessible for regulatory purposes

Sector : Horizontal Solution : Teradata Data Retention

Machine Retain BI & Analytical data long term

in RainStor powered Data Retention Machine for low cost per TB stored. Eliminating tape.

Sector : Various/Horizontal Solution : Information Lifecycle

Management Retaining historical data from highly

complex packaged applications while keeping accessible for business and regulatory purposes

HP Sector : Telco Solution : CDR/IPDR retention and

lawful intercept (HP Dragon) Retaining billions of CDRs per day in

immutable form and enabling cost effective query for regulatory authorities

Page 28: Smarter Management for Your Data Growth

Data Retention Solution Requirements

TransactionalOLTP

AnalyticalOLAP

Static Machine-Generated Data (MGD)

Online Data Retention (OLDR)

Database ArchivingApplication Retirement

Data Warehouse ArchivingData Warehouse Appliance

ComplianceQuery

Page 29: Smarter Management for Your Data Growth

Where RainStor Fits Enterprise

app

Data cleansing/sampling/

MDM

EDW

Distributed data

Reporting/BI

Operationaldatabase

Operationaldatabase

Operationaldatabase

Operationaldatabase

Data repository

ReportingReportingReporting

Analyticdatabase

Analyticdatabase

Analyticdatabase

Reporting/BI

Reporting/BI

Datastructure

Reporting

Reporting/BI

Hadoop/DW

Analyticdatabase

Analyticdatabase

Analyticdatabase

ReportingReportingReporting

Cloud Infrastructure

Data archive

Analytic DB

Hadoop

Reporting/BI

Application Archive / Retired

Page 30: Smarter Management for Your Data Growth

RainStor’s Focus

Communications- OSS- BSS- ISS

Multi- billions of recordsStrict ComplianceRDBMS’s BreakAnalytics Required

10’s of Petabytes Retained

Volumes are rising- Regulated -

Infrastructure needs -

Reaching Telco-scale

Security

Network ForensicsCyber-security

Utilities

- SmartGrid- e Meter

Big Data Volumes- Needs to be online &

Query-able

Found the needle – where’s the

haystack?

Data security will account for

over 60% of new enterprise

security spending in next 3 years

Global mobile data traffic will

grow 26-fold between 2010

and 2015! (6.3 Exabyte's

p/mth)

SmartGrid to Generated 1 Exabyte of

DataIn US AloneNext 2 years

Page 31: Smarter Management for Your Data Growth

How Does RainStor Do It?

SIZE: Massive de-dupe ~97% savings in storageHARDWARE: On commodity server/disk

infrastructureRESOURCES: Without specialist DBA support

ReducePRESERVED: Massive record volumes in original

formIMMUTABLE: Tamper proofed with audit trailCONFIGURABLE: With retention & expiry policies

RetainSTANDARDS: SQL & BI tools via ODBC/JDBCPERFORMANT: Fast queries for large complex data

setsFLEXIBLE: With schema evolution & point-in-time

access

Retrieve

Page 32: Smarter Management for Your Data Growth

RainStor’s Disruptive Technology Patented – 4 layers

of compression

Data Reduction through value and pattern de-duplication

Further Algorithmic-level and byte-level compression

Fast Queries in stored format without re-inflation.

Peter Smith Pharma $40,000

Peter Smith Pharma $40,000

Paul Finance $35,000

Peter Smith Pharma $40,000

Paul Finance $35,000

John

Brown

Page 33: Smarter Management for Your Data Growth

Analytics/DW

Offload Warehouse Data to Online ArchiveHigh Performance & Lower Cost

Augment existing warehouse & analytics systems by providing access to years of history

Run query on RainStor and import results to data warehouse

Re-instate data from data retention repository back to warehouse for deep analytics

Benefits: Lower TCO (Admin, Storage, CPU) Compliant data retention Unlimited scalability Add more data sources for broader

analysis

Source DBe.g. Oracle

5 Quarters

50 Quarters

Page 34: Smarter Management for Your Data Growth

RainStor Cloud

EC2

S3

VM Software Appliance

ODBC/JDBC

Amazon

1. Compressed de-duplicated data sent to the cloud resulting in quicker and cheaper uploads.

2. Encrypted data stored in private containers ensuring security and easy management.

3. Data accessed on demand using standard SQL tools leveraging elasticity of the cloud

Send

Search

Store

Page 35: Smarter Management for Your Data Growth

How Do the Economics Stack Up?

Key Criteria Standard RDBMS /

Warehouse

RainStor

Storage 2PB 100 TB (20x compression)

Servers for

Data Load & Query

100 10

Admin Multiple DBA’s No Design, Tuning or Maintenance

Page 36: Smarter Management for Your Data Growth

© 2011 by The 451 Group. All rights reserved

36

Quick summary

The growing volume, variety and velocity of data is a problem, but it is also an opportunity

Requires a broader approach to data management

Deploy appliances and Hadoop for specific use-cases, and online repository for historical data

‘Datastructure’ will become increasingly valuable, not only as a source of data but also as a source of intelligence

Data location, and the role of data virtualization will come into greater focus

Page 37: Smarter Management for Your Data Growth

Q&A

Page 38: Smarter Management for Your Data Growth

© 2011 by The 451 Group. All rights reserved

FULL TIME

Thank you