getting started using database archiving toronto dama chapter meeting 16 september, 2009 jack e....

31
Getting Started Using Database Archiving Toronto DAMA Chapter Meeting 16 September, 2009 Jack E. Olson [email protected] www.svaltech.com SvalTech “Database Archiving: How to Keep Lots of Data for a Long Time” Jack E. Olson, Morgan Kaufmann, 2008 Copyright SvalTech, Inc., 2009

Upload: anjali-langhorne

Post on 14-Dec-2015

216 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Getting Started Using Database Archiving Toronto DAMA Chapter Meeting 16 September, 2009 Jack E. Olson jack.olson@SvalTech.com  SvalTech

Getting Started UsingDatabase Archiving

Toronto DAMA Chapter Meeting16 September, 2009

Jack E. Olson

[email protected]

www.svaltech.com

SvalTech

“Database Archiving: How to Keep Lots of Data for a Long Time”Jack E. Olson, Morgan Kaufmann, 2008

Copyright SvalTech, Inc., 2009

Page 2: Getting Started Using Database Archiving Toronto DAMA Chapter Meeting 16 September, 2009 Jack E. Olson jack.olson@SvalTech.com  SvalTech

2

Why This Presentation

SvalTech

• A common position of many IT shops is– We know we should be doing database archiving

– We know it will be valuable to us

– But we don’t know how to get started

• Database archiving is an enterprise technology: it can be used in many applications

• Not all database applications are suitable for database archiving

• Suitable applications have widely differing return-on-investment potential

Copyright SvalTech, Inc., 2009

Page 3: Getting Started Using Database Archiving Toronto DAMA Chapter Meeting 16 September, 2009 Jack E. Olson jack.olson@SvalTech.com  SvalTech

3

The Database Archiving SurveySvalTech

organize survey team

application enumeration

first-cut feasibility

data-life-cycle analysis

operational analysis

risk analysis

metric gathering

evaluate implementation options

business case development

prioritization

Copyright SvalTech, Inc., 2009

Page 4: Getting Started Using Database Archiving Toronto DAMA Chapter Meeting 16 September, 2009 Jack E. Olson jack.olson@SvalTech.com  SvalTech

4

The Survey Organization

SvalTech

MandatePeopleInputs

Mandate

A management directive that creates the database archiving survey task force and gives them the scope and objectives of the study.

Scope: business units to include, organizational units (divisions, companies, campuses)

Objectives: find best candidates for cost reduction, fixing operational problems, risk reduction

Copyright SvalTech, Inc., 2009

Page 5: Getting Started Using Database Archiving Toronto DAMA Chapter Meeting 16 September, 2009 Jack E. Olson jack.olson@SvalTech.com  SvalTech

5

The Survey Team

SvalTech

MandatePeopleInputs

ChairFulltime members

IT/enterprise architectstorage administration records retention

Subject matter membersdatabase architectdata managementbusiness unit data analystdatabase administration

Incidental memberslegal departmentIT compliancedata governancesecurity administrationdata analyst (BI type)

Copyright SvalTech, Inc., 2009

Page 6: Getting Started Using Database Archiving Toronto DAMA Chapter Meeting 16 September, 2009 Jack E. Olson jack.olson@SvalTech.com  SvalTech

6

Starting Materials

SvalTech

MandatePeopleInputs

Enterprise data modelData classification resultsSLA’sIT storage strategyRegulations/compliance rulesData governance mandates

Copyright SvalTech, Inc., 2009

Page 7: Getting Started Using Database Archiving Toronto DAMA Chapter Meeting 16 September, 2009 Jack E. Olson jack.olson@SvalTech.com  SvalTech

7

Application Enumeration

SvalTech

Limit search to those within mandatebusiness unitlocationenterprise

Identify Operational Applicationsclassify as transactional vs. static datainclude those already archiving to any extent

Identify Retired Applications still retaining data

Identify applications about to changeconsolidations planned planned or recent acquisitionsreplacements/ conversions/ reengineeringidentify any strategies for application retirement

Copyright SvalTech, Inc., 2009

Page 8: Getting Started Using Database Archiving Toronto DAMA Chapter Meeting 16 September, 2009 Jack E. Olson jack.olson@SvalTech.com  SvalTech

8

Application Enumeration

SvalTech

For Applications with potential,

Capture application data modelIdentify business records within the data modelConnect business records to records retention and legal categoriesIdentify database information: system/dbms/file/metadataCreate a Database Topology chartIdentify parallel applications within the corporation

(even if out of scope)Identify operational replicatesIdentify backup/disaster recovery stores and strategiesIdentify recurring data extracts for BI, etc.Get rough idea of db size and transaction rates

Copyright SvalTech, Inc., 2009

Page 9: Getting Started Using Database Archiving Toronto DAMA Chapter Meeting 16 September, 2009 Jack E. Olson jack.olson@SvalTech.com  SvalTech

9

Database Topology Chart

SvalTech

create data

operationalreplicateoperational

BI storesCRM

archive

backupbackup

disasterrecovery

offlinestorage

Copyright SvalTech, Inc., 2009

Page 10: Getting Started Using Database Archiving Toronto DAMA Chapter Meeting 16 September, 2009 Jack E. Olson jack.olson@SvalTech.com  SvalTech

10

First-Cut Feasibility

SvalTech

Factors for continuing to consider,important datalots of datalots of individual business records simple data structuresrelatively stable data structures (little change)long retention requirementlong inactive period within retention requirementlow frequency access requirement in inactive periodlow performance requirement in inactive periodsimple access requirements in inactive period

Apply criteria after each subsequent step to further eliminate bad candidates

Copyright SvalTech, Inc., 2009

Page 11: Getting Started Using Database Archiving Toronto DAMA Chapter Meeting 16 September, 2009 Jack E. Olson jack.olson@SvalTech.com  SvalTech

11

Examples

SvalTech

Good Not GoodBank deposits and withdrawals Customer master dataStock trades Airplane manufacturing recordsCredit card transactions HR recordsTicketmaster transactions Felony recordsMedical claim data Home salesCasualty claim data (auto, home)Retail sales inventory transactionsPackage trackingPassenger flight dataDriver license recordsSales tax recordsProperty tax recordsTelephone call transactionsNuclear reactor monitoring recordsAuto warrantee records

Copyright SvalTech, Inc., 2009

Page 12: Getting Started Using Database Archiving Toronto DAMA Chapter Meeting 16 September, 2009 Jack E. Olson jack.olson@SvalTech.com  SvalTech

12

Data Life Cycle Analysis

SvalTech

Create a database archiving DLCA for each business record type

Data Retention ChartBusiness Record Process Chart to determine inactive periodBusiness Record SLA chart by age of record

Copyright SvalTech, Inc., 2009

Page 13: Getting Started Using Database Archiving Toronto DAMA Chapter Meeting 16 September, 2009 Jack E. Olson jack.olson@SvalTech.com  SvalTech

13

Data Retention Chart

SvalTech

The requirement to keep data for a business object for a specified period of time. The object cannot be destroyed untilafter the time for all such requirements applicable to it has past.

Business Requirements

Regulatory Requirements

The Data Retention requirement is the longest of all requirement lines.

Copyright SvalTech, Inc., 2009

Page 14: Getting Started Using Database Archiving Toronto DAMA Chapter Meeting 16 September, 2009 Jack E. Olson jack.olson@SvalTech.com  SvalTech

14

Business Record Process Chart

SvalTech

for a single instance of a data object

Create POUpdate POCreate InvoiceBackorderCreate Financial RecordUpdate on ShipUpdate on Ack

Weekly Sales ReportQuarterly Sales report

Extract for data warehouseExtract for bus analysisCommon customer queriesCommon bus queries

Ad hoc requestsLaw suit e-Discovery requestsInvestigation data gathering

Retention requirement

operational reference inactive

time

Copyright SvalTech, Inc., 2009

Page 15: Getting Started Using Database Archiving Toronto DAMA Chapter Meeting 16 September, 2009 Jack E. Olson jack.olson@SvalTech.com  SvalTech

15

Business Record SLA Chart by Age

SvalTech

for a single instance of a data object

Query response time

Transaction volume

create/update

Security (no users)

read

Retention requirement

operational reference inactive

time

Copyright SvalTech, Inc., 2009

Page 16: Getting Started Using Database Archiving Toronto DAMA Chapter Meeting 16 September, 2009 Jack E. Olson jack.olson@SvalTech.com  SvalTech

16

Operational Analysis

SvalTech

Don’t assume there are no problems.

Talk to DBAs and users.

Look for trends

Look for escalating operational costs.

Get numbers.

Copyright SvalTech, Inc., 2009

Page 17: Getting Started Using Database Archiving Toronto DAMA Chapter Meeting 16 September, 2009 Jack E. Olson jack.olson@SvalTech.com  SvalTech

17

Operational AnalysisSvalTech

• Performance Issues– Not meeting response time SLA

– Longer time to run extracts

– Longer time to run backups

– Longer time to run database reorganizations

– Running reorganizations more frequently

– More difficult to tune

• Risk Issues– Longer estimated time to run recovery

– Longer estimated time to run disaster recovery

• Cost Issues– Higher annual hardware costs

– Higher annual MIP-based software cost

– Adding expensive DASD to support database and backups

Copyright SvalTech, Inc., 2009

Page 18: Getting Started Using Database Archiving Toronto DAMA Chapter Meeting 16 September, 2009 Jack E. Olson jack.olson@SvalTech.com  SvalTech

18

Risk AnalysisSvalTech

• Data Loss Risk– Isolation from internet hackers

– Prevent ANY updates or deletes

– Preserve data through multi-site backups and periodic pings

• Data Quality Risk– Changing data structures and column semantics

– Changing reference data

• Unauthorized Access Risk– Reduced (or different) user set

– Audit trail of access

• Legal Risk– Preserve authenticity of data in archive

– Reduce cost and time to produce data for discovery requests

Copyright SvalTech, Inc., 2009

Page 19: Getting Started Using Database Archiving Toronto DAMA Chapter Meeting 16 September, 2009 Jack E. Olson jack.olson@SvalTech.com  SvalTech

19

Metric Gathering

SvalTech

Data bytes stored per business objectnew transactions created per daybytes for backups, replicatesgrowth in transactions ratesany sudden expected additionspast history plus future projections

Storage Costscost per byte: operationalcost per byte: backupcost per byte: archivearchive compression ratios

System Costsmips required to processsoftware license feesstaff for operational

Copyright SvalTech, Inc., 2009

Page 20: Getting Started Using Database Archiving Toronto DAMA Chapter Meeting 16 September, 2009 Jack E. Olson jack.olson@SvalTech.com  SvalTech

20

Metric Gathering

SvalTech

For retired applications concentrate ondisplaced system costdisplaced software costdisplaced staff cost

IBM mainframeIMS DBMS

CICSDBA/SYSPROG

LINUX serverArchive software

JDBCArchive admin

NOT shared

Shared

Copyright SvalTech, Inc., 2009

Page 21: Getting Started Using Database Archiving Toronto DAMA Chapter Meeting 16 September, 2009 Jack E. Olson jack.olson@SvalTech.com  SvalTech

21

Evaluate Implementation Options

SvalTech

• Software– Vendor provided software

– Custom built solution

• Access tools– Original application

– Generic report generation/ query tools

– Custom built

• Storage for archive– Storage subsystem

– Hosted storage

Copyright SvalTech, Inc., 2009

Page 22: Getting Started Using Database Archiving Toronto DAMA Chapter Meeting 16 September, 2009 Jack E. Olson jack.olson@SvalTech.com  SvalTech

22

Architecture of Database Archiving

Archive Server

Operational System

archive catalog

archive storage

OP DB

Archive AdministratorArchive DesignerArchive Data ManagerArchive Access Manager

SvalTech

Archive Extractor

Application program

Archive extractor

Copyright SvalTech, Inc., 2009

Page 23: Getting Started Using Database Archiving Toronto DAMA Chapter Meeting 16 September, 2009 Jack E. Olson jack.olson@SvalTech.com  SvalTech

23

Estimate Implementation Time and Cost

SvalTech

• Archiving systems required– Servers

– Storage systems (hosted storage?)

– Licensed software

• Application Design

• Implementation

• Test

• Deployment

• Ongoing operation and administration

Copyright SvalTech, Inc., 2009

Page 24: Getting Started Using Database Archiving Toronto DAMA Chapter Meeting 16 September, 2009 Jack E. Olson jack.olson@SvalTech.com  SvalTech

24

Business Case Development

SvalTech

– Lower IT costs

– Improved operational efficiency

– Risk reduction

Copyright SvalTech, Inc., 2009

Page 25: Getting Started Using Database Archiving Toronto DAMA Chapter Meeting 16 September, 2009 Jack E. Olson jack.olson@SvalTech.com  SvalTech

25

Lower IT Costs

SvalTech

• Systems– Reduce size/cost of operational systems

– Put off or eliminate need for system upgrades

• Software– Eliminate or reduce cost of expensive system software

• DBMS

• Transaction system

– Eliminate or reduce cost of application software

• Storage costs– Switch to lower cost storage

– Impact on backups/ disaster recovery stores

– Reduction in byte count stored

• Staff– Eliminate or reduce legacy system staff

Copyright SvalTech, Inc., 2009

Page 26: Getting Started Using Database Archiving Toronto DAMA Chapter Meeting 16 September, 2009 Jack E. Olson jack.olson@SvalTech.com  SvalTech

26

Chart it

operational operational archive

All data in operational db

most expensive system most expensive storage most expensive software

Inactive data in archive db

least expensive system least expensive storage least expensive software

In a typical op db60-80% of datais inactive

This percentageis growing

SvalTech

Size Today

Copyright SvalTech, Inc., 2009

Page 27: Getting Started Using Database Archiving Toronto DAMA Chapter Meeting 16 September, 2009 Jack E. Olson jack.olson@SvalTech.com  SvalTech

27

Lower IT Costs

SvalTech

• First year impact

• Time to recover project costs

• Chart cost savings over time– Plot data growth over time for operational

– Plot data growth over time of archive

Copyright SvalTech, Inc., 2009

Page 28: Getting Started Using Database Archiving Toronto DAMA Chapter Meeting 16 September, 2009 Jack E. Olson jack.olson@SvalTech.com  SvalTech

28

Operational Improvements

SvalTech

• Itemize improvements expected– Performance of operations

– Reduction of utility times

– Reduction of recovery times

– Reduction of disaster recovery times

– Reduction of DBA workload

• Provide cost savings where appropriate

Copyright SvalTech, Inc., 2009

Page 29: Getting Started Using Database Archiving Toronto DAMA Chapter Meeting 16 September, 2009 Jack E. Olson jack.olson@SvalTech.com  SvalTech

29

Risk Reduction

SvalTech

• Itemize improvements expected– Less risk of failing e-Discovery request

– Enhanced data quality of older data

– Less exposure to loss of data authenticity

– Better access control

– Better compliance

– Better data governance

– Less dependence on legacy systems

• Provide cost savings where appropriate

Copyright SvalTech, Inc., 2009

Page 30: Getting Started Using Database Archiving Toronto DAMA Chapter Meeting 16 September, 2009 Jack E. Olson jack.olson@SvalTech.com  SvalTech

30

Prioritization

SvalTech

– Determine Prioritization Criteria• Cost is most common primary factor

– First archiving project may have other goals• Lower risk of failure

• Faster implementation

• Faster return on investment

• Usually a retired application project

– Risk may over-ride other factors• Preserve data authenticity

Copyright SvalTech, Inc., 2009

Page 31: Getting Started Using Database Archiving Toronto DAMA Chapter Meeting 16 September, 2009 Jack E. Olson jack.olson@SvalTech.com  SvalTech

31

Final Thought

SvalTech

• Always do a survey to find the best applications to start with

• Always do a survey to identify those that make sense to proceed with versus those that do not: don’t waste time on apps that are too hard to implement or that will have little value.

• A good database archive application can save millions of dollars per year, increase performance of operational systems and reduce risk all at the same time. The trick is identifying them and proving it.

• Repeat the Database Archiving Survey from time to time in the future.

Copyright SvalTech, Inc., 2009