move to hadoop, go faster and save millions - mainframe legacy modernization

27
1 Hadoop Summit 2013- June 26th, 2013 Move to Hadoop, Go Faster and Save Millions - Mainframe Legacy Modernization Sunilkumar Kakade – Director IT Aashish Chandra – DVP, Legacy Modernization

Upload: hadoopsummit

Post on 22-Jan-2015

3.394 views

Category:

Technology


1 download

DESCRIPTION

In spite of recent advances in computing, many core business processes are batch-oriented running on Mainframes. Annual Mainframe costs are counted in 6+ figure Dollars per year, potentially growing with capacity needs. In order to tackle the cost challenge, many organizations have considered or attempted multi-year mainframe migration/re-hosting strategies. Traditional approaches to Mainframe elimination call for large initial investments and carry significant risks – It is hard to match Mainframe performance and reliability. Using Hadoop, Sears/MetaScale developed an innovative alternative that enables batch processing migration to Hadoop, without the risks, time and costs of other methods. This solution has been adopted in multiple businesses with excellent results and associated cost savings, as Mainframes are physically eliminated or downsized: Millions of dollars in savings based on MIP reductions have been seen – A reduction of 200 MIPS can yield $1 million in annual savings. MetaScale eliminated over 900 MIPs and an entire Mainframe system for one fortune 500 client. This presentation illustrates reference architecture and approach successfully used by MetaScale to move mainframe processing to the Hadoop platform without altering user-facing business applications.

TRANSCRIPT

Page 1: Move to Hadoop, Go Faster and Save Millions - Mainframe Legacy Modernization

1

Hadoop Summit 2013- June 26th, 2013

Move to Hadoop, Go Faster and Save Millions - Mainframe Legacy Modernization

Sunilkumar Kakade – Director ITAashish Chandra – DVP, Legacy Modernization

Page 2: Move to Hadoop, Go Faster and Save Millions - Mainframe Legacy Modernization

2

Legacy Rides The Elephant

Hadoop is disrupting the enterprise IT processing.

Page 3: Move to Hadoop, Go Faster and Save Millions - Mainframe Legacy Modernization

3

Recognition - Contributors • Our Leaders

• Ted Rudman• Aashish Chandra

• Team • Simon Thomas• Sunil Kakade• Susan Hsu• Bob Pult• Kim Havens• Murali Nandula• Willa Tao• Arlene Pynadath• Nagamani Banda• Tushar Tanna• Kesavan Srinivasan

Page 4: Move to Hadoop, Go Faster and Save Millions - Mainframe Legacy Modernization

4

The Enterprise Challenge

The Challenge

Growing Data

Volumes Shortened Processing Windows

Escalating Costs

Hitting Scalability Ceilings Demanding

Business Rqmts

ETL Complexity

Latency in Data

Tight IT Budgets

Page 5: Move to Hadoop, Go Faster and Save Millions - Mainframe Legacy Modernization

5

Mainframe Migration - Overview• In spite of recent advances in computing, many core business

processes are batch-oriented running on mainframes.

• Annual Mainframe costs are counted in 6+ figure Dollars per year, potentially growing with capacity needs. In order to tackle the cost challenge, many organization have considered or attempted multi-year mainframe migration/re-hosting strategies.

Page 6: Move to Hadoop, Go Faster and Save Millions - Mainframe Legacy Modernization

6

Batch Processing Characteristics

*Ref:. IBM Redbook

Characteristics*• Large amounts of input data are processed and stored (perhaps

terabytes or more).• Large numbers of records are accessed, and a large volume of

output is produced• Immediate response time is usually not a requirement,

however, must complete within a “batch window”• Batch jobs are often designed to run concurrently with online

transactions with minimal resource contention.

Page 7: Move to Hadoop, Go Faster and Save Millions - Mainframe Legacy Modernization

7

Batch Processing Characteristics

Key infrastructure requirements:

• Sufficient data storage• Available processor capacity, or cycles• job scheduling• Programming utilities to process basic operations

(Sort/Filter/Split/Copy/Unload etc.)

Page 8: Move to Hadoop, Go Faster and Save Millions - Mainframe Legacy Modernization

8

Why Hadoop and Why Now?THE ADVANTAGES:

• Cost reduction• Alleviate performance bottlenecks • ETL too expensive and complex• Mainframe and Data Warehouse processing Hadoop

THE CHALLENGE:• Traditional enterprises lack of awareness

THE SOLUTION:• Leverage the growing support system for Hadoop• Make Hadoop the data hub in the Enterprise• Use Hadoop for processing batch and analytic jobs

Page 9: Move to Hadoop, Go Faster and Save Millions - Mainframe Legacy Modernization

9

The Architecture

• Enterprise solutions using Hadoop must be an eco-system

• Large companies have a complex environment:• Transactional system• Services• EDW and Data marts• Reporting tools and needs

• We needed to build an entire solution

Page 10: Move to Hadoop, Go Faster and Save Millions - Mainframe Legacy Modernization

10

MetaScale’ s Hadoop Ecosystem

Page 11: Move to Hadoop, Go Faster and Save Millions - Mainframe Legacy Modernization

11

Ha

doop based Ecosystem

for Legacy System

Modernization

Mysql

EnterpriseSystems

JQUERY/AJAXQuart

zJAXB

REST API

JDBC/IBATIS

JBOSSJ2EE/JBOSS/SPRING

Batch ProcessingHIVE

RUBY/MAPREDUCE

JBOSSHADOOP/PIG

DB2

EnterpriseSystems

JQUERY/AJAXQuart

zJAXB

REST APIJDBC/IBATIS

JBOSSJ2EE/WebSphere

Mainframe Batch Processing

VSAM

JBOSSCOBOL/JCL

MetaScale

Page 12: Move to Hadoop, Go Faster and Save Millions - Mainframe Legacy Modernization

12

Mainframe Batch Processing Architecture

Mainframe Batch Processing Architecture

User Interface Data SourcesBatch

Processing

Datawarehouse

Input

Resultant Data

Resultant Data

Historical Data Sources

Input

Data Retention External Systems

Resultant Data

Input

Page 13: Move to Hadoop, Go Faster and Save Millions - Mainframe Legacy Modernization

13

MetaScale Batch Processing Architecture With Hadoop

Hadoop EcoSystem

User Interface Data Sources

Hadoop EcoSystem Map Reduce basedBatch Processing

External Systems/

Datawarehouse

InputMove to Hadoop

Resultant Data

Move to Non-Hadoop

Resultant DataMove to Non-Hadoop platform

Datawarehouse

Resultant Data

Page 14: Move to Hadoop, Go Faster and Save Millions - Mainframe Legacy Modernization

14

Typical Batch Processing Units (JCL) on Mainframe

Batch Processing - JOB FLOW

JCL1 - APPLICATION 1

Mainframe Batch Processing Flow

User Interface Data Sources Batch Processing

External Systems/

DatawarehouseInput

Resultant Data Resultant Data

SORT Input SPLITInput

SORT

Input COBOL

Input FILTER

Input FORMAT

JCL2 - APPLICATION 1

JCL3 - APPLICATION 2

LOAD TO DATABASE

COPY Input COBOL Input FORMAT

Input

Input

Page 15: Move to Hadoop, Go Faster and Save Millions - Mainframe Legacy Modernization

15

Batch Processing Migration With HadoopSeamless migration of high MIPS processing jobs with no application alteration

Commodity Hardware Based Software Framework

Batch Processing - JOB FLOW

Batch Process - APPLICATION 1

Batch Processing - JOB FLOW - Legacy Platform

Invention - Migration methodology for Legacy Applications to Commodity Hardware

User Interface Data SourcesExternal Systems/

Datawarehouse

Batch ProcessingInput Resultant Data

PIG/MR Input PIG/MRInput

PIG/MR

Input PIG/MR

Input PIG/MR

Input PIG/MR

JCL2 - APPLICATION 1

JCL3 - APPLICATION 2

LOAD TO DATABASE

COPY Input COBOL Input FORMAT

Input

Input

Resultant Data

Page 16: Move to Hadoop, Go Faster and Save Millions - Mainframe Legacy Modernization

16

Mainframe to Hadoop-PIG conversion example

Mainframe JCL//PZHDC110 EXEC PGM=SORT//SORTIN DD DSN=PZ.THDC100.PLMP.PRC,// DISP=(OLD,DELETE,KEEP)//SORTOUT DD DSN=PZ.THDC110.PLMP.PRC.SRT,LABEL=EXPDT=99000,// DISP=(,CATLG,DELETE),// UNIT=CART,// VOL=(,RETAIN),// RECFM=FB,LRECL=40//SYSIN DD DSN=KMC.PZ.PARMLIB(PZHDC11A),// DISP=SHR//SYSOUT DD SYSOUT=V//SYSUDUMP DD SYSOUT=D//*__________________________________________________

//* SORT FIELDS=(1,9,CH,A) - 500 Million Records sort took 45 minutes of clock time on A168 mainframe

PIG a = LOAD 'data' AS f1:char;b = ORDER a BY f1;

- 500 Million Records sort took less than 2 minutes

More benchmarking studies in progress

Page 17: Move to Hadoop, Go Faster and Save Millions - Mainframe Legacy Modernization

17

Mainframe to Hadoop-PIG conversion example

Mainframe JCL//PZHDC110 EXEC PGM=SORT//SORTIN DD DSN=PZ.THDC100.PLMP.PRC,// DISP=(OLD,DELETE,KEEP)//SORTOUT DD DSN=PZ.THDC110.PLMP.PRC.SRT,LABEL=EXPDT=99000,// DISP=(,CATLG,DELETE),// UNIT=CART,// VOL=(,RETAIN),// RECFM=FB,LRECL=40//SYSIN DD DSN=KMC.PZ.PARMLIB(PZHDC11A),// DISP=SHR//SYSOUT DD SYSOUT=V//SYSUDUMP DD SYSOUT=D//*__________________________________________________

//* SORT FIELDS=(1,9,CH,A) - 500 Million Records sort took 45 minutes of clock time on A168 mainframe

PIG a = LOAD 'data' AS f1:char;b = ORDER a BY f1;

- 500 Million Records sort took less than 2 minutes

More benchmarking studies in progress

Page 18: Move to Hadoop, Go Faster and Save Millions - Mainframe Legacy Modernization

18

Mainframe Migration – Value Proposition

MainframeMigration

Optimize

PiG / Hadoop Rewrites

Convert

High TCO

ResourceCrunch

Inert Business Practices

Mainframe ONLINE

-Tool based Conversion

-Convert COBOL & JCL to Java

Mainframe Optimization: -5% ~ 10% MIPS Reduction

-Quick Wins with Low hanging fruits

Mainframe BATCH

-ETL Modernization

-Move Batch Processing to Hadoop

Cost Savings Open Source Platform Simpler & Easier Code Business Agility Business & IT Transformation Modernized Systems IT Efficiencies

Companies can SAVE 60% ~ 80% of their Mainframe Costs with Modernization

Typically 60% ~ 65% of MIPS are used in Mainframes by BATCH processing

Estimated 45% of FUNCTIONALITY in mainframes is never used

Page 19: Move to Hadoop, Go Faster and Save Millions - Mainframe Legacy Modernization

19

Mainframe Migration – Traditional Approach

• Traditional approaches to mainframe elimination call for large initial investments and carry significant risks – It is hard to match Mainframe performance and reliability.

• Many organizations still utilize mainframe for batch processing applications. Several solutions presented to move expensive mainframe computing to other distributed proprietary platform, most of them rely on end-to-end migration of applications.

Page 20: Move to Hadoop, Go Faster and Save Millions - Mainframe Legacy Modernization

20

Mainframe Batch Processing MetaScale Architecture

• Using Hadoop, Sears/MetaScale developed an innovative alternative that enables batch processing migration to Hadoop Ecosystem, without the risks, time and costs of other methods.

• The solution has been adopted in multiple businesses with excellent results and associated cost savings, as Mainframes are physically eliminated or downsized: Millions of dollars in savings based on MIP reductions have been seen.

Page 21: Move to Hadoop, Go Faster and Save Millions - Mainframe Legacy Modernization

21

MetaScale Mainframe Migration Methodology

Implement a Hadoop-

centric reference

architecture

Move enterprise

batch processing to Hadoop

Make Hadoop

the single point of

truth

Massively reduce ETL

by transforming

within Hadoop

Move results and

aggregates back to legacy

systems for consumption

Retain, within Hadoop,

source files at the finest

granularity for re-use

1 2 3 4 5 6

Key to our Approach:1) allowing users to continue to use familiar consumption interfaces2) providing inherent HA3) enabling businesses to unlock previously unusable data

Page 22: Move to Hadoop, Go Faster and Save Millions - Mainframe Legacy Modernization

22

Mainframe Migration - Benefits

“MetaScale is the market

leader in moving mainframe batch

processing to Hadoop”

• Significant reduction in ISV costs & mainframe software licenses fees

• Open Source platform

• Saved ~ $2MM annually within 13 weeks by MIPS Optimization efforts

• Reduced 1000+ MIPS by moving batch processing to Hadoop

• Modernized COBOL, JCL, DB2, VSAM, IMS & so on

• Reduced batch processing in COBOL/JCL from over 6 hrs to less than 10 min in PiG Latin on Hadoop

• Simpler, and easily maintainable code

• Massively Parallel Processing

• Readily available resources & commodity skills

• Access to latest technologies

• IT Operational Efficiencies

• Moved 7000 lines of COBOL code to under 50 lines in PiG

• Ancient systems no longer bottleneck for business

• Faster time to Market

• Mission critical “Item Master” application in COBOL/JCL being converted by our tool in Java (JOBOL)

Cost Savings TransformI.T.

Skills & ResourcesBusiness Agility

Page 23: Move to Hadoop, Go Faster and Save Millions - Mainframe Legacy Modernization

23

Summary• Hadoop can revolutionize Enterprise workload and make business

agile• Can reduce strain on legacy platforms• Can reduce cost• Can bring new business opportunities

• Must be an eco-system• Must be part of an data overall strategy• Not to be underestimated

Page 24: Move to Hadoop, Go Faster and Save Millions - Mainframe Legacy Modernization

24

The LearningH

AD

OO

P We can dramatically reduce batch processing times for mainframe and EDW We can retain and analyze data at a much more granular level, with longer history Hadoop must be part of an overall solution and eco-system

IMP

LE

ME

NT

AT

ION We can reliably meet our production deliverable time-windows by using Hadoop

We can largely eliminate the use of traditional ETL tools New Tools allow improved user experience on very large data sets

UN

IQU

E

VA

LU

E

We developed tools and skills – The learning curve is not to be underestimated We developed experience in moving workload from expensive, proprietary mainframe and EDW

platforms to Hadoop with spectacular results

Over two years of Hadoop experience using Hadoop for Enterprise legacy workload.

Page 25: Move to Hadoop, Go Faster and Save Millions - Mainframe Legacy Modernization

25

• Automation tools and techniques that ease the Enterprise integration of Hadoop

• Educate traditional Enterprise IT organizations about the possibilities and reasons to deploy Hadoop

• Continue development of a reusable framework for legacy workload migration

The Horizon – What do we need next?

Page 26: Move to Hadoop, Go Faster and Save Millions - Mainframe Legacy Modernization

26

Legacy Modernization Service Offerings• Leveraging our patent pending and award-winning niche` products, we reduce

Mainframe MIPS, Modernize ETL processing and transform business and IT organizations to open source, cloud based, Big Data and agile platform

• MetaScale Legacy Modernization offers following services –

Legacy Modernization Assessment Services

Mainframe Migration Services• MIPS Reduction Services• Mainframe Application Migration

Legacy Distributed Modernization• ETL Modernization Services• Modernize Proprietary Systems and

Databases Managed Applications Support Support Transition Services

Page 27: Move to Hadoop, Go Faster and Save Millions - Mainframe Legacy Modernization

27

For

mor

e in

form

atio

n,

visi

t:

www.metascale.com

Follow us on Twitter @LegacyModernizationMadeEasy

Join us on LinkedIn: www.linkedin.com/company/metascale-llc

Legacy Modernization Made Easy!