high availability @ kbc jan tielemans. agenda back in time how the ha environment @ kbc looked...

Post on 14-Dec-2015

214 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

High Availability @ KBC

Jan Tielemans

Agenda

Back in timeHow the HA environment @ KBC looked like in

2006Pro - Cons

Discuss the different steps(projects) to move to a HA environment anno today

Future directions/plans

Why a ‘High Availability’ environment ?

Back in time (1996/7….)Objective HA environment :

Keep the 24*7 applications running during ‘technical maintenance’

Be able to ‘re-direct’ 24*7 applications to ‘one’ system in case of disaster (no DRP)

Stable performance How it was implemented technically :

‘Headoffice & Branches’ workload runs on one lpar(M1)

Retail workload (24*7) runs on the other lpar(M2) Both ‘workloads’ share the same DB2 data

Developed a ‘Switch procedure’ to redirect the retail workload to another machine

HA@KBC anno 2006

(DSN A) (DNS B) (DNS XXXX)

SIPC

SDPC

SMPC

IC1 IC2

DNS A

IPC1

DPC1

MPC1

IC5 IC6

DNS B

IPC2

DPC2

MPC2

IC3 IC4

DNS XXXX

M1 M2

WAS

TCP/

IP

Retail (DNS B)

Retail (DNS B)

Retail (DNS B)

Retail (DNS B)

Retail (DNS B)

Retail (IPC1 DNS B)

Headoffice & Branches(DNS A)Headoffice & Branches

(DNS A)Headoffice & Branches(DNS A)Headoffice & Branches(SIPC DNS A) Retail (IPC2 DNS XXXX)

AGF frame work (KCF => connection cleanup after 20 min)

32

70

‘He

ad

office

&

Bra

nch

es’

Technical environment :DB2

Datasharing, 2 active members – 1 sleeping member

IMSNo Shared QueuesSubsystems/Regions for Retail not the same as

‘Headoffice & Branches’

MQNo Shared QueuesSubsystems for Retail not the same as ‘Headoffice &

Branches’ For some queues we use the MQ Clustering

technique

Switch Procedure Very complex and error prone

Often resulted in unavailability 24*7 applications Lot of interaction with WAS servers

Redirecting workload is done on each WAS serverControlled via a mainframe application

Knowledge and maintenance by one personMainframe application designed in NetView ($)

Static Workload distributionUnder utilization of CPU capacity at certain time periods

Availability was high, +99%

Things changed after initial setup in 1996More CPU/Memory available per MachineNew (critical) applications for the ‘Headoffice &

Branches’ and Retail workloadOther workload implemented More and more a mix of concurrent Online & Batch

workloadRegulations of BASEL – CBFA - …..…….

Why change ?Complex – error prone Switch procedureBetter utilize CPU capacityBetter use and exploit sysplex technologyPrepare for scalability (Cloned subsystems)

A 3 step approachMake Open Systems independent from Mainframe

Dynamic Transaction routing to the Mainframe

Workload Balancing RetailWorkload Balancing ‘Headoffice & Branches’

Implementation DTM Switch Procedure (06/2007)

TCP/IP Sysplex Distributor (DNS B M2)

SIPC

SDPC

SMPC

IC1+xit IC2+xit

DNS A

‘Headoffice & Branches’(SIPC DNS A)

Retail(GIPCDNS B)

AGF frame work (KCF => time to leave 5 Minutes)

BankSys(GIPCDNS B)

IPC1

DPC1

MPC1

IC5+xit IC6+xit

DNS B

IPC2

DPC2

MPC2

IC3+xit IC4+xit

DNS B

KBC Phone(GIPCDNS B)

M1 M2

TCP/IP Sysplex Distributor (DNS B M1)

32

70

‘He

ad

office

&

Bra

nch

es’

Implementation DTM Switch Procedure (06/2007)

Time managed connections to IMS

DTM Switch Procedure : Written in automation Retail Workload redirected in less then 10 minutes No interface/communication with WAS Servers Switch is a Mainframe ONLY operation Technical maintenance for M2 starts now on 13:00h vs 22:00h

Cloned IMS and MQ subsystems

Exploit TCP/IP sysplex Distributor

Implemented IMS exit to resolve : Map the IMS group name to an active IMS subsystem Control if IMS subsystem is active, if not redirect to other

To be solved for the next step(s) : Workload Balancing

Application changes :Eliminate system affinity in applications logic

Get IMSID – If substr(IMSID,3,1) eq ‘P’ then …..

Identified application which could suffer from DB2 Datasharing

Identified serial transaction How to serialize trx’s in a parallel environment ?

Communication – presentations for different departmentsMind Change

LET OP!

Wij draaien in HIGH AVAILABILITYTransacties kunnen op beide “online” productiesystemen in uitvoering gaan!

HA for Retail (11/2008)

KBC Phone(DNS B GIPC)

Sysplex Distributor WLM MANAGED(DNS B M1, M2)

Headoffice & Branches(SIPC DNS A)

ONL (elb & ipa) / AUT / KID / IIP(DNS B GIPC)

AGF frame work (KCF => time to leave 5 Minutes)

BankSys(DNS B==> GIPC)

SIPC

SDPC

SMPC

IC1+xit IC2+xit

DNS A

IC3+xit IC4+xit

DNS B

IPC1

DPC1

MPC1

IC3+xit IC4+xit

DNS B

M1 M2

WLM

WLM

32

70

‘He

ad

office

&

Bra

nch

es

HA for Retail (12/2008)

KBC Phone(DNS B GIPC)

Sysplex Distributor WEIGHTEDACTIVE(DNS B M1, M2)

‘Headoffice & Branches’(SIPC DNS A)

ONL (elb & ipa) / AUT / KID / IIP(DNS B GIPC)

AGF frame work (KCF => time to leave 5 Minutes)

BankSys(DNS B==> GIPC)

SIPC

SDPC

SMPC

IC1+xit IC2+xit

DNS A

IC3+xit IC4+xit

DNS B

IPC1

DPC1

MPC1

IC3+xit IC4+xit

DNS B

10 90

M1 M2

32

70

‘He

ad

office

&

Bra

nch

es’

IMS Exit :Build in logic to not distribute ‘some’ serial

transactions from ‘Headoffice & Branches’ Run these serial transactions only on ONE ims

subsystem(SIPC)

Switch from SERVERWLM to WEIGHTEDACTIVE distribution methodTo much important work defined in WLMHeterogeneous workload on M1 – M2 What information sends WLM to the Sysplex

Distributor ?

Weightedactive weights can easily be modified with ‘simple’ commands.

HA for Retail (12/2008)

KBC Phone(DNS B GIPC)

Sysplex Distributor WEIGHTEDACTIVE(DNS B M1, M2)

‘Headoffice & Branches’(SIPC DNS A)

ONL (elb & ipa) / AUT / KID / IIP(DNS B GIPC)

AGF frame work (KCF => time to leave 5 Minutes)

BankSys(DNS B==> GIPC)

SIPC

SDPC

SMPC

IC1+xit IC2+xit

DNS A

IC3+xit IC4+xit

DNS B

10

M1

IPC1

DPC1

MPC1

IC3+xit IC4+xit

DNS B

M2

Sysplex Distributor WEIGHTEDACTIVE(DNS B M1)

100 90

32

70

‘He

ad

office

&

Bra

nch

es’

HA for Retail (12/2008)

BenefitsOne environment (IPC2-DPC2-MPC2) less to

manage/maintain‘Retail’ Workload balancing is dynamic

adjustable…..Better utilization of resources (Lpars, cpu,

memory..)Pre z10 40% to M1 , 60% to M2z10 10% to M1, 90% to M2

(Cons)WLM management of the workload not possible due

to the heterogeneous workload on the system(s)

HA for ‘Headoffice & Branches’ (7/2009)

KBC Phone(DNS B GIPC)

Sysplex Distributor WEIGHTEDACTIVE(DSNA M1,M2) (DNS B M1, M2)

‘Headoffice & Branches’(SIPC DNS A)

ONL (elb & ipa) / AUT / KID / IIP(DNS B GIPC)

AGF frame work (KCF => time to leave 5 Minutes)

BankSys(DNS B==> GIPC)

SIPC

SDPC

SMPC

IC1+xit IC2+xit

DNS A

IC3+xit IC4+xit

DNS B

10

M1

IPC1

DPC1

MPC1

IC1+xit IC2+xit

DNS A

M2

Sysplex Distributor WEIGHTEDACTIVE(DSNA M1) (DNS B M1)

90

90 10

IC3+xit IC4+xit

DNS B

MSC - IP connection IP link

100

100

32

70

‘He

ad

office

&

Bra

nch

es’

IMS MSC (Multi Systems Coupling)Defined SIPC as base system for serial transactionsDefine serial transactions as Local - RemoteStill have (and will have) 21 serial transactionsAll serial transactions are now managed by MSC,

removed logic in the IMS Exit

HA in Performance figures

SSPCLP 18 (2094 –z9)Memory 40gb

IBS I

IBS IIFF

SSQCLP 10 (2097 z10)Memory 40gb

Io <2ms25K io/sec

<20µsec

Housekeeping

Housekeeping

PersoneelsnetService centerAsset centerWebseal……

‘HeadofficeBranches’

Retail Retail‘Headoffice Branches’

LdapLdap

Sysplex Distributor WEIGHTEDACTIVE

‘Headoffice & Branches’60-65 ms (2/3 DB2 -1/3 PGM)

5.5 - 6 milj trx /day – 200 trx/sec9:00 – 17:00

ONL (elb & ipa) / AUT / KID / IIP80-85 ms ((2/3 DB2 -1/3 PGM)

7.5 - 8 milj trx /day – 250 trx/sec00:00 – 24:00

AGF frame work (KCF)

90

Ldap

10 901050 50

Batch

Batch

Batch

BatchBatch

Batch Batch

Batch

Batch

Mirrored (GDPS managed)

Future Plans

Standardization on Subsystems namesDone

Implement “subsystem” failure management

Design for a SERVERWLM distribution

Design for CBU implementation vs PrioritiesRetail & ‘Headoffice & Branches’ can not run on

peek times (normal business hours) on one Lpar (system)

QUESTIONS ?

top related