maintenance challenges mike hughes. the challenge significant amounts of maintenance on both lans ...

17
Maintenance Challenges Maintenance Challenges Mike Hughes <[email protected]>

Upload: laureen-berry

Post on 12-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Maintenance Challenges Mike Hughes. The Challenge Significant amounts of maintenance on both LANs  Complete upgrade of Extreme LAN hardware  Major upgrade

Maintenance ChallengesMaintenance Challenges

Mike Hughes <[email protected]>

Page 2: Maintenance Challenges Mike Hughes. The Challenge Significant amounts of maintenance on both LANs  Complete upgrade of Extreme LAN hardware  Major upgrade

The ChallengeThe Challenge

Significant amounts of maintenance on both LANsComplete upgrade of Extreme LAN hardwareMajor upgrade of Foundry LAN (RX16s)Reconfiguration of backbone rings in both

LANs Deploying MRP2 on Foundry, EAPSv2 on Extreme

“Chains” of dependencies Would be a total of between 14-16 slots

Page 3: Maintenance Challenges Mike Hughes. The Challenge Significant amounts of maintenance on both LANs  Complete upgrade of Extreme LAN hardware  Major upgrade

It would take ages!It would take ages!

Maintenance is usually one, occasionally two slots per week, minimum 7 days noticePlus, only one LAN at once to minimise risk

Would take between 3 and 4 months Not accounting for failures/backouts

Would cause delays of an additional week per backout

Oh, and we needed to get it done sooner!High demand for 10G ports!

Page 4: Maintenance Challenges Mike Hughes. The Challenge Significant amounts of maintenance on both LANs  Complete upgrade of Extreme LAN hardware  Major upgrade

Intense ApproachIntense Approach

Block out whole weeks for maintenance Allowing between 1 and 4 slots each week

Alternate between Foundry and Extreme Allowing changes to settle in

Ability to rapidly reschedule (<24 hours) for maintenance which overran or had to be backed out

Hire in external hands to assist Reduce direct stress on LINX staff Ensure “backup” crew available next day

Page 5: Maintenance Challenges Mike Hughes. The Challenge Significant amounts of maintenance on both LANs  Complete upgrade of Extreme LAN hardware  Major upgrade

Breaking the NewsBreaking the News

First get your colleagues on side Then, break it to your members! Surprisingly well received

Especially given the “otherwise we’ll still be finishing this in September” situation

It seemed it would only cause significant problems for one member (using DNS-based load distribution)

We even managed to mitigate some of that

Page 6: Maintenance Challenges Mike Hughes. The Challenge Significant amounts of maintenance on both LANs  Complete upgrade of Extreme LAN hardware  Major upgrade

SchedulingScheduling

6 new switchesAll complete box swaps

One hard cutover, 5 staggered migrations

2 different software upgradesInvolving 6 different switches

New inter-site fibre delivery Backbone reconfiguration Little opportunity to build “greenfield”

Page 7: Maintenance Challenges Mike Hughes. The Challenge Significant amounts of maintenance on both LANs  Complete upgrade of Extreme LAN hardware  Major upgrade

Weekly BreakdownWeekly Breakdown1st

May

8th May

15th May

22nd May

29th May

5th June

12th June

19th June

26th June

3rd July

LINX 53 Meeting

Euro-IX Meeting Migrate 100M ports to RX @ Telehouse

East

Install RX16 @ Telehouse East

Insert 3x BD8800s into EAPS ring

UKNOF Meeting

Migrate all member ports on 2x BD8800s

Migrate all member ports on 2x BD8800s

Migrate 1G ports to RX @ Telehouse

East

NANOG Meeting, San Jose

Upgrade MG8 s/w

Upgrade Jetcore s/w 4x BI8000s

Retry Failed Jetcore s/w upgrade on one

switch only

Retry Failed Jetcore s/w upgrade

Migrate entire MG8 to new RX16 @ THN

- 7hr maint!

EAPSv2 pre-deploy and build(Using new intersite fibre)

Activate New EAPSv2 Backbone

Topology

Deploy MRP2 and upgrade interswitch

trunks

Emergency Maintenance on a

BD8800 MSM

SLACK WEEK

SLACK WEEK

New Staff Member

Foundry Network

Extreme Network

External Event

LINX Meeting

Emergency Maintenance

Window

LEGEND

Page 8: Maintenance Challenges Mike Hughes. The Challenge Significant amounts of maintenance on both LANs  Complete upgrade of Extreme LAN hardware  Major upgrade

Weekly BreakdownWeekly Breakdown1st

May

8th May

15th May

22nd May

29th May

5th June

12th June

19th June

26th June

3rd July

LINX 53 Meeting

Euro-IX Meeting Migrate 100M ports to RX @ Telehouse

East

Install RX16 @ Telehouse East

Insert 3x BD8800s into EAPS ring

UKNOF Meeting

Migrate all member ports on 2x BD8800s

Migrate all member ports on 2x BD8800s

Migrate 1G ports to RX @ Telehouse

East

NANOG Meeting, San Jose

Upgrade MG8 s/w

Upgrade Jetcore s/w 4x BI8000s

Retry Failed Jetcore s/w upgrade on one

switch only

Retry Failed Jetcore s/w upgrade

Migrate entire MG8 to new RX16 @ THN

- 7hr maint!

EAPSv2 pre-deploy and build(Using new intersite fibre)

Activate New EAPSv2 Backbone

Topology

Deploy MRP2 and upgrade interswitch

trunks

Emergency Maintenance on a

BD8800 MSM

SLACK WEEK

SLACK WEEK

New Staff Member

Foundry Network

Extreme Network

External Event

LINX Meeting

Emergency Maintenance

Window

LEGEND

We broke the “news” to the members here

Page 9: Maintenance Challenges Mike Hughes. The Challenge Significant amounts of maintenance on both LANs  Complete upgrade of Extreme LAN hardware  Major upgrade

Weekly BreakdownWeekly Breakdown1st

May

8th May

15th May

22nd May

29th May

5th June

12th June

19th June

26th June

3rd July

LINX 53 Meeting

Euro-IX Meeting Migrate 100M ports to RX @ Telehouse

East

Install RX16 @ Telehouse East

Insert 3x BD8800s into EAPS ring

UKNOF Meeting

Migrate all member ports on 2x BD8800s

Migrate all member ports on 2x BD8800s

Migrate 1G ports to RX @ Telehouse

East

NANOG Meeting, San Jose

Upgrade MG8 s/w

Upgrade Jetcore s/w 4x BI8000s

Retry Failed Jetcore s/w upgrade on one

switch only

Retry Failed Jetcore s/w upgrade

Migrate entire MG8 to new RX16 @ THN

- 7hr maint!

EAPSv2 pre-deploy and build(Using new intersite fibre)

Activate New EAPSv2 Backbone

Topology

Deploy MRP2 and upgrade interswitch

trunks

Emergency Maintenance on a

BD8800 MSM

SLACK WEEK

SLACK WEEK

New Staff Member

Foundry Network

Extreme Network

External Event

LINX Meeting

Emergency Maintenance

Window

LEGEND

This window had to be backed out

Page 10: Maintenance Challenges Mike Hughes. The Challenge Significant amounts of maintenance on both LANs  Complete upgrade of Extreme LAN hardware  Major upgrade

Weekly BreakdownWeekly Breakdown1st

May

8th May

15th May

22nd May

29th May

5th June

12th June

19th June

26th June

3rd July

LINX 53 Meeting

Euro-IX Meeting Migrate 100M ports to RX @ Telehouse

East

Install RX16 @ Telehouse East

Insert 3x BD8800s into EAPS ring

UKNOF Meeting

Migrate all member ports on 2x BD8800s

Migrate all member ports on 2x BD8800s

Migrate 1G ports to RX @ Telehouse

East

NANOG Meeting, San Jose

Upgrade MG8 s/w

Upgrade Jetcore s/w 4x BI8000s

Retry Failed Jetcore s/w upgrade on one

switch only

Retry Failed Jetcore s/w upgrade

Migrate entire MG8 to new RX16 @ THN

- 7hr maint!

EAPSv2 pre-deploy and build(Using new intersite fibre)

Activate New EAPSv2 Backbone

Topology

Deploy MRP2 and upgrade interswitch

trunks

Emergency Maintenance on a

BD8800 MSM

SLACK WEEK

SLACK WEEK

New Staff Member

Foundry Network

Extreme Network

External Event

LINX Meeting

Emergency Maintenance

Window

LEGEND

This window had to be backed out

Page 11: Maintenance Challenges Mike Hughes. The Challenge Significant amounts of maintenance on both LANs  Complete upgrade of Extreme LAN hardware  Major upgrade

Weekly BreakdownWeekly Breakdown1st

May

8th May

15th May

22nd May

29th May

5th June

12th June

19th June

26th June

3rd July

LINX 53 Meeting

Euro-IX Meeting Migrate 100M ports to RX @ Telehouse

East

Install RX16 @ Telehouse East

Insert 3x BD8800s into EAPS ring

UKNOF Meeting

Migrate all member ports on 2x BD8800s

Migrate all member ports on 2x BD8800s

Migrate 1G ports to RX @ Telehouse

East

NANOG Meeting, San Jose

Upgrade MG8 s/w

Upgrade Jetcore s/w 4x BI8000s

Retry Failed Jetcore s/w upgrade on one

switch only

Retry Failed Jetcore s/w upgrade

Migrate entire MG8 to new RX16 @ THN

- 7hr maint!

EAPSv2 pre-deploy and build(Using new intersite fibre)

Activate New EAPSv2 Backbone

Topology

Deploy MRP2 and upgrade interswitch

trunks

Emergency Maintenance on a

BD8800 MSM

SLACK WEEK

SLACK WEEK

New Staff Member

Foundry Network

Extreme Network

External Event

LINX Meeting

Emergency Maintenance

Window

LEGEND

We retried using different code next day

Page 12: Maintenance Challenges Mike Hughes. The Challenge Significant amounts of maintenance on both LANs  Complete upgrade of Extreme LAN hardware  Major upgrade

Weekly BreakdownWeekly Breakdown1st

May

8th May

15th May

22nd May

29th May

5th June

12th June

19th June

26th June

3rd July

LINX 53 Meeting

Euro-IX Meeting Migrate 100M ports to RX @ Telehouse

East

Install RX16 @ Telehouse East

Insert 3x BD8800s into EAPS ring

UKNOF Meeting

Migrate all member ports on 2x BD8800s

Migrate all member ports on 2x BD8800s

Migrate 1G ports to RX @ Telehouse

East

NANOG Meeting, San Jose

Upgrade MG8 s/w

Upgrade Jetcore s/w 4x BI8000s

Retry Failed Jetcore s/w upgrade on one

switch only

Retry Failed Jetcore s/w upgrade

Migrate entire MG8 to new RX16 @ THN

- 7hr maint!

EAPSv2 pre-deploy and build(Using new intersite fibre)

Activate New EAPSv2 Backbone

Topology

Deploy MRP2 and upgrade interswitch

trunks

Emergency Maintenance on a

BD8800 MSM

SLACK WEEK

SLACK WEEK

New Staff Member

Foundry Network

Extreme Network

External Event

LINX Meeting

Emergency Maintenance

Window

LEGEND

…and the rest two days later

Page 13: Maintenance Challenges Mike Hughes. The Challenge Significant amounts of maintenance on both LANs  Complete upgrade of Extreme LAN hardware  Major upgrade

AfterthoughtsAfterthoughts

The blocking out of whole weeks for maintenance worked Especially when we could quickly reschedule that

maintenance work which got backed out Otherwise, we’d have had at least 7 days “slip”

The intensive approach worked We haven’t had to raise a scheduled maintenance

ticket until this week! Prevented network operating for extended periods in

“transitional state”

Page 14: Maintenance Challenges Mike Hughes. The Challenge Significant amounts of maintenance on both LANs  Complete upgrade of Extreme LAN hardware  Major upgrade

What we started withWhat we started with

Page 15: Maintenance Challenges Mike Hughes. The Challenge Significant amounts of maintenance on both LANs  Complete upgrade of Extreme LAN hardware  Major upgrade

What we ended up with - FoundryWhat we ended up with - Foundry

Page 16: Maintenance Challenges Mike Hughes. The Challenge Significant amounts of maintenance on both LANs  Complete upgrade of Extreme LAN hardware  Major upgrade

What we ended up with - ExtremeWhat we ended up with - Extreme

Page 17: Maintenance Challenges Mike Hughes. The Challenge Significant amounts of maintenance on both LANs  Complete upgrade of Extreme LAN hardware  Major upgrade

Any questions?Any questions?