the resilient wan -...
TRANSCRIPT
The Resilient WAN
*Please visit polls in the ILTA Mobile App or http://ilta.cnf.io/sessions/289*
• Introductions
• Brief discussion of WAN terminology and acronyms
• What does it mean to be “Resilient”
• Two Case Studies: WAN Upgrades - Before and After • Small-midsized firm: Zuckerman Spaeder • Midsized to large firm: Lathrop & Gage
• Wrap-up & Questions
The Resilient WAN: Session Overview
Presenters
Philip Finnerty CIO, Zuckerman Spaeder
Moderator:
Charlie Wise
IT Director, Manion Gaynor & Manning
David Alberico Network Manager, Lathrop & Gage
Tim Soto IT Infrastructure Manager, Lathrop & Gage
• BGP—Border Gateway Protocol • Protocol used on exterior Internet gateways for routing
• EIGRP—Enhanced Interior Gateway Routing Protocol • Protocol used in internal systems for routing
• N+1 or N+2 • The number you need plus the number of extras
• QoS—Quality of Service • Allows you to “tag” some traffic as a higher priority
• WAN – “Wide Area Network”
• Last Mile Carrier/Local Loop • Physical connection between the customer premises (demarcation
point ) and the edge of the carrier 's network • POP/Layer 2 Route Path
• The path data traverses across the physical network; can include diverse carrier networks.
What acronyms will you use today? (layman’s explanations)
• WAN – “Wide Area Network” • Highway that allows data to go from office to office
• Bandwidth • Number of lanes on the highway available for data
• Capacity or congestion • Amount of traffic at any given moment in time using the lanes
• QoS • HOV or Express lanes; Prioritized traffic
• Latency • The time it takes to travel from point A to point B
• Speed • Better definition—a function or all the above • Speed Limit--(theoretically, the max is the speed of light)
Elements of a WAN (Traffic analogy—oversimplified)
NOTE: WAN Acceleration/Optimization can change all of the above
Resilient
The capacity of a system to absorb disturbance and retain the same essential functions, structure, identity, and user experience
vs Redundant
The provisioning of additional or duplicate circuits, hardware, etc., that function in case a part of the system fails.
You can achieve resiliency by employing redundancy; however, redundant systems are not necessarily resilient.
What do you mean by Resilient?
• What is your tolerance for downtime?
• What applications are running on your WAN (VoIP, video, VDI)?
• What is your budget?
• How much complexity can you manage?
• Any special security requirements?
• What are your anticipated future needs?
• What carriers are in your area?
• Speed costs money; how fast does it need to be (throughput, bandwidth, latency)?
• Where are your locations? Is data centralized or distributed?
WAN Resiliency is like Enlightenment: there are no right answers, only questions.
Can you give me an example of a smaller firm?
• 100 attorneys/200 users total • 8 people in IT • 4 offices
• Washington, DC • Baltimore, MD • New York, NY • Tampa, FA
Vendor Shout Outs (Who helped) • ARG—Design, carrier selection,
contract/price negotiation, and project management of installation
• CDW/Cisco—Design and hardware selection
What were the issues Zuckerman faced?
• Frequent complaints of slowness from remote offices (confirmed by SolarWind high ping times)
• Project to move production environment to a new CoLo facility and move DR/offsite backup to home office
• Circuit Renewal
• Aging Firewalls that needed to be replaced/upgraded in near future
• Aging WAAS equipment
• Long term plans for VDI and video to the desktop
• Long term plan to improve security monitoring (possibly as managed service)
=‘s DR and offsite Backup
=‘s Production environment
=‘s WAN Acceleration (4)
Connectivity Before
DC
DR
TA
BA
NY
Internet
Internet
Internet
Internet
Internet
25Mbps
6Mbps
6Mbps
6Mbps 100Mbps
3Mbps
100Mbps
1024Mbps
20Mbps
=‘s Firewall (6)
HA PAIR
Carrier 1
Carrier 2
Goals for new Resilient WAN?
• Improve user experience (perception of performance)
• faster vs. the absence of slowness
• Seamless failover
• Increase Bandwidth
• Big pipes beats better management
• Reduce total cost of ownership
• Monthly recurring fees for circuits
• Maintenance on equipment
• Reduce equipment footprint/retire aging equipment
• Increase security
• Reduce points of entry
• Reduce time spent managing firewalls
Connectivity After
DC
CoLo
TA
BA
NY
Backup Internet
10
0M
bp
s
10
24
Mb
ps
1024Mbps
10
0M
bp
s
1024Mbps
Internet
40Mbps
30Mbps
30Mbps
30Mbps
3072 Mbps
1024Mbps
TW Teleco
m
ZAYO
BGP
=‘s DR and offsite Backup
=‘s Production environment
=‘s Firewall
Any regrets or lessons learned?
• Cisco and GNS3 has a great simulation tools that will allow you to design, test and export
• Survey fiber runs from building’s point of entry to your suites prior to ordering.
• 1 gig and 10 gig capabilities are nice, but “spendy”
• Sometimes the carriers are smarter than you—BGP challenges with 1 carrier
• Installation of circuits will take longer than you think—negotiate billing delays until all circuits are up
Can you give me an example of a larger firm? • 380 attorneys/680 users total
• 16 people in IT • 10 offices
Vendor Shout Outs (Who helped) • Strategic Telecom Partners –
Circuit Specs and consulting • Riverbed – Path Selection
Engineering • Dell – Compellent / Live-Volume • CDW / Cisco – Hardware Selection
and hardware sales
What were the issues Lathrop faced?
• DR Data Center was “Cold” and a plane flight away
• Expensive Point to Point replication circuit
• MPLS Circuit refresh
• “Primary” MPLS Circuits were maxed out while passive sat empty
• 180 second SLA for primary MPLS Circuit outage
• Egress QoS, phone quality, Video Conference Quality
• Long Term Plans for Proven DR/BC
• Long Term Plans for Centralized SIP trunking
• Centralized Data Governance
• Single Points of Failure
Issues Lathrop faced….continued Failure Scenarios
• Environmental – Primary DataCenter Down
• Failover or bring primary DC back up?
• If you failover can you fail back?
• Host Primary MPLS down
• 180 second failover for all regional offices
• Secondary MPLS performance only for all regional offices
• MPLS Cloud / BGP issue exposed
• At mercy of Provider Routing
Goals for new Resilient WAN?
• Improve User experience
• Rock solid Voice and Video quality
• Seamless Failover for multiple failure scenarios
• Cost efficiency
• More Bandwidth
• Primary MPLS bandwidth
• DataCenter to DataCenter
• Present Secondary MPLS as aggregate with failover
• More WAN Optimization
• Disaster recovery / Business Continuity
• Active/Active DataCenter
• Seamless Production Migration
Compellent Live-Volume Technology
• Used to meet our Disaster recovery / Business Continuity Goals
• Active/Active DataCenter
• Seamless Production Migration
• Live-Volume technology is similar to EMC vPlex, NetApp MetroCluster, or HDS solutions.
• The main concept is that it allows for a LUN to be presented to hosts in both datacenters SIMULTANEOUSLY.
• This allows for vMotion between physical datacenters.
• This capability creates a paradigm shift from DR (Disaster Recovery) to DA (Disaster Avoidance)
• With our Diverse 10Gb WAN ring, this technology helps make our two datacenters look like a single datacenter to our users and upper layer applications.
Any regrets or lessons learned?
• GNS3 Network simulator was a critical tool, real IOS, Labs with YouTube how to videos
• Verify your Provider’s bandwidth
• Link Serialization (Data to Wire speeds)
• Make sure your QoS settings are correct and negotiated with your provider
• Shoot for 100% WAN optimization (it makes a BIG difference)
• Your SNMP 5 minutes average is full of LIES
• Research what is the best solution for YOUR environment, because it may or may not be what vendors/consultants will try to sell you. (IE OTV functionality without OTV). Use a design framework (Zachman).
From the considerations you mentioned earlier, what were your top 5 priorities?
Zuckerman Spaeder
1. Security
2. User Experience
3. Cost of circuits (monthly recurring)
4. Cost of other hardware & maintenance
5. Tolerance for outages or interruptions
Lathrop & Gage
1. DR/BC N+1 construction
2. Bandwidth / QoS
3. Seamless Fail-over; Disaster Avoidance
4. Cost / Utilizing passive circuits/equipment
5. Better Reporting / Avoiding problems beyond our control