l3 agent ha - object-storage-ca-ymq-1.vexxhost.net · neutron l3 agent ha or: how i learned to stop...
TRANSCRIPT
![Page 1: L3 Agent HA - object-storage-ca-ymq-1.vexxhost.net · Neutron L3 Agent HA Or: How I Learned to Stop Worrying and Love the API Kevin Bringard // OpenStack Juno Summit // May 2014 •](https://reader034.vdocument.in/reader034/viewer/2022042310/5ed789a5ce1f7a43cc0be6d1/html5/thumbnails/1.jpg)
Neutron L3 Agent HAOr: How I Learned to Stop Worrying and Love the API
Kevin Bringard // OpenStack Juno Summit // May 2014
![Page 2: L3 Agent HA - object-storage-ca-ymq-1.vexxhost.net · Neutron L3 Agent HA Or: How I Learned to Stop Worrying and Love the API Kevin Bringard // OpenStack Juno Summit // May 2014 •](https://reader034.vdocument.in/reader034/viewer/2022042310/5ed789a5ce1f7a43cc0be6d1/html5/thumbnails/2.jpg)
• There is no “one right way” • The goal is to move L3 resources to a new L2
resource as quickly and seamlessly as possible • This is a really difficult, but important, problem to
solve
![Page 3: L3 Agent HA - object-storage-ca-ymq-1.vexxhost.net · Neutron L3 Agent HA Or: How I Learned to Stop Worrying and Love the API Kevin Bringard // OpenStack Juno Summit // May 2014 •](https://reader034.vdocument.in/reader034/viewer/2022042310/5ed789a5ce1f7a43cc0be6d1/html5/thumbnails/3.jpg)
Layer 3Internet Happens
![Page 4: L3 Agent HA - object-storage-ca-ymq-1.vexxhost.net · Neutron L3 Agent HA Or: How I Learned to Stop Worrying and Love the API Kevin Bringard // OpenStack Juno Summit // May 2014 •](https://reader034.vdocument.in/reader034/viewer/2022042310/5ed789a5ce1f7a43cc0be6d1/html5/thumbnails/4.jpg)
L3 agent L3 agent L3 agent
router1 router2router3router4router5
router6
VM1 VM3VM2 VM4 VM5 VM7VM6
Core Router
![Page 5: L3 Agent HA - object-storage-ca-ymq-1.vexxhost.net · Neutron L3 Agent HA Or: How I Learned to Stop Worrying and Love the API Kevin Bringard // OpenStack Juno Summit // May 2014 •](https://reader034.vdocument.in/reader034/viewer/2022042310/5ed789a5ce1f7a43cc0be6d1/html5/thumbnails/5.jpg)
L3 agent L3 agent
router1 router2router3router4router5
router6
VM1 VM3VM2 VM4 VM5 VM7VM6
Core Router
![Page 6: L3 Agent HA - object-storage-ca-ymq-1.vexxhost.net · Neutron L3 Agent HA Or: How I Learned to Stop Worrying and Love the API Kevin Bringard // OpenStack Juno Summit // May 2014 •](https://reader034.vdocument.in/reader034/viewer/2022042310/5ed789a5ce1f7a43cc0be6d1/html5/thumbnails/6.jpg)
Layer 2The ARPing is the hardest part
![Page 7: L3 Agent HA - object-storage-ca-ymq-1.vexxhost.net · Neutron L3 Agent HA Or: How I Learned to Stop Worrying and Love the API Kevin Bringard // OpenStack Juno Summit // May 2014 •](https://reader034.vdocument.in/reader034/viewer/2022042310/5ed789a5ce1f7a43cc0be6d1/html5/thumbnails/7.jpg)
• One L3 resource may only be tied to one L2 resource at a time
• Many technologies exist to sort of work around this • HSRP • VRRP • CARP
• Work is being done to implement VRRP like functionality into Juno • https://blueprints.launchpad.net/neutron/+spec/l3-
high-availability • Nothing is currently integrated into OpenStack
![Page 8: L3 Agent HA - object-storage-ca-ymq-1.vexxhost.net · Neutron L3 Agent HA Or: How I Learned to Stop Worrying and Love the API Kevin Bringard // OpenStack Juno Summit // May 2014 •](https://reader034.vdocument.in/reader034/viewer/2022042310/5ed789a5ce1f7a43cc0be6d1/html5/thumbnails/8.jpg)
Pacemakerhttp://docs.openstack.org/high-availability-guide/content/_highly_available_neutron_l3_agent.html
![Page 9: L3 Agent HA - object-storage-ca-ymq-1.vexxhost.net · Neutron L3 Agent HA Or: How I Learned to Stop Worrying and Love the API Kevin Bringard // OpenStack Juno Summit // May 2014 •](https://reader034.vdocument.in/reader034/viewer/2022042310/5ed789a5ce1f7a43cc0be6d1/html5/thumbnails/9.jpg)
• False positives — caused more downtime than actual outages
• Split brain possibilities • Assumes control of L3 agent start/stop functions • Limited Horizontal Scale
• More difficult to run multiple Active L3 agents • Failover requires entire services starts/stops
• Active/Passive Model Requires More Hardware • Works on a “per agent” level • Akin to RAID1
![Page 10: L3 Agent HA - object-storage-ca-ymq-1.vexxhost.net · Neutron L3 Agent HA Or: How I Learned to Stop Worrying and Love the API Kevin Bringard // OpenStack Juno Summit // May 2014 •](https://reader034.vdocument.in/reader034/viewer/2022042310/5ed789a5ce1f7a43cc0be6d1/html5/thumbnails/10.jpg)
L3 agent L3 agent L3 agent
router1 router2router3router4router5
router6
VM1 VM3VM2 VM4 VM5 VM7VM6
Core Router
![Page 11: L3 Agent HA - object-storage-ca-ymq-1.vexxhost.net · Neutron L3 Agent HA Or: How I Learned to Stop Worrying and Love the API Kevin Bringard // OpenStack Juno Summit // May 2014 •](https://reader034.vdocument.in/reader034/viewer/2022042310/5ed789a5ce1f7a43cc0be6d1/html5/thumbnails/11.jpg)
L3 agent L3 agent
router1 router2router3router4router5
router6
VM1 VM3VM2 VM4 VM5 VM7VM6
Core Router
![Page 12: L3 Agent HA - object-storage-ca-ymq-1.vexxhost.net · Neutron L3 Agent HA Or: How I Learned to Stop Worrying and Love the API Kevin Bringard // OpenStack Juno Summit // May 2014 •](https://reader034.vdocument.in/reader034/viewer/2022042310/5ed789a5ce1f7a43cc0be6d1/html5/thumbnails/12.jpg)
L3 agent L3 agent L3 agent
router1 router2router3router4router5
router6
VM1 VM3VM2 VM4 VM5 VM7VM6
Core Router
![Page 13: L3 Agent HA - object-storage-ca-ymq-1.vexxhost.net · Neutron L3 Agent HA Or: How I Learned to Stop Worrying and Love the API Kevin Bringard // OpenStack Juno Summit // May 2014 •](https://reader034.vdocument.in/reader034/viewer/2022042310/5ed789a5ce1f7a43cc0be6d1/html5/thumbnails/13.jpg)
Neutron HA Toolhttps://raw.githubusercontent.com/stackforge/cookbook-openstack-network/master/files/default/neutron-ha-tool.py
![Page 14: L3 Agent HA - object-storage-ca-ymq-1.vexxhost.net · Neutron L3 Agent HA Or: How I Learned to Stop Worrying and Love the API Kevin Bringard // OpenStack Juno Summit // May 2014 •](https://reader034.vdocument.in/reader034/viewer/2022042310/5ed789a5ce1f7a43cc0be6d1/html5/thumbnails/14.jpg)
• API Driven • Uses native API calls to perform all functions • Can be run externally from infrastructure or cross
site • Supports any operations the neutron client
libraries supports • Easily Extendable
• Written in python • Leverages standard OpenStack libraries
• Works on a “per resource” level
![Page 15: L3 Agent HA - object-storage-ca-ymq-1.vexxhost.net · Neutron L3 Agent HA Or: How I Learned to Stop Worrying and Love the API Kevin Bringard // OpenStack Juno Summit // May 2014 •](https://reader034.vdocument.in/reader034/viewer/2022042310/5ed789a5ce1f7a43cc0be6d1/html5/thumbnails/15.jpg)
L3 agent L3 agent L3 agent
router1 router2router3router4router5
router6
VM1 VM3VM2 VM4 VM5 VM7VM6
Core Router
![Page 16: L3 Agent HA - object-storage-ca-ymq-1.vexxhost.net · Neutron L3 Agent HA Or: How I Learned to Stop Worrying and Love the API Kevin Bringard // OpenStack Juno Summit // May 2014 •](https://reader034.vdocument.in/reader034/viewer/2022042310/5ed789a5ce1f7a43cc0be6d1/html5/thumbnails/16.jpg)
L3 agent L3 agent
router1 router2router3router4router5
router6
VM1 VM3VM2 VM4 VM5 VM7VM6
Core Router
![Page 17: L3 Agent HA - object-storage-ca-ymq-1.vexxhost.net · Neutron L3 Agent HA Or: How I Learned to Stop Worrying and Love the API Kevin Bringard // OpenStack Juno Summit // May 2014 •](https://reader034.vdocument.in/reader034/viewer/2022042310/5ed789a5ce1f7a43cc0be6d1/html5/thumbnails/17.jpg)
L3 agent L3 agent
router1 router2
router3router4
router5
router6
VM1 VM3VM2 VM4 VM5 VM7VM6
Core Router
![Page 18: L3 Agent HA - object-storage-ca-ymq-1.vexxhost.net · Neutron L3 Agent HA Or: How I Learned to Stop Worrying and Love the API Kevin Bringard // OpenStack Juno Summit // May 2014 •](https://reader034.vdocument.in/reader034/viewer/2022042310/5ed789a5ce1f7a43cc0be6d1/html5/thumbnails/18.jpg)
• Only routers/IPs on the affected L3 agent are impacted
• Recovery time depends on the number of routers which need to be migrated and the number of IPs on each router
• Migration happens quickly, but every IP on the routers must re-ARP to the upstream switch
• Meta-data proxies migrate with the routers
![Page 19: L3 Agent HA - object-storage-ca-ymq-1.vexxhost.net · Neutron L3 Agent HA Or: How I Learned to Stop Worrying and Love the API Kevin Bringard // OpenStack Juno Summit // May 2014 •](https://reader034.vdocument.in/reader034/viewer/2022042310/5ed789a5ce1f7a43cc0be6d1/html5/thumbnails/19.jpg)
OK, so what’s the catch?
![Page 20: L3 Agent HA - object-storage-ca-ymq-1.vexxhost.net · Neutron L3 Agent HA Or: How I Learned to Stop Worrying and Love the API Kevin Bringard // OpenStack Juno Summit // May 2014 •](https://reader034.vdocument.in/reader034/viewer/2022042310/5ed789a5ce1f7a43cc0be6d1/html5/thumbnails/20.jpg)
• Not seamless • The ARP processes happen in parallel, but generally
take 60-90 seconds for all IPs to complete • Various *aaS offerings further complicate things
• Currently only accounts for “l3-agent” controlled services
• No coordination between HA tools • How do you HA the HA?
• Currently not daemonized, runs from cron • Add 60 seconds to total recovery time • Jitter protection adds additional total recovery time
• No mechanism by which to ensure resources actually come up/work
![Page 21: L3 Agent HA - object-storage-ca-ymq-1.vexxhost.net · Neutron L3 Agent HA Or: How I Learned to Stop Worrying and Love the API Kevin Bringard // OpenStack Juno Summit // May 2014 •](https://reader034.vdocument.in/reader034/viewer/2022042310/5ed789a5ce1f7a43cc0be6d1/html5/thumbnails/21.jpg)
What about DHCP?
![Page 22: L3 Agent HA - object-storage-ca-ymq-1.vexxhost.net · Neutron L3 Agent HA Or: How I Learned to Stop Worrying and Love the API Kevin Bringard // OpenStack Juno Summit // May 2014 •](https://reader034.vdocument.in/reader034/viewer/2022042310/5ed789a5ce1f7a43cc0be6d1/html5/thumbnails/22.jpg)
• Multiple DHCP agents may be run Active/Active • DHCP agents per subnet may be specified in your
agent config file • Each agent requires an IP in the tenant’s subnet • DHCP is multi-cast
• All agents have the same lease file • The first one to reply binds to the VM
• Any DHCP agent may reply to a DNS request and resolve all known leases
• By default, each DHCP agent hands out a list of every agent as available resolvers
• HA tool has an option to replicate DHCP to all agents
![Page 23: L3 Agent HA - object-storage-ca-ymq-1.vexxhost.net · Neutron L3 Agent HA Or: How I Learned to Stop Worrying and Love the API Kevin Bringard // OpenStack Juno Summit // May 2014 •](https://reader034.vdocument.in/reader034/viewer/2022042310/5ed789a5ce1f7a43cc0be6d1/html5/thumbnails/23.jpg)
• VRRP Like functionality • Specify number of Active L3 agents per subnet • Leverage conntrackd/keepalived • Point of diminishing returns for HA tool? • The beauty of open source:
• There is no “one right way” • Think outside the box • Do cool things
Moving Forward
![Page 24: L3 Agent HA - object-storage-ca-ymq-1.vexxhost.net · Neutron L3 Agent HA Or: How I Learned to Stop Worrying and Love the API Kevin Bringard // OpenStack Juno Summit // May 2014 •](https://reader034.vdocument.in/reader034/viewer/2022042310/5ed789a5ce1f7a43cc0be6d1/html5/thumbnails/24.jpg)
Questions?