high availability in neutron · high availability in neutron sylvain afchain software engineer...
TRANSCRIPT
![Page 1: High Availability in Neutron · High Availability in Neutron Sylvain Afchain Software Engineer Assaf Muller Software Engineer Getting the L3 Agent Right November 2014. SITUATION IN](https://reader033.vdocument.in/reader033/viewer/2022060421/5f18108aed0c0c14b26e926b/html5/thumbnails/1.jpg)
High Availability in Neutron
Sylvain AfchainSoftware Engineer
Assaf MullerSoftware Engineer
Getting the L3 Agent Right
November 2014
![Page 2: High Availability in Neutron · High Availability in Neutron Sylvain Afchain Software Engineer Assaf Muller Software Engineer Getting the L3 Agent Right November 2014. SITUATION IN](https://reader033.vdocument.in/reader033/viewer/2022060421/5f18108aed0c0c14b26e926b/html5/thumbnails/2.jpg)
SITUATION IN ICEHOUSE
![Page 3: High Availability in Neutron · High Availability in Neutron Sylvain Afchain Software Engineer Assaf Muller Software Engineer Getting the L3 Agent Right November 2014. SITUATION IN](https://reader033.vdocument.in/reader033/viewer/2022060421/5f18108aed0c0c14b26e926b/html5/thumbnails/3.jpg)
![Page 4: High Availability in Neutron · High Availability in Neutron Sylvain Afchain Software Engineer Assaf Muller Software Engineer Getting the L3 Agent Right November 2014. SITUATION IN](https://reader033.vdocument.in/reader033/viewer/2022060421/5f18108aed0c0c14b26e926b/html5/thumbnails/4.jpg)
Single Point of Failure
R2
L3 Agent 1 L3 Agent 2
R1
External Network
Tenant Network
VM VM
![Page 5: High Availability in Neutron · High Availability in Neutron Sylvain Afchain Software Engineer Assaf Muller Software Engineer Getting the L3 Agent Right November 2014. SITUATION IN](https://reader033.vdocument.in/reader033/viewer/2022060421/5f18108aed0c0c14b26e926b/html5/thumbnails/5.jpg)
L3 Agent Healthcheck
● Developed by eNovance
● Open-Source● Works and tested
on Grizzly, Havana and Icehouse
R2
Controller L3 Agent 1 L3 Agent 2
L3 Agent State
Healthcheck Healthcheck
R1
RPC
![Page 6: High Availability in Neutron · High Availability in Neutron Sylvain Afchain Software Engineer Assaf Muller Software Engineer Getting the L3 Agent Right November 2014. SITUATION IN](https://reader033.vdocument.in/reader033/viewer/2022060421/5f18108aed0c0c14b26e926b/html5/thumbnails/6.jpg)
L3 Agent Healthcheck
● Developed by eNovance
● Open-Source● Works and tested
on Grizzly, Havana and Icehouse
R2
Controller L3 Agent 1 L3 Agent 2
L3 Agent State
Healthcheck Healthcheck
R1
RPC
![Page 7: High Availability in Neutron · High Availability in Neutron Sylvain Afchain Software Engineer Assaf Muller Software Engineer Getting the L3 Agent Right November 2014. SITUATION IN](https://reader033.vdocument.in/reader033/viewer/2022060421/5f18108aed0c0c14b26e926b/html5/thumbnails/7.jpg)
L3 Agent Healthcheck
ConsPros
● Does not affect deployment
● Remove a node if isolated● Distributed service● Works since Grizzly● Lightweight
● Still no full HA● Not stateful● Long downtime● Out of tree● Still not the right way
![Page 8: High Availability in Neutron · High Availability in Neutron Sylvain Afchain Software Engineer Assaf Muller Software Engineer Getting the L3 Agent Right November 2014. SITUATION IN](https://reader033.vdocument.in/reader033/viewer/2022060421/5f18108aed0c0c14b26e926b/html5/thumbnails/8.jpg)
SITUATION IN JUNO
![Page 9: High Availability in Neutron · High Availability in Neutron Sylvain Afchain Software Engineer Assaf Muller Software Engineer Getting the L3 Agent Right November 2014. SITUATION IN](https://reader033.vdocument.in/reader033/viewer/2022060421/5f18108aed0c0c14b26e926b/html5/thumbnails/9.jpg)
Controller Rescheduling
Controller L3 Agent 1 L3 Agent 2
Rescheduler Loop R1
allow_automatic_l3agent_failover = True
Authored by Kevin Benton
R2
![Page 10: High Availability in Neutron · High Availability in Neutron Sylvain Afchain Software Engineer Assaf Muller Software Engineer Getting the L3 Agent Right November 2014. SITUATION IN](https://reader033.vdocument.in/reader033/viewer/2022060421/5f18108aed0c0c14b26e926b/html5/thumbnails/10.jpg)
Controller Rescheduling
Controller L3 Agent 1 L3 Agent 2
Rescheduler Loop
R1
allow_automatic_l3agent_failover = True
Authored by Kevin Benton
R2
![Page 11: High Availability in Neutron · High Availability in Neutron Sylvain Afchain Software Engineer Assaf Muller Software Engineer Getting the L3 Agent Right November 2014. SITUATION IN](https://reader033.vdocument.in/reader033/viewer/2022060421/5f18108aed0c0c14b26e926b/html5/thumbnails/11.jpg)
Another Approach: Keepalived
Features
● Configuration determines default, admin can overrule
● Works within tenant networks
● Plays nice with: FWaaS, VPNaaS, LBaaS
● Failover independent from RPC layer
● Routers in active / passive● Floating IP in active /
passive● All L3 agents are active● Uses keepalived (VRRP)● Rough failover time: c + 2/29 * n
![Page 12: High Availability in Neutron · High Availability in Neutron Sylvain Afchain Software Engineer Assaf Muller Software Engineer Getting the L3 Agent Right November 2014. SITUATION IN](https://reader033.vdocument.in/reader033/viewer/2022060421/5f18108aed0c0c14b26e926b/html5/thumbnails/12.jpg)
VRRP: Pre-emptive Router Scheduling
R1 R1
R2R2
R3
R3
L3 Agent 1 L3 Agent 2 L3 Agent 3
SLAVE
MASTER
![Page 13: High Availability in Neutron · High Availability in Neutron Sylvain Afchain Software Engineer Assaf Muller Software Engineer Getting the L3 Agent Right November 2014. SITUATION IN](https://reader033.vdocument.in/reader033/viewer/2022060421/5f18108aed0c0c14b26e926b/html5/thumbnails/13.jpg)
VRRP: HA Networks & VRID
R1 - VRID 1 R1 - VRID 1
R2 - VRID 2R2 - VRID 2
R3 - VRID 1
R3 - VRID 1
L3 Agent 2 L3 Agent 3
R1, R2 - Tenant A
R3 - Tenant B
L3 Agent 1
![Page 14: High Availability in Neutron · High Availability in Neutron Sylvain Afchain Software Engineer Assaf Muller Software Engineer Getting the L3 Agent Right November 2014. SITUATION IN](https://reader033.vdocument.in/reader033/viewer/2022060421/5f18108aed0c0c14b26e926b/html5/thumbnails/14.jpg)
VRRP: Implementation
L3 AgentNeutron Server
● Per tenant network to accommodate VRRP traffic
● New virtual router ID attribute uniquely identifies router clusters
● New keepalived manager● IPs as VIPs, only present
on the master instance
![Page 15: High Availability in Neutron · High Availability in Neutron Sylvain Afchain Software Engineer Assaf Muller Software Engineer Getting the L3 Agent Right November 2014. SITUATION IN](https://reader033.vdocument.in/reader033/viewer/2022060421/5f18108aed0c0c14b26e926b/html5/thumbnails/15.jpg)
VRRP: Implementation
R1
L3 Agent 1
QG
HA
QR
169.254.192.1/18KEEPALIVED
VIPVIP
VIP
R1
L3 Agent 2
QG
HA
QR
169.254.192.2/18 KEEPALIVED
![Page 16: High Availability in Neutron · High Availability in Neutron Sylvain Afchain Software Engineer Assaf Muller Software Engineer Getting the L3 Agent Right November 2014. SITUATION IN](https://reader033.vdocument.in/reader033/viewer/2022060421/5f18108aed0c0c14b26e926b/html5/thumbnails/16.jpg)
VRRP: CLI/API
neutron net-list
As an admin :
+---------------------+----------------------------------------------------------+| id | name | subnets |+---------------------+-------------------+--------------------------------------+| 27b2e2b7-22c0-4f74- | private | f472d8c8-1dd0-4272- 10.0.0.0/24 || be1a07de-9d7b-4823- | public | 7f3d69a6-bd50-4cd5- 172.24.4.0/24 || 62917a72-0576-422e- | HA network tenant | 12f466f4-6c51-4726- 169.254.192.0/18 |+---------------------+-------------------+--------------------------------------+
As a user :
+---------------------+-------------------------------------------------------+| id | name | subnets |+---------------------+-------------------+-----------------------------------+| 27b2e2b7-22c0-4f74- | private | f472d8c8-1dd0-4272- 10.0.0.0/24 || be1a07de-9d7b-4823- | public | 7f3d69a6-bd50-4cd5- 172.24.4.0/24 |+--------------------------------------+--------------------------------------+
![Page 17: High Availability in Neutron · High Availability in Neutron Sylvain Afchain Software Engineer Assaf Muller Software Engineer Getting the L3 Agent Right November 2014. SITUATION IN](https://reader033.vdocument.in/reader033/viewer/2022060421/5f18108aed0c0c14b26e926b/html5/thumbnails/17.jpg)
VRRP: CLI/API
neutron port-list
As an admin :
+-----------+---------+-----------------------------------------------------------+| id | name | fixed_ips |+-----------+---------+-----------------------------------------------------------+| 3e7268c6- | HA port | {"subnet_id": "12f466f4-", "ip_address": "169.254.192.2"} || d87a6c9c- | HA port | {"subnet_id": "12f466f4-", "ip_address": "169.254.192.1"} || 5105bd78- | | {"subnet_id": "f472d8c8-", "ip_address": "10.0.0.1"} |+-----------+---------+-----------------------------------------------------------+
As a user :
+-----------+------+------------------------------------------------------+| id | name | fixed_ips | +-----------+------+------------------------------------------------------+| 5105bd78- | | {"subnet_id": "f472d8c8-", "ip_address": "10.0.0.1"} |+-----------+------+------------------------------------------------------+
![Page 18: High Availability in Neutron · High Availability in Neutron Sylvain Afchain Software Engineer Assaf Muller Software Engineer Getting the L3 Agent Right November 2014. SITUATION IN](https://reader033.vdocument.in/reader033/viewer/2022060421/5f18108aed0c0c14b26e926b/html5/thumbnails/18.jpg)
VRRP: CLI/API
Where are my HA router instances?
neutron l3-agent-list-hosting-router router1
+--------------------------------------+--------+----------------+-------+| id | host | admin_state_up | alive |+--------------------------------------+--------+----------------+-------+| 0dad7203-cba8-4b79-bec3-16ddf55d6a5a | ops-1 | True | :-) || 7d7afb99-b522-442a-8acd-f1548a1dea19 | ops-2 | True | :-) |+--------------------------------------+--------+----------------+-------+
![Page 20: High Availability in Neutron · High Availability in Neutron Sylvain Afchain Software Engineer Assaf Muller Software Engineer Getting the L3 Agent Right November 2014. SITUATION IN](https://reader033.vdocument.in/reader033/viewer/2022060421/5f18108aed0c0c14b26e926b/html5/thumbnails/20.jpg)
VRRP : Future Work
Improvements
● Administration:● Where is the master?
Can I move it?● Log state transitions
● Conntrackd for stateful SNAT traffic
● Migrate legacy routers to HA
● In Juno, a router may be distributed or HA, but not both
![Page 21: High Availability in Neutron · High Availability in Neutron Sylvain Afchain Software Engineer Assaf Muller Software Engineer Getting the L3 Agent Right November 2014. SITUATION IN](https://reader033.vdocument.in/reader033/viewer/2022060421/5f18108aed0c0c14b26e926b/html5/thumbnails/21.jpg)
VRRP: Limitations
Limitations
● 255 virtual routers per HA network thus per tenant
● Can be removed by allowing more than one HA network per tenant
● East-west traffic could be improved, no need to go through a L3 node
● North-south traffic could be improved as well, especially for floating IPs
Improvements
![Page 22: High Availability in Neutron · High Availability in Neutron Sylvain Afchain Software Engineer Assaf Muller Software Engineer Getting the L3 Agent Right November 2014. SITUATION IN](https://reader033.vdocument.in/reader033/viewer/2022060421/5f18108aed0c0c14b26e926b/html5/thumbnails/22.jpg)
Another Approach: DVR
Features
● Distributes virtual routers on all compute nodes
● No more east-west traffic through L3 node
● No single point of failure
● Floating IPs hosted by the compute node hosting the VM
● Some service could be distributed as well, like FWaaS
● SNAT support through a L3 node
![Page 23: High Availability in Neutron · High Availability in Neutron Sylvain Afchain Software Engineer Assaf Muller Software Engineer Getting the L3 Agent Right November 2014. SITUATION IN](https://reader033.vdocument.in/reader033/viewer/2022060421/5f18108aed0c0c14b26e926b/html5/thumbnails/23.jpg)
Without DVR
R1
VM1 VM2 VM3 VM4
L3 Agent
![Page 24: High Availability in Neutron · High Availability in Neutron Sylvain Afchain Software Engineer Assaf Muller Software Engineer Getting the L3 Agent Right November 2014. SITUATION IN](https://reader033.vdocument.in/reader033/viewer/2022060421/5f18108aed0c0c14b26e926b/html5/thumbnails/24.jpg)
East-West with DVR
VM1 VM2
VM3 VM4
DVR DVR
![Page 25: High Availability in Neutron · High Availability in Neutron Sylvain Afchain Software Engineer Assaf Muller Software Engineer Getting the L3 Agent Right November 2014. SITUATION IN](https://reader033.vdocument.in/reader033/viewer/2022060421/5f18108aed0c0c14b26e926b/html5/thumbnails/25.jpg)
Floating IPs with DVR
VM1 VM2
VM3 VM4
DVR + NAT DVR + NAT
FLOATING-IPFLOATING-IP
![Page 26: High Availability in Neutron · High Availability in Neutron Sylvain Afchain Software Engineer Assaf Muller Software Engineer Getting the L3 Agent Right November 2014. SITUATION IN](https://reader033.vdocument.in/reader033/viewer/2022060421/5f18108aed0c0c14b26e926b/html5/thumbnails/26.jpg)
SNAT with DVR
VM1 VM2
VM3 VM4
DVR + NAT
FLOATING-IP
VRRP + SNAT
DVR + NAT
![Page 27: High Availability in Neutron · High Availability in Neutron Sylvain Afchain Software Engineer Assaf Muller Software Engineer Getting the L3 Agent Right November 2014. SITUATION IN](https://reader033.vdocument.in/reader033/viewer/2022060421/5f18108aed0c0c14b26e926b/html5/thumbnails/27.jpg)
Healthcheck Rescheduling L3 HA DVR
Advantages Mature In-tree, simple Failover is quick and indifferent to management plane
Removes bottleneck from network node
Release Grizzly Juno Juno Juno
Segmentation Technology
All All All Tunneling + L2pop
Topology Extra agent to install on network nodes
Enable configuration option
Enable configuration option
Compute nodes connected to external network(s)
Summary
![Page 28: High Availability in Neutron · High Availability in Neutron Sylvain Afchain Software Engineer Assaf Muller Software Engineer Getting the L3 Agent Right November 2014. SITUATION IN](https://reader033.vdocument.in/reader033/viewer/2022060421/5f18108aed0c0c14b26e926b/html5/thumbnails/28.jpg)
Sylvain AfchainSoftware Engineer
[email protected] on FreenodeEnovance Tech Notes
Assaf MullerSoftware Engineer
[email protected] on Freenode
assafmuller.com