neutron high availability open stack architecture openstack israel event 2015

37
@Livnat_Peer Sr. Engineering Manager, Red Hat @ArthurBerezin Sr. Technical Product Manager, Red Hat Neutron High Availability OpenStack Israel Tel-Aviv June 2015

Upload: arthur-berezin

Post on 08-Aug-2015

821 views

Category:

Software


6 download

TRANSCRIPT

@Livnat_PeerSr. Engineering Manager, Red Hat @ArthurBerezinSr. Technical Product Manager, Red Hat

Neutron High Availability

OpenStack IsraelTel-Aviv June 2015

Agenda

HA Enabling TechnologiesPacemaker and HAProxy

Neutron Built-in Mechanisms DHCP Agent HAL3 Agent with

Virtual Router Redundancy Protocol(VRRP)Distributed Virtual Routing(DVR)

cc: Morio2015 Source: https://www.wikiwand.com/en/Scuderia_Ferrari

Losing Your Controller

https://www.youtube.com/watch?v=Kb43Nxuwc4I

High Availability

● Minimize Downtime By Avoiding SPOF ● Service redundancy

○ Active-Active When possible■ Stateless services■ Built-in HA mechanisms

○ Active-Passive for others● Scale out Architecture

Add nodes as you go

HA Enabling TechnologiesPacemaker, HAProxy

● Cluster Resource Manager● Uses Corosync for cluster communication● Monitor and Control Resources:

○ Floating Virtual IP Address (VIP)○ SystemD/LSB/OCF Services ○ Cloned Services(Active/Active)

● STONITH - Fencing with Power Management○ Important for ensuring data consistency

Pacemaker

● Virtual IP(VIP)● SystemD Cloned Resource● STONITH Fencing

Pacemaker OpenStack Service

Node 2 - 192.168.1.2Node 1 - 192.168.1.1

pcsd pcsd

Cloned

STONITH STONITH

Service Service

ServiceVirtual IP10.0.0.1

HAProxy Load Balancer

Load Balancing and Proxy for HTTP/TCP● Mature and popular with web applications● Health Checking ● Load Distribution

● Load Distribution○ Round Robin, ○ Stick-Table

● API Isolation● Failure Detection

Node 1

Node 2 Node 3

HAProxy Load Balancer

Service Service

HAProxy

Avoiding SPOFsA day in a Highly Available Service Life

Neutron-Server Controller

Give Me Horizon Web UI NOW!

Neutron-Server Controller

Give Me Horizon Web UI NOW!

Single Point Of Failure

Neutron-Server Controller 1

Neutron-Server Controller 2

Neutron-Server Controller 3

Give Me Horizon Web UI NOW!

HAProxy Controller 1

Neutron-Server Controller 1

Neutron-Server Controller 2

Neutron-Server Controller 3

Give Me Horizon Web UI NOW!

HAProxy Controller 1

Single Point Of Failure

Each Could Fail

Neutron-Server Controller 1

Neutron-Server Controller 2

Neutron-Server Controller 3

Give Me Horizon Web UI NOW!

HAProxy Controller 1

Single Point Of Failure

Pacemaker Cloned Horizon Service

Neutron-Server Controller 1

Neutron-Server Controller 2

Neutron-Server Controller 3

Give Me Horizon Web UI NOW!

HAProxy Controller 1

HAProxy Controller 3

HAProxy Controller 2

Pacemaker Cloned Horizon Service

Pacemaker Cloned HAProxy Service

Pacemaker Cloned HAProxy Service

Neutron-Server Controller 1

Neutron-Server Controller 2

Neutron-Server Controller 3

HAProxy Controller 1

HAProxy Controller 3

HAProxy Controller 2

Give Me Horizon Web UI NOW!

Horizon

VIP

Pacemaker Cloned Horizon Service

Neutron Built-in Mechanisms

● External mechanisms

● Neutron built-in mechanisms

● Reference implementation vs. vendors code

My HA Solution

Architecture - Assuming Centralized Network Node

Compute NodeController Node

Network Node

Neutron server

MySQL server

Neutron server

Neutron serverRabbitmq serverNeutron server

OVS agent

OVS

OVS Agent

keepalived

Neutron serverOVS

DHCP agentDHCP Agent

Neutron serverMetadata Agent

Metadata Proxy

dnsmasq

InternetExternal Network

APINetwork

Management Network

Data Network

L3 Agent

DHCP Agent

● IP address allocation is done by the Neutron server

● dnsmasq is used as a distribution mechanism of predefined allocations

● The DHCP protocol allows multiple DHCP servers to co-exist while serving the same pool

● Configuration in Neutron

neutron.conf :

dhcp_agents_per_network = X OVS Agent

Neutron serverOVS

DHCP agent

Neutron serverMetadata Agent

Metadata Proxy

dnsmasq

L3 Agent

keepalived

DHCP Agent

● Dynamic process creation: dnsmasq, keepalived, metadata proxy etc.

● ProcessMonitor check processes liveliness periodically

● Optional actions:

– Respawn process

– Exit agent

– Notify (not available yet)

● Default configuration

check_child_processes_action = respawn

check_child_processes_period = 0

Process Monitoring

OVS Agent

Neutron serverOVS

DHCP agentDHCP Agent

Neutron serverMetadata Agent

Metadata Proxy

dnsmasq

L3 Agent

keepalived

Metadata Agent

OVS

What Else?

DHCP Agent

Metadata Proxy

dnsmasq

L3 Agent

keepalived

OVS Agent

OVS

Metadata Agent

What Else?

Metadata Agent

OVS

DHCP Agent

Metadata Proxy

dnsmasq

L3 Agent

keepalived

OVS Agent

OVS

Metadata Agent

What Else?

Metadata Agent

OVS

DHCP Agent

Metadata Proxy

dnsmasq

L3 Agent

keepalived

OVS Agent

OVS

Metadata Agent

VRRP (Virtual Router Redundancy Protocol)

● Providing HA of the network’s default gateway

● Configuring default gateway as VIP + Virtual MAC

● Gratuitous ARP after failoverSync Net

L3 HA Implementing VRRP

● Using keepalived which internally implements VRRP

● Creating a per tenant HA network, used for VRRP sync messages

● When HA router is created it is scheduled on multiple network nodes (Configurable)

● New in Kilo

– Report which network node is hosting the master instance

● On the work

– L3 HA + l2pop

– External interface tracking

– L3 HA+DVR

Traffic Flow 3-tier Application

Host 1

WWW

VM

Host 2

App

VM

Host 3

DB

VM

Network Node

Virtual Router

DVR – Distributed Virtual Router

● DVR is moving most of the routing to the compute node

– Isolating the failure domain of the network node

– Optimizing the network flow

● Traffic types

– East – West (Within the tenant, different networks)

– North – South with floating IP (VM to/from external network)

– North – South without floating IP (Based on SNAT)

Direct between compute nodes

Through network node

Architecture - Assuming DVR

Compute NodeController Node

Network Node

Neutron server

MySQL server

Neutron server

Neutron serverRabbitmq server

InternetExternal Network

APINetwork

Management Network

Data Network

Network Node

OVS Agent

keepalived

Neutron serverOVS

DHCP agentDHCP Agent

Neutron serverMetadata Agent

Metadata Proxy

dnsmasq

L3 Agent

Neutron server

OVS agent

OVS

Architecture - Assuming DVR

Compute NodeController Node

Network Node

Neutron server

MySQL server

Neutron server

Neutron serverRabbitmq server

InternetExternal Network

APINetwork

Management Network

Data Network

Network Node

OVS Agent

keepalived

Neutron serverOVS

DHCP agentDHCP Agent

Neutron serverMetadata Agent

Metadata Proxy

dnsmasq

L3 Agent

Neutron server

OVS agent

OVS

Architecture - Assuming DVR

Compute NodeController Node

Network Node

Neutron server

MySQL server

Neutron server

Neutron serverRabbitmq server

Neutron server

OVS agent

OVS

InternetExternal Network

APINetwork

Management Network

Data Network

Compute Node

Neutron server

OVS agent

Neutron serverOVS

L3 agent

Neutron serverMetadata agent

Metadata Proxy

Network Node

OVS Agent

keepalived

Neutron serverOVS

DHCP agentDHCP Agent

Neutron serverMetadata Agent

Metadata Proxy

dnsmasq

L3 Agent

Summary

● No one stop shop

● Maximize the use of built-in solutions

– They are vendor neutral

– Highly maintained

– Widely documented

● Understand what you need, use the appropriate tools

– DVR vs VRRP

– What size is your deployment, maybe A/P is good enough...

● The more complicated the solution is the more likely it is to have bugs

Thank You

Resources

● http://assafmuller.com

● http://specs.openstack.org/openstack/neutron-specs/specs/kilo/agent-child-processes-status.html

● https://github.com/beekhof/osp-ha-deploy/blob/master/ha-openstack.md

● https://docs.google.com/document/d/1jCmraZGirmXq5V1MtRqhjdZCbUfiwBhRkUjDXGt5QUQ/edit

● https://docs.google.com/document/d/1jCmraZGirmXq5V1MtRqhjdZCbUfiwBhRkUjDXGt5QUQ/edit

● https://www.youtube.com/watch?v=00j1x-T1vhA