can you trust neutron?
DESCRIPTION
A tour of scalability improvements between Havana and Juno. The presentation discusses results from an experimental campaign and the various features that enable the scalability improvements Presentation from Aaron Rose and Salvatore Orlando.TRANSCRIPT
Can you trust Neutron?A tour of scalability and reliability improvements from Havana to Juno
Salvatore Orlando (@taturiello)Aaron Rosen (@aaronorosen)
From Havana to Juno
● 12 months● 1672 commits● +147765 -70127 lines of code
(excluding changes in neutron/locale/*)
But... did it really get any better?
Measuring scalability - Process
● Goal: Validate agent scalability under varying loado In this talk we’ll discuss the L2 agent only, sorry!
● Testbed: single server OpenStack installation
● Methodology: run several experiments increasing
the number of servers concurrently createdo Number of servers ranging from 1 to 20o Every experiment is repeated 20 timeso For each metric, study mean, median, and variance
Measuring scalability - Metrics
Instance metrics (t_start = instance created):● t_active - time until the instance reaches active state● t_ping - time until the instance can be pinged● t_allocate_net - time spent configuring networking for instance
Port metrics (t_start = VIF plugged):● t_proc: time until the agent start processing the port● t_up: time until the port is wired● t_dhcp: time for adding DHCP info for the new port
Measuring scalability - Results
t_up in Havana and Juno - a rather remarkable difference!
Measuring scalability - Resultst_allocate_net almost constant in Juno
Growth trend is only 15% of the one seen in Havana
Measuring scalability - results● VM failure rate
analysiso Failure == error while
creating VM or unable to ping within 3 min timeout
● Juno is infallible decently reliable (Havana not as much…)
Analysing progress
FolsomGrizzly Havana
IcehouseJuno
>>>>
>>
<<
How the software improved
● Boot VMs only once network is wired
● Remove choke points from L2 agents
● Streamline security group RPC
● Better router processing in L3 agents
● Reporting floating IP processing status
● many others… which unfortunately won’t fit into the time
allocated to this talk
More results
● Virtually no improvements in time to ping an instance
- As the tests are executed on a single host IO contention between instances is the main bottleneck.
- “Time to ping” is slowed down by longer instance boot times
● Instances are slower to go to “ACTIVE” then they were in Havana
- This is actually a desired feature
- Indeed it’s the reason for which failure rate in Juno is 0 even with 20 concurrent instances
Nova/Neutron Event reporting
Problem: Nova displays cached IPAM info about instance from neutron. Cache is updated slowly…
nova-api
neutron-api1. Associate floating IP to port
2. Show me instance!
Wat? No floating ip?
Nova/Neutron Event reportingSolution: Neutron sends events to nova on IPAM changes causing nova to update its cache.
neutron-api1. Associate floating IP to port
nova-api
2. network-changed for instance X
nova-compute3. dispatch event to compute host
4. update_network cache for instance X
5. Show me instance!
I haz floating ip
Nova/Neutron Event reportingProblem: Instances would go active before network was wired. Some dhcp clients (as the one in cirros images) doesn’t continue retrying...
nova-api1. Boot instance
W00T Active!
Timeout.. Hrm?!?
2. Ready?!?
3. ssh instance…..
Nova/Neutron Event reportingSolution:Neutron sends events to nova on when network is ready.
nova-api
1. Boot instance
nova-scheduler nova-compute
VM
3. Started in paused state
neutron-api
2B. event: network-vif-plugged: port X
VM
Neutron Backend
2. Allocate network for instance
3B. unpaused
1B. Port X active
Enabling/disabling event reporting
Settings in nova.conf
vif_plugging_timeout = 300vif_plugging_is_fatal = True
Speeding up L2 interface processing
Problem - device processing delayed by:- inefficient server/agent interface- preemptive behaviour of security group callbacks- pedantic polling of interfaces on integration bridge- superficial analysis of devices to process
Solution:- ovsdb-monitor triggers interface processing only when changes are detected- Neutron server perform at most 2 RPC call over AMQP for each API operation
- only 1 call in most cases- The L2 agent queries the server only once for retrieving interface detail- Security group updates are processed in the same loop as interface, thus avoiding starvation.- The agent only processes interfaces which are ready to be used - and most importantly
processes them only once!
Streamlining security group RPCs
Problem - exponential complexityThe payload of the RPC call to retrieve security group rules grows exponentially when the number of devices increases
Solution:Restructure the format of the payload exchanged between agent and server, removing data redundancy.With the new payload format, security group rules are not repeated anymore.
Streamlining security group RPCs
Credits: Miguel Angel Ajo Pelayohttp://www.ajo.es/post/95269040924/neutron-security-group-rules-for-devices-rpc-rewrite
RPC message payload size vs # of ports RPC execution time vs # of ports
Reducing router processing times
Problems:● Router synchronization starves RPC handling● Not enough parallelism in router and floating IP processing
Solution:● Router synchronization tasks and RPC messages are added to a priority
queue. Items pulled from the queue are processed in separate threads.● Apply iptables command in a non blocking fashion
Know your floating IP status
Problem:There was no way to know whether your floating IP is ready or not(beyond pinging it, obviously)
Solution:- Introducing the concept of operational status for floating IPs.- The L3 agent calls back the server to confirm successful floating IP creation (ACTIVE), or an
error (DOWN)- The state defaults to DOWN. Goes ACTIVE upon floating IP association, and DOWN when the
floating IP is disassociated.
Other enhancements (in brief)
● Multiple REST API workers
● Multiple RPC over AMQP workers
● Better IP address recycling
● Removal of several locking queries
o ie: LOCK FOR UPDATE statements
● Removal of conditions triggering LOCK WAIT timeout errors
o bug triggered by eventlet yielding within a transaction
Where we are...● The L2 agent scalability considerably improved over the past 12 months
o Results measured with OVS only but the same considerations apply to Linux Bridge as well
● Security groups can now be used even in very large deployments
● Nova/Neutron interface much more reliableo Boot a server only when the network for it is wired
o Faster, less chatty communication
● Some progress on resource status trackingo Far from being optimal, but at least now you can now when your floating IP is
ready to use...
… and where we want to be● There is still a lot of room for improvement in the agents
o E.g.: OVS agent still scan all ports on integration bridge at each iteration
● The Nova/Neutron interface is better, but is however far from idealo Enhanced caching on the nova side can avoid a lot of round trips to neutron
● Little to nothing has been done for tracking async operation and resource status. For example:o there is no way to know whether DHCP info are ready for a port
o security group updates are processed asynchronously, but it is impossible to know when processing completes
Final thoughts
● “Much better” is different from “ideal”o ≅ 3 seconds for wiring an interface could not be ideal for many
applicationso scalability limits should be addressed even if they involve architectural
changes
● What about data plane scalability?
● What about API usability?