Copyright©2014 NTT DOCOMO, INC. All rights reserved.
Design and Operation of OpenStack Cloud on 100 Physical Servers
NTT DOCOMO Inc. Ken Igarashi
Virtualtech Japan Inc. Hiromichi Ito
NEC Akihiro Motoki
DOCOMO, INC All Rights Reserved
Ken Igarashi ○ Leading OpenStack Project at NTT DOCOMO ○ One of the first members of proposing
OpenStack Bare Metal Provisioning (currently called "Ironic") - bit.ly/1stuN2E
Hiromichi Ito ○ CTO of Virtualtech Japan Inc.
Akihiro Motoki ○ Senior Research Engineer, NEC ○ Core developer of Neutron and Horizon.
About Us
2
DOCOMO, INC All Rights Reserved
○ Information required Ø Hardware resources/performance
– Management resources – User resources
ü Nova, Cinder – depends of individual Ø Hardware/Software configuration
– High Availability – Network configuration (e.g. Neutron)
Ø Deployment tool – JuJu/MaaS, Fuel, Helion, RDO etc
○ How we get it Ø Did simulation using 100 physical hosts
– Total 3200vCPU, 12.8TB Memory – Collaboration with: National Institute of Information and Communications
Technology, VirtualTech Japan Inc., NTT Advanced Technology Corporation, Japan Advanced Institute of Science and Technology, Tokyo University and Dell Japan Inc.
Design OpenStack Cloud
3
DOCOMO, INC All Rights Reserved
Test Environment
4
National Institute of Information and Communications Technology
Ishikawa prefecture
About 1400 servers in the single site
DOCOMO, INC All Rights Reserved
Research and Development New locater protocol development Home network protocol development Virtual node migration algorithms HEMS management protocol New tunnel protocols Inter-AS traceback
TCP behavior comparison Proxy server performance evaluation Evaluation of X-ray sharing Video conference protocol switching FW benchmarking
Protocol / Product Evaluation
Education Security operation competition
Cyberrange training Remote hands-on for Asian students
Competition of cloud computing ideas
Testbed federation algorithms Supporting software for control testbeds
Wireless link simulation on wired link IPv6 support on network testbeds
Simulation
Realistic and Flexible
experiments based on bare-wire
environment
StarBED – http://bit.ly/10gYttm
5
○ Open to any companies and organizations
DOCOMO, INC All Rights Reserved 6
100 Physical Servers on StarBED
Compute Node x 36
Leaf Switch (S4810)
Leaf Switch (S4810)
Leaf Switch (S4810)
Spine Switch (S6000)
Spine Switch (S6000)
Compute Node x 37
40Gb x 2
10Gb x 4
LB (BIG-IP 5200V) x 2
Leaf Switch (S4810) 10Gb x 4
Management Servers x 21
10Gb x 1
10Gb x 1 10Gb x 1
10Gb x 1
Compute Node x 6
40Gb x 2 40Gb x 2 40Gb x 2
10Gb x 1 10Gb x 1
10Gb x 1 10Gb x 1 10Gb x 1 10Gb x 1
○ OpenStack Icehouse
DOCOMO, INC All Rights Reserved
○ Multi-Chassis Link Aggregation (MLAG)
Network Redundancy
8
○ Endhost Equal Cost Multi Path(ECMP)
Switch Switch
eth1
bond0 z.z.z.z
eth2
MLAG with VRRP
ECMP
Bonding
Switch Switch
eth1 x.x.x.x
lo z.z.z.z
eth2 y.y.y.y
Routing Protocol
ECMP
Maturity Need expensive switch
Remove network complexity Maturity
DOCOMO, INC All Rights Reserved
○ Virtual network creation is essential to increase network security Ø ML2 with tunnel network configuration
– Type Driver ü VXLAN ü GRE
– We chose VXLAN ü VXLAN uses MAC Address-in-User Datagram Protocol (MAC-in-UDP)
encapsulation ü The load balancing algorithm works effectively by using UDP port
number hash ü Many network hardware support VXLAN
Ø Mechanism Drivers – Open vSwitch (OVS) – Linux Bridge
Neutron Configuration
9
DOCOMO, INC All Rights Reserved
○ Throughput between 1 VM and1 VM on different physical hosts (1 TCP connection) Ø No much difference between OVS and Linux Bridge Ø MLAG gets better performance than ECMP
Throughput for Different Network Configuration
10
3.4
3.6
3.8
4.0
4.2
4.4
4.6
ovs_mlag ovs_ecmp bridge_mlag
Throughp
ut [G
bps]
DOCOMO, INC All Rights Reserved
○ MLAG with OVS seems the best configuration today Ø Performance, Potential, Stability
3.4
3.6
3.8
4.0
4.2
4.4
4.6
ovs_mlag ovs_ecmp bridge_mlag
Throughp
ut [G
bps]
Throughput for Different Network Configuration
11
We increased VM’s MTU to 8950 to get the performance but the physical network bandwidth is 20Gbps
DOCOMO, INC All Rights Reserved
Throughput for Different Number of VMs
12
○ VM communicates to random VM on a different physical hosts (1 connection per VM)
○ It consumes only 50% of the total bandwidth though allocating all physical resource to VM
0.0
2.0
4.0
6.0
8.0
10.0
12.0
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
100 200 300 400 477
Throu
ghpu
t (PH
Y) [G
bps]
Throu
ghpu
t (VM
) [Gbp
s]
Number of Servers
VM (MTU 1500) VM (MTU 8950) PHY (MTU 1500) PHY (MTU 8950)
* PHY :VM’s total throughput measured at a physical host
DOCOMO, INC All Rights Reserved
○ We could get 19Gbps (MTU 1500) between physical hosts ○ Enabling VXLAN
Ø We could get only 10Gps (MTU 8950) Ø VM’s CPU load during the communication
Ø The throughput is highly reduced by turning on VXLAN – CPU is overloaded by VTEP software processing
ü packet encapsulation and de-capsulation
Slow Throughput
13
Server Receiver
89.3 0.0 391:31.82 vhost-yyyyy 49.3 0.8 257:06.66 qemu-system-x86
98.4 0.0 462:41.90 vhost-xxxxx 42.9 0.9 294:34.67 qemu-system+
DOCOMO, INC All Rights Reserved
○ NIC with VXLAN offload would be able to reduce CPU load ○ Available Device Lists
Ø Mellanox ConnectX-3 Pro – World 1st VXLAN offload NIC
Ø Intel X710,XL710 – Release at 2014 Sep.
Ø Emulex XE102 Ø Qlogic 8300 series
– Support at October 21, 2013 software release Ø Qlogic NetXtreme II 57800 series
– Broadcom is selling its NetXtreme II line of 10GbE controllers and adapters to QLogic.
NIC with VXLAN Offload Support
14
DOCOMO, INC All Rights Reserved
0.0
5.0
10.0
15.0
20.0
25.0
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
10 20 30 38
Throughp
ut (P
HY) [Gbp
s]
Thou
ghpu
t (VM
) [Gbp
s]
Number of Servers
VM OFF (MTU 1500) VM ON (MTU 1500) VM OFF (MTU 8950) VM ON (MTU 8950) PHY OFF (MTU 1500) PHY ON (MTU 1500) PHY OFF (MTU 8950) PHY ON (MTU 8950)
Throughput using VXLAN Offload NIC
15
○ Throughput between VMs on 4 different physical hosts (2 server, 2 receiver)
○ It can consume 98% of the total physical bandwidth Ø VXLAN Offload with MTU 8950
3.5 ~ 5.6 x
1.3 ~ 1.4 x
* PHY :VM’s total throughput measured at a physical host
DOCOMO, INC All Rights Reserved
CPU Load
16
0.0
50.0
100.0
150.0
200.0
250.0
1 2 4 6 8 10 12 14 16
CPU
[%]
Number of Servers
On Tx CPU/Gbps OFF Tx CPU/Gbps
27.1%
0.0
50.0
100.0
150.0
200.0
250.0
300.0
1 2 4 6 8 10 12 14 16
CPU
[%]
Number of Servers
On Rx CPU/Gbps OFF Rx CPU/Gbps
28.5%
Server Receiver
DOCOMO, INC All Rights Reserved
○ We could get 1.3~5.5 times throughput compared to NIC without offload capability
○ CPU load on a physical host was reduced 27~28% ○ MTU 8950 showed 1.5~1.6 times better throughput than MTU
1500 Ø We decided to set MTU 9000 on a physical host but we deliver MTU
1500 by DHCP server Ø Let user extend MTU
VXLAN Offload NIC
17
DOCOMO, INC All Rights Reserved
○ You need 10-12 people Ø 4 group + α people are required
○ If we can delay fixing a problem later, we can only work on
weekday Ø High Availability is the key to achieve this
○ Our design Ø Double redundancies for hardware Ø Triple redundancies for software ⇒ Against double failure
24/7 Support
19
DOCOMO, INC All Rights Reserved
○ Others ○ Load Balancer based
VMVM
VMVM
VMVM
MySQL (Galera)
High Availability
20
Arbitrator
DB1 DB2
DB3 DB4 VMVMNova
OpenStack APIs
Zabbix
LB LB
Load Balancing SSL Termination Health Check
Neutron Agents
PXE, DNS, DHCP
MAAS
RabbitMQ
DOCOMO, INC All Rights Reserved
○ 4 Nodes + 1 Arbitrator MySQL HA
21
Arbitrator
DB1
LB LB
DB2 DB3 DB4
Read/Write to a single node
Quorum-based Voting
Health Check
• Check TCP Port 3306 • Cluster Status
ü show status like 'wsrep_ready=‘ON’
Priority 1 Priority 2 Priority 3 Priority 4
DOCOMO, INC All Rights Reserved
Galera-cluster State Transition
22
Open Primary
Joiner
Joined [3]
Synced [4]
Donor [2]
IST and SST
wsrep_ready=‘ON’
○ WSREP_STATUS = 2 and 4 can’t cover all the states
DOCOMO, INC All Rights Reserved
○ Node Recovery Ø Health check detects DB1’s failure
MySQL HA
23
DB1
LB LB
DB2 DB3 DB4
Priority 1 Priority 2 Priority 3 Priority 4
Health Check
• Check TCP Port 3306 • Cluster Status
ü show status like wsrep_ready=‘ON’
Arbitrator
DOCOMO, INC All Rights Reserved
○ Node Recovery Ø Designated DB is changed from DB1 to DB2
MySQL HA
24
DB1
LB LB
DB2 DB3 DB4
Priority 1 Priority 2 Priority 3 Priority 4
• Cluster Status ü show status like wsrep_ready=‘YES’ -> ‘NO’
Arbitrator
DOCOMO, INC All Rights Reserved
Arbitrator
○ Node Recovery Ø DB1 is fixed from DB4 (lowest priority) using IST or SST
MySQL HA
25
DB1
LB LB
DB2 DB3 DB4
Priority 1 Priority 2 Priority 3 Priority 4
Synchronization • IST: Incremental State Transfer • SST: State Snapshot Transfer
DOCOMO, INC All Rights Reserved
MySQL HA
26
DB1
LB LB
DB2 DB3 DB4
Priority 1 Priority 2 Priority 3 Priority 4
○ Node Recovery Ø DB1’s priority is changed before joining the cluster
Priority 1 Priority 2 Priority 3 Priority 4
Arbitrator
DOCOMO, INC All Rights Reserved
○ Node Recovery Ø The cluster is backed to normal state
MySQL HA
27
DB1
LB LB
DB2 DB3 DB4
Priority 4 Priority 1 Priority 2 Priority 3
Arbitrator
DOCOMO, INC All Rights Reserved
0.0
50.0
100.0
150.0
200.0
250.0
300.0
350.0
120TPS 240TPS 120TPS 240TPS
Time for recovery [s]
Backgroud Traffic
JOINED-‐>SYNCED JOINER-‐>JOINED
Recovery Time
28
○ Time for IST
Performance Max 340 TPS
Performance Max 1356 TPS
DOCOMO, INC All Rights Reserved
Recovery Time
29
0.0
200.0
400.0
600.0
800.0
1,000.0
1,200.0
1,400.0
1,600.0
1,800.0
2,000.0
120TPS 240TPS 120TPS 240TPS
Time for recovery [s]
Background Traffic
JOINED-‐>SYNCED JOINER-‐>JOINED
○ Time for SST
Performance Max 340 TPS
Performance Max 1356 TPS
DOCOMO, INC All Rights Reserved
○ Loosing all database
Disaster Recovery
Restore from backup
Fix NW
DB1
DB2
DB3
DB4
Stand-by
30
SST
Run MySQL
Run MySQL Stand-by
Stand-by
DONOR
Run MySQL
SST
SST
Healthy State
DB 3GB
11 seconds 70 seconds 70 seconds 98.2 minutes 97.5minutes for 12hours
bin log recovery
Run MySQL DONOR
DOCOMO, INC All Rights Reserved
○ MAAS includes Ø DNS, DHCP, tftp
○ DNS Ø Master – Slave
○ DHCP (ISC DHCP) Ø Replication – (Delivering fixed IP address through DHCP)
○ MAAS and tftp Ø Back up by VM
MAAS-HA
31
MAAS
Storage
VM Image
Activate
DOCOMO, INC All Rights Reserved
○ We add multiple RabbitMQ address to configuration files Ø Easy configuration and application level health monitoring At lease 3 RabbitMQ (5 ideally) hosts are required against split-brain
○ Read/Write to single node using load balancer Ø Don’t need to care about split-brain – 3 RabbitMQ hosts Ø Network level health monitoring
RabbitMQ-HA
32
VMVM
VMVM
VMVM
VMVMNova
LB
LB
VMVM
VMVM
VMVM
VMVMNova
fcluster_partition_handling =‘autoheal’
DOCOMO, INC All Rights Reserved
Network Setup
34
o DHCP agent Ø Support Active-Active. Assign a virtual network into multiple agents
ü dhcp_agents_per_network = 3 (should be <= 3) o L3 agent
Ø Support only Active-Standby Ø If it fails, we need to migrate a router to another agent
o Metadata agent Ø Has no state ⇒ Just need to keep metadata-agent running in all nodes
NW node
Data Plane (VXLAN)
External Net
Neutron Server
Message Queue
NW node
NW node
L3-agt
dhcp-agt
Control Plane
dhcp-agt dhcp-agt
L3-agt L3-agt meta-agt meta-agt meta-agt
Compute Node
Compute Node
Compute Node
DOCOMO, INC All Rights Reserved
Monitoring Points
35
NW node
Data Plane (VXLAN)
External Net
GW router
Neutron Server
Message Queue
NW node
NW node
L3-agt
dhcp-agt
Compute Node
Compute Node
Compute Node
Control Plane
dhcp-agt dhcp-agt
L3-agt L3-agt
[2] PING from external net
[1] PING from Internal net
[4] PING from C-plane
[3] Agent state check via REST API
DOCOMO, INC All Rights Reserved
○ Data plane connectivity – If it fails, users cannot communicate through routers.
Ø [1] Internal network for VXLAN (ping) Ø [2] External network (ping)
○ Network agent health check – L3 agent, DHCP agent
Ø [3] Agent alive state from neutron server (REST API agent-list) – Each neutron agent reports its state via message queue.
Ø [4] Control network connectivity (ping) – If it fails, we are no longer able to control the node.
Health Checks against Failures
36
DOCOMO, INC All Rights Reserved
Recovery from Failures
37
NW node
Data Plane (VXLAN)
External Net GW
router
Neutron Server
Message Queue
NW node
NW node
N N N
R R R R R
N N
Compute Node
Compute Node
Compute Node
Control Plane
L3-agt
dhcp-agt dhcp-agt dhcp-agt
L3-agt L3-agt
(1) Disable agents on the host
DOCOMO, INC All Rights Reserved
Recovery from Failures
38
NW node
Data Plane (VXLAN)
External Net GW
router
Neutron Server
Message Queue
NW node
NW node
N N N
R R R R R
N N
Compute Node
Compute Node
Compute Node
Control Plane
L3-agt
dhcp-agt dhcp-agt dhcp-agt
L3-agt L3-agt
(2) Migrate network/router
DOCOMO, INC All Rights Reserved
Recovery from Failures
39
NW node
Data Plane (VXLAN)
External Net GW
router
Neutron Server
Message Queue
NW node
NW node
N N N N
R R R R R R
N N N
Compute Node
Compute Node
Compute Node
Control Plane
L3-agt
dhcp-agt dhcp-agt dhcp-agt
L3-agt L3-agt R
DOCOMO, INC All Rights Reserved
Recovery from Failures
40
NW node
Data Plane (VXLAN)
External Net GW
router
Neutron Server
Message Queue
NW node
NW node
N N N
R R R R R
N N
Compute Node
Compute Node
Compute Node
Control Plane
L3-agt
dhcp-agt dhcp-agt dhcp-agt
L3-agt L3-agt
(3) Shutdown NICs (or the node)
DOCOMO, INC All Rights Reserved
○ Dedicated network namespace on network node for external connectivity checking Ø Network node has reachability from external network node. Ø Use IP address on isolated namespace to avoid access to the node host
from public network.
Tips: Checking External network connectivity
41
Network Node
Bridge (external)
ethN
Router netns
Router netns
Netns for checking
IPAddr
GW router
PING check
No access to the host
DOCOMO, INC All Rights Reserved
○ Throughput from external node to a VM ○ Injected a control plane failure and migrated a router from
another L3-agent
Traffic During Router Migration
42
0
100
200
300
400
500
600
700
800
900
1000
1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82 85 88
Elapsed Time [second]
Thro
ughp
ut [M
bps]
10 seconds
DOCOMO, INC All Rights Reserved
○ Migrated 88 routers from one L3-agent to other two L3-agents Router Migration Progress
43
0.0
10.0
20.0
30.0
40.0
50.0
60.0
70.0
80.0
90.0
100.0
0:00:00 0:00:17 0:00:35 0:00:52 0:01:09 0:01:26 0:01:44 0:02:01 0:02:18
Num
ber o
f Rou
ters
Pro
cess
ed
Elapsed Time [sec]
L3-agent processed
REST API processed
REST API requested
L3-agent processed (aggregated)
DOCOMO, INC All Rights Reserved
Possible Improvements
44
NW node
Data Plane (VXLAN)
External Net GW
router
Neutron Server
Message Queue
NW node
NW node
Compute Node
Compute Node
Compute Node
Control Plane L3-agt L3-agt L3-agt
R R R R
o Integration with L3-Agent HA feature Ø It improves data-plane availability much Ø Monitoring external network connectivity needs to be improved in L3-HA Ø Still requires router migration based on C-Plane monitoring
No monitoring for external network now
HA supported for internal network failure
C-Plane monitoring is still required
DOCOMO, INC All Rights Reserved
○ Integration with Juno Neutron features Ø Using L3-Agent HA feature (prev. page) Ø Leveraging L3-agent auto rescheduling
– Helps us reduce the number of REST API calls – Juno Neutron support L3-agent rescheduling for routers on inactive agents – “admin_state” is not considered for rescheduling ß Need to be improved
○ Possible contributions to Neutron upstream Ø DHCP agent auto rescheduling Ø LBaaS agent scheduling
– There is no way to reassigning LBaaS agent for HAProxy driver
Possible Improvements
45
DOCOMO, INC All Rights Reserved
○ Controller Ø API
○ Message Queue Ø RabbitMQ
○ Database Ø MySQL – OpenStack
○ Neutron Servers
○ Monitoring Ø Zabbix Servers(+MySQL)
○ Storage Ø Log, Backup
○ Deployment + etc Ø MaaS, MongoDB
Management Resources
47
DOCOMO, INC All Rights Reserved
Management Resources
48
3 ○ Controller Ø API
○ Message Queue Ø RabbitMQ
○ Database Ø MySQL – OpenStack
○ Neutron Servers
○ Monitoring Ø Zabbix Servers(+MySQL)
○ Storage Ø Log, Backup
○ Deployment + etc Ø MaaS, MongoDB
DOCOMO, INC All Rights Reserved
Management Resources
49
3
3 + 2
○ Controller Ø API
○ Message Queue Ø RabbitMQ
○ Database Ø MySQL – OpenStack
○ Neutron Servers
○ Monitoring Ø Zabbix Servers(+MySQL)
○ Storage Ø Log, Backup
○ Deployment + etc Ø MaaS, MongoDB
DOCOMO, INC All Rights Reserved
Management Resources
50
3
3 + 2
4 + 0.5
○ Controller Ø API
○ Message Queue Ø RabbitMQ
○ Database Ø MySQL – OpenStack
○ Neutron Servers
○ Monitoring Ø Zabbix Servers(+MySQL)
○ Storage Ø Log, Backup
○ Deployment + etc Ø MaaS, MongoDB
DOCOMO, INC All Rights Reserved
Management Resources
51
3
3 + 2
4 + 0.5
3
○ Controller Ø API
○ Message Queue Ø RabbitMQ
○ Database Ø MySQL – OpenStack
○ Neutron Servers
○ Monitoring Ø Zabbix Servers(+MySQL)
○ Storage Ø Log, Backup
○ Deployment + etc Ø MaaS, MongoDB
DOCOMO, INC All Rights Reserved
Management Resources
52
3
3 + 2
4 + 0.5
3
3
○ Controller Ø API
○ Message Queue Ø RabbitMQ
○ Database Ø MySQL – OpenStack
○ Neutron Servers
○ Monitoring Ø Zabbix Servers(+MySQL)
○ Storage Ø Log, Backup
○ Deployment + etc Ø MAAS, MongoDB
DOCOMO, INC All Rights Reserved
Management Resources
53
3
3 + 2
4 + 0.5
3
3
xxTB
○ Controller Ø API
○ Message Queue Ø RabbitMQ
○ Database Ø MySQL – OpenStack
○ Neutron Servers
○ Monitoring Ø Zabbix Servers(+MySQL)
○ Storage Ø Log, Backup
○ Deployment + etc Ø MAAS, MongoDB
DOCOMO, INC All Rights Reserved
Management Resources
54
3
3 + 2
4 + 0.5
3
3
xxTB
2
○ Controller Ø API
○ Message Queue Ø RabbitMQ
○ Database Ø MySQL – OpenStack
○ Neutron Servers
○ Monitoring Ø Zabbix Servers(+MySQL)
○ Storage Ø Log, Backup
○ Deployment + etc Ø MAAS, MongoDB
DOCOMO, INC All Rights Reserved
Management Resources
55
Controller RabbitMQ
MySQL
Neutron Zabbix Log, backup storage
etc
DOCOMO, INC All Rights Reserved
Management Resources
56
Controller RabbitMQ
MySQL
Neutron Zabbix Log, backup storage
etc
Nova Compute
DOCOMO, INC All Rights Reserved
Scalability Test
57
0.0%
10.0%
20.0%
30.0%
40.0%
50.0%
60.0%
0.0
5.0
10.0
15.0
20.0
25.0
30.0
35.0
40.0
45.0
50.0
0-‐1000 1000-‐2000 2000-‐3000 3000-‐4000 4000-‐5000
Error R
ate [%
]
Time
[S]
Number of VMs
Elapsed Time to Boot a Instance (Avg.) Error %
○ We measured VM boot time for 0-5000 instances
DOCOMO, INC All Rights Reserved
Database Size - Zabbix
58
size duration
History 50 bytes 30 days
Trend 128 bytes 90 days
Event 130 bytes 90 days
Servers Switch (per port) Tempest Health Check (30 seconds)
Usage (180 seconds)
Health Check
Usage System Check
Item Number 69 557 1 24 500
Size (history) 15GB 40GB 687MB 5GB 108MB
Size (trend) 2GB 15GB 88MB 2GB 138MB
Size (event) * 1GB 1GB 1GB 1GB 1GB
Total Size 18GB 57GB 2GB 8GB 1GB
* Assume 1 event/second
86 GB
DOCOMO, INC All Rights Reserved
Database Size - OpenStack
59
Sep 14 2014 Sep 25 2014
OpenStack Related
Keystone* 1.4GB 1.4GB
Nova (28k -> 55k) 451MB 856MB
Neutron (7k -> 9k) 78MB 235MB
Glance 64MB 89MB
Heat 45MB 55MB
Cinder 39MB 43MB
Sub Total 2.1GB 2.7GB
MySQL Related
Transaction log 4.1GB 4.1GB
Ibdata1 268MB 268MB
Total Size 6.4GB 7.0 GB
* Did “keystone-manage token_flush” every 1 hour
DOCOMO, INC All Rights Reserved
○ We can change configuration easily (e.g. HA and Neutron) ○ We can use Ansible for deployment and operation
Deployment Tools
60
DOCOMO (Ansible based)
Mirantis Fuel HP Helion
Canonical Juju/MAAS
MySQL HA LB +
Percona
haproxy+corosync
+pacemaker +Galera
haproxy+keepalived
+Galera
haproxy+corosync
+pacemaker +Percona
RabbitMQ HA
Configfile based (pause_minority)
RabbitMQ Cluster(autoheal)
+ LB
Configfile based (pause_minority)
Configfile based (ignore)
LB HA Commercial
Products
haproxy (nameserver)
+corosync+pacemaker
haproxy+keepalived
haproxy+corosync
+pacemaker
Network Neutron
+ Own HA Neutron Neutron DVR Neutron
DOCOMO, INC All Rights Reserved
○ Default security group Ø IP table entry is added/deleted to all VMs whenever you create/delete a VM ⇒ ovs-agent became busy when we created mv VMs
○ Number of Neutron workers Ø neutron.conf
– api_workers = ‘number of cores’ – rpc_workers = ‘number of cores’
Ø metadata_agent.ini – metadata_workers = ‘number of cores’
○ Number of File Descriptors Ø Default : 1024 Ø RabbitMQ: more than 5,000 connections Ø metadata-ns-proxy (L3-agent,dhcp-agent): request x 2
○ Retry VM Creation Time Ø nova.conf
– scheduler_max_attempts = 1 ⇒ No difference between 1 and 3
Tips Learned from Scalability Tests
61