openstack ha @ paypal
DESCRIPTION
Openstack HA @ paypal. Open Stack Summit – Hong Kong - 2013. ABOUT PAYPAL. PayPal offers flexible and innovative payment solutions for consumers and merchants of all sizes . 137,000,000 users $300,000 payments processed each minute 193 markets / 26 currencies - PowerPoint PPT PresentationTRANSCRIPT
Open Stack Summit – Hong Kong - 2013
OPENSTACK HA @PAYPAL
2
PayPal offers flexible and innovative payment solutions for consumers and merchants of all sizes.
• 137,000,000 users
• $300,000 payments processedeach minute
• 193 markets / 26 currencies
• The World’s Most Widely Used Digital Wallet
ABOUT PAYPAL
3
Why HA is important for PayPal?
Our Learning
Our Solution
What is not solved?
Q&A
AGENDA
4
WHY HA IS IMPORTANT?
“no perceived downtime” for cloud users
Enterprise Class
Auto Scaling & Flex up/down can never break
API Integrations always succeed
Everyone expected to use the cloud
5
No SPOF “Under the Cloud”
Scale Across the Data Center(s)
Scale Across Racks & Containers
Respect natural availability zones within the data centers
No ‘cloud’ can impact any other ‘cloud’
AVAILABILITY REQUIREMENTS
6
INFRASTRUCTURE RACK
10g
Acti
ve10
g Pa
ssiv
e1g
M
gmt 1g
Mgm
t10g
Passive10g
Active10
g A
ctive
10g
Pass
ive
1g
Mgm
t 1g M
gmt
10g Passive
10g A
ctive
LB Active LB PassiveAccess
Compute Racks … Infrastructure / Controller Racks
…
Layer 2 versus Layer 3
Cattle&
Puppies
7
INFRASTRUCTURE RACK
OpenStack Services are all VM on KVM
Every infra component resides on 2+ nodes
Redundant physical racks
Redundant power/switches in each rack
Layer-3 connectivity between racks (no Layer 2)
Enterprise Grade Physical LB (floating VIP)
8
COMPUTE
LB Active LB Passive
Compute Node96 Hyperscale
16 Core256GB Ram
1.1T Disk
10g
Acti
ve10
g Pa
ssiv
e1g
M
gmt 1g
Mgm
t10g
Passive10g
Active
Compute Node96 Hyperscale
16 Core256GB Ram
1.1T Disk
10g
Acti
ve10
g Pa
ssiv
e1g
M
gmt 1g
Mgm
t10g
Passive10g
Active
Compute Node96 Hyperscale
16 Core256GB Ram
1.1T Disk
10g
Acti
ve10
g Pa
ssiv
e1g
M
gmt 1g
Mgm
t10g
Passive10g
Active
Compute Node96 Hyperscale
16 Core256GB Ram
1.1T Disk
10g
Acti
ve10
g Pa
ssiv
e1g
M
gmt 1g
Mgm
t10g
Passive10g
Active
LB Active LB PassiveAccess
1
2
3
9
COMPUTE
Active Passive
10g 10g
1g
10g 10g
1g
10g 10g 10g 10g
Management1g 1g
bond0 bond0HyperscaleRaid-10
HyperscaleRaid-10
Top Of Rack Top Of Rack
OPENSTACK SERVICES
11
LB VIP for every service (unless it can’t)
Connect to LB VIP, not individual nodes
Script to close Server Connections
Pacemaker only works inside a single Layer-2 (not a large enterprise)
Auto Restart using Monit
MySQL
Swift Cluster
OPENSTACK CONSIDERATIONS
12
HEAT with Corosync/Pacemaker/keepalived (for now)
KeyStone / Nova / Glance / Swift Proxy
Rabbit MQ Cluster
Cinder Volume Service
CONTINUED…
13
Figure shows a typical interaction between Cinder components to serve a end user request. (create new volume in this example).
CINDER SERVICES WORKFLOW
Cinder API
Cinder Scheduler
Cinder Volume
AMPQ
Storage Back-end1
Storage Back-end2
User request(create volume)
1
23
4
5
6
14
How HA is implemented for Cinder Components:
• API (stateless) – Load Balancer (A/A or A/P);
• Scheduler (stateless) – Pacemaker, Queue itself (A/A or A/P);
• Volume – Pacemaker, Queue itself (A/A or A/P).
CINDER SERVICES WITH HA
Cinder API A Cinder Scheduler B
Cinder Volume A
AMPQCluster
Storage Back-end1
Storage Back-end2
User request(create volume)
1
2
5
6
Load Balancer
Cinder API B
3
Cinder Scheduler A
4
Cinder Volume B
15
VIP-friendly Cinder Volume service
Seamless Upgrade Flip
Failed DB TX Reconciliation
Consistent API Response Time
UNRESOLVED
16 Confidential and Proprietary
THANK YOU
HTTP://GITHUB.COM/PAYPAL/AURORA
SCOTT CARLSON - @RELAXED137RAJ GEDAZHITENG HUANG IRC:WINSTON-D