open stack summit – hong kong - 2013 openstack ha @paypal

Post on 15-Dec-2015

213 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Open Stack Summit – Hong Kong - 2013

OPENSTACK HA @PAYPAL

2

PayPal offers flexible and innovative payment solutions for consumers and merchants of all sizes.

• 137,000,000 users

• $300,000 payments processedeach minute

• 193 markets / 26 currencies

• The World’s Most Widely Used Digital Wallet

ABOUT PAYPAL

3

Why HA is important for PayPal?

Our Learning

Our Solution

What is not solved?

Q&A

AGENDA

4

WHY HA IS IMPORTANT?

“no perceived downtime” for cloud users

Enterprise Class

Auto Scaling & Flex up/down can never break

API Integrations always succeed

Everyone expected to use the cloud

5

No SPOF “Under the Cloud”

Scale Across the Data Center(s)

Scale Across Racks & Containers

Respect natural availability zones within the data centers

No ‘cloud’ can impact any other ‘cloud’

AVAILABILITY REQUIREMENTS

6

INFRASTRUCTURE RACK

10g

Acti

ve10

g Pa

ssiv

e1g

M

gmt 1g

Mgm

t10g

Passive10g

Active10

g A

ctive

10g

Pass

ive

1g

Mgm

t 1g M

gmt

10g Passive

10g A

ctive

LB Active LB PassiveAccess

Compute Racks … Infrastructure / Controller Racks

Layer 2 versus Layer 3

Cattle&

Puppies

7

INFRASTRUCTURE RACK

OpenStack Services are all VM on KVM

Every infra component resides on 2+ nodes

Redundant physical racks

Redundant power/switches in each rack

Layer-3 connectivity between racks (no Layer 2)

Enterprise Grade Physical LB (floating VIP)

8

COMPUTE

LB Active LB Passive

Compute Node96 Hyperscale

16 Core256GB Ram

1.1T Disk

10g

Acti

ve10

g Pa

ssiv

e1g

M

gmt 1g

Mgm

t10g

Passive10g

Active

Compute Node96 Hyperscale

16 Core256GB Ram

1.1T Disk

10g

Acti

ve10

g Pa

ssiv

e1g

M

gmt 1g

Mgm

t10g

Passive10g

Active

Compute Node96 Hyperscale

16 Core256GB Ram

1.1T Disk

10g

Acti

ve10

g Pa

ssiv

e1g

M

gmt 1g

Mgm

t10g

Passive10g

Active

Compute Node96 Hyperscale

16 Core256GB Ram

1.1T Disk

10g

Acti

ve10

g Pa

ssiv

e1g

M

gmt 1g

Mgm

t10g

Passive10g

Active

LB Active LB PassiveAccess

1

2

3

9

COMPUTE

Active Passive

10g 10g

1g

10g 10g

1g

10g 10g 10g 10g

Management1g 1g

bond0 bond0HyperscaleRaid-10

HyperscaleRaid-10

Top Of Rack Top Of Rack

OPENSTACK SERVICES

11

LB VIP for every service (unless it can’t)

Connect to LB VIP, not individual nodes

Script to close Server Connections

Pacemaker only works inside a single Layer-2 (not a large enterprise)

Auto Restart using Monit

MySQL

Swift Cluster

OPENSTACK CONSIDERATIONS

Palanisamy, Anand

12

HEAT with Corosync/Pacemaker/keepalived (for now)

KeyStone / Nova / Glance / Swift Proxy

Rabbit MQ Cluster

Cinder Volume Service

CONTINUED…

Palanisamy, Anand

13

Figure shows a typical interaction between Cinder components to serve a end user request. (create new volume in this example).

CINDER SERVICES WORKFLOW

Cinder API

Cinder Scheduler

Cinder Volume

AMPQ

Storage Back-end1

Storage Back-end2

User request(create volume)

1

23

4

5

6

14

How HA is implemented for Cinder Components:

• API (stateless) – Load Balancer (A/A or A/P);

• Scheduler (stateless) – Pacemaker, Queue itself (A/A or A/P);

• Volume – Pacemaker, Queue itself (A/A or A/P).

CINDER SERVICES WITH HA

Cinder API A Cinder Scheduler B

Cinder Volume A

AMPQCluster

Storage Back-end1

Storage Back-end2

User request(create volume)

1

2

5

6

Load Balancer

Cinder API B

3

Cinder Scheduler A

4

Cinder Volume B

15

VIP-friendly Cinder Volume service

Seamless Upgrade Flip

Failed DB TX Reconciliation

Consistent API Response Time

UNRESOLVED

Palanisamy, Anand

16 Confidential and Proprietary

cloud@paypal.com

THANK YOU

HTTP://GITHUB.COM/PAYPAL/AURORA

SCOTT CARLSON - @RELAXED137RAJ GEDAZHITENG HUANG IRC:WINSTON-D

top related