vCRIB: Virtual Cloud Rule Information Base
Masoud Moshref, Minlan Yu, Abhishek Sharma, Ramesh Govindan
HotCloud 2012
2Introduction Motivation Architecture Evaluation Conclusion
Introduction
Datacenters use rules to implement management policiesDatacenters use rules to implement management policies
• Access control• Rate limiting• Traffic measurement • Traffic engineering
Datacenters use rules to implement management policies
R1
Src IP
Dst
IP R1=Accept SrcIP: 12.0.0.0/8 DstIP: 10.0.0.0/16
An action on a hypercube of flow spaceAn action on a hypercube of flow space
Examples:• Deny• Normal• Enqueue
Flow fields examples:• Src IP / Dst IP• Protocol• Src Port / Dst Port
3Introduction Motivation Architecture Evaluation Conclusion
Agg1 Agg2
ToR1 ToR2 ToR3
Agg1 Agg2
ToR1 ToR2 ToR3
Current Practice
R0R4
R3
R0R4
R3
R0R4
R3
R0R4
R3
R0R4
R3
R0R4
R3
R0R4
R3
R0R4
R3
R0R4
R3
R0R4
R3
R0R4
R3
Rules are saved on predefined fixed machines
Multiple policies may compete for resources
Machines have limited resources
Datacenters have different resource constraints
1
2
3
4Introduction Motivation Architecture Evaluation Conclusion
Agg1 Agg2
ToR1 ToR2 ToR3
vCRIB Goal: Flexible Rule Placement
Find the best feasible rule placement based on resource constraints
R0R4
R3
R0R4
R3
R0R4
R3
A
B C
5Introduction Architecture Evaluation ConclusionMotivation
Future Datacenters will have many fine-grained rules
Regulating VM pair communication• Access control (CloudPolice)• Bandwidth allocation (Seawall)
Per flow decision• Flow measurement for traffic engineering (MicroTE, Hedera)
VLAN per server• Traffic management (NetLord, Spain)
100K – 1M
10 – 100M
1M
6Introduction Architecture Evaluation ConclusionMotivation
Where to place rules? Hypervisors vs. Switches
Hypervisor Switch
Limited CPU budget
Software, Slow
Complex rules
# TCAM entries
Hardware, Fast
OpenFlow rules
Performance
Close to VMs External traffic, Aggregate traffic
Flexibility
Entry point
Resources
7Introduction Architecture Evaluation ConclusionMotivation
Agg1 Agg2
ToR1 ToR2 ToR3
Rule Location Trade-off (Resource vs. Bandwidth Usage)
Storing rules at hypervisor incurs CPU processing overhead
8Introduction Architecture Evaluation ConclusionMotivation
Agg1 Agg2
ToR1 ToR2 ToR3
Rule Location Trade-off (Resource vs. Bandwidth Usage)
Move the rule to ToR switch and forward trafficSaving the rules at hypervisor uses all CPU budget
9Introduction Architecture Evaluation ConclusionMotivation
Can we reduce Open vSwitch CPU usage?
400 500 600 700 800 900 1024100
1000
10000
100000
1000000
CPU=100% CPU=50%
Wildcard Pattern
Ru
les
CPU usage
# wildcard patterns
# rules
The set of ignore bits in the maskR1=Accept, DstIP: 10.0.0.0/16, SrcIP: 12.0.0.0/81111111111111111****************, 11111111************************
Handle same number of new flows with lower CPU budget
10Introduction Architecture Evaluation ConclusionMotivation
Agg1 Agg2
ToR1 ToR2 ToR3
Rule Location Trade-off (Resource vs. Bandwidth Usage)
If rule memory is limited in one switch
11Introduction Architecture Evaluation ConclusionMotivation
Agg1 Agg2
ToR1 ToR2 ToR3
Rule Location Trade-off (Resource vs. Bandwidth Usage)
just move the rule to another switchCan tradeoff bandwidth within the switch fabric,
in addition to trading-off bandwidth between hypervisors and switches
12Introduction Motivation Evaluation ConclusionArchitecture
Our Approach: vCRIB, a Virtual Cloud Rule Information Base
vCRIB
R1 R2
R3 R4
Network State
Allow operators to define fine-grained rules without worrying about placement
Proactive rule placement abstraction layer
Optimize performance given resource constraints
Flexible rule placement at hypervisors and switches
Agg1 Agg2
ToR1 ToR2 ToR3
Rules
13Introduction Motivation Evaluation ConclusionArchitecture
Challenges: Overlapping Rules
vCRIB
Network State Agg1 Agg2
ToR1 ToR2 ToR3
Rules
R1 R2
R3 R4R1 R2
R3 R4
14Introduction Motivation Evaluation ConclusionArchitecture
Challenges: Overlapping Rules
R1 R2
R3 R4
R0
R4R3
R2
R1
Src IPD
st IP R1
ToR1
15Introduction Motivation Evaluation ConclusionArchitecture
R0
R4R3
R2
R1 R5
Challenges: Overlapping Rules
R0R2
R6
R1 R3
R7
R4R8R6
R4R8
R5
R3
R7R0R2
R1
Partitions rules to reduce overlapping rules dependency
Splitting rules covering multiple partitions causes inflation
ToR1
16Introduction Motivation Evaluation ConclusionArchitecture
vCRIB: Partitioning
Recursively cut partitions to create a BSP tree
R6
R4
R8
R5
R3
R7
R0R2
R1
R0
R4
R3R2
R1
R5
R4
R3
R7
Select a cut that• balances two partitions• creates fewest number of new rules
Smaller partitions• are more flexible to place• match fewer communicating VMs
Stop whenever a resource at a node is exhausted
17Introduction Motivation Evaluation ConclusionArchitecture
Challenges: Placement Complexity
R6
R4
R8
R5
R3
R7
R0R2R1
ToR1
Constraints• Functionality• Machine resources
Goal• Minimize traffic overhead• Minimize delay• Minimize cost of bandwidth usage
vs. saved CPU
Generalized Assignment Problem
T11
T21 T22
T23
T32
T33
• Different partition sizes• Different machine capacities• Different traffic overhead for
each partition location
18Introduction Motivation Evaluation ConclusionArchitecture
vCRIB: Placement (Branch and Bound)
Select the largest unassigned partition
• Capable of handling its rules Functionality Resources
• Make minimum traffic overhead
Place it on a switch/hypervisor
19Introduction Motivation Evaluation ConclusionArchitecture
vCRIB Architecture
R6
R4
R8
R5R3
R7
R0R2
R1
R0
R4
R3
R2
R1
R5
R4
R3
R7
Partitioning Placement
R6
R4
R8
R5
R3
R7
R0R2R1
ToR1
PartitionsTraffic and Topology information
Rule Placement
vCRIB Manager
Rules
20Introduction Motivation Architecture ConclusionEvaluation
Evaluation: Goal
Can partitioning algorithm achieve small partitions?
Can placement algorithm leverage resource availability to decrease traffic overhead?
Configuration• 100 VMs per machine• 10K flows (10KB) per machine• ClassBench rules• 1K rule capacity per switches
Agg1 Agg2
ToR1 ToR2 ToR3
21Introduction Motivation Architecture ConclusionEvaluation
Evaluation: Partitioning
2.5 5 7.5 101
10
100
1000
4K 8K 16K 32K
Hypervisors Rule Capacity (K)
Maxim
um
Part
itio
n S
ize
Change rule capacity to show the effect of different CPU budgets
Maximum size of partitions goes down as resources increase
22Introduction Motivation Architecture ConclusionEvaluation
Evaluation: Placement
2.5 5 7.5 10500
550
600
650
700
750
800
Rnd-4K Agg-4K
Hypervisor Rule Capacity (K)
Netw
ork
Tra
ffic
(MB
)
For each machineselect VM addresses from
a contiguous IP range
Traffic decreases as resources increase
Aggregated addresses make lower traffic overhead
No traffic decrease
Agg1 Agg2
ToR1 ToR2 ToR3
R5
R3
R7
Replication
23Introduction Motivation Architecture Evaluation Conclusion
Conclusion
vCRIB provides an abstraction layer for placement of rules in datacenters
Places the rules on both hypervisors and switches to achieve the best performance given the resource
constraints
24Introduction Motivation Architecture Evaluation Conclusion
Future Work
Exploit performance model of hypervisors & switches
Online Algorithm adjusting to traffic changes
Replication in the partitioning and placement algorithm
vCRIB: Virtual Cloud Rule Information Base
Masoud Moshref, Minlan Yu, Abhishek Sharma, Ramesh Govindan
HotCloud 2012