declarative networking: language, execution and optimization boon thau loo 1, tyson condie 1, minos...
TRANSCRIPT
Declarative Networking:
Language, Execution and Optimization
Boon Thau Loo1, Tyson Condie1, Minos Garofalakis2, David E. Gay2, Joseph M. Hellerstein1, Petros Maniatis2, Raghu Ramakrishnan3, Timothy Roscoe2, Ion Stoica1
1UC Berkeley, 2Intel Research Berkeley, 3University of Wisconsin-Madison
Declarative Networking
Database query language, execution and optimization for the design and implementation of networksSuccess of database research: 70’s – today: Database research has
revolutionized data management Today: Similar opportunity to
revolutionize the Internet architecture
Why now?
Internet faces many challenges today: Unwanted, harmful traffic Complexity/fragility in Internet routing Proliferation of new applications
Efforts at improving the Internet: Evolutionary: App-level “Overlay” networks Revolutionary: Clean-slate designs
NSF GENI initiative, FIND program
Opportunity: Software tools that can significantly accelerate network
innovation
Opportunity: Software tools that can significantly accelerate network
innovation
A Declarative Network
Distributed recursive query
Traditional Networks Declarative NetworksNetwork State Distributed
databaseNetwork protocol Recursive Query Execution
Network messages Distributed Dataflow
DataflowDataflow
messages
Dataflow
Dataflow
Dataflow
Dataflow messagesmessages
Previous Work and Use Cases
Declarative routing [SIGCOMM ’05]: Recursive queries as a compact, high-level
representation of routing protocols Good balance between router extensibility and
safetyBeyong routing: Declarative overlays [SOSP ’05]:
P2 declarative networking system implementation Chord overlay network (47 lines)
System being used: Code pre-release (http://p2.cs.berkeley.edu) Cambridge, Harvard, MPI, Rice, UT-Austin Distributed consensus protocols, replication
protocols, debugging networks, content-based routing
Focus of this paper:
Important unresolved issues: Query language and semantics Asynchronous, distributed query
processing New challenges in query
optimizations Semantics in dynamic networks
Outline
BackgroundQuery language by exampleQuery processingOptimizationsConclusion
Review of Datalog
<result> <condition1>, <condition2>, … , <conditionN>.
Datalog rule syntax:
Types of conditions in body: Input tables: link(src,dst) predicate Arithmetic and list operations
Head is an output table Recursive rules: result of head in rule body
BodyHead
Review: All-Pairs Reachability
R2: reachable(S,D) link(S,Z), reachable(Z,D)
R1: reachable(S,D) link(S,D)
Input: link(source, destination)Output: reachable(source, destination)
“For all nodes S,D, If there is a link from S to D, then S can reach D”.
link(a,b) – “there is a link from node a to node b”
reachable(a,b) – “node a can reach node b”
Review: All-Pairs Reachability
R2: reachable(S,D) link(S,Z), reachable(Z,D)
R1: reachable(S,D) link(S,D)
Input: link(source, destination)Output: reachable(source, destination)
“For all nodes S,D and Z, If there is a link from S to Z, AND Z can reach D, then S can reach D”.
All-Pairs Reachability
R1: reachable(@S,D) link(@S,D)
R2: reachable(@S,D) link(@S,Z), reachable(@Z,D)
Network Datalog
Query: reachable(@M,N)
@S
D
@a
b
@a
c
@a
d
reachableOutput
table:
Input table:
Query: reachable(@a,N)
@S
D
@c b
@c d
link@
SD
@b
c
@b
a
link@
SD
@a
b
link @
SD
@d
c
link
b dca
@S
D
@b
a
@b
c
@b
d
reachable @
SD
@c a
@c b
@c d
reachable @
SD
@d
a
@d
b
@d
c
reachable
Location Specifier “@S”
Query: reachable(@a,N)
Implicit Communication
A networking language with no explicit communication:
All communication happens among neighbors: Link-restricted rules enforced via syntactic restrictions
R2: reachable(@S,D) link(@S,Z), reachable(@Z,D)
Data placement induces communication
b catuple(@b,…)
Path Vector Protocol Example
Advertisement: entire path to a destinationEach node receives advertisement, add itself to path and forward to neighbors
path=[c,d]path=[b,c,d]path=[a,b,c,d]
c advertises [c,d]b advertises [b,c,d]
b dca
Path Vector in Network Datalog
Input: link(@source, destination)Query output: path(@source, destination, pathVector)
R1: path(@S,D,P) link(@S,D), P=(S,D).
R2: link(@Z,S), path(@S,D,P)
P=SP2. path(@Z,D,P2),
Query: path(@S,D,P)
Add S to front of P2
R1: path(@S,D,P) link(@S,D), P=(S,D).
R2: path(@S,D,P) link(@Z,S), path(@Z,D,P2), P=SP2.
@S
D P @S
D P
@c d [c,d]
Query Execution
@S
D P @S
D P
Query: path(@a,d,P)
Neighbor table:
@S
D
@c b
@c d
link@S D
@b c
@b a
link@
SD
@a
b
link @S D
@d c
link
b dca
path path path
Forwarding table:
@S
D P @S
D P @S
D P
@c d [c,d]
Query Execution
Forwarding table:
@S
D P
@b
d [b,c,d]
b dca
path(@b,d,[b,c,d])
R1: path(@S,D,P) link(@S,D), P=(S,D).
R2: path(@S,D,P) link(@Z,S), path(@Z,D,P2), P=SP2.
Query: path(@a,d,P)
Neighbor table:
@S
D
@c b
@c d
link@S D
@b c
@b a
link@
SD
@a
b
link @S D
@d c
link
path path path@S
D P
@a
d [a,b,c,d]
path(@a,d,[a,b,c,d])
Communication patterns are identical to those in the actual path vector
protocol
Communication patterns are identical to those in the actual path vector
protocol
Matching variable Z = “Join”
Outline
BackgroundQuery language by exampleQuery ProcessingOptimizationsConclusion
Recursive Query Evaluation
Semi-naïve evaluation: Iterations (rounds) of synchronous computation Results from iteration ith used in (i+1)th
Path Table
87
3-hop
109
21
1-hop3
65 2-hop4
Link Table Network
510
021
3
4
6
8
7
Problem: Unpredictable delays and failures
9
Pipelined Semi-naïve (PSN)Fully-asynchronous evaluation:
Computed tuples in any iteration pipelined to next iteration
Natural for network protocols
Path Table
41
7
Link Table Network
25836910
510
021
3
4
6
8
79
Relaxation of semi-naïve
Relaxation of semi-naïve
Pipelined Evaluation
Challenges: Does PSN produce the correct answer? Is PSN bandwidth efficient?
I.e. does it make the minimum number of inferences?
In paper, proofs for
Basic technique: local timestamps
p(x,z) :- p1(x,y), p2(y,z), …, pn(y,z), q(z,w)
recursive w.r.t. p
lookup
lookup
Dem
ux
link
Local Tables
path ...
UD
P
Tx
Round
Robin
Queue
CC
T
x
Queue
UD
P
Rx
CC
R
x
Execution Plan
Nodes in execution plan (“operators”): Network operators (send/recv, cc, retry, rate limitation) Relational operators (selects, projects, joins, aggregates) Flow operators (mux, demux, queues)
Messages
Network In
Messages
Network Out
Single Node
Localization RewriteRules may have body predicates at different locations:
R2: path(@S,D,P) link(@S,Z), path(@Z,D,P2), P=SP2.
R2b: path(@S,D,P) linkD(S,@Z), path(@Z,D,P2), P=SP2.
R2a: linkD(S,@D) link(@S,D)
Matching variable Z = “Join”
Rewritten rules:
Matching variable Z = “Join”
Localized Rule Compilation
Execution Plan
path Joinpath.Z = linkD.Z
linkD
Projectpath(S,D,P) Send to
path.S
R2b: path(@S,D,P) linkD(S,@Z), path(@Z,D,P2), P=SP2.
Netw
ork
In
Netw
ork
Ou
t
linkD
JoinlinkD.Z =
path.Z
path
Projectpath(S,D,P) Send to
path.S
Outline
BackgroundQuery language by exampleQuery Processing OptimizationsConclusion
Role of Query Optimizations
Network protocols = query executionCan query optimizations help implement efficient protocols?
Our First Steps
Traditional: evaluate in the NW context Aggregate Selections Predicate Reordering Magic Sets rewrite
New: motivated in the NW context Multi-query optimizations Cost-based optimizations based on
network statistics
Predicate Reordering
R1: path(@S,D,P) link(@S,D), P= (S,D). R2: path(@S,D,P) Query: path(@S,D,P)
link(@S,Z), path(@Z,D,P2),P=SP2.
Predicate Reordering
R1: path(@S,D,P) link(@S,D), P= (S,D). R2: path(@S,D,P) Query: path(@S,D,P)
path(@S,Z,P1), link(@Z,D), P=SP2.P=P1D.
Predicate reordering: path vector protocol dynamic source routingInteresting variants: Predicate reordering + magic-sets rewrite Cost-based optimizations (work-in-progress)
Network statistics (neighborhood density, rate of change of links)
Evaluation Overview
Setup: Routing protocols implemented using P2 Emulab testbed Metrics: Convergence latency,
communication
Results in paper: Aggregate selections Magic sets & predicate reordering Multi-query optimizations
ConclusionDeclarative Networking Database techniques for network design and
implementation Important role to play in the innovation of networks
Paper focuses on important unresolved issues: Query language, query processing, optimizations,
semantics in dynamic networks
Raises several interesting research challenges: Language and semantics Runtime cost-based optimizations Interaction between query processing and
networking
Thank Youhttp://p2.cs.berkeley.edu