some open questions on the borderline of distributed computing and networking
DESCRIPTION
Some Open Questions on the Borderline of Distributed Computing and Networking. Michael Schapira School of Computer Science and Engineering Hebrew University of Jerusalem. This Talk. New questions in Internet protocol design Self-stabilizing Internet protocols - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Some Open Questions on the Borderline of Distributed Computing and Networking](https://reader036.vdocument.in/reader036/viewer/2022070501/5681692b550346895de06c20/html5/thumbnails/1.jpg)
Michael SchapiraSchool of Computer Science and Engineering
Hebrew University of Jerusalem
Some Open Questions on the Borderline of Distributed Computing and Networking
![Page 2: Some Open Questions on the Borderline of Distributed Computing and Networking](https://reader036.vdocument.in/reader036/viewer/2022070501/5681692b550346895de06c20/html5/thumbnails/2.jpg)
This Talk1. New questions in Internet protocol
design
2. Self-stabilizing Internet protocols
3. Incentive-compatible network protocols
• … illustrated via Internet routing examples
![Page 3: Some Open Questions on the Borderline of Distributed Computing and Networking](https://reader036.vdocument.in/reader036/viewer/2022070501/5681692b550346895de06c20/html5/thumbnails/3.jpg)
The Internet• Tremendous success– from research experiment
to global infrastructure
• Enables innovation in applications–Web, P2P, VoIP, social networks, virtual
worlds
• But, the Internet infrastructure fairly stagnant for decades…
![Page 4: Some Open Questions on the Borderline of Distributed Computing and Networking](https://reader036.vdocument.in/reader036/viewer/2022070501/5681692b550346895de06c20/html5/thumbnails/4.jpg)
Why Can’t We Innovate?• “Closed” equipment– software bundled with hardware– vendor-specific interfaces
• Slow protocol standardization
• Few people can innovate– equipment vendors write the code– long delays to introduce new features
![Page 5: Some Open Questions on the Borderline of Distributed Computing and Networking](https://reader036.vdocument.in/reader036/viewer/2022070501/5681692b550346895de06c20/html5/thumbnails/5.jpg)
Traditional Computer Networks
data plane:packet streaming
Handle packets in “real time”: forward, filter, buffer, mark, rate-limit, measure, …
![Page 6: Some Open Questions on the Borderline of Distributed Computing and Networking](https://reader036.vdocument.in/reader036/viewer/2022070501/5681692b550346895de06c20/html5/thumbnails/6.jpg)
slower time scale: track topology changes, compute routes, install forwarding rules, …
control plane:distributed algorithms
Traditional Computer Networks
![Page 7: Some Open Questions on the Borderline of Distributed Computing and Networking](https://reader036.vdocument.in/reader036/viewer/2022070501/5681692b550346895de06c20/html5/thumbnails/7.jpg)
Software Defined Networking (SDN):a New Paradigm
API to the data plane(e.g., OpenFlow)
Controller: logically-centralized control, smart, slow, implemented in software, …
Switch: dumb, fast, implementedin hardware
![Page 8: Some Open Questions on the Borderline of Distributed Computing and Networking](https://reader036.vdocument.in/reader036/viewer/2022070501/5681692b550346895de06c20/html5/thumbnails/8.jpg)
8
Network OS
Controller Application
events from switchestopology changes,traffic statistics,arriving packets,…
commands to switches(un)install rules,query statistics,…
Software Defined Networking (SDN):a New Paradigm
![Page 9: Some Open Questions on the Borderline of Distributed Computing and Networking](https://reader036.vdocument.in/reader036/viewer/2022070501/5681692b550346895de06c20/html5/thumbnails/9.jpg)
So…• Change is finally on the horizon
• But many challenges remain…– Realizing SDN (e.g., distribute the controller?)– What are the “right” protocols (for routing,
traffic engineering, etc.)?
• Distributed computing theory can play an important role here
![Page 10: Some Open Questions on the Borderline of Distributed Computing and Networking](https://reader036.vdocument.in/reader036/viewer/2022070501/5681692b550346895de06c20/html5/thumbnails/10.jpg)
Distributed Controller?
10
Network OS
Controller Application
Network OS
Controller Application
for scalability and reliability
partition and replicate state
Elect a leader?Distribute the computation?How to ensure consistency
(across controllers / switches)?Where to place the
controller(s)?
![Page 11: Some Open Questions on the Borderline of Distributed Computing and Networking](https://reader036.vdocument.in/reader036/viewer/2022070501/5681692b550346895de06c20/html5/thumbnails/11.jpg)
Rethinking (Routing) Protocols
• Routing is a control plane operation– slow (milliseconds – seconds)
• Packet forwarding is a data plane operation– fast (microseconds)
• Today’s (intradomain) routing– establishes connectivity– optimizes routes (= shortest paths)
• failure ⇒ re-convergence ⇒ dropped packets!
![Page 12: Some Open Questions on the Borderline of Distributed Computing and Networking](https://reader036.vdocument.in/reader036/viewer/2022070501/5681692b550346895de06c20/html5/thumbnails/12.jpg)
Pushing Connectivity (Only!)to the Data Plane
• … while retaining scalability– implemented in hardware– low overhead (end-to-end backup paths too costly…)– static forwarding tables (no changes in packet rates)– no change to packet header
• When packet to a node d arrives at node i,i’s outgoing link is a function only of
i dincoming link
set of “live” outgoing edges fid: Ei x P(Ei) -> Ei
![Page 13: Some Open Questions on the Borderline of Distributed Computing and Networking](https://reader036.vdocument.in/reader036/viewer/2022070501/5681692b550346895de06c20/html5/thumbnails/13.jpg)
Resilient Forwarding• A “forwarding pattern” {fid}i is t-resilient if
for any (at most) t-edge-failures the existence of a path between a node i and the destination d implies loop-free forwarding from i to d.
• Perfect resilience ≣ t →∞
i d
j
x
![Page 14: Some Open Questions on the Borderline of Distributed Computing and Networking](https://reader036.vdocument.in/reader036/viewer/2022070501/5681692b550346895de06c20/html5/thumbnails/14.jpg)
Theoretical Perspective• Thm [Feigenbaum-Godfrey-Panda-S-Shenker-Singla]: 1-
resilient forwarding pattern always exists
• Thm [Feigenbaum-Godfrey-Panda-S-Shenker-Singla]: Perfect resilience is not achievable
• Big gap! – does a 2-resilient forwarding pattern always
exist?– specific families of graphs?– relax restrictions (randomness, dynamic
forwarding tables, …)?
![Page 15: Some Open Questions on the Borderline of Distributed Computing and Networking](https://reader036.vdocument.in/reader036/viewer/2022070501/5681692b550346895de06c20/html5/thumbnails/15.jpg)
Practical Perspective
A perfectly-resilient mechanism for achieving connectivity in the data plane – [“Data Driven Connectivity”, Liu-Panda-
Singla-Godfrey-S-Shenker, NSDI 2013]
– utilizes existing mechanisms
– small (few bits) changes to forwarding tables at packet rate
![Page 16: Some Open Questions on the Borderline of Distributed Computing and Networking](https://reader036.vdocument.in/reader036/viewer/2022070501/5681692b550346895de06c20/html5/thumbnails/16.jpg)
• How to distribute the controller?
• Data-plane/control-plane perspective on other networking tasks (e.g., traffic engineering)
• Connectivity in the data plane
Directions for Future Research
![Page 17: Some Open Questions on the Borderline of Distributed Computing and Networking](https://reader036.vdocument.in/reader036/viewer/2022070501/5681692b550346895de06c20/html5/thumbnails/17.jpg)
(Self-)Stabilizing
Internet Routing
![Page 18: Some Open Questions on the Borderline of Distributed Computing and Networking](https://reader036.vdocument.in/reader036/viewer/2022070501/5681692b550346895de06c20/html5/thumbnails/18.jpg)
Border Gateway Protocol
Verizon
Comcast
AT&T
The Border Gateway Protocol (BGP) establishes routes between the (over 42,000) networks that make up the Internet
![Page 19: Some Open Questions on the Borderline of Distributed Computing and Networking](https://reader036.vdocument.in/reader036/viewer/2022070501/5681692b550346895de06c20/html5/thumbnails/19.jpg)
BGP ≠ Shortest-Path Routing!
Verizon
Comcast
AT&T
I want to avoid routes through
Comcast if possible I won’t carry
traffic between AT&T and Verizon
I want a cheap route I want
short routes
![Page 20: Some Open Questions on the Borderline of Distributed Computing and Networking](https://reader036.vdocument.in/reader036/viewer/2022070501/5681692b550346895de06c20/html5/thumbnails/20.jpg)
Illustration: BGP Dynamics
1 2
d2, I’m
available
1, my routeis 2d
1, I’m available
Prefer routes
through 2
Prefer routes
through 1
A stable state is reached
![Page 21: Some Open Questions on the Borderline of Distributed Computing and Networking](https://reader036.vdocument.in/reader036/viewer/2022070501/5681692b550346895de06c20/html5/thumbnails/21.jpg)
1 2
d
BGP might oscillateindefinitely between
1d, 2dand
12d, 21d
1, 2, I’m thedestination
1, my routeis 2d
2, my routeis 1d
Illustration: BGP OscillationPrefer routes
through 2
Prefer routes
through 1
Conjecture [Griffin-Wilfong, SIGCOMM 99]:2+ stable states → BGP can oscillate
![Page 22: Some Open Questions on the Borderline of Distributed Computing and Networking](https://reader036.vdocument.in/reader036/viewer/2022070501/5681692b550346895de06c20/html5/thumbnails/22.jpg)
Why are Oscillations Bad?
• Make the network unpredictable and hard
to debug.
• Might lead to the flooding on the network with BGP update messages.
• Deteriorate performance!– almost 50% of VoIP disruptions are due to BGP
route fluctuations
![Page 23: Some Open Questions on the Borderline of Distributed Computing and Networking](https://reader036.vdocument.in/reader036/viewer/2022070501/5681692b550346895de06c20/html5/thumbnails/23.jpg)
Internet Protocols, Markets, and Beyond
• Often, in computational and economic environments1. the prescribed behavior for each “node” (human,
machine) is simple and natural2. nodes’ interaction is not synchronized
• How can we reason about such environments?– Internet protocols (BGP routing, TCP congestion control)– large-scale markets– social networks– …
![Page 24: Some Open Questions on the Borderline of Distributed Computing and Networking](https://reader036.vdocument.in/reader036/viewer/2022070501/5681692b550346895de06c20/html5/thumbnails/24.jpg)
Dynamics:Game Theory vs. Distributed Computing
• Game theory: – establishes convergence to equilibrium for
“natural dynamics” (best-/better-response, fictitious play, no-regret, …)
– … but typically assumes synchronization.
• Distributed computing theory:– analyzes system behavior in asynchronous
environments– … but no general notions of natural behavior.
![Page 25: Some Open Questions on the Borderline of Distributed Computing and Networking](https://reader036.vdocument.in/reader036/viewer/2022070501/5681692b550346895de06c20/html5/thumbnails/25.jpg)
• n nodes 1,…,n
• Node i has action space Ai– A=A1•…•An
– A-i=A1•…•Ai-1•Ai+1•…•An
• Node i has reaction function fi:A-i→Ai– f=(f1,…,fn)– fi can capture node i’s “best-responses”
Simple Model
![Page 26: Some Open Questions on the Borderline of Distributed Computing and Networking](https://reader036.vdocument.in/reader036/viewer/2022070501/5681692b550346895de06c20/html5/thumbnails/26.jpg)
• Infinite sequence of discrete time steps t=1,…
• A schedule s:{1,…} →2[n] maps each time
step to the subset of nodes “activated” at that time step– a fair schedule activates each node infinitely often
• An initial action-profile and schedule naturally induce a dynamics.
Simple Model (Cont.)
![Page 27: Some Open Questions on the Borderline of Distributed Computing and Networking](https://reader036.vdocument.in/reader036/viewer/2022070501/5681692b550346895de06c20/html5/thumbnails/27.jpg)
• Defn: An action-profile a*=(a1,…,an) is a stable state if fi(a*)=ai for all i.– that is, a* is a fixed point of f– abusing notation…
• Defn: A system is convergent if for every choice of initial action-profile and fair schedule the induced dynamics converge to a stable state.
Simple Model (Cont.)
![Page 28: Some Open Questions on the Borderline of Distributed Computing and Networking](https://reader036.vdocument.in/reader036/viewer/2022070501/5681692b550346895de06c20/html5/thumbnails/28.jpg)
• Thm [Jaggard-S-Wright]: If there exist multiple stable states, then the system is not convergent.– valency argument!– no failures, just dumb nodes!
• So, a unique stable state is a necessary condition for guaranteed convergence.
• Can be generalized to bounded-recall, non-stationary reaction functions.
Towards a Characterization of Convergent Systems
![Page 29: Some Open Questions on the Borderline of Distributed Computing and Networking](https://reader036.vdocument.in/reader036/viewer/2022070501/5681692b550346895de06c20/html5/thumbnails/29.jpg)
Application: Internet Routing• BGP establishes routes between the smaller networks that
make up the Internet
• Question [Griffin-Shepherd-Wilfong, 2001]: Do multiple stable routing configurations imply the possibility of persistent route oscillations?
• Answer [Sami-S-Zohar, 2009]: Yes!
AT&T
Qwest
Comcast
Sprint
![Page 30: Some Open Questions on the Borderline of Distributed Computing and Networking](https://reader036.vdocument.in/reader036/viewer/2022070501/5681692b550346895de06c20/html5/thumbnails/30.jpg)
Other Applications
• Our “two people in a corridor” example…
• Models of congestion control on the Internet
• Load balancing
• Diffusion of technologies in social networks
• Asynchronous circuits
• …
![Page 31: Some Open Questions on the Borderline of Distributed Computing and Networking](https://reader036.vdocument.in/reader036/viewer/2022070501/5681692b550346895de06c20/html5/thumbnails/31.jpg)
Meanwhile, back in the corridor…
![Page 32: Some Open Questions on the Borderline of Distributed Computing and Networking](https://reader036.vdocument.in/reader036/viewer/2022070501/5681692b550346895de06c20/html5/thumbnails/32.jpg)
• Defn: An r-fair schedule activates each node at least once in every r consecutive time steps
• Defn: A system is r-convergent if for all choices of initial action-profile and r-fair schedule the induced dynamics converges to a stable state.– convergent r-convergent– not r-convergence not convergent
• Thm [Erdmann-S]: If there exist multiple stable states, then the system is not (n-1)-convergent.– tight!– much more delicate valency argument
Strengthening the Result:Convergence vs. Synchronism
![Page 33: Some Open Questions on the Borderline of Distributed Computing and Networking](https://reader036.vdocument.in/reader036/viewer/2022070501/5681692b550346895de06c20/html5/thumbnails/33.jpg)
• Thm [Jaggard-S-Wright]: Determining if a system with n nodes is convergent requires exponential communication (in n).
• Thm [Engelberg-Fabrikant-S-Wajc]: Determining if a succinctly described system is convergent is PSPACE-complete.
• Both results extend also to “stochastic convergence”.
Complexity of Convergent Systems
![Page 34: Some Open Questions on the Borderline of Distributed Computing and Networking](https://reader036.vdocument.in/reader036/viewer/2022070501/5681692b550346895de06c20/html5/thumbnails/34.jpg)
• Other protocols!
• Identify specific classes of (stochastically) convergent games and measure convergence rate (e.g., in terms of asynchronous rounds).
• Characterize guaranteed convergence, and design algorithms for determining such convergence for other game dynamics (e.g., fictitious play, no-regret
dynamics) other notions of equilibrium (e.g., mixed Nash,
correlated) other notions of asynchrony
Directions for Future Research
![Page 35: Some Open Questions on the Borderline of Distributed Computing and Networking](https://reader036.vdocument.in/reader036/viewer/2022070501/5681692b550346895de06c20/html5/thumbnails/35.jpg)
Incentive-CompatibleNetwork Protocols
![Page 36: Some Open Questions on the Borderline of Distributed Computing and Networking](https://reader036.vdocument.in/reader036/viewer/2022070501/5681692b550346895de06c20/html5/thumbnails/36.jpg)
queue
routerlink link
TCP Congestion Control is NOT Incentive Compatible
AIMD = Additive Increase Multiplicative Decrease
![Page 37: Some Open Questions on the Borderline of Distributed Computing and Networking](https://reader036.vdocument.in/reader036/viewer/2022070501/5681692b550346895de06c20/html5/thumbnails/37.jpg)
What About BGP?
• BGP was designed to guarantee connectivity between largely trusted and obedient parties.
• In today’s commercial Internet ASes are owned by self-interested, often competing, entities– might not follow the “prescribed behaviour”
• Simple examples show that BGP is, in fact, not incentive compatible– a node can obtain a better route by “lying”
![Page 38: Some Open Questions on the Borderline of Distributed Computing and Networking](https://reader036.vdocument.in/reader036/viewer/2022070501/5681692b550346895de06c20/html5/thumbnails/38.jpg)
How Can We Fix This?• Economic Mechanism Design: “the reverse-
engineering approach to game-theory”.
• Goal: Incentivize players to follow the prescribed behaviour– if others run the protocol so should I!– without money!
• Thm [Levin-S-Zohar]: Secure variants of BGP are incentive compatible.
![Page 39: Some Open Questions on the Borderline of Distributed Computing and Networking](https://reader036.vdocument.in/reader036/viewer/2022070501/5681692b550346895de06c20/html5/thumbnails/39.jpg)
• An exciting time to be in networking
• Internet protocols motivate new research directions
• Distributed computing theory has much to contribute
Conclusion
![Page 40: Some Open Questions on the Borderline of Distributed Computing and Networking](https://reader036.vdocument.in/reader036/viewer/2022070501/5681692b550346895de06c20/html5/thumbnails/40.jpg)
Thank You