2016 04-04 engineer versus complexity
TRANSCRIPT
Engineer versus Complexity
Russ White/rule11.us
Defining Complexity
Networks are complex…• And getting more
complex all the time• But what is
complexity, really?• Can we put a network
on a scale that measures complexity?
Define Complexity• Anything I don’t
understand• But…
– There are many things I don’t understand that aren’t complex
– Often understanding is a matter of context rather than complexity
Define Complexity• Anything with a lot of parts• But…
– Mandelbrot sets have a lot of parts
– But they aren’t necessarily complex
Define Complexity• Anything that produces
unintended consequences• Anything that has more
parts than required to solve a specific problem
You can’t “solve” complexity• Why not just make
things simple?• Because complexity is
required to solve real world problems
In our view, however, complexity is most succinctly discussed in terms of functionality and its robustness. Specifically, we argue that complexity in highly organized systems arises primarily from design strategies intended to create robustness to uncertainty in their environments and component parts.
Solution Effectiveness
RobustnessComplexity
Alderson, D. and J. Doyle, “Contrasting Views of Complexity and Their Implications For Network- Centric Infrastructures”, IEEE ‐TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 40, NO. 4, JULY 2010
You can’t “solve” complexity• If you can’t “solve”
complexity…• Then—abandon all hope ye
who enter here?• No!• Instead, we need to learn
to manage complexity
Interacting with
Complexity
Presentation ID
11
• “Department of Defense” (DoD) model• Origin of TCP/IP
• Recursive Internet Architecture (RINA) model• http://csr.bu.edu/rina/
• Flow across a data center fabric• IP Fast Reroute
DoD Model• How can state be divided up
to produce clean interfaces between different software components?
• What information must be moved from component to component to transmit information from application to application?– Lateral across the network– Vertically among components Link
Internet
Transport
Application
Link
Internet
Transport
Application
DoD Model• What state is contained here?• How much?• How fast does it change?
• What state is shared here?• How much does the internal
state mix between these two components?
• What am I optimizing by splitting this state up?• What is the goal of breaking
this up into multiple subsystems?
Link
Internet
Transport
Application
Link
Internet
Transport
Application
DoD Model• What state is contained here?• How much?• How fast does it change?
• What state is shared here?• How much does the internal
state mix between these two components?
• What am I optimizing by splitting this state up?• What is the goal of breaking
this up into multiple subsystems?
Link
Internet
Transport
Application
Link
Internet
Transport
Application
Network Device
RINA Model• Recursive Internet
Architecture• What processing needs to
take place to transmit information from application to application?
Identifies four processes: Transport, Multiplex, Error handling, Flow controlGrouped into two complimentary pairs
• Who needs to do it?Interface, host, applicationNetwork device
App
to
App
TR/MULT
ERROR/FLOW
Host
to
Host
TR/MULT
ERR/FLOW
Inte
rface
to
Inte
rface
TR/MULT
ERROR/FLOW
TR/MULT
ERR/FLOW
TR/MULT
TR/MULT
ERR/FLOW
TR/MULT
TR/MULT
ERR/FLOW
TRANSPORT/MULTIPLEX
ERROR/FLOW
TRANSPORT/MULTIPLEX
ERROR/FLOW
Network Device
RINA Model
App
to
App
TR/MULT
ERROR/FLOW
Host
to
Host
TR/MULT
ERR/FLOW
Inte
rface
to
Inte
rface
TR/MULT
ERROR/FLOW
TR/MULT
ERR/FLOW
TR/MULT
TR/MULT
ERR/FLOW
TR/MULT
TR/MULT
ERR/FLOW
TRANSPORT/MULTIPLEX
ERROR/FLOW
TRANSPORT/MULTIPLEX
ERROR/FLOW
• What state is required here?• How much?• How fast does it change?
• What state is shared here?• How much does the internal
state mix between these two components?
• What am I optimizing by splitting this state up?
Network Device
RINA Model
App
to
App
TR/MULT
ERROR/FLOW
Host
to
Host
TR/MULT
ERR/FLOW
Inte
rface
to
Inte
rface
TR/MULT
ERROR/FLOW
TR/MULT
ERR/FLOW
TR/MULT
TR/MULT
ERR/FLOW
TR/MULT
TR/MULT
ERR/FLOW
TRANSPORT/MULTIPLEX
ERROR/FLOW
TRANSPORT/MULTIPLEX
ERROR/FLOW
• What state is shared here?• How much does the internal
state mix between these two components?
• What am I optimizing by splitting this state up?
• What state is required here?• How much?• How fast does it change?
Data Flow Example• Mobile device queries
DNS server for service• Packet must pass through
virtual edge– OSS/BSS– Encryption– Other common edge
services• DNS server responds
DC Fabric
Virtual Edge DNS DPI
CDN
DC Fabric
Virtual Edge DNS DPI
CDN
Data Flow Example• Mobile device initiates
HTTP session with server• Flow must pass through
deep packet inspection (DPI) service
• Flow is redirected to a local CDN server for processing
DC Fabric
Virtual Edge DNS DPI
CDN
Data Flow Example
• How much state must be pushed to the edge to make this work?• How much state in the fabric
versus in the overlay?
• What is the balance between processing on dedicated processors and COTs, scale out and stretch, etc.
• What state is required here?• How much?• How fast does it change?
DC Fabric
Virtual Edge DNS DPI
CDN
Data Flow Example
• How much state must be pushed to the edge to make this work?• How much state in the fabric
versus in the overlay?
• What is the balance between processing on dedicated processors and COTs, scale out and stretch, etc.
• What state is required here?• How much?• How fast does it change?
Control Plane Example • A uses the path through B
as to 2001:db8:0:1::/64• Path through C is blocked
by the control plane• We would like to be able
to use C as an alternate path in the case of a link failure along A->B->E
2001:db8:0:1::/64
A
B
C
D
E
Control Plane Example• Loop Free Alternates
(LFAs)– A can compute the cost
from C to determine if traffic forwarded to 2001:db8:0:1::/64 will be looped back to A
– If not, then A can install the path through C as a backup path
2001:db8:0:1::/64
A
B
C
D
E
Control Plane Example2001:db8:0:1::/64
A
B
C
D
E
• Do other routers in the network need to react to new state in the control plane?
• How much faster will the network converge with LFAs?
• How much state is added to the control plane to calculate the LFA?• How often does this state change?
Control Plane Example2001:db8:0:1::/64
A
B
C
D
E
• Do other routers in the network need to react to new state in the control plane?
• How much faster will the network converge with LFAs?
• How much state is added to the control plane to calculate the LFA?• How often does this state change?
Corralling the Questions• Why am I doing this?• What am I doing to—
– State– Optimization– Surfaces
• What is this like?
Why?• Acts as an abstractor
– Only look at the stuff you need to understand the system
• Closes off rabbit trails– “Why does IS-IS elect a DIS?”– “That’s a good question, but it’s not related to fixing
these slow DNS queries…”• Connects the technical to the business
– Unless you can explain why this problem is important to solve…
– It’s probably not!
What?• Three subquestions—
StateOptimizationSurface
• Three way choices are common in the real world
• Network complexity is another form of this sort of three way choice
Quick
Cheap High Quality
State
Surface Optimization
The plane of the possible• Why not just make things
simple?• The “plane of the possible”
is just the way reality is built
Realm of the Impossible
Plane of the Possible
State
Surface
Opti
miza
tion
What is this like?Common Problem Common Solution Common Problems with the Solution
Find the shortest path SPFPath VectorBellman-FordDUAL
Microloops, state too fastSlow convergence, inconsistent stateSlow convergence, feedback loopsDrops during convergence, feedback loops
Distribute Data FloodMulticastUnicast
Possibly too dense, CAP, transmission errorsComplex control plane, CAP, transmission errorsComplex control plane, CAP, transmission errors
Reliable Data Transmission Forward CorrectionDetect/Retransmit
Carry unneeded state in many casesSlows down transmission
What is this like?
• This is an opportunity, rather than a problem• The more you understand about the past, the more you will
understand the future• Be a collector of…
– Models, protocols, theories, algorithms, ideas, etc.
Every old idea will be proposed again with a different name and a different presentation, regardless of whether it works.
-RFC1925, rule 11
Why?
What problem am I solving?
Narrows my view – don’t boil the
ocean
What & How? This is like?
Has this problem been solved before?
How was it solved?
What are the problems with this solution?
Models and Abstractions• Abstractions Leak• Aggregation
– If the A->B link fails, the cost of the aggregate changes
• TCP retransmissions• Embedded IP addresses
102001:db8:0:1::/64
2001:db8:0:2::/6420
A
B
C
D
2001:db8::/61 aggregate metric set to metric of the lowest component
Models and Abstractions• Abstractions hide things
– You might not understand the entire system as well as you might by studying it “in detail”
– But comprehending a complex system without abstractions often just leads to confusion
• More than one model will help expose different aspects of the system
Presentation ID
36
Why?
What/How?State?
Surface?Optimization?
What is this like? to find potential solutions and “know
where to look”