designing robust distributed systems with weakly ... · 4/24/2013  · organ axis (endocrine...

52
Designing robust distributed systems with weakly interacting feedback structures EPITA April 24, 2013 Peter Van Roy ICTEAM Institute Université catholique de Louvain Louvain-la-Neuve, Belgium P. Van Roy, UCL, Louvain-la-Neuve 1 Apr. 2013

Upload: others

Post on 27-Apr-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Designing robust distributed systems with weakly ... · 4/24/2013  · organ axis (endocrine system) ! Two superimposed groups of negative feedback loops, a third short negative loop,

Designing robust distributed systems

with weakly interacting feedback structures

EPITA

April 24, 2013

Peter Van Roy

ICTEAM Institute Université catholique de Louvain

Louvain-la-Neuve, Belgium

P. Van Roy, UCL, Louvain-la-Neuve

1

Apr. 2013

Page 2: Designing robust distributed systems with weakly ... · 4/24/2013  · organ axis (endocrine system) ! Two superimposed groups of negative feedback loops, a third short negative loop,

Overview l  As Internet services become larger, they become more complex and their

environment becomes more hostile l  Partial failures, software errors, communication problems, churn, attacks l  Problems due to global behavior (oscillations, traffic jams, multicast storms,

thundering herds, chaotic fluctuations, thrashing, cascading failures) l  How can we design Internet services to provide predictable behavior in such

conditions? l  Motivating examples from biology (and some well-designed computing systems) l  Proposed architecture for scalable services as a set of weakly interacting feedback

structures with dependencies l  Preliminary evaluation based on Scalaris key/value store (SELFMAN project)

l  Scalaris provides high-performance transactions on a structured overlay network l  Scalaris contains five feedback structures and their dependencies: connectivity,

routing, load balancing, replication, and transactions l  We are currently formalizing this approach and tying it to existing quantitative

approaches P. Van Roy, UCL, Louvain-la-Neuve 2 Apr. 2013

Page 3: Designing robust distributed systems with weakly ... · 4/24/2013  · organ axis (endocrine system) ! Two superimposed groups of negative feedback loops, a third short negative loop,

Motivating examples from biology and computing

P. Van Roy, UCL, Louvain-la-Neuve

3

Apr. 2013

Page 4: Designing robust distributed systems with weakly ... · 4/24/2013  · organ axis (endocrine system) ! Two superimposed groups of negative feedback loops, a third short negative loop,

Motivating examples l  Many systems exist that survive in hostile environments

l  Biological organisms l  Some computing systems l  Human organizations

l  It is a good idea to study these systems to derive general design principles

l  We give five examples to introduce the main ideas l  Hotel lobby example → Debugging of feedback structures l  Human respiratory system → Design rules, state diagram l  TCP protocol operation → Systems with many parts l  Human endocrine system → Concurrent component model l  Human organizations → Design patterns for feedback structures

P. Van Roy, UCL, Louvain-la-Neuve 4 Apr. 2013

Page 5: Designing robust distributed systems with weakly ... · 4/24/2013  · organ axis (endocrine system) ! Two superimposed groups of negative feedback loops, a third short negative loop,

5

Simple example: hotel lobby (from [Wiener 1948])

l  This is unstable! l  The tribesman stokes

the fire but gets colder and colder because the airconditioning works harder and harder

l  Wiener leaves the fix as homework for the reader (!)

l  One possible solution: outer loop (tribesman) controls the other by simply adjusting the thermostat l  One loop controls the

other

l  Two loops interacting through a common subsystem (stigmergy)

Monitoring agents

Thermostat(run aircond. if too warm)

(stoke fire if too cold)Tribesman

Measuretemperature

near fire

Measuretemperature

in lobbyairconditioningRun

fireStoke

Subsystem

Hotel lobby

Calculate corrective action

Fire

Hotel lobby

Tribesman

Thermostat

Actuating agents

P. Van Roy, UCL, Louvain-la-Neuve

5

Apr. 2013

Page 6: Designing robust distributed systems with weakly ... · 4/24/2013  · organ axis (endocrine system) ! Two superimposed groups of negative feedback loops, a third short negative loop,

Hotel lobby solution

l  Instead of stoking a fire, the tribesman simply adjusts the thermostat. The resulting system is stable.

l  This uses management (one loop controls another) instead of stigmergy (two loops interact through the environment)

l  Design pattern: use the system, don’t try to bypass it

(adjust thermostat)

Thermostat(run aircond. if too warm)

airconditioningRun

Hotel lobby

Tribesman

Measuretemperature

Measuretemperature

at thermostat at tribesman

P. Van Roy, UCL, Louvain-la-Neuve

6

Apr. 2013

Page 7: Designing robust distributed systems with weakly ... · 4/24/2013  · organ axis (endocrine system) ! Two superimposed groups of negative feedback loops, a third short negative loop,

Human respiratory system

l  Default behavior: rhythmic breathing reflex l  Complex component: conscious control can override and plan lifesaving actions l  Abstraction: conscious control does not need to know details of breathing reflex l  Fail-safe: conscious control can itself be overridden (falling unconscious) l  Time scales: laryngospasm is a quick action that interrupts slower breathing reflex

Other inputs

when sufficient obstruction in airways

Laryngospasm

(seal air tube)

Breathing

reflex

MeasureO2

in blood

Monitor

breathing

MeasureCO2

in blood

Detectobstructionin airways

Trigger unconsciousness

when O2 falls to threshold

Conscious control

of body and breathing

Trigger breathing reflex

when CO2 increases to threshold

Trigger laryngospasm temporarily

Actuating agents Monitoring agents

in human body

Breathing apparatus

(maximum is breath!hold breakpoint)and change CO2 threshold

Increase or decrease breathing rate

(and reduce CO2 threshold to base level)Render unconscious

P. Van Roy, UCL, Louvain-la-Neuve

Some design rules:

The operation of the human respiratory system is given as one feedback structure, inferred from a precise medical description of its behavior

7

Apr. 2013

Page 8: Designing robust distributed systems with weakly ... · 4/24/2013  · organ axis (endocrine system) ! Two superimposed groups of negative feedback loops, a third short negative loop,

Apr. 2013 P. Van Roy, UCL, Louvain-la-Neuve 8

Discussion of respiratory system l  Four feedback loops: two inner loops (breathing reflex and

laryngospasm), a loop controlling the breathing reflex (conscious control), and an outer loop controlling the conscious control (falling unconscious) l  This design is derived from a precise textual medical description

[Wikipedia 2006: Entry “Drowning”] l  Holding your breath can have two effects

l  Breath-hold threshold is reached first and breathing reflex happens l  O2 threshold is reached first and you fall unconscious, which reestablishes the

normal breathing reflex l  Some plausible design rules inferred from this system

l  Conscious control is sandwiched in between two simpler loops: the breathing reflex provides abstraction (consciousness does not have to understand details of breathing) and falling unconscious provides protection against instability

l  Conscious control is a powerful problem solver but it needs to be held in check

Page 9: Designing robust distributed systems with weakly ... · 4/24/2013  · organ axis (endocrine system) ! Two superimposed groups of negative feedback loops, a third short negative loop,

Respiratory system state diagram

l  The behavior of the human respiratory system modeled as a state diagram l  Dominant subset = active subset of feedback loops = state

l  At any time, one subset is active, depending on operating conditions l  Each subset corresponds to a state in the state diagram

P. Van Roy, UCL, Louvain-la-Neuve 9 Apr. 2013

Page 10: Designing robust distributed systems with weakly ... · 4/24/2013  · organ axis (endocrine system) ! Two superimposed groups of negative feedback loops, a third short negative loop,

10

TCP as feedback structures l  This example shows a

reliable byte stream protocol with congestion control (a variant of TCP) l  This diagram is for the

sending side l  The congestion control

loop manages the reliable transfer loop l  By changing the sliding

window’s buffer size l  With n connections there

are n feedback structures interacting through a shared network (stigmergy) l  This is an example of a

system with many WIFS l  Each FS has its own

state

P. Van Roy, UCL, Louvain-la-Neuve

Send

Inner loop (reliable transfer)

Outer loop (congestion control)

Calculate policy modification

Actuator(send packet)

Monitor Monitorthroughput

Calculate bytes to send

(modify throughput)

(sliding window protocol)

destination and receives ack)(network that sends packet to

Subsystem

(receive ack)

Sendstream acknowledgement

10

Apr. 2013

Page 11: Designing robust distributed systems with weakly ... · 4/24/2013  · organ axis (endocrine system) ! Two superimposed groups of negative feedback loops, a third short negative loop,

Apr. 2013 P. Van Roy, UCL, Louvain-la-Neuve 11

Human endocrine system l  The endocrine system regulates many quantities in

the human body l  It uses chemical messengers called hormones

which are secreted by specialized glands and which exercise their action at a distance, using the blood stream as a diffusion channel

l  By studying the endocrine system, we can obtain insights in how to build large-scale self-regulating distributed systems

Page 12: Designing robust distributed systems with weakly ... · 4/24/2013  · organ axis (endocrine system) ! Two superimposed groups of negative feedback loops, a third short negative loop,

Apr. 2013 P. Van Roy, UCL, Louvain-la-Neuve 12

Feedback loops in the endocrine system l  There are many feedback loops and systems of interacting feedback

loops in the endocrine system l  Provides homeostasis (stability) and reaction to stresses

l  Much regulation is done by simple negative feedback loops l  Glucose level in blood is regulated by hormones glucagon & insulin. In the

pancreas, A cells secrete glucagon and B cells secrete insulin. Increase in glucose in blood causes decrease in glucagon and increase in insulin. These hormones act on the liver, which releases glucose in the bloodstream.

l  Calcium level in blood is regulated by parathyroid hormone (parathormone) and calcitonine (also in opposite directions), which act on the bone

l  More complex regulatory mechanisms exist, e.g., hypothalamus-pituitary-target organ axis

l  There is interaction between nervous transmission and hormonal transmission

Page 13: Designing robust distributed systems with weakly ... · 4/24/2013  · organ axis (endocrine system) ! Two superimposed groups of negative feedback loops, a third short negative loop,

P. Van Roy, UCL, Louvain-la-Neuve

Hypothalamus-pituitary-target organ axis (endocrine system)

l  Two superimposed groups of negative feedback loops, a third short negative loop, a fourth loop from the central nervous system [Encyclopaedia Britannica 2005]

l  This diagram shows only the main components and their interactions; there are many more parts giving a much more complex full system

13 Apr. 2013

Page 14: Designing robust distributed systems with weakly ... · 4/24/2013  · organ axis (endocrine system) ! Two superimposed groups of negative feedback loops, a third short negative loop,

Apr. 2013 P. Van Roy, UCL, Louvain-la-Neuve

14

Discussion of endocrine system l  This system is quite complex

l  Many interacting feedback loops, many “short circuits”, many special cases, much interaction with other systems (nervous, immune) l  Negative feedback for most, also saturation (logistic curve) l  Evolution is not always a parsimonious designer!

§  Only criterion: it has to work

l  Several feedback loops are channeled through a single point, the hypothalamus-pituitary complex in the brain l  So that the central nervous system can manage these loops l  Different time scales are used: the loops are slow; the

central nervous system is fast

Page 15: Designing robust distributed systems with weakly ... · 4/24/2013  · organ axis (endocrine system) ! Two superimposed groups of negative feedback loops, a third short negative loop,

Apr. 2013 P. Van Roy, UCL, Louvain-la-Neuve

15

Computational architecture of human endocrine system l  Local and global components

l  Local: gland, organ, or clumps of cells l  Global (diffuse): large part of the body

l  Point-to-point and broadcast channels l  Fast point-to-point: nerve fiber, e.g., from spinal chord to muscle l  Slower broadcast: hormone diffused by blood circulation

l  With buffering (reducing variations): carrier proteins l  Regulatory mechanisms can be modeled by interactions

between components and channels l  There are often intermediate links l  Abstraction (encapsulation) is almost always approximate

Page 16: Designing robust distributed systems with weakly ... · 4/24/2013  · organ axis (endocrine system) ! Two superimposed groups of negative feedback loops, a third short negative loop,

Design patterns for feedback structures

l  We can arrange feedback structures in a tree according to their relationships and the problems they solve

P. Van Roy, UCL, Louvain-la-Neuve

Chapter

4.Feedback

systems

-patterns

64

I am most concerned about ...

Balancing LoopReinforcing Loop

vicious and virtuous spiral

Limits to GrowthSuccess to the

Successful

Accidental Adversaries

Fix that BackfireDrifting Goals Escalation IndecisionBalancing with

Delay

Tragedy of the Commons

The Attractiveness Principle

Growth & Underinvestment

Shifting the Burden

AddictionGrowth &

Underinvestment (drifting standards)

growth fixing problems

But my growth seems to lead

to your decline...

But nothinggrows forevever...

I have more thanone limit and can'taddress all of them

equally...

... so, if we're allup against the

same limit My capacity is mylimit. Therefore, mycapacity isn't large

enough...

... but there's a temptation to let my

standards slip instead ...

I form a partnershipfor growth, but end up

feeling betrayed ...

...by making my partner intoan adversary ...

While waiting for myfix to take hold, to relieve the tension,I become satisfied

with less ...

But my fix is yournightmare But I don't know

what I'm going to do

But sometimes, the reaction

is not immediate

But my fix comes backto haunt me

... because I'm gettingat the real underlying

cause ...

The drifting goalsundermine my long-term

growth ...

but once I becomeaddicted to the

symptomatic solution ...

Fig

ure

4.4

0:

The

Archetyp

eFam

ilyTree

Archetype Family Tree (from [Senge 1994])

16 Apr. 2013

Page 17: Designing robust distributed systems with weakly ... · 4/24/2013  · organ axis (endocrine system) ! Two superimposed groups of negative feedback loops, a third short negative loop,

Designing scalable systems

P. Van Roy, UCL, Louvain-la-Neuve

17

Apr. 2013

Page 18: Designing robust distributed systems with weakly ... · 4/24/2013  · organ axis (endocrine system) ! Two superimposed groups of negative feedback loops, a third short negative loop,

Designing scalable systems l  Essential ingredients

l  Design principles inferred from existing working systems and validated subsequently

l  The CAP theorem is an essential tool l  It holds at all scales and all levels of abstraction

l  First step l  The default is a set of independent parts l  We add coordination between these parts l  It is important to add as little coordination as possible

l  Next step: weakly interacting feedback structures Apr. 2013 P. Van Roy, UCL, Louvain-la-Neuve

18

Page 19: Designing robust distributed systems with weakly ... · 4/24/2013  · organ axis (endocrine system) ! Two superimposed groups of negative feedback loops, a third short negative loop,

The CAP theorem l  The CAP theorem is an essential tool for any scalable system

design l  The CAP theorem was conjectured by Eric Brewer at PODC in 2000 and

proved by Seth Gilbert and Nancy Lynch in 2002

l  For an asynchronous network, it is impossible to implement an object that guarantees the following properties in all fair executions: l  Consistency: all operations are atomic (totally ordered) l  Availability: every request eventually returns a result l  Partition tolerance: any messages may be lost

l  The CAP Theorem applies for all systems, at all levels of abstraction, and at all sizes l  It can be applied in many places in the same system l  The whole system is a rainbow of interacting instances of CAP

P. Van Roy, UCL, Louvain-la-Neuve

19

Apr. 2013

Page 20: Designing robust distributed systems with weakly ... · 4/24/2013  · organ axis (endocrine system) ! Two superimposed groups of negative feedback loops, a third short negative loop,

Designing with CAP l  C is hard to achieve → (P+A, no C) is the default

l  Consistency requires global coordination

l  Avoid needing C if possible l  We can achieve robustness (P) and performance (A)

l  DropBox and Web cache give P and A, but not C l  Wuala and BitTorrent are read-only, achieve C easily l  Mercurial is consistent if connected (C+A), but is still usable if disconnected (P+A)

l  But if we really need C l  Give up A → Waiting sometimes needed l  Give up P → Fragile system

l  Distributed database guarantees C but will block if there is a partition

l  Accept weaker C → Eventual consistency

l  We can have our cake and eat it too, if we pay the price l  Highly reliable communication channels and fault tolerance l  We get C and A, and we “seem” to get P as well (actually, we just have less partitions)

l  Scalaris, Beernet: peer-to-peer with majority consensus (Paxos) gives robustness l  Cassandra: run on cloud, not peer-to-peer (does not support loose coupling)

P. Van Roy, UCL, Louvain-la-Neuve

20

Apr. 2013

Page 21: Designing robust distributed systems with weakly ... · 4/24/2013  · organ axis (endocrine system) ! Two superimposed groups of negative feedback loops, a third short negative loop,

The default is a set of independent parts

l  Every scalable design starts as a decentralized system (P+A, no C) l  A system of independent parts

l  Nodes occasionally interact (add some C) → collaboration, emergence l  Split protocol: what happens when a node leaves a group (may be abrupt) l  Merge protocol: what happens when a node joins a group

l  Merge is based on data coherence and may need input from highest level l  Many examples: biology, peer-to-peer, map-reduce, gas/liquid/solid, …

P. Van Roy, UCL, Louvain-la-Neuve

split/merge

l group

21

Apr. 2013

Page 22: Designing robust distributed systems with weakly ... · 4/24/2013  · organ axis (endocrine system) ! Two superimposed groups of negative feedback loops, a third short negative loop,

Mostly independent parts l  Large systems consist of independent parts with weak

interactions l  Gas in a box: molecules mostly independent, occasional interaction

when two molecules collide. l  Peer-to-peer network: peers mostly independent, occasional

interaction between neighbors only. Can provide efficient and robust communication and storage infrastructure (see later).

l  Gossip algorithm: nodes mostly independent, occasional interaction between random pairs. Can efficiently solve many global problems such as diffusion, search, aggregation, monitoring, and topology management.

l  This seems to be a general principle l  Systems with many parts that interact strongly are avoided by nature

P. Van Roy, UCL, Louvain-la-Neuve

22

Apr. 2013

Page 23: Designing robust distributed systems with weakly ... · 4/24/2013  · organ axis (endocrine system) ! Two superimposed groups of negative feedback loops, a third short negative loop,

Apr. 2013 P. Van Roy, UCL, Louvain-la-Neuve 23

Types of systems l  This diagram is from

[Weinberg 1977] An Introduction to General Systems Thinking

l  The discipline of computing is pushing the boundaries of the two shaded areas inwards

l  Software development and computational science are the vanguards of system theory

l  However, there seems to be something inherently unpleasant about the white area in the middle l  It is extremely difficult to

analyze systems with many strongly interacting parts; science has barely touched it

l  Even biological organisms avoid it (they are mostly decomposable)

computing

computing

Page 24: Designing robust distributed systems with weakly ... · 4/24/2013  · organ axis (endocrine system) ! Two superimposed groups of negative feedback loops, a third short negative loop,

Adding consistency/coordination l  We start with a decentralized system (P+A, no C)

l  How much C do we need and how do we add it? l  General principle: as little as possible (weak interaction)

l  The rest of the talk explores how to add C l  Main design principle:

weakly interacting feedback structures l  We validate the approach on a real system

l  Scalaris, a transactional key/value store

P. Van Roy, UCL, Louvain-la-Neuve

24

Apr. 2013

Page 25: Designing robust distributed systems with weakly ... · 4/24/2013  · organ axis (endocrine system) ! Two superimposed groups of negative feedback loops, a third short negative loop,

A scalable architecture in four steps l  Concurrent component

l  An active entity communicating with its neighbors through asynchronous messages

l  Intelligence is concentrated in complex components l  Feedback loop

l  Monitor, corrector, and actuator components connected to a subsystem and continuously maintaining one local goal

l  Feedback structure l  A set of feedback loops that work together to maintain

one global system property l  Weakly interacting feedback structures

l  The complete system is a conjunction of global properties, each maintained by one feedback structure

l  The feedback structures have dependencies based on the operating conditions

P. Van Roy, UCL, Louvain-la-Neuve 25 Apr. 2013

Page 26: Designing robust distributed systems with weakly ... · 4/24/2013  · organ axis (endocrine system) ! Two superimposed groups of negative feedback loops, a third short negative loop,

Scalaris with feedback structures

P. Van Roy, UCL, Louvain-la-Neuve

26

Apr. 2013

Page 27: Designing robust distributed systems with weakly ... · 4/24/2013  · organ axis (endocrine system) ! Two superimposed groups of negative feedback loops, a third short negative loop,

A peer-to-peer key/value Store: Scalaris

l  Scalaris is a high-performance self-managing key/value store that provides transactions and is built on top of a structured overlay network l  A major result of the European SELFMAN project (www.ist-selfman.org) l  4000 read-modify-write transactions per second on two dual-core Intel Xeon at 2.66 GHz

l  Scalaris has five WIFS: connectivity management (Sconnect), routing (Sroute), load balancing (Sload), replica management (Sreplica), and transaction management (Strans)

Sscalaris= Skey-value ∧ Sconnect ∧ Sroute ∧

Sload ∧ Sreplica ∧ Strans The Scalaris specification is a conjunction of six properties. Each non-functional property is implemented by one feedback structure.

Sconnect → Sroute → Sreplica → Strans

P. Van Roy, UCL, Louvain-la-Neuve

Sload

27 Apr. 2013

! " "! " !

#!$%&'!()!)!*%&#"#!+&*,

-

.$)&#)*!"%&!/),+$

"- 0"-"!

)!%1"*"!,2!*%&#"#!+&*,2

"#%-)!"%&2!(3$)0"-"!,

4++$"!%"4++$ /),+$5)!) 5)!)

6+7-"*)!"%&!/),+$

#*)-)0"-"!,

)8)"-)0"-"!,

4++$ !% 4++$!/),+$

5)!)

9!%$+

:

5)!)

9!%$+

;

5)!)

9!%$+

<

5)!)

9!%$+

=

5)!)

9!%$+

>>>

5)!)

9!%$+

>>?

5)!)

9!%$+

>>@

5)!)

9!%$+

>>A

Page 28: Designing robust distributed systems with weakly ... · 4/24/2013  · organ axis (endocrine system) ! Two superimposed groups of negative feedback loops, a third short negative loop,

Scalaris scalability

Apr. 2013 P. Van Roy, UCL, Louvain-la-Neuve 28

Page 29: Designing robust distributed systems with weakly ... · 4/24/2013  · organ axis (endocrine system) ! Two superimposed groups of negative feedback loops, a third short negative loop,

Scalaris is based on a structured overlay network

Ring

Fingers

l  Structured overlay networks are often based on a ring l  By far the most popular structure,

it has many variants and has been extensively studied

l  Self organization is done at two levels: l  The ring ensures connectivity: it

must always exist despite node joins, leaves, and failures

l  The fingers provide efficient routing: they can be temporarily in an inconsistent state

P. Van Roy, UCL, Louvain-la-Neuve 29 Apr. 2013

Page 30: Designing robust distributed systems with weakly ... · 4/24/2013  · organ axis (endocrine system) ! Two superimposed groups of negative feedback loops, a third short negative loop,

Structured overlay networks: inspired by peer-to-peer

l  Hybrid (client/server) l  Napster

l  Unstructured overlay l  Gnutella, Kazaa,

Morpheus, Freenet, … l  Uses flooding

l  Structured overlay l  Exponential network l  DHT (Distributed Hash

Table), e.g., Chord, DKS, Scalaris, Beernet, etc.

R = N-1 (hub)

R = 1 (others)

H = 1

R = ? (variable)

H = 1…7

(but no guarantee)

R = log N

H = log N

(with guarantee)

P. Van Roy, UCL, Louvain-la-Neuve

30

Apr. 2013

Page 31: Designing robust distributed systems with weakly ... · 4/24/2013  · organ axis (endocrine system) ! Two superimposed groups of negative feedback loops, a third short negative loop,

31

A “relaxed” structured overlay network

l  The relaxed ring is completely asynchronous l  Join and leave are completely

asynchronous (as opposed to Scalaris, where they are synchronous)

l  The bushes appear only if there are failure suspicions

l  Beernet implements the relaxed ring

l  There is a perfect ring (in red) as a subset of the relaxed ring

l  The relaxed ring is always converging to a perfect ring l  The bushiness depends on

churn (rate of change of the ring, leaves/joins) and failure suspicion rate (communication delays)

P. Van Roy, UCL, Louvain-la-Neuve

Perfect ring

Bushes

31

Apr. 2013

Page 32: Designing robust distributed systems with weakly ... · 4/24/2013  · organ axis (endocrine system) ! Two superimposed groups of negative feedback loops, a third short negative loop,

32

More on the relaxed ring l  False failure suspicions are common on the Internet

l  We do not want to eject the node from the ring when this happens l  The relaxed ring solves this by doing ring maintenance in asynchronous

fashion [Mejias 2008] l  Nodes communicate through message passing l  For a join, instead of one step involving 3 peers (as in Scalaris, Chord, or

DKS), we have two steps each with 2 peers → we do not need locking or a periodic stabilization algorithm

l  Invariant: Every peer is in the same ring as its successor

P. Van Roy, UCL, Louvain-la-Neuve

32

Apr. 2013

Page 33: Designing robust distributed systems with weakly ... · 4/24/2013  · organ axis (endocrine system) ! Two superimposed groups of negative feedback loops, a third short negative loop,

Phases in the relaxed ring

l  The relaxed ring has (at least) three phases l  Uses ring merge algorithm developed in SELFMAN [Shafaat 2009] l  We are studying how the ring reacts to external stress (phase transitions)

l  Key questions: l  How do the phases show up at the application layer? (“qualitative changes”) l  How do we know when we are near a phase transition? (“early bubbling”)

P. Van Roy, UCL, Louvain-la-Neuve

33

Apr. 2013

Page 34: Designing robust distributed systems with weakly ... · 4/24/2013  · organ axis (endocrine system) ! Two superimposed groups of negative feedback loops, a third short negative loop,

Phases in large systems

l  A phase is a concise characterization of an aggregate behavior in a system consisting of many interacting components

l  Phases appear in many large systems l  Not just physical systems

(water) but also computing systems (like peer-to-peer)

P. Van Roy, UCL, Louvain-la-Neuve

l  Different parts of the system can be in different phases l  Depending on the local operating conditions (environment) l  Boundaries between phases can be sharp or diffuse l  Phase transitions and critical points can occur if operating conditions change

Water phase diagram (Copyright © Martin Chaplin)

34

Apr. 2013

Page 35: Designing robust distributed systems with weakly ... · 4/24/2013  · organ axis (endocrine system) ! Two superimposed groups of negative feedback loops, a third short negative loop,

Complex components

(supplement)

Apr. 2013 P. Van Roy, UCL, Louvain-la-Neuve

35

Page 36: Designing robust distributed systems with weakly ... · 4/24/2013  · organ axis (endocrine system) ! Two superimposed groups of negative feedback loops, a third short negative loop,

Some complex components l  Human intelligence

l  Main strength: adaptability (dynamic creation of new feedback loops)

l  Program intelligence l  Can easily go beyond human

intelligence in many areas! l  Turing test is irrelevant: complex

components are already replacing humans in more and more areas

l  Minesweeper digital assistant: uses constraints (easy to program!)

l  Chess: uses alpha-beta search with heuristics

l  Compiler: translates human-readable program into executable form

P. Van Roy, UCL, Louvain-la-Neuve

36

Apr. 2013

Page 37: Designing robust distributed systems with weakly ... · 4/24/2013  · organ axis (endocrine system) ! Two superimposed groups of negative feedback loops, a third short negative loop,

Properties of complex components l  Complex components are essential parts of many large systems

l  For example, conscious control in the human respiratory system l  Complex components completely solve a problem inside a specific

(small) part of the space of system operating conditions (from the viewpoint of the rest of the system) l  Conscious control, a chess program, and a compiler are extremely smart

within their operating space l  Outside of this space, they can be very stupid and should be inactive (on

their own accord or forced) l  Complex components are completely unpredictable when viewed

from the outside l  If it were not so, they would not be needed! l  They can be highly nonlinear and unstable; the rest of the system has to

trust them (typically, up to some hardwired fail-safe)

P. Van Roy, UCL, Louvain-la-Neuve

37

Apr. 2013

Page 38: Designing robust distributed systems with weakly ... · 4/24/2013  · organ axis (endocrine system) ! Two superimposed groups of negative feedback loops, a third short negative loop,

Power is built in, not added on

l  The power of a system depends on the strength of its complex components l  The human respiratory system uses conscious control (e.g., to avoid drowning!) l  Erlang OTP uses supervisor trees and a database to implement robustness l  Scalaris uses Paxos consensus and replication to implement fast transactions l  Google Search uses eigenvector calculation of the Web link matrix l  What does your system use?

P. Van Roy, UCL, Louvain-la-Neuve

3.6-Liter Biturbo Motor with 353 kW (480 HP)

Porsche Carrera GT

38

Apr. 2013

Page 39: Designing robust distributed systems with weakly ... · 4/24/2013  · organ axis (endocrine system) ! Two superimposed groups of negative feedback loops, a third short negative loop,

Why is conscious control so smart? l  Cognitive science and neuroscience try to understand why

l  The brain uses brute force, but in a very smart way

l  Conscious control is a bricklayer: it continuously builds and organizes new components on top of existing components l  This process is continuous from birth with compound interest effect,

which is why humans are so smart in common-sense tasks

l  It continuously brings the most useful concepts to the top (cache organization combined with “grandfather cell”) l  Manipulating common concepts is made easy

l  “Mirror neurons”: it can use its own components to simulate other humans, which is why humans can empathize so well with others

l  It can efficiently execute up to two complex programs at once (“walking and chewing gum”), because of the two-lobed structure of the brain

P. Van Roy, UCL, Louvain-la-Neuve

39

Apr. 2013

Page 40: Designing robust distributed systems with weakly ... · 4/24/2013  · organ axis (endocrine system) ! Two superimposed groups of negative feedback loops, a third short negative loop,

Clouds and elastic applications

(supplement)

P. Van Roy, UCL, Louvain-la-Neuve

40

Apr. 2013

Page 41: Designing robust distributed systems with weakly ... · 4/24/2013  · organ axis (endocrine system) ! Two superimposed groups of negative feedback loops, a third short negative loop,

Elastic computing l  Two main infrastructures for scalable computing

l  Peer-to-peer: use of client machines l  Cloud-based: use of datacenters

l  Cloud is elastic; peer-to-peer is not l  Elasticity: the ability to scale resource usage up and down

rapidly according to instantaneous demand l  Elasticity is a new property that did not exist before clouds

l  Elasticity makes possible a new kind of application l  Applications that use enormous computational and storage

resources for short times, but at constant (low) cost l  Applications that use data-intensive algorithms and machine

learning

P. Van Roy, UCL, Louvain-la-Neuve

41

Apr. 2013

Page 42: Designing robust distributed systems with weakly ... · 4/24/2013  · organ axis (endocrine system) ! Two superimposed groups of negative feedback loops, a third short negative loop,

Clouds are the first key: much more than meets the eye! l  Cloud computing is a form of

client-server where the “server” is a dynamically scalable network of loosely coupled heterogeneous nodes that are owned by a single institution

l  It allows enterprises to offload their computing infrastructure

l  It gives mobile devices an easy way to manage data

l  Is that all that cloud computing offers? l  No! This is just the tip of the iceberg! l  Cloud computing is the beginning of a much more profound change

P. Van Roy, UCL, Louvain-la-Neuve

42

Apr. 2013

Page 43: Designing robust distributed systems with weakly ... · 4/24/2013  · organ axis (endocrine system) ! Two superimposed groups of negative feedback loops, a third short negative loop,

Clouds are elastic!

l  Elasticity is the ability to ramp up resources quickly to meet demand l  Like electric power distribution

l  With elastic clouds the enormous dark blue area becomes available

l  Applications that need enormous resources for short times can get them for low cost! l  Like electric power distribution, pay

only for the volume (cost is product of time and number of machines)

l  This is exactly what intelligent applications need!

t (time interval)

r (resources)

t0

r0

Local resources

r · t ≤ c0

Elastic

43

resources

43

P. Van Roy, UCL, Louvain-la-Neuve Apr. 2013

Page 44: Designing robust distributed systems with weakly ... · 4/24/2013  · organ axis (endocrine system) ! Two superimposed groups of negative feedback loops, a third short negative loop,

Machine learning is the second key l  Machine learning is the discipline that studies how to program

computers to evolve behaviors based on example data or past experience

l  Machine learning can solve complex problems that we cannot solve in any other way l  It has many successes in practical applications both big and small, e.g.,

speech recognition, computer vision (face and handwriting, etc.), social prediction (epidemics, economics, retail, etc.), robot control (drones, cars, etc.), data mining, aiding natural sciences (biology, astronomy, neurology, etc.)

l  It is a major force on the Internet in big companies (Google, Amazon, Netflix, Facebook, etc.) as well as in startups (e.g., RecordedFuture)

l  Machine learning will (eventually) transform programming! l  Programmers will not work on raw data any more; instead they will build

machine learning systems l  “Programming, like all engineering, is a lot of work; machine learning is more

like farming, where we let nature do most of the work” – Pedro Domingos

P. Van Roy, UCL, Louvain-la-Neuve

44

Apr. 2013

Page 45: Designing robust distributed systems with weakly ... · 4/24/2013  · organ axis (endocrine system) ! Two superimposed groups of negative feedback loops, a third short negative loop,

An elastic application (1): real-time voice translation l  The pieces of this application already exist; for example the IRCAM

research institute has implemented many of them l  It requires combining domain knowledge (in sound and language)

with an enormous sound fragment database, hosted on a cloud

Normalization to canonical voice

Decomposition into phoneme

sequences Lookup in

sound database

English/Chinese sound fragment

database

Concatenative synthesis

Denormalization to original voice

English voice

Chinese voice

(purely hypothetical design!)

l  Franz Och, head of translation services at Google, announced that they are working on something similar (Feb. 10, 2010)

l  Rick Rashid, head of research at Microsoft, has recently demonstrated a prototype of this application (Nov. 2012)

P. Van Roy, UCL, Louvain-la-Neuve 45 Apr. 2013

Page 46: Designing robust distributed systems with weakly ... · 4/24/2013  · organ axis (endocrine system) ! Two superimposed groups of negative feedback loops, a third short negative loop,

An elastic application (2): ubiquitous augmentation l  Your sensory input will be “augmented” in real-time

l  Faces, objects, and names you see will be recognized l  Selected relevant information will be given spontaneously l  Foreign languages (text, audio, visual) will be translated l  When doing an activity, you will be guided to do it expertly l  When confronted with a problem, solutions will be suggested

l  The augmentation will be good enough that it can be always enabled (it doesn’t get in your way) l  It will learn to mesh with your thinking processes productively l  On the rare occasions that it is disabled, you will feel helpless

l  As if half of your brain just stopped working l  Like today’s Internet addictions, but much worse!

P. Van Roy, UCL, Louvain-la-Neuve

46

Apr. 2013

Page 47: Designing robust distributed systems with weakly ... · 4/24/2013  · organ axis (endocrine system) ! Two superimposed groups of negative feedback loops, a third short negative loop,

Query/use phase elastic resource requirements l (response time constraints)

Learning/setup phase elastic resource requirements l (learning time constraints)

l S  

l S  

l M  

l M  

l L  

l L  

Google  Search  Google  Translate  

l Recommenda3on  sys.  Speech  recogni3on  l Skype  connec3on  l Social  networks  

l Media  transla3on  

Weather  forecas3ng  

l Interactivity (learning + query)

l One-­‐shot  

l One-­‐way  stream  

Recorded  Future  

MMORPG  l Role-­‐playing  games  

l Chess  programs  

Championship  chess  (Deep  Fritz)  

IBM  Jeopardy(Watson)  

Wolfram  Alpha  Image  recogni3on  

Computer  algebra  Peer-­‐to-­‐peer  CDN  Google  Earth  l JIT  Compiler  

l XL  

l XL  

l Conversa3on  

BitTorrent  WIMP  GUI  

l MicrosoP  Office  

Tomorrow’s applications

Standard applications

Advanced applications

Real-­‐&me  audio  language  transla&on  

Real-­‐&me  expert  guidance  

 Crea&ve  problem  solving  

(controlled  search)  

 

Space of intelligent applications

l Intelligent  augmented  

reality  

P. Van Roy, UCL, Louvain-la-Neuve 47

Optimization is a form of learning!

Apr. 2013

Page 48: Designing robust distributed systems with weakly ... · 4/24/2013  · organ axis (endocrine system) ! Two superimposed groups of negative feedback loops, a third short negative loop,

Conclusions

Apr. 2013 P. Van Roy, UCL, Louvain-la-Neuve

48

Page 49: Designing robust distributed systems with weakly ... · 4/24/2013  · organ axis (endocrine system) ! Two superimposed groups of negative feedback loops, a third short negative loop,

Conclusions l  Design of large distributed systems is difficult

l  Not just because of their own complexity, but because the environment becomes more hostile

l  How can we design them? l  Learn from existing systems that work l  Inspiration from the SELFMAN project on self-managing systems l  Inspiration from biological systems

l  Proposed architecture l  Weakly interacting feedback structures with dominant subsets l  Complex components to solve the problem in limited conditions l  Phases define behavior over all possible operating conditions

l  Validation l  First validation with the Scalaris and Beernet transactional key/value stores

l  Ongoing research l  Formalization and semantics l  Tie the approach to existing quantitative techniques (control theory, model checking, system

dynamics) l  Collaboration with system and application builders

Apr. 2013 P. Van Roy, UCL, Louvain-la-Neuve

49

Page 50: Designing robust distributed systems with weakly ... · 4/24/2013  · organ axis (endocrine system) ! Two superimposed groups of negative feedback loops, a third short negative loop,

References

P. Van Roy, UCL, Louvain-la-Neuve

50

Apr. 2013

Page 51: Designing robust distributed systems with weakly ... · 4/24/2013  · organ axis (endocrine system) ! Two superimposed groups of negative feedback loops, a third short negative loop,

References (1) l  Joe Armstrong. Making Reliable Distributed Systems in the Presence of Software Errors, Ph. D.

dissertation, Royal Institute of Technology (KTH), Kista, Sweden, Nov. 2003. l  Ken Birman, Gregory Chockler, and Robbert van Renesse. “Toward a Cloud Computing Research

Agenda”, 3rd ACM SIGOPS International Workshop on Large Scale Distributed Systems and Middleware, ACM SIGACT News, 40(2): 68-80 (June 2009).

l  Alexandre Bultot. A Survey of Systems with Multiple Interacting Feedback Loops and Their Application to Programming, Master’s report, Dept. of Comp. Sci. and Eng., UCL, Aug. 2009.

l  Rick Cattell. “High Performance Scalable Data Stores”, Feb. 22, 2010. l  Raphaël Collet. The Limits of Network Transparency in a Distributed Programming Language, Ph. D.

dissertation, Dept. of Comp. Sci. and Eng., UCL, Dec. 2007. l  Michael Fischer, Nancy Lynch, and Michael Paterson. “Impossibility of Distributed Consensus with One

Faulty Process”, Journal of the ACM, 32(2): 374-382 (April 1985). l  Seth Gilbert and Nancy Lynch. “Brewer’s Conjecture and the Feasibility of Consistent, Available,

Partition-Tolerant Web Services”, ACM SIGACT News, 33(2): 51-59 (2002). l  Rachid Guerraoui and Luís Rodrigues. Introduction to Reliable Distributed Programming, Springer-

Verlag, 2006. l  Márk Jelasity and Özalp Babaoglu. “T-Man: Gossip-based Overlay Topology Management”, Proc. 3rd

Int. Workshop on Engineering Self-Organising Systems (ESOA 2005), Springer-Verlag LNCS volume 3910, 2006, pp. 1-15.

l  Boris Mejías. A Relaxed Ring for Self-Managing Decentralized Systems with Transactional Replicated Storage, Ph. D. dissertation, Dept. of Comp. Sci. and Eng., UCL, Oct. 2010.

l  Gerhard Michal and Dietmar Schomburg. Biochemical Pathways: An Atlas of Biochemistry and Molecular Biology, Wiley-Blackwell, 1999 (first edition), 2012 (second edition).

P. Van Roy, UCL, Louvain-la-Neuve

51

Apr. 2013

Page 52: Designing robust distributed systems with weakly ... · 4/24/2013  · organ axis (endocrine system) ! Two superimposed groups of negative feedback loops, a third short negative loop,

References (2) l  Florian Schintke, Alexander Reinefeld, Seif Haridi, and Thorsten Schütt. “Enhanced Paxos Commit for

Transactions on DHTs”, 10th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid 2010), May 17-20, 2010, Melbourne, Australia.

l  SELFMAN: Self Management for Large-Scale Distributed Systems Based on Structured Overlay Networks and Components. European 6th Framework Programme, www.ist-selfman.org (2009).

l  Peter M. Senge et al. The Fifth Discipline Fieldbook: Strategies and Tools for Building a Learning Organization, Nicholas Brealey Publishing, 1994.

l  Tallat M. Shafaat, Ali Ghodsi, and Seif Haridi. “Dealing with Network Partitions in Structured Overlay Networks”, Journal of Peer-to-Peer Networking and Applications, 2(4): 334-347 (2009).

l  Steven Strogatz. Nonlinear Dynamics and Chaos: With Applications to Physics, Biology, Chemistry, and Engineering (Studies in Nonlinearity), Perseus Books, 1994.

l  Nassim Taleb. The Black Swan: The Impact of the Highly Improbable, Penguin Books, 2008. l  Peter Van Roy, Seif Haridi, and Alexander Reinefeld. “Software Design with Weakly Interacting

Feedback Structures and Its Application to Distributed Systems”, Research Report, Dept. of Comp. Sci. and Eng., UCL, 2011.

l  Peter Van Roy. “Programming Paradigms for Dummies: What Every Programmer Should Know”, chapter in New Computational Paradigms for Computer Music, G. Assayag and A. Gerzso (eds.), IRCAM/Delatour France, June 2009.

l  Gerald M. Weinberg. An Introduction to General Systems Thinking, Dorset House Publishing, 1975 (Silver Anniversary Edition 2001).

l  Norbert Wiener. Cybernetics, or Control and Communication in the Animal and the Machine, MIT Press, Cambridge, MA, 1948.

l  Ulf Wiger. “Four-fold Increase in Productivity and Quality – Industrial Strength Functional Programming in Telecom-Class Products”, Ericsson Telecom AB, 2001.

P. Van Roy, UCL, Louvain-la-Neuve

52

Apr. 2013