designing robust distributed systems with weakly ... · 4/24/2013 · organ axis (endocrine...
TRANSCRIPT
Designing robust distributed systems
with weakly interacting feedback structures
EPITA
April 24, 2013
Peter Van Roy
ICTEAM Institute Université catholique de Louvain
Louvain-la-Neuve, Belgium
P. Van Roy, UCL, Louvain-la-Neuve
1
Apr. 2013
Overview l As Internet services become larger, they become more complex and their
environment becomes more hostile l Partial failures, software errors, communication problems, churn, attacks l Problems due to global behavior (oscillations, traffic jams, multicast storms,
thundering herds, chaotic fluctuations, thrashing, cascading failures) l How can we design Internet services to provide predictable behavior in such
conditions? l Motivating examples from biology (and some well-designed computing systems) l Proposed architecture for scalable services as a set of weakly interacting feedback
structures with dependencies l Preliminary evaluation based on Scalaris key/value store (SELFMAN project)
l Scalaris provides high-performance transactions on a structured overlay network l Scalaris contains five feedback structures and their dependencies: connectivity,
routing, load balancing, replication, and transactions l We are currently formalizing this approach and tying it to existing quantitative
approaches P. Van Roy, UCL, Louvain-la-Neuve 2 Apr. 2013
Motivating examples from biology and computing
P. Van Roy, UCL, Louvain-la-Neuve
3
Apr. 2013
Motivating examples l Many systems exist that survive in hostile environments
l Biological organisms l Some computing systems l Human organizations
l It is a good idea to study these systems to derive general design principles
l We give five examples to introduce the main ideas l Hotel lobby example → Debugging of feedback structures l Human respiratory system → Design rules, state diagram l TCP protocol operation → Systems with many parts l Human endocrine system → Concurrent component model l Human organizations → Design patterns for feedback structures
P. Van Roy, UCL, Louvain-la-Neuve 4 Apr. 2013
5
Simple example: hotel lobby (from [Wiener 1948])
l This is unstable! l The tribesman stokes
the fire but gets colder and colder because the airconditioning works harder and harder
l Wiener leaves the fix as homework for the reader (!)
l One possible solution: outer loop (tribesman) controls the other by simply adjusting the thermostat l One loop controls the
other
l Two loops interacting through a common subsystem (stigmergy)
Monitoring agents
Thermostat(run aircond. if too warm)
(stoke fire if too cold)Tribesman
Measuretemperature
near fire
Measuretemperature
in lobbyairconditioningRun
fireStoke
Subsystem
Hotel lobby
Calculate corrective action
Fire
Hotel lobby
Tribesman
Thermostat
Actuating agents
P. Van Roy, UCL, Louvain-la-Neuve
5
Apr. 2013
Hotel lobby solution
l Instead of stoking a fire, the tribesman simply adjusts the thermostat. The resulting system is stable.
l This uses management (one loop controls another) instead of stigmergy (two loops interact through the environment)
l Design pattern: use the system, don’t try to bypass it
(adjust thermostat)
Thermostat(run aircond. if too warm)
airconditioningRun
Hotel lobby
Tribesman
Measuretemperature
Measuretemperature
at thermostat at tribesman
P. Van Roy, UCL, Louvain-la-Neuve
6
Apr. 2013
Human respiratory system
l Default behavior: rhythmic breathing reflex l Complex component: conscious control can override and plan lifesaving actions l Abstraction: conscious control does not need to know details of breathing reflex l Fail-safe: conscious control can itself be overridden (falling unconscious) l Time scales: laryngospasm is a quick action that interrupts slower breathing reflex
Other inputs
when sufficient obstruction in airways
Laryngospasm
(seal air tube)
Breathing
reflex
MeasureO2
in blood
Monitor
breathing
MeasureCO2
in blood
Detectobstructionin airways
Trigger unconsciousness
when O2 falls to threshold
Conscious control
of body and breathing
Trigger breathing reflex
when CO2 increases to threshold
Trigger laryngospasm temporarily
Actuating agents Monitoring agents
in human body
Breathing apparatus
(maximum is breath!hold breakpoint)and change CO2 threshold
Increase or decrease breathing rate
(and reduce CO2 threshold to base level)Render unconscious
P. Van Roy, UCL, Louvain-la-Neuve
Some design rules:
The operation of the human respiratory system is given as one feedback structure, inferred from a precise medical description of its behavior
7
Apr. 2013
Apr. 2013 P. Van Roy, UCL, Louvain-la-Neuve 8
Discussion of respiratory system l Four feedback loops: two inner loops (breathing reflex and
laryngospasm), a loop controlling the breathing reflex (conscious control), and an outer loop controlling the conscious control (falling unconscious) l This design is derived from a precise textual medical description
[Wikipedia 2006: Entry “Drowning”] l Holding your breath can have two effects
l Breath-hold threshold is reached first and breathing reflex happens l O2 threshold is reached first and you fall unconscious, which reestablishes the
normal breathing reflex l Some plausible design rules inferred from this system
l Conscious control is sandwiched in between two simpler loops: the breathing reflex provides abstraction (consciousness does not have to understand details of breathing) and falling unconscious provides protection against instability
l Conscious control is a powerful problem solver but it needs to be held in check
Respiratory system state diagram
l The behavior of the human respiratory system modeled as a state diagram l Dominant subset = active subset of feedback loops = state
l At any time, one subset is active, depending on operating conditions l Each subset corresponds to a state in the state diagram
P. Van Roy, UCL, Louvain-la-Neuve 9 Apr. 2013
10
TCP as feedback structures l This example shows a
reliable byte stream protocol with congestion control (a variant of TCP) l This diagram is for the
sending side l The congestion control
loop manages the reliable transfer loop l By changing the sliding
window’s buffer size l With n connections there
are n feedback structures interacting through a shared network (stigmergy) l This is an example of a
system with many WIFS l Each FS has its own
state
P. Van Roy, UCL, Louvain-la-Neuve
Send
Inner loop (reliable transfer)
Outer loop (congestion control)
Calculate policy modification
Actuator(send packet)
Monitor Monitorthroughput
Calculate bytes to send
(modify throughput)
(sliding window protocol)
destination and receives ack)(network that sends packet to
Subsystem
(receive ack)
Sendstream acknowledgement
10
Apr. 2013
Apr. 2013 P. Van Roy, UCL, Louvain-la-Neuve 11
Human endocrine system l The endocrine system regulates many quantities in
the human body l It uses chemical messengers called hormones
which are secreted by specialized glands and which exercise their action at a distance, using the blood stream as a diffusion channel
l By studying the endocrine system, we can obtain insights in how to build large-scale self-regulating distributed systems
Apr. 2013 P. Van Roy, UCL, Louvain-la-Neuve 12
Feedback loops in the endocrine system l There are many feedback loops and systems of interacting feedback
loops in the endocrine system l Provides homeostasis (stability) and reaction to stresses
l Much regulation is done by simple negative feedback loops l Glucose level in blood is regulated by hormones glucagon & insulin. In the
pancreas, A cells secrete glucagon and B cells secrete insulin. Increase in glucose in blood causes decrease in glucagon and increase in insulin. These hormones act on the liver, which releases glucose in the bloodstream.
l Calcium level in blood is regulated by parathyroid hormone (parathormone) and calcitonine (also in opposite directions), which act on the bone
l More complex regulatory mechanisms exist, e.g., hypothalamus-pituitary-target organ axis
l There is interaction between nervous transmission and hormonal transmission
P. Van Roy, UCL, Louvain-la-Neuve
Hypothalamus-pituitary-target organ axis (endocrine system)
l Two superimposed groups of negative feedback loops, a third short negative loop, a fourth loop from the central nervous system [Encyclopaedia Britannica 2005]
l This diagram shows only the main components and their interactions; there are many more parts giving a much more complex full system
13 Apr. 2013
Apr. 2013 P. Van Roy, UCL, Louvain-la-Neuve
14
Discussion of endocrine system l This system is quite complex
l Many interacting feedback loops, many “short circuits”, many special cases, much interaction with other systems (nervous, immune) l Negative feedback for most, also saturation (logistic curve) l Evolution is not always a parsimonious designer!
§ Only criterion: it has to work
l Several feedback loops are channeled through a single point, the hypothalamus-pituitary complex in the brain l So that the central nervous system can manage these loops l Different time scales are used: the loops are slow; the
central nervous system is fast
Apr. 2013 P. Van Roy, UCL, Louvain-la-Neuve
15
Computational architecture of human endocrine system l Local and global components
l Local: gland, organ, or clumps of cells l Global (diffuse): large part of the body
l Point-to-point and broadcast channels l Fast point-to-point: nerve fiber, e.g., from spinal chord to muscle l Slower broadcast: hormone diffused by blood circulation
l With buffering (reducing variations): carrier proteins l Regulatory mechanisms can be modeled by interactions
between components and channels l There are often intermediate links l Abstraction (encapsulation) is almost always approximate
Design patterns for feedback structures
l We can arrange feedback structures in a tree according to their relationships and the problems they solve
P. Van Roy, UCL, Louvain-la-Neuve
Chapter
4.Feedback
systems
-patterns
64
I am most concerned about ...
Balancing LoopReinforcing Loop
vicious and virtuous spiral
Limits to GrowthSuccess to the
Successful
Accidental Adversaries
Fix that BackfireDrifting Goals Escalation IndecisionBalancing with
Delay
Tragedy of the Commons
The Attractiveness Principle
Growth & Underinvestment
Shifting the Burden
AddictionGrowth &
Underinvestment (drifting standards)
growth fixing problems
But my growth seems to lead
to your decline...
But nothinggrows forevever...
I have more thanone limit and can'taddress all of them
equally...
... so, if we're allup against the
same limit My capacity is mylimit. Therefore, mycapacity isn't large
enough...
... but there's a temptation to let my
standards slip instead ...
I form a partnershipfor growth, but end up
feeling betrayed ...
...by making my partner intoan adversary ...
While waiting for myfix to take hold, to relieve the tension,I become satisfied
with less ...
But my fix is yournightmare But I don't know
what I'm going to do
But sometimes, the reaction
is not immediate
But my fix comes backto haunt me
... because I'm gettingat the real underlying
cause ...
The drifting goalsundermine my long-term
growth ...
but once I becomeaddicted to the
symptomatic solution ...
Fig
ure
4.4
0:
The
Archetyp
eFam
ilyTree
Archetype Family Tree (from [Senge 1994])
16 Apr. 2013
Designing scalable systems
P. Van Roy, UCL, Louvain-la-Neuve
17
Apr. 2013
Designing scalable systems l Essential ingredients
l Design principles inferred from existing working systems and validated subsequently
l The CAP theorem is an essential tool l It holds at all scales and all levels of abstraction
l First step l The default is a set of independent parts l We add coordination between these parts l It is important to add as little coordination as possible
l Next step: weakly interacting feedback structures Apr. 2013 P. Van Roy, UCL, Louvain-la-Neuve
18
The CAP theorem l The CAP theorem is an essential tool for any scalable system
design l The CAP theorem was conjectured by Eric Brewer at PODC in 2000 and
proved by Seth Gilbert and Nancy Lynch in 2002
l For an asynchronous network, it is impossible to implement an object that guarantees the following properties in all fair executions: l Consistency: all operations are atomic (totally ordered) l Availability: every request eventually returns a result l Partition tolerance: any messages may be lost
l The CAP Theorem applies for all systems, at all levels of abstraction, and at all sizes l It can be applied in many places in the same system l The whole system is a rainbow of interacting instances of CAP
P. Van Roy, UCL, Louvain-la-Neuve
19
Apr. 2013
Designing with CAP l C is hard to achieve → (P+A, no C) is the default
l Consistency requires global coordination
l Avoid needing C if possible l We can achieve robustness (P) and performance (A)
l DropBox and Web cache give P and A, but not C l Wuala and BitTorrent are read-only, achieve C easily l Mercurial is consistent if connected (C+A), but is still usable if disconnected (P+A)
l But if we really need C l Give up A → Waiting sometimes needed l Give up P → Fragile system
l Distributed database guarantees C but will block if there is a partition
l Accept weaker C → Eventual consistency
l We can have our cake and eat it too, if we pay the price l Highly reliable communication channels and fault tolerance l We get C and A, and we “seem” to get P as well (actually, we just have less partitions)
l Scalaris, Beernet: peer-to-peer with majority consensus (Paxos) gives robustness l Cassandra: run on cloud, not peer-to-peer (does not support loose coupling)
P. Van Roy, UCL, Louvain-la-Neuve
20
Apr. 2013
The default is a set of independent parts
l Every scalable design starts as a decentralized system (P+A, no C) l A system of independent parts
l Nodes occasionally interact (add some C) → collaboration, emergence l Split protocol: what happens when a node leaves a group (may be abrupt) l Merge protocol: what happens when a node joins a group
l Merge is based on data coherence and may need input from highest level l Many examples: biology, peer-to-peer, map-reduce, gas/liquid/solid, …
P. Van Roy, UCL, Louvain-la-Neuve
split/merge
l group
21
Apr. 2013
Mostly independent parts l Large systems consist of independent parts with weak
interactions l Gas in a box: molecules mostly independent, occasional interaction
when two molecules collide. l Peer-to-peer network: peers mostly independent, occasional
interaction between neighbors only. Can provide efficient and robust communication and storage infrastructure (see later).
l Gossip algorithm: nodes mostly independent, occasional interaction between random pairs. Can efficiently solve many global problems such as diffusion, search, aggregation, monitoring, and topology management.
l This seems to be a general principle l Systems with many parts that interact strongly are avoided by nature
P. Van Roy, UCL, Louvain-la-Neuve
22
Apr. 2013
Apr. 2013 P. Van Roy, UCL, Louvain-la-Neuve 23
Types of systems l This diagram is from
[Weinberg 1977] An Introduction to General Systems Thinking
l The discipline of computing is pushing the boundaries of the two shaded areas inwards
l Software development and computational science are the vanguards of system theory
l However, there seems to be something inherently unpleasant about the white area in the middle l It is extremely difficult to
analyze systems with many strongly interacting parts; science has barely touched it
l Even biological organisms avoid it (they are mostly decomposable)
computing
computing
Adding consistency/coordination l We start with a decentralized system (P+A, no C)
l How much C do we need and how do we add it? l General principle: as little as possible (weak interaction)
l The rest of the talk explores how to add C l Main design principle:
weakly interacting feedback structures l We validate the approach on a real system
l Scalaris, a transactional key/value store
P. Van Roy, UCL, Louvain-la-Neuve
24
Apr. 2013
A scalable architecture in four steps l Concurrent component
l An active entity communicating with its neighbors through asynchronous messages
l Intelligence is concentrated in complex components l Feedback loop
l Monitor, corrector, and actuator components connected to a subsystem and continuously maintaining one local goal
l Feedback structure l A set of feedback loops that work together to maintain
one global system property l Weakly interacting feedback structures
l The complete system is a conjunction of global properties, each maintained by one feedback structure
l The feedback structures have dependencies based on the operating conditions
P. Van Roy, UCL, Louvain-la-Neuve 25 Apr. 2013
Scalaris with feedback structures
P. Van Roy, UCL, Louvain-la-Neuve
26
Apr. 2013
A peer-to-peer key/value Store: Scalaris
l Scalaris is a high-performance self-managing key/value store that provides transactions and is built on top of a structured overlay network l A major result of the European SELFMAN project (www.ist-selfman.org) l 4000 read-modify-write transactions per second on two dual-core Intel Xeon at 2.66 GHz
l Scalaris has five WIFS: connectivity management (Sconnect), routing (Sroute), load balancing (Sload), replica management (Sreplica), and transaction management (Strans)
Sscalaris= Skey-value ∧ Sconnect ∧ Sroute ∧
Sload ∧ Sreplica ∧ Strans The Scalaris specification is a conjunction of six properties. Each non-functional property is implemented by one feedback structure.
Sconnect → Sroute → Sreplica → Strans
P. Van Roy, UCL, Louvain-la-Neuve
Sload
27 Apr. 2013
! " "! " !
#!$%&'!()!)!*%&#"#!+&*,
-
.$)&#)*!"%&!/),+$
"- 0"-"!
)!%1"*"!,2!*%&#"#!+&*,2
"#%-)!"%&2!(3$)0"-"!,
4++$"!%"4++$ /),+$5)!) 5)!)
6+7-"*)!"%&!/),+$
#*)-)0"-"!,
)8)"-)0"-"!,
4++$ !% 4++$!/),+$
5)!)
9!%$+
:
5)!)
9!%$+
;
5)!)
9!%$+
<
5)!)
9!%$+
=
5)!)
9!%$+
>>>
5)!)
9!%$+
>>?
5)!)
9!%$+
>>@
5)!)
9!%$+
>>A
Scalaris scalability
Apr. 2013 P. Van Roy, UCL, Louvain-la-Neuve 28
Scalaris is based on a structured overlay network
Ring
Fingers
l Structured overlay networks are often based on a ring l By far the most popular structure,
it has many variants and has been extensively studied
l Self organization is done at two levels: l The ring ensures connectivity: it
must always exist despite node joins, leaves, and failures
l The fingers provide efficient routing: they can be temporarily in an inconsistent state
P. Van Roy, UCL, Louvain-la-Neuve 29 Apr. 2013
Structured overlay networks: inspired by peer-to-peer
l Hybrid (client/server) l Napster
l Unstructured overlay l Gnutella, Kazaa,
Morpheus, Freenet, … l Uses flooding
l Structured overlay l Exponential network l DHT (Distributed Hash
Table), e.g., Chord, DKS, Scalaris, Beernet, etc.
R = N-1 (hub)
R = 1 (others)
H = 1
R = ? (variable)
H = 1…7
(but no guarantee)
R = log N
H = log N
(with guarantee)
P. Van Roy, UCL, Louvain-la-Neuve
30
Apr. 2013
31
A “relaxed” structured overlay network
l The relaxed ring is completely asynchronous l Join and leave are completely
asynchronous (as opposed to Scalaris, where they are synchronous)
l The bushes appear only if there are failure suspicions
l Beernet implements the relaxed ring
l There is a perfect ring (in red) as a subset of the relaxed ring
l The relaxed ring is always converging to a perfect ring l The bushiness depends on
churn (rate of change of the ring, leaves/joins) and failure suspicion rate (communication delays)
P. Van Roy, UCL, Louvain-la-Neuve
Perfect ring
Bushes
31
Apr. 2013
32
More on the relaxed ring l False failure suspicions are common on the Internet
l We do not want to eject the node from the ring when this happens l The relaxed ring solves this by doing ring maintenance in asynchronous
fashion [Mejias 2008] l Nodes communicate through message passing l For a join, instead of one step involving 3 peers (as in Scalaris, Chord, or
DKS), we have two steps each with 2 peers → we do not need locking or a periodic stabilization algorithm
l Invariant: Every peer is in the same ring as its successor
P. Van Roy, UCL, Louvain-la-Neuve
32
Apr. 2013
Phases in the relaxed ring
l The relaxed ring has (at least) three phases l Uses ring merge algorithm developed in SELFMAN [Shafaat 2009] l We are studying how the ring reacts to external stress (phase transitions)
l Key questions: l How do the phases show up at the application layer? (“qualitative changes”) l How do we know when we are near a phase transition? (“early bubbling”)
P. Van Roy, UCL, Louvain-la-Neuve
33
Apr. 2013
Phases in large systems
l A phase is a concise characterization of an aggregate behavior in a system consisting of many interacting components
l Phases appear in many large systems l Not just physical systems
(water) but also computing systems (like peer-to-peer)
P. Van Roy, UCL, Louvain-la-Neuve
l Different parts of the system can be in different phases l Depending on the local operating conditions (environment) l Boundaries between phases can be sharp or diffuse l Phase transitions and critical points can occur if operating conditions change
Water phase diagram (Copyright © Martin Chaplin)
34
Apr. 2013
Complex components
(supplement)
Apr. 2013 P. Van Roy, UCL, Louvain-la-Neuve
35
Some complex components l Human intelligence
l Main strength: adaptability (dynamic creation of new feedback loops)
l Program intelligence l Can easily go beyond human
intelligence in many areas! l Turing test is irrelevant: complex
components are already replacing humans in more and more areas
l Minesweeper digital assistant: uses constraints (easy to program!)
l Chess: uses alpha-beta search with heuristics
l Compiler: translates human-readable program into executable form
P. Van Roy, UCL, Louvain-la-Neuve
36
Apr. 2013
Properties of complex components l Complex components are essential parts of many large systems
l For example, conscious control in the human respiratory system l Complex components completely solve a problem inside a specific
(small) part of the space of system operating conditions (from the viewpoint of the rest of the system) l Conscious control, a chess program, and a compiler are extremely smart
within their operating space l Outside of this space, they can be very stupid and should be inactive (on
their own accord or forced) l Complex components are completely unpredictable when viewed
from the outside l If it were not so, they would not be needed! l They can be highly nonlinear and unstable; the rest of the system has to
trust them (typically, up to some hardwired fail-safe)
P. Van Roy, UCL, Louvain-la-Neuve
37
Apr. 2013
Power is built in, not added on
l The power of a system depends on the strength of its complex components l The human respiratory system uses conscious control (e.g., to avoid drowning!) l Erlang OTP uses supervisor trees and a database to implement robustness l Scalaris uses Paxos consensus and replication to implement fast transactions l Google Search uses eigenvector calculation of the Web link matrix l What does your system use?
P. Van Roy, UCL, Louvain-la-Neuve
3.6-Liter Biturbo Motor with 353 kW (480 HP)
Porsche Carrera GT
38
Apr. 2013
Why is conscious control so smart? l Cognitive science and neuroscience try to understand why
l The brain uses brute force, but in a very smart way
l Conscious control is a bricklayer: it continuously builds and organizes new components on top of existing components l This process is continuous from birth with compound interest effect,
which is why humans are so smart in common-sense tasks
l It continuously brings the most useful concepts to the top (cache organization combined with “grandfather cell”) l Manipulating common concepts is made easy
l “Mirror neurons”: it can use its own components to simulate other humans, which is why humans can empathize so well with others
l It can efficiently execute up to two complex programs at once (“walking and chewing gum”), because of the two-lobed structure of the brain
P. Van Roy, UCL, Louvain-la-Neuve
39
Apr. 2013
Clouds and elastic applications
(supplement)
P. Van Roy, UCL, Louvain-la-Neuve
40
Apr. 2013
Elastic computing l Two main infrastructures for scalable computing
l Peer-to-peer: use of client machines l Cloud-based: use of datacenters
l Cloud is elastic; peer-to-peer is not l Elasticity: the ability to scale resource usage up and down
rapidly according to instantaneous demand l Elasticity is a new property that did not exist before clouds
l Elasticity makes possible a new kind of application l Applications that use enormous computational and storage
resources for short times, but at constant (low) cost l Applications that use data-intensive algorithms and machine
learning
P. Van Roy, UCL, Louvain-la-Neuve
41
Apr. 2013
Clouds are the first key: much more than meets the eye! l Cloud computing is a form of
client-server where the “server” is a dynamically scalable network of loosely coupled heterogeneous nodes that are owned by a single institution
l It allows enterprises to offload their computing infrastructure
l It gives mobile devices an easy way to manage data
l Is that all that cloud computing offers? l No! This is just the tip of the iceberg! l Cloud computing is the beginning of a much more profound change
P. Van Roy, UCL, Louvain-la-Neuve
42
Apr. 2013
Clouds are elastic!
l Elasticity is the ability to ramp up resources quickly to meet demand l Like electric power distribution
l With elastic clouds the enormous dark blue area becomes available
l Applications that need enormous resources for short times can get them for low cost! l Like electric power distribution, pay
only for the volume (cost is product of time and number of machines)
l This is exactly what intelligent applications need!
t (time interval)
r (resources)
t0
r0
Local resources
r · t ≤ c0
Elastic
43
resources
43
P. Van Roy, UCL, Louvain-la-Neuve Apr. 2013
Machine learning is the second key l Machine learning is the discipline that studies how to program
computers to evolve behaviors based on example data or past experience
l Machine learning can solve complex problems that we cannot solve in any other way l It has many successes in practical applications both big and small, e.g.,
speech recognition, computer vision (face and handwriting, etc.), social prediction (epidemics, economics, retail, etc.), robot control (drones, cars, etc.), data mining, aiding natural sciences (biology, astronomy, neurology, etc.)
l It is a major force on the Internet in big companies (Google, Amazon, Netflix, Facebook, etc.) as well as in startups (e.g., RecordedFuture)
l Machine learning will (eventually) transform programming! l Programmers will not work on raw data any more; instead they will build
machine learning systems l “Programming, like all engineering, is a lot of work; machine learning is more
like farming, where we let nature do most of the work” – Pedro Domingos
P. Van Roy, UCL, Louvain-la-Neuve
44
Apr. 2013
An elastic application (1): real-time voice translation l The pieces of this application already exist; for example the IRCAM
research institute has implemented many of them l It requires combining domain knowledge (in sound and language)
with an enormous sound fragment database, hosted on a cloud
Normalization to canonical voice
Decomposition into phoneme
sequences Lookup in
sound database
English/Chinese sound fragment
database
Concatenative synthesis
Denormalization to original voice
English voice
Chinese voice
(purely hypothetical design!)
l Franz Och, head of translation services at Google, announced that they are working on something similar (Feb. 10, 2010)
l Rick Rashid, head of research at Microsoft, has recently demonstrated a prototype of this application (Nov. 2012)
P. Van Roy, UCL, Louvain-la-Neuve 45 Apr. 2013
An elastic application (2): ubiquitous augmentation l Your sensory input will be “augmented” in real-time
l Faces, objects, and names you see will be recognized l Selected relevant information will be given spontaneously l Foreign languages (text, audio, visual) will be translated l When doing an activity, you will be guided to do it expertly l When confronted with a problem, solutions will be suggested
l The augmentation will be good enough that it can be always enabled (it doesn’t get in your way) l It will learn to mesh with your thinking processes productively l On the rare occasions that it is disabled, you will feel helpless
l As if half of your brain just stopped working l Like today’s Internet addictions, but much worse!
P. Van Roy, UCL, Louvain-la-Neuve
46
Apr. 2013
Query/use phase elastic resource requirements l (response time constraints)
Learning/setup phase elastic resource requirements l (learning time constraints)
l S
l S
l M
l M
l L
l L
Google Search Google Translate
l Recommenda3on sys. Speech recogni3on l Skype connec3on l Social networks
l Media transla3on
Weather forecas3ng
l Interactivity (learning + query)
l One-‐shot
l One-‐way stream
Recorded Future
MMORPG l Role-‐playing games
l Chess programs
Championship chess (Deep Fritz)
IBM Jeopardy(Watson)
Wolfram Alpha Image recogni3on
Computer algebra Peer-‐to-‐peer CDN Google Earth l JIT Compiler
l XL
l XL
l Conversa3on
BitTorrent WIMP GUI
l MicrosoP Office
Tomorrow’s applications
Standard applications
Advanced applications
Real-‐&me audio language transla&on
Real-‐&me expert guidance
Crea&ve problem solving
(controlled search)
Space of intelligent applications
l Intelligent augmented
reality
P. Van Roy, UCL, Louvain-la-Neuve 47
Optimization is a form of learning!
Apr. 2013
Conclusions
Apr. 2013 P. Van Roy, UCL, Louvain-la-Neuve
48
Conclusions l Design of large distributed systems is difficult
l Not just because of their own complexity, but because the environment becomes more hostile
l How can we design them? l Learn from existing systems that work l Inspiration from the SELFMAN project on self-managing systems l Inspiration from biological systems
l Proposed architecture l Weakly interacting feedback structures with dominant subsets l Complex components to solve the problem in limited conditions l Phases define behavior over all possible operating conditions
l Validation l First validation with the Scalaris and Beernet transactional key/value stores
l Ongoing research l Formalization and semantics l Tie the approach to existing quantitative techniques (control theory, model checking, system
dynamics) l Collaboration with system and application builders
Apr. 2013 P. Van Roy, UCL, Louvain-la-Neuve
49
References
P. Van Roy, UCL, Louvain-la-Neuve
50
Apr. 2013
References (1) l Joe Armstrong. Making Reliable Distributed Systems in the Presence of Software Errors, Ph. D.
dissertation, Royal Institute of Technology (KTH), Kista, Sweden, Nov. 2003. l Ken Birman, Gregory Chockler, and Robbert van Renesse. “Toward a Cloud Computing Research
Agenda”, 3rd ACM SIGOPS International Workshop on Large Scale Distributed Systems and Middleware, ACM SIGACT News, 40(2): 68-80 (June 2009).
l Alexandre Bultot. A Survey of Systems with Multiple Interacting Feedback Loops and Their Application to Programming, Master’s report, Dept. of Comp. Sci. and Eng., UCL, Aug. 2009.
l Rick Cattell. “High Performance Scalable Data Stores”, Feb. 22, 2010. l Raphaël Collet. The Limits of Network Transparency in a Distributed Programming Language, Ph. D.
dissertation, Dept. of Comp. Sci. and Eng., UCL, Dec. 2007. l Michael Fischer, Nancy Lynch, and Michael Paterson. “Impossibility of Distributed Consensus with One
Faulty Process”, Journal of the ACM, 32(2): 374-382 (April 1985). l Seth Gilbert and Nancy Lynch. “Brewer’s Conjecture and the Feasibility of Consistent, Available,
Partition-Tolerant Web Services”, ACM SIGACT News, 33(2): 51-59 (2002). l Rachid Guerraoui and Luís Rodrigues. Introduction to Reliable Distributed Programming, Springer-
Verlag, 2006. l Márk Jelasity and Özalp Babaoglu. “T-Man: Gossip-based Overlay Topology Management”, Proc. 3rd
Int. Workshop on Engineering Self-Organising Systems (ESOA 2005), Springer-Verlag LNCS volume 3910, 2006, pp. 1-15.
l Boris Mejías. A Relaxed Ring for Self-Managing Decentralized Systems with Transactional Replicated Storage, Ph. D. dissertation, Dept. of Comp. Sci. and Eng., UCL, Oct. 2010.
l Gerhard Michal and Dietmar Schomburg. Biochemical Pathways: An Atlas of Biochemistry and Molecular Biology, Wiley-Blackwell, 1999 (first edition), 2012 (second edition).
P. Van Roy, UCL, Louvain-la-Neuve
51
Apr. 2013
References (2) l Florian Schintke, Alexander Reinefeld, Seif Haridi, and Thorsten Schütt. “Enhanced Paxos Commit for
Transactions on DHTs”, 10th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid 2010), May 17-20, 2010, Melbourne, Australia.
l SELFMAN: Self Management for Large-Scale Distributed Systems Based on Structured Overlay Networks and Components. European 6th Framework Programme, www.ist-selfman.org (2009).
l Peter M. Senge et al. The Fifth Discipline Fieldbook: Strategies and Tools for Building a Learning Organization, Nicholas Brealey Publishing, 1994.
l Tallat M. Shafaat, Ali Ghodsi, and Seif Haridi. “Dealing with Network Partitions in Structured Overlay Networks”, Journal of Peer-to-Peer Networking and Applications, 2(4): 334-347 (2009).
l Steven Strogatz. Nonlinear Dynamics and Chaos: With Applications to Physics, Biology, Chemistry, and Engineering (Studies in Nonlinearity), Perseus Books, 1994.
l Nassim Taleb. The Black Swan: The Impact of the Highly Improbable, Penguin Books, 2008. l Peter Van Roy, Seif Haridi, and Alexander Reinefeld. “Software Design with Weakly Interacting
Feedback Structures and Its Application to Distributed Systems”, Research Report, Dept. of Comp. Sci. and Eng., UCL, 2011.
l Peter Van Roy. “Programming Paradigms for Dummies: What Every Programmer Should Know”, chapter in New Computational Paradigms for Computer Music, G. Assayag and A. Gerzso (eds.), IRCAM/Delatour France, June 2009.
l Gerald M. Weinberg. An Introduction to General Systems Thinking, Dorset House Publishing, 1975 (Silver Anniversary Edition 2001).
l Norbert Wiener. Cybernetics, or Control and Communication in the Animal and the Machine, MIT Press, Cambridge, MA, 1948.
l Ulf Wiger. “Four-fold Increase in Productivity and Quality – Industrial Strength Functional Programming in Telecom-Class Products”, Ericsson Telecom AB, 2001.
P. Van Roy, UCL, Louvain-la-Neuve
52
Apr. 2013