clojure's take on concurrency

80
Clojure’s take on Concurrency Yoav Rubin

Upload: yoavrubin

Post on 15-Jan-2015

332 views

Category:

Technology


0 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Clojure's take on concurrency

Clojure’s take on Concurrency

Yoav Rubin

Page 2: Clojure's take on concurrency

About me• Software engineer in IBM Research, Haifa

– Worked on• From large scale products to small scale research

projects

– Domains• Software tools• Development environments • Simplified programming

– Technologies• Frontend engineering• Java, Clojure

• Lecture the course “Functional programming on the JVM” in Haifa University

{:name “Yoav Rubin,:email [email protected],:blog http://yoavrubin.blogspot.com,:twitter @yoavrubin}

Page 3: Clojure's take on concurrency

Agenda

• The problem of concurrency

• Reference types

• Pendings

Page 4: Clojure's take on concurrency

Why concurrency is a problem?

Page 5: Clojure's take on concurrency

Mutability

Page 6: Clojure's take on concurrency

What is to mutate

• What is actually x = x+1

– LOAD R10 x– ADDI R10 1– STORE R10 X

Page 7: Clojure's take on concurrency

What is to mutate

• Thread 1: x = x+1

– LOAD R10 X– ADDI R10 1– STORE R10 X

• Thread 2: x = x+5

– LOAD R10 X– ADDI R10 5– STORE R10 X

Page 8: Clojure's take on concurrency

What will happen?

Page 9: Clojure's take on concurrency

What is to mutate

• Thread 1: x = x+1

– LOAD R10 X

– ADDI R10 1

– STORE R10 X

• Thread 2: x = x+5

– LOAD R10 X– ADDI R10 5

– STORE R10 X

x is increased by 1 !!!

Page 10: Clojure's take on concurrency

What is to mutate

• Thread 1: x = x+1

– LOAD R10 X

– ADDI R10 1– STORE R10 X

• Thread 2: x = x+5

– LOAD R10 X– ADDI R10 5

– STORE R10 X

x is increased by 5 !!!

Page 11: Clojure's take on concurrency

What is to mutate

• Thread 1: x = x+1

– LOAD R10 X– ADDI R10 1– STORE R10 X

• Thread 2: x = x+5

– LOAD R10 X– ADDI R10 5– STORE R10 X

x is increased by 6 (the correct result)

Page 12: Clojure's take on concurrency

Getting to the right result

• The first two cases introduced a race condition– Threads racing to perform a write to the same

place in the memory

• Can be prevented with critical section

Page 13: Clojure's take on concurrency

Critical section

• A marker that does not allow a thread to enter a code segment as long as another thread is there

Page 14: Clojure's take on concurrency

Critical section

• It is up to the developer to define it– Using locks

• Need to get the lock of the critical section before entering it

• Need to release the lock of the critical section after finishing with it

Page 15: Clojure's take on concurrency

The trouble with locks

• Introduce a trade-off between improving performance and reducing complexity

• More complexity => more bugs• Concurrency bugs are:

– Harder to find– Harder to replicate– Harder to debug – Harder to solve

Page 16: Clojure's take on concurrency

The trouble with locks

• To properly use locks we need to have a complete understanding of everything that happens in the program– Rarely possible, and if so, by top individuals– Hardly scalable

Page 17: Clojure's take on concurrency

The trouble with locks

• If the entire program is locked, there’s no complexity related to lock management– But we suffer from poorer performance due to

no concurrency

• If nothing is locked => it is up to Murphy

Page 18: Clojure's take on concurrency

Managing locks

• What to lock

• When to lock– What’s the right time for a specific lock– What’s the right order for a series of locks

• When to unlock– The right time for a specific lock– The right order for a series of locks

Page 19: Clojure's take on concurrency

What to lock?

• Pessimistic approach – any accessed value, both read and write

• Optimistic approach – any value we try to write to– What happens if a read value is used in future

writes ?• We cannot trust writes that are based on an

unlocked read

Page 20: Clojure's take on concurrency

When to lock?

• Grab the lock as soon as possible– Prevent from others to take it

• Postpone the locking as much as possible– Less effect on the rest of the threads

Page 21: Clojure's take on concurrency

When to unlock?

• The first release defines the end of the critical section

• Release lock(x) after writing to X

• Release lock(x,y,z) the writes to x,y,z

Page 22: Clojure's take on concurrency

Grabbing several locks

• In what order?– Ordered vs unordered

• What to do if we can’t grab them all– Keep and retry to continue– Release what we have an restart

Page 23: Clojure's take on concurrency

Unordered + keeping the locks

Thread 1:• Need locks A and B

• (grab A)• Wait till B is unlocked

Thread 2:• Need locks A and B• (grab B)

• Wait till A is unlocked

Deadlock!!!

Page 24: Clojure's take on concurrency

Unordered + release the locks

• Need locks A and B

• (grab A)• (can’t get B)

• (free A)

• Need locks A and B• (grab B)

• (can’t get A)

• (free B)

livelock!!!

Page 25: Clojure's take on concurrency

Ordered

• Need to decide on strict order

• Need to enforce it throughout the software

• Need to enforce it on components that interact with the software

• Need to adapt to the order that was used in other components

• Need to update all of the places when there’s a change that affects the order– e.g., in case of refactoring

• Both code structure and element’s names

Page 26: Clojure's take on concurrency

Who grabs the lock

• Need to prioritize the locking order– Need to update the priority based on the

application’s state

• Otherwise we may cause a starvation

– Thread A waits for a lock on X, other threads keep on grabbing that lock before thread A succeeds

Page 27: Clojure's take on concurrency

Debugging concurrent software may introduce

heisenbugs

Page 28: Clojure's take on concurrency

Writing correct concurrent software is very complicated

Complexity cause bugs

Known unknowns

Page 29: Clojure's take on concurrency

Writing correct concurrent software is always harder than you think

The delta between how hard it is and how hard you think it is transforms to bugs which are almost impossible to solve

Unknown unknowns

Page 30: Clojure's take on concurrency

Why does it happen

• Locks have the same abstraction level as types have in assembly– They don’t

• Types are used to allow correct interpretation of the areas in the memory– Semantic aspect of the software

• Locks are used to allow correct access to areas in the memory– Syntactic aspect of the software

• Lower level constructs mixed with higher level language

Page 31: Clojure's take on concurrency

What’s the solution

• Types allow defining semantic interpretation of memory areas– Each access to a memory area has to pass through

the type information

• Need to find a mechanism that would define concurrency semantic to areas in the memory– So each access to the memory area would pass

through the concurrency semantics information

Page 32: Clojure's take on concurrency

What’s the solution

• Add another level of indirection

• Manage changes based on concurrency semantics

• Reference types

Page 33: Clojure's take on concurrency

Type info

memory

Concurrency semantics

The element

Reference types

symbol

Page 34: Clojure's take on concurrency

(as oppose to)

symbol

Type info

memory

Page 35: Clojure's take on concurrency

What happens when changing?

symbol

Type info

memory

Page 36: Clojure's take on concurrency

What happens when changing?

Concurrency semantics

symbol

Type info

memory

Type info

Other memory

This area may be reclaimed by

the GC

Page 37: Clojure's take on concurrency

Clojure epochal model

Symbol that has concurrency semantics

State 1 State 2 State 3

function function

Page 38: Clojure's take on concurrency

State:The value of an identity

at a given time

State can be changed by applying function on an identity

Page 39: Clojure's take on concurrency

Reference types

• Providing concurrency semantics as part of the language– The developer needs to decide what’s the

right concurrency semantics of the element• Just like deciding what’s the type of the element

• When combined with immutability, it results in almost eliminating the risk caused by concurrency

Page 40: Clojure's take on concurrency

Declaring the semantics

as oppose to

implementing it (using locks)

Page 41: Clojure's take on concurrency

Concurrency semantics

• The change is to be performed at:– Current thread (synchronous)– Another thread (A-synchronous)

• A change in the element’s state can be:– Visible to other threads (shared)– Not visible to other threads (isolated)

• A change in the element’s state can be – Coordinated with changes at other elements– Not coordinated with changes at other elements

Page 42: Clojure's take on concurrency

Concurrency semantics

IsolatedCoordinatedSynchronous

No meaning

Page 43: Clojure's take on concurrency

Concurrency semantics

IsolatedCoordinatedSynchronous

var

Page 44: Clojure's take on concurrency

Concurrency semantics

IsolatedCoordinatedSynchronous

ref

Page 45: Clojure's take on concurrency

Concurrency semantics

IsolatedCoordinatedSynchronous

atom

Page 46: Clojure's take on concurrency

Concurrency semantics

IsolatedCoordinatedSynchronous

agent

Page 47: Clojure's take on concurrency

Agent

• A value that can be shared between threads

• The change is not coordinated with other elements

• Execution is performed in an asynchronous manner– By a different thread

Page 48: Clojure's take on concurrency

Agent

• Creation:– (agent <value>)– (def a (agent <value>)

• Reading– (deref <the-agent>)– @<the-agent>

Page 49: Clojure's take on concurrency

Agent - activation

• Activation:– (send a-name func args)

• To be executed from a predefined thread pool

– (send-off a-name func args)• For blocking / heavy functions – uses a new thread

• Send and send-off return immediately– The return value is the agent

Page 50: Clojure's take on concurrency

Agent - activation

• Agents are aware of transactions

• Agent can be activated within a transaction– send or send-off within dosync – The agents wait for the transaction to succeed

before activating• To prevent multiple execution due to retries

Page 51: Clojure's take on concurrency

Agent - waiting

• Agents are performed in an asynchronous fashion– We may reach to a point in the program that we need

their updated value

• We need to wait for it to complete– (await a+)

• Though it may block forever• Returns nil

– (await-for millis a+ )• Waiting for a predefined milliseconds• Return nil in case the return is due to the timeout

Page 52: Clojure's take on concurrency

Error handling

• Agents are executed in a different thread than the one that created them

• In case of error, they are in a FAILURE state

• Any send would result in the same error

• Can be restarted by – (restart-agent <the-agent> new-state)

Page 53: Clojure's take on concurrency

Error handling

• It is possible to set a error handling function to an agent

• The function is activated in case of an error

• (set-error-handler! <the-agent> <er-fn>)– The error handling function receives two

arguments• The agent• The exception

Page 54: Clojure's take on concurrency

Var

• A var’s value is visible in all threads

• We can change its value, but the changes is visible only in the changing thread

• Use ‘def’ to create a var

• (var <the-var-name>) returns the var– Or use the reader macro #’<the-var-name>

• #’a ;=>theNS/a

Page 55: Clojure's take on concurrency

Var

• (def a ^:dynamic 8) to create a var that is re-bindable

• To rebind a var– The common way:

• (binding [binding-pairs] <expression>)

– Use set! within binding to re-bind the var to a new value

Page 56: Clojure's take on concurrency

Var

• The much less used way to rebind a var

– (with-binding* {binding-map} <expression>)• Binding-map is paired with var => newVal• That’s where the reader macro #’ becomes handy

– (with-binding <the-var> <the-value>)

Page 57: Clojure's take on concurrency

Var

• It is also possible to change the root value of a var– The root value is the value exposed to all the

threads

• (alter-var-root the-var f <args…>)– Note that the var’s value is the first argument

to f

Page 58: Clojure's take on concurrency

Atom

• An atom’s value is shared between threads

• A change in an atom’s value is shared between threads

• The change is not coordinated with other Atoms

• The change is atomic – a single point in time

• Execution is synchronous

Page 59: Clojure's take on concurrency

Atom

• Creation– (atom <value>)– (def a (atom <value>))

• Reading an atom’s value– (deref <the-atom>)– @<the-atom>

Page 60: Clojure's take on concurrency

Atom

• (swap! atm func args)– The first argument of func is the pre-change

value of the atom• A new value is created based on the function

• (reset! atm val)– Change the atom’s value to val

Page 61: Clojure's take on concurrency

Ref

• A ref’s value can be shared between threads

• The change can be coordinated with other refs– It is always performed within a transaction,

that can be executed on several refs

• Execution is synchronous

Page 62: Clojure's take on concurrency

Ref

• Creation:– (ref <value>)– (def a (ref <value>))

• Reading– (deref <the-ref>)– @<the-ref>

Page 63: Clojure's take on concurrency

Ref

• the modification of the ref is done using– (alter <the-ref> func args)

• The first argument of func must be the updated element

– (ref-set <the-ref> v)

• Using only the above will not work !!!!

Page 64: Clojure's take on concurrency

Ref

• Need to execute the commands within a transaction

• Use (dosync <expr…>)

Page 65: Clojure's take on concurrency

Transaction

• Transactions maintain the ACID property:– Atomic

• The change happens in a single point in time, for all the participating values

– Or it fails entirely

– Consistent• At any given point the consistency rules are valid

– It is possible to add such rules

– Isolated• Any change done within a transaction is not visible to an outside

viewer during the execution of a transaction– No side effects

– Durable• Once the transaction succeeds, its effects are not-susceptible to

system failures

Page 66: Clojure's take on concurrency

Transactions and side-effects

• Transactions may be retried

• Do not perform side-effects in the body of alter / swap!– Any i/o , db call …

Page 67: Clojure's take on concurrency

Software Transaction Memory (STM)• Clojure uses an STM to update refs and

atoms

• STM maintains the ACI properties– As it runs in memory – no writing to a disc

• Clojure’s STM uses the MVCC algorithm– Multi-version-concurrency-control– Used within commercial DBs, such as

Oracle’s

Page 68: Clojure's take on concurrency

How the update works

• No assignment in the developer’s code

• The developer provides a function– How to create new value based on old value

• The update is managed by the system– There are locks behind the scenes

• The update functionality is just one of the things that can be provided by the developer– More things can be added

Page 69: Clojure's take on concurrency

validation

• It is possible to provide a validator when creating a ref / atom / var / agent– (<elem> initial-val :validator fn)– (set-validator <elem> fn)

• The validation function accepts one argument, which is the new value– Returns either true or false

• If the validation function fails, the transaction fails– No retry– Note that atom’s update is done also within a

transaction

Page 70: Clojure's take on concurrency

Observing changes

• It is possible to add a function that would be invoked upon a change in an element– Var / Atom / Ref / Agent

• (add-watch <elem> <key> <watch-fn>)– <elem>: the var / atom / ref / agent– <key>: a unique identifier of the watch-fn– <watch-fn>: a function that accepts 4 arguments

• <key> - the key used when the fn was attached to the elem• <elem> - the changed element• <old-val> - the old value of the element• <new-val> - the new value of the element

Page 71: Clojure's take on concurrency

Observing changes

• Within the watch function:– Do not deref the element to get its value

• it may be different from both the old and new value

– Ignore the key

• Use the key when removing the watch– (remove-watch <elem> <key>)

Page 72: Clojure's take on concurrency

Pendings

Page 73: Clojure's take on concurrency

What are pendings

• A result of a calculation

• To be used later

• Who provides the calculation

• When to start it

Page 74: Clojure's take on concurrency

What are pendings

• A box that contains a result of a computation• Future

– The computation is defined upon initialization– Starts when the future is defined

• Delay – The computation is defined upon initialization– Starts when somebody asks for the result of the

computation• Promise

– The computation is NOT defined upon initialization– It is up to someone who can access the promise to

provide it

Page 75: Clojure's take on concurrency
Page 76: Clojure's take on concurrency

Future / delay

• An asynchronous computation• Creation:

– (future <form>)– (def ftr (future <form>))

• Reading– (deref ftr)– @ftr

• Reading a future / delay is a blocking operation

Page 77: Clojure's take on concurrency

Future / delay

• When to use– For starting long computations that will be

needed later• DB call• Service over HTTP• …

Page 78: Clojure's take on concurrency

Promise

• A promise is a “box” that holds a data element– Not a computation

• The “box” can be filled once, and then its value can be read– Following attempts to “fill” the box would fail

silently

Page 79: Clojure's take on concurrency

Promise

• Creation– (promise)– (def p (promise))

• Reading– (deref p)– @p

• Setting the value– (deliver p <the-val>)

Page 80: Clojure's take on concurrency

That’s all for today