syncing in javascript: multi-client collaboration through data sharing (steve marx)
DESCRIPTION
Presentation taken from Future of Web Apps Boston (http://futureofwebapps.com/boston-2014) In this talk, Steve will build a system from scratch for cross-device data synchronization in JavaScript. Through demos, he will explore all the things you're probably not thinking about when rolling your own sync engine, like offline caching, change notification, and conflict resolution. Drawing on his experience from Dropbox, Steve will discuss the thorny challenges around sync and how to solve them.TRANSCRIPT
Steve Marx
Developer advocate
Dropbox
“Data synchronization is the process of
establishing consistency among data from
a source to a target data storage and vice
versa and the continuous harmonization
of the data over time.”
– Wikipedia
Cloud storage
DEMOLet’s build cloud storage.
if request.method == 'PUT' then
storage.data = request.body
end
return storage.data
Simple cloud storage
Concurrency
Concurrency anti-pattern
This algorithm doesn’t work:
1. GET latest from the server
2. PUT update
What if things change between 1 and 2?
if (file.exists) {
file.open();
}
Optimistic concurrency
Optimistic concurrency
• Try update “optimistically”
• Conditionally update if nothing changed
• Else change is rejected
• HTTP has Etag and If-Match headers
DEMOImproved cloud storage, now without data loss!
local hash =
crypto.sha256(storage.data or '').hexdigest()
if request.method == 'PUT' then
if request.headers['If-Match'] == hash then
storage.data = request.body or ''
return 200
else
return 409
end
end
Conflict resolution
Transactions
Algorithm for applying an operation:
1. Read current value
2. Apply operation to value
3. Send to server
4. If rejected, GOTO 1
function increment() {
superagent.put('https://sync.webscript.io')
.set('If-Match', etag).send(counter + 1)
.done(function (res) {
if (res.status !== 200) {
counter = parseInt(res.text);
etag = res.headers.etag;
increment();
}
});
}
Transaction downsides
• Commits end up serialized
• Transactions lost on shutdown?
• Firebase and Dropbox Datastores
(Python)
Working offline
Diff and merge
• Keep track of a common ancestor
• 3-way merge, git-like:
– Diff server data against ancestor
– Apply patch to client data
– (or something equivalent)
Merging counts
Ancestor: 3
Server: 4
Client: 5
Merged: 3 + (4-3) + (5-3) = 6
DEMOMerge all the things.
Merging downsides
• Requires tracking common ancestor
• Sometimes ambiguous
Summary
• Syncing is hard, so think about:
– Dealing with concurrency
– Thinking about offline
– Resolving conflicts in a consistent way
• Or use someone else’s platform
(Dropbox!)
Image credits
Thanks!
https://www.flickr.com/photos/kalyan02/5458325252
https://www.flickr.com/photos/gara/3560355335
https://www.flickr.com/photos/somoslarevolucion/3609196127
http://geekandpoke.typepad.com/geekandpoke/2012/09/simply-
explained-concurrency.html
https://www.flickr.com/photos/onesevenone/6361400909
@smarx