dude where is my data

28
 USENIX LISA San Diego, CA December 13, 2012 Dude, Where' !" Da#a$ Jeff Darcy gluster.org

Upload: mohant3ch

Post on 22-Feb-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Dude Where is My Data

7/24/2019 Dude Where is My Data

http://slidepdf.com/reader/full/dude-where-is-my-data 1/28

 

USENIX LISA San Diego, CA December 13, 2012

Dude, Where' !" Da#a$

Jeff Darcygluster.org

Page 2: Dude Where is My Data

7/24/2019 Dude Where is My Data

http://slidepdf.com/reader/full/dude-where-is-my-data 2/28

 

USENIX LISA San Diego, CA December 13, 2012

The Problem

● Compute cycles are everywhere

● Your data isn't

● It's easy to move computation then find thatthere's no data for it to wor! on

%%&S

Page 3: Dude Where is My Data

7/24/2019 Dude Where is My Data

http://slidepdf.com/reader/full/dude-where-is-my-data 3/28

 

USENIX LISA San Diego, CA December 13, 2012

(arie#"

"ig Data# $%

(e)oci#"(o)ume

Page 4: Dude Where is My Data

7/24/2019 Dude Where is My Data

http://slidepdf.com/reader/full/dude-where-is-my-data 4/28

 

USENIX LISA San Diego, CA December 13, 2012

$olume $elocity $ariety

● $olume & total data T"(

● affects initial setup or full)resync( time

● bandwidth problem

● $elocity & rate of change T"*hour files*hour(

● affects ongoing bandwidth need

● bandwidth and latency problem

● $ariety & data +shape,● file) and directory)si-e distribution sparseness

etended attributes even contents

Page 5: Dude Where is My Data

7/24/2019 Dude Where is My Data

http://slidepdf.com/reader/full/dude-where-is-my-data 5/28

 

USENIX LISA San Diego, CA December 13, 2012

Di*ergence

"igger Data# D%

DomainDi#ance

Page 6: Dude Where is My Data

7/24/2019 Dude Where is My Data

http://slidepdf.com/reader/full/dude-where-is-my-data 6/28

 

USENIX LISA San Diego, CA December 13, 2012

Distance Domains Divergence

● Distance# how far/

● across the river vs. across the world

● Domains# how many/

● two sites vs. four sites vs. hundreds of sites

● also separate security perimeters and policies

● Divergence# how similar/

● sync vs. async ordered vs. unordered conflictresolution ...

Page 7: Dude Where is My Data

7/24/2019 Dude Where is My Data

http://slidepdf.com/reader/full/dude-where-is-my-data 7/28

 

USENIX LISA San Diego, CA December 13, 2012

0ample# rsync

● $ariety affects scanning rate

● delta comparison favors large files

● 1ensitive to distance

● still need networ! round trips to comparechec!sums

● 2ard to manage with many domains

● set up each connection separately includingparallel connections

● 2igh divergence ) scanning order conflicts

Page 8: Dude Where is My Data

7/24/2019 Dude Where is My Data

http://slidepdf.com/reader/full/dude-where-is-my-data 8/28

 

USENIX LISA San Diego, CA December 13, 2012

Initial 1ync

● Transfer large files instead of small ones

● 34 5"*s 6& 3 T"*day

● 74ms 8TT 6& 35 ops*day fewer files*day(

● copy tarballs or dis! images pac!*unpac! locally

● Transfer in parallel

● 9rid:TP P:TP "itTorrent 5urder ; Twitter(

● <ggressively pre)deploy replicas– ...or let a CD= do it for you

● Don't forget compression*deduplication

Page 9: Dude Where is My Data

7/24/2019 Dude Where is My Data

http://slidepdf.com/reader/full/dude-where-is-my-data 9/28

 

USENIX LISA San Diego, CA December 13, 2012

>? =ow @hat/

Ini#ia)S"nc

In#ance+Da#a Lie#ime

Page 10: Dude Where is My Data

7/24/2019 Dude Where is My Data

http://slidepdf.com/reader/full/dude-where-is-my-data 10/28

 

USENIX LISA San Diego, CA December 13, 2012

8eplication 1emantics

)

sync async

ordered unordered

con#inuou)ogging

-eriodiccanning

)a#enc"eni#i*e

Page 11: Dude Where is My Data

7/24/2019 Dude Where is My Data

http://slidepdf.com/reader/full/dude-where-is-my-data 11/28

 

USENIX LISA San Diego, CA December 13, 2012

8eplication Topologies

"oston

1an Diego

"oston

Iowa City

<nn <rbor

1an Diego

"oston

Iowa City

<nn <rbor

1an Diego

"oston

Iowa City

<nn <rbor

1an Diego

Page 12: Dude Where is My Data

7/24/2019 Dude Where is My Data

http://slidepdf.com/reader/full/dude-where-is-my-data 12/28

 

USENIX LISA San Diego, CA December 13, 2012

>ther Distinctions

● Directionality

● static master floating master peer to peer

● 5igration and caching are replication too

● epressed vs. assumed interest

● partial

● ependable not dependable(

Page 13: Dude Where is My Data

7/24/2019 Dude Where is My Data

http://slidepdf.com/reader/full/dude-where-is-my-data 13/28

 

USENIX LISA San Diego, CA December 13, 2012

8eplication Aite

● Consider using an overlay*union :1

● Bnionfs <B:1 overlay mounts

● 0ach client has their own overlay on top of

same read)only base● 1hip overlay bac! home to apply and resolve

conflicts/( at leisure

● :ree version history

Page 14: Dude Where is My Data

7/24/2019 Dude Where is My Data

http://slidepdf.com/reader/full/dude-where-is-my-data 14/28

 

USENIX LISA San Diego, CA December 13, 2012

1ync vs. <sync

● 1ynchronous replication

● divergence very small

– still possible with errors

● performance limited by latency● <synchronous replication

● divergence can be uite large

conflict handling becomes most of the code● performance limited by bandwidth

Page 15: Dude Where is My Data

7/24/2019 Dude Where is My Data

http://slidepdf.com/reader/full/dude-where-is-my-data 15/28

 

USENIX LISA San Diego, CA December 13, 2012

1canning vs. Aogging

● 1canning negatives

● naive versions are slow and resource intensive

● even smart versions have high divergence

– +many siblings, problem– often missing info for proper conflict resolution

● Aogging negatives

reuires local buffer space– one more thing to provision*manage or have fail(

● networ! interruptions still create divergence

Page 16: Dude Where is My Data

7/24/2019 Dude Where is My Data

http://slidepdf.com/reader/full/dude-where-is-my-data 16/28

 

USENIX LISA San Diego, CA December 13, 2012

Improving on rsync 3 of %(

● @rap a script around it

● connection setup and credential management

● parallel streams

● continuous iteration

● >ptimi-e scanning

● mar! changes up toward root

● don't scan unchanged subtrees● net slide

Page 17: Dude Where is My Data

7/24/2019 Dude Where is My Data

http://slidepdf.com/reader/full/dude-where-is-my-data 17/28

 

USENIX LISA San Diego, CA December 13, 2012

Improving on rsync of %(

*bar

*bar*-*bar*y

*foo

*foo*w *foo*

*

mar!ed not mar!ed

S#o-

Scanning

.houandSib)ing

&rob)em

Page 18: Dude Where is My Data

7/24/2019 Dude Where is My Data

http://slidepdf.com/reader/full/dude-where-is-my-data 18/28

 

USENIX LISA San Diego, CA December 13, 2012

Improving rsync % of %(

● 1o it's better than before

● higher scanning rate

● more automated

● ...but...● scanning is still inherently inefficient

– still have to find changes within files and*or transfer morethan necessary

● divergence is still high

– changes appear in scanning order might conflict

Page 19: Dude Where is My Data

7/24/2019 Dude Where is My Data

http://slidepdf.com/reader/full/dude-where-is-my-data 19/28

 

USENIX LISA San Diego, CA December 13, 2012

"y The @ay...

● That's pretty much 9luster:1 geo)sync but I'mnot here to tal! about that.

● Current proEect# ordered async replication

● +pony, replication as in +all that and...,● full duple mesh partition tolerant

● vector)cloc! conflict resolution

maybe I'll be able to tal! more about it net year● meanwhile see 8esources last lin!(

Page 20: Dude Where is My Data

7/24/2019 Dude Where is My Data

http://slidepdf.com/reader/full/dude-where-is-my-data 20/28

 

USENIX LISA San Diego, CA December 13, 2012

@hat's @rong/

● Isn't this all rather . . . manual/

● you manage scheduling

● you manage parallelism

● you manage credentials● you manage conflicts

● Yes it isF

● Aet's loo! at more transparent solutions.

Page 21: Dude Where is My Data

7/24/2019 Dude Where is My Data

http://slidepdf.com/reader/full/dude-where-is-my-data 21/28

 

USENIX LISA San Diego, CA December 13, 2012

<:1

● The grand)daddy of wide)area distributedfilesystems

● Deployed successfully at hundreds of sites

tens of thousands of users● >nly one writable replica others read)only

● 1tatic file)Gserver assignment

● =otoriously hard to administer and debug● seven types of servers +uniue,

communication*loc!ing protocols

Page 22: Dude Where is My Data

7/24/2019 Dude Where is My Data

http://slidepdf.com/reader/full/dude-where-is-my-data 22/28

 

USENIX LISA San Diego, CA December 13, 2012

:ile1erver

:ile1erver

<:1 Diagram

*

*users *data

*users*staff *users*students

:ile1erver

:ile1erver

:ile1erver

$olume1erver

Client

Page 23: Dude Where is My Data

7/24/2019 Dude Where is My Data

http://slidepdf.com/reader/full/dude-where-is-my-data 23/28

 

USENIX LISA San Diego, CA December 13, 2012

Coda

● <:1 descendant

● <dds disconnected)client operation

● <dds multi)way write replication between

servers

● Conflict resolution is automatic but type)specific

● 1hares other drawbac!s with <:1

● =ot widely deployed

Page 24: Dude Where is My Data

7/24/2019 Dude Where is My Data

http://slidepdf.com/reader/full/dude-where-is-my-data 24/28

 

USENIX LISA San Diego, CA December 13, 2012

Htreem:1

● 0uropean Htreem>1*Contrail proEects

● 1ervers# one DI8 one 58C multiple >1D

● dynamic placement on >1D better than <:1(

● DI8*58C replication*failover still immature/

● 2istorically# read)only replication pull model(

● 5ore recently# read*write replication

● floating master leases

● 1napshots

Page 25: Dude Where is My Data

7/24/2019 Dude Where is My Data

http://slidepdf.com/reader/full/dude-where-is-my-data 25/28

 

USENIX LISA San Diego, CA December 13, 2012

Client

Htreem:1 Diagram

>1D >1D >1D

DI8Client

58C

Da#a &a#h

Con#ro)&a#h

Client

Page 26: Dude Where is My Data

7/24/2019 Dude Where is My Data

http://slidepdf.com/reader/full/dude-where-is-my-data 26/28

 

USENIX LISA San Diego, CA December 13, 2012

>ther 1olutions

● dCache i8>D1# archival orientation onlineinformation almost unreadable

● 1ector# paired with 1phere 2adoop

alternative( claims @<= distribution● D8"D# two)way async bloc! replication

● :1)Cache# client caching add)on to =:1 <:1

● PeerDist*"ranchCache# content)addressablecaching from 15"*CI:1

Page 27: Dude Where is My Data

7/24/2019 Dude Where is My Data

http://slidepdf.com/reader/full/dude-where-is-my-data 27/28

 

USENIX LISA San Diego, CA December 13, 2012

Conclusions

● Initial sync is easy staying in sync is hard

● Conflict resolution is a maEor issue

● potential for failure plus performance concern

● segregate data by consistency reuirements

– including read only

● try to choose +Eust enough, consistency

1ome assembly reuired

Page 28: Dude Where is My Data

7/24/2019 Dude Where is My Data

http://slidepdf.com/reader/full/dude-where-is-my-data 28/28

 

USENIX LISA San Diego, CA December 13, 2012

8esources

● 1aito and 1hapiro essentialF( http#**www.ysaito.com*survey.pdf

● <cademic "ac!ground

● "ayou http#**www)users.cs.umn.edu*6he*iss*iss47344.ppt

● :icus http#**www.lasr.cs.ucla.edu*ficus*ficussummary.html

● >cean1tore http#**oceanstore.cs.ber!eley.edu*

● Production Code

● http#**rsync.samba.org*

● http#**www.openafs.org*

● http#**www.coda.cs.cmu.edu*● http#**www.treemfs.org*

● http#**www.gluster.org*

● http#**he!afs.org*inde.php*433*34*all)that)and)a)pony*