Download - High-Availability LH* Schemes with Mirroring
![Page 1: High-Availability LH* Schemes with Mirroring](https://reader036.vdocument.in/reader036/viewer/2022062519/56814d9f550346895dbafb10/html5/thumbnails/1.jpg)
1
High-Availability LH* Schemes High-Availability LH* Schemes with Mirroringwith Mirroring
High-Availability LH* Schemes High-Availability LH* Schemes with Mirroringwith Mirroring
W. Litwin, M.-A. NeimatW. Litwin, M.-A. NeimatU. Paris 9 & HPL Palo-AltoU. Paris 9 & HPL Palo-Alto
[email protected]@cid5.etud.dauphine.fr
![Page 2: High-Availability LH* Schemes with Mirroring](https://reader036.vdocument.in/reader036/viewer/2022062519/56814d9f550346895dbafb10/html5/thumbnails/2.jpg)
2
LH* with mirroringLH* with mirroringLH* with mirroringLH* with mirroring A Scalable Dsitributed Data StructuresA Scalable Dsitributed Data Structures Data are in Distributed RAM of server nodes Data are in Distributed RAM of server nodes
of a multicomputerof a multicomputer Uses the mirroring to survive :Uses the mirroring to survive :
– every single node failureevery single node failure– most of multiple node failuresmost of multiple node failures
Moderate performance deterioration with Moderate performance deterioration with respect to basic LH*respect to basic LH*
![Page 3: High-Availability LH* Schemes with Mirroring](https://reader036.vdocument.in/reader036/viewer/2022062519/56814d9f550346895dbafb10/html5/thumbnails/3.jpg)
3
PlanPlanPlanPlan
IntroductionIntroduction– multicomputers & SDDSsmulticomputers & SDDSs– need for high availabilityneed for high availability
Principles of LH* with mirroringPrinciples of LH* with mirroring Design issues Design issues PerformancePerformance ConclusionConclusion
![Page 4: High-Availability LH* Schemes with Mirroring](https://reader036.vdocument.in/reader036/viewer/2022062519/56814d9f550346895dbafb10/html5/thumbnails/4.jpg)
4
MulticomputersMulticomputersMulticomputersMulticomputers A collection of loosely coupled computersA collection of loosely coupled computers
– common and/or preexisting hardwarecommon and/or preexisting hardware– share nothing architectureshare nothing architecture– message passing through message passing through high-speedhigh-speed net net
Network Network multicomputersmulticomputers– use general purpose netsuse general purpose nets
» LANs: Ethernet, Token Ring, Fast Ethernet, SCI, FDDI...LANs: Ethernet, Token Ring, Fast Ethernet, SCI, FDDI...» WANs: ATM...WANs: ATM...
SwitchedSwitched multicomputers multicomputers– use a bus, use a bus,
» e.g., Transputer & Parsytece.g., Transputer & Parsytec
![Page 5: High-Availability LH* Schemes with Mirroring](https://reader036.vdocument.in/reader036/viewer/2022062519/56814d9f550346895dbafb10/html5/thumbnails/5.jpg)
5
Client Server
Network multicomputer
![Page 6: High-Availability LH* Schemes with Mirroring](https://reader036.vdocument.in/reader036/viewer/2022062519/56814d9f550346895dbafb10/html5/thumbnails/6.jpg)
6
Why multicomputers ?Why multicomputers ?Why multicomputers ?Why multicomputers ?
Potentially unbeatable price-performance ratioPotentially unbeatable price-performance ratio
– Much cheaper and more powerful than supercomputersMuch cheaper and more powerful than supercomputers» 1500 WSs at HPL with 500+ GB of RAM & TBs of disks1500 WSs at HPL with 500+ GB of RAM & TBs of disks
Potential computing powerPotential computing power
– file sizefile size
– access and processing timeaccess and processing time
– throughputthroughput For more pro & cons :For more pro & cons :
– NOW project (UC Berkeley)NOW project (UC Berkeley)
– Tanenbaum: "Distributed Operating Systems", Prentice Hall, Tanenbaum: "Distributed Operating Systems", Prentice Hall, 19951995
![Page 7: High-Availability LH* Schemes with Mirroring](https://reader036.vdocument.in/reader036/viewer/2022062519/56814d9f550346895dbafb10/html5/thumbnails/7.jpg)
7
Why SDDSsWhy SDDSsWhy SDDSsWhy SDDSs
Multicomputers need data structures and Multicomputers need data structures and file systemsfile systems
Trivial extensions of traditional structures Trivial extensions of traditional structures are not bestare not best
hot-spotshot-spots scalabilityscalability parallel queriesparallel queries distributed and autonomous clientsdistributed and autonomous clients
![Page 8: High-Availability LH* Schemes with Mirroring](https://reader036.vdocument.in/reader036/viewer/2022062519/56814d9f550346895dbafb10/html5/thumbnails/8.jpg)
8
What is an SDDSWhat is an SDDSWhat is an SDDSWhat is an SDDS A A scalablescalable data structure where: data structure where: Data are on Data are on serversservers
– always available for accessalways available for access
Queries come from autonomous Queries come from autonomous clientsclients– available for access only on its initiativeavailable for access only on its initiative
There is no centralized directoryThere is no centralized directory Clients sometime make Clients sometime make addressing errorsaddressing errors
» Clients have less or more adequate Clients have less or more adequate image image of the actual file structureof the actual file structure
Servers are able to Servers are able to forwardforward the queries to the correct address the queries to the correct address– perhaps in several messagesperhaps in several messages
Servers send Servers send Image Adjustment MessagesImage Adjustment Messages» Clients do not make same error twiceClients do not make same error twice
![Page 9: High-Availability LH* Schemes with Mirroring](https://reader036.vdocument.in/reader036/viewer/2022062519/56814d9f550346895dbafb10/html5/thumbnails/9.jpg)
9
An SDDSAn SDDSAn SDDSAn SDDS
Servers
![Page 10: High-Availability LH* Schemes with Mirroring](https://reader036.vdocument.in/reader036/viewer/2022062519/56814d9f550346895dbafb10/html5/thumbnails/10.jpg)
10
An SDDSAn SDDSAn SDDSAn SDDS
Servers
growth through splits under inserts
![Page 11: High-Availability LH* Schemes with Mirroring](https://reader036.vdocument.in/reader036/viewer/2022062519/56814d9f550346895dbafb10/html5/thumbnails/11.jpg)
11
An SDDSAn SDDSAn SDDSAn SDDS
growth through splits under inserts
Servers
![Page 12: High-Availability LH* Schemes with Mirroring](https://reader036.vdocument.in/reader036/viewer/2022062519/56814d9f550346895dbafb10/html5/thumbnails/12.jpg)
12
An SDDSAn SDDSAn SDDSAn SDDS
Clients
Servers
![Page 13: High-Availability LH* Schemes with Mirroring](https://reader036.vdocument.in/reader036/viewer/2022062519/56814d9f550346895dbafb10/html5/thumbnails/13.jpg)
13
An SDDSAn SDDSAn SDDSAn SDDS
Clients
![Page 14: High-Availability LH* Schemes with Mirroring](https://reader036.vdocument.in/reader036/viewer/2022062519/56814d9f550346895dbafb10/html5/thumbnails/14.jpg)
14
Clients
An SDDSAn SDDSAn SDDSAn SDDS
![Page 15: High-Availability LH* Schemes with Mirroring](https://reader036.vdocument.in/reader036/viewer/2022062519/56814d9f550346895dbafb10/html5/thumbnails/15.jpg)
15
Clients
IAM
An SDDSAn SDDSAn SDDSAn SDDS
![Page 16: High-Availability LH* Schemes with Mirroring](https://reader036.vdocument.in/reader036/viewer/2022062519/56814d9f550346895dbafb10/html5/thumbnails/16.jpg)
16
Clients
An SDDSAn SDDSAn SDDSAn SDDS
![Page 17: High-Availability LH* Schemes with Mirroring](https://reader036.vdocument.in/reader036/viewer/2022062519/56814d9f550346895dbafb10/html5/thumbnails/17.jpg)
17
Clients
An SDDSAn SDDSAn SDDSAn SDDS
![Page 18: High-Availability LH* Schemes with Mirroring](https://reader036.vdocument.in/reader036/viewer/2022062519/56814d9f550346895dbafb10/html5/thumbnails/18.jpg)
18
Known SDDSsKnown SDDSsKnown SDDSsKnown SDDSs
Hachage
ClassicsSDDS(1993)
Arbre 1-d
LH* schemesDDH
Breitbart & alRP* schemesKroll & Widmayer
Arbre k-d
k-RP* schemes
DS
![Page 19: High-Availability LH* Schemes with Mirroring](https://reader036.vdocument.in/reader036/viewer/2022062519/56814d9f550346895dbafb10/html5/thumbnails/19.jpg)
19
Known SDDSsKnown SDDSsKnown SDDSsKnown SDDSs
Hachage
ClassicsSDDS(1993)
Arbre 1-d
LH* schemesDDH
Breitbart & alRP* schemesKroll & Widmayer
Arbre k-d
k-RP* schemes
DS
You are here
![Page 20: High-Availability LH* Schemes with Mirroring](https://reader036.vdocument.in/reader036/viewer/2022062519/56814d9f550346895dbafb10/html5/thumbnails/20.jpg)
20
LH* LH* ((A classic)A classic)LH* LH* ((A classic)A classic)
Allows for key based hash filesAllows for key based hash files– generalizes the LH addressing schemageneralizes the LH addressing schema
Load factor 70 - 90 %Load factor 70 - 90 % At most 2 forwarding messagesAt most 2 forwarding messages
– regardless of the size of the fileregardless of the size of the file
In practice, 1 m/insert and 2 m/search on the In practice, 1 m/insert and 2 m/search on the averageaverage
4 messages in the worst case4 messages in the worst case Search time of a ms (10 Mb/s net) and of us (Gb/s Search time of a ms (10 Mb/s net) and of us (Gb/s
netnet
![Page 21: High-Availability LH* Schemes with Mirroring](https://reader036.vdocument.in/reader036/viewer/2022062519/56814d9f550346895dbafb10/html5/thumbnails/21.jpg)
21
10,000 inserts
Global cost
Client's cost
![Page 22: High-Availability LH* Schemes with Mirroring](https://reader036.vdocument.in/reader036/viewer/2022062519/56814d9f550346895dbafb10/html5/thumbnails/22.jpg)
22
High-availability LH* schemesHigh-availability LH* schemesHigh-availability LH* schemesHigh-availability LH* schemes In a large multicomputer, it is unlikely that all In a large multicomputer, it is unlikely that all
servers are upservers are up Consider the probability that a bucket is up is 99 % Consider the probability that a bucket is up is 99 %
– bucket is unavailable 3 days per yearbucket is unavailable 3 days per year One stores every key in 1 bucket One stores every key in 1 bucket
– case of typical SDDSs, LH* includedcase of typical SDDSs, LH* included Probability that Probability that nn-bucket file is entirely up is-bucket file is entirely up is
» 37 % for 37 % for n = n = 100100
» 0 % for 0 % for n = n = 1000 1000
![Page 23: High-Availability LH* Schemes with Mirroring](https://reader036.vdocument.in/reader036/viewer/2022062519/56814d9f550346895dbafb10/html5/thumbnails/23.jpg)
23
High-availability LH* schemesHigh-availability LH* schemesHigh-availability LH* schemesHigh-availability LH* schemes
Using 2 buckets to store a key, one may Using 2 buckets to store a key, one may expect :expect :
– 99 % for 99 % for n = n = 100 100
– 91 % for 91 % for n n = 1000= 1000 High availability SDDS High availability SDDS
– make sensemake sense– are the only way to reliable large SDDS filesare the only way to reliable large SDDS files
![Page 24: High-Availability LH* Schemes with Mirroring](https://reader036.vdocument.in/reader036/viewer/2022062519/56814d9f550346895dbafb10/html5/thumbnails/24.jpg)
24
High-availability LH* schemesHigh-availability LH* schemesHigh-availability LH* schemesHigh-availability LH* schemes
High-availability LH* schemes keep data High-availability LH* schemes keep data available despite server failuresavailable despite server failures– any single server failureany single server failure
– most of two server failuresmost of two server failures
– some catastrophic failuressome catastrophic failures
Three types of schemes are currently knownThree types of schemes are currently known– with mirroringwith mirroring– with striping or groupingwith striping or grouping
![Page 25: High-Availability LH* Schemes with Mirroring](https://reader036.vdocument.in/reader036/viewer/2022062519/56814d9f550346895dbafb10/html5/thumbnails/25.jpg)
25
LH* with MirroringLH* with MirroringLH* with MirroringLH* with Mirroring There are two files called There are two files called mirrorsmirrors Every insert propagates to bothEvery insert propagates to both
– the propagation is done by the serversthe propagation is done by the servers Splits are autonomous Every search is directed towards one of the mirrorsEvery search is directed towards one of the mirrors
– thethe primary primary mirror for the corresponding client mirror for the corresponding client If a bucket failure is detected, the If a bucket failure is detected, the sparespare is produced is produced
instantlyinstantly at some site at some site– the storage for failed bucket is reclaimedthe storage for failed bucket is reclaimed– it can be allocated to another bucketit can be allocated to another bucket
![Page 26: High-Availability LH* Schemes with Mirroring](https://reader036.vdocument.in/reader036/viewer/2022062519/56814d9f550346895dbafb10/html5/thumbnails/26.jpg)
26
Basic configurationBasic configurationBasic configurationBasic configuration
Site 1with file F1
Site 2with file F2
Mirrors
Protection against a catastrophique failureProtection against a catastrophique failure
![Page 27: High-Availability LH* Schemes with Mirroring](https://reader036.vdocument.in/reader036/viewer/2022062519/56814d9f550346895dbafb10/html5/thumbnails/27.jpg)
27
High-availability LH* schemesHigh-availability LH* schemesHigh-availability LH* schemesHigh-availability LH* schemes
Two types of LH* schemes with mirroring appearTwo types of LH* schemes with mirroring appear Structurally-alike (SA) mirrorsStructurally-alike (SA) mirrors
– same file parameterssame file parameters» keys are presumably at the same bucketskeys are presumably at the same buckets
Structurally-dissimilar (SD) mirrorsStructurally-dissimilar (SD) mirrors» keys are presumably at different bucketskeys are presumably at different buckets
– loosely coupled = same LH-functions loosely coupled = same LH-functions hhii
– minimally coupled = different LH-functions minimally coupled = different LH-functions hhii
![Page 28: High-Availability LH* Schemes with Mirroring](https://reader036.vdocument.in/reader036/viewer/2022062519/56814d9f550346895dbafb10/html5/thumbnails/28.jpg)
83216
30
363317
31
21
34264218
32
15
633514
23
43683620
34
23
1251656921
35
23
20
83216
30
22
3317
31
221
34264218
32
115
633514
23
47
683620
34
223
1251656921
35
44
10
i' = 0i' = 3
0, 125
3, 35
C1C2
SA-MirrorsSA-MirrorsSA-MirrorsSA-Mirrors
![Page 29: High-Availability LH* Schemes with Mirroring](https://reader036.vdocument.in/reader036/viewer/2022062519/56814d9f550346895dbafb10/html5/thumbnails/29.jpg)
SA-MirrorsSA-Mirrorsnew forwarding pathsnew forwarding paths
SA-MirrorsSA-Mirrorsnew forwarding pathsnew forwarding paths
![Page 30: High-Availability LH* Schemes with Mirroring](https://reader036.vdocument.in/reader036/viewer/2022062519/56814d9f550346895dbafb10/html5/thumbnails/30.jpg)
30
Failure managementFailure managementFailure managementFailure management
A bucket failure can be discoveredA bucket failure can be discovered– by the clientby the client– by the forwarding or mirroring serverby the forwarding or mirroring server– by the LH* split coordinatorby the LH* split coordinator
The failure discovery triggers the The failure discovery triggers the instantinstant creation of a creation of a spare spare bucketbucket– a copy of the failed bucket constructed from the a copy of the failed bucket constructed from the
mirror filemirror file» from one or more bucketsfrom one or more buckets
![Page 31: High-Availability LH* Schemes with Mirroring](https://reader036.vdocument.in/reader036/viewer/2022062519/56814d9f550346895dbafb10/html5/thumbnails/31.jpg)
31
Spare creationSpare creationSpare creationSpare creation
The spare creation process is managed by The spare creation process is managed by the coordinatorthe coordinator– choice of the node for the sparechoice of the node for the spare– transfert of the records from the mirror filetransfert of the records from the mirror file
» the algo is in the paperthe algo is in the paper
– propagation of the spare node address to the propagation of the spare node address to the node of the failed bucketnode of the failed bucket
» when the node recovers, it contacts the coordinatorwhen the node recovers, it contacts the coordinator
![Page 32: High-Availability LH* Schemes with Mirroring](https://reader036.vdocument.in/reader036/viewer/2022062519/56814d9f550346895dbafb10/html5/thumbnails/32.jpg)
32
And the client ?And the client ?And the client ?And the client ?
The client can be unaware of the failureThe client can be unaware of the failure– it then may send the message to the failed nodeit then may send the message to the failed node
» that perhaps recovered and has another bucket that perhaps recovered and has another bucket n'n'
ProblemProblem– bucket bucket n' n' should recognize an addressing errorshould recognize an addressing error– should forward the query to the spareshould forward the query to the spare
» a case that did not exist for the basic LH* a case that did not exist for the basic LH*
![Page 33: High-Availability LH* Schemes with Mirroring](https://reader036.vdocument.in/reader036/viewer/2022062519/56814d9f550346895dbafb10/html5/thumbnails/33.jpg)
33
SolutionSolutionSolutionSolution
Every client sends with the query Every client sends with the query Q Q the the address address n n of the bucket of the bucket QQ should reach should reach
if if n <> n'n <> n', then bucket , then bucket n' n' resends the query resends the query to bucket to bucket n n – that must be the sparethat must be the spare
Bucket Bucket n n sendssends an IAM to the client to an IAM to the client to adjust its alloc. tableadjust its alloc. table– a new kind of IAMa new kind of IAM
![Page 34: High-Availability LH* Schemes with Mirroring](https://reader036.vdocument.in/reader036/viewer/2022062519/56814d9f550346895dbafb10/html5/thumbnails/34.jpg)
34
SA / SD mirrorsSA / SD mirrorsSA / SD mirrorsSA / SD mirrors
2
10 6 2
10 6
2
10 6
210 6
2
10 6
987 12
2
1
0
6
98
7
(a)
(b)
(c)
F1F2b = 8 b = 8
b = 4
b = 6
SA-mirrors
SD-mirrors
![Page 35: High-Availability LH* Schemes with Mirroring](https://reader036.vdocument.in/reader036/viewer/2022062519/56814d9f550346895dbafb10/html5/thumbnails/35.jpg)
35
SA-mirrorsSA-mirrors– most efficient for access and spare productionmost efficient for access and spare production– but max loss in the case of two-bucket failurebut max loss in the case of two-bucket failure
Loosely-coupled SD-mirrorsLoosely-coupled SD-mirrors– less efficient for access and spare productionless efficient for access and spare production– lesser loss of data for a two-bucket failurelesser loss of data for a two-bucket failure
Minimally-coupled SD-mirrorsMinimally-coupled SD-mirrors– least efficient for access and spare productionleast efficient for access and spare production– min. loss for a two-bucket failuremin. loss for a two-bucket failure
ComparisonComparisonComparisonComparison
![Page 36: High-Availability LH* Schemes with Mirroring](https://reader036.vdocument.in/reader036/viewer/2022062519/56814d9f550346895dbafb10/html5/thumbnails/36.jpg)
36
ConclusionConclusionConclusionConclusion
LH* with mirroring is first SDDS for high-LH* with mirroring is first SDDS for high-availabilityavailability– for large multicomputer filesfor large multicomputer files– for high-availability DBsfor high-availability DBs
» avoids to create fragments replicasavoids to create fragments replicas
Variants adapted to importance of different Variants adapted to importance of different kinds of failureskinds of failures– How important is a multiple bucket failure ?How important is a multiple bucket failure ?
![Page 37: High-Availability LH* Schemes with Mirroring](https://reader036.vdocument.in/reader036/viewer/2022062519/56814d9f550346895dbafb10/html5/thumbnails/37.jpg)
37
Price to payPrice to payPrice to payPrice to pay
Moderate access performance deterioration Moderate access performance deterioration as compared to basic LH*as compared to basic LH*– an additional message to the mirror per insertan additional message to the mirror per insert– a few messages when failures occura few messages when failures occur
Double storage for the fileDouble storage for the file– can be a drawbackcan be a drawback
![Page 38: High-Availability LH* Schemes with Mirroring](https://reader036.vdocument.in/reader036/viewer/2022062519/56814d9f550346895dbafb10/html5/thumbnails/38.jpg)
38
Future directionsFuture directionsFuture directionsFuture directions
ImplementationImplementation Performance analysisPerformance analysis
– in presence of failuresin presence of failures Concurrency & transaction managementConcurrency & transaction management Other high-availability schemesOther high-availability schemes
– RAID-likeRAID-like
![Page 40: High-Availability LH* Schemes with Mirroring](https://reader036.vdocument.in/reader036/viewer/2022062519/56814d9f550346895dbafb10/html5/thumbnails/40.jpg)
40