analysis of replicated data with repair dependency
DESCRIPTION
Analysis of Replicated Data with Repair Dependency. ING-RAY CHEN AND DING-CHAU WANG Presented By Weiping He and Li-Yuan Kuan. INTRODUCTION. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/1.jpg)
1
Analysis of Replicated Analysis of Replicated Data withData with
Repair DependencyRepair Dependency
ING-RAY CHEN AND DING-CHAU WANG
Presented By Weiping He and Li-Yuan Kuan
![Page 2: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/2.jpg)
2
INTRODUCTION
Pessimistic control algorithms for replicated data permit only one partition to perform update operations at any given time so as to ensure mutual exclusion of the replicated data object.
![Page 3: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/3.jpg)
3
INTRODUCTION(contd)
Existing availability modelling and analyses of pessimistic control algorithms for replicated data management are constrained to either site-failure-only or link-failure-only models, but not both.
![Page 4: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/4.jpg)
4
INTRODUCTION(contd)
Using stochastic Petri nets, this paper investigates the effect of repair dependency which occurs when many sites and links may have to share the same repairman due to repair constraints.
![Page 5: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/5.jpg)
5
Four repairman models in the paper
(a) independent repairman with one repairman assigned to each link and each node;
(b) dependent repairman with FIFO servicing discipline;
(c) dependent repairman with linear-order servicing discipline;
(d) dependent repairman with best servicing discipline.
![Page 6: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/6.jpg)
6
SYSTEM MODEL AND ASSUMPTIONS
Background Dynamic voting uses current up to
date copies at any time for deriving its dynamic quorum set. the system can dynamically adjust its quorum set in response to state changes and it results in an increase in availability.
![Page 7: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/7.jpg)
7
Background (contd)
Suppose a file f is replicated to n copies, stored at sites S1, S2, ….Sn.
The essence of dynamic voting is that it must keep track of the number of up to date copies involved in the last update, and also which copies are up to date.
![Page 8: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/8.jpg)
8
Background (contd)
Each copy is associated with 3 variables version number (VN): to tell if the local copy is current site cardinality(SC): the number of current
copies that participated in the last update
distinguished site (DS): stores the ID of the highest linearly ordered site among all sites that presently store current copies.
![Page 9: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/9.jpg)
9
Background (contd)
(VNi, SCi , DSi) be the set associated with the copy stored in site Si For all Si, initial states are Vni= 0,
Sci =n Dsi= n. Dsi is initialized to n because Sn
is the highest linearly ordered site among all initially.
![Page 10: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/10.jpg)
10
Background (contd)
Process When updating, a site C (coordinator)
requests all sites that have a copy of f to send their values of (VNi, SCi , DSi).
P : the set consisting of C and all subordinates that responded to the request; each site in P locks its copy of the data item f during the process.
![Page 11: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/11.jpg)
11
Background (contd)
Site C then inspects the largest version number VNp in set P .
I : the subset containing a copy with version number VNp.
SCp , DSp : the value of SC and DS in set I
![Page 12: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/12.jpg)
12
Background (contd)
site C is in the major partition if(1) the cardinality of I is larger than one half of SCp;
(2) the cardinality of I is exactly equal to one half of SCp and set I contains site DSp.
If site C is in the major partition , it can commit the update locally.
![Page 13: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/13.jpg)
13
Background (contd)
(VN, SC, DS) is set to (VNp+1, the cardinality of set P , the ID of the highest linearly ordered site found in P ).
![Page 14: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/14.jpg)
14
Assumptions
(1) there are n sites connected by a topology to be specified. Each site is assigned a single vote and a unique site ID, numbered 1,2 …n. where n is the total number of sites in the system. site n is the highest linearly ordered site.
![Page 15: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/15.jpg)
15
Assumptions (contd)
(2) sites and links have independent failure rate λs and λl.
(3) A repairman repairs a failed site with rate μs and a failed link with rate μl .
(4) times between these events are exponentially distributed.
![Page 16: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/16.jpg)
16
Independent repairman model
Site subnet A site can be in
one of four states up and
current(upcci) up and out-of-date
(upoci) down and current
(downcci) down and out-of-
date (downoci)
![Page 17: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/17.jpg)
17
Independent repairman model
Suppose site i initially is in the state of up and current: #.upcci = i, #.upoci = 0, #.downcci = 0
and #.downoci =0 an update occurs and site i is not in
the major partition. upcci upoci #.upcci = 0, #.upoci = i, #.downcci = 0
and #.downoci =0
![Page 18: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/18.jpg)
18
Independent repairman model
upcci upoci: (t5i ; 3; gi ) t5i is the name
of the transition, 3 is the priority of transition current, gi is the enabling function
![Page 19: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/19.jpg)
19
Independent repairman model
a transition is enabled when (1) input places contains tokens
>= the multiplicity of the input arc.
(2) enabling function returns TRUE.
the transition with the highest priority level will fire first.
![Page 20: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/20.jpg)
20
Independent repairman model
When readyi =1, upcci = i, the enabling function return TRUE:
transition t5i will be fired , tokens in readyi and upcci will be removed and i tokens will be added to upoci. i is now in the state of up and out of date.
![Page 21: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/21.jpg)
21
Independent repairman model
When an update arrives and a major partition exists, a token will be put into readyi. only one out of the six transitions can fire,all have the same priority level
![Page 22: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/22.jpg)
22
Independent repairman model
t0i will fire if site i is in down and current. #i Downccidownoci, an update occurs and site i is in down and current, new state will be down and out of date.
Site i is down can’t attend in the update.
![Page 23: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/23.jpg)
23
Independent repairman model
t1i will fire if site i is in down and out of date.
Site i remains in down and out of date..
![Page 24: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/24.jpg)
24
Independent repairman model
t2i will fire if site i is in up and out of date,.
upoci upcci, new state will be up and current,
t2i’s enabling function gi returns TRUE if site i is in the major partition.
![Page 25: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/25.jpg)
25
Independent repairman model
t3i will fire if site i is in up and current.
upcci upcci, new state will be up and current.
enabling function gi
![Page 26: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/26.jpg)
26
Independent repairman model
t4i will fire if site i is in up and out of date, and not in the major partition
upoci upoci, new state will be up and out of date.
enabling function gi
![Page 27: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/27.jpg)
27
Independent repairman model
t5i will fire if site i is in up and current, and not in the major partition.
upcci upoci, new state will be up out of date.
enabling function gi
![Page 28: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/28.jpg)
28
Independent repairman model
all transitions in are immediate transitions.
![Page 29: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/29.jpg)
29
System subnet
At the bottom of Figure 2, each of the boxes labeled site i is the SPN subset shown in Figure 1.
![Page 30: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/30.jpg)
30
System subnet
Transitions tf and tfbar are given the highest priority levels (5 and 4).
When an update event arrives, a token will be put in place update event.
![Page 31: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/31.jpg)
31
System subnet
tf is evaluated prior to tfbar since it has the highest priority.
If a major partition exists, tf will fire, remove the token in update_event and place one token in all Is.
![Page 32: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/32.jpg)
32
System subnet
If a major partition does not exist, tfbar will fire, remove the token in update_event.
![Page 33: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/33.jpg)
33
System subnet
The next highest priority level (i.e. 3) is assigned to each of the boxes.
![Page 34: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/34.jpg)
34
System subnet
After all sites are evaluated and each site's status are updated, tds and tsc which have lower priority levels will execute.
![Page 35: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/35.jpg)
35
System subnet
tsc updates the site cardinality.
tsc sc, multiplicity is the number of sites with mark (upcc) >0 in the major partition.
If update changes the system status, the site cardinality is stored as the number of tokens in place sc.
![Page 36: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/36.jpg)
36
System subnet
tds updates the distinguished site.
tds ds, multiplicity is maximum #(upcc) value among all the sites in the major partition.
After update, the site ID of the new distinguished site will be stored as the number of tokens in place ds.
![Page 37: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/37.jpg)
37
Site failure/repair subnets
This figure describes the effect of site i’s failure and repair on the system state, for the independent repairman model.
![Page 38: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/38.jpg)
38
Site failure/repair subnets
site i can only be in one state at a time, only one transition out of these two subnets is possible at a time.
![Page 39: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/39.jpg)
39
Site failure/repair subnets
failure events: upcci downcci
tccfi λs or upoci downoci
tocfi λs
repair events: Downcci upcci
tccri μs downoci upoci
Tocri μs
![Page 40: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/40.jpg)
40
Site failure/repair subnets
system will update its status upon every failure or repair event.
It ensures that the system state is updated before another failure or repair event occurs.
![Page 41: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/41.jpg)
41
Link failure/repair subnets
subscript ij refer to the link between i, j.
failure events: uplinkij dwlinkij
tlinkfi λl
repair events: dwlinkij uplinkij
tlinkri μl
![Page 42: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/42.jpg)
42
Meanings of places.
![Page 43: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/43.jpg)
43
Arc multiplicity functions.
![Page 44: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/44.jpg)
44
Enabling functions.
![Page 45: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/45.jpg)
45
FIFO repairman model
we can make use of the independent repairman model and modify the repair rates to account for repair dependencies.
repair rate now becoming a function of the total number of failed sites and links.
![Page 46: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/46.jpg)
46
FIFO repairman model
If a state has two failed sites and one failed link, independent
repairman model, repair rates are μs, μs and μl,
FIFO repairman model, repair rates are μs/3, μs/3 and μl/3,
![Page 47: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/47.jpg)
47
FIFO repairman model
In general, if a state has M failed sites and links, then the respective repair rates of these failed entities in that state are “deflated” by a factor of M to account for the effect of repair resource sharing.
![Page 48: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/48.jpg)
48
Linear-order repairman model
there is a prespecified order among failed entities.Because of selecting the distinguished site (DS), it is naturally to follow that order for repair.
a higher linearly ordered site will become a distinguished site in the event that the number of sites within the major partition is an even number. Therefore, giving a higher repair priority to a higher linearly ordered site increases the chance of finding the majority partition.
![Page 49: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/49.jpg)
49
Linear-order repairman model
Example: a state in which there are only two
sites D and E left in the major partition with E being the distinguished site. Suppose sites D and E subsequently fail. Then, repairing D would not result in a major partition being found while repairing E would.
![Page 50: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/50.jpg)
50
Linear-order repairman model
By making use of the independent repairman model, create a new enabling function associated with the transition of each site or link repair event as in the right table
![Page 51: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/51.jpg)
51
Linear-order repairman model
Only one enabling function at any state shall return TRUE based on the prespecified linear order and all others shall return FALSE.
![Page 52: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/52.jpg)
52
Linear-order repairman model
Example:
a 5-site ring topology linear order is (sites 5,4,3,2,1; links 54,51,43,32,21).
sites 4, 2 and link 51 are down
![Page 53: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/53.jpg)
53
Linear-order repairman model
enabling functions associated with sites 4, 2 and link 51 will return TRUE, FALSE and FALSE, meaning 4 will be repaired next over site 2 and link 51.
![Page 54: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/54.jpg)
54
What is system
availability?
What is site
availability?
System availability: The steady-state probability that a major partition exists.
Site availability: The probability that an update arriving at an arbitrary site will succeed.
![Page 55: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/55.jpg)
55
Best-first repairman model
Linear-order repair order is fixed at run time.
Best-first repairman repair order is not fixed.
Preference given to the site or link which can most improve the site availability of the system after its repair with respect to the current state.
![Page 56: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/56.jpg)
56
Best-first repairman model- Repair Strategy
If the repair of a distinct failed site or link would lead to the existence of a major partition, then the distinct failed site or link will be selected to be repaired next.
If more than one failed site or link whose repair would lead to the existence of a major partition, then a tie-breaker rule will be applied to select one distinct member of the group to be repaired next.
major partition: the number of current copies is the largest among all possible repair choices in that state
![Page 57: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/57.jpg)
57
Repair StrategyTie-breaker rules
(Both failed sites and links) if μs/λs>=μl/ λl, preference is given to the failed sites, otherwise given to the failed links.
(Failed sites) preference is given to the highest linearly ordered in the group.
Preference is given to the failed entity which can be repaired fastest.
A higher linearly ordered site is more likely to become the distinguished site.
![Page 58: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/58.jpg)
58
How to increase the probability of
uniting more sites in the major
partition?
Repair sites or links Repair sites or links closer tocloser to
the current distinguished sitethe current distinguished site
![Page 59: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/59.jpg)
59
Tie-breaker rules-Example
Example: five-site ring system major partition contains only
the distinguished site, site4. all other sites and links have
failed. repairing any of the failed
sites or links yields the same availability improvement.
![Page 60: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/60.jpg)
60
better to unite
sites 4 & 5
No benefit
Failed site only
![Page 61: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/61.jpg)
61
Repair StrategyTie-breaker rules (contd)
(Failed links) preference is given to the link that connects the highest linearly ordered sites in the group.
if μs/λs<μl/ λl, repairing links takes precedence over repairing sites.
Repair links around current distinguished.
![Page 62: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/62.jpg)
62
Higher linearly ordered site has a higher probability to become the distinguished site
Repair a link connecting higher linearly ordered site
Put the system in a better position to unite more sites
![Page 63: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/63.jpg)
63
better to unite
sites 4 & 5
No benefit
Failed link only
Example
![Page 64: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/64.jpg)
64
Best-first repairman model
Similar to linear-ordered repairman model.
Associate an enabling function with the repair transaction of each site or link.
Only one enabling function can return TRUE at any state.
![Page 65: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/65.jpg)
65
Implementations
Modeling techniques was tested with five-site ring topology.
Four repairman models: Independent repairman model Dependent model with FIFO Dependent model with linear-order Dependent model with best-first
Using SPNP on SUN Ultra1
![Page 66: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/66.jpg)
66
Implementations - Models
IndependentDependent
FIFODependent
linear-orderDependent best-first
Figures 1, 2, 3, 4 1, 2, 3, 4 1, 2, 3, 4 1, 2, 3, 4
Tables 1, 2, 3, 4 1, 2, 4, 5 1, 2, 3, 4, 6 1, 2, 3, 4, 7
Markov chain states
8674 8674 5429 3821
![Page 67: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/67.jpg)
67
Implementations - System and Site Availability metrics
Definition Reward Assignment
System
The steady-state probability that a major partition exists.
Reward rate = 1:
For those states in which enabling function f() is evaluated to TRUE.
Reward rate = 0: otherwise
Site
The probability that an update arriving at an arbitrary site will succeed.
Reward rate = 1*k/n:
For those states in which enabling function f() is evaluated to TRUE.
Reward rate = 0: otherwise
k: # of ‘up’ site in the major partition in a particular state; n: total site # in a system
![Page 68: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/68.jpg)
68
Results (Indep.)
l:l siteA s:s siteA Increase rates of siteA
slow down as l:l or s:s
Either only fallible links (or fallible sites, not both) will overestimate the site availability metric unrealistically.
MTTF:MTTR
[2]
?
[2] Jajodia, S. and Mutchler, D. (1990) Dynamic voting algorithms for maintaining the consistency of
replicated database. ACM Trans. Database Systems, 15, 230-280.
![Page 69: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/69.jpg)
69
Results (dep.)
Site availability with independent repair is higher than those with dependent repair.
Only when l:l is relatively high can the availability of best-first repairman model approximate that of the independent repairman model.
![Page 70: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/70.jpg)
70
Results (dep., contd)
System availability is higher than than the corresponding site availability under the same l:l and s:s ratio.
Independent > Best-first > Linear-Order > FIFO for system and site availability under the same l:l and s:s
ratio.
![Page 71: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/71.jpg)
71
Results - Three Dependent Repairman Models
Best-first repairman model can always provide a better site availability than the other two.
Avoid unfavorable states resulting from ineffective repair activities (major partition not exist).
Unfavorable states occur when any of the following occurs:
(a) One of more up but out-of-date copies.
(b) One up and current copy (not the distinguished copy) + (a). Avoid these states by repairing the distinguished sites first: a major
partition always results. (assumption: frequent updates.)
![Page 72: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/72.jpg)
72
Results - Three Dependent Repairman Models (contd)
Best-first repairman model can always provide a better site availability than the other two.
Choose a failed site or link to repair such that the size of the major partition after repair will become the largest among all possible choices. (major partition exists).
Eliminate ineffective repair.
![Page 73: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/73.jpg)
73
Results - Three Dependent Repairman Models (contd)
Two caseseffectivebest-first (size 4)
ineffectivelinear-order/FIFO (size 3)
Repairing any member of a group results in the same availability improvement, best-first will select the one with the shortest repair time. So that most of time the system can stay in the in states with high availability.
Case 1
Case 2
![Page 74: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/74.jpg)
74
Results - Difference in Availability
Assumption: s = l = ; s = l =
![Page 75: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/75.jpg)
75
Results - Difference in Availability
When ratio / >20: different among these three model is small.
Sites/links can be repaired much faster than they could fail.
When ratio / is 5~15: difference becomes larger.
the effective repairs have more prominent impact.
When ratio / <1: different among these three model is small.
Sites/links can be failed much faster than they could repair. The effective repairs become less significant.
![Page 76: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/76.jpg)
76
Conclusions
a significant difference in availability exists between two systems with independent and dependent repairman models, except when the repair rate is much higher than the failure rate (i.e. when / >20)
when several sites and links have to share the same repairman, the best-first repairman model can always provide a better availability than the FIFO and linear-order repairman models
![Page 77: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/77.jpg)
77
Conclusions (contd)
The difference in availability becomes more pronounced as the site/link failure rate increases relative to the site/link repair rate (i.e. / );
An availability model that considers only site or link failures/repairs, but not both, can give a very unrealistic, overestimated value of the availability metric.
Ignoring concurrent site/link failure modes or repair dependency can overestimate the availability of replicated data.
![Page 78: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/78.jpg)
78
Future Work
Extend and apply the modeling techniques to analyzing other pessimistic algorithms.
Develop modeling techniques to study the trade-off between the reduction in data processing overheads and the sacrifice in availability for database environments where a combined performance/availability design goal must be met.
![Page 79: Analysis of Replicated Data with Repair Dependency](https://reader036.vdocument.in/reader036/viewer/2022062301/568132b9550346895d99764d/html5/thumbnails/79.jpg)
79
Thank you!