presented by rukmini and diksha chauhan virginia tech 2 nd may, 2007
DESCRIPTION
Movement-Based Checkpointing and Logging for Recovery in Mobile Computing Systems. Presented by Rukmini and Diksha Chauhan Virginia Tech 2 nd May, 2007. Sapna E.George, Ing-Ray Chen & Ying Jin. Agenda. Related work Mobile Computing System Proposed Movement-based Checkpointing and Logging - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Presented by Rukmini and Diksha Chauhan Virginia Tech 2 nd May, 2007](https://reader038.vdocument.in/reader038/viewer/2022110103/56814743550346895db4807e/html5/thumbnails/1.jpg)
Presented by Rukmini and Diksha Chauhan
Virginia Tech2nd May, 2007
Movement-Based Checkpointing and Logging for Recovery in Mobile Computing
Systems
Sapna E.George, Ing-Ray Chen & Ying Jin
![Page 2: Presented by Rukmini and Diksha Chauhan Virginia Tech 2 nd May, 2007](https://reader038.vdocument.in/reader038/viewer/2022110103/56814743550346895db4807e/html5/thumbnails/2.jpg)
AgendaRelated workMobile Computing SystemProposed Movement-based Checkpointing
and LoggingRecovery SchemesPerformance AnalysisConclusion
![Page 3: Presented by Rukmini and Diksha Chauhan Virginia Tech 2 nd May, 2007](https://reader038.vdocument.in/reader038/viewer/2022110103/56814743550346895db4807e/html5/thumbnails/3.jpg)
Properties of Mobile ComputingInherent properties
Host MobilityDisconnectionsWireless bandwidth LimitationBattery LifeStorageHardware failureSoftware Failure
MotivationPropose an efficient failure recovery scheme
![Page 4: Presented by Rukmini and Diksha Chauhan Virginia Tech 2 nd May, 2007](https://reader038.vdocument.in/reader038/viewer/2022110103/56814743550346895db4807e/html5/thumbnails/4.jpg)
Distributed SystemsFault-tolerance schemes
LoggingCheckpointingRollback Recovery
Definition Domino Effect
inter-process dependencies - cascading rollbacks
Asynchronous Recovery
![Page 5: Presented by Rukmini and Diksha Chauhan Virginia Tech 2 nd May, 2007](https://reader038.vdocument.in/reader038/viewer/2022110103/56814743550346895db4807e/html5/thumbnails/5.jpg)
Related WorkAcharya et al. in [1] describes a distributed
uncoordinatedcheckpointing scheme, where multiple MHs can arrive at a global consistent checkpoint without coordination messages.
The paper does not describe how failure recovery is achieved nor does it address the issue of recovery information management in the face of MH movement.
![Page 6: Presented by Rukmini and Diksha Chauhan Virginia Tech 2 nd May, 2007](https://reader038.vdocument.in/reader038/viewer/2022110103/56814743550346895db4807e/html5/thumbnails/6.jpg)
Underlying Model
![Page 7: Presented by Rukmini and Diksha Chauhan Virginia Tech 2 nd May, 2007](https://reader038.vdocument.in/reader038/viewer/2022110103/56814743550346895db4807e/html5/thumbnails/7.jpg)
Basic DefinitionsMobile
Mobile Host(MH)
Mobile support Systems(MSS)Infrastructue machinesHigh speed Static wired n/w
![Page 8: Presented by Rukmini and Diksha Chauhan Virginia Tech 2 nd May, 2007](https://reader038.vdocument.in/reader038/viewer/2022110103/56814743550346895db4807e/html5/thumbnails/8.jpg)
Basic DefinitionsCell
Local MSS
Communication Between MH and MSS-Constraints
Process of Communication between MH’sTwo one-hop wireless transmissionsArbitrary hops
![Page 9: Presented by Rukmini and Diksha Chauhan Virginia Tech 2 nd May, 2007](https://reader038.vdocument.in/reader038/viewer/2022110103/56814743550346895db4807e/html5/thumbnails/9.jpg)
Basic DefinitionsHandoff
Instantaneously ProcessMH crosses a cell boundary
MH disconnect(MSS1) voluntarily from network to conserve power and reconnect(MSS2) at a later time.
MH sends the ID of MSS1 to the new MSS2-initiates handoff procedures.
![Page 10: Presented by Rukmini and Diksha Chauhan Virginia Tech 2 nd May, 2007](https://reader038.vdocument.in/reader038/viewer/2022110103/56814743550346895db4807e/html5/thumbnails/10.jpg)
Processes and StatesThree States
Normal Execution Application-related Computation Sending or receiving messages Logging
SaveRecovery
Write EventMessage received from other MH or serverUser Input or Local Computation
![Page 11: Presented by Rukmini and Diksha Chauhan Virginia Tech 2 nd May, 2007](https://reader038.vdocument.in/reader038/viewer/2022110103/56814743550346895db4807e/html5/thumbnails/11.jpg)
Movement-Based Checkpointing & Message LoggingCheckpoints after a certain number of host migrations
across cells rather than periodically.
Recovery SchemeCombines independent checkpointing and optimistic
message logging enabling asynchronous recovery of a MH upon failure.
Application recovery mechanisms - optimize recovery cost (failure-free operational cost), recovery time Storage requirements for recovery related information
![Page 12: Presented by Rukmini and Diksha Chauhan Virginia Tech 2 nd May, 2007](https://reader038.vdocument.in/reader038/viewer/2022110103/56814743550346895db4807e/html5/thumbnails/12.jpg)
Movement-Based Checkpointing & Message LoggingScheme uses distance or number of handoffs
Parameter to trigger information consolidation MH crosses a distance threshold from the location
of the latest checkpoint, the recovery information is collected and transferred to the MH’s local MSS.
Recovery protocol – proactively controls no. of checkpoints and logs by movement-based checkpointing strategyadditional overhead of unnecessary checkpoints
and log consolidation during failure-free operation is avoided.
![Page 13: Presented by Rukmini and Diksha Chauhan Virginia Tech 2 nd May, 2007](https://reader038.vdocument.in/reader038/viewer/2022110103/56814743550346895db4807e/html5/thumbnails/13.jpg)
Checkpointing & Message Logging
m is a f (user’s mobility rate, the failure rate and log arrival rate ) –Adaptation to user and Application behaviour
![Page 14: Presented by Rukmini and Diksha Chauhan Virginia Tech 2 nd May, 2007](https://reader038.vdocument.in/reader038/viewer/2022110103/56814743550346895db4807e/html5/thumbnails/14.jpg)
Movement-based Checkpointing and LoggingMH –Stored variables
cp_seq -stores the sequence number of the latest checkpoint and
cp_loc -stores the ID of the MSS that has recorded the latest checkpoint.
MSScp- Latest MSS Handoff_counter to 0MSSlogs (log_set) - IDs of MSSs
ActivityCheckpointsLogging
![Page 15: Presented by Rukmini and Diksha Chauhan Virginia Tech 2 nd May, 2007](https://reader038.vdocument.in/reader038/viewer/2022110103/56814743550346895db4807e/html5/thumbnails/15.jpg)
Independent RecoveryIndependent – without Coordination with other
hosts.Recovery process
MH sends MSS cp_seq, cp_loc and log_setMSS initiates (requests) data collectionMSS compiles
Logs into list ordered by time Checkpoints
Once recovery is completed successfully, a checkpoint of the current state is taken and sent to the MSS and the local variables are reset.
![Page 16: Presented by Rukmini and Diksha Chauhan Virginia Tech 2 nd May, 2007](https://reader038.vdocument.in/reader038/viewer/2022110103/56814743550346895db4807e/html5/thumbnails/16.jpg)
Storage Management at MSS MH’s Disk
Unstable Limited
MSS’s Disk Stable storage Considerably large
storage at MSSs –depleted1. Temporarily halt –Perform Garbage Collection2. Alternative Storage3. Deleting outdated recovery Information
![Page 17: Presented by Rukmini and Diksha Chauhan Virginia Tech 2 nd May, 2007](https://reader038.vdocument.in/reader038/viewer/2022110103/56814743550346895db4807e/html5/thumbnails/17.jpg)
SPN Model Parameters
![Page 18: Presented by Rukmini and Diksha Chauhan Virginia Tech 2 nd May, 2007](https://reader038.vdocument.in/reader038/viewer/2022110103/56814743550346895db4807e/html5/thumbnails/18.jpg)
SPN Model
![Page 19: Presented by Rukmini and Diksha Chauhan Virginia Tech 2 nd May, 2007](https://reader038.vdocument.in/reader038/viewer/2022110103/56814743550346895db4807e/html5/thumbnails/19.jpg)
Transition firing rate
Checkpoint Rate of MH
During checkpointing:
(a) MH takes a snapshot of its current state
(b) MH sends the checkpoint to the current MSS through the wireless channel.
(c) The MSS stores it in its stable storage.
where is the time required to transmit a checkpoint through wireless link.
![Page 20: Presented by Rukmini and Diksha Chauhan Virginia Tech 2 nd May, 2007](https://reader038.vdocument.in/reader038/viewer/2022110103/56814743550346895db4807e/html5/thumbnails/20.jpg)
Transition Firing RateRecovery Rate of MH i.e, inverse of Recovery Time
Recovery Time includes :
(a) time to send recovery information requests to the MSSs storing the latest checkpoint and all logs since the latest checkpoint
(b) time to transmit the latest checkpoint from the MSS where it is stored (MSScp) to the MSS in which the MH has recovered (MSSrec) through the wired network and through the wireless channel to the MH
![Page 21: Presented by Rukmini and Diksha Chauhan Virginia Tech 2 nd May, 2007](https://reader038.vdocument.in/reader038/viewer/2022110103/56814743550346895db4807e/html5/thumbnails/21.jpg)
Transition Firing Rate
c) time to transmit all the logs from the respective MSSs where they are located (MSSlogs) to the MSSrec through the wired network and through the wireless channel to the MH and
(d) time to rollback to the last checkpoint and apply all the logs at
the MH.
![Page 22: Presented by Rukmini and Diksha Chauhan Virginia Tech 2 nd May, 2007](https://reader038.vdocument.in/reader038/viewer/2022110103/56814743550346895db4807e/html5/thumbnails/22.jpg)
Variables & Represents the number of MSSs storing logs.At most its value is the number of handoffs before failure, i.e. i
Represents average hop count between MSScp and MSSrec.
![Page 23: Presented by Rukmini and Diksha Chauhan Virginia Tech 2 nd May, 2007](https://reader038.vdocument.in/reader038/viewer/2022110103/56814743550346895db4807e/html5/thumbnails/23.jpg)
Recovery TimeTime Spent on Recovery Requests:
Time spent on transmitting the latest checkpoint to MH:
![Page 24: Presented by Rukmini and Diksha Chauhan Virginia Tech 2 nd May, 2007](https://reader038.vdocument.in/reader038/viewer/2022110103/56814743550346895db4807e/html5/thumbnails/24.jpg)
Recovery Time contd..Time spent to transmit the logs to MH:
where n is the number of log entries since the last checkpointTime spent to rollback to the last checkpoint and apply the logs:
Total Recovery time after i movements:
![Page 25: Presented by Rukmini and Diksha Chauhan Virginia Tech 2 nd May, 2007](https://reader038.vdocument.in/reader038/viewer/2022110103/56814743550346895db4807e/html5/thumbnails/25.jpg)
Recovery Cost per failureThe SPN model’s underlying Markov model has 2M+1 states. The average recovery time per failure is given by:
The total failure-free operations cost(or time spent on checkpointing and logging before failure) is given by:
where denotes the number of checkpoints before failure and denotes the number of log entries before failure.
![Page 26: Presented by Rukmini and Diksha Chauhan Virginia Tech 2 nd May, 2007](https://reader038.vdocument.in/reader038/viewer/2022110103/56814743550346895db4807e/html5/thumbnails/26.jpg)
Recovery Cost per failure contd…Total Cost of Recovery per failure is the weighted sum of the
average recovery time and the total time spent on the checkpointing and logging per failure and is given by:
where w1 and w2 are the weights associated with recovery time and failure-free operation cost.
This paper uses w1 = w2 = 0.5 to account for the situation where
is equally proportional to and
![Page 27: Presented by Rukmini and Diksha Chauhan Virginia Tech 2 nd May, 2007](https://reader038.vdocument.in/reader038/viewer/2022110103/56814743550346895db4807e/html5/thumbnails/27.jpg)
Recovery Probability
The recovery probability is defined as the probability that recovery time is less than or equal to T
![Page 28: Presented by Rukmini and Diksha Chauhan Virginia Tech 2 nd May, 2007](https://reader038.vdocument.in/reader038/viewer/2022110103/56814743550346895db4807e/html5/thumbnails/28.jpg)
Results and AnalysisThe SPN model was implemented and analyzed using the SPNP s/w
The following parameter values were kept constant in all the runs.size of a log entry is 50B, size of a checkpoint is 2000B, bandwidth of the wired network is 2Mbps, ratio of bandwidth of wireless to wired network (r) is 0.1, Telog is 0.0001s. Tlog_w is 0.002s and Tckp_w is 0.08s.
Model parameters such as mobility rate, log arrival rate, failure rate, and movement threshold were varied across runs
![Page 29: Presented by Rukmini and Diksha Chauhan Virginia Tech 2 nd May, 2007](https://reader038.vdocument.in/reader038/viewer/2022110103/56814743550346895db4807e/html5/thumbnails/29.jpg)
Results and Analysis contd…
Recovery Probability vs. Recovery Time.
![Page 30: Presented by Rukmini and Diksha Chauhan Virginia Tech 2 nd May, 2007](https://reader038.vdocument.in/reader038/viewer/2022110103/56814743550346895db4807e/html5/thumbnails/30.jpg)
Results and Analysis contd…
Recovery Probability vs. Log Arrival Rate.
![Page 31: Presented by Rukmini and Diksha Chauhan Virginia Tech 2 nd May, 2007](https://reader038.vdocument.in/reader038/viewer/2022110103/56814743550346895db4807e/html5/thumbnails/31.jpg)
Results and Analysis contd…
Recovery Probability vs. Failure Rate.
![Page 32: Presented by Rukmini and Diksha Chauhan Virginia Tech 2 nd May, 2007](https://reader038.vdocument.in/reader038/viewer/2022110103/56814743550346895db4807e/html5/thumbnails/32.jpg)
Results and Analysis contd…
Recovery Probability vs. Movement Threshold.
![Page 33: Presented by Rukmini and Diksha Chauhan Virginia Tech 2 nd May, 2007](https://reader038.vdocument.in/reader038/viewer/2022110103/56814743550346895db4807e/html5/thumbnails/33.jpg)
Results and Analysis contd…
Recovery Time vs. Movement Threshold.
![Page 34: Presented by Rukmini and Diksha Chauhan Virginia Tech 2 nd May, 2007](https://reader038.vdocument.in/reader038/viewer/2022110103/56814743550346895db4807e/html5/thumbnails/34.jpg)
Results and Analysis contd…
Determining Optimal Movement Threshold that minimizes Recovery Cost Per Failure.
![Page 35: Presented by Rukmini and Diksha Chauhan Virginia Tech 2 nd May, 2007](https://reader038.vdocument.in/reader038/viewer/2022110103/56814743550346895db4807e/html5/thumbnails/35.jpg)
ApplicabilityResults can be applied in the following manner:
Build a Table at static time covering possible parameter values of the mobility rate and failure rate of the MH and log arrival rate of the mobile applications
List the optimal M value to minimize the recovery cost per failure for each parameter set.
Select optimal M dynamically at runtime based on the measured rates to minimize the recovery cost per failure.
![Page 36: Presented by Rukmini and Diksha Chauhan Virginia Tech 2 nd May, 2007](https://reader038.vdocument.in/reader038/viewer/2022110103/56814743550346895db4807e/html5/thumbnails/36.jpg)
Summary Implemented movement-based checkpointing and logging scheme which
checkpoints after M movements (mobility handoffs) as compared to current approaches where checkpoints are taken periodically.
A performance model developed based on stochastic Petri nets to identify the optimal M, given the failure, mobility and log arrival rates, to minimize the cost of recovery per failure.
The results of performance analysis and the sensitivity of recoverability to the various parameters were shown
![Page 37: Presented by Rukmini and Diksha Chauhan Virginia Tech 2 nd May, 2007](https://reader038.vdocument.in/reader038/viewer/2022110103/56814743550346895db4807e/html5/thumbnails/37.jpg)
Future WorkTo analyze and compare the proposed approach to existing
approaches, in terms of the gain achieved over the use of constant periodic checkpointing.
To extend the proposed work to MIPv6 environments.
![Page 38: Presented by Rukmini and Diksha Chauhan Virginia Tech 2 nd May, 2007](https://reader038.vdocument.in/reader038/viewer/2022110103/56814743550346895db4807e/html5/thumbnails/38.jpg)
QUESTIONS ??