cloud computing and architecture architectural tactics (tonight’s guest star: availability)
TRANSCRIPT
![Page 1: Cloud Computing and Architecture Architectural Tactics (Tonight’s guest star: Availability)](https://reader036.vdocument.in/reader036/viewer/2022081506/5697c0151a28abf838cce0ab/html5/thumbnails/1.jpg)
Cloud Computing andArchitecture
Architectural Tactics
(Tonight’s guest star: Availability)
![Page 2: Cloud Computing and Architecture Architectural Tactics (Tonight’s guest star: Availability)](https://reader036.vdocument.in/reader036/viewer/2022081506/5697c0151a28abf838cce0ab/html5/thumbnails/2.jpg)
Quality framework (Bass et al.)
• Central quality attributes– Availability– Interoperability– Modifiability– Performance– Security– Testability– Usability
• Other qualities– Portability– Scalability– Variability– Flexibility
– Cost– Time to market
– …
Strongly recommended
reading!
![Page 3: Cloud Computing and Architecture Architectural Tactics (Tonight’s guest star: Availability)](https://reader036.vdocument.in/reader036/viewer/2022081506/5697c0151a28abf838cce0ab/html5/thumbnails/3.jpg)
A Writing Template
3
· Source of stimulus. This is some entity (a human, a computer system, or any other actuator) that generated the stimulus.
· Stimulus. The stimulus is a condition that needs to be considered when it arrives at a system.
· Environment. The stimulus occurs within certain conditions. The system may be in an overload condition or may be running when the stimulus occurs, or some other condition may be true.
· Artifact. Some artifact is stimulated. This may be the whole system or some pieces of it.
· Response. The response is the activity undertaken after the arrival of the stimulus.
· Response measure. When the response occurs, it should be measurable in some fashion so that the requirement can be tested.
![Page 4: Cloud Computing and Architecture Architectural Tactics (Tonight’s guest star: Availability)](https://reader036.vdocument.in/reader036/viewer/2022081506/5697c0151a28abf838cce0ab/html5/thumbnails/4.jpg)
Example: World of Warcraft
CS@AU Henrik Bærbak Christensen 4
![Page 5: Cloud Computing and Architecture Architectural Tactics (Tonight’s guest star: Availability)](https://reader036.vdocument.in/reader036/viewer/2022081506/5697c0151a28abf838cce0ab/html5/thumbnails/5.jpg)
Example: SkyCave
Quality attribute AvailabilitySource Internal to the systemStimuli A crashArtifact Database serverEnvironment Normal operationResponse Detects events, record it in log, continues in normal operationResponse Measure Within 3 seconds
CS@AU Henrik Bærbak Christensen 5
Quality attribute PerformanceSource 1000 independent clientsStimuli Generate on average 2 character events per second Artifact SkyCave App serverEnvironment Normal operationResponse Events are processed, cave state is updatedResponse Measure With maximal 5 seconds latency
![Page 6: Cloud Computing and Architecture Architectural Tactics (Tonight’s guest star: Availability)](https://reader036.vdocument.in/reader036/viewer/2022081506/5697c0151a28abf838cce0ab/html5/thumbnails/6.jpg)
Tactic
• Tactic– A design decision that influences the achievement of a
quality attribute response
• Example of modifiability tactic:– Encapsulate: Introduce explicit interface to module
CS@AU Henrik Bærbak Christensen 6
![Page 7: Cloud Computing and Architecture Architectural Tactics (Tonight’s guest star: Availability)](https://reader036.vdocument.in/reader036/viewer/2022081506/5697c0151a28abf838cce0ab/html5/thumbnails/7.jpg)
CloudArch Core Focus
Discussion
• If a system is not available, what is the point of all other QAs?
• Security ?– Equals slowness
CS@AU Henrik Bærbak Christensen 7
• System quality attributes– Availability– Modifiability– Performance– Security– Testability– Usability– Interoperability– Scalability
![Page 8: Cloud Computing and Architecture Architectural Tactics (Tonight’s guest star: Availability)](https://reader036.vdocument.in/reader036/viewer/2022081506/5697c0151a28abf838cce0ab/html5/thumbnails/8.jpg)
Availability
CS@AU Henrik Bærbak Christensen 8
![Page 9: Cloud Computing and Architecture Architectural Tactics (Tonight’s guest star: Availability)](https://reader036.vdocument.in/reader036/viewer/2022081506/5697c0151a28abf838cce0ab/html5/thumbnails/9.jpg)
Definition(s)
• Availability (1): Property of software that it is there and ready to carry out its task when you need it to be
• Availability (2): Ability of a system to mask or repair faults such that the cumulative service outage period does not exceed a required value over a specified time interval
CS@AU Henrik Bærbak Christensen 9
Nygard Stability (resilience, longevity): Ability to keep processing for a long time even when there are transient impulses, persistent stresses, or component failures
![Page 10: Cloud Computing and Architecture Architectural Tactics (Tonight’s guest star: Availability)](https://reader036.vdocument.in/reader036/viewer/2022081506/5697c0151a28abf838cce0ab/html5/thumbnails/10.jpg)
Measurements
• MTBF: Mean time between failure• MTTR: Mean time to repair
• But often we talk in percentages!– 99% 3d 15h downtime per year– 99,9% 8h 1m– 99,99% 52m– 99,9999% 32 seconds (!)
CS@AU Henrik Bærbak Christensen 10
![Page 11: Cloud Computing and Architecture Architectural Tactics (Tonight’s guest star: Availability)](https://reader036.vdocument.in/reader036/viewer/2022081506/5697c0151a28abf838cce0ab/html5/thumbnails/11.jpg)
Tactics
• Lots of techs!
CS@AU Henrik Bærbak Christensen 11
![Page 12: Cloud Computing and Architecture Architectural Tactics (Tonight’s guest star: Availability)](https://reader036.vdocument.in/reader036/viewer/2022081506/5697c0151a28abf838cce0ab/html5/thumbnails/12.jpg)
Tactics
• Categories– Fault detection– Recovery
• Preparation+Repair• Reintroduction
– Prevention
CS@AU Henrik Bærbak Christensen 12
![Page 13: Cloud Computing and Architecture Architectural Tactics (Tonight’s guest star: Availability)](https://reader036.vdocument.in/reader036/viewer/2022081506/5697c0151a28abf838cce0ab/html5/thumbnails/13.jpg)
Detection
• Ping-echo
• Monitor Nagios – Zabbix - …
• Exceptions– Time out
CS@AU Henrik Bærbak Christensen 13
![Page 14: Cloud Computing and Architecture Architectural Tactics (Tonight’s guest star: Availability)](https://reader036.vdocument.in/reader036/viewer/2022081506/5697c0151a28abf838cce0ab/html5/thumbnails/14.jpg)
Recover: Prep and Repair
• Active redundancy Hot standby– All receive and process all events
• Millisecond failover
• Passive redundancy Warm standby– Master-slave
• Minute failover
• Spare Cold standby– ”I think we have an extra machine in the cellar”
CS@AU Henrik Bærbak Christensen 14
![Page 15: Cloud Computing and Architecture Architectural Tactics (Tonight’s guest star: Availability)](https://reader036.vdocument.in/reader036/viewer/2022081506/5697c0151a28abf838cce0ab/html5/thumbnails/15.jpg)
Recover: Prep and Repair
• Exceptions• Rollback
– Used in DB and [exercise: where else?]– Check pointing
• Retry• Degradation
CS@AU Henrik Bærbak Christensen 15
Which Nygard patterns?
![Page 16: Cloud Computing and Architecture Architectural Tactics (Tonight’s guest star: Availability)](https://reader036.vdocument.in/reader036/viewer/2022081506/5697c0151a28abf838cce0ab/html5/thumbnails/16.jpg)
Recover: Reintroduction
• Shadow– Run in shadow mode until ‘up-to-speed’
• State Resync– Typical DB behaviour
• Cold slaves must catch up with primary
– EcoSense db war story Stale DB
CS@AU Henrik Bærbak Christensen 16
![Page 17: Cloud Computing and Architecture Architectural Tactics (Tonight’s guest star: Availability)](https://reader036.vdocument.in/reader036/viewer/2022081506/5697c0151a28abf838cce0ab/html5/thumbnails/17.jpg)
Preventing
• Removal from service– ‘scrubbing’– Use to be that Tomcat server would respawn every 12
hours• Easiest way to fix the numerous memory leaks!
• Transactions– ACID guaranties
CS@AU Henrik Bærbak Christensen 17
![Page 18: Cloud Computing and Architecture Architectural Tactics (Tonight’s guest star: Availability)](https://reader036.vdocument.in/reader036/viewer/2022081506/5697c0151a28abf838cce0ab/html5/thumbnails/18.jpg)
Summary
• All things bad can and will happen to real systems having real users operating in the real world!
• You systems should strive for high availability and graceful degradation– If you want to keep your customers!
• The architectural tool box is big!
CS@AU Henrik Bærbak Christensen 18