box leangsuksun computer science center for ...box/hapc/quantify_non_functional.pdf · microsoft...

21
echEd 2002 2002 Microsoft Corporation. All rights reserved. enit c Measuring/estimating System Reliability and Performance Box Leangsuksun Computer Science Center for Entrepreneurship and Information Technology Louisiana Tech University enit c Introduction Non-functional requirements are equally if not more important Why? World is impatient More Cost-effective upfront than retrofit Efficiency Inconvenience Life-threatening Lost of money and/or opportunities Etc.

Upload: vuongmien

Post on 28-Aug-2018

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Box Leangsuksun Computer Science Center for ...box/hapc/Quantify_non_functional.pdf · Microsoft makes no warranties, express or implied, in this summary. 1 cenit Measuring/estimating

TechEd 2002© 2002 Microsoft Corporation. All rights reserved.This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary. 1

enitc

Measuring/estimating System Reliability and PerformanceBox LeangsuksunComputer ScienceCenter for Entrepreneurship and Information TechnologyLouisiana Tech University

enitc

IntroductionNon-functional requirements are equally if not more importantWhy?World is impatientMore Cost-effective upfront than retrofitEfficiency Inconvenience Life-threateningLost of money and/or opportunitiesEtc.

Page 2: Box Leangsuksun Computer Science Center for ...box/hapc/Quantify_non_functional.pdf · Microsoft makes no warranties, express or implied, in this summary. 1 cenit Measuring/estimating

TechEd 2002© 2002 Microsoft Corporation. All rights reserved.This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary. 2

enitc

Why? GoalsCompare AlternativesDetermine impacts (per features)System Tuningquantify relative Rel/Avail/PerfDebuggingSet Expectation

enitc

How to measure or estimateMeasurementsSimulationsAnalytical Modeling

Page 3: Box Leangsuksun Computer Science Center for ...box/hapc/Quantify_non_functional.pdf · Microsoft makes no warranties, express or implied, in this summary. 1 cenit Measuring/estimating

TechEd 2002© 2002 Microsoft Corporation. All rights reserved.This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary. 3

enitc

MeasurementsActual System ConstructionCreate a workload per requirementsProvides the best resultsInherent difficult and inflexibleAlmost impossible for What-if

enitc

Measurements (continued)Measure system or subsystem performance with tools

GprofTop/ ps etc..Benchmark programs (e.g. Linpak, Specmark, Winmark

What about reliability measurement? log, trace, outages.

Page 4: Box Leangsuksun Computer Science Center for ...box/hapc/Quantify_non_functional.pdf · Microsoft makes no warranties, express or implied, in this summary. 1 cenit Measuring/estimating

TechEd 2002© 2002 Microsoft Corporation. All rights reserved.This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary. 4

enitc

SimulationA program to simulate important characteristics of targeted systemsFlexible and ease to modifyGood for the What-if analysis Difficult to model every small detailsPopular – cost-effective and flexibleSuffer from details

enitc

Analytical ModelingMathematical description of the systemProvide a quick insight

To help guiding in detail simulation or measurement-based

Results are much less believable or accurateExample

H = cache hit prob, Tm = memory access time, Tc= cache access timeT avg = H Tc + (1 – H) Tm

Page 5: Box Leangsuksun Computer Science Center for ...box/hapc/Quantify_non_functional.pdf · Microsoft makes no warranties, express or implied, in this summary. 1 cenit Measuring/estimating

TechEd 2002© 2002 Microsoft Corporation. All rights reserved.This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary. 5

enitc

Comparison (Lilja’ book)

HighMediumLowAccuracyHighMediumLowBelievabilityHighMediumLowCostLowHighHighFlexibility

MeasurementSimulationAnalytical Modeling

Factor

enitc

Dependability Estimation/Measurement

Similarly to aforementioned 3 techniquesTwo measures

Availability (ratio of uptime/total)Reliability (MTTF)

Analytical modelingNon-state spaceState-space

Page 6: Box Leangsuksun Computer Science Center for ...box/hapc/Quantify_non_functional.pdf · Microsoft makes no warranties, express or implied, in this summary. 1 cenit Measuring/estimating

TechEd 2002© 2002 Microsoft Corporation. All rights reserved.This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary. 6

enitc

Why Dependability measures?

comparisons with cost and performance. a proper focus for product-improvement efforts. Consideration of safety and risk issues.

enitc

Dependability ModelingInclude reliability modeling and availability modelingA designed system can be shown to meet performance and dependability requirement. provide a good mechanism for examining the behavior of a system, right from the design stage to implementation and final deployment.

Page 7: Box Leangsuksun Computer Science Center for ...box/hapc/Quantify_non_functional.pdf · Microsoft makes no warranties, express or implied, in this summary. 1 cenit Measuring/estimating

TechEd 2002© 2002 Microsoft Corporation. All rights reserved.This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary. 7

enitc

Dependability

Two measuresReliability (MTTF) Availability (ratio of uptime/total)

enitc

ReliabilityDefinition: The reliability R(t) of a system at time t is the probability that the system failure has not occurred in the interval [0,t). If X is a random variable that represents the time to occurrence of system failure, then R(t)=P(X>t).unreliability = 1-R(t)

Page 8: Box Leangsuksun Computer Science Center for ...box/hapc/Quantify_non_functional.pdf · Microsoft makes no warranties, express or implied, in this summary. 1 cenit Measuring/estimating

TechEd 2002© 2002 Microsoft Corporation. All rights reserved.This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary. 8

enitc

ReliabilityDefinition MTTF of a system is the expected time until the occurrence of the (first) system failure. If X is a random variable that represents the time to occurrence of system failure, then MTTF=E[X].Given the system reliability R(t), the MTTF can be computed as,

MTTF = int R(t)dt

enitc

AvailabilityA measurement represents a ratio of uptime vs. total timesHigh availability - ability of a system to perform its function continuously (without interruption) for a significantly longer period of time than the reliabilities of its individual components would suggest. High availability is most often achieved through fault tolerance.

Page 9: Box Leangsuksun Computer Science Center for ...box/hapc/Quantify_non_functional.pdf · Microsoft makes no warranties, express or implied, in this summary. 1 cenit Measuring/estimating

TechEd 2002© 2002 Microsoft Corporation. All rights reserved.This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary. 9

enitc

Degree of Availability

System Type Unavailability(minutes/year)

Availability(in percent)

Availability Class

Unmanaged 50,000 90 1

Managed 5,000 99 2

Well-managed 500 99.9 3

Fault-tolerant 50 99.99 4

High Availability 5 99.999 5

Very High Availability

0.599.9999 6

Ultra Availability 0.05 99.99999 7

enitc

AvailabilityDefinition: Availability A(t) of a system at time t is the probability that the system is functioning correctly at time t.Like the reliability measure, in some applications it is better to compute the system unavailability U(t) = 1 -A(t).

A steady = lim A(t) where t -> inf

Page 10: Box Leangsuksun Computer Science Center for ...box/hapc/Quantify_non_functional.pdf · Microsoft makes no warranties, express or implied, in this summary. 1 cenit Measuring/estimating

TechEd 2002© 2002 Microsoft Corporation. All rights reserved.This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary. 10

enitc

Modeling TechniquesNon State-space

Fault-treeReliability Block Diagram

State-SpaceContinuous Markov ChainStochastic Petri Net

enitc

Example of system

Page 11: Box Leangsuksun Computer Science Center for ...box/hapc/Quantify_non_functional.pdf · Microsoft makes no warranties, express or implied, in this summary. 1 cenit Measuring/estimating

TechEd 2002© 2002 Microsoft Corporation. All rights reserved.This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary. 11

enitc

Fault Tree

enitc

Availability ModelServer up Server down & repair

S1

S1

S2

time

Availability model

HA-OSCAR dual head model

S1&S2

Page 12: Box Leangsuksun Computer Science Center for ...box/hapc/Quantify_non_functional.pdf · Microsoft makes no warranties, express or implied, in this summary. 1 cenit Measuring/estimating

TechEd 2002© 2002 Microsoft Corporation. All rights reserved.This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary. 12

enitc

HA-OSCAR SRN model

•Server sub-model

•Switches

•Compute nodes

enitc

Server Sub Model •P Server up•P Server down•Failover•P server repair•Failback

•S is up and ready•S takes control•S Server down•S repair

Page 13: Box Leangsuksun Computer Science Center for ...box/hapc/Quantify_non_functional.pdf · Microsoft makes no warranties, express or implied, in this summary. 1 cenit Measuring/estimating

TechEd 2002© 2002 Microsoft Corporation. All rights reserved.This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary. 13

enitc

Compute node sub model

Switch sub model

enitc

Instantaneous Availability

Steady (A) = 99.993 (36 min) vs.

Beowulf (A) = 99.65 (30 hr)

Page 14: Box Leangsuksun Computer Science Center for ...box/hapc/Quantify_non_functional.pdf · Microsoft makes no warranties, express or implied, in this summary. 1 cenit Measuring/estimating

TechEd 2002© 2002 Microsoft Corporation. All rights reserved.This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary. 14

enitc

Stochastic Petri Net PackageR & D from Duke UVery popularPetri net based dependability analysis

enitc

PerformanceComputation

CPUMemoryI/O etc

Communication Latency Bandwidth

Transaction Possible more involvement than DB

Page 15: Box Leangsuksun Computer Science Center for ...box/hapc/Quantify_non_functional.pdf · Microsoft makes no warranties, express or implied, in this summary. 1 cenit Measuring/estimating

TechEd 2002© 2002 Microsoft Corporation. All rights reserved.This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary. 15

enitc

Some Criteria Throughput – # of completed requests per time unitResponse time – amount of time it takes from when a request was submitted until the first response is produced, not outputCPU utilization – keep the CPU as busy as possibleTurnaround time – amount of time to execute a particular request (finishing time –arrival time)

enitc

Performance issue discovery phase

Requirement Architecture/design Development/code test

1/19/2004 3/19/2004

2/1/2004 3/1/2004

1/19/2004 - 3/19/2004Re-design, code, re-test

Telcomm industry architecture review:1/3 related issues to performance

Page 16: Box Leangsuksun Computer Science Center for ...box/hapc/Quantify_non_functional.pdf · Microsoft makes no warranties, express or implied, in this summary. 1 cenit Measuring/estimating

TechEd 2002© 2002 Microsoft Corporation. All rights reserved.This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary. 16

enitc

Performance Measures

ModelingSimulationMeasurement

enitc

Analytical ModelingExample for memory

H = cache hit prob, Tm = memory access time, Tc= cache access timeT avg = H Tc + (1 – H) Tm

Example of operation/transaction modeling

Browsing order Tb + submitting order Ts90 % vs 10% (volume)Weight 20% vs. 80% orderOrder = 50 instructions + 10 mem

Page 17: Box Leangsuksun Computer Science Center for ...box/hapc/Quantify_non_functional.pdf · Microsoft makes no warranties, express or implied, in this summary. 1 cenit Measuring/estimating

TechEd 2002© 2002 Microsoft Corporation. All rights reserved.This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary. 17

enitc

Performance EngineeringUnderstand requirements and growth Should begin at planning and architecture stageResource needs and budgetUse quantitative methods to gauge the goals (&eliminate root causes)EstimateTracking and ManagementMeasurementTuning

enitc

PE (continued)Poor performance reflects a negativityCostly or high cost when retrofitting

Re-architectingAdd more hardware

Highly tuned code -> cost more in maintainance

Page 18: Box Leangsuksun Computer Science Center for ...box/hapc/Quantify_non_functional.pdf · Microsoft makes no warranties, express or implied, in this summary. 1 cenit Measuring/estimating

TechEd 2002© 2002 Microsoft Corporation. All rights reserved.This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary. 18

enitc

Key PE activities

Predict-requirement-architecture/analysis-budget

Track

Measure

Correct

enitc

Key approach*Bound performance to acceptable level (based on requirement)Targets are quantitative requirements that define the acceptance criteria Budgets are the performance goals allocated across all of the architecture components that must all be met in order to meet the overall targetsEstimates are design component goals derived from experience or previous measurement of existing components

•These definitions are excerpted from AT&T performance engineering course and only used for educational propose.

Page 19: Box Leangsuksun Computer Science Center for ...box/hapc/Quantify_non_functional.pdf · Microsoft makes no warranties, express or implied, in this summary. 1 cenit Measuring/estimating

TechEd 2002© 2002 Microsoft Corporation. All rights reserved.This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary. 19

enitc

Estimate -> How well can the system perform?

Budget -> How well must the system perform?

enitc

Performance Engineering Life CycleArchitecture

Design

M1

m2

m3

m4

m5

m6

m7

17 18 19 20 21 22

23 24 25 26 27 28 29

Spread Sheet

Budget

Initial Performance Model

MeasurementTest

Page 20: Box Leangsuksun Computer Science Center for ...box/hapc/Quantify_non_functional.pdf · Microsoft makes no warranties, express or implied, in this summary. 1 cenit Measuring/estimating

TechEd 2002© 2002 Microsoft Corporation. All rights reserved.This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary. 20

enitc

How to start (Target)Seek the boundary (requirement)Start with Back of the Envelope calculation

Ball park (e.g. no of transactions, normal and at peak)Don’t get hung up on precision earlyE.g. How much water flow out?

enitc

Budgettarget or educated guessIterative processStart from subsystem and then down to modulesBudgeted resources items for each process/modules/subsystems

CPU, memory, Disk I/O, network bandwidth

Page 21: Box Leangsuksun Computer Science Center for ...box/hapc/Quantify_non_functional.pdf · Microsoft makes no warranties, express or implied, in this summary. 1 cenit Measuring/estimating

TechEd 2002© 2002 Microsoft Corporation. All rights reserved.This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary. 21

enitc

Budget typesConcurrency: percentage of resource allocationA sequential: wall clock timeExample of Budget response time for a transaction

T trans = T cpu + (1 – Cmem) T disk + T network

enitc

ExercisesSee the handouts in the class