sfeldman performance bb_worldemea07

41
Principles of Performance Engineering Steve Feldman, Director Performance Engineering and Architecture [email protected]

Upload: steve-feldman

Post on 27-Jan-2015

103 views

Category:

Technology


0 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Sfeldman performance bb_worldemea07

Principles of Performance Engineering

Steve Feldman, Director Performance Engineering and [email protected]

Page 2: Sfeldman performance bb_worldemea07

Agenda

• Questions of the Mind…• PE @ Bb• Process and Methodology• What projects are coming out of the PerfEng Lab in ’07

• Links

Page 3: Sfeldman performance bb_worldemea07

Part 1: Questions of the MindPart 1: Questions of the Mind

Page 4: Sfeldman performance bb_worldemea07

Questions of the Mind…• What is Performance?• What is Scalability?• What is Performance Engineering?• Why does Blackboard invest in Performance Engineering?

Page 5: Sfeldman performance bb_worldemea07

What is Performance?

• Performance = Response Times• Response Times affect the User Experience.

• If the User Experience is acceptable Abandonment becomes less likely which can positively affect Adoption.

• When Adoption increases, greater the need for Scalability.

Page 6: Sfeldman performance bb_worldemea07

What is Scalability?

• Scalability = Heavy Adoption/Usage – If Heavy Adoption = Positive User Experience

• If Positive User Experience = Acceptable Response Times

• Patterns of predictable usage– Present Population– User Behavior

• Unpredictable Behavior and New Populations.– Make the unpredictable predictable.

• Changes in Adoption Growth Behavior

Page 7: Sfeldman performance bb_worldemea07

What is Performance Engineering?

• Engineering discipline with a primary focus on performance and scalability.

• Combination of Software and System activities with intent of improving Performance and Scalability.

• Performance Design Performance Development Performance Verification Performance Benchmarking

Page 8: Sfeldman performance bb_worldemea07

Why does Blackboard invest in Performance Engineering?• Some of the world’s largest application deployments outside of commercial portals and e-commerce sites are Blackboard.

• Desire to add more applications, sub-systems and features to drive adoption and increase growth.

• Differentiate Blackboard from other LMS/CMS ISVs and the Open Source Community.

Page 9: Sfeldman performance bb_worldemea07

Part 2: PE @ BbPart 2: PE @ Bb

Page 10: Sfeldman performance bb_worldemea07

PerfEng at Bb…• Triangle of Priorities

– Performance: Response Times (Apdex)– Scalability: High Session Activity with Low Abandonment (PAR)

– User Experience: High Performance and Heavy Integration (Reference Architecture)

• Investment and Relationships– Software Tool Set– Lab Sponsors

Page 11: Sfeldman performance bb_worldemea07

Triangle of Priorities

Page 12: Sfeldman performance bb_worldemea07

PerfEng at Bb…

RequirementDevelopment

FunctionalTesting

GeneralAvailability

RegressionTesting

CertificationIntegrated Testing

DevelopDesign

PerformanceBenchmarking

- Assess Performance Risk- Mitigate Performance Risk- Identify Critical Use Cases for Analysis

- Performance Workbooks- Review Technical Design Document- Reference acceptable design patterns- Warn about unacceptable anti-patterns- Model/Prototype

- Baseline as functionality can be tested- Profile for inefficient calls/executions- Identify scalability issues in time to refactor

- High-Watermark Load Testing- Common Scenario Load Testing- Conditional Scenario Load Testing- Java Performance Scenarios

- Platform Configurations- Advanced Configurations- Partner Benchmarking (Vendor Kits)

- Sizing and Capacity Guidance

End to End Performance Integration in the Blackboard Software Development Lifecycle

- Performance Verification- Focus Verification- QA Cyclical Requirements

Page 13: Sfeldman performance bb_worldemea07

Performance: Response Times (Apdex)

• Every transaction in the application is defined as a state, an action state or an action.

• Each transaction is assigned a ranking: critical, essential or trivial.

• Each transaction is assigned an abandonment policy: low, medium, high and very high.– Abandonment represents (4) dimensions of Apdex.

Page 14: Sfeldman performance bb_worldemea07

State/Action Modeling• State: A condition or point of reference within a sub-system in which an actor has the option to move to a sub-state, perform an action or move to a super-state. – States are often considered navigation items easily identified by a bread crumb.

– States can also be considered pages if and only if a sub-state or action is branched from the state. • Example: Discussion Board Forum List Widget (State of the Discussion Board sub-system)

Page 15: Sfeldman performance bb_worldemea07

State/Action Modeling

• Action State: The navigation to an action. – For example, an instructor wants to create a message for a topic. The instructor selects Add Message and is brought to a page that requires the user to input information followed by a submit to actually create the message.

Page 16: Sfeldman performance bb_worldemea07

State/Action Modeling

• Action: An actor driven process that occurs within a state. – Actions occur when an actor can not move into a sub-states.

– Most often associated with a use case. •Example: Replying to a thread within a Discussion Board Message.

Page 17: Sfeldman performance bb_worldemea07

What is Apdex(Application Performance Index)?

• What is Apdex– Apdex is an open standard developed by an alliance of companies that defines a standardized method to report, benchmark, and track application performance.

• http://www.apdex.org

Page 18: Sfeldman performance bb_worldemea07

Apdex and User Abandonment

• Bb defines (4) Apdex Dimensions: – Low (2s-8s)– Medium (5s-20s)– High (8s-32s) – Very High (12s-48s)

Page 19: Sfeldman performance bb_worldemea07

Apdex and Workload Variations• Apdex scores are taken across each data model variation.• Expect scores of 85% or higher for under-loaded systems.• During Verification Testing and Benchmarks, expect

scores of 75% or higher.• If scores return below accepting, move to

instrumentation and profiling (Method-R)

Page 20: Sfeldman performance bb_worldemea07

Scalability: High Session Activity with Low Abandonment (PAR)

• What is a PAR?– Performance Archetype Ratio– Scoring method to determine resource requirements of a deployment based on given system workload.

• Any component of the deployment can have a PAR score.

Page 21: Sfeldman performance bb_worldemea07

Scalability: High Session Activity with Low Abandonment (PAR)

X-Axis: IterationsX-Axis: Iterations

Y-A

xis

: R

esou

rce U

tiliza

tion

Y-A

xis

: R

esou

rce U

tiliza

tion

Resource Utilization Threshold LineResource Utilization Threshold Line

Optimal WorkloadOptimal Workload

CPCPUU

Page 22: Sfeldman performance bb_worldemea07

PAR Process: Step 1 Calibration

• Calibrate Workloads with User Abandonment– Peak of Concurrency (POC): The virtual user workload in which response times are acceptable and the highest volume of virtual users are park of the scenario.

– Level of Concurrency (LOC): The virtual user workload in which response times are acceptable the steadiest volume of virtual users are participating in the scenario.

– Average Concurrency: The average workloads of the POC and LOC measurements combined.

Page 23: Sfeldman performance bb_worldemea07

PAR Process: Step 2 App. Saturation

• Take workloads from abandonment run and disable abandonment.– Run based on Peak of Concurrency workload (Abandonment

Disabled) – Run based on Level of Concurrency workload (Abandonment

Disabled) – Run based on Average of Concurrency workload (Abandonment

Disabled)• Example Metrics

– Response times consistently lower then ~5 seconds – Application CPU saturation close to X > 90% where X = CPU

utilization + (1) Standard Deviation of the CPU Utilization – Total Sessions – Total Transactions – Application Server Hits Per Second – Database CPU saturation

• Strategies– Clustering and Virtualization

Page 24: Sfeldman performance bb_worldemea07

PAR Process: Step 3 DB Saturation

• Multiply the workload from Step 2 across identical application servers.– Typically want 90% CPU utilization and sub-5

second response times.• Example Metrics

– Database CPU saturation close to X > 80% where X = CPU utilization + (1) Standard Deviation of the CPU Utilization

– Memory Utilization– Database Shadow Processes– I/O operations per second

• Strategies– Increase CPU speed and count– Optimize storage configuration

Page 25: Sfeldman performance bb_worldemea07

PAR Process: Additional Steps

• Hypothesis and Proof– Essential part in this process is determine theoretical performance.

– Understanding of linear, sub-linear or super-linear performance.

– Simulate to determine actual.

• PARs can be gathered for other peripherals such as Load-Balancers, Storage Sub-Systems, Memory, CPUs, etc…

Page 26: Sfeldman performance bb_worldemea07

User Experience: High Performance and Heavy Integration (Reference Architecture)

• Insert Visio Here

Page 27: Sfeldman performance bb_worldemea07

Investment and Relationships

• PerfEng Team (10 Team Members)– Combined both teams as part of the merger and increased head count.

• Software Tools– Mercury LoadRunner– Quest Product Suite– Homegrown Tools: Simulation, Log Parsing, Modeling and Sampling

– Hotsos Oracle Profiler

Page 28: Sfeldman performance bb_worldemea07

Investment and Relationships

• Performance Lab Sponsors– Dell: Servers and Remote Lab– Sun: Servers, Storage and Remote Lab

– Intel: Servers– Coradiant: TrueSight Device – Quest: All Software Products– NetApp: Storage

Page 29: Sfeldman performance bb_worldemea07

Part 3: Process and MethodologyPart 3: Process and Methodology

Page 30: Sfeldman performance bb_worldemea07

Process and Methodology

Page 31: Sfeldman performance bb_worldemea07

SPE Overview…Assess

Performance Risk

Assessing the performance risk at the outset of the project (During Requirements) Identify, qualify and mitigate: Rapid Cognition Factors affecting risk: http://lightwave.blackboard.com/Engineering/1899

Identify CriticalUse Cases

Identify use cases where risk of performance goals not met causes the system to fail or be less than successful. Ranking of use cases based on workload variation, execution paths, processing considerations and utility.

Select Key Performance Scenarios

Most frequently executed scenarios, or those that are critical to the perceived performance of the system. Each performance scenario corresponds to a workload characterization. Define execution models, behavior models, cognition models, data models and processing models.

Establish Performance Objectives

Specify the quantitative criteria for evaluating the performance characteristics of the system under development. Must specify objectives prior to any simulations or analysis.

ConstructPerformance Models

Modeling techniques for representing the software processing steps for the performance model. Sequence Diagramming, Markovian Probability Models and Discrete Simulation Models

Software ExecutionModel

Determination of software resource utilization to appropriately measure effect of software as it scales in usage. Identification of Performance Anti-Patterns targeted for refactoring. Method-R Analytics and Problem Solving via Decision Tree and Pattern Recognition.

System ExecutionModel

Determination of system resource requirements utilized by the software under a given workload. Used for sizing and capacity models. Method-R Analytics and Problem Solving via Decision Tree and Pattern Recognition.

Page 32: Sfeldman performance bb_worldemea07

Method-R: Requirements of a Good Methodology (Milsap, Carey)

• Predictive Capacity: A method must enable the analyst to predict the impact of proposed remedy.

• Reliability: A method must identify the correct root cause of the problem, no matter what the root cause may be.

• Determinism: A method must guide the analyst through an unambiguous sequence of steps that always rely upon documented axioms, no experience or intuition.

Page 33: Sfeldman performance bb_worldemea07

Method-R: Requirements of a Good Methodology (Milsap, Carey)• Finiteness: A method must have a well-defined terminating condition, such as proof of optimality.

• Practicality: A method must be usable in any reasonable operating condition. It would be unacceptable for a performance improvement method to rely upon tools that exist in some other operating environment but not others.

Page 34: Sfeldman performance bb_worldemea07

Method-R: Response Time Performance Improvement• Practical way of thinking.• Often asking someone to be practical is in itself impractical.– Select the user actions that the business needs improved performance.

– Collect properly scoped diagnostic data that will allow you to identify the causes of a response time consumer while it is performed sub-optimally.

– Execute the candidate optimization activity that will have the greatest net payoff.

– Suspend your improvement activities until something changes.

Page 35: Sfeldman performance bb_worldemea07

Part 4: Performance Lab ‘07Part 4: Performance Lab ‘07

Page 36: Sfeldman performance bb_worldemea07

What projects are coming out of the PerfEng Lab in ‘07

• Blackboard Performance Sizing and Certification Program.

• Special Projects– Virtualization: Zen, LDOMs/Containers and VMWare

– Scalent Management Suite– Monitoring and Management– User Experience and Incident Management: Coradiant

– Storage Protocols: NFS, IP-SAN and FC-SAN

Page 37: Sfeldman performance bb_worldemea07

What is the BPSC?• The BPSC is a benchmarking program designed to

showcase the enterprise architecture, performance and scalability of the Blackboard Application Suite.

• The BPSC will help Blackboard customers make the appropriate purchasing decisions (hardware and software) to support their Blackboard implementation.

• The BPSC is a joint effort by Blackboard and members of the Blackboard Technology Family. This includes ISVs such as Microsoft, Oracle and Quest, as well as OEMs such Dell, Sun and Coradiant.

Page 38: Sfeldman performance bb_worldemea07

Special Projects• Virtualization

– VMWare, Zen and LDOMs• Management

– Scalent and Quest• Monitoring

– Coradiant and Quest• Storage Protocols

– FC/SAN, IP/SAN and NFS• Scale-Up and Scale-Out Databases

– Oracle RAC– 64-bit SQL Server

Page 39: Sfeldman performance bb_worldemea07

Questions?

Page 40: Sfeldman performance bb_worldemea07

Links/References• Blackboard Academic Suite Hardware Sizing Guide (Behind the Blackboard)• Performance and Capacity Planning Guidelines for the Blackboard Academic Suite (Behind the Blackboard)• http://www.perfeng.com• http://www.spec.org/sfs97r1/results/sfs97r1.html • http://www.storageperformance.org• http://www.coradiant.com• http://www.quest.com• http://www.bmc.com • Performance by Design : Computer Capacity Planning By Example; Menasce, Daniel• Scaling for E-Business: Technologies, Models, Performance, and Capacity Planning; Menasce, Daniel• Linux Performance Tuning and Capacity Planning; Fink, Jason• Network Administrators Survival Guide; Deveriya, Anand• Capacity Planning for Internet Services; Cockcroft, Adrian• http://www.blackboard.com/docs/r6/6_3/en_US/admin/bbas_performance_capacity.pdf • http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnbda/html/bdadotnetarch081.asp • http://developers.sun.com/solaris/articles/systemslowdowns.html • http://www.oracle.com/technology/deploy/performance/index.html• http://tpc.org/tpc_app/default.asp (TPC-App)• http://tpc.org/tpcw/default.asp (TPC-W)• http://java.sun.com/docs/performance/ • http://support.microsoft.com/kb/224587• http://www.javaperformancetuning.com• http://www.oraperf.com• http://www.ixora.com.au• http://www.hotsos.com• http://perl.apache.org/docs/1.0/guide/performance.html• Sherlog, Webalizer, WebTrends, Analog• http://dir.yahoo.com/Computers_and_Internet/Software/Internet/World_Wide_Web/Servers/Log_Analysis_Tools/ • http://www.serverwatch.com/tutorials/article.php/3518061 • http://www-106.ibm.com/developerworks/rational/library/4250.html• http://www.keynote.com/downloads/articles/tradesecrets.pdf • Whalen, Edward. Oracle Database 10G: Linux Administration ISBN: 0-07-223053-3;• Milsap, Cary. Optimizing Oracle Performance ISBN: 0-596-00527-X• DeLuca, Steve. Microsoft SQL Server 2000 Performance Tuning Technical Reference ISBN: 0735612706• McGehee, B. “SQL-Server Configuration Performance Checklist”

http://sql-server-performance.com/sql_server_performance_audit5.asp • http://www.sql-server-performance.com/jc_sql_server_quantative_analysis1.asp

Page 41: Sfeldman performance bb_worldemea07

Merci