1 cs 501 spring 2008 cs 501: software engineering lectures 25 and 26 performance of computer systems
DESCRIPTION
3 CS 501 Spring 2008 Software Development as a Profession Question: Is software development a branch of engineering? Answer: It depends on how you define engineering. Software development demands a high degree of professionalism.TRANSCRIPT
1 CS 501 Spring 2008
CS 501: Software Engineering
Lectures 25 and 26
Performance of Computer Systems
2 CS 501 Spring 2008
Administration
Next week
Thursday, May 1: no class
3 CS 501 Spring 2008
Software Development as a Profession
Question: Is software development a branch of engineering?
Answer: It depends on how you define engineering.
Software development demands a high degree of
professionalism.
4 CS 501 Spring 2008
What is Engineering?
A definition of engineering
The profession of:
... creating cost-effective solutions ...
... to practical problems ...
... by applying scientific knowledge ...
... and established practices ...
... building things ...
and taking responsibility for them!
With this definition, software development is clearly engineering
5 CS 501 Spring 2008
What is Engineering?
A second definition of engineering
A professional who
… is licensed by a professional society
… based on a set educational program with a standard body of knowledge and specified experience
… who is the only person permitted to oversee certain tasks
If this is your definition of engineering it is hard to see it applied to software development
6 CS 501 Spring 2008
From the National Society of Professional Engineers
• Only a licensed engineer may prepare, sign and seal, and submit engineering plans ... for public and private clients.
• Licensure for individuals ... is a legal requirement for those who are in responsible charge of work, ...
• Federal, state, and municipal agencies require that certain [positions] ... be filled only by licensed professional engineers.
• Many states have been increasingly requiring that those individuals teaching engineering must be licensed.
• State engineering boards are increasingly ... obtaining the authority to impose civil penalties against unlicensed individuals.
7 CS 501 Spring 2008
From Lecture 1:The Craft of Software Development
Software products are very varied
--> Client requirements are very different
--> There is no standard process for software engineering
--> There is no best language, operating system, platform, database system, development environment, etc.
A skilled software developer knows about a wide variety of approaches, methods, tools. The craft of software engineering is to select appropriate methods for each project and apply them effectively.
8 CS 501 Spring 2008
Crafts, Science, Engineering
Production
Craft
Commercial
Science
ProfessionalEngineering
From: Shaw and Garlan
9 CS 501 Spring 2008
Crafts, Science, Engineering
Production
Craft
Commercial
Science
ProfessionalEngineering
From: Shaw and Garlan
algorithmsdata structures
compiler construction
software developmentmethodologies
10 CS 501 Spring 2008
From Lecture 1:Professional Responsibility
Organizations put trust in software developers:
• Competence: Software that does not work effectively can destroy an organization.
• Confidentiality: Software developers and systems administrators may have access to highly confidential information (e.g., trade secrets, personal data).
• Legal environment: Software exists in a complex legal environment (e.g., intellectual property, obscenity).
• Acceptable use and misuse: Computer abuse can paralyze an organization (e.g., the Internet worm).
11 CS 501 Spring 2008
An Old Question: Safety Critical Software
A software system fails and several lives are lost. An inquiry discovers that the test plan did not consider the case that caused the failure. Who is responsible:
(a) The testers for not noticing the missing cases?
(b) The test planners for not writing the complete test plan?
(c) The managers for not having checked the test plan?
(d) The client for not having done a thorough acceptance test?
12 CS 501 Spring 2008
Software Developers and Testers: Responsibilities
• Carrying out assigned tasks thoroughly and in a professional manner
• Being committed to the entire project -- not just tasks that have been assigned
• Resisting pressures to cut corners on vital tasks
• Alerting colleagues and management to potential problems early
13 CS 501 Spring 2008
Computing Management Responsibility
• Organization culture that expects quality
• Appointment of suitably qualified people to vital tasks (e.g., testing safety-critical software)
• Establishing and overseeing the software development process
• Providing time and incentives that encourage quality work
• Working closely with the client
Accepting responsibility for work of team
14 CS 501 Spring 2008
Client Responsibility
• Organization culture that expects quality
• Appointment of suitably qualified people to vital tasks (e.g., technical team that will build a critical system)
• Reviewing requirements and design carefully
• Establishing and overseeing the acceptance process
• Providing time and incentives that encourage quality work
• Working closely with the software team
Accepting responsibility for the resulting product
15 CS 501 Spring 2008
Performance of Computer Systems
In most computer systems
The cost of people is much greater than the cost of hardware
Yet performance is important
Future loads may be much greater than predicted
A single bottleneck can slow down an entire system
The choice of systems architecture may lead to a system that places great demands on the skills of the implementers.
16 CS 501 Spring 2008
Performance Challenges
Tasks
• Predict performance problems before a system is implemented
• Identify causes and fix problems after a system is implemented
Basic techniques
• Understand how the underlying hardware and networking components interact when executing the system
• For each component calculate the capacity and load
• Identify components that are near peak capacity
17 CS 501 Spring 2008
Understand the Interactions between Hardware and Software
Example: execution of http://www.cs.cornell.edu/
Client Servers
domain name service
TCP connection
HTTP get
18 CS 501 Spring 2008
Understand the Interactions between Hardware and Software
:Thread :Toolkit :ComponentPeer target:HelloWorld
runrun callbackLoop
handleExpose
paint
19 CS 501 Spring 2008
Decompress
Stream audioStream video
fork
join
start state
stop state
Understand Interactions between Hardware and Software
20 CS 501 Spring 2008
Look for Bottlenecks
CPU performance is important in certain domains, e.g.:
• large data analysis (e.g., searching)
• mathematical computation (e.g., weather models)
• multimedia (e.g., video compression)
• perception (e.g., image processing)
21 CS 501 Spring 2008
Look for Bottlenecks
In most domains CPU performance is not the limiting factor. Common bottlenecks:
• Reading data from disk
• Shortage of memory (including paging)
• Moving data from memory to CPU
• Network load
Inefficient software:
• Database access
• Parallel and sequential processing
22 CS 501 Spring 2008
Timescale
Operations per second
CPU instruction: 1,000,000,000
Disk latency: 100 read: 25,000,000 bytes
Network LAN: 10,000,000 bytesdial-up modem: 6,000 bytes
23 CS 501 Spring 2008
Predicting System Performance
• Direct measurement on subsystem (benchmark)
• Mathematical models
• Simulation
• Rules of thumb
All require detailed understanding of the interaction between software and hardware systems.
24 CS 501 Spring 2008
Look for Bottlenecks: Utilization
utilization =
mean service timemean inter-arrival time
When the utilization of any hardware component exceeds 30%, be prepared for congestion.
Peak loads and temporary increases in demand can be much greater than the average.
Utilization is the proportion of the capacity of a service that is used on average.
25 CS 501 Spring 2008
Mathematical Models
Queueing theory
Good estimates of congestion can be made for single-server queues with:
• arrivals that are independent, random events (Poisson process)
• service times that follow families of distributions (e.g., negative exponential, gamma)
Many of the results can be extended to multi-server queues.
26 CS 501 Spring 2008
Mathematical Models: Queues
arrive wait in line service depart
Single server queue
27 CS 501 Spring 2008
Queues
arrive wait in line
service
depart
Multi-server queue
28 CS 501 Spring 2008
Behavior of Queues: Utilization
meandelay
utilization10
29 CS 501 Spring 2008
Software development for high-performance systems
High-performance computing:
• Large data collections (e.g., Amazon)• Internet services (e.g., Google)• Large computations (e.g, weather forecasting)
Must balance cost of hardware against cost of software development
Some configurations are very difficult to program and debug
Sometimes it is possible to isolate applications programmers from the system complexities
CS 530, Architecture of Large-Scale Information Systems
30 CS 501 Spring 2008
Software development: very large databases
Databases
• Hardware is expensive.
• Software development uses commercial database systems, which do not need specialist knowledge.
• But specialist knowledge is needed to organize data on disks, connect disks to memory, configure database for backup and restore.
31 CS 501 Spring 2008
Software development: cluster computing
Computer clusters are built on large numbers of commodity computers. Hardware is cheap(ish).
Commercial clusters may have thousands of nodes and petabytes of disk.
• System must assume that hardware components are unreliable (e.g., Hadoop file system), bandwidth limited.
• New techniques (e.g., map/reduce) emphasize simple applications code, so that moderately skilled programmers do not need to understand complexities of parallel programming.
32 CS 501 Spring 2008
Measurements on Operational Systems
• Benchmarks: Run system on standard problem sets, sample inputs, or a simulated load on the system.
• Instrumentation: Clock specific events.
If you have any doubt about the performance of part of a system, experiment with a simulated load.
33 CS 501 Spring 2008
Example: Web Laboratory
Benchmark: Throughput v. number of CPUs on SMP
total MB/s
average / CPU
34 CS 501 Spring 2008
Techniques: Simulation
Model the system as set of states and eventsadvance simulated time determine which events occurred update state and event listrepeat
Discrete time simulation: Time is advanced in fixed steps (e.g., 1 millisecond)
Next event simulation: Time is advanced to next event
Events can be simulated by random variables (e.g., arrival of next customer, completion of disk latency)
35 CS 501 Spring 2008
Case Study: Performance of Disk Array
When many transaction use a disk array, each transaction must:
wait for specific disk platterwait for I/O channelsignal to move heads on disk platterwait for I/O channelpause for disk rotationread data
Close agreement between: results from queuing theory, simulation, and direct measurement (within 15%).
36 CS 501 Spring 2008
Fixing Bad Performance
If a system performs badly, begin by identifying the cause:
• Instrumentation. Add timers to the code. Often this will reveal that the delays are centered in one specific part of the system.
• Test loads. Run the system with varying loads, e.g., high transaction rates, large input files, many users, etc. This may reveal the characteristics of when the system runs badly.
• Design and code reviews. Have a team review the system design and suspect sections of code for performance problems. This may reveal an algorithm that is running very slowly, e.g., a sort, locking procedure, etc.
Fix the underlying cause or the problem will return!
37 CS 501 Spring 2008
Predicting Performance Change:Moore's Law
Original version:
The density of transistors in an integrated circuit will double every year. (Gordon Moore, Intel, 1965)
Current version:
Cost/performance of silicon chips doubles every 18 months.
38 CS 501 Spring 2008
Moore's Law: Rules of Thumb
Planning assumptions:
Every year: cost/performance of silicon chips improves 25% cost/performance of magnetic media improves 30%
10 years = 100:120 years = 10,000:1
39 CS 501 Spring 2008
Moore's Law and System Design
Design system: 2006Production use: 2009Withdrawn from production: 2019
Processor speeds: 1 1.9 28Memory sizes: 1 1.9 28Disk capacity: 1 2.2 51
System cost: 1 0.4 0.01
40 CS 501 Spring 2008
Moore's Law Example
Will this be a typical personal computer?
2008 2020
Processor 2.5 GHz 50 GHz
Memory 1 GB 30 GB
Disc 100 GB 4 TB
Network 100 Mb/s 1 Gb/s
Surely there will be some fundamental changes in how this this power is packaged and used.
41 CS 501 Spring 2008
Parkinson's Law
Original: Work expands to fill the time available. (C. Northcote Parkinson)
Planning assumptions:
(a) Demand will expand to use all the hardware available.
(b) Low prices will create new demands.
(c) Your software will be used on equipment that you have not envisioned.
42 CS 501 Spring 2008
False Assumptions from the Past
Unix file system will never exceed 2 Gbytes (232 bytes).
AppleTalk networks will never have more than 256 hosts (28 bits).
GPS software will not last 1024 weeks.
etc., etc., .....
43 CS 501 Spring 2008
Moore's Law and the Long Term
1965
What level?
2005
44 CS 501 Spring 2008
Moore's Law and the Long Term
1965 When?
What level?
2006?
Within your working life?