uss yorktown (1998) a crew member of the guided-missile cruiser uss yorktown mistakenly entered a...

8
USS Yorktown (1998) A crew member of the guided-missile cruiser USS Yorktown mistakenly entered a zero for a data value, which resulted in a division by zero. The error cascaded and eventually shut down the ship's propulsion system. The ship was dead in the water for several hours because a program didn't check for valid input. (reported in Scientific American, November 1998) Famous Bugs Famous Bugs London Ambulance System (1992) A succession of software engineering failures, especially in project management, caused 2 failures of London's (England) Ambulance dispatch system. The repair cost was estimated at £9m, but it is believed that people died who would not have died if ambulances had reached them as promptly as they would have done without the failures. Denver baggage handling system (1992) The Denver airport baggage handling system was so complex (involving 300 computers) that the development overrun prevented the airport from opening on time. Fixing the incredibly buggy system required an additional 50% of the original budget - nearly $200m. Taurus (1993) Taurus, the planned automated transaction settlement system for the London Stock Exchange was canceled after 5 years of failed development. Losses are estimated at £75m for the project and £450m to customers. (Pooley & Stevens, 1999) Ariane 5 (1996) The Ariane 5 rocket exploded on its maiden flight in June [4], 1996 because the navigation package was inherited from the Ariane 4 without proper testing. The new rocket flew faster, resulting in larger values of some variables in the navigation software. Shortly after launch, an attempt to convert a 64-bit floating- point number into a 16-bit integer generated an overflow. The error was caught, but the code that caught it elected to shut down the subsystem. The rocket veered off course and exploded. It was unfortunate that the code that failed genereated inertial reference information useful only before lift-off; had it been turned off at the moment of launch, there would have been no trouble. (Kernighan, 1999) E-mail buffer overflow (1998) Several E-mail systems suffer from a "buffer overflow error", when extremely long e-mail addresses are received. The internal buffers receiving the addresses do not check for length and allow their buffers to overflow causing the applications to crash. Hostile hackers use this fault to trick the computer into running a malicious program in its place. USS Yorktown (1998) A crew member of the guided-missile cruiser USS Yorktown mistakenly entered a zero for a data value, which resulted in a division by zero. The error cascaded and eventually shut down the ship's propulsion system. The ship was dead in the water for several hours because a program didn't check for valid input. (reported in Scientific American, November 1998) Mars Climate Orbiter (September 23rd, 1999) The 125 million dollar Mars Climate Orbiter is assumed lost by officials at NASA. The failure responsible for loss of the orbiter is attributed to a failure of NASA’s system engineer process. The process did not specify the system of measurement to be used on the project. As a result, one of the development teams used Imperial measurement while the other used the metric system of measurement. When parameters from one module were passed to another during orbit navigation correct, no conversion was performed, resulting in the loss of the craft. http://mars.jpl.nasa.gov/msp98/orbiter/ London Ambulance System (1992) A succession of software engineering failures, especially in project management, caused 2 failures of London's (England) Ambulance dispatch system. The repair cost was estimated at £9m, but it is believed that people died who would not have died if ambulances had reached them as promptly as they would have done without the failures. Denver baggage handling system (1992) The Denver airport baggage handling system was so complex (involving 300 computers) that the development overrun prevented the airport from opening on time. Fixing the incredibly buggy system required an additional 50% of the original budget - nearly $200m. Taurus (1993) Taurus, the planned automated transaction settlement system for the London Stock Exchange was canceled after 5 years of failed development. Losses are estimated at £75m for the project and £450m to customers. (Pooley & Stevens, 1999) Ariane 5 (1996) The Ariane 5 rocket exploded on its maiden flight in June [4], 1996 because the navigation package was inherited from the Ariane 4 without proper testing. The new rocket flew faster, resulting in larger values of some variables in the navigation software. Shortly after launch, an attempt to convert a 64-bit floating- point number into a 16-bit integer generated an overflow. The error was caught, but the code that caught it elected to shut down the subsystem. The rocket veered off course and exploded. It was unfortunate that the code that failed genereated inertial reference information useful only before lift-off; had it been turned off at the moment of launch, there would have been no trouble. (Kernighan, 1999) E-mail buffer overflow (1998) Several E-mail systems suffer from a "buffer overflow error", when extremely long e-mail addresses are received. The internal buffers receiving the addresses do not check for length and allow their buffers to overflow causing the applications to crash. Hostile hackers use this fault to trick the computer into running a malicious program in its place. USS Yorktown (1998) A crew member of the guided-missile cruiser USS Yorktown mistakenly entered a zero for a data value, which resulted in a division by zero. The error cascaded and eventually shut down the ship's propulsion system. The ship was dead in the water for several hours because a program didn't check for valid input. (reported in Scientific American, November 1998) Mars Climate Orbiter (September 23rd, 1999) The 125 million dollar Mars Climate Orbiter is assumed lost by officials at NASA. The failure responsible for loss of the orbiter is attributed to a failure of NASA’s system engineer process. The process did not specify the system of measurement to be used on the project. As a result, one of the development teams used Imperial measurement while the other used the metric system of measurement. When parameters from one module were passed to another during orbit navigation correct, no conversion was performed, resulting in the loss of the craft. http://mars.jpl.nasa.gov/msp98/orbiter/ London Ambulance System (1992) A succession of software engineering failures, especially in project management, caused 2 failures of London's (England) Ambulance dispatch system. The repair cost was estimated at £9m, but it is believed that people died who would not have died if ambulances had reached them as promptly as they would have done without the failures. Denver baggage handling system (1992) The Denver airport baggage handling system was so complex (involving 300 computers) that the development overrun prevented the airport from opening on time. Fixing the incredibly buggy system required an additional 50% of the original budget - nearly $200m. Taurus (1993) Taurus, the planned automated transaction settlement system for the London Stock Exchange was canceled after 5 years of failed development. Losses are estimated at £75m for the project and £450m to customers. (Pooley & Stevens, 1999) Ariane 5 (1996) The Ariane 5 rocket exploded on its maiden flight in June [4], 1996 because the navigation package was inherited from the Ariane 4 without proper testing. The new rocket flew faster, resulting in larger values of some variables in the navigation software. Shortly after launch, an attempt to convert a 64-bit floating- point number into a 16-bit integer generated an overflow. The error was caught, but the code that caught it elected to shut down the subsystem. The rocket veered off course and exploded. It was unfortunate that the code that failed genereated inertial reference information useful only before lift-off; had it been turned off at the moment of launch, there would have been no trouble. (Kernighan, 1999) E-mail buffer overflow (1998) Several E-mail systems suffer from a "buffer overflow error", when extremely long e-mail addresses are received. The internal buffers receiving the addresses do not check for length and allow their buffers to overflow causing the applications to crash. Hostile hackers use this fault to trick the computer into running a malicious program in its place. USS Yorktown (1998) A crew member of the guided-missile cruiser USS Yorktown mistakenly entered a zero for a data value, which resulted in a division by zero. The error cascaded and eventually shut down the ship's propulsion system. The ship was dead in the water for several hours because a program didn't check for valid input. (reported in Scientific American, November 1998) Mars Climate Orbiter (September 23rd, 1999) The 125 million dollar Mars Climate Orbiter is assumed lost by officials at NASA. The failure responsible for loss of the orbiter is attributed to a failure of NASA’s system engineer process. The process did not specify the system of measurement to be used on the project. As a result, one of the development teams used Imperial measurement while the other used the metric system of measurement. When parameters from one module were passed to another during orbit navigation correct, no conversion was performed, resulting in the loss of the craft. http://mars.jpl.nasa.gov/msp98/orbiter/ London Ambulance System (1992) A succession of software engineering failures, especially in project management, caused 2 failures of London's (England) Ambulance dispatch system. The repair cost was estimated at £9m, but it is believed that people died who would not have died if ambulances had reached them as promptly as they would have done without the failures. Denver baggage handling system (1992) The Denver airport baggage handling system was so complex (involving 300 computers) that the development overrun prevented the airport from opening on time. Fixing the incredibly buggy system required an additional 50% of the original budget - nearly $200m. Taurus (1993) Taurus, the planned automated transaction settlement system for the London Stock Exchange was canceled after 5 years of failed development. Losses are estimated at £75m for the project and £450m to customers. (Pooley & Stevens, 1999) Ariane 5 (1996) The Ariane 5 rocket exploded on its maiden flight in June [4], 1996 because the navigation package was inherited from the Ariane 4 without proper testing. The new rocket flew faster, resulting in larger values of some variables in the navigation software. Shortly after launch, an attempt to convert a 64-bit floating- point number into a 16-bit integer generated an overflow. The error was caught, but the code that caught it elected to shut down the subsystem. The rocket veered off course and exploded. It was unfortunate that the code that failed genereated inertial reference information useful only before lift-off; had it been turned off at the moment of launch, there would have been no trouble. (Kernighan, 1999) E-mail buffer overflow (1998) Several E-mail systems suffer from a "buffer overflow error", when extremely long e-mail addresses are received. The internal buffers receiving the addresses do not check for length and allow their buffers to overflow causing the applications to crash. Hostile hackers use this fault to trick the computer into running a malicious program in its place. Mars Climate Orbiter (September 23rd, 1999) The 125 million dollar Mars Climate Orbiter is assumed lost by officials at NASA. The failure responsible for loss of the orbiter is attributed to a failure of NASA’s system engineer process. The process did not specify the system of measurement to be used on the project. As a result, one of the development teams used Imperial measurement while the other used the metric system of measurement. When parameters from one module were passed to another during orbit navigation correct, no conversion was performed, resulting in the loss of the craft. 1992 1999 1992 London A m bulance System 1992 D enverB aggage H andling System 1993 Taurus 1996 A riane 5 1998 Em ailB ufferO verflow 1998 U S S Yorktow n 1999 M ars O rbiter

Upload: clinton-robertson

Post on 26-Dec-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: USS Yorktown (1998) A crew member of the guided-missile cruiser USS Yorktown mistakenly entered a zero for a data value, which resulted in a division by

USS Yorktown (1998)

A crew member of the guided-missile cruiser USS Yorktown mistakenly entered a zero for a data value, which resulted in a division by zero.  The error cascaded and eventually shut down the ship's propulsion system.  The ship was dead in the water for several hours because a program didn't check for valid input.  (reported in Scientific American, November 1998)

Famous BugsFamous Bugs

London Ambulance System (1992)

A succession of software engineering failures, especially in project management, caused 2 failures of London's (England) Ambulance dispatch system.  The repair cost was estimated at £9m, but it is believed that people died who would not have died if ambulances had reached them as promptly as they would have done without the failures.

Denver baggage handling system (1992)

The Denver airport baggage handling system was so complex (involving 300 computers) that the development overrun prevented the airport from opening on time.  Fixing the incredibly buggy system required an additional 50% of the original budget - nearly $200m.

Taurus (1993)

Taurus, the planned automated transaction settlement system for the London Stock Exchange was canceled after 5 years of failed development.  Losses are estimated at £75m for the project and £450m to customers. (Pooley & Stevens, 1999)

Ariane 5 (1996)

The Ariane 5 rocket exploded on its maiden flight in June [4], 1996 because the navigation package was inherited from the Ariane 4 without proper testing.  The new rocket flew faster, resulting in larger values of some variables in the navigation software.   Shortly after launch, an attempt to convert a 64-bit floating-point number into a 16-bit integer generated an overflow.  The error was caught, but the code that caught it elected to shut down the subsystem.  The rocket veered off course and exploded.   It was unfortunate that the code that failed genereated inertial reference information useful only before lift-off; had it been turned off at the moment of launch, there would have been no trouble. (Kernighan, 1999)

E-mail buffer overflow (1998)

Several E-mail systems suffer from a "buffer overflow error", when extremely long e-mail addresses are received.  The internal buffers receiving the addresses do not check for length and allow their buffers to overflow causing the applications to crash.  Hostile hackers use this fault to trick the computer into running a malicious program in its place.

USS Yorktown (1998)

A crew member of the guided-missile cruiser USS Yorktown mistakenly entered a zero for a data value, which resulted in a division by zero.  The error cascaded and eventually shut down the ship's propulsion system.  The ship was dead in the water for several hours because a program didn't check for valid input.  (reported in Scientific American, November 1998)

Mars Climate Orbiter (September 23rd, 1999)

The 125 million dollar Mars Climate Orbiter is assumed lost by officials at NASA. The failure responsible for loss of the orbiter is attributed to a failure of NASA’s system engineer process. The process did not specify the system of measurement to be used on the project. As a result, one of the development teams used Imperial measurement while the other used the metric system of measurement. When parameters from one module were passed to another during orbit navigation correct, no conversion was performed, resulting in the loss of the craft. http://mars.jpl.nasa.gov/msp98/orbiter/

London Ambulance System (1992)

A succession of software engineering failures, especially in project management, caused 2 failures of London's (England) Ambulance dispatch system.  The repair cost was estimated at £9m, but it is believed that people died who would not have died if ambulances had reached them as promptly as they would have done without the failures.

Denver baggage handling system (1992)

The Denver airport baggage handling system was so complex (involving 300 computers) that the development overrun prevented the airport from opening on time.  Fixing the incredibly buggy system required an additional 50% of the original budget - nearly $200m.

Taurus (1993)

Taurus, the planned automated transaction settlement system for the London Stock Exchange was canceled after 5 years of failed development.  Losses are estimated at £75m for the project and £450m to customers. (Pooley & Stevens, 1999)

Ariane 5 (1996)

The Ariane 5 rocket exploded on its maiden flight in June [4], 1996 because the navigation package was inherited from the Ariane 4 without proper testing.  The new rocket flew faster, resulting in larger values of some variables in the navigation software.   Shortly after launch, an attempt to convert a 64-bit floating-point number into a 16-bit integer generated an overflow.  The error was caught, but the code that caught it elected to shut down the subsystem.  The rocket veered off course and exploded.   It was unfortunate that the code that failed genereated inertial reference information useful only before lift-off; had it been turned off at the moment of launch, there would have been no trouble. (Kernighan, 1999)

E-mail buffer overflow (1998)

Several E-mail systems suffer from a "buffer overflow error", when extremely long e-mail addresses are received.  The internal buffers receiving the addresses do not check for length and allow their buffers to overflow causing the applications to crash.  Hostile hackers use this fault to trick the computer into running a malicious program in its place.

USS Yorktown (1998)

A crew member of the guided-missile cruiser USS Yorktown mistakenly entered a zero for a data value, which resulted in a division by zero.  The error cascaded and eventually shut down the ship's propulsion system.  The ship was dead in the water for several hours because a program didn't check for valid input.  (reported in Scientific American, November 1998)

Mars Climate Orbiter (September 23rd, 1999)

The 125 million dollar Mars Climate Orbiter is assumed lost by officials at NASA. The failure responsible for loss of the orbiter is attributed to a failure of NASA’s system engineer process. The process did not specify the system of measurement to be used on the project. As a result, one of the development teams used Imperial measurement while the other used the metric system of measurement. When parameters from one module were passed to another during orbit navigation correct, no conversion was performed, resulting in the loss of the craft. http://mars.jpl.nasa.gov/msp98/orbiter/

London Ambulance System (1992)

A succession of software engineering failures, especially in project management, caused 2 failures of London's (England) Ambulance dispatch system.  The repair cost was estimated at £9m, but it is believed that people died who would not have died if ambulances had reached them as promptly as they would have done without the failures.

Denver baggage handling system (1992)

The Denver airport baggage handling system was so complex (involving 300 computers) that the development overrun prevented the airport from opening on time.  Fixing the incredibly buggy system required an additional 50% of the original budget - nearly $200m.

Taurus (1993)

Taurus, the planned automated transaction settlement system for the London Stock Exchange was canceled after 5 years of failed development.  Losses are estimated at £75m for the project and £450m to customers. (Pooley & Stevens, 1999)

Ariane 5 (1996)

The Ariane 5 rocket exploded on its maiden flight in June [4], 1996 because the navigation package was inherited from the Ariane 4 without proper testing.  The new rocket flew faster, resulting in larger values of some variables in the navigation software.   Shortly after launch, an attempt to convert a 64-bit floating-point number into a 16-bit integer generated an overflow.  The error was caught, but the code that caught it elected to shut down the subsystem.  The rocket veered off course and exploded.   It was unfortunate that the code that failed genereated inertial reference information useful only before lift-off; had it been turned off at the moment of launch, there would have been no trouble. (Kernighan, 1999)

E-mail buffer overflow (1998)

Several E-mail systems suffer from a "buffer overflow error", when extremely long e-mail addresses are received.  The internal buffers receiving the addresses do not check for length and allow their buffers to overflow causing the applications to crash.  Hostile hackers use this fault to trick the computer into running a malicious program in its place.

USS Yorktown (1998)

A crew member of the guided-missile cruiser USS Yorktown mistakenly entered a zero for a data value, which resulted in a division by zero.  The error cascaded and eventually shut down the ship's propulsion system.  The ship was dead in the water for several hours because a program didn't check for valid input.  (reported in Scientific American, November 1998)

Mars Climate Orbiter (September 23rd, 1999)

The 125 million dollar Mars Climate Orbiter is assumed lost by officials at NASA. The failure responsible for loss of the orbiter is attributed to a failure of NASA’s system engineer process. The process did not specify the system of measurement to be used on the project. As a result, one of the development teams used Imperial measurement while the other used the metric system of measurement. When parameters from one module were passed to another during orbit navigation correct, no conversion was performed, resulting in the loss of the craft. http://mars.jpl.nasa.gov/msp98/orbiter/

London Ambulance System (1992)

A succession of software engineering failures, especially in project management, caused 2 failures of London's (England) Ambulance dispatch system.  The repair cost was estimated at £9m, but it is believed that people died who would not have died if ambulances had reached them as promptly as they would have done without the failures.

Denver baggage handling system (1992)

The Denver airport baggage handling system was so complex (involving 300 computers) that the development overrun prevented the airport from opening on time.  Fixing the incredibly buggy system required an additional 50% of the original budget - nearly $200m.

Taurus (1993)

Taurus, the planned automated transaction settlement system for the London Stock Exchange was canceled after 5 years of failed development.  Losses are estimated at £75m for the project and £450m to customers. (Pooley & Stevens, 1999)

Ariane 5 (1996)

The Ariane 5 rocket exploded on its maiden flight in June [4], 1996 because the navigation package was inherited from the Ariane 4 without proper testing.  The new rocket flew faster, resulting in larger values of some variables in the navigation software.   Shortly after launch, an attempt to convert a 64-bit floating-point number into a 16-bit integer generated an overflow.  The error was caught, but the code that caught it elected to shut down the subsystem.  The rocket veered off course and exploded.   It was unfortunate that the code that failed genereated inertial reference information useful only before lift-off; had it been turned off at the moment of launch, there would have been no trouble. (Kernighan, 1999)

E-mail buffer overflow (1998)

Several E-mail systems suffer from a "buffer overflow error", when extremely long e-mail addresses are received.  The internal buffers receiving the addresses do not check for length and allow their buffers to overflow causing the applications to crash.  Hostile hackers use this fault to trick the computer into running a malicious program in its place.

Mars Climate Orbiter (September 23rd, 1999)

The 125 million dollar Mars Climate Orbiter is assumed lost by officials at NASA. The failure responsible for loss of the orbiter is attributed to a failure of NASA’s system engineer process. The process did not specify the system of measurement to be used on the project. As a result, one of the development teams used Imperial measurement while the other used the metric system of measurement. When parameters from one module were passed to another during orbit navigation correct, no conversion was performed, resulting in the loss of the craft.

1992 1999

1992

London Ambulance System

1992

Denver Baggage Handling System

1993

Taurus1996

Ariane 51998

Email Buffer Overflow

1998

USS Yorktown

1999

Mars Orbiter

Page 2: USS Yorktown (1998) A crew member of the guided-missile cruiser USS Yorktown mistakenly entered a zero for a data value, which resulted in a division by

Building Mission Building Mission Critical SoftwareCritical Software

Eric LeeProduct ManagerMicrosoft Corporation

Page 3: USS Yorktown (1998) A crew member of the guided-missile cruiser USS Yorktown mistakenly entered a zero for a data value, which resulted in a division by

Building Mission Critical SoftwareBuilding Mission Critical Software

Trying to improve software quality by increasingTrying to improve software quality by increasing

the amount of testing is like trying to lose weight by the amount of testing is like trying to lose weight by

weighing yourself more...weighing yourself more...

If you want to lose weight, don't buy a new scaleIf you want to lose weight, don't buy a new scale

go on diet.go on diet.

If you want to improve your software, don't testIf you want to improve your software, don't test

more; more; develop better.develop better.

[Steve McConnell, Code Complete[Steve McConnell, Code Complete]]

Page 4: USS Yorktown (1998) A crew member of the guided-missile cruiser USS Yorktown mistakenly entered a zero for a data value, which resulted in a division by

Expanding Visual StudioExpanding Visual Studio

IncreasedIncreasedReliabilityReliability

QualityQualityEarly & OftenEarly & Often

PredictabilityPredictability& Visibility& Visibility

Design forDesign forOperationsOperations

ProjectProjectManagerManager

SolutionSolutionArchitectArchitect

DeveloperDeveloperTesterTester

InfrastructureInfrastructureArchitectArchitect

Page 5: USS Yorktown (1998) A crew member of the guided-missile cruiser USS Yorktown mistakenly entered a zero for a data value, which resulted in a division by

Building Mission Critical Software with Building Mission Critical Software with Visual Studio Team SystemVisual Studio Team System

Page 6: USS Yorktown (1998) A crew member of the guided-missile cruiser USS Yorktown mistakenly entered a zero for a data value, which resulted in a division by

Eric LeeEric [email protected]@microsoft.com

Page 7: USS Yorktown (1998) A crew member of the guided-missile cruiser USS Yorktown mistakenly entered a zero for a data value, which resulted in a division by

ResourcesResources

Website:Website:http://lab.msdn.microsoft.com/teamsystem/

Blogs:Blogs:http://blogs.msdn.com/ricom/http://blogs.msdn.com/scarroll/http://blogs.msdn.com/robcaron/http://blogs.msdn.com/robcaron/http://blogs.msdn.com/ericlee/http://blogs.msdn.com/ericlee/http://blogs.msdn.com/ianhu/http://blogs.msdn.com/ianhu/http://blogs.msdn.com/matt_pietrek/http://blogs.msdn.com/matt_pietrek/

Page 8: USS Yorktown (1998) A crew member of the guided-missile cruiser USS Yorktown mistakenly entered a zero for a data value, which resulted in a division by

© 2005 Microsoft Corporation. All rights reserved.This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.