1
An Overview of MSFC QuantitativeRisk Assessment (QRA) Practices
Fayssal Safie/MSFC
October 25, 2000
2
Agenda
• Quantitative Risk Assessment System (QRAS)
• Other PRA-Related Practices
• Reliability Prediction
• Probabilistic Structural Analysis
• Similarity Analysis
• Reliability Demonstration
3
MSFC Propulsion ElementsQRAS
4
QRASBackground
• Space Shuttle probabilistic risk assessment (PRA) studies
• 1988 - Space Shuttle PRA for Galileo mission (PRC)
• 1993 - Galileo PRA update (SAIC)
• 1995 - Space Shuttle PRA (SAIC)
• 1997/2000 - Space Shuttle PRA (NASA/Code Q)
5
QRASBackground (cont’d)
• 1997/2000 NASA QRA study
• In July 1996, the NASA Administrator directed NASA Headquarters to develop a software system to quantitatively assess the overall shuttle risk and serve as a tool to estimate risk changes due to proposed shuttle upgrades.
• At the request of NASA Headquarters, MSFC and JSC, supported by their prime contractors, are modeling their respective elements.
• The software system, called QRAS (Quantitative Risk Assessment System), is designed and developed by NASA Headquarters Code Q.
6
• Develop a quantitative risk model to:
• Assess the reliability/risk of the overall shuttle vehicle, its major elements, and their components
• Evaluate risk reduction due to proposed shuttle upgrades
• Rank shuttle failure modes
• Perform trade studies/sensitivity analyses
QRASObjectives
7
QRASModel Requirements
• Model builds on work done by SAIC 1993-1995 Shuttle PRA model.
• Model is modular, reflecting shuttle modularity with its discrete elements, subsystems, and components (flexible to accommodate upgraded components and additional details).
• Model must be most detailed in high risk areas to allow sensitivity analysis and trade studies to be performed.
• Model/tool must be user-friendly and easily updateable.
• Model must be capable of identifying, quantifying, and prioritizing the major risk contributors.
• Model must support NASA decision-making process (evaluating shuttle upgrades and supporting flight issues).
8
QRASModeling Approach
Space Shuttle
ORBITER
MCC HEXHPFTPLPFTP
- - -
Products1. Space Shuttle Risk2. Element Risk3. Subsystem Risk4. Risk Ranking5. Sensitivity Analysis etc..
Turbine Blade
Porosity
Turn-Around
Duct Fail.
Housing Retaining Lug Fail.
FLIGHT/TEST DATAPROBABILISTIC STRUCTURAL MODELS
SIMILARITY ANALYSISENGINEERING JUDGMENT
System Hierarchy
TurbineBlade Porosity
MissionSuccess
InspectionNot Effective
Porosity Presentin Critical Location
ETSSME ISRB
UNCERTAINTY DISTRIBUTION FOR LOV DUE TO TURBINE
BLADE POROSITY
Event Tree
RISK AGGREGATION OF BASIC EVENTS
Functional Event Sequence Diagram (FESD)
End Stateor Transfer
Porosity Present in Critical
Location Leads to Crack in <4300 sec
ScenarioNumber
1 LOV
3 MS
4 MS
2 MS
TurbineBlade
Porosity
InspectionNot
Effective
PorosityPresent inCritical
Location
QUANTIFICATION OF FESD
INITIATING &PIVOTAL EVENTS
UNCERTAINTY DISTRIBUTION FOR
EVENT PROBABILITY
EVENT PROBABILITYDISTRIBUTION
BASIC/INITIATING EVENTS
Porosity in Critical Location Leads to
Crack in <4300 sec
MissionSuccess
MissionSuccess
Loss ofVehicle(LOV)
BladeFailure
MissionSuccess
BladeFailure
5 MS
9
QRASMSFC Team Participants
• MSFC
• Safety & Mission Assurance (S&MA)
• Chief engineer & project offices
• Engineering
• Prime contractors
• Reliability engineering
• Design & manufacturing engineering
• Hernandez Engineering Inc. (HEI)
• Reliability engineering and simulation
10
QRASDatabases
• Problem Reporting and Corrective Action (PRACA)
• Automated Configuration data Tracking System (ACTS)
• Logbooks
• Engineering data/analyses
• Generic data
• Lessons learned
• SAIC study
11
QRASPropulsion Element Models
Significant Observations
Strength:
• QRAS modeling effort has contributed towards drawing management attention in using statistical and probabilistic information in the decision making process.
• Event Sequence Diagram (ESD) provides a better understanding of the failure mode risk and an excellent way to address risk mitigation.
• Data contained in the individual ESD packages are an excellent source of reference material and lessons learned.
• QRAS models constitute:
• The best source of failure rate data for the shuttle program to evaluate upgrades.
• The best source of information to understand the risk mitigation in place.
• The best source to understand the physics of failure for critical failure modes/events.
12
Considerations:
• QRAS is a large scale QRA study which is very complex and require extensive knowledge of the system, a large amount of data, and extensive modeling.
• Use of engineering judgment introduces significant amount of uncertainty.
• Quantification methods, in most cases, are not robust. Overlooking one piece of data may dramatically change the probability of loss of vehicle.
QRASPropulsion Element Models
Significant Observations (cont’d)
13
Considerations (cont’d):
• Modeling of human error/process error is a big challenge.
• Human error/process error has been incorporated implicitly where flight and test data exist.
• For structural failures which are modeled using design information, the human error/process error has been incorporated explicitly using placeholders based on historical data.
• The QRAS modeling effort has shown that developing explicit models for the human error/process error is extremely difficult because of lack of adequate data.
QRASPropulsion Element Models
Significant Observations (cont’d)
14
Considerations (cont’d):
• QRAS/PRA failure probabilities are imbalanced
• Some failure probabilities are derived using mainly design information ( P&W Turbopumps), while others are derived using mainly test and flight data (RKDN SSME hardware). Generic data are also used in other cases.
• Some failure probabilities are derived using limited data (solid propulsion elements), while others are derived based on a lot of data (liquid propulsion elements).
• Difficult to model common cause failures
• Incomplete interface models
QRASPropulsion Element Models
Significant Observations (cont’d)
15
QRASConclusions
• Following a well defined and documented systematic procedure, involving the appropriate disciplines (reliability, design, and manufacturing engineering), and using the appropriate data are the key elements for a successful QRA study.
• Information derived from QRA studies are most accurate and useful at lower levels (within components and failure modes).
• QRAS tool is the best QRA tool available to support the shuttle program management decisions.
16
Other PRA-Related Practices
Reliability Prediction
17
Reliability Prediction
• Reliability prediction techniques are dependent on the degree of the design definition and the availability of historical data. Two commonly used techniques are:
• Probabilistic design techniques: Reliability is predicted using engineering failure models.
• Similarity analysis techniques: Reliability of a new design is predicted using reliability of similar parts.
18
Reliability PredictionProbabilistic Structural Analysis
• It is a tool to probabilistically characterize the design and analyze its reliability using engineering failure models.
• It is a tool to evaluate the expected reliability of a part given the structural capability and the expected operating environment.
• It is used when failure data is not available and the design is characterized by complex geometry or is sensitive to loads, material properties, and environments.
19
FRACTURELOCATION
•During rig testing the AT/HPFTP Bearing experienced several cracked races.
•Summary of 440C race fractures / tests: 3 of 4 Fractured
Reliability PredictionProbabilistic Structural Analysis (cont’d)
Turbo-Pump Bearing Example
20
OBJECTIVE: Predict probability of inner race over-stress, under the conditions experienced in the test rig, and estimate the effect of manufacturing stresses on the fracture probability.
StressAllowable
Load
Failure Region
Reliability PredictionProbabilistic Structural Analysis (cont’d)
Turbo-Pump Bearing Example
21
Conditions• Using rig fits and clearances• Crack size data from actual cut-ups• Stresses associated with manufacturing (ideal)• Materials properties and their variations• Failure mode being analyzed is over-stress
Reliability PredictionProbabilistic Structural Analysis (cont’d)
Turbo-Pump Bearing Example
22
HPFTP Roller Bearing Inner Race - Model Flow
Randomly select values for inner race material properties
Randomly select values for shaft and sleeve material properties
Tolerance fits of rig test bearing
Inner race hoop stress contribution at given conditions
Shaft and sleeve hoop stress contribution at given conditions.
Total hoop stress
Stress due to Manufacturing Stress > Allowable Load
Iterate and compute Failure Probability
Variation in:o Fracture Toughnesso Yield Strengtho No. of Crackso Crack Deptho Crack Length
Compute AllowableLoad for each crack
Compute AllowableLoad (worst crack)
Reliability PredictionProbabilistic Structural Analysis (cont’d)
Turbo-Pump Bearing Example
23
RESULTS - FAILURE RATES
At Test
3 of 4 failed
---
---
In 15+ testsnever had athrough ringfracture
Race Configuration
440C w/ actual manufacturingstresses (ie ideal + abusivegrinding)
440C w/no manf. stresses
440C w/ideal manf. stresses
9310 w/ ideal manf stresses
Probabilistic Structural Analysis
68,000 fail/100k firings
1,500 fail/100k firings
27,000 fail/100k firings
10 fail/100k firings
It is estimated that 50% of the through ring fractures would result in an engine shutdown. The shutdown 9310 HPFTP Roller Bearing Inner Race Failure Rate is then: 0.50 X 10/100k = 5 fail/100k firings
Reliability PredictionProbabilistic Structural Analysis (cont’d)
Turbo-Pump Bearing Example
24
Reliability PredictionSimilarity Analysis
• Similarity Analysis is a technique for predicting the reliability of a new design based on historical data of similar designs (heritage hardware).
• Failure rates derived from historical data are modified to reflect the design and environment of the new hardware.
• Similarity Analysis is best performed at the lowest level possible, where more data is available and more appropriate judgment is made.
25
Reliability PredictionSimilarity Analysis (cont’d)
Fuel Turbo Pump Example
• Assume a Fuel Turbo Pump (FTP) has a historical failure rate of:
50 per 100k firings
• Assume also the failure mode break down is:
• Then the Cracked/Fractured Failure rate is: .35 X 50 = 17.5/100k firings
Cracked/Fractured Blades
Turbine bearing Failure
Pump bearing Failure
Impeller Failure
Turbine Seal Failure
100%
35%
25%
20%
10%
10%
26
• If the failure causes for Cracked/Fractured are determined to be:
• Then the Thermal Stress Failure Rate is:
0.57 X 17.5 = 10/100k firings
100%
Reliability PredictionSimilarity Analysis (cont’d)
Fuel Turbo Pump Example
27
•Failure Rate Adjustments established through:• Test Results• Preliminary Analyses• Integrated Product Team (IPT) Input
• Address "high hitters" - Using Thermal Stress failure rate of 10.0/100k firing• Design changes to improve reliability Cum Percent Failure Rate Improvement ReductionLower Operating Temperatures 20% 2.00(Test)Hollow Blades 30% (additional) 4.40(Analysis, Expert Opinion)Material Change 20% (additional) 5.52(Analysis)
Reliability PredictionSimilarity Analysis (cont’d)
Fuel Turbo Pump Example
28
If no other changes are made, the FTP predicted reliability is then:
50 - 5.52 = 44.48 / 100k firings
Reliability PredictionSimilarity Analysis (cont’d)
Fuel Turbo Pump Example
29
Other PRA-Related Practices
Reliability Demonstration
30
Reliability Demonstration
• Reliability Demonstration is a reliability estimation method that primarily uses test data (objective data) to calculate demonstrated reliability with some statistical confidence.
• Some commonly used models and techniques for reliability demonstration include Binomial, Exponential, and Weibull models. Reliability growth techniques, such as the U.S. Army Material Systems Analysis Activity (AMSAA) and Duane models can also be used to calculate demonstrated reliability.
31
• SFR Criteria is an optimization tool based on the demonstrated reliability of SSME hardware.
• SFR is used by the SSME Program as a quantitative probabilistic risk management tool for SSME critical hardware.
• SFR Criteria:• Extensive fleet hot-fire experience• No failures or MR history• No periodic inspection• Use discrete optimization for life limit determination• Extend life limit up to 50% fleet leader but not to exceed the
minimum run time of the six leading samples• New life limit should not be less than 25% of the fleet leader• Advantages include:
• Maximize hardware usage• Use of all operational history
Reliability Demonstration
ExampleSSME Single Flight Reliability (SFR) Criteria
32
Reliability DemonstrationExample
SSME Single Flight Reliability (SFR) Criteria – Powerhead Assembly Example
0 7000 14000 21000 28000 35000
Serial Number487393748761184881840410146441037044887803488695941076244881353489109348839154882395489285548870184881159410142041064544889036488551548735584915695488979448862944878165489173848762164102590488166448763514877733
Seconds333292471621017203272004619908164731644415346121941184311125 9338 9230 9137 8821 8199 8070 7797 6583 5893 5577 5031 4989 4980 4643 4391 4376 3716 3636
(Partial Listing)
Beta = 2.08Powerhead Assembly LRU Code A050
25% F/L – 8332 50% F/L - 16665