reliability & failure analysis
TRANSCRIPT
-
7/26/2019 RELIABILITY & FAILURE ANALYSIS
1/44
EMT 480/3: RELIABILITY &
FAILURE ANALYSIS
original version !
oraini "#$%an
Ei#e ! 'asni(a$ Aris
-
7/26/2019 RELIABILITY & FAILURE ANALYSIS
2/44
1. Reliability Engineering2. Design for Reliability (DFR)3. Reliability Prediction Techniques
. F!E"#. FT"$. Reliability %ife Testing
Lecture contents
-
7/26/2019 RELIABILITY & FAILURE ANALYSIS
3/44
Reliability
the ability of a &roduct to confor' to itselectrical and visual/mechanical specications
oer a s&ecied &eriod of ti'e unders&ecied conditions at a specied condencelevel
Reliability Engineering refers to the deelo&'ent of technology*
&rocesses and standards to ensure thereliability semiconductors during applications.
Focuses on eliminating maintenancerequirements
Ter's + Denitions
-
7/26/2019 RELIABILITY & FAILURE ANALYSIS
4/44
Reliability !onitoring
consists of getting nished product samples from the line andsubjecting these to reliability testing. Valid reliability failuresshould undergo root cause analysis for reliability improvement
,afer-leel Reliability Testing
once an integrated circuit has been designed and the rstsilicon comes out, reliability tests at wafer-level are done toassess the reliability of the die
Pacage-leel Reliability Testing
refers to the assessment of the overall reliability of the device
in a pacaged form.
-
7/26/2019 RELIABILITY & FAILURE ANALYSIS
5/44
/e0 Product ualication
operationally the same as pacage-levelreliability testing, e!cept that it is systemi"ed
with the objective of generating o#icialreliability data that would justify the massproduction of a new product.
-
7/26/2019 RELIABILITY & FAILURE ANALYSIS
6/44
$he concept is to e!ert as much e#ort aspossible to design a &roduct to be inherentlyreliable
$his consist of follo0ing all no0n designrules for 'aing a &roduct reliable, not onlyelectrically but visually and mechanically as
possible %uilding reliability into a &roduct as early as
the design &hase is a &must'.
esigning for Reliability (DFR)
-
7/26/2019 RELIABILITY & FAILURE ANALYSIS
7/44
(eliability design begins with the specication ofreliability goals consistent with cost and&erfor'anceobjective
$hese goals must be translated into indiidualco'&onent* subco'&onent and &arts&ecications
4arious design 'ethods are then applied inorder to meet the goals )such as stress-strengthanalysis, simplication etc.*
-
7/26/2019 RELIABILITY & FAILURE ANALYSIS
8/44
+ failure analysis is then performed to determinewhether s&ecications are being 'et and toprovide a systematic approach for identifying,
raning and eliminating failure modes
f either the reliability or the safety goals are notmet, the design process must continue
ften, it may require a complete redesign
-
7/26/2019 RELIABILITY & FAILURE ANALYSIS
9/44Reliability Design
-
7/26/2019 RELIABILITY & FAILURE ANALYSIS
10/44
n summary, an e!cellent reliability engineeringsystem would have all of the followingcomponents
)a* design for reliability )b* wafer-level reliability testing
)c* pacage-level reliability testing
)d* new product/process qualication
)e* reliability monitoring
-
7/26/2019 RELIABILITY & FAILURE ANALYSIS
11/44
(eliability prediction is a design-assist process by
which the reliability characteristics of a system areobtained, by calculating the anticipated system (+0)(eliability, +vailability, aintainability and 0afety-ntegrity*from assumed component failure rates.
$he mportance of (eliability 1rediction
)a* provides early indication of a system's potential to meetthe design reliability requirements
)b* enables assessment of life-cycle costs to be carried out
)c* enables one to establish which components, or areas in adesign contribute to the major portion of unreliability
)d* enables trade-o#s to be made, as for eg. betweenreliability and maintainability in achieving a given availability
,hat is Reliability PredictionTechniques5
-
7/26/2019 RELIABILITY & FAILURE ANALYSIS
12/44
$raditionally, reliability has been achieved throughe!tensive testing and use of techniques such asprobabilistic reliability modeling )$hese are techniques
done in the late stages of development* $he challenge is to design in quality and reliability
early in the development cycles
(eliability of a device could be nown up-front, during
the design phase and before the device ismanufactured
$his could avoid costly redesign cycles.
,hy Reliability Prediction Techniquesis /eeded5
-
7/26/2019 RELIABILITY & FAILURE ANALYSIS
13/44
aterial quality
perating temperature
Vibration and miscellaneous mechanical factors
2lectrical stress levels
,hat "re The Factors That "6ect The Reliability Perfor'anceof Electronic 7o'&onents5
-
7/26/2019 RELIABILITY & FAILURE ANALYSIS
14/44
F2+ stands for Failure odes and 2#ects +nalysis
t is a methodology designed)i* to identify potential failure modes for a product or
process3
)ii* to assess the ris associated with those failure modes3
)iii* to ran the issues in terms of importance3 and
)iv* to identify and carry out corrective actions to address themost serious concerns
(a) 8ntroduction
8ntroduction to F!E"
-
7/26/2019 RELIABILITY & FAILURE ANALYSIS
15/44
For easy understanding, just remember that F2+is intended to
document)i* a Failure
)ii* its ode
)iii* its 2#ects
)iv* by +nalysis
-
7/26/2019 RELIABILITY & FAILURE ANALYSIS
16/44
$here are 4 standard categories of F2+ 5esign F2+ )5F2+*
addresses potential failure modes arising during
design of components and subsystems
1rocess F2+ )1F2+*
addresses potential failure modes arising during
manufacturing and assembly processes
(b) Ty&es of F!E"
-
7/26/2019 RELIABILITY & FAILURE ANALYSIS
17/44
$he process for conducting F2+ is summari"ed as
follows
)a* 5escribe product or process
)b* 5ene Functions )c* dentify 1otential Failure odes
)d* 5escribes 2#ects of Failures
)e* 5etermine 6auses
)f* 5irection ethods or 6urrent 6ontrols
)g* 6alculate (iss 7 use (is 1riority 8umber )(18* )h* $ae +ction
)i* +ssess (esults
(c) Process for 7onducting F!E"
-
7/26/2019 RELIABILITY & FAILURE ANALYSIS
18/44
+ typical F2+ incorporates some method toevaluate the ris associated with the potentialproblems identied through the analysis. ne of itis by using the (is 1riority 8umbers )(18*
-
7/26/2019 RELIABILITY & FAILURE ANALYSIS
19/44
$o use (18 method to assess ris, the analysisteam must
)a* (ate the severity of each e#ect of failure
)b* (ate the lielihood of occurrence for each
cause of failure )c* (ate the lielihood of prior detection for
each
cause of failure
)d* 6alculate the (18 by obtaining theproduct of
the three ratings
RP/ 9 :eerity ;
-
7/26/2019 RELIABILITY & FAILURE ANALYSIS
20/44
"n E;a'&le of F!E" =a>ard "ssess'ent
-
7/26/2019 RELIABILITY & FAILURE ANALYSIS
21/44
mprove product/process reliability and quality
ncrease customer satisfaction
2arly identication and elimination of potential
product/process failure modes
1rioriti"e product/process deciencies
6apture engineering/organi"ation nowledge
(d) ?enets of F!E"
-
7/26/2019 RELIABILITY & FAILURE ANALYSIS
22/44
F$+ stands for Fault $ree +nalysis
t is a graphical representation of themajor faultsor critical failures associated with a product, thecauses for the faults, and potentialcountermeasures
$he tool helps identify areas of concern for new
product design or for improvement of e!istingproducts. t also helps identify corrective actionsto correct or mitigate problems
n a Fault $ree, one wors in a9failure space:, and
loos at system failure combinations
(a) ,hat is a FT"5
8ntroduction to Fault Tree "nalysis(FT")
-
7/26/2019 RELIABILITY & FAILURE ANALYSIS
23/44
(b) ,hen to use it5
F$+ is useful both in designing newproducts/services or in dealing
with identied problems in e!isting
products/services.
n the quality planning process, the analysis can beused to optimi"e
process features and goals and to design forcritical factors and human error. +s part of process improvement, itcan be used to
help identify root causes of trouble and to design
remedies and
-
7/26/2019 RELIABILITY & FAILURE ANALYSIS
24/44
$he basic constructs in a Fault $ree 5iagram are )a* gates ); representconditions* )b* events )represent the system failure mode* $he two most commonly used gates are )a* +85 gates )b* ( gates
f occurrence ofeither eventcauses the top event tooccur, then these events )blocs* are connectedusing an( gate
+lternatively, ifboth eventsneed to occur to causethe top event to occur, they are connected by an
+85 gate
(c) ?asic 7onstructs of FT"
-
7/26/2019 RELIABILITY & FAILURE ANALYSIS
25/44
Example:For the 9$op 2vent: to occur, either + or % must
happen. n other
words, failure of + or %, causes the system to fail.
equivalent < (eliability
%loc 5iagram =
-
7/26/2019 RELIABILITY & FAILURE ANALYSIS
26/44
:y'bols used in FT"
-
7/26/2019 RELIABILITY & FAILURE ANALYSIS
27/44
1. :elect a to& leel eent for analysis.$ry to be specic, fore!ample, 92mail server down for more than > hours.: 0ources oftop level events include 1roblem/?nown 2rror (ecords3potential failures from brainstorming3 etc.
2. 8dentify faults that could lead to the to& leel eent.6ontinuing the above e!ample, some possible faults leading to anoutage lasting more than four hours might be 9loss of power:,another might be 9hardware failure.: @ist all the faults under the
top level event in bo!es and connect the fault bo!es to the toplevel event bo! by drawing lines.
(d) =o0 to Perfor' FT" in $ ste&s
-
7/26/2019 RELIABILITY & FAILURE ANALYSIS
28/44
3. For each fault* list as 'any causes as&ossible in bo;es belo0 the related fault.6ontinuing the e!ample above, in the case of 9loss
of power,A some causes might be 9electricaloutage,: 9power supply failure,: and so on. 6onnectthe bo!es to the appropriate fault bo!.
-
7/26/2019 RELIABILITY & FAILURE ANALYSIS
29/44
. Dra0 a diagra' of the @fault tree.A$wo logicoperators 7 +85 and (, also nown as logic gates7 are used to represent the sequencing of faults andcauses. For e!ample, 92mail server down for more
than > hours: could be caused by 9loss of power: or9hardware fault.A +nother might be 9loss of buildingpower: and 9battery bacup e!hausted.:
Bpdate faults and causes by grouping logically
related items using +85 or ( between faults andevents3 and faults and causes. (e-draw the linesfrom top level event to logic gates to faults to logicgates to causes.
-
7/26/2019 RELIABILITY & FAILURE ANALYSIS
30/44
#. 7ontinue identifying causes for each fault until you reacha root cause (reactie FT")* or one that you can doso'ething about (&roactie FT").For e!ample, the root causeof 9power supply failure: might be 9lter clogged3A the root causeof 9battery bacup e!hausted: might be 9battery bacup too
small.A
$. 7onsider counter'easures. + root cause is one you can dosomething about3 so now you need to thin of thecountermeasures you might apply to each root cause. @istcountermeasures for each root cause in a bo! under the rootcause. For e!ample, for 9lter clogged: a countermeasure might
be 9clean lter monthly.: @in the countermeasure to the rootcause by drawing a line.
-
7/26/2019 RELIABILITY & FAILURE ANALYSIS
31/44
E;a'&leB
-
7/26/2019 RELIABILITY & FAILURE ANALYSIS
32/44
:olutionB
-
7/26/2019 RELIABILITY & FAILURE ANALYSIS
33/44
-
7/26/2019 RELIABILITY & FAILURE ANALYSIS
34/44
?urn-inB
+ process of operating items at elevated stresslevels )particularly temperature, humidity andvoltage* in order to accelerate the processesleading to failure. $he populations of defectiveitems are thus reduced
:creeningB +n enhancement to Cuality 6ontrol whereby
additional detailed visual and electrical/mechanicaltests see to reveal defective features which wouldotherwise increase the population of &wea' items
Reliability %ife Testing
(b) ?urn-8n and :creening
-
7/26/2019 RELIABILITY & FAILURE ANALYSIS
35/44
-
7/26/2019 RELIABILITY & FAILURE ANALYSIS
36/44
-
7/26/2019 RELIABILITY & FAILURE ANALYSIS
37/44
0everal types of 200 testing available are listed asfollows
)i* $emperature 6ycling )ii* $hermal 0hoc
)iii* Dumidity $esting
)iv* $emperature, Dumidity, %ias )$D%* $esting
-
7/26/2019 RELIABILITY & FAILURE ANALYSIS
38/44
(i) Temperature Cycling
() (efers to the process in which a product issubjected to multiple cycles of changingtemperatures between pre-determined e!tremes
at relatively high rates of change fatiguing andcausing inferior product to fail
() 6ycling will show at what temperature, bothhigh and low, a product will cease to function
properly
-
7/26/2019 RELIABILITY & FAILURE ANALYSIS
39/44
(ii) Thermal Shock
(efers to rapid temperature changes from e!treme cold tohot environment to thermally shocs and stresses a products
$his causes permanent changes in electrical performanceand can cause sudden overloading of materials
$hermal shoc failures are due to thermal mismatches ormaterials with di#erences in rates of thermal e!pansion andcontraction
-
7/26/2019 RELIABILITY & FAILURE ANALYSIS
40/44
(iii) Humidity Testing Dumidity testing normally involves high heat to
aid in forcing water vapor through wealysealed components
any electronic devices are susceptible to thedamaging e#ects of moisture both by directcondensation and indirect e#ects
5irect condensation is where water comes out ofthe air and forms droplets on a device
-
7/26/2019 RELIABILITY & FAILURE ANALYSIS
41/44
$hese droplets may nd their way into thedevice and attac sensitive components
6ommon e#ects include shorting of electricalcomponents and initiation of corrosive e#ects
ndirect e#ects are numerous
2!ample is moisture breaching sealed deviceswhich results in failures over time
-
7/26/2019 RELIABILITY & FAILURE ANALYSIS
42/44
-
7/26/2019 RELIABILITY & FAILURE ANALYSIS
43/44
-
7/26/2019 RELIABILITY & FAILURE ANALYSIS
44/44
any +@$ of semiconductors involve temperature assemiconductor properties are usually have a strongtemperature dependency
$he most common accelerated test condition is as follows
)a* echanical 0hoc )b* 5rop 0hoc )$est* )c* Voltage 2!tremes )d* Digh Dumidity )e* (andom Vibration $est3 etcH