statistisk vurdering av ndt-metoders pålitelighetndt.sitegen.no/customers/ndt/files/pod.pdf ·...

Probability of Detection (PoD)

Statistisk vurdering av NDT-metoders pålitelighet

Håkon Stokka Hasting, Ph.D.24 November 2008

© Det Norske Veritas AS. All rights reserved Slide 217 November 2008

The importance of NDT

For all applications that requires inspection:

� Proper NDT a requirement for safety and sustainability

� Higher risk can give higher rewards

� Strong requirements on NDT methods for critical and high risk applications, for instance:

- Nuclear Power Plants

- Air Transport

- Oil and Gas

� Optimal utilization of NDT method needed to maximise safety, sustainability and profit, and reduce risk


NDT Reliability

� Need for a methodology to assess the reliability of the NDT-method, document performance: detection and sizing

� The idea: Assess detectability, find the size of the largest defect that is not detected/reported by the NDT method

� Hit/Miss: The outcome of NDT on one defect is either detected or not detected

� Either use field finds or design an experiment to acquire data of defectsdetected or not detected

� Criteria to decide acceptable performance needed


NDT Reliability

� Simple approach: Grouping of defects

� Defects of similar size are grouped, their average hit/miss-value is theprobability of detection

� Drawback: Need high number of defects for sufficient confidence (standard deviation)

030.2

111.43

361.33

331.27

031.23

031.17

331.1

330.87

030.73

030.7

060.67

030.63

030.6

030.5

030.43

030.37

DetObsSize

032.43

334.37

334.3

334.27

334.07

333.77

333.6

033.4

333.37

333.13

333.1

332.97

332.73

662.7

332.63

332.6

DetObsSize

331.5

332.4

332.37

332.33

332.3

332.2

662.17

332

661.83

331.73

031.7

331.67

031.65

331.63

331.57

031.53

DetObsSize

334.47

337.73

336.4

336.23

336

335.8

335.73

335.63

335.2

335.1

334.77

334.67

334.63

334.6

DetObsSize


Grouping of Defects

0.9411.00049495.27

0.7680.87542483.04

0.7730.87843491.98

0.1150.20410490.73

pcPoD: pDet: mObs: nSize: x

Probability of Detection

0

0.2

0.4

0.6

0.8

1

0 1 2 3 4 5 6

Defect height

AUT qualification, Girth welds 16" x 15.8 mm, API 5L X60

Single V, GMAV-RMD + FCAW (GS), 40% FSH threshold

Groups of 13 observations with lower 95% confidence limit

Detections

Håkon S. Hasting, Det Norske Veritas 12-11-2008


0

0.2

0.4

0.6

0.8

1

0 1 2 3 4 5 6

Defect height




Observations


0.7941.00013136.58

0.7941.00013135.50

0.7941.00013134.70

0.7941.00013134.33

0.5050.76910133.63

0.7941.00013133.01

0.7941.00013132.62

0.5050.76910132.39

0.7941.00013132.22

0.7941.00013132.01

0.2870.5387131.67

0.2870.5387131.44

0.2240.4626131.00

0.0000.0000130.65

0.0000.0000130.37

pcPoD: pDet: mObs: nSize: x


Standard PoD criteria

Statistical approach beneficial: Assume that PoD follows a statistical model

Confidence criteria in statistical approach:

� Standard confidence criterion: 90% PoD at a 95% confidence level

� Difference between average PoD-value and lower 95% confidence valueshall be less than or equal to 0.1


� Hit/miss nature of data suggests for binomial distribution:

� X= # hit/success, n= # trials, p = probability of hit/success , q = 1-p

� Criteria of P = B(r;n,p) ≥ 0.95 and p ≥ 0.9 fulfilled when- NDT-method reveals 29 out of 29 defects

� Means: If 29 out of 29 know defects are successfully detected, the system has proven detection reliability of the given defects with a 90% PoD at a 95% confidence level.

� Requires reasonable defects to be valid

� Useful approach when comparing two NDT techniques

� Ref: Nordtest Technical Report 394, approved 1998-04 Nordtest technical Report 300

Binomial distribution

),;(),;(0

pnxbpnrBr

x=Σ=xnxqp

x

npnxb −

=),;(


PoD vs. Flaw Size

� Model PoD as a function of flaw size

� Advantage: Take benefit of data from neighbour flaw sizes whendetermine the confidence level

� Binary regression the appropriate statistical technique

� Requires an appropriate statistical distribution which succeeds to modelthe real detection performance:

- Logistic model (Nordtest techn. Report 394)

- Cumulative normal distribution (MIL-HDBK-1829 (1999))

- Weibull- Gompertz-Makeham


PoD-Curve


0

0.2

0.4

0.6

0.8

1

0 1 2 3 4 5 6

Defect height



PoD curve fit based on 195 observations

PoD curve lower 95% confidence interval

Fitted PoD curve minus 0,1

PoD% 90 at 50% Confidence Interval at Defect Height 2.42 mm



Observations

xo= 1.314921 beta= 3.623611 A= 0.010421 B= 0.027061 D= 0.317211



� Common PoD model (Nordtest Techn. Report 394):

� X is the flaw size

� Available data used to estimate the best fit for the parameters X0 and βby the Maximum Likelihood Estimator (MLE)

Model PoD

ββ

+

−=

0

0

1

11),;(

x

xxxPoD


0

0.2

0.4

0.6

0.8

1

0 1 2 3 4 5 6

Defect height









Observations

xo= 1.314921 beta= 3.623611 A= 0.010421 B= 0.027061 D= 0.317211



Model PoD

� Parameters X0 and β can for instance be estimated:- Using a standard statistical software packgage, like Minitab or S+. For instance by

the ”logistic regression”-procedure in Minitab.

- Write a macro or procedure in a suitable program, for instance Excel or Matlab. Parameters can be estimated by Newton-Raphson iteration.

� Once the parameters are estimated, PoD and confidence interval can be obtained by:

- Calculations and plotting in Excel. Confidence intervals will need a macro to be written.

- Write a macro for the standard statistical software

- Using other appropriate and simple software (e.g. Matlab)

ββ

+

−=

0

0

1

11),;(

x

xxxPoD


Requirements for PoD

� Reliable result from the model is fully dependent on thedistribution of data, e.g. defect sizes and types

� Reliable reference technique to give the true defect size

� Information about defects both not detected and detected

� Information from the most relevant factors that influences NDT-technique reliability (e.g. operators, temperature, offset...)

� Proper planning in advance is beneficial for reliable and valid end results


PoD applied: Qualification of AUT

� NDT is applied during the manufacture of offshore pipeline girth welds

� AUT has during the last years manifested its position as a replacementfor radiography

� According to DNV OS-F101 for offshore pipeline girth weld inspection, use of AUT calls for project specific system qualification

� The system shall show reliable detection, e.g. at least 90% PoD with 95% confidence, for defects with sizes given in project specific acceptancecriteria (e.g. from ECA)

� Designed experiments usually necessary


AUT PoD Experiment

� Designed experiment, conditions as equal as possible to intended usage

� Need a collection of representative flaws:- Intentionally induced defects in welds- Full qualification: ~50-70 weld defects

- Relevant distribution of defect types, i.e. LOF, LORP, etc

- Sufficient spread in postion of weld, root, fill and cap- Sufficient spread in defect sizes

- Some indications below applied threshold value

� Conduct experiment on the specimens:- Repeatability tests- Relilability tests

- Temperature stability tests


AUT PoD Experiment

� Idea:

� Scan welds with AUT scanner

� Slice weld afterwards and investigate true size of flaws

� Compare reported defect size from AUT-scans with true defect size

� Make valid assumptions for performance based on this experiment by applying PoD statistics


AUT Particulars

� Fixed or Phased array

� Zonal Pulse-Echo (PE) channels- Calibration against FBHs (usually 2 or 3 mm Ø set to 80% FSH) and notches- Sensitivity setting by specifying a detection/reporting threshold (% of FSH)

� TOFD (Time-Of-Flight-Diffraction)

� Volumetric / mapping channels

� Use of information varies among the different channel types- Detection based on PE only, or PE and TOFD/volumetric channels- Height sizing

FBH: Flat Bottom Hole FSH: Full Screen Height


AUT Scanner


Repeatability Tests

� Each weld scanned twice (CW and CCW) by two different teams of operators

� 10 sequential scans on calibration bloc for different positions:- 12 o’clock- 6 o’clock- 2G (vertical)- 6G (45˚)

� Max amplitude of FBH and notches shall not vary more than 2dB between thescans


Repeatability tests


Temperature Tests

� Initial scan on calibration bloc, then one scan at non heated weld

� Weld heated up to 90˚C

� Then 15 cycles of:- One scan at heated test coupon- One can of the calibration bloc

� Each cycle shall not take more than 5 minutes


Reliability Tests

� One scan on calibration bloc followed by,

� One scan on each test weld coupon, clockwise direction (CW)

� Repeated for counterclockwise direction (CCW)

� Eventually CW and CCW scans repeated by a different team of operators

� Eventually a serie CW-scans done at elevated temperature


Reliability Tests

� Choose positions for macro sectioning

� Mark up welds for macro sectioning by use of AUT scanner

� Salami-slicing of weld at macro positions:- Cut- Grind/Polish

- Photographs with scale

- Measure and document results


Reliability Tests

� AUT results and Macro-results compared:

� Calculate PoD based on true flaw size (from macro) to establishdetectability

- Evaluate hit/miss as a function of detection threshold- 40% FSH a typical applied threshold

- Different sensitivity in different zones

- ”Probability of Disregarding” defects

� Evaluate height sizing accuracy

� Threshold settings can be evaluated, when used with ECA acceptancecriteria


Example 1: AUT Qualification Results

� Girth Welds 16” x 15.8 mm

� Single V GMAW-RMD + FCAW(GS)

� Each trial weld scanned three times: CW, CCW and High T (CW)

� About 110 different defects (41 intentionally, the rest supplementary and additional)


Example 1: PoD-curve

� General PoD-curve, all defects merged in the analysis

� PoD-curve (black) is the model fit to the data

� Confidence band (red) reflects the match between experiment data and modelProbability of Detection

0

0.2

0.4

0.6

0.8

1

0 1 2 3 4 5 6

Defect height

PoD example of AUT qualification data

40% FSH threshold







Observations

xo= 1.314921 beta= 3.623611 A= 0.010421 B= 0.027061 D= 0.317211



Example 1: Specific PoD


0

0.2

0.4

0.6

0.8

1

0 1 2 3 4

Defect height

PoD example of Root AUT qualification data

40% FSH threshold







Observations

xo= 0.410011 beta= 37.029951 A= 15976576.967351 B= -1087979142.17 D= 247568541636.611001



0

0.2

0.4

0.6

0.8

1

0 1 2 3 4 5 6

Defect height

PoD example of High T AUT qualification data

40% FSH threshold







Observations

xo= 1.217631 beta= 3.367571 A= 0.027811 B= 0.061551 D= 0.778011



0

0.2

0.4

0.6

0.8

1

0 1 2 3 4 5 6

Defect height

PoD example of CW+CCW AUT qualification data

40% FSH threshold







Observations

xo= 1.205611 beta= 3.360031 A= 0.013561 B= 0.030591 D= 0.387441


� High temperature scan seems to show somewhat poorerperformance

� Good detectability in root-channel, but this is based on only fewdefects


Example 1: Height Sizing

0

1

2

3

4

5

6

7

8

0 1 2 3 4 5 6 7 8

Macro defect Height

AU

T D

efec

t H

eig

ht

Example AUT height-sizingAUT Height MeasurementsPreliminary Results196 Observations

AUT Height = Macro HeightSystematic Error: -0.16

Standard Deviation: 1.4895% limit against undersizing

(5% error fractile: -2.43 mm undersizing)95% error fractile: 2.27 mm

� Some help for PoD-interpretation from Height sizing accuracy

� Large both under- and over sizing, systematic error close to 0


Example 1: Height Sizing

0

1

2

3

4

5

6

7

8

0 1 2 3 4 5 6 7 8

Macro defect Height

AU

T D

efec

t H

eig

ht

Example Root AUT Height SizingAUT Height MeasurementsPreliminary Results27 ObservationsAUT Height = Macro Height

Systematic Error: -0.97Standard Deviation: 0.9895% limit against undersizing(5% error fractile: -2.54 mm undersizing)95% error fractile: 0.36 mm

0

1

2

3

4

5

6

7

8

0 1 2 3 4 5 6 7 8

M a cr o de f e c t He i ght

Example Fill AUT Height SizingAUT Height MeasurementsPreliminary Results86 Observations

AUT Height = Macro Height

Systematic Error: 0.97

Standard Deviation: 1.18

95% limit against undersizing

(5% error fractile: -1.17 mm undersizing)

95% error fractile: 2.89 mm

Further break-down shows:

� Large under-sizing in root channel

� Mainly over-sizing in fill channel


Example 2: AUT Qualification Results

� Project: Girth Welds 16” x 27.4 mm (J-prep) and 6” x 12.7 mm (V-prep)

� Trials backed with historical qualification data: J-prep GMAW 16 mm - 25 mm thickness, V-prep SMAW 8 mm – 11 mm thickness.

� More than 200 individual defects


Example 2: PoD-Curve


0

0.2

0.4

0.6

0.8

1

0 1 2 3 4

Defect height

PoD example 2 AUT qualification data

40% FSH threshold







Observations

xo= 0.016341 beta= 0.627891 A= 0.001781 B= 0.014641 D= 0.123731


� Unlikely PoD-result due to poor match between data and model


Example 2: PoD specificProbability of Detection

0

0.2

0.4

0.6

0.8

1

0 1 2 3 4 5

Defect height

PoD Example 2, V-prep qualification data

40% FSH threshold







Observations

xo= 0.072271 beta= 0.826971 A= 0.013871 B= 0.051681 D= 0.211301



0

0.2

0.4

0.6

0.8

1

0 1 2 3 4

Defect height

PoD Example 2, J-prep qualification data

40% FSH threshold







Observations

xo= 0.073931 beta= 1.210251 A= 0.017561 B= 0.094901 D= 0.542811


� Further break-down shows poorPoD for V-prep welds

� Analysis lacks information from small defects, ”non-detected”


Example 3

� Girth Welds from two different pipelines: 15” x 24 mm and 14” x 19.7 mm

� Each defect scanned once

� 76 different defects included


Example 3: PoD

� Unreasonable PoD-result: 90%|95% at 2.5 mm when no non-detecteddefects larger than 1 mm.


0

0.2

0.4

0.6

0.8

1

0 1 2 3 4 5

Defect height

PoD Example 3, Qualification Data







Observations

xo= 0.600781 beta= 2.164071 A= 0.007861 B= 0.010731 D= 0.318681



Example 3: PoD Model Updated


0

0.2

0.4

0.6

0.8

1

0 1 2 3 4

Defect height

PoD Example 3, Qualification Data

3 Parameter Model: Gompertz-Makeham







Observations

Gamma= 0.001001 Kappa= 10.856831 Lambda= 0.601481 A= 0.000041 B= -0.053971 C= -0.001591 D= 67.836501 E= 1.856871 F= 0.120261


� Applied a more flexible 3-parameter distribution that models the actual data better


Remember...

� Statistics is applied in order to draw valid conclusions of NDT performance from a limited amount of qualification data material

� Probability of Detection (PoD) refers to the detectability to a NDT method

� The result of a PoD-study is to a large extent dependent on PoD-experiement design

� Experiment should be performed in conditions as close as in normal operation, all relevant factors that has influence of method reliabilityshould be included

� To assess detectability, information from defects not detected/not regarded by NDT-technique has to be included in the PoD-analysis

statistisk vurdering av ndt-metoders pålitelighetndt.sitegen.no/customers/ndt/files/pod.pdf ·...

Documents