validation metrics [-1.5cm] - ncsu

Validation MetricsKathryn Maupin

Laura Swiler

June 28, 2017

Sandia National Laboratories is a multimission laboratory managed and operated by National Technology and Engineering Solutions of Sandia LLC, a wholly ownedsubsidiary of Honeywell International Inc. for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-NA0003525. SAND NO.

2017-XXX

Overview

Model Validation

Data Classification

Validation Metrics

Examples

Conclusions

June 28, 2017 2

Model Validation

The comparison of experimental observations with model output

Observed values contain uncertaintiesExperimental measurement errorLimited/incomplete dataModel form errorParameter uncertaintyApproximation/discretization error

Validation Metric: quantifies the difference between physical andsimulation observations

Observations may be considered with or without uncertainties

June 28, 2017 3

Data Types

Type 1: Experimental and model values are treated withoutuncertainty

Type 2: Experimental values are treated as uncertainNominal value, given by expertsStandard deviation, calculated using multiple experiments

Type 3: Experimental and model values are treated as uncertainUncertainty analysis

June 28, 2017 4

Data TypesType 1 (no uncertainties)

Type 2 (experimental uncertainty)

June 28, 2017 5

Data Types

Type 3 (experimental and model uncertainty)

June 28, 2017 6

Classification of Validation MetricsMetric Type 1 Type 2 Type 3Root Mean Square ✓Minkowski Distance ✓Simple Cross Correlation ✓Normalized Cross Correlation ✓Normalized Zero-Mean Sum of Squared Distances ✓Moravec Correlation ✓Index of Agreement ✓Sprague-Geers Metric ✓Normalized Euclidean Metric ✓Mahalanobis Distance ✓Hellinger Metric ✓ ✓Kolmogorov-Smirnoff Test ✓ ✓Kullback-Leibler Divergence ✓Symmetrized Divergence ✓Jensen-Shannon Divergence ✓Total Variation Distance ✓

June 28, 2017 7

Example Validation Metrics

Type 1 DataMinkowski Distance (lp Distance)

d =

(∑i|Pi − Di|p

)p

Type 2 DataMahalanobis Distance

d =

√(P − D)T Σ−1

D (P − D)

Type 3 DataKullback-Leibler Divergence

DKL(ND∥NP) =1

2

[tr(Σ−1

P ΣD) + (P − D)TΣ−1P (P − D)− k + ln

(detΣPdetΣD

)]Kolmogorov-Smirnov Test

DKS = sup |FP(x)− FD(x)|

June 28, 2017 8



d =

(∑i|Pi − Di|p

)p


d =

√(P − D)T Σ−1

D (P − D)


DKL(ND∥NP) =1

2

[tr(Σ−1

P ΣD) + (P − D)TΣ−1P (P − D)− k + ln

(detΣPdetΣD



June 28, 2017 9



d =

(∑i|Pi − Di|p

)p


d =

√(P − D)T Σ−1

D (P − D)


DKL(ND∥NP) =1

2

[tr(Σ−1

P ΣD) + (P − D)TΣ−1P (P − D)− k + ln

(detΣPdetΣD



June 28, 2017 10

Example 1

“Experiment”f(x) = 1.1 log(10x) + ε

Modelg(x) = log(10x)

withε = measurement errorxi = 1, 2, . . . 20

Type 1 Data

Metric Value Min MaxRoot Mean Square 4.4706× 10−1 0 ∞Average Relative Minkowski Distance

p = 1 8.9337× 10−2 0 ∞p = 2 3.0483× 10−3 0 ∞

June 28, 2017 11

Example 1Type 2 Data

June 28, 2017 12


Metric Value (5%) Value (10%) Min MaxAverage Mahalanobis Distance 3.5891 1.7946 0 ∞

June 28, 2017 13


June 28, 2017 14


Metric Value (5%) Value (10%) Min MaxKullback-Leibler Divergence

Total 1.5608× 102 3.9162× 101 0 ∞Average per point 7.8041 1.9581 0 ∞

June 28, 2017 15


Metric Value (5%) Value (10%) Min MaxKolmogorov-Smirnov Test

Maximum 9.7072× 10−1 7.2466× 10−1 0 1Average 9.3431× 10−1 6.4907× 10−1 0 1

June 28, 2017 16

Example 2

“Experiment”f(x) = sin2(x) + 5 + ε

Modelg(x) = sin2(x) + θ

withε = measurement error

Type 1 DataMetric Value Min MaxRoot Mean Square 1.0242× 10−1 0 ∞Average Relative Minkowski Distance

p = 1 1.5291× 10−2 0 ∞p = 2 1.8371× 10−2 0 ∞

June 28, 2017 17


Metric Value (5%) Value (10%) Min MaxAverage Mahalanobis Distance 7.3463× 10−1 3.6817× 10−1 0 ∞

June 28, 2017 18


Metric Value (5%) Value (10%) Min MaxKullback-Leibler Divergence

Total 5.8205× 101 2.4860× 102 0 ∞Average per point 4.0703× 10−1 1.7384 0 ∞

June 28, 2017 19


Metric Value (5%) Value (10%) Min MaxKolmogorov-Smirnov Test

Maximum 7.2276× 10−1 5.8143× 10−1 0 1Average 2.6065× 10−1 3.0401× 10−1 0 1

June 28, 2017 20

Conclusions

Computational models require validation before they can bereliably used in prediction scenarios

Metrics exist for deterministic data and probabilistic (uncertain)data

Choice of metric is application and quantity of interest dependent

Future work: develop a guide to determinechoice of validation metricvalidation tolerance

June 28, 2017 21

validation metrics [-1.5cm] - ncsu

Documents