eiv regression - final - may 2007

Upload: rpcovert

Post on 30-May-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/14/2019 EIV Regression - Final - May 2007

    1/45

    08/10/09

    MCR, LLC

    Reprinted with permission of MCR, LLC

    Advances in CER Development:

    Errors-in-Variables Regression

    Raymond CovertTechnical DirectorMCR, LLC

    [email protected]

    Presented to the

    European Aerospace Working Group onCost Engineering (EACE)

    Frascati, Italy

    24-25 April 2007

    mailto:[email protected]:[email protected]
  • 8/14/2019 EIV Regression - Final - May 2007

    2/45

    08/10/09

    MCR, LLC 2

    Agenda

    Introduction

    Errors-In-Variables Regression

    Sources of Uncertainty

    Examples

    CER Regression with Normalization Uncertainty

    CER Regression with Fuzzy Cost Drivers

    Spacecraft EPS NR and REC CER Example

    EIV Modeling Benefits and Drawbacks

    Summary

  • 8/14/2019 EIV Regression - Final - May 2007

    3/45

    08/10/09

    MCR, LLC 3

    Agenda

    Introduction

    Errors-In-Variables Regression

    Sources of Uncertainty

    Examples

    CER Regression with Normalization Uncertainty

    CER Regression with Fuzzy Cost Drivers

    Spacecraft EPS NR and REC CER Example

    EIV Modeling Benefits and Drawbacks

    Summary

  • 8/14/2019 EIV Regression - Final - May 2007

    4/45

    08/10/09

    MCR, LLC 4

    Introduction

    At the Joint EACE/SSCAG/SCAF Meeting we introducedErrors-in-Variables (EIV) regression [Ref. 1] Used fictitious data to highlight effects of uncertainty and fuzzy

    variables in CER development

    Recently, we experimented with EIV regression usingreal cost data Abandoned all assumptions and began treating normalization

    and regression problem as completely random process

    Exposed additional sources of uncertainty that warrant use of

    EIV regression over traditional methods

    Exposed strengths and weaknesses of techniques

    This presentation provides highlights of originalpresentation and recent advances

  • 8/14/2019 EIV Regression - Final - May 2007

    5/45

    08/10/09

    MCR, LLC 5

    Background

    Traditional regression techniques used instatistically derived cost and schedule relationships:

    Ordinary least squares (OLS) Minimizes sum of squares of errors

    For linear relationships with additive error term: y=a+bx+ Log-OLS

    Minimizes sum of squares of log of errors

    For power relationships with multiplicative error term: y=axb Constrained optimization

    Minimizes a penalty function (sum of squares of errors,percent errors, etc.) while constraining some other term (e.g.,bias =0)

    Traditional regression techniques used in cost analysis assume

    that independent variables are constant and known exactly

  • 8/14/2019 EIV Regression - Final - May 2007

    6/45

    08/10/09

    MCR, LLC 6

    Uncertainty in Dependent andIndependent Variables

    However, even independent variables are not necessarilyconstant parameters

    They may be random variables with uncertainty due tothe following:

    Normalizing data (nonrecurring [NR] and recurring [REC] split,inflation assumptions, treatment of qualification and engineeringunits [EU])

    Uncertain cost driver values (multiple versions of weight,power, etc.)

    Fuzzy cost drivers such as percent new design, manufacturingcomplexity, design difficulty, etc.

    Is there a method of regression that accommodates thisadditional uncertainty?

    Both independent and dependent variables may beuncertain parameters (random variables)

  • 8/14/2019 EIV Regression - Final - May 2007

    7/4508/10/09

    MCR, LLC 7

    Errors-in-Variables Regression

    Errors-in-variables (EIV) is a robust modeling techniquein statistics that assumes every variable can have error ornoise

    Also referred to as Total Least Squares (TLS)

    Started with R. J. Adcocks one-page paper in TheAnalystA Problem in Least Squares(Des Moines, Iowa)in 1878

    Simple linear regression (OLS) is special case in which

    we assume no measurement errors in independentvariables

    P. Foussier presented EIV regression from differentperspective in his 2006 ISPA Conference paper Palliatingthe Bias Introduced by Linear Regression [Ref. 2]

  • 8/14/2019 EIV Regression - Final - May 2007

    8/45

    08/10/09

    MCR, LLC 8

    Advancing the State of the Artin Regression

    Constrained Optimization allows unprecedented freedomover traditional methods

    OLS and Log-OLS are simple, analytical solutions, but

    They restrict us to CERs of the form y=a+bx+e or y=axb*e ;

    where [a, b] are coefficients and e is additive or multiplicativeerror term, respectively

    OLS produces constant error term, Log OLS produces biasedresults (that require correction)

    Constrained optimization allows freedom to choose

    Form of the CER (e.g., y=a+bxc

    qd

    *e) How we wish to model the error term, multiplicative or additive

    Whether to eliminate bias (which we can constrain to zero)

    EIV modeling allows

    Same freedoms of constrained optimization

    Ability to include effects of uncertainty in our data

  • 8/14/2019 EIV Regression - Final - May 2007

    9/45

    08/10/09

    MCR, LLC 9

    Agenda

    Introduction Errors-In-Variables Regression

    Sources of Uncertainty

    Examples

    CER Regression with Normalization Uncertainty

    CER Regression with Fuzzy Cost Drivers

    Spacecraft EPS NR and REC CER Example

    EIV Modeling Benefits and Drawbacks

    Summary

  • 8/14/2019 EIV Regression - Final - May 2007

    10/45

    08/10/09

    MCR, LLC 10

    Regressing Constant Variables

    OLS, Log-OLS and Constrained Optimization Regressiontechniques assume constant values for independent (x)variables

    $ (y)

    Cost Driver (x)

    Cost = a +bXcCost = a +bxc

    Historicaldata point

    Cost estimating relationship

    Standardpercent error bounds

    Traditional regression techniques used in cost analysis assumethat independent variables are constant and known exactly

  • 8/14/2019 EIV Regression - Final - May 2007

    11/45

    08/10/09

    MCR, LLC 11

    Regressing Random Variables

    EIV regression assumes uncertain (random) values forboth dependent (x) and independent (y) variables

    Cost Driver (x)

    Cost = a +bXcCost = a +bxc

    Historical data distribution

    Cost estimating relationship

    Standard percent error bounds

    CER

    EIV regression assumes variables are uncertain (random)

    $ (y)

    Random values for (x, y)

  • 8/14/2019 EIV Regression - Final - May 2007

    12/45

    08/10/09

    MCR, LLC 12

    EIV RegressionTools and Techniques

    We need two tools to perform EIV regression Monte Carlo simulation to model uncertainty

    Constrained optimization tool to solve for CERcoefficients under constraint (e.g., zero bias)

    We tested two methods to perform EIV regression Crystal Ball with OptQuest, which has these two tools

    built into one Benefits: Simple spreadsheet application, search for global

    minimum

    Drawbacks: Coefficient search takes a lot of time (hours) Dump trials into a spreadsheet and perform

    regression using Premium Solver Benefits: Finds minimum rather quickly (minutes)

    Drawbacks: May not be global minimum, spreadsheet is large

  • 8/14/2019 EIV Regression - Final - May 2007

    13/45

    08/10/09

    MCR, LLC 13

    EIV Using Crystal Ballwith OptQuest

    Uncertain variables can be modeled in spreadsheet usingstatistical simulation tool (Crystal Ball) with optimizationcapability (OptQuest)

    Random variables defined for uncertain variables thatconstitute x,y data points - cost drivers, normalizationassumptions

    Outputs (forecasts) from Statistical Simulation defined forBias and Percent Standard Error

    CER coefficients defined as decision variables - Findoptimum coefficients that give minimum mean of standarderror under (near) zero bias constraint

    During optimization, random Variables are generated for xand y variables - Examples that follow use 5000 trials

    CER Coefficients are tested for each set of (5000) trials OptQuestdetermines optimum coefficients using scatter

    search and tabu search techniques (does not find minima

    via gradient approach)

  • 8/14/2019 EIV Regression - Final - May 2007

    14/45

    08/10/09

    MCR, LLC 14

    Optimizing Using Crystal Balland Premium Solver

    Model uncertain variables in spreadsheet usingstatistical simulation tool such as Crystal Ball Random variables defined for uncertain variables that

    constitute x,y data points Cost driver uncertainty

    NR/REC split Quantities (EDUs, Qual and Protoqual units) Inflation

    Outputs (forecasts) from Statistical Simulation aredefined Uncertain Input variables (cost drivers, quantities)

    Output variables (nonrecurring and recurring costs) Trial values (1000 trials) for each are dumped into a spreadsheet

    Data are regressed using constrained optimization(Premium Solver) Uses a combined scatter search and gradient approach to find global

    minimum for percent error under the constraint bias =0

    Produces coefficients for CER

  • 8/14/2019 EIV Regression - Final - May 2007

    15/45

    08/10/09

    MCR, LLC 15

    Agenda

    Introduction Errors-In-Variables Regression

    Sources of Uncertainty

    Examples

    CER Regression with Normalization Uncertainty

    CER Regression with Fuzzy Cost Drivers

    USCM EPS NR and REC CER Example

    Spacecraft Modeling Benefits and Drawbacks

    Summary

  • 8/14/2019 EIV Regression - Final - May 2007

    16/45

    08/10/09

    MCR, LLC 16

    Sources of Uncertainty

    Uncertainty in x, y variables can originate from: Assumptions of Normalization process

    A posteriorivalues of cost drivers (e.g., weight) are typically chosenas best hardware cost drivers; however we do not know these a

    priorivalues with certainty (weight is estimated at program start)

    Treatment of qualification units and EUs (Factor of T1 cost?)

    How much of total cost is NR vs. REC? We typically rely oncontractor inputs, guesses and assumptions

    Applying inflation (particularly to older data points) Should oldercost data be treated with more uncertainty? (Yes)

    Incomplete/inconsistent or otherwise fuzzy data How should we model parameters such as new design percentage?

    Combining data from multiple data sources/models

    All vendors treat cost data differently

    We can use Error-in-variables constrained optimizationto find coefficients for CERs with uncertain data by

    accounting for uncertainty in the normalization process

  • 8/14/2019 EIV Regression - Final - May 2007

    17/45

    08/10/09

    MCR, LLC 17

    Cost Model Development Flow

    COST REPORTA ---- ----

    B ---- ----

    C ---- ----D ---- ----

    COST REPORTA ---- ----

    B ---- ----C ---- ----

    D ---- ----

    COST REPORTA ---- ----

    B ---- ----C ---- ----

    D ---- ----

    COST REPORT

    A ---- ----

    B ---- ----C ---- ----

    D ---- ----

    COST REPORTA ---- ----

    B ---- ----C ---- ----

    D ---- ----

    DESIGNREVIEWDESIGN

    REVIEWDESIGN

    REVIEWDESIGNREVIEW

    Data Collection

    Scope

    Quantity

    Inflation

    Technology

    DataNormalization

    Regression

    Statistics

    Filter

    SCHED. REPORTA ---- ----

    B ---- ----C ---- ----

    D ---- ----

    SCHED. REPORTA ---- ----

    B ---- ----

    C ---- ----D ---- ----

    SCHED. REPORT

    A ---- ----B ---- ----

    C ---- ----

    D ---- ----

    SCHED. REPORT

    A ---- ----B ---- ----

    C ---- ----

    D ---- ----

    CERDevelopment

    ContractRiders

    CER Documentation

    COST DRIVERSA ---- ----B ---- ----C ---- ----D ---- ----

    CER functions

    and coefficientsData points

    WBS Definition

    NormalizationAssumptions

    Fit Statistics

    Data Statistics

    Sources of Uncertainty (in red)

  • 8/14/2019 EIV Regression - Final - May 2007

    18/45

    08/10/09

    MCR, LLC 18

    Uncertainty in Data Normalization

    Filter: Decide what to include in cost Riders things that are not relevant to the contract or

    program Engineering & Cost Change Proposals

    Scoping: Consistent definitions and content Re-allocation of data into WBS elements

    Need to determine where qualification units, prototypeunits and protoflight units should be booked

    Quantity: Consistent units for regression Data will be for 1 unit, 100s of units, 10th unit, etc. Need to either use quantity as an input variable (QAIV)

    or normalize to a base set of units using an assumedlearning curve assumption

    Inflation: Consistent economic year Use a consistent set of inflation indices (e.g., DoD 3020

    and 3600)

    Technology: Consistent technology maturity Treat all data as if they were built in economic year of

    the model

    Scope

    Quantity

    Inflation

    Technology

    DataNormalization

    Filter

    ContractRiders

  • 8/14/2019 EIV Regression - Final - May 2007

    19/45

    08/10/09

    MCR, LLC 19

    The EIV Problemwith Fuzzy Inputs

    New Design is defined by the following fuzzy variablesmodeled as triangular probability distributions

    Category New Design New Design % Low Most Likely High

    1 None 0.1 0 0.1 0.2

    2 Minor Mods 0.2 0.1 0.2 0.5

    3 Moderate Mods 0.5 0.4 0.5 0.84 Major Mods 0.75 0.6 0.75 0.9

    5 New Design 1 0.9 1 1

  • 8/14/2019 EIV Regression - Final - May 2007

    20/45

    08/10/09

    MCR, LLC 20

    Agenda

    Introduction Errors-In-Variables Regression

    Sources of Uncertainty

    Examples

    CER Regression with Normalization Uncertainty

    CER Regression with Fuzzy Cost Drivers

    Spacecraft EPS NR and REC CER Example

    EIV Modeling Benefits and Drawbacks

    Summary

  • 8/14/2019 EIV Regression - Final - May 2007

    21/45

    08/10/09

    MCR, LLC 21

    CER Development Example

    Military Fixed and Mobile Terminal Antennas Used for satellite communications

    (Fictitious) Cost and technical data used to derivetheoretical first unit (T1) and nonrecurring (NR) costrelationshipsT1 Cost BY05$K = (a * Diamb * Freqc + d)*1 (1=error)NR Cost BY05$K = (e * (x*T1)f* + g)*2 (2=error)

    Program

    Antenna

    Diameter, m Frequency, GHz

    Slew Rate,

    deg/sec

    New

    Design

    T1 Actual

    Cost

    NR Actual

    Cost Cost FY

    1 3 2 0 1 45.59 2.28 1982

    2 3 2 1.05 5 57.32 171.95 1985

    3 5 2 0 2 50.73 25.37 1984

    4 15 22 0 1 9,136.89 456.84 1992

    5 20 22 1 3 10,458.80 10,458.80 1985

    6 4 10 0 3 1,123.54 1,123.54 2000

    7 3 12 0.95 5 1,850.14 5,550.43 1999

    8 5 10 0.5 4 1,729.35 3,458.69 1988

  • 8/14/2019 EIV Regression - Final - May 2007

    22/45

  • 8/14/2019 EIV Regression - Final - May 2007

    23/45

    08/10/09

    MCR, LLC 23

    Determining CER Coefficient ValuesUsing Constant (x, y) Values

    Using Excel Solver:

    Solve for T1 CER coefficients using ZPB-MPE*

    Minimize Percent Standard Error (%SE)

    Constraint: Bias = 0.00%

    T1BY05$K=[9.58*Diam^0.350 * Freq ^ 2.031)+17.52]*

    Program

    Antenna

    Diameter, m

    Frequency,

    GHz

    Slew Rate,

    deg/sec

    Actual T1

    Cost BY05$K

    Est Cost

    BY05$K

    (Act-Est)

    /Est

    1 3 2 0 72.55 75.06 (0.03) testa 9.5838522 3 2 1.05 85.85 75.06 0.14 testb 0.350121

    3 5 2 0 77.55 86.33 (0.10) testc 2.030987

    4 15 22 0 11,881.17 13,192.74 (0.10) testd 17.52339

    5 20 22 1 15,666.08 14,588.92 0.07

    6 4 10 0 1,242.96 1,689.85 (0.26)

    7 3 12 0.95 2,088.57 2,207.27 (0.05)

    8 5 10 0.5 2,438.03 1,825.74 0.34

    Correl with Cost 0.9844 0.9192 0.2031 1.0000 % Bias 0.0000

    Correlation with %Er 0.0318 -0.0454 0.5389 0.0457 %SE 18.24

    * Zero percent bias, minimum percentage error

  • 8/14/2019 EIV Regression - Final - May 2007

    24/45

    08/10/09

    MCR, LLC 24

    Determining CER Coefficient ValuesUsing Uncertain (x, y) Values

    Model data as uncertain parameters in statistical simulationby defining random variables for: Independent (x) variables:

    Antenna diameter: + 0.5m (uniform distribution) Frequency: Low and High cutoff frequencies (uniform

    distribution)

    Dependent (y) Variable (cost): Inflation: Base inflation rate with 1% Standard error (normaldistribution)

    Learning rate: Low=0.90, Most Likely=0.95 and High=1.00(triangular distribution)

    Cost Fiscal Year: + 1 year (discrete distribution)

    Solve for coefficients of the CER that provide: Minimized mean of percent standard error (which is now a

    random variable) Near zero mean of bias (also a random variable) less than

    +0.5%

    0.9, 0.95, 1.0

    -0.5, 0.49

    f(low), f(high)

    -1, 0, 1

  • 8/14/2019 EIV Regression - Final - May 2007

    25/45

    08/10/09

    MCR, LLC 25

    EIV Solution for T1 CER

    Solution converges after 2395 simulations T1BY05$K=[8.64*Diam^0.64 * Freq ^ 1.74)+2.87]*

    2 = 27.9 % standard percent error Bias = +0.36%

    Simulation

    MinimizeObjective

    %SEMean

    Requirement% Bias-

    .5

  • 8/14/2019 EIV Regression - Final - May 2007

    26/45

    08/10/09

    MCR, LLC 26

    Actual vs. EstimatedPlot of T1 CER

  • 8/14/2019 EIV Regression - Final - May 2007

    27/45

    08/10/09

    MCR, LLC 27

    Contributors to Variance inDependent Variables

    Can use Crystal Ball sensitivity tool to see whatvariables contribute to variance of %SE and %Bias

    Frequencyerrors

    Learningerrors

    Diametererrors

    Inflation

    errors

  • 8/14/2019 EIV Regression - Final - May 2007

    28/45

    08/10/09

    MCR, LLC 28

    Agenda

    Introduction Errors-In-Variables Regression

    Sources of Uncertainty

    Examples

    CER Regression with Normalization Uncertainty

    CER Regression with Fuzzy Cost Drivers

    Spacecraft EPS NR and REC CER Example

    EIV Modeling Benefits and Drawbacks

    Summary

  • 8/14/2019 EIV Regression - Final - May 2007

    29/45

    08/10/09

    MCR, LLC 29

    Regression With Uncertain Variables:Nonrecurring CER

    Some cost drivers are subjective values(particularly in nonrecurring CERs):

    Amount of new design (0% to 100%)

    Complexity of development, manufacturing or testing

    process Solve the ZPB-MPE problem under uncertainty

    using Crystal Ball with OptQuest

    Use our estimated T1 cost, T1EST (from last

    regression) and percent new design (ND) as thecost drivers

    ND is a fuzzy cost driver (it has a loose definition)

    NR Cost BY05$K = [e * (T1EST*ND)f * + g]*2 (2

    =error)

  • 8/14/2019 EIV Regression - Final - May 2007

    30/45

    08/10/09

    MCR, LLC 30

    Solving the EIV Problemwith Fuzzy Inputs

    Define new design categories as assumption variables Define coefficients e, f and g as decision variables

    Define % SE and % bias as forecast values

    Use OptQuest to Minimize mean of % SE and

    constrain mean of % bias to +/- 0.5%NR Cost BY05$K = [e * (T1EST*ND)

    f * + g]*2

    EIV S l i f

  • 8/14/2019 EIV Regression - Final - May 2007

    31/45

    08/10/09

    MCR, LLC 31

    EIV Solution forNonrecurring CER

    Solution converges after 1225 simulationsNR BY05$K = [2.34 * (T1EST*ND)

    0.99 * +0.27]*2 2 = 72.6 % standard percent error Bias = -0.27%

    Simulatio

    n

    Minimize

    Objective

    % NR

    SEMean

    Requirem

    ent% NR

    Bias-.5

  • 8/14/2019 EIV Regression - Final - May 2007

    32/45

    08/10/09

    MCR, LLC 32

    Agenda

    Introduction Errors-In-Variables Regression

    Sources of Uncertainty

    Examples

    CER Regression with Normalization Uncertainty

    CER Regression with Fuzzy Cost Drivers

    SpacecraftEPS NR and REC CER Example

    EIV Modeling Benefits and Drawbacks

    Summary

  • 8/14/2019 EIV Regression - Final - May 2007

    33/45

    08/10/09

    MCR, LLC 33

    Spacecraft EPS NR and REC CER

    The problem: Find a set of NR and REC CERs for ElectricalPower System (EPS)

    Approach:

    Use USCM data set (24 Programs)

    Apply uncertainty to a posteriori values of cost drivers EPSweight, Beginning of Life Power (BOLP) and BatteryCapacity (in Amp-Hours)

    Apply probabilistic bounds to US Department of Defense(DoD) inflation indices using Consumer Price Index (CPI),select DRI Indices and US Aerospace Contractors

    Abandon assumed learning rate of 95% to derive T1 cost

    and use quantity as an independent variable (QAIV)approach to develop REC CER [Ref. 3]

    Use EIV to find CERs with minimum percent error with zerobias and select best cost driver from Weight, BOLP andBattery Capacity

  • 8/14/2019 EIV Regression - Final - May 2007

    34/45

    08/10/09

    MCR, LLC 34

    Candidate Solutions

    Multiple solutions produced Best driver is BOLP Many local minima found

    Coefficients seem to be correlated

    R l ti hi f

  • 8/14/2019 EIV Regression - Final - May 2007

    35/45

    08/10/09

    MCR, LLC 35

    Relationship ofCandidate Coefficients

    Produced many viable candidate solutions Premium Solver could not find global minimum

    How can we find the global minimum the first time?

    Coefficients produced by each candidate seem to be

    related to each other This information may be helpful in finding re-sampling

    bounds

    0.000

    0.100

    0.200

    0.300

    0.400

    0.500

    0.600

    0.700

    0.800

    0.900

    1.000

    -1200 -1000 -800 -600 -400 -200 0 200

    Coef a1

    Coefb1

    0.00

    100.00

    200.00

    300.00

    400.00

    500.00

    600.00

    -1200 -1000 -800 -600 -400 -200 0 200

    Coef a1

    Coefb1

  • 8/14/2019 EIV Regression - Final - May 2007

    36/45

  • 8/14/2019 EIV Regression - Final - May 2007

    37/45

    08/10/09

    MCR, LLC 37

    Minimum NR+REC Problem

    The last regression provided the CER coefficientsproviding the minimum percent error under the zerobias constraint for the REC CER

    We can produce a NR CER using the same typeminimization criteria and constraints

    What happens when we try to find the coefficientsfor the NR and REC CERs at the same time? This makes sense, since we are just going to add NR

    and REC results when we use the CER

    How to approach this problem: Minimize the NR+REC percent error

    Constrain the total (NR+REC) bias to zero

    We produce two CERs with comparatively smallstandard error but each has a bias

  • 8/14/2019 EIV Regression - Final - May 2007

    38/45

    08/10/09

    MCR, LLC 38

    EPS NR CER

    Total (NR+REC) percent SE = 43.9%, bias = 0.0% NR percent SE = 75.4%, bias = -181%

    Normalized Actuals vs. Estimates

    USCM NR EPS CER

    NR=(46.5 + 206.41 *BOLP^0.662)*err

    1,000

    10,000

    100,000

    1,000,000

    1,000 10,000 100,000 1,000,000

    Normalized Actual Cost (FY06$K)

    EIVEstima

    tedCost(FY06$K)

  • 8/14/2019 EIV Regression - Final - May 2007

    39/45

    08/10/09

    MCR, LLC 39

    EPS REC CER

    Total (NR+REC) percent SE = 43.9%, bias = 0.0% REC percent SE = 51.4%, bias = 408%

    Normalized Actuals vs. Estimates

    USCM REC EPS CER

    NR=(2567.6 + 277.45 *BOLP^0.507 Q^1.231)*err

    1,000

    10,000

    100,000

    1,000,000

    1,000 10,000 100,000 1,000,000

    Normalized Actual Cost (FY06$K)

    EIVEstimat

    edCost(FY06$K)

  • 8/14/2019 EIV Regression - Final - May 2007

    40/45

    08/10/09

    MCR, LLC 40

    New Questions Arise

    When we develop CERs independently we may beproducing biased total (NR+REC) results

    So, should we develop them in tandem in thefuture?

    Why not regress the entire data set at once andminimize the total NR+REC error for the spacecraftbus?

    I dont know the right answer, but we will certainlybe looking into these issues in the future

  • 8/14/2019 EIV Regression - Final - May 2007

    41/45

    08/10/09

    MCR, LLC 41

    Agenda

    Introduction Errors-In-Variables Regression

    Sources of Uncertainty

    Examples

    CER Regression with Normalization Uncertainty CER Regression with Fuzzy Cost Drivers

    Spacecraft EPS NR and REC CER Example

    EIV Modeling Benefits and Drawbacks

    Summary

    EIV Modeling

  • 8/14/2019 EIV Regression - Final - May 2007

    42/45

    08/10/09

    MCR, LLC 42

    EIV ModelingBenefits and Drawbacks

    Many benefits to using EIV over traditional methods of treatingdata and regression

    More realistic treatment of uncertainty

    More realistic accounting of uncertainty in cost drivers

    Provides more accurate picture of CER uncertainty

    There are a few drawbacks Can be time consuming

    Spreadsheet preparation

    Choice between two search techniques

    Scatter search (time consuming)

    Local gradient search (do not find global minimum)

    Tool set needs to be better established

    Need to combine a Monte Carlo simulator with gradient searchwith genetic algorithm (to re-seed the search and find globalminimum)

  • 8/14/2019 EIV Regression - Final - May 2007

    43/45

    08/10/09

    MCR, LLC 43

    Agenda

    Introduction Errors-In-Variables Regression

    Sources of Uncertainty

    Examples

    CER Regression with Normalization Uncertainty CER Regression with Fuzzy Cost Drivers

    Spacecraft EPS NR and REC CER Example

    EIV Modeling Benefits and Drawbacks

    Summary

  • 8/14/2019 EIV Regression - Final - May 2007

    44/45

    08/10/09

    MCR, LLC 44

    Summary

    Errors-in-Variables (EIV) regression assumes thatevery variable can have error or noise

    Also known as Total Least Squares (TLS)

    Uncertainty in dependent and independent variables

    due to incomplete data and assumptions in thenormalization process

    Need Statistical Simulation tool with Optimizationutility (Crystal Ball with OptQuest or PremiumSolver)

    With EIV we can create more realistic CERs

    Need better tools: Monte Carlo simulator + gradientsearch + genetic algorithm

  • 8/14/2019 EIV Regression - Final - May 2007

    45/45

    References

    1. Covert, R., Errors-In-Variables Regression, Presented to the JointSSCAG/EACE/SCAF Meeting, September 19-21, 2006.2. Foussier, P., Palliating the Bias Introduced by Linear Regression, ISPA

    International Conference, Seattle WA, 23-26 May 2006.3. Book, S. and Burgess, E., A Way Out of the Learning-Rate Morass: Quantity

    as an Independent Variable, January 2003.

    Further Reading: Quirino, P., "Robust Estimators of Errors-In-Variables Models Part 1"

    (August 1, 2004), Department of Agricultural & Resource Economics (ARE),University of California at Davis, ARE Working Papers, Paper 04-007.

    van Huffel, S.; Lemmerling, P. (Eds.), Total Least Squares and Errors-in-Variables Modeling: Analysis, Algorithms and Applications, SpringerVerlag, 2002, ISBN: 1-4020-0476-1.

    Griliches, Z., "Errors in Variables and Other Unobservables," Econometrica,Econometric Society, vol. 42(6), pages 971-98, November 1974.

    Pollock, D.S.G., Topics in Econometrics: the Errors in Variables Model andthe Linear Regression Model, unpublished notes, p. 1-4,http://www.qmw.ac.uk/~ugte133/courses/mesomet/topics/ectopics.htm

    http://www.qmw.ac.uk/~ugte133/courses/mesomet/topics/ectopics.htmhttp://www.qmw.ac.uk/~ugte133/courses/mesomet/topics/ectopics.htm