paper eswa cost estimation 2

Upload: orlandoduran

Post on 03-Apr-2018

219 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/28/2019 Paper Eswa Cost Estimation 2

    1/8

    Comparisons between two types of neural networks for manufacturing cost

    estimation of piping elements

    Orlando Duran a,, Juan Maciel a, Nibaldo Rodriguez b

    a Escuela de Ingeniera Mecnica, Pontificia Universidad Catlica de Valparaso, Quilpu, Chileb Escuela de Ingeniera Informtica, Pontificia Universidad Catlica de Valparaso, Valparaso, Chile

    a r t i c l e i n f o

    Keywords:

    Cost estimation

    Piping

    Neural networks

    Multi layer perceptron

    Radial basis function

    a b s t r a c t

    The objective of this paper is to develop and test a model of manufacturing cost estimating of piping ele-

    ments during the early design phase through the application of artificial neural networks (ANN). The

    developed model can help designers to make decisions at the early phases of the design process. An

    ANN model would allow obtaining a fairly accurate prediction, even when enough and adequate informa-

    tion is not available in the early stages of the design process. The developed model is compared with tra-

    ditional neural networks and conventional regression models. This model proved that neural networks

    are capable of reducing uncertainties related to the cost estimation of shell and tube heat exchangers.

    2012 Elsevier Ltd. All rights reserved.

    1. Introduction

    Cost estimation is a keyfactor during the development phases of

    manufactured products. Early costapproximations as a function of aset of general characteristics help designers in decisions such as

    material selection, production processes and mainly morphological

    characteristics of the product. Studies have shown that the greatest

    potential for cost reduction is at the early design phases, where as

    much as 80% of the cost of a product is decided. As the designphase

    itself accounts for a relatively small percentage of thetotal develop-

    ment cost, devoting a greater effort to costdesignis a reasonableand

    necessary step towards optimizing product costs. Making a wrong

    decision at this stage is extremely costly further down the develop-

    ment process. Product modifications and process alterations are

    more expensive thelater they occur in thedevelopment cycle. Thus,

    cost estimators need to approximate the true cost of producing a

    product. In addition, since cost estimating is the start of the cost

    management process and influences the go/no-go decisions con-

    cerning a new product development, ideally, these go decisions

    regarding new product development or product design changes

    must be based on quantitative analysis instead of guesswork. Sev-

    eral techniques and methods for early cost estimation have been

    introduced in previous literature. RushandRoy (2000)examine both

    traditional and more recent developments in cost estimating tech-

    niques in order to highlight their advantages and limitations. The

    analysis includes parametric estimation, feature-based costing,

    artificial intelligence, and cost management techniques. Niazi, Dai,

    Balabani, and Seneviratne (2006) provide a detailed review of the

    state of the art in product cost estimation covering various tech-

    niques and methodologies developed over the years. The overall

    work is categorized into qualitative and quantitative techniques.The qualitative techniques are further subdivided into intuitive

    and analogical techniques, and the quantitative ones techniques

    into parametric and analytical techniques. Curran, Raghunathan,

    and Price (2004) provide a comprehensive literature reviewin engi-

    neeringcost modeling as appliedto aerospace. Three main quantita-

    tive approaches can be identified for cost estimation purposes.

    Analogy-based techniques: these techniques are based on the

    definition and analysis of the degree of similarity between the new

    and another product, which has been already produced by the firm.

    The parametric method: the cost is expressed as an analytical func-

    tion of a set of variables that consists of or represents some features

    of the product that aresupposed to influence mainly the final cost of

    the product. These functions are called Cost Estimation Relation-

    ships (CERs) andthey are built using statistical methodologies. Ana-

    lytical models: In this case the estimation is based on the detailed

    analysis of the manufacturing process and the features of the prod-

    uct. The estimated cost of the product is calculated in a very analyt-

    ical way, as the sum of its elementary components, constituted by

    thevalueof theresourcesusedin eachstepof theproduction process

    (raw materials, components, labor, equipment, etc.). Therefore, the

    engineering approach can be usedonly when all theproductionpro-

    cess and product characteristics are well defined. Thus, the applica-

    tionof this approach is limitedto situations wherea great amountof

    input datais available. Through a reviewof thecostestimation liter-

    ature it can be observed that an incipient number of cases that use

    artificial intelligence (AI) techniques have been reported. These

    techniques constitute the last generation of tools for manufactured

    0957-4174/$ - see front matter 2012 Elsevier Ltd. All rights reserved.doi:10.1016/j.eswa.2012.01.095

    Corresponding author. Tel.: +56 32 2274471.

    E-mail addresses: [email protected] (O. Duran), [email protected] (J. Maciel),

    [email protected] (N. Rodriguez).

    Expert Systems with Applications 39 (2012) 77887795

    Contents lists available at SciVerse ScienceDirect

    Expert Systems with Applications

    j o u r n a l h o m e p a g e : w w w . e l s e v i e r . c o m / l o c a t e / e s w a

    http://dx.doi.org/10.1016/j.eswa.2012.01.095mailto:[email protected]:[email protected]:[email protected]://dx.doi.org/10.1016/j.eswa.2012.01.095http://www.sciencedirect.com/science/journal/09574174http://www.elsevier.com/locate/eswahttp://www.elsevier.com/locate/eswahttp://www.sciencedirect.com/science/journal/09574174http://dx.doi.org/10.1016/j.eswa.2012.01.095mailto:[email protected]:[email protected]:[email protected]://dx.doi.org/10.1016/j.eswa.2012.01.095
  • 7/28/2019 Paper Eswa Cost Estimation 2

    2/8

    product costestimation. Thebasic concept behindthe application of

    AI in cost estimating is to imitate the behavior of human experts

    when determining the main variables that rule (and in what extend

    they do) the final cost of a manufactured product. Thus far, the most

    common AI techniquein costestimationis thecase based reasoning.

    This techniqueis similar, inessence, to theanalogy-based technique.

    AI techniques allow modeling, storingand re-usinginformation and

    capturing the relative knowledge on products yet produced for

    adapting it to newsituations, i.e. newproducts under development.

    Grahamand Smith (2004) proposed the case-based estimator (CBE),

    whichis a small model, consisting of five input features, one output

    and a small case base. According to the authors, an experiment was

    performed to assess the ability of two retrieval mechanisms (one a

    simple mathematical formula, the other, an adaptation of the ID3

    decision tree generating algorithm) to measure similarity. The sim-

    ple formula was found to be more preferable, both in terms of con-

    sistency and development effort. Another case based system for

    thecost estimationis presented by Ficko, Drstvensek, Brezocnik, Ba-

    lic, and Vaupotic (2005), who suggested an intelligent system for

    predicting the total cost of stamping tools. The work is limited to

    tools to manufacture sheet metal products bystamping. Onthe basis

    of target and source cases, the system prepares the prediction of

    costs. The results show that the quality of predictions made by the

    intelligent system is comparable to thequality assured by the expe-

    rienced expert.Artificial neural networks (ANN) have been themost

    explored AI based technique in research on cost estimation. Neural

    networks, as non-parametric approximators attempt to fit curves

    through data without providing a predetermined function with free

    parameters. Neural networks are therefore able to detect hidden

    functional relationships between product attributes and cost, i.e.

    relationships unknown to the cost engineer. Bode (2000) investi-

    gates the potential of neural networks to support cost estimation

    at the early stage of product development. The cost estimation

    performance is compared to conventional methods, i.e. linear and

    non-linear parametric regression. Neural networks achieve lower

    deviationsin theircost estimations.Cavalieri, Maccarrone,and Pinto

    (2004) reported the results of the comparison of the application oftwo different approaches, parametric and artificial neural network

    techniques, for the estimation of the unitary manufacturing costs

    of a new type of brake disks produced by an Italian manufacturing

    firm. Kim and Han (2003) proposed hybrid artificial intelligence

    techniques to resolve cost estimation problems. Genetic algorithms

    areused to identifyoptimal or near-optimal costdrivers. In addition,

    artificial neural networks are employed to allocate indirect costs

    with non-linear behavior to the products. Empirical results show

    thatthe proposed model outperforms theconventional model. Some

    applications of ANN, not properly in cost estimation but in cost dri-

    ver estimation in Activity based Costing (ABC), are reported in the

    literature. Kim, Seo, and Kang (2005) applied hybrid models of neu-

    ral networks and genetic algorithms (GA) to cost estimationof resi-

    dential buildings to predict preliminary cost estimates. Kwang-Kyu,Seo, and Ahn (2006) presented a learning algorithm based estima-

    tion method for the maintenance cost as life cycle cost of product

    concepts. Seo, Park, Jang, and Wallace (2002) explore an approxi-

    mate method for providing the preliminary life cycle cost. Learning

    algorithms trainedto usethe knowncharacteristics of existing prod-

    ucts allowthe life cycle cost of new products tobe promptly approx-

    imated during theconceptual design phase without the overhead of

    defining new life cycle costing (LCC) models. Artificial neural net-

    works were trained to generalize product attributes and life cycle

    cost data from pre-existing LCC studies. Because of this approach,

    there is still considerable uncertainty within the estimate that can

    affect the final result. One approach involves applying fuzzy sets

    and fuzzy reasoning to modeling situations using linguistic vari-

    ables. With this approach, called fuzzy logic, it is possible to handlethe uncertainty in cost estimation problems that cannot be

    addressed by the traditional techniques. This uncertainty results

    from a sort of tolerance for human imperfection when transferring

    ideas or information as the casting of opinions. Shehab and Abdalla

    (2002) proposed an intelligent knowledge-based systemfor product

    cost modeling. The developed system has the capability of selecting

    a material, as well as machining processes and parameters based on

    a set of design and production parameters; and also estimating the

    product cost throughout the entire product development cycle,

    including the assembly cost. The proposed system is applied with

    no need for detailed design information, so that it can be used at

    an early design stage. The systemhas been validated through a case

    study.Other proposalsincludethe useof non-traditional techniques

    for costestimation. Koonce, Judd, Sormaz, and Masel (2003) present

    the design and implementation of a customizable cost integration

    tool to support design time optimization that considers cost as an

    objective function or constraint. Giachetti and Arango (2003) re-

    ported an activity-based printed circuit board (PCB) cost estimation

    model. The proposed model estimates PCB cost based on design

    parameters. The activities are defined so that the design decisions

    become the cost drivers and thus enable the cost estimation model

    to be utilized early in the design process, when sufficient time

    remains to make design changes. Ozbayrak, Akgun, and Turker

    (2004) discussed the implementation of the activity based costing

    (ABC) approach along with a mathematical and simulation model

    to estimate the manufacturing and product cost in an automated

    manufacturing system. ABC was used to model the manufacturing

    and product costs.An extensive analysis has been carried out to cal-

    culate theproduct costs under these twostrategies. Thecomparison

    between thetwo strategies, in terms of effects on themanufacturing

    and product costs, is carried out to highlight the difference between

    them. Hmidaa et al. (2006)introducesthe newconcept of costentity

    and suggested two models, a product model and a Costgrammes

    model. The cost estimating reasoning procedure that takes into

    accountalternative processplansof a product is modeled andsolved

    by a constraint satisfaction problem (CSP). In Ou-Yang and Lin

    (1997) a framework for estimating the manufacturing cost in terms

    of a feature-based approach is proposed. This system tends to esti-mate the manufacturing cost of a design according to the shapes

    and precision of its features. The approach integrates a feature-

    based CAD model and a database-storing product and process cost.

    Murat Gnaydn and Zeynep Dogan (2004) applied neural networks

    in early costestimationsforstructural systemsof building. Themen-

    tioned authors performeda sensitivityanalysis to obtain feedbackas

    to which input parametersare more significant in terms of theeffect

    that each network input parameter has on the network output.

    Verlinden, Duflou, Collin, and Cattrysse (2008) compared two ap-

    proaches for cost estimation in sheet metal part manufacturing.

    They usedregressionand artificial neural networks for defining cost

    estimation models. Despite the differences between the regression

    model and the neural network are small, the ANN yields better

    results. Cos, Sanchez, Ortega, and Montequin (2008) compared theresults of the application of a non-parametric regression model

    and an ANN approach for cost estimating of metallic components

    for the aerospace industry. Most recently Che (2010) used a PSO

    based approach in training an artificial neural network for cost esti-

    mating of plastic injection molding. Several comparisons were

    made, which showed that the so-called FAPSO-TBP estimation

    approach can be considered as very competitive.

    2. Piping manufacturing

    A pipe is a tubular section or hollow cylinder used mainly to

    convey substances which can flow fluids, slurries, powders and

    masses. Pipe manufacturing refers to how the individual piecesof pipe are made. Each piece of piping produced is called a joint

    O. Duran et al. / Expert Systems with Applications 39 (2012) 77887795 7789

  • 7/28/2019 Paper Eswa Cost Estimation 2

    3/8

    or a length. Generally, pipe is shipped to the pipeline construction

    site as double joints, where two pieces of pipe are pre-welded to-

    gether to save time. Most of the piping elements is seamless or lon-

    gitudinally welded, although spirally welded pipe is common for

    larger diameters. The most typical piping elements are suitable

    for carbon steel, various alloys steel and stainless steel. The most

    used welding methods are TIG, MIG and SAW. Fig. 1 shows exam-

    ples of spooling parts.

    For cost estimation, the most common methods consider histor-

    ical databases. Through the use of this type of information many

    engineers prepare detailed definitive cost estimates. Three meth-

    ods are available for estimating the relationship between the cost

    and the characteristic parameters: Engineering (subjective) esti-

    mates, Account analysis and Statistical (regression) analysis. These

    methods yield in low confidence intervals and the obtained esti-

    mations cannot be considered as useful in a general way.

    3. Estimators of proposed costs

    In this section three cost estimating models are presented,

    known as multi layer perceptron neural network (MLPNN), radial

    basis function (RBFNN) and linear multivariate regression (MR).

    3.1. MLP neural network

    The neural estimator output is obtained as:

    y Xmj1

    bj/jwi;j;xi 1

    where bj represents the linear weight and wji is the non-linear

    weight of MLPNN. The activation function /j is then defined by:

    /jxi /Xpi1

    wjixi

    !2

    /x 1

    1 expxwhere p is the number of input variables of the MLPNN.

    In order to estimate the linear and non-linear parameters

    h = {bj, wji} of MLPNN, the LevenbergMarquardt (LM) algorithm

    is used. This algorithm adapts the parameters using the following

    expressions:

    hk 1 hk DhDhk JJT lI1Eh

    Eh 12

    XNi1

    di yi23

    where N denotes the sample Lumber of the learning process, di is

    the expected value for the cost estimation, Jrepresents the Jacobianmatrix of the error vector assessed in h, Iis the identity matrix and

    the parameter l is increased or reduced along each learning stage[Ref-LM].

    3.2. RBF neural network

    The RBF neural network is similar to MLPNN; however, the acti-

    vation function is replaced by a radial-base function and the output

    of the neural network is given by:

    y Xmj1

    cj/jvi;j;xi 4

    /jxi /kxi vjik2

    /u 1ffiffiffiffiffiffiffiffiffiffiffiffi1 up 5

    where vj represents the non-linear weights and cj represents the lin-

    ear weights.

    Linear weights are obtained using the Least Square Method of

    the residual square sum, such as:

    ^

    c UT

    U?X 6where ()\ is the pseudo-reverse matrix of [Ref-MP] of the hidden

    layer of the RBFNN. When the linear parameters are obtained, the

    non-linear weights {vji} are then estimated using the algorithm

    (LM) given in Section 3.1.

    3.3. Linear multivariate regression

    The linear cost estimator is obtained using the following linear

    regression equation:

    y Xmi1

    aixi 7

    where the value of ai represents the parameters obtained as:

    a ATA?XA aji; j 1; . . . ;N and i 1; . . . ;m

    4. Methodology

    4.1. Hypothesis

    The main hypothesis of this research is that by using neural net-

    works based techniques is possible to develop costs estimation

    models that perform better (by being more accurate) than the tra-

    ditional cost estimation techniques (regression based models).

    Through experimentation and using actual data these models willbe used to provide answer to the following research questions:

    Fig. 1. Appearance of spooling products.

    7790 O. Duran et al. / Expert Systems with Applications 39 (2012) 77887795

  • 7/28/2019 Paper Eswa Cost Estimation 2

    4/8

    Does incomplete information represent an impediment to effec-

    tive product costs estimations?

    Can a very small set of product characteristics be used for cost

    estimation using soft computing based methods?

    4.2. Experiments

    An actual case of a manufacturer of pipingelements was consid-

    ered to test the developed models. In this case, it was pretended

    that a company wanted to estimate the manufacturing cost of a

    new piping element during its design stage. The experiment pre-

    supposed that manufacturing costs of all similar elements follow

    the same single cost function, so that costs are completely deter-

    mined by the product attributes known at the time of cost

    estimation.

    Three types of models were defined for the comparison:

    Models based on multiple linear regressions.

    Models based on Sigmoid Neural Networks (MLP NLS::

    Levemberg Marquard).

    Models based on Radial-Based Neural Networks with alg. Learn-

    ing based on SNLS.

    Further on in this paper, a structured methodology aimed to ob-

    tain a cost estimation model will be applied, which will be used in

    a specific situation, i.e. the presupposition of a high technology

    piping manufacture that will be used in fluid transport projects

    by the mining industry. The suggested methodology involves a se-

    quence of actions that may generally be summarized as follows.

    4.3. Phase I: PRE-PROCESS

    The database obtained from a manufacturer of piping elements

    for the transport of fluids for the large-scale mining industry in-

    cludes the following fields:

    Item Name Identification Code

    Diameter (inches)

    Welding Class

    difficulty (degree)

    Cavities (Number of ORings)

    Weight (kg)

    The input parameter selection is firstly done (independent vari-

    ables), a process that consists of identifying the closely correlated

    parameters (Pearson Coefficient) with the output variable, i.e. the

    cost.

    The analysis of the resulting coefficient allows discriminating

    within the less related parameters inside the sought variable. For

    the decision-making process, Table 1 was developed, which hasbeen arranged in a descending order with the values of the corre-

    lation coefficients obtained for each parameter.

    The elimination of the last two parameters, Cavities and Class,

    seems obvious and the lack of connection with the total manufac-

    ture cost results quite apparent.

    The next step is to analyse the multicolinearity among the

    parameters; thus, the Covariance matrix in Table 2 is shown.

    It may be observed the diagonal shows the existing correlation

    between each parameters and its own auto-correlation, therefore

    this value is 1 (100% of correlation) and the remaining values are

    variant (cross-correlation).

    Table 3 shows the accumulated correlations in a decreasing

    order.

    The value identified as R Multicolinearity corresponds to thetotal obtained from the four correlations among the parameters;

    therefore, the parameter to be eliminated can be discriminated

    (Diameter). Its high correlation (Diameter) may be explained con-

    sidering that its significance is probably contained in another

    parameter. This presents certain logic when the Difficulty

    parameter is taken as a reference, where it may be inferred the

    higher the diameter the piece to be manufactured, the higher the

    difficulty degree involved in its maneuver and construction.

    Then, the data set is divided into two parts: a training data set

    and a test data set. The training data is firstly used to choose the

    parameters or to train the Developer models. The test data is for

    validation purposes.

    Both groups become standardized in such manner all values le-

    vel at the range (0.1). It avoids the larger value input variables

    dominating smaller values inputs and avoids numerical difficulties

    during calculation. This hence reduces prediction errors.

    4.4. Phase II: calibrating or training the NNs

    A set of configurations of neural networks for both types of net-

    works (MLP and RBF), for cost estimation during early design

    stages in a make to order production scenario were defined and

    tested. The performance of the different configurations and train-

    ing strategies used in this research was compared and tested. In

    a number of experiments, neural networks were trained and ap-

    plied to the set of train data. The accuracy of the cost estimation

    results and other indicators of performance were explored. The

    metric used in this comparison was the Mean Absolute Percentage

    Error (MAPE), given by the following equation:

    M 1n

    Xni1

    At FtAt

    8

    where At is the actual value and Ft is the predicted value.

    Since there is no rule to determine the best configurations of

    NN, the number of nodes in the hidden layer was determined by

    a set of trials. Therefore, the neural network model were deter-mined empirically, rather than derived theoretically. The proper

    Table 1

    Correlation coefficients in descending order.

    Parameter Pearson coeffic.

    1 Weight 0.93

    2 Welding type 0.77

    3 Diameter 0.71

    4 Difficulty 0.68

    5 Cavities 0.35

    6 Class 0.10

    Table 2

    Multicolinearity among parameters.

    COV MATRIX Diameter Welding type Difficulty Weight

    Diameter 1.0000 0.7432 0.9697 0.6650

    Welding type 0.7432 1.0000 0.7238 0.7233

    Difficulty 0.9697 0.7238 1.0000 0.6669

    Weight 0.6650 0.7233 0.6669 1.0000

    Table 3

    Accumulated multicolinearity among parameters.

    No order Parameter R Multicolinearity Decision

    1 Dimeter 3.3780 Eliminate

    2 Difficulty 3.3605 Select

    3 Welding 3.1903 Select

    4 Weight 3.0552 Select

    O. Duran et al. / Expert Systems with Applications 39 (2012) 77887795 7791

  • 7/28/2019 Paper Eswa Cost Estimation 2

    5/8

    structure has been selected after having tested 20 ANN configura-

    tions with one hidden layer, and different numbers of nodes in the

    hidden layer. Each configuration was run 30 times. The Results are

    shown in Table 4.

    Table 4 shows the lower average of MAPE was obtained for con-

    figurations 13 and 33, respectively, although both networks in-

    clude a hidden layer with 13 nodes each. Both networks will be

    regarded as Mlp(4,13,1) and rbf(4,13,1).

    A model of linear regression was developed for comparison pur-

    poses, considering the same input variables used in the develop-

    ment of the neural network models. Fig. 2 shows the scatterplot

    of actual versus predicted costs for the regression model. Fig. 3plots the comparisons between simulated and real values of the

    testing set using the regression model. Fig. 4 presents the residuals

    versus the predicted costs obtained by the regression model.

    The linear regression model showed the following performance

    results: RMSE: 0.0785; R-square: 0.8812 and MAPE: 16.68.

    4.5. Phase III: testing the ANN

    In order to test the real capacity of the networks and the se-

    lected configurations, and also determine the generalization capac-

    ity and statistical strength, the network with the test data group

    (those corresponding to 25% of the initial sample) is used.

    Fig. 7 shows the scatterplot of actual versus predicted costs for

    the RBF and MLP. It is seen that most of the points lie very close tothe line indicating a strong prediction capability (for perfect pre-

    diction, all points should lie on this line). Hence, this chart provides

    the linear equation of the regression line (in the form ofY=Ax + B)

    between predicted and actual values. In this equation the closest to

    0 is the B factor and the closest to 1 is the slope of the line (A fac-

    tor), the better can be considered the estimation. Note that the cor-

    relation coefficient (R) is equal to 0.997 in the case of MLP ANN.

    Fig. 5a and b presents the residuals versus the predicted costs ob-

    tained by the two neural networks.

    Fig. 6 plots the comparisons between simulated and real values

    of the testing set using the MLP ANN (a) and the RBF ANN (b). Fig. 7

    presents the residuals versus the predicted costs obtained by the

    selected MLP ANN configuration (a) and the RBF ANN (b). It can

    be observed that an important fraction (over 98%) of the 100 casestested are acceptable with residuals ranging from 20% to 20%.

    Table 4

    Results with the neural network (MLP + LM) and (RBF + SNLS).

    MLP RBF

    Test

    number

    Nodes in

    hidden layer

    MAPE Test

    number

    Nodes in

    hidden layer

    MAPE

    1 1 11.0268 21 1 13.3283

    2 2 9.9753 22 2 13.3240

    3 3 10.4452 23 3 11.1037

    4 4 7.6372 24 4 10.9737

    5 5 7.5514 25 5 10.7229

    6 6 7.0654 26 6 10.5917

    7 7 8.1934 27 7 10.7837

    8 8 6.2542 28 8 9.7388

    9 9 6.1097 29 9 9.0044

    10 10 6.0094 30 10 8.9866

    11 11 5.3607 31 11 9.2657

    12 12 5.3912 32 12 8.7523

    13 13 5.1142 33 13 8.3152

    14 14 6.6281 34 14 8.8591

    15 15 5.6738 35 15 8.8484

    16 16 5.7192 36 16 8.4352

    17 17 5.4238 37 17 8.5351

    18 18 5.1687 38 18 8.7823

    19 19 5.3993 39 19 9.1130

    20 20 5.2629 40 20 8.3305

    0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-0.2

    0

    0.2

    0.4

    0.6

    0.8

    1

    1.2

    1.4

    Actual (A)

    Estimated

    (E)

    Ideal A=E

    Best Linear Fit

    Data Points

    Fig. 2. Scatterplot of actual versus predicted costs for the regression model.

    0 100 200 300 400 500 6000

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    1

    Validation samples

    NormalizedMfg.

    Costs

    Actual Cost

    Estimate Cost

    Fig. 3. Comparison between simulated and real values of the testing set using the

    regression model.

    0 100 200 300 400 500 600-60

    -40

    -20

    0

    20

    40

    60

    Predicted Values

    RelativeResidual

    Fig. 4. Residuals versus the predicted costs obtained by the regression model.

    7792 O. Duran et al. / Expert Systems with Applications 39 (2012) 77887795

  • 7/28/2019 Paper Eswa Cost Estimation 2

    6/8

    5. Result discussion

    A group of metrics was used to assess the characteristics of the

    results given by each network with the defined configurations.

    Such metrics are (Zemouri, Gouriveau, & Zerhouni, 2010):

    Mean Prediction Error (Ei) and the Standard deviation (std(i))

    Timeliness

    Precision (std)

    Repeatability

    Accuracy.

    Suppose Mrepresents the number of all the training/test run-

    ning. For every running i of the training algorithm, a new value of

    the mean prediction error E(i) and the standard deviation std(i) are

    obtained for the n data of the test set as follows:

    Bi 1n

    Xnj1

    nij fj 9

    stdi ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

    1

    n

    Xnj1

    Ei fj2vuut 10

    where xi(j) is the jth output obtained by the ith neural model andz(j) is the jth system output. The measures of the prognostic neural

    system performance are then processed on the variations ofE(i) and

    std(i). On this basis, various performance metrics can be proposed.

    The timeliness is given by the global mean of all the Mvalues of

    E(i):

    E 1M

    XMi1

    Ei 11

    The perfect score is timeliness = 0. For a small value of the time-

    liness, the probability to have a prediction close to the real value

    can be significant. On the contrary, if the timeliness value is impor-

    tant, the probability to have a wrong prediction is very high.

    The Precision is given by the global mean of all the M values ofstd(i):

    std 1M

    XMi1

    stdi 12

    The perfect score is precision equal zero. For a small value of the

    precision, the probability to have predictions grouped together can

    be significant. On the contrary, if the precision value is important,

    then the predictions are dispersed.

    The Repeatability is given by the standard deviation of both E(i)and std(i). A simple way to calculate the repeatability parameter is

    0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-0.2

    0

    0.2

    0.4

    0.6

    0.8

    1

    1.2

    Actual (A)

    Estimated

    (E)

    Ideal A=E

    Best Linear Fit

    Data Points

    0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    1

    Actual (A)

    Estimated

    (E)

    Ideal A=E

    Best Linear Fit

    Data Points

    (a) (b)

    Fig. 5. Comparison between test data and output data as scatterplot. Using (a) MLP and (b) RBF.

    0 100 200 300 400 500 6000

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    1

    Validation samples

    NormalizedMfg.

    Costs

    Actual Cost

    Estimate Cost

    0 100 200 300 400 500 6000

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    1

    Validation samples

    NormalizedMfg.

    Costs

    Actual Cost

    Estimate Cost

    (a) (b)

    Fig. 6. Comparison between test data and output data. Using (a) MLP and (b) RBF.

    O. Duran et al. / Expert Systems with Applications 39 (2012) 77887795 7793

  • 7/28/2019 Paper Eswa Cost Estimation 2

    7/8

    Repeatability rstd rE2

    13

    The perfect score is Repeatability = 0. This parameter indicates

    how close the different values of the E(i) and the std(i) are groupedor clustered together. This parameter reveals the dispersion of E(i)

    and std(i) values.

    The accuracy is obtained from the three parameters and it gives

    a global appreciation of the prediction. A simple way to calculate

    accuracy is:

    Accuracy 1Repeatability Timeliness Precision 14

    If a neural model has a good Timeliness, Precision and is com-

    pletely Repeatable, the prediction given by this neural model is

    very close to the real data. The prediction confidence is then very

    high. A big value of the accuracy parameter gives a great confi-

    dence of the prediction.

    Table 5 shows the results for both networks of the previouslymentioned metrics.

    Table 5 shows the best results were achieved by the Sigmoid

    MLP-type network. As it may be observed, the network based in

    the multi layer preceptron structure shows a better performance

    for all presented metrics, specially where the best profit are ob-

    tained for Accuracy of the model and its precision.

    6. Conclusions

    Twoneural network based modelshave beendeveloped for early

    manufacturing cost estimation of piping elements by applying the

    principles of supervised learning. Both neural network models

    proved as capable of reducing uncertainties related to the cost esti-

    mation of pipingelements.The first model, a multi layer perceptron,showeda better performance in terms of precision, repeatability and

    accuracy. The second model, based on radial basis function, showed

    better convergence speed than the MLP. The approach solves thecomplex non-linear mapping for predicting the manufacturing cost

    at any phase of the design process of piping elements.

    References

    Bode, J. (2000). Neural networks for cost estimation: Simulations and pilot

    application. International Journal of Production Research, 38(6), 12311254.Cavalieri, S., Maccarrone, P., & Pinto, R. (2004). Parametric vs. neural network

    models for the estimation of production costs: A case study in the automotive

    industry. International Journal of Production Economics, 91, 165177.Che, Z. H. (2010). PSO-based back-propagation artificial neural network for product

    and mold cost estimation of plastic injection molding. Computers and. IndustrialEngineering, 58(4), 625637.

    Cos, J. de., Sanchez, F., Ortega, F., & Montequin, V. (2008). Rapid cost estimation of

    metallic components for the aerospace industry. International Journal ofProduction Economics, 112(1), 470482.

    Curran, R.,Raghunathan, S., & Price, M.(2004). Reviewof aerospaceengineeringcost

    modelling: The genetic causal approach. Progress in Aerospace Sciences, 40,487534.

    Ficko, M., Drstvensek, I., Brezocnik, M., Balic, J., & Vaupotic, B. (2005). Prediction of

    total manufacturing costs for stamping tool on the basis of CAD-model of

    finishedproduct.Journal of Materials Processing Technology, 164165, 13271335.Giachetti, R., & Arango, J. (2003). A design-centric activity-based cost estimation

    model for PCB fabrication. Concurrent Engineering, 11, 139149.Graham, C., & Smith, S. D. (2004). Estimating the productivity of cyclic construction

    operations using case-based reasoning. Advanced Engineering Informatics, 18,1728.

    Hmidaa, F., Martin, P., & Vernadat, F. (2006). Cost estimation in mechanical

    production: The cost entity approach applied to integrated product engineering.

    International Journal of Production Economics, 103, 1735.Kim, K.-J., & Han, I. (2003). Application of a hybrid genetic algorithm and neural

    network approach in activity-based costing. Expert Systems with Applications,24(1), 7377.

    Kim, G. H., Seo, D. S., & Kang, K. I. (2005). Hybrid models of neural networks and

    genetic algorithms for predicting preliminary cost estimates. Journal ofComputing in Civil Engineering, 19(2), 208211.Koonce, D., Judd, R., Sormaz, D., & Masel, D. T. (2003). A hierarchical cost estimation

    tool. Computers in Industry Archive, 50(3), 50.Kwang-Kyu Seo & Ahn, B. (2006). A learning algorithmbased estimationmethod for

    maintenance cost of product concepts. Computers and Industrial Engineering, 50,6675.

    Murat Gnaydn, H., & Zeynep Dogan, S. (2004). A neural network approach for

    early cost estimation of structural systems of buildings. International Journal ofProject Management, 22(7), 595602.

    Niazi, A., Dai, J. S., Balabani, S., & Seneviratne, L. (2006). Product cost estimation:

    Technique classification and methodology review. Journal of ManufacturingScience and Engineering-Transactions of the ASME, 128(2), 563575.

    Ou-Yang, C., & Lin, T. S. (1997). Developing an integrated framework for feature

    based early manufacturing cost estimation. International Journal of AdvancedManufacturing Technology, 13, 618629.

    Ozbayrak, M. O., Akgun, M., & Turker, A. K. (2004). Activity-based cost estimation in

    a push/pull advanced manufacturing system. International Journal of ProductionEconomics, 87(1), 4965 [8 January].

    Rush, C., & Roy, R. (2000). Analysis of cost estimating processes used within aconcurrent engineering environment throughout a product life cycle. In Seventh

    0 100 200 300 400 500 600-40

    -30

    -20

    -10

    0

    10

    20

    30

    40

    50

    RelativeResidual

    Predicted Values0 100 200 300 400 500 600

    -40

    -30

    -20

    -10

    0

    10

    20

    30

    40

    50

    60

    Predicted Values

    RelativeRes

    idual

    (a) (b)

    Fig. 7. Residuals versus the predicted costs obtained by (a) the selected MLP ANN configuration, (b) the selected RBF ANN.

    Table 5

    Metric results for both networks.

    MLP RBF

    RMSE 0.0307 0.0381

    R-SQUARE 0.9751 0.9612

    MAPE 5.5089 8.3152

    TIME-L 0.0152 0.0236

    Precision 0.0291 0.0378

    Repeatability 0.0037 0.0041

    Accuracy 20.8433 15.2674

    7794 O. Duran et al. / Expert Systems with Applications 39 (2012) 77887795

  • 7/28/2019 Paper Eswa Cost Estimation 2

    8/8

    ISPE international conference on concurrent engineering: research and applications,Lyon, France (pp. 5867), July 1720, PA, USA.

    Seo, K.-K., Park, J.-H., Jang, D.-S., & Wallace, D. (2002). Approximate estimation of

    the product life cycle cost using artificial neural networks in conceptual design.

    International Journal of Advanced Manufacturing Technology, 19, 461471.Shehab, E., & Abdalla, H. (2002). An intelligent knowledge-based system for product

    cost modelling. International Journal of Advanced Manufacturing Technology, 19,4965.

    Verlinden, B., Duflou, J. R., Collin, P., & Cattrysse, D. (2008). Cost estimation for sheet

    metal parts using multiple regression and artificial neural networks. A case

    study. International Journal of Production Economics, 111(2), 484492.Zemouri, R., Gouriveau, R., & Zerhouni, N. (2010). Defining and applying prediction

    performance metrics on a recurrent NARX time series model. Neurocomputing,73, 25062521.

    O. Duran et al. / Expert Systems with Applications 39 (2012) 77887795 7795