artificial neural network as an incremental non-linear

8/3/2019 Artificial Neural Network as an Incremental Non-linear

1/19

Artificial neural network as an incremental non-linear

constitutive model for a finite element code

M. Lefik a, B.A. Schrefler b,*

a Department of Mechanics of Materials, Technical University of oodz, Al. Politechniki 6, oodzz93 590, Polandb Department of Structures and Transportation Engineering, University of Padua, Via Marzolo 9, Padova 35131, Italy

Received 12 May 2003; accepted 12 May 2003

Abstract

A back propagation artificial neural network (BP ANN) is proposed as a tool for numerical modelling of the

constitutive behaviour of a physically non-linear body. Training process of the ANN using experimental data is dis-

cussed in details and illustrated with an example. In particular, some difficulties in the constitutive description proposed

in consistent, incremental form are discovered and two solutions are proposed to overcome them. Two numerical

examples are presented. The first one deals with modelling of elasto-plastic hysteresis, the second shows the application

of ANN to approximation of biaxial non-linear behaviour.

2003 Elsevier B.V. All rights reserved.

Keywords: Artificial neural network; Constitutive modelling; Finite elements method

1. Introduction

Classical application of artificial neural network (ANN) for constitutive modelling of concrete was

originally proposed by Ghaboussi et al. in [7]. An improved technique of ANN approximation for a

problem of mechanical behaviour of drained and undrained sand is presented in [6]. In state of the art

reviews [10,2325] the role of neural computing in constitutive modelling is clearly pointed out. A similar

approach is employed in [4,18,11,21] and many other papers. The interest of such an application of ANN inthe case when the model is built directly from some available experimental data is obvious. In such a case an

unknown conventional analytical constitutive description can be directly replaced with a suitably trained

ANN. A source of knowledge for ANN is not a symbolic formula but the set of experimental data in this

case. The essence of the ANN technique is to construct the application that attributes a given set of output

vectors to a given set of input vectors. When applied to the constitutive description, the physical nature of

* Corresponding author. Tel.: +39-49-827-5602; fax: +39-49-827-5604.

E-mail address: [email protected] (B.A. Schrefler).

0045-7825/03/$ - see front matter 2003 Elsevier B.V. All rights reserved.

doi:10.1016/S0045-7825(03)00350-5

Comput. Methods Appl. Mech. Engrg. 192 (2003) 32653283

www.elsevier.com/locate/cma
http://mail%20to:%[email protected]/http://mail%20to:%[email protected]/


2/19

these inputoutput data is clearly determined by measured quantities: strainsstresses or displacements

forces.

We mention that in the case of substitution of the conventional description with the non-symbolic one,

one usually constructs this new approximation by presenting the networks with pairs element of domainelement of image of the considered, known constitutive operator. The neural black box operator, re-

placing an existing symbolic description, can be simpler in numerical manipulations, even as an element of a

FE code, as it is shown in [5,15]. A hybrid FE-ANN code is described also in [19,20]. The authors show that

the insertion of the constitutive law presented in the form of neural operator leads to some qualitative

improvements in application of FE in engineering practice. Namely, the ANN representation can be

modified to reduce an error of FE numerical experiment with respect to the real experiment. Our repre-

sentation of constitutive law with ANN is slightly different. It is incremental while in [19,20] er functions

are directly approximated.

The construction of the non-symbolic description of non-linear constitutive behaviour known from

experiment and the use of this representation in a finite element code is the subject of the present paper.

Before, in [5] we have successfully incorporated into the HAMTRA FE code an ANN description of a one-

dimensional functional dependence of sorption on relative humidity, known from experiment. The present

article has been inspired by an engineering analysis of mechanics of a bundle of super-conducting fibres for

fusion devices. The stressstrain relation will be coded with ANN and used then as a part of a FE model.

In the paper we analyse results of recent experimental investigation of mechanical properties of a super-

conducting cable, performed at The University of Twente and published in [16]. This research revealed a

very complex irreversible, non-linear behaviour of the cable. The description of these experimental results

in terms of classical rheology has been undertaken in [3] and in [13,14] by ourselves. The symbolic theo-

retical model is composed of two equally difficult steps: first the rheological scheme must be proposed then

its parameters should be identified to fit well the measured data. This procedure results in rather compli-

cated formulae the use of which inside a FE model seems to be questionable. The method we propose in this

paper, which employs the ANN technique, does not require any arbitrary choice of the constitutive model.

The numerical parameters of the proposed description are easily and automatically defined. As will beshown, it can be incorporated in a very natural manner into any FE code.

The presented method is, in our opinion, the shortest way from experimental research to numerical

modelling. In subsequent sections we analyse in some details the properties of the approximation of the

constitutive law by ANN from the point of view of the application in FE computations. We explain then

how the ANN is inserted into a FE code. The paper is completed with two examples of practical appli-

cations: the first one is very simple, one-dimensional; its advantage is that it can be compared directly with

experimental data that have been used before to train the network. The second one is two-dimensional and

describes the non-linear mechanical behaviour of the same super-conducting coil under tensioncom-

pression cycles.

2. Neural network for constitutive modelling

A neural network can be considered as a collection of simple processing units (nodes, artificial neurons)

that are mutually interconnected with variable weights. This system of units is organised to transform a set

of given input signals into a set of given output signals. This transformation is organised as follows: each

node of the network computes first its activation as a weighted sum of incoming signals. Then the node

transforms its activation by the non-linear, usually sigmoid transfer function (2) and sends it to every

connected node. Both input to the network and the output from the network are suitably defined to possess

a needed physical interpretation. In our case this is a sequence of corresponding values of stresses and

strains. A functional dependence between input and output (if it exists) can be approximated by the net-

3266 M. Lefik, B.A. Schrefler / Comput. Methods Appl. Mech. Engrg. 192 (2003) 32653283


3/19

work with an arbitrary precision. It is proven that ANN with sigmoid transfer function can be regarded as a

universal approximator of a continuous function of many variables. The proof can be found in [2] and in

many others papers quoted there.

The pair: given inputknown output (target) forms an input pattern. After each forward transmission ofthe input signal through the network the transformed signal is compared with the target value and the error

is computed. The weights of connections are modified to reduce the total error between the current network

response and the corresponding target. This process is called training or learning. In this paper the

ANN is trained by means of the back propagation (BP) algorithm. This is in fact a method of computing of

the gradient of the square norm of the output error, with respect to the weights. According to the maximum

descent rule, the individual correction of the strengths of connection wij is proportional to the ijth com-

ponent of the minus gradient. The proportionality factor is called learning rate. We mention that all

effective, iterative algorithms of minimisation can also be used instead, for example the Newton method,

conjugate gradient method and many others.

The process of weights correction is continued until the difference between the neural network output

and the desired, known output is minimised for a whole set of pairs: given inputknown output.

The action of a classical ANN operator on a given input vector i is summarised by:

Dri oi Xr

w3ri f

Xq

w2qr fXp

ipw1pq

b1q

b2r

b

3i : 1

In the above, parentheses f enclose an argument of function f, wi denotes a matrix of adaptableweights of connections between nodes belonging to neighbouring ith and ith 1 layers of neurones, bi is avector of bias values attributed to the hidden layer n i. The bipolar sigmoid activation function fi is at-

tributed to neurone n i in a hidden or output layer:

fix pi1 expkix=1 expkix: 2

The two parameters pand k can be also treated as variables adaptable during the learning process for each

neurone or for some groups of neurones (as it is described in [12]) but in the present application they are

kept constant, as usually in a typical BP ANN.

The presence of the transfer function f assures that the transformation i ! o is non-linear.In this paper, the neural network with, at most, two hidden layers are used in simulations. The scheme of

neural network associated with Eq. (1) is presented in Fig. 1.

The interested reader is referred to any of the textbooks [1,8,9,17,22] for details concerning the activity of

nets and nodal units.

w11Mh1

i1

iMinp

i2

w111

w112

f(a11)

f(a12)

f(a1Mh1)

a11

a1Mh1

a12

a21

a2Mh2

a22

w211

w212

w21Mh2

f(a21)

f(a22)

f(a2Mh2)

w311

w31M2

ao1

ao2

1

b2

1

b3

o1

o2

oMoutaoMo

1

b

1

Fig. 1. Scheme of the neural network with hidden layers. The squares illustrate the action of the transfer function f on its argument

activation of the neurone. The left most circles, indicating the input layer, are mostly nested in Eq. (1). This chain of transformations

interprets clearly the approximation formula (1).

M. Lefik, B.A. Schrefler / Comput. Methods Appl. Mech. Engrg. 192 (2003) 32653283 3267


4/19

The structure of the input vector i is crucial for the problem and it is discussed in Sections 2.1 and 2.2.

We note here that the input and output data, from a formal point of view, form an ordered set of scalars

but not necessary a vector or tensor. Because of this we will use the term column matrix for both input i

and the output o. A set of all inputs (or all outputs) forms a rectangular matrix of the dimension I P(O P), where I and O are the number of input or (respectively) output nodes. P is the number of patterns.

The non-symbolic model is constructed as follows: the neural network is trained first to reflect correctly

the set of observed, experimental data. In current practice only a part of the available data is used as a

training set while the other part, a test set is hidden from the network in training. It is used for current

testing (during each epoch) of the correctness of the predictions of the network when presented with un-

known data. The networks generalisation capability (interpolation between some data sets is a particu-

lar case of generalisation) enables us to predict the material behaviour i.e. to produce the strainstress

graph for an arbitrary sequence of strain values. The networks simulation can be checked against the

real experimental results at this step. Such a new set of experimental data is called a verification set. If

the networks prediction is satisfactory, the model is ready, if not, the new experimental data can be

added to the existing training set and the network should be taught again in the larger experimental

context.

Mathematical models describing the relationship between stresses and strains consist of mathematical

rules and expressions that explain the observed material behaviour. Such a symbolic description becomes

usually very complex when it captures non-linear effects and accounts for different material behaviours in

various ranges of stresses or strains. Usually the form of this description is postulated and then checked

with few (but carefully defined) experimental observations. Artificial neural network provides an alterna-

tive, non-symbolic approach to this problem. Since the neural operator is defined by learning from known

experimental data, this is knowledge based rather than speculative approach.

2.1. Constitutive description by a set of experimental curves in stress space

A constitutive relation can be represented by a set of curves in stress space, obtained (experimentally ornumerically) for some given strains paths. ANN can be trained to reproduce these curves and to interpolate

between them. In this sense ANN acts as the constitutive operator when presented with a vector containing

strains and possibly some others state variables at the input. It means that the ANN can be constructed to

approximate the family of functions rije of a tensorial argument e.Extrapolating the method used in our previous paper [5], the following description of the graphs of the

constitutive relationships is possible: at the input layer we present the points defining a reasonably long

segment on the graph and the independent variable of any other point on the curve. The value of the

function corresponding to this last point is presented at the output of the network. The scheme of such an

ANN with the shortest possible segment on the graph can be denoted (for 2D) 15 mn3, where 15 neurones

in the input layer take values ei; ej;rei;rej; ek while three output neurones take the values of rek,

corresponding to the input value ek. Two hidden layers contain respectively m and n neurones. Latinsubscript denotes the number of experimental point on the curve. Taking into account the fact that the state

of stress in the current point is the function of all components of the stress tensor (treated as the state

variables) the approximation applicable for a functional dependence should be reformulated. The following

input pattern is thus well physically motivated:

fei; ej; rei;rej; ek; rekg: 3

The two elements in curl brackets represent input and output column matrices. When we deal with hys-

teresis loops, an auxiliary point must be added to the input set, to mark the segment of the curve. Any other

point on the curve (for example jth in the preceding formula) can take the role of this supplementary ele-



5/19

ment of the input matrix, as well as a parameter representing the cumulative work or energy dissipated up

to the current point. This choice is discussed in [13]. The presented concept is very natural but, as we have

shown in [12], the alternative, incremental description is simpler and more efficient. In this paper we will use

exclusively the following scheme of the approximation of the constitutive relationships: The input columnmatrix needed for (1) is always of the form:

i ei;ri;Dei or i ei;ri;gi;Dei; 4

g is a scalar parameter, very important when we deal with irreversible processes; for soils it can take a role

of porosity for instance. The choice and physical interpretation of g is discussed in [13]. The output is

always the stress increment Dr. Pattern sequence for the network we propose in this paper will be of the

form:

fei;ri;gi;Dei;Drabig: 5

The operator D acting on any measured entity s can be defined as follows:

8j > i Djsi; sj si or Dsi; si1 si: 6

Since in expression (5) we deal not only with increments but also with values of stress measured exactly in

ith point of the constitutive curve, the choice between forward, central or backward increment definition

is not trivial. It can influence the process of numerical integration of stress. The proposed choice seems to be

very natural from the physical point of view.

It is to note that we can always build three (for 2D case) independent networks with a single node at the

output, instead of the one network with multi-nodal output, described by Eq. (3). Each of the single nodes

is interpreted as one component of the stress tensor, separately approximated. These three networks (5) are

equivalent to the one with multi-nodal output layer when trained with the same set of patterns. This is

preferable especially in the case when one of the tensor s component is more difficult to approximate with

the same precision than the two other.

One can observe that in the proposed representation of the constitutive law neither yield surface norplastic potential are explicitly defined. However, the stress response to any strain input will never fall

outside the admissible domain in stress space since the network was trained only with admissible graphs.

This approach is thus consistent with the traditional one, without defining any of the surfaces in stress

space, required by classical elasticplastic approach.

2.2. Influence of the length of the strain increment on the approximation quality

Expression (1) proposes the ANN as a tool for approximation of curves resulting from experiment

and defining a constitutive relationship. For the use of this approximation in the FE model it is important

to have the increments Ds as small as possible since it will serve for approximation of a tangent stiffnessmatrix. Because of this, in the formula (5) for each i only the neighbouring j will be used in data set

prepared for the training process. If the experimental data are not dense enough, they can always be

smoothed, taking into account each three neighbouring experimental points. The training data can be

extracted from such an artificially modified set of experimental data by interpolation. This procedure has

not been applied in the paper.

A much more important problem results from the fact that the strain increment takes a role of an in-

dependent variable in the representation (1). From the practice of approximation with ANN it is known

that the interpolation is good only for those values of the independent variable that lie inside the segment of

this variable, used in training. The network prediction will be thus unacceptable both for shorter and longer

strain increments. An example provided below shows a numerical evidence of this feature. Two solutions



6/19

can be proposed for this problem. The first one assumes that only the best increments of strains can be

used as element of the input of the trained ANN in recall mode. The choice of the increment used in

training as the best one is obvious (but not unique). The results for any shorter increment should be thus

obtained by an a posteriori interpolation between results obtained from ANN presented with an optimalincrement.

According to the second solution an artificial subset of data, constructed for De 0; 0; 0 can be addedto the training set. Expression (71) says that zero value of the independent variable gives zero increment of

the dependent variable. Also some interpolated points for increments shorter than the given from exper-

iment can be used in the teaching phase. In Eq. (72) the linear interpolation with a < 1 is considered.

fei;ri;gi; 0; 0g; fei;ri;gi; aDgiven

ei; aDgiven

rig: 7

The first method affects the use of the ANN in the recall mode, the second improves the learning phase. We

use both solutions throughout the paper.

2.3. Construction of a pattern set for the ANN representation of the constitutive law

It is obvious that the neural-like approximation of the true constitutive law must preserve the objectivity

requirement. It means that the rigid rotation superposed with the deformation due to applied load cannot

influence the value of the approximate stress tensor related to the local, material co-ordinate system. It is

easy to satisfy this requirement: the network must be presented with true, measured or computed data,

obtained for rotated body. This is the unique way to force the neural-like representation to preserve the

objectivity: since the construction of the ANN is given, only the choice of weights (thus training with

correct or true examples) can assure the invariance with respect to rigid rotations. In the paper, we assume

that we deal with infinitesimal transformations thus the problem of objectivity of the stress increment is

trivial. We add, however, a set of artificial data simulating the response of the investigated material when

measured in a rotated co-ordinates system. We explicitly assume that we deal with isotropic response in the

cross section of the cable. This is a reasonable assumption for a composite with random spatial ordering ofcomponents as it is in the studied case.

We note that if the response would not be isotropic, the information concerning its anisotropy should be

given as a part of experimental results since our approach is knowledge based!

Let us denote by @ the action on the input data of the neural-like operator N, defined by (1).

N@ finputg foutputg: 8

In the context of formula (4) we must verify the following condition, imposed by isotropy of the ap-

proximated constitutive law (T denotes transposition of the matrix):

8i : if N@

eirig

Dei

8>>>:

9>>=>>; fDrig then 8H : H

TH 1 we have N@

HTeiH

HTriH

g

HTDeiH

8>>>:

9>>=>>; H

TDriH : 9To satisfy condition (9) we must train the network with some supplementary data: the new subset of

patterns is of the form:

fHTeiH;HTreiH; gi;H

TDeiH;HTDrabiHgk: 10

The subscript k refers to the pattern obtained from ith experimental point by transformation by kth ro-

tation matrix H. The total number K of these additional terms in the matrix of patterns depends on the

number of trial rotation needed to train the network up to a satisfactory level of tolerance.

We recall that the set of data contains also a number of supplementary patterns prescribed by (7).



7/19

2.4. Modified algorithm of training

We propose the following, modified algorithm of the supervised training of the ANN:

Initiation of weights and activation functions to assure the best linear transformation of i into Dr. Initial

weights minimise the value of the distance

jXbmYma tbriTraiari

Tra

1j; 11

where r is the number of patterns, a, b, m number of input, output and hidden nodes. t and i are re-

spectively the target and input vectors, X and Y are matrices of weights of synaptic links between input

layerhidden layer and hidden layeroutput layer.

Training of the network with only the experimental data (the part of data resulting from trial rotations

for objectivity is not taken into account).

Expanding the network with two supplementary layers: after input layer and before the output layer.

Training of the network with only the part of data resulting from trial rotations, needed to force the ob-

jectivity of the approximation. The weights of previously trained main hidden layers are kept frozenduring this step.

Final correction of all weights using BP (delta-bar-delta) algorithm.

The introduced method of weights initialisation is not very important but for some classes of constitutive

relationships it would result with an important economy in number of iterations. It can be explained as

follow. For the network without hidden layers the best linear approximation of the relation between the

input vector i and the target t is assured by a matrix Wba:

if W tiTiiT1

then kt Wik minimum: 12

For the network with one hidden layer the same relation (the same W) can be decomposed using two

matrices of weights Xbr, Yra such that XY W. For r6 a and a random choice ofY, the best choice ofXis the following:

X WYTYYT1: 13

The same reasoning can be repeated for each supplementary hidden layer.

It is seen that the network is constructed starting with the best linear approximation of the relation

between the input and the target. For a large class of constitutive relationships this initial guess can be very

close to the final approximation.

2.5. Verification of the quality of the approximation

The following criteria are mostly used in practice to estimate the quality of the approximation of the set

of data with an ANN:

Mean square of the error between the output generated by ANN and the target prescribed in the set of

training patterns (RMS). It is computed after each epoch for both: training set and testing set of patterns.

Statistical correlation of the output data prescribed in the training patterns and the output generated by

the ANN. This is also computed for both test and training sets.

Definition of the RMS error is given by:

RMS

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1

PK

Xp;i

Tpi outpi2

s: 14



8/19

In (14) p is the current number of input pattern, i is the number of output node, Tpi is a component of the

target vector. P stands for the total number of patterns and K is the total number of output nodes.

The set of test patterns is usually a subset of the whole data set. The patterns belonging to the test set are

never presented to the network. If the structure of the ANN is well designed, the mean square error ispossibly small simultaneously for training and testing sets and similar for these two sets. It means that the

network is sufficiently rich to reproduce well the given set of patterns and the number of nodes (thus of

weights of the internodal connections) is not too great since the generalised results are close to the test ones.

A too great number of neurones causes the pathological compliance of the network thus the interpolation

of the testing patterns is very bad.

In some applications of ANN inside a FE code [5] we realised that even a well designed (according to the

above quoted criteria) ANN causes some errors. Analysis of this behaviour shows that another, stronger

criterion should be introduced. Let us consider the generation of the learned curve by the ANN via a

recurrent procedure in which the output values are used as the element of the input data for the next step:

For a given initial point: (e0, r0, g0, De0), the ANN generates the value ofDrab0 on the output node.

Input pattern for the step i 1 contains the value of the rabei1 rabei Drabi (Drabi computed byANN in the previous step), new value ofg is gi gei Dei.

Output value Drabi1 is used next for the generation of the subsequent point on the curve as well as the

next input to the network.

We say that ANN verifies the autonomous criterion if the curve generated in the recurrence manner,

autonomously by ANN, lies sufficiently close to the curve described by a set of training and tests patterns. All

graphs presented below, show the autonomous behaviour of trained ANNs for some program of incremen-

tation of the independent variable. We use step increment d different from that used in the learning process.

3. Simple examples of approximation of constitutive relationships with ANN

The purpose of this section is to illustrate the observation that:

The network not correctly defined can show a pathological behaviour.

ANN trained with input containing patterns constructed according to Eqs. (5), (7) and (10) behaves bet-

ter than that trained with training set defined as in (3).

The network, for a given path of increments of strains, generates stress increments that allow to trace the

curves very close to those used in training.

The above is illustrated with examples of one-dimensional hysteresis and two-dimensional proportional

loading path.The rheological type constitutive relationship we are going to approximate by ANN in these examples is

taken from [3]. It explains roughly the behaviour of the same super-conducting cable that will be finally

analysed in the last section of the present paper. The considered model was built using a number of Prandtl

elements i.e. elastic spring connected in series with a dry friction slider. All Prandtl elements are connected in

parallel. It is assumed that the common stiffness of all springs is constant and equal to E. The yield stress

si Eds of the ith dry friction element varies from element to element. It is interpreted as a random variablewith a hypothetical probability density function. Thanks to the parallel contribution of different sliders, the

very trivial elasto-plastic model represented by a single elastic springplastic slider transforms into a rich

model that can be used to simulate a complex material response. Our assumption concerning this probability

distribution ofs is different from those in [3] and is given by Eq. (15). A wide family of qualitatively different



9/19

behaviours can be obtained changing simply few parameters of the below defined probability density

function. This is the principal advantage of this model in the numerical experimentation.

ps Heavisides m

s m

v2 exp

m s

v

: 15

Functions of loading, unloading and reloading are taken directly from [3] and are quoted below. Stresses

are calculated according to (16)(18) for the proposed probability density function (15). In loading, the

strain driven numerical experiment is continued until e0, in unloading the maximum strain value is e00:

rlo E e

Ze0

e sps ds

; 16

run E e

Ze0e=20

e sps ds

Ze0e0e=2

e00 sps ds

; 17

rre E e

Zee00=20

e sps ds

Ze0e00=2ee00=2

e00 sps ds

Ze0e0e00=2

e0 sps ds

: 18

We underline that in the above equations the stress is expressed by total strain. The split of total strain into

the plastic and elastic parts is taken into account during the development of equations and is hidden from

the resulting formulae.

For the second example a superposition of the two one-dimensional stretchings are taken into account.

We consider only loading phase in this case:

rcd Ecdab eab

Zeab0

eab sps;m; neab ds

: 19

The interaction between loading function in two directions can be accounted for by assuming, that theprobability density function in one direction is influenced by strains in the second direction. It is, however,

neglected in the present paper. Parameter n is a counterpart ofm in the second direction.

In Fig. 2a the stress responses are drawn for some proportional strain paths and for an intermediary

m and n while in Fig. 2b the limit case for m tending to zero is presented. This last figure is similar to the

-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

-2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2 .5

11

22

-2.5

-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

2.5

-2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2 .5

11

22

(b)(a)

Fig. 2. 2D loading path: some possible stress responses for proportional, strain driven loading, created with Eq. (19).



10/19

Saint-Venant yield criterion. We would like to stress that the model (19) used in the example is probably not

very realistic but it is very suitable for numerical tests since many different curves can be generated with it.

3.1. Numerical experiments with hysteresis

The proposed technique of constitutive modelling with ANN will be illustrated with an example of

elasticplastic hysteresis. This task is very specific: first of all the experimental data are presented in form of

loops i.e. not a one to one applications. Moreover, the whole family of loops of hysteresis should be

approximated, not a one single branch. The ANN must thus interpolate not only between points but also

between curves. Since the ANN will be incorporated into a FE code, the approximation must be relatively

independent of the step size kDek.During the numerical experiments we tested the capability of learning the one-dimensional hysteresis of

the re graphs for several types of neural networks. The description of the network is always of the form

ImnK (numbers of nodes in layers). The symbols that starts with 5 (5 nodes in the input layer) denote the

ANN in non-incremental form (3). Those starting with 3 nodes correspond to the incremental form (5) butwithout the auxiliary variable g. Symbol ImnK=a means that the network with m and n nodes in hiddenlayers was tested with the number of increments a times grater then the one in training phase (the strain

incremental step is about a time shorter). Description of the graphs in Fig. 3 includes three numbers in-

terpreted as values of turning point strains from formulae (16)(18), multiplied by 100, according to the

scheme: e0 e00 final e. With d we have marked the loops presented to the network in training. The

horizontal and vertical axes are strain and normalised stress respectively.

We perform the following numerical experiments:

Training of ANNs of the type 5mn1, 3mn1 with 13 different loops of hysteresis. Five of these loops are

reproduced in Fig. 3. Other loops are similar but non-symmetric in the sense that loadingunloading

turning point is different from the unloadingreloading one. For each loop the increment of strains is

quite uniform and varies from 0.00035 to 0.0007.

-1.5

-1

-0.5

0

0.5

1

1.5

-0.1 -0.075 -0.05- 0.0250 0.025 0.05 0.075 0.1

d1: 7_-7_7

d2:6_-6_6

d3:5_-5_5

d4: 4_-4_4

d5: 3_-3_3-1.5

-1

-0.5

0

0.5

1

1.5

-0.075 -0.05 -0.025 0.025 0.050 0.075 0.1

test1test2

test3

3331

3331

3331

0

(a) (b)

Fig. 3. Examples of training sets for the numerical test. (a) The sampling points are marked only for two extreme training data sets. (b)

Shows the results of the autonomous criterion test: the loops are drawn by ANN starting from 0; 0 with given increment of strain.



11/19

Two kinds of autonomous criterion tests are then executed. First, the sufficiently trained ANNs are

presented with three strain incremental programs that were never used in training. For the second case

of testing, the increment of strain was reduced twice, three times, five times and ten times.

Results of these numerical experiments are illustrated below with some selected graphs. In all figures we

collected the reproductions of the curves drawn by the ANN when the network starts from 0; 0 and thenprogresses by itself until the end of the loadingunloadingreloading loop. A similar, autonomous action

will be used inside the FE code to update stresses at the end of each step of a Newton iterative solution (as it

will be defined in Section 4). In Fig. 3a and b we see that the simple reproduction of the curves are very

satisfactory even for a very small network (only three nodes in hidden layers). However, when the step size

decreases, the quality of the approximation is not good. We observe this in Fig. 4 and in Fig. 5. Fig. 4a

shows, that when the test increment decreases, the graphs are attracted to the centre of the loops. This is

quite natural since the networks is taught with all the loops thus in the case of unknown presented data it

tries to interpolate in best defined directions. In Fig. 4 we show the test results for the network that were

trained with not complete set of learning data. Namely, the training set was dominated by function

shifted to the left because of loadingunloading turning point smaller than the unloadingreloading one

(data d10d13 were eliminated intentionally from the data set). We observe that the shape of the graphs is

concave in unloading.

In Fig. 4a, the reaction to this experiment is more distinct that in Fig. 4b. It means that the ANN with

the input organised according to (3) (non-incremental) is more sensitive on the change of the training set

than the incremental one.

The incremental networks trained with extended data set (interpolated points on the curve and zero

increment data, according to (7)) are much less sensitive to the reduction of the step size. The agreement is

still good for the strain increments three and five times shorter than that used in training. The degradation

is, however, very quick after. The graphs 3331/10 bifurcate suddenly to a new shape. The graphs repro-

duced autonomously by the ANN, shown in Fig. 5, were never presented to the network in learning.

The numerical experiments suggest that the ANNs constructed for incremental representation of aconstitutive law (described by Eq. (5)) behave better than these described by Eq. (3).

-1.5

-1

-0.5

0

0.5

1

1.5

-0.075 -0.05 -0.025 0 0.025 0.05 0.075 0.1

d1

5531

5531/3

5531/2

(a)

-1.5

-1

-0.5

0

0.5

1

1.5

. 0.07test1

test3

3331

3331

3331/3

c

(b)

-0.075 -0.05 -0.025 0 0.025 0.05 0.075 0.1

Fig. 4. Results of the tests for the network with different input structures: direct and incremental with step size shorter than that used in

training.



12/19

3.2. Two-dimensional problem

All observations made for one-dimensional case can be confirmed also for the two-dimensional example.

In Fig. 6 the graphs illustrate the interpolation ability of the ANN of the type 5962. The network drew the

curves that have never been used in the learning process.

It is to note, that the structure of such a network is analogous to that of 3331 and is described in Section2.2. In the input layer we have five neurones interpreted as follows: r11;r22; kDecumulativek;De11;De22. Twoneurones at the output are valued with Dr11, Dr22.

-0.5

0

0.5

1

1.5

-2 -1.5 -1 -0.5 0 0.5 1 1.5 2

2

2

-0.5

0

0.5

1

1.5

2

-2 -1.5 -1 -0.5 0 0.5 1 1.5

2

2

Fig. 6. Testing results for the network with input structure as described in Section 2.2. ANN draws the curves that have never been

presented to the network (generalisation). Continuous lines denote the true curves, markersthe approximation by ANN.

-1.5

-1

-0.5

0

0.5

1

1.5

-0.075 -0.05 -0.025 0.025 0.1

test2

test3

3331/10

3331/5

3331/3

0.05 0.0750

Fig. 5. Results of the tests for the curves that have never been presented to the network. The degradation of the approximation starts

for the increment ten times smaller than that the one used in learning.



13/19

4. Implementation of constitutive law represented by ANN inside a FE code

Let us suppose that the basic equation for standard displacement-based finite element method can be

written in the form:ZV

BTN : sdV0

ZS

NTNt dS

ZV

NTNfdV; edu BNdu; 20

V is the volume of the body in the reference configuration, BN is the matrix that operating on the vector of

admissible variation of independent variables, gives the strain measure edu conjugate to material stress s.In what follows in the paper we will consider small transformations thus infinitesimal strain tensor. The

index N denotes that B is constructed on the basis of the approximation of u by a set of interpolation

functions Nx on appropriate finite elements. On the right side of (20) t is a stress vector given on the partS of the boundary while f are body forces acting on the elementary volume dV.

Since the considered material behaviour is non-linear, the Newton algorithm will be applied to solve the

system of equation (20). The Jacobian of the left hand side of (20) can be written as follows:

J

ZV

ds : edu s : dedu dV0: 21

The first term under the integral (21) can be computed using a usual constitutive assumption:

ds D : de where Dij osi

oej: 22

We can rewrite the above equations, taking variation with respect to the independent variables of the

problem u and obtaining (by definition) the stiffness matrix K:

KmainMN ZV0

BM : D : BNdV0: 23

The second term in the integral (21) represents initial stress matrix.

Using the assumed representation of constitutive law by ANN we have instead of (22):

ds Nd;r @ de: 24

Index ddenotes that the network quality is best for some given value of increment d, r means that the value

of stress increment is computed at the current value of s r. It is clear that we must replace the neuraloperator in (24) by the matrix D or simply, construct this matrix using the given representation of the

constitutive law. This will be done by trial incrementing of e. Let us suppose that both tensors ds and de are

represented by column vectors:

dr ds1 ds2 ds1 Nd;r @ de1 Nd;r @ de2 Nd;r @ de1 ; 25

det de1 de2 de3 ;

D drdet1: 26

Matrix of trial vectors det is always proportional to the strains at the last equilibrated point (r; e) during theNewton iteration process (preceding step). Trial vectors cannot be arbitrary because Nd;r @ de 6Nd;r @ de and in fact two different tangent stiffness matrices can be defined in any point: one forloading and the other for unloading. It is supposed thus that the loading (unloading) is continued during



14/19

the current increment in the Newton iterations. The formulae (25), (26) are used here instead of computing

the derivatives of the neural network with respect to input values (the method proposed in [20]).

The stress in the second term in the integral (21) is computed using neural network in the recall mode for

given, constant step de, until the strain e at the trial solution at the current step is reached. The ANN acts

here in the autonomous activity mode as it is defined in Section 2.5 and tested in Section 3. This process

corresponds to the classical integration of incremental constitutive equation for updating r. It starts always

at the last equilibrated point and the increment de is proportional to the one, defined for this step (loading

or unloading). This is illustrated in Fig. 7.

5. ANN-FE hybrid model of a bundle of super-conducting fibres

The use of the ANN representation in the numerical modelling of continua involves two main steps: first

the ANN can be defined and successfully trained, then the FE code must be adapted to accept the material

data in this form. The correct definition of the neural-like operator is discussed in Section 3. We skip as too

technical the analysis of the process of optimisation of the ANN s topology for the application presented

below for a super-conducting cable. The reader interested in this technique is referred to the textbooks

[1,8,9,17,22]. The insertion of the ANN into a standard FE code is described in Section 4.

In the present section we describe the approximation with an ANN of the given experimental data and

the example of the use of this description in a FE model.

Cable-in-conduit conductors in components of fusion devices, such as for instance toroidal coils for the

International Thermonuclear Reactor, may be regarded from a structural point of view as hierarchicalcomposites. They are in fact made up of a large number of small fibres (super-conductors) grouped in

clusters, and nested in a cooper matrix strand. The strands in turn are grouped in petals and are bound

together by an outer steel jacket. Due to the large number of repetitive strands, the whole cable can be

considered as a homogeneous body in a macro scale. Constitutive relation for such a homogenised material

is very difficult to deduce from the knowledge of the internal structure because of the complex geometry,

unilateral contact between strands and geometrical non-linearity of its finite displacements.

Recently an experimental analysis, performed at The University of Twente and published in [16], con-

firmed that the mechanical behaviour of such a structure is very complex. The cable was pressed in the

direction of its diameter and the displacement of the upper part of the steel jacket with respect to its bottom

part has been measured for 38 cycles of loading and unloading. For cyclic loading complicated hysteresis

xx1

ft+1

ft

x2A

C B

unloading

from point B'

loading branch

B'

C' N@d N@(-d )

stress updating using

self-iteration of ANN

BB

Fig. 7. Newton procedure with the use of ANN representation of constitutive load. Two branches of the curve: loading and unloading

correspond to positive and negative sign of strains increment.



15/19

loops in the displacementforce plane have been measured. We observe large initial irreversible settlement

of the virgin cable followed by non-linear elastic behaviour. This behaviour is reproduced as a backgroundfor the results of FE modelling in Figs. 8, 10 and 11.

Our numerical experiments show that a surprisingly small network learns well the constitutive relation

between the applied force and the displacement. The number of neurones in hidden layers was never higher

than 6. The results presented in Fig. 8 are obtained for the network with two hidden layers, five neurones

each. The correlation ratio for all tested networks was very good (of the order of 0.99), the RMS error was

very small (of the order of 0.02).

The graphs of reproduction for the test set are very close to the given target. The graph in Fig. 8 shows

that the two first cycles are surprisingly well reproduced. These two cycles are qualitatively different from

the others and the network had only one (third) similar set of data at disposition to learn this. The markers

denoting the networks response for the input never presented during the learning process are very close to

0

500

1000

1500

2000

2500

3000

0 50 100 150 200 250

Pattern sequence

Training set targets

Training output

Test output

Fig. 8. ANNs prediction of two initial loops that were not presented to the network during training. (ANN of 4551 type i.e. with the

input layer of the form ei; ri; gi;Dei.

Fig. 9. Illustration of the use of the data collected from the experiment with a single cable to model a quasi continuous behaviour of a

whole super-conducting beam.



16/19

the expected output marked with dotted line. Cumulated networks outputsdisplacements fxi dfxi;dxi1 in [lm] are reported along the vertical axis. When the two last cycles were hidden from thenetwork during the training, their reproduction by the ANN was even better than in Fig. 8 since these two

last cycles are very similar to most of the learned loops. Fig. 8 proves well that the generalisation is correct.

It means that the network acts rather like a model of the constitutive relation than as a tool of the storage of

the experimental data. The network learns well the constitutive law, not the numerical data. The obser-

vations made for one-dimensional case is confirmed also for the two-dimensional example we are going to

use inside the FE model in the sequel.

5.1. One-dimensional FE-ANN model

The experimental data concern the mean stress and the corresponding mean strains in the single strand.

This is because the measured displacement (co-linear with the force) is apparently a measure of the response

of the whole structure. Also in [3] the same interpretation is proposed. The investigated strand is a part of

larger super-conducting structure that can be considered as a homogeneous one due to the huge number of

strands in the typical cross section. Without entering into details of the shape of this cross section, we can

say that we have a constitutive law, which is true in a mean sense for a periodic cell constructing the global

super-structure. This is illustrated in Fig. 9.

-2

0

2

4

6

8

10

1214

16

18

20

1 713

19

25

31

37

43

49

55

61

stress

0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

strain

loading program

experimental

loadingexperimental

dataFE results

-2

0

2

4

6

8

10

12

14

1618

20

0 0.02 0.04 0.06 0.08 0.1

strain

stress

experimental data

FE resul ts

Fig. 10. Comparison of FE results with the experimental data. Solid dots and solid line are obtained from displacement at the end ofeach finite element, re-scaled to strains to be comparable with experimental graph. (b) Represents the same data but rearranged in form

ofer loops.



17/19

We use the ANN representation of the constitutive equation defined above in our own research FE code.

The true experiment was displacement driven (the kinematic load has been applied to the sample). The

numerical experiment we have done was force driven. We have prescribed few levels of concentrated force

acting in vertical direction as presented in Fig. 9. A definition of kinematic loading program equivalent to

the one carried on in the laboratory is difficult because of the rapid drop of forces in unloading with nearly

infinitesimal change of displacement. The values of forces have been chosen on the path known from ex-

periment. It is illustrated in Fig. 10a. The response of the element should be identical with that observed in

the laboratory. The differences observed in both Fig. 10a and b are probably due to the smeared character

of approximation with ANN. The drop of force at the end of each loop provokes some numerical troubles

since an infinitesimal strain increment corresponds to a large decrease of force.

5.2. Two-dimensional FE-ANN solution

An artificial construction of the experimental data has been necessary to perform a 2D numerical ex-

periment. The true data have been completed by adding a hypothetical behaviour in the second, hori-

zontal direction. The purpose of this example is rather to illustrate the performance of the hybrid FE-ANN

code than to analyse the structure of the cable. We have assumed that the horizontal displacement of the

single strand are identical in their character with the really measured vertical one. Only the magnitude was

scaled by a coefficient )0.35. We assume thus that we deal with plane state of stress and the data we have

can be interpreted as pairs:

-2

0

2

4

6

8

10

12

14

16

18

20

0 0.02 0.04 0.06 0.08

strain

stress

experimental one-D

data2D FE solution

-2

0

2

4

6

8

10

1214

16

18

20

1 611

16

21

26

31

36

41

46

51

56

61

66

71

stress

-0.04

-0.02

0

0.02

0.04

0.06

0.08

strain

exp. force

FEM load program

exper. hor. displ.

exper.Vert.displ .

FEM vert. disp l.

FEM hor.displ.

(a)

(b)

Fig. 11. Comparison of FE results (2D) with the experimental data. Bold dots and bold line are obtained from displacement at the end

of finite element, re-scaled to strains to be comparable with experimental graph. (b) Represents the same data but rearranged in the

form of er loops.



18/19

r Fi 0

0 0

!!

di 0

0 0:35di

!ffi e: 27

With this interpretation, we can complete the set of input patterns for ANN according to (9).The finite element model contains a simple square mesh of triangles ten rows by ten columns. The

boundary conditions are the following: vertical displacement constrained at the bottom edge of the square,

uniform stress vector at the upper edge. Horizontal displacements are free except the one at the axis of

symmetry.

The displacement referred to in Fig. 10 has been measured at the upper corner of the square domain. All

observations made for one-dimensional case can be confirmed also for the two-dimensional example. The

displacements are, however much smaller in this case.

6. Conclusions

The following conclusions can be drawn from this paper:The presented examples show that the stress paths drawn for a given strain history can be approximated

very well by a small neural network. The sufficiently trained ANN can interpolate between learned curves

to draw the one, not presented in training.

The incremental ANN representation of any constitutive law is always (by construction) consistent (in

the sense of theory of plasticity). This observation concerns both sources of knowledge about the material:

real and numerical experiment.

This representation is automatic in the sense that it does not require any a priori choice or ad-

aptation of the existing constitutive theory for the description of the observed material behaviour.

Finally we show that it is possible to incorporate the ANN constitutive description into a Finite Element

code. A realistic FE model can be thus constructed for a material described by ANN. The examples show

that the model is possible even in the case of complicated non-linear, inelastic behaviour.

Acknowledgements

This paper has been partly supported by CUTTER project GRD1/1999/10330 (Enhanced Design and

Production of Wear Resistant Rock Cutting Tools For Construction Machinery) and partly by FUSION

grant RFX FU0S-CT2000-00045 (EFDA/00-S21).

References

[1] H. Abdi, Les Reeseaux de Neurones, Presses Universitaires de Grenoble, 1994.[2] T. Chen, H. Chen, Universal approximation to non-linear operators by neural networks with arbitrary activation functions and its

application to dynamical systems, IEEE Trans. Neural Networks 6 (4) (1995) 911917.

[3] U. Galvanetto, V. Naumov, V. Palmov, B.A. Schrefler, Analysis of the mechanical behaviour of cable-in-conduit superconductors

under transverse cyclic loading, Int. J. Computat. Civil Struct. Engrg. 1 (2) (2000) 110.

[4] S. Garcia, M.P. Romo, V. Taboada-Urtuzuastegui, Knowledge-based modelling of sand behaviour, Proceedings of ECCOMAS

2000, Barcelona, 2000, pp. 1114.

[5] D. Gawin, M. Lefik, B.A. Schrefler, ANN approach to sorption hysteresis within a coupled hygrothermo-mechanical FE

analysis, Int. J. Numer. Meth. Engrg. 50 (2001) 299323.

[6] J. Ghaboussi, D.E. Sidarta, New nested adaptive neural networks (NANN) for constitutive modelling, Comput. Geotec. 22 (1)

(1998) 2952.

[7] J. Ghaboussi, J.H. Garrett, X. Wu, Knowledge-based modelling of material behaviour with neural networks, J. Engrg. Mech. 117

(1991) 132151.



19/19

[8] J. Hertz, A. Krogh, G.R. Palmer, Introduction to the theory of neural computation, Lecture Notes, vol. I, Santa Fe Institute

Studies in the sciences of Complexity, Addison-Wesley, 1991.

[9] Y.H. Hu, J.-N. Hwang (Eds.), Handbook of Neural Network Signal Processing, CRC PRESS, 2002.

[10] S. Kortesis, P.D. Panagiotopoulos, Neural networks for computing in structural analysis: Methods and prospects of applications,

Int. J. Numer. Meth. Engrg. 36 (1993) 23052318.[11] M. Lefik, Use of artificial neural network to define a non-linear effective constitutive law for a composite, Proceedings of the 13th

Polish Conference on Computer Methods in Mechanics PCCMM97, 1997, pp. 725732.

[12] M. Lefik, Modified BP artificial neural network as an incremental non-linear constitutive model, Proceedings of European

Conference on Computational Mechanics, ECCM-2001, 2001 on CD.

[13] M. Lefik, B.A. Schrefler, Artificial neural network for parameter identifications for an elasto-plastic model of super-conducting

cable under cyclic loading, Comput. Struct. 80 (22) (2002) 16991713.

[14] M. Lefik, B.A. Schrefler, One-dimensional model of cable-in-conduit superconductors under cyclic loading using artificial neural

networks, Fusion Engrg. Des. 60 (2) (2002) 105117.

[15] G. Mucha, Z. Waszczyszyn, Hybrid neural-network/computational program for bending analysis of elastoplastic beams,

Proceedings Of the XIII Polish Conference On Computer Methods in Mechanics, 1997, pp. 949956.

[16] N.H. Nijuhuis, W. Noordman, H.H.J. Ten Kate, Mechanical and electrical testing of an ITER CS1 model coil conductor under

transverse loading in a cryogenic press, Preliminary Report, University of Twente, 1998.

[17] S. Osowski, Sieci Neuronowe w ujeciu algorytmicznym, Wydawnictwo Naukowo Techniczne, Warszawa, 1996.

[18] D. Penumadu, R. Zhao, Triaxial compression behaviour of sand and gravel using artificial neural networks (ANN), Comput.

Geotech. 24 (1999) 207230.

[19] H.S. Shin, G.N. Pande, Intelligent finite elements, in: S. Valliappan, N. Khalili (Eds.), Computational Mechanics-New Frontiers

for New Millenium, Elsevier Science, 2001.

[20] H.S. Shin, G.N. Pande, On self-learning finite element codes based on monitored response of structures, Comput. Geotech. 27

(2000) 161178.

[21] Z. Sikora, R. Ossowski, Y. Ichikawa, K. Tkacz, Neural networks as a tool for constitutive modelling, in: F. Oka, A. Yashima

(Eds.), Localization and Bifurcation Theory for Soils and Rocks, Balkema, Rotterdam, 1998.

[22] R. Tadeusiewicz, Sieci Neuronowe, Akademicka Oficyna Wydawnicza, 1993.

[23] Z. Waszczyszyn, Neural networks in plasticity: some new results and prospects of applications, European Congress on

Computational Methods in Applied Sciences and Engineering ECCOMAS 2000, 2000 on CD.

[24] Z. Waszczyszyn, Some new results in applications of backpropagation neural networks in structural and civil engineering, in:

Advances in Engineering Computational Technology, Civil-Comp Press, Edinburgh, 1998, pp. 173187.

[25] G. Yagawa, H. Okuda, Neural networks in computational mechanics, Arch. Computat. Meth. Engrg. 3 (4) (1996) 435512.


artificial neural network as an incremental non-linear

Documents