fotiadis, d. i. karras, d. a. lagaris, i. e. likas, a. papageorgiou, d. g. optimization software as...

Fotiadis, D. I.

Karras, D. A.

Lagaris, I. E.

Likas, A.

Papageorgiou, D. G.

OPTIMIZATION SOFTWARE

as a Tool for Solving Differential Equations Using

NEURAL NETWORKS

DIFFERENTIAL EQUATIONS HANDLED

• ODE’s

• Systems of ODE’s

• PDE’s ( Boundary and Initial Value

Problems )

• Eigen - Value PDE Problems

• IDE’s

ARTIFICIAL NEURAL NETWORKS

• Closed Analytic Form

• Universal Approximators

• Linear and Non-Linear Parameters

• Highly Parallel Systems

• Specialized Hardware for ANN

OPTIMIZATION ENVIRONMENT

MERLIN / MCL 3.0 SOFTWARE

Features Include:

• A Host of Optimization Algorithms

• Special Merit for Sums of Squares

• Variable Bounds and Variable Fixing

• Command Driven User Interface

• Numerical Estimation of Derivatives

• Dynamic Programming of Strategies

ARTIFICIALNEURAL NETWORKS

• Inspired from biological NN

Input - Output mapping via the

weights u,w,v and the activation functions

2 13

1 1

)1()2(

1

)2()1( ),,,,(n

j

n

kjkjkij

n

ii uxwwvvwwuxN

Analytically this is given by the formula:

x

x

1

2

Bias+ 1

w(1)

w(2)

v

Input

Layer

Hidden

Layers

Output

Layer

1

1

2

2

3

u

Activation Functions

Many different functions can be used.

Our current choice: The Sigmoidal

xex

1

1A smooth function, infinitely differentiable,

bounded in (0,1)

(x)

0

0.2

0.4

0.6

0.8

1

-10 -5 0 5 10

The Sigmoidal properties

)](1)[()(

xxdx

xd

)](21)][(1)[()(

2

2

xxxdx

xd

FACTS

Kolmogorov

and Cybenko and Hornik

proved theorems concerning the

approximation capabilities of ANNs

In fact it is shown that ANNs are

UNIVERSAL APPROXIMATORS

DESCRIPTIONOF THE METHOD

SOLVE THE EQUATION

)()( xfxL SUBJECT TO

DIRICHLET B.C.

Where

L is an Integrodifferential Operator

Linear or Non-Linear

)()()()( xxxx NZBM

Where:Where:•B(x) satisfies the BC•Z(x) vanishes on the boundary•N(x) is an Artificial Neural Net

MODEL PROPERTIES

The Model

satisfies by construction the B.C.

)()()()( xxxx NZBM

The Model thanks to the Network is “trainable”

0)()( xfxL M

N

x 1,0

The Network parameters can be adjusted so that:

Pick a set of representative

points

in the unit Hypercuben

xxx ,...,,21

2iiM

] [L )f(x)(xΨn 1,i

The residual “Error”

ILLUSTRATION

10

2

2

)1(,)0(

),,()(

dxd

xfxdxd

Simple 1-d example

Model)()1()1()(

10xNxxxxx

M

ILLUSTRATION

For a second order, two-dimensional PDE:

),()1()1(),(),( yxNyyxxyxByxM

where

)]}1,1()1,0()1[()1,({

)]}0,1()0,()1[()0,(){1(

),1(),0()1(),(

xxxy

xxxxy

yxyxyxB

EXAMPLES

Problem: Solve the 2-d PDE:

)632(),(2 yyxxeyx

Subject to the BC :

]1,0[, yxIn the domain:

xxex

yy

)0,(

3),0(xexx

eyy

)1()1,(

)31(),1(

A single hidden layer Perceptron was used:),()1()1(),(),( yxNyyxxyxByxM

)]21()1[(

)()1(/)1()1(),(1

133

xexexy

eexyeyxyxyxBx

x

Exact

GRAPHICAL REPRESENTATION

)(),( 3yxxeyx

The analytic solution is:

GRAPHS & COMPARISON

Neural Solution accuracy

Plot Points: Training Points

),(),( yxyx M

GRAPHS & COMPARISON

),(),( yxyx M

Neural Solution accuracy

Plot Points: Test Points

GRAPHS & COMPARISON

Finite Element Solution accuracy

Plot Points: Training Points

),(),( yxyx FE

GRAPHS & COMPARISON

Finite Element Solution accuracy

Plot Points: Test Points

),(),( yxyx FE

PERFORMANCE

• Highly Accurate Solution (even with few training points)

• Uniform “Error” Distribution

• Superior Interpolation Properties

The model solution is very flexible. Can be easily enhanced to offer even

higher accuracy.

EIGEN VALUE PROBLEMS

The model is the same as before.However the “Error” is defined as:

Problem: )()( xxL With appropriate Dirichlet BC

n

iiM

n

iiMiM

x

xxL

1

2

1

2

)]([

)]()([

EIGEN VALUE PROBLEMS

Where:

n

ii

n

iii

x

xLx

1

2

1

)]([

)()(

i.e. the value for which the “Error” is minimum.

Problems of that kind are often encountered in Quantum Mechanics. (Schrödinger’s equation)

EXAMPLES

The non-local Schrödinger equation

)(')'()',()()()(

2 0

02

22

rdrrrrKrrVdr

rdh

Describes the bound “n+” system in the framework of the Resonating Group Method.

0,~)(,0)0( ker kr

0),()( brNrer brMModel:

nodes

jjjj urwvrN

1

)()( Where:

is a single hidden layer, sigmoidal Perceptron

OBTAINING EIGENVALUES

Example:The Henon-Heiles potential

3222

2

2

2

2

31

541

21

21

xxyyxyx

Asymptotic behavior:)( 22

~),( yxkeyx

),(),( )( 22

yxNeyx yxbM

Model used:

Use the above model to obtain an eigen solution

Obtain a different eigen solution by deflation, i.e. : '')','()','(),(),(),(~ dydxyxyxyxyxyx MMM This model is orthogonal to (x,y) by construction. The procedure can be applied repeatedly.

ARBITRARILY SHAPED DOMAINS

For domains other than Hypercubes the BC cannot be embedded in the model.

miRi ,...,2,1, Let defining the arbitrarily shaped boundary. The BC are then:

be the set of points

mibR ii ,...,2,1)(

Let be the set of the training points inside the domain.

niri ,...,2,1,

We describe two ways to proceed solving the problem)()( xfxL

OPTIMIZATION WITH CONSTRAINTS

Model:

“Error” to be minimized:

Domain terms + Boundary terms

)()( xNxM

2

1

2

1

])([)]()([ i

m

iiMi

n

iiM bRrfrL

With a penalty parameter, to control the degree of satisfaction of the BC.

PERCEPTRON-RBF SYNERGY

Model:

2

1

iRxm

iiM eaxNx

Where the ’s are determined in a way so that the model satisfies the BC exactly, i.e.:

)(1

2

ii

m

k

RRk RNbea ki

The free parameter is chosen once initially so as the system above is easily solved.

2

1

)]()([ i

n

iiM rfrL

“Error”:

Pros & Cons . . .

• Computationally costly. A linear system is solved each time the model is evaluated.

• Exact in satisfying the BC.

The Penalty method is:

The RBF - Synergy is:

• Approximate in satisfying the BC.

• Computationally efficient

IN PRACTICE . . .

• Initially proceed via the penalty method, till an approximate solution is found.

• Refine the solution, using the RBF- Synergy method, to satisfy the BC exactly.

Conclusions:

Experiments on several model problems shows performance similar to the one reported earlier.

GENERALOBSERVATIONS

Enhanced generalization performance is achieved, when the exponential weights of the Neural Networks are kept small.

Hence box-constrained optimization methods should be applied.

Bigger Networks (greater number of nodes) can achieve higher accuracy.

This favors the use of:

• Existing Specialized Hardware

• Sophisticated Optimization Software

MERLIN 3.0

What is it ?A software package offering many optimization algorithms and a friendly user interface.

What problems does it solve ?

Find a local minimum of the function:

Under the conditions:

),...,,(,, 21 NN xxxRf xxx

Niulx iii ,...,2,1],,[

ALGORITHMS

• SIMPLEX

• ROLL

Direct Methods

Gradient Methods

Conjugate Gradient Quasi Newton

• Polak-Ribiere

• Fletcher-Reeves

• Generalized P&R

• BFGS (3 versions)

• DFP

Levenberg-Marquardt

• For Sum-Of-Squares

THE USER’S PART

What the user has to do ?

• Program the objective function

• Use Merlin to find an optimum

What the user may want to do ?

• Program the gradient

• Program the Hessian

• Program the Jacobian

MERLIN FEATURES & TOOLS

• Intuitive free-format I/O

• Menu assisted Input

• On-line HELP

• Several gradient modes

• Confidence parameter intervals

• Box constraints

• Postscript graphs

• Programmability

• “Open” to user enhancements

MCL:

MerlinControl

Language

What is it ?

High-Level Programming Language,

that Drives Merlin Intelligently.

What are the benefits ?

• Abolishes User Intervention.

• Optimization Strategies.

• Handy Utilities.

• Global Optimum Seeking Methods.

MCL REPERTOIRE

MCL command types:

• Merlin Commands

• Conditionals (IF-THEN-ELSE-ENDIF)

• Loops (DO type of loops)

• Branching (GO TO type)

• I/O (READ/WRITE)

MCL intrinsic variables: All Merlin important variables, e.g.: Parameters, Value, Gradient, Bounds ...

programvar i; sml; bfgs_calls; nfix; max_calls

sml = 1.e-4 % Gradient threshlod.bfgs_calls = 1000 % Number of BFGS calls.max_calls = 10000 % Max. calls to spend.

again: loosall nfix = 0 loop i from 1 to dim if abs[grad[i]] <= sml then fix (x.i) nfix = nfix+1 end if end loop

if nfix == dim then display 'Gradient below threshold...' loosall finish end if bfgs (noc=bfgs_calls)when pcount < max_calls just move to againdisplay 'We probably failed...'

end

SAMPLE MCL PROGRAM

MERLIN-MCLAvailability

http://nrt.cs.uoi.gr/merlin/

The Merlin - MCL package is written in ANSI Fortran 77 and

can be downloaded from the following URL:

It is maintained, supported and is FREELY available to

the scientific community.

FUTURE DEVELOPMENTS

• Optimal Training Point Sets

• Optimal Network Architecture

• Expansion & Pruning Techniques

Hardware Implementation on

NEUROPROCESSORS

fotiadis, d. i. karras, d. a. lagaris, i. e. likas, a. papageorgiou, d. g. optimization software as...

Documents

nonlinear slide

u slide

ann slide

residual error slide

universal approximators

schrdingers equation

artificial neural net

d example model slide