fotiadis, d. i. karras, d. a. lagaris, i. e. likas, a. papageorgiou, d. g. optimization software as...
TRANSCRIPT
Fotiadis, D. I.
Karras, D. A.
Lagaris, I. E.
Likas, A.
Papageorgiou, D. G.
OPTIMIZATION SOFTWARE
as a Tool for Solving Differential Equations Using
NEURAL NETWORKS
DIFFERENTIAL EQUATIONS HANDLED
• ODE’s
• Systems of ODE’s
• PDE’s ( Boundary and Initial Value
Problems )
• Eigen - Value PDE Problems
• IDE’s
ARTIFICIAL NEURAL NETWORKS
• Closed Analytic Form
• Universal Approximators
• Linear and Non-Linear Parameters
• Highly Parallel Systems
• Specialized Hardware for ANN
OPTIMIZATION ENVIRONMENT
MERLIN / MCL 3.0 SOFTWARE
Features Include:
• A Host of Optimization Algorithms
• Special Merit for Sums of Squares
• Variable Bounds and Variable Fixing
• Command Driven User Interface
• Numerical Estimation of Derivatives
• Dynamic Programming of Strategies
ARTIFICIALNEURAL NETWORKS
• Inspired from biological NN
Input - Output mapping via the
weights u,w,v and the activation functions
2 13
1 1
)1()2(
1
)2()1( ),,,,(n
j
n
kjkjkij
n
ii uxwwvvwwuxN
Analytically this is given by the formula:
x
x
1
2
Bias+ 1
w(1)
w(2)
v
Input
Layer
Hidden
Layers
Output
Layer
1
1
2
2
3
u
Activation Functions
Many different functions can be used.
Our current choice: The Sigmoidal
xex
1
1A smooth function, infinitely differentiable,
bounded in (0,1)
(x)
0
0.2
0.4
0.6
0.8
1
-10 -5 0 5 10
The Sigmoidal properties
)](1)[()(
xxdx
xd
)](21)][(1)[()(
2
2
xxxdx
xd
FACTS
Kolmogorov
and Cybenko and Hornik
proved theorems concerning the
approximation capabilities of ANNs
In fact it is shown that ANNs are
UNIVERSAL APPROXIMATORS
DESCRIPTIONOF THE METHOD
SOLVE THE EQUATION
)()( xfxL SUBJECT TO
DIRICHLET B.C.
Where
L is an Integrodifferential Operator
Linear or Non-Linear
)()()()( xxxx NZBM
Where:Where:•B(x) satisfies the BC•Z(x) vanishes on the boundary•N(x) is an Artificial Neural Net
MODEL PROPERTIES
The Model
satisfies by construction the B.C.
)()()()( xxxx NZBM
The Model thanks to the Network is “trainable”
0)()( xfxL M
N
x 1,0
The Network parameters can be adjusted so that:
Pick a set of representative
points
in the unit Hypercuben
xxx ,...,,21
2iiM
] [L )f(x)(xΨn 1,i
The residual “Error”
ILLUSTRATION
10
2
2
)1(,)0(
),,()(
dxd
xfxdxd
Simple 1-d example
Model)()1()1()(
10xNxxxxx
M
ILLUSTRATION
For a second order, two-dimensional PDE:
),()1()1(),(),( yxNyyxxyxByxM
where
)]}1,1()1,0()1[()1,({
)]}0,1()0,()1[()0,(){1(
),1(),0()1(),(
xxxy
xxxxy
yxyxyxB
EXAMPLES
Problem: Solve the 2-d PDE:
)632(),(2 yyxxeyx
Subject to the BC :
]1,0[, yxIn the domain:
xxex
yy
)0,(
3),0(xexx
eyy
)1()1,(
)31(),1(
A single hidden layer Perceptron was used:),()1()1(),(),( yxNyyxxyxByxM
)]21()1[(
)()1(/)1()1(),(1
133
xexexy
eexyeyxyxyxBx
x
Exact
GRAPHICAL REPRESENTATION
)(),( 3yxxeyx
The analytic solution is:
GRAPHS & COMPARISON
Neural Solution accuracy
Plot Points: Training Points
),(),( yxyx M
GRAPHS & COMPARISON
),(),( yxyx M
Neural Solution accuracy
Plot Points: Test Points
GRAPHS & COMPARISON
Finite Element Solution accuracy
Plot Points: Training Points
),(),( yxyx FE
GRAPHS & COMPARISON
Finite Element Solution accuracy
Plot Points: Test Points
),(),( yxyx FE
PERFORMANCE
• Highly Accurate Solution (even with few training points)
• Uniform “Error” Distribution
• Superior Interpolation Properties
The model solution is very flexible. Can be easily enhanced to offer even
higher accuracy.
EIGEN VALUE PROBLEMS
The model is the same as before.However the “Error” is defined as:
Problem: )()( xxL With appropriate Dirichlet BC
n
iiM
n
iiMiM
x
xxL
1
2
1
2
)]([
)]()([
EIGEN VALUE PROBLEMS
Where:
n
ii
n
iii
x
xLx
1
2
1
)]([
)()(
i.e. the value for which the “Error” is minimum.
Problems of that kind are often encountered in Quantum Mechanics. (Schrödinger’s equation)
EXAMPLES
The non-local Schrödinger equation
)(')'()',()()()(
2 0
02
22
rdrrrrKrrVdr
rdh
Describes the bound “n+” system in the framework of the Resonating Group Method.
0,~)(,0)0( ker kr
0),()( brNrer brMModel:
nodes
jjjj urwvrN
1
)()( Where:
is a single hidden layer, sigmoidal Perceptron
OBTAINING EIGENVALUES
Example:The Henon-Heiles potential
3222
2
2
2
2
31
541
21
21
xxyyxyx
Asymptotic behavior:)( 22
~),( yxkeyx
),(),( )( 22
yxNeyx yxbM
Model used:
Use the above model to obtain an eigen solution
Obtain a different eigen solution by deflation, i.e. : '')','()','(),(),(),(~ dydxyxyxyxyxyx MMM This model is orthogonal to (x,y) by construction. The procedure can be applied repeatedly.
ARBITRARILY SHAPED DOMAINS
For domains other than Hypercubes the BC cannot be embedded in the model.
miRi ,...,2,1, Let defining the arbitrarily shaped boundary. The BC are then:
be the set of points
mibR ii ,...,2,1)(
Let be the set of the training points inside the domain.
niri ,...,2,1,
We describe two ways to proceed solving the problem)()( xfxL
OPTIMIZATION WITH CONSTRAINTS
Model:
“Error” to be minimized:
Domain terms + Boundary terms
)()( xNxM
2
1
2
1
])([)]()([ i
m
iiMi
n
iiM bRrfrL
With a penalty parameter, to control the degree of satisfaction of the BC.
PERCEPTRON-RBF SYNERGY
Model:
2
1
iRxm
iiM eaxNx
Where the ’s are determined in a way so that the model satisfies the BC exactly, i.e.:
)(1
2
ii
m
k
RRk RNbea ki
The free parameter is chosen once initially so as the system above is easily solved.
2
1
)]()([ i
n
iiM rfrL
“Error”:
Pros & Cons . . .
• Computationally costly. A linear system is solved each time the model is evaluated.
• Exact in satisfying the BC.
The Penalty method is:
The RBF - Synergy is:
• Approximate in satisfying the BC.
• Computationally efficient
IN PRACTICE . . .
• Initially proceed via the penalty method, till an approximate solution is found.
• Refine the solution, using the RBF- Synergy method, to satisfy the BC exactly.
Conclusions:
Experiments on several model problems shows performance similar to the one reported earlier.
GENERALOBSERVATIONS
Enhanced generalization performance is achieved, when the exponential weights of the Neural Networks are kept small.
Hence box-constrained optimization methods should be applied.
Bigger Networks (greater number of nodes) can achieve higher accuracy.
This favors the use of:
• Existing Specialized Hardware
• Sophisticated Optimization Software
MERLIN 3.0
What is it ?A software package offering many optimization algorithms and a friendly user interface.
What problems does it solve ?
Find a local minimum of the function:
Under the conditions:
),...,,(,, 21 NN xxxRf xxx
Niulx iii ,...,2,1],,[
ALGORITHMS
• SIMPLEX
• ROLL
Direct Methods
Gradient Methods
Conjugate Gradient Quasi Newton
• Polak-Ribiere
• Fletcher-Reeves
• Generalized P&R
• BFGS (3 versions)
• DFP
Levenberg-Marquardt
• For Sum-Of-Squares
THE USER’S PART
What the user has to do ?
• Program the objective function
• Use Merlin to find an optimum
What the user may want to do ?
• Program the gradient
• Program the Hessian
• Program the Jacobian
MERLIN FEATURES & TOOLS
• Intuitive free-format I/O
• Menu assisted Input
• On-line HELP
• Several gradient modes
• Confidence parameter intervals
• Box constraints
• Postscript graphs
• Programmability
• “Open” to user enhancements
MCL:
MerlinControl
Language
What is it ?
High-Level Programming Language,
that Drives Merlin Intelligently.
What are the benefits ?
• Abolishes User Intervention.
• Optimization Strategies.
• Handy Utilities.
• Global Optimum Seeking Methods.
MCL REPERTOIRE
MCL command types:
• Merlin Commands
• Conditionals (IF-THEN-ELSE-ENDIF)
• Loops (DO type of loops)
• Branching (GO TO type)
• I/O (READ/WRITE)
MCL intrinsic variables: All Merlin important variables, e.g.: Parameters, Value, Gradient, Bounds ...
programvar i; sml; bfgs_calls; nfix; max_calls
sml = 1.e-4 % Gradient threshlod.bfgs_calls = 1000 % Number of BFGS calls.max_calls = 10000 % Max. calls to spend.
again: loosall nfix = 0 loop i from 1 to dim if abs[grad[i]] <= sml then fix (x.i) nfix = nfix+1 end if end loop
if nfix == dim then display 'Gradient below threshold...' loosall finish end if bfgs (noc=bfgs_calls)when pcount < max_calls just move to againdisplay 'We probably failed...'
end
SAMPLE MCL PROGRAM
MERLIN-MCLAvailability
http://nrt.cs.uoi.gr/merlin/
The Merlin - MCL package is written in ANSI Fortran 77 and
can be downloaded from the following URL:
It is maintained, supported and is FREELY available to
the scientific community.
FUTURE DEVELOPMENTS
• Optimal Training Point Sets
• Optimal Network Architecture
• Expansion & Pruning Techniques
Hardware Implementation on
NEUROPROCESSORS