ict619 intelligent systems topic 4: artificial neural networks

47
ICT619 Intelligent ICT619 Intelligent Systems Systems Topic 4: Artificial Topic 4: Artificial Neural Networks Neural Networks

Post on 18-Dec-2015

224 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: ICT619 Intelligent Systems Topic 4: Artificial Neural Networks

ICT619 Intelligent SystemsICT619 Intelligent Systems

Topic 4: Artificial Neural Topic 4: Artificial Neural NetworksNetworks

Page 2: ICT619 Intelligent Systems Topic 4: Artificial Neural Networks

ICT619ICT619 22

Artificial Neural NetworksArtificial Neural NetworksPART APART A IntroductionIntroduction An overview of the biological neuronAn overview of the biological neuron The synthetic neuronThe synthetic neuron Structure and operation of an ANNStructure and operation of an ANN Problem solving by an ANNProblem solving by an ANN Learning in ANNsLearning in ANNs ANN modelsANN models ApplicationsApplications

PART BPART B Developing neural network applicationsDeveloping neural network applications Design of the networkDesign of the network Training issuesTraining issues A comparison of ANN and ESA comparison of ANN and ES Hybrid ANN systemsHybrid ANN systems Case StudiesCase Studies

Page 3: ICT619 Intelligent Systems Topic 4: Artificial Neural Networks

ICT619ICT619 33

Introduction Introduction Artificial Neural Networks (ANN)Artificial Neural Networks (ANN) Also known as Also known as

Neural networksNeural networks Neural computing (or neuro-computing) systemsNeural computing (or neuro-computing) systems Connectionist modelsConnectionist models

ANNs simulate the biological brain for problem solvingANNs simulate the biological brain for problem solving

This represents a totally different approach to machine This represents a totally different approach to machine intelligence from the symbolic logic approachintelligence from the symbolic logic approach

The biological brain is a The biological brain is a massively parallelmassively parallel system of system of interconnected processing elementsinterconnected processing elements

ANNs simulate a similar network of simple processing elements at ANNs simulate a similar network of simple processing elements at a greatly reduced scalea greatly reduced scale

Page 4: ICT619 Intelligent Systems Topic 4: Artificial Neural Networks

ICT619ICT619 44

Introduction Introduction ANNs adapt themselves using data to learn problem ANNs adapt themselves using data to learn problem

solutionssolutions

ANNs can be particularly effective for problems that are ANNs can be particularly effective for problems that are hard to solve using conventional computing methodshard to solve using conventional computing methods

First developed in the 1950s, slumped in 70sFirst developed in the 1950s, slumped in 70s

Great upsurge in interest in the mid 1980sGreat upsurge in interest in the mid 1980s

Both ANNs and expert systems are non-algorithmic tools Both ANNs and expert systems are non-algorithmic tools for problem solvingfor problem solving

ES rely on the solution being expressed as a set of ES rely on the solution being expressed as a set of heuristics by an expertheuristics by an expert

ANNs learn solely from data.ANNs learn solely from data.

Page 5: ICT619 Intelligent Systems Topic 4: Artificial Neural Networks

ICT619ICT619 55

Page 6: ICT619 Intelligent Systems Topic 4: Artificial Neural Networks

ICT619ICT619 66

An overview of the biological An overview of the biological neuronneuron

Estimated 1000 billion Estimated 1000 billion neuronsneurons in the human brain, in the human brain, with each connected to up to 10,000 otherswith each connected to up to 10,000 others

Electrical impulses produced by a neuron travel along Electrical impulses produced by a neuron travel along the the axonaxon

The axon connects to dendrites through The axon connects to dendrites through synaptic synaptic junctionsjunctions

Page 7: ICT619 Intelligent Systems Topic 4: Artificial Neural Networks

ICT619ICT619 77

An overview of the biological An overview of the biological neuronneuron

Photo: Osaka University

Page 8: ICT619 Intelligent Systems Topic 4: Artificial Neural Networks

ICT619ICT619 88

An overview of the biological An overview of the biological neuronneuron

A neuron collects the excitation of its inputs and "fires" A neuron collects the excitation of its inputs and "fires" (produces a burst of activity) when the sum of its inputs (produces a burst of activity) when the sum of its inputs exceeds a certain thresholdexceeds a certain threshold

The strengths of a neuron’s inputs are modified The strengths of a neuron’s inputs are modified (enhanced or inhibited) by the synaptic junctions(enhanced or inhibited) by the synaptic junctions

Learning in our brains occurs through a continuous Learning in our brains occurs through a continuous process of new interconnections forming between process of new interconnections forming between neurons, and adjustments at the synaptic junctionsneurons, and adjustments at the synaptic junctions

Page 9: ICT619 Intelligent Systems Topic 4: Artificial Neural Networks

ICT619ICT619 99

The synthetic neuronThe synthetic neuron

A simple model of the biological neuron, first A simple model of the biological neuron, first proposed in 1943 by McCulloch and Pitts proposed in 1943 by McCulloch and Pitts consists of a summing function with an internal consists of a summing function with an internal threshold, and "weighted" inputs as shown threshold, and "weighted" inputs as shown below.below.

Page 10: ICT619 Intelligent Systems Topic 4: Artificial Neural Networks

ICT619ICT619 1010

The synthetic neuron (cont’d)The synthetic neuron (cont’d)

For a neuron receiving For a neuron receiving nn inputs, each input inputs, each input x xii ( ( ii ranging ranging from 1 to from 1 to nn) is weighted by multiplying it with a weight ) is weighted by multiplying it with a weight wwii

The sum of theThe sum of the products products wwiixxii gives the net activation gives the net activation value of the neuronvalue of the neuron

The activation value is subjected to a The activation value is subjected to a transfer functiontransfer function to to produce the neuron’s outputproduce the neuron’s output

The weight value of the connection carrying signals from The weight value of the connection carrying signals from a neuron a neuron i i to a neuron to a neuron jj is termed is termed wwijij....

Page 11: ICT619 Intelligent Systems Topic 4: Artificial Neural Networks

ICT619ICT619 1111

Transfer functionsTransfer functions These compute the output of a node from its net activation. These compute the output of a node from its net activation.

Among the popular transfer functions are:Among the popular transfer functions are: Step functionStep function Signum (or sign) functionSignum (or sign) function Sigmoid functionSigmoid function Hyperbolic tangent functionHyperbolic tangent function

In the In the step functionstep function, the neuron produces an output only when its , the neuron produces an output only when its net activation reaches a minimum value – known as the net activation reaches a minimum value – known as the thresholdthreshold

For a binary neuron For a binary neuron ii, whose output is a 0 or 1 value, the step , whose output is a 0 or 1 value, the step function can be summarised as:function can be summarised as:

Tactivationif

Tactivationifoutput

i

ii 1

0

Page 12: ICT619 Intelligent Systems Topic 4: Artificial Neural Networks

ICT619ICT619 1212

Transfer functions (cont’d)Transfer functions (cont’d)

The sign function returns a value between -1 and +1. To avoid The sign function returns a value between -1 and +1. To avoid confusion with 'sine' it is often called confusion with 'sine' it is often called signumsignum..

01

01

i

ii activationif

activationifoutput

+1

-1

0

outputi

activationi

Page 13: ICT619 Intelligent Systems Topic 4: Artificial Neural Networks

ICT619ICT619 1313

Transfer functions (cont’d)Transfer functions (cont’d)

The sigmoidThe sigmoid

The The sigmoidsigmoid transfer function produces a continuous transfer function produces a continuous value in the range 0 to 1value in the range 0 to 1

The parameter The parameter gaingain affects the slope of the function affects the slope of the function around zeroaround zero

Page 14: ICT619 Intelligent Systems Topic 4: Artificial Neural Networks

ICT619ICT619 1414

Transfer functions (cont’d)Transfer functions (cont’d)

The hyperbolic tangentThe hyperbolic tangent A variant of the sigmoid transfer function A variant of the sigmoid transfer function

Has a shape similar to the sigmoid (like an Has a shape similar to the sigmoid (like an SS), with the ), with the difference being that the value of difference being that the value of outputoutputii ranges ranges between –1 and 1.between –1 and 1.

ii

ii

activationactivation

activationactivation

iee

eeoutput

Page 15: ICT619 Intelligent Systems Topic 4: Artificial Neural Networks

ICT619ICT619 1515

Structure and operation of an Structure and operation of an ANNANN

The building block of an ANN is the artificial The building block of an ANN is the artificial neuron. It is characterised byneuron. It is characterised by weighted inputsweighted inputs summing and transfer functionsumming and transfer function

The most common architecture of an ANN consists The most common architecture of an ANN consists of two or more layers of artificial neurons or nodes, of two or more layers of artificial neurons or nodes, with each node in a layer connected to every node with each node in a layer connected to every node in the following layer in the following layer

Signals usually flow from the Signals usually flow from the input layerinput layer, which is , which is directly subjected to an input pattern, across one directly subjected to an input pattern, across one or more or more hidden layershidden layers towards the towards the output layeroutput layer..

Page 16: ICT619 Intelligent Systems Topic 4: Artificial Neural Networks

ICT619ICT619 1616

Structure and operation of an ANNStructure and operation of an ANN

The most popular ANN architecture, known as the multilayer The most popular ANN architecture, known as the multilayer perceptron (shown in diagram above), follows this model. perceptron (shown in diagram above), follows this model.

In some models of the ANN, such as the In some models of the ANN, such as the self-organising mapself-organising map (SOM) (SOM) or or Kohonen netKohonen net, nodes in the same layer may have interconnections , nodes in the same layer may have interconnections among themamong them

In In recurrent networksrecurrent networks, connections can even go backwards to nodes , connections can even go backwards to nodes closer to inputcloser to input

Page 17: ICT619 Intelligent Systems Topic 4: Artificial Neural Networks

ICT619ICT619 1717

Problem solving by an ANNProblem solving by an ANN

The inputs of an ANN are data values grouped The inputs of an ANN are data values grouped together to form a pattern together to form a pattern

Each data value (component of the pattern vector) is Each data value (component of the pattern vector) is applied to one neuron in the input layer applied to one neuron in the input layer

The output value(s) of node(s) in the output layer The output value(s) of node(s) in the output layer represent some function of the input patternrepresent some function of the input pattern

Page 18: ICT619 Intelligent Systems Topic 4: Artificial Neural Networks

ICT619ICT619 1818

Problem solving by an ANN (cont’d)Problem solving by an ANN (cont’d)

• In the example above, the ANN maps the input pattern In the example above, the ANN maps the input pattern to either one of two classesto either one of two classes

The ANN produces the output for an accurate The ANN produces the output for an accurate prediction, prediction, onlyonly if the functional relationships between if the functional relationships between the relevant variables, namely the components of the the relevant variables, namely the components of the input pattern, and the corresponding output, have been input pattern, and the corresponding output, have been “learned” by the ANN“learned” by the ANN

Any three-layer ANN can (at least in theory) represent Any three-layer ANN can (at least in theory) represent the functional relationship between an input pattern the functional relationship between an input pattern and its classand its class

It may be difficult in practice for the ANN to learn a It may be difficult in practice for the ANN to learn a given relationshipgiven relationship

Page 19: ICT619 Intelligent Systems Topic 4: Artificial Neural Networks

ICT619ICT619 1919

Learning in ANN Learning in ANN

Common human learning behaviour: repeatedly going Common human learning behaviour: repeatedly going through same material, making mistakes and learning through same material, making mistakes and learning until able to carry out a given task successfully until able to carry out a given task successfully

Learning by most ANNs is modelled after this type of Learning by most ANNs is modelled after this type of human learninghuman learning

Learned knowledge to solve a given problem is stored Learned knowledge to solve a given problem is stored in the interconnection weights of an ANNin the interconnection weights of an ANN

The process by which an ANN arrives at the right The process by which an ANN arrives at the right values of these weights is known as values of these weights is known as learning learning or or trainingtraining

Page 20: ICT619 Intelligent Systems Topic 4: Artificial Neural Networks

ICT619ICT619 2020

Learning in ANN (cont’d)Learning in ANN (cont’d)

Learning in ANNs takes place through an Learning in ANNs takes place through an iterative training process during which node iterative training process during which node interconnection weight values are adjustedinterconnection weight values are adjusted

Initial weights, usually small random values, Initial weights, usually small random values, are assigned to the interconnections between are assigned to the interconnections between the ANN nodes.the ANN nodes.

Like knowledge acquisition in ES, learning in Like knowledge acquisition in ES, learning in ANNs can be the most time consuming phase ANNs can be the most time consuming phase in its developmentin its development

Page 21: ICT619 Intelligent Systems Topic 4: Artificial Neural Networks

ICT619ICT619 2121

Learning in ANNs (cont’d)Learning in ANNs (cont’d)

ANN learning (or training) can be ANN learning (or training) can be supervisedsupervised or or unsupervisedunsupervised

In In supervised trainingsupervised training, , data sets consisting of pairs, each one an input patterns and data sets consisting of pairs, each one an input patterns and

its expected correct output value, are usedits expected correct output value, are used

The weight adjustments during each iteration aim to reduce The weight adjustments during each iteration aim to reduce the “error” (difference between the ANN’s actual output and the “error” (difference between the ANN’s actual output and the expected correct output)the expected correct output)

Eg, a node producing a small negative output when it is Eg, a node producing a small negative output when it is

expected to produce a large positive one, has its positive expected to produce a large positive one, has its positive weight values increased and the negative weight values weight values increased and the negative weight values decreaseddecreased

Page 22: ICT619 Intelligent Systems Topic 4: Artificial Neural Networks

ICT619ICT619 2222

Learning in ANNsLearning in ANNs

In In supervised trainingsupervised training,, Pairs of sample input value and corresponding output Pairs of sample input value and corresponding output

value are used to train the net repeatedly until the value are used to train the net repeatedly until the output becomes satisfactorily accurateoutput becomes satisfactorily accurate

In In unsupervised trainingunsupervised training, , there is no known expected output used for guiding the there is no known expected output used for guiding the

weight adjustmentsweight adjustments The function to be optimised can be any function of the The function to be optimised can be any function of the

inputs and outputs, usually set by the applicationinputs and outputs, usually set by the application the net adapts itself to align its weight values with the net adapts itself to align its weight values with

training patternstraining patterns This results in groups of nodes responding strongly to This results in groups of nodes responding strongly to

specific groups of similar inputs patterns specific groups of similar inputs patterns

Page 23: ICT619 Intelligent Systems Topic 4: Artificial Neural Networks

ICT619ICT619 2323

The two states of an ANNThe two states of an ANN

A neural network can be in one of two A neural network can be in one of two states: states: training modetraining mode or or operation modeoperation mode

Most ANNs learn Most ANNs learn off-line off-line and do not change their and do not change their weights once training is finished and they are in weights once training is finished and they are in operationoperation

In an ANN capable of In an ANN capable of on-lineon-line learning, training and learning, training and operation continue togetheroperation continue together

ANN training can be time consuming, but once ANN training can be time consuming, but once trained, the resulting network can be made to run trained, the resulting network can be made to run very efficiently – providing fast responses very efficiently – providing fast responses

Page 24: ICT619 Intelligent Systems Topic 4: Artificial Neural Networks

ICT619ICT619 2424

ANN modelsANN models

ANNs are supposed to model the structure and ANNs are supposed to model the structure and operation of the biological brain operation of the biological brain

But there are different types of neural networks But there are different types of neural networks depending on the architecture, learning strategy and depending on the architecture, learning strategy and operationoperation

Three of the most well known models are:Three of the most well known models are:1.1. The multilayer perceptronThe multilayer perceptron2.2. The Kohonen network (the Self-Organising Map)The Kohonen network (the Self-Organising Map)3.3. The Hopfield netThe Hopfield net

The Multilayer Perceptron (MLP) isThe Multilayer Perceptron (MLP) is t the most popular he most popular ANN architectureANN architecture

Page 25: ICT619 Intelligent Systems Topic 4: Artificial Neural Networks

ICT619ICT619 2525

The Multilayer PerceptronThe Multilayer Perceptron Nodes are arranged into an input layer, an output layer Nodes are arranged into an input layer, an output layer

and one or more hidden layersand one or more hidden layers Also known as the Also known as the backpropagationbackpropagation network because of network because of

the use of error values from the output layer in the layers the use of error values from the output layer in the layers before it to calculate weight adjustments during training. before it to calculate weight adjustments during training.

Another name for the MLP is the Another name for the MLP is the feedforwardfeedforward network. network.

Page 26: ICT619 Intelligent Systems Topic 4: Artificial Neural Networks

ICT619ICT619 2626

MLP learning algorithmMLP learning algorithm

The learning rule for the multilayer perceptron is known The learning rule for the multilayer perceptron is known as "the generalised delta rule" or the "backpropagation as "the generalised delta rule" or the "backpropagation rule"rule"

The generalised delta rule repeatedly calculates an The generalised delta rule repeatedly calculates an errorerror value for each input, which is a function of the value for each input, which is a function of the squared difference between the expected correct squared difference between the expected correct output and the actual outputoutput and the actual output

The calculated error is The calculated error is backpropagatedbackpropagated from one layer from one layer to the previous one, and is used to adjust the weights to the previous one, and is used to adjust the weights between connecting layers between connecting layers

Page 27: ICT619 Intelligent Systems Topic 4: Artificial Neural Networks

ICT619ICT619 2727

MLP learning algorithm MLP learning algorithm (cont’d)(cont’d)

New weight = Old weight + change calculated from square of errorNew weight = Old weight + change calculated from square of error

Error = difference between desired output and actual outputError = difference between desired output and actual output

Training stops when error becomes acceptable, or Training stops when error becomes acceptable, or after a predetermined number of iterations after a predetermined number of iterations

After training, the modified interconnection weights After training, the modified interconnection weights form a sort of internal representation that enables the form a sort of internal representation that enables the ANN to generate desired outputs when given the ANN to generate desired outputs when given the training inputs – training inputs – or even new inputs that are similar to or even new inputs that are similar to training inputstraining inputs

ThisThis generalisation generalisation is a very important propertyis a very important property

Page 28: ICT619 Intelligent Systems Topic 4: Artificial Neural Networks

ICT619ICT619 2828

The error landscape in a The error landscape in a multilayer perceptronmultilayer perceptron

For a given pattern For a given pattern pp, the error , the error Ep Ep can be plotted can be plotted against the weights to give the so called against the weights to give the so called error surfaceerror surface

The error surface is a landscape of hills and valleys, The error surface is a landscape of hills and valleys, with points of minimum error corresponding to wells with points of minimum error corresponding to wells and maximum error found on peaks.and maximum error found on peaks.

The generalised delta rule aims to minimise The generalised delta rule aims to minimise EpEp by by adjusting weights so that they correspond to points of adjusting weights so that they correspond to points of lowest errorlowest error

It follows the method of It follows the method of gradient descentgradient descent where the where the changes are made in the steepest downward directionchanges are made in the steepest downward direction

All possible solutions are depressions in the error All possible solutions are depressions in the error surface, known as surface, known as basins of attractionbasins of attraction

Page 29: ICT619 Intelligent Systems Topic 4: Artificial Neural Networks

ICT619ICT619 2929

The error landscape in a The error landscape in a multilayer perceptronmultilayer perceptron

Ep

i

j

Page 30: ICT619 Intelligent Systems Topic 4: Artificial Neural Networks

ICT619ICT619 3030

Learning difficulties in Learning difficulties in multilayer perceptrons - local multilayer perceptrons - local minimaminima

The MLP may fail to settle into the global minimum of The MLP may fail to settle into the global minimum of the error surface and instead find itself in one of the the error surface and instead find itself in one of the local minimalocal minima

This is due to the gradient descent strategy followedThis is due to the gradient descent strategy followed

A number of alternative approaches can be taken to A number of alternative approaches can be taken to reduce this possibility:reduce this possibility:

Lowering the Lowering the gain termgain term progressively progressively Used to influence rate at which weight changes are made Used to influence rate at which weight changes are made

during trainingduring training Value by default is 1, but it may be gradually reduced to Value by default is 1, but it may be gradually reduced to

reduce the rate of change as training progressesreduce the rate of change as training progresses

Page 31: ICT619 Intelligent Systems Topic 4: Artificial Neural Networks

ICT619ICT619 3131

Learning difficulties in Learning difficulties in multilayer perceptronsmultilayer perceptrons(cont’d)(cont’d) Addition of more nodes for better representation of patternsAddition of more nodes for better representation of patterns

Too few nodes (and consequently not enough weights) can cause Too few nodes (and consequently not enough weights) can cause failure of the ANN to learn a patternfailure of the ANN to learn a pattern

Introduction of a Introduction of a momentum termmomentum term Determines effect of past weight changes on current direction of Determines effect of past weight changes on current direction of

movement in weight spacemovement in weight space Momentum term is also a small numerical value in the range 0 -1Momentum term is also a small numerical value in the range 0 -1

Addition of random noise to perturb the ANN out of local minimaAddition of random noise to perturb the ANN out of local minima Usually done by adding small random values to weights.Usually done by adding small random values to weights. Takes the net to a different point in the error space – hopefully out Takes the net to a different point in the error space – hopefully out

of a local minimumof a local minimum

Page 32: ICT619 Intelligent Systems Topic 4: Artificial Neural Networks

ICT619ICT619 3232

The Kohonen network (the self-The Kohonen network (the self-organising map)organising map)

Biological systems display both supervised and Biological systems display both supervised and unsupervised learning behaviourunsupervised learning behaviour

A neural network with unsupervised learning A neural network with unsupervised learning capability is said to be capability is said to be self-organisingself-organising

During training, the Kohonen net changes its During training, the Kohonen net changes its weights to learn appropriate associations, weights to learn appropriate associations, without any right answers being providedwithout any right answers being provided

Page 33: ICT619 Intelligent Systems Topic 4: Artificial Neural Networks

ICT619ICT619 3333

The Kohonen network (cont’d)The Kohonen network (cont’d)

The Kohonen net consists of an input layer, that The Kohonen net consists of an input layer, that distributes the inputs to every node in a second layer, distributes the inputs to every node in a second layer, known as the competitive layer. known as the competitive layer.

The competitive (output) layer is usually organised into The competitive (output) layer is usually organised into some 2-D or 3-D surface (feature map)some 2-D or 3-D surface (feature map)

Page 34: ICT619 Intelligent Systems Topic 4: Artificial Neural Networks

ICT619ICT619 3434

Operation of the Kohonen NetOperation of the Kohonen Net

Each neuron in the competitive layer is connected to other Each neuron in the competitive layer is connected to other neurons in its neighbourhoodneurons in its neighbourhood

Neurons in the competitive layer have excitatory (positively Neurons in the competitive layer have excitatory (positively weighted) connections to immediate neighbours and weighted) connections to immediate neighbours and inhibitory (negatively weighted) connections to more distant inhibitory (negatively weighted) connections to more distant neurons. neurons.

As an input pattern is presented, some of the neurons in the As an input pattern is presented, some of the neurons in the competitive layer are sufficiently activated to produce competitive layer are sufficiently activated to produce outputs, which are fed to other neurons in their outputs, which are fed to other neurons in their neighbourhoodsneighbourhoods

The node with the set of input weights closest to the input The node with the set of input weights closest to the input pattern component values produces the largest output. This pattern component values produces the largest output. This node is termed the node is termed the best matching (or winning) nodebest matching (or winning) node

Page 35: ICT619 Intelligent Systems Topic 4: Artificial Neural Networks

ICT619ICT619 3535

Operation of the Kohonen NetOperation of the Kohonen Net(cont’d)(cont’d)

During training, input weights of the best matching node and During training, input weights of the best matching node and its neighbours are adjusted to make them resemble the its neighbours are adjusted to make them resemble the input pattern even more closelyinput pattern even more closely

At the completion of training, the best matching node ends At the completion of training, the best matching node ends up with its input weight values aligned with the input pattern up with its input weight values aligned with the input pattern and produces the strongest output whenever that particular and produces the strongest output whenever that particular pattern is presentedpattern is presented

The nodes in the winning node's neighbourhood also have The nodes in the winning node's neighbourhood also have their weights modified to settle down to an average their weights modified to settle down to an average representation of that pattern classrepresentation of that pattern class

As a result, the net is able to represent clusters of similar As a result, the net is able to represent clusters of similar input patterns - a feature found useful for data mining input patterns - a feature found useful for data mining applications, for example.applications, for example.

Page 36: ICT619 Intelligent Systems Topic 4: Artificial Neural Networks

ICT619ICT619 3636

The Hopfield ModelThe Hopfield Model

The Hopfield net is the most widely The Hopfield net is the most widely known of all the known of all the autoassociativeautoassociative - - pattern completing - ANNspattern completing - ANNs

In autoassociation, a noisy or partially In autoassociation, a noisy or partially

incomplete input pattern causes the incomplete input pattern causes the network to stabilise to a state network to stabilise to a state corresponding to the original patterncorresponding to the original pattern

It is also useful for optimisation tasks.It is also useful for optimisation tasks.

The Hopfield net is a recurrent ANN in The Hopfield net is a recurrent ANN in which the output produced by each which the output produced by each neuron is fed back as input to all other neuron is fed back as input to all other neuronsneurons

Neurons computer a weighted sum Neurons computer a weighted sum with a step transfer function.with a step transfer function.

Page 37: ICT619 Intelligent Systems Topic 4: Artificial Neural Networks

ICT619ICT619 3737

The Hopfield Model (cont’d)The Hopfield Model (cont’d)

The Hopfield net has no iterative The Hopfield net has no iterative learning algorithm as such. Patterns learning algorithm as such. Patterns (or facts) are simply stored by (or facts) are simply stored by adjusting the weights to lower a term adjusting the weights to lower a term called called network energynetwork energy

During operation, an input pattern is During operation, an input pattern is applied to all neurons simultaneously applied to all neurons simultaneously and the network is left to stabiliseand the network is left to stabilise

Outputs from the neurons in the stable Outputs from the neurons in the stable state form the output of the network. state form the output of the network.

When presented with an input pattern, When presented with an input pattern, the net outputs a stored pattern the net outputs a stored pattern nearest to the presented pattern.nearest to the presented pattern.

Page 38: ICT619 Intelligent Systems Topic 4: Artificial Neural Networks

ICT619ICT619 3838

When ANNs should be appliedWhen ANNs should be applied

Difficulties with some real-life problems:Difficulties with some real-life problems: Solutions are difficult, if not impossible, to define Solutions are difficult, if not impossible, to define

algorithmically due mainly to the unstructured naturealgorithmically due mainly to the unstructured nature

Too many variables and/or the interactions of relevant Too many variables and/or the interactions of relevant variables not understood wellvariables not understood well

Input data may be partially corrupt or missing, making it Input data may be partially corrupt or missing, making it difficult for a logical sequence of solution steps to difficult for a logical sequence of solution steps to function effectivelyfunction effectively

Page 39: ICT619 Intelligent Systems Topic 4: Artificial Neural Networks

ICT619ICT619 3939

When ANNs should be applied When ANNs should be applied (cont’d)(cont’d)

The typical ANN attempts to arrive at an answer by The typical ANN attempts to arrive at an answer by learning to identify the right answer through an iterative learning to identify the right answer through an iterative process of self-adaptation or trainingprocess of self-adaptation or training

If there are many factors, with complex interactions If there are many factors, with complex interactions among them, the usual "linear" statistical techniques among them, the usual "linear" statistical techniques may be inappropriatemay be inappropriate

If sufficient data is available, an ANN can find the If sufficient data is available, an ANN can find the relevant functional relationship by means of an relevant functional relationship by means of an adaptive learning procedure from the dataadaptive learning procedure from the data

Page 40: ICT619 Intelligent Systems Topic 4: Artificial Neural Networks

ICT619ICT619 4040

Current applications of ANNsCurrent applications of ANNs

ANNs are good at recognition and classification tasksANNs are good at recognition and classification tasks

Due to their ability to recognise complex patterns, Due to their ability to recognise complex patterns, ANNs have been widely applied in character, ANNs have been widely applied in character, handwritten text and signature recognition, as well as handwritten text and signature recognition, as well as more complex images such as faces more complex images such as faces

They have also been used successfully for speech They have also been used successfully for speech recognition and synthesisrecognition and synthesis

ANNs are being used in an increasing number of ANNs are being used in an increasing number of applications where high-speed computation of applications where high-speed computation of functions is important, eg, in industrial roboticsfunctions is important, eg, in industrial robotics

Page 41: ICT619 Intelligent Systems Topic 4: Artificial Neural Networks

ICT619ICT619 4141

Current applications of ANNsCurrent applications of ANNs(cont’d)(cont’d)

One of the more successful applications of ANNs has One of the more successful applications of ANNs has been as a decision support tool in the area of finance been as a decision support tool in the area of finance and bankingand banking

Some examples of commercial applications of ANN Some examples of commercial applications of ANN are:are: Financial market analysis for investment decision makingFinancial market analysis for investment decision making Sales support - targeting customers for telemarketingSales support - targeting customers for telemarketing Bankruptcy predictionBankruptcy prediction Intelligent flexible manufacturing systemsIntelligent flexible manufacturing systems Stock market predictionStock market prediction Resource allocation – scheduling and management of Resource allocation – scheduling and management of

personnel and equipment personnel and equipment

Page 42: ICT619 Intelligent Systems Topic 4: Artificial Neural Networks

ICT619ICT619 4242

ANN applications - broad ANN applications - broad categoriescategories

According to a survey (Quaddus & Khan, 2002) According to a survey (Quaddus & Khan, 2002) covering the period 1988 up to mid 1998, the covering the period 1988 up to mid 1998, the main business application areas of ANNs are:main business application areas of ANNs are: Production (36%)Production (36%) Information systems (20%)Information systems (20%) Finance (18%)Finance (18%) Marketing & distribution (14.5%)Marketing & distribution (14.5%) Accounting/Auditing (5%)Accounting/Auditing (5%) Others (6.5%)Others (6.5%)

Page 43: ICT619 Intelligent Systems Topic 4: Artificial Neural Networks

ICT619ICT619 4343

ANN applications - broad ANN applications - broad categories (cont’d)categories (cont’d)

The levelling off of publications on ANN applications The levelling off of publications on ANN applications may be attributed to the ANN moving from the research may be attributed to the ANN moving from the research to the commercial application domainto the commercial application domain

The emergence of other intelligent system tools may The emergence of other intelligent system tools may be another factorbe another factor

Table 1: Distribution of the Articles by Areas and YearAREA 1988 89 90 91 92 93 94 95 96 97 98 Total % of TotalAccounting/Auditing 1 0 1 1 6 3 3 7 7 5 0 34 4.97Finance 0 0 4 11 19 28 27 18 5 9 2 123 17.98Human resources 0 0 0 1 0 1 1 0 0 0 0 3 0.44Information systems 4 6 9 7 15 24 21 18 13 18 3 138 20.18Marketing/Distribution 2 2 2 3 8 10 12 17 29 14 0 99 14.47Production 2 6 8 21 31 38 24 50 29 31 1 241 35.23Others 0 0 1 7 3 8 7 8 7 5 0 46 6.73Yearly Total 9 14 25 51 82 112 95 118 90 82 6 684 100.00% of Total 1.32 2.05 3.65 7.46 11.99 16.37 13.89 17.25 13.16 11.99 0.88 100.00

Page 44: ICT619 Intelligent Systems Topic 4: Artificial Neural Networks

ICT619ICT619 4444

Some advantages of ANNsSome advantages of ANNs

Able to take incomplete or corrupt data and provide Able to take incomplete or corrupt data and provide approximate results.approximate results.

Good at generalisation, that is recognising patterns Good at generalisation, that is recognising patterns similar to those learned during trainingsimilar to those learned during training

Inherent parallelism makes them fault-tolerant – loss of Inherent parallelism makes them fault-tolerant – loss of a few interconnections or nodes leaves the system a few interconnections or nodes leaves the system relatively unaffectedrelatively unaffected

Parallelism also makes ANNs fast and efficient for Parallelism also makes ANNs fast and efficient for handling large amounts of data.handling large amounts of data.

Page 45: ICT619 Intelligent Systems Topic 4: Artificial Neural Networks

ICT619ICT619 4545

ANN State-of-the-art ANN State-of-the-art overviewoverview

Currently neural network systems are available as Currently neural network systems are available as Software simulation on conventional computers - prevalentSoftware simulation on conventional computers - prevalent Special purpose hardware that models the parallelism of Special purpose hardware that models the parallelism of

neurons. neurons.

ANN-based systems not likely to replace conventional ANN-based systems not likely to replace conventional computing systems, but they are an established computing systems, but they are an established alternative to the symbolic logic approach to alternative to the symbolic logic approach to information processinginformation processing

A new computing paradigm in the form of hybrid A new computing paradigm in the form of hybrid intelligent systems has emerged - often involving ANNs intelligent systems has emerged - often involving ANNs with other intelligent system tools with other intelligent system tools

Page 46: ICT619 Intelligent Systems Topic 4: Artificial Neural Networks

ICT619ICT619 4646

REFERENCESREFERENCES AI Expert (special issue on ANN), June 1990.AI Expert (special issue on ANN), June 1990.

BYTE (special issue on ANN), Aug. 1989.BYTE (special issue on ANN), Aug. 1989.

Caudill,M., "The View from Now", AI Expert, June 1992, pp.27-31.Caudill,M., "The View from Now", AI Expert, June 1992, pp.27-31.

Dhar, V., & Stein, RDhar, V., & Stein, R., Seven Methods for Transforming Corporate Data ., Seven Methods for Transforming Corporate Data into Business Intelligenceinto Business Intelligence., Prentice Hall 1997., Prentice Hall 1997

Kirrmann,H., "Neural Computing: The new gold rush in informatics", Kirrmann,H., "Neural Computing: The new gold rush in informatics", IEEE Micro June 1989 pp. 7-9IEEE Micro June 1989 pp. 7-9

Lippman, R.P., "An Introduction to Computing with Neural Nets", IEEE Lippman, R.P., "An Introduction to Computing with Neural Nets", IEEE ASSP Magazine, April 1987 pp.4-21.ASSP Magazine, April 1987 pp.4-21.

Lisboa, P., (Ed.) Neural Networks Current Applications, Chapman & Lisboa, P., (Ed.) Neural Networks Current Applications, Chapman & Hall, 1992.Hall, 1992.

Negnevitsky, M. Artificial Intelligence A Guide to Intelligent Systems, Negnevitsky, M. Artificial Intelligence A Guide to Intelligent Systems, Addison-Wesley 2005.Addison-Wesley 2005.

Page 47: ICT619 Intelligent Systems Topic 4: Artificial Neural Networks

ICT619ICT619 4747

REFERENCES (cont’d)REFERENCES (cont’d) Quaddus, M. A., and Khan, M. S.,  "Evolution of Artificial Neural Quaddus, M. A., and Khan, M. S.,  "Evolution of Artificial Neural

Networks in Business Applications: An Empirical Investigation Networks in Business Applications: An Empirical Investigation Using a Growth Model", International Journal of Management and Using a Growth Model", International Journal of Management and Decision Making, Vol.3, No.1, March 2002, pp.19-34.(see also Decision Making, Vol.3, No.1, March 2002, pp.19-34.(see also ANN application publications end note library files, ICT619 ftp site)ANN application publications end note library files, ICT619 ftp site)

Wasserman, P.D., Neural Computing, Theory and Practice, Van Wasserman, P.D., Neural Computing, Theory and Practice, Van Nostrand Reinhold, New York 1989Nostrand Reinhold, New York 1989

Wong, B.K., Bodnovich, T.A., Selvi, Yakup, "Neural Networks Wong, B.K., Bodnovich, T.A., Selvi, Yakup, "Neural Networks applications in business: A Review and Analysis of the literature applications in business: A Review and Analysis of the literature (1988-95)", Decision Support Systems, 19, 1997, pp. 301-320. (1988-95)", Decision Support Systems, 19, 1997, pp. 301-320.

Zahedi, F., Zahedi, F., Intelligent Systems for BusinessIntelligent Systems for Business, Wadsworth , Wadsworth Publishing, Belmont, California, 1993.Publishing, Belmont, California, 1993.

http://www.doc.ic.ac.uk/~nd/surprise_96/journal/vol4/cs11/http://www.doc.ic.ac.uk/~nd/surprise_96/journal/vol4/cs11/report.html report.html