soft computering technics - unit2

91
SOFT COMUTING TECHNICS WELCOME

Upload: munna-bhai

Post on 10-May-2015

940 views

Category:

Technology


1 download

DESCRIPTION

EEE Final Year 1st semister study

TRANSCRIPT

Page 1: Soft Computering Technics - Unit2

SOFT COMUTING TECHNICS

WELCOME

Page 2: Soft Computering Technics - Unit2

Presented by Rm.Sumanth

P.Ganga Bashkar&

Habeeb Khan Rahim Khan

Soft Computing Technics

Page 3: Soft Computering Technics - Unit2

04/11/233

A mini Classroom Project review onsubmitted to

MADINA ENGINEERING COLLEGE, KADPA.

in partial fulfillment of the requirements for the award of the degree of

BATCHLOR OF TECHNOLOGYin

ELECTRICAL & ELECTRONICS ENGINEERING

by

RM.SUMANTH11725A0201

Mr.P.Ganga Bashkar,B.Tech, Senior Technical Student,

Dept. of EEE Madina Engg.College

Page 4: Soft Computering Technics - Unit2

Machine Learning

Neural Networks

Page 5: Soft Computering Technics - Unit2

Artificial Neural Network is based on the

biological nervous system as Brain

It is composed of interconnected computing

units called neurons

ANN like human, learn by examples

Introduction

Page 6: Soft Computering Technics - Unit2

There are two basic reasons why we are interested in building artificial neural networks (ANNs):

Technical viewpoint: Some problems such as character recognition or the prediction of future states of a system require massively parallel and adaptive processing.

Biological viewpoint: ANNs can be used to replicate and simulate components of the human (or animal) brain, thereby giving us insight into natural information processing.

6

Why Artificial Neural Networks?

Page 7: Soft Computering Technics - Unit2

Science: Model how biological neural systems, like human brain, work?

How do we see? How is information stored

in/retrieved from memory? How do you learn to not to touch

fire? How do your eyes adapt to the

amount of light in the environment?

Related fields: Neuroscience, Computational Neuroscience, Psychology, Psychophysiology, Cognitive Science, Medicine, Math, Physics.

7

Page 8: Soft Computering Technics - Unit2

Old Ages: Association (William James; 1890) McCulloch-Pitts Neuron (1943,1947) Perceptrons (Rosenblatt; 1958,1962) Adaline/LMS (Widrow and Hoff; 1960) Perceptrons book (Minsky and Papert; 1969)Dark Ages: Self-organization in visual cortex (von der Malsburg; 1973) Backpropagation (Werbos, 1974) Foundations of Adaptive Resonance Theory (Grossberg;

1976) Neural Theory of Association (Amari; 1977)

8

Brief History

Page 9: Soft Computering Technics - Unit2

Modern Ages: Adaptive Resonance Theory (Grossberg; 1980) Hopfield model (Hopfield; 1982, 1984) Self-organizing maps (Kohonen; 1982) Reinforcement learning (Sutton and Barto; 1983) Simulated Annealing (Kirkpatrick et al.; 1983) Boltzmann machines (Ackley, Hinton, Terrence;

1985) Backpropagation (Rumelhart, Hinton, Williams;

1986) ART-networks (Carpenter, Grossberg; 1992) Support Vector Machines

9

History

Page 10: Soft Computering Technics - Unit2

In 1949, Donald Hebb formulated William James’ principle of association into a mathematical form.

10

Hebb’s Learning Law

• If the activation of the neurons, y1 and y2 , are both on (+1) then the weight between the two neurons grow. (Off: 0)

• Else the weight between remains the same.

• However, when bipolar activation {-1,+1} scheme is used, then the weights can also decrease when the activation of two neurons does not match.

Page 11: Soft Computering Technics - Unit2

Synapses change size and strength with experience.

Hebbian learning: When two connected neurons are firing at the same time, the strength of the synapse between them increases.

“Neurons that fire together, wire together.”

11

Real Neural Learning

Page 12: Soft Computering Technics - Unit2

Biological Neurons Human brain = tens of

thousands of neurons Each neuron is connected to

thousands other neurons A neuron is made of:

The soma: body of the neuron

Dendrites: filaments that provide input to the neuron

The axon: sends an output signal

Synapses: connection with other neurons – releases certain quantities of chemicals called neurotransmitters to other neurons

12

Page 13: Soft Computering Technics - Unit2

13

Modeling of Brain Functions

Page 14: Soft Computering Technics - Unit2

The pulses generated by the neuron travels along the axon as an electrical wave.

Once these pulses reach the synapses at the end of the axon open up chemical vesicles exciting the other neuron.

14

The biological neuron

Page 15: Soft Computering Technics - Unit2

Information is transmitted as a series of electric impulses, so-called spikes.

The frequency and phase of these spikes encodes the information.

In biological systems, one neuron can be connected to as many as 10,000 other neurons.

Usually, a neuron receives its information from other neurons in a confined area

15

How do NNs and ANNs work?

Page 16: Soft Computering Technics - Unit2

Done by Pomerlau. The network takes inputs from a 34X36 video image and a 7X36 range finder. Output units represent “drive straight”, “turn left” or “turn right”. After training about 40 times on 1200 road images, the car drove around CMU campus at 5 km/h (using a small workstation on the car). This was almost twice the speed of any other non-NN algorithm at the time.

16

Navigation of a car

Page 17: Soft Computering Technics - Unit2

17

Automated driving at 70 mph on a public highway

Camera image

30x32 pixelsas inputs

30 outputsfor steering

30x32 weightsinto one out offour hiddenunit

4 hiddenunits

Page 18: Soft Computering Technics - Unit2

“Standard” ComputersNeural Networks

one CPU highly parallelprocessing

fast processing units slow processing units

reliable units unreliable units

static infrastructure dynamic infrastructure

18

Computers vs. Neural Networks

Page 19: Soft Computering Technics - Unit2

Neural Network

Page 20: Soft Computering Technics - Unit2

Neural Network Application

•Pattern recognition can be implemented using NN

•The figure can be T or H character, the network should identify each class of T or H.

Page 21: Soft Computering Technics - Unit2
Page 22: Soft Computering Technics - Unit2
Page 23: Soft Computering Technics - Unit2
Page 24: Soft Computering Technics - Unit2

Simple Neuron

X1

X2

Xn

OutputInputs

b

Page 25: Soft Computering Technics - Unit2

x1

An Artificial Neuron

x2

xn

Wi,1Wi,2

Wi,n

n

jjjii txtwt

1, )()()(net

xi

neuron i

net input signal

synapses

output ) )(()(x tn e tft iii

Page 26: Soft Computering Technics - Unit2

Neural Network

Input Layer Hidden 1 Hidden 2 Output Layer

Page 27: Soft Computering Technics - Unit2

The common type of ANN consists of three layers

of neurons: a layer of input neurons connected

to the layer of hidden neuron which is

connected to a layer of output neurons.

Network Layers

Page 28: Soft Computering Technics - Unit2

Feed-Forward networksAllow the signals to travel one way from input

to output Feed-Back NetworksThe signals travel as loops in the network, the

output is connected to the input of the network

Architecture of ANN

Page 29: Soft Computering Technics - Unit2
Page 30: Soft Computering Technics - Unit2
Page 31: Soft Computering Technics - Unit2
Page 32: Soft Computering Technics - Unit2

NNs are able to learn by adapting their connectivity patterns so that the organism improves its behavior in terms of reaching certain (evolutionary) goals.

The NN achieves learning by appropriately adapting the states of its synapses.

How do NNs and ANNs Learn?

Page 33: Soft Computering Technics - Unit2

The learning rule modifies the weights of the

connections.

The learning process is divided into

Supervised and Unsupervised learning

Learning Rule

Page 34: Soft Computering Technics - Unit2

Which means there exists an external

teacher. The target is to minimization of the

error between the desired and computed

output

Supervised Network

Page 35: Soft Computering Technics - Unit2
Page 36: Soft Computering Technics - Unit2
Page 37: Soft Computering Technics - Unit2
Page 38: Soft Computering Technics - Unit2

Uses no external teacher and is based upon

only local information.

Unsupervised Network

Page 39: Soft Computering Technics - Unit2
Page 40: Soft Computering Technics - Unit2
Page 41: Soft Computering Technics - Unit2
Page 42: Soft Computering Technics - Unit2

It is a network of one neuron and hard limit transfer function

Perceptron

Inputs f

X1

X2

Xn

Output

W1

W2

Wn

Page 43: Soft Computering Technics - Unit2

The perceptron is given first a randomly

weights vectors

Perceptron is given chosen data pairs (input

and desired output)

Preceptron learning rule changes the weights

according to the error in output

Perceptron

Page 44: Soft Computering Technics - Unit2
Page 45: Soft Computering Technics - Unit2

W new = W old + (t-a) X

Where W new is the new weight

W old is the old value of weight

X is the input value

t is the desired value of output

a is the actual value of output

Perceptron Learning Rule

Page 46: Soft Computering Technics - Unit2

Let X1 = [0 0] and t =0 X2 = [0 1] and t=0 X3 = [1 0] and t=0 X4 = [1 1] and t=1

W = [2 2] and b = -3

Example

Page 47: Soft Computering Technics - Unit2

This example means we construct a network for AND operation. The network draw a line to separate the classes which is called

Classification

AND Network

Page 48: Soft Computering Technics - Unit2

The equation below describes a (hyper-)plane in the input space consisting of real valued m-dimensional vectors. The plane splits the input space into two regions, each of them describing one class.

Perceptron Geometric View

0 wxw 0

m

1iii

x2

C1

C2x1

decisionboundary

w1x1 + w2x2 + w0 = 0

decisionregion for C1

w1x1 + w2x2 + w0 >= 0

Page 49: Soft Computering Technics - Unit2

Four one-dimensional data belonging to two classes are

X = [1 -0.5 3 -2]T = [1 -1 1 -1]W = [-2.5 1.75]

Problems

Page 50: Soft Computering Technics - Unit2

Take in two inputs (-1 or +1) Produce one output (-1 or +1) In other contexts, use 0 and 1 Example: AND function

Produces +1 only if both inputs are +1 Example: OR function

Produces +1 if either inputs are +1 Related to the logical connectives from

F.O.L.

Boolean Functions

Page 51: Soft Computering Technics - Unit2

The First Neural Neural Networks

AND Function

1

1X1

X2

Y

AND

X1 X2 Y

1 1 1

1 0 0

0 1 0

0 0 0

Threshold(Y) = 2

Page 52: Soft Computering Technics - Unit2

Simple Networks

t = 0.0

y

x

W = 1.5

W = 1

-1

Page 53: Soft Computering Technics - Unit2

Design a neural network to recognize the problem of

X1=[2 2] , t1=0 X=[1 -2], t2=1 X3=[-2 2], t3=0 X4=[-1 1], t4=1Start with initial weights w=[0 0] and bias =0

Exercises

Page 54: Soft Computering Technics - Unit2

The perceptron The perceptron can only model linearly linearly separable classes, separable classes, like (those described by) the following Boolean functions:

ANDAND OROR COMPLEMENTCOMPLEMENT It cannot cannot model the XORXOR.

You can experiment with these functions in the Matlab practical lessons.

Perceptron: Limitations

Page 55: Soft Computering Technics - Unit2

Types of decision regions

022110 xwxww

022110 xwxww

x1

1

x2 w2

w1

w0

Convexregion

L1L2

L3L4 -3.5

Networkwith a singlenode

One-hidden layer network that realizes the convex region

1

1

1

1

1

x1

x2

1

Page 56: Soft Computering Technics - Unit2

Another type of neurons overcomes this problem by using a Gaussian activation function:

Gaussian Neurons

11

00

11

ffii(net(netii(t))(t))

netnetii(t)(t)-1-1

2

1)(n et

))(net(

t

ii

i

etf

Page 57: Soft Computering Technics - Unit2

Gaussian neurons are able to realize non-linear functions.

Therefore, networks of Gaussian units are in principle unrestricted with regard to the functions that they can realize.

The drawback of Gaussian neurons is that we have to make sure that their net input does not exceed 1.

This adds some difficulty to the learning in Gaussian networks.

57

Gaussian Neurons

Page 58: Soft Computering Technics - Unit2

Sigmoidal neurons accept any vectors of real numbers as input, and they output a real number between 0 and 1.

Sigmoidal neurons are the most common type of artificial neuron, especially in learning networks.

A network of sigmoidal units with m input neurons and n output neurons realizes a network function f: Rm (0,1)n

58

Sigmoidal Neurons

Page 59: Soft Computering Technics - Unit2

The parameter controls the slope of the sigmoid function, while the parameter controls the horizontal offset of the function in a way similar to the threshold neurons.

59

Sigmoidal Neurons

11

00

11

ffii(net(netii(t))(t))

netnetii(t)(t)-1-1

/))(n et(1

1))(net(

tii ietf

= = 11

Page 60: Soft Computering Technics - Unit2

This leads to a simplified form of the sigmoid function:

60

Sigmoidal Neurons

)(1

1)(

netenetS

We do not need a modifiable threshold , because we will use “dummy” inputs as we did for perceptrons.

The choice = 1 works well in most situations and results in a very simple derivative of S(net).

Page 61: Soft Computering Technics - Unit2

61

Sigmoidal Neurons

This result will be very useful when we develop the very useful when we develop the backpropagation algorithm.backpropagation algorithm.

xexS

1

1)(

2)1(

)()('

x

x

e

e

dx

xdSxS

22 )1(

1

1

1

)1(

11xxx

x

eee

e

) )(1) (( xSxS

Page 62: Soft Computering Technics - Unit2

Let the network of 3 layers Input layer Hidden layer Output layer

Each layer has different number of neurons The famous example to need the multi-layer

network is XOR unction

Multi-layers Network

Page 63: Soft Computering Technics - Unit2
Page 64: Soft Computering Technics - Unit2
Page 65: Soft Computering Technics - Unit2
Page 66: Soft Computering Technics - Unit2
Page 67: Soft Computering Technics - Unit2

The perceptron learning rule can not be

applied to multi-layer network

We use BackPropagation Algorithm in

learning process

Learning rule

Page 68: Soft Computering Technics - Unit2

Feed-forward: input from the features is fed forward in the network

from input layer towards the output layer Backpropagation:

Method to asses the blame of errors to weights error rate flows backwards from the output layer to

the input layer (to adjust the weight in order to minimize the output error)

68

Feed-forward + Backpropagation

Page 69: Soft Computering Technics - Unit2

Back-propagation training algorithm illustrated:

Backprop adjusts the weights of the NN in order to minimize the network total mean squared error.

Backprop

Network activationError computationForward Step

Error propagationBackward Step

Page 70: Soft Computering Technics - Unit2

Hebbian Learning (1949):

“When an axon of cell A is near enough to excite a cell B and repeatedly or persistently takes place in firing it, some growth process or metabolic change takes place in one or both cells such that A’s efficiency, as one of the cells firing B, is increased.”

Weight modification rule:

wi,j = cxixj

Eventually, the connection strength will reflect the correlation between the neurons’ outputs.

Correlation Learning

Page 71: Soft Computering Technics - Unit2

• Nodes compete for inputs

• Node with highest activation is the winner

• Winner neuron adapts its tuning (pattern of weights) even further towards the current input

• Individual nodes specialize to win competition for a set of similar inputs

• Process leads to most efficient neural representation of input space

• Typical for unsupervised learning

71

Competitive Learning

Page 72: Soft Computering Technics - Unit2

Similar to the Adaline, the goal of the Backpropagation learning algorithm is to modify the network’s weights so that its output vector

op = (op,1, op,2, …, op,K)

is as close as possible to the desired output vector

dp = (dp,1, dp,2, …, dp,K)

for K output neurons and input patterns p = 1, …, P.

The set of input-output pairs (exemplars) {(xp, dp) | p = 1, …, P} constitutes the training set.

72

Backpropagation Learning

Page 73: Soft Computering Technics - Unit2

The weight change rule is

Where is the learning factor <1 Error is the error between actual and trained

value f’ is is the derivative of sigmoid function =

f(1-f)

Bp Algorithm

)('.. ioldij

newij inputferror

Page 74: Soft Computering Technics - Unit2

Each observation contributes a variable amount to the output

The scale of the contribution depends on the input

Output errors can be blamed on the weights

A least mean square (LSM) error function can be defined (ideally it should be zero)

E = ½ (t – y)2

Delta Rule

Page 75: Soft Computering Technics - Unit2

For the network with one neuron in input layer and one neuron in hidden layer the following values are givenX=1, w1 =1, b1=-2, w2=1, b2 =1, =1 and

t=1Where X is the input valueW1 is the weight connect input to hidden W2 is the weight connect hidden to outputB1 and b2 are biasT is the training value

Example

Page 76: Soft Computering Technics - Unit2

Design a neural network to recognize the problem of

X1=[2 2] , t1=0 X=[1 -2], t2=1 X3=[-2 2], t3=0 X4=[-1 1], t4=1Start with initial weights w=[0 0] and bias =0

Exercises

Page 77: Soft Computering Technics - Unit2

Perform one iteration of backprpgation to network of two layers. First layer has one neuron with weight 1 and bias –2. The transfer function in first layer is f=n2

The second layer has only one neuron with weight 1 and bias 1. The f in second layer is 1/n.

The input to the network is x=1 and t=1

Exercises

Page 78: Soft Computering Technics - Unit2

Neural NetworkConstruct a neural network to solve the problem

X1 X2 Output

1.0 1.0 1

9.4 6.4 -1

2.5 2.1 1

8.0 7.7 -1

0.5 2.2 1

7.9 8.4 -1

7.0 7.0 -1

2.8 0.8 1

1.2 3.0 1

7.8 6.1 -1

Initialize the weights 0.75 , 0.5, and –0.6

Page 79: Soft Computering Technics - Unit2

Neural NetworkConstruct a neural network to solve the XOR problem

X1 X2 Output

1 1 0

0 0 0

1 0 1

0 1 1

Initialize the weights –7.0 , -7.0, -5.0 and –4.0

Page 80: Soft Computering Technics - Unit2

-0.5

-0.5

-2

3

-1

1

1

1-1

1

0.5

The transfer function is linear function.

Page 81: Soft Computering Technics - Unit2

Consider a transfer function as f(n) = n2. Perform one iteration of BackPropagation with a= 0.9 for neural network of two neurons in input layer and one neuron in output layer. The input values are X=[1 -1] and t = 8, the weight values between input and hidden layer are w11 = 1, w12 = - 2, w21 = 0.2, and w22 = 0.1. The weight between input and output layers are w1 = 2 and w2= -2. The bias in input layers are b1 = -1, and b2= 3.

X1

X2

W11

W22

W12

W1

W21 W2

Page 82: Soft Computering Technics - Unit2

True gradient descent assumes infinitesmall learning rate (). If is too small then learning is very slow. If large, then the system's learning may never converge.

Some of the possible solutions to this problem are: Add a momentum term to allow a large learning

rate. Use a different activation function Use a different error function Use an adaptive learning rate Use a good weight initialization procedure. Use a different minimization procedure

82

Some variations

Page 83: Soft Computering Technics - Unit2

Backpropagation is gradient descent search Where the height of the hills is determined by error But there are many dimensions to the space

One for each weight in the network

Therefore backpropagation Can find its ways into local minima

One partial solution: Random re-start: learn lots of networks

Starting with different random weight settings Can take best network Or can set up a “committee” of networks to categorise

examples Another partial solution: Momentum

Problems with Local Minima

Page 84: Soft Computering Technics - Unit2

Imagine rolling a ball down a hill

Without Momentum With Momentum

Adding Momentum

Gets stuck here

Page 85: Soft Computering Technics - Unit2

For each weight Remember what was added in the previous epoch

In the current epoch Add on a small amount of the previous Δ

The amount is determined by The momentum parameter, denoted α α is taken to be between 0 and 1

Momentum in Backpropagation

Page 86: Soft Computering Technics - Unit2

If direction of the weight doesn’t change Then the movement of search gets bigger The amount of additional extra is compounded in

each epoch May mean that narrow local minima are avoided May also mean that the convergence rate speeds

up Caution:

May not have enough momentum to get out of local minima

Also, too much momentum might carry search Back out of the global minimum, into a local minimum

How Momentum Works

Page 87: Soft Computering Technics - Unit2

Weight update becomes: wij (n+1) = (pj opi) + wij(n)

The momentum parameter is chosen between 0 and 1, typically 0.9. This allows one to use higher learning rates. The momentum term filters out high frequency oscillations on the error surface.What would the learning rate be in a deep

valley?

87

Momentum

Page 88: Soft Computering Technics - Unit2

Plot training example error versus test example error:

Test set error is increasing! Network is overfitting the data Learning idiosyncrasies in data, not general principles Big problem in Machine Learning (ANNs in particular)

Problems with Overfitting

Page 89: Soft Computering Technics - Unit2

Bad idea to use training set accuracy to terminate

One alternative: Use a validation set Hold back some of the training set during

training Like a miniature test set (not used to train

weights at all) If the validation set error stops decreasing, but

the training set error continues decreasing Then it’s likely that overfitting has started to occur, so

stop Another alternative: use a weight decay factor

Take a small amount off every weight after each epoch

Networks with smaller weights aren’t as highly fine tuned (overfit)

Avoiding Overfitting

Page 90: Soft Computering Technics - Unit2
Page 91: Soft Computering Technics - Unit2