artificial neural networks part-2 · perceptron learning rule • learning signal is the difference...

42
Ass. Prof. Dr.Asaad Sabah Hadi 2018-2019 Artificial Neural Networks Part-2

Upload: others

Post on 26-Mar-2020

13 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Artificial Neural Networks Part-2 · Perceptron Learning rule • Learning signal is the difference between the desired and actual neuron’s response • Learning is supervised

Ass. Prof. Dr.Asaad Sabah Hadi

2018-2019

Artificial Neural Networks

Part-2

Page 2: Artificial Neural Networks Part-2 · Perceptron Learning rule • Learning signal is the difference between the desired and actual neuron’s response • Learning is supervised

Neural Network Learning rules

c – learning constant

Page 3: Artificial Neural Networks Part-2 · Perceptron Learning rule • Learning signal is the difference between the desired and actual neuron’s response • Learning is supervised

Hebbian Learning Rule

• The learning signal is equal to the

neuron’s output

FEED FORWARD UNSUPERVISED LEARNING

Page 4: Artificial Neural Networks Part-2 · Perceptron Learning rule • Learning signal is the difference between the desired and actual neuron’s response • Learning is supervised

Features of Hebbian Learning

• Feedforward unsupervised learning

• “When an axon of a cell A is near enough

to exicite a cell B and repeatedly and

persistently takes place in firing it, some

growth process or change takes place in

one or both cells increasing the efficiency”

• If oixj is positive the results is increase in

weight else vice versa

Page 5: Artificial Neural Networks Part-2 · Perceptron Learning rule • Learning signal is the difference between the desired and actual neuron’s response • Learning is supervised

Final answer:

Page 6: Artificial Neural Networks Part-2 · Perceptron Learning rule • Learning signal is the difference between the desired and actual neuron’s response • Learning is supervised

• For the same inputs for bipolar continuous

activation function the final updated weight

is given by

Page 7: Artificial Neural Networks Part-2 · Perceptron Learning rule • Learning signal is the difference between the desired and actual neuron’s response • Learning is supervised

Perceptron Learning rule• Learning signal is the difference between the

desired and actual neuron’s response

• Learning is supervised

Page 8: Artificial Neural Networks Part-2 · Perceptron Learning rule • Learning signal is the difference between the desired and actual neuron’s response • Learning is supervised
Page 9: Artificial Neural Networks Part-2 · Perceptron Learning rule • Learning signal is the difference between the desired and actual neuron’s response • Learning is supervised

Delta Learning Rule

• Only valid for continuous activation function

• Used in supervised training mode

• Learning signal for this rule is called delta

• The aim of the delta rule is to minimize the error over all training patterns

Page 10: Artificial Neural Networks Part-2 · Perceptron Learning rule • Learning signal is the difference between the desired and actual neuron’s response • Learning is supervised

Delta Learning Rule Contd.

Learning rule is derived from the condition of least squared error.

Calculating the gradient vector with respect to wi

Minimization of error requires the weight changes to be in the negative

gradient direction

Page 11: Artificial Neural Networks Part-2 · Perceptron Learning rule • Learning signal is the difference between the desired and actual neuron’s response • Learning is supervised

Widrow-Hoff learning Rule

• Also called as least mean square learning rule

• Introduced by Widrow(1962), used in supervised learning

• Independent of the activation function

• Special case of delta learning rule wherein activation function is an identity function ie f(net)=net

• Minimizes the squared error between the desired output value di

and neti

Page 12: Artificial Neural Networks Part-2 · Perceptron Learning rule • Learning signal is the difference between the desired and actual neuron’s response • Learning is supervised

Winner-Take-All learning rules

Page 13: Artificial Neural Networks Part-2 · Perceptron Learning rule • Learning signal is the difference between the desired and actual neuron’s response • Learning is supervised

Winner-Take-All Learning rule

Contd…

• Can be explained for a layer of neurons

• Example of competitive learning and used for

unsupervised network training

• Learning is based on the premise that one of the

neurons in the layer has a maximum response

due to the input x

• This neuron is declared the winner with a weight

Page 14: Artificial Neural Networks Part-2 · Perceptron Learning rule • Learning signal is the difference between the desired and actual neuron’s response • Learning is supervised
Page 15: Artificial Neural Networks Part-2 · Perceptron Learning rule • Learning signal is the difference between the desired and actual neuron’s response • Learning is supervised

Summary of learning rules

Page 16: Artificial Neural Networks Part-2 · Perceptron Learning rule • Learning signal is the difference between the desired and actual neuron’s response • Learning is supervised

Linear Separability

• Separation of the input space into regions is based on whether the network response is positive or negative

• Line of separation is called linear-separable line.

• Example:-

– AND function & OR function are linear separable Example

– EXOR function Linearly inseparable. Example

Page 17: Artificial Neural Networks Part-2 · Perceptron Learning rule • Learning signal is the difference between the desired and actual neuron’s response • Learning is supervised

Hebb Network

• Hebb learning rule is the simpliest one

• The learning in the brain is performed by the

change in the synaptic gap

• When an axon of cell A is near enough to excite

cell B and repeatedly keep firing it, some growth

process takes place in one or both cells

• According to Hebb rule, weight vector is found to

increase proportionately to the product of the

input and learning signal.

yxoldwneww iii )()(

Page 18: Artificial Neural Networks Part-2 · Perceptron Learning rule • Learning signal is the difference between the desired and actual neuron’s response • Learning is supervised

Flow chart of Hebb training

algorithmStart

Initialize Weights

For

Each

s:t

Activate input

xi=si

1

1

Activate output

y=t

Weight updateyxoldwneww iii )()(

Bias update

b(new)=b(old) + y

Stop

y

n

Page 19: Artificial Neural Networks Part-2 · Perceptron Learning rule • Learning signal is the difference between the desired and actual neuron’s response • Learning is supervised

• Hebb rule can be used for pattern

association, pattern categorization, pattern

classification and over a range of other

areas

• Problem to be solved:

Design a Hebb net to implement OR

function

Page 20: Artificial Neural Networks Part-2 · Perceptron Learning rule • Learning signal is the difference between the desired and actual neuron’s response • Learning is supervised

How to solve

Use bipolar data in the place of binary data

Initially the weights and bias are set to zero

w1=w2=b=0

X1 X2 B y

1 1 1 1

1 -1 1 1

-1 1 1 1

-1 -1 1 -1

Page 21: Artificial Neural Networks Part-2 · Perceptron Learning rule • Learning signal is the difference between the desired and actual neuron’s response • Learning is supervised

Inputs y Weight changes weights

X1 X2 b Y W1 W2 B W1(0) W2(0) (0)b

1 1 1 1 1 1 1 1 1 1

1 -1 1 1 1 -1 1 2 0 2

-1 1 1 1 -1 1 1 1 1 3

-1 -1 1 -1 1 1 -1 2 2 2

Page 22: Artificial Neural Networks Part-2 · Perceptron Learning rule • Learning signal is the difference between the desired and actual neuron’s response • Learning is supervised

Learning by trial‐and‐error

Continuous process of:

Trial:Processing an input to produce an output (In terms of ANN: Compute

the output function of a given input)

Evaluate:

Evaluating this output by comparing the actual output with

the expected output.

Adjust:Adjust the weights.

Page 23: Artificial Neural Networks Part-2 · Perceptron Learning rule • Learning signal is the difference between the desired and actual neuron’s response • Learning is supervised

Example: XOR

Hidden

Layer, with

three

neurons

Output

Layer, with

one neuron

Input Layer,

with two

neurons

Page 24: Artificial Neural Networks Part-2 · Perceptron Learning rule • Learning signal is the difference between the desired and actual neuron’s response • Learning is supervised

How it works?

Hidden

Layer, with

three

neurons

Output

Layer, with

one neuron

Input Layer,

with two

neurons

Page 25: Artificial Neural Networks Part-2 · Perceptron Learning rule • Learning signal is the difference between the desired and actual neuron’s response • Learning is supervised

How it works? Set initial values of the weights randomly.

Input: truth table of the XOR

Do

Read input (e.g. 0, and 0)

Compute an output (e.g. 0.60543)

Compare it to the expected output. (Diff= 0.60543)

Modify the weights accordingly.

Loop until a condition is met

Condition: certain number of iterations

Condition: error threshold

Page 26: Artificial Neural Networks Part-2 · Perceptron Learning rule • Learning signal is the difference between the desired and actual neuron’s response • Learning is supervised

Who is concerned with NNs?

Page 27: Artificial Neural Networks Part-2 · Perceptron Learning rule • Learning signal is the difference between the desired and actual neuron’s response • Learning is supervised

Design Issues

Initial weights (small random values ∈[‐1,1])

Transfer function (How the inputs and the weights are

combined to produce output?)

Error estimation

Weights adjusting

Number of neurons

Data representation

Size of training set

Page 28: Artificial Neural Networks Part-2 · Perceptron Learning rule • Learning signal is the difference between the desired and actual neuron’s response • Learning is supervised

Transfer Functions Linear: The output is proportional to the total

weighted input.

Threshold: The output is set at one of two values,

depending on whether the total weighted input is

greater than or less than some threshold value.

Non‐linear: The output varies continuously but not

linearly as the input changes.

Page 29: Artificial Neural Networks Part-2 · Perceptron Learning rule • Learning signal is the difference between the desired and actual neuron’s response • Learning is supervised

Error Estimation The root mean square error (RMSE) is a frequently-

used measure of the differences between values

predicted by a model or an estimator and the values

actually observed from the thing being modeled or

estimated

Page 30: Artificial Neural Networks Part-2 · Perceptron Learning rule • Learning signal is the difference between the desired and actual neuron’s response • Learning is supervised

Weights Adjusting

After each iteration, weights should be adjusted to

minimize the error.

– All possible weights

– Back propagation

Page 31: Artificial Neural Networks Part-2 · Perceptron Learning rule • Learning signal is the difference between the desired and actual neuron’s response • Learning is supervised

Learning Paradigms

Supervised learning

Unsupervised learning

Reinforcement learning

Page 32: Artificial Neural Networks Part-2 · Perceptron Learning rule • Learning signal is the difference between the desired and actual neuron’s response • Learning is supervised

Supervised learning This is what we have seen so far!

A network is fed with a set of training samples (inputs and

corresponding output), and it uses these samples to learn

the general relationship between the inputs and the outputs.

This relationship is represented by the values of the weights

of the trained network.

Page 33: Artificial Neural Networks Part-2 · Perceptron Learning rule • Learning signal is the difference between the desired and actual neuron’s response • Learning is supervised

Unsupervised learning

No desired output is associated with the training data!

Faster than supervised learning

Used to find out structures within data:

Clustering

Compression

Page 34: Artificial Neural Networks Part-2 · Perceptron Learning rule • Learning signal is the difference between the desired and actual neuron’s response • Learning is supervised

Reinforcement learning

Like supervised learning, but:

Weights adjusting is not directly related to the error

value.

The error value is used to randomly, shuffle weights!

Relatively slow learning due to ‘randomness’.

Page 35: Artificial Neural Networks Part-2 · Perceptron Learning rule • Learning signal is the difference between the desired and actual neuron’s response • Learning is supervised

Back Propagation

Back-propagation is an example of supervised learning is

used at each layer to minimize the error between the

layer’s response and the actual data

The error at each hidden layer is an average of the

evaluated error

Hidden layer networks are trained this way

Page 36: Artificial Neural Networks Part-2 · Perceptron Learning rule • Learning signal is the difference between the desired and actual neuron’s response • Learning is supervised

Back Propagation

N is a neuron.

Nw is one of N’s inputs weights

Nout is N’s output.

Nw = Nw +Δ Nw

Δ Nw = Nout * (1‐ Nout)* NErrorFactor

NErrorFactor = NExpectedOutput – NActualOutput

This works only for the last layer, as we can know

the actual output, and the expected output.

Page 37: Artificial Neural Networks Part-2 · Perceptron Learning rule • Learning signal is the difference between the desired and actual neuron’s response • Learning is supervised

Number of neurons Many neurons:

Higher accuracy

Slower

Risk of over‐fitting

Memorizing, rather than understanding

The network will be useless with new problems.

Few neurons:

Lower accuracy

Inability to learn at all

Optimal number.

Page 38: Artificial Neural Networks Part-2 · Perceptron Learning rule • Learning signal is the difference between the desired and actual neuron’s response • Learning is supervised

Data representation Usually input/output data needs pre‐processing

Pictures

Pixel intensity

Text:

A pattern

Page 39: Artificial Neural Networks Part-2 · Perceptron Learning rule • Learning signal is the difference between the desired and actual neuron’s response • Learning is supervised

Size of training set

No one‐fits‐all formula

Over fitting can occur if a “good” training set is not

chosen

What constitutes a “good” training set?

Samples must represent the general population.

Samples must contain members of each class.

Samples in each class must contain a wide range of

variations or noise effect.

The size of the training set is related to the number of

hidden neurons

Page 40: Artificial Neural Networks Part-2 · Perceptron Learning rule • Learning signal is the difference between the desired and actual neuron’s response • Learning is supervised

Applications Areas Function approximation

including time series prediction and modeling.

Classification

including patterns and sequences recognition, novelty

detection and sequential decision making.

(radar systems, face identification, handwritten text recognition)

Data processing

including filtering, clustering blinds source separation and

compression. (data mining, e-mail Spam filtering)

Page 41: Artificial Neural Networks Part-2 · Perceptron Learning rule • Learning signal is the difference between the desired and actual neuron’s response • Learning is supervised

Advantages / Disadvantages

Advantages

Adapt to unknown situations

Powerful, it can model complex functions.

Ease of use, learns by example, and very little user

domain‐specific expertise needed

Disadvantages

Forgets

Not exact

Large complexity of the network structure

Page 42: Artificial Neural Networks Part-2 · Perceptron Learning rule • Learning signal is the difference between the desired and actual neuron’s response • Learning is supervised

Q. How does each neuron work in ANNS?

What is back propagation?

A neuron: receives input from many other neurons;

changes its internal state (activation) based on the

current input;

sends one output signal to many other neurons, possibly

including its input neurons (ANN is recurrent network).

Back-propagation is a type of supervised learning, used at each

layer to minimize the error between the layer’s response and the

actual data.