artificial neural network : architecturesdsamanta/courses/sca/resources/slides/nn-… · such...

27
Artificial Neural Network : Architectures Debasis Samanta IIT Kharagpur [email protected] 27.03.2018 Debasis Samanta (IIT Kharagpur) Soft Computing Applications 27.03.2018 1 / 27

Upload: others

Post on 12-May-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Artificial Neural Network : Architecturesdsamanta/courses/sca/resources/slides/NN-… · Such network is feed forward in type or acyclic in nature and hence the name. Debasis Samanta

Artificial Neural Network : Architectures

Debasis Samanta

IIT Kharagpur

[email protected]

27.03.2018

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 27.03.2018 1 / 27

Page 2: Artificial Neural Network : Architecturesdsamanta/courses/sca/resources/slides/NN-… · Such network is feed forward in type or acyclic in nature and hence the name. Debasis Samanta

Neural network architectures

There are three fundamental classes of ANN architectures:

Single layer feed forward architecture

Multilayer feed forward architecture

Recurrent networks architecture

Before going to discuss all these architectures, we first discuss themathematical details of a neuron at a single level. To do this, let us firstconsider the AND problem and its possible solution with neuralnetwork.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 27.03.2018 2 / 27

Page 3: Artificial Neural Network : Architecturesdsamanta/courses/sca/resources/slides/NN-… · Such network is feed forward in type or acyclic in nature and hence the name. Debasis Samanta

The AND problem and its Neural network

The simple Boolean AND operation with two input variables x1and x2 is shown in the truth table.

Here, we have four input patterns: 00, 01, 10 and 11.

For the first three patterns output is 0 and for the last patternoutput is 1.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 27.03.2018 3 / 27

Page 4: Artificial Neural Network : Architecturesdsamanta/courses/sca/resources/slides/NN-… · Such network is feed forward in type or acyclic in nature and hence the name. Debasis Samanta

The AND problem and its neural network

Alternatively, the AND problem can be thought as a perceptionproblem where we have to receive four different patterns as inputand perceive the results as 0 or 1.

x1

x2

w1

w2

Y

0

00

01

10

11

1

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 27.03.2018 4 / 27

Page 5: Artificial Neural Network : Architecturesdsamanta/courses/sca/resources/slides/NN-… · Such network is feed forward in type or acyclic in nature and hence the name. Debasis Samanta

The AND problem and its neural network

A possible neuron specification to solve the AND problem is givenin the following. In this solution, when the input is 11, the weightsum exceeds the threshold (θ = 0.9) leading to the output 1 else itgives the output 0.

1 2

1 2

Here, y =∑

wixi − θ and w1 = 0.5,w2 = 0.5 and θ = 0.9

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 27.03.2018 5 / 27

Page 6: Artificial Neural Network : Architecturesdsamanta/courses/sca/resources/slides/NN-… · Such network is feed forward in type or acyclic in nature and hence the name. Debasis Samanta

Single layer feed forward neural network

The concept of the AND problem and its solution with a singleneuron can be extended to multiple neurons.

I1=

I2=

I3=

In=

x1

x2

x3

xm

………

..

………

..

w11

w12

w13

ɵ1

o1

o2

o3

on

f1

f2

f3

fn

INPUT OUTPUT

w1n

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 27.03.2018 6 / 27

Page 7: Artificial Neural Network : Architecturesdsamanta/courses/sca/resources/slides/NN-… · Such network is feed forward in type or acyclic in nature and hence the name. Debasis Samanta

Single layer feed forward neural network

I1=

I2=

I3=

In=

x1

x2

x3

xm

………

..

………

..

w11

w12

w13

ɵ1

o1

o2

o3

on

f1

f2

f3

fn

INPUT OUTPUT

w1n

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 27.03.2018 7 / 27

Page 8: Artificial Neural Network : Architecturesdsamanta/courses/sca/resources/slides/NN-… · Such network is feed forward in type or acyclic in nature and hence the name. Debasis Samanta

Single layer feed forward neural network

We see, a layer of n neurons constitutues a single layer feedforward neural network.

This is so called because, it contains a single layer of artificialneurons.

Note that the input layer and output layer, which receive inputsignals and transmit output signals are although called layers,they are actually boundary of the architecture and hence truly notlayers.

The only layer in the architecture is the synaptic links carrying theweights connect every input to the output neurons.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 27.03.2018 8 / 27

Page 9: Artificial Neural Network : Architecturesdsamanta/courses/sca/resources/slides/NN-… · Such network is feed forward in type or acyclic in nature and hence the name. Debasis Samanta

Modeling SLFFNN

In a single layer neural network, the inputs x1, x2, · · · , xm areconnected to the layers of neurons through the weight matrix W . Theweight matrix Wm×n can be represented as follows.

w =

∣∣∣∣∣∣∣∣∣w11 w12 w13 · · · w1nw21 w22 w23 · · · w2n

......

......

wm1 wm2 wm3 · · · wmn

∣∣∣∣∣∣∣∣∣ (1)

The output of any k -th neuron can be determined as follows.

Ok = fk(∑m

i=1 (wikxi) + θk)

where k = 1,2,3, · · · ,n and θk denotes the threshold value of the k-thneuron. Such network is feed forward in type or acyclic in nature andhence the name.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 27.03.2018 9 / 27

Page 10: Artificial Neural Network : Architecturesdsamanta/courses/sca/resources/slides/NN-… · Such network is feed forward in type or acyclic in nature and hence the name. Debasis Samanta

Multilayer feed forward neural networks

This network, as its name indicates is made up of multiple layers.

Thus architectures of this class besides processing an input andan output layer also have one or more intermediary layers calledhidden layers.

The hidden layer(s) aid in performing useful intermediarycomputation before directing the input to the output layer.

A multilayer feed forward network with l input neurons (number ofneuron at the first layer), m1,m2, · · · ,mp number of neurons at i-thhidden layer (i = 1,2, · · · ,p) and n neurons at the last layer (it isthe output neurons) is written as l − m1 − m2 − · · · − mp − nMLFFNN.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 27.03.2018 10 / 27

Page 11: Artificial Neural Network : Architecturesdsamanta/courses/sca/resources/slides/NN-… · Such network is feed forward in type or acyclic in nature and hence the name. Debasis Samanta

Multilayer feed forward neural networks

Figure shows a schematic diagram of multilayer feed forward neuralnetwork with a configuration of l − m − n.

I11=x1

x2

xp

………

..

………

..

ɵ1

o1

o2

on

f11

I12=ɵ2

f12

I1l=ɵl

f11

I21=

………

..

ɵ1

f21

I22=ɵ2

f22

I2m=ɵm

f2m

I31=………

..

ɵ1

f31

I32=ɵ2

f32

I3n=ɵn

f3n

1

1

12

2

2

3

3

3

INPUT HIDDEN OUTPUT

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 27.03.2018 11 / 27

Page 12: Artificial Neural Network : Architecturesdsamanta/courses/sca/resources/slides/NN-… · Such network is feed forward in type or acyclic in nature and hence the name. Debasis Samanta

Multilayer feed forward neural networks

I11=x1

x2

xp

………

..

………

..

ɵ1

o1

o2

on

f11

I12=ɵ2

f12

I1l=ɵl

f11

I21=

………

..

ɵ1

f21

I22=ɵ2

f22

I2m=ɵm

f2m

I31=

………

..

ɵ1

f31

I32=ɵ2

f32

I3n=ɵn

f3n

1

1

12

2

2

3

3

3

INPUT HIDDEN OUTPUT

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 27.03.2018 12 / 27

Page 13: Artificial Neural Network : Architecturesdsamanta/courses/sca/resources/slides/NN-… · Such network is feed forward in type or acyclic in nature and hence the name. Debasis Samanta

Multilayer feed forward neural networks

In l − m − n MLFFNN, the input first layer contains l numbersneurons, the only hidden layer contains m number of neurons andthe last (output) layer contains n number of neurons.

The inputs x1, x2, .....xp are fed to the first layer and the weightmatrices between input and the first layer, the first layer and thehidden layer and those between hidden and the last (output) layerare denoted as W 1, W 2, and W 3, respectively.

Further, consider that f 1, f 2, and f 3 are the transfer functions ofneurons lying on the first, hidden and the last layers, respectively.

Likewise, the threshold values of any i-th neuron in j-th layer isdenoted by θj

i .

Moreover, the output of i-th, j-th, and k -th neuron in any l-th layeris represented by Ol

i = f li

(∑XiW l + θl

i

), where Xl is the input

vector to the l-th layer.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 27.03.2018 13 / 27

Page 14: Artificial Neural Network : Architecturesdsamanta/courses/sca/resources/slides/NN-… · Such network is feed forward in type or acyclic in nature and hence the name. Debasis Samanta

Recurrent neural network architecture

The networks differ from feedback network architectures in thesense that there is at least one ”feedback loop”.

Thus, in these networks, there could exist one layer with feedbackconnection.

There could also be neurons with self-feedback links, that is, theoutput of a neuron is fed back into itself as input.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 27.03.2018 14 / 27

Page 15: Artificial Neural Network : Architecturesdsamanta/courses/sca/resources/slides/NN-… · Such network is feed forward in type or acyclic in nature and hence the name. Debasis Samanta

Recurrent neural network architecture

Depending on different type of feedback loops, several recurrent neuralnetworks are known such as Hopfield network, Boltzmann machinenetwork etc.Debasis Samanta (IIT Kharagpur) Soft Computing Applications 27.03.2018 15 / 27

Page 16: Artificial Neural Network : Architecturesdsamanta/courses/sca/resources/slides/NN-… · Such network is feed forward in type or acyclic in nature and hence the name. Debasis Samanta

Why different type of neural network architecture?

To give the answer to this question, let us first consider the case of asingle neural network with two inputs as shown below.

x1

x2

w0

w1

w2

f=w0ɵ + w1x1 + w2x2

=b0 + w1x1 + w2x2

x1

x2

f=b 0

+w1x

1+

w2x2

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 27.03.2018 16 / 27

Page 17: Artificial Neural Network : Architecturesdsamanta/courses/sca/resources/slides/NN-… · Such network is feed forward in type or acyclic in nature and hence the name. Debasis Samanta

Revisit of a single neural network

Note that f = b0 + w1x1 + w2x2 denotes a straight line in the planeof x1-x2 (as shown in the figure (right) in the last slide).Now, depending on the values of w1 and w2, we have a set ofpoints for different values of x1 and x2.We then say that these points are linearly separable, if the straightline f separates these points into two classes.Linearly separable and non-separable points are further illustratedin Figure.

A straight line as a decision boundary for 2-classification problem.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 27.03.2018 17 / 27

Page 18: Artificial Neural Network : Architecturesdsamanta/courses/sca/resources/slides/NN-… · Such network is feed forward in type or acyclic in nature and hence the name. Debasis Samanta

AND and XOR problems

To illustrate the concept of linearly separable and non separable tasksto be accomplished by a neural network, let us consider the case ofAND problem and XOR problem.

Inputs Output (y)

0 0 0

0 1 0

1 0 0

1 1 1

x1 x2

AND Problem

Output (y)

0 0 0

0 1 1

1 0 1

1 1 0

x1 x2

XOR Problem

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 27.03.2018 18 / 27

Page 19: Artificial Neural Network : Architecturesdsamanta/courses/sca/resources/slides/NN-… · Such network is feed forward in type or acyclic in nature and hence the name. Debasis Samanta

AND problem is linearly separable

0,1Output (y)

0 0 0

0 1 0

1 0 0

1 1 1

x1 x2

The AND Logic

y=0

y=0

1

1 x1

x2

01,0y=0

0,0

1,1

Y=1

f = 0.5 x1 + 0.5 x2 - 0.9

AND-problem is linearly separable

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 27.03.2018 19 / 27

Page 20: Artificial Neural Network : Architecturesdsamanta/courses/sca/resources/slides/NN-… · Such network is feed forward in type or acyclic in nature and hence the name. Debasis Samanta

XOR problem is linearly non-separable

Output (y)

0 0 0

0 1 1

1 0 1

1 1 0

x1 x2

XOR Problem

y=1

y=1

1

x1

x2

0

1,0

y=0

0,0

y=0

XOR-problem is non-linearly separable

0,1 1,1

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 27.03.2018 20 / 27

Page 21: Artificial Neural Network : Architecturesdsamanta/courses/sca/resources/slides/NN-… · Such network is feed forward in type or acyclic in nature and hence the name. Debasis Samanta

Our observations

From the example discussed, we understand that a straight line ispossible in AND-problem to separate two tasks namely the output as 0or 1 for any input.

However, in case of XOR problem, such a line is not possible.

Note: horizontal or a vertical line in case of XOR problem is notadmissible because in that case it completely ignores one input.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 27.03.2018 21 / 27

Page 22: Artificial Neural Network : Architecturesdsamanta/courses/sca/resources/slides/NN-… · Such network is feed forward in type or acyclic in nature and hence the name. Debasis Samanta

Example

So, far a 2-classification problem, if there is a straight line, whichacts as a decision boundary then we can say such problem aslinearly separable; otherwise, it is non-linearly separable.

The same concept can be extended to n-classification problem.

Such a problem can be represented by an n-dimensional spaceand a boundary would be with n − 1 dimensions that separates agiven sets.

In fact, any linearly separable problem can be solved with a singlelayer feed forward neural network. For example, the AND problem.

On the other hand, if the problem is non-linearly separable, then asingle layer neural network can not solves such a problem.To solve such a problem, multilayer feed forward neural network isrequired.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 27.03.2018 22 / 27

Page 23: Artificial Neural Network : Architecturesdsamanta/courses/sca/resources/slides/NN-… · Such network is feed forward in type or acyclic in nature and hence the name. Debasis Samanta

Example: Solving XOR problem

0.5

1.5

0.5

X1

X2

1

1

1

1

1

-1

f

Neural network for XOR-problem

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 27.03.2018 23 / 27

Page 24: Artificial Neural Network : Architecturesdsamanta/courses/sca/resources/slides/NN-… · Such network is feed forward in type or acyclic in nature and hence the name. Debasis Samanta

Dynamic neural network

In some cases, the output needs to be compared with its targetvalues to determine an error, if any.

Based on this category of applications, a neural network can bestatic neural network or dynamic neural network.

In a static neural network, error in prediction is neither calculatednor feedback for updating the neural network.

On the other hand, in a dynamic neural network, the error isdetermined and then feed back to the network to modify itsweights (or architecture or both).

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 27.03.2018 24 / 27

Page 25: Artificial Neural Network : Architecturesdsamanta/courses/sca/resources/slides/NN-… · Such network is feed forward in type or acyclic in nature and hence the name. Debasis Samanta

Dynamic neural network

Framework of dynamic neural network

INPUTSNEURAL NETWORK

ARCHITECTUREOUTPUT ERROR

CALCULATION

FEED BACK

TA

RG

ET

Adjust weights / architecture

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 27.03.2018 25 / 27

Page 26: Artificial Neural Network : Architecturesdsamanta/courses/sca/resources/slides/NN-… · Such network is feed forward in type or acyclic in nature and hence the name. Debasis Samanta

Dynamic neural network

From the above discussions, we conclude that

For linearly separable problems, we solve using single layer feedforward neural network.

For non-linearly separable problem, we solve using multilayer feedforward neural networks.

For problems, with error calculation, we solve using recurrentneural networks as well as dynamic neural networks.

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 27.03.2018 26 / 27

Page 27: Artificial Neural Network : Architecturesdsamanta/courses/sca/resources/slides/NN-… · Such network is feed forward in type or acyclic in nature and hence the name. Debasis Samanta

Any questions??

Debasis Samanta (IIT Kharagpur) Soft Computing Applications 27.03.2018 27 / 27