artificial neural network : architecturesdsamanta/courses/sca/resources/slides/nn-… · such...
TRANSCRIPT
Artificial Neural Network : Architectures
Debasis Samanta
IIT Kharagpur
27.03.2018
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 27.03.2018 1 / 27
Neural network architectures
There are three fundamental classes of ANN architectures:
Single layer feed forward architecture
Multilayer feed forward architecture
Recurrent networks architecture
Before going to discuss all these architectures, we first discuss themathematical details of a neuron at a single level. To do this, let us firstconsider the AND problem and its possible solution with neuralnetwork.
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 27.03.2018 2 / 27
The AND problem and its Neural network
The simple Boolean AND operation with two input variables x1and x2 is shown in the truth table.
Here, we have four input patterns: 00, 01, 10 and 11.
For the first three patterns output is 0 and for the last patternoutput is 1.
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 27.03.2018 3 / 27
The AND problem and its neural network
Alternatively, the AND problem can be thought as a perceptionproblem where we have to receive four different patterns as inputand perceive the results as 0 or 1.
x1
x2
w1
w2
Y
0
00
01
10
11
1
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 27.03.2018 4 / 27
The AND problem and its neural network
A possible neuron specification to solve the AND problem is givenin the following. In this solution, when the input is 11, the weightsum exceeds the threshold (θ = 0.9) leading to the output 1 else itgives the output 0.
1 2
1 2
Here, y =∑
wixi − θ and w1 = 0.5,w2 = 0.5 and θ = 0.9
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 27.03.2018 5 / 27
Single layer feed forward neural network
The concept of the AND problem and its solution with a singleneuron can be extended to multiple neurons.
I1=
I2=
I3=
In=
x1
x2
x3
xm
………
..
………
..
w11
w12
w13
ɵ1
o1
o2
o3
on
f1
f2
f3
fn
INPUT OUTPUT
w1n
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 27.03.2018 6 / 27
Single layer feed forward neural network
I1=
I2=
I3=
In=
x1
x2
x3
xm
………
..
………
..
w11
w12
w13
ɵ1
o1
o2
o3
on
f1
f2
f3
fn
INPUT OUTPUT
w1n
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 27.03.2018 7 / 27
Single layer feed forward neural network
We see, a layer of n neurons constitutues a single layer feedforward neural network.
This is so called because, it contains a single layer of artificialneurons.
Note that the input layer and output layer, which receive inputsignals and transmit output signals are although called layers,they are actually boundary of the architecture and hence truly notlayers.
The only layer in the architecture is the synaptic links carrying theweights connect every input to the output neurons.
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 27.03.2018 8 / 27
Modeling SLFFNN
In a single layer neural network, the inputs x1, x2, · · · , xm areconnected to the layers of neurons through the weight matrix W . Theweight matrix Wm×n can be represented as follows.
w =
∣∣∣∣∣∣∣∣∣w11 w12 w13 · · · w1nw21 w22 w23 · · · w2n
......
......
wm1 wm2 wm3 · · · wmn
∣∣∣∣∣∣∣∣∣ (1)
The output of any k -th neuron can be determined as follows.
Ok = fk(∑m
i=1 (wikxi) + θk)
where k = 1,2,3, · · · ,n and θk denotes the threshold value of the k-thneuron. Such network is feed forward in type or acyclic in nature andhence the name.
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 27.03.2018 9 / 27
Multilayer feed forward neural networks
This network, as its name indicates is made up of multiple layers.
Thus architectures of this class besides processing an input andan output layer also have one or more intermediary layers calledhidden layers.
The hidden layer(s) aid in performing useful intermediarycomputation before directing the input to the output layer.
A multilayer feed forward network with l input neurons (number ofneuron at the first layer), m1,m2, · · · ,mp number of neurons at i-thhidden layer (i = 1,2, · · · ,p) and n neurons at the last layer (it isthe output neurons) is written as l − m1 − m2 − · · · − mp − nMLFFNN.
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 27.03.2018 10 / 27
Multilayer feed forward neural networks
Figure shows a schematic diagram of multilayer feed forward neuralnetwork with a configuration of l − m − n.
I11=x1
x2
xp
………
..
………
..
ɵ1
o1
o2
on
f11
I12=ɵ2
f12
I1l=ɵl
f11
I21=
………
..
ɵ1
f21
I22=ɵ2
f22
I2m=ɵm
f2m
I31=………
..
ɵ1
f31
I32=ɵ2
f32
I3n=ɵn
f3n
1
1
12
2
2
3
3
3
INPUT HIDDEN OUTPUT
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 27.03.2018 11 / 27
Multilayer feed forward neural networks
I11=x1
x2
xp
………
..
………
..
ɵ1
o1
o2
on
f11
I12=ɵ2
f12
I1l=ɵl
f11
I21=
………
..
ɵ1
f21
I22=ɵ2
f22
I2m=ɵm
f2m
I31=
………
..
ɵ1
f31
I32=ɵ2
f32
I3n=ɵn
f3n
1
1
12
2
2
3
3
3
INPUT HIDDEN OUTPUT
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 27.03.2018 12 / 27
Multilayer feed forward neural networks
In l − m − n MLFFNN, the input first layer contains l numbersneurons, the only hidden layer contains m number of neurons andthe last (output) layer contains n number of neurons.
The inputs x1, x2, .....xp are fed to the first layer and the weightmatrices between input and the first layer, the first layer and thehidden layer and those between hidden and the last (output) layerare denoted as W 1, W 2, and W 3, respectively.
Further, consider that f 1, f 2, and f 3 are the transfer functions ofneurons lying on the first, hidden and the last layers, respectively.
Likewise, the threshold values of any i-th neuron in j-th layer isdenoted by θj
i .
Moreover, the output of i-th, j-th, and k -th neuron in any l-th layeris represented by Ol
i = f li
(∑XiW l + θl
i
), where Xl is the input
vector to the l-th layer.
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 27.03.2018 13 / 27
Recurrent neural network architecture
The networks differ from feedback network architectures in thesense that there is at least one ”feedback loop”.
Thus, in these networks, there could exist one layer with feedbackconnection.
There could also be neurons with self-feedback links, that is, theoutput of a neuron is fed back into itself as input.
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 27.03.2018 14 / 27
Recurrent neural network architecture
Depending on different type of feedback loops, several recurrent neuralnetworks are known such as Hopfield network, Boltzmann machinenetwork etc.Debasis Samanta (IIT Kharagpur) Soft Computing Applications 27.03.2018 15 / 27
Why different type of neural network architecture?
To give the answer to this question, let us first consider the case of asingle neural network with two inputs as shown below.
x1
x2
w0
w1
w2
f=w0ɵ + w1x1 + w2x2
=b0 + w1x1 + w2x2
x1
x2
f=b 0
+w1x
1+
w2x2
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 27.03.2018 16 / 27
Revisit of a single neural network
Note that f = b0 + w1x1 + w2x2 denotes a straight line in the planeof x1-x2 (as shown in the figure (right) in the last slide).Now, depending on the values of w1 and w2, we have a set ofpoints for different values of x1 and x2.We then say that these points are linearly separable, if the straightline f separates these points into two classes.Linearly separable and non-separable points are further illustratedin Figure.
A straight line as a decision boundary for 2-classification problem.
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 27.03.2018 17 / 27
AND and XOR problems
To illustrate the concept of linearly separable and non separable tasksto be accomplished by a neural network, let us consider the case ofAND problem and XOR problem.
Inputs Output (y)
0 0 0
0 1 0
1 0 0
1 1 1
x1 x2
AND Problem
Output (y)
0 0 0
0 1 1
1 0 1
1 1 0
x1 x2
XOR Problem
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 27.03.2018 18 / 27
AND problem is linearly separable
0,1Output (y)
0 0 0
0 1 0
1 0 0
1 1 1
x1 x2
The AND Logic
y=0
y=0
1
1 x1
x2
01,0y=0
0,0
1,1
Y=1
f = 0.5 x1 + 0.5 x2 - 0.9
AND-problem is linearly separable
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 27.03.2018 19 / 27
XOR problem is linearly non-separable
Output (y)
0 0 0
0 1 1
1 0 1
1 1 0
x1 x2
XOR Problem
y=1
y=1
1
x1
x2
0
1,0
y=0
0,0
y=0
XOR-problem is non-linearly separable
0,1 1,1
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 27.03.2018 20 / 27
Our observations
From the example discussed, we understand that a straight line ispossible in AND-problem to separate two tasks namely the output as 0or 1 for any input.
However, in case of XOR problem, such a line is not possible.
Note: horizontal or a vertical line in case of XOR problem is notadmissible because in that case it completely ignores one input.
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 27.03.2018 21 / 27
Example
So, far a 2-classification problem, if there is a straight line, whichacts as a decision boundary then we can say such problem aslinearly separable; otherwise, it is non-linearly separable.
The same concept can be extended to n-classification problem.
Such a problem can be represented by an n-dimensional spaceand a boundary would be with n − 1 dimensions that separates agiven sets.
In fact, any linearly separable problem can be solved with a singlelayer feed forward neural network. For example, the AND problem.
On the other hand, if the problem is non-linearly separable, then asingle layer neural network can not solves such a problem.To solve such a problem, multilayer feed forward neural network isrequired.
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 27.03.2018 22 / 27
Example: Solving XOR problem
0.5
1.5
0.5
X1
X2
1
1
1
1
1
-1
f
Neural network for XOR-problem
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 27.03.2018 23 / 27
Dynamic neural network
In some cases, the output needs to be compared with its targetvalues to determine an error, if any.
Based on this category of applications, a neural network can bestatic neural network or dynamic neural network.
In a static neural network, error in prediction is neither calculatednor feedback for updating the neural network.
On the other hand, in a dynamic neural network, the error isdetermined and then feed back to the network to modify itsweights (or architecture or both).
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 27.03.2018 24 / 27
Dynamic neural network
Framework of dynamic neural network
INPUTSNEURAL NETWORK
ARCHITECTUREOUTPUT ERROR
CALCULATION
FEED BACK
TA
RG
ET
Adjust weights / architecture
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 27.03.2018 25 / 27
Dynamic neural network
From the above discussions, we conclude that
For linearly separable problems, we solve using single layer feedforward neural network.
For non-linearly separable problem, we solve using multilayer feedforward neural networks.
For problems, with error calculation, we solve using recurrentneural networks as well as dynamic neural networks.
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 27.03.2018 26 / 27
Any questions??
Debasis Samanta (IIT Kharagpur) Soft Computing Applications 27.03.2018 27 / 27