p11 neural network applications - web.ece.ucsb.edu

36
Neural Network Applications In Signal Processing Dr. Yogananda Isukapalli

Upload: others

Post on 27-Dec-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: P11 Neural Network Applications - web.ece.ucsb.edu

Neural Network ApplicationsIn

Signal Processing

Dr. Yogananda Isukapalli

Page 2: P11 Neural Network Applications - web.ece.ucsb.edu

2

Neural Network Applications in Signal Processing

• Signal Detection by Noise Reduction• Nonlinear Time Series Prediction• Target Classification in RADAR, SONAR etc.• Transient Signal Detection & Classification• Frequency Estimation & Tracking• Nonlinear System Modeling• Detector for Different Signal Constellations in Digital

Communication• Speech/Speaker Recognition• Data Compression (Speech/Image)• Image / Object Recognition & Processing• Principal Component Extraction & Subspace Estimation

Page 3: P11 Neural Network Applications - web.ece.ucsb.edu

3

Neural Network Architectures and Algorithms

• Multilayered Feedforward Networks

• Backpropogation Algorithms

• Radial Basis Function Networks

• Recurrent Neural Networks

• Real-time Recurrent Learning Algorithm

• Hopfield Networks

• Unsupervised Learning Algorithms

Page 4: P11 Neural Network Applications - web.ece.ucsb.edu

4

Neuron Representation and Typical Output Function

Page 5: P11 Neural Network Applications - web.ece.ucsb.edu

5

The Multilayer Perceptron

• Initialize weights and thresholds, set all weights and thresholds to small random values

• Present input and desired output:

Present input Xp = x0, x1, x2, …….. xn-1

Target output Tp = t0, t1, t2, …….. tm-1

where, ‘n’ is the number of input nodes and ‘m’ is the number of output nodes. Set w0 to be -q, the bias, and x0 to be always 1.

Page 6: P11 Neural Network Applications - web.ece.ucsb.edu

6

Multilayer Feed-Forward Networks

Page 7: P11 Neural Network Applications - web.ece.ucsb.edu

7

Multilayer Feed-Forward NetworksA mapping from one multi-dimensional vector space intoAnother multi-dimensional vector space:

Consider a pair of vectors ( Rk, Sk ):

Rk = (r1k,…. rm

k)T : Input to the networkk = 1, … , p

Sk = (s1k,…. sm

k)T

For each input : Output Rk = Y(W, Rk) = Rk = (y1k,…. ym

k)T

Y(W, .) is the output of network parameterized by its connection matrix W

Page 8: P11 Neural Network Applications - web.ece.ucsb.edu

8

• If the error incurred by the network is defined as the mean square error E, then there exists a multi-layerfeed-forward neural network W such that:

[ ] [ ]÷÷ø

öççè

æ--= å

=

p

k

kkTkk SRWYSRWYW

E1

),(),(21

min

Multilayer Feed-Forward Networks

Page 9: P11 Neural Network Applications - web.ece.ucsb.edu

9

Neural Networks: Learning Algorithms

• Hebbian Learning

• Widrow-Hoff Learning

• Perceptron Learning

• Backpropagation Learning

• Instar and Outstar (Grossberg) Learning

• Kohonen Learning

• Kosko / Klopf Learning

• Other Learning algorithms

Page 10: P11 Neural Network Applications - web.ece.ucsb.edu

10

Back-propagation Learning AlgorithmAssume: M Layer

N1 neuron / layer1 = 1, 2, …., MN0 = m inputs in the first hidden layerNm = n inputs in the output layer

The total activation of the ith neuron in layer1, denoted by x1, iscomputed as follows:

å-

=-

+=1

10,,,1,,,

l

y

N

jiljljilil WWx

WhereWl ,i, j: Connection weight from neuron j in layer l-1 to

neuron in layer1Wl ,i, j: the threshold of the neuron

Page 11: P11 Neural Network Applications - web.ece.ucsb.edu

11

Back-propagation Learning Algorithm

0 exp1

1),

,, >+

== - kif(xyilkxl il

= Continuous, differentiable, monotonically(sigmoid) function of its total activation.

• When pattern K is presented to the network, for i = 1, … N0

• The Neurons in the first hidden layer will compute their outputand propagate their outputs to the layer above

• The output of the network will be given by the neurons at the very last layer, i.e. yM,i, i = 1, 2, …. NM

kii

k ry =,0

Page 12: P11 Neural Network Applications - web.ece.ucsb.edu

12

• The goal is to find the best set of weights W such thatthe following error measure is minimized:

Back-propagation Learning Algorithm

ååå= ==

-==p

k

n

i

KiiM

k

W

p

k

k

W

SyEE1 1

2,

min

1

min

)(

The partial derivative of the error with respect to the weights of the neurons in the output layer is:

( )å

å

=

-

=

-=¶¶

¶¶

¶¶

¶¶

=¶¶

p

kjM

k

iM

iMk

kiiM

k

jiM

p

k jiM

iMk

iM

iMk

iM

K

jiM

ydxdy

SyWE

Wx

xy

yE

WE

1,1

,

,,

,,

1 ,,

,

,

,

,,,

.

..

Page 13: P11 Neural Network Applications - web.ece.ucsb.edu

13

Back-propagation Learning Algorithm

)(' ,,

,il

k

ilk

ilk

xfdxdy

=

• Note that

• define dkm,i as

)(').( ,,, iMkk

iiMk

iMk xfSy -=d

Page 14: P11 Neural Network Applications - web.ece.ucsb.edu

14

The partial derivative of the error with respect to the weightsOf the neurons in the output layer is:

( )å

å

=

-

=

-=¶¶

¶¶

¶¶

¶¶

=¶¶

p

kjM

k

iM

iMk

kiiM

k

jiM

p

k jiM

iMk

iM

iMk

iM

K

jiM

ydxdy

SyWE

Wx

xy

yE

WE

1,1

,

,,

,,

1 ,,

,

,

,

,,,

.

..

Back-propagation Learning Algorithm

Page 15: P11 Neural Network Applications - web.ece.ucsb.edu

15

Back-propagation Learning Algorithm

• dkm,i is generalized for the other layers by defining

å+

=++=

1

1,,11,, .)('

lN

jijll

kiM

kiM

k Wxf dd

* * l = M-1, …,1; i = 1, …. N1

This makes it clear how the errors at one layer are propagatedbackwards into the previous layer

• By combing * and * *, the partial derivative of the error withrespect to any of the weights in the hidden layer can be writtenas:

å=

-=¶¶ p

kjl

kil

k

jil

yWE

1,1,

,,

d

Page 16: P11 Neural Network Applications - web.ece.ucsb.edu

16

The procedure is shown to converge provided that at, thelearning rates were chosen such that

These conditions are satisfied by the particular choice of

but the convergence may become very slow

Back-propagation Learning Algorithm

• To “change” the weights by a small amount in the directionwhich causes the error to be reduced, the following updateprocedure is introduced:

jil

t

ttjil

tjil W

EWW,,

)()(,,

)1(,, ¶

¶-=+ a

åå¥

=

¥

=

+¥<+¥<1

2

0

2 and t

tt

t aa

0 somefor / 00 >= aaa tt

Page 17: P11 Neural Network Applications - web.ece.ucsb.edu

17

Back-propagation Learning Algorithm

• Increasing the rates of convergence

A. If we use a constant learning rate at such that 0.1<a<0.9, thealgorithm still approximates the steepest descent path, andshould produce a reasonably good solution

B. To reduce the effects of a very shallow error surface, amomentum term bDw is added as follows:

)(,,

,,

)()1(

,, β)β1(α tjil

jil

ttjil W

WEW D+

¶¶

--=D +

where b is the amount of momentum that should be added.

Page 18: P11 Neural Network Applications - web.ece.ucsb.edu

18

In practice b should range from 0 to 0.4. Rewriting the aboveequation:

å=

-+

¶¶

--=Dt

k jil

ktkt

jil WEW

0 ,,

)()1(

,, β)β1(α

Back-propagation Learning Algorithm

The change to the weights becomes the exponentially weightedSum of the current and the past error’s partial derivative with Respect to the weight being changed.

Page 19: P11 Neural Network Applications - web.ece.ucsb.edu

19

Kolmogorov’s Mapping Neural Network

Existence Theorem: (How many hidden layers are needed?)

Given any continuous function F:Id ® Rc, F(x) = y, where I isthe closed interval [0,1] (and therefore Id is the d-dimensional unitcube), F can be implemented exactly by a three-layer neural networkhaving d processing elements in the input layer, (2d+1) processingelements in the single (hidden) layer, and c processing elements in the output layer. The input layer serves merely to ‘hold’ or freezethe input and distribute each input to the hidden layer.

Page 20: P11 Neural Network Applications - web.ece.ucsb.edu

20

RADIAL BASIS FUNCTION NETWORKS

• Examples of Radial Basis Functions

21

22

21

22

2

2

2

)()(

)()(

)log()(

exp)(

-+=

+=

=

÷÷ø

öççè

æ=

sf

sf

f

sf

XX

XX

XXX

XX Gaussian

Thin plate Spline

Multi-quadratic function

Inverse multi-quadratic function

X is the euclidian norm of the difference between the input vector and the rbf center for that node

Page 21: P11 Neural Network Applications - web.ece.ucsb.edu

21

RADIAL BASIS FUNCTION NETWORKS

Radial Basis Function Network A Radial Basis Neuron

Page 22: P11 Neural Network Applications - web.ece.ucsb.edu

22

Approximation by Radial Basis Functions

Given N different input vectors, Xi, Xi e Rn, i = 1, 2, … N, and thecorresponding N real outputs, yi, yi e R, i = 1, 2, … N, find a function F, F:Xi, Rn ® R, satisfying the interpolation conditions

F(Xi) = yi i = 1, 2, …. , N

The solution to the above problem is a function which is linearcombination of N radial basis functions, i.e.,

å=

-F=N

iii XXCxF

1||)(||)(

Page 23: P11 Neural Network Applications - web.ece.ucsb.edu

23

Neural Network Architecture Using Radial Basis Functions

Parallel network like structure

Points in the data space are chosen as basis functions for interpolation

The outputs are the weighted and summed quantities of the outputs of the RBF centers, i.e.,

nc: the number of RBF centersOj: the output of the jth output nodeå

=

=N

iiijj WO

1f

Page 24: P11 Neural Network Applications - web.ece.ucsb.edu

24

• Implementing RBF Networks

1. Gaussian Kernel is typically chosen2. the spread s (Gaussian Kernel) is chosen by K nearest

neighbor rule3. Fix the RBF centers by choosing the subset of the data

vectors that represent the entire data set, The vectors are generally obtained by a K-means clustering algorithm

4. Compare the weights to the output layer by W = AjD

where A represents the output of the first layer units for all the available training patters; D represents the desired output patterns and Aj is the Moore-penrose pseudo-inverse of A

5. Generalize the strict interpolation method to a successive approximation radial basis network function network(Xiangdong He and Lapedes)

Page 25: P11 Neural Network Applications - web.ece.ucsb.edu

25

Radial Basis Function NetworksBasis functions are the centers of the RBF neurons. All the available training patterns are prospective candidates. The centers can be fixed by k-means clustering or by the orthogonal least squares method. The function Pi is a radial function, i.e., it is a function of the distance of a particular vector x from the RBF center c. The weights are obtained by a least squares fit over all the training patterns.

Page 26: P11 Neural Network Applications - web.ece.ucsb.edu

26

Two Layered RBF Networks

All training patterns are used as centers. The given patterns aredivided into N groups with L centers each. The w’s and λ’s are obtained by least squares fit over all the patterns.

A Two layered Radial Basis Function Network

Page 27: P11 Neural Network Applications - web.ece.ucsb.edu

27

Radial Basis Function Networks

A Two layered Radial Basis Function Network

Page 28: P11 Neural Network Applications - web.ece.ucsb.edu

28

Recurrent Neural Networks

• A fully connected recurrent is one which all the outputs ofeach unit is connected to the input of all the units, includingitself and excluding the external input nodes

• The recurrent network architecture allows the network toretain the information from the infinite past.

Page 29: P11 Neural Network Applications - web.ece.ucsb.edu

29

Recurrent Neural Networks

Page 30: P11 Neural Network Applications - web.ece.ucsb.edu

30

Recurrent Neural Networks

• Real-Time Recurrent Learning (RTRL) Algorithm (Williams and Zipser)

Assume: m: Input Nodesn: Output Nodesq: designated target units for which desired

outputs are knownn-q: The remaining nodes for which the desired

outputs are set to zeroWij: The weight connecting the ith and jth units

Let: Z = [x,y]X: the input vector of size 1 x mY: the output unit of size 1 x nZ: the augmented vector of size 1 x (m+n)

Page 31: P11 Neural Network Applications - web.ece.ucsb.edu

31

Recurrent Neural Networks

then

å+

=

=nm

llklk ZWtS

1)(

)((exp)11

)1( tSk ktY -+

=+

)1()1()1( +-+=+ tdtyte kkk

dk: the desired output for the kth output unit

Page 32: P11 Neural Network Applications - web.ece.ucsb.edu

32

Recurrent Neural Networks

))1(1)(1()(' +-+= tytySf kkkk

The weights are updated by gradient descent method

Now: The minimum mean squared error is the optimizationcriterion as follows:

2)(21å -=

jjj ydE

åå==

+=¶¶

+=¶¶ q

k

kijk

q

kk

ijk

ij

tptetyW

teWE

11)()1()().1(

Page 33: P11 Neural Network Applications - web.ece.ucsb.edu

33

Recurrent Neural Networks

)()()1(

)1().1()(

)()()(')1(

.)1(

1

1

tWtwtW

tptetW

tZtpWSftp

WS

SEtp

ijijij

n

k

kijkij

n

ljik

lijklk

kij

ij

k

k

kij

D+=+

++=D

úû

ùêë

é+=+

¶¶

¶¶

=+

å

å

=

=

a

d

Page 34: P11 Neural Network Applications - web.ece.ucsb.edu

34

Recurrent Neural Networks

Comments on RTRL:

• Computationally intensive and higher storage requirements

• Computational complexity is O(n4)

• Storage requirements O(n3)

• Parallel implementation will reduce the computational time(Williams and Zipser, 1989)

Page 35: P11 Neural Network Applications - web.ece.ucsb.edu

35

The Little/Hopfield Neural Network Model

Page 36: P11 Neural Network Applications - web.ece.ucsb.edu

36

Hopfield Network

The outputs of the network are modified in such a way that theLyapunov function E is minimized.

ååå== =

-=n

iii

n

i

n

jjiij vIvvWE

11 121

Wij are the weights, Ii’s are the inputs

( )å +=+ ikkii ItvWftv )()1(

Thus, if we can identify the above structure of E in an optimization problem, then the weights and inputscan be computed. The network istrained till the output vi’s converge.

A typical Hopfield Neural Network