the threshold logic unit (tlu), mcculloch&pitts, 1943 is the simplest model

58
1 The Threshold Logic Unit (TLU), cCulloch&Pitts, 1943 is the simplest model f an artificial neuron. Chapter 3 Simple Supervised learning

Upload: jenaya

Post on 26-Jan-2016

113 views

Category:

Documents


5 download

DESCRIPTION

Chapter 3 Simple Supervised learning. The Threshold Logic Unit (TLU), McCulloch&Pitts, 1943 is the simplest model of an artificial neuron. TLU is the feedforward structure, which only one of several available. The feedforward is used to place an - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

1

The Threshold Logic Unit (TLU), McCulloch&Pitts, 1943 is the simplest model of an artificial neuron.

Chapter 3 Simple Supervised learning

Page 2: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

2

TLU is the feedforward structure, which only one of several available. The feedforward is used to place an input pattern into one of several classes according to the resulting pattern of outputs.

Page 3: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

3

The requirements of McCulloch-Pitts1. The activation is binary. (1 is fire or

0 is not fire)

2. The neurons are connected by directed, weighted paths.

3. A connection path is excitatory if the weight on the path is positive; otherwise it is inhibitory. (All excitatory connections into a particular neuron have the same weights)

Page 4: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

4

The requirements of McCulloch-Pitts4. Each neuron has a fixed threshold

such that if the net input is greater than threshold, the neuron fires.

5. The threshold is set so that inhibition is absolute. That is, any nonzero inhibitory input will prevent the neuron from firing.

6. It takes one time step for a signal to pass over one connection link.

Page 5: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

5

θ_ if 0

θ_ if 1)_(

iny

inyinyf

Simple McCulloch-Pitts Architecture

X1

X2 Y

X3-1

2

2

Page 6: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

6

Algorithm The weights for neuron are set, together with the threshold for the neuron’s activation function, thus theneuron will perform a simple logic function. We used the simple neurons as building blocks, that can model any function that can be represented as a logic function. Rather than a trainingalgorithm, it is used to determine the values of weights and threshold.

Page 7: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

7

The binary form of the functions for AND, OR and AND NOT are defined for reference the neuron’s activation function. This defined the threshold on Y unit to be 2.

X1

X2

Y

W1

W2

Simple networks for logic functions

Page 8: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

8

AND function

gives the following four training input, target output pairs : X1 X2 Y 0 0 0 0 1 0 1 0 0 1 1 1

จะสามารถกำาหนด w1 และ w2 ม�ค่�าเท่�ากำ�บ ?

Page 9: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

9

OR function

gives the following four training input, target output pairs : X1 X2 Y 0 0 0 0 1 1 1 0 1 1 1 1

จะสามารถกำาหนด w1 และ w2 ม�ค่�าเท่�ากำ�บ ?

Page 10: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

10

AND NOT function

gives the following four training input, target output pairs : X1 X2 Y 0 0 0 0 1 0 1 0 1 1 1 0

จะสามารถกำาหนด w1 ม�ค่�าเท่�ากำ�บ ? และ w2 ม�ค่�าเท่�ากำ�บ ?

Page 11: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

11

XOR function

x1 XOR x2 (x1 AND NOT x2) OR (x2 AND NOT x1)

How to model the network for XOR function?

Page 12: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

For NN approach, we assume that there are a set of training patterns for which the correct classification is known. In the simplest case, we find the output unit represents membership in the class with a response of 1;a response of -1 (or 0) indicates the patternis not a member of the class.

2.1 Pattern Classification

Page 13: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

The activation function

y_in = w1x1+ w2x2 +….+ wnxn

The output (bipolar value)

-1 if y_in < thresholdf(y_in) = 1 if y_in >= threshold

Simple Pattern Classification

Page 14: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

14

x1 x2 activation output

-1 -1 -1

-1 1 -1

1 -1 -1

1 1 1

AND TLU : threshold = 3w1, w2 = ?

Page 15: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

n

iiixw

1

2211 xwxw

2.2 The linear separation of classes Critical condition of classification :the activation equals the threshold

For 2-D case :

Page 16: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

16

baxx 12 : a = -1 , b =1.5

กำรณี� binary input ท่��กำาหนด w1, w2 = 1threshold = 1.5

Page 17: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

2.3 Biases and Thresholds

A bias acts as a weight on a connection from a unit whose activation is always 1. Increasing the bias increases the net input to the unit.

net = b + n wnxn

The output

-1 if net < 0 f(net) = 1 if net >= 0

Page 18: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

Single Layer with a Binary Step FunctionConsider a network with 2 inputs and 1 output node (2 classes).The net output of the network is a linear function of the weights and the inputs.net = W X = x1 w1 + x2 w2y = f(net)                                                   

x1 w1 + x2 w2 = 0 defines a straight line through the input space. x2 = - w1/w2 x1 <- this is line through the origin with slope -w1/w2

Page 19: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

Bias (threshold)What if the line dividing the 2 classes does not go through the origin?                                      

Page 20: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

2.4 The inner products v = (1,1) และ w =(0,2)

Page 21: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

ค่วามส�มพั�นธ์�ของม มกำ�บค่�าผลค่"ณี

Page 22: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

coswvwv

w

wvvw

Vector Projections

Page 23: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

23

xwxwa 2.5 Inner products and TLUs

ด�งน�#น

Page 24: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

24

แยกำได&เป็(น 2 กำรณี� (2 classes)

กำรณี� 1

Page 25: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

25

กำรณี� 2

Page 26: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

26

Other interesting geometric points to note:• The weight vector (w1, w2) is normal to

the decision boundary.Proof: Suppose z1 and z2 are points on the decision boundary.                     

Page 27: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

27

Page 28: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

2.6 Training TLUs Training Methods : three kinds of methods for training single-layer networks that do pattern classification.

• Hebb net - earliest and simplest learning rule• Perceptron - guaranteed to find the right weights if they exist • The Adaline (uses Delta Rule) - can easily be generalized to multi-layer nets (nonlinear problems)

Page 29: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

Hebb Algorithm

Step 0. Initialize all weights : wi = 0 (i=1 to n)Step 1. For each input training vector and target output pair, s : t, do steps 2-4 Step 2. Set activations for input units : xi = si ( i=1 to n) Step 3. Set activation for output unit: y = t Step 4. Adjust the weights for wi(new) = wi(old) + xiy Adjust the bias b(new) = b(old) + y

Page 30: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

30

2.7 Perceptron Rosenblatt introduced it in 1962. Perceptron consists of a TLU whoseinputs come from a set of preprocessingassociation units.

Page 31: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

Perceptron Training ใน training unit จะม�กำารป็ร�บ weightvector และ threshold เพั*�อได&ค่�าสาหร�บกำารแบ�งกำล �มท่��เหมาะสมกำารป็ร�บค่�า weight

กำรณี� 1

Page 32: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

32

10

กำรณี� 2

โดย

Page 33: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

Perceptron ใช้&กำารเร�ยนร"&อย�างง�ายท่��เร�ยกำว�า simple training rule

จากำร"ป็ p1 ค่*อ training input

Page 34: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

34

p2 ค่*อ training input

Page 35: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

Perceptron Algorithm

Step 0. Initialize all weights : wi = 0 (i=1 to n) Set learning rate ( 0 < <= 1)Step 1. While stopping condition is false, do steps 2-6 Step 2. For each training pair s:t, do steps 3-5. Step 3. Set activation of input units: xi = si ( i=1 to n) Step 4. Compute the response of output unit: y_in = b + xiwi

1 if y_in > y = 0 if - <= y_in <= -1 if y_in < -

Page 36: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

Perceptron Algorithm

Step 5. Update the weights and bias if an error occurred for this pattern if y t wi(new) = wi(old) + xit

b(new) = b(old) + t else wi(new) = wi(old) b(new) = b(old)Step 6. Test stopping condition: if no weights changed in Step2, stop; else, continue.

Page 37: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

37

กำารบ&าน

จงเข�ยนโป็รแกำรมเพั*�อท่ากำารเร�ยนร"&ฟั.งกำ�ช้�น OR ท่��ม�ค่�า weight 0.1, 0.2 และ threshold=0.5 โดยม�learning rate = 0.3

Page 38: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

2.8 Perceptron as classifiers กำารใช้& perceptron training algorithmอาจใช้&ในกำารแบ�งข&อม"ล (linearly separableclasses)

Page 39: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

ถ&าม� 4 ค่ลาสและจะท่ากำารแบ�ง 2 ระนาบจะเข�ยน pattern space ได&ด�งร"ป็

Page 40: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

เร/�มจากำ train two units เพั*�อท่ากำารแบ�งเป็(น 2 กำล �มกำ�อน จะได&เป็(น (A B) (C D)และ (A D) (B C) ด�งน�#

output

Page 41: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

41

จากำค่วามส�มพั�นธ์�ของ 2 units เป็(น 2 กำล �มถ&าแบ�งเป็(น 4 กำล �ม จะได&ด�งน�#

Page 42: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

42

โค่รงสร&างของ network ในกำารแบ�งข&อม"ล

Page 43: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

2.9 ADALINE Adaptive Linear Neuron using delta rulefor training. An ADALINE is a special case inwhich there is only one output unit. Architecture of ADALINE is a single neuronthat receives input from several units.x1

xnY

1b

wn

w1...

Page 44: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

44

ii vytw )(

vytww )(

สมกำารป็ร�บค่�า weight

น��นค่*อ

เร�ยกำว�า a training rule หร*อ learning ruleและ พัาราม/เตอร� เร�ยกำว�า learning rate

Page 45: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

45

กำารเร�ยนร"&แบบ supervise ใน neural น�#นเป็(นกำารเร�ยนร"&ข&อม"ลจากำ training set ซึ่2�งข&อม"ลน�#นจะม�ร"ป็แบบ output ท่��กำาหนดไว&แล&วระบบ network จะนาข&อม"ลเข&ามาท่างานภายใต& learning rule เพั*�อป็ร�บค่�า weightจนกำระท่��งได&ค่�า weight ท่��เหมาะสมสาหร�บนาไป็ใช้&งานต�อไป็ ซึ่2�งค่�า weight ท่��เหมาะสมจะพั/จารณีาเม*�อ network น�#น convergenceถ2งจ ดท่��ไม�ม�กำารเป็ล��ยนแป็ลงข&อม"ล

Page 46: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

Training Algorithm

Step 0. Initialize weights. (Small random values) set learning rate .Step 1. While stopping condition is false, do Step 2-6. Step 2. For each bipolar training pair s:t, do Step 3-5. Step 3. Set activations of input units, i=1,…,n xi = si

Step 4. Compute net input to output y_in = b + xiwi

Step 5. Update bias and weights b(new) = b(old)+(t-y_in) wi(new) = wi(old )+(t-y_in)xi

Step 6. Test for stopping condition.

Page 47: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

47

2.10 Delta rule : minimizing an error 2.10.1 Hebb’s learning law From linear associator network, the output vector y’ is derived from theinput vector x by means of this formula.

Wx'ywhere W = (wij) is the m x n weight matrix.

Page 48: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

48

2.10.2 gradient descent function

Page 49: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

49

slopexxกำาหนดให&xslopey

xxy

yy

x

ด�งน�#น 2)slope(y

จากำค่�า slope ท่��ยกำกำาล�งสอง ท่าให&ค่�าเป็(นบวกำเสมอ และเม*�อค่"ณีด&วยพัาราม/เตอร�ท่��ม�ค่�าเป็(นลบ ส�งผลให& สมกำารข&างต&นม�ค่�า <0 ด�งน�#นจ2งม�ล�กำษณีะเป็(น“travelled down”

Page 50: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

50

Delta rule is also known as - Widrow-Hoff rule - Least Mean Squares (LMS) rule

To train the network, we adjust the weights in the network so as to decrease the cost (this is where we require differentiability). This is called gradient descent.

Delta Rule: Training by Gradient Descent Revisited

Page 51: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

51

2.10.3 Gradient descent on error The early ADALINE(ADAptive LInearNEuron) model of Widrow and Hoff is discussed as a simple type of processing element. The Widrow learning law appliedminimizing error as the delta rule.

Page 52: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

Delta rule จะท่ากำารค่านวณี error ท่��เกำ/ดจากำtraining set ของแต�ละค่ร�#ง แล&วนาค่�าน�#นไป็พั/จารณีาเป็(นฟั.งกำ�ช้�นของ weight ในร"ป็ของgradient descent on the error

2

21

)yt(Ep จากำสมกำาร ค่�า error Ep ค่*อฟั.งกำ�ช้�นของ weightsสาหร�บ input pattern 1 input ด�งน�#นค่�าของerror ท่�#งหมด (Total error, E) แสดงด�งน�#

n

iii )yt(E

1

2

21

Page 53: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

53

The learning algorithm terminates once we are at, or sufficiently near to, the minimum of the error function, where dE/dw = 0. We say then that the algorithm has converged.

n

iii )yt(E

1

2

21

Page 54: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

54

An important consideration is the learning rate µ, which determines by how much we change the weights w at each step. If µ is too small, the algorithm will take a long time to converge.

Page 55: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

55

Conversely, if µ is too large, we may end up bouncing around the error surface out of control - the algorithm diverges. This usually ends with an overflow error in the computer's floating-point arithmetic.

Page 56: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

56

shows E for a range of values of w0 and w1

Page 57: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

สาหร�บ delta rule ใน TLUs จะใช้& activationแท่นค่�า output (y) จะได&สมกำารด�งน�#

2

21

)at(Ep

กำารป็ร�บค่�า weight ของ input แต�ละต�ว ตามกำารท่างานของ TLU โดยป็ร�บช้�วง output เป็(น-1, 1 จะได&สมกำารป็ร�บ weight ค่*อ

ii x)at(w

Page 58: The Threshold Logic Unit   (TLU),  McCulloch&Pitts, 1943 is the simplest model

ถ&าใช้&กำ�บ semilinear สมกำารจะป็ร�บให&สอดค่ล&องกำ�บ sigmoid function โดยเพั/�ม derivative เข&าไป็ในสมกำาร

))a()(a(da

)a(d)a(

x)yt)(a(w ii

11