eclt 5810 classification –neural networkseclt5810/lecture/neural-networks... · 2020. 3. 3. ·...

ECLT 5810Classification – Neural Networks

Reference:Data Mining: Concepts and Techniques By J. Hand, M. Kamber, and J. Pei, Morgan Kaufmann

ECLT 5810 Classification - Neural Networks 2

Neural Networks A neural network is a set of connected input/output units where

each connection has a weight associated with it. It is also referred to as connectionist learning. Has been applied to various industry including financial area

e.g. stock market trading


Demonstrating Some Intelligence

“Mastering the game of Go with Deep Neural Networks and Tree Search”, Nature 529, Jan 28, 2016.


Application in Financial Industry

“Algorithms Take Control of Wall Street”, Felix Salmon and Jon Strokes Magazine, 2010 Dec 27.


Neural Networks Advantages

prediction accuracy is generally high robust, works when training examples contain errors fast evaluation of the learned target function

Criticism long training time difficult to understand the learned function (weights) not easy to incorporate domain knowledge


Multi-Layer Feed-Forward Network

Output nodes

Input nodes

Hidden nodes

Output vector

Input vector: xi

wij

jθ

neuron(processingunit)


Defining a Network Topology Normalize the input values for each attribute Discrete-valued attributes may be encoded such that there is

one input unit per domain value An output unit is used to represent two classes. If there are

more than two classes, then one output unit per class is used.


A Neuron (I) The n-dimensional input vector is mapped into the output

variable by means of the scalar product and a nonlinear function mapping

j (bias)-

f

weighted sum

Inputs(outputs

fromprevious

layer)

Output O’

Activationfunction

weightvector w

w0j

w1j

wnj

O0

O1

On

Ij


A Neuron (II)

j-

fOutput O’

w0j

w1j

wnj

O0

O1

On

Ij

ji

iijj OwI jIje

'O

11

squashing function(to map a large input domain onto [0,1])


An example of a neural network

5

41

2

3

4

5

6

1X

2X

3X

14W15W

24W25W

34W35W

46W

56W

1 0 1 0.2 -0.3 0.4 0.1 -0.5 0.2 -0.3 -0.2 -0.4 0.2 0.1

1X 2X 3X 14W 15W 24W 25W 34W 35W 46W 56W 4 5 6

1

1

0 Prediction

Assume all the weightsand thresholds have been trained

6


Using the Neural Network for Prediction

Prediction output = 0


Network Training The ultimate objective of training

obtain a set of weights that makes almost all the tuples in the training data classified correctly

Steps Initialize weights with random values While terminating condition not satisfied

For each training samples (input tuples) Compute the output value of each unit (including the output unit)// back-propagate the errors For each output unit

Compute the error For each hidden unit

Compute the error// weight and bias updating (case updating) For each output and hidden unit

Update the weights and the bias


Network Training (Backpropagation) (I)

Output nodes

Input nodes

Hidden nodes

Output vector

Input vector: xi

wjk

output true the is T where

)OT)(O1(OErr

j

jjjjj

jkk

kjjj wErrOOErr )1(


Network Training (Backpropagation) (II)

Output nodes

Input nodes

Hidden nodes

Output vector

Input vector: xi

wij

ijijij OErrlww )(jjj Errl)(

jjj Errl)(ijijij OErrlww )(

l is the learning rate

jθ

wij

jθ


An example of training a neural network

4

5

1

2

3

4

5

6

1X

2X

3X

14W15W

24W25W

34W35W

46W

56W

1 0 1 0.2 -0.3 0.4 0.1 -0.5 0.2 -0.3 -0.2 -0.4 0.2 0.1

1X

Assume these are initial values for training

2X 3X 14W 15W 24W 25W 34W 35W 46W 56W 4 5 6

1

1

0 Correct Output=1

6


Learning Example


Learning rate = 0.9


Weight Updating Case updating – The weights and biases are updated after the

presentation of each sample. Epoch updating – The weight and bias increments could be

accumulated in variables, so that the weights and biases are updated after all of the samples in the training set have been presented.

In practice, case updating is more common.


Terminating Condition Training stops when

All changes in weights were so small as to be below some threshold, or

The percentage of samples misclassified is below some threshold, or

A pre-specified number of epochs has expired. In practice, several hundreds of thousands of epochs may be

required before the weights will converge.

eclt 5810 classification –neural networkseclt5810/lecture/neural-networks... · 2020. 3. 3. ·...

Documents