cs621: artificial intelligence lecture 26: backpropagation and its applications pushpak...

27
CS621: Artificial Intelligence Lecture 26: Backpropagation and its applications Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay

Upload: nickolas-shaw

Post on 29-Jan-2016

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CS621: Artificial Intelligence Lecture 26: Backpropagation and its applications Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay

CS621: Artificial IntelligenceLecture 26: Backpropagation and its

applications

Pushpak BhattacharyyaComputer Science and Engineering

DepartmentIIT Bombay

Page 2: CS621: Artificial Intelligence Lecture 26: Backpropagation and its applications Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay

Backpropagation algorithm

• Fully connected feed forward network• Pure FF network (no jumping of connections

over layers)

Hidden layers

Input layer (n i/p neurons)

Output layer (m o/p neurons)

j

i

wji

….

….

….

….

Page 3: CS621: Artificial Intelligence Lecture 26: Backpropagation and its applications Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay

Gradient Descent Equations

iji

jji

j

thj

ji

j

jji

jiji

jow

netjw

jnet

E

netw

net

net

E

w

E

w

Ew

)layer j at theinput (

)10 rate, learning(

Page 4: CS621: Artificial Intelligence Lecture 26: Backpropagation and its applications Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay

Backpropagation – for outermost layer

ijjjjji

jjjj

m

ppp

thj

j

j

jj

ooootw

oootj

otE

netnet

o

o

E

net

Ej

)1()(

))1()(( Hence,

)(2

1

)layer j at theinput (

1

2

Page 5: CS621: Artificial Intelligence Lecture 26: Backpropagation and its applications Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay

Backpropagation for hidden layers

Hidden layers

Input layer (n i/p neurons)

Output layer (m o/p neurons)

j

i

….

….

….

….

k

k is propagated backwards to find value of j

Page 6: CS621: Artificial Intelligence Lecture 26: Backpropagation and its applications Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay

Backpropagation – for hidden layers

ijjk

kkj

jjk

kjkj

jjk jk

jjj

j

j

jj

iji

ooow

oow

ooo

netk

net

E

ooo

E

net

o

o

E

net

Ej

jow

)1()(

)1()( Hence,

)1()(

)1(

layernext

layernext

layernext

Page 7: CS621: Artificial Intelligence Lecture 26: Backpropagation and its applications Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay

General Backpropagation Rule

ijjk

kkj ooow )1()(layernext

)1()( jjjjj ooot

iji jow • General weight updating rule:

• Where for outermost layer

for hidden layers

Page 8: CS621: Artificial Intelligence Lecture 26: Backpropagation and its applications Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay

How does it work?

• Input propagation forward and error propagation backward (e.g. XOR)

w2=1w1=1θ = 0.5

x1x2 x1x2

-1

x1 x2

-11.5

1.5

1 1

Page 9: CS621: Artificial Intelligence Lecture 26: Backpropagation and its applications Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay

Local Minima

Due to the Greedy nature of BP, it can get stuck in local minimum m and will never be able to reach the global minimum g as the error can only decrease by weight change.

Page 10: CS621: Artificial Intelligence Lecture 26: Backpropagation and its applications Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay

Momentum factor

1. Introduce momentum factor.

Accelerates the movement out of the trough. Dampens oscillation inside the trough. Choosing β : If β is large, we may jump over the

minimum.

iterationthnjiijiterationnthji wOw )1()()(

Page 11: CS621: Artificial Intelligence Lecture 26: Backpropagation and its applications Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay

Symmetry breaking• If mapping demands different weights, but we start with the

same weights everywhere, then BP will never converge.

w2=1w1=1θ = 0.5

x1x2 x1x2

-1

x1 x2

-11.5

1.5

1 1

XOR n/w: if we sstarted with identicalweight everywhere, BPwill not converge

Page 12: CS621: Artificial Intelligence Lecture 26: Backpropagation and its applications Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay

Backpropagation Applications

Page 13: CS621: Artificial Intelligence Lecture 26: Backpropagation and its applications Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay

Problem defined

Decided by trial error

Problem defined

O/P layer

Hidden layer

I/P layer

Feed Forward Network Architecture

Page 14: CS621: Artificial Intelligence Lecture 26: Backpropagation and its applications Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay

Digit Recognition Problem

• Digit recognition:– 7 segment display– Segment being on/off defines a digit

1

2

3

6

7

45

Page 15: CS621: Artificial Intelligence Lecture 26: Backpropagation and its applications Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay

9O 8O 7O . . . 2O 1O

Hidden layer

Full connection

Full connection

7O 6O 5O . . . 2O 1OSeg-7 Seg-6 Seg-5 Seg-2 Seg-1

Page 16: CS621: Artificial Intelligence Lecture 26: Backpropagation and its applications Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay

Example - Character Recognition

• Output layer – 26 neurons (all capital)• First output neuron has the responsibility of

detecting all forms of ‘A’• Centralized representation of outputs• In distributed representations, all output

neurons participate in output

Page 17: CS621: Artificial Intelligence Lecture 26: Backpropagation and its applications Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay

An application in Medical Domain

Page 18: CS621: Artificial Intelligence Lecture 26: Backpropagation and its applications Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay

Expert System for Skin Diseases Diagnosis

• Bumpiness and scaliness of skin• Mostly for symptom gathering and for

developing diagnosis skills• Not replacing doctor’s diagnosis

Page 19: CS621: Artificial Intelligence Lecture 26: Backpropagation and its applications Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay

Architecture of the FF NN• 96-20-10• 96 input neurons, 20 hidden layer neurons, 10

output neurons• Inputs: skin disease symptoms and their parameters

– Location, distribution, shape, arrangement, pattern, number of lesions, presence of an active norder, amount of scale, elevation of papuls, color, altered pigmentation, itching, pustules, lymphadenopathy, palmer thickening, results of microscopic examination, presence of herald pathc, result of dermatology test called KOH

Page 20: CS621: Artificial Intelligence Lecture 26: Backpropagation and its applications Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay

Output

• 10 neurons indicative of the diseases:– psoriasis, pityriasis rubra pilaris, lichen planus,

pityriasis rosea, tinea versicolor, dermatophytosis, cutaneous T-cell lymphoma, secondery syphilis, chronic contact dermatitis, soberrheic dermatitis

Page 21: CS621: Artificial Intelligence Lecture 26: Backpropagation and its applications Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay

Training data

• Input specs of 10 model diseases from 250 patients

• 0.5 is some specific symptom value is not knoiwn

• Trained using standard error backpropagation algorithm

Page 22: CS621: Artificial Intelligence Lecture 26: Backpropagation and its applications Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay

Testing• Previously unused symptom and disease data of 99 patients• Result:• Correct diagnosis achieved for 70% of papulosquamous group

skin diseases• Success rate above 80% for the remaining diseases except for

psoriasis• psoriasis diagnosed correctly only in 30% of the cases• Psoriasis resembles other diseases within the

papulosquamous group of diseases, and is somewhat difficult even for specialists to recognise.

Page 23: CS621: Artificial Intelligence Lecture 26: Backpropagation and its applications Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay

Explanation capability

• Rule based systems reveal the explicit path of reasoning through the textual statements

• Connectionist expert systems reach conclusions through complex, non linear and simultaneous interaction of many units

• Analysing the effect of a single input or a single group of inputs would be difficult and would yield incor6rect results

Page 24: CS621: Artificial Intelligence Lecture 26: Backpropagation and its applications Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay

Explanation contd.

• The hidden layer re-represents the data• Outputs of hidden neurons are neither

symtoms nor decisions

Page 25: CS621: Artificial Intelligence Lecture 26: Backpropagation and its applications Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay

Figure : Explanation of dermatophytosis diagnosis using the DESKNET expert system.

5 (Dermatophytosis node)

0( Psoriasis node )

Disease diagnosis

-2.71

-2.48

-3.46

-2.68

19

14

13

0

1.621.43

2.13

1.68

1.58

1.22

Symptoms & parametersDuration

of lesions : weeks 0

1

6

10

36

171

95

96

Duration of lesions : weeks

Minimal itching

Positive KOH test

Lesions locatedon feet

Minimalincrease

in pigmentation

Positive test forpseudohyphae

And spores

Bias

Internalrepresentation

1.46

20 Bias

-2.86

-3.31

9 (Seborrheic dermatitis node)

Page 26: CS621: Artificial Intelligence Lecture 26: Backpropagation and its applications Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay

Discussion

• Symptoms and parameters contributing to the diagnosis found from the n/w

• Standard deviation, mean and other tests of significance used to arrive at the importance of contributing parameters

• The n/w acts as apprentice to the expert

Page 27: CS621: Artificial Intelligence Lecture 26: Backpropagation and its applications Pushpak Bhattacharyya Computer Science and Engineering Department IIT Bombay

Exercise

• Find the weakest condition for symmetry breaking. It is not the case that only when ALL weights are equal, the network faces the symmetry problem.