lecture 7. learning (iv): error correcting learning and lms algorithm
Post on 01-Jan-2016
29 Views
Preview:
DESCRIPTION
TRANSCRIPT
Intro. ANN & Fuzzy Systems
Lecture 7.Learning (IV):
Error correcting Learning and LMS Algorithm
(C) 2001-2003 by Yu Hen Hu 2
Intro. ANN & Fuzzy Systems
Outline
• Error Correcting Learning – LMS algorithm – Nonlinear activation
• Batch mode Least Square Learning
(C) 2001-2003 by Yu Hen Hu 3
Intro. ANN & Fuzzy Systems
Supervised Learning with a Single Neuron
Input: x(n) = [1 x1(n) … xM(n)]T
Parameter: w(n) = [w0(n) w1(n) … wM(n)]T
Output: y(n) = f[u(n)] = f[wT(n)xT(n)]
Feedback from environment: d(n), desired output.
w0
wM
w1x1(n)
xM(n)
u(n) y(n) d(n): feedback fromenvironment
(C) 2001-2003 by Yu Hen Hu 4
Intro. ANN & Fuzzy Systems
Error-Correction Learning • Error @ time n = desired value – actual value
e(n) = d(n) – y(n)
• Goal: modify w(n) to minimized square error
E(n) = e2(n)
• This leads to a steepest descent learning formulation:
w(n+1) = w(n) – ' wE(n)
where wE(n) = [E(n)/w0(n) … E(n)/wM(n)]T
= 2 e(n) [y(n)/w0(n) … y(n)/wM(n)]T
is the gradient of E(n) w.r.t. w(n).
’ is a learning rate constant.
(C) 2001-2003 by Yu Hen Hu 5
Intro. ANN & Fuzzy Systems
Case 1. Linear Activation: LMS Learning
If f(u) = u, then y(n) = u(n) = wT(n)x(n) Hence wE(n) =2 e(n) [y(n)/w0(n) … y(n)/wM(n)]T
= 2 e(n) [1 x1(n) … xM(n)]T = 2 e(n) x(n)
Note e(n) is a scalar, and x(n) is a M+1 by 1 vector. Let = 2’, we have the least mean square (LMS) learning formula as a special case of error-correcting learning:
w(n+1) = w(n) + h e(n)•x(n)
Observation The amount of corrections made to w(n), w(n+1)w(n) is proportional to the magnitude of the error e(n) and along the direction of the input vector x(n).
(C) 2001-2003 by Yu Hen Hu 6
Intro. ANN & Fuzzy Systems
Example
Let y(n) = w0(n) •1+ w1(n) x1(n) + w2(n) x2(n). Assume the inputs are:
Assume w0(1) = w1(1) = w2(1) = 0, and = 0.01.
e(1) = d(1) – y(1) = 1 – [0•1 + 0•0.5 + 0•0.8] = 1
w0(2) = w0(1) + e(1)•1 = 0 + 0.01•1•1 = 0.01
w1(2) = w1(1) + e(1) x1(1) = 0 + 0.01•1•0.5 = 0.005
w2(2) = w2(1) + e(1) x2(1) = 0 + 0.01•1•0.8 = 0.008
n 1 2 3 4
x1(n) 0.5 -0.4 1.1 0.7
x2(n) 0.8 0.4 -0.3 1.2
d(n) 1 0 0 1
(C) 2001-2003 by Yu Hen Hu 7
Intro. ANN & Fuzzy Systems
Results
Matlab source file: Learner.m
n 1 2 3 4
w0(n) 0.01 0. 0099 0.0098 0.0195
w1(n) 0.005 0.0050 0.0049 0.0117
w2(n) 0.008 0.0080 0.0080 0.0197
(C) 2001-2003 by Yu Hen Hu 8
Intro. ANN & Fuzzy Systems
Case 2. Non-linear Activation
In general,
Observation:
The additional terms is f’ [u(n)]. When this term becomes small, learning will NOT take place.
Otherwise, the update formula is similar to LMS.
)()]([')(2
)()(
)]([)(2)(
)(
)]([)(2)(
nxnufne
nxndu
nudfnenu
ndu
nudfnenE WW
(C) 2001-2003 by Yu Hen Hu 9
Intro. ANN & Fuzzy Systems
LMS and Least Square Estimate Assume that the parameters w remain
unchanged for n = 1 to N (> M). Then, e2(n) =
d2(n) 2d(n)wTx(n) + wTx(n)xT(n)w. Define an
expected error (Mean square error)
Denote
Then
where R: correlation matrix, : cross correlation vector.
wnxnxwndnxwndneE TN
n
TN
n
TN
n
N
n
)()()()(2)()(111
2
1
2
)()( and ),()(11
ndnxnxnxRN
n
TN
n
RwwwndE TTN
n
2)(1
2
(C) 2001-2003 by Yu Hen Hu 10
Intro. ANN & Fuzzy Systems
Least Square Solution
• Solve wE = 0, for W, we have
wLS = R1
• When {x(n)} is a wide-sense stationary random process, the LMS solution w(n) converges in probability to the least square solution wLS.
LMSdemo.m
(C) 2001-2003 by Yu Hen Hu 11
Intro. ANN & Fuzzy Systems
LMS Demonstration
0 2000 4000 6000-1.5
-1
-0.5
0
0.5
w0
0 2000 4000 60000.2
0.4
0.6
0.8
1
w1
0 2000 4000 60000
0.5
1
1.5
w2
0 2000 4000 6000-2
-1
0
1
2
erro
r
(C) 2001-2003 by Yu Hen Hu 12
Intro. ANN & Fuzzy Systems
LMS output comparison
0 1000 2000 3000 4000 5000 6000-2
0
2
nois
y y
noiseless output (red), noisy output (1), filtered output (2), and error
0 1000 2000 3000 4000 5000 6000-2
0
2
filte
red
y
0 1000 2000 3000 4000 5000 60000
0.5
1
1.5original noisefiltered noise
top related