extreme learning machine:theory and applications

24
Extreme learning machine:Theory and applications G.-B. Huang, Q.-Y. Zhu, and C.-K. Siew Neurocomputing, 2006 Presenter: James Chou 2012/03/15

Upload: formatc666

Post on 30-Nov-2014

4.475 views

Category:

Technology


2 download

DESCRIPTION

20120315

TRANSCRIPT

Page 1: Extreme learning machine:Theory and applications

Extreme learning machine:Theory and applicationsG.-B. Huang, Q.-Y. Zhu, and C.-K. SiewNeurocomputing, 2006

Presenter: James Chou2012/03/15

Page 2: Extreme learning machine:Theory and applications

Outline

Introduction Single-hidden layer feed-forward neural networks Neural Network Mathematical Model Back Propagation algorithm ELM Mathematical Model Performance Evaluation Conclusion

2

Page 3: Extreme learning machine:Theory and applications

Introduction

For past decades, gradient descent based methods have mainly been used in many learning algorithms of feed-forward neural networks.

Traditionally, all the parameters of the feed-forward neural networks need to tune iterative and need a very long time to learn.

When the input weights and the hidden layer biases are randomly assigned, SLFNs (single-hidden layer feed-forward neural networks) can be simply considered as a linear system and the output weights (linking the hidden layer to the output layer) can be computed through simple generalized inverse operation.

3

Page 4: Extreme learning machine:Theory and applications

Introduction (Cont.)

Based on this idea, this paper proposes a simple learning algorithm for SLFNs called extreme learning.

Different from traditional learning algorithms the extreme learning algorithm not only provide the smaller training error but also the better performance.

4

Page 5: Extreme learning machine:Theory and applications

Single-hidden layer feed-forwardneural networks

F(. ) is activation function Hard Limiter function

Sigmoid function

5

)(1

N

iii xFOutput

xwhen

xwhenxf

,0

,1)(

xexf

1

1)(

θ is the threshold

Page 6: Extreme learning machine:Theory and applications

Single-hidden layer feed-forwardneural networks (Cont.)

6

G() is activation functionL is number of hidden layer nodes

Page 7: Extreme learning machine:Theory and applications

Neural Network Mathematical Model

7

Page 8: Extreme learning machine:Theory and applications

Neural Network Mathematical Model (Cont.)

8

If ε = 0 , meanFL(x) = f(x) = T , T is known targetand Cost function = 0

Page 9: Extreme learning machine:Theory and applications

Neural Network Mathematical Model (Cont.)

Mathematical Model is βH = T From linear algebra view point If hidden layer have 20 nodes and total 1000

training datas.β How to calculate the big size inverse matrix is a

traditional issue.We try to calculate the inverse matrix of 5000*50,

but the PC crashes.

9

Page 10: Extreme learning machine:Theory and applications

Back Propagation algorithm

BP algorithm is the classic gradient base algorithm to find the best weight vectors and minimize the cost function.

10

η is Leaming Rate

Demo BP algorithm!

Page 11: Extreme learning machine:Theory and applications

ELM Mathematical Model

H+ is the Moore-Penrose generalized inverse of hidden layer output matrix H.

H+ = (HTH)-1HT

11

Page 12: Extreme learning machine:Theory and applications

ELM Mathematical Model (Cont.)

Moore-Penrose generalized inverse matrixThe application of linear algebra theorem.For a general linear system Ax = y, we say that is a

least-squares solution (l.s.s) if There · mean a norm in Euclidean space and ∥ ∥

A R∈ mxn and Y R∈ m.The resolution of a general linear system Ax=y,

where A may be singular and may even not be square, can be made very simple by the use of the Moore–Penrose generalized inverse.

12

Page 13: Extreme learning machine:Theory and applications

ELM Mathematical Model (Cont.)

Mathematical Model is βH = T We can rewritten the formula as

β = H+T = (HTH)-1HTT If hidden layer have 20 nodes and total 1000

training datas.

13

Page 14: Extreme learning machine:Theory and applications

Performance Evaluation

Page 15: Extreme learning machine:Theory and applications

Regression of SinC Function15

Page 16: Extreme learning machine:Theory and applications

Regression of SinC Function (Cont.)

100000 training data with 5-20% noise. 100000 testing data is noise free. The result of training 50 times in the

following table.

16

Noise TrainingTime_AVG(sec)

TrainingRMS_AVG TestingRMS_AVG

5% 0.6462 0.0113 2.201e-04=0.00022

10% 0.6306 0.0224 2.753e-04=0.00027

15% 0.6427 0.0334 8.336e-04=0.00083

20% 0.6452 0.0449 11.541e-04=0.00115

Demo ELM!

Page 17: Extreme learning machine:Theory and applications

Real-World Regression Problems

17

Page 18: Extreme learning machine:Theory and applications

Real-World Regression Problems (Cont.)

18

Page 19: Extreme learning machine:Theory and applications

Real-World Regression Problems (Cont.)

19

Page 20: Extreme learning machine:Theory and applications

Real-World Regression Problems (Cont.)

20

Page 21: Extreme learning machine:Theory and applications

Real-World Very Large Complex Applications

21

Page 22: Extreme learning machine:Theory and applications

Real Medical Diagnosis Application: Diabetes

22

Page 23: Extreme learning machine:Theory and applications

Protein Sequence Classification23

Page 24: Extreme learning machine:Theory and applications

Conclusion

Advantages ELM needs less training time compared to popular BP and

SVM/SVR. The prediction performance of ELM is usually a little better

than BP and close to SVM/SVR in many applications. Only need to turn the parameter L (hidden layer nodes). Nonlinear activation function still can work in ELM.

Disadvantages How to find the optimal soluction? Local minima issue. Easy Overfitting.

24