„the perfect is not good enough!” (carl benz) v isualization of high dimensional data by use of...

15
„The perfect is not good enough!” (Carl Benz) VISUALIZATION OF HIGH DIMENSIONAL DATA BY USE OF GENETIC PROGRAMMING – APPLICATION TO ON-LINE INFRARED SPECTROSCOPY BASED PROCESS MONITORING TIBOR KULCSÁR, JÁNOS ABONYI UNIVERSITY OF PANNONIA DEPARTMENT OF PROCESS ENGINEERING

Upload: lenard-malone

Post on 03-Jan-2016

218 views

Category:

Documents


0 download

TRANSCRIPT

„The perfect is not good enough!” (Carl Benz)

VISUALIZATION OF HIGH DIMENSIONAL DATA BY USE OF GENETIC PROGRAMMING – APPLICATION TO ON-LINE INFRARED SPECTROSCOPY BASED PROCESS MONITORINGTIBOR KULCSÁR, JÁNOS ABONYIUNIVERSITY OF PANNONIADEPARTMENT OF PROCESS ENGINEERING

2

PreconditionsOnline analyzers are widely used in oil industry to

predict product properties like Density, Cloud point, etc.

Properties can’t be described using linear models

Visualization of high dimensional spectral database is needed for model development and proces monitoring

Cost function and a tool for equation discovery is needed to obtain compact and interpretable mappingof high dimensional data

3

4000 4100 4200 4300 4400 4500 4600 4700 48000

0.002

0.004

0.006

0.008

0.01

0.012

0.014

0.016

cm-1

jj f wy

w

y

n

n

R

R

w

y 3010 yn

195wn

Njwn

kkj ,...,1,1

1

Task I: Estimation

Nj ,...,1

Tjnjj yPP ,...1y

4

Similar spectra - Similar property

-1.5 -1 -0.5 0 0.5 1 1.5

x 10-4

-4

-3

-2

-1

0

1

2

3

4

5x 10

-5

Dmax

Rsphere = 3 Percentage of Dmax corresponding to the radius of the sphere

15 20 25 30 351

1.5

2

2.5

3

3.5

4

4.5

5

5.5

TotAroP

olyC

ycl

k

n

kkxkjxjjx wwSSdi

1

,

mjxjx iSSdi ,

vxvj PP vvxvj EPP

Evim

5

Finding similar spectra

Prediction model Nearest Neighbors algorithm The neighborhood is basis of the

prediction

2D mapping Define the range of validity for

the local models The mapped plain should follow

the original spectral space

Quality measure Measure the quality of mapping Measure the neighborhood

preserving

Property X = f ( Prop[S1, S2, S3, S4, S5, S6] )Property X = f ( Prop[S1, S2, S3, S4, S5, S6] )

S1S1S2S2

S4S4S6S6S5S5

S3S3

N2N2

N4N4

N6N6N5N5

N3N3

N1N1

XX

nxP̂

n

1iiP

6

Chemical information – interpretable?

0

.2

.4

.6

.8

1

1.2

4000 4100 4200 4300 4400 4500 4600 4700 4800

Ab

so

rbe

nc

y

Aromatic

Eth

yle

nic

Ole

fin

ic

Aro

ma

tic

Aro

ma

tic

Bra

nc

he

d /

cy

clo

nic

Linear

Saturated

Saturated

Branched

Wavenumber (cm-1)

43

21~WW

WWKARO

aromatic

linear

olefinic

7

Aggregates – need for explicit mapping

1.8 2 2.25

6

7

Rsat

Kar

o

1.8 2 2.210

20

30

Rsat

Kis

o

1.8 2 2.210

15

20

Rsat

Ken

e

1.8 2 2.265

70

75

Rsat

Nol

a

1.8 2 2.215

20

25

Rsat

Nol

ef

1.8 2 2.2-10

0

10

Rsat

Nar

o

1.8 2 2.2-100

-50

0

Rsat

Kox

1.8 2 2.280

100

120

Rsat

Par

ox

1.8 2 2.2-1

-0.5

0

Rsat

Kar

o3

1.8 2 2.2100

150

Rsat

Kcy

1.8 2 2.20

50

100

Rsat

Ksa

tu

1.8 2 2.20

50

100

Rsat

Ker

oH

1.8 2 2.29

9.5

10

Rsat

AK

aro

5.5 6 6.5 710

20

30

Karo

Kis

o

5.5 6 6.5 710

15

20

Karo

Ken

e

5.5 6 6.5 765

70

75

Karo

Nol

a5.5 6 6.5 7

15

20

25

Karo

Nol

ef

5.5 6 6.5 7-10

0

10

Karo

Nar

o

5.5 6 6.5 7-100

-50

0

Karo

Kox

5.5 6 6.5 780

100

120

Karo

Par

ox

5.5 6 6.5 7-1

-0.5

0

Karo

Kar

o3

5.5 6 6.5 7100

150

Karo

Kcy

5.5 6 6.5 70

50

100

Karo

Ksa

tu

5.5 6 6.5 70

50

100

Karo

Ker

oH

5.5 6 6.5 79

9.5

10

Karo

AK

aro

15 20 25 3010

15

20

Kiso

Ken

e

15 20 25 3065

70

75

Kiso

Nol

a

15 20 25 3015

20

25

Kiso

Nol

ef

15 20 25 30-10

0

10

Kiso

Nar

o

15 20 25 30-100

-50

0

Kiso

Kox

15 20 25 3080

100

120

Kiso

Par

ox

15 20 25 30-1

-0.5

0

Kiso

Kar

o3

15 20 25 30100

150

Kiso

Kcy

15 20 25 300

50

100

Kiso

Ksa

tu

15 20 25 300

50

100

Kiso

Ker

oH

15 20 25 309

9.5

10

Kiso

AK

aro

12 14 16 1865

70

75

Kene

Nol

a

12 14 16 1815

20

25

Kene

Nol

ef

12 14 16 18-10

0

10

Kene

Nar

o

12 14 16 18-100

-50

0

Kene

Kox

12 14 16 1880

100

120

Kene

Par

ox

12 14 16 18-1

-0.5

0

Kene

Kar

o3

12 14 16 18100

150

Kene

Kcy

12 14 16 180

50

100

Kene

Ksa

tu

12 14 16 180

50

100

Kene

Ker

oH

12 14 16 189

9.5

10

Kene

AK

aro

65 70 7515

20

25

Nola

Nol

ef

65 70 75-10

0

10

NolaN

aro

65 70 75-100

-50

0

Nola

Kox

65 70 7580

100

120

Nola

Par

ox

65 70 75-1

-0.5

0

Nola

Kar

o3

65 70 75100

150

Nola

Kcy

65 70 750

50

100

Nola

Ksa

tu

65 70 750

50

100

Nola

Ker

oH

65 70 759

9.5

10

Nola

AK

aro

15 20 25-10

0

10

Nolef

Nar

o

15 20 25-100

-50

0

Nolef

Kox

15 20 2580

100

120

Nolef

Par

ox

15 20 25-1

-0.5

0

Nolef

Kar

o3

15 20 25100

150

Nolef

Kcy

15 20 250

50

100

Nolef

Ksa

tu

15 20 250

50

100

Nolef

Ker

oH

15 20 259

9.5

10

Nolef

AK

aro

-5 0 5 10-100

-50

0

NaroK

ox-5 0 5 10

80

100

120

Naro

Par

ox

-5 0 5 10-1

-0.5

0

Naro

Kar

o3

-5 0 5 10100

150

Naro

Kcy

-5 0 5 100

50

100

Naro

Ksa

tu

-5 0 5 100

50

100

Naro

Ker

oH

-5 0 5 109

9.5

10

Naro

AK

aro

-100 -50 080

100

120

Kox

Par

ox

-100 -50 0-1

-0.5

0

Kox

Kar

o3

-100 -50 0100

150

Kox

Kcy

-100 -50 00

50

100

Kox

Ksa

tu

-100 -50 00

50

100

Kox

Ker

oH

-100 -50 09

9.5

10

Kox

AK

aro

80 90 100 110-1

-0.5

0

Parox

Kar

o3

80 90 100 110100

150

Parox

Kcy

80 90 100 1100

50

100

Parox

Ksa

tu

80 90 100 1100

50

100

Parox

Ker

oH

80 90 100 1109

9.5

10

Parox

AK

aro

-1 -0.5 0100

150

Karo3

Kcy

-1 -0.5 00

50

100

Karo3

Ksa

tu

-1 -0.5 00

50

100

Karo3

Ker

oH

-1 -0.5 09

9.5

10

Karo3

AK

aro

100 120 140 1600

50

100

Kcy

Ksa

tu

100 120 140 1600

50

100

Kcy

Ker

oH

100 120 140 1609

9.5

10

Kcy

AK

aro

20 40 60 800

50

100

Ksatu

Ker

oH

20 40 60 809

9.5

10

Ksatu

AK

aro

0 50 1009

9.5

10

KeroH

AK

aro

𝑊1 +𝑊2𝑊3 +𝑊4 +𝑊5

൬𝑊1 ⋅ 𝑊2𝑊3 ⋅ 𝑊4 −𝐶1൰𝐶2 +𝐶3

൬𝑊1 −𝑊2 +𝐶1𝐶2𝑊1 +𝑊4 −𝐶3൰𝐶4 +𝐶5

൬𝑊1𝑊2𝑊3 −𝐶1൰𝐶2 −𝐶3

൬𝑊1 +𝑊2 +𝑊3𝑊4 +𝑊5 −𝐶1൰𝐶2 +𝐶3

൬𝐶1𝑊1 +𝑊2𝑊3 +𝑊4 −𝐶1൰𝐶2 +𝐶3

ሺ𝐶1𝑊1 +𝐶2𝑊2 +𝐶3𝑊3 +𝐶4𝑊4 +𝐶5𝑊5 +𝐶6𝑊6ሻ−𝐶7

Agg

rage

2

Aggrage 1

Two aggregate

2D mapping

8

Representation of AggregatesOne of the most popular method

for representing structures is the binary tree.

1221 / pxpxy

Terminal nodes:

Variables: x1, x2

Parameters: p1, p2

Non terminal nodes

Operators: +,-,*,/

Functions: exp(),cos()

X1 /

+ P1

P2 X2

9

Genetic Operators: Mutation

-

x1 /

*

x2x1

p1

-

x1 /

+

x2x1

p1

10

Genetic Operators: Crossover-

x1 /

+

x2x1

p1

+

x2 +

x1 p1

+

x2

-

x1 +

x1 p1

/

+

x2x1

p1

11

Scheme of Genetic ProgramingCreation of initial

population

Evaluation

Selection

Direct reproduction

New generation

End?

End

Crossover Mutation

Parameteroptimization

Fitnessvalue

12

Process of model developmentMeasurement•Online spectrum•Labor data

MATLAB•Preprocessing•Data query

MATLAB Genetic

algorithm

TOPNIR environment

Online System

13

Results

Best pair from original set Best eq and an optimised pair

Searche a better pair

14

ConclusionThe quality of mapping is measureable

Neighborhood preserving (forward and backward) Discriminating operational regimes

Aggregate based mapping Interpretable chemical information Build aggregate – needs much experience (divination)

Genetic programing Controlled method to make new equations Needs proper cost function

(measure the quality of mapping)

Visual representation of models Aggregate -> 2D plot -> dashboard graph Information about the model structure

15

Questions? …

The financial support of the TAMOP-4.2.2/B-10/1-2010-0025 project is acknowledged.

ACKNOWLEDGMENT

In case of any question or remark please contact us

[email protected]