knowledge engineering · as an example, we will develop an intelligent system for diagnosing...

Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 1

Lecture 15Lecture 15

Knowledge engineering:Knowledge engineering:Genetic algorithms and hybrid systemsGenetic algorithms and hybrid systems

�� Will genetic algorithms work for my Will genetic algorithms work for my

problem?problem?

�� The travelling salesman problemThe travelling salesman problem

�� Will a hybrid system work for my Will a hybrid system work for my problemproblem??

�� NeuroNeuro--fuzzy decisionfuzzy decision--support systemssupport systems

�� TimeTime--series predictionseries prediction

�� SummarySummary


Will genetic algorithms work for my Will genetic algorithms work for my

problem?problem?

�� OptimisationOptimisation is the process of finding a better is the process of finding a better

solution to a problemsolution to a problem. .

�� A genetic algorithm generates a population of A genetic algorithm generates a population of

competing candidate solutions and then causes competing candidate solutions and then causes

them to evolve through the process of natural them to evolve through the process of natural

selection selection –– poor solutions tend to die out, while poor solutions tend to die out, while

better solutions survive and reproduce. By better solutions survive and reproduce. By

repeating this process over and over again, the repeating this process over and over again, the

genetic algorithm breeds an optimal solution.genetic algorithm breeds an optimal solution.


Case study 8Case study 8

The travelling salesman problemThe travelling salesman problem

Suppose, Suppose, we are going to travel by car in Western we are going to travel by car in Western

and Central Europe. We want to produce an optimal and Central Europe. We want to produce an optimal

itinerary for visiting all major cities and returning itinerary for visiting all major cities and returning

home.home.

This problem is well known as the This problem is well known as the travelling travelling

salesman problemsalesman problem. Given a finite number of cities, . Given a finite number of cities,

NN, and the cost of travel (or the distance) between , and the cost of travel (or the distance) between

each pair of cities, we need to find the cheapest way each pair of cities, we need to find the cheapest way

(or the shortest route) for visiting each city exactly (or the shortest route) for visiting each city exactly

once and returning to the starting point.once and returning to the starting point.


�� The TSP is represented in numerous The TSP is represented in numerous

transportation and logistics applications such astransportation and logistics applications such as

�� arranging routes for school buses to pick up arranging routes for school buses to pick up

children in a school district, children in a school district,

�� delivering meals to homedelivering meals to home--bound people,bound people,

�� scheduling stacker cranes in a warehouse,scheduling stacker cranes in a warehouse,

�� planning truck routes to pick up parcel post and planning truck routes to pick up parcel post and

many others. many others.

�� A classic example of the TSP is the scheduling A classic example of the TSP is the scheduling

of a machine to drill holes in a circuit board.of a machine to drill holes in a circuit board.


How does a genetic algorithm solve the How does a genetic algorithm solve the

travelling salesman problemtravelling salesman problem? ?

�� First, we need to decide how to represent a route of First, we need to decide how to represent a route of

the salesman. The most natural way of representing the salesman. The most natural way of representing

a route is the a route is the path representationpath representation.. Each city is Each city is

given an alphabetic or numerical name, the route given an alphabetic or numerical name, the route

through the cities is represented as a chromosome, through the cities is represented as a chromosome,

and appropriate genetic operators are used to create and appropriate genetic operators are used to create

new routes.new routes.

�� Suppose we have nine cities named from 1 to 9. In Suppose we have nine cities named from 1 to 9. In

a chromosome, the order of the integers represents a chromosome, the order of the integers represents

the order in which the cities will be visited by the the order in which the cities will be visited by the

salesman. salesman.


An example of the salesmanAn example of the salesman’’s routes route

1

2

3

4

65

7

8

9

1 56 3 2 48 9 7


How does the crossover operator works?How does the crossover operator works?

The crossover operator in its classical form cannot be The crossover operator in its classical form cannot be

directly applied to the TSP. A simple exchange of parts directly applied to the TSP. A simple exchange of parts

between two parents would produce illegal routes between two parents would produce illegal routes

containing duplicates and omissions containing duplicates and omissions –– some cities would some cities would

be visited twice while some others would not be visited be visited twice while some others would not be visited

at all:at all:

1 56 3 2 48 9 7 Parent 1: 3 67 1 9 84 2 5 Parent 2:

1 56 3 9 84 2 5 Child 1: 3 67 1 2 48 9 7 Child 2:


Crossover operators for the TSPCrossover operators for the TSP

Step 1

Parent 2: 3 67 2 51 9 84 Parent 1: 9 71 56 3 2 8 4

Step 2

* 1 9 84 Child 1: 3 2 48 Child 2:* * * * * * ** *

Step 3

9 84 Child 1: 3 2 48 Child 2:7 * *67 56 2 7 9

Parent 1: 9 71 56 3 2 8 4 Parent 2: 3 67 2 51 9 84

135 7 * *16 5


How does the mutation operator works?How does the mutation operator works?

�� There are two types of mutation operators: There are two types of mutation operators:

reciprocal exchangereciprocal exchange and and inversioninversion. .

�� The The reciprocal exchange operatorreciprocal exchange operator simply simply

swaps two randomly selected cities in the swaps two randomly selected cities in the

chromosome. chromosome.

�� The The inversion operatorinversion operator selects two random selects two random

points along the chromosome string and points along the chromosome string and

reverses the order of the cities between these reverses the order of the cities between these

points.points.


9 71 56 3 2 8 4

(a) original chromosomes

Reciprocal exchange Inversion

9 71 56 43 2 8

9 71 86 3 2 5 4

(b) mutated chromosomes

9 71 32 45 6 8

Mutation operators for the TSPMutation operators for the TSP


How do we define a fitness function in the How do we define a fitness function in the

TSP?TSP?

The fitness of each individual chromosome is The fitness of each individual chromosome is

determined as the reciprocal of the route determined as the reciprocal of the route

length. length.

In other words, the shorter the route, the fitter In other words, the shorter the route, the fitter

the chromosome.the chromosome.


Performance graph and the best Performance graph and the best salesmansalesman’’s route created s route created

in a population of 20 chromosomes after 100 generationsin a population of 20 chromosomes after 100 generations

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1The total distance: 5.124

4

1

2

5

67

8

9

10

11

12

13

14

16

17

18

19

20

3

15

0 10 20 30 40 50 60 70 80 90 1004

5

6

7

8

9

10

11

Generations

Distance

N = 20, pc = 0.7, pm = 0.001

Best

Average


Performance graphs and the best Performance graphs and the best routes created in a routes created in a

population of 200 chromosomes: mutation rate is 0.001population of 200 chromosomes: mutation rate is 0.001

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9


4

1

2

5

67

8

9

10

11

12

13

14

16

17

18

19

20

3

15

0 10 20 30 40 50 60 70 80 90 1004

5

6

8

9

10

11

Generations

Distance

N = 200, pc = 0.7, pm = 0.001

Best

Average

(a)


Performance graphs and the best Performance graphs and the best routes created in a routes created in a

population of 200 chromosomes: mutation rate is 0.01population of 200 chromosomes: mutation rate is 0.01

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9


4

1

2

5

67

8

9

10

11

12

13

14

16

17

18

19

20

3

15

0 10 20 30 40 50 60 70 80 90 1004

5

6

7

8

9

10

11

Generations

Distance

Best

Average

(b)

N = 200, pc = 0.7, pm = 0.01


Will a hybrid system work for my Will a hybrid system work for my

problem?problem?

Solving complex realSolving complex real--world problems requires world problems requires

an application of complex intelligent systems an application of complex intelligent systems

that combine the advantages of expert systems, that combine the advantages of expert systems,

fuzzy logic, neural networks and evolutionary fuzzy logic, neural networks and evolutionary

computation. Such systems can integrate computation. Such systems can integrate

humanhuman--like expertise in a specific domain with like expertise in a specific domain with

abilities to learn and adapt to a rapidly changing abilities to learn and adapt to a rapidly changing

environment.environment.



NeuroNeuro--fuzzy decisionfuzzy decision--support systemssupport systems

As an example, we will develop As an example, we will develop an intelligent an intelligent

system for diagnosing myocardial perfusion system for diagnosing myocardial perfusion

from cardiac images. Suppose, we have a set from cardiac images. Suppose, we have a set

of cardiac images as well as the clinical notes of cardiac images as well as the clinical notes

and physicianand physician’’s interpretation. s interpretation.


What areWhat are SPECTSPECT images?images?

�� Diagnosis in modern cardiac medicine is based on Diagnosis in modern cardiac medicine is based on

the analysis ofthe analysis of SPECTSPECT (Single Proton Emission (Single Proton Emission

Computed Tomography) images. Computed Tomography) images.

�� By injecting a patient with radioactive tracer, two By injecting a patient with radioactive tracer, two

sets ofsets of SPECTSPECT images are obtained: one is taken 10 images are obtained: one is taken 10 ––

15 minutes after the injection when the stress is 15 minutes after the injection when the stress is

greatest (stress images), and the other is taken 2 greatest (stress images), and the other is taken 2 –– 5 5

hours after the injection (rest images). Distribution hours after the injection (rest images). Distribution

of the radioactive tracer in the cardiac muscle is of the radioactive tracer in the cardiac muscle is

proportional to the muscleproportional to the muscle’’s perfusion.s perfusion.

�� A cardiologist detects abnormalities in the heart A cardiologist detects abnormalities in the heart

function by comparing stress and rest images. function by comparing stress and rest images.


�� The The SPECTSPECT images are usually represented by images are usually represented by

high resolution twohigh resolution two--dimensional blackdimensional black--andand--white white

pictures with up to 256 shades of grey. Brighter pictures with up to 256 shades of grey. Brighter

patches on the image correspond to wellpatches on the image correspond to well--perfused perfused

areas of the myocardium, while darker patches areas of the myocardium, while darker patches

may indicate the presence of anmay indicate the presence of an ischemiaischemia. .

�� Unfortunately a visual inspection of theUnfortunately a visual inspection of the SPECTSPECT

images is highly subjective; physiciansimages is highly subjective; physicians’’

interpretations are therefore often inconsistent interpretations are therefore often inconsistent

and susceptible to errors.and susceptible to errors.


�� For this study, we use 267 cardiac diagnostic For this study, we use 267 cardiac diagnostic

cases. Each case is accompanied by twocases. Each case is accompanied by two SPECTSPECT

images (the stress image and the rest image), and images (the stress image and the rest image), and

each image is divided into 22 regions. each image is divided into 22 regions.

�� The regionThe region’’s brightness, which in turn reflects s brightness, which in turn reflects

perfusion inside this region, is expressed by an perfusion inside this region, is expressed by an

integer number between 0 and 100. integer number between 0 and 100.

�� Thus, each cardiac diagnostic case is represented Thus, each cardiac diagnostic case is represented

by 44 continuous features and one binary feature by 44 continuous features and one binary feature

that assigns an overall diagnosis that assigns an overall diagnosis –– normal or normal or

abnormal.abnormal.


�� The entire The entire SPECTSPECT data set consists of 55 cases data set consists of 55 cases

classified as normal (positive examples) and 212 classified as normal (positive examples) and 212

cases classified as abnormal (negative examples).cases classified as abnormal (negative examples).

�� The entire set is divided into training and test The entire set is divided into training and test

sets. sets.

�� The training set has 40 positive and 40 negative The training set has 40 positive and 40 negative

examples. The test set is represented by 15 examples. The test set is represented by 15

positive and 172 negative examples.positive and 172 negative examples.

TheThe SPECTSPECT data setdata set


�� The number ofThe number of neuronsneurons in the input layer is in the input layer is

determined by the total number of regions in the determined by the total number of regions in the

stress and rest images. In this example, each stress and rest images. In this example, each

image is divided into 22 regions, so we need 44 image is divided into 22 regions, so we need 44

inputinput neuronsneurons. .

�� Since Since SPECTSPECT images are to be classified as either images are to be classified as either

normal or abnormal, we use two output normal or abnormal, we use two output neuronsneurons..

�� A good generalisation is obtained with as little as A good generalisation is obtained with as little as

5 to 75 to 7 neuronsneurons in the hidden layer. in the hidden layer.

Can we train a backCan we train a back--propagation neural propagation neural

network to classify thenetwork to classify the SPECTSPECT images into images into

normal and abnormal?normal and abnormal?


�� However, when we test the network on the test However, when we test the network on the test

set, we find that the networkset, we find that the network’’s performance is s performance is

rather poor rather poor –– about 25 percent of normal cardiac about 25 percent of normal cardiac

diagnostic cases are misclassified as abnormal diagnostic cases are misclassified as abnormal

and over 35 percent of abnormal cases are and over 35 percent of abnormal cases are

misclassified as normal; the overall diagnostic misclassified as normal; the overall diagnostic

error exceeds 33 percent. error exceeds 33 percent.

�� This indicates that the training set may lack some This indicates that the training set may lack some

important examples (a neural network is only as important examples (a neural network is only as

good as the examples used to train it).good as the examples used to train it).


First, we need to redefine the problem. To train the First, we need to redefine the problem. To train the

network, we used the same number of positive and network, we used the same number of positive and

negative examples. Although in real clinical trials, negative examples. Although in real clinical trials,

the ratio between normal and abnormalthe ratio between normal and abnormal SPECTSPECT

images is very different, the misclassification of an images is very different, the misclassification of an

abnormal cardiac case could lead to infinitely more abnormal cardiac case could lead to infinitely more

serious consequences than the misclassification of a serious consequences than the misclassification of a

normal case. Therefore, in order to achieve a small normal case. Therefore, in order to achieve a small

classification error for abnormalclassification error for abnormal SPECTSPECT images, we images, we

might agree to have a relatively large error for might agree to have a relatively large error for

normal images.normal images.

Can we improve the accuracy of the Can we improve the accuracy of the

diagnosis?diagnosis?


�� The neural network produces two outputs. The The neural network produces two outputs. The

first output corresponds to the possibility that thefirst output corresponds to the possibility that the

SPECTSPECT image belongs to the class image belongs to the class normalnormal, and , and

the second to the possibility that the image the second to the possibility that the image

belongs to the class belongs to the class abnormalabnormal. If, for example, . If, for example,

the first (the first (normalnormal) output is 0.92 and the second ) output is 0.92 and the second

((abnormalabnormal) is 0.16, the) is 0.16, the SPECTSPECT image is image is

classified as normal, and we can conclude that the classified as normal, and we can conclude that the

risk of a heart attack for this case is low.risk of a heart attack for this case is low.


�� On the other hand, if the On the other hand, if the normalnormal output is low, output is low,

say 0.17, and the say 0.17, and the abnormalabnormal output is much output is much

higher, say 0.51, thehigher, say 0.51, the SPECTSPECT image is classified image is classified

as abnormal, and we can infer that the risk of a as abnormal, and we can infer that the risk of a

heart attack in this case is rather high. However, heart attack in this case is rather high. However,

if the two outputs are close if the two outputs are close –– say the say the normalnormal

output is 0.51 and the output is 0.51 and the abnormalabnormal 0.49 0.49 –– we cannot we cannot

confidently classify the image.confidently classify the image.


�� To build a fuzzy system, we first need to To build a fuzzy system, we first need to

determine input and output variables, define determine input and output variables, define

fuzzy sets and construct fuzzy rules. fuzzy sets and construct fuzzy rules.

�� For our problem, there are two inputs (For our problem, there are two inputs (NNNN

output 1output 1 andand NNNN output 2output 2) and one output (the ) and one output (the

riskrisk of a heart attack). of a heart attack).

�� The inputs are normalised to be within the The inputs are normalised to be within the

range of [0, 1], and the output can vary range of [0, 1], and the output can vary

between 0 and 100 percent. between 0 and 100 percent.

Can we use fuzzy logic for decisionCan we use fuzzy logic for decision--making making

in medical diagnosis?in medical diagnosis?


Fuzzy sets of the neural network output Fuzzy sets of the neural network output

normalnormal

0.10

1.0

0.0

0.2

0.4

0.6

0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

NN output 1 (normalised )

Degree of membership

0.8HighLow Medium


Fuzzy sets of the neural network output Fuzzy sets of the neural network output

abnormalabnormal

0.10

1.0

0.0

0.2

0.4

0.6

0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

NN output 2 (normalised )


0.8HighLow Medium


Fuzzy sets of the linguistic variable Fuzzy sets of the linguistic variable RiskRisk

100

0.0

20 30 40 50 60 70 80 90 100

Risk, %


1.0

0.2

0.4

0.6

0.8

ModerateLow HighVery low Very high


Fuzzy rules for assessing the risk of a heart Fuzzy rules for assessing the risk of a heart

deceasedecease

1. If (NN_output1 is Low) and (NN_output2 is Low) then (Risk is Moderate)2. If (NN_output1 is Low) and (NN_output2 is Medium) then (Risk is High)3. If (NN_output1 is Low) and (NN_output2 is High) then (Risk is Very_high)4. If (NN_output1 is Medium) and (NN_output2 is Low) then (Risk is Low)5. If (NN_output1 is Medium) and (NN_output2 is Medium) then (Risk is Moderate)6. If (NN_output1 is Medium) and (NN_output2 is High) then (Risk is High)

7. If (NN_output1 is High) and (NN_output2 is Low) then (Risk is Very_low)8. If (NN_output1 is High) and (NN_output2 is Medium) then (Risk is Low)

9. If (NN_output1 is High) and (NN_output2 is High) then (Risk is Moderate)


Structure of the Structure of the neuroneuro--fuzzy systemfuzzy system

Fuzzy Rule Base

Risk

NN output 1 NN output 2

4321 765

21

31 2 4442 43


ThreeThree--dimensional plot for the fuzzy rule basedimensional plot for the fuzzy rule base

00.2

0.40.6

0.81 0

0.20.4

0.60.8

1

20

40

60

80

NN output 2

Risk, %

NN output 1


�� The systemThe system’’s output is a crisp number that s output is a crisp number that

represents a patientrepresents a patient’’s risk of a heart attack.s risk of a heart attack.

�� Based on this number, a cardiologist can now Based on this number, a cardiologist can now

classify cardiac cases with greater certainty classify cardiac cases with greater certainty ––

when the risk is quantified, a decisionwhen the risk is quantified, a decision--maker maker

has a much better chance of making the right has a much better chance of making the right

decision. For instance, if the risk is low, say, decision. For instance, if the risk is low, say,

smaller than 30 percent, the cardiac case can smaller than 30 percent, the cardiac case can

be classified as be classified as normalnormal, but if the risk is high, , but if the risk is high,

say, greater than 50 percent, the case is say, greater than 50 percent, the case is

classified as classified as abnormalabnormal..


However, cardiac cases with the risk However, cardiac cases with the risk

between 30 and 50 percent cannot be between 30 and 50 percent cannot be

classified as either classified as either normalnormal or or abormalabormal ––

rather, such cases are rather, such cases are uncertainuncertain..


A cardiologist knows that, in normal heart A cardiologist knows that, in normal heart

muscle, perfusion at maximum stress is usually muscle, perfusion at maximum stress is usually

higher than perfusion at rest: higher than perfusion at rest:

�� If perfusion inside region If perfusion inside region ii at stress is higher than at stress is higher than

perfusion inside the same region at rest, then the perfusion inside the same region at rest, then the

risk of a heart attack should be decreased. risk of a heart attack should be decreased.

�� If perfusion inside region If perfusion inside region ii at stress is not higher at stress is not higher

than perfusion inside the same region at rest, then than perfusion inside the same region at rest, then

the risk of a heart attack should be increased. the risk of a heart attack should be increased.

Can we classify some of the uncertain cases Can we classify some of the uncertain cases

using the knowledge of a cardiologist?using the knowledge of a cardiologist?


TheseThese heuristicsheuristics can be implemented in the can be implemented in the

diagnostic system as follows:diagnostic system as follows:

Step 1:Step 1: Present thePresent the neuroneuro--fuzzy system with a fuzzy system with a

cardiac case.cardiac case.

Step 2:Step 2: If the systemIf the system’’s output is less than 30,s output is less than 30,

classify the presented case as classify the presented case as normalnormal andand

then stop. If the output is greater than 50,then stop. If the output is greater than 50,

classify the case as classify the case as abnormalabnormal and stop.and stop.

Otherwise, go to Step 3.Otherwise, go to Step 3.


Step 3:Step 3: For regionFor region 1, subtract perfusion at rest from1, subtract perfusion at rest from

perfusion at stress. If the result is positive,perfusion at stress. If the result is positive,

decrease the current risk by multiplyingdecrease the current risk by multiplying

its value by 0.99. Otherwise, increase theits value by 0.99. Otherwise, increase the

risk by multiplying its value by 1.01. Repeatrisk by multiplying its value by 1.01. Repeat

this procedure for all 22 regions and then gothis procedure for all 22 regions and then go

to Step 4.to Step 4.

Step 4:Step 4: If the new risk value is less than 30, classifyIf the new risk value is less than 30, classify

the case as the case as normalnormal; if the risk is greater than; if the risk is greater than

50, classify the case as 50, classify the case as abnormalabnormal; otherwise; otherwise

–– classify the case as classify the case as uncertainuncertain..


�� The accuracy of diagnosis has dramatically The accuracy of diagnosis has dramatically

improved improved –– the overall diagnostic error does not the overall diagnostic error does not

exceed 5 percent, while only 3 percent of exceed 5 percent, while only 3 percent of

abnormal cardiac cases are misclassified as abnormal cardiac cases are misclassified as

normal. normal.

�� Although we have not improved the systemAlthough we have not improved the system’’s s

performance on normal cases (over 30 percent of performance on normal cases (over 30 percent of

normal cases are still misclassified as abnormal), normal cases are still misclassified as abnormal),

and up to 20 percent of the total number of cases and up to 20 percent of the total number of cases

are classified as uncertain, theare classified as uncertain, the neuroneuro--fuzzy system fuzzy system

can actually achieve even better results in can actually achieve even better results in

classifyingclassifying SPECTSPECT images than a cardiologist can. images than a cardiologist can.


�� In this example, theIn this example, the neuroneuro--fuzzy system has a fuzzy system has a

heterogeneousheterogeneous structure structure –– the neural network and the neural network and

fuzzy system work as independent components fuzzy system work as independent components

(although they cooperate in solving the problem). (although they cooperate in solving the problem).

When a new case is presented to the diagnostic When a new case is presented to the diagnostic

system, the trained neural network determines system, the trained neural network determines

inputs to the fuzzy system. Then the fuzzy system inputs to the fuzzy system. Then the fuzzy system

using predefined fuzzy sets and fuzzy rules, maps using predefined fuzzy sets and fuzzy rules, maps

the given inputs to an output, and thereby obtains the given inputs to an output, and thereby obtains

the risk of a heart attack.the risk of a heart attack.

�� AdaptiveAdaptive NeuroNeuro--Fuzzy Inference System (Fuzzy Inference System (ANFISANFIS) )

is a typical example of ais a typical example of a neuroneuro--fuzzy system with a fuzzy system with a

homogeneoushomogeneous structure.structure.



TimeTime--series predictionseries prediction

As an example, we will develop As an example, we will develop a tool to predict a tool to predict

an aircraftan aircraft’’s trajectory during its landing aboard s trajectory during its landing aboard

an aircraft carrier. an aircraft carrier.

Suppose, we have a database of landing Suppose, we have a database of landing

trajectories of various aircraft flown by different trajectories of various aircraft flown by different

pilots, and we also can use RADAR numerical pilots, and we also can use RADAR numerical

data, which provide realdata, which provide real--time trajectories of time trajectories of

landing aircraft. Our goal is to predict an landing aircraft. Our goal is to predict an

aircraftaircraft’’s trajectory at least two seconds in s trajectory at least two seconds in

advance, based on the aircraftadvance, based on the aircraft’’s current position.s current position.


�� The landing of an aircraft, particularly aboard The landing of an aircraft, particularly aboard

aircraft carriers, is an extremely complex process.aircraft carriers, is an extremely complex process.

�� It is affected by such variables as the flight deckIt is affected by such variables as the flight deck’’s s

space constraints and its motions (both pitch and space constraints and its motions (both pitch and

roll), the aircraftroll), the aircraft’’s ordinance and fuel load, s ordinance and fuel load,

continuous mechanical preparations, and the most continuous mechanical preparations, and the most

critical of all critical of all –– time constraints. time constraints.

�� The ship may be heaving 10 feet up and 10 feet The ship may be heaving 10 feet up and 10 feet

down, making a 20down, making a 20--foot displacement from a level foot displacement from a level

deck. In addition, it is difficult to see approaching deck. In addition, it is difficult to see approaching

aircraft at night or during stormy conditions.aircraft at night or during stormy conditions.

Prediction of the aircraftPrediction of the aircraft’’s position s position


�� Responsibility for the aircraftResponsibility for the aircraft’’s final approach s final approach

and landing lies with the Landing Signal Officer and landing lies with the Landing Signal Officer

((LSOLSO). ).

�� When an aircraft is within one nautical mile of When an aircraft is within one nautical mile of

the landing deck, which roughly corresponds to the landing deck, which roughly corresponds to

60 seconds in real time, the aircraft60 seconds in real time, the aircraft’’s flight is s flight is

carefully observed and guided. During this carefully observed and guided. During this

critical time, thecritical time, the LSOLSO needs to predict the needs to predict the

aircraftaircraft’’s position at least two seconds ahead.s position at least two seconds ahead.

�� Such problems are known in mathematics as Such problems are known in mathematics as

timetime--series predictionseries prediction problems.problems.


�� A time series can be defined as a set of A time series can be defined as a set of

observations, each one being recorded at a observations, each one being recorded at a

specific time. For instance, a time series can be specific time. For instance, a time series can be

obtained by recording the aircraftobtained by recording the aircraft’’s positions s positions

over a time interval of, say, 60 seconds before over a time interval of, say, 60 seconds before

landing. landing.

�� RealReal--world timeworld time--series problems are nonseries problems are non--linear linear

and often exhibit chaotic behaviour, which and often exhibit chaotic behaviour, which

make them hard to model.make them hard to model.

What is a time series?What is a time series?


�� Prediction of the aircraftPrediction of the aircraft’’s landing trajectory is s landing trajectory is

mainly based on the experience of amainly based on the experience of a LSOLSO (all(all

LSOsLSOs are trained pilots). are trained pilots).

�� An automatic prediction system can use aircraftAn automatic prediction system can use aircraft--

position data given by the shipposition data given by the ship’’s RADAR, and s RADAR, and

also data records of previous landings executed also data records of previous landings executed

by pilots flying different types of aircraft. by pilots flying different types of aircraft.

�� The system is trained offThe system is trained off--line with the past data. line with the past data.

Then it is presented onThen it is presented on--line with the current line with the current

motion profile, and required to predict the motion profile, and required to predict the

aircraftaircraft’’s motion in the next few seconds. s motion in the next few seconds.


OnOn--line timeline time--series prediction of an series prediction of an

aircraftaircraft’’s trajectorys trajectory

Input: current motionprofile of the aircraft

Time-seriesprediction

system

Output: predictedposition of the aircraft

-10

0

10

Line-up, feet 20

Time, s

0 2 4 6 8

Time, s

0 2 4 6 8-10

0

10

Line-up, feet 20

To predict an aircraftTo predict an aircraft’’s position ons position on--line we will line we will

use anuse an ANFISANFIS. .


To predict a future value for a time series, we use To predict a future value for a time series, we use

values that are already known. For example, if we values that are already known. For example, if we

want to predict an aircraftwant to predict an aircraft’’s position two seconds s position two seconds

ahead, we may use its current position data as well ahead, we may use its current position data as well

as data recorded, say, 2, 4 and 6 seconds before the as data recorded, say, 2, 4 and 6 seconds before the

current position. These four known values current position. These four known values

represent an input pattern represent an input pattern –– a foura four--dimensional dimensional

vector of the following form:vector of the following form:

What do we use asWhat do we use as ANFISANFIS inputs?inputs?

[ ])( )2( )4( )6( txtxtxtx −−−=x


The The ANFISANFIS output corresponds to the trajectory output corresponds to the trajectory

prediction: the aircraftprediction: the aircraft’’s position two seconds s position two seconds

ahead, ahead, xx ((tt + 2).+ 2).

For this case study, we will use 10 landing For this case study, we will use 10 landing

trajectories trajectories –– five for training and five for testing. five for training and five for testing.

Each trajectory is a time series of the aircraftEach trajectory is a time series of the aircraft’’s s

position data points recorded every half a second position data points recorded every half a second

over a time interval of 60 seconds before landing. over a time interval of 60 seconds before landing.

Thus, a data set for each trajectory contains 121 Thus, a data set for each trajectory contains 121

values.values.

What is theWhat is the ANFISANFIS output?output?


An aircraft trajectory and a data set built to An aircraft trajectory and a data set built to

train thetrain the ANFISANFIS

-20

-10

0

10

Line-up, feet

20

0

Time, s

1 2 3 4 5 6 7 8 9

Inputs

x1 x2 x3 x4

Desired

output

−17.4 2.1 11.0 3.9 4.2

−12.9 7.5 10.1 2.1 4.9

−10.0 9.8 8.2 2.0 5.3

For a landing trajectory recorded over a time interval For a landing trajectory recorded over a time interval

of 60 seconds, we obtain 105 training samples of 60 seconds, we obtain 105 training samples

represented by a 105 represented by a 105 ×× 5 matrix. Thus, the entire data 5 matrix. Thus, the entire data

set, which we use for training theset, which we use for training the ANFISANFIS, is , is

represented by a 525 represented by a 525 ×× 5 matrix.5 matrix.


A practical approach is to choose the smallest A practical approach is to choose the smallest

number of membership functions. Thus, we may number of membership functions. Thus, we may

begin with two membership functions assigned to begin with two membership functions assigned to

each input variable.each input variable.

How many membership functions How many membership functions

should we assign to each input variable?should we assign to each input variable?


0 10 20 30 40 50 60-25

-20

-15

-10

-5

0

5

10

15

Time, s

Line-up, feet

Aircraft trajectory

ANFIS prediction

20

Performance of thePerformance of the ANFISANFIS with four inputs and with four inputs and

two membership functions assigned to each input: two membership functions assigned to each input:

one epochone epoch



two membership functions assigned to each input: two membership functions assigned to each input:

100 epochs100 epochs

0 10 20 30 40 50 60-25

-20

-15

-10

-5

0

5

10

15

Time, s

Line-up, feet

Aircraft trajectory

ANFIS prediction

20


The The ANFISANFIS’’ss performance can be significantly performance can be significantly

improved by assigning improved by assigning three membership three membership

functions to each input variablefunctions to each input variable..

How can we improve theHow can we improve the ANFISANFIS’’ss

performance?performance?



three membership functions assigned to each input three membership functions assigned to each input

after one epoch of trainingafter one epoch of training

0 10 20 30 40 50 60

-25

-20

-15

-10

-5

0

5

10

15

Time, s

Line-up, feet

Aircraft trajectory

ANFIS prediction

20

-30


Another way of improving timeAnother way of improving time--series series

prediction is toprediction is to increase the number of input increase the number of input

variablesvariables. .

Let us, for example, examine anLet us, for example, examine an ANFISANFIS with with

six inputs that correspond to the aircraftsix inputs that correspond to the aircraft’’s flight s flight

positions at (positions at (tt –– 5), (5), (tt –– 4), (4), (tt –– 3), (3), (tt –– 2), (2), (tt –– 1), 1),

and and tt, respectively. The, respectively. The ANFISANFIS output still output still

remains the tworemains the two--second prediction. second prediction.

The training data set is now represented by a The training data set is now represented by a

535 535 ×× 7 matrix.7 matrix.


Performance of thePerformance of the ANFISANFIS with six inputs and two with six inputs and two

membership functions assigned to each input: membership functions assigned to each input:

prediction after one epochprediction after one epoch

0 10 20 30 40 50 60

-25

-20

-15

-10

-5

0

5

10

15

Time, s

Line-up, feet

Aircraft trajectory

ANFIS prediction

20

-30


Performance of thePerformance of the ANFISANFIS with six inputs and two with six inputs and two

membership functions assigned to each input: membership functions assigned to each input:

prediction errorsprediction errors

0 10 20 30 40 50 60-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

Time, s

Prediction error, feet

knowledge engineering · as an example, we will develop an intelligent system for diagnosing...

Documents