knowledge engineering · as an example, we will develop an intelligent system for diagnosing...
TRANSCRIPT
-
Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 1
Lecture 15Lecture 15
Knowledge engineering:Knowledge engineering:Genetic algorithms and hybrid systemsGenetic algorithms and hybrid systems
�� Will genetic algorithms work for my Will genetic algorithms work for my
problem?problem?
�� The travelling salesman problemThe travelling salesman problem
�� Will a hybrid system work for my Will a hybrid system work for my problemproblem??
�� NeuroNeuro--fuzzy decisionfuzzy decision--support systemssupport systems
�� TimeTime--series predictionseries prediction
�� SummarySummary
-
Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 2
Will genetic algorithms work for my Will genetic algorithms work for my
problem?problem?
�� OptimisationOptimisation is the process of finding a better is the process of finding a better
solution to a problemsolution to a problem. .
�� A genetic algorithm generates a population of A genetic algorithm generates a population of
competing candidate solutions and then causes competing candidate solutions and then causes
them to evolve through the process of natural them to evolve through the process of natural
selection selection –– poor solutions tend to die out, while poor solutions tend to die out, while
better solutions survive and reproduce. By better solutions survive and reproduce. By
repeating this process over and over again, the repeating this process over and over again, the
genetic algorithm breeds an optimal solution.genetic algorithm breeds an optimal solution.
-
Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 3
Case study 8Case study 8
The travelling salesman problemThe travelling salesman problem
Suppose, Suppose, we are going to travel by car in Western we are going to travel by car in Western
and Central Europe. We want to produce an optimal and Central Europe. We want to produce an optimal
itinerary for visiting all major cities and returning itinerary for visiting all major cities and returning
home.home.
This problem is well known as the This problem is well known as the travelling travelling
salesman problemsalesman problem. Given a finite number of cities, . Given a finite number of cities,
NN, and the cost of travel (or the distance) between , and the cost of travel (or the distance) between
each pair of cities, we need to find the cheapest way each pair of cities, we need to find the cheapest way
(or the shortest route) for visiting each city exactly (or the shortest route) for visiting each city exactly
once and returning to the starting point.once and returning to the starting point.
-
Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 4
�� The TSP is represented in numerous The TSP is represented in numerous
transportation and logistics applications such astransportation and logistics applications such as
�� arranging routes for school buses to pick up arranging routes for school buses to pick up
children in a school district, children in a school district,
�� delivering meals to homedelivering meals to home--bound people,bound people,
�� scheduling stacker cranes in a warehouse,scheduling stacker cranes in a warehouse,
�� planning truck routes to pick up parcel post and planning truck routes to pick up parcel post and
many others. many others.
�� A classic example of the TSP is the scheduling A classic example of the TSP is the scheduling
of a machine to drill holes in a circuit board.of a machine to drill holes in a circuit board.
-
Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 5
How does a genetic algorithm solve the How does a genetic algorithm solve the
travelling salesman problemtravelling salesman problem? ?
�� First, we need to decide how to represent a route of First, we need to decide how to represent a route of
the salesman. The most natural way of representing the salesman. The most natural way of representing
a route is the a route is the path representationpath representation.. Each city is Each city is
given an alphabetic or numerical name, the route given an alphabetic or numerical name, the route
through the cities is represented as a chromosome, through the cities is represented as a chromosome,
and appropriate genetic operators are used to create and appropriate genetic operators are used to create
new routes.new routes.
�� Suppose we have nine cities named from 1 to 9. In Suppose we have nine cities named from 1 to 9. In
a chromosome, the order of the integers represents a chromosome, the order of the integers represents
the order in which the cities will be visited by the the order in which the cities will be visited by the
salesman. salesman.
-
Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 6
An example of the salesmanAn example of the salesman’’s routes route
1
2
3
4
65
7
8
9
1 56 3 2 48 9 7
-
Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 7
How does the crossover operator works?How does the crossover operator works?
The crossover operator in its classical form cannot be The crossover operator in its classical form cannot be
directly applied to the TSP. A simple exchange of parts directly applied to the TSP. A simple exchange of parts
between two parents would produce illegal routes between two parents would produce illegal routes
containing duplicates and omissions containing duplicates and omissions –– some cities would some cities would
be visited twice while some others would not be visited be visited twice while some others would not be visited
at all:at all:
1 56 3 2 48 9 7 Parent 1: 3 67 1 9 84 2 5 Parent 2:
1 56 3 9 84 2 5 Child 1: 3 67 1 2 48 9 7 Child 2:
-
Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 8
Crossover operators for the TSPCrossover operators for the TSP
Step 1
Parent 2: 3 67 2 51 9 84 Parent 1: 9 71 56 3 2 8 4
Step 2
* 1 9 84 Child 1: 3 2 48 Child 2:* * * * * * ** *
Step 3
9 84 Child 1: 3 2 48 Child 2:7 * *67 56 2 7 9
Parent 1: 9 71 56 3 2 8 4 Parent 2: 3 67 2 51 9 84
135 7 * *16 5
-
Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 9
How does the mutation operator works?How does the mutation operator works?
�� There are two types of mutation operators: There are two types of mutation operators:
reciprocal exchangereciprocal exchange and and inversioninversion. .
�� The The reciprocal exchange operatorreciprocal exchange operator simply simply
swaps two randomly selected cities in the swaps two randomly selected cities in the
chromosome. chromosome.
�� The The inversion operatorinversion operator selects two random selects two random
points along the chromosome string and points along the chromosome string and
reverses the order of the cities between these reverses the order of the cities between these
points.points.
-
Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 10
9 71 56 3 2 8 4
(a) original chromosomes
Reciprocal exchange Inversion
9 71 56 43 2 8
9 71 86 3 2 5 4
(b) mutated chromosomes
9 71 32 45 6 8
Mutation operators for the TSPMutation operators for the TSP
-
Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 11
How do we define a fitness function in the How do we define a fitness function in the
TSP?TSP?
The fitness of each individual chromosome is The fitness of each individual chromosome is
determined as the reciprocal of the route determined as the reciprocal of the route
length. length.
In other words, the shorter the route, the fitter In other words, the shorter the route, the fitter
the chromosome.the chromosome.
-
Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 12
Performance graph and the best Performance graph and the best salesmansalesman’’s route created s route created
in a population of 20 chromosomes after 100 generationsin a population of 20 chromosomes after 100 generations
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1The total distance: 5.124
4
1
2
5
67
8
9
10
11
12
13
14
16
17
18
19
20
3
15
0 10 20 30 40 50 60 70 80 90 1004
5
6
7
8
9
10
11
Generations
Distance
N = 20, pc = 0.7, pm = 0.001
Best
Average
-
Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 13
Performance graphs and the best Performance graphs and the best routes created in a routes created in a
population of 200 chromosomes: mutation rate is 0.001population of 200 chromosomes: mutation rate is 0.001
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1The total distance: 4.0938
4
1
2
5
67
8
9
10
11
12
13
14
16
17
18
19
20
3
15
0 10 20 30 40 50 60 70 80 90 1004
5
6
8
9
10
11
Generations
Distance
N = 200, pc = 0.7, pm = 0.001
Best
Average
(a)
-
Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 14
Performance graphs and the best Performance graphs and the best routes created in a routes created in a
population of 200 chromosomes: mutation rate is 0.01population of 200 chromosomes: mutation rate is 0.01
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1The total distance: 4.0825
4
1
2
5
67
8
9
10
11
12
13
14
16
17
18
19
20
3
15
0 10 20 30 40 50 60 70 80 90 1004
5
6
7
8
9
10
11
Generations
Distance
Best
Average
(b)
N = 200, pc = 0.7, pm = 0.01
-
Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 15
Will a hybrid system work for my Will a hybrid system work for my
problem?problem?
Solving complex realSolving complex real--world problems requires world problems requires
an application of complex intelligent systems an application of complex intelligent systems
that combine the advantages of expert systems, that combine the advantages of expert systems,
fuzzy logic, neural networks and evolutionary fuzzy logic, neural networks and evolutionary
computation. Such systems can integrate computation. Such systems can integrate
humanhuman--like expertise in a specific domain with like expertise in a specific domain with
abilities to learn and adapt to a rapidly changing abilities to learn and adapt to a rapidly changing
environment.environment.
-
Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 16
Case study 9Case study 9
NeuroNeuro--fuzzy decisionfuzzy decision--support systemssupport systems
As an example, we will develop As an example, we will develop an intelligent an intelligent
system for diagnosing myocardial perfusion system for diagnosing myocardial perfusion
from cardiac images. Suppose, we have a set from cardiac images. Suppose, we have a set
of cardiac images as well as the clinical notes of cardiac images as well as the clinical notes
and physicianand physician’’s interpretation. s interpretation.
-
Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 17
What areWhat are SPECTSPECT images?images?
�� Diagnosis in modern cardiac medicine is based on Diagnosis in modern cardiac medicine is based on
the analysis ofthe analysis of SPECTSPECT (Single Proton Emission (Single Proton Emission
Computed Tomography) images. Computed Tomography) images.
�� By injecting a patient with radioactive tracer, two By injecting a patient with radioactive tracer, two
sets ofsets of SPECTSPECT images are obtained: one is taken 10 images are obtained: one is taken 10 ––
15 minutes after the injection when the stress is 15 minutes after the injection when the stress is
greatest (stress images), and the other is taken 2 greatest (stress images), and the other is taken 2 –– 5 5
hours after the injection (rest images). Distribution hours after the injection (rest images). Distribution
of the radioactive tracer in the cardiac muscle is of the radioactive tracer in the cardiac muscle is
proportional to the muscleproportional to the muscle’’s perfusion.s perfusion.
�� A cardiologist detects abnormalities in the heart A cardiologist detects abnormalities in the heart
function by comparing stress and rest images. function by comparing stress and rest images.
-
Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 18
�� The The SPECTSPECT images are usually represented by images are usually represented by
high resolution twohigh resolution two--dimensional blackdimensional black--andand--white white
pictures with up to 256 shades of grey. Brighter pictures with up to 256 shades of grey. Brighter
patches on the image correspond to wellpatches on the image correspond to well--perfused perfused
areas of the myocardium, while darker patches areas of the myocardium, while darker patches
may indicate the presence of anmay indicate the presence of an ischemiaischemia. .
�� Unfortunately a visual inspection of theUnfortunately a visual inspection of the SPECTSPECT
images is highly subjective; physiciansimages is highly subjective; physicians’’
interpretations are therefore often inconsistent interpretations are therefore often inconsistent
and susceptible to errors.and susceptible to errors.
-
Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 19
�� For this study, we use 267 cardiac diagnostic For this study, we use 267 cardiac diagnostic
cases. Each case is accompanied by twocases. Each case is accompanied by two SPECTSPECT
images (the stress image and the rest image), and images (the stress image and the rest image), and
each image is divided into 22 regions. each image is divided into 22 regions.
�� The regionThe region’’s brightness, which in turn reflects s brightness, which in turn reflects
perfusion inside this region, is expressed by an perfusion inside this region, is expressed by an
integer number between 0 and 100. integer number between 0 and 100.
�� Thus, each cardiac diagnostic case is represented Thus, each cardiac diagnostic case is represented
by 44 continuous features and one binary feature by 44 continuous features and one binary feature
that assigns an overall diagnosis that assigns an overall diagnosis –– normal or normal or
abnormal.abnormal.
-
Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 20
�� The entire The entire SPECTSPECT data set consists of 55 cases data set consists of 55 cases
classified as normal (positive examples) and 212 classified as normal (positive examples) and 212
cases classified as abnormal (negative examples).cases classified as abnormal (negative examples).
�� The entire set is divided into training and test The entire set is divided into training and test
sets. sets.
�� The training set has 40 positive and 40 negative The training set has 40 positive and 40 negative
examples. The test set is represented by 15 examples. The test set is represented by 15
positive and 172 negative examples.positive and 172 negative examples.
TheThe SPECTSPECT data setdata set
-
Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 21
�� The number ofThe number of neuronsneurons in the input layer is in the input layer is
determined by the total number of regions in the determined by the total number of regions in the
stress and rest images. In this example, each stress and rest images. In this example, each
image is divided into 22 regions, so we need 44 image is divided into 22 regions, so we need 44
inputinput neuronsneurons. .
�� Since Since SPECTSPECT images are to be classified as either images are to be classified as either
normal or abnormal, we use two output normal or abnormal, we use two output neuronsneurons..
�� A good generalisation is obtained with as little as A good generalisation is obtained with as little as
5 to 75 to 7 neuronsneurons in the hidden layer. in the hidden layer.
Can we train a backCan we train a back--propagation neural propagation neural
network to classify thenetwork to classify the SPECTSPECT images into images into
normal and abnormal?normal and abnormal?
-
Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 22
�� However, when we test the network on the test However, when we test the network on the test
set, we find that the networkset, we find that the network’’s performance is s performance is
rather poor rather poor –– about 25 percent of normal cardiac about 25 percent of normal cardiac
diagnostic cases are misclassified as abnormal diagnostic cases are misclassified as abnormal
and over 35 percent of abnormal cases are and over 35 percent of abnormal cases are
misclassified as normal; the overall diagnostic misclassified as normal; the overall diagnostic
error exceeds 33 percent. error exceeds 33 percent.
�� This indicates that the training set may lack some This indicates that the training set may lack some
important examples (a neural network is only as important examples (a neural network is only as
good as the examples used to train it).good as the examples used to train it).
-
Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 23
First, we need to redefine the problem. To train the First, we need to redefine the problem. To train the
network, we used the same number of positive and network, we used the same number of positive and
negative examples. Although in real clinical trials, negative examples. Although in real clinical trials,
the ratio between normal and abnormalthe ratio between normal and abnormal SPECTSPECT
images is very different, the misclassification of an images is very different, the misclassification of an
abnormal cardiac case could lead to infinitely more abnormal cardiac case could lead to infinitely more
serious consequences than the misclassification of a serious consequences than the misclassification of a
normal case. Therefore, in order to achieve a small normal case. Therefore, in order to achieve a small
classification error for abnormalclassification error for abnormal SPECTSPECT images, we images, we
might agree to have a relatively large error for might agree to have a relatively large error for
normal images.normal images.
Can we improve the accuracy of the Can we improve the accuracy of the
diagnosis?diagnosis?
-
Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 24
�� The neural network produces two outputs. The The neural network produces two outputs. The
first output corresponds to the possibility that thefirst output corresponds to the possibility that the
SPECTSPECT image belongs to the class image belongs to the class normalnormal, and , and
the second to the possibility that the image the second to the possibility that the image
belongs to the class belongs to the class abnormalabnormal. If, for example, . If, for example,
the first (the first (normalnormal) output is 0.92 and the second ) output is 0.92 and the second
((abnormalabnormal) is 0.16, the) is 0.16, the SPECTSPECT image is image is
classified as normal, and we can conclude that the classified as normal, and we can conclude that the
risk of a heart attack for this case is low.risk of a heart attack for this case is low.
-
Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 25
�� On the other hand, if the On the other hand, if the normalnormal output is low, output is low,
say 0.17, and the say 0.17, and the abnormalabnormal output is much output is much
higher, say 0.51, thehigher, say 0.51, the SPECTSPECT image is classified image is classified
as abnormal, and we can infer that the risk of a as abnormal, and we can infer that the risk of a
heart attack in this case is rather high. However, heart attack in this case is rather high. However,
if the two outputs are close if the two outputs are close –– say the say the normalnormal
output is 0.51 and the output is 0.51 and the abnormalabnormal 0.49 0.49 –– we cannot we cannot
confidently classify the image.confidently classify the image.
-
Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 26
�� To build a fuzzy system, we first need to To build a fuzzy system, we first need to
determine input and output variables, define determine input and output variables, define
fuzzy sets and construct fuzzy rules. fuzzy sets and construct fuzzy rules.
�� For our problem, there are two inputs (For our problem, there are two inputs (NNNN
output 1output 1 andand NNNN output 2output 2) and one output (the ) and one output (the
riskrisk of a heart attack). of a heart attack).
�� The inputs are normalised to be within the The inputs are normalised to be within the
range of [0, 1], and the output can vary range of [0, 1], and the output can vary
between 0 and 100 percent. between 0 and 100 percent.
Can we use fuzzy logic for decisionCan we use fuzzy logic for decision--making making
in medical diagnosis?in medical diagnosis?
-
Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 27
Fuzzy sets of the neural network output Fuzzy sets of the neural network output
normalnormal
0.10
1.0
0.0
0.2
0.4
0.6
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
NN output 1 (normalised )
Degree of membership
0.8HighLow Medium
-
Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 28
Fuzzy sets of the neural network output Fuzzy sets of the neural network output
abnormalabnormal
0.10
1.0
0.0
0.2
0.4
0.6
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
NN output 2 (normalised )
Degree of membership
0.8HighLow Medium
-
Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 29
Fuzzy sets of the linguistic variable Fuzzy sets of the linguistic variable RiskRisk
100
0.0
20 30 40 50 60 70 80 90 100
Risk, %
Degree of membership
1.0
0.2
0.4
0.6
0.8
ModerateLow HighVery low Very high
-
Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 30
Fuzzy rules for assessing the risk of a heart Fuzzy rules for assessing the risk of a heart
deceasedecease
1. If (NN_output1 is Low) and (NN_output2 is Low) then (Risk is Moderate)2. If (NN_output1 is Low) and (NN_output2 is Medium) then (Risk is High)3. If (NN_output1 is Low) and (NN_output2 is High) then (Risk is Very_high)4. If (NN_output1 is Medium) and (NN_output2 is Low) then (Risk is Low)5. If (NN_output1 is Medium) and (NN_output2 is Medium) then (Risk is Moderate)6. If (NN_output1 is Medium) and (NN_output2 is High) then (Risk is High)
7. If (NN_output1 is High) and (NN_output2 is Low) then (Risk is Very_low)8. If (NN_output1 is High) and (NN_output2 is Medium) then (Risk is Low)
9. If (NN_output1 is High) and (NN_output2 is High) then (Risk is Moderate)
-
Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 31
Structure of the Structure of the neuroneuro--fuzzy systemfuzzy system
Fuzzy Rule Base
Risk
NN output 1 NN output 2
4321 765
21
31 2 4442 43
-
Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 32
ThreeThree--dimensional plot for the fuzzy rule basedimensional plot for the fuzzy rule base
00.2
0.40.6
0.81 0
0.20.4
0.60.8
1
20
40
60
80
NN output 2
Risk, %
NN output 1
-
Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 33
�� The systemThe system’’s output is a crisp number that s output is a crisp number that
represents a patientrepresents a patient’’s risk of a heart attack.s risk of a heart attack.
�� Based on this number, a cardiologist can now Based on this number, a cardiologist can now
classify cardiac cases with greater certainty classify cardiac cases with greater certainty ––
when the risk is quantified, a decisionwhen the risk is quantified, a decision--maker maker
has a much better chance of making the right has a much better chance of making the right
decision. For instance, if the risk is low, say, decision. For instance, if the risk is low, say,
smaller than 30 percent, the cardiac case can smaller than 30 percent, the cardiac case can
be classified as be classified as normalnormal, but if the risk is high, , but if the risk is high,
say, greater than 50 percent, the case is say, greater than 50 percent, the case is
classified as classified as abnormalabnormal..
-
Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 34
However, cardiac cases with the risk However, cardiac cases with the risk
between 30 and 50 percent cannot be between 30 and 50 percent cannot be
classified as either classified as either normalnormal or or abormalabormal ––
rather, such cases are rather, such cases are uncertainuncertain..
-
Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 35
A cardiologist knows that, in normal heart A cardiologist knows that, in normal heart
muscle, perfusion at maximum stress is usually muscle, perfusion at maximum stress is usually
higher than perfusion at rest: higher than perfusion at rest:
�� If perfusion inside region If perfusion inside region ii at stress is higher than at stress is higher than
perfusion inside the same region at rest, then the perfusion inside the same region at rest, then the
risk of a heart attack should be decreased. risk of a heart attack should be decreased.
�� If perfusion inside region If perfusion inside region ii at stress is not higher at stress is not higher
than perfusion inside the same region at rest, then than perfusion inside the same region at rest, then
the risk of a heart attack should be increased. the risk of a heart attack should be increased.
Can we classify some of the uncertain cases Can we classify some of the uncertain cases
using the knowledge of a cardiologist?using the knowledge of a cardiologist?
-
Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 36
TheseThese heuristicsheuristics can be implemented in the can be implemented in the
diagnostic system as follows:diagnostic system as follows:
Step 1:Step 1: Present thePresent the neuroneuro--fuzzy system with a fuzzy system with a
cardiac case.cardiac case.
Step 2:Step 2: If the systemIf the system’’s output is less than 30,s output is less than 30,
classify the presented case as classify the presented case as normalnormal andand
then stop. If the output is greater than 50,then stop. If the output is greater than 50,
classify the case as classify the case as abnormalabnormal and stop.and stop.
Otherwise, go to Step 3.Otherwise, go to Step 3.
-
Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 37
Step 3:Step 3: For regionFor region 1, subtract perfusion at rest from1, subtract perfusion at rest from
perfusion at stress. If the result is positive,perfusion at stress. If the result is positive,
decrease the current risk by multiplyingdecrease the current risk by multiplying
its value by 0.99. Otherwise, increase theits value by 0.99. Otherwise, increase the
risk by multiplying its value by 1.01. Repeatrisk by multiplying its value by 1.01. Repeat
this procedure for all 22 regions and then gothis procedure for all 22 regions and then go
to Step 4.to Step 4.
Step 4:Step 4: If the new risk value is less than 30, classifyIf the new risk value is less than 30, classify
the case as the case as normalnormal; if the risk is greater than; if the risk is greater than
50, classify the case as 50, classify the case as abnormalabnormal; otherwise; otherwise
–– classify the case as classify the case as uncertainuncertain..
-
Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 38
�� The accuracy of diagnosis has dramatically The accuracy of diagnosis has dramatically
improved improved –– the overall diagnostic error does not the overall diagnostic error does not
exceed 5 percent, while only 3 percent of exceed 5 percent, while only 3 percent of
abnormal cardiac cases are misclassified as abnormal cardiac cases are misclassified as
normal. normal.
�� Although we have not improved the systemAlthough we have not improved the system’’s s
performance on normal cases (over 30 percent of performance on normal cases (over 30 percent of
normal cases are still misclassified as abnormal), normal cases are still misclassified as abnormal),
and up to 20 percent of the total number of cases and up to 20 percent of the total number of cases
are classified as uncertain, theare classified as uncertain, the neuroneuro--fuzzy system fuzzy system
can actually achieve even better results in can actually achieve even better results in
classifyingclassifying SPECTSPECT images than a cardiologist can. images than a cardiologist can.
-
Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 39
�� In this example, theIn this example, the neuroneuro--fuzzy system has a fuzzy system has a
heterogeneousheterogeneous structure structure –– the neural network and the neural network and
fuzzy system work as independent components fuzzy system work as independent components
(although they cooperate in solving the problem). (although they cooperate in solving the problem).
When a new case is presented to the diagnostic When a new case is presented to the diagnostic
system, the trained neural network determines system, the trained neural network determines
inputs to the fuzzy system. Then the fuzzy system inputs to the fuzzy system. Then the fuzzy system
using predefined fuzzy sets and fuzzy rules, maps using predefined fuzzy sets and fuzzy rules, maps
the given inputs to an output, and thereby obtains the given inputs to an output, and thereby obtains
the risk of a heart attack.the risk of a heart attack.
�� AdaptiveAdaptive NeuroNeuro--Fuzzy Inference System (Fuzzy Inference System (ANFISANFIS) )
is a typical example of ais a typical example of a neuroneuro--fuzzy system with a fuzzy system with a
homogeneoushomogeneous structure.structure.
-
Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 40
Case study 10Case study 10
TimeTime--series predictionseries prediction
As an example, we will develop As an example, we will develop a tool to predict a tool to predict
an aircraftan aircraft’’s trajectory during its landing aboard s trajectory during its landing aboard
an aircraft carrier. an aircraft carrier.
Suppose, we have a database of landing Suppose, we have a database of landing
trajectories of various aircraft flown by different trajectories of various aircraft flown by different
pilots, and we also can use RADAR numerical pilots, and we also can use RADAR numerical
data, which provide realdata, which provide real--time trajectories of time trajectories of
landing aircraft. Our goal is to predict an landing aircraft. Our goal is to predict an
aircraftaircraft’’s trajectory at least two seconds in s trajectory at least two seconds in
advance, based on the aircraftadvance, based on the aircraft’’s current position.s current position.
-
Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 41
�� The landing of an aircraft, particularly aboard The landing of an aircraft, particularly aboard
aircraft carriers, is an extremely complex process.aircraft carriers, is an extremely complex process.
�� It is affected by such variables as the flight deckIt is affected by such variables as the flight deck’’s s
space constraints and its motions (both pitch and space constraints and its motions (both pitch and
roll), the aircraftroll), the aircraft’’s ordinance and fuel load, s ordinance and fuel load,
continuous mechanical preparations, and the most continuous mechanical preparations, and the most
critical of all critical of all –– time constraints. time constraints.
�� The ship may be heaving 10 feet up and 10 feet The ship may be heaving 10 feet up and 10 feet
down, making a 20down, making a 20--foot displacement from a level foot displacement from a level
deck. In addition, it is difficult to see approaching deck. In addition, it is difficult to see approaching
aircraft at night or during stormy conditions.aircraft at night or during stormy conditions.
Prediction of the aircraftPrediction of the aircraft’’s position s position
-
Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 42
�� Responsibility for the aircraftResponsibility for the aircraft’’s final approach s final approach
and landing lies with the Landing Signal Officer and landing lies with the Landing Signal Officer
((LSOLSO). ).
�� When an aircraft is within one nautical mile of When an aircraft is within one nautical mile of
the landing deck, which roughly corresponds to the landing deck, which roughly corresponds to
60 seconds in real time, the aircraft60 seconds in real time, the aircraft’’s flight is s flight is
carefully observed and guided. During this carefully observed and guided. During this
critical time, thecritical time, the LSOLSO needs to predict the needs to predict the
aircraftaircraft’’s position at least two seconds ahead.s position at least two seconds ahead.
�� Such problems are known in mathematics as Such problems are known in mathematics as
timetime--series predictionseries prediction problems.problems.
-
Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 43
�� A time series can be defined as a set of A time series can be defined as a set of
observations, each one being recorded at a observations, each one being recorded at a
specific time. For instance, a time series can be specific time. For instance, a time series can be
obtained by recording the aircraftobtained by recording the aircraft’’s positions s positions
over a time interval of, say, 60 seconds before over a time interval of, say, 60 seconds before
landing. landing.
�� RealReal--world timeworld time--series problems are nonseries problems are non--linear linear
and often exhibit chaotic behaviour, which and often exhibit chaotic behaviour, which
make them hard to model.make them hard to model.
What is a time series?What is a time series?
-
Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 44
�� Prediction of the aircraftPrediction of the aircraft’’s landing trajectory is s landing trajectory is
mainly based on the experience of amainly based on the experience of a LSOLSO (all(all
LSOsLSOs are trained pilots). are trained pilots).
�� An automatic prediction system can use aircraftAn automatic prediction system can use aircraft--
position data given by the shipposition data given by the ship’’s RADAR, and s RADAR, and
also data records of previous landings executed also data records of previous landings executed
by pilots flying different types of aircraft. by pilots flying different types of aircraft.
�� The system is trained offThe system is trained off--line with the past data. line with the past data.
Then it is presented onThen it is presented on--line with the current line with the current
motion profile, and required to predict the motion profile, and required to predict the
aircraftaircraft’’s motion in the next few seconds. s motion in the next few seconds.
-
Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 45
OnOn--line timeline time--series prediction of an series prediction of an
aircraftaircraft’’s trajectorys trajectory
Input: current motionprofile of the aircraft
Time-seriesprediction
system
Output: predictedposition of the aircraft
-10
0
10
Line-up, feet 20
Time, s
0 2 4 6 8
Time, s
0 2 4 6 8-10
0
10
Line-up, feet 20
To predict an aircraftTo predict an aircraft’’s position ons position on--line we will line we will
use anuse an ANFISANFIS. .
-
Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 46
To predict a future value for a time series, we use To predict a future value for a time series, we use
values that are already known. For example, if we values that are already known. For example, if we
want to predict an aircraftwant to predict an aircraft’’s position two seconds s position two seconds
ahead, we may use its current position data as well ahead, we may use its current position data as well
as data recorded, say, 2, 4 and 6 seconds before the as data recorded, say, 2, 4 and 6 seconds before the
current position. These four known values current position. These four known values
represent an input pattern represent an input pattern –– a foura four--dimensional dimensional
vector of the following form:vector of the following form:
What do we use asWhat do we use as ANFISANFIS inputs?inputs?
[ ])( )2( )4( )6( txtxtxtx −−−=x
-
Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 47
The The ANFISANFIS output corresponds to the trajectory output corresponds to the trajectory
prediction: the aircraftprediction: the aircraft’’s position two seconds s position two seconds
ahead, ahead, xx ((tt + 2).+ 2).
For this case study, we will use 10 landing For this case study, we will use 10 landing
trajectories trajectories –– five for training and five for testing. five for training and five for testing.
Each trajectory is a time series of the aircraftEach trajectory is a time series of the aircraft’’s s
position data points recorded every half a second position data points recorded every half a second
over a time interval of 60 seconds before landing. over a time interval of 60 seconds before landing.
Thus, a data set for each trajectory contains 121 Thus, a data set for each trajectory contains 121
values.values.
What is theWhat is the ANFISANFIS output?output?
-
Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 48
An aircraft trajectory and a data set built to An aircraft trajectory and a data set built to
train thetrain the ANFISANFIS
-20
-10
0
10
Line-up, feet
20
0
Time, s
1 2 3 4 5 6 7 8 9
Inputs
x1 x2 x3 x4
Desired
output
−17.4 2.1 11.0 3.9 4.2
−12.9 7.5 10.1 2.1 4.9
−10.0 9.8 8.2 2.0 5.3
For a landing trajectory recorded over a time interval For a landing trajectory recorded over a time interval
of 60 seconds, we obtain 105 training samples of 60 seconds, we obtain 105 training samples
represented by a 105 represented by a 105 ×× 5 matrix. Thus, the entire data 5 matrix. Thus, the entire data
set, which we use for training theset, which we use for training the ANFISANFIS, is , is
represented by a 525 represented by a 525 ×× 5 matrix.5 matrix.
-
Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 49
A practical approach is to choose the smallest A practical approach is to choose the smallest
number of membership functions. Thus, we may number of membership functions. Thus, we may
begin with two membership functions assigned to begin with two membership functions assigned to
each input variable.each input variable.
How many membership functions How many membership functions
should we assign to each input variable?should we assign to each input variable?
-
Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 50
0 10 20 30 40 50 60-25
-20
-15
-10
-5
0
5
10
15
Time, s
Line-up, feet
Aircraft trajectory
ANFIS prediction
20
Performance of thePerformance of the ANFISANFIS with four inputs and with four inputs and
two membership functions assigned to each input: two membership functions assigned to each input:
one epochone epoch
-
Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 51
Performance of thePerformance of the ANFISANFIS with four inputs and with four inputs and
two membership functions assigned to each input: two membership functions assigned to each input:
100 epochs100 epochs
0 10 20 30 40 50 60-25
-20
-15
-10
-5
0
5
10
15
Time, s
Line-up, feet
Aircraft trajectory
ANFIS prediction
20
-
Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 52
The The ANFISANFIS’’ss performance can be significantly performance can be significantly
improved by assigning improved by assigning three membership three membership
functions to each input variablefunctions to each input variable..
How can we improve theHow can we improve the ANFISANFIS’’ss
performance?performance?
-
Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 53
Performance of thePerformance of the ANFISANFIS with four inputs and with four inputs and
three membership functions assigned to each input three membership functions assigned to each input
after one epoch of trainingafter one epoch of training
0 10 20 30 40 50 60
-25
-20
-15
-10
-5
0
5
10
15
Time, s
Line-up, feet
Aircraft trajectory
ANFIS prediction
20
-30
-
Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 54
Another way of improving timeAnother way of improving time--series series
prediction is toprediction is to increase the number of input increase the number of input
variablesvariables. .
Let us, for example, examine anLet us, for example, examine an ANFISANFIS with with
six inputs that correspond to the aircraftsix inputs that correspond to the aircraft’’s flight s flight
positions at (positions at (tt –– 5), (5), (tt –– 4), (4), (tt –– 3), (3), (tt –– 2), (2), (tt –– 1), 1),
and and tt, respectively. The, respectively. The ANFISANFIS output still output still
remains the tworemains the two--second prediction. second prediction.
The training data set is now represented by a The training data set is now represented by a
535 535 ×× 7 matrix.7 matrix.
-
Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 55
Performance of thePerformance of the ANFISANFIS with six inputs and two with six inputs and two
membership functions assigned to each input: membership functions assigned to each input:
prediction after one epochprediction after one epoch
0 10 20 30 40 50 60
-25
-20
-15
-10
-5
0
5
10
15
Time, s
Line-up, feet
Aircraft trajectory
ANFIS prediction
20
-30
-
Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 56
Performance of thePerformance of the ANFISANFIS with six inputs and two with six inputs and two
membership functions assigned to each input: membership functions assigned to each input:
prediction errorsprediction errors
0 10 20 30 40 50 60-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
Time, s
Prediction error, feet