knowledge engineering · as an example, we will develop an intelligent system for diagnosing...

56
Negnevitsky, Pearson Education, 2011 Negnevitsky, Pearson Education, 2011 1 Lecture 15 Lecture 15 Knowledge engineering: Knowledge engineering: Genetic algorithms and hybrid systems Genetic algorithms and hybrid systems Will genetic algorithms work for my Will genetic algorithms work for my problem? problem? The travelling salesman problem The travelling salesman problem Will a hybrid system work for my Will a hybrid system work for my problem problem ? ? Neuro Neuro - - fuzzy decision fuzzy decision - - support systems support systems Time Time - - series prediction series prediction Summary Summary

Upload: others

Post on 26-Jan-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

  • Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 1

    Lecture 15Lecture 15

    Knowledge engineering:Knowledge engineering:Genetic algorithms and hybrid systemsGenetic algorithms and hybrid systems

    �� Will genetic algorithms work for my Will genetic algorithms work for my

    problem?problem?

    �� The travelling salesman problemThe travelling salesman problem

    �� Will a hybrid system work for my Will a hybrid system work for my problemproblem??

    �� NeuroNeuro--fuzzy decisionfuzzy decision--support systemssupport systems

    �� TimeTime--series predictionseries prediction

    �� SummarySummary

  • Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 2

    Will genetic algorithms work for my Will genetic algorithms work for my

    problem?problem?

    �� OptimisationOptimisation is the process of finding a better is the process of finding a better

    solution to a problemsolution to a problem. .

    �� A genetic algorithm generates a population of A genetic algorithm generates a population of

    competing candidate solutions and then causes competing candidate solutions and then causes

    them to evolve through the process of natural them to evolve through the process of natural

    selection selection –– poor solutions tend to die out, while poor solutions tend to die out, while

    better solutions survive and reproduce. By better solutions survive and reproduce. By

    repeating this process over and over again, the repeating this process over and over again, the

    genetic algorithm breeds an optimal solution.genetic algorithm breeds an optimal solution.

  • Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 3

    Case study 8Case study 8

    The travelling salesman problemThe travelling salesman problem

    Suppose, Suppose, we are going to travel by car in Western we are going to travel by car in Western

    and Central Europe. We want to produce an optimal and Central Europe. We want to produce an optimal

    itinerary for visiting all major cities and returning itinerary for visiting all major cities and returning

    home.home.

    This problem is well known as the This problem is well known as the travelling travelling

    salesman problemsalesman problem. Given a finite number of cities, . Given a finite number of cities,

    NN, and the cost of travel (or the distance) between , and the cost of travel (or the distance) between

    each pair of cities, we need to find the cheapest way each pair of cities, we need to find the cheapest way

    (or the shortest route) for visiting each city exactly (or the shortest route) for visiting each city exactly

    once and returning to the starting point.once and returning to the starting point.

  • Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 4

    �� The TSP is represented in numerous The TSP is represented in numerous

    transportation and logistics applications such astransportation and logistics applications such as

    �� arranging routes for school buses to pick up arranging routes for school buses to pick up

    children in a school district, children in a school district,

    �� delivering meals to homedelivering meals to home--bound people,bound people,

    �� scheduling stacker cranes in a warehouse,scheduling stacker cranes in a warehouse,

    �� planning truck routes to pick up parcel post and planning truck routes to pick up parcel post and

    many others. many others.

    �� A classic example of the TSP is the scheduling A classic example of the TSP is the scheduling

    of a machine to drill holes in a circuit board.of a machine to drill holes in a circuit board.

  • Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 5

    How does a genetic algorithm solve the How does a genetic algorithm solve the

    travelling salesman problemtravelling salesman problem? ?

    �� First, we need to decide how to represent a route of First, we need to decide how to represent a route of

    the salesman. The most natural way of representing the salesman. The most natural way of representing

    a route is the a route is the path representationpath representation.. Each city is Each city is

    given an alphabetic or numerical name, the route given an alphabetic or numerical name, the route

    through the cities is represented as a chromosome, through the cities is represented as a chromosome,

    and appropriate genetic operators are used to create and appropriate genetic operators are used to create

    new routes.new routes.

    �� Suppose we have nine cities named from 1 to 9. In Suppose we have nine cities named from 1 to 9. In

    a chromosome, the order of the integers represents a chromosome, the order of the integers represents

    the order in which the cities will be visited by the the order in which the cities will be visited by the

    salesman. salesman.

  • Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 6

    An example of the salesmanAn example of the salesman’’s routes route

    1

    2

    3

    4

    65

    7

    8

    9

    1 56 3 2 48 9 7

  • Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 7

    How does the crossover operator works?How does the crossover operator works?

    The crossover operator in its classical form cannot be The crossover operator in its classical form cannot be

    directly applied to the TSP. A simple exchange of parts directly applied to the TSP. A simple exchange of parts

    between two parents would produce illegal routes between two parents would produce illegal routes

    containing duplicates and omissions containing duplicates and omissions –– some cities would some cities would

    be visited twice while some others would not be visited be visited twice while some others would not be visited

    at all:at all:

    1 56 3 2 48 9 7 Parent 1: 3 67 1 9 84 2 5 Parent 2:

    1 56 3 9 84 2 5 Child 1: 3 67 1 2 48 9 7 Child 2:

  • Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 8

    Crossover operators for the TSPCrossover operators for the TSP

    Step 1

    Parent 2: 3 67 2 51 9 84 Parent 1: 9 71 56 3 2 8 4

    Step 2

    * 1 9 84 Child 1: 3 2 48 Child 2:* * * * * * ** *

    Step 3

    9 84 Child 1: 3 2 48 Child 2:7 * *67 56 2 7 9

    Parent 1: 9 71 56 3 2 8 4 Parent 2: 3 67 2 51 9 84

    135 7 * *16 5

  • Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 9

    How does the mutation operator works?How does the mutation operator works?

    �� There are two types of mutation operators: There are two types of mutation operators:

    reciprocal exchangereciprocal exchange and and inversioninversion. .

    �� The The reciprocal exchange operatorreciprocal exchange operator simply simply

    swaps two randomly selected cities in the swaps two randomly selected cities in the

    chromosome. chromosome.

    �� The The inversion operatorinversion operator selects two random selects two random

    points along the chromosome string and points along the chromosome string and

    reverses the order of the cities between these reverses the order of the cities between these

    points.points.

  • Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 10

    9 71 56 3 2 8 4

    (a) original chromosomes

    Reciprocal exchange Inversion

    9 71 56 43 2 8

    9 71 86 3 2 5 4

    (b) mutated chromosomes

    9 71 32 45 6 8

    Mutation operators for the TSPMutation operators for the TSP

  • Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 11

    How do we define a fitness function in the How do we define a fitness function in the

    TSP?TSP?

    The fitness of each individual chromosome is The fitness of each individual chromosome is

    determined as the reciprocal of the route determined as the reciprocal of the route

    length. length.

    In other words, the shorter the route, the fitter In other words, the shorter the route, the fitter

    the chromosome.the chromosome.

  • Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 12

    Performance graph and the best Performance graph and the best salesmansalesman’’s route created s route created

    in a population of 20 chromosomes after 100 generationsin a population of 20 chromosomes after 100 generations

    0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    1The total distance: 5.124

    4

    1

    2

    5

    67

    8

    9

    10

    11

    12

    13

    14

    16

    17

    18

    19

    20

    3

    15

    0 10 20 30 40 50 60 70 80 90 1004

    5

    6

    7

    8

    9

    10

    11

    Generations

    Distance

    N = 20, pc = 0.7, pm = 0.001

    Best

    Average

  • Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 13

    Performance graphs and the best Performance graphs and the best routes created in a routes created in a

    population of 200 chromosomes: mutation rate is 0.001population of 200 chromosomes: mutation rate is 0.001

    0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    1The total distance: 4.0938

    4

    1

    2

    5

    67

    8

    9

    10

    11

    12

    13

    14

    16

    17

    18

    19

    20

    3

    15

    0 10 20 30 40 50 60 70 80 90 1004

    5

    6

    8

    9

    10

    11

    Generations

    Distance

    N = 200, pc = 0.7, pm = 0.001

    Best

    Average

    (a)

  • Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 14

    Performance graphs and the best Performance graphs and the best routes created in a routes created in a

    population of 200 chromosomes: mutation rate is 0.01population of 200 chromosomes: mutation rate is 0.01

    0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    1The total distance: 4.0825

    4

    1

    2

    5

    67

    8

    9

    10

    11

    12

    13

    14

    16

    17

    18

    19

    20

    3

    15

    0 10 20 30 40 50 60 70 80 90 1004

    5

    6

    7

    8

    9

    10

    11

    Generations

    Distance

    Best

    Average

    (b)

    N = 200, pc = 0.7, pm = 0.01

  • Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 15

    Will a hybrid system work for my Will a hybrid system work for my

    problem?problem?

    Solving complex realSolving complex real--world problems requires world problems requires

    an application of complex intelligent systems an application of complex intelligent systems

    that combine the advantages of expert systems, that combine the advantages of expert systems,

    fuzzy logic, neural networks and evolutionary fuzzy logic, neural networks and evolutionary

    computation. Such systems can integrate computation. Such systems can integrate

    humanhuman--like expertise in a specific domain with like expertise in a specific domain with

    abilities to learn and adapt to a rapidly changing abilities to learn and adapt to a rapidly changing

    environment.environment.

  • Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 16

    Case study 9Case study 9

    NeuroNeuro--fuzzy decisionfuzzy decision--support systemssupport systems

    As an example, we will develop As an example, we will develop an intelligent an intelligent

    system for diagnosing myocardial perfusion system for diagnosing myocardial perfusion

    from cardiac images. Suppose, we have a set from cardiac images. Suppose, we have a set

    of cardiac images as well as the clinical notes of cardiac images as well as the clinical notes

    and physicianand physician’’s interpretation. s interpretation.

  • Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 17

    What areWhat are SPECTSPECT images?images?

    �� Diagnosis in modern cardiac medicine is based on Diagnosis in modern cardiac medicine is based on

    the analysis ofthe analysis of SPECTSPECT (Single Proton Emission (Single Proton Emission

    Computed Tomography) images. Computed Tomography) images.

    �� By injecting a patient with radioactive tracer, two By injecting a patient with radioactive tracer, two

    sets ofsets of SPECTSPECT images are obtained: one is taken 10 images are obtained: one is taken 10 ––

    15 minutes after the injection when the stress is 15 minutes after the injection when the stress is

    greatest (stress images), and the other is taken 2 greatest (stress images), and the other is taken 2 –– 5 5

    hours after the injection (rest images). Distribution hours after the injection (rest images). Distribution

    of the radioactive tracer in the cardiac muscle is of the radioactive tracer in the cardiac muscle is

    proportional to the muscleproportional to the muscle’’s perfusion.s perfusion.

    �� A cardiologist detects abnormalities in the heart A cardiologist detects abnormalities in the heart

    function by comparing stress and rest images. function by comparing stress and rest images.

  • Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 18

    �� The The SPECTSPECT images are usually represented by images are usually represented by

    high resolution twohigh resolution two--dimensional blackdimensional black--andand--white white

    pictures with up to 256 shades of grey. Brighter pictures with up to 256 shades of grey. Brighter

    patches on the image correspond to wellpatches on the image correspond to well--perfused perfused

    areas of the myocardium, while darker patches areas of the myocardium, while darker patches

    may indicate the presence of anmay indicate the presence of an ischemiaischemia. .

    �� Unfortunately a visual inspection of theUnfortunately a visual inspection of the SPECTSPECT

    images is highly subjective; physiciansimages is highly subjective; physicians’’

    interpretations are therefore often inconsistent interpretations are therefore often inconsistent

    and susceptible to errors.and susceptible to errors.

  • Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 19

    �� For this study, we use 267 cardiac diagnostic For this study, we use 267 cardiac diagnostic

    cases. Each case is accompanied by twocases. Each case is accompanied by two SPECTSPECT

    images (the stress image and the rest image), and images (the stress image and the rest image), and

    each image is divided into 22 regions. each image is divided into 22 regions.

    �� The regionThe region’’s brightness, which in turn reflects s brightness, which in turn reflects

    perfusion inside this region, is expressed by an perfusion inside this region, is expressed by an

    integer number between 0 and 100. integer number between 0 and 100.

    �� Thus, each cardiac diagnostic case is represented Thus, each cardiac diagnostic case is represented

    by 44 continuous features and one binary feature by 44 continuous features and one binary feature

    that assigns an overall diagnosis that assigns an overall diagnosis –– normal or normal or

    abnormal.abnormal.

  • Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 20

    �� The entire The entire SPECTSPECT data set consists of 55 cases data set consists of 55 cases

    classified as normal (positive examples) and 212 classified as normal (positive examples) and 212

    cases classified as abnormal (negative examples).cases classified as abnormal (negative examples).

    �� The entire set is divided into training and test The entire set is divided into training and test

    sets. sets.

    �� The training set has 40 positive and 40 negative The training set has 40 positive and 40 negative

    examples. The test set is represented by 15 examples. The test set is represented by 15

    positive and 172 negative examples.positive and 172 negative examples.

    TheThe SPECTSPECT data setdata set

  • Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 21

    �� The number ofThe number of neuronsneurons in the input layer is in the input layer is

    determined by the total number of regions in the determined by the total number of regions in the

    stress and rest images. In this example, each stress and rest images. In this example, each

    image is divided into 22 regions, so we need 44 image is divided into 22 regions, so we need 44

    inputinput neuronsneurons. .

    �� Since Since SPECTSPECT images are to be classified as either images are to be classified as either

    normal or abnormal, we use two output normal or abnormal, we use two output neuronsneurons..

    �� A good generalisation is obtained with as little as A good generalisation is obtained with as little as

    5 to 75 to 7 neuronsneurons in the hidden layer. in the hidden layer.

    Can we train a backCan we train a back--propagation neural propagation neural

    network to classify thenetwork to classify the SPECTSPECT images into images into

    normal and abnormal?normal and abnormal?

  • Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 22

    �� However, when we test the network on the test However, when we test the network on the test

    set, we find that the networkset, we find that the network’’s performance is s performance is

    rather poor rather poor –– about 25 percent of normal cardiac about 25 percent of normal cardiac

    diagnostic cases are misclassified as abnormal diagnostic cases are misclassified as abnormal

    and over 35 percent of abnormal cases are and over 35 percent of abnormal cases are

    misclassified as normal; the overall diagnostic misclassified as normal; the overall diagnostic

    error exceeds 33 percent. error exceeds 33 percent.

    �� This indicates that the training set may lack some This indicates that the training set may lack some

    important examples (a neural network is only as important examples (a neural network is only as

    good as the examples used to train it).good as the examples used to train it).

  • Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 23

    First, we need to redefine the problem. To train the First, we need to redefine the problem. To train the

    network, we used the same number of positive and network, we used the same number of positive and

    negative examples. Although in real clinical trials, negative examples. Although in real clinical trials,

    the ratio between normal and abnormalthe ratio between normal and abnormal SPECTSPECT

    images is very different, the misclassification of an images is very different, the misclassification of an

    abnormal cardiac case could lead to infinitely more abnormal cardiac case could lead to infinitely more

    serious consequences than the misclassification of a serious consequences than the misclassification of a

    normal case. Therefore, in order to achieve a small normal case. Therefore, in order to achieve a small

    classification error for abnormalclassification error for abnormal SPECTSPECT images, we images, we

    might agree to have a relatively large error for might agree to have a relatively large error for

    normal images.normal images.

    Can we improve the accuracy of the Can we improve the accuracy of the

    diagnosis?diagnosis?

  • Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 24

    �� The neural network produces two outputs. The The neural network produces two outputs. The

    first output corresponds to the possibility that thefirst output corresponds to the possibility that the

    SPECTSPECT image belongs to the class image belongs to the class normalnormal, and , and

    the second to the possibility that the image the second to the possibility that the image

    belongs to the class belongs to the class abnormalabnormal. If, for example, . If, for example,

    the first (the first (normalnormal) output is 0.92 and the second ) output is 0.92 and the second

    ((abnormalabnormal) is 0.16, the) is 0.16, the SPECTSPECT image is image is

    classified as normal, and we can conclude that the classified as normal, and we can conclude that the

    risk of a heart attack for this case is low.risk of a heart attack for this case is low.

  • Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 25

    �� On the other hand, if the On the other hand, if the normalnormal output is low, output is low,

    say 0.17, and the say 0.17, and the abnormalabnormal output is much output is much

    higher, say 0.51, thehigher, say 0.51, the SPECTSPECT image is classified image is classified

    as abnormal, and we can infer that the risk of a as abnormal, and we can infer that the risk of a

    heart attack in this case is rather high. However, heart attack in this case is rather high. However,

    if the two outputs are close if the two outputs are close –– say the say the normalnormal

    output is 0.51 and the output is 0.51 and the abnormalabnormal 0.49 0.49 –– we cannot we cannot

    confidently classify the image.confidently classify the image.

  • Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 26

    �� To build a fuzzy system, we first need to To build a fuzzy system, we first need to

    determine input and output variables, define determine input and output variables, define

    fuzzy sets and construct fuzzy rules. fuzzy sets and construct fuzzy rules.

    �� For our problem, there are two inputs (For our problem, there are two inputs (NNNN

    output 1output 1 andand NNNN output 2output 2) and one output (the ) and one output (the

    riskrisk of a heart attack). of a heart attack).

    �� The inputs are normalised to be within the The inputs are normalised to be within the

    range of [0, 1], and the output can vary range of [0, 1], and the output can vary

    between 0 and 100 percent. between 0 and 100 percent.

    Can we use fuzzy logic for decisionCan we use fuzzy logic for decision--making making

    in medical diagnosis?in medical diagnosis?

  • Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 27

    Fuzzy sets of the neural network output Fuzzy sets of the neural network output

    normalnormal

    0.10

    1.0

    0.0

    0.2

    0.4

    0.6

    0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

    NN output 1 (normalised )

    Degree of membership

    0.8HighLow Medium

  • Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 28

    Fuzzy sets of the neural network output Fuzzy sets of the neural network output

    abnormalabnormal

    0.10

    1.0

    0.0

    0.2

    0.4

    0.6

    0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

    NN output 2 (normalised )

    Degree of membership

    0.8HighLow Medium

  • Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 29

    Fuzzy sets of the linguistic variable Fuzzy sets of the linguistic variable RiskRisk

    100

    0.0

    20 30 40 50 60 70 80 90 100

    Risk, %

    Degree of membership

    1.0

    0.2

    0.4

    0.6

    0.8

    ModerateLow HighVery low Very high

  • Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 30

    Fuzzy rules for assessing the risk of a heart Fuzzy rules for assessing the risk of a heart

    deceasedecease

    1. If (NN_output1 is Low) and (NN_output2 is Low) then (Risk is Moderate)2. If (NN_output1 is Low) and (NN_output2 is Medium) then (Risk is High)3. If (NN_output1 is Low) and (NN_output2 is High) then (Risk is Very_high)4. If (NN_output1 is Medium) and (NN_output2 is Low) then (Risk is Low)5. If (NN_output1 is Medium) and (NN_output2 is Medium) then (Risk is Moderate)6. If (NN_output1 is Medium) and (NN_output2 is High) then (Risk is High)

    7. If (NN_output1 is High) and (NN_output2 is Low) then (Risk is Very_low)8. If (NN_output1 is High) and (NN_output2 is Medium) then (Risk is Low)

    9. If (NN_output1 is High) and (NN_output2 is High) then (Risk is Moderate)

  • Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 31

    Structure of the Structure of the neuroneuro--fuzzy systemfuzzy system

    Fuzzy Rule Base

    Risk

    NN output 1 NN output 2

    4321 765

    21

    31 2 4442 43

  • Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 32

    ThreeThree--dimensional plot for the fuzzy rule basedimensional plot for the fuzzy rule base

    00.2

    0.40.6

    0.81 0

    0.20.4

    0.60.8

    1

    20

    40

    60

    80

    NN output 2

    Risk, %

    NN output 1

  • Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 33

    �� The systemThe system’’s output is a crisp number that s output is a crisp number that

    represents a patientrepresents a patient’’s risk of a heart attack.s risk of a heart attack.

    �� Based on this number, a cardiologist can now Based on this number, a cardiologist can now

    classify cardiac cases with greater certainty classify cardiac cases with greater certainty ––

    when the risk is quantified, a decisionwhen the risk is quantified, a decision--maker maker

    has a much better chance of making the right has a much better chance of making the right

    decision. For instance, if the risk is low, say, decision. For instance, if the risk is low, say,

    smaller than 30 percent, the cardiac case can smaller than 30 percent, the cardiac case can

    be classified as be classified as normalnormal, but if the risk is high, , but if the risk is high,

    say, greater than 50 percent, the case is say, greater than 50 percent, the case is

    classified as classified as abnormalabnormal..

  • Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 34

    However, cardiac cases with the risk However, cardiac cases with the risk

    between 30 and 50 percent cannot be between 30 and 50 percent cannot be

    classified as either classified as either normalnormal or or abormalabormal ––

    rather, such cases are rather, such cases are uncertainuncertain..

  • Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 35

    A cardiologist knows that, in normal heart A cardiologist knows that, in normal heart

    muscle, perfusion at maximum stress is usually muscle, perfusion at maximum stress is usually

    higher than perfusion at rest: higher than perfusion at rest:

    �� If perfusion inside region If perfusion inside region ii at stress is higher than at stress is higher than

    perfusion inside the same region at rest, then the perfusion inside the same region at rest, then the

    risk of a heart attack should be decreased. risk of a heart attack should be decreased.

    �� If perfusion inside region If perfusion inside region ii at stress is not higher at stress is not higher

    than perfusion inside the same region at rest, then than perfusion inside the same region at rest, then

    the risk of a heart attack should be increased. the risk of a heart attack should be increased.

    Can we classify some of the uncertain cases Can we classify some of the uncertain cases

    using the knowledge of a cardiologist?using the knowledge of a cardiologist?

  • Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 36

    TheseThese heuristicsheuristics can be implemented in the can be implemented in the

    diagnostic system as follows:diagnostic system as follows:

    Step 1:Step 1: Present thePresent the neuroneuro--fuzzy system with a fuzzy system with a

    cardiac case.cardiac case.

    Step 2:Step 2: If the systemIf the system’’s output is less than 30,s output is less than 30,

    classify the presented case as classify the presented case as normalnormal andand

    then stop. If the output is greater than 50,then stop. If the output is greater than 50,

    classify the case as classify the case as abnormalabnormal and stop.and stop.

    Otherwise, go to Step 3.Otherwise, go to Step 3.

  • Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 37

    Step 3:Step 3: For regionFor region 1, subtract perfusion at rest from1, subtract perfusion at rest from

    perfusion at stress. If the result is positive,perfusion at stress. If the result is positive,

    decrease the current risk by multiplyingdecrease the current risk by multiplying

    its value by 0.99. Otherwise, increase theits value by 0.99. Otherwise, increase the

    risk by multiplying its value by 1.01. Repeatrisk by multiplying its value by 1.01. Repeat

    this procedure for all 22 regions and then gothis procedure for all 22 regions and then go

    to Step 4.to Step 4.

    Step 4:Step 4: If the new risk value is less than 30, classifyIf the new risk value is less than 30, classify

    the case as the case as normalnormal; if the risk is greater than; if the risk is greater than

    50, classify the case as 50, classify the case as abnormalabnormal; otherwise; otherwise

    –– classify the case as classify the case as uncertainuncertain..

  • Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 38

    �� The accuracy of diagnosis has dramatically The accuracy of diagnosis has dramatically

    improved improved –– the overall diagnostic error does not the overall diagnostic error does not

    exceed 5 percent, while only 3 percent of exceed 5 percent, while only 3 percent of

    abnormal cardiac cases are misclassified as abnormal cardiac cases are misclassified as

    normal. normal.

    �� Although we have not improved the systemAlthough we have not improved the system’’s s

    performance on normal cases (over 30 percent of performance on normal cases (over 30 percent of

    normal cases are still misclassified as abnormal), normal cases are still misclassified as abnormal),

    and up to 20 percent of the total number of cases and up to 20 percent of the total number of cases

    are classified as uncertain, theare classified as uncertain, the neuroneuro--fuzzy system fuzzy system

    can actually achieve even better results in can actually achieve even better results in

    classifyingclassifying SPECTSPECT images than a cardiologist can. images than a cardiologist can.

  • Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 39

    �� In this example, theIn this example, the neuroneuro--fuzzy system has a fuzzy system has a

    heterogeneousheterogeneous structure structure –– the neural network and the neural network and

    fuzzy system work as independent components fuzzy system work as independent components

    (although they cooperate in solving the problem). (although they cooperate in solving the problem).

    When a new case is presented to the diagnostic When a new case is presented to the diagnostic

    system, the trained neural network determines system, the trained neural network determines

    inputs to the fuzzy system. Then the fuzzy system inputs to the fuzzy system. Then the fuzzy system

    using predefined fuzzy sets and fuzzy rules, maps using predefined fuzzy sets and fuzzy rules, maps

    the given inputs to an output, and thereby obtains the given inputs to an output, and thereby obtains

    the risk of a heart attack.the risk of a heart attack.

    �� AdaptiveAdaptive NeuroNeuro--Fuzzy Inference System (Fuzzy Inference System (ANFISANFIS) )

    is a typical example of ais a typical example of a neuroneuro--fuzzy system with a fuzzy system with a

    homogeneoushomogeneous structure.structure.

  • Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 40

    Case study 10Case study 10

    TimeTime--series predictionseries prediction

    As an example, we will develop As an example, we will develop a tool to predict a tool to predict

    an aircraftan aircraft’’s trajectory during its landing aboard s trajectory during its landing aboard

    an aircraft carrier. an aircraft carrier.

    Suppose, we have a database of landing Suppose, we have a database of landing

    trajectories of various aircraft flown by different trajectories of various aircraft flown by different

    pilots, and we also can use RADAR numerical pilots, and we also can use RADAR numerical

    data, which provide realdata, which provide real--time trajectories of time trajectories of

    landing aircraft. Our goal is to predict an landing aircraft. Our goal is to predict an

    aircraftaircraft’’s trajectory at least two seconds in s trajectory at least two seconds in

    advance, based on the aircraftadvance, based on the aircraft’’s current position.s current position.

  • Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 41

    �� The landing of an aircraft, particularly aboard The landing of an aircraft, particularly aboard

    aircraft carriers, is an extremely complex process.aircraft carriers, is an extremely complex process.

    �� It is affected by such variables as the flight deckIt is affected by such variables as the flight deck’’s s

    space constraints and its motions (both pitch and space constraints and its motions (both pitch and

    roll), the aircraftroll), the aircraft’’s ordinance and fuel load, s ordinance and fuel load,

    continuous mechanical preparations, and the most continuous mechanical preparations, and the most

    critical of all critical of all –– time constraints. time constraints.

    �� The ship may be heaving 10 feet up and 10 feet The ship may be heaving 10 feet up and 10 feet

    down, making a 20down, making a 20--foot displacement from a level foot displacement from a level

    deck. In addition, it is difficult to see approaching deck. In addition, it is difficult to see approaching

    aircraft at night or during stormy conditions.aircraft at night or during stormy conditions.

    Prediction of the aircraftPrediction of the aircraft’’s position s position

  • Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 42

    �� Responsibility for the aircraftResponsibility for the aircraft’’s final approach s final approach

    and landing lies with the Landing Signal Officer and landing lies with the Landing Signal Officer

    ((LSOLSO). ).

    �� When an aircraft is within one nautical mile of When an aircraft is within one nautical mile of

    the landing deck, which roughly corresponds to the landing deck, which roughly corresponds to

    60 seconds in real time, the aircraft60 seconds in real time, the aircraft’’s flight is s flight is

    carefully observed and guided. During this carefully observed and guided. During this

    critical time, thecritical time, the LSOLSO needs to predict the needs to predict the

    aircraftaircraft’’s position at least two seconds ahead.s position at least two seconds ahead.

    �� Such problems are known in mathematics as Such problems are known in mathematics as

    timetime--series predictionseries prediction problems.problems.

  • Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 43

    �� A time series can be defined as a set of A time series can be defined as a set of

    observations, each one being recorded at a observations, each one being recorded at a

    specific time. For instance, a time series can be specific time. For instance, a time series can be

    obtained by recording the aircraftobtained by recording the aircraft’’s positions s positions

    over a time interval of, say, 60 seconds before over a time interval of, say, 60 seconds before

    landing. landing.

    �� RealReal--world timeworld time--series problems are nonseries problems are non--linear linear

    and often exhibit chaotic behaviour, which and often exhibit chaotic behaviour, which

    make them hard to model.make them hard to model.

    What is a time series?What is a time series?

  • Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 44

    �� Prediction of the aircraftPrediction of the aircraft’’s landing trajectory is s landing trajectory is

    mainly based on the experience of amainly based on the experience of a LSOLSO (all(all

    LSOsLSOs are trained pilots). are trained pilots).

    �� An automatic prediction system can use aircraftAn automatic prediction system can use aircraft--

    position data given by the shipposition data given by the ship’’s RADAR, and s RADAR, and

    also data records of previous landings executed also data records of previous landings executed

    by pilots flying different types of aircraft. by pilots flying different types of aircraft.

    �� The system is trained offThe system is trained off--line with the past data. line with the past data.

    Then it is presented onThen it is presented on--line with the current line with the current

    motion profile, and required to predict the motion profile, and required to predict the

    aircraftaircraft’’s motion in the next few seconds. s motion in the next few seconds.

  • Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 45

    OnOn--line timeline time--series prediction of an series prediction of an

    aircraftaircraft’’s trajectorys trajectory

    Input: current motionprofile of the aircraft

    Time-seriesprediction

    system

    Output: predictedposition of the aircraft

    -10

    0

    10

    Line-up, feet 20

    Time, s

    0 2 4 6 8

    Time, s

    0 2 4 6 8-10

    0

    10

    Line-up, feet 20

    To predict an aircraftTo predict an aircraft’’s position ons position on--line we will line we will

    use anuse an ANFISANFIS. .

  • Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 46

    To predict a future value for a time series, we use To predict a future value for a time series, we use

    values that are already known. For example, if we values that are already known. For example, if we

    want to predict an aircraftwant to predict an aircraft’’s position two seconds s position two seconds

    ahead, we may use its current position data as well ahead, we may use its current position data as well

    as data recorded, say, 2, 4 and 6 seconds before the as data recorded, say, 2, 4 and 6 seconds before the

    current position. These four known values current position. These four known values

    represent an input pattern represent an input pattern –– a foura four--dimensional dimensional

    vector of the following form:vector of the following form:

    What do we use asWhat do we use as ANFISANFIS inputs?inputs?

    [ ])( )2( )4( )6( txtxtxtx −−−=x

  • Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 47

    The The ANFISANFIS output corresponds to the trajectory output corresponds to the trajectory

    prediction: the aircraftprediction: the aircraft’’s position two seconds s position two seconds

    ahead, ahead, xx ((tt + 2).+ 2).

    For this case study, we will use 10 landing For this case study, we will use 10 landing

    trajectories trajectories –– five for training and five for testing. five for training and five for testing.

    Each trajectory is a time series of the aircraftEach trajectory is a time series of the aircraft’’s s

    position data points recorded every half a second position data points recorded every half a second

    over a time interval of 60 seconds before landing. over a time interval of 60 seconds before landing.

    Thus, a data set for each trajectory contains 121 Thus, a data set for each trajectory contains 121

    values.values.

    What is theWhat is the ANFISANFIS output?output?

  • Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 48

    An aircraft trajectory and a data set built to An aircraft trajectory and a data set built to

    train thetrain the ANFISANFIS

    -20

    -10

    0

    10

    Line-up, feet

    20

    0

    Time, s

    1 2 3 4 5 6 7 8 9

    Inputs

    x1 x2 x3 x4

    Desired

    output

    −17.4 2.1 11.0 3.9 4.2

    −12.9 7.5 10.1 2.1 4.9

    −10.0 9.8 8.2 2.0 5.3

    For a landing trajectory recorded over a time interval For a landing trajectory recorded over a time interval

    of 60 seconds, we obtain 105 training samples of 60 seconds, we obtain 105 training samples

    represented by a 105 represented by a 105 ×× 5 matrix. Thus, the entire data 5 matrix. Thus, the entire data

    set, which we use for training theset, which we use for training the ANFISANFIS, is , is

    represented by a 525 represented by a 525 ×× 5 matrix.5 matrix.

  • Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 49

    A practical approach is to choose the smallest A practical approach is to choose the smallest

    number of membership functions. Thus, we may number of membership functions. Thus, we may

    begin with two membership functions assigned to begin with two membership functions assigned to

    each input variable.each input variable.

    How many membership functions How many membership functions

    should we assign to each input variable?should we assign to each input variable?

  • Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 50

    0 10 20 30 40 50 60-25

    -20

    -15

    -10

    -5

    0

    5

    10

    15

    Time, s

    Line-up, feet

    Aircraft trajectory

    ANFIS prediction

    20

    Performance of thePerformance of the ANFISANFIS with four inputs and with four inputs and

    two membership functions assigned to each input: two membership functions assigned to each input:

    one epochone epoch

  • Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 51

    Performance of thePerformance of the ANFISANFIS with four inputs and with four inputs and

    two membership functions assigned to each input: two membership functions assigned to each input:

    100 epochs100 epochs

    0 10 20 30 40 50 60-25

    -20

    -15

    -10

    -5

    0

    5

    10

    15

    Time, s

    Line-up, feet

    Aircraft trajectory

    ANFIS prediction

    20

  • Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 52

    The The ANFISANFIS’’ss performance can be significantly performance can be significantly

    improved by assigning improved by assigning three membership three membership

    functions to each input variablefunctions to each input variable..

    How can we improve theHow can we improve the ANFISANFIS’’ss

    performance?performance?

  • Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 53

    Performance of thePerformance of the ANFISANFIS with four inputs and with four inputs and

    three membership functions assigned to each input three membership functions assigned to each input

    after one epoch of trainingafter one epoch of training

    0 10 20 30 40 50 60

    -25

    -20

    -15

    -10

    -5

    0

    5

    10

    15

    Time, s

    Line-up, feet

    Aircraft trajectory

    ANFIS prediction

    20

    -30

  • Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 54

    Another way of improving timeAnother way of improving time--series series

    prediction is toprediction is to increase the number of input increase the number of input

    variablesvariables. .

    Let us, for example, examine anLet us, for example, examine an ANFISANFIS with with

    six inputs that correspond to the aircraftsix inputs that correspond to the aircraft’’s flight s flight

    positions at (positions at (tt –– 5), (5), (tt –– 4), (4), (tt –– 3), (3), (tt –– 2), (2), (tt –– 1), 1),

    and and tt, respectively. The, respectively. The ANFISANFIS output still output still

    remains the tworemains the two--second prediction. second prediction.

    The training data set is now represented by a The training data set is now represented by a

    535 535 ×× 7 matrix.7 matrix.

  • Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 55

    Performance of thePerformance of the ANFISANFIS with six inputs and two with six inputs and two

    membership functions assigned to each input: membership functions assigned to each input:

    prediction after one epochprediction after one epoch

    0 10 20 30 40 50 60

    -25

    -20

    -15

    -10

    -5

    0

    5

    10

    15

    Time, s

    Line-up, feet

    Aircraft trajectory

    ANFIS prediction

    20

    -30

  • Negnevitsky, Pearson Education, 2011Negnevitsky, Pearson Education, 2011 56

    Performance of thePerformance of the ANFISANFIS with six inputs and two with six inputs and two

    membership functions assigned to each input: membership functions assigned to each input:

    prediction errorsprediction errors

    0 10 20 30 40 50 60-0.8

    -0.6

    -0.4

    -0.2

    0

    0.2

    0.4

    0.6

    Time, s

    Prediction error, feet