protein motif extraction with neuro-fuzzy optimization bill c. h. chang and author : bill c. h....
TRANSCRIPT
![Page 1: Protein motif extraction with neuro-fuzzy optimization Bill C. H. Chang and Author : Bill C. H. Chang and Saman K. Halgamuge Saman K. Halgamuge Adviser](https://reader035.vdocument.in/reader035/viewer/2022062713/56649f525503460f94c76bd4/html5/thumbnails/1.jpg)
Protein motif extraction with nProtein motif extraction with neuro-fuzzy optimizationeuro-fuzzy optimization
Author : Bill C. H. Chang and Bill C. H. Chang and Saman K. HalgamugeSaman K. HalgamugeAdviser : K. T. SunPresenter : Wei-Liang LiuPresenter : Wei-Liang Liu
BIOINFORMATICS Vol. 18 no. 8 2002 Pages 1084–1090
![Page 2: Protein motif extraction with neuro-fuzzy optimization Bill C. H. Chang and Author : Bill C. H. Chang and Saman K. Halgamuge Saman K. Halgamuge Adviser](https://reader035.vdocument.in/reader035/viewer/2022062713/56649f525503460f94c76bd4/html5/thumbnails/2.jpg)
22
Introduction (1/2)Introduction (1/2)
We present a new algorithm for extracting the consensus pattern, or motif, from a group of related protein sequences.
This algorithm involves a statistical method to find short patterns with high frequency and then neural network training to optimize the final classification accuracies.
Fuzzy logic is used to increase the flexibility of protein motifs.
![Page 3: Protein motif extraction with neuro-fuzzy optimization Bill C. H. Chang and Author : Bill C. H. Chang and Saman K. Halgamuge Saman K. Halgamuge Adviser](https://reader035.vdocument.in/reader035/viewer/2022062713/56649f525503460f94c76bd4/html5/thumbnails/3.jpg)
33
Introduction (2/2)Introduction (2/2)
Sequence motif discovery algorithms can be Sequence motif discovery algorithms can be generally categorized into three types: generally categorized into three types:
(1) string Alignment algorithms, (1) string Alignment algorithms, (2) exhaustive enumeration algorithms,(2) exhaustive enumeration algorithms, (3) heuristic methods.(3) heuristic methods.
![Page 4: Protein motif extraction with neuro-fuzzy optimization Bill C. H. Chang and Author : Bill C. H. Chang and Saman K. Halgamuge Saman K. Halgamuge Adviser](https://reader035.vdocument.in/reader035/viewer/2022062713/56649f525503460f94c76bd4/html5/thumbnails/4.jpg)
44
String alignment algorithmsString alignment algorithms
Find sequence motifs by minimizing a cost Find sequence motifs by minimizing a cost function which is related to the edit distances function which is related to the edit distances between sequences. between sequences.
Multiple alignment of sequences is a NP-hard Multiple alignment of sequences is a NP-hard problem and its computational time increases problem and its computational time increases exponentially with the sequence size. exponentially with the sequence size.
![Page 5: Protein motif extraction with neuro-fuzzy optimization Bill C. H. Chang and Author : Bill C. H. Chang and Saman K. Halgamuge Saman K. Halgamuge Adviser](https://reader035.vdocument.in/reader035/viewer/2022062713/56649f525503460f94c76bd4/html5/thumbnails/5.jpg)
55
Exhaustive enumeration algorithmsExhaustive enumeration algorithms
Exhaustive enumeration algorithms are guaraExhaustive enumeration algorithms are guaranteed to find the optimal motif, but run in exponteed to find the optimal motif, but run in exponential time with respect to the length of motif.nential time with respect to the length of motif.
![Page 6: Protein motif extraction with neuro-fuzzy optimization Bill C. H. Chang and Author : Bill C. H. Chang and Saman K. Halgamuge Saman K. Halgamuge Adviser](https://reader035.vdocument.in/reader035/viewer/2022062713/56649f525503460f94c76bd4/html5/thumbnails/6.jpg)
66
Heuristic methodsHeuristic methods
Heuristic methods can have a better performaHeuristic methods can have a better performance but are usually less flexible.nce but are usually less flexible.
![Page 7: Protein motif extraction with neuro-fuzzy optimization Bill C. H. Chang and Author : Bill C. H. Chang and Saman K. Halgamuge Saman K. Halgamuge Adviser](https://reader035.vdocument.in/reader035/viewer/2022062713/56649f525503460f94c76bd4/html5/thumbnails/7.jpg)
77
Neuro-Fuzzy systemNeuro-Fuzzy system
A neuro-fuzzy system is a A neuro-fuzzy system is a neural networkneural network and and a a fuzzyfuzzy system mapped to each other thus pro system mapped to each other thus providing advantages of both systems (Halgamugviding advantages of both systems (Halgamuge and Glesner, 1994). e and Glesner, 1994).
When it is used as a When it is used as a classifierclassifier, the outputs are , the outputs are class labels and therefore, class labels and therefore, no conventional defno conventional defuzzificationuzzification is applied. is applied.
![Page 8: Protein motif extraction with neuro-fuzzy optimization Bill C. H. Chang and Author : Bill C. H. Chang and Saman K. Halgamuge Saman K. Halgamuge Adviser](https://reader035.vdocument.in/reader035/viewer/2022062713/56649f525503460f94c76bd4/html5/thumbnails/8.jpg)
88
Example of a sequenceExample of a sequence One example of a sequence data is the human zinc
finger sequence data ZNF117 [6]:
MKRHEMVAKHLVMFYYFAQHLWPEQNIRDSFQKVTLRRYRKCGYENLQLRKGCKSVVECKQHKGDYSGLNQCLKTTLSKIFQCNKYVEVFHKISNSNRHKMRHTENKHFKCKECRKTFCMLSHLTQHKRIHTRVNFYKCEAYGRAFNWSSTLNKHKRIHTGEKPYKCKECGKAFNQTSHLIRHKRIHTEEKPYKCEECGKAFNQSSTLTTHNIIHTGEIPYKCEKCVRAFNQASKLTEHKLIHTGEKRYECEECGKAFNRSSKLTEHKYIHTGEKLYKCEECDKAFNLSSTLTKHKVIHTGEKLYKCKECGKAFKQFSHLAIHNIIHTGEKLYKCEECGKAFNSSSNLTAHKKNRTGEKPYKCEECGKANLSSTLTPHKTIHI
![Page 9: Protein motif extraction with neuro-fuzzy optimization Bill C. H. Chang and Author : Bill C. H. Chang and Saman K. Halgamuge Saman K. Halgamuge Adviser](https://reader035.vdocument.in/reader035/viewer/2022062713/56649f525503460f94c76bd4/html5/thumbnails/9.jpg)
99
AlgorithmAlgorithm
The aim of this algorithm is to The aim of this algorithm is to find a consensus pattefind a consensus pattern,or motifrn,or motif, from sequences belonging to the same fa, from sequences belonging to the same family.mily.
This motif can be either a This motif can be either a rigid or flexiblerigid or flexible pattern. pattern. A rigid pattern may be A–A rigid pattern may be A–xx((55)–B, where there exist a )–B, where there exist a
fixed number of fixed number of gaps/wildcardsgaps/wildcards (in this case, five) bet (in this case, five) between two patterns A and B. ween two patterns A and B.
In a In a flexible patternflexible pattern, the number of gaps is represent, the number of gaps is represented by a ed by a lower bound and an upper boundlower bound and an upper bound, such as , such as xx(2,4).(2,4).
![Page 10: Protein motif extraction with neuro-fuzzy optimization Bill C. H. Chang and Author : Bill C. H. Chang and Saman K. Halgamuge Saman K. Halgamuge Adviser](https://reader035.vdocument.in/reader035/viewer/2022062713/56649f525503460f94c76bd4/html5/thumbnails/10.jpg)
1010
Algorithm has four main stepsAlgorithm has four main steps
The proposed motif extraction algorithm has The proposed motif extraction algorithm has four main steps: four main steps: sequence preprocessingsequence preprocessing, , motif generation, motif generation, motif selection and motif selection and motif optimizationmotif optimization. .
![Page 11: Protein motif extraction with neuro-fuzzy optimization Bill C. H. Chang and Author : Bill C. H. Chang and Saman K. Halgamuge Saman K. Halgamuge Adviser](https://reader035.vdocument.in/reader035/viewer/2022062713/56649f525503460f94c76bd4/html5/thumbnails/11.jpg)
1111
Overview of the algorithmOverview of the algorithm
![Page 12: Protein motif extraction with neuro-fuzzy optimization Bill C. H. Chang and Author : Bill C. H. Chang and Saman K. Halgamuge Saman K. Halgamuge Adviser](https://reader035.vdocument.in/reader035/viewer/2022062713/56649f525503460f94c76bd4/html5/thumbnails/12.jpg)
1212
Sequence PreprocessingSequence Preprocessing
The aim of the preprocessing step is to select The aim of the preprocessing step is to select the ‘the ‘moremore’ important ‘’ important ‘featuresfeatures’ within a single f’ within a single family sequences so that actual motif extractioamily sequences so that actual motif extraction becomes faster.n becomes faster.
![Page 13: Protein motif extraction with neuro-fuzzy optimization Bill C. H. Chang and Author : Bill C. H. Chang and Saman K. Halgamuge Saman K. Halgamuge Adviser](https://reader035.vdocument.in/reader035/viewer/2022062713/56649f525503460f94c76bd4/html5/thumbnails/13.jpg)
1313
Example (1/2)Example (1/2)
ABC–ABC–xx(1,3)–DEF,(1,3)–DEF, where where xx(1,3) represents wild cards of length 1 to 3. A(1,3) represents wild cards of length 1 to 3. A
ny amino acid symbol can match a wild card. Sequeny amino acid symbol can match a wild card. Sequencesnces
ABCHHDEF and ABCAAADEF both satisfy the abovABCHHDEF and ABCAAADEF both satisfy the above consensus pattern. e consensus pattern.
The consensus pattern ABC–The consensus pattern ABC–xx(1,3)–DEF can also be (1,3)–DEF can also be written as A–written as A–xx(0)–B–(0)–B–xx(0)–C–(0)–C–xx(1,3)–D–(1,3)–D–xx(0)–E–(0)–E–xx(0)–(0)–F.F.
![Page 14: Protein motif extraction with neuro-fuzzy optimization Bill C. H. Chang and Author : Bill C. H. Chang and Saman K. Halgamuge Saman K. Halgamuge Adviser](https://reader035.vdocument.in/reader035/viewer/2022062713/56649f525503460f94c76bd4/html5/thumbnails/14.jpg)
1414
Example (2/2)Example (2/2)
As a general form, a sequence pattern can be As a general form, a sequence pattern can be represented as a series of represented as a series of events events and and intervalintervalss (Chang and Halgamuge, 2001):(Chang and Halgamuge, 2001):
EE11––II11,,22––EE22––II2,32,3 − − . . . . . . − − II(N−1)(N−1),,NN ––EENN
Where EWhere E11 is the first event and I is the first event and I1,21,2 is the interv is the interv
al al gapgap between the first and second events. between the first and second events.
![Page 15: Protein motif extraction with neuro-fuzzy optimization Bill C. H. Chang and Author : Bill C. H. Chang and Saman K. Halgamuge Saman K. Halgamuge Adviser](https://reader035.vdocument.in/reader035/viewer/2022062713/56649f525503460f94c76bd4/html5/thumbnails/15.jpg)
1515
Vector generationVector generation
Each element of the vector represents a combEach element of the vector represents a combination of ination of two eventstwo events, , EiEi and and E jE j and theirand their gap gap II
i, ji, j , (where , (where EEii occurs before occurs before E E jj ), and the value ), and the value of each element of the vector is either 1 or 0.of each element of the vector is either 1 or 0.
A value of A value of 1 1 translates to ‘translates to ‘in this sequencein this sequence, th, there is an occurrence of character ere is an occurrence of character Ei Ei with intervwith interval al Ii j Ii j before before E j E j ’, and a value of ’, and a value of zerozero is otherw is otherwise (there is ise (there is no such occurrenceno such occurrence).).
![Page 16: Protein motif extraction with neuro-fuzzy optimization Bill C. H. Chang and Author : Bill C. H. Chang and Saman K. Halgamuge Saman K. Halgamuge Adviser](https://reader035.vdocument.in/reader035/viewer/2022062713/56649f525503460f94c76bd4/html5/thumbnails/16.jpg)
1616
ExampleExample
let us assume the first element of a vector reprlet us assume the first element of a vector represents ‘A–esents ‘A–xx(0)–A’. (0)–A’.
The value of this element will be The value of this element will be 1 for sequence ‘AABCD’ and 1 for sequence ‘AABCD’ and 0 for sequence ‘ABACD’, 0 for sequence ‘ABACD’, as the short pattern A–as the short pattern A–xx(0)–A occurs in the firs(0)–A occurs in the first sequence but not the second.t sequence but not the second.
![Page 17: Protein motif extraction with neuro-fuzzy optimization Bill C. H. Chang and Author : Bill C. H. Chang and Saman K. Halgamuge Saman K. Halgamuge Adviser](https://reader035.vdocument.in/reader035/viewer/2022062713/56649f525503460f94c76bd4/html5/thumbnails/17.jpg)
1717
Size of VectorSize of Vector For protein sequences, the number of possible For protein sequences, the number of possible
events is 20 (there are events is 20 (there are 20 amino acids20 amino acids) ) By considering that only nine patterns in PROSITE By considering that only nine patterns in PROSITE
out of around 1300 motif patterns have interval gaps out of around 1300 motif patterns have interval gaps of more than 20 (Hart of more than 20 (Hart et al.et al.,2000), a ,2000), a maximum gapmaximum gap considered between any two events of considered between any two events of 2020 should be should be satisfactory. satisfactory.
Therefore the size of the vector is Therefore the size of the vector is 20 × 20 × 20 = 800020 × 20 × 20 = 8000
vector can be implementedvector can be implemented as a as a 13-bits13-bits ((213 = 8192213 = 8192) ) binary data.binary data.
![Page 18: Protein motif extraction with neuro-fuzzy optimization Bill C. H. Chang and Author : Bill C. H. Chang and Saman K. Halgamuge Saman K. Halgamuge Adviser](https://reader035.vdocument.in/reader035/viewer/2022062713/56649f525503460f94c76bd4/html5/thumbnails/18.jpg)
1818
Protein sequencesProtein sequences
![Page 19: Protein motif extraction with neuro-fuzzy optimization Bill C. H. Chang and Author : Bill C. H. Chang and Saman K. Halgamuge Saman K. Halgamuge Adviser](https://reader035.vdocument.in/reader035/viewer/2022062713/56649f525503460f94c76bd4/html5/thumbnails/19.jpg)
1919
Feature selectionFeature selection
By selecting the elements above a certain By selecting the elements above a certain threthreshold valueshold value (e.g. 0.90). (e.g. 0.90).
The value of each vector element represents tThe value of each vector element represents the he frequencies of occurrencesfrequencies of occurrences of a particular of a particular EEii – – IIi,i, jj – – E E jj pattern. pattern.
For example,if an element which represents AFor example,if an element which represents A––xx(0)–A has a value of 0.99, then 99% of this (0)–A has a value of 0.99, then 99% of this group of sequences have ‘AA’ somewhere in tgroup of sequences have ‘AA’ somewhere in their sequences.heir sequences.
![Page 20: Protein motif extraction with neuro-fuzzy optimization Bill C. H. Chang and Author : Bill C. H. Chang and Saman K. Halgamuge Saman K. Halgamuge Adviser](https://reader035.vdocument.in/reader035/viewer/2022062713/56649f525503460f94c76bd4/html5/thumbnails/20.jpg)
2020
Motif generation (1/3)Motif generation (1/3)
For example, For example, if a motif pattern if a motif pattern C–C–xx(2)–C–(2)–C–xx(3)–F(3)–F occurs in 9 occurs in 90% of the sequences in the family, 0% of the sequences in the family, the short patterns (or important features): the short patterns (or important features): (1) (1) C–C–xx(2)–C(2)–C, , (2) (2) C–C–xx(3)–F(3)–F, and, and(3) (3) C–C–xx(6)–F(6)–Fmust all exist at a frequencey of 90% or greater in the sequences. But the reverse is not always true.
![Page 21: Protein motif extraction with neuro-fuzzy optimization Bill C. H. Chang and Author : Bill C. H. Chang and Saman K. Halgamuge Saman K. Halgamuge Adviser](https://reader035.vdocument.in/reader035/viewer/2022062713/56649f525503460f94c76bd4/html5/thumbnails/21.jpg)
2121
Motif generation (2/3)Motif generation (2/3)
Fig.2.Connect important features to form a motif candidate.
![Page 22: Protein motif extraction with neuro-fuzzy optimization Bill C. H. Chang and Author : Bill C. H. Chang and Saman K. Halgamuge Saman K. Halgamuge Adviser](https://reader035.vdocument.in/reader035/viewer/2022062713/56649f525503460f94c76bd4/html5/thumbnails/22.jpg)
2222
Motif generation (3/3)Motif generation (3/3)
In Figure 2, F–x(2)–S is not connected because for a motif C–x(2)–C–x(3)–F–x(2)–S to occur frequently, the short patterns C–x(9)–S, C–x(6)–S should have occurred frequently as well (which is not in the above case).
![Page 23: Protein motif extraction with neuro-fuzzy optimization Bill C. H. Chang and Author : Bill C. H. Chang and Saman K. Halgamuge Saman K. Halgamuge Adviser](https://reader035.vdocument.in/reader035/viewer/2022062713/56649f525503460f94c76bd4/html5/thumbnails/23.jpg)
2323
A good motif patternA good motif pattern
A good motif pattern can be simply described as:(1) Correctly identify protein sequences
belonging to the family it represents, or maximize ‘true-positives’.
(2) Does not identify protein sequences belonging to the other families, or minimize ‘false-positives’.
![Page 24: Protein motif extraction with neuro-fuzzy optimization Bill C. H. Chang and Author : Bill C. H. Chang and Saman K. Halgamuge Saman K. Halgamuge Adviser](https://reader035.vdocument.in/reader035/viewer/2022062713/56649f525503460f94c76bd4/html5/thumbnails/24.jpg)
2424
Motif optimization (1/2)Motif optimization (1/2)
![Page 25: Protein motif extraction with neuro-fuzzy optimization Bill C. H. Chang and Author : Bill C. H. Chang and Saman K. Halgamuge Saman K. Halgamuge Adviser](https://reader035.vdocument.in/reader035/viewer/2022062713/56649f525503460f94c76bd4/html5/thumbnails/25.jpg)
2525
Motif optimization (2/2)Motif optimization (2/2)
The inputs to the network are event intervals.The simple rule (black node in ‘rule base’ layer
of Figure 3) in the neuro-fuzzy system is: ‘IF I1 is μ1 and I2 is μ1, THEN output is μclass’.
μclass is the output of the neuro-fuzzy network.
![Page 26: Protein motif extraction with neuro-fuzzy optimization Bill C. H. Chang and Author : Bill C. H. Chang and Saman K. Halgamuge Saman K. Halgamuge Adviser](https://reader035.vdocument.in/reader035/viewer/2022062713/56649f525503460f94c76bd4/html5/thumbnails/26.jpg)
2626
Fuzzy inference systemFuzzy inference system
A fuzzy inference system embedded in neural network has three main steps:fuzzification, fuzzy inference anddefuzzification.
![Page 27: Protein motif extraction with neuro-fuzzy optimization Bill C. H. Chang and Author : Bill C. H. Chang and Saman K. Halgamuge Saman K. Halgamuge Adviser](https://reader035.vdocument.in/reader035/viewer/2022062713/56649f525503460f94c76bd4/html5/thumbnails/27.jpg)
2727
Sequence Preprocessing (1/3)Sequence Preprocessing (1/3)
For example, let T = AGCCTGAT. The first and second level distribution matrices are shown in Table 1:
![Page 28: Protein motif extraction with neuro-fuzzy optimization Bill C. H. Chang and Author : Bill C. H. Chang and Saman K. Halgamuge Saman K. Halgamuge Adviser](https://reader035.vdocument.in/reader035/viewer/2022062713/56649f525503460f94c76bd4/html5/thumbnails/28.jpg)
2828
Sequence Preprocessing (2/3)Sequence Preprocessing (2/3)
![Page 29: Protein motif extraction with neuro-fuzzy optimization Bill C. H. Chang and Author : Bill C. H. Chang and Saman K. Halgamuge Saman K. Halgamuge Adviser](https://reader035.vdocument.in/reader035/viewer/2022062713/56649f525503460f94c76bd4/html5/thumbnails/29.jpg)
2929
Sequence Preprocessing (3/3)Sequence Preprocessing (3/3)
![Page 30: Protein motif extraction with neuro-fuzzy optimization Bill C. H. Chang and Author : Bill C. H. Chang and Saman K. Halgamuge Saman K. Halgamuge Adviser](https://reader035.vdocument.in/reader035/viewer/2022062713/56649f525503460f94c76bd4/html5/thumbnails/30.jpg)
3030
Sequence Fuzzification (1/2)Sequence Fuzzification (1/2)
The value of event interval is also fuzzified. For example, if pattern P = T φφG, the event interval fuzzy membership function can be defined as shown in Figure 4.
P = T φφG = P = T-X(2)-G
![Page 31: Protein motif extraction with neuro-fuzzy optimization Bill C. H. Chang and Author : Bill C. H. Chang and Saman K. Halgamuge Saman K. Halgamuge Adviser](https://reader035.vdocument.in/reader035/viewer/2022062713/56649f525503460f94c76bd4/html5/thumbnails/31.jpg)
3131
Sequence Fuzzification (2/2)Sequence Fuzzification (2/2)
![Page 32: Protein motif extraction with neuro-fuzzy optimization Bill C. H. Chang and Author : Bill C. H. Chang and Saman K. Halgamuge Saman K. Halgamuge Adviser](https://reader035.vdocument.in/reader035/viewer/2022062713/56649f525503460f94c76bd4/html5/thumbnails/32.jpg)
3232
Sequence InferenceSequence Inference
This step aims to find the most “similar” subsequence in Text T compares to Pattern P.
The inference rule used here is: IF event A1 occurs AND event A2 occursAND event interval between A1 and A2 is I1
AND … event An-1 occurs AND event An occurs AND event interval between An-1 and An is In-1, THEN Pattern P exists in Text T with degree Yi.
![Page 33: Protein motif extraction with neuro-fuzzy optimization Bill C. H. Chang and Author : Bill C. H. Chang and Saman K. Halgamuge Saman K. Halgamuge Adviser](https://reader035.vdocument.in/reader035/viewer/2022062713/56649f525503460f94c76bd4/html5/thumbnails/33.jpg)
3333
Fuzzy Sequence Pattern Matching Fuzzy Sequence Pattern Matching Algorithm (example)Algorithm (example)
The general structure of a C2H2 zinc finger protein motif (a motif is the signature of a particular group of sequences) is [2]:CφφCφφφφφφφφφφφφHφφH
![Page 34: Protein motif extraction with neuro-fuzzy optimization Bill C. H. Chang and Author : Bill C. H. Chang and Saman K. Halgamuge Saman K. Halgamuge Adviser](https://reader035.vdocument.in/reader035/viewer/2022062713/56649f525503460f94c76bd4/html5/thumbnails/34.jpg)
3434
Sequence Preprocessing (example)Sequence Preprocessing (example)
CφφCφφφφφφφφφφφφHφφH
![Page 35: Protein motif extraction with neuro-fuzzy optimization Bill C. H. Chang and Author : Bill C. H. Chang and Saman K. Halgamuge Saman K. Halgamuge Adviser](https://reader035.vdocument.in/reader035/viewer/2022062713/56649f525503460f94c76bd4/html5/thumbnails/35.jpg)
3535
Sequence Fuzzification (example)Sequence Fuzzification (example)
We use the following fuzzy rule to describe the event interval:
R1: If event interval is I1 between the first two C, then the membership value is μ1
R2: If event interval is I2 between C and H, then themembership value is μ2
R3: If event interval is I3 between the last two H, then
the membership value is μ3
![Page 36: Protein motif extraction with neuro-fuzzy optimization Bill C. H. Chang and Author : Bill C. H. Chang and Saman K. Halgamuge Saman K. Halgamuge Adviser](https://reader035.vdocument.in/reader035/viewer/2022062713/56649f525503460f94c76bd4/html5/thumbnails/36.jpg)
3636
Sequence Inference (example)Sequence Inference (example)
The inference rule used here is:
IF event interval between the first two Cs is I1 AND event interval between C and H is I2 AND event interval between the last two Hs is I3, THEN Pattern P exists in Text T with degree Yi.
Where Yi = μ1 × μ2 × μ3 And Y = Max(Y1, Y2, Y3, …, Ym)
![Page 37: Protein motif extraction with neuro-fuzzy optimization Bill C. H. Chang and Author : Bill C. H. Chang and Saman K. Halgamuge Saman K. Halgamuge Adviser](https://reader035.vdocument.in/reader035/viewer/2022062713/56649f525503460f94c76bd4/html5/thumbnails/37.jpg)
3737
ClassifyClassify
![Page 38: Protein motif extraction with neuro-fuzzy optimization Bill C. H. Chang and Author : Bill C. H. Chang and Saman K. Halgamuge Saman K. Halgamuge Adviser](https://reader035.vdocument.in/reader035/viewer/2022062713/56649f525503460f94c76bd4/html5/thumbnails/38.jpg)
3838
Sum of square errorSum of square error For example, sequence Z is ACCABBDACA, and the
preliminary motif is A–x(2)–A–x(2)–A. The possible matches are
(a) ACCABBDA (A–x(2)–A–x(3)–A) and (b) ABBDACA (A–x(3)–A–x(1)–A).
The sum of square error is:for (a) : (2 − 2)2 + (3 − 2)2 = 1
(b) : (3 − 2)2 + (1 − 2)2 = 2. So (a) is the ‘most similar match’ and its event interv
al values (2, 3) is used as a training input data.
![Page 39: Protein motif extraction with neuro-fuzzy optimization Bill C. H. Chang and Author : Bill C. H. Chang and Saman K. Halgamuge Saman K. Halgamuge Adviser](https://reader035.vdocument.in/reader035/viewer/2022062713/56649f525503460f94c76bd4/html5/thumbnails/39.jpg)
3939
Result of C2H2 zinc finger protein (1/3)Result of C2H2 zinc finger protein (1/3)
![Page 40: Protein motif extraction with neuro-fuzzy optimization Bill C. H. Chang and Author : Bill C. H. Chang and Saman K. Halgamuge Saman K. Halgamuge Adviser](https://reader035.vdocument.in/reader035/viewer/2022062713/56649f525503460f94c76bd4/html5/thumbnails/40.jpg)
4040
Result of C2H2 zinc finger protein (2/3)Result of C2H2 zinc finger protein (2/3)
![Page 41: Protein motif extraction with neuro-fuzzy optimization Bill C. H. Chang and Author : Bill C. H. Chang and Saman K. Halgamuge Saman K. Halgamuge Adviser](https://reader035.vdocument.in/reader035/viewer/2022062713/56649f525503460f94c76bd4/html5/thumbnails/41.jpg)
4141
Result of C2H2 zinc finger protein (3/3)Result of C2H2 zinc finger protein (3/3)
![Page 42: Protein motif extraction with neuro-fuzzy optimization Bill C. H. Chang and Author : Bill C. H. Chang and Saman K. Halgamuge Saman K. Halgamuge Adviser](https://reader035.vdocument.in/reader035/viewer/2022062713/56649f525503460f94c76bd4/html5/thumbnails/42.jpg)
4242
Result of EGF Protein (1/3)Result of EGF Protein (1/3)
![Page 43: Protein motif extraction with neuro-fuzzy optimization Bill C. H. Chang and Author : Bill C. H. Chang and Saman K. Halgamuge Saman K. Halgamuge Adviser](https://reader035.vdocument.in/reader035/viewer/2022062713/56649f525503460f94c76bd4/html5/thumbnails/43.jpg)
4343
Result of EGF Protein (2/3)Result of EGF Protein (2/3)
![Page 44: Protein motif extraction with neuro-fuzzy optimization Bill C. H. Chang and Author : Bill C. H. Chang and Saman K. Halgamuge Saman K. Halgamuge Adviser](https://reader035.vdocument.in/reader035/viewer/2022062713/56649f525503460f94c76bd4/html5/thumbnails/44.jpg)
4444
Result of EGF Protein (3/3)Result of EGF Protein (3/3)
![Page 45: Protein motif extraction with neuro-fuzzy optimization Bill C. H. Chang and Author : Bill C. H. Chang and Saman K. Halgamuge Saman K. Halgamuge Adviser](https://reader035.vdocument.in/reader035/viewer/2022062713/56649f525503460f94c76bd4/html5/thumbnails/45.jpg)
4545
DiscussionDiscussion
The optimization of motif patterns in both EGF and zinc finger protein family increases the rate of true positives.
However, with an increase in true positives rate, the rate of false positives also increases.
An interesting observation is that in comparison to the motifs suggested in PROSITE, the motifs identified by our method are more flexible and broad.
![Page 46: Protein motif extraction with neuro-fuzzy optimization Bill C. H. Chang and Author : Bill C. H. Chang and Saman K. Halgamuge Saman K. Halgamuge Adviser](https://reader035.vdocument.in/reader035/viewer/2022062713/56649f525503460f94c76bd4/html5/thumbnails/46.jpg)
4646
Conclusion and future workConclusion and future work
For future research, optimization of neuro-fuzzy system will be further investigated to implement event fuzzy membership functions for events.