the effect of new parameters and increased database size on the cysteine oxidation prediction...
TRANSCRIPT
![Page 1: The Effect of New Parameters and Increased Database Size on the Cysteine Oxidation Prediction Program Megan Riddle with Ricardo Sanchez and Dr. Jamil Momand](https://reader036.vdocument.in/reader036/viewer/2022062515/56649c785503460f9492e359/html5/thumbnails/1.jpg)
The Effect of New Parameters and Increased Database Size on the Cysteine Oxidation Prediction
Program
Megan Riddle with Ricardo Sanchez and Dr. Jamil Momand
California State University, Los AngelesAugust 23, 2007
HS CH2 C COOH
H
NH2
![Page 2: The Effect of New Parameters and Increased Database Size on the Cysteine Oxidation Prediction Program Megan Riddle with Ricardo Sanchez and Dr. Jamil Momand](https://reader036.vdocument.in/reader036/viewer/2022062515/56649c785503460f9492e359/html5/thumbnails/2.jpg)
Overview• Cysteine Oxidation Prediction Program
(COPP) – Oxidation defined
• Biological Significance of Cysteine Oxidation
– Effects of oxidation on proteins
• Summer 2007 Goals– Increase database size– Add new parameters
• Methods and Results
HS CH2 C COOH
H
NH2
![Page 3: The Effect of New Parameters and Increased Database Size on the Cysteine Oxidation Prediction Program Megan Riddle with Ricardo Sanchez and Dr. Jamil Momand](https://reader036.vdocument.in/reader036/viewer/2022062515/56649c785503460f9492e359/html5/thumbnails/3.jpg)
Cysteine Oxidation Prediction Program
• Goal: Create a program that will use physicochemical parameters to predict reactive surface cysteine thiols
• Methods:– Gather examples of proteins susceptible to
cysteine oxidation– Extract parameters from Protein Data Bank – Use computer classifier C4.5 to determine
rules that will predict if cysteine can become oxidized
![Page 4: The Effect of New Parameters and Increased Database Size on the Cysteine Oxidation Prediction Program Megan Riddle with Ricardo Sanchez and Dr. Jamil Momand](https://reader036.vdocument.in/reader036/viewer/2022062515/56649c785503460f9492e359/html5/thumbnails/4.jpg)
Oxidation of Cysteines
Sanchez (2007)
![Page 5: The Effect of New Parameters and Increased Database Size on the Cysteine Oxidation Prediction Program Megan Riddle with Ricardo Sanchez and Dr. Jamil Momand](https://reader036.vdocument.in/reader036/viewer/2022062515/56649c785503460f9492e359/html5/thumbnails/5.jpg)
Cysteine Prediction
Two types of cysteine oxidation:1. Permanent structural oxidation: cysteines
that form permanent disulfide bonds or bind to metals shortly after translation
– Prediction programs based on sequence already exist 88% accuracy (Martelli et al. 2002)
2. Reactive surface cysteine thiols: cysteines that become oxidized under certain conditions, most reversibly
– No prediction programs exist COPP
![Page 6: The Effect of New Parameters and Increased Database Size on the Cysteine Oxidation Prediction Program Megan Riddle with Ricardo Sanchez and Dr. Jamil Momand](https://reader036.vdocument.in/reader036/viewer/2022062515/56649c785503460f9492e359/html5/thumbnails/6.jpg)
Biological Significance: Oxidation and Enzyme Function
• The active sites of glutaredoxin and thioredoxin cycle between reduced and oxidized states
http://www.cs.stedwards.edu/chem/Chemistry/CHEM43/CHEM43/Thioredox/RNA2.GIF
![Page 7: The Effect of New Parameters and Increased Database Size on the Cysteine Oxidation Prediction Program Megan Riddle with Ricardo Sanchez and Dr. Jamil Momand](https://reader036.vdocument.in/reader036/viewer/2022062515/56649c785503460f9492e359/html5/thumbnails/7.jpg)
Enzyme Inactivation via Oxidation
• H2O2 inactivates PTEN tumor suppressor protein by causing the formation of a disulfide bond
Lee et al. JBC (2002)
![Page 8: The Effect of New Parameters and Increased Database Size on the Cysteine Oxidation Prediction Program Megan Riddle with Ricardo Sanchez and Dr. Jamil Momand](https://reader036.vdocument.in/reader036/viewer/2022062515/56649c785503460f9492e359/html5/thumbnails/8.jpg)
Summer 2007 Goals
1. Increase the size of the COPP database
2. Test new parameters to determine if they affect the rules and accuracy of COPP
HS CH2 C COOH
H
NH2
![Page 9: The Effect of New Parameters and Increased Database Size on the Cysteine Oxidation Prediction Program Megan Riddle with Ricardo Sanchez and Dr. Jamil Momand](https://reader036.vdocument.in/reader036/viewer/2022062515/56649c785503460f9492e359/html5/thumbnails/9.jpg)
Increase Database Size
• Previously:– 85 proteins that undergo non-structural
cysteine oxidation– 135 cysteines that undergo oxidation– 225 cysteines that remain reduced under
oxidizing conditions
• To create an accurate, general set of rules for cysteine oxidation requires a large, unbiased database
![Page 10: The Effect of New Parameters and Increased Database Size on the Cysteine Oxidation Prediction Program Megan Riddle with Ricardo Sanchez and Dr. Jamil Momand](https://reader036.vdocument.in/reader036/viewer/2022062515/56649c785503460f9492e359/html5/thumbnails/10.jpg)
Methods: Increase Database Size 1. Search Entrez for keywords
• i.e. cysteine and oxidation, sulfenic acid, etc.
2. Look for proteins in Protein Data Bank
Potential Proteins
![Page 11: The Effect of New Parameters and Increased Database Size on the Cysteine Oxidation Prediction Program Megan Riddle with Ricardo Sanchez and Dr. Jamil Momand](https://reader036.vdocument.in/reader036/viewer/2022062515/56649c785503460f9492e359/html5/thumbnails/11.jpg)
Increase Database Size
3. Do BLASTALL – eliminate proteins with:• Identity > 35%• E value < 1• Conserved cys
Potential Proteins
Cysteines Oxidize
![Page 12: The Effect of New Parameters and Increased Database Size on the Cysteine Oxidation Prediction Program Megan Riddle with Ricardo Sanchez and Dr. Jamil Momand](https://reader036.vdocument.in/reader036/viewer/2022062515/56649c785503460f9492e359/html5/thumbnails/12.jpg)
Increase Database Size
C4.5/ J48
Original Proteins
Rules to Classify Cysteines
New Proteins
![Page 13: The Effect of New Parameters and Increased Database Size on the Cysteine Oxidation Prediction Program Megan Riddle with Ricardo Sanchez and Dr. Jamil Momand](https://reader036.vdocument.in/reader036/viewer/2022062515/56649c785503460f9492e359/html5/thumbnails/13.jpg)
Results: New Proteins
S1 DISTANCE <= 6: 1 (88.19/17.0)S1 DISTANCE > 6| ASA (Å2) <= 1: 0 (136.51/6.51)| ASA (Å2) > 1| | N1 DISTANCE <= 5.2: 1 (32.54/9.0)| | N1 DISTANCE > 5.2| | | O1 ASA <= 2: 1 (33.0/15.0)| | | O1 ASA > 2: 0 (71.76/15.76)
6Å
+
5.2Å
-
Sanchez (2007)
• Increased database size caused reduction in rules
• Accuracy decreasedOld Rules New Rules
S1 DISTANCE <= 6.1: 1 (115.44/28.0)S1 DISTANCE > 6.1| ASA (Å2) <= 1.8: 0 (177.51/12.51)| ASA (Å2) > 1.8| | N1 DISTANCE <= 5.4: 1 (46.54/17.0)| | N1 DISTANCE >5.4: 0 (133.51/39.51)
81.8% Accuracy 79.1% Accuracy
![Page 14: The Effect of New Parameters and Increased Database Size on the Cysteine Oxidation Prediction Program Megan Riddle with Ricardo Sanchez and Dr. Jamil Momand](https://reader036.vdocument.in/reader036/viewer/2022062515/56649c785503460f9492e359/html5/thumbnails/14.jpg)
Methods: Parameters already used by COPP
• S1 DISTANCE distance to nearest sulfur atom• S1 ASA area exposed to the surface• N1 DISTANCE distance to the nearest +nitrogen
atom• N1 DONOR nitrogen’s parent side chain• N1 ASA area exposed to the surface• O1 DISTANCE distance to the nearest -oxygen• O1 DONOR oxygen’s parent side chain• O1 ASA area exposed to the surface• ASA exposed surface of S in question • CLASS class: 0 if reduced; 1 otherwise
![Page 15: The Effect of New Parameters and Increased Database Size on the Cysteine Oxidation Prediction Program Megan Riddle with Ricardo Sanchez and Dr. Jamil Momand](https://reader036.vdocument.in/reader036/viewer/2022062515/56649c785503460f9492e359/html5/thumbnails/15.jpg)
Methods: Parameters already used by COPP
• S1 DISTANCE distance to nearest sulfur atom• S1 ASA area exposed to the surface• N1 DISTANCE distance to the nearest +N atom• N1 DONOR nitrogen’s parent side chain• N1 ASA area exposed to the surface• O1 DISTANCE distance to the nearest -oxygen• O1 DONOR oxygen’s parent side chain• O1 ASA area exposed to the surface• ASA exposed surface of S in question • CLASS class: 0 if reduced; 1 otherwise
![Page 16: The Effect of New Parameters and Increased Database Size on the Cysteine Oxidation Prediction Program Megan Riddle with Ricardo Sanchez and Dr. Jamil Momand](https://reader036.vdocument.in/reader036/viewer/2022062515/56649c785503460f9492e359/html5/thumbnails/16.jpg)
New Parameters
• pKa: acid dissociation constant– How easily can the S lose a proton?
• Electrostatic Potential: potential energy per unit charge– How well stabilized is the charged S after the
proton is lost?
-S CH2 C COOH
H
NH2
H
![Page 17: The Effect of New Parameters and Increased Database Size on the Cysteine Oxidation Prediction Program Megan Riddle with Ricardo Sanchez and Dr. Jamil Momand](https://reader036.vdocument.in/reader036/viewer/2022062515/56649c785503460f9492e359/html5/thumbnails/17.jpg)
Methods: New Parameters • PCE: Protein Continuum Electrostatics
– Calculates Electrostatic Potential
Coordinates Electrostatic Potential
7.854 0.668 -0.602 -10.725 4.683 1.223 3.305 -25.413 3.330 8.072 3.708 -19.335 2.256 -11.243 9.879 -21.887 14.014 7.907 3.298 -13.670
Miteva et al. NAR (2005)http://bioserv.rpbs.jussieu.fr/cgi-bin/PCE-Pot
![Page 18: The Effect of New Parameters and Increased Database Size on the Cysteine Oxidation Prediction Program Megan Riddle with Ricardo Sanchez and Dr. Jamil Momand](https://reader036.vdocument.in/reader036/viewer/2022062515/56649c785503460f9492e359/html5/thumbnails/18.jpg)
New Parameters• PROPKA
– Calculates pKa
Li et al. Proteins (2005)http://propka.ki.ku.dk/
![Page 19: The Effect of New Parameters and Increased Database Size on the Cysteine Oxidation Prediction Program Megan Riddle with Ricardo Sanchez and Dr. Jamil Momand](https://reader036.vdocument.in/reader036/viewer/2022062515/56649c785503460f9492e359/html5/thumbnails/19.jpg)
New Parameters
C4.5/ J48
Electrostatic Potential and pKa Data
Original Data
Rules to Classify Cysteines
![Page 20: The Effect of New Parameters and Increased Database Size on the Cysteine Oxidation Prediction Program Megan Riddle with Ricardo Sanchez and Dr. Jamil Momand](https://reader036.vdocument.in/reader036/viewer/2022062515/56649c785503460f9492e359/html5/thumbnails/20.jpg)
Results: New Parameters
• New parameters caused an alteration in the final rule
• The accuracy is similar
Old ParametersS1 DISTANCE <= 6.1: 1 (115.44/28.0)S1 DISTANCE > 6.1| ASA (Å2) <= 1.8: 0 (177.51/12.51)| ASA (Å2) > 1.8| | N1 DISTANCE <= 5.4: 1 (46.54/17.0)| | N1 DISTANCE >5.4: 0 (133.51/39.51)
79.0698% Accuracy
S1 DISTANCE <= 6.1: 1 (115.44/28.0)S1 DISTANCE > 6.1| ASA (Å2) <= 1.8: 0 (177.51/12.51)| ASA (Å2) > 1.8| | pKa of S0 <= 8.75: 1 (74.29/32.0)| | pKa of S0 > 8.75: 0 (105.76/26.76)
New Parameters
78.6469% Accuracy
![Page 21: The Effect of New Parameters and Increased Database Size on the Cysteine Oxidation Prediction Program Megan Riddle with Ricardo Sanchez and Dr. Jamil Momand](https://reader036.vdocument.in/reader036/viewer/2022062515/56649c785503460f9492e359/html5/thumbnails/21.jpg)
Conclusions
• New Proteins:– A larger database results in a more general,
but less accurate, set of rules
• New Parameters:– A low pKa value correlates with oxidation, but
does not improve the accuracy of COPP
• Future Goals:– Make COPP publicly available– Modify COPP to predict type of oxidation
![Page 22: The Effect of New Parameters and Increased Database Size on the Cysteine Oxidation Prediction Program Megan Riddle with Ricardo Sanchez and Dr. Jamil Momand](https://reader036.vdocument.in/reader036/viewer/2022062515/56649c785503460f9492e359/html5/thumbnails/22.jpg)
With many thanks to. . .
Dr. Jamil Momand, Ricardo Sanchez, and the rest of the Momand lab
SoCalBSI fellow students and mentors
California State University at Los Angeles
Funding from:
LA Orange County Biotechnology Center
![Page 23: The Effect of New Parameters and Increased Database Size on the Cysteine Oxidation Prediction Program Megan Riddle with Ricardo Sanchez and Dr. Jamil Momand](https://reader036.vdocument.in/reader036/viewer/2022062515/56649c785503460f9492e359/html5/thumbnails/23.jpg)
![Page 24: The Effect of New Parameters and Increased Database Size on the Cysteine Oxidation Prediction Program Megan Riddle with Ricardo Sanchez and Dr. Jamil Momand](https://reader036.vdocument.in/reader036/viewer/2022062515/56649c785503460f9492e359/html5/thumbnails/24.jpg)
66
68
70
72
74
76
78
80
82
84
0 10 20 30 40 50 60
M-value
Co
rre
ctl
y C
las
sif
ied
Ins
tan
ce
s (
%)
0
5
10
15
20
25
30
35
40
45
Tre
e S
ize
Correctly Classified
Tree Size
![Page 25: The Effect of New Parameters and Increased Database Size on the Cysteine Oxidation Prediction Program Megan Riddle with Ricardo Sanchez and Dr. Jamil Momand](https://reader036.vdocument.in/reader036/viewer/2022062515/56649c785503460f9492e359/html5/thumbnails/25.jpg)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
False Positive Rate
Tru
e P
os
itiv
e R
ate
M2
M3
M4
M5
M6
M7
M8
M9
M10
M11
M12
M13
M14
M15
M16
M17
M18
M19
M20
M21
M22
M23
M24
M25
M26
M27
M28
M29
M30
M31
M32
M33
M34
M35
M36
M37
M38
M39
M40
M41
M42
M43
M44
M45
M46
M47
M48
M49
M50
![Page 26: The Effect of New Parameters and Increased Database Size on the Cysteine Oxidation Prediction Program Megan Riddle with Ricardo Sanchez and Dr. Jamil Momand](https://reader036.vdocument.in/reader036/viewer/2022062515/56649c785503460f9492e359/html5/thumbnails/26.jpg)
Results: New Proteins
• Increased database size caused reduction in rules
• Accuracy decreased
S1 DISTANCE <= 6: 1 (88.19/17.0)S1 DISTANCE > 6| ASA (Å2) <= 1: 0 (136.51/6.51)| ASA (Å2) > 1| | N1 DISTANCE <= 5.2: 1 (32.54/9.0)| | N1 DISTANCE > 5.2| | | O1 ASA <= 2: 1 (33.0/15.0)| | | O1 ASA > 2: 0 (71.76/15.76)
81.8% Accuracy 79.1% Accuracy
Old Rules New RulesS1 DISTANCE <= 6.1: 1 (115.44/28.0)S1 DISTANCE > 6.1| ASA (Å2) <= 1.8: 0 (177.51/12.51)| ASA (Å2) > 1.8| | N1 DISTANCE <= 5.4: 1 (46.54/17.0)| | N1 DISTANCE >5.4: 0 (133.51/39.51)
Cys-SH HS-Cys