andras[1]

Upload: zchepe

Post on 13-Apr-2018

223 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/27/2019 Andras[1]

    1/21

    Protein structure modelingProtein structure modeling

    Department of Biochemistry and

    Seaver Foundation Center for Bioinformatics

    Albert Einstein College of MedicineNew York, USA

    AndrAndrss FiserFiser

  • 7/27/2019 Andras[1]

    2/21

    Why is it useful to know the structure of a

    protein not only its sequence?

    The 3D structure is more informative than sequence because patterns inspace are frequently more recognizable than patterns in sequence

    Evolution tends to conserve function and

    function depends more directly on structthan on sequence, structure is more

    conserved in evolution than sequence.

  • 7/27/2019 Andras[1]

    3/21

    Why Protein Structure Prediction?

    Y 2005

    29,000Structures

    2,300,000Sequences

    We know the experimental 3D structure for~1% of the protein sequences

  • 7/27/2019 Andras[1]

    4/21

    Principles of Protein Structure

    GFCHIKAYTRLIMVG

    An

    abaena7120

    Anacystisnidulans

    Condruscrispus

    Desulfovibriovulgaris

    Ab initio prediction Fold Recognition

    Com arative Modelin

    folding evolution

  • 7/27/2019 Andras[1]

    5/21

    Protein structure modeling

    Ab initio prediction Comparative Modeling

    pplicable to any sequence

    ot very accurate (>4 Ang RMSD),

    ttempted for proteins of

  • 7/27/2019 Andras[1]

    6/21

    A small difference in the sequence makes a small

    ifference in the structure

    I Protein structures are clustered into fold families

    a ma es compara ve mo e ng poss e

  • 7/27/2019 Andras[1]

    7/21

    Structural Genomics

    The number of families is

    much smaller than the numberof proteins

    haracterize most protein sequences (red) based on related

    nown structures (green).

    S G

  • 7/27/2019 Andras[1]

    8/21

    Structural Genomics

    efinition: The aim of structural genomics is to put every protein sequence withinmodeling distance of a known protein structure.

    ize of the problem:

    There are a few thousand domain fold families.There are ~20,000 sequence families (30% sequence id).

    olution:Determine protein structures for as many different families as possible.

    Model the rest of the family members using comparative modeling

  • 7/27/2019 Andras[1]

    9/21

    Comparative Protein Structure Modeling

    COMPARATIVE

    MODELING

    0 (100)2 (50) 1 (80)

    Ca RMSD (% EQV)

    20 50 100

    Anabaena 7120

    Anacystis nidulans

    Condrus crispus

    Desulfovibrio vulgaris

    Clostridium mp.

    KIGIFFSTSTGNTTEVA

    Flavodoxinfamily

  • 7/27/2019 Andras[1]

    10/21

    teps in Comparative Protein Structure Modeli

    MSVIPKRLYGNCEQTSEEAIRIEDSPIV---TADLVCLKIDEIPERLVGE

    ASILPKRLFGNCEQTSDEGLKIERTPLVPHISAQNVCLKIDDVPERLIPE

    ASILPKRLFGNCEQTSDEGLK

    IERTPLVPHISAQNVCLKIDD

    VPERLIPERASFQWMNDK

    TARGET TEMPLATE

    No

    Target TemplateAlignment

    Model Building

    START

    Template Search

    OK?

    Model Evaluation

    ENDYes

  • 7/27/2019 Andras[1]

    11/21

    Steps in Comparative Protein Structure Modeling

    No

    Target TemplateAlignment

    Model Building

    START

    Template Search

    OK?

    Model Evaluation

    END

    Yes

    Pattern recognition, heuristic searches

    (e.g. BLAST, FastA)

    Profile and iterative alignment methods(e.g. HMMs, PSI-BLAST)

    Structure based threading

    (e.g. THREADER, FUGUE, 3DPSSM)

  • 7/27/2019 Andras[1]

    12/21

    Steps in Comparative Protein Structure Modeling

    No

    Target TemplateAlignment

    Model Building

    START

    Template Search

    OK?

    Model Evaluation

    END

    Yes

    Dynamic Programming, Pairwise Alignme

    Multiple Alignments, Profiles, HMMs Structure based approaches (Threading)

  • 7/27/2019 Andras[1]

    13/21

    Steps in Comparative Protein Structure Modeling

    No

    Target TemplateAlignment

    Model Building

    START

    Template Search

    OK?

    Model Evaluation

    END

    Yes

    Rigid Body Assembly (COMPOSER)

    Segment Matching (SEGMOD, 3DPSSM)

    Satisfaction of Spatial Restraints (MODELLE

    Integrated (NEST)

    loop modeling, side chain modeling

  • 7/27/2019 Andras[1]

    14/21

    teps in Comparative Protein Structure Modelin

    No

    Target TemplateAlignment

    Model Building

    START

    Template Search

    OK?

    Model Evaluation

    END

    Yes

    Stereochemistry (PROCHECK, WHATCHE

    Environment (Profiles3D, Verify3d)

    Statistical potentials based methods (PROS

    Is the model reliable?

    A model is reliable when it is based on a

    correct template and on an approximatelycorrect alignment.

  • 7/27/2019 Andras[1]

    15/21

    Typical Errors in Comparative Models

    Distortion in correctly

    aligned regionsRegion without a

    template Side chain packing

    Incorrect template

    MODELX RAYTEMPLATE

    Misalignment

    mpar ng accurac es o exper men a an eore ca approac

  • 7/27/2019 Andras[1]

    16/21

    mpar ng accurac es o exper men a an eore ca approac

    Some Models Can Be Surprisingly Accurateome o e s an e urpr s ng y ccurate

  • 7/27/2019 Andras[1]

    17/21

    24 sequence identity

    YJL001W1rypH

    25 sequence identityYGL203C

    1ac5

    Ser 176

    His 488

    Asp 383

    Some Models Can Be Surprisingly Accurateome o e s an e urpr s ng y ccurate(in Some Regions)(in Some Regions)

  • 7/27/2019 Andras[1]

    18/21

    in Zebrafish forkhead transcription factor Foxi1

    RMSD

    re-modelled wild type segments(6 and 7aa) and NMR: 1.78 and 1.82modelled mutated segments with each other (6 and 7aa): 1.19

    wild type and mutated segments (6 and 7 aa): 3.65 and 3.75

    ere su un commun ca on n su am es o ases

  • 7/27/2019 Andras[1]

    19/21

    phila m. H. sapiens

    li Eq. inf. virus

    Predicting features that are not present in the template

    1. Active form usually is a trimer,each active site is formed by all threemonomers.

    2. Comparison of models and X-raystructures reveals two subclasses

    of dUTPases with different type of

    subunit interfaces.

    3.Altered character of subunitinterfaces correlates with the

    suggested different functional

    mechanism: polar/charged surface

    is better adjusted for allosterism.

    ere su un commun ca on n su am es o ases

    Convergent evolution of Trichomonas vaginalis lactate

  • 7/27/2019 Andras[1]

    20/21

    Designing new enzyme specificity with the aid of comparative models

    1. Sequences are identi

    from the Trichomona

    genome project

    2. Mutations were

    designed using the

    constructed 3D mode

    to switch specificity.

    Convergent evolution of Trichomonas vaginalis lactatedehydrogenase from malate dehydrogenase.

    Core histones of the amitochondriate protist Giardia lambli

  • 7/27/2019 Andras[1]

    21/21

    Confirming fold by energy evaluation of comparative model

    -5.23/-4.42-2.74/-2.39-3.98/-4.05-5.41/-5.09X-ray

    -4.79-0.26-2.82-2.29H4

    -0.41-2.38-0.61-1.35H3

    -1.70-0.41-4.34-1.15H2B

    -2.77-0.64-3.42-4.74H2A

    1aoiB/F1aoiA/E1aoiD/H1aoiC/GG.Lamblia

    Core histones of the amitochondriate protist, Giardia lambli