www.wwpdb.org september 29, 2008. worldwide protein data bank 10:00 am.welcome and introductionskh...

Post on 25-Dec-2015

215 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

www.wwpdb.org

September 29, 2008

Worldwide Protein Data Bank

www.wwpdb.org

10:00 am. Welcome and Introductions KH10:15 Overview of recent wwPDB progress HB10:35 Outreach HN10:55 NMR Task Force JM11:15 Improvements in Data Deposition

and Processing KH11:45 New Projects HBNoon Working Lunch1:00pm Funding Update All1:30 Matters Arising

Committee membershipNext meeting

2:00 Discussion 3:00 Executive Session3:15 Feedback3:30 Adjourn

Agenda

Overview

Helen Berman

Worldwide Protein Data Bank

www.wwpdb.org

wwPDBAC 2007 (on wwPDB Intranet)

Worldwide Protein Data Bank

www.wwpdb.org

wwPDBAC 2007 Recommendations Structure factors and/or NMR restraints should be a

prerequisite for receiving a PDB ID– Done

Inform the relevant journals of this new policy– Done; adopted by some but not all

ValidationEstablish additional X-ray crystallography and NMR validation procedures – In progress Results should be made available to depositors immediately after

submission. Upon depositor request, the validation reports should be made available to designated scientific journal editors

– Possible now, journal policies have not as yet changedWork to establish recommendations for additional experimental data

deposition and release requirements– In progress

Worldwide Protein Data Bank

www.wwpdb.org

wwPDB AchievementsOctober 2007 - September 2008 Continued growth of archive – now

more than 50,000 structures Website updates Download statistics available Publications and presentations Enhanced complex molecule

annotation New Format document Initiation of Common Annotation Tool

development

Worldwide Protein Data Bank

www.wwpdb.org

DepositionsDeposited To Processed By

TotalDepositionsRCSB PDBj PDBe RCSB PDBj PDBe

Oct 07 456 69 74 361 164 74 599

Nov 07 408 69 113 265 212 113 590

Dec 07 447 53 80 324 176 80 580

Jan 08 460 57 87 340 177 87 604

Feb 08 362 82 81 313 131 81 525

Mar 08 427 60 81 333 154 81 568

Apr 08 407 96 73 323 180 73 576

May 08 458 35 73 353 140 73 566

Jun 08 450 18 79 308 160 79 547

Jul 08 554 28 84 408 174 84 666

Aug 08 459 63 75 362 160 75 597

TOTAL 4888 630 900 3690 1828 900 6418

Nu

mb

er o

f re

leas

ed e

ntr

ies

Year:

Depositions to the PDB by decade

Depositor locations

Download locations

RCSB PDB

PDBe

PDBj

Worldwide Protein Data Bank

www.wwpdb.org

Last 12 monthsFTP: 256,753,220HTTP: 47,102,103Total: 303,855,323

PDB File Downloads

Outreach

Haruki Nakamura

Worldwide Protein Data Bank

www.wwpdb.org

Worldwide Protein Data Bank

www.wwpdb.org

Outreach

wwPDB website Simultaneous updating PDB archives Publications Professional society meetings

– Presentations– Exhibit booth

wwPDB websiteDeposition and download statistics

Deposition and Release Policies

Format Description

Meeting information and preliminary recommendations

Worldwide Protein Data Bank

www.wwpdb.org

Simultaneous weekly update of PDB archive

In the past, PDBj site started to copy the latest data and load them to the local database system only after the RCSB-PDB archive was updated on Wednesday. Therefore, there was some delay in updating the database at PDBj. This frustrated potential PDBj users and they preferred to access RCSB-PDB.

From Sept. 2008, PDBj copies the latest data directly from the internal database in RCSB-PDB to pre-construct the PDBj database on Saturday midnight.

By receiving a mail sent from RCSB-PDB automatically after updating the public ftp-site on every Wednesday, the ftp-site at PDBj is also updated with little time delay.

Worldwide Protein Data Bank

www.wwpdb.org

Joint publications K. Henrick, Z. Feng, W. Bluhm, D. Dimitropoulos, J.F.

Doreleijers, S. Dutta, J.L. Flippen-Anderson, J. Ionides, C. Kamada, E. Krissinel, C.L. Lawson, J.L. Markley, H. Nakamura, R. Newman, Y. Shimizu, J. Swaminathan, S. Velankar, J. Ory, E.L. Ulrich, W. Vranken, J. Westbrook, R. Yamashita, H. Yang, J. Young, M. Yousufuddin, and H. Berman (2008) Remediation of the Protein Data Bank Archive. Nucleic Acids Res. 36(Database issue): D426-D433.

J.L. Markley, E.L. Ulrich, H. Berman, K. Henrick, H. Nakamura, and H. Akutsu (2008) BioMagResBank (BMRB) as a partner in the Worldwide Protein Data Bank (wwPDB): New policies affecting biomolecular NMR depositions. J Biol NMR. 40: 153-155.

S. Dutta, K. Burkhardt, G.J. Swaminathan, T. Kosada, K. Henrick, H. Nakamura, and H.M. Berman, Data deposition and annotation at the Worldwide Protein Data Bank, in Methods in Molecular Biology, 426: Structural Proteomics: High-Throughput Methods, B.G. Kobe, Mitchell; Huber, Thomas Editor. 2008, Humana Press: Totowa, NJ.

C.L. Lawson, S. Dutta, J.D. Westbrook, K. Henrick, and H.M. Berman (2008) Representation of viruses in the remediated PDB archive. Acta Cryst. D64: 874-882.

Worldwide Protein Data Bank

www.wwpdb.org

Interactions Exchange visits

– PDBe/RCSB PDB– PDBj/RCSB PDB – PDBj/BMRB– BMRB/RCSB PDB– BMRB/PDBe

Phone conference with site directors-twice a year VTC’s among staff

– BMRB/RCSB PDB twice a month (ADIT-NMR)– MSD/RCSB PDB weekly – RCSB PDB/PDBj and BMRB/PDBj– BMRB/PDBe

Daily emails among staff– PDBe/RCSB PDB– PDBj/RCSB PDB– BMRB/RCSB PDB, PDBj, PDBe

wwPDB Retreat 2007

Worldwide Protein Data Bank

www.wwpdb.org

wwPDB Retreat

Worldwide Protein Data Bank

www.wwpdb.org

IUCr Osaka 2008 Joint exhibition stand Presentations

– Keynote lecture, What the Protein Data Bank tells us about the past, present and future of structural biology

– Validation talk, Data Quality in the PDB Archive

Q&A at the Commission on Biological Macromolecules Specialized Participation

– Small Angle Commission– Workshop on New Routes to

Crystallographic Data Publication– COMCIFs

Worldwide Protein Data Bank

www.wwpdb.org

http://www.eccb08.org

A demonstration describing the wwPDB highlighting the collaboration as well as services offered by membersites

NMR Update

John Markley

Worldwide Protein Data Bank

www.wwpdb.org

Worldwide Protein Data Bank

www.wwpdb.org

NMR structure depositions Number of NMR structures deposited

through ADIT-NMR (09/01/07-08/31/08)– BMRB -> RCSB PDB 461– PDBj - BMRB -> PDBj 112

Restraints remediation– Processing is virtually complete– Will be released as soon as it can be

made consistent with the remediated chemical components dictionary

Worldwide Protein Data Bank

www.wwpdb.org

wwPDB policies and rules on NMR entries Two types of NMR experiments will be

distinguished in the PDB entries– Solution NMR – Solid-state NMR

NMR entries will have new PDB records – MDLTYP to indicate MINIMIZED AVERAGE – NUMMDL to specify number of models in entry

These changes are reflected in Format Guide 3.2

Worldwide Protein Data Bank

www.wwpdb.org

The numbering of models is sequential, beginning with 1

All models in a deposition (ensemble members and minimized average, if provided) should be superimposed in an appropriate author determined manner, and only one superposition method should be used.

All models in an NMR ensemble and the minimized average structure, if provided, should have the same sequence and covalent structure (exact same number and type of atoms: hydrogens and heavy atoms), and chemistry (e.g., protonation state)

wwPDB policies and rules on NMR entries

Worldwide Protein Data Bank

www.wwpdb.org

Policies clarified by NMR Task Force August 26, 2008

PDB will accept minimized average structures only if they meet the above criteria for alignment and covalent structure

The number of models will not be limited in a PDB file

Chemical shifts deposition will become mandatory

Depositors are encouraged to avail themselves of third-party validation software prior to deposition of NMR structures

Improvements in Data Deposition and Annotation

Kim Henrick

Worldwide Protein Data Bank

www.wwpdb.org

A year of VTC’s and discussions

Worldwide Protein Data Bank

www.wwpdb.org

PDB Contents Guide Version 3.2

The goal was to further clarify all formats and procedures so as to create a more uniform archive

Worldwide Protein Data Bank

www.wwpdb.org

Process

Every record was reviewed for scientific correctness and clarity by wwPDB annotators

Some records were added and others expanded

Task Force members were consulted where appropriate

Worldwide Protein Data Bank

www.wwpdb.org

SPLIT for large structures to indicate number of PDB entries NUMMDL number of MODELS in an entryMDLTYP model types and if C-alpha only chainsREMARK 0 Re-refinement noticeREMARK 475 Residues modeled with zero occupancyREMARK 480 Polymer atoms modeled with zero occupancyREMARK 620 Metal coordinationREMARK 630 Inhibitor Description

DBREF1 / DBREF2 To match very long UniProt IdentifiersDBREF (standard format still used)

Added PDB Format Records

Worldwide Protein Data Bank

www.wwpdb.org

Internal Documentation

Worldwide Protein Data Bank

www.wwpdb.org

Results

Complete new Format document produced and released to public September 15, 2008

Files will be processed according to this specification starting November 15, 2008

All files in archive will be brought up to this standard Q1 2009

Worldwide Protein Data Bank

www.wwpdb.org

X-ray Validation Task Force WorkshopApril 14-16, 2008 EBI, Hinxton, UK

www.wwpdb.org/workshop/2008/index.html

Randy Read (Chair), Paul Adams, Axel Brunger, Paul Emsley, Robbie Joosten, Gerard Kleywegt, Eugene Krissinel, Thomas Luetteke, Zbyszek Otwinowski, Tassos Perrakis,

Jane Richardson, Will Sheffler, Janet Smith, Ian Tickle, Gert Vriend

Worldwide Protein Data Bank

www.wwpdb.org

wwPDB Validation Task Force

Workshop report to be published in Fall 2008 Candidate global and local validation measures

were identified These measures were reviewed in terms of the

requirements of depositors, reviewers, and users

This meeting of the X-ray Validation Task Force was held to collect recommendations and develop consensus on additional validation that should be performed on PDB entries, and to identify software applications to perform validation tasks.Preliminary Outcomes:

Worldwide Protein Data Bank

www.wwpdb.org

Remediation and Curation of Complex Chemistry in the PDB

Worldwide Protein Data Bank

www.wwpdb.org

Inhibitor molecules: annotate the chem comp dictionary and migrate details to PDB entries

Ribosomal (postranslational modifications) and non-ribosomal cyclic, modified and conjugated peptides: consistently given a SEQRES , SOURCE; annotate an entity look up table and transfer to PDB entries

SCOPE

Worldwide Protein Data Bank

www.wwpdb.org

2VUM

AMANITIN

Worldwide Protein Data Bank

www.wwpdb.org

Mapping to UNIPROT

e.g. AMATX_AMAPH (P85421)2VUM cyclically permuted, and needs to be corrected SEQRES 1 M 8 ASN HYP ILX TRX GLY ILE GLY CSX to SEQRES 1 M 8 ILX TRX GLY ILE GLY CSX ASN HYP to align with the gene sequence for beta-amanitin from Amanita phalloides, and alpha-amanitin from Amanita bispoigera. The encoded sequence would be,Ile-Trp-Gly-Ile-Gly-Cys-Asn-Pro Needs MODRES to match gene product

recently shown to be gene product

AMANITIN

Worldwide Protein Data Bank

www.wwpdb.org

Non-gene peptides e.g. actinomycin D i.e. require a gene cluster

Nonribosomal peptides http://bioinfo.lifl.fr/norine/or Novel Antibiotics DataBasehttp://www.nih.go.jp/~jun/NADB/search.html

Cyclic, Modified and Conjugated PeptidesMay be Ribosomal or Non-Ribosomal

Worldwide Protein Data Bank

www.wwpdb.org

Value to users

To understand unique and shared aspects of a particular occurrence

To find a specific system : Some components of a PDB file, such as inhibitors and antibiotic peptides, might not be found or even be apparent

To study related ligands across different proteins

Worldwide Protein Data Bank

www.wwpdb.org

Challenges

Inclusion of non-standard amino acid, nucleotides, or other chemical groups in sequence

Non-linear (cyclic or branched) sequences Microheterogeneity (some cases) Non-uniform annotation of the same

molecule in different PDB entries Lack of annotation regarding the source

and function of these molecules

Worldwide Protein Data Bank

www.wwpdb.org

Solutions Analysis and classification

– Identify antibiotics and inhibitors and group them into polymeric molecules or single components

Dictionary updates– Build single chemical components for appropriate cases– Update dictionary with source, function and other

details

Remediation and future processing– Edit/revise files to include compound name, sequence,

source and function for all antibiotics and inhibitors– Establish rules and procedures to make new

annotations consistent

Worldwide Protein Data Bank

www.wwpdb.org

Single component vs. Polymeric Single component antibiotics or inhibitors

– Build component and retain subcomponent information; annotate dictionary with details about molecule

– Migrate details from dictionary to entry files in specific remarks

– e.g. D-Phenylalanyl-L-prolyl-L-arginine chloromethyl ketone (PPACK)

Polymeric (peptide-like) antibiotics or inhibitors– Present sequence, compound name, and source information

as any regular polymer– Include details about functions in specific remarks– e.g. post-translationally modified ribosomal peptides, non-

ribosomal cyclic, modified or conjugated peptides

Worldwide Protein Data Bank

www.wwpdb.org

How many? Antibiotics

– Single component: ~1000– Polymeric: ~300

Inhibitors– Natural and synthetic

inhibitors of enzymes and other cellular processes

– Single component: ~350– Polymeric:~350

Others– Toxins: ~120

~1300 identified PDB entriesAntibacterialAntiviralAntimicrobialAntifungalAntibioticOverlap with Anticancer Anti-inflammatory Immunosuppressant Herbicide

Worldwide Protein Data Bank

www.wwpdb.org

THIOSTREPTON

Worldwide Protein Data Bank

www.wwpdb.org

1e9w SEQRES THR ILE ALA DHA ALA DHA PYT2jq7 SEQRES ILE ALA DHA ALA 1oln LINKed HETs ROP incorrectly used3cf5 is single molecule TXX

SEQRES should be TZO THR TZB TSI TZO XAA QUA ILE ALA DHA ALA XBB TZO DHA PYT

Now matched in all 4 entries, TXX obsolete

4 PDB entries with 4 different representations

THIOSTREPTON

Worldwide Protein Data Bank

www.wwpdb.org

_entity.pdbx_description ; Thiostrepton complex bacterial natural product containing thiazole rings that's used as a topical veterinary antibiotic and also has promising antimalarial and anticancer activity first isolated from bacteria in 1955, thiostrepton has an unusual type of antibiotic activity: It disables protein biosynthesis by binding to ribosomal RNA and one of its associated proteins and interacts directly with 23S rRNA nucleotides 1067A and 1095A;_entity.type “Polypeptide, sulfur containing antibiotic”_entity.details ; Thiostrepton is a macrocyclic antibiotic incorporating thiazoles and other atypical amino acids. Patented in 1961, thiostrepton has been used as an antibiotic and acts by binding to ribosomes to prevent the binding of the EF-G elongation factor and GTP to the 50S riobsomal subunit. Thiostrepton is an inducer of tipA, a gene that controls the bacterial transcription regulators, TipAL and TipAS, members of the MerR proteins that are central regulators in multidrug resistance. Closely related to siomycin, a recently discovered inhibitor of oncogenic transcription factor - FoxM1. The thiostrepton-resistant gene is also commonly used as a selective marker for recombinant DNA/plasmid technologies.

THIOSTREPTON

Worldwide Protein Data Bank

www.wwpdb.org

1 “CAS” “1393-48-2” ?1 “PUBCHEM” “16130278” ?1 “Merck Index” “11:9295 ; 14:9364” ?1 “RTECS” “XN6300100” ?1 “MDL number” “MFCD00135828” http://www.mdli.com/1 “EG/EC Number” “215-734-9” ?1 “ChemSpider” 10469505 http://www.chemspider.com/ 1 “URL” http://www.fermentek.co.il/Thiostrepton.htm ?1 “URL” http://www.tebu-bio.com/file/product/170BIA-T1158-1/ ? 1 “URL” http://www.bioaustralis.com/pdfs/thiostrepton.pdf ?1 “Sigma Aldrich” “T8902” http://www.sigmaaldrich.com/1 “Chemical Class” “macrolide” ? 1 “MESH” “Peptides, Cyclic [D04.345.566]” ?1 “Pharm. Action” “Anti-Bacterial Agent” ? 1 “Image” http://pubs.acs.org/cen/images/8239/8239notw4image.gif ?1 “Image” http://en.wikipedia.org/wiki/Image:Thiostrepton.png ?

THIOSTREPTON

Worldwide Protein Data Bank

www.wwpdb.org

John S. Garavelli UniProt/RESID database

micrococcin P1 SCTTCVCTCSCCT Bacillus cereus strain ATCC 14579 UniProt:Q812G9_BACCR, Incorrectly annotated as a Putative lantibiotic peptide

Now believe that all the pyridinyl polythiazole antibiotics, including micrococcin P1, thiostrepton, thiocillin, GE2270 A and sulfamycin B, are genetically encoded directly.

Alert - New Protein Modifications Thu, September 25, 2008 1:17 pm

Worldwide Protein Data Bank

www.wwpdb.org

TZO THR TZB TSI TZO XAA QUA ILE ALA DHA ALA XBB TZO DHA PYT

THIOSTREPTON

QUA ILE ALA SER ALA SER CYS THR THR CYS ILE CYS THR CYS SER CYS SER SER NH2

SEQRES

Worldwide Protein Data Bank

www.wwpdb.org

Inhibitors

Worldwide Protein Data Bank

www.wwpdb.org

1ke2 SEQRES CSI LEU PHA 1bcs SEQRES CSI LEU PHA 1m21 single HET group CHY1wvm single HET group CHY1sgc single HET group CST

5 PDB entries with 3 representations all cases bound to Serine-OG

CHY C31 H41 N7 O6 (OG missing aldehyde)CST C31 H41 N7 O7 (OG present carboxlyic acid)

Convert all to pseudo SEQRES with BIOLOGICAL SOURCE

CHYMOSTATIN

Worldwide Protein Data Bank

www.wwpdb.org

Border-line ?

PDB ID 1qr3

Inhibitor of human leukocyte elastase from Streptomyces resistomycificus Should this be a single component or a polymeric?

Sequence: AIB ORN THR AA3 AA4 PHE AA6 VAL

FR901277

Worldwide Protein Data Bank

www.wwpdb.org

Miri Hirshberg Hyunmi Sun Shuchismita Dutta John WestbrookJasmine YoungKim HenrickJohn S. Garavelli UniProt

New Projects

Helen Berman

Worldwide Protein Data Bank

www.wwpdb.org

Worldwide Protein Data Bank

www.wwpdb.org

Small Angle Scattering

Two-member annotator team reviewing possible SAXS and SANS templates

Attendance at SAS Commission to discuss deposition and publication requirements

Template recommendation expected in 2009

Worldwide Protein Data Bank

www.wwpdb.org

Common Deposition and Annotation Tool

Selected as the most important project going forward by participants of the 2007 wwPDB Retreat

Project timeline: Concept in 2008, design and development 2009 - 2011 with delivery by 2012

Progress– wwPDB Directors adopted role of Steering Committee and initiated

the project Concept Phase

– Concept Team, representing the 4 partner sites, meet to create Scope Document (December 2008)

– Steering Committee approved the Scope Document in May 2008

– Core Team Kick Off meeting July 2008

Worldwide Protein Data Bank

www.wwpdb.org

Scope

wwPDB-wide project Will allow full sharing of data load worldwide and

eliminate individual points of failure Will implement recommendations of NMR and X-

ray Validation Task Forces Will allow for data acquisition of coordinate,

experimental and meta data for all methods Will ensure quality, consistency and efficiency of

data processing and annotation process

Worldwide Protein Data Bank

www.wwpdb.org

Assumptions The deposition tools must be able to handle all

current, agreed upon, data entry formats from the user community

The underlying system design will not be driven by existing formats

The product must provide an extensible framework enabling support for new experimental methods over its ten year life span

The project technical level will be set at a “reasonable” standard. Technology should not be bleeding edge nor declining.

Core Team Kick Off

Worldwide Protein Data Bank

www.wwpdb.org

1. Establish Project Management Strategy for this project2. Draft a conceptual design for the solution and identify

critical components that need to be investigated3. Identify the top three challenges and initiate study

groups• Future system data model (John Westbrook

and Tom Oldfield)• Technologies and strategies for data and “state”

management (John Westbrook)• Technologies and strategies for automation of the

validation and annotation pipeline (Sameer Velankar) 

Core Team Meeting Outcome

Worldwide Protein Data Bank

www.wwpdb.org

Path ForwardAdapt Agile Development to our environment as appropriate.

Final design and Full Requirements realized through incremental deliveries, using lessons learned along the way.

Acceptance and

release

Develop and Test

Incre. Products Develop and Test

Incre. Products Develop and Test

Incre. Products Develop and Test

Incremental Products

Probe Development &

Testing of critical component solutions

Initial Requirements &

Conceptual design

Worldwide Protein Data Bank

www.wwpdb.org

Archiving of Raw Diffraction Data

Discussion at Commission on Biological Macromolecules

Outcome Appoint working group to study

requirements for archiving raw experimental data (Chair: Judith L. Flippen-Anderson)

Worldwide Protein Data Bank

www.wwpdb.org

RCSB has received approval from NSB for funding through 2013 BMRB currently funded through Aug 2009 – has submitted a competitive renewal application to the National Library of Medicine (U.S. National Institutes of Health) – even if successful, the current budget will be reduced by 30% • PDBj is going to be reviewed in this November, at the middle of the current project until Mar 2011 • EMBL-EBI (PDBe) Has 6 months bridging funds from Wellcome Trust to cover transition of team leader, 6 staff funded until 1-Dec-09

Funding Update

Matters Arising

Worldwide Protein Data Bank

www.wwpdb.org

Committee membership HPUB proposed revision Industrial structures Validation guidelines

top related