communications

26
L. ARAVIND National Center for Biotechnology Information Apprehending Life’s complexity: Making and communicating biological discoveries

Upload: somasushma

Post on 11-May-2015

506 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Communications

L. ARAVINDNational Center for Biotechnology Information

Apprehending Life’s complexity: Making and communicating biological discoveries

Page 2: Communications

We are becoming meme and teme machines !

It is all about replicators: biological and otherwise

The good old genes Memes Temes

Page 3: Communications

Summary of issues

Discovery in biologyDifferent philosophies: Natural history versus “hypothesis driven science”Evolutionary theory and computation as a bridge between the philosophical antipodes

•Example of the PAS domainA rich breeding ground for memes

Levels of organization in the living world and its complexityMicroscopic, mesoscopic and macroscopic world views need integrationSeeking gold at the end of the mazeFollowing natural order: hierarchies and networks

•Examples of classifications and hierarchies

The meme machine: transmission of discoveriesDatabases and search toolsScientific collaboration and competition Journal systems

Page 4: Communications

The two philosophies in biology

Natural history: discovery of new forms, cataloguing and

classification

Hypothesis-> attempt at falsification->paradigms:

Popper’s world view

Largely a history of clash

or neglect

Page 5: Communications

Building the bridge: Evolutionary theory and computation

+

...

.

•Sequence profile analysis

•Structure similarity

comparisons

•Contextual analysis

Understanding and predicting protein

(biomolecule) function

Systems biology:Ensembles of biomolecules

in functional guilds

The “omics” (regular and meta):From sequence to organismal

biology and ecology

Page 6: Communications

Earlydomain universe

The protein universe shows enormous diversity but an underlying unity

•These relationships are powerful predictors of protein evolution, function and behavior

?

?

?

?

The largest assemblage of homologous domains that can unified by sequence features is formally a superfamily

Several superfamilies may share a common folding pattern and arrangement of secondary structure elements: unified to a fold

Page 7: Communications

ALL LIFE FORMSBACTERIAARCHAEAEUKARYA

•The ribosome, and the associated enzymes like some RNAses (including RnaseHII), PseudoU synthases, RNA methylases, thioU synthases, Clamp loader ATPase, RecA, RNA polymerases, translation GTPases, AATRS, ABC, MinD ATPases, OSGP like chaperone/protease. PCNA, DNA ligases, rRNA and tRNAs

DNA polymerases, Holliday junction resolvases, Primases, Replicative Helicases, Origin recognition complexes

Ribozymes are well-known: so an RNA world of sorts must have existed

There was a common ancestor of all life; the main functions of this life form revolved around RNA metabolism and translation; some cellular functions related to DNA had developed but modern DNA replication “crystallized” later

So there was a RNA centered ancestral form with a possible DNA intermediate in replication

Unifying life and inferring the common ancestor

Page 8: Communications

Getting behind biological clocks, photodetectors and oxygen sensors

Regulation of circadian rhythms in animals

Periodic growth and sporulation in fungi

Light regulated expression of photosynthetic pigments

Oxygen-seeking behavior in aerobic bacteria

A master regulator of the clock the period protein

(per)

WC-1 and WC-2 two light sensory

regulators of gene-expression in

Neurospora

BAT a regulator of photosynthetic

pigment expression

The aerotaxis receptor of E.coli

and other bacteria

Page 9: Communications

The PAS domain

A ligand binding domain which binds diverse ligands like heme, tetrahydropyrrole and flavin nucleotides

Thus, it can sense diverse stimuli like light, redox or both

Transmits this stimulus to a diverse range of other “effector” domains

Curr Biol. 1997 Nov 1;7(11):R674-7. PAS: a multifunctional domain family comes to light. Ponting CP, Aravind L.

Curr Biol. 7(11):R674-7. PAS: a multifunctional domain family comes to light.

Ponting CP and Aravind L

PASG

ATAPAS

PAS

PASPAS

bHLH

PAS

PAS

PASC6

PASAAA+

HTH

Transcription

WC-1

SIM

PER

Page 10: Communications

PAS

PAS S/T-Kinase

GAF

GAF

Adeny ly lcy clase

PAS

GAF

PASPAS

H-kina se

ERG-channels: redox sensing in

animal hearts

Phytochrome:Light sensing in

plants and bacteria

Signaling intracellular redox states

Small-molecule based regulation

of signaling enzymes

Birth of a meme…

•Detection of the PAS domain allows a definitive functional prediction

•The mechanisms of critical molecules across the entire diversity of life could be predicted

•It was a very successful meme indeed: 887 publications following up on the original characterization and function prediction of the PAS domain have emerged since – around 80-90 per year.

The predictions:

Page 11: Communications

Overview of biological complexity

Mesoscopic

Characterization of biological functional systems

Function prediction & classification

Microscopic Discovery and classification of domains

Computational analysis of whole biological systems or networks

Reconstructing organismal biology and whole ecosystemsMacroscopic

Evolutionary trajectories: Genomes to Biology

Page 12: Communications

Eukaryotic signaling proteins show non-linear scaling with proteome size…

However, major superfamilies of signaling proteins show largely linear trends: invention of many lineage-specific systems independent of the large superfamiliesDeviations point to important functional adaptations: convergent evolution of LRR+kinase architectures

Page 13: Communications

(Prolyl hydroxylases)

RsPbcv1

Dm

Arabidopsis

Drosophila

Lineage specific expansion of a domain family

Definition: The increase in numbers of a domain in particular lineage with respect to its number in sister reference lineage

Hom

o

Page 14: Communications

Section of the contextual network for the Ub pathway

LFLFWLMUB WLMPUGWLMUB LFLFWLM WLM

PNGaseThioredoxin PAW PNGase PAWPUGUBAPUGPPPDEPUL DOMAINThioredoxinPPPDE

UBOTU-DUBUB

C2H2OTU-DUB

UB Asp Protease UBA

ThioredoxinX UBXUBXThioredoxin ZZ fingerUBA PUL DOMAINWD40

LFLFLFLFLF Calpain

A20 ZnF

UB/UBX

UBCH

LF

C2H2-U

An1 ZnF

OTU-DUB

PNGase

RAB-GEF

PUL DOMAIN

Thioredoxin

PPPDE

Asp Protease

PAW

ZZ finger

WLM (metallopeptidase)

Yif1

TM TM TM TM TM//

RAB

WD40

Calpain

E2//

UBA

*

* Predicted DUB

**

** *

Page 15: Communications

Domain architectural “complexity” of eukaryotic signaling proteins

•Complexity can vary drastically even between sister lineages: parasitism causes a general fall in complexity

•The complexity in free-living forms is high in the chromalveolate+crown group clade.

•Multicellularity and cellular complexity resulted in increases in domain architectural complexity but clearly the increase was greatest in the animal lineage alone.

•Fungi as a whole show a reduction of complexity concomitant with their gene loss with respect to the ancestor of the crown group lineage.

Page 16: Communications

Biology of Networks

Nodes

Links

Interaction

A

B

Network

Proteins

Physical Interaction

Protein-Protein

A

B

Protein Interaction

Metabolites

Enzymatic conversion

Protein-Metabolite

A

B

Metabolic

Transcription factorTarget genes

TranscriptionalInteraction

Protein-DNA

A

B

Transcriptional

Page 17: Communications

112 TFs

711 TGs

1295 Interactions

E. coli transcriptional regulatory network

Small-scale biochemical experimentsLarge-scale ChIP-chip experiments

and genetic deletion and over-expression data

157 TFs

4410TGs

12873 Interactions

Datasets

Yeast transcriptional regulatory network

Page 18: Communications

N (k) k

1

Scale-free structure

Presence of few nodes with many links and many

nodes with few links

Transcriptional networks are scale-free

Scale free structure provides robustness to the system

Albert & Barabasi, Rev Mod Phys (2002)

Page 19: Communications

Crp

NarL

Crp

NarL

E. coli H. influenzae B. pertussis

NarL

Crp

Regulatory hubs which are condition specific can beeither lost or replaced

The same protein in organisms living in different lifestyles may conferdifferent adaptive value. Hence it may emerge as a regulatory

hub in the organism to which it confers high adaptive value and not in the others

Different proteins should emerge as hubs in organismswith different lifestyle

Page 20: Communications

Apprehending the diversity of eukaryotes“c

row

n gr

oup”

Mos

t stu

died

“mic

robi

al e

ukar

yote

s”M

ost d

iver

se a

nd p

reva

lent

animalsfungiSlime molds

plants

Chlorophytesrhodophytes

diatoms

Heteroloboseans

parbasalidsDiplomonads

Euglenozoa

ciliates Apicomplexans

Page 21: Communications

Some notable associations that might favor inter-eukaryotic gene flow

Primary endosymbiosiswith cyanobacterium

Secondary endosymbiosiswith different plant lineages

Plant lineages

Karyoklepty (e.g. ciliates)

Endosymbiosis

Engulfment

Parasitic nucleus

Nuclear invasion

Karyoparasitism (e.g. Rhodophytes)Endoparasitism (e.g. apicomplexa)

Page 22: Communications

Composite selves: bacterial origins for Vitamin B12 receptors

• We discovered a novel domain that forms the common denominator for Vitamin B12 binding and recognition in both bacteria and animals. This helped us understand how B12 is taken up by animal guts

•Domain architectures and unusual phyletic distribution of this domain strongly suggested a bacterial origin for the primary animal Vitamin B12 receptor

Page 23: Communications

The medium for biological discovery

The Dali Database

•BLAST•PSI-BLAST

•HMMER•HHPRED

•DALI•MUSTANG

•KALIGN•MUSCLE

….

Labs (including “Omics” centers)

Primary archival databases

Search methods and strategies

Secondary databasesJournals

Lost in the blackhole

Page 24: Communications

Sociology of the process: Complexity, competition and currency

Complexity•Dispersion of efforts•Lack of integration

Gold rush for the “hot” issues

Publications seen as currency in scientific

community

•Intense competition•Secrecy and strife

Transmission of discoveries is

hampered

Can we / should we intercede?

Increased Collaboration

Page 25: Communications

Genes: Natural selection; scientific memes: peer review?

Does the axe peer review, as it stands, hamper effective scientific

transmission?

•Great science was done without modern-style peer review•Long delays in publishing - damaging in a competitive scientific environment•Inane reviews with hardly any constructive value•Nitpicking – surely a primate instinct, but does is help in science?•Obstructionists: peer review as an tool against competitors •Closed one-sided process

•Crackpot science : What do we do about it•Enormous volume of scientific production: strain on referees and journal editors•Constructive criticism helps!

•Open peer review system: A viable compromise?•A test case for the model: Biology Direct at BMC journals

Page 26: Communications

Conclusions

Given the “special” interests:1)Journals and publishers2)Evaluation of scientists by host institutions3)Triaging scientific publications4)Allocating Funds for Biological research5) Need to bar crackpots

Given the competition:1)Blogs2)Wikis3)Open access, open peer-review etc.4)The ubiquity of the internet5) The drive from the memes and temes!

Will out of the box thinking help?