abc proteins statistical analysis

30
GENOMIC ANALYSIS OF ABC PROTEINS IN ARCHAEA AND BACTERIA Supervised By: Dr. S. P. Kanaujia Presented By: Mehul Garg 10010621 IIT Guwahati BTP PRESENTATION PHASE-II

Upload: mehul-garg

Post on 27-Jun-2015

93 views

Category:

Engineering


3 download

DESCRIPTION

ABC or ATP-Binding proteins were identified and were analyzed. The number varied in bacteria and archaea.

TRANSCRIPT

Page 1: ABC Proteins Statistical Analysis

GENOMIC ANALYSIS OF ABC PROTEINS IN

ARCHAEA AND BACTERIA

Supervised By: Dr. S. P. Kanaujia

Presented By: Mehul Garg

10010621

IIT Guwahati

BTP PRESENTATION PHASE-II

Page 2: ABC Proteins Statistical Analysis

2

AB

C P

rote

ins in

Arc

haea a

nd

Bacte

ria

OVERVIEW

ABC PROTEINS – INTRODUCTION DOMAINS OF ABC PROTEINS IDENTIFICATION OF DOMAINS TOOLS FOR IDENTIFICATION PATTERN SEARCH ALGORITHM RESULTS AND DISCUSSIONS

4/2

3/2

01

4

Page 3: ABC Proteins Statistical Analysis

3

AB

C P

rote

ins in

Arc

haea a

nd

Bacte

ria

ABC PROTEINS:

The ATP-binding cassette (ABC) genes represent the largest family of transmembrane (TM) proteins.

These proteins bind ATP and use the energy to drive the transport of various molecules across all cell membranes.

4/2

3/2

01

4

Page 4: ABC Proteins Statistical Analysis

4

AB

C P

rote

ins in

Arc

haea a

nd

Bacte

ria

STRUCTURE :

Proteins are classified as ABC transporters based on the sequence and organization of their ATP-binding domain(s), also known as nucleotide-binding folds (NBDs), transmembrane domain(TMDs) and substrate binding domain(SBPs).

The NBDs contain characteristic motifs (Walker A and B), separated by approximately 90–120 amino acids, found in all ATP-binding proteins., the signature (C) motif, located just upstream of the Walker B site.

The TMDs contain 6–11 membrane-spanning α-helices. The SBPs are present in bacteria and archaea which help in

substrate uptake in transporters.

4/2

3/2

01

4

Page 5: ABC Proteins Statistical Analysis

5

AB

C P

rote

ins in

Arc

haea a

nd

Bacte

ria

AB

C T

RA

NS

PO

RTER

This animation display the domains present in ABC Transporter.

SBP binds with substrate and initiate the transport. The TMD helps in transport of substrate through membrane. The NBD helps by providing energy through ATP hydrolysis.

4/2

3/2

01

4

Page 6: ABC Proteins Statistical Analysis

6

AB

C P

rote

ins in

Arc

haea a

nd

Bacte

ria

AB

C T

RA

NS

PO

RTER

4/2

3/2

01

4This animation display the domains present in ABC Transporter.

SBP binds with substrate and initiate the transport. The TMD helps in transport of substrate through membrane. The NBD helps by providing energy through ATP hydrolysis.

Page 7: ABC Proteins Statistical Analysis

7

AB

C P

rote

ins in

Arc

haea a

nd

Bacte

ria

AB

C T

RA

NS

PO

RTER

This animation display the domains present in ABC Transporter.

SBP binds with substrate and initiate the transport. The TMD helps in transport of substrate through membrane. The NBD helps by providing energy through ATP hydrolysis.

4/2

3/2

01

4

Page 8: ABC Proteins Statistical Analysis

8

AB

C P

rote

ins in

Arc

haea a

nd

Bacte

ria

AB

C T

RA

NS

PO

RTER

This animation display the domains present in ABC Transporter.

SBP binds with substrate and initiate the transport. The TMD helps in transport of substrate through membrane. The NBD helps by providing energy through ATP hydrolysis.

4/2

3/2

01

4

Page 9: ABC Proteins Statistical Analysis

9

AB

C P

rote

ins in

Arc

haea a

nd

Bacte

ria

AB

C T

RA

NS

PO

RTER

This animation display the domains present in ABC Transporter.

SBP binds with substrate and initiate the transport. The TMD helps in transport of substrate through membrane. The NBD helps by providing energy through ATP hydrolysis.

4/2

3/2

01

4

Page 10: ABC Proteins Statistical Analysis

10

AB

C P

rote

ins in

Arc

haea a

nd

Bacte

ria

AB

C T

RA

NS

PO

RTER

This animation display the domains present in ABC Transporter.

SBP binds with substrate and initiate the transport. The TMD helps in transport of substrate through membrane. The NBD helps by providing energy through ATP hydrolysis.

4/2

3/2

01

4

Page 11: ABC Proteins Statistical Analysis

11

AB

C P

rote

ins in

Arc

haea a

nd

Bacte

ria

AB

C T

RA

NS

PO

RTER

This animation display the domains present in ABC Transporter.

SBP binds with substrate and initiate the transport. The TMD helps in transport of substrate through membrane. The NBD helps by providing energy through ATP hydrolysis.

4/2

3/2

01

4

Page 12: ABC Proteins Statistical Analysis

12

AB

C P

rote

ins in

Arc

haea a

nd

Bacte

ria

AB

C T

RA

NS

PO

RTER

This animation display the domains present in ABC Transporter.

SBP binds with substrate and initiate the transport. The TMD helps in transport of substrate through membrane. The NBD helps by providing energy through ATP hydrolysis.

4/2

3/2

01

4

Page 13: ABC Proteins Statistical Analysis

13

AB

C P

rote

ins in

Arc

haea a

nd

Bacte

ria

AB

C T

RA

NS

PO

RTER

This animation display the domains present in ABC Transporter.

SBP binds with substrate and initiate the transport. The TMD helps in transport of substrate through membrane. The NBD helps by providing energy through ATP hydrolysis.

4/2

3/2

01

4

Page 14: ABC Proteins Statistical Analysis

14

AB

C P

rote

ins in

Arc

haea a

nd

Bacte

ria

AB

C T

RA

NS

PO

RTER

This animation display the domains present in ABC Transporter.

SBP binds with substrate and initiate the transport. The TMD helps in transport of substrate through membrane. The NBD helps by providing energy through ATP hydrolysis.

4/2

3/2

01

4

Page 15: ABC Proteins Statistical Analysis

15

AB

C P

rote

ins in

Arc

haea a

nd

Bacte

ria

AB

C T

RA

NS

PO

RTER

This animation display the domains present in ABC Transporter.

SBP binds with substrate and initiate the transport. The TMD helps in transport of substrate through membrane. The NBD helps by providing energy through ATP hydrolysis.

4/2

3/2

01

4

Page 16: ABC Proteins Statistical Analysis

AB

C P

rote

ins in

Arc

haea a

nd

Bacte

ria

16

NBDS :4

/23

/20

14

CONSERVED DOMAINS :WalkerA : [AG]-x(4)-G-K-[ST] WalkerB : D-E-x(5)-DSignature Sequence:[LIVMFYC]-[SA]-[SAPGLVFYKQH]-G-[DENQMW]-[KRQASPCLIMFW]-[KRNQSTAVM]-[KRACLVM]-[LIVMFYPAN]-{PHY}-[LIVMFW]-[SAGCLIVP]-{FYWHP}-{KRHP}-[LIVMFYWSTA][] : any one amino acid, {} : none of the amino acid, X : any amino acid

(reproduced from wikipedia.org)

Page 17: ABC Proteins Statistical Analysis

17

AB

C P

rote

ins in

Arc

haea a

nd

Bacte

ria

IDENTIFICATION OF NBDS:

Scanned all the proteins for their content of the WalkerA, the WalkerB and the ABC transporter family signature motifs.

In NBDs, the ABC transporter family signature motif is always located between the two Walker A and B motifs (about 100 residues downstream of the WalkerA motif and 10 residues upstream of the WalkerB motif), we checked if the identified proteins contain each of these three motifs at a correct relative positions.

We searched for the conserved domains in NBDs using web server : Genolist

4/2

3/2

01

4

Page 18: ABC Proteins Statistical Analysis

AB

C P

rote

ins in

Arc

haea a

nd

Bacte

ria

18

SERVER USED FOR SEARCHING PATTERN: GENOLIST

4/2

3/2

01

4

Genolist is a server provided by : Pasterur Institute France. One can analyze 700 genomes that are provided by the server. For Pattern Search following syntax is used : [] : Any Protein in the brackets allowed.[^] : None of the Protein in the brackets allowed.X : Any Protein allowed

Page 19: ABC Proteins Statistical Analysis

19

AB

C P

rote

ins in

Arc

haea a

nd

Bacte

ria

PATTERN SEARCH :

Used Regular Expression, Python tool. Advantages :

• Having Code helps user know what program is doing• Only 700 genomes are listed in Genolist, for which one

can perform pattern search. Other available pattern search doesn’t allow multiple pattern search.

• Only upto 100 genomes can be selected in Genolist, whereas you can search among any number of genome using code.

4/2

3/2

01

4

Page 20: ABC Proteins Statistical Analysis

20

AB

C P

rote

ins in

Arc

haea a

nd

Bacte

ria

CODE:

Different Parts :

1. The program asks user for number of patterns.

2. The user is asked for the pattern and the number of mismatches allowed.

3. The programs then asks user for the lower and upper bound of amino acids in between patterns.

4. The program find all possible combinations of mismatches allowed and compute regular expression.

5. The expression is searched in input file that user provides and the results are written to temporary file according to the mismatches.

6. The temporary files are combined and results are written into a common output file based on total sum of mismatches.

4/2

3/2

01

4

Page 21: ABC Proteins Statistical Analysis

21

AB

C P

rote

ins in

Arc

haea a

nd

Bacte

ria

IDENTIFICATION OF TMDS :

o Signature motifs are only found in some sub-families of TMDs.

o All TMDs are integral transmembrane proteins are composed of four to eight alpha-helices and their encoding genes are usually organized in an operon with those encoding NBDs.

o We searched for nearby proteins for transmembrane domain using web server : TMHMM

4/2

3/2

01

4

Page 22: ABC Proteins Statistical Analysis

22

AB

C P

rote

ins in

Arc

haea a

nd

Bacte

ria

SERVER FOR TRANSMEMBRANE DOMAIN: TMHMM

4/2

3/2

01

4

TMHMM is a server provided by : Technical University of Denmark. One can analyze upto 4000 proteins one time for presence of transmembrane domain.

Page 23: ABC Proteins Statistical Analysis

23

AB

C P

rote

ins in

Arc

haea a

nd

Bacte

ria

SBPS :

In Gram Positive Bacteria and Archaea the SBP is attached to the membrane whereas in Gram Negative Bacteria it is in between outer and inner membrane.

4/2

3/2

01

4

(reproduced from Braibant et al. (2000))

Page 24: ABC Proteins Statistical Analysis

AB

C P

rote

ins in

Arc

haea a

nd

Bacte

ria

24

IDENTIFICATION OF SBPS :4

/23

/20

14

Our strategy for finding the SBPs of the importers was based on the facts that: In Gram-positive bacteria,

SBPs are lipoproteins containing a prokaryotic membrane lipoprotein lipid attachment site.

The genes encoding the SBPs are usually organized in an operon with those encoding NBDs and TMDs.

Our strategy for finding the SBPs of the importers was based on the facts that: In Gram-negative bacteria,

SBPs are proteins containing a signal peptide.

The genes encoding the

SBPs are usually organized in an operon with those encoding NBDs and TMDs.

Archaea and Gram Positive Bacteria:

Gram Negative Bacteria:

Page 25: ABC Proteins Statistical Analysis

25

RESULTS: 4

/23

/20

14

AB

C P

rote

ins in

Arc

haea a

nd

B

acte

ria

Streptococcus pneumoniae and Beutenbergia cavernae were found to have high content of ABC assemblies as compared to other genomes.

BACTERIA

µ-2σ

µ

µ+2σ

Page 26: ABC Proteins Statistical Analysis

26

AB

C P

rote

ins in

Arc

haea a

nd

Bacte

ria

RESULTS:4

/23

/20

14

Thermofilum pendes has a very high content of ABC systems: may be due to fact that it can sustain life in extreme environments, making it a thermoacidophile, thus requirement of transporters in extreme conditions might be responsible. Nanoarchaeum equitans has only 2 assembly: due to the fact that it cannot synthesize most nucleotides, amino acids, lipids and cofactors as the cell most likely obtains these biomolecules from Ignicoccus.

ARCHAEA

µ+2σ

µ

µ-2σ

Page 27: ABC Proteins Statistical Analysis

AB

C P

rote

ins in

Arc

haea a

nd

Bacte

ria

27

ABC ASSEMBLY VS NUMBER OF GENES:

4/2

3/2

01

4

Archaea Bacteria

0

50

100

150

200

250

f(x) = 0.0196685942177455 x + 2.30054678485003

NUMBER OF GENES

AB

C A

SSEM

BLY

0

20

40

60

80

100

f(x) = 0.0149410140241395 x − 2.38963573060174

NUMBER OF GENES

AB

C A

SSEM

BLY

As the size of the genome increases, the number of transporters of all categories is approximately proportional to genomic size.

Page 28: ABC Proteins Statistical Analysis

AB

C P

rote

ins in

Arc

haea a

nd

Bacte

ria

28

4/2

3/2

01

4PATHOGENIC BACTERIA:

Mean Normalized Score : 1.86%, less than overall Bacteria ABC Assembly percentage. Myobacterium tuberculosis has the lowest number of ABC proteins.

Page 29: ABC Proteins Statistical Analysis

29

CONCLUSION:4

/23

/20

14

AB

C P

rote

ins in

Arc

haea a

nd

B

acte

ria

Normalized percentage of ABC proteins found (1.97*3) ~5.93 %

Most of the bacteria used are intracellular parasites. Such bacteria are able to grow inside cells, or the availability of a metabolite can lead to gene inessentiality and to subsequent disruption or deletion of the gene. M. tuberculosis has only 38 ABC assemblies which is lower than E. coli where 90 ABC assemblies are found.

Normalized percentage of ABC assemblies found (1.37*3) ~4.12 %

Thermofilum pendes was found to have a very high content of ABC systems compared with that of species of similar genome size.

Nanoarchaeum equitans was found to have only 2 ABC assemblies.

Bacteria : 45 genomes Archaea : 60 genomes

Normalized percentage of ABC protein can be found by multiplying by average three(NBD,TMD and SBP). Normalized Score = Number of ABC Assembly/Number of Genes in genome.

Page 30: ABC Proteins Statistical Analysis

30

AB

C P

rote

ins in

Arc

haea a

nd

Bacte

ria

REFERENCES:

Martine Braibant, Philippe Gilot, Jean Content, The ATP binding cassette (ABC) transport systems of Mycobacterium tuberculosis, FEMS Microbiology Reviews, 2000, 24 449-467.

Sonja-Verena Albers, Sonja M. Koning, Wil N. Konings & Arnold J. M. Driessen, Insights Into ABC Transport in Archaea, Journal of Bioenergetics and Biomembranes, 2004, Vol. 36, No. 1.

Pierre Lechat, Laurence Hummel, Sandrine Rousseau & Ivan Moszer. GenoList: an integrated environment for comparative analysis of microbial genomes, PubMed, 2008, D469-74. DOI:10.1093.

Jannick Dyrliv, Bendtsen, Henrik Nielsen, Gunnar von Heijne, Soren & Brunak. Improved prediction of signal peptides | SignalP, 3.0.J. Mol. Biol., 2004, 23-1. 

Combet, C., Blanchet, C., Geourjon, C. & Deleage, G. Trends Biochem. Sci., 2000, 25-147

4/2

3/2

01

4