biopython

BioPython Prepared by :Suhad JihadMsc. Student in Polytecnich

Translation & Translation tables:

What is Translation :is the process of translating mRNA into the corresponding protein sequence.

What is Translation Tables:They are a genetic codes by which we get amino acid The translation tables available in Biopython are based on those from the NCBI.In NCBI there are many table and each table have a name an an ID but by default, translation will use the standard genetic code (NCBI table id 1).

Translation tables

Translation tables

Difference between two tables

Start and Stop codons

Translation by Biopython

:>>> from Bio.Seq import Seq>>> from Bio.Alphabet import IUPAC

>>> messenger_rna = Seq("AUGGCCAUUGUAAUGGGCCGCUGAAAGGGUGCCCGAUAG", IUPAC.unambiguous_rna)

>>> messenger_rna

Seq('AUGGCCAUUGUAAUGGGCCGCUGAAAGGGUGCCCGAUAG', IUPACUnambiguousRNA())

>>> messenger_rna.translate()

Seq('MAIVMGR*KGAR*', HasStopCodon(IUPACProtein(), '*'))

Example


:translate directly from the coding strand DNA sequence:>>> from Bio.Seq import Seq

>>> from Bio.Alphabet import IUPAC

>>> coding_dna = Seq("ATGGCCATTGTAATGGGCCGCTGAAAGGGTGCCCGATAG", IUPAC.unambiguous_dna)

>>> coding_dna

Seq('ATGGCCATTGTAATGGGCCGCTGAAAGGGTGCCCGATAG', IUPACUnambiguousDNA())

>>> coding_dna.translate()

Seq('MAIVMGR*KGAR*', HasStopCodon(IUPACProtein(), '*'))

Example


If you notice from the previous example you have internal stop codon which we can deal with it as:from Bio.Seq import Seqfrom Bio.Alphabet import IUPAC

coding_dna = Seq("ATGGCCATTGTAATGGGCCGCTGAAAGGGTGCCCGATAG", IUPAC.unambiguous_dna)

print coding_dna.translate()

print coding_dna.translate(to_stop=True)


You can also specify the table using the NCBI table number which is shorter as:

>>> coding_dna.translate(table=2)

Seq('MAIVMGRWKGAR*', HasStopCodon(IUPACProtein(), '*'))

>>> coding_dna.translate(table=2, to_stop=True)

Seq('MAIVMGRWKGAR', IUPACProtein())

Notice that when you use the to_stop argument, the stop codon itself is not translated - and the stop symbol is not included at the end of your protein sequence


Deal with translation table using Biopython:>>> from Bio.Data import CodonTable

>>> standard_table = CodonTable.unambiguous_dna_by_name["Standard"]

>>> mito_table = CodonTable.unambiguous_dna_by_name["Vertebrate Mitochondrial"]


Alternatively, these tables are labeled with ID numbers 1 and 2, respectively:

>>> from Bio.Data import CodonTable

>>> standard_table = CodonTable.unambiguous_dna_by_id[1]

>>> mito_table = CodonTable.unambiguous_dna_by_id[2]


You can print the tables using:>>> print(standard_table)

>>> print(mito_table) You may find these following properties useful for example if you are trying to do your own gene finding:

>>> mito_table.stop_codons

['TAA', 'TAG', 'AGA', 'AGG']

>>> mito_table.start_codons

['ATT', 'ATC', 'ATA', 'ATG', 'GTG']

>>> mito_table.forward_table["ACG"]

'T'

H W

H.W.1:

Suppose you have this Seq:

ATGGCCATTGTAATGGGCCGCTGAAAGGGTGCCCGATAG which is defined as IUPAC.unambiguous_dna

Transcribe it using biopython prepared function as a first way and using python technic only as a second way. You need to import:

from Bio.Seq import Seq

from Bio.Alphabet import IUPAC

H W

H.W.2:

Using genetic code table find the amino acid corresponding to this codon GAC

Use this table CodonTable.unambiguous_dna_by_id[2]

You need to import

from Bio.Data import CodonTable

biopython

Software