whole genome sequencing of arabidopsis thaliana

19
WHOLE GENOME SEQUENCING OF Arabidopsis thaliana By BHAVYASREE R K

Upload: bhavya-sree

Post on 11-Jan-2017

783 views

Category:

Science


8 download

TRANSCRIPT

Page 1: Whole genome sequencing of arabidopsis thaliana

WHOLE GENOME SEQUENCING OF Arabidopsis thaliana

ByBHAVYASREE R K

Page 2: Whole genome sequencing of arabidopsis thaliana

“GENOME SEQUENCING”

• Idea discussed in scientific community during 1984

onwards•1990 : human genome project officially began•Genome sequencing approaches:• Clone by clone sequencing• Shot gun sequencing

“GENOMICS”Determination of genetic information and the mechanism by which this information used by the organism

“GENOME”The complete set of genetic information

of an organism

Page 3: Whole genome sequencing of arabidopsis thaliana

Genome sequencing projectsModel organisms: Mostly used in genetic and scientific studies

Yeast E.coli Cenorhabditis elegans

Drosophila

Arabidopsis thaliana

Page 4: Whole genome sequencing of arabidopsis thaliana

Genome size:Nuclear: 125 MbPlastid: 154 Kb

Mitochondria:367 Kb

Small plant

belongs to family

Cruciferae

Larger no. of offsprings & short generation time

Convenience and abundance

Low amount of

repetitive DNA

Basic similarities to other crops

Susceptible to T-

DNA insertions

Relatively smaller and simpler

genome

Arabidopsis: A model plant

Page 5: Whole genome sequencing of arabidopsis thaliana

ARABIDOPSIS GENOME ANALYSIS: Initiation and progress

• 1983 - first genetic map published• 1988-89 - publication of RFLP maps• 1990 - Multinational Coordinated Arabidopsis thaliana

Genome project initiated• 1991 - first YAC libraries• 1995-96 - standard BAC and P1 libraries constructed • 1996 - Arabidopsis Genome Initiative organised and started

sequencing• 1998 - Physical maps of all chromosomes completed• 1999 - sequence and analysis of chromosome 2 and 4• 2000 - sequence and analysis of chromosomes 1, 3 and 5• 2000 - completion of whole genome sequencing

Page 6: Whole genome sequencing of arabidopsis thaliana

This report includes:– Completed Arabidopsis genome sequences– Annotation of predicted genes– Assignment of functional categories– Chromosomal dynamics and architecture– Distribution of transposable elements and other

repeats– Extend of lateral gene transfer from organelles– Comparison of the genome sequence and structure to

that of other Arabidopsis accessions and plant species

Page 7: Whole genome sequencing of arabidopsis thaliana

Sequencing strategy

• “Clone by clone sequencing”= “hierarchical shot gun sequencing”= “map based shot gun sequencing “

• It includes:– Map construction– Clone selection– Sub clone library construction– Random shot gun phase– Directed finishing phase– Sequence authentication

Page 8: Whole genome sequencing of arabidopsis thaliana

• Primary substrates – large insert BAC , P1 and TAC libraries

• Physical maps of genome of accession COLUMBIA were assembled by restriction fragment ‘fingerprint’ analysis of BAC clones, by hybridization or PCR of STS and by hybridization and southern blotting

• 47788 BAC clones are end sequenced to assemble the contigs

Steps

Page 9: Whole genome sequencing of arabidopsis thaliana

• Minimally overlapping 1569 BAC, TAC,Cosmid and P1 clones (avg. Insert size: 100 Kb) used to assemble 10 contigs :represent minimum tiling path

• These clones are selected for shot gun sequencing

• To link the regions not covered by cloned DNA or to optimize the minimum tiling path 22 PCR products were amplified directly from genomic DNA

Page 10: Whole genome sequencing of arabidopsis thaliana

• DNA insert of selected clones is purified and subjected to random fragmentation by physical shearing

• Enzymatic repair is done in broken end• Size fractionation and elution of 2-5 Kb

fragments• They are subcloned into plasmids or M13

vectors

Page 11: Whole genome sequencing of arabidopsis thaliana

• Sequence reads of plasmids are derived from universal priming sites

• Sufficient redundant data generated and sequence reads are computationally assembled (>99.99% accuracy if 8-10 fold sequence coverage)

• All available sequenced genetic markers were integrated to sequence assemblies to verify the sequenced contigs

Page 12: Whole genome sequencing of arabidopsis thaliana

Outcomes of sequencing project

• 115409949 bp (~115.4 Mb) are sequenced• The unsequenced centromeric and ribosomal

DNA repeat regions measures roughly 10 Mb• 25498 genes are predicted

Page 13: Whole genome sequencing of arabidopsis thaliana
Page 14: Whole genome sequencing of arabidopsis thaliana

Outcomes of sequencing project

• Characterization of the coding regions• Genome organization and duplication• Comparative analysis of Arabidopsis

accessions• Comparison of Arabidopsis and other plant

genera• Integration of 3 genomes in the plant cell• Transposable elements• rDNA, telomeres and centromeres

Page 15: Whole genome sequencing of arabidopsis thaliana

• Membrane transport• DNA repair and recombination• Gene regulation• Cellular organization• Development• Signal transduction• Recognizing and respond to pathogens• Photomorphogenesis and photosynthesis• metabolism

Page 16: Whole genome sequencing of arabidopsis thaliana

1001 Genomes Project• Launched at the beginning of 2008 to discover the whole-genome

sequence variation in 1001 strains • Each accession is an inbred line with seeds that are freely available

from the stock centre• Hierarchical approach of selection 1001 genomes

– Sampling 10 individuals from 10 populations each in 10 geographical regions throughout Eurasia plus at least one north African accession (10x10x10+1)

• sequence information can be used directly in association studies at biochemical, metabolic, physiological, morphological and whole plant-fitness levels

• The complete genome sequences of over 80 accessions were released in early 2010 by the Max Planck Institute,

• many more have been added by the Salk Institute, the Gregor Mendel Institute and Monsanto.

• September 2014 over 1100 lines have been sequenced,

Page 17: Whole genome sequencing of arabidopsis thaliana
Page 18: Whole genome sequencing of arabidopsis thaliana

References

• The Multinational Coordinated Arabidopsis thaliana Functional Genomics Project: Annual Report 2010 Multinational Arabidopsis steering committee, 2010

• Weigel D & Mott R(2009) The 1001 genomes project for Arabidopsis thaliana Genome Biol 10(5):107.

• http://1001genomes.org/

• Arabidopsis Genome Initiative (2000) Analysis of the genome sequence of the flowering plant Arabidopsis thaliana nature 408: 796-815.

• Green E D (2001) Strategies for the systematic sequencing of complex genomes Nature Reviews Genetics 2(8):573-83.

• Singh B D (2009) Biotechnology expanding horizons Kalyani India.

Page 19: Whole genome sequencing of arabidopsis thaliana

THANK YOU