transcriptome analysis using open reading frame ests (orestes) emmanuel dias neto, phd lab of...

Post on 15-Jan-2016

217 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Transcriptome analysis using Open Reading

frame ESTs (ORESTES)

Emmanuel Dias Neto, PhDLab of Neurosciences, LIM-27

Instituto de PsiquiatriaFaculdade de Medicina

Universidade de Sao Paulo, SP - BRAZIL

UNESCO - First North-South UNESCO - First North-South Human Genome ConferenceHuman Genome Conference

• Caxambú, MG, Brazil - 1993Caxambú, MG, Brazil - 1993• Is there a way to integrate the research Is there a way to integrate the research

performed in developing countries with the performed in developing countries with the US/Europe ‘Human Genome Project’ ?US/Europe ‘Human Genome Project’ ?

• After the completion of the ‘Human After the completion of the ‘Human Genome sequencing’, how can we gain Genome sequencing’, how can we gain access or make use of the technology access or make use of the technology developed ?developed ?

How can we learn ?How can we learn ?

• Initiate an EST sequencing project of a Initiate an EST sequencing project of a parasite of local importance (parasite of local importance (Schistosoma Schistosoma mansonimansoni))

• cDNA libraries prepared with Marcelo cDNA libraries prepared with Marcelo Bento SoaresBento Soares

• cDNA sequencing performed at TIGR cDNA sequencing performed at TIGR (Craig Venter)(Craig Venter)

• Some 1,000 ESTs generatedSome 1,000 ESTs generated

ESTsESTs

“Expressed Sequence Tags”

Partial sequences, usually derived from the ends of cDNA molecules.

500 nt 500 nt

4 Kb5’ 3’

Open reading frame (ORF)

Oligo dT primers

cDNAAdaptadors

ESTs(HGP)

vectorvector Insert ~3kb

Sequencing Primers

Main problems foundMain problems found

• - Repetitive sequencing of highly expressed - Repetitive sequencing of highly expressed genes : high redundancy (~60%)genes : high redundancy (~60%)

• - Necessity of large amounts of mRNA in - Necessity of large amounts of mRNA in order to obtain a normalized libraryorder to obtain a normalized library

• - Reduced information of - Reduced information of no matchesno matches

Gene expression in a typical eukaryotic cell

ClassClass

AbundantAbundant

IntermediateIntermediate

RareRare

Abundance/geneAbundance/gene

12,00012,000

300300

1010

DiversityDiversity

<10<10

500500

11.00011.000

Huang et al., 1999

Alternative protocol to generate ESTs

• Is there a way to tag rare genes ?• How to generate data from small

amounts of mRNA ?• Is it possible to tag the central

portion of the transcripts ?

Ideas

• The use of a PCR-based strategy, should enable the analysis of small amounts of mRNA.

• Using randomly selected primers (in RT-PCR) at low stringency as a means to evaluate other regions of the transcripts...

Randomly selected primers

ORESTESORESTES

Factors that contribute for the presence of a gene in a cDNA

library

AbundanceAbundanceNucleotide diversityNucleotide diversity

Usual cDNAlibraries

ORESTESORESTES

ORESTES - the dataORESTES - the data normalization

Covering a transcript with ORESTES

-The amplification of a gene region requires primer binding at both sides of a point.- The chance of a primer binding, depends on the size of the sequences flanking the amplification point.- If the size of a transcript is taken as 1, and the distance of the 3’ end is taken as S:

-The probability (P) of an appropriate amplification of a point is P = S(1-S)

Coverage of the central point = 0.5(1-0.5) = 0.5x0.5 = 0.25 = 25%Coverage of the last 10% of a transcript = 0.1x0.9 = 0.09 = 9%

Position of matchesPosition of matches

ORESTESORESTES - - sequence distributionsequence distribution

ORESTES - the data Comparison with dbest data

Project OrganisationProject Organisation

Sequencing Center

FM-USP

Sequencing Center

UNICAMP

Sequencing Center

EPM

FM-USP/RPSequencing Center

IQ-USP

CoordinationCoordination

LICRLICR

Sequencing Center

Project OrganisationProject Organisation

Dissected tissuesamples

Dept. of PathologyHospital A.C. Camargo

RNA coordinationLICR/SP

Preparation and validation of all

mRNAs to be used

Library coordinationLICR/SP

•cDNA synthesis and amplification

• ORESTES production and development

• ORESTES sequencing

Fernando Costa (CM) Sérgio Verjovski (QV) Christine Hackel Arthur Gruber Helaine Carrer/Dirce Carraro Mari Cleide Sogayar Ma Fátima Sonati Edna Kimura Gonçalo G. Pereira Hamza FA El-Dorry

Maria Aparecida Nagai (MR) Marco Antônio Zago (RC) Angelita Gama Enilza Espeáfrico Daniel Gianella Neto Gustavo H Goldman Suely KN Marie Ma Luísa Paçó-Larson Elizabeth Martins Paulo L. Hoo Vanderlei Rodrigues

Eloiza Tajara Marcelo Briones (PM) Sandro Valentini Rui MB Maciel Luis Eduardo Andrade Ismael DG Silva João Bosco Pesquero

Maria Inês Pardini (IL2) Marina Nóbrega (IL3) Sílvia Rogatto (IL5)

Using ORESTES to help to define the complete set

of genes expressed in different human tissues/tumours

Generation of Colon ESTs Generation of Colon ESTs

0

20000

40000

60000

80000

100000

120000

Normal Tumor Total

CGAP/NIHHCGP

HCGP X CGAP = 2,1x more sequencesHCGP X CGAP = 2,1x more sequences

Generation of Stomach ESTsGeneration of Stomach ESTs

0

10000

20000

30000

40000

50000

60000

Normal Tumor Total

CGAP/NIHHCGP

HCGP X CGAP = 2,5x more sequencesHCGP X CGAP = 2,5x more sequences

Generation of Breast ESTsGeneration of Breast ESTs

0

20000

40000

60000

80000

100000

120000

140000

160000

Normal Tumor Total

CGAP/NIHHCGP

HCGP X CGAP = 9,1x more sequencesHCGP X CGAP = 9,1x more sequences

Generation of Head and Neck ESTsGeneration of Head and Neck ESTs

0

50000

100000

150000

200000

250000

Normal Tumor Total

CGAP/NIHHCGP

HCGP X CGAP = 34,4x more sequencesHCGP X CGAP = 34,4x more sequences

Next challengeNext challenge

Data Information

The Head & Neck The Head & Neck transcriptome transcriptome

initiativeinitiative

Transcriptional level

Tumor Suppressor genes

- Clusters composed of sequences exclusively derived - Clusters composed of sequences exclusively derived from normal samplesfrom normal samples

- Clusters mapping - Clusters mapping to genomic regions of frequentto genomic regions of frequentLoss (LOH) in H&N tumoursLoss (LOH) in H&N tumours

Total = 78 clustersTotal = 78 clusters

Looking for putative Looking for putative tumour tumour suppressor genessuppressor genes

Transcriptional levelTranscriptional level

Oncogenes

- Clusters composed of sequences exclusively derived - Clusters composed of sequences exclusively derived from tumour samplesfrom tumour samples

- Clusters mapping - Clusters mapping to genomic regions frequentlyto genomic regions frequentlyamplified in H&N tumoursamplified in H&N tumours

Total = 271 clustersTotal = 271 clusters

Looking for putative Looking for putative oncogenesoncogenes

0

0,5

1

1,5

2

2,5

3

3,5

el768 R1cde R5dfr tdrf43

Larynx normal Larynx tumor

Differential gene expression in Larynx tumors

0

0,5

1

1,5

2

2,5

3

3,5

4

el768 R1cde R5dfr tdfr43

Pharynx Normal Pharynx Tumor

Differential gene expression in Pharynx tumors

0

1

2

3

4

5

6

7

8

9

10

el768 R1cde

Oral Cavity normal Oral Cavity Tumor

Differential gene expression in Oral cavity tumors

0

50

100

150

200

250

300

350

R5dfr tdfr43

Oral Cavity normal Oral Cavity TumorTonsil tumor

Differential gene expression in Oral cavity tumors

Transcriptional level

Gene humano

HSD00365 - TCGTTATGCCAGTGAAAATGTCAACAAATTGTTGGTAGGGAACAAATGTGA

RC5-BT0377-030200-012-A06 - .........................a.........................

PM2-BT0723-090201-010-c07 - .........................c.........................

PM2-BT0723-130900-002-c07 - .........................c.........................

MR3-GN0190-301100-004-e08 - .........................c.........................

MR4-ET0140-220101-004-d02 - .........................c.........................

MR4-EN0075-220101-006-d02 - .........................c.........................

IL2-FT0160-070800-121-C02 - .........................a.........................

MR4-ET0140-190201-007-h04 - .........................a.........................

MR0-RT0037-121200-004-d02 - .........................a.........................

CM1-HN0016-161100-568-c06 - .........................a.........................

QV3-BN0046-150300-121-a12 - .........................a.........................

QV3-DT0045-210100-063-f03 - .........................a.........................

QV2-NN0045-220800-323-d03 - .........................a.........................

IL5-UM0067-240300-051-g06 - ..............…..........a.........................

CM4-HN0021-241100-457-h02 - .........................a.........................

MR0-RT0037-011200-002-a07 - .........................a.........................

MR2-UM0060-030400-103-g02 - .........................a..............g..........

PM0-IT0018-091100-001-e02 - .........................a.........................

PM1-MT0143-101100-003-a06 - .......*.................a.........................

PM1-MT0143-101100-003-f11 - .........................a.........................

Homo sapiens RAB1, member RAS oncogene family (RAB1), mRNA

Type

Non-Synonymous

Codon

aaa-caa

Nucleotide

A-C

Aminoacid

K(lysine)-Q(glutanine)

"You have made your way from worm to man but much

within you is still worm"(Friedrich Nietzche,

Zarathustra's Prologue)

S. japonicum

43,707 ESTs

28,839 adult worms14,868 eggs

Schisto ESTs in Public databases

020000400006000080000

100000120000140000160000

2000 2003

Year

ES

Ts

S. japonicum S. mansoni

New drugs ??

Trans R Soc Trop Med Hyg. 2002 Sep-Oct;96(5):465-9.

AcknowledgementsBioinformatics

- F. Tsukumo, M. Carazolli and G. Pereira (UNICAMP)- EM Reis, A. Silva, S. Verjovski (IQ/USP)- WA Silva Jr, MA Zago (USP/RP)

Clinical Group

- André, M. Giuliano, LP Kowalski (H.Câncer)

Genomics & Molecular genetics

FAD Nunes (FO/USP)MM Brentani, Simone, Fátima, E Miracca, MA Nagai (FM/USP)DN Nunes, C Colin, MH Bengston, K Marsirer, MC Sogayar (IQ/USP)E Kimura, S Leoni (ICB/USP)JM Cerutti, GS Guimarães, R Maciel (UNIFESP),E Tajara, Ulises, P Rahal (UNESP/SJR Preto),S Rogatto, C Rainho (UNESP/Botucatu), S Valentim, José Eduardo, Glória (UNESP/Araraquara)FG Nóbrega, M Nóbrega (UNIVAP)EPB Ojopi, PEM Guimarães (IPq/USP)F Costa, F Lopes (Unicamp)MCR Costa (USP/RP)

AcknowledgementsAcknowledgements

AcknowledgementsAcknowledgements

www.neurociencias.org.br

emmanuel@usp.br

Emmanuel Dias Neto, PhDEmmanuel Dias Neto, PhDLaboratory of Neurosciences, Laboratory of Neurosciences,

Institute and Dept. of Psychiatry Institute and Dept. of Psychiatry Faculdade de Medicina, Faculdade de Medicina, University of São PauloUniversity of São Paulo

São Paulo, SPSão Paulo, SP

top related