transcriptome analysis using open reading frame ests (orestes) emmanuel dias neto, phd lab of...
TRANSCRIPT
Transcriptome analysis using Open Reading
frame ESTs (ORESTES)
Emmanuel Dias Neto, PhDLab of Neurosciences, LIM-27
Instituto de PsiquiatriaFaculdade de Medicina
Universidade de Sao Paulo, SP - BRAZIL
UNESCO - First North-South UNESCO - First North-South Human Genome ConferenceHuman Genome Conference
• Caxambú, MG, Brazil - 1993Caxambú, MG, Brazil - 1993• Is there a way to integrate the research Is there a way to integrate the research
performed in developing countries with the performed in developing countries with the US/Europe ‘Human Genome Project’ ?US/Europe ‘Human Genome Project’ ?
• After the completion of the ‘Human After the completion of the ‘Human Genome sequencing’, how can we gain Genome sequencing’, how can we gain access or make use of the technology access or make use of the technology developed ?developed ?
How can we learn ?How can we learn ?
• Initiate an EST sequencing project of a Initiate an EST sequencing project of a parasite of local importance (parasite of local importance (Schistosoma Schistosoma mansonimansoni))
• cDNA libraries prepared with Marcelo cDNA libraries prepared with Marcelo Bento SoaresBento Soares
• cDNA sequencing performed at TIGR cDNA sequencing performed at TIGR (Craig Venter)(Craig Venter)
• Some 1,000 ESTs generatedSome 1,000 ESTs generated
ESTsESTs
“Expressed Sequence Tags”
Partial sequences, usually derived from the ends of cDNA molecules.
500 nt 500 nt
4 Kb5’ 3’
Open reading frame (ORF)
Oligo dT primers
cDNAAdaptadors
ESTs(HGP)
vectorvector Insert ~3kb
Sequencing Primers
Main problems foundMain problems found
• - Repetitive sequencing of highly expressed - Repetitive sequencing of highly expressed genes : high redundancy (~60%)genes : high redundancy (~60%)
• - Necessity of large amounts of mRNA in - Necessity of large amounts of mRNA in order to obtain a normalized libraryorder to obtain a normalized library
• - Reduced information of - Reduced information of no matchesno matches
Gene expression in a typical eukaryotic cell
ClassClass
AbundantAbundant
IntermediateIntermediate
RareRare
Abundance/geneAbundance/gene
12,00012,000
300300
1010
DiversityDiversity
<10<10
500500
11.00011.000
Huang et al., 1999
Alternative protocol to generate ESTs
• Is there a way to tag rare genes ?• How to generate data from small
amounts of mRNA ?• Is it possible to tag the central
portion of the transcripts ?
Ideas
• The use of a PCR-based strategy, should enable the analysis of small amounts of mRNA.
• Using randomly selected primers (in RT-PCR) at low stringency as a means to evaluate other regions of the transcripts...
Randomly selected primers
ORESTESORESTES
Factors that contribute for the presence of a gene in a cDNA
library
AbundanceAbundanceNucleotide diversityNucleotide diversity
Usual cDNAlibraries
ORESTESORESTES
ORESTES - the dataORESTES - the data normalization
Covering a transcript with ORESTES
-The amplification of a gene region requires primer binding at both sides of a point.- The chance of a primer binding, depends on the size of the sequences flanking the amplification point.- If the size of a transcript is taken as 1, and the distance of the 3’ end is taken as S:
-The probability (P) of an appropriate amplification of a point is P = S(1-S)
Coverage of the central point = 0.5(1-0.5) = 0.5x0.5 = 0.25 = 25%Coverage of the last 10% of a transcript = 0.1x0.9 = 0.09 = 9%
Position of matchesPosition of matches
ORESTESORESTES - - sequence distributionsequence distribution
ORESTES - the data Comparison with dbest data
Project OrganisationProject Organisation
Sequencing Center
FM-USP
Sequencing Center
UNICAMP
Sequencing Center
EPM
FM-USP/RPSequencing Center
IQ-USP
CoordinationCoordination
LICRLICR
Sequencing Center
Project OrganisationProject Organisation
Dissected tissuesamples
Dept. of PathologyHospital A.C. Camargo
RNA coordinationLICR/SP
Preparation and validation of all
mRNAs to be used
Library coordinationLICR/SP
•cDNA synthesis and amplification
• ORESTES production and development
• ORESTES sequencing
Fernando Costa (CM) Sérgio Verjovski (QV) Christine Hackel Arthur Gruber Helaine Carrer/Dirce Carraro Mari Cleide Sogayar Ma Fátima Sonati Edna Kimura Gonçalo G. Pereira Hamza FA El-Dorry
Maria Aparecida Nagai (MR) Marco Antônio Zago (RC) Angelita Gama Enilza Espeáfrico Daniel Gianella Neto Gustavo H Goldman Suely KN Marie Ma Luísa Paçó-Larson Elizabeth Martins Paulo L. Hoo Vanderlei Rodrigues
Eloiza Tajara Marcelo Briones (PM) Sandro Valentini Rui MB Maciel Luis Eduardo Andrade Ismael DG Silva João Bosco Pesquero
Maria Inês Pardini (IL2) Marina Nóbrega (IL3) Sílvia Rogatto (IL5)
Using ORESTES to help to define the complete set
of genes expressed in different human tissues/tumours
Generation of Colon ESTs Generation of Colon ESTs
0
20000
40000
60000
80000
100000
120000
Normal Tumor Total
CGAP/NIHHCGP
HCGP X CGAP = 2,1x more sequencesHCGP X CGAP = 2,1x more sequences
Generation of Stomach ESTsGeneration of Stomach ESTs
0
10000
20000
30000
40000
50000
60000
Normal Tumor Total
CGAP/NIHHCGP
HCGP X CGAP = 2,5x more sequencesHCGP X CGAP = 2,5x more sequences
Generation of Breast ESTsGeneration of Breast ESTs
0
20000
40000
60000
80000
100000
120000
140000
160000
Normal Tumor Total
CGAP/NIHHCGP
HCGP X CGAP = 9,1x more sequencesHCGP X CGAP = 9,1x more sequences
Generation of Head and Neck ESTsGeneration of Head and Neck ESTs
0
50000
100000
150000
200000
250000
Normal Tumor Total
CGAP/NIHHCGP
HCGP X CGAP = 34,4x more sequencesHCGP X CGAP = 34,4x more sequences
Next challengeNext challenge
Data Information
The Head & Neck The Head & Neck transcriptome transcriptome
initiativeinitiative
Transcriptional level
Tumor Suppressor genes
- Clusters composed of sequences exclusively derived - Clusters composed of sequences exclusively derived from normal samplesfrom normal samples
- Clusters mapping - Clusters mapping to genomic regions of frequentto genomic regions of frequentLoss (LOH) in H&N tumoursLoss (LOH) in H&N tumours
Total = 78 clustersTotal = 78 clusters
Looking for putative Looking for putative tumour tumour suppressor genessuppressor genes
Transcriptional levelTranscriptional level
Oncogenes
- Clusters composed of sequences exclusively derived - Clusters composed of sequences exclusively derived from tumour samplesfrom tumour samples
- Clusters mapping - Clusters mapping to genomic regions frequentlyto genomic regions frequentlyamplified in H&N tumoursamplified in H&N tumours
Total = 271 clustersTotal = 271 clusters
Looking for putative Looking for putative oncogenesoncogenes
0
0,5
1
1,5
2
2,5
3
3,5
el768 R1cde R5dfr tdrf43
Larynx normal Larynx tumor
Differential gene expression in Larynx tumors
0
0,5
1
1,5
2
2,5
3
3,5
4
el768 R1cde R5dfr tdfr43
Pharynx Normal Pharynx Tumor
Differential gene expression in Pharynx tumors
0
1
2
3
4
5
6
7
8
9
10
el768 R1cde
Oral Cavity normal Oral Cavity Tumor
Differential gene expression in Oral cavity tumors
0
50
100
150
200
250
300
350
R5dfr tdfr43
Oral Cavity normal Oral Cavity TumorTonsil tumor
Differential gene expression in Oral cavity tumors
Transcriptional level
Gene humano
HSD00365 - TCGTTATGCCAGTGAAAATGTCAACAAATTGTTGGTAGGGAACAAATGTGA
RC5-BT0377-030200-012-A06 - .........................a.........................
PM2-BT0723-090201-010-c07 - .........................c.........................
PM2-BT0723-130900-002-c07 - .........................c.........................
MR3-GN0190-301100-004-e08 - .........................c.........................
MR4-ET0140-220101-004-d02 - .........................c.........................
MR4-EN0075-220101-006-d02 - .........................c.........................
IL2-FT0160-070800-121-C02 - .........................a.........................
MR4-ET0140-190201-007-h04 - .........................a.........................
MR0-RT0037-121200-004-d02 - .........................a.........................
CM1-HN0016-161100-568-c06 - .........................a.........................
QV3-BN0046-150300-121-a12 - .........................a.........................
QV3-DT0045-210100-063-f03 - .........................a.........................
QV2-NN0045-220800-323-d03 - .........................a.........................
IL5-UM0067-240300-051-g06 - ..............…..........a.........................
CM4-HN0021-241100-457-h02 - .........................a.........................
MR0-RT0037-011200-002-a07 - .........................a.........................
MR2-UM0060-030400-103-g02 - .........................a..............g..........
PM0-IT0018-091100-001-e02 - .........................a.........................
PM1-MT0143-101100-003-a06 - .......*.................a.........................
PM1-MT0143-101100-003-f11 - .........................a.........................
Homo sapiens RAB1, member RAS oncogene family (RAB1), mRNA
Type
Non-Synonymous
Codon
aaa-caa
Nucleotide
A-C
Aminoacid
K(lysine)-Q(glutanine)
"You have made your way from worm to man but much
within you is still worm"(Friedrich Nietzche,
Zarathustra's Prologue)
S. japonicum
43,707 ESTs
28,839 adult worms14,868 eggs
Schisto ESTs in Public databases
020000400006000080000
100000120000140000160000
2000 2003
Year
ES
Ts
S. japonicum S. mansoni
New drugs ??
Trans R Soc Trop Med Hyg. 2002 Sep-Oct;96(5):465-9.
AcknowledgementsBioinformatics
- F. Tsukumo, M. Carazolli and G. Pereira (UNICAMP)- EM Reis, A. Silva, S. Verjovski (IQ/USP)- WA Silva Jr, MA Zago (USP/RP)
Clinical Group
- André, M. Giuliano, LP Kowalski (H.Câncer)
Genomics & Molecular genetics
FAD Nunes (FO/USP)MM Brentani, Simone, Fátima, E Miracca, MA Nagai (FM/USP)DN Nunes, C Colin, MH Bengston, K Marsirer, MC Sogayar (IQ/USP)E Kimura, S Leoni (ICB/USP)JM Cerutti, GS Guimarães, R Maciel (UNIFESP),E Tajara, Ulises, P Rahal (UNESP/SJR Preto),S Rogatto, C Rainho (UNESP/Botucatu), S Valentim, José Eduardo, Glória (UNESP/Araraquara)FG Nóbrega, M Nóbrega (UNIVAP)EPB Ojopi, PEM Guimarães (IPq/USP)F Costa, F Lopes (Unicamp)MCR Costa (USP/RP)
AcknowledgementsAcknowledgements
Emmanuel Dias Neto, PhDEmmanuel Dias Neto, PhDLaboratory of Neurosciences, Laboratory of Neurosciences,
Institute and Dept. of Psychiatry Institute and Dept. of Psychiatry Faculdade de Medicina, Faculdade de Medicina, University of São PauloUniversity of São Paulo
São Paulo, SPSão Paulo, SP