phytome a data analysis pipline presented by jason phillips
Post on 19-Dec-2015
224 views
TRANSCRIPT
![Page 1: Phytome A Data Analysis Pipline presented by Jason Phillips](https://reader036.vdocument.in/reader036/viewer/2022062308/56649d2b5503460f94a00231/html5/thumbnails/1.jpg)
Phytome
A Data Analysis Piplinepresented byJason Phillips
![Page 2: Phytome A Data Analysis Pipline presented by Jason Phillips](https://reader036.vdocument.in/reader036/viewer/2022062308/56649d2b5503460f94a00231/html5/thumbnails/2.jpg)
High Level Flow Chart
Retrieve UnigenesRetrieve Unigenes
Translate UnigenesTranslate Unigenes
FamiliesFamilies
![Page 3: Phytome A Data Analysis Pipline presented by Jason Phillips](https://reader036.vdocument.in/reader036/viewer/2022062308/56649d2b5503460f94a00231/html5/thumbnails/3.jpg)
Main Outline
● Unigenes (Where'd they come from, where'd they go?)
● Translation (methods and procedures)
● Building Families (the power of together-ness)
![Page 4: Phytome A Data Analysis Pipline presented by Jason Phillips](https://reader036.vdocument.in/reader036/viewer/2022062308/56649d2b5503460f94a00231/html5/thumbnails/4.jpg)
phytome » Unigene
● What are?● Where from?● Nine Species● Arabidopsis, a special case● Storage
![Page 5: Phytome A Data Analysis Pipline presented by Jason Phillips](https://reader036.vdocument.in/reader036/viewer/2022062308/56649d2b5503460f94a00231/html5/thumbnails/5.jpg)
phytome » Unigene » What Are?Combined EST's that overlap
![Page 6: Phytome A Data Analysis Pipline presented by Jason Phillips](https://reader036.vdocument.in/reader036/viewer/2022062308/56649d2b5503460f94a00231/html5/thumbnails/6.jpg)
phytome » Unigene » Where From?
● TIGR● Other sources?
![Page 7: Phytome A Data Analysis Pipline presented by Jason Phillips](https://reader036.vdocument.in/reader036/viewer/2022062308/56649d2b5503460f94a00231/html5/thumbnails/7.jpg)
phytome » Unigene » Nine Species
![Page 8: Phytome A Data Analysis Pipline presented by Jason Phillips](https://reader036.vdocument.in/reader036/viewer/2022062308/56649d2b5503460f94a00231/html5/thumbnails/8.jpg)
phytome » Unigene » Arabidopsis
Highly annotated...Highly sequenced...Highly translated...
![Page 9: Phytome A Data Analysis Pipline presented by Jason Phillips](https://reader036.vdocument.in/reader036/viewer/2022062308/56649d2b5503460f94a00231/html5/thumbnails/9.jpg)
phytome » Unigene » Storage
species count-------------------ghir 24350mcry 8455osat 60778hann 20520mtru 36976lesc 31012ljap 11025lsat 21960atha 27170-------------------total: 242246
![Page 10: Phytome A Data Analysis Pipline presented by Jason Phillips](https://reader036.vdocument.in/reader036/viewer/2022062308/56649d2b5503460f94a00231/html5/thumbnails/10.jpg)
phytome » Translation
● Methods● Estwise● Estscan● FrameFinder
● Procedure● Numbers
![Page 11: Phytome A Data Analysis Pipline presented by Jason Phillips](https://reader036.vdocument.in/reader036/viewer/2022062308/56649d2b5503460f94a00231/html5/thumbnails/11.jpg)
phytome » Translation » methods
EST-WISE ESTSCANFRAMEFINDER
AB INITIOHOMOLOGIES via BLAST
sprot + trembl
![Page 12: Phytome A Data Analysis Pipline presented by Jason Phillips](https://reader036.vdocument.in/reader036/viewer/2022062308/56649d2b5503460f94a00231/html5/thumbnails/12.jpg)
phytome » Translation » procedure
● EST-WISE (Mac OSX Cluster)– blast swiss prot: 10.3 hours, 35 nodes (~15 days)– blast trembl: 35.7 hours, 35 nodes (~52 days)
● ESTSCAN (Mustard)● FrameFinder (Mustard)
![Page 13: Phytome A Data Analysis Pipline presented by Jason Phillips](https://reader036.vdocument.in/reader036/viewer/2022062308/56649d2b5503460f94a00231/html5/thumbnails/13.jpg)
phytome » Translation » numbers
242,246Unigenes
242,246Unigenes
ESTWISE
FRAMEFINDER
ESTSCAN
151,830
226,988
242,242
90,416
15,258
4
![Page 14: Phytome A Data Analysis Pipline presented by Jason Phillips](https://reader036.vdocument.in/reader036/viewer/2022062308/56649d2b5503460f94a00231/html5/thumbnails/14.jpg)
phytome » Families
● Relationships● Clustering● Numbers
![Page 15: Phytome A Data Analysis Pipline presented by Jason Phillips](https://reader036.vdocument.in/reader036/viewer/2022062308/56649d2b5503460f94a00231/html5/thumbnails/15.jpg)
phytome » Families » RelationshipsBlast everything against everything
sequences blastable dbof sequences
query sbjct e-value------- -------- -----------mtru302 ljap4523 1 29mtru302 lesc25072 1 26mtru302 hann20270 5 24osat59606 osat59606 1 157osat59606 osat4002 1 96osat59606 atha25166 1 88...... ..... . ........ ..... . ..
![Page 16: Phytome A Data Analysis Pipline presented by Jason Phillips](https://reader036.vdocument.in/reader036/viewer/2022062308/56649d2b5503460f94a00231/html5/thumbnails/16.jpg)
phytome » Families » RelationshipsBut we have 4 set's of sequences!
tblastx242,246
nucleotides
blastp151,830
estwise
blastp226,988
estscan
blastp242,242
framefinder
Which method do we trust?
![Page 17: Phytome A Data Analysis Pipline presented by Jason Phillips](https://reader036.vdocument.in/reader036/viewer/2022062308/56649d2b5503460f94a00231/html5/thumbnails/17.jpg)
phytome » Families » Relationships4 data sets...4 family interpretations
tb
ew
es
ff
~3 days, 28 nodes (~84 days)
~1/4 day, 21 nodes (~5days)
~1/4 day, 21 nodes (~5 days)
~1/4 day, 21 nodes (~5 days)
BLAST OFF!
![Page 18: Phytome A Data Analysis Pipline presented by Jason Phillips](https://reader036.vdocument.in/reader036/viewer/2022062308/56649d2b5503460f94a00231/html5/thumbnails/18.jpg)
phytome » Families » Relationships
Method size no blast no trans attrition------ -------- -------- -------- ----------tb 242246 153 0 153ew 151830 22 90416 90438ff 242242 24563 4 24567 es 226988 1345 15258 16603
BLAST RESULTS
![Page 19: Phytome A Data Analysis Pipline presented by Jason Phillips](https://reader036.vdocument.in/reader036/viewer/2022062308/56649d2b5503460f94a00231/html5/thumbnails/19.jpg)
phytome » Families » Clustering
TRIBE MCL
evalue
gene
![Page 20: Phytome A Data Analysis Pipline presented by Jason Phillips](https://reader036.vdocument.in/reader036/viewer/2022062308/56649d2b5503460f94a00231/html5/thumbnails/20.jpg)
phytome » Families » Clustering
TRIBE MCL
evalue
gene
![Page 21: Phytome A Data Analysis Pipline presented by Jason Phillips](https://reader036.vdocument.in/reader036/viewer/2022062308/56649d2b5503460f94a00231/html5/thumbnails/21.jpg)
phytome » Families » Clustering
fam id member------ ------.... ........... .......4035 atha74994035 atha75034035 atha84834036 atha107044036 osat230814036 osat366674037 atha10724037 atha50594037 lsat154214037 lsat21190.... ........... ......
query sbjct evalue-------- -------- ------atha7499 atha8483 6 78atha7499 atha7503 4 90osat23081 atha10704 8 78osat23081 osat36667 8 78atha1072 atha5059 2 68atha1072 lsat15421 2 60atha1072 lsat21190 1 102atha1072 atha5059 9 54...... ...... . ........ ...... . ........ ...... . ..
tribe mcl
![Page 22: Phytome A Data Analysis Pipline presented by Jason Phillips](https://reader036.vdocument.in/reader036/viewer/2022062308/56649d2b5503460f94a00231/html5/thumbnails/22.jpg)
phytome » Families » Clustering
tb
ff
es
ew
tb
ff
es
ewTRIBE MCL
blast results families
![Page 23: Phytome A Data Analysis Pipline presented by Jason Phillips](https://reader036.vdocument.in/reader036/viewer/2022062308/56649d2b5503460f94a00231/html5/thumbnails/23.jpg)
phytome » Families » Clustering
Let's look as some histograms!
![Page 24: Phytome A Data Analysis Pipline presented by Jason Phillips](https://reader036.vdocument.in/reader036/viewer/2022062308/56649d2b5503460f94a00231/html5/thumbnails/24.jpg)
What should we do next round?