2015 summer - araport project overview leaflet

2
Araport is a free online community resource for Arabidopsis and plant research, funded by the NSF (US) and BBSRC (UK) since 2013. Conceived from a community whitepaper [1], Araport integrates a wide range of data types from globally distributed and financially independent data centers into a "one-stop” data shop [2, 3]. Araport is designed to grow with the science community by providing an extensible framework to incorporate new data sources and create of user interfaces to consume them. Find out more at https://www.araport.org. Araport has taken on the responsibility of updating the Arabidopsis genome annotation. In our latest Araport11 data release, Col-0 gene structures were validated and updated using 113 RNA-seq data sets along with annotation contributions from NCBI, MAKER, UniProt, and individual labs. A pre-release version is available on the Araport JBrowse and the FTP site. Find out more at https://www.araport.org/data/araport11. ThaleMine, an instance of the InterMine biological data warehouse, serves as our central data repository. Araport currently hosts genome annotation which is integrated with data from UniProt (proteins), NCBI (publications and gene reference into functions), KEGG (pathways), ATTED (co-expression), BioAnalytic Resource (BAR, array expression and interactions), EPIC-CoGe (epigenomics), Phytomine (plant ortholgs) and others. Arabidopsis germplasm stocks, genotype, and phenotype data are expected to be available in fall 2015. ThaleMine can be used to search genes, gene functions, biological processes, pathways, and publications. Find out more at https://apps.araport.org/thalemine. Araport uses innovative middleware based on the iPlant Agave platform to support enrollment, discovery, and access to community-developed web services. For example, ThaleMine Gene Report pages retrieve via web services in real time co-expression data from the ATTED project, Japan, and gene expression pictographs from the BAR project, Canada. ATTED BAR Araport has instantiated a JBrowse genome viewer which houses the Araport11 data release including updated gene models, RNA-seq profiles representing 11 tissues, transcript isoforms, and other gene evidence. Additional data tracks include large-scale datasets such as 1001 genome variants (real-time from Ensembl), epigenomics (real-time from EPIC-CoGe), TDNA- seq (NCBI/Ecker lab), TAIR10 gene models, and over 70 additional tracks. Find out more at https://apps.araport.org/jbrowse. Araport hosts Science Apps that integrate data from other web sites using the web services model of on-request data exchange. These science apps are developed and published to Araport by the science community and the Araport team. Below is a demonstration Science App which retrieves protein-protein interaction data from EBI in real-time. The Arabidopsis Information Portal Araport11 Genome Annotation ThaleMine Data Warehouse Data Federation in ThaleMine JBrowse Genome Viewer Science Apps

Upload: araport

Post on 17-Aug-2015

45 views

Category:

Science


1 download

TRANSCRIPT

Page 1: 2015 Summer - Araport Project Overview Leaflet

Araport is a free online community resource for

Arabidopsis and plant research, funded by the

NSF (US) and BBSRC (UK) since 2013. Conceived

from a community whitepaper [1], Araport

integrates a wide range of data types from

globally distributed and financially independent

data centers into a "one-stop” data shop [2, 3].

Araport is designed to grow with the science

community by providing an extensible framework

to incorporate new data sources and create of

user interfaces to consume them. Find out more

at https://www.araport.org.

Araport has taken on the responsibility of

updating the Arabidopsis genome annotation. In

our latest Araport11 data release, Col-0 gene

structures were validated and updated using 113

RNA-seq data sets along with annotation

contributions from NCBI, MAKER, UniProt, and

individual labs. A pre-release version is available

on the Araport JBrowse and the FTP site. Find out

more at https://www.araport.org/data/araport11.

ThaleMine, an instance of the InterMine

biological data warehouse, serves as our central

data repository. Araport currently hosts genome

annotation which is integrated with data from

UniProt (proteins), NCBI (publications and gene

reference into functions), KEGG (pathways),

ATTED (co-expression), BioAnalytic Resource (BAR,

array expression and interactions), EPIC-CoGe

(epigenomics), Phytomine (plant ortholgs) and

others. Arabidopsis germplasm stocks, genotype,

and phenotype data are expected to be

available in fall 2015. ThaleMine can be used to

search genes, gene functions, biological

processes, pathways, and publications. Find out

more at https://apps.araport.org/thalemine.

Araport uses innovative middleware based on

the iPlant Agave platform to support enrollment,

discovery, and access to community-developed

web services. For example, ThaleMine Gene

Report pages retrieve via web services in real

time co-expression data from the ATTED project,

Japan, and gene expression pictographs from

the BAR project, Canada.

ATTED

BAR

Araport has instantiated a JBrowse genome

viewer which houses the Araport11 data release

including updated gene models, RNA-seq profiles

representing 11 tissues, transcript isoforms, and

other gene evidence. Additional data tracks

include large-scale datasets such as 1001

genome variants (real-time from Ensembl),

epigenomics (real-time from EPIC-CoGe), TDNA-

seq (NCBI/Ecker lab), TAIR10 gene models, and

over 70 additional tracks. Find out more at

https://apps.araport.org/jbrowse.

Araport hosts Science Apps that integrate data

from other web sites using the web services

model of on-request data exchange. These

science apps are developed and published to

Araport by the science community and the

Araport team. Below is a demonstration Science

App which retrieves protein-protein interaction

data from EBI in real-time.

The Arabidopsis Information Portal

Araport11 Genome Annotation

ThaleMine Data Warehouse

Data Federation in ThaleMine

JBrowse Genome Viewer

Science Apps

Page 2: 2015 Summer - Araport Project Overview Leaflet

Another example of a Science App displays

KEGG pathway data obtained at run time by

web services. Find out more at

https://www.araport.org/apps.

Araport is an open source science resource and

maintains an active GitHub repository. The

community is not only invited to contribute and

expand functionalities, but is also empowered to

do so. Araport developers exploit cutting edge

technologies such as iPlant, Agave, Adama, git,

jQuery, Bootstrap, Docker, and Swagger. Araport

hosts developer workshops, and hack-a-thons to

help the community enrich and exploit this

resource. Find out more at

https://www.araport.org/devzone.

Summer internships are available for high school

and college undergraduate students who have

interests in biology and computer programming,

and also for teachers interested in developing

curriculum to teach biological subjects using

bioinformatics tools. Find out more at

https://www.araport.org/araport-internship-

program.

[1] International Arabidopsis Informatics Consortium. (2012).

Taking the next step: building an Arabidopsis information portal.

The Plant Cell, 24(6), 2248-2256. PMID: 22751211

[2] Krishnakumar, V., Hanlon, M. R., Contrino, S., Ferlanti, E. S.,

Karamycheva, S., Kim, M., Rosen, B. D., Cheng, C., Moreira, W.,

Mock, S. A., Stubbs, J., Sullivan, J. M., Krampis, K., Miller, J. R.,

Micklem, G., Vaughn, M. & Town, C. D. (2014). Araport: the

Arabidopsis Information Portal. Nucl. Acids Res., 43(D1), D1003-

D1009. PMID: 25414324

[3] Hanlon, M. R., Vaughn, M., Mock, S., Dooley, R., Moreira, W.,

Stubbs, J., Town, C., Miller, J., Krishnakumar, V., Ferlanti, E., and

Pence, E. (2015). Araport: an application platform for data

discovery. Concurrency Computat.: Pract. Exper., doi:

10.1002/cpe.3542.

J. Craig Venter Institute, US Chris Town [lead]

Jason Miller Agnes Chan

Texas Advanced Computing Center, US

Matt Vaughn

University of Cambridge, UK Gos Micklem

Araport is currently seeking “ideas” for bioinformatics tools (Science Apps) most needed by the plant community. You are invited to submit ideas for bioinformatics tools (Science Apps) that make use of one or more public data sources and/or existing tools. The target user could be either a bench scientist or bioinformatician. Submitter of the best idea will receive an iPad. Find out more at https://www.araport.org/araportChalleng2015.

Powered by

Funded by

Developer Zone

Internships at Araport

References

Participating Groups and PIs

Ideas Challenge 2015 Win an iPad

Win an iPad

https://www.araport.org

Email: [email protected]

Twitter: @araport

org

Last modified: Aug 6, 2015