module development
TRANSCRIPT
1
Module Developmentfor
A R A P O R T
Jason MillerJ. Craig Venter Institute (JCVI)
Plant & Animal Genomes Conference (PAG) 2016
2
www.ARAPORT.org
Community• Jobs• Meetings• Curation
ThaleMine• Genes pages• Protein pages• Analysis tools
Modules• Science apps• Web services• Education
JBrowse• Annotation• RNAseq• SNPs
Arabidopsis Information Portal
3
www.ARAPORT.org
Community• Jobs• Meetings• Curation
ThaleMine• Genes pages• Protein pages• Analysis tools
Modules• Science apps• Web services• Education
JBrowse• Annotation• RNAseq• SNPs
Arabidopsis Information Portal
4
Modules Proposed
‘The design of the AIP will provide core functionality while remaining flexible to encourage multiple contributors and constant innovation.’IAIC Whitepaper (2012) Plant Cell: ”Taking the next step”.
5
Modules Realized
6
Modules Realized
Core web applications for integration and indexing.
7
Modules Realized
Core web applications for integration and indexing.
In-house Science Apps
Community Science Apps
8
Interactions Module
Contributed by BAR2015
9Science App by Asher Pasha & Nicholas Provart from BAR at University of Toronto.
10
How It Works
https://www.araport.org JavaScript in the Browser…
if ( gene && gene.length > 0 )
… calls Araport web services by URL…
$.get('https://api.araport.org…
… which in turn call BAR web services by URL…
http://bar.utoronto.ca/webservices/get_expressologs.php
*This pass-through middle layer exists to prevent “cross-origin resource sharing” exceptions that would occur if the JavaScript were to invoke URLs unrelated to the JavaScript source.
11
How It Works
https://www.araport.org
The graph is interactive.• Users can rearrange nodes by dragging.• Users can get details by clicking.
This is Cytoscape.• The graph is drawn by Cytoscape.js• This is a free library for JavaScript.
There are many libraries to choose from!• jsPhyloSVG: phylogenetic trees• HighCharts: statistical charts• jQuery DataTables: interactive tables• d3.js: all sorts of cool stuff
12
Code re-use
http://bar.utoronto.ca
The Araport science app (left) reuses code from the pre-existing BAR app (right).These apps look different by choice but they could be made identical.
https://www.araport.org
13
Key Points• The interactions Science App
– Example of visualization module– Uses Cytoscape library for JavaScript apps– Displays data from BAR web services via Araport pass-through– Developed at BAR by developers attended Araport Workshop– Similar codes deployed at Araport and BAR
• We invite you to develop a visualization module– Araport engineers available to provide technical support
• Featured speaker:– Xinbin Dai, The Samuel Roberts Noble Foundation
• “HRGRN: enabling graph search and integrative analysis of Arabidopsis signaling transduction, metabolism and gene regulation networks”
14
Subcellular Localization Module
Contributed by SUBA2015
15
Web Servicesuba3
Web service by Cornelia Hooper, Ian Castleden from University of Western Australia.
This URL
Returns this data
16
Auto Docsuba3
Automatic documentation at Araport.
These are the service endpoints• The endpoint is the verb in the URL.• Verb is followed by a parameter.• Example: araport/suba/search?locus=AT2G46830Standard service endpoints at Araport• /list = which IDs work with this service?• /search = what are the details for a given ID• /prov = who provided this data and how?
Automatic documentation is generated by Araport based on provided metadata.Implemented with server side components: Adama and Swagger.
17
Auto Docsuba3
Easy to use “try it out!” button.
Result: this transcription factor localizes to nucleus.
Results in JavaScript-friendly JSON format.
18
A Web Service Module• SUBA provides a web services module– URL query takes an Arabidopsis locus as parameter– URL responds with a web page full of data
• The data is not formatted for display to humans (e.g. HTML)• The data is formatted for JavaScript parsers (in JSON format)
– The service is RESTful in that the data exchange is achieved with just the standard web protocol, HTTP
• We can all use this module…– Build a Science App that colors genes and pathway– Build a Science App that scores predicted interactions– ThaleMine could add subcellular localization to gene lists
suba3
19
How They Did Itsuba3
1. SUBA created a RESTful web service at their university.• Added local URLs that return JSON instead of HTML.• Re-used their existing database and web server.
2. SUBA wrote an Araport adapter.• Wrote an adapter program in python.• Adapter calls their URL & prints results in JSON format.• Added metadata in YAML format (for auto documentation)• Saved code to a source code repository on bitbucket.
3. SUBA deployed the adapther to Araport.• Used ‘curl’ to send Araport the URL of the source code repository.• Araport checks out the code, compiles it, containerizes it, deploys it.• Araport generates interactive documentation using Swagger.
http://suba.plantenergy.uwa.edu.au/suba-app…
https://bitbucket.org/athaliana/suba-araport
$ curl –kL -X POST –H ”$BEARER_TOKEN” –F "git_repository=https://bitbucket.org/athaliana/suba-araport” https://api.araport.org/community/v0.3
20
Key Points• Example of a web services module– Developed at SUBA, University of Western Australia– Deployed independently without Araport intervention
• We invite you to develop a web services module– Araport will provide documentation, indexing– Araport will promote auto-discovery, interoperability
• Featured speaker:– Manhoi Hur, Iowa State University, Ames, IA
• “PMR metabolomics and transcriptomics database and its RESTful web APIs: A data sharing resource”
suba3
21
JBrowse Module
Contributed by EPIC CoGe2015
22
JBrowse Tracks
Track selected.
Track displayed.
Data provided by EPIC CoGe web services. Thanks to Erik Lyons, University of Arizona.
23
Key Points• A web services module for JBrowse display– Data + metadata provided by EPIC CoGe web services– Exposed at Araport using pass-through adapters
• No code, just metadata
• We invite you to contribute JBrowse track data– Support for mapped reads in indexed BAM files– Support for genomic variants in VCF files
• Featured speaker:– Beth Rowan, Max Planck Institute for Developmental Biology,
Tübingen, Germany• “User friendly tools for the Arabidopsis thaliana 1001 Genomes”
24
BLAST Module
Contributed by TACC2015
25
BLAST APP
The BLAST app provides basic search against TAIR10 and Araport11 databases. Future versions will provide gene report page hyperlinks and other Science App integrations.
26
How It WorksBLAST APP
JavaScript in the Browser:https://www.araport.org
function submitBlastJob(Agave){ Agave.api.jobs.submit(…
Docker
Araport Servers:
filesDocker
27
How It WorksBLAST APP
The source code is public:
https://www.github.org
◀ JavaScript Science App for browsers(upload this using an Araport
form) ◀ BLAST Database build script for servers
(submit this with Agave tools)
◀ BLAST software build script for servers(submit this with Agave tools)
28
Key Points• The BLAST Science App
– Built by Araport staff with tools available to end users– Codes are open source (can be re-used)– Codes are portable (can be installed at your site, too)
• We invite you to contribute a computational module– Develop in almost any programming language– Define the Docker container for running it– Use Agave for scheduling jobs, storing files, etc.– Deploy a JavaScript user interface to Araport
• Featured speaker:– Michael Hamilton, Colorado State University
• “Predicting differential intron retention with iDiffIR”
BLAST APP
29
Summary• The Araport platform – Hosts modules from community members
• Members gain visibility, accessibility, discoverability• Members benefit from documentation, tech support
– Hosts many forms of community modules• Visualization Science Apps using JavaScript libraries• Computation Science Apps and back end software• Pure data interchange as RESTful web services• JBrowse tracks as RESTful web services
• On-going infrastructure development– Federated search, ontology-based interoperation– User workspaces, drag & drop combinations
Araport Developer Workshops
30
Deploying the Atted Science App Tutorial atAIP Developer Workshop, TACC, Nov 2014.
The Atted Science App Tutorial is available as open source on GitHub.
Sign up now for the 2016 workshop
31
Acknowledgements
Araport Data Sources
32
AcknowledgementsJ Craig Venter Institute• Chris Town (PI)• Jason Miller• Agnes Chan• Erik Ferlanti• Irina Belyaeva• Chia-Yi Cheng• Vivek KrishnakumarFormer team members: Konstantinos Krampis, Svetlana Karamycheva, Maria Kim, Ben Rosen, Christopher Nelson, Seth Schobel
University of Cambridge• Gos Micklem• Sergio Contrino
Funding Agencies
Texas Advanced Computing CenterMatt VaughnSteve MockMatt HanlonWalter MoreiraRion DooleyJoe StubbsJosue Balandrano CoronelAlex Rocha
The Arabidopsis Information Portal (www.Araport.org) is an integrated resource for Arabidopsis genomics data, web-based genome browsing, and data mining. Araport is also a community-extensible platform for growth. Community labs are invited to contribute modules devoted to specific experimental data types. Araport’s community modules provide databases, computations, and visualizations. These are exposed as user-friendly web applications (“Science Apps”) or programmer-friendly web services or both. Araport modules are custom branded, auto-documented, and portable to other web sites. Module deployment is automated and developer-driven. Provenance tracking, usage reporting, data indexing and data integration will be automated soon. We will explain the process of developing a module and deploying it to Araport.
Abstract
34
External programsPortal programs (www.araport.org)
API (api.araport.org)
Agave Corekeep metadata
enroll usersADAMA
format data
enroll services
a b c d e f
CGI
Computing
Storage
Databases
ThaleMine JBrowse
Authentication, metering, logging, versioning, security.
a b c d e f
Apps
Jobs
Systems
CGI
InterMines
Others
Tripal
SOAP
CGI
REST
Science Apps
Requisite Architectural Diagram