nrnb annual report 2013

84
Annual Progress Report - Research Progress 2013 National Resource for Network Biology P41 GM103504 05/01/2012 - 04/30/2013 SAKUNTABHAI, ANAVAJ SANSONETTI, PHILIPPE COLLOMBE SAMUEL SCHWIKOWSKI, BENNO LOPES, CHRISTIAN AKSOY, BúLENT ARMAN SANDER, CHRIS RU BR CHRIS CERAMI, ETHAN VARMUS, HAROLD SHARMA, KUMAR MCCONNELL, MIKE CHANG, JOHN T GIN M GUITHART, ORIOL GILSON, MICHAEL KAY, STEVEN SMOOT, MIKE IDEKER, TREY SUBRAMANI, SURESH KAMBUROV, ATANAS TILL, ANDREAS SAITO, RINTARO ONO, KEIICHIRO PENTCHEV, KONSTANTIN MAERE, STEVEN DEMCHAK, BARRY BEMIS, DEBRA FOWLER, JAMES LOTIA, SAMAD FLETTERICK, ROBERT J GUO, YURONG GREGG, CHRISTOPHER HANCOCK, WILLIAM S NORM MICH HANNUM, G HANSPERS, KRISTINA MORRIS, JOHN "SCOOTER" MONTOJO, JASON DONG, YUE SHIH, DAVID KUCHINSKY, ALLAN RODCHENKOV, IGOR EMILI, ANDREW VOISIN, VERONIQUE BADER, GARY OWN, JOHN ER, NA PICO, ALEXANDER ZACKSENHAUS, ELDAD JIAO, DAZHI LIU, JEFF FRANZ, MAX WS, A The 2013 NRNB Network. On the left is a network representation of all NRNB personnel and collaborators (blue circles), all TRD, DPB, Collaboration, and Service projects (orange diamonds), and associated publications (green triangles). Node size is proportional to the number of connections. Thick red borders indicate personnel, projects and publications directly funded by the NRNB P41 grant. On the right is a zoomed inset, inclusive of all NRNB-funded personnel making up the vital core of the NRNB network. There are 276 nodes and 365 connections in the network. NRNB funds 46 (17%) of these nodes, which make 211 (58%) of the connections. As a Cytoscape network [1], we can interactively explore this representation with our External Advisory Committee, offering dynamic views of our projects, collaborations and budgets. Also see Appendix A for a full-page view of the entire network. 1. Smoot ME, Ono K, Ruscheinski J, Wang PL, Ideker T (2011) Cytoscape 2.8: New features for data integration and network visualization. Bioinformatics 27:431–432.

Upload: alexander-pico

Post on 10-May-2015

1.089 views

Category:

Health & Medicine


13 download

DESCRIPTION

Annual progress report for the NIH P41 National Resource for Network Biology

TRANSCRIPT

Page 1: NRNB Annual Report 2013

Annual Progress Report - Research Progress 2013 National Resource for Network Biology

P41 GM103504 05/01/2012 - 04/30/2013

SAWYERS, CHARLES

PEROU, CHARLES M

MEYERSON, MATHEW L

LEVINE, DOUGLAS A

LADANYI, MARCMESIROV, JILL P

SAKUNTABHAI, ANAVAJ

SANSONETTI, PHILIPPE

KUCHERLAPATI, RAJU

THIEFFRY, DENIS

COLLOMBET, SAMUEL

SCHWIKOWSKI, BENNO

LOPES, CHRISTIAN

GAO, JIANJIONG

AKSOY, BúLENT ARMAN

SANDER, CHRIS

RUGHEIMER, FRANK

BRUN, CHRISTINE

NOIROT, PHILIPPE

NALDI, AURâLIEN

CERAMI, ETHAN

VARMUS, HAROLD

SHARMA, KUMAR

WOLF, DIETER AMCCONNELL, MIKE

CHANG, JOHN T

GINSBERG, MARK

GUITHART, ORIOL

BARBER, DIANE L KIRBY, MICHEAL

HU, ZHENJUN CHANDA, SUMIT K

GILSON, MICHAEL KAY, STEVEN

ECKMANN, LARS

BARK, STEVEN J

BANDYOPADYAY, SOURAV

WEBSTER, NICK

SMOOT, MIKEIDEKER, TREY

SUBRAMANI, SURESH

HOOK, VIVIAN

DUVVURI, VIKAS

DORRESTEIN, PIETER

BANDEIRA, NUNO

VAN ATTIKUM, HAICO

JONES, LEANNE

DAWSON, TED

RATH, CHRISTOPHER

M KAMBUROV,

ATANAS

TILL, ANDREAS

SAITO, RINTARO

ONO, KEIICHIRO

PENTCHEV, KONSTANTIN

MAERE, STEVEN

DEMCHAK, BARRY

BEMIS, DEBRA

FOWLER, JAMES

ASTAKHOV, VADIM

CHRISTAKIS, NICHOLAS

DUTKOWSKI, JANUSZ

WRENSCH, MARGARET CONKLIN,

BRUCE

YUMOTO, FUMIAKI

LOTIA, SAMAD

FLETTERICK, ROBERT J

GUO, YURONG

ZHANG, KANGKIPPS, THOMAS

GREGG, CHRISTOPHER

HANCOCK, WILLIAM S

NORMAN, MICHAEL L

HANNUM, GREG

LI, JIANFENG

SOBOL, ROBERT W

HANSPERS, KRISTINA

ZHANG, CHAO

KWOK, PUI-YAN

WAAGMEESTER, ANDRA

ZHOU, YIGANG

XU, DONG

WANG, JIGUANG

DHRUVA, NEIL

TANG, LING FUNG

MORRIS, JOHN

"SCOOTER"

MONTOJO, JASON

DONG, YUE

SHIH, DAVID

KUCHINSKY, ALLAN

FIJTEN, RIANNE

FRIED, JAKE

LUNA, AUGUSTIN

KUMAR, PRAVEEN

KUTMON, MARTINA

DUTTA, ANWESHA

VAN IERSEL, MARTIJN

FERRIN, THOMAS

WILLIGHAGEN, EGON

MORRIS, QUAID

ALMAN, BENJAMIN A

RODCHENKOV, IGOR

EMILI, ANDREWVOISIN, VERONIQUE

BADER, GARY

GUIDOS, CYNTHIA

BRUDNO, MICHAEL

TAYLOR, MICHAEL

GRAMOLINI, ANTHONY

ISSERLIN, RUTH

MERICO, DANIELE

FIUME, MARC

DANSKA, JAYNE

CHACHCHA, KHUSHI

PEARSON, BRET

BROWN, JOHN

PFISTER, SABINA

SINHA, SRAVANTHI

RANI LAUNGANI,

RITISHA

PICO, ALEXANDER

SIMINOVITCH, KATHERINE

DICK, JOHN

ZACKSENHAUS, ELDAD

GAIEVER, GURI

ZANDSTRA, PEER WALLACE, IAIN

SINGH, SHEILA

BAHCECI, ISTEMI

SONLU, SINAN

DOGRUSOZ, UGUR

JIAO, DAZHI

LIU, JEFF

STEIN, LINCOLN

ARANDA, BRUNO

HERMJAKOB, HENNING

BOONE, CHARLES JURISICA, IGOR

FRANZ, MAX

ANDREWS, BRENDA

The 2013 NRNB Network. On the left is a network representation of all NRNB personnel and collaborators (blue circles), all TRD, DPB, Collaboration, and Service projects (orange diamonds), and associated publications (green triangles). Node size is proportional to the number of connections. Thick red borders indicate personnel, projects and publications directly funded by the NRNB P41 grant. On the right is a zoomed inset, inclusive of all NRNB-funded personnel making up the vital core of the NRNB network. There are 276 nodes and 365 connections in the network. NRNB funds 46 (17%) of these nodes, which make 211 (58%) of the connections. As a Cytoscape network [1], we can interactively explore this representation with our External Advisory Committee, offering dynamic views of our projects, collaborations and budgets. Also see Appendix A for a full-page view of the entire network. 1. Smoot ME, Ono K, Ruscheinski J, Wang PL, Ideker T (2011) Cytoscape 2.8: New features for data integration and network visualization. Bioinformatics 27:431–432.

Page 2: NRNB Annual Report 2013

Annual Progress Report - Advisory Committee 2013 National Resource for Network Biology

P41 GM103504 05/01/2012 - 04/30/2013

We held our second External Advisory Committee (EAC), on December 12, 2012, in coordination with the annual Cytoscape Workshops and Network Biology Symposium hosted by NRNB this year at the Gladstone Institutes in San Francisco. In addition to the EAC members listed below, we also had our Program Officer, Doug Healy in attendance. The following report was issued by our EAC. Participating External Advisory Committee Members:

• Stephen Friend, Sage Bionetworks • David Hill, Dana-Farber Cancer Institute • Tamara Munzner, University of British Columbia • Anya Tsalenko, Agilent Technologies • Marian Walhout, University of Massachusetts Medical School

Overall Perspectives of the NRNB External Advisory Board

All of the members of the advisory Committee found this meeting to provide evidence of very strong progress and appreciated the increased clarity as to how to convey it to outsiders. In the past 18 months all of the major suggestions have been effectively addressed. The supplementary material has allowed a very powerful engagement by Alex Pico and the delivery of an entirely new focus to build out the cytoscape tools within a “cytoscape App store”: http://apps.cytoscape.org This has been matched by a comprehensive evolution of functionalities within the new version of Cytoscape 3.0 and a coherent maturation of all the Technology Research and Development Projects TRDs and associated Driving Biological Projects DBPs. The three major suggestions this cycle involve: 1) reviewing both the existing TRDs and DBPs to determine how mid-course optimization of these projects might allow maximal creation of “shining examples” around the strengths of the NRNB, especially by searching for new distal DBPs, 2) resolving the question of how to best measure success for the NRNB with a transition away from paper/citation based metrics to metrics of community enablement and integration, and 3) the importance of preparing for the extension by completing the draft proposal in time to engage the EAC six weeks before it is due to be submitted. In summary, the NRNB has continued to make excellent progress through the first half of this funding period and the committee is strongly supportive of the overall progress and direction. The comments below, albeit pointedly critical, are designed to help the NRNB position itself for the strongest possible competitive renewal in 18 mos. Please see the following descriptions of the specific programs for more detailed comments:

Page 3: NRNB Annual Report 2013

Specific Project Summary Statements

1) TRDs and DBPs (separate one for TRD3 and Cytoscape)

All of the NRNB labs continue to do exciting and cutting edge work developing new approaches to develop network-based solutions to address important questions in biological and social sciences. The “network extracted gene ontology” is one example of integrating a novel way to better use ontologies while providing a visual output that offers a clearer and better representation of functional modules. Integrating statistical and scripting tools into Cytoscape is a decided plus, initially done in the context of social networks, that should have broad applicability. Ongoing work is proposing potential paradigm- shifting ways to answer questions and gain insight beyond traditional approaches – using link clustering and network ontology, for example.

The recent set of publications across the entire spectrum of NRNB activities shows that good progress is being made in developing new network-based tools and demonstrating the value of studying networks. At the approximately halfway point of this grant, the NRNB has provided clear examples of identifying problems or critical biological questions that require novel approaches, proposing and developing solutions based on integrating information into networks, and implemented potentially useful tools for addressing similar questions. Each of the TRDs was individually successful in that regard. The challenge going forward is to clearly demonstrate that these tools and approaches have applicability beyond the questions/problem(s) that the individual TRDs tackled in the first place. One thing to consider is now how to better integrate across multiple TRDs. For example, can the tools being developed in TRD A, C, & D

be used in TRD B – this could be taken on as a collaboration or via a new DBP. Can the tools in TRD D be used to add further insight in developing network as biomarkers or network ontologies efforts?

TRD C has made significant and impressive progress in the past year, with flagship projects in Mosaic (ontology-partitioned mosaics) and NeXO (network extracted ontologies). The Mosaic work has already been released as a Cytoscape plugin. The NeXO work is particularly exciting as a path to data-driven ontologies rather than a single monolithic solution that is not sensitive to context.

Several possible avenues for moving forward with the NeXO work were discussed, including the possibility of partnering with the existing GO project via supplemental funding.

In terms of communicating the overall value of the NRNB to the broader scientific community, there are four distinct elements that need to be clearly articulated in terms of what the TRDs are doing and what the NRNB as a whole has accomplished: NRNB to date has clearly shown 1) an ability to Identify a problem/driving biological question that can not be done without a network approach; 2) an ability to develop new tools and technology for network analysis and visualization; 3) an ability to implement usable tools

Page 4: NRNB Annual Report 2013

and demonstrate proof of concept; and, the most challenging, 4) an ability to demonstrate that the tools are getting into wide use (e.g. via Cytoscape). This will require additional tracking and curation efforts that will be challenging because Cytoscape is now viewed as a “standard tool” and therefore less likely to be cited.

The NRNB is poised to be more than a collection of already successful TRDs. There should be some consideration for a major paper that involves ALL TRDs and many of DBPs to show how the new suite of Cytoscape tools can help answer a major question in elucidating genotype-to-phenotype relationships. Cytoscape has become a great collection of tools and NRNB has done great science developing some new tools and using them on a specific question – but the NRNB needs to move beyond being just a developer of Cytoscape tools and should look towards becoming an entity that is more of a “whole is greater than the sum of the parts”.

While the entire spectrum of projects involving all TRDs and DBPs is quite exciting, now is the time to begin considering restructuring the DBPs – potentially eliminating some – as plans are developed for the competitive renewal in 18 months.

One area to consider is whether or not the NRNB should begin to branch out with respect to other disease models – much of the recent success has been focused on cancer – as there is more and more evidence for many genes to be involved in diseases very distinct from the initial disease associated with any given gene.

As previously, Hill’s lab is willing to serve as an alpha or beta test site for data integration and novel visualizations as well as testing plug-ins for statistical analysis coupled to visualizations.

In Summary, it is clear that some TRDs are progressing well and are on track to roll out tools for network biology that will be widely used. In other cases, it is not clear the right audience is being reached. With this in mind, we recommend that the NRNB perform a comprehensive review of all TRD projects and strive to align them with a set of DBPs that represent the most active user communities in network biology with the following goals:

● Reach out to key/hub user bases for each technology

● Pursue opportunities for cross pollination/integration/pipelines across NRNB technology projects, which are currently being developed in isolation

● Identify other important resources and tools that NRNB TRDs could integrate with

Cytoscape Progress:

The team has made great progress towards Cytoscape 3.0: the beta release has been available for many months, and the full release is coming very soon. Many suggestions from the last meeting have already been incorporated, including identifying which previous plugins are high impact and devoting resources to make sure that these are

Page 5: NRNB Annual Report 2013

ported to the new version.

The issue of backwards compatibility was raised again, since Cytoscape 3.0 introduces major API changes that prevents old plugins from working without code updates. The verbal answer made it clear that choices had been carefully considered in consultation with the developer community. In particular, the assurance was made that API compatibility is a guaranteed contract for all 3.x versions with no changes made before version 4.0, thanks to the use of semantic versioning. The suggestion was made once again to ensure that keeping the API stable is a very high priority, because as the user community grows in size the costs of breaking backwards compatibility increase accordingly.

The consensus was that the process taken as described verbally was sound; it was just poorly documented in the written report. The suggestion for next time is to more explicitly document several things:

- process taken (to show that care was in fact taken)

- lessons learned: what worked, what didn't

- plans for the future

The team has made great progress in better documenting the use of Cytoscape by the biology community, with compelling statistics about the amount of use (including the impressive number of 1400 NIH grants). The changes made to the cytoscape.org front page with the tumblr feed showing images and the explicit encouragement that people should cite its use are great. The use of resources to also manually track the divergence between citation rates and use is entirely appropriate (with the interesting result that use is at least 2x the citations).

There are many new exciting technical directions. The new AppStore will benefit many constituencies: developers, end-users, and the PIs themselves in documenting usage of its efforts by the community. The set of new features chosen also reflects the needs of many constituencies, for example scaffolding new users with the new welcome/startup screen, and supporting developers with the new API. It's also heartening to see technology transfer from the visualization community with the incorporation of edge bundling.

The report mentioned new support for 3D rendering. Concerns were raised about whether devoting resources to this effort is appropriate given the empirical work from visualization community that has found many drawbacks to 3D layout of node-link graphs. The verbal answer was the new modular architecture allows alternate renderers, and 3D was simply one of several, and it was developed by a community member rather than the core developers.

2) Outreach and Impact

At the last advisory board meting it was suggested to “distribute open source network

Page 6: NRNB Annual Report 2013

technologies to the greater scientific community”. This meeting Alex Pico presented the NRNB execution on that suggested deliverable. Simply stated there has been awesome progress and much of this stems from the direct leadership of Alex in his new role as an Executive Director of the NRNB. Whether measured by the recently published article in Nature Methods “A travel guide to Cytoscape plugins, or through a visit to the cytoscape app store you can get to by googling “cytoscape apps” http://apps.cytoscape.org or by looking at how often they are used, this stands out as a remarkable success. It is now possible to extend this powerful start and consider annotating it with sections for open source and non-open source apps. There is a possibility to begin a dialog between those that desire new apps with those willing to build them. It might even be possible to now have funding listed and contests to encourage the building out of the most requested apps.

3) Moving forward: Ideas and Topics for Discussion

A lot of discussion about moving forward to NRNB effort was centered on increasing outreach to potential users of NRNM resources including Cytoscape, as well as tracking the use of these resources. Big progress has been made already through http://www.nrnb.org website, Cytoscape app store, but more could be done.

Some suggestions for increasing outreach to users included targeted communications to potential users either subscribed to Cytoscape mailing list, or authors of papers using Cytoscape. Connections to various social media resources like twitter or facebook could be increased. Quantitatively this outreach could be measured by the number of groups using NRNB resources, not in number of papers citing these resources or Cytoscape. Some of the papers may not cite Cytoscape directly, but have it buried in the Supplementary information that is not being searched or not cited at all.

Impact of Cytoscape and NRNB tools in general could be increased by connecting to other public resources for molecular and computational biology. One example is connection with GenomeSpace (www.genomespace.org) which is a platform that connects different bioinformatics tools, making it possible to move data smoothly between these tools and leveraging available analysis and visualizations. Other public resources that could benefit from connection to Cytoscape include Galaxy, KnowledgeBase, and IGV. Sharing between users could be increased by enabling smooth sharing Cytoscape networks on Google Drive or Amazon Cloud, as well as the use of Cytoscape web.

One area of applications of network biology tools that could be significantly expanded going forward is social network research, especially analysis of social and molecular networks, and interactions between different groups.

NRNB group made an impressive progress with tens of successful Google Summer of Code projects. Going forward it would be great to track careers of these students and students from NRNB mentorship program as another way to measure impact on community and science.

Page 7: NRNB Annual Report 2013

4) Suggestions For Next EAC Meeting and Report:

1. Next Report This year's report was much better than last year's; however, there is still room for improvement.

As suggested, the emphasis shifted from the science results of the DBPs to the more appropriate new developments created through the TRDs; that's a major improvement. However, the problem of documenting to what extent the output of this and previous funding -- new tools or methods -- are used in biological discovery could be even more clearly addressed.

For example, in the group's own research papers that are not directly about the development of Cytoscape itself, to what extent was the use of Cytoscape instrumental in achieving the research results? We suggest that this story should be told very explicitly.

Another suggestion for the next round is to provide a full list of results or subprojects at a fine-grained level, for example a specific new Cytoscape plugin or new analysis method proposed in a research papers. For each result, identify progress according to a four key milestones:

1. Identify problems

2. propose solutions (for example, new methods in published paper)

3. build generally available tool

4. get other people to use it

The goal should not be to reach the final milestone for every idea, but to document progress in terms of moving from earlier ones to later ones. Subprojects may enter at any stage, they don't have to be seeded only through the DBPs in the original grant. Subprojects may also exit at any stage, for example when the decision is made to propose alternate new solutions rather than following up with tool building in every case. It was clear from the verbal discussion that the center should be able produce some very satisfying answers of its achievements along these lines, and that these proofs of accomplishment will be a compelling and convincing part of a renewal proposal. This type of reporting will also help with the argument that the impact of Cytoscape and the NBRB goes beyond simple publication counts and citation counts. The deeper goal of the center is to introduce and encourage network methods in the biology community, so documenting the adoption of methods and tools shows progress towards that goal.

A second suggestion is to more clearly explain the boundary between this P41 and the other sources of funding: the related R01, and the grants supporting the DBPs. Ideker articulated a clear story in response to EAC questions: the $300K/yr R01 funds maintenance, while new technology springs from the $700K/yr P41. The committee approves of this story; it just needs to be told clearly and concisely in the written

Page 8: NRNB Annual Report 2013

materials. In particular, document what efforts are funded through the R01 and what are through the P41. Although the NRNB has broader scope than Cytoscape alone, since it is partially funding core Cytoscape work the best way to address this boundary is to at least briefly present the full picture of what work on Cytoscape has been done, and then to explain what parts were funded by the P41. The current report gives the full picture of Cytoscape development, but does not adequately explain the boundary.

The administrative information section is very well done. The budget is clearly explained, with crosscutting breakdowns between categories (staff vs. TRDs vs. PI salaries) and PI groups. The breakdown of expenses according to both FTEs and money was also helpful. The discussion of the importance of actively cultivating an open development community is articulate.

2. Next Meeting First, the EAC should be sent the relevant written materials to read in advance of the actual meeting. This year, the report was provided on paper to committee members at the start of the meeting, with an electronic version following a few hours into the meeting. This timing is too late, because it's hard to assimilate the written report in parallel with attending to the presentations. The report should be provided to committee members in advance, ideally one week before the meeting, and at bare minimum at least two days before the meeting. The late timing this year was particularly frustrating given that this report was created many months ago, but through an oversight hadn't been forwarded to us.

Second, the EAC agreed that we would best serve the interests of the NBRB by scheduling our next meeting shortly before the renewal proposal is due in what we think will be June 2014. Our intent is to act as pre-reviewers, where we will read a full draft of the proposal in detail before the meeting and then devote the meeting to an in-depth discussion of ways to strengthen and improve it. We propose roughly six weeks before the proposal is due: early enough that our feedback can be responded to, but late enough that the draft proposal is nearly complete rather than preliminary. This meeting would be roughly 1.5 years from now, the same amount of time that has elapsed between our first and second meetings.

Third, a suggestion for the renewal proposal is to have a large set of short testimonials from users, rather than (or in addition to) the more usual approach of full formal letters of support from a small number of people. The testimonials would be a few sentences or a paragraph about how Cytoscape has been valuable in their own work; having dozens or even hundreds of these compiled together in one document might have enormous impact on reviewers.

5) Collaborations and service projects

A major goal of NRNB is to support collaborations with a broad variety of researchers in Biomedical science. Different types of collaborations have been initiated from very small support-style collaborations to larger collaborations that require active participation by NRNB. The EAC was very impressed with the overall number of collaborations. At the

Page 9: NRNB Annual Report 2013

time of the previous SAB meeting, there were 36 active research collaborations with NIH-supported researchers. In the last 1.5 year or so, another 60 were added, making a total of 96. One issue is that the majority of collaborations are internal Better advertisement of NRNB and its collaborative goals at relevant scientific conferences may help to acquire more external collaborations.

Collaborations are only a small part of the NRNB budget with an estimated cost of ~$100,000, but are highly effective at leveraging the NRNB expertise to expand the overall impact and reach of Cytoscape.

The term ‘collaboration’ is used in a way that is somewhat ambiguous: within the CSP umbrella is included tiny-scope efforts called ‘support’ (33%), small-scope efforts called ‘consulting’, and medium- scope efforts called ‘collaboration’. However, the DBPs are what we might consider true collaboration, and the hope is that some of the medium-scope efforts would evolve into new DBPs over the time, even as some previous DBPs might be scaled back into a smaller role. However, since the term ‘CSP’ is the standard vocabulary defined by the grant, perhaps it is not realistic to rename these medium-scope efforts. It would be useful to see these numbers proportioned for internal versus external collaborations.

These collaborations are currently tracked in a publicly available and transparent way on the NBRB web site with titles, investigators, and NRNB contact. It would be useful if their status could also be tracked.

For the renewal, it will be very important to obtain letters or a filled out survey from collaborators regarding the utility of Cytoscape and how it changed their research.

6) Promising ideas for potential supplemental funding

The first supplemental effort provided to the NRNB enabling the Cytoscape App Store project has turned out to be a remarkable return on investment, demonstrating a capacity for greater creativity and productivity. We highly recommend additional supplemental grants to maintain, or even increase, this level of activity. During the advisory meeting, we explored a number of proposals worth considering:

1. Moving NeXO forward (see TRD A) by partnering with existing GO projects

2. Enable Cytoscape users to record/reuse/host/share workflows and sessions to promote network biology use cases, enriched publications, reproducibility and collaboration.

3. Interface with a specific key technology that targets a strategic community ripe for network biology perspective/tools (e.g., MIDAS, UCSC Genome Browser, NCBO BioPortal, Galaxy, GenomeSpace, Sage Bionetworks/Synapse, DREAM)

Page 10: NRNB Annual Report 2013

Annual Progress Report - Administrative Information 2013 National Resource for Network Biology

P41 GM103504 05/01/2012 - 04/30/2013

Administrative Structure During the first year, we defined the administrative structure of the resource, including some unique new roles within the organization. The roles of Principal Investigator (PI), Co-PI, External Advisory Committee (EAC), Resource Administrator and Chief Software Architect were defined as in the original grant. We defined a new role of Executive Director (ED) to oversee some of the new resource functions that NRNB provides, including Training & Outreach, Communications and Infrastructure. The ED (Alex Pico, Gladstone Institutes) is responsible for coordinating these efforts as well as conducting all of the necessary tracking and due diligence for the annual reporting to NIH. During the second year, we defined the new role of Collaboration Coordinator to screen and process collaboration requests to our resource. This has been a vital role in supporting the 80+ ongoing collaborations during the past two years. During the third year, we defined a proper position for the Roving Engineer who is vital for outreach to new users, app developers and strategic partnerships. Our Roving Engineer is also a major contributor to Cytoscape core design and implementation, embodying the full cycle from users to developers to implementation to release. Finally, we are very pleased to have maintained an active dialog with our EAC members, including Dr. Stephen Friend as chair of the committee. Budget changes have been minimal over the three years, with the exception of the new Collaboration Coordinator and TRD increases for Pico, Ideker and Sander in Year 2, and the new Roving Engineer and subsequent TRD cuts to Pico and Ideker in Year 3. The trend over time has been toward supporting more Outreach initiatives to fulfill our P41 goals.

Outreach

TRDs

Admin

Co-PIs

Ideker

Pico

Sander

Bader

Schwikowski

Fowler

A B

Page 11: NRNB Annual Report 2013

Figure 1. Budget graphs. Area charts showing the distribution of funds for years 1-3 (x-axis) per category (A) and per group (B). Y-axis is in units of $1,000s of US dollars. Each stripe typically corresponds to an individual with a specific role in NRNB, totaling 6.5 FTEs. Note that groups are sorted by degree of change, which is critical in this style of visualization to minimize misperception of change when slopes are actually parallel. As the basis for the graphs above, here are itemized tables of FTEs and funding for all three years (Table 1). Highlighted in red are the significant changes in Year 3 to FTEs and total dollars.

FTEs $1,000s Roles and Groups Year 1 Year 2 Year 3 Year 1 Year 2 Year 3 Collaboration (Ideker) 0.00 0.50 0.63 0 50 50 Admin-Asst. (Ideker) 1.00 0.56 0.56 52 38 41 Core Tech. (Ideker) 0.40 0.40 0.40 47 51 53 TRD-A (Ideker) 0.50 0.50 0.50 40 45 36 Admin-PI (Ideker) 0.30 0.30 0.29 74 78 77 Communication (Pico) 0.30 0.30 0.25 29 29 25 Admin-ED (Pico) 0.50 0.50 0.50 56 56 57 Roving Engineer (Pico) 0.00 0.00 0.12 0 0 16 TRD-C (Pico) 0.20 0.48 0.13 21 39 17 Co-PI (Pico) 0.02 0.02 0.02 5 5 0 TRD-A (Sander) 0.65 0.65 0.62 90 97 98 Co-PI (Sander) 0.02 0.02 0.02 5 5 5 TRD-C (Bader) 1.00 1.00 0.91 90 93 90 Co-PI (Bader) 0.10 0.10 0.10 0 0 0 TRD-D (Schwikowski) 1.00 1.08 1.08 81 83 83 Co-PI (Schwikowski) 0.08 0.08 0.08 0 0 0 TRD-B (Fowler) 1.00 0.72 0.20 58 54 53 Co-PI (Fowler) 0.10 0.10 0.10 21 26 27 SUBTOTAL 7.17 7.32 6.51 669 750 728 Supplement (Ideker) 0.00 0.40 0.40 0 45 45 Supplement (Pico) 0.00 1.00 1.00 0 85 85 Supplement (Bader) 0.00 0.40 0.40 0 45 45 SUBTOTAL 0.00 1.80 1.80 0 175 175 GRAND TOTAL 7.17 9.12 8.31 669 925 903

Table 1. NRNB effort and budget. Annual budgeting of FTEs and $1,000s itemized by roles (per group). Major changes are highlighted in red. Subtotals are provided separately for the main grant and supplemental funding (bold) and Grand Total is in the last row. Allocation of Resource Access Beyond the active distribution and support of Cytoscape, which is covered in later sections, NRNB resource allocation can be categorized in the following way:

1. On-site training events: NRNB staff participated in 13 training events during the reporting period. These events include tutorials, workshops and courses.

Page 12: NRNB Annual Report 2013

2. Requests for collaboration and mentorship: For the second consecutive year, we have maintained a high number of active collaborations. Many of these collaborations are coming through our participation in Google Summer of Code (GSoC) and our own NRNB Academy efforts (see #3).

3. Google Summer of Code and NRNB Academy: In addition to receiving requests from potential students through these programs, we also receive requests from a number of groups to join our organization as mentors. This brings new technology and ideas to our effort. GSoC has been our most successful outreach program by far. It’s responsible for a quarter of all our NRNB collaborations. It is the most active period for NRNB.org, granting broad exposure for NRNB in the open source community. Building on the success of this model, we launch NRNB Academy last year. Our Academy follows the same approach as GSoC, organizing around available mentors, ideas and interested students. However, we are not restricted to supporting university students in our program as it is independent of GSoC and 100% volunteer based. The Research Progress and Highlights provide more details.

4. Requests for training material support: We receive requests for tutorial materials throughout the year from inside and outside the Cytoscape core development team. Our homegrown Open Tutorials system makes it easy to accommodate all such requests. Open Tutorials is an easy-to-use wiki system that provides content formatted to be used as online sessions, slide shows and printed handouts. This year we are seeing more content from more contributors, in addition to a steady rise in visitors (see details in the Training section below).

5. Providing software community support: Our goal is to develop a generic template of services based on the support we provide the Cytoscape community of users and developers. So far we have extended support to Cytoscape, WikiPathways, Cytoscape Web and the cBio Cancer Genomics Portal. These proven resources demonstrate the broader scope of the NRNB mission. We are providing distribution links, showcases, tutorial support, news and event tracking, and GSoC and NRNB Academy participation to these projects. New this year, is a gallery page with screenshot for all of these tools.

Awards and Honors None Dissemination

Overall Cytoscape Version 3.0 (v3.0) was released for unrestricted public use on February 1, 2013. It represents an evolution of v2.x resulting from a two-year collaboration of a multinational, multi-institution team of programmers and biologists. This report describes the Cytoscape software, the infrastructure that supports it, and the activities of the community it serves.

Background The overall mission of Cytoscape is to be a freely available worldwide asset supporting network analysis and visualization for systems biology science. The major focus of v3.0 is the modularization and rationalization of code to solve stability issues in v2.x encountered as multiple developers pursued multiple agendas. Under v2.x, internal programmatic interfaces evolved from one release to the next, leading to the failure of working plugins over time and

Page 13: NRNB Annual Report 2013

negative interactions between otherwise working plugins. Ultimately, this resulted in loss of programmer and user productivity, and undermined community confidence in Cytoscape. v3.0 addresses these issues by adopting modular coding practices promoted by the OSGi1 architectural framework. This enables both the Cytoscape core and externally developed apps (formerly called plugins) to evolve independently without compromising unrelated functionality. At the logical level, Cytoscape leverages OSGi precepts to produce v3.0 APIs having cleaner and clearer demarcations between functional areas. At the deployment level, OSGi enables on-the-fly substitution of one processing element for another (e.g., apps) in order to tailor Cytoscape to meet user requirements at runtime without reinstalling or reconfiguring Cytoscape. v3.0 represents a strong investment toward reducing future development and support costs, and increasing reliability and evolvability. We expect to leverage v3.0 as a platform to satisfy the evolving needs of multiple stakeholder groups, and as a platform enabling research on leading edge analysis and visualization techniques. v3.0 is the intended successor to v2.8, with development and support of v2.8 expected to diminish and disappear over time in favor of v3.0 and its successors. v3.0 is upward compatible with v2.8, but not downward compatible. While v3.0 is a substantial reorganization of v2.8, its launch marks an evolution in the Cytoscape team’s approach to community engagement, where different community demographics are engaged in different, demographic-sensitive ways. The team identified four major groups: new users, casual (but not new) users, power users, and app developers. Initial v3.0 release was promoted towards power users and app developers as a way of delivering v3.0’s advanced capabilities to groups most able to leverage them, give qualitative and remedial feedback, and promote v3.0 adoption to other Cytoscape users. This strategy dovetails with v3.0 features (described below) that lower barriers to entry for new and casual users while enabling efficiency and productivity for power users and app developers. The second release (v3.0.1) is imminent – it incorporates various critical fixes and numerous feature requests made by early v3.0 adopters. As such, it will be promoted to the entire Cytoscape community, including new and casual users. v3.0.1 will become the default Cytoscape download, replacing v2.8 as the default. As compared to v2.8, Cytoscape users will benefit most directly from the v3.0 in the long run by:

• experiencing  fewer  core  and  app  bugs  from  one  release  to  the  next  • the  availability  of  more  and  richer  apps  (due  to  developers  spending  less  time  tracking  and  fixing  

bugs)  • more  core  features  with  higher  biological  and  logistical  value  (due  to  improved  flexibility  

provided  by  interface-­‐driven  development)  

The v3.0 Release Throughout 2012, Cytoscape developers made a number of beta versions available to early adopters. Issues were tracked in RedMine, and were contributed by both developers and early adopters. The final release was made on February 1, 2013, accompanied by updated user documentation, user tutorials, JavaDoc programmer documentation, app developer tutorials, a new App Developer Cookbook (containing useful code snippets), and release notes.

1  www.osgi.org  –  also  used  as  the  basic  framework  for  Eclipse  and  numerous  commercial  products  

Page 14: NRNB Annual Report 2013

Additionally, a new and comprehensive user-focused Welcome Letter was created to differentiate between different user demographics and engage them appropriately. Principle v3.0 development was carried on by staff and researchers worldwide, including the following institutes: UC San Diego, Pasteur Institute, University of Toronto, Gladstone Institute (UC San Francisco), University of Amsterdam. v3.0 included the following major features:

• Upward  compatibility  with  Cytoscape  2.x  networks,  attributes,  analysis,  layout,  and  display  • App  Store  (for  centralized  app  availability)  • Friendly  Welcome  dialog  (to  engage  new  and  casual  users)  • Import  network  • Edge  bend  visual  property  • Edge  bundling  • Grouping  (for  hierarchical  networks)  • Enhanced  search  • Show  All  in  Table  Browser  • Multiple  network  management  • Major  refactoring  to  rationalize/regularize  inter-­‐module  interfaces  (to  aid  app  developers  in  

creating  reliable  apps)  

Major issues remaining after the v3.0 release included: • Slower  startup  than  v2.x  • Fewer  apps  (plugins)  than  v2.x  • Numerous  undiscovered  or  unaddressed  bugs  (due  to  major  refactoring)  • Smaller  network  capacity  on  32  bit  processors  

There are 145 apps (plugins) available in v2.x, though many have gone unmaintained and have fallen out of use. Of the v2.x plugins, 8 were delivered in v3.0 as core functionality: EnhancedSearch MetanodePlugin2 PSICQUICUniversalClient GraphMLReader NCBIEntrezgeneUserInterface ScriptEngineManager JavaScriptEngine NetworkAnalyzer

Additionally, the App Store contained another 13 apps (corresponding to many of the most popular v2.x plugins): AgilentLiteratureSearch Cy3PerformanceReporter jActiveModules CentiScaPe Cyni Toolbox MCODE ClueGO CyPath2 PathExplorer CluePedia DynNetwork Venn and Euler Diagram ClusterOne GeneMANIA

Bug Bounty To foster early investment and engagement in v3.0 by the user community, we created the Cytoscape Bug Bounty program, which paid out small prizes to users identifying high value bugs in the month of February 2013.

Page 15: NRNB Annual Report 2013

The program produced 35 bugs by 17 qualified reporters: 8 crash/data loss, 19 user interface, and 7 cosmetic. Gift cards were given to the top 9 reporters.

It  was  great  fun  to  participate  in  the  February  Bug  Bounty.  Thank  you  for  organizing  it,  and,  in  general,  thank  you  for  making  the  development  of  Cytoscape  an  open  process.  It’s  really  appreciated,  from  the  point  of  view  of  the  users,  when  a  software  is  developed  this  way.  

In  general,  I’ve  found  that  the  new  Cytoscape  3.0  version  is  a  great  improvement  over  the  previous.  The  new  “Welcome  screen”,  together  with  many  little  improvements  to  the  menus  and  the  interface,  gave  me  a  feeling  of  very  user  friendly  software.  The  ability  of  downloading  whole  species  for  networks  with  a  click,  or  to  import  them  from  many  sources,  is  attractive  to  many  people,  and  I  know  some  persons  who  will  use  it  for  their  work.  The  App  store  is  also  a  nice  addition,  as  it  is  much  better  to  have  a  common  web  page  for  all  the  plugins  instead  of  having  to  look  for  documentation  dispersed  into  many  little  websites.2  

The v3.0.1 Release The v3.0.1 Release is scheduled for April 18, 2013. Its main purpose is to eliminate bugs leading to data loss, program crashes, misleading displays, and small user interface issues. Given this, we expect that it will be suitable for use by the entire Cytoscape community (including new and casual users) in preference to v2.8, and we expect v3.0.1 to become the default download on the Cytoscape web site. The first v3.0.1 release candidate (RC) will become available for download by April 4. It will include fixes or resolutions for 98 reported bugs and other issues, including 30 of 35 reported under the Bug Bounty program. Notably, the v3.0.1 release:

• Substantially  increases  the  size  of  network  manageable  on  32-­‐bit  systems  • Migrates  source  from  SVN  to  GitHub  (to  expand  collaboration  opportunities)  

At release time, we expect there to be slightly under 200 bugs or unresolved issues remaining on our backlog, including feature requests and issues requiring substantial development or rework. Additionally, app developers have asked for improved documentation to enable quick and reliable app development. Currently, UC San Diego is upgrading three v2.8 plugins to become v3.0 apps, and expects completion in Q3 2013:

• GenomeSpace  • MiMI  • BiNGO  

Additionally, the NRNB has offered Amazon gift certificates as rewards to app developers for the first 20 apps independently developed and submitted.

2  Giovanni  Marco  Dall’Olio,  March  8,  2013  via  e-­‐mail  

Page 16: NRNB Annual Report 2013

Bug and Issue Tracking Since early 2011, the Cytoscape team has tracked bugs and issues using the RedMine cloud service. As of v3.0, users can inject reports of bugs and issues into RedMine directly from Cytoscape. A CDF plot of bugs and issues logged over time shows aggressive tracking:

The following CDF shows that the Cytoscape team has responded to logged reports (by addressing them as bug fixes or scheduling them to be addressed in the future).

“Created” means that a ticket was opened, and “Updated” means that a Cytoscape team member has acknowledged it, and has prioritized it for solving or has already solved it.

Measured Results

Cytoscape Downloads and Web Site Visits Through 2013, the overall number of Cytoscape downloads (including v2.8 and v3.0) continues to rise. The chart below shows the monthly download counts, with data dropouts in November,

Page 17: NRNB Annual Report 2013

2007 and March, 2009. In February 2013, the download count was 6,685, and the count for March was 7,323.

Since 2012, weekly visits (outside of holidays) have increased. The Cytoscape v3.0 web page was first put up in October 2012. The trends since the February, 2013 release are too new to yield conclusions, though it seems that visits have measurably increased. Visits to the Cytoscape download page have remained somewhat constant over time, though seem to have increased since v3.0’s February 2013 release.

Page 18: NRNB Annual Report 2013

In examining year over year visit patterns, 2013 visits have increased by about 30%, with an uptick corresponding to the v3.0 release timeframe. This pattern is reflected in visits to the download page, too. Note that visits to the v3.0 page are associated with about 25% of page visits. (Note that visits to the v3.0 page are prerequisite to downloading v3.0, and therefore bounds the count of v3.0 downloads. Visiting the v3.0 page can have many purposes, only one of which is downloading v3.0.)

Between January 1, 2012, and the end of March, 2013, the Cytoscape web site received 393,903 distinct visits. Web site visitors were geographically dispersed worldwide:

Page 19: NRNB Annual Report 2013

Cytoscape visitors arrived most often after performing a Google search, but also arrived from direct links and from links within Cytoscape web pages:

Page 20: NRNB Annual Report 2013

App Store The App Store opened for business on June 1, 2012. Since then, it has received over 33,000 visits from users worldwide:

Most visits originate from a link within the Cytoscape web site but a significant number of visits launch from search engines and direct links:

Page 21: NRNB Annual Report 2013

Except for during the holiday season, the traffic to the App Store has consistently grown. By March, 2013, weekly visitors numbered between 1,100 and 1,300. Through March, 2013, a total of 33,596 visits were received:

Interest was evenly distributed across a number of app categories:

The most frequently downloaded apps (as of March, 2013) were:

App Count ClueGo 1,394 GeneMANIA 1,230 jActiveModules 1,196 MCODE 980

Page 22: NRNB Annual Report 2013

Cytoscape Citations The count of Cytoscape-citing papers continues to accelerate year-over-year, with the count for 2013 being incomplete (as of March, 2013).

Year-over-year growth has been historically sporadic, and may be showing signs of slowing:

Year-over-year Growth 2004-2005 64% 2005-2006 72% 2006-2007 126% 2007-2008 94% 2008-2009 80% 2009-2010 8% 2010-2011 32% 2011-2012 19% 2012-2013 incomplete

Community Outreach The Cytoscape community consists of core developers, app developers, and users. Communication and outreach is multimodal: Google Groups for contemporaneous discussion, Google video and Hackathons for core developer meetings, papers, web site and social media, and public meetings and symposia.

Google Groups and Video The Cytoscape team has maintained Google Groups since April, 2011. As of March, 2013, there were 4 groups:

Page 23: NRNB Annual Report 2013

Group Membership Topic Count cytoscape-discuss 1,531 2,570 cytoscape-helpdesk 1,148 1,413 cytoscape-announce 918 194 cytostaff 49 2,643

The discuss and helpdesk groups facilitate self help (through search), peer assistance, and assistance directly by Cytoscape core developers. The announce group is used by Cytoscape core developers to announce new Cytoscape releases, and by app developers to announce new apps.

The cytostaff group enables communication between Cytoscape core developers to coordinate activities and exchange technical information. Cytoscape core developers also meet on video chat weekly to plan agendas, triage issues, and conduct infrastructure activities.

Hackathons The Cytoscape team conducted a Hackathon at the Gladstone Institute in San Francisco on December 12, 2013, concurrently with the annual general Cytoscape symposium. Participants laid out the following roadmap for short and medium term development:

• Table  loading  performance  • Network  panel  update  • Command  language  support  • Search/Filter  API  • Property  Sheets  • Separation  of  ViewModel  • Advanced  Label  Rendering  (Zoom/multi-­‐scale)  • JSON  package  to  support  external  processes  • SBGN  symbols  • Table  merge  • Vizmapper  documentation  • Developer  requests  

o Integration  to  R/scripting  o XMLRPC/REST  access  o Headless/daemon  mode  

Web Site and Social Media The main Cytoscape web site (cytoscape.org) was augmented to include a branch for v3.0, which includes user and developer documentation, links to the Welcome Document and release notes, and links to presentations and social media sites. Notably, videos of app presentations at the December 13-14 general Cytoscape symposium were posted at: http://nrnb.org/presentations.html

Page 24: NRNB Annual Report 2013

Future Risks The primary objective of the architectural refactoring that transformed Cytoscape v2.8 to v3.0 was to normalize relationships amongst subsystems so that changes could be made in one subsystem without detriment to another. While this evolution has been accomplished, much code was changed, and bugs continue to be discovered and reported by the user community. For now, the community remains forgiving and indulgent, mainly because Cytoscape’s basic functionality appears sound. However, the community perspective may change when v3.0 becomes the default download. While bugs can be fixed on point releases, slow startup times and the slow conversion rate of v2.x plugins into v3.0 apps remain a threat for several quarters. Mitigating strategies include continuing the excellent and diligent support offered by the Cytoscape team and community, which serves to help prioritize release features and to keep user frustration from growing. Additionally, software reliability can be improved by incrementally developing automatic test suites beyond what exists today. While Cytoscape’s semantic versioning provides app developers with important guarantees of interface- and semantic-consistency as Cytoscape evolves, it’s possible that semantic versioning itself may threaten to retard plugin authorship, rendering Cytoscape unresponsive to scientific requirements in meaningful timeframes. The interfaces defined in Cytoscape 3.0 have been shown to be insufficient for the needs of new apps in some cases. While new interfaces can be added, doing so requires incrementing the minor version number (e.g., from 3.0 to 3.1), which is intended to occur only rarely. Furthermore, the operational complexity and overhead of making new Cytoscape releases virtually guarantee the slow evolution of Cytoscape interfaces. Mitigating strategies include deliberately hastening the pace of interface-augmenting releases and engaging app developers to aggressively feed interface requests to the team – possibly at the expense of core development. Notwithstanding the enormous benefits of the architectural refactoring, critical Cytoscape subsystems (e.g., user interface and apps) remain tightly coupled. This coupling threatens (at best) to recapitulate the tangled relationships that triggered the refactoring or (at worst) make the replacement, scaling, or reuse of these subsystems problematic. Eventually, this threatens the evolvability of Cytoscape to serve scientific interests in relevant timeframes. Mitigating strategies include focused refactoring of key subsystems along SOA (service oriented architecture) or COA (component oriented architecture) principles to expose and separate distinct concerns. This type of refactoring can occur while implementing a given use case, and then leveraged to benefit subsequent, related use cases. Patents, Licenses, Inventions, and Copyrights None. We are committed to an Open-Source dissemination policy. Training and Outreach Annual Cytoscape Retreat The annual Cytoscape Workshops and Symposium was hosted by the National Resource for Network Biology (NRNB) at the Gladstone Institutes on the UCSF Mission Bay campus in San Francisco during this reporting period. In addition to developer meetings, the event included user and new developer tutorials, a Plugin/App Expo, a special Network Biology symposium,

Page 25: NRNB Annual Report 2013

and our EAC meeting. The meeting was a huge successful with capacity attendance for the user tutorial and very positive survey responses from attendees. Workshops For the reporting period, NRNB has participated a total of 13 training events in multiple countries. These events include tutorials, workshops and courses. Cytoscape is taught in many classroom and workshop settings. We try to track all of these on our website and Event Tracker. We’ve identified 37 courses offered in the 2012-2013 calendar year! And these are just the ones affiliated with NRNB staff. Open Tutorials Our tutorial management system, Open Tutorials, is still the main source for tutorial materials for the Cytoscape project, and is being used both internally by presenters, and by researchers and developers. Visits to Open Tutorials have continued to increase over the last year, with an average of 3750 visits/month, as compared to 2700 visits/month for the previous reporting period. More than half of all visits (57%) are from new visitors. We estimate that the increase in traffic is mainly from users, as we have had only two new editors in the same period. Tutorial development during the past year was focused on a set of user tutorials for Cytoscape 3.0, covering the most common use cases and describing the user interface and new welcome screen. We plan to add several additional user tutorials over the next 6 months. Overall, Open Tutorials has allowed NRNB to reach our goal of providing tutorial support to a broad and diverse community.

Social Media We have initiated a social media effort for Cytoscape through a number of different tools (http://www.cytoscape.org/community.html). For example, a Twitter account is used for quick announcements (http://twitter.com/cytoscape) and YouTube is utilized for video tutorials (http://www.youtube.com/results?search_query=cytoscape). During this reporting period we continued the popular Tumblr site to capture published figures using Cytoscape. Pairs of figures are posted on a weekly basis on the front page of cytoscape.org based on this Tumblr feed. We now regularly get authors submitting their recent publications to us, asking to feature them via our Tumblr site. This is directly helping to promote the use and citation of Cytoscape. Google AdWords We were awarded a non-profit account in the Google AdWords program. We are managing 8 Ad Group campaigns consisting of over 880 keywords and phrases. Last month alone we received over 7,000 clicks on these ads to our NRNB sites. These activities are worth over $8,800 a month (a 550% increase over last year), which we are getting free-of-charge. We have a spending limit of $329 per day through this program, a potential value of $120,000 per year, so we will continue to identify new ads and relevant resources. Google Summer of Code and NRNB Academy In addition to the outreach effort described above, we also leverage a Google-sponsored program called Google Summer of Code to attract new developers. This year we are coordinating 30 mentors, leveraging the effort of developers from open source communities surrounding NRNB-related tools. Last summer through the GSoC program we received over 60

Page 26: NRNB Annual Report 2013

student applications. From these we selected 16 students to mentor on Cytoscape and NRNB-related projects. All 16 projects passed and completed the summer successfully! Google paid $5,000 per student, making their investment $80,000 in NRNB for 3 months of work. Inspired by this very successful model for recruiting new code contributors, we designed and launched NRNB Academy last year. Through NRNB Academy, we offer anybody the opportunity to work with our open source development team on network biology related tools and resources. The program offers a framework for training by providing project ideas and by pairing participants with mentors. It is completely volunteer-based and offers participants flexible project terms. Since its launch in January 2011, we have had 14 requests from participants, and we currently have 4 students enrolled. The first graduate completed their project in September 2012. In addition to ongoing student projects, the program has also resulted in one collaboration and continues to be a source for project ideas and mentors for our GSoC effort. Based on our experience so far, this program is not only effective in producing useful tools and resources, but it also serves as a mechanism to increase long-term development collaborations. Our first graduating student continues to be involved as a contributor, and two of the ongoing students are involved in longer-term ongoing projects as well.

Page 27: NRNB Annual Report 2013

Annual Progress Report - Research Highlights 2013 National Resource for Network Biology

P41 GM103504 05/01/2012 - 04/30/2013

Contents ● Network Approach to Building Gene Ontologies ● First Release of Cytoscape 3.0 and the Cytoscape App Store ● NRNB Google Summer of Code Program Reaches New Levels

Network Approach to Building Gene Ontologies Ontologies are of key importance to many domains of biological research. The Gene Ontology (GO), in particular, has been instrumental in unifying knowledge about biological processes, cellular components, and molecular functions through a hierarchy of concepts and their interrelationships. However, given only partial biological knowledge and inconsistency in how this knowledge is curated, it has been difficult to construct, extend and validate GO in an unbiased manner. We have recently showed that the existing collection of high-throughput network maps, as are now becoming available, can be analyzed to automatically assemble an ontology of gene function that rivals manually curated efforts [1]. Our systematic computational approach combines evidence from physical, genetic and transcriptional networks to produce an ontology comprised of 4,123 biological concepts and 5,766 hierarchical concept relations. Using a new ontology alignment procedure, we found that the network-based ontology captures the majority of known cellular components and identifies approximately 600 new cellular components and component relations – many of which we were able to validate either experimentally or bioinformatically. By working closely with the GO curators, we were able to incorporate selected new components and relations into the Gene Ontology, thus providing proof-of-principle for how to systematically update and revise the GO structure based on large-scale data 1. Dutkowski, J., Kramer, M., Surma, M.A., Balakrishnan, R., Cherry, J.M., Krogan, N.J., and Ideker, T., A gene ontology inferred from molecular networks. Nat Biotechnol, 2013. 31(1): p. 38-45.

First Release of Cytoscape 3.0 and the Cytoscape App Store The overall mission of Cytoscape is to be a freely available worldwide asset supporting network analysis and visualization for systems biology science. Cytoscape Version 3.0 (v3.0) was released for unrestricted public use on February 1, 2013. It represents an evolution of v2.x resulting from a two-year collaboration of a multinational, multi-institution team of programmers and biologists. The major focus of v3.0 is the modularization and rationalization of code to solve stability issues in v2.x encountered as multiple developers pursued multiple agendas. Version 3.0 addresses these issues by adopting modular coding practices promoted by the OSGi

Page 28: NRNB Annual Report 2013

architectural framework. This enables both the Cytoscape core and externally developed apps (formerly called plugins) to evolve independently without compromising unrelated functionality. Since 2012, weekly visits (outside of holidays) have increased. The Cytoscape v3.0 web page was first put up in October 2012. The trends since the February, 2013 release are too new to yield conclusions, though it seems that visits have measurably increased. Visits to the Cytoscape download page have remained somewhat constant over time, though seem to have increased since v3.0’s February 2013 release. To help address the needs of users, we launched the Cytoscape App Store (http://apps.cytoscape.org) to coincide with the release of Cytoscape 3.0, a major re-architecturing of Cytoscape for improved stability, performance, and versatility. The overarching goals of the Cytoscape App Store are to highlight the important features apps add to Cytoscape, to enable researchers to find apps they need, and for developers to promote their apps. For each Cytoscape 3.0 app, the App Store supports unique features like one-click install and comprehensive download statistics. The App Store opened for business on June 1, 2012. Since then, it has received over 33,000 visits from users worldwide. Except for during the holiday season, the traffic to the App Store has consistently grown. By March, 2013, weekly visitors numbered between 1,100 and 1,300. Through March, 2013, a total of 33,596 visits were received. The App Store is already playing a broader role in the Cytoscape community than just a place for browsing and submitting apps. For instance, we held a competition for the best Cytoscape 3.0 apps in December 2012. The first prize was shared by ClueGO, which visualizes the relationship between gene ontology terms; and DynNetwork, which visualizes networks with time-based movement. We plan to host more competitions in the future to encourage Cytoscape 3.0 app development. Apps and the app developer community play a critical role in success of Cytoscape, ensuring its continued relevance and reach as the field of network biology evolves. The new Cytoscape App Store aims to increase the visibility and accessibility of apps, providing support to both Cytoscape users and app developers. We anticipate that traffic will continue to increase as apps–and the App Store–become more prominent in the Cytoscape community.

NRNB Google Summer of Code Program Reaches New Levels Last summer through the Google Summer of Code (GSoC) program we received over 60 student applications. From these we selected 16 students to mentor on Cytoscape and NRNB-related projects. All 16 projects passed and completed the summer successfully! This is almost double the number of students we mentor through GSoC in a typical year and puts NRNB in the top 10 supported organizations out of 180 open source orgs accepted into the Googel program. Google paid $5,000 per student, making their investment $80,000 in NRNB for 3 months of work. Inspired by this very successful model for recruiting new code contributors, we designed and launched NRNB Academy last year. Through NRNB Academy, we offer anybody the opportunity to work with our open source development team on network biology related tools and resources. The program offers a framework for training by providing project ideas and by pairing participants with mentors. It is completely volunteer-based and offers participants flexible

Page 29: NRNB Annual Report 2013

project terms. Since its launch in January 2011, we have had 14 requests from participants, and we currently have 4 students enrolled. The first graduate completed their project in September 2012. In addition to ongoing student projects, the program has also resulted in one collaboration and continues to be a source for project ideas and mentors for our GSoC effort. Based on our experience so far, this program is not only effective in producing useful tools and resources, but it also serves as a mechanism to increase long-term development collaborations.

Page 30: NRNB Annual Report 2013

Summary Continued advances in high-throughput experimental technologies release enormous amounts of interaction data into the public domain. Analysis of these interactions – and the networks they form – relies in large part on robust bioinformatics technology. The mission of the NRNB (nrnb.org) is to develop and support a suite of bioinformatics tools that broadly enable the study of network biology. In our third year as a resource, we have significantly advanced our goals through basic research, collaboration, dissemination of software tools, and community support. Here, we describe our progress in research, both basic and collaborative. This progress includes the use of network modules for patient diagnostics; tools that use ontologies to enable new network analyses and visualizations; tools that generate ontologies from networks; novel investigations at the interface of social networks and health; and major new releases of our Cytoscape platform and App Store. Each progress report below specifies the associated personnel and FTEs funded by the NRNB grant. In terms of our own research, NRNB enables a stable effort from each of the resource member sites, ranging from 0.20 to 1.08 FTEs. Many of these TRD projects leverage effort from other grants and funding mechanisms as well in order to maximize the return on investment. Nevertheless, without NRNB support, these projects would be significantly diminished, if not discontinued, and would lack the cohesion and synergy provided by a network biology resource (see reports #1-7 below). In terms of the services, training and dissemination, the impact of the NRNB resource is clear. Specifically, the extra effort needed to drive our mailing list response rate to over 90% is due to this resource (see Administrative Information report); the Open Tutorials system for collecting, maintaining and serving tutorial materials; the administration of NRNB’s participation in Google Summer of Code and our own NRNB Academy (see report #9 below); the organization of the annual Network Biology SIG and Cytoscape Workshops; the new Cytoscape App Store, which has catalyzed Cytoscape user and developer communities (see report #10 below). These efforts are maintained by the 0.5 FTE executive director and 0.25 FTE communications coordinator roles defined and funded by NRNB. And finally, NRNB has wide-ranging impact on biomedical research, both nationally and internationally through its collaboration projects. NRNB member sites were collectively maintaining an estimated two-dozen collaborations prior to the formation of this Resource. During the first year, we established close to 40. And for the past two years, NRNB is now maintaining 80-100 collaboration projects. These project range from the application of Cytoscape as a research tool for network analysis and visualization, to the development of Cytoscape plugins for custom data types and analyses, to the development and application of other network and pathways tools and resources for network biology (see report #8 below). This activity is a direct result of NRNB roles for executive director, communications coordinator and collaboration coordinator (0.63 FTE). We’ve come a long way in just three years, and NRNB is still maturing. With continued support, we are committed to maintaining and growing these efforts as a Resource for the network biology community.

Page 31: NRNB Annual Report 2013

Contents

I. Technology Research and Development: Progress and Applications References and figures are provided for each project and numbered independently. This year, per the direction of our EAC, we are using a 4-Stage model to provide a common context in describing the wide variety of technologies being developed in both our TRD and Collaboration projects. You will see references to "(Stage 2)", for example. The 4-Stage model is described and illustrated at the beginning of the next section (II. Collaboration, Table 1.). 1. A Gene Ontology Extracted from Molecular Networks (Ideker) 2. Network Analysis Tools for Cancer Genomics (Sander) 3. Network Analysis Methods for Inferring Causality in Signaling Networks (Sander) 4. Using Cytoscape for Social Network Research (Fowler) 5. Cytoscape 3.0 and CytoscapeWeb for the Visualization and Representation of

Biological Networks (Bader) 6. Analyzing Complex Networks Using Ontologies and Cytoscape 3.0 (Pico) 7. The CYNI Modular Network Inference Framework (Schwikowski)

II. Collaboration and Service Projects: Progress In addition to the direct impact of our TRD projects on our research, NRNB also impacts new science through our many CSPs. A description for each CSP is provided in the bulk of the report. Here, we summarize the scope of our collaborations and provide a new 4-Stage model and illustration to convey the range of our efforts as well as progress from year-to-year. Major service projects are also described in this section. 8. Collaboration Landscape 9. Google Summer of Code and NRNB Academy

III. Progress on Supplemental Award, 2011-2013 We were awarded a two-year supplemental grant to work on the Cytoscape App Store. This is a progress report on the second year. 10. The Cytoscape App Store (Pico, Bader)

Appendix A. The 2012 NRNB Network A full-page view of this year’s network representation of NRNB.

Page 32: NRNB Annual Report 2013

I. Technology Research and Development: Progress and Applications References and figures are provided for each project and numbered independently. This year, per the direction of our EAC, we are using a 4-Stage model to provide a common context in describing the wide variety of technologies being developed in both our TRD and Collaboration projects. You will see references to "(Stage 2)", for example. The 4-Stage model is described and illustrated at the beginning of the next section (II. Collaboration, Table 1).

1. A Gene Ontology Extracted from Molecular Networks (Ideker, 0.5 FTE: Janusz Dutkowski) Ontologies are of key importance to many domains of biological research. The Gene Ontology (GO), in particular, has been instrumental in unifying knowledge about biological processes, cellular components, and molecular functions through a hierarchy of concepts and their interrelationships. However, given only partial biological knowledge and inconsistency in how this knowledge is curated, it has been difficult to construct, extend and validate GO in an unbiased manner. We have recently showed that the existing collection of high-throughput network maps, as are now becoming available, can be analyzed to automatically assemble an ontology of gene function that rivals manually curated efforts [1]. Our systematic computational approach (Fig. 1) combines evidence from physical, genetic and transcriptional networks to produce an ontology comprised of 4,123 biological concepts and 5,766 hierarchical concept relations (Fig. 2). Using a new ontology alignment procedure (Fig. 1), we found that the network-based ontology captures the majority of known cellular components and identifies approximately 600 new cellular components and component relations – many of which we were able to validate either experimentally or bioinformatically. By working closely with the GO curators, we were able to incorporate selected new components and relations into the Gene Ontology, thus providing proof-of-principle for how to systematically update and revise the GO structure based on large-scale data (Stages 1 & 2). The network-extracted ontology is a new resource for systems and synthetic biology – i.e. a data-driven catalogue of cellular machinery, from genes, to complexes, to pathways and higher-order processes. It provides a powerful tool for performing multi-scale analysis of biological networks, including automatically identifying, annotating and visualizing the complete hierarchical structure. We also show how integrating the ontology with additional high-throughput datasets leads to identification of new components and processes altered in human disease. Based on our results, we suggest a new role for ontologies in bioinformatics: rather than merely being used as a gold-standard for performing functional enrichment, ontologies should serve as evolvable models that are validated, revised, and expanded based on new genomic data. Moving forward, it will be interesting to see how the network-extracted ontology can further be extended. For instance while NeXO represents a rigorous approach to capture ontology terms and term relations, the ability to systematically annotate the type of relation that occurs between terms (e.g. “is a”, “part of”, “regulates”) poses a separate and very interesting challenge. An in-

Page 33: NRNB Annual Report 2013

depth investigation is needed to assess which network properties are best at separating the different types of relations, and whether there are additional data sets that might be brought to bear on this problem (Stage 3). Similarly, while NeXO identifies the majority of known cellular components, it will be interesting to further investigate what types of network data could be used to increase the coverage of biological processes and molecular functions. Finally, a key question is whether enough high-quality data exist to build NeXO ontologies for other species, particularly human, and, whether it is better to structure a common ontology for all species, as has been done in GO, or to focus on individual species-specific ontologies.

Figure 1. Automated assembly and alignment of gene ontologies. (A) Probabilistic community detection within the input networks yields a binary tree in which nodes correspond to ontology terms and links correspond to parent-child term relations. Unsupported terms are replaced by multi-way joins, and additional parent-child relations are added based on network data. The resulting ontology is aligned against the Gene Ontology, in a way that (B) prohibits non-unique mappings and ancestor-descendant criss-crossing.

Page 34: NRNB Annual Report 2013

References 1. Dutkowski, J., Kramer, M., Surma, M.A., Balakrishnan, R., Cherry, J.M., Krogan, N.J., and Ideker, T., A gene ontology inferred from molecular networks. Nat Biotechnol, 2013. 31(1): p. 38-45.

2. Network Analysis Tools for Cancer Genomics (Sander, 0.62FTE: Ben Gross) This project is focused on building network analysis tools for interpreting high-throughput cancer genomic data sets to identify altered disease networks and enable the identification of network-based biomarkers in cancer. Our primary focus is building user-friendly, open source tools for visualizing and analyzing multidimensional cancer genomic data sets (including copy number, mutation, and mRNA expression) in the context of known biological pathways and interaction networks, and making these tools broadly available to clinical, experimental and computational investigators within the cancer research community. Providing such tools to the cancer research community is critical, as numerous large-scale projects, including the Cancer Genome Atlas (TCGA) project and the International Cancer Genome Consortium (ICGC), are profiling dozens of cancer types and subtypes. Identifying altered pathways and networks within each of these cancer types remains a critical and open challenge. During our first several years of NRNB funding, we completed a prototype project for displaying multi-dimensional cancer genomic data in the context of molecular interaction networks. We

Figure 2. The NeXO ontology is shown as a tree, with nodes indicating terms and edges indi-cating hierarchical rela-tions between terms, i.e. that one term contains another. Node sizes indi-cate the number of genes assigned to a term. Node colors represent the degree of correspon-dence to a term in GO as determined by ontology alignment, with high-level alignments labeled. Insets show the hierarchy identi-fied for the ribosome and actin cytoskeleton.  

Page 35: NRNB Annual Report 2013

chose to implement the prototype in CytoscapeWeb [1], as CytoscapeWeb does not require any additional software installation or require Java Web Start. It therefore significantly lowers the barriers for usage, particularly for biologists and clinical researchers ----- two of our main target user groups. We transitioned our tools from prototype to production mode (Stage 3), and have made our software available to the entire cancer research community. Cancer researchers are now using these tools to perform network analysis on up to 20 different cancer types, including TCGA-funded projects, such as glioblastoma multiforme (GBM) [2] and serous ovarian cancer [3] (Stage 4). The cBioPortal for Cancer Genomics code base has recently reached a stable state where it is now being considered as a general framework to build our other NRNB-related tools on. Our recently finalized drug-target data support in the context of cBioPortal’s network analysis is one such example. During the past year, we improved the network analysis capabilities of the cBioPortal by providing query and visualization of aggregated drug data from multiple resources. With this new feature, the portal currently contains gene-centric drug-target information from the following resources: DrugBank [8], KEGG Drug [9], NCI Cancer Drugs (http://www.cancer.gov/cancertopics/druginfo/alphalist), and Rask-Andersen et al. [10]. Within the network analysis view, drugs are hidden by default, but can be added to the network via the Genes & Drugs menu on the right side of the screen. Users now have the option of displaying FDA-approved drugs, cancer drugs defined by NCI Cancer Drugs, or all drugs targeting the query genes. For example, when the user queries for the gene EGFR in the portal, we not only show the network context of this gene, but also provide information about the drugs targeting the product of this gene: gefitinib and erlotinib are tyrosine kinase inhibitors that target the catalytic domain of EGFR, and cetuximab and trastuzumab are monoclonal antibodies that target the extracellular domain of EGFR and ERBB2, respectively (Fig. 1) [11].

Page 36: NRNB Annual Report 2013

Figure 1: Improved Network tab: Network analysis of epidermal growth factor receptor networks in serous ovarian cancer. (A) Network view of the EGFR and ERBB2 neighborhood in serous ovarian cancer (TCGA data set) rendered by Cytoscape Web. EGFR and ERBB2 are query genes (thick border), and nearest neighbor genes are color coded by their alteration frequency in ovarian cancer. One can display drugs that target EGFR or ERBB2 (hexagons, orange if FDA approved), as well as details about genomic alterations and links to external resources (lower left panel, example MYC). (B) The portal overlays multidimensional genomic data (copy number, mutation, and mRNA expression) onto all nodes in the network. (C) Edges can represent different interaction types (color-coded, such as “reacts with”). (D) Options for filtering, cropping and searching the network of interest. Our new drug-target feature is now available as part of the open-access cBioPortal and is helping cancer researchers in exploring the therapy options within the network context of genes of interest (Stage 4).

Outreach Plans Since its launch in mid-2010, the cBioPortal has been extensively used by cancer researchers around the globe, particularly by The Cancer Genome Atlas (TCGA) network. The portal currently attracts more than 1,500 unique visitors per week. In order to help researchers use cBioPortal in their studies, we are actively communicating with various communities, such as the TCGA network and publicizing the tool through different channels. During the last year, we have adapted and are currently maintaining an e-mail list for users who have questions regarding the use of the cBioPortal. This e-mail list and the questions answered by our group are publicly available at our Google Groups page (http://groups.google.com/group/cbioportal/). Furthermore we have recently completed a manuscript that explains the use cases of cBioPortal and its network analysis feature in details

Page 37: NRNB Annual Report 2013

in order to encourage wider adaptation. We believe this publication (Science Signaling) will help researchers interested in Cancer Research to use the portal in a more efficient way. We have also participated in the last year’s Google Summer of Code (GSoC) Program for two separate projects under the NRNB organization. The first project, a Cytoscape 3.0 Application to facilitate downloading cancer genomics data through the cBioPortal Web API services, was successfully lead by Dazhi Jiao under the advisement of two members from our group. This Cytoscape 3.0 application allows users to download data from cBioPortal, visualize it in the network context either in an overall or sample-specific manner, and analyze it with the help of additional Cytoscape 3.0 applications (see Figure 2). The source code for this project is freely available at our Google Code project web site (http://bit.ly/cbioportal). The software implementation for this project is currently being finalized (Stage 3) and we are planning to distribute this application through Cytoscape’s App Store interface in the next year (Stage 4).

Figure 2: A screenshot of the Mondrian application, an open-source project conducted as part of the Google Summer of Code 2012 project. The image shows how genomics data, downloaded from the cBioPortal through this application, is being overlaid onto the user’s network of interest. Once the data is loaded from the cancer studies of interest through the cBioPortal’s Web Api, users have the option to explore multi-dimensional cancer-related data within Cytoscape framework in a fashion that is similar to cBioPortal’s network analysis feature. Our second GSoC project was lead by the summer student, Istemi Bahceci under the co-advisement of one member of our group in conjunction with our NRNB-collaborator Ugur Dogrusoz at Bilkent University. The aim of this project was to extend CytoscapeWeb to support the Systems Biology Graphical Notation (SBGN) for more detailed biological pathway visualization. This project was completed over the last summer and we are currently

Page 38: NRNB Annual Report 2013

integrating it into the cBioPortal’s to provide better network analysis options for users (Stage 4, please see the following section).

New Driving Biological Projects In the next year, we are anticipating improving the network analysis feature in two ways: 1) detailed visualization of the pathways and reactions in the network view; 2) inference of indirect drug targets, for potentially interesting therapy options, by using genomic alteration and drug-target data. Currently, interaction types that are shown in the network analysis view are derived from the BioPAX to SIF inference rules [7]. For example: In Same Component indicates that Genes A and B are involved in the same biological component, such as a complex; State Change indicates that Gene A causes a state change, such as a phosphorylation change within Gene B. This reduction from BioPAX to SIF was necessary as the Cytoscape Web framework, by then, was not supporting visualization of more complex elements, such as compartments. With the technology being developed as part of the CSP-100 project (Gary Bader), it recently become feasible to visualize biological networks in a more detailed way, therefore enabling the use of Systems Biology Graphical Notation (SBGN) for better representation of BioPAX. As part of our NRNB collaboration with Ugur Dogrusoz (Bilkent University, Turkey), we are aiming to adopt SBGN-complaint views to visualize multi-dimensional cancer genomics data with the network context (see Figure 3). This project has recently been implemented as a proof-of-concept prototype and is now being integrated into cBioPortal (Stage 3 -> 4). When complete, this new feature will allow better presentation of proteomics data (e.g. Reverse Phase Protein Array data provided as part of the TCGA network) by allowing users to optionally switch from a gene-centric to protein-centric view.

Figure 3: Proposed additions to the current simple network view. On the left is the Simple Interaction derived from BioPAX; on the right is an example visualization of a BioPAX network obtained from Pathway Commons. The latter is utilizing the new visualization capabilities, Systems Biology Graphical Notation (SBGN), of the CytoscapeWeb project. The SBGN view provides a more detailed representation

Page 39: NRNB Annual Report 2013

of the pathway and also provides protein-centric view with Proteomics data mapped to specific proteins or phospho-proteins. In the next year, we are also planning to utilize genomic alteration and pathway data to infer clinically relevant uses of drug-target data. For this, we intend to use down- and up-stream relationships between genes to suggest drugs of possible interest that can indirectly target a particular genomic alteration event in cancer samples (see Figure 4). One historical example for such cases is the use of AKT inhibitors in patients who bear a homozygous PTEN deletion. Without the gene PTEN and its product, Akt proteins, which are down-stream of PTEN, cannot be suppressed, and therefore are found to be upregulated in cancer samples that have the homozygous PTEN deletion. In the presence of an AKT inhibitor, this up-regulation effect can be counteracted. Another similar example of this concept is the use of CDK4/6 inhibitors when CDKN2A is either mutated or homozygously deleted in cancer cells. Pathway resources, such as Pathway Commons, already provide this type of relationships between genes; and we plan to extract this information in a systematic way and combine it with the drug-target data in order to infer such therapy options in an automatic manner within the cBioPortal framework. This method and the prototype are currently under development (Stage 2).

Figure 4: Conceptual framework for inferring novel and drug-based therapy options based on specific genomic alteration with the use of pathway context -- e.g. use of AKT inhibitors when PTEN is altered in the tumor.

3. Network Analysis Methods for Inferring Causality in Signaling Networks (Sander, 0.62FTE: Ben Gross) The goal of our second TRD project is to develop network analysis tools that algorithmically infer causality within signaling networks and make these tools available. High-throughput screens conducted with libraries of small molecules or inhibitory RNAs have the ability to identify compounds that induce tumor suppressive responses in cancer cells [12]. While the effects of such perturbations can be easily linked to transcriptional changes, identifying the causal mechanism is a main challenge. In collaboration with Somwar and colleagues [13], we

Page 40: NRNB Annual Report 2013

used a computational approach to predict the target of a small molecule inducing reduced growth in lung adenocarcinoma cell lines. Interestingly, experimental follow up confirmed the prediction. Building on this concept, we have been working on computational approaches to model causal signaling cascades inducing observed transcriptional changes within perturbed cancer cell lines. We have been exploring the use of optimization algorithms adapted from statistical physics to identify the minimal set of interactions able to connect genes that are differentially expressed after a perturbation, with candidate targets of the same perturbation (Stage 2). This initial approach relied on an algorithm that solves the Steiner-tree problem. Given a set of “terminal” nodes, the Steiner-tree is defined as the tree of minimum weight connecting these terminals, allowing the inclusion of additional nodes. Differentially expressed genes after a perturbation and/or candidate targets of the same perturbation can be used as terminals. Our prediction was that the resulting Steiner-tree could therefore contain both gene interactions able to explain the observed transcriptional changes and the putative target of the perturbation. Within this past year, we determined that this approach does not work as well as expected, and are now in the process of exploring a new algorithmic framework that combines Gaussian graphical models with maximum entropy methods.

New Driving Biological Projects A new biological driver for deriving causality networks is inferring causal relationships within data types and between data types, such as copy number changes and cancer genomics. For example, we would like to investigate the relationship between mutations in the TP53 tumor repressor and the complex copy number profile in ovarian cancer. Another example is the exploration of causal relationships between gene mutations. For example, mutations in the POLE gene lead to a characteristic spectrum of mutations in other proteins. We have preliminary results and plan to develop a network analysis approach to identify causal relationships. We are also considering looking at interactions between microbial subpopulations, starting with the gut microbiome, where a set of interacting bacterial populations change under fluctuating constraints provided by the host and nutrient intake. Recent work has shown the precise composition and evolution of this population is closely coupled to the state of health of the host. Certain deviations from equilibrium present a significant risk of invasion by pathogenic bacteria, as seen with some cancer patients receiving bone marrow transplantations [23]. A more detailed understanding of the relationships between gut microbial subpopulations following such aggressive treatments in the host could inform therapeutic development leading to improved outcomes.

Our Related Publications • Gao J, et al, Integrative Analysis of Complex Cancer Genomics Profiles using the cBioPortal. Science

Signaling Protocol (in Press).

Page 41: NRNB Annual Report 2013

• Molinelli* E, Korkut* A, Wang* W, MIller M, Gauthier N, Jing X, Kaushik P, et al. Perturbation Biology: inferring signaling networks in cellular systems. PLoS Comp Bio (in Review).

• Cancer Genome Atlas Network. Comprehensive molecular portraits of human breast tumours. Nature. 2012; 490(7418):61-70.

• Cerami E, et al, The cBio Cancer Genomics Portal: An open platform for exploring multi-dimensional cancer genomics data. Cancer Discovery. May 2012, 2:401.

• The Cancer Genome Atlas Network, Comprehensive Molecular Characterization of Human Colon and Rectal Cancer. Nature 2012; 487(7407):330-337.

• The Cancer Genome Atlas Network, Comprehensive genomic characterization of squamous cell lung cancers. Nature 2012; 489:519-525.

References 1. Lopes CT, Franz M, Kazi F, Donaldson SL, Morris Q, Bader GD: Cytoscape Web: an interactive web-based network browser. Bioinformatics, 26(18):2347-2348. 2. TCGA: Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature 2008, 455(7216):1061--1068. 3. Integrated genomic analyses of ovarian carcinoma. Nature 2011, 474(7353):609-615. 4. Keshava Prasad TS, Goel R, Kandasamy K, Keerthikumar S, Kumar S, Mathivanan S, Telikicherla D, Raju R, Shafreen B, Venugopal A et al: Human Protein Reference Database--2009 update. Nucleic acids research 2009, 37(Database issue):D767-772. 5. Matthews L, Gopinath G, Gillespie M, Caudy M, Croft D, de Bono B, Garapati P, Hemish J, Hermjakob H, Jassal B et al: Reactome knowledgebase of human biological pathways and processes. Nucleic acids research 2009, 37(Database issue):D619-622. 6. Schaefer CF, Anthony K, Krupa S, Buchoff J, Day M, Hannay T, Buetow KH: PID: the Pathway Interaction Database. Nucleic acids research 2009, 37(Database issue):D674-679. 7. Cerami EG, Gross BE, Demir E, Rodchenkov I, Babur O, Anwar N, Schultz N, Bader GD, Sander C: Pathway Commons, a web resource for biological pathway data. Nucleic acids research, 39(Database issue):D685-690. 8. Knox C, Law V, Jewison T, Liu P, Ly S, Frolkis A, Pon A, Banco K, Mak C, Neveu V et al: DrugBank 3.0: a comprehensive resource for 'omics' research on drugs. Nucleic acids research 2011, 39(Database issue):D1035-1041. 9. Ogata H, Goto S, Sato K, Fujibuchi W, Bono H, & Kanehisa M (1999). KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic acids research, 27(1), 29–34 10. Rask-Andersen M, Almen MS and Schioth HB: Trends in the exploitation of novel drug targets. Nature Drug Discovery 2011; 10:579-590. 11. Raymond E., Faivre S, Armand JP: Epidermal growth factor receptor tyrosine kinase as a target for anticancer therapy. Drugs 2000; 60:41-42. 12. Somwar R, Shum D, Djaballah H, Varmus H: Identification and preliminary characterization of novel small molecules that inhibit growth of human lung adenocarcinoma cells. Journal of biomolecular screening 2009, 14(10):1176-1184. 13. Somwar R, Erdjument-Bromage H, Larsson E, Shum D, Lockwood WW, Yang G, Sander C, Ouerfelli O, Tempst PJ, Djaballah H et al: Superoxide dismutase 1 (SOD1) is a target for a small molecule identified in a screen for inhibitors of the growth of lung adenocarcinoma cell lines. Proceedings of the National Academy of Sciences of the United States of America 2011, 108(39):16375-16380. 14. Stratton MR, Campbell PJ, Futreal PA: The cancer genome. Nature 2009, 458(7239):719--724. 15. Hanahan D, Weinberg RA: The hallmarks of cancer. Cell 2000, 100(1):57--70. 16. Hanahan D, Weinberg RA: Hallmarks of cancer: the next generation. Cell 2011, 144(5):646-674.

Page 42: NRNB Annual Report 2013

17. Ciriello G, Cerami E, Sander C, Schultz N: Mutual exclusivity analysis identifies oncogenic network modules. Genome research 2012, 22(2):398-406. 18. Vaske CJ, Benz SC, Sanborn JZ, Earl D, Szeto C, Zhu J, Haussler D, Stuart JM: Inference of patient-specific pathway activities from multi-dimensional cancer genomics data using PARADIGM. Bioinformatics 2010, 26(12):i237-245. 19. Vandin F, Upfal E, Raphael BJ: Algorithms for detecting significantly mutated pathways in cancer. Journal of computational biology : a journal of computational molecular cell biology 2011, 18(3):507-522. 20. Turner N, Tutt A, Ashworth A: Hallmarks of 'BRCAness' in sporadic cancers. Nat Rev Cancer 2004, 4(10):814-819. 21. Storrs C: Combing the Cancer Genome. The Scientist 2012, Mar. 22. Taylor BS, Schultz N, Hieronymus H, Gopalan A, Xiao Y, Carver BS, Arora VK, Kaushik P, Cerami E, Reva B et al: Integrative genomic profiling of human prostate cancer. Cancer cell 2010, 18(1):11-22. 23. Jenq RR, Ubeda C, Taur Y, Menezes CC, Khanin R, Dudakov JA, Liu C, et al. Regulation of intestinal inflammation by microbiota following allogeneic bone marrow transplantation. J Exp Med. 2012;209(5):903-911.

4. Using Cytoscape for Social Network Research (Fowler, 0.2FTE: James Fowler) In addition to the Network Correlation plugin developed in collaboration with Alex Pico's group last year, we have now also used Cytoscape to study the network of interactions in the Olfactory system for a manuscript on “Friendship and Natural Selection” (in review) (Stage 1). A theory paper predicting what we observe here was published last year [1]. The target audience for this work is other social network scholars and people interested in applying these techniques to social network data. In our friendship and natural selection paper we show that friends are more genetically related than strangers, to the tune of about fourth cousins. So any project that takes into account population structure might also consider structure induced by friendship. And if there are gene characteristics available in Cytoscape, we could apply the Network Correlation plugin to see how far in gene-gene interaction networks these characteristics tend to correlate (Stage 2-4). In terms of Cytoscape integration with our new work, it would be great to have a database of natural selection scores available for each gene in human studies, so scholars could easily visualize what parts of their network are under recent natural selection. I have Pardis Sabeti’s Composite of Multiple Signals scores for about 3 million SNPs. It would also be nice to have easily available within Cytoscape tools for translating from SNPs to genes. We will work with the Cytoscape team to make natural selection data available and to implement methods for its visualization. This work might relate to the visualization TRDs by Bader and Pico groups within NRNB.

Social Networks and Health We originally proposed using trend motifs as a new statistical method to investigate "Social Networks and Disease". We are no longer working on this because we have come to believe there are other methods that are more suitable. So we are now at Stage 1 in our work on "Social Networks and Health", which has already led to a number of publications:

Page 43: NRNB Annual Report 2013

• Strully KW, Fowler JH, Murabito J, Benjamin EJ, Levy D, Christakis NA. Aspirin Use and Cardiovascular Events in Social Networks, Social Science & Medicine 74 (7), 1125–1129 (March 2012)

• O'Malley J, Arbesman S, Steiger DM, Fowler JH, Christakis NA. Egocentric Social Network Structure, Health, and Pro-Social Behaviors in a National Panel Study of Americans, PLoS ONE 7(5): e36250 (May 2012)

• Christakis NA, Fowler JH. Social Contagion Theory: Examining Dynamic Social Networks and Human Behavior. Statistics in Medicine 32 (4): 556–577 (February 2013)

• Shakya HB, Christakis NA, Fowler JH. Parental Influence on Substance Use in Adolescent Social Networks. Archives of Pediatrics & Adolescent Medicine 166 (12): 1132-1139 (December 2012)

• Rudolph AE, Crawford ND, Latkin C, Fowler JH, Fuller CM. Individual and Neighborhood Correlates of Membership in Drug Using Networks with a Higher Prevalence of HIV in New York City (2006-2009), Annals of Epidemiology, forthcoming

The target audience for this work includes scholars in public health. I will be teaching a class on networks and we will use existing tools in Cytoscape there that contribute to Stage 4 (broad adoption) for this approach. There are also some non-health-related projects that will be precursors to a new project in which we will match death records to the Facebook data to ascertain social network correlates of health: • Jones JJ, Settle JE, Bond RM, Fariss CJ, Marlow C, Fowler JH. Inferring Tie Strength from Online

Directed Behavior. PLoS ONE 8 (2): e52168 (February 2013) • Jones JJ, Bond RM, Fariss CJ, Settle JE, Kramer ADI, Marlow C, Fowler JH. Yahtzee: An Anonymized

Group Level Matching Procedure. PLoS ONE 8 (2): e55760 (February 2013) • Bond RM, Fariss CJ, Jones JJ, Kramer ADI, Marlow C, Settle JE, Fowler JH. A 61-Million-Person

Experiment in Social Influence and Political Mobilization. Nature 489: 295–298 (13 September 2012) This work might be ideally suited for a supplement grant. We would use the Facebook and death data to predict longevity and health factors that influence it (like MI). Next, we would develop and disseminate a Facebook App that anyone can download that will give them health stats based on their data. We could then use Cytoscape to show people their networks and the health risks of their friends and friends’ friends (Stage 4).

References 1. Fu F, Nowak MA, Christakis NA, Fowler JH. The Evolution of Homophily. Scientific Reports 2: 845

(13 November 2012)

5. Cytoscape 3.0 and CytoscapeWeb for the Visualization and Representation of Biological Networks (Bader, 0.91FTE: Christian Lopes, Jason Montojo, Igor Rodchenhov)

Technologies developed with NRNB funds Our goal is to develop new technologies for visualization and representation of biological networks. Our grant aims are: Aim 1. Simplifying network views by hierarchically organizing networks and their modules.

Page 44: NRNB Annual Report 2013

Aim 2. Showing only the information needed across multiple levels of detail (semantic zooming) and data sources (information layering) Aim 3. Simultaneously viewing network attributes collected from many biological experiments. Our major activity over the past year has continued to ensure that Cytoscape 3.0 supports the advanced visualization and representation features that we proposed in the NRNB grant, both in system design and performance. This has required major effort working on releasing Cytoscape 3.0 and making sure all of the application programming interfaces (APIs) present in that system will support our aims. Cytoscape 3.0 has now been released and we are focusing on developing new features. Over the past year, we implemented support for multiple renderers so app authors can contribute their own visualizations. We helped integrate the Metanodes plugin into Cytoscape 3 as CyGroups in collaboration with Scooter Morris at UCSF, which enables hierarchical views. We developed technology to integrate OpenGL into Cytoscape, which required changes to the code building process. This will support faster new visualizations in the future that take advantage of current graphics card functionality. We also mentored a GSoC project for visualizing dynamic networks (e.g. networks that vary over time). Excitingly, we have also ported over the CytoscapeWeb visualization tool that we originally developed for the GeneMANIA project to use open web standards such as HTML5 (stage 3). This software is now available at github.com/cytoscape/cytoscape.js. This involved testing various HTML5 canvas and WebGL based libraries for rendering networks within the browser, such as Three.js, KineticJS, EaselJS, performing performance testing using WebGL directly for drawing networks using HTML5 technology, performing performance testing using HTML5 canvas directly and finally implementing a complete HTML5 canvas based rendered that is fully compatible with touch devices, like the Apple iPad. This technology will improve accessibility of Cytoscape visualizations by supporting any computing device that runs a modern web browser (many of which can not run the Cytoscape Java application). As more groups develop web-based network visualization and analysis functionality, we expect extensive innovation in user interfaces and visualization techniques to occur on this platform. We are also continuing to lay the groundwork for representation and visualization of detailed biological pathway information in Cytoscape 3. We have completed the following activities in this area (all at Stage 3): • Developed a new CyPath2 Cytoscape 3.0 app, which can load biological pathway data from

the Pathway Commons web service in BioPAX level 3 format. • Updated BioPAX and Pathway Commons core plugins/apps for Cy2 and Cy3 Ensuring Cytoscape 3 will enable our stated aims has required tremendous effort, in that we have need to implement a number of prototype features to test that our API designs are robust. We expect this work will pay off in 2013 now that we have finally released Cytoscape 3 and have started working on novel visualization features in earnest.

Page 45: NRNB Annual Report 2013

We continue to maintain our highly successful Enrichment Map visualization plugin for Cytoscape 2.8, responding to frequent requests by users for new features (Stage 4 of development). This visualization tool is heavily used in all of our collaborations with local biology groups (see collaboration section) and by others (the two papers describing the method garnered over 90 citations since 2010). We have recently established a collaboration with Jill Mesirov’s group at the Broad Institute, MIT to integrate Enrichment Map into their popular Gene Set Enrichment Analysis (GSEA) software. This work, funded by a separate NIH R01 grant, is now complete and will be released in the coming months. Over the past year, we have developed strong ties with a new driving biological project – the Cancer Stem Cell program at the Ontario Institute for Cancer Research (OICR) in Toronto, led by John Dick, the discoverer of cancer stem cells (http://oicr.on.ca/oicr-programs-and-platforms/innovation-programs/cancer-stem-cells). Fifteen labs participate in this program, most of which run genomics experiments and are very interested in pathway analysis to help interpret their results. As such, they have funded two full time research associates in my lab, Veronique Voisin who does pathway analysis using Enrichment Map and Shaheena Bashir, who is a biostatistician who processes raw genomics data to prepare for pathway analysis. This has led to the four published projects in the past year (see below).

Our Related Publications • Lechman ER, Gentner B, van Galen P, Giustacchini A, Saini M, Boccalatte FE, Hiramatsu H, Restuccia

U, Bachi A, Voisin V, Bader GD, Dick JE, Naldini L. Attenuation of miR-126 activity expands HSC in vivo without exhaustion. Cell Stem Cell. 2012 Dec 7;11(6):799-811. doi: 10.1016/j.stem.2012.09.001. Epub 2012 Nov 8. PubMed PMID: 23142521; PubMed Central PMCID: PMC3517970.

• Labbé RM, Irimia M, Currie KW, Lin A, Zhu SJ, Brown DD, Ross EJ, Voisin V, Bader GD, Blencowe BJ, Pearson BJ. A comparative transcriptomic analysis reveals conserved features of stem cell pluripotency in planarians and mammals. Stem Cells. 2012 Aug;30(8):1734-45. doi: 10.1002/stem.1144. PubMed PMID: 22696458.

• Liu JC, Voisin V, Bader GD, Deng T, Pusztai L, Symmans WF, Esteva FJ, Egan SE, Zacksenhaus E. Seventeen-gene signature from enriched Her2/Neu mammary tumor-initiating cells predicts clinical outcome for human HER2+:ERα- breast cancer. Proc Natl Acad Sci U S A. 2012 Apr 10;109(15):5832-7. doi:10.1073/pnas.1201105109. Epub 2012 Mar 28. PubMed PMID: 22460789; PubMed Central PMCID: PMC3326451.

• Bozhena Jhas, Shrivani Sriskanthadevan, Marko Skrtic, Mahadeo A. Sukhai, Veronique Voisin, Yulia Jitkova, Marcela Gronda, Rose Hurren, Rob C. Laister, Gary D. Bader, Mark D. Minden and Aaron D. Schimmer, Metabolic adaptation to chronic inhibition of mitochondrial protein synthesis in acute myeloid leukemia cells PLOS ONE, Accepted Feb.4.2013 PMID:23520503 PMCID:PMC3592803.

6. Analyzing Complex Networks Using Ontologies and Cytoscape 3.0 (Pico, 0.25FTE: Alex Pico, Scooter Morris , Kristina Hanspers) To make use of the vast wealth of data and knowledge to elucidate the function of biological networks is one of the biggest challenges in bioinformatics. The development of high-throughput

Page 46: NRNB Annual Report 2013

technology has given rise to an enormous increase of data on biomolecular expression and interactions, which results in protein interaction networks, gene regulatory networks, signaling networks and metabolic networks. One approach to understanding networks relies on ontologies, such as Gene Ontology, and enrichment analysis. Common approaches, such as BiNGO [1], regard a network as a list of genes, performing gene-level annotation and enrichment methods. However, it is apparent that the same list of genes with different interactions may perform different functions. Thus, the functional analysis of networks should take into consideration network topology. Following up on last year's work on the Mosaic app for the partitioning and visualization of complex networks using Gene Ontology [2], we focused on the development of the Network Ontology Analysis (NOA) app for Cytoscape. The NOA app implements the NOA algorithm [3] for network-based enrichment analysis, which extends Gene Ontology annotations to network links, or edges (Stage 2). First, NOA assigns ontology terms to interactions based on the known annotations of connected genes via optimizing two novel indexes ‘Coverage’ and ‘Diversity’. Then, NOA generates two alternative reference sets to statistically rank the enriched functional terms for a given biological network. NOA was shown to be more efficient not only for dynamic regulatory networks but also for rewired protein interaction networks. However, there are several shortcomings with the initial implementation: no graphical interface or visualization of results; limited support for species, gene names and ontology types; no batch mode for large-scale computation; and no integration or interoperability with other network tools. To overcome these shortcomings, we have reimplemented the NOA algorithm as a Cytoscape app and added interfaces, extensible ontology and identifier support, new visualizations, a batch mode, and interoperability with other Cytoscape apps, such as CyThesaurus and Mosaic (Stage 3). The NOA app facilitates the annotation and analysis of one or more networks in Cytoscape according to user-defined parameters. In addition to tables, the NOA plugin also presents results in the form of heatmaps and overview networks in Cytoscape, which can be exported for publication figures. A manuscript on this work is currently in review (accepted with minor revision). The publication and release of this tool via the Cytoscape App Store brings the project to Stage 4.

Application As a proof-of-principle example, we prepared a file containing all the human pathways curated at WikiPathways and translated Ensembl identifiers for batch mode analysis by the NOA plugin. We ran edge-based analysis to make use of the interaction content in the pathways, rather than just treating them like disconnected gene sets. Each pathway was analyzed against the entire collection as reference and no correction was applied. As expected, curated pathways of known function were significantly annotated with relevant GO terms (Fig. 1). Users could add novel pathways or uncharacterized interaction networks to this batch analysis to assess functional annotation of edges in the context of known biology.

Page 47: NRNB Annual Report 2013

Fig. 1. A heatmap of GO-annotated human pathways based on batch mode NOA plugin analysis. Axes are labeled in inset, highlighting apoptosis-related pathways and GO terms.

Cytoscape 3.0 Adding Dr. Scooter Morris to the Gladstone subcontract this year, we not only gained a valuable Roving Engineer for training and outreach (see Administrative Information: Training and Outreach), but also a key contributor to Cytoscape 3.0 design and development. During this reporting period, Dr. Morris, worked on porting key plugins to 3.0, that other apps depend on as well as many core features, including context menus, Tunables, and custom graphics. As part of this work, he met with Bader lab members at the University of Toronto to work on improving the user interface for the VizMapper and extending Metanodes. He also improved the standardization of the Cytoscape context menus and the esthetics of Cytoscape dialogs by enhancing the Tunables interface. The focus of the Cytoscape 3 effort during the later half of this reporting period has been the 3.0.1 release. This is primarily a "bug-fix" release that follows on the heels of the initial release of Cytoscape 3, and is intended to be the "solid" release we want to advocate for new users. In addition to several general fixes, Dr. Morris' work has addressed several issues with graphical annotations and groups that are important for wider use. We also began the design for an API to export the functionality of graphical annotations to other apps, in particular apps such as the GPML App, which will use these features to render pathway diagrams imported from the WikiPathways resource, which is another NRNB-supported service.

References 1. Maere, S., Heymans, K. and Kuiper, M. (2005) BiNGO: a Cytoscape plugin to assess

overrepresentation of gene ontology categories in biological networks, Bioinformatics, 21, 3448-3449. 2. Zhang, C., et al. (2012) Mosaic: making biological sense of complex networks, Bioinformatics, 28,

1943-1944. 3. Wang, J., et al. (2009) Disease-aging network reveals significant roles of aging genes in connecting

genetic diseases, PLoS Comput Biol, 5, e1000521.

Page 48: NRNB Annual Report 2013

7. The CYNI Modular Network Inference Framework (Schwikowski, 1.08FTE: Oriol Guitart) The objectives of this TRD work are to (1) Create 'fill-in-the-algorithm' infrastructure for network inference and (2) Make methods accessible to biologists within the Cytoscape framework. Our approach is to create a software infrastructure (CYNI) for network inference algorithms (Fig 1).

During this reporting period, we made progress on the following fronts:

-­‐ Working CYNI App -­‐ Documented API -­‐ User and CYNI App writer documentation -­‐ Additional CYNI functionality: Discretization and Imputation -­‐ Implemented and documented downloadable “Hello World examples” -­‐ CYNI App presented at two French network biology meetings -­‐ CYNI App and Comprehensive documentation publicly available

The project is now live at http://www.proteomics.fr/Sysbio/CyniProject and is also available at the Cytoscape App Store. Data imputation and discretization techniques are provided along with several known inference algorithms to make this tool fully operational for any kind inference requirement. While data imputation and discretization techniques allow you modify Cytoscape tables, network inference algorithms produce a new network after applying the chosen technique. Cyni Toolbox is not only an application to perform a result, but it also provides a framework to other app developers to facilitate the implementation of other algorithms. The goal of Cyni framework is to provide standarized, extensible and configurable default solutions for all components different from the core algorithms, such as GUI, parameter handling, configuration of distance/similarity measures. Thereby, developers can focus on their core expertise, instead of trying to spend significant effort constructing software components foreign to their expertise. Once, you are familiar with the principles of Cytoscape Apps, you will notice that there is a new concept to develop apps, which is to develop an app that depends on another API app. This new option brings a new variety of possibilities but at the same time requires app developers to be more rigorous on their developments. Cyni provides two sets of elements grouped in two

Page 49: NRNB Annual Report 2013

types called Algorithms and Metrics, but also allows other app developers to create new Algorithms and metrics through the Cyni framework. To help other app developers to create any of these two types using the Cyni framework, we offer tutorials to show how a new algorithm or metric can be easily developed through Cyni. And, so far it has been successful. CYNI was presented at the French NETBIO network inference community (Paris, Nov. '12) and at the Prospectom workshop (Grenoble, Dec. '12). We received this feedback:

"I was very happy to hear that Cytoscape is going inferential :). And that members of our community are interested in collaborating with your to make their tools available in the form of network inference Apps!"

We also helped implement the first "non-native" CYNI plug-in, which tested and evolved our API, UI elements and documentation. This was published this year as a joint Bioinformatics Application Note [1]. There are other future CYNI Apps in the works from the d'Alche-Buc lab [2]. Overall, we aim to drive the evolution from network inference to iterative network modeling.

References 1. Céline Brouard, Florence d'Alché-Buc, Marie Szafranski: Semi-supervised Penalized Output Kernel

Regression for Link Prediction. ICML 2011: 593-600 2. Quach, M., Brunel, N. and d'Alche Buc, F., (2007), Estimating parameters and hidden variables in

non-linear state-space models based on ODEs for biological networks inference, Bioinformatics, 23, 23, pp. 3209–3216.

Cytoscape Europe 2013 Our group is also in the early stages of planning a European Cytoscape Workshops and Symposium in 2013. We will report on this in the next APR, but the planning has already begun during this reporting period. The audience will include new and existing users and app developers, as well as researchers interested in network biology. This field and Cytoscape interest in particular are quite active in Europe, so we anticipate a large response.

Page 50: NRNB Annual Report 2013

II. Collaboration and Service Projects: Progress (1.38FTE: Alex Pico, Rintaro Saito, Kristina Hanspers) In addition to the direct impact of our TRD projects on our research, NRNB also impacts new science through our many CSPs. A description for each CSP is provided in the bulk of the report. Here, we summarize the scope of our collaborations and provide a new 4-Stage model and illustration to convey the range of our efforts as well as progress from year-to-year. Major service projects are also described in this section.

4-Stage Model of Technology Development

Table 1. Pairs of columns are used to represent early and late phases for each of 4 major stages of technology development: (1) Identify problem, (2) Proof of concept, (3) Implement solution, and (4) Broad adoption. Each row represents a TRD or Collaboration currently in development through NRNB. The black-filled cells in the table mark the stages of development led by or facilitated by NRNB members and resources. For example, the top row (Suppl-1) is for the supplement project for the Cytoscape App Store, which was originated, developed and recently released by the NRNB team (Stage 1-4). Further down is a row for TRD-C1 by the Bader group, which started at the end of the proof-of-concept stage and has completed an implementation (Stage 2-3). We also have collaborations, like the second-to-last row (CSP-22, Fowler), which are exploring new questions and defining ways to apply network technologies to new fields (Stage 1). Overall, it just so happens that our projects are nicely distrbuted across the

Page 51: NRNB Annual Report 2013

full range of this model. We anticipate seeing more projects grow in length as they progress, while continuing to bring in new projects, starting out at various stages.

8. Collaboration Landscape During this reporting period, we augmented our existing collaboration processing system with an easy-to-use web form for requesting, collecting, and visualizing progress reports for every collaboration with NRNB. Each of the 5 NRNB sites has a designated Collaboration Contact who is responsible for managing collaboration requests and updates. It's still easy to start a new collaboration by clicking on the ‘Collaborate’ button throughout our website, which leads to a simple web-based form that is automatically logged in our Collaboration Tracker spreadsheet. Entries are assessed per the availability and interest of each group. If accepted, they are marked for entry into our annual reporting system. If not accepted, they are marked as rejected but still recorded for reporting purposes. Now we have an administration page that allow us to request updates for any active collaboration. Filling-out a simple web form populates the update fields in the Collaboration Tracker. A simple script can be run at any time to generate the table above (Table 1) based on these updates. Numerous potential collaborators also independently find the collaboration hooks on our website, such as the mentoring programs which bring in the largest numbers and some of the most diverse and productive collaborations (see below). At the end of year-one, we had established close to 40 collaborations. During the course of the second and third years, we've maintained 80-100 collaborations. These range from the application of Cytoscape as a research tool for network analysis and visualization, to the development of Cytoscape plugins for custom data types and analyses, to the development and application of other network and pathways tools and resources for network biology. Applications of Network Biology In this category, we are enabling a wide range of medical research applications including the study of Lung Cancer, Breast Cancer, Eating Disorders, Glaucoma, Heart Disease, Leukemia, Glioma, Prostate Cancer, Endometrial Cancer, Colorectal Cancer, Malaria, Fatty Liver Disease, and Diabetes [1-9]. Through NRNB collaborations, Cytoscape is also being applied to study of the mechanisms underlying adult stem cells, tumorigenesis, hepatocytes, ES cell transcription, chromatin remodeling, pediatric brain tumors, chemogenomics, cell-cell interactions, cardiomyopathy, social networks, kinase interactome, oxidative stress response, DNA damage and repair, cancer stem cells, wound healing, immune system, and secretory vesicle architecture [11-15]. Development of Network Biology Tools and Resources It is a testament to the extensible model of Cytoscape and our outreach efforts to provide training and documentation to developers, that we get an equal number of collaboration requests for developing new Cytoscape features, which in turn can be applied to not only our immediate collaborators’ research, but more broadly to the Cytoscape user community. This is a very gratifying virtuous cycle that NRNB is specifically enabling and amplifying. In this category, we have established collaborations to develop plugins and apps to connect with public databases to access and load interactions and annotations, to provide new types of data visualizations and analyses [16]. We also have collaborations to develop additional network

Page 52: NRNB Annual Report 2013

biology tools and resources beyond just Cytoscape, including PSICQUIC and PSICORE, MedSavant, PathVisio, WikiPathways, cBio Portal, SBGN, GeneMANIA, Basysbio, Cancer Gene Encyclopaedia, VizANT [17].

References 1. Somwar R, Erdjument-Bromage H, Larsson E, Shum D, Lockwood WW, Yang G, Sander C, Ouerfelli

O, Tempst PJ, Djaballah H, Varmus HE. Superoxide dismutase 1 (SOD1) is a target for a small molecule identified in a screen for inhibitors of the growth of lung adenocarcinoma cell lines. Proc Natl Acad Sci U S A. 2011 Sep 27;108(39):16375-80. doi: 10.1073/pnas.1113554108. Epub 2011 Sep 19. PubMed PMID: 21930909; PubMed Central PMCID: PMC3182729.

2. Liu JC, Voisin V, Bader GD, Deng T, Pusztai L, Symmans WF, Esteva FJ, Egan SE, Zacksenhaus E. Seventeen-gene signature from enriched Her2/Neu mammary tumor-initiating cells predicts clinical outcome for human HER2+:ERα- breast cancer. Proc Natl Acad Sci U S A. 2012 Apr 10;109(15):5832-7. doi: 10.1073/pnas.1201105109. Epub 2012 Mar 28. PubMed PMID: 22460789; PubMed Central PMCID: PMC3326451.

3. Isserlin R, Merico D, Alikhani-Koupaei R, Gramolini A, Bader GD, Emili A. Pathway analysis of dilated cardiomyopathy using global proteomic profiling and enrichment maps. Proteomics. 2010 Mar;10(6):1316-27. doi: 10.1002/pmic.200900412. PubMed PMID: 20127684; PubMed Central PMCID: PMC2879143.

4. Jenkins RB, Xiao Y, Sicotte H, Decker PA, Kollmeyer TM, Hansen HM, Kosel ML, Zheng S, Walsh KM, Rice T, Bracci P, McCoy LS, Smirnov I, Patoka JS, Hsuang G, Wiemels JL, Tihan T, Pico AR, Prados MD, Chang SM, Berger MS, Caron AA, Fink SR, Halder C, Rynearson AL, Fridley BL, Buckner JC, O'Neill BP, Giannini C, Lachance DH, Wiencke JK, Eckel-Passow JE, Wrensch MR. A low-frequency variant at 8q24.21 is strongly associated with risk of oligodendroglial tumors and astrocytomas with IDH1 or IDH2 mutation. Nat Genet. 2012 Oct;44(10):1122-5. doi: 10.1038/ng.2388. Epub 2012 Aug 26. PubMed PMID: 22922872; PubMed Central PMCID: PMC3600846.

5. Taylor BS, Schultz N, Hieronymus H, Gopalan A, Xiao Y, Carver BS, Arora VK, Kaushik P, Cerami E, Reva B, Antipin Y, Mitsiades N, Landers T, Dolgalev I, Major JE, Wilson M, Socci ND, Lash AE, Heguy A, Eastham JA, Scher HI, Reuter VE, Scardino PT, Sander C, Sawyers CL, Gerald WL. Integrative genomic profiling of human prostate cancer. Cancer Cell. 2010 Jul 13;18(1):11-22. doi: 10.1016/j.ccr.2010.05.026. Epub 2010 Jun 24. PubMed PMID: 20579941; PubMed Central PMCID: PMC3198787.

6. Liu JC, Voisin V, Bader GD, Deng T, Pusztai L, Symmans WF, Esteva FJ, Egan SE, Zacksenhaus E. Seventeen-gene signature from enriched Her2/Neu mammary tumor-initiating cells predicts clinical outcome for human HER2+:ERα- breast cancer. Proc Natl Acad Sci U S A. 2012 Apr 10;109(15):5832-7. doi: 10.1073/pnas.1201105109. Epub 2012 Mar 28. PubMed PMID: 22460789; PubMed Central PMCID: PMC3326451.

7. Cancer Genome Atlas Network. Comprehensive molecular characterization of human colon and rectal cancer. Nature. 2012 Jul 18;487(7407):330-7. doi: 10.1038/nature11252. PubMed PMID: 22810696; PubMed Central PMCID: PMC3401966.

8. Cancer Genome Atlas Network. Comprehensive molecular portraits of human breast tumours. Nature. 2012 Oct 4;490(7418):61-70. doi: 10.1038/nature11412. Epub 2012 Sep 23. PubMed PMID: 23000897; PubMed Central PMCID: PMC3465532.

9. Cancer Genome Atlas Research Network. Comprehensive genomic characterization of squamous cell lung cancers. Nature. 2012 Sep 27;489(7417):519-25. doi: 10.1038/nature11404. Epub 2012 Sep 9. Erratum in: Nature. 2012 Nov 8;491(7423):288. Rogers, Kristen [corrected to Rodgers, Kristen]. PubMed PMID: 22960745; PubMed Central PMCID: PMC3466113.

Page 53: NRNB Annual Report 2013

10. Labbé RM, Irimia M, Currie KW, Lin A, Zhu SJ, Brown DD, Ross EJ, Voisin V, Bader GD, Blencowe BJ, Pearson BJ. A comparative transcriptomic analysis reveals conserved features of stem cell pluripotency in planarians and mammals. Stem Cells. 2012 Aug;30(8):1734-45. doi: 10.1002/stem.1144. PubMed PMID: 22696458.

11. Kumar A, Möcklinghoff S, Yumoto F, Jaroszewski L, Farr CL, Grzechnik A, Nguyen P, Weichenberger CX, Chiu HJ, Klock HE, Elsliger MA, Deacon AM, Godzik A, Lesley SA, Conklin BR, Fletterick RJ, Wilson IA. Structure of a novel winged-helix like domain from human NFRKB protein. PLoS One. 2012;7(9):e43761. doi: 10.1371/journal.pone.0043761. Epub 2012 Sep 11. PubMed PMID: 22984442; PubMed Central PMCID: PMC3439487.

12. Northcott PA, Shih DJ, Peacock J, Garzia L, et al. Subgroup-specific structural variation across 1,000 medulloblastoma genomes. Nature. 2012 Aug 2;488(7409):49-56. doi: 10.1038/nature11327. PubMed PMID: 22832581.

13. Witt H, Mack SC, Ryzhova M, Bender S, Sill M, Isserlin R, Benner A, Hielscher T, Milde T, Remke M, Jones DT, Northcott PA, Garzia L, Bertrand KC, Wittmann A, Yao Y, Roberts SS, Massimi L, Van Meter T, Weiss WA, Gupta N, Grajkowska W, Lach B, Cho YJ, von Deimling A, Kulozik AE, Witt O, Bader GD, Hawkins CE, Tabori U, Guha A, Rutka JT, Lichter P, Korshunov A, Taylor MD, Pfister SM. Delineation of two clinically and molecularly distinct subgroups of posterior fossa ependymoma. Cancer Cell. 2011 Aug 16;20(2):143-57. doi: 10.1016/j.ccr.2011.07.007. PubMed PMID: 21840481.

14. International Cancer Genome Consortium, Hudson TJ, Anderson W, Artez A, et al. International network of cancer genome projects. Nature. 2010 Apr 15;464(7291):993-8. doi: 10.1038/nature08987. Erratum in: Nature. 2010 Jun 17;465(7300):966. Himmelbaue, Heinz [corrected to Himmelbauer, Heinz]; Gardiner, Brooke A [corrected to Gardiner, Brooke B]; Cross, Anthony [corrected to Cros, Anthony]. PubMed PMID: 20393554; PubMed Central PMCID: PMC2902243.

15. Lechman ER, Gentner B, van Galen P, Giustacchini A, Saini M, Boccalatte FE, Hiramatsu H, Restuccia U, Bachi A, Voisin V, Bader GD, Dick JE, Naldini L. Attenuation of miR-126 activity expands HSC in vivo without exhaustion. Cell Stem Cell. 2012 Dec 7;11(6):799-811. doi: 10.1016/j.stem.2012.09.001. Epub 2012 Nov 8. PubMed PMID: 23142521; PubMed Central PMCID: PMC3517970.

16. Zhang C, Hanspers K, Kuchinsky A, Salomonis N, Xu D, Pico AR. Mosaic: making biological sense of complex networks. Bioinformatics. 2012 Jul 15;28(14):1943-4. doi: 10.1093/bioinformatics/bts278. Epub 2012 May 9. PubMed PMID: 22576176; PubMed Central PMCID: PMC3389769.

17. Aranda B, Blankenburg H, Kerrien S, Brinkman FS, Ceol A, Chautard E, Dana JM, De Las Rivas J, Dumousseau M, Galeota E, Gaulton A, Goll J, Hancock RE, Isserlin R, Jimenez RC, Kerssemakers J, Khadake J, Lynn DJ, Michaut M, O'Kelly G, Ono K, Orchard S, Prieto C, Razick S, Rigina O, Salwinski L, Simonovic M, Velankar S, Winter A, Wu G, Bader GD, Cesareni G, Donaldson IM, Eisenberg D, Kleywegt GJ, Overington J, Ricard-Blum S, Tyers M, Albrecht M, Hermjakob H. PSICQUIC and PSISCORE: accessing and scoring molecular interactions. Nat Methods. 2011 Jun 29;8(7):528-9. doi: 10.1038/nmeth.1637. PubMed PMID: 21716279; PubMed Central PMCID: PMC3246345.

9. Google Summer of Code and NRNB Academy In addition to the outreach effort described above, we also leverage a Google-sponsored program called Google Summer of Code (GSoC) to attract new developers for Cytoscape core, plugins/apps, WikiPathways, PathVisio and other tools we deem relevant to the NRNB mission. This year is the seventh year that Dr. Pico has coordinated the collective GSoC effort; this is the

Page 54: NRNB Annual Report 2013

third year we’ve participated under the name of “NRNB”. Through the GSoC program we not only recruit new developers, but we are also promoting NRNB as an open source-friendly organization, putting us in an exclusive list of ~175 organizations selected from around the world by Google to participate. Dr. Pico attends the annual GSoC Mentors Summit with other NRNB mentors to further engage the open source development community. GSoC also brings in new potential collaborators who want to participate as mentors in addition to the 40-60 student applicants. This year we have coordinated 30 mentors, leveraging the effort of additional developers from the open source communities surrounding NRNB-related tools. We anticipate getting 15-18 students this summer. Google is paying $5,000 per student, making their investment ~$80,000 in NRNB for 3 months of work. That’s what I call leveraging the community! Inspired by this very successful model for recruiting new code contributors, we designed and launched NRNB Academy last year. The idea behind NRNB Academy is very similar to GSoC, except it’s not restricted to students, it’s not affiliated with Google, and it’s 100% volunteer. Our experience has been that the major draw to our projects in the past has been the opportunity to get direct mentorship in developing Cytoscape and our other tools. The students and external mentors are eager to contribute time and effort when they know it will be guided and effectively amplified by the interaction with NRNB, thus dramatically increasing the odds for a productive output. Since its launch in January 2011, we have had 14 requests from participants, and we currently have 4 students enrolled. The first graduate completed their project in September 2012. In addition to ongoing student projects, the program has also resulted in a collaboration and continues to be a source for project ideas and mentors for our GSoC effort. Based on our experience so far, this program is not only effective in producing useful tools and resources, but it also serves as a mechanism to increase long-term development collaborations. Our first graduating student continues to be involved as a contributor, and two of the ongoing students are involved in longer-term ongoing projects as well.

Page 55: NRNB Annual Report 2013

III. Progress on Supplemental Award, 11/2011-07/2013 We were awarded a two-year supplemental grant to work on the Cytoscape App Store. This is a progress report on the second year.

10. The Cytoscape App Store (Pico, 1.0FTE: Samad Lotia, Kristina Hanspers; Bader, 0.45FTE: Jason Montojo, Yue Dong) To help address the needs of users, we launched the Cytoscape App Store (http://apps.cytoscape.org) to coincide with the release of Cytoscape 3.0, a major re-architecturing of Cytoscape for improved stability, performance, and versatility. The overarching goals of the Cytoscape App Store are to highlight the important features apps add to Cytoscape, to enable researchers to find apps they need, and for developers to promote their apps. For each Cytoscape 3.0 app, the App Store supports unique features like one-click install and comprehensive download statistics. 2.x Cytoscape apps are also available on the App Store despite being incompatible with Cytoscape 3.0. We plan to continue supporting these apps throughout the transition to Cytoscape 3.0, though we anticipate rapid growth in new and ported apps for 3.0 as it gains adoption in the community. Gary Bader's group implemented Cytoscape 3 functionality for the Cytoscape App Store. This involved developing the App Manager for Cytoscape for viewing, downloading apps/plugins to Cytoscape, implementing a user interface for the App Manager, as well as mechanism for keeping track of installed apps, implementing a REST API for allowing queries to the app manager for status on installed apps, and for telling the app manager to install an app from the app store, developing a system for checking for updates to currently installed apps, and installing them. This feature is in Stage 4 of development and we continue to maintain this important component of the Cytoscape software. The App Store is already playing a broader role in the Cytoscape community than just a place for browsing and submitting apps. For instance, we held a competition for the best Cytoscape 3.0 apps in December 2012. The first prize was shared by ClueGO, which visualizes the relationship between gene ontology terms; and DynNetwork, which visualizes networks with time-based movement. We plan to host more competitions in the future to encourage Cytoscape 3.0 app development. Apps and the app developer community play a critical role in success of Cytoscape, ensuring its continued relevance and reach as the field of network biology evolves. The new Cytoscape App Store aims to increase the visibility and accessibility of apps, providing support to both Cytoscape users and app developers. We anticipate that traffic will continue to increase as apps–and the App Store–become more prominent in the Cytoscape community.

Page 56: NRNB Annual Report 2013

Appendix A. The 2013 NRNB Network

SAWYERS, CHARLES

PEROU, CHARLES M

MEYERSON, MATHEW L

LEVINE, DOUGLAS A

LADANYI, MARCMESIROV, JILL P

SAKUNTABHAI, ANAVAJ

SANSONETTI, PHILIPPE

KUCHERLAPATI, RAJU

THIEFFRY, DENIS

COLLOMBET, SAMUEL

SCHWIKOWSKI, BENNO

LOPES, CHRISTIAN

GAO, JIANJIONG

AKSOY, BúLENT ARMAN

SANDER, CHRIS

RUGHEIMER, FRANK

BRUN, CHRISTINE

NOIROT, PHILIPPE

NALDI, AURâLIEN

CERAMI, ETHAN

VARMUS, HAROLD

SHARMA, KUMAR

WOLF, DIETER AMCCONNELL, MIKE

CHANG, JOHN T

GINSBERG, MARK

GUITHART, ORIOL

BARBER, DIANE L KIRBY, MICHEAL

HU, ZHENJUN CHANDA, SUMIT K

GILSON, MICHAEL KAY, STEVEN

ECKMANN, LARS

BARK, STEVEN J

BANDYOPADYAY, SOURAV

WEBSTER, NICK

SMOOT, MIKEIDEKER, TREY

SUBRAMANI, SURESH

HOOK, VIVIAN

DUVVURI, VIKAS

DORRESTEIN, PIETER

BANDEIRA, NUNO

VAN ATTIKUM, HAICO

JONES, LEANNE

DAWSON, TED

RATH, CHRISTOPHER

M KAMBUROV,

ATANAS

TILL, ANDREAS

SAITO, RINTARO

ONO, KEIICHIRO

PENTCHEV, KONSTANTIN

MAERE, STEVEN

DEMCHAK, BARRY

BEMIS, DEBRA

FOWLER, JAMES

ASTAKHOV, VADIM

CHRISTAKIS, NICHOLAS

DUTKOWSKI, JANUSZ

WRENSCH, MARGARET CONKLIN,

BRUCE

YUMOTO, FUMIAKI

LOTIA, SAMAD

FLETTERICK, ROBERT J

GUO, YURONG

ZHANG, KANGKIPPS, THOMAS

GREGG, CHRISTOPHER

HANCOCK, WILLIAM S

NORMAN, MICHAEL L

HANNUM, GREG

LI, JIANFENG

SOBOL, ROBERT W

HANSPERS, KRISTINA

ZHANG, CHAO

KWOK, PUI-YAN

WAAGMEESTER, ANDRA

ZHOU, YIGANG

XU, DONG

WANG, JIGUANG

DHRUVA, NEIL

TANG, LING FUNG

MORRIS, JOHN

"SCOOTER"

MONTOJO, JASON

DONG, YUE

SHIH, DAVID

KUCHINSKY, ALLAN

FIJTEN, RIANNE

FRIED, JAKE

LUNA, AUGUSTIN

KUMAR, PRAVEEN

KUTMON, MARTINA

DUTTA, ANWESHA

VAN IERSEL, MARTIJN

FERRIN, THOMAS

WILLIGHAGEN, EGON

MORRIS, QUAID

ALMAN, BENJAMIN A

RODCHENKOV, IGOR

EMILI, ANDREWVOISIN, VERONIQUE

BADER, GARY

GUIDOS, CYNTHIA

BRUDNO, MICHAEL

TAYLOR, MICHAEL

GRAMOLINI, ANTHONY

ISSERLIN, RUTH

MERICO, DANIELE

FIUME, MARC

DANSKA, JAYNE

CHACHCHA, KHUSHI

PEARSON, BRET

BROWN, JOHN

PFISTER, SABINA

SINHA, SRAVANTHI

RANI LAUNGANI,

RITISHA

PICO, ALEXANDER

SIMINOVITCH, KATHERINE

DICK, JOHN

ZACKSENHAUS, ELDAD

GAIEVER, GURI

ZANDSTRA, PEER WALLACE, IAIN

SINGH, SHEILA

BAHCECI, ISTEMI

SONLU, SINAN

DOGRUSOZ, UGUR

JIAO, DAZHI

LIU, JEFF

STEIN, LINCOLN

ARANDA, BRUNO

HERMJAKOB, HENNING

BOONE, CHARLES JURISICA, IGOR

FRANZ, MAX

ANDREWS, BRENDA

A network representation of all NRNB personnel and collaborators (blue circles), all TRD, DPB, Collaboration, and Service projects (orange diamonds), and associated publications (green triangles). Node size is proportional to the number of connections. Thick red borders indicate personnel and projects directly funded by the NRNB P41 grant. There are 276 nodes and 365 connections in the network. NRNB funds 46 (17%) of these nodes, which make 211 (58%) of the connections.

Page 57: NRNB Annual Report 2013

Annual Progress Report - Research Highlights 2013 National Resource for Network Biology

P41 GM103504 05/01/2012 - 04/30/2013

Contents ● Network Approach to Building Gene Ontologies ● First Release of Cytoscape 3.0 and the Cytoscape App Store ● NRNB Google Summer of Code Program Reaches New Levels

Network Approach to Building Gene Ontologies Ontologies are of key importance to many domains of biological research. The Gene Ontology (GO), in particular, has been instrumental in unifying knowledge about biological processes, cellular components, and molecular functions through a hierarchy of concepts and their interrelationships. However, given only partial biological knowledge and inconsistency in how this knowledge is curated, it has been difficult to construct, extend and validate GO in an unbiased manner. We have recently showed that the existing collection of high-throughput network maps, as are now becoming available, can be analyzed to automatically assemble an ontology of gene function that rivals manually curated efforts [1]. Our systematic computational approach combines evidence from physical, genetic and transcriptional networks to produce an ontology comprised of 4,123 biological concepts and 5,766 hierarchical concept relations. Using a new ontology alignment procedure, we found that the network-based ontology captures the majority of known cellular components and identifies approximately 600 new cellular components and component relations – many of which we were able to validate either experimentally or bioinformatically. By working closely with the GO curators, we were able to incorporate selected new components and relations into the Gene Ontology, thus providing proof-of-principle for how to systematically update and revise the GO structure based on large-scale data 1. Dutkowski, J., Kramer, M., Surma, M.A., Balakrishnan, R., Cherry, J.M., Krogan, N.J., and Ideker, T., A gene ontology inferred from molecular networks. Nat Biotechnol, 2013. 31(1): p. 38-45.

First Release of Cytoscape 3.0 and the Cytoscape App Store The overall mission of Cytoscape is to be a freely available worldwide asset supporting network analysis and visualization for systems biology science. Cytoscape Version 3.0 (v3.0) was released for unrestricted public use on February 1, 2013. It represents an evolution of v2.x resulting from a two-year collaboration of a multinational, multi-institution team of programmers and biologists. The major focus of v3.0 is the modularization and rationalization of code to solve stability issues in v2.x encountered as multiple developers pursued multiple agendas. Version 3.0 addresses these issues by adopting modular coding practices promoted by the OSGi

Page 58: NRNB Annual Report 2013

architectural framework. This enables both the Cytoscape core and externally developed apps (formerly called plugins) to evolve independently without compromising unrelated functionality. Since 2012, weekly visits (outside of holidays) have increased. The Cytoscape v3.0 web page was first put up in October 2012. The trends since the February, 2013 release are too new to yield conclusions, though it seems that visits have measurably increased. Visits to the Cytoscape download page have remained somewhat constant over time, though seem to have increased since v3.0’s February 2013 release. To help address the needs of users, we launched the Cytoscape App Store (http://apps.cytoscape.org) to coincide with the release of Cytoscape 3.0, a major re-architecturing of Cytoscape for improved stability, performance, and versatility. The overarching goals of the Cytoscape App Store are to highlight the important features apps add to Cytoscape, to enable researchers to find apps they need, and for developers to promote their apps. For each Cytoscape 3.0 app, the App Store supports unique features like one-click install and comprehensive download statistics. The App Store opened for business on June 1, 2012. Since then, it has received over 33,000 visits from users worldwide. Except for during the holiday season, the traffic to the App Store has consistently grown. By March, 2013, weekly visitors numbered between 1,100 and 1,300. Through March, 2013, a total of 33,596 visits were received. The App Store is already playing a broader role in the Cytoscape community than just a place for browsing and submitting apps. For instance, we held a competition for the best Cytoscape 3.0 apps in December 2012. The first prize was shared by ClueGO, which visualizes the relationship between gene ontology terms; and DynNetwork, which visualizes networks with time-based movement. We plan to host more competitions in the future to encourage Cytoscape 3.0 app development. Apps and the app developer community play a critical role in success of Cytoscape, ensuring its continued relevance and reach as the field of network biology evolves. The new Cytoscape App Store aims to increase the visibility and accessibility of apps, providing support to both Cytoscape users and app developers. We anticipate that traffic will continue to increase as apps–and the App Store–become more prominent in the Cytoscape community.

NRNB Google Summer of Code Program Reaches New Levels Last summer through the Google Summer of Code (GSoC) program we received over 60 student applications. From these we selected 16 students to mentor on Cytoscape and NRNB-related projects. All 16 projects passed and completed the summer successfully! This is almost double the number of students we mentor through GSoC in a typical year and puts NRNB in the top 10 supported organizations out of 180 open source orgs accepted into the Googel program. Google paid $5,000 per student, making their investment $80,000 in NRNB for 3 months of work. Inspired by this very successful model for recruiting new code contributors, we designed and launched NRNB Academy last year. Through NRNB Academy, we offer anybody the opportunity to work with our open source development team on network biology related tools and resources. The program offers a framework for training by providing project ideas and by pairing participants with mentors. It is completely volunteer-based and offers participants flexible

Page 59: NRNB Annual Report 2013

project terms. Since its launch in January 2011, we have had 14 requests from participants, and we currently have 4 students enrolled. The first graduate completed their project in September 2012. In addition to ongoing student projects, the program has also resulted in one collaboration and continues to be a source for project ideas and mentors for our GSoC effort. Based on our experience so far, this program is not only effective in producing useful tools and resources, but it also serves as a mechanism to increase long-term development collaborations.

Page 60: NRNB Annual Report 2013

Annual Progress Report - Administrative Information 2013 National Resource for Network Biology

P41 GM103504 05/01/2012 - 04/30/2013

Administrative Structure During the first year, we defined the administrative structure of the resource, including some unique new roles within the organization. The roles of Principal Investigator (PI), Co-PI, External Advisory Committee (EAC), Resource Administrator and Chief Software Architect were defined as in the original grant. We defined a new role of Executive Director (ED) to oversee some of the new resource functions that NRNB provides, including Training & Outreach, Communications and Infrastructure. The ED (Alex Pico, Gladstone Institutes) is responsible for coordinating these efforts as well as conducting all of the necessary tracking and due diligence for the annual reporting to NIH. During the second year, we defined the new role of Collaboration Coordinator to screen and process collaboration requests to our resource. This has been a vital role in supporting the 80+ ongoing collaborations during the past two years. During the third year, we defined a proper position for the Roving Engineer who is vital for outreach to new users, app developers and strategic partnerships. Our Roving Engineer is also a major contributor to Cytoscape core design and implementation, embodying the full cycle from users to developers to implementation to release. Finally, we are very pleased to have maintained an active dialog with our EAC members, including Dr. Stephen Friend as chair of the committee. Budget changes have been minimal over the three years, with the exception of the new Collaboration Coordinator and TRD increases for Pico, Ideker and Sander in Year 2, and the new Roving Engineer and subsequent TRD cuts to Pico and Ideker in Year 3. The trend over time has been toward supporting more Outreach initiatives to fulfill our P41 goals.

Outreach

TRDs

Admin

Co-PIs

Ideker

Pico

Sander

Bader

Schwikowski

Fowler

A B

Page 61: NRNB Annual Report 2013

Figure 1. Budget graphs. Area charts showing the distribution of funds for years 1-3 (x-axis) per category (A) and per group (B). Y-axis is in units of $1,000s of US dollars. Each stripe typically corresponds to an individual with a specific role in NRNB, totaling 6.5 FTEs. Note that groups are sorted by degree of change, which is critical in this style of visualization to minimize misperception of change when slopes are actually parallel. As the basis for the graphs above, here are itemized tables of FTEs and funding for all three years (Table 1). Highlighted in red are the significant changes in Year 3 to FTEs and total dollars.

FTEs $1,000s Roles and Groups Year 1 Year 2 Year 3 Year 1 Year 2 Year 3 Collaboration (Ideker) 0.00 0.50 0.63 0 50 50 Admin-Asst. (Ideker) 1.00 0.56 0.56 52 38 41 Core Tech. (Ideker) 0.40 0.40 0.40 47 51 53 TRD-A (Ideker) 0.50 0.50 0.50 40 45 36 Admin-PI (Ideker) 0.30 0.30 0.29 74 78 77 Communication (Pico) 0.30 0.30 0.25 29 29 25 Admin-ED (Pico) 0.50 0.50 0.50 56 56 57 Roving Engineer (Pico) 0.00 0.00 0.12 0 0 16 TRD-C (Pico) 0.20 0.48 0.13 21 39 17 Co-PI (Pico) 0.02 0.02 0.02 5 5 0 TRD-A (Sander) 0.65 0.65 0.62 90 97 98 Co-PI (Sander) 0.02 0.02 0.02 5 5 5 TRD-C (Bader) 1.00 1.00 0.91 90 93 90 Co-PI (Bader) 0.10 0.10 0.10 0 0 0 TRD-D (Schwikowski) 1.00 1.08 1.08 81 83 83 Co-PI (Schwikowski) 0.08 0.08 0.08 0 0 0 TRD-B (Fowler) 1.00 0.72 0.20 58 54 53 Co-PI (Fowler) 0.10 0.10 0.10 21 26 27 SUBTOTAL 7.17 7.32 6.51 669 750 728 Supplement (Ideker) 0.00 0.40 0.40 0 45 45 Supplement (Pico) 0.00 1.00 1.00 0 85 85 Supplement (Bader) 0.00 0.40 0.40 0 45 45 SUBTOTAL 0.00 1.80 1.80 0 175 175 GRAND TOTAL 7.17 9.12 8.31 669 925 903

Table 1. NRNB effort and budget. Annual budgeting of FTEs and $1,000s itemized by roles (per group). Major changes are highlighted in red. Subtotals are provided separately for the main grant and supplemental funding (bold) and Grand Total is in the last row. Allocation of Resource Access Beyond the active distribution and support of Cytoscape, which is covered in later sections, NRNB resource allocation can be categorized in the following way:

1. On-site training events: NRNB staff participated in 13 training events during the reporting period. These events include tutorials, workshops and courses.

Page 62: NRNB Annual Report 2013

2. Requests for collaboration and mentorship: For the second consecutive year, we have maintained a high number of active collaborations. Many of these collaborations are coming through our participation in Google Summer of Code (GSoC) and our own NRNB Academy efforts (see #3).

3. Google Summer of Code and NRNB Academy: In addition to receiving requests from potential students through these programs, we also receive requests from a number of groups to join our organization as mentors. This brings new technology and ideas to our effort. GSoC has been our most successful outreach program by far. It’s responsible for a quarter of all our NRNB collaborations. It is the most active period for NRNB.org, granting broad exposure for NRNB in the open source community. Building on the success of this model, we launch NRNB Academy last year. Our Academy follows the same approach as GSoC, organizing around available mentors, ideas and interested students. However, we are not restricted to supporting university students in our program as it is independent of GSoC and 100% volunteer based. The Research Progress and Highlights provide more details.

4. Requests for training material support: We receive requests for tutorial materials throughout the year from inside and outside the Cytoscape core development team. Our homegrown Open Tutorials system makes it easy to accommodate all such requests. Open Tutorials is an easy-to-use wiki system that provides content formatted to be used as online sessions, slide shows and printed handouts. This year we are seeing more content from more contributors, in addition to a steady rise in visitors (see details in the Training section below).

5. Providing software community support: Our goal is to develop a generic template of services based on the support we provide the Cytoscape community of users and developers. So far we have extended support to Cytoscape, WikiPathways, Cytoscape Web and the cBio Cancer Genomics Portal. These proven resources demonstrate the broader scope of the NRNB mission. We are providing distribution links, showcases, tutorial support, news and event tracking, and GSoC and NRNB Academy participation to these projects. New this year, is a gallery page with screenshot for all of these tools.

Awards and Honors None Dissemination

Overall Cytoscape Version 3.0 (v3.0) was released for unrestricted public use on February 1, 2013. It represents an evolution of v2.x resulting from a two-year collaboration of a multinational, multi-institution team of programmers and biologists. This report describes the Cytoscape software, the infrastructure that supports it, and the activities of the community it serves.

Background The overall mission of Cytoscape is to be a freely available worldwide asset supporting network analysis and visualization for systems biology science. The major focus of v3.0 is the modularization and rationalization of code to solve stability issues in v2.x encountered as multiple developers pursued multiple agendas. Under v2.x, internal programmatic interfaces evolved from one release to the next, leading to the failure of working plugins over time and

Page 63: NRNB Annual Report 2013

negative interactions between otherwise working plugins. Ultimately, this resulted in loss of programmer and user productivity, and undermined community confidence in Cytoscape. v3.0 addresses these issues by adopting modular coding practices promoted by the OSGi1 architectural framework. This enables both the Cytoscape core and externally developed apps (formerly called plugins) to evolve independently without compromising unrelated functionality. At the logical level, Cytoscape leverages OSGi precepts to produce v3.0 APIs having cleaner and clearer demarcations between functional areas. At the deployment level, OSGi enables on-the-fly substitution of one processing element for another (e.g., apps) in order to tailor Cytoscape to meet user requirements at runtime without reinstalling or reconfiguring Cytoscape. v3.0 represents a strong investment toward reducing future development and support costs, and increasing reliability and evolvability. We expect to leverage v3.0 as a platform to satisfy the evolving needs of multiple stakeholder groups, and as a platform enabling research on leading edge analysis and visualization techniques. v3.0 is the intended successor to v2.8, with development and support of v2.8 expected to diminish and disappear over time in favor of v3.0 and its successors. v3.0 is upward compatible with v2.8, but not downward compatible. While v3.0 is a substantial reorganization of v2.8, its launch marks an evolution in the Cytoscape team’s approach to community engagement, where different community demographics are engaged in different, demographic-sensitive ways. The team identified four major groups: new users, casual (but not new) users, power users, and app developers. Initial v3.0 release was promoted towards power users and app developers as a way of delivering v3.0’s advanced capabilities to groups most able to leverage them, give qualitative and remedial feedback, and promote v3.0 adoption to other Cytoscape users. This strategy dovetails with v3.0 features (described below) that lower barriers to entry for new and casual users while enabling efficiency and productivity for power users and app developers. The second release (v3.0.1) is imminent – it incorporates various critical fixes and numerous feature requests made by early v3.0 adopters. As such, it will be promoted to the entire Cytoscape community, including new and casual users. v3.0.1 will become the default Cytoscape download, replacing v2.8 as the default. As compared to v2.8, Cytoscape users will benefit most directly from the v3.0 in the long run by:

• experiencing  fewer  core  and  app  bugs  from  one  release  to  the  next  • the  availability  of  more  and  richer  apps  (due  to  developers  spending  less  time  tracking  and  fixing  

bugs)  • more  core  features  with  higher  biological  and  logistical  value  (due  to  improved  flexibility  

provided  by  interface-­‐driven  development)  

The v3.0 Release Throughout 2012, Cytoscape developers made a number of beta versions available to early adopters. Issues were tracked in RedMine, and were contributed by both developers and early adopters. The final release was made on February 1, 2013, accompanied by updated user documentation, user tutorials, JavaDoc programmer documentation, app developer tutorials, a new App Developer Cookbook (containing useful code snippets), and release notes.

1  www.osgi.org  –  also  used  as  the  basic  framework  for  Eclipse  and  numerous  commercial  products  

Page 64: NRNB Annual Report 2013

Additionally, a new and comprehensive user-focused Welcome Letter was created to differentiate between different user demographics and engage them appropriately. Principle v3.0 development was carried on by staff and researchers worldwide, including the following institutes: UC San Diego, Pasteur Institute, University of Toronto, Gladstone Institute (UC San Francisco), University of Amsterdam. v3.0 included the following major features:

• Upward  compatibility  with  Cytoscape  2.x  networks,  attributes,  analysis,  layout,  and  display  • App  Store  (for  centralized  app  availability)  • Friendly  Welcome  dialog  (to  engage  new  and  casual  users)  • Import  network  • Edge  bend  visual  property  • Edge  bundling  • Grouping  (for  hierarchical  networks)  • Enhanced  search  • Show  All  in  Table  Browser  • Multiple  network  management  • Major  refactoring  to  rationalize/regularize  inter-­‐module  interfaces  (to  aid  app  developers  in  

creating  reliable  apps)  

Major issues remaining after the v3.0 release included: • Slower  startup  than  v2.x  • Fewer  apps  (plugins)  than  v2.x  • Numerous  undiscovered  or  unaddressed  bugs  (due  to  major  refactoring)  • Smaller  network  capacity  on  32  bit  processors  

There are 145 apps (plugins) available in v2.x, though many have gone unmaintained and have fallen out of use. Of the v2.x plugins, 8 were delivered in v3.0 as core functionality: EnhancedSearch MetanodePlugin2 PSICQUICUniversalClient GraphMLReader NCBIEntrezgeneUserInterface ScriptEngineManager JavaScriptEngine NetworkAnalyzer

Additionally, the App Store contained another 13 apps (corresponding to many of the most popular v2.x plugins): AgilentLiteratureSearch Cy3PerformanceReporter jActiveModules CentiScaPe Cyni Toolbox MCODE ClueGO CyPath2 PathExplorer CluePedia DynNetwork Venn and Euler Diagram ClusterOne GeneMANIA

Bug Bounty To foster early investment and engagement in v3.0 by the user community, we created the Cytoscape Bug Bounty program, which paid out small prizes to users identifying high value bugs in the month of February 2013.

Page 65: NRNB Annual Report 2013

The program produced 35 bugs by 17 qualified reporters: 8 crash/data loss, 19 user interface, and 7 cosmetic. Gift cards were given to the top 9 reporters.

It  was  great  fun  to  participate  in  the  February  Bug  Bounty.  Thank  you  for  organizing  it,  and,  in  general,  thank  you  for  making  the  development  of  Cytoscape  an  open  process.  It’s  really  appreciated,  from  the  point  of  view  of  the  users,  when  a  software  is  developed  this  way.  

In  general,  I’ve  found  that  the  new  Cytoscape  3.0  version  is  a  great  improvement  over  the  previous.  The  new  “Welcome  screen”,  together  with  many  little  improvements  to  the  menus  and  the  interface,  gave  me  a  feeling  of  very  user  friendly  software.  The  ability  of  downloading  whole  species  for  networks  with  a  click,  or  to  import  them  from  many  sources,  is  attractive  to  many  people,  and  I  know  some  persons  who  will  use  it  for  their  work.  The  App  store  is  also  a  nice  addition,  as  it  is  much  better  to  have  a  common  web  page  for  all  the  plugins  instead  of  having  to  look  for  documentation  dispersed  into  many  little  websites.2  

The v3.0.1 Release The v3.0.1 Release is scheduled for April 18, 2013. Its main purpose is to eliminate bugs leading to data loss, program crashes, misleading displays, and small user interface issues. Given this, we expect that it will be suitable for use by the entire Cytoscape community (including new and casual users) in preference to v2.8, and we expect v3.0.1 to become the default download on the Cytoscape web site. The first v3.0.1 release candidate (RC) will become available for download by April 4. It will include fixes or resolutions for 98 reported bugs and other issues, including 30 of 35 reported under the Bug Bounty program. Notably, the v3.0.1 release:

• Substantially  increases  the  size  of  network  manageable  on  32-­‐bit  systems  • Migrates  source  from  SVN  to  GitHub  (to  expand  collaboration  opportunities)  

At release time, we expect there to be slightly under 200 bugs or unresolved issues remaining on our backlog, including feature requests and issues requiring substantial development or rework. Additionally, app developers have asked for improved documentation to enable quick and reliable app development. Currently, UC San Diego is upgrading three v2.8 plugins to become v3.0 apps, and expects completion in Q3 2013:

• GenomeSpace  • MiMI  • BiNGO  

Additionally, the NRNB has offered Amazon gift certificates as rewards to app developers for the first 20 apps independently developed and submitted.

2  Giovanni  Marco  Dall’Olio,  March  8,  2013  via  e-­‐mail  

Page 66: NRNB Annual Report 2013

Bug and Issue Tracking Since early 2011, the Cytoscape team has tracked bugs and issues using the RedMine cloud service. As of v3.0, users can inject reports of bugs and issues into RedMine directly from Cytoscape. A CDF plot of bugs and issues logged over time shows aggressive tracking:

The following CDF shows that the Cytoscape team has responded to logged reports (by addressing them as bug fixes or scheduling them to be addressed in the future).

“Created” means that a ticket was opened, and “Updated” means that a Cytoscape team member has acknowledged it, and has prioritized it for solving or has already solved it.

Measured Results

Cytoscape Downloads and Web Site Visits Through 2013, the overall number of Cytoscape downloads (including v2.8 and v3.0) continues to rise. The chart below shows the monthly download counts, with data dropouts in November,

Page 67: NRNB Annual Report 2013

2007 and March, 2009. In February 2013, the download count was 6,685, and the count for March was 7,323.

Since 2012, weekly visits (outside of holidays) have increased. The Cytoscape v3.0 web page was first put up in October 2012. The trends since the February, 2013 release are too new to yield conclusions, though it seems that visits have measurably increased. Visits to the Cytoscape download page have remained somewhat constant over time, though seem to have increased since v3.0’s February 2013 release.

Page 68: NRNB Annual Report 2013

In examining year over year visit patterns, 2013 visits have increased by about 30%, with an uptick corresponding to the v3.0 release timeframe. This pattern is reflected in visits to the download page, too. Note that visits to the v3.0 page are associated with about 25% of page visits. (Note that visits to the v3.0 page are prerequisite to downloading v3.0, and therefore bounds the count of v3.0 downloads. Visiting the v3.0 page can have many purposes, only one of which is downloading v3.0.)

Between January 1, 2012, and the end of March, 2013, the Cytoscape web site received 393,903 distinct visits. Web site visitors were geographically dispersed worldwide:

Page 69: NRNB Annual Report 2013

Cytoscape visitors arrived most often after performing a Google search, but also arrived from direct links and from links within Cytoscape web pages:

Page 70: NRNB Annual Report 2013

App Store The App Store opened for business on June 1, 2012. Since then, it has received over 33,000 visits from users worldwide:

Most visits originate from a link within the Cytoscape web site but a significant number of visits launch from search engines and direct links:

Page 71: NRNB Annual Report 2013

Except for during the holiday season, the traffic to the App Store has consistently grown. By March, 2013, weekly visitors numbered between 1,100 and 1,300. Through March, 2013, a total of 33,596 visits were received:

Interest was evenly distributed across a number of app categories:

The most frequently downloaded apps (as of March, 2013) were:

App Count ClueGo 1,394 GeneMANIA 1,230 jActiveModules 1,196 MCODE 980

Page 72: NRNB Annual Report 2013

Cytoscape Citations The count of Cytoscape-citing papers continues to accelerate year-over-year, with the count for 2013 being incomplete (as of March, 2013).

Year-over-year growth has been historically sporadic, and may be showing signs of slowing:

Year-over-year Growth 2004-2005 64% 2005-2006 72% 2006-2007 126% 2007-2008 94% 2008-2009 80% 2009-2010 8% 2010-2011 32% 2011-2012 19% 2012-2013 incomplete

Community Outreach The Cytoscape community consists of core developers, app developers, and users. Communication and outreach is multimodal: Google Groups for contemporaneous discussion, Google video and Hackathons for core developer meetings, papers, web site and social media, and public meetings and symposia.

Google Groups and Video The Cytoscape team has maintained Google Groups since April, 2011. As of March, 2013, there were 4 groups:

Page 73: NRNB Annual Report 2013

Group Membership Topic Count cytoscape-discuss 1,531 2,570 cytoscape-helpdesk 1,148 1,413 cytoscape-announce 918 194 cytostaff 49 2,643

The discuss and helpdesk groups facilitate self help (through search), peer assistance, and assistance directly by Cytoscape core developers. The announce group is used by Cytoscape core developers to announce new Cytoscape releases, and by app developers to announce new apps.

The cytostaff group enables communication between Cytoscape core developers to coordinate activities and exchange technical information. Cytoscape core developers also meet on video chat weekly to plan agendas, triage issues, and conduct infrastructure activities.

Hackathons The Cytoscape team conducted a Hackathon at the Gladstone Institute in San Francisco on December 12, 2013, concurrently with the annual general Cytoscape symposium. Participants laid out the following roadmap for short and medium term development:

• Table  loading  performance  • Network  panel  update  • Command  language  support  • Search/Filter  API  • Property  Sheets  • Separation  of  ViewModel  • Advanced  Label  Rendering  (Zoom/multi-­‐scale)  • JSON  package  to  support  external  processes  • SBGN  symbols  • Table  merge  • Vizmapper  documentation  • Developer  requests  

o Integration  to  R/scripting  o XMLRPC/REST  access  o Headless/daemon  mode  

Web Site and Social Media The main Cytoscape web site (cytoscape.org) was augmented to include a branch for v3.0, which includes user and developer documentation, links to the Welcome Document and release notes, and links to presentations and social media sites. Notably, videos of app presentations at the December 13-14 general Cytoscape symposium were posted at: http://nrnb.org/presentations.html

Page 74: NRNB Annual Report 2013

Future Risks The primary objective of the architectural refactoring that transformed Cytoscape v2.8 to v3.0 was to normalize relationships amongst subsystems so that changes could be made in one subsystem without detriment to another. While this evolution has been accomplished, much code was changed, and bugs continue to be discovered and reported by the user community. For now, the community remains forgiving and indulgent, mainly because Cytoscape’s basic functionality appears sound. However, the community perspective may change when v3.0 becomes the default download. While bugs can be fixed on point releases, slow startup times and the slow conversion rate of v2.x plugins into v3.0 apps remain a threat for several quarters. Mitigating strategies include continuing the excellent and diligent support offered by the Cytoscape team and community, which serves to help prioritize release features and to keep user frustration from growing. Additionally, software reliability can be improved by incrementally developing automatic test suites beyond what exists today. While Cytoscape’s semantic versioning provides app developers with important guarantees of interface- and semantic-consistency as Cytoscape evolves, it’s possible that semantic versioning itself may threaten to retard plugin authorship, rendering Cytoscape unresponsive to scientific requirements in meaningful timeframes. The interfaces defined in Cytoscape 3.0 have been shown to be insufficient for the needs of new apps in some cases. While new interfaces can be added, doing so requires incrementing the minor version number (e.g., from 3.0 to 3.1), which is intended to occur only rarely. Furthermore, the operational complexity and overhead of making new Cytoscape releases virtually guarantee the slow evolution of Cytoscape interfaces. Mitigating strategies include deliberately hastening the pace of interface-augmenting releases and engaging app developers to aggressively feed interface requests to the team – possibly at the expense of core development. Notwithstanding the enormous benefits of the architectural refactoring, critical Cytoscape subsystems (e.g., user interface and apps) remain tightly coupled. This coupling threatens (at best) to recapitulate the tangled relationships that triggered the refactoring or (at worst) make the replacement, scaling, or reuse of these subsystems problematic. Eventually, this threatens the evolvability of Cytoscape to serve scientific interests in relevant timeframes. Mitigating strategies include focused refactoring of key subsystems along SOA (service oriented architecture) or COA (component oriented architecture) principles to expose and separate distinct concerns. This type of refactoring can occur while implementing a given use case, and then leveraged to benefit subsequent, related use cases. Patents, Licenses, Inventions, and Copyrights None. We are committed to an Open-Source dissemination policy. Training and Outreach Annual Cytoscape Retreat The annual Cytoscape Workshops and Symposium was hosted by the National Resource for Network Biology (NRNB) at the Gladstone Institutes on the UCSF Mission Bay campus in San Francisco during this reporting period. In addition to developer meetings, the event included user and new developer tutorials, a Plugin/App Expo, a special Network Biology symposium,

Page 75: NRNB Annual Report 2013

and our EAC meeting. The meeting was a huge successful with capacity attendance for the user tutorial and very positive survey responses from attendees. Workshops For the reporting period, NRNB has participated a total of 13 training events in multiple countries. These events include tutorials, workshops and courses. Cytoscape is taught in many classroom and workshop settings. We try to track all of these on our website and Event Tracker. We’ve identified 37 courses offered in the 2012-2013 calendar year! And these are just the ones affiliated with NRNB staff. Open Tutorials Our tutorial management system, Open Tutorials, is still the main source for tutorial materials for the Cytoscape project, and is being used both internally by presenters, and by researchers and developers. Visits to Open Tutorials have continued to increase over the last year, with an average of 3750 visits/month, as compared to 2700 visits/month for the previous reporting period. More than half of all visits (57%) are from new visitors. We estimate that the increase in traffic is mainly from users, as we have had only two new editors in the same period. Tutorial development during the past year was focused on a set of user tutorials for Cytoscape 3.0, covering the most common use cases and describing the user interface and new welcome screen. We plan to add several additional user tutorials over the next 6 months. Overall, Open Tutorials has allowed NRNB to reach our goal of providing tutorial support to a broad and diverse community.

Social Media We have initiated a social media effort for Cytoscape through a number of different tools (http://www.cytoscape.org/community.html). For example, a Twitter account is used for quick announcements (http://twitter.com/cytoscape) and YouTube is utilized for video tutorials (http://www.youtube.com/results?search_query=cytoscape). During this reporting period we continued the popular Tumblr site to capture published figures using Cytoscape. Pairs of figures are posted on a weekly basis on the front page of cytoscape.org based on this Tumblr feed. We now regularly get authors submitting their recent publications to us, asking to feature them via our Tumblr site. This is directly helping to promote the use and citation of Cytoscape. Google AdWords We were awarded a non-profit account in the Google AdWords program. We are managing 8 Ad Group campaigns consisting of over 880 keywords and phrases. Last month alone we received over 7,000 clicks on these ads to our NRNB sites. These activities are worth over $8,800 a month (a 550% increase over last year), which we are getting free-of-charge. We have a spending limit of $329 per day through this program, a potential value of $120,000 per year, so we will continue to identify new ads and relevant resources. Google Summer of Code and NRNB Academy In addition to the outreach effort described above, we also leverage a Google-sponsored program called Google Summer of Code to attract new developers. This year we are coordinating 30 mentors, leveraging the effort of developers from open source communities surrounding NRNB-related tools. Last summer through the GSoC program we received over 60

Page 76: NRNB Annual Report 2013

student applications. From these we selected 16 students to mentor on Cytoscape and NRNB-related projects. All 16 projects passed and completed the summer successfully! Google paid $5,000 per student, making their investment $80,000 in NRNB for 3 months of work. Inspired by this very successful model for recruiting new code contributors, we designed and launched NRNB Academy last year. Through NRNB Academy, we offer anybody the opportunity to work with our open source development team on network biology related tools and resources. The program offers a framework for training by providing project ideas and by pairing participants with mentors. It is completely volunteer-based and offers participants flexible project terms. Since its launch in January 2011, we have had 14 requests from participants, and we currently have 4 students enrolled. The first graduate completed their project in September 2012. In addition to ongoing student projects, the program has also resulted in one collaboration and continues to be a source for project ideas and mentors for our GSoC effort. Based on our experience so far, this program is not only effective in producing useful tools and resources, but it also serves as a mechanism to increase long-term development collaborations. Our first graduating student continues to be involved as a contributor, and two of the ongoing students are involved in longer-term ongoing projects as well.

Page 77: NRNB Annual Report 2013

Annual Progress Report - Advisory Committee 2013 National Resource for Network Biology

P41 GM103504 05/01/2012 - 04/30/2013

We held our second External Advisory Committee (EAC), on December 12, 2012, in coordination with the annual Cytoscape Workshops and Network Biology Symposium hosted by NRNB this year at the Gladstone Institutes in San Francisco. In addition to the EAC members listed below, we also had our Program Officer, Doug Healy in attendance. The following report was issued by our EAC. Participating External Advisory Committee Members:

• Stephen Friend, Sage Bionetworks • David Hill, Dana-Farber Cancer Institute • Tamara Munzner, University of British Columbia • Anya Tsalenko, Agilent Technologies • Marian Walhout, University of Massachusetts Medical School

Overall Perspectives of the NRNB External Advisory Board

All of the members of the advisory Committee found this meeting to provide evidence of very strong progress and appreciated the increased clarity as to how to convey it to outsiders. In the past 18 months all of the major suggestions have been effectively addressed. The supplementary material has allowed a very powerful engagement by Alex Pico and the delivery of an entirely new focus to build out the cytoscape tools within a “cytoscape App store”: http://apps.cytoscape.org This has been matched by a comprehensive evolution of functionalities within the new version of Cytoscape 3.0 and a coherent maturation of all the Technology Research and Development Projects TRDs and associated Driving Biological Projects DBPs. The three major suggestions this cycle involve: 1) reviewing both the existing TRDs and DBPs to determine how mid-course optimization of these projects might allow maximal creation of “shining examples” around the strengths of the NRNB, especially by searching for new distal DBPs, 2) resolving the question of how to best measure success for the NRNB with a transition away from paper/citation based metrics to metrics of community enablement and integration, and 3) the importance of preparing for the extension by completing the draft proposal in time to engage the EAC six weeks before it is due to be submitted. In summary, the NRNB has continued to make excellent progress through the first half of this funding period and the committee is strongly supportive of the overall progress and direction. The comments below, albeit pointedly critical, are designed to help the NRNB position itself for the strongest possible competitive renewal in 18 mos. Please see the following descriptions of the specific programs for more detailed comments:

Page 78: NRNB Annual Report 2013

Specific Project Summary Statements

1) TRDs and DBPs (separate one for TRD3 and Cytoscape)

All of the NRNB labs continue to do exciting and cutting edge work developing new approaches to develop network-based solutions to address important questions in biological and social sciences. The “network extracted gene ontology” is one example of integrating a novel way to better use ontologies while providing a visual output that offers a clearer and better representation of functional modules. Integrating statistical and scripting tools into Cytoscape is a decided plus, initially done in the context of social networks, that should have broad applicability. Ongoing work is proposing potential paradigm- shifting ways to answer questions and gain insight beyond traditional approaches – using link clustering and network ontology, for example.

The recent set of publications across the entire spectrum of NRNB activities shows that good progress is being made in developing new network-based tools and demonstrating the value of studying networks. At the approximately halfway point of this grant, the NRNB has provided clear examples of identifying problems or critical biological questions that require novel approaches, proposing and developing solutions based on integrating information into networks, and implemented potentially useful tools for addressing similar questions. Each of the TRDs was individually successful in that regard. The challenge going forward is to clearly demonstrate that these tools and approaches have applicability beyond the questions/problem(s) that the individual TRDs tackled in the first place. One thing to consider is now how to better integrate across multiple TRDs. For example, can the tools being developed in TRD A, C, & D

be used in TRD B – this could be taken on as a collaboration or via a new DBP. Can the tools in TRD D be used to add further insight in developing network as biomarkers or network ontologies efforts?

TRD C has made significant and impressive progress in the past year, with flagship projects in Mosaic (ontology-partitioned mosaics) and NeXO (network extracted ontologies). The Mosaic work has already been released as a Cytoscape plugin. The NeXO work is particularly exciting as a path to data-driven ontologies rather than a single monolithic solution that is not sensitive to context.

Several possible avenues for moving forward with the NeXO work were discussed, including the possibility of partnering with the existing GO project via supplemental funding.

In terms of communicating the overall value of the NRNB to the broader scientific community, there are four distinct elements that need to be clearly articulated in terms of what the TRDs are doing and what the NRNB as a whole has accomplished: NRNB to date has clearly shown 1) an ability to Identify a problem/driving biological question that can not be done without a network approach; 2) an ability to develop new tools and technology for network analysis and visualization; 3) an ability to implement usable tools

Page 79: NRNB Annual Report 2013

and demonstrate proof of concept; and, the most challenging, 4) an ability to demonstrate that the tools are getting into wide use (e.g. via Cytoscape). This will require additional tracking and curation efforts that will be challenging because Cytoscape is now viewed as a “standard tool” and therefore less likely to be cited.

The NRNB is poised to be more than a collection of already successful TRDs. There should be some consideration for a major paper that involves ALL TRDs and many of DBPs to show how the new suite of Cytoscape tools can help answer a major question in elucidating genotype-to-phenotype relationships. Cytoscape has become a great collection of tools and NRNB has done great science developing some new tools and using them on a specific question – but the NRNB needs to move beyond being just a developer of Cytoscape tools and should look towards becoming an entity that is more of a “whole is greater than the sum of the parts”.

While the entire spectrum of projects involving all TRDs and DBPs is quite exciting, now is the time to begin considering restructuring the DBPs – potentially eliminating some – as plans are developed for the competitive renewal in 18 months.

One area to consider is whether or not the NRNB should begin to branch out with respect to other disease models – much of the recent success has been focused on cancer – as there is more and more evidence for many genes to be involved in diseases very distinct from the initial disease associated with any given gene.

As previously, Hill’s lab is willing to serve as an alpha or beta test site for data integration and novel visualizations as well as testing plug-ins for statistical analysis coupled to visualizations.

In Summary, it is clear that some TRDs are progressing well and are on track to roll out tools for network biology that will be widely used. In other cases, it is not clear the right audience is being reached. With this in mind, we recommend that the NRNB perform a comprehensive review of all TRD projects and strive to align them with a set of DBPs that represent the most active user communities in network biology with the following goals:

● Reach out to key/hub user bases for each technology

● Pursue opportunities for cross pollination/integration/pipelines across NRNB technology projects, which are currently being developed in isolation

● Identify other important resources and tools that NRNB TRDs could integrate with

Cytoscape Progress:

The team has made great progress towards Cytoscape 3.0: the beta release has been available for many months, and the full release is coming very soon. Many suggestions from the last meeting have already been incorporated, including identifying which previous plugins are high impact and devoting resources to make sure that these are

Page 80: NRNB Annual Report 2013

ported to the new version.

The issue of backwards compatibility was raised again, since Cytoscape 3.0 introduces major API changes that prevents old plugins from working without code updates. The verbal answer made it clear that choices had been carefully considered in consultation with the developer community. In particular, the assurance was made that API compatibility is a guaranteed contract for all 3.x versions with no changes made before version 4.0, thanks to the use of semantic versioning. The suggestion was made once again to ensure that keeping the API stable is a very high priority, because as the user community grows in size the costs of breaking backwards compatibility increase accordingly.

The consensus was that the process taken as described verbally was sound; it was just poorly documented in the written report. The suggestion for next time is to more explicitly document several things:

- process taken (to show that care was in fact taken)

- lessons learned: what worked, what didn't

- plans for the future

The team has made great progress in better documenting the use of Cytoscape by the biology community, with compelling statistics about the amount of use (including the impressive number of 1400 NIH grants). The changes made to the cytoscape.org front page with the tumblr feed showing images and the explicit encouragement that people should cite its use are great. The use of resources to also manually track the divergence between citation rates and use is entirely appropriate (with the interesting result that use is at least 2x the citations).

There are many new exciting technical directions. The new AppStore will benefit many constituencies: developers, end-users, and the PIs themselves in documenting usage of its efforts by the community. The set of new features chosen also reflects the needs of many constituencies, for example scaffolding new users with the new welcome/startup screen, and supporting developers with the new API. It's also heartening to see technology transfer from the visualization community with the incorporation of edge bundling.

The report mentioned new support for 3D rendering. Concerns were raised about whether devoting resources to this effort is appropriate given the empirical work from visualization community that has found many drawbacks to 3D layout of node-link graphs. The verbal answer was the new modular architecture allows alternate renderers, and 3D was simply one of several, and it was developed by a community member rather than the core developers.

2) Outreach and Impact

At the last advisory board meting it was suggested to “distribute open source network

Page 81: NRNB Annual Report 2013

technologies to the greater scientific community”. This meeting Alex Pico presented the NRNB execution on that suggested deliverable. Simply stated there has been awesome progress and much of this stems from the direct leadership of Alex in his new role as an Executive Director of the NRNB. Whether measured by the recently published article in Nature Methods “A travel guide to Cytoscape plugins, or through a visit to the cytoscape app store you can get to by googling “cytoscape apps” http://apps.cytoscape.org or by looking at how often they are used, this stands out as a remarkable success. It is now possible to extend this powerful start and consider annotating it with sections for open source and non-open source apps. There is a possibility to begin a dialog between those that desire new apps with those willing to build them. It might even be possible to now have funding listed and contests to encourage the building out of the most requested apps.

3) Moving forward: Ideas and Topics for Discussion

A lot of discussion about moving forward to NRNB effort was centered on increasing outreach to potential users of NRNM resources including Cytoscape, as well as tracking the use of these resources. Big progress has been made already through http://www.nrnb.org website, Cytoscape app store, but more could be done.

Some suggestions for increasing outreach to users included targeted communications to potential users either subscribed to Cytoscape mailing list, or authors of papers using Cytoscape. Connections to various social media resources like twitter or facebook could be increased. Quantitatively this outreach could be measured by the number of groups using NRNB resources, not in number of papers citing these resources or Cytoscape. Some of the papers may not cite Cytoscape directly, but have it buried in the Supplementary information that is not being searched or not cited at all.

Impact of Cytoscape and NRNB tools in general could be increased by connecting to other public resources for molecular and computational biology. One example is connection with GenomeSpace (www.genomespace.org) which is a platform that connects different bioinformatics tools, making it possible to move data smoothly between these tools and leveraging available analysis and visualizations. Other public resources that could benefit from connection to Cytoscape include Galaxy, KnowledgeBase, and IGV. Sharing between users could be increased by enabling smooth sharing Cytoscape networks on Google Drive or Amazon Cloud, as well as the use of Cytoscape web.

One area of applications of network biology tools that could be significantly expanded going forward is social network research, especially analysis of social and molecular networks, and interactions between different groups.

NRNB group made an impressive progress with tens of successful Google Summer of Code projects. Going forward it would be great to track careers of these students and students from NRNB mentorship program as another way to measure impact on community and science.

Page 82: NRNB Annual Report 2013

4) Suggestions For Next EAC Meeting and Report:

1. Next Report This year's report was much better than last year's; however, there is still room for improvement.

As suggested, the emphasis shifted from the science results of the DBPs to the more appropriate new developments created through the TRDs; that's a major improvement. However, the problem of documenting to what extent the output of this and previous funding -- new tools or methods -- are used in biological discovery could be even more clearly addressed.

For example, in the group's own research papers that are not directly about the development of Cytoscape itself, to what extent was the use of Cytoscape instrumental in achieving the research results? We suggest that this story should be told very explicitly.

Another suggestion for the next round is to provide a full list of results or subprojects at a fine-grained level, for example a specific new Cytoscape plugin or new analysis method proposed in a research papers. For each result, identify progress according to a four key milestones:

1. Identify problems

2. propose solutions (for example, new methods in published paper)

3. build generally available tool

4. get other people to use it

The goal should not be to reach the final milestone for every idea, but to document progress in terms of moving from earlier ones to later ones. Subprojects may enter at any stage, they don't have to be seeded only through the DBPs in the original grant. Subprojects may also exit at any stage, for example when the decision is made to propose alternate new solutions rather than following up with tool building in every case. It was clear from the verbal discussion that the center should be able produce some very satisfying answers of its achievements along these lines, and that these proofs of accomplishment will be a compelling and convincing part of a renewal proposal. This type of reporting will also help with the argument that the impact of Cytoscape and the NBRB goes beyond simple publication counts and citation counts. The deeper goal of the center is to introduce and encourage network methods in the biology community, so documenting the adoption of methods and tools shows progress towards that goal.

A second suggestion is to more clearly explain the boundary between this P41 and the other sources of funding: the related R01, and the grants supporting the DBPs. Ideker articulated a clear story in response to EAC questions: the $300K/yr R01 funds maintenance, while new technology springs from the $700K/yr P41. The committee approves of this story; it just needs to be told clearly and concisely in the written

Page 83: NRNB Annual Report 2013

materials. In particular, document what efforts are funded through the R01 and what are through the P41. Although the NRNB has broader scope than Cytoscape alone, since it is partially funding core Cytoscape work the best way to address this boundary is to at least briefly present the full picture of what work on Cytoscape has been done, and then to explain what parts were funded by the P41. The current report gives the full picture of Cytoscape development, but does not adequately explain the boundary.

The administrative information section is very well done. The budget is clearly explained, with crosscutting breakdowns between categories (staff vs. TRDs vs. PI salaries) and PI groups. The breakdown of expenses according to both FTEs and money was also helpful. The discussion of the importance of actively cultivating an open development community is articulate.

2. Next Meeting First, the EAC should be sent the relevant written materials to read in advance of the actual meeting. This year, the report was provided on paper to committee members at the start of the meeting, with an electronic version following a few hours into the meeting. This timing is too late, because it's hard to assimilate the written report in parallel with attending to the presentations. The report should be provided to committee members in advance, ideally one week before the meeting, and at bare minimum at least two days before the meeting. The late timing this year was particularly frustrating given that this report was created many months ago, but through an oversight hadn't been forwarded to us.

Second, the EAC agreed that we would best serve the interests of the NBRB by scheduling our next meeting shortly before the renewal proposal is due in what we think will be June 2014. Our intent is to act as pre-reviewers, where we will read a full draft of the proposal in detail before the meeting and then devote the meeting to an in-depth discussion of ways to strengthen and improve it. We propose roughly six weeks before the proposal is due: early enough that our feedback can be responded to, but late enough that the draft proposal is nearly complete rather than preliminary. This meeting would be roughly 1.5 years from now, the same amount of time that has elapsed between our first and second meetings.

Third, a suggestion for the renewal proposal is to have a large set of short testimonials from users, rather than (or in addition to) the more usual approach of full formal letters of support from a small number of people. The testimonials would be a few sentences or a paragraph about how Cytoscape has been valuable in their own work; having dozens or even hundreds of these compiled together in one document might have enormous impact on reviewers.

5) Collaborations and service projects

A major goal of NRNB is to support collaborations with a broad variety of researchers in Biomedical science. Different types of collaborations have been initiated from very small support-style collaborations to larger collaborations that require active participation by NRNB. The EAC was very impressed with the overall number of collaborations. At the

Page 84: NRNB Annual Report 2013

time of the previous SAB meeting, there were 36 active research collaborations with NIH-supported researchers. In the last 1.5 year or so, another 60 were added, making a total of 96. One issue is that the majority of collaborations are internal Better advertisement of NRNB and its collaborative goals at relevant scientific conferences may help to acquire more external collaborations.

Collaborations are only a small part of the NRNB budget with an estimated cost of ~$100,000, but are highly effective at leveraging the NRNB expertise to expand the overall impact and reach of Cytoscape.

The term ‘collaboration’ is used in a way that is somewhat ambiguous: within the CSP umbrella is included tiny-scope efforts called ‘support’ (33%), small-scope efforts called ‘consulting’, and medium- scope efforts called ‘collaboration’. However, the DBPs are what we might consider true collaboration, and the hope is that some of the medium-scope efforts would evolve into new DBPs over the time, even as some previous DBPs might be scaled back into a smaller role. However, since the term ‘CSP’ is the standard vocabulary defined by the grant, perhaps it is not realistic to rename these medium-scope efforts. It would be useful to see these numbers proportioned for internal versus external collaborations.

These collaborations are currently tracked in a publicly available and transparent way on the NBRB web site with titles, investigators, and NRNB contact. It would be useful if their status could also be tracked.

For the renewal, it will be very important to obtain letters or a filled out survey from collaborators regarding the utility of Cytoscape and how it changed their research.

6) Promising ideas for potential supplemental funding

The first supplemental effort provided to the NRNB enabling the Cytoscape App Store project has turned out to be a remarkable return on investment, demonstrating a capacity for greater creativity and productivity. We highly recommend additional supplemental grants to maintain, or even increase, this level of activity. During the advisory meeting, we explored a number of proposals worth considering:

1. Moving NeXO forward (see TRD A) by partnering with existing GO projects

2. Enable Cytoscape users to record/reuse/host/share workflows and sessions to promote network biology use cases, enriched publications, reproducibility and collaboration.

3. Interface with a specific key technology that targets a strategic community ripe for network biology perspective/tools (e.g., MIDAS, UCSC Genome Browser, NCBO BioPortal, Galaxy, GenomeSpace, Sage Bionetworks/Synapse, DREAM)