communicating systems biology - why and how we should do better in a digital world

61
Communicating Systems Biology – Why and How We Should Do Better in a Digital World ? Philip E. Bourne University of California San Diego [email protected] http://www.slideshare.net/pebourne/ ICBP Houston April 27, 2012

Upload: philip-bourne

Post on 06-May-2015

968 views

Category:

Education


0 download

DESCRIPTION

ICBP Workshop, Houston, April 27, 2012

TRANSCRIPT

Page 1: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

ICBP Houston April 27, 2012

Communicating Systems Biology – Why and How We Should Do Better in a Digital World ?

Philip E. BourneUniversity of California San Diego

[email protected]://www.slideshare.net/pebourne/

Page 2: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

ICBP Houston April 27, 2012

Why We Should Do Better

• Discovery processes are increasingly complex and broad in scope

• Data must be connected more closely to the methods under study

• Science is an increasingly social endeavor

http://www.discoveryinformaticsinitiative.org/Yolanda Gil and Haym Hirsch

Page 3: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

ICBP Houston April 27, 2012

Why We Should Do BetterThe Scientific Process is Too Slow to Respond to a Crisis – Either Global

or Personal

Motivation

http://knol.google.com/k/plos-currents-influenza#

By the time the paper is published we could all be dead

Page 4: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

ICBP Houston April 27, 2012* http://www.cdc.gov/h1n1flu/estimates/April_March_13.htm

Jan. 2008 Jan. 2009 Jan. 2010Jul. 2009Jul. 2008 Jul. 2010

1RUZ: 1918 H1 Hemagglutinin

Structure Summary page activity forH1N1 Influenza related structures

3B7E: Neuraminidase of A/Brevig Mission/1/1918 H1N1 strain in complex with zanamivir

In a time of crisis the need for fast access to accurate data and any knowledge associatedwith that data are paramount

Motivation

Page 5: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

ICBP Houston April 27, 2012

If that is not enough…

For some people the scientific process may be too slow to save their life

Motivation

Page 6: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

ICBP Houston April 27, 2012

Josh Sommer – A Remarkable Young ManCo-founder & Executive Director the Chordoma Foundation

http://sagecongress.org/Presentations/Sommer.pdf

Motivation

Page 7: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

ICBP Houston April 27, 2012

Chordoma

• A rare form of brain cancer

• No known drugs• Treatment – surgical

resection followed by intense radiation therapy

Motivation

http://upload.wikimedia.org/wikipedia/commons/2/2b/Chordoma.JPG

Page 8: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

ICBP Houston April 27, 2012

http://sagecongress.org/Presentations/Sommer.pdf

Motivation

Page 9: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

ICBP Houston April 27, 2012

http://sagecongress.org/Presentations/Sommer.pdf

Motivation

Page 10: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

ICBP Houston April 27, 2012

http://sagecongress.org/Presentations/Sommer.pdf

Motivation

Page 11: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

ICBP Houston April 27, 2012

http://sagecongress.org/Presentations/Sommer.pdf

Motivation

Page 12: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

ICBP Houston April 27, 2012Motivation

http://sagecongress.org/Presentations/Sommer.pdf

Page 13: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

ICBP Houston April 27, 2012

http://fora.tv/2010/04/23/Sage_Commons_Josh_Sommer_Chordoma_Foundation

Motivation

Page 14: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

ICBP Houston April 27, 2012

Science is an Increasingly Social Endeavor

Witness the Story of Meredith

Page 15: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

ICBP Houston April 27, 2012

A Requirement is More Open ScienceBut ….

Page 16: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

ICBP Houston April 27, 2012

Openness is Misunderstood by Scientists

• Witness the confusion regarding open access

• Witness PubMed Central

Page 17: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

ICBP Houston April 27, 2012

What Are the Impediments to Open Science?

Change Reward

You don’t get tenure for starting a blog!

Page 18: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

ICBP Houston April 27, 2012

How Can We Do Better? …

Page 19: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

ICBP Houston April 27, 2012

How Can We Do Better?

• Better communication, data and knowledge access, and new modes of discovery, which means:– We need data and knowledge about that data to interoperate

i.e. we need new kinds of fast, versatile publications and data archives

– We need to be more open with both– We need to think more about the tools that analyze, visualize

and annotate data to maximize knowledge discovery– Reward systems need to change– We need scientist management and discovery tools– We need to be less fixated on the big data problems– We need to unleash the full power of the Internet

Easy Hard

Page 20: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

Both Are Under Stress

• PubMed contains ~21M entries (May 2011)

• ~100,000 papers indexed per month

• In Feb 2009:– 67,406,898 interactive

searches were done– 92,216,786 entries were

viewed

• 1330 databases reported in NAR 2011

• MetaBase http://biodatabase.org reports 2,651 entries edited 12,587 times

PLoS Comp. Biol. 2005 1(3) e34

Page 21: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

Some More Comparisons

• Journals have a pretty standardized interface

• Journals have a business model

• The quality is declining as numbers increase (?)

• Audience believes they are sustainable

• Efforts to make the interfaces different!

• Little attempt at a business model compared to the Web 2.0 world

• Quality is increasing (?)• Not well sustained

PLoS Comp. Biol. 2008. 4(7): e1000136Databases versus journals

Page 22: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

1. A link brings up figures from the paper

0. Full text of PLoS papers stored in a database

2. Clicking the paper figure retrievesdata from the PDB which is

analyzed

3. A composite view ofjournal and database

content results

We Need Data and Knowledge About That

Data to Interoperate

1. User clicks on content2. Metadata and

webservices to data provide an interactive view that can be annotated

3. Selecting features provides a data/knowledge mashup

4. Analysis leads to new content I can share

4. The composite view haslinks to pertinent blocks

of literature text and back to the PDB

1.

2.

3.

4.

The Knowledge and Data Cycle

PLoS Comp. Biol. 2005 1(3) e34

Page 23: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

ICBP Houston April 27, 2012

We Need Data and Knowledge About That Data to Interoperate – What is Stopping Us?

• Governance – publishers vs. database providers

• Reward• Metadata standards for provenance, privacy

etc.• Exemplars• ….

Caveat: Each discipline is different – I speak very much from a biomedicalsciences perspective

Page 24: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

A Small Example - The World Wide Protein Data Bank

• The single worldwide repository for data on the structure of biological macromolecules

• Vital for drug discovery and the life sciences

• 41 years old• Free to all

http://www.wwpdb.org

We need data and knowledge about that data to interoperatePLoS Comp. Biol. 2005 1(3) e34 ICBP Houston April 27, 2012

Page 25: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

The World Wide Protein Data Bank – The Best Case Scenario

• Paper not published unless data are deposited – strong data to literature correspondence

• Highly structured data conforming to extensive ontologies

• DOI’s assigned to every structure

http://www.wwpdb.org

We need data and knowledge about that data to interoperatePLoS Comp. Biol. 2005 1(3) e34 ICBP Houston April 27, 2012

Page 26: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

ICBP Houston April 27, 2012

www.rcsb.org/pdb/explore/literature.do?structureId=1TIM

Example Interoperability: The Database View

We need data and knowledge about that data to interoperateBMC Bioinformatics 2010 11:220

Page 27: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

ICBP Houston April 27, 2012

Example Interoperability: The Literature Viewhttp://biolit.ucsd.edu

Nucleic Acids Research 2008 36(S2) W385-389 We need data and knowledge about that data to interoperate

Page 28: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

ICBP Houston April 27, 2012

Semantic Tagging & Widgets are a Powerful Tool to Integrate Data and Knowledge of that

Data, But as Yet Not Used Much

Will Widgets and Semantic Tagging Change Computational Biology? PLoS Comp. Biol. 6(2) e1000673

We need data and knowledge about that data to interoperate

Page 29: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

ICBP Houston April 27, 2012

Semantic Tagging of Database Content in The Literature or Elsewhere

http://www.rcsb.org/pdb/static.do?p=widgets/widgetShowcase.jspPLoS Comp. Biol. 6(2) e1000673Semantic Tagging

Page 30: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

ICBP Houston April 27, 2012

Where Will It All End?http://richard.cyganiak.de/2007/10/lod/lod-datasets_2010-09-22_colored.html

Page 31: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

ICBP Houston April 27, 2012

This is Literature Post-processingBetter to Get the Authors Involved

• Authors are the absolute experts on the content

• More effective distribution of labor

• Add metadata before the article enters the publishing process

We need data and knowledge about that data to interoperate

Page 32: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

ICBP Houston April 27, 2012

Word Add-in for Authors

• Allows authors to add metadata as they write, before they submit the manuscript

• Authors are assisted by automated term recognition– OBO ontologies– Database IDs

• Metadata are embedded directly into the manuscript document via XML tags, OOXML format– Open– Machine-readable

• Open source, Microsoft Public License

http://www.codeplex.com/ucsdbiolit

We need data and knowledge about that data to interoperate

Page 33: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

ICBP Houston April 27, 2012

Challenges

• Authors – Carrot IF one or more publishers fast tracked a

paper that had semantic markup it might catch on

• Publishers– Carrot Competitive advantage

We need data and knowledge about that data to interoperate

Page 34: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

ICBP Houston April 27, 2012

The Promise – A Hypothetical Example

Immunology Literature

Cardiac DiseaseLiterature

Shared Function

We need data and knowledge about that data to interoperate

Page 35: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

ICBP Houston April 27, 2012

How Can We Do Better?

• Better communication, data and knowledge access, and new modes of discovery, which means:– We need data and knowledge about that data to interoperate

i.e. we need new kinds of fast, versatile publications and data archives

– We need to be more open with both– We need to think more about the tools that analyze, visualize

and annotate data to maximize knowledge discovery– Reward systems need to change– We need scientist management and discovery tools– We need to be less fixated on the big data problems– We need to unleash the full power of the Internet

Easy Hard

Page 36: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

ICBP Houston April 27, 2012

One Small Example of the Problem• jMol, VMD … are de facto

standard important tools for rendering biological molecules .. but

• They are not versatile ie do not for example:– Respond to the data they are

reading– Offer views that match the users

interests– Allow the user to annotate the

data– Allow those annotations to be

shared (published?)

Think More About the Tools

Page 37: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

Github is Great But We Need Apps for Science

Computational Biology Resources Lack Persistence and Usability. PLoS Comp. Biol. 2008 4(7): e1000136

Page 38: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

ICBP Houston April 27, 2012

A Few Things to Accelerate the Rate of Scientific Discovery

• Better communication, data and knowledge access, and new modes of discovery, which means:– We need data and knowledge about that data to interoperate

i.e. we need new kinds of fast, versatile publications and data archives

– We need to be more open with both– We need to think more about the tools that analyze, visualize

and annotate data to maximize knowledge discovery– Reward systems need to change– We need scientist management tools– We need to be less fixated on the big data problems– We need to unleash the full power of the Internet

Easy Hard

Page 39: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

ICBP Houston April 27, 2012

Reward Systems Need to ChangeWhat is Needed?

• Author disambiguation• Auditing (identification and metrics) of all

scholarship - means new tools• Seniors need to promote alternative forms of

scholarship• Juniors need to respond

Reward Systems Need to Change

Ten Simple Rules for Getting Promoted as a Computational Biologist in Academia PLoS Comp Biol 2011 7(10 e1002001

Page 40: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

ICBP Houston April 27, 2012

What Are these Alternative Forms of Scholarship?

Research[Grants]

JournalArticle

ConferencePaper

PosterSession

Reviews

BlogsCommunity Service/Data

Curation

Reward Systems Need to Change

Page 41: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

ICBP Houston April 27, 2012Reward Systems Need to Change

Page 42: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

ICBP Houston April 27, 2012

A Unique Identifier is Going to Happen

• It is DOIs for people• Some scientists will

resist• The winner is ORCID?

Reward Systems Need to Change

Page 43: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

ICBP Houston April 27, 2012

Ideally the ID will be Tagged to Every Piece of Scholarly Communication

I an Not a Scientist I am a NumberPLoS Comp. Biol. 2008 4(12) e1000247

Reward Systems Need to Change

Page 44: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

ICBP Houston April 27, 2012

One Solution: Use the Traditional Reward System in New Ways

The Wikipedia Experiment – Topic Pages

• Identify areas of Wikipedia that relate to the journal that are missing of stubs

• Develop a Wikipedia page in the sandbox

• Have a Topic Page Editor review the page

• Publish the copy of record with associated rewards

• Release the living version into Wikipedia

Page 45: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

ICBP Houston April 27, 2012

How Can We Do Better?

• Better communication, data and knowledge access, and new modes of discovery, which means:– We need data and knowledge about that data to interoperate

i.e. we need new kinds of fast, versatile publications and data archives

– We need to be more open with both– We need to think more about the tools that analyze, visualize

and annotate data to maximize knowledge discovery– Reward systems need to change– We need scientist management and discovery tools– We need to be less fixated on the big data problems– We need to unleash the full power of the Internet

Easy Hard

Page 46: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

ICBP Houston April 27, 2012

The Truth About My Laboratory

• I have ?? mail folders!

• The intellectual memory of my laboratory is in those folders

• This is an unhealthy hub and spoke mentality

We Need Scientist Management Tools

Page 47: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

ICBP Houston April 27, 2012

The Truth About My Laboratory

• I generate way more negative that positive data, but where is it?

• Content management is a mess– Slides, posters…..– Data, lab notebooks ….– Collaborations, Journal clubs …

• Software is open but where is it?• Farewell is for the data too

Computational Biology Resources Lack Persistence and Usability. PLoS Comp. Biol. 2008 4(7): e1000136

We Need Scientist Management Tools

http://artbyvida.com/portfolio.php

Page 48: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

ICBP Houston April 27, 2012

Many Great Tools Out There

We Need Scientist Management Tools

Taverna

Page 49: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

ICBP Houston April 27, 2012

The Dream of Discovery Informatics

• At the end of the day a software agent reviews all of our labs electronic notebooks. Common themes and individual interests are extracted and searched against recent literature, public data, blogs, other social media and results returned and ranked for perusal next morning over coffee.

Page 50: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

ICBP Houston April 27, 2012

How Can We Do Better?

• Better communication, data and knowledge access, and new modes of discovery, which means:– We need data and knowledge about that data to interoperate

i.e. we need new kinds of fast, versatile publications and data archives

– We need to be more open with both– We need to think more about the tools that analyze, visualize

and annotate data to maximize knowledge discovery– Reward systems need to change– We need scientist management tools– We need to be less fixated on the big data problems– We need to unleash the full power of the Internet

Easy Hard

Page 51: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

Yes YouTube Can Increase the Rate of Discovery

Unleash the full power of the Internet

Page 52: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

ICBP Houston April 27, 2012

The Lab ExperimentPaper+Rich Media

• My students enjoyed the experience• The shyest student was actually the most bold

in front of the camera• “We will become a generation of “science

castors”• They liked the exposure for the most part –

rather than the PI it puts them out in front

Unleash the full power of the Internet

Page 53: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

ICBP Houston April 27, 2012

Organic Growth

• Some of their work viewed 20,000+ times• Global audience of researchers, educators and

academic/research institutions– 60,000 unique visitors & 2M pageviews/month– 16,000 registered users & 600 communities– 5,000 uploads of video content (about journal articles,

conferences, research news and classes)– Growing 4-5% monthly

• Sustainability - evolving a business model supporting journals and conferences

3 Years Laterwww.scivee.tv

Unleash the full power of the Internet

Page 54: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

ICBP Houston April 27, 2012

Products

ApplicationProduct Primary Customers

Journals PubCast Journals, publishers, societies

Meetings PosterCast Societies, conference orgs.SlideCast

Comm. PaperCast Societies, journalsPodcastSlideCast

Education PosterCast Societies, universitiesSlideCast

Books BookCast Publishers, book sellers

What Emerged: SciveeCasts

Unleash the full power of the Internet

Page 55: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

ICBP Houston April 27, 2012

AndroidiPhone

Windows Phone 7

Step 1presenter starts

PowerPoint

Step 2presenter starts

recording onsmart phone

Step 3presenter stops recording and

initiates upload

Slides

Website

Step 5slides and podcastare automatically

synchronizedSync FilePodcast

Step 6listener

plays back synchronized presentation

Proposal - The TeachU WorkflowMacPC

Step 4slides areuploaded

Page 56: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

ICBP Houston April 27, 2012

Acknowledgements• BioLit Team

– Lynn Fink– Parker Williams– Marco Martinez– Rahul Chandran– Greg Quinn

• MBT– John Moreland– John Beaver

• Microsoft Scholarly Communications– Pablo Fernicola– Lee Dirks– Savas Parastitidas– Alex Wade– Tony Hey

• wwPDB team– Andreas Prilc– Dimitris Dimitropoulos

• SciVee Team– Apryl Bailey– Leo Chalupa– Lynn Fink– Marc Friedman (CEO)– Ken Liu– Alex Ramos– Willy Suwanto– Ben Yukich

http://www.scivee.tv

http://biolit.ucsd.eduhttp//www.pdb.orghttp://www.codeplex.com/ucsdbiolit

Page 57: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

ICBP Houston April 27, 2012

Questions?

[email protected]

Page 58: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

ICBP Houston April 27, 2012

What Is Open Science

• Unrestricted access and reuse of scientific knowledge as found in the literature and elsewhere provided attribution is given

• Ditto the data, protocols, software etc. from which that knowledge is derived

• Something catalyzed by the Fourth Paradigm

Page 59: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

ICBP Houston April 27, 2012

What Motivates Me to Talk About Open Science?

• I am a domain (life) scientist not a computer or information scientist

• I have been co-directing a major open and freely accessible biological data source – the Protein Data Bank (PDB) for the past 11 years.

• Almost 6 years ago I co-founded and remain the founding Editor in Chief of the open access journal PLoS Computational Biology

• I co-founded SciVee.tv to disseminate science in new ways

• There must be a business model to enable persistence and growth

Page 60: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

ICBP Houston April 27, 2012

What Are the Promises of Open Science?

• To accelerate the rate of scientific discovery worldwide

• To enable contributions from a broader geographic and economic base

• To approach learning and comprehension in new ways

• To reach a broader audience including the general public

Page 61: Communicating Systems Biology - Why and How We Should Do Better in a Digital World

ICBP Houston April 27, 2012

MBT Featureshttp://mbt.sdsc.edu

• Offer a framework not an end user application

• Responds to the data type• Support read write access• Encourages others to

write end user applications

• Discourages feature creep

Think More About the Tools

Immunome Research, 2007 3(1):3

Immunologists

MedicinalChemists

BMC Bioinformatics 2005, 6:21.