directions in open science
DESCRIPTION
Directions in Open Science. Mike Travers SRI Bioinformatics Research Group. For AIC Lunch and Learn, 30 Jan 2012. About this talk. Partly a trip report from Open Science Summit 2011 Partly an attempt to define open science and explore its impact - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/1.jpg)
Directions in Open Science
Mike TraversSRI Bioinformatics Research
Group
For AIC Lunch and Learn, 30 Jan 2012
![Page 2: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/2.jpg)
About this talk• Partly a trip report from Open
Science Summit 2011 • Partly an attempt to define open
science and explore its impact• Partly an excuse to talk about some
of my own vaguely related work• And partly some semi-crazy
speculation about future projects in this space
![Page 3: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/3.jpg)
The Open Science Summit unites researchers, life science industry professionals, students, patients and other stakeholders to discuss the future of collaborative science and innovation.
…in-depth sessions on new models for drug discovery and clinical trials, personal genomics, the patent system, the future of scientific publications, and more.
![Page 4: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/4.jpg)
What is Open Science?• Many different things, but boils down to:• Removing barriers to scientific
communication and collaboration:– Social– Technical– Legal– Economic– Bureaucratic
• To accelerate scientific progress• Utilizing modern technology
![Page 5: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/5.jpg)
Driven by technological change
• The Internet has radically reduced communication costs
• So old institutions of scientific communication are now obstacles– Closed academic publishers, notably:
• Internet will transform scientific media just like it has newspapers, TV, social life….
• The difference is: science is more important than sharing cat pictures
![Page 6: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/6.jpg)
For-profit academic publishing is a racket
A very lucrative one
Starting to be rumbles of complaint (boycotts) from academics
![Page 7: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/7.jpg)
Open
• Most visible and successful branchof open science
• Articles are free to read, payto publish
• Funders are starting to requiresome form of public access
![Page 8: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/8.jpg)
Gold: OA journal, Green: OA self-archivingOpen Access to the Scientific Journal Literature: Situation 2009, PLoS ONE, Bo-Christer Björk et al
![Page 9: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/9.jpg)
Research Works Act• H.R.3699 – “A bill to ensure the
continued publication and integrity of the peer-reviewed research works by the private sector.”No Federal agency may adopt, implement, maintain, continue, or
otherwise engage in any policy, program, or other activity that--(1) causes, permits, or authorizes network dissemination of any private-sector research work without the prior consent of the publisher of such work; or(2) requires that any actual or prospective author, or the employer of such an actual or prospective author, assent to network dissemination of a private-sector research work.
![Page 10: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/10.jpg)
Myth 1: American consumers have a right to free access to articles their tax dollars fund.
FactAmerican taxpayers do not fund peer reviewed research articles; they fund some of the research that is used in those articles…
![Page 11: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/11.jpg)
Beyond Open Access• Not going to say a whole lot about OA, because:• It’s easy to understand• It’s pretty clearly going to win in the long term• By itself, not a very radical change to how science
is done:– Knowledge is still in paper-sized chunks– Papers are peer-reviewed prior to publication;– Once something is published, it’s static
• All these parameters are being challenged in some way by other efforts
• George Whitesides (Harvard chemist): “The concept of the scientific paper is eroding before our very eyes”
![Page 12: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/12.jpg)
Variations on publishing• “Peer review is broken”
– Too slow– Too biased– Too rigid– May be “the worst system except for all the others”
• Pre-peer-review publication– Eg arXiv.org
• Micropublication– Crowdsourcing, blogs, wikis….
• Open-notebook science– No gap at all between bench and publication
• Database-linked publications• Dynamic Review Papers
![Page 13: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/13.jpg)
Biggest sequencing operation in the world
Generating 6 terabytes/day of genomic data
Open-Source Genomic Analysis of Shiga-Toxin–Producing E. coli O104:H4 Rohde et al 2011 (NEJM)Toxic E. coli outbreak in Germany May 2011:We released these data into the public domain… which elicited a burst of crowd-sourced, curiosity-driven analyses carried out by bioinformaticians on four continents. Twenty-four hours after the release of the genome, it had been assembled; … Five days after the release of the sequence data, we had designed and released strain-specific diagnostic primer sequences, and within a week, two dozen reports had been filed on an open-source wiki …dedicated to analysis of the strain
Sequenced the rice genomehttps://github.com/ehec-outbreak-crowdsourced
![Page 14: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/14.jpg)
GigaScience is a new integrated database and journal co-published in collaboration between BGI Shenzhen and BioMed Central, to meet the needs of a new generation of biological and biomedical research as it enters the era of "big-data."
![Page 15: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/15.jpg)
Dynamic Review Papers
Conventional paper
Paired withDynamically-updated,wiki-based paper/database/model
![Page 16: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/16.jpg)
![Page 17: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/17.jpg)
Driving apps
![Page 18: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/18.jpg)
Who comes to Open Science Summits?
![Page 19: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/19.jpg)
Activist Organizations
![Page 20: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/20.jpg)
Participatory Medicine& Disease Foundations
![Page 21: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/21.jpg)
StartupsSocial paper and citation management
Scientific servicesmarketplace
Web-based moleculelibrary management
![Page 22: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/22.jpg)
Citizen Science
![Page 23: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/23.jpg)
Somewhat less garage-y• Independent research institute, started
from data released by Merck
• Repository of experimental data (Sage Commons)
• Network of cooperating institutions
• Starting to build a computational platform (Synapse)
![Page 24: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/24.jpg)
Synthetic Biology
![Page 25: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/25.jpg)
And some individual researchers
• Peter Murray-Rust Chemist, Cambridge, promoter of Chemical Markup Language and semantic web“Closed science makes people die!”
• Victoria StoddenStatistician, Columbia, reproducibility of computational science(cf ClimateGate)
![Page 26: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/26.jpg)
Some open science success stories
• Galaxy Zoo• FoldIt• Nutrient Network (NutNet)• Prazinquantel synthesis
![Page 27: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/27.jpg)
![Page 28: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/28.jpg)
Galaxy Zoo• Citizen science (loosely)• Image classification task• Mechanical Turk-like approach (but
unpaid)• About 200K participants• Discovered a whole new class of
galaxies (“green pea”) and a quasar mirror
• 22 published papers in 3 years
![Page 29: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/29.jpg)
![Page 30: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/30.jpg)
![Page 31: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/31.jpg)
![Page 32: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/32.jpg)
Social sharing of algorithms (“recipes”)
Descent with modification
![Page 33: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/33.jpg)
Matthew Todd, chemist at U of Syndney
Schistosmiasis
Looking for synthesis for known drug Prazinquantel (PZQ) in enantiopure form
Open-notebook science (LabTrove)
![Page 34: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/34.jpg)
![Page 35: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/35.jpg)
Nutrient Network (NutNet)
![Page 36: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/36.jpg)
![Page 37: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/37.jpg)
What paper has the most authors?
• NutNet paper:40 authors, 41 institutions
• This one from SLAC and elsewhere:407 authors, but only 35 institutions
![Page 38: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/38.jpg)
Three variations on the scientific process
• Automated Science• Distributed Science• Web-scale Intelligent Science
• Open Science as the lubrication / accelerant that makes these feasible
![Page 39: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/39.jpg)
Afferent: Automation for Drug Discovery
• Combinatorial Chemistry• Planning software to drive lab robots
![Page 40: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/40.jpg)
![Page 41: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/41.jpg)
Distributed Science• Some science (eg evaluation of drug
candidates) is highly parallelizable,• Hence distributable• CollabRx was initially supposed to
support “virtual pharma companies” that would tie disparate academic research efforts into focused teams
![Page 42: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/42.jpg)
![Page 43: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/43.jpg)
Web-scale Intelligent Science• Imagine all of science as a giant
distributed computational process• Individual scientists are agents – working on a small part of the problem– Sharing their results– Getting feedback and funding dependent on
success• Centralized data integration and decision
tools used to help determine next useful experiment
![Page 44: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/44.jpg)
Steps towards distributed intelligence
• Adaptive clinical trials– Rather than a classical trial with two arms run to
completion– Change the distribution of test cases based on ongoing
results
• Now imagine this strategy applied more globally across all treatments for a disease
• Credit for this slightly mad vision goes mainly to Marty Tenenbaum:– AI Meets Web 2.0 (2006)– Shrager, Tenenbaum, Travers, Cancer Commons:
Biomedicine in the Internet Age (2011)
![Page 45: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/45.jpg)
What does all that have to do with Open Science?
• Open Science is lowering barriers to collaboration
• So it’s a necessary but not sufficient step towards this new kind of science
• CollabRx may just have been too early:– the groundwork hasn’t been laid yet, – we are still working on basics – (eg standards for representation)
• Reducing friction (or transaction costs) can be incredibly important
![Page 46: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/46.jpg)
“Changing the cost of innovation fundamentally changes the nature of innovation” – Joichi Ito
TCP, HTTP etc are the containerization of data.
So what’s the analog for scientific knowledge?
![Page 47: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/47.jpg)
Standardized Legal and Institutional Mechanisms
![Page 48: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/48.jpg)
A mix of technical, institutional, and legal standardization:
-Standard licenses (parameterizable)
-RDF representation for licenses.
-Web Tools to generate these
-Sites that collect and “market” available materials.
![Page 49: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/49.jpg)
BioBike, a platform for open science
• Conceived of as a vehicle for getting biologists to do their own knowledge-based biocomputing.
• Lisp + Frame system + Bioinformatics Tools– Through-the-web programmability– Community sharing of code and data– Visual Programming Language
• Open Source •
Jeff Elhai, Arnaud Taton, J. P. Massar, John K. Myers, Michael Travers, Johnny Casey, Mark Slupesky, Jeff Shrager. BioBIKE: A Web-based, programmable, integrated biological knowledge base. Nucleic Acids Research, 2009
![Page 50: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/50.jpg)
![Page 51: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/51.jpg)
![Page 52: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/52.jpg)
BioBike and Open Science• BioBike wasn’t for Open Science per se• But it did explore some ideas in web-based
biocomputation• The next-generation BioBike platform:– Data: Big data, Open data, semantic web
integrated– Programming: Able to deal with large scale and
distributed workflows with human elements– Collaboration: Integrating different communities
in a “trading zone”KnowOS: The (Re)Birth of the Knowledge Operating System. Mike Travers, JP Massar, and Jeff Shrager, International Lisp Conference 2005
![Page 53: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/53.jpg)
What is a platform?• The economic meaning of “platform” is interesting• Something that:
– Supports two-sided network effects– Stands in the middle and extracts a toll
• Examples:– Credit cards
(merchants ↔ consumers)– Operating systems
(application developers ↔ users)• Science has more complicated networks and relations
– Data providers– Data consumers– Service providers– Analysts (statisticians, eg)– Patients
• A science platform is not going to make anyone rich like Facebook, but it would be nice to have a powerful and standard way for all these groups to collaborate.
![Page 54: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/54.jpg)
Open Data is outstripping analysis capacity
• Or in other words: – data is cheap,– attention, knowledge, & expertise are
expensive• A platform for collaborative
computational interpretation of biological data
• To better leverage the expensive resources
![Page 55: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/55.jpg)
identifies advancing new computational infrastructure as a priority for driving innovation in science and engineering.
Scientific discovery and innovation are advancing along fundamentally new pathways opened by the development of increasingly sophisticated software.
the overarching goal of transforming innovations in research and education into sustained software resources that are an integral part of the cyberinfrastructure
![Page 56: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/56.jpg)
Anti-open arguments
• Peer-review is an essential filter; without it too much nonsense gets out
• Electronic availability of articles actually leads to narrowing of science (Evans, 2008)
• Privacy, HIPAA, etc.• Need to retain IP for economic motivation• The problem isn’t availability of data; it’s
making sense of what we do have• See PRISM for more
![Page 57: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/57.jpg)
Opener Science
• Science is already pretty open!
• institutions of opennessplayed a role in the foundation of science, including the first scientific journals
![Page 58: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/58.jpg)
Historical Origins of Open Science
• Before the invention of science, knowledge of the natural world was closely guarded, passed down from master to apprentice.
• The development of institutions of openness was a key factor in the scientific revolution (Paul David, Stanford economist)
• …and the printing press was a key factor in that.
![Page 59: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/59.jpg)
So…• The printing press is almost 600 years old• The scientific journal is almost 350 years old• There’s been some advancement in
communication technology since then…• Science will eventually change:– Either a modest acceleration of the scientific
process, – Or as significant and discontinuous as the first
scientific revolution• Which one? An open question.
![Page 60: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/60.jpg)
Further Reading
![Page 61: Directions in Open Science](https://reader035.vdocument.in/reader035/viewer/2022062302/56816717550346895ddb8731/html5/thumbnails/61.jpg)
End