alan turing institute symposium on reproducibility for ... · pdf file alan turing institute...

Click here to load reader

Post on 08-Jun-2020

4 views

Category:

Documents

0 download

Embed Size (px)

TRANSCRIPT

  • Alan Turing Institute Symposium on Reproducibility for Data-Intensive Research Dickson Poon China Centre, St. Hugh’s College, University of Oxford Wednesday 6 and Thursday 7 April 2016

    1

    Alan Turing Institute Symposium on Reproducibility for Data-Intensive Research

    Wednesday 6 April 2016

    Programme Location

    09:00 - 09:30 Registration and coffee Outside of the Lecture Theatre

    09:30 - 09:40 Welcome and Introduction; outcomes of the symposium (Lucie Burgess, University of Oxford)

    Lecture Theatre

    09:40 – 10:10 What is reproducibility in the setting of computational data analytics? (Prof Carole Goble, Manchester University)

    Lecture Theatre

    10:10 – 10:30 Overview of the ATI (Prof Jared Tanner, University of Oxford) Lecture Theatre

    10:30 – 11:00 Coffee break Wordsworth Tea Room

    11:00 - 12:45 Session 1 – Data provenance to support reproducibility (Lead: Dr Paolo Missier, University of Newcastle, and Prof Tom Nichols, University of Warwick) Speakers: Prof Luc Moreau, University of Southampton Prof Dorothy Bishop, University of Oxford Dr Paolo Missier, University of Newcastle

    Lecture Theatre Break-out rooms – see Session format for more details

    12:45 – 13:45 Lunch Wordsworth Tea Room

    13:45 – 15:30 Session 2 – Computational models and simulations (Lead: Prof Jeremy Gibbons, University of Oxford) Speakers: Dr Nicola Botta, Potsdam Institute for Climate Impact Research Prof Patrik Jansson, Chalmers University of Technology Dr Camil Demetrescu, Sapienza University of Rome

    Lecture Theatre Break-out rooms – see Session format for more details

    15:30 – 16:00 Coffee break Wordsworth Tea Room

    16:00 – 17:15 Lightning talks Lecture Theatre

    17:15 – 17:45 Day 1 reportage, plans for day 2 Lecture Theatre

    19:00 Pre-dinner drinks; welcome from Lucie Burgess At the entrance to St Hugh’s College Hall

    19:30 Conference dinner St Hugh’s College Hall

  • Alan Turing Institute Symposium on Reproducibility for Data-Intensive Research Dickson Poon China Centre, St. Hugh’s College, University of Oxford Wednesday 6 and Thursday 7 April 2016

    2

    Thursday 7 April 2016

    Programme Location

    08:30 Coffee available Wordsworth Tea Room

    09:00 – 09:15 Re-cap of day 1 and overview of day 2 Lecture Theatre

    09:15 - 11:00 Session 3 – Reproducibility for real-time big data (Lead: Prof David de Roure, University of Oxford) Speakers: Dr Suzy Moat, University of Warwick Prof David de Roure, University of Oxford

    Lecture Theatre Break-out rooms – see Session format for more details

    11:00 – 11:15 Coffee break Wordsworth Tea Room

    11:15 – 13:00 Session 4 – Publication of Data-Intensive Research (Leads: Prof Carole Goble, Manchester University; David Crotty, Richard O’Beirne, Oxford University Press) Speakers: Dr Laurie Goodman, GigaScience Mr Neil Chue Hong, Software Sustainability Institute

    Lecture Theatre Break-out rooms – see Session format for more details

    13:00 – 14:00 Lunch Wordsworth Tea Room

    14:00 – 15:45 Session 5 – Novel architectures and infrastructure to support reproducibility (Leads: Dr Richard Mortier, University of Cambridge, and Dr Adam Farquhar, The British Library) Speakers: Dr Kenji Takeda, Microsoft Research Limited Dr Richard Mortier, University of Cambridge

    Lecture Theatre Break-out rooms – see Session format for more details

    15:45 – 16:30 Day 2 reportage Lecture Theatre

    16:30 End of symposium

    BREAK-OUT GROUPS SESSION FORMAT There will be 3 break-out groups per session, each with a different topic; delegates will chose on the day which group to join.

    Questions for small group discussion could include: • Current landscape, scientific challenges - What are the latest advances in research relating to the theme? What is

    state of the art? What are the key research questions and scientific challenges? Where do the greatest gaps lie? • Disciplinary and inter-disciplinary challenges - What are the data-intensive scientific disciplines that should be

    brought together; where do they inter-relate to support research in this area? • Foundational and applied research challenges - What are the applied stakeholder challenges and how should these

    drive fundamental research? e.g. in health, finance, utilities, engineering, etc … • Benefits and impact – What are the downstream impacts, e.g. less costly computations, more efficient and

    widespread data re-use, greater transparency and public trust in science? • What could an emerging ATI programme look like in this area? What ambitious targets could we achieve within 1

    year, 3, 5 years? What is the added value of working within the ATI framework? What partners should the ATI work with in this area?

  • Alan Turing Institute Symposium on Reproducibility for Data-Intensive Research Dickson Poon China Centre, St. Hugh’s College, University of Oxford Wednesday 6 and Thursday 7 April 2016

    3

    Session abstracts

    Session location, logistics, reportage Keynotes and session talks take place in the LECTURE THEATRE. Following each session there will be three breakout groups, in the Lecture Theatre, the Louey Seminar Room on the 2nd floor, and the Ho Tim Seminar Room on the 1st floor. Each breakout group will have a different topic. Topics and breakout rooms are scheduled below. For each breakout group, we invite one delegate to act as a scribe and to note the key points of the discussion in a Google document, for which links are provided below. Other delegates are welcome to add to the Google notes during each session. We will photograph post-it-notes and flip-chart outputs for a visual record. The session chairs will feed back the key points in the reportage sessions. The outcome of the symposium will be a white paper, which we will publish online. If you would like to do some pre-reading on the topics covered in the symposium, a list of references is available in BibText, RefWorks and EndNotes formats on a shared Google folder here: https://drive.google.com/open?id=0B1EyUglIzGARZjFVa3dLUWExbzg

    WEDNESDAY 6 APRIL 2016

    Opening keynote - What is Reproducibility? - Professor Carole Goble, University of Manchester ‘When I use a word’, Humpty Dumpty said in rather a scornful tone, ‘it means just what I choose it to mean - neither more nor less.’[1]. It is the same with Reproducible. Reusable. Repeatable. Recomputable. Replicable. Rerunnable. Regeneratable. Reviewable. It is R* mayhem. Or pride [2]. Does it matter? At least it does for computational science. Different shades of ‘reproducible’ matter in the publishing workflow, depending on whether you are testing for robustness (rerun), defence (repeat), certification (replicate), comparison (reproduce) or transferring between researchers (reuse). Different forms of ‘R’ make different demands on the completeness, depth and portability of research [3]. If we view computational tools (software, scripts) as instruments – ‘data scopes’ rather than ‘telescopes’ or ‘microscopes’ – then we need to be clear when we talk about reproducible computational experiments about whether we are rerunning with the same set up on the same (preserved) instrument (say a virtual machine), or reproducing the instrument to replicate the experiment (say a description of an algorithm recoded) or repairing the instrument so we can reuse it for some other experiment (say replacing a defunct web service or a deprecated library). In this talk Professor Carole Goble will discuss the R* brouhaha and its practical consequences for computational data driven science. [1] Lewis Carroll, Through the Looking-Glass (1872) [2] David De Roure, More Rs than Pirates http://www.scilogs.com/eresearch/more-rs-than-pirates/ [3] Juliana Freire, Philippe Bonnet, Dennis Shasha, Computational reproducibility: state-of-the-art, challenges, and database research opportunities SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data: 593-596, ACM New York, NY, USA, doi: 10.1145/2213836.2213908 Overview of the Alan Turing Institute – Professor Jared Tanner, University of Oxford Professor Jared Tanner will give an overview of the mission and strategic objectives of the newly-established Alan Turing Institute, and will take questions from delegates. The Institute’s mission is to: undertake data science research at the intersection of computer science, mathematics, statistics and systems engineering; provide technically informed advice to policy makers on the wider implications of algorithms; enable researchers from industry and academia to work together to undertake research with practical applications; and act as a magnet for leaders in academia and industry from around the world to engage with the UK in data science and its applications.

  • Alan Turing Institute Symposium on Reproducibility for Data-Intensive Research Dickson Poon China Centre, St. Hugh’s College, University of Oxford Wednesday 6 and Thursday 7 April 2016

    4

    Session 1 – Data provenance to support reproducibility (Wed 6 April, 11:00-12:45) Link to Google doc for notes: Group A – https://docs.google.com/document/d/1Ac2J7WVzXfEHeWne

View more