towards lensfield

Post on 12-May-2015

3.654 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

Lensfield is a desktop and filesystem-based tool designed as a “personal data management assistant” for the scientist. It combines distributed version control (DVCS), software transaction memory (STM) and linked open data (LOD) publishing to create a novel data management, processing and publication tool. The application “just looks after” these technologies for the scientist, providing simple interfaces for typical uses. It is built with Clojure and includes macros which define steps in a common workflow. Functions and Java libraries provide facilities for automatic processing of data which is ultimately published as RDF in a web application. The progress of data processing is tracked by a fine-grained data structure that can be serialized to disk, with the potential to include manual steps and programmatic interrupts in largely automated processes through seamless resumption. Flexibility in operation and minimizing barriers to adoption are major design features.

TRANSCRIPT

Towards Lensfield: data management, processing and

semantic publication for vernacular e-science

Nick Day, Jim Downing, Lezan Hawizy, Nico Adams and

Peter Murray-Rust Unilever Centre for Molecular Science Informatics,

University of Cambridge

This presentation: CC-By-SA Jim Downing

Linked Data

CC-By-SA-NC jmelchio

CC Images from Flickr

Selling Linked Data

Make it transparentMake it easy

CC-By mrslogic

Selling Linked Data

Citations

Selling Linked Data

• Visualizations

• Data management

• Automation

Demo

http://code.google.com/p/lensfield/

Lensfield Principles

• Make it easier to do the right thing

• Vernacular

• KISS and Embrace constraints

Constraints

• Work on the desktop without infrastructure installation

• Processing tasks could be anything and aren’t predictable

Re-use

Jumbo-Converters

• Library of chemistry file format converters, semantifiers and enhancers

• Part of the CML Java libraries

• http://sourceforge.net/projects/cml/

Version Control• Mercurial

• Excellent support for experimentation

• Backup to remote machine

• P2P sharing

• Track script changes with data

• Automatically ignore deterministic intermediates

Build metaphor

• Describing state transitions rather than process better for provenance tracking

• Alternative to graphical programming languages / workflow packages

• hard problems are re-use and comprehension

Clojure

• Strong on concurrency

• Functional

• Software Transactional Memory

• Lisp

• Snapshots, pause and resume, continuations

Future Development

• Templated Parameter Sweeps & sensitivity analysis

• Design of Experiments

• Multicore performance testing

• Grid processing

http://fascinator.usq.edu.au/

Users

• CLARION project

• Embargo management and publication of Electronic Lab Notebook data.

• OREChem

• Distributed chemistry eScience using Linked Data.

• Computational Chemical engineering

UsersYou?

... to use Lensfield!

http://code.google.com/p/lensfield/

CC-By-NC ilonameagher

ThanksColleagues Funds

Collaboration and Inspiration

Nick DayJohn AspdenLezan HawizyPeter Murray-Rust

Nico Adams (Dept of Genetics, Cambridge)Jerry Winter (Unilever)Noel Ruddock (Unilever)Markus Kraft, Weerapong Phadungsukanan (Chemical Engineering, Cambridge)

top related