scipion: toward software integration, reproducibility and validation in em image processing...

Post on 22-Dec-2015

224 Views

Category:

Documents

3 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Scipion: Toward software integration, reproducibility and

validation in EM image processing

Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC

J.M. de la Rosa Trevín

There are different EM modalities

• Single Particles

• Helical

• 2D Crystallography

• Tomography

There is a long path to produce a 3D model

Sample Preparation

Image Acquisition

3D reconstruction

3D model2D Analysis

Image acquisition and preprocessing

Data collection

Movies alignment (with DDD)

Micrographs CTF estimation

2D image processing for Single Particles

Particle Picking

Screening - Preprocessing

Alignment and Classification

3D reconstruction

Initial model

3D classification

3D refinement

VALIDATION

The EM field needs software integration

Using different EM software packages is now like the

tower of Babel

Appion is certainly a pioneer work in terms of software integration (and our main inspiration)

It is increasing the number of external tools added to current EM packages (such as Eman2, Xmipp3 or Relion )

But still is complicated to easily use tools from different packages in one project…

Scipion goals

1. Integrate EM software packages to be used in the same project.

2. Full project traceability, improving reproducibility.

3. Execute complete workflows in an automated manner.

4. Easy to install and use.5. Easy to extend with new protocols.

Goal 1: Integrate EM software packages to be used in the same project.

All vs All is hard to maintain and extend

All conversions: N*N

New package: 2*N

It is better to have a common format

All conversions: N+N

New package: 2

We bridge across package differences by modeling our domain

3D Reconstruction

Set of Images

Initial Model

3D Volume

DataProtocols

We need conversion functions for each package

Goal 2: Full project traceability, improving reproducibility.

Results should be reproducible, not more “black boxes”

We implemented asimple storage mechanism

Mapper Layer

Data Objects

Protocol Objects

Goal 3: Execute complete workflows in an automated manner.

Scipion client

Worker Host 1 Worker Host 2

Scipion Server Bookeeping

Designed to perform distributed execution

Distributed data storage

Big data transfers

Goal 4: Easy to install and use.

Goal 5: Easy to extend with new protocols.

(Let´s see Scipion in action)

Example 1: Integration of Spider-MDA (in collaboration with Tanvir Shaikh)

Example 2: Integration of Normal Modes analysis and flexible fitting

(in collaboration with Slavica Jonic)

Example 3: Integration of ResMap

(in collaboration with Alp Kucukelbirand Hemant Tagare)

List of currently integrated Protocols: Software package and tools Protocols integrated into Scipion

Xmipp 3.1 All

Niko Grigorieff ctffind3.5/4, frealign9.07 refinement and classification

Eman 2.1 Initial model, particle picking, 3D refinement

Spider Filters, align APSR, CAPCA, classify Ward,Refinement 3D

Relion Most of programs

Bsoft Particle picking

ResMap Local resolution estimation

Dosefgpu DD Movie averaging

Roadmap 2015

May : Alpha release Pilot installations outside Madrid. (A few have been made already)

June : Beta release announcement in 3DEM list

End of summer : Scipion 1.0 release

There is a team behind

We need to do it all together!!! All are wellcome.

www.structuralbiology.euFollow us on twitter @instructhub

top related