scipion: toward software integration, reproducibility and validation in em image processing...

28
Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC J.M. de la Rosa Trevín

Upload: merryl-lewis

Post on 22-Dec-2015

222 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC

Scipion: Toward software integration, reproducibility and

validation in EM image processing

Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC

J.M. de la Rosa Trevín

Page 2: Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC

There are different EM modalities

• Single Particles

• Helical

• 2D Crystallography

• Tomography

Page 3: Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC

There is a long path to produce a 3D model

Sample Preparation

Image Acquisition

3D reconstruction

3D model2D Analysis

Page 4: Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC

Image acquisition and preprocessing

Data collection

Movies alignment (with DDD)

Micrographs CTF estimation

Page 5: Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC

2D image processing for Single Particles

Particle Picking

Screening - Preprocessing

Alignment and Classification

Page 6: Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC

3D reconstruction

Initial model

3D classification

3D refinement

VALIDATION

Page 7: Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC

The EM field needs software integration

Using different EM software packages is now like the

tower of Babel

Page 8: Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC

Appion is certainly a pioneer work in terms of software integration (and our main inspiration)

It is increasing the number of external tools added to current EM packages (such as Eman2, Xmipp3 or Relion )

But still is complicated to easily use tools from different packages in one project…

Page 9: Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC

Scipion goals

1. Integrate EM software packages to be used in the same project.

2. Full project traceability, improving reproducibility.

3. Execute complete workflows in an automated manner.

4. Easy to install and use.5. Easy to extend with new protocols.

Page 10: Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC

Goal 1: Integrate EM software packages to be used in the same project.

Page 11: Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC

All vs All is hard to maintain and extend

All conversions: N*N

New package: 2*N

Page 12: Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC

It is better to have a common format

All conversions: N+N

New package: 2

Page 13: Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC

We bridge across package differences by modeling our domain

3D Reconstruction

Set of Images

Initial Model

3D Volume

DataProtocols

Page 14: Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC

We need conversion functions for each package

Page 15: Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC

Goal 2: Full project traceability, improving reproducibility.

Page 16: Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC

Results should be reproducible, not more “black boxes”

Page 17: Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC

We implemented asimple storage mechanism

Mapper Layer

Data Objects

Protocol Objects

Page 18: Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC

Goal 3: Execute complete workflows in an automated manner.

Page 19: Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC

Scipion client

Worker Host 1 Worker Host 2

Scipion Server Bookeeping

Designed to perform distributed execution

Distributed data storage

Big data transfers

Page 20: Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC

Goal 4: Easy to install and use.

Goal 5: Easy to extend with new protocols.

(Let´s see Scipion in action)

Page 21: Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC

Example 1: Integration of Spider-MDA (in collaboration with Tanvir Shaikh)

Page 22: Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC

Example 2: Integration of Normal Modes analysis and flexible fitting

(in collaboration with Slavica Jonic)

Page 23: Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC

Example 3: Integration of ResMap

(in collaboration with Alp Kucukelbirand Hemant Tagare)

Page 24: Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC

List of currently integrated Protocols: Software package and tools Protocols integrated into Scipion

Xmipp 3.1 All

Niko Grigorieff ctffind3.5/4, frealign9.07 refinement and classification

Eman 2.1 Initial model, particle picking, 3D refinement

Spider Filters, align APSR, CAPCA, classify Ward,Refinement 3D

Relion Most of programs

Bsoft Particle picking

ResMap Local resolution estimation

Dosefgpu DD Movie averaging

Page 25: Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC

Roadmap 2015

May : Alpha release Pilot installations outside Madrid. (A few have been made already)

June : Beta release announcement in 3DEM list

End of summer : Scipion 1.0 release

Page 26: Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC

There is a team behind

Page 27: Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC

We need to do it all together!!! All are wellcome.

Page 28: Scipion: Toward software integration, reproducibility and validation in EM image processing Biocomputing Unit, Instruct Image Processing Center, CNB-CSIC

www.structuralbiology.euFollow us on twitter @instructhub