integration of heterogeneous data sources for high content ... · screening process • (3) image...

Post on 12-Jun-2020

7 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Integration of heterogeneous

data sources for high content

screening data exploitation and

exchange

Institute Curie, 6th January 2015 Elton Rexhepaj, MSc, PhD

Presentation layout

• High content screening and the visualisation

functionalities needed

• Curie institute framework for microscopy

imaging content management

• Tools developped to integrate heterogenous

data

• Conclusion and perspectives

BioPhenics is a technological HCS platform that supports

research teams in their needs of high-content screens

Translational research projects involves clinicians working on

cancer-related cell models for target validation or/and drug

repositioning.

Academic research projects involves research groups

working on cancer-related.

Support researchers with (1) assay development, (2) image data

acquisition, (3) data analysis

Who are we ?

To deal with the complexity of the analysis and underlying biological question

software tools are needed for data sharing and communication

High content screening

workflow

• (1) Assay development. Which Abs

to use? Incubation time?

• (2) Robotics for scaling up the

screening process

• (3) Image acquisition

High throughput removes subjectivity from the data but is

still prone to technical biais that need to be validated.

High-content screening (HCS), also

known as high-content analysis (HCA)

or cellomics, is a method that is used in

biological research and drug discovery

to identify substances such as small

molecules, peptides, or RNAi that alter

the phenotype of a cell in a desired

manner.

Typical screen: 1000 molecules, 2 replicates = 1TB of data, ~40000 images

High Content Screening Analysis pipeline

Sharing of imaging data can encrich phenotyping and allow a more optimal

exploitation of the HCS data

CID iManage > Institut Curie

Avadis® iManage

Images

Annotations

Apps

Access UI

Results

Acquisition UI

Settings

Access APIs SOAP/XML/RPC

Prepare Acquire Analyze Share Disseminate Visualize

Images Server+ Metadata +annotations (manual or analysis results)/

attachments (publications,xls file…)

Acquisition Client Web Client Interface Web admin for project

managing

COMPUTING

CLUSTER

IMAGE STORAGE

Dynamic Organisation,

Visual search or

advanced search

functionalities

Metadata (pixel size, acquisition time,…)

annotations,

Automatic analysis without

full download,

Data fusion, advanced

visualisation

BioImaging Cell and Tissue Core Facility

http://pict-ibisa.curie.fr/

Underlying hardware infrastructure

Avadis® iManage

Images

Annotations

Apps

Results Settings

Access APIs SOAP/XML/RPC

Collaborator

User

?

Software tool needs to be addressed

Real-time Unified Bio-Imaging Exploitation

System > RUBIES

As part of France Bio-Imaging network, we developed a Real-time Unified Bio-

Imaging Exploitation System to answer data sharing needs.

Liferay technology was selected as an open source platform to develop the

RUBIES portail.

RUBIES uses open source tools and widely-known paradigms to create a

collaborative platform to visualise and exchange HCS data.

Heterogeneous data from HCS is analysed in real-time in order to enrich

databases which are exposed through a web service layer.

WebLab integration framework: Service Oriented Architecture where individual

components are encapsulated as web services with common int/com protocols.

Heterogeneous components (algorithms, data sources, indexes, external tools)

• Internal components are separated from the communication layer

• Web services use standardized WebLab interfaces and data exchange protocols.

• Ensures compatibility between old and new services.

Liferay technology choice for development

• Any kind of information may be created by a service without needing to change

other services.

(subject, predicate, object)=>(Project URI/ hasId / “016” )

• Visualization is done through portal and portlet technology. Each portlet is able to call user specific types of data or processing.

• Portlets are loosely coupled elements, easing maintenance.

• Pages can be easily created by users in order to be tailored to a specific task or work methodology.

Data exchange protocols use RDF/JSON

for annotation storage and communication

Cell count

Lysosome count

IF Granularity

Raw tiff

JP

EG

20

00

(Q

=80

)

IF average intensity per well for lysosome staining (endocytosis

screening) / Raw data (384 well plates)

Plate robust z-score normalisation of average cell population IF

intensity prior to visualisation

1) Image normalisation and contrast enhacement 2) JPEG compression

Data pre-processing prior to visualisation

Compression is necessary to decrease client/server communication throughput (10 fold

decrease) and also fine for cell and organelle segmentation

Experimental plates in acquisition order

No

rmalized

well in

ten

sit

y

Raw

well in

ten

sit

y

DAPI Endosome Lysosome

Background services provide capability to search for elements through plain text queries

Interface for dynamic search of content

Selected fields (and corresponding metadata, in the lower-right corner) can be exported to external applications, in this case Avadis iManage (commercial) through specialized portlets

Export of selected annotation/images

Export of selected annotation/images /

multiple selection

Export of mixed automated and

experimental annotations with images

Conclusions

RDF as a web standard suitable to represents any kind of information

statements as collections of triples for HCS data (nosql approach).

Image preprocessing (normalisation/compression) is a critical step for HCS

data visualisation (i.e. throughput) and visual assessment of results.

RUBIES can interoperate with other data management systems and hence

integrate HCS image and annotation data with other experimental output.

Concurrently access to the portal and user management is supported with

job parallelisation in the computing cluster.

RUBIES is still a work in progress however visualisation Interfaces are

fully functional and we are happy to share our work with the community.

Further work: user-added annotations and exploitation of image and

metadata in order to create cell image dictionaries for pattern recognition.

Franck Perez

Philippe Benaroche

Jacque Chamonix

Elaine Del Nery

Aurianne Lescure

Sarah Tessier

Dmitry Vjostockolevic

Elodie Anthony

Acknoledgments Curie Institute - BIOPHENICS

Curie Institute – UMR 144

Jean Salamero

Perrine Paul-Gilloteaux

top related