integration of heterogeneous data sources for high content ... · screening process • (3) image...
Post on 12-Jun-2020
7 Views
Preview:
TRANSCRIPT
Integration of heterogeneous
data sources for high content
screening data exploitation and
exchange
Institute Curie, 6th January 2015 Elton Rexhepaj, MSc, PhD
Presentation layout
• High content screening and the visualisation
functionalities needed
• Curie institute framework for microscopy
imaging content management
• Tools developped to integrate heterogenous
data
• Conclusion and perspectives
BioPhenics is a technological HCS platform that supports
research teams in their needs of high-content screens
Translational research projects involves clinicians working on
cancer-related cell models for target validation or/and drug
repositioning.
Academic research projects involves research groups
working on cancer-related.
Support researchers with (1) assay development, (2) image data
acquisition, (3) data analysis
Who are we ?
To deal with the complexity of the analysis and underlying biological question
software tools are needed for data sharing and communication
High content screening
workflow
• (1) Assay development. Which Abs
to use? Incubation time?
• (2) Robotics for scaling up the
screening process
• (3) Image acquisition
High throughput removes subjectivity from the data but is
still prone to technical biais that need to be validated.
High-content screening (HCS), also
known as high-content analysis (HCA)
or cellomics, is a method that is used in
biological research and drug discovery
to identify substances such as small
molecules, peptides, or RNAi that alter
the phenotype of a cell in a desired
manner.
Typical screen: 1000 molecules, 2 replicates = 1TB of data, ~40000 images
High Content Screening Analysis pipeline
Sharing of imaging data can encrich phenotyping and allow a more optimal
exploitation of the HCS data
CID iManage > Institut Curie
Avadis® iManage
Images
Annotations
Apps
Access UI
Results
Acquisition UI
Settings
Access APIs SOAP/XML/RPC
Prepare Acquire Analyze Share Disseminate Visualize
Images Server+ Metadata +annotations (manual or analysis results)/
attachments (publications,xls file…)
Acquisition Client Web Client Interface Web admin for project
managing
COMPUTING
CLUSTER
IMAGE STORAGE
Dynamic Organisation,
Visual search or
advanced search
functionalities
Metadata (pixel size, acquisition time,…)
annotations,
Automatic analysis without
full download,
Data fusion, advanced
visualisation
BioImaging Cell and Tissue Core Facility
http://pict-ibisa.curie.fr/
Underlying hardware infrastructure
Avadis® iManage
Images
Annotations
Apps
Results Settings
Access APIs SOAP/XML/RPC
Collaborator
User
?
Software tool needs to be addressed
Real-time Unified Bio-Imaging Exploitation
System > RUBIES
As part of France Bio-Imaging network, we developed a Real-time Unified Bio-
Imaging Exploitation System to answer data sharing needs.
Liferay technology was selected as an open source platform to develop the
RUBIES portail.
RUBIES uses open source tools and widely-known paradigms to create a
collaborative platform to visualise and exchange HCS data.
Heterogeneous data from HCS is analysed in real-time in order to enrich
databases which are exposed through a web service layer.
WebLab integration framework: Service Oriented Architecture where individual
components are encapsulated as web services with common int/com protocols.
Heterogeneous components (algorithms, data sources, indexes, external tools)
• Internal components are separated from the communication layer
• Web services use standardized WebLab interfaces and data exchange protocols.
• Ensures compatibility between old and new services.
Liferay technology choice for development
• Any kind of information may be created by a service without needing to change
other services.
(subject, predicate, object)=>(Project URI/ hasId / “016” )
• Visualization is done through portal and portlet technology. Each portlet is able to call user specific types of data or processing.
• Portlets are loosely coupled elements, easing maintenance.
• Pages can be easily created by users in order to be tailored to a specific task or work methodology.
Data exchange protocols use RDF/JSON
for annotation storage and communication
Cell count
Lysosome count
IF Granularity
Raw tiff
JP
EG
20
00
(Q
=80
)
IF average intensity per well for lysosome staining (endocytosis
screening) / Raw data (384 well plates)
Plate robust z-score normalisation of average cell population IF
intensity prior to visualisation
1) Image normalisation and contrast enhacement 2) JPEG compression
Data pre-processing prior to visualisation
Compression is necessary to decrease client/server communication throughput (10 fold
decrease) and also fine for cell and organelle segmentation
Experimental plates in acquisition order
No
rmalized
well in
ten
sit
y
Raw
well in
ten
sit
y
DAPI Endosome Lysosome
Background services provide capability to search for elements through plain text queries
Interface for dynamic search of content
Selected fields (and corresponding metadata, in the lower-right corner) can be exported to external applications, in this case Avadis iManage (commercial) through specialized portlets
Export of selected annotation/images
Export of selected annotation/images /
multiple selection
Export of mixed automated and
experimental annotations with images
Conclusions
RDF as a web standard suitable to represents any kind of information
statements as collections of triples for HCS data (nosql approach).
Image preprocessing (normalisation/compression) is a critical step for HCS
data visualisation (i.e. throughput) and visual assessment of results.
RUBIES can interoperate with other data management systems and hence
integrate HCS image and annotation data with other experimental output.
Concurrently access to the portal and user management is supported with
job parallelisation in the computing cluster.
RUBIES is still a work in progress however visualisation Interfaces are
fully functional and we are happy to share our work with the community.
Further work: user-added annotations and exploitation of image and
metadata in order to create cell image dictionaries for pattern recognition.
Franck Perez
Philippe Benaroche
Jacque Chamonix
Elaine Del Nery
Aurianne Lescure
Sarah Tessier
Dmitry Vjostockolevic
Elodie Anthony
Acknoledgments Curie Institute - BIOPHENICS
Curie Institute – UMR 144
Jean Salamero
Perrine Paul-Gilloteaux
top related