workflows and visualisation for machine learning

Workflows and visualisation for machine learning segmentation of massive image data sets

James Lefevre

Hamilton lab

Institute for Molecular Bioscience

The University of Queensland

Lattice Light Sheet Microscopy

• Very High spatial resolution(100 x 100 x 268 nm)

• Very High Temporal resolution (100 slices / second)

• Very low photo-toxicity

• Opportunity to generate very rare

data (very few working systems

world wide)

• 4 Individual lasers (440 nm (CFP),

488 nm (GFP), 560 nm (RFP), 640

nm (Far Red)

Typical capture: 300 time steps / image stacks over 20 minutes

Each stack 1113 x 1024 x 151, 688MB (single channel )

Approach to analysis of macrophage imaging

• Want automated semantic segmentation into:

– Tent-pole

– Ruffles

– Cell body

– Background

• Segment and analyse each stack / time step, then integrate information

• Identify and track objects

• Understand and quantify events with Stow Group

Stow Lab Imaging: http://doi.org/10.1083/jcb.201804137

http://doi.org/10.1083/jcb.201804137

Pipeline for segmentation and quant of 4D Imaging

Elements needed for pipeline

• High performance computing (thanks RCC!)

• 4D visualisation for interaction and viewing

• Methods for tracking structures across time

• Statistical analysis

Automated semantic segmentation approaches

Deep learning• Use 3D U-Net or similar architecture

• Sophisticated segmentations rapidly calculated on GPU clusters

• Requires extensive training data and time

• Uncertain prospects for transfer learning for this problem

Lacks power and reliablity at

scale – cannot compete with

machine learning approaches

Traditional approach• Work in interactive ImageJ or similar

• Clean image, e.g. noise removal via median filter

• Thresholding to separate classes, spot detection to identify objects etc

• Code algorithm in macro, run in batch mode

Lack sufficient training data

(many complete segmentations),

may lack computational

resources for training

Alternative machine learning approaches?

Machine learning with Trainable Weka

• Uses existing algorithms in ImageJ and plugins to calculate useful image features

• Trains a "shallow" machine learning algorithm from Weka ML platform using these features

• Far less training data required; fast, easy training of model

https://imagej.net/Trainable_Weka_Segmentation

Statistics used:• Local 3D mean, min, max, variance,

median

• 3D derivatives

• 3D Hessians

• 3D Gaussian blur

• 3D Laplacian

• 3D Canny edge detection

• 3D Difference of Gaussians

Calculated for r=1,2,4,8,16 pixels

(less in z dimension)

Used random forest

• Fast

• >99% class weighted CV

accuracy

• Good generalisability (visual

assessment)

Replicated training data with

scaled intensity for robustness

against fluorophore responseSpeed advantage in training does not extend to deployment –

all features need to be calculated for each image

2D and 3D versions – differ only in

image features used

GUI features

Builds on ImageJ interface

Create classes

Select training data

Select image features and scales

Select machine learning model

Train, save, load and apply models

Export training data

Compare segmentation to image

Select additional training data -> iterate

Can look at standard ML performance measures, but primarily assess visually over whole image / stack

Trainable Weka

GUI falls short of requirements in several ways – used API and extension code

Trainable Weka at scale

Run model externally, cache features on disk

calculate single feature for whole stack

segment single slice using all features

Selectively downsample for features at larger radii

HPC cluster, ImageJ headless mode, Research Database

Management collection

Wrote visualisation software using Processing language

See slices in 3D context

Compare images to segmentation, segmentation versions

Memory constraints, stability

110 feature stacks x ~200MB

Processing speed

Some feature costs O(radius cubed)

Automatic processing of large datasets

Selection of training data – 3D context

Deconvolution

Max intensity projections and stats

Segmentation

Computational Workflow

Parallel processing ~300 stacks

Watershed split, object stats and adjacencies

Tracking

Quality check, cropping,Intensity adjustment

Meshes and skeletonisation

Further analysis and visualisation

Deconvolution

Max intensity projections and stats

Segmentation

Microvolution softwareWiener GPU cluster

IMB Image portal

Interactive ImageJ, R

Computational Workflow

Parallel processing ~300 stacks

Watershed split, object stats and adjacencies

Tracking

Quality check, cropping,Intensity adjustment

Meshes and skeletonisation

Further analysis and visualisation

Headless ImageJTrainable Weka plugin

ImageScience / FeatureJ plugin

Headless ImageJMorphoLibJ

3D Objects Counter

Headless ImageJ with3D ImageJ Suite / mcib3d

Skeletonize3D, AnalyseSkeleton3D Objects Count3D Objects

Headless ImageJ

Object detection and tracking

Challenge

• Hundreds of objects over hundreds of time steps

• Segmentation often fails to fully separate objects

• Inevitable variation in segmentation between time steps

• Need to combine objects of different classes to understand

events

Approach

• Watershed split algorithm using edge-distance

• Selective rejoining of objects integrated into tracking

• Created class and object hierarchy – merged tentpole/ruffle class also analysed

Visualisation

Questions:

During training and validation• Training data selection – what class should that pixel be in?

• Assessment and comparison of base segmentations

Evaluating object segmentation and tracking• How did my object splitting and joining algorithm go?

• How about tracking over time?

Answer: I need a 3D/4D visualiser that allows instant

switching between various types of data - wrote visualiser in Processing 3

Visualiser - Viewing source & segmented data

Training data selection

What class should that pixel be in? How did my last model do?

Need to put image slices into 3D/4D context

View Object Associations in Space

How did my object splitting algorithm go?

Cells separated, but 2 spurious splits

Corrected by recombination algorithm

Tracking Across Time2 cells over 21

time points

2 cells over 3 time points

Yellow lines track tentpoles

Ruffles splitting

and merging

Questions

• Understand this new way that cells internalise proteins and molecules from their environment and responds to pathogens

• Apply to tens of thousands of events

• Track each event and its components over time

• What affects their generation? What proteins are crucial?

• What defects are associated with disease?

• Apply and adapt to other cellular systems

Stow Lab Imaging: http://doi.org/10.1083/jcb.201804137

http://doi.org/10.1083/jcb.201804137

Nick Hamilton

Fu

nd

ing

Stow Lab

Adam Wall

Nick Condon

Yvette Koh

Institute for Molecular Bioscience Microscopy

UQ Research Computing Center

Training data selection

What class should that pixel be in? How did my last model do?

Need to put image slices into 3D/4D context

Deconvolved image Semantic segmentation Segmentation probability

Segmentation of 3D LLSM imaging

Macrophage cells

(Image:Nick Condon / Stow Lab)

Segmentation into “tent poles” ,

ruffles, cell surface

(Image: James Lefevre / Hamilton Lab)

Has been applied to ~2000 3D time points and appears to work well

workflows and visualisation for machine learning

Documents