lec16: medical image registration (advanced): deformable registration
TRANSCRIPT
MEDICAL IMAGE COMPUTING (CAP 5937)
LECTURE 16: Medical Image Registration II (Advanced):Deformable Registration
Dr. Ulas BagciHEC 221, Center for Research in Computer Vision (CRCV), University of Central Florida (UCF), Orlando, FL [email protected] or [email protected]
1 SPRING 2017
Outline
• Similarity Functions Revisited– MSE, NCC, MI, …
• Deformable Image Registration– B-spline parameterization and Free Form
Deformation– Diffeomorphic image registration (DARTEL, …)
• Optimization
2
I. SIMILARITY MEASURES: REVISITED
3
Standard Similarity Measures
• SSD-SAD• Cross correlation• Mutual information
4
Difference Measure
5
6
Credit: Itk.org
Limitations of SSD
7
Limitations of SSD
8
Read: Bagci, U., et al. The role of Intensity Standardization in Medical Image Registration, PRL 2010.
Normalized Cross Correlation (NCC)
9
NCC-Example
10
Multi-modal Image Registration
11
Information Theoretic Approaches• Mutual Information (MI)• Normalized Mutual Information (NMI)
12
Preliminary: Histogram Calculation
13
Preliminary: Joint Histogram Calculation
14
Information Theoretic Approach
15
Joint Histogram
16
Joint Histogram
17
Entropy
18
Mutual Information
19
Historical Note
20
Normalization of MI
21
Improvements to MI
22
Requirements on Similarity Measure
23
II. Deformable Registration
24
Non-Rigid Deformation
25
Source
Target
Before
After
Reasons for Deformable Registration
26
Deformable Registration
27
Non-Rigid Registration 28
Deformable Registration General Framework
29
Deformation Fields
30
y
x
Deformation Fields
31
y
x
Deformable Registration General Framework
32
Why do we need regularization?
33
Why do we need regularization?
34
Regularization Strategies
35
Example: Elastic Registration• Model the deformation as a physical process resembling the
stretching of an elastic material – The physical process is governed by the internal force & external force– described by the Navier linear elastic partial differential equation
• The external force drives the registration process– The external force can be the gradient of a similarity measure
• e.g. local correlation measure based on intensities, intensity differences or intensity features such as edge and curvature
– Or the distance between the curves and surfaces of corresponding anatomical structures.
36
Example: Free-Form Deformation (FFD)• The general idea is to deform an image by manipulating a
regular grid of control points that are distributed across the image at an arbitrary mesh resolution.
• Control points can be moved and the position of individual pixels between the control points is computed from the positions of surrounding control points.
37
Free Form Deformation
38
Credits: Sederberg and Parry, SIGGRAPH (1986)
Global Motion Model
39
(12 degrees of freedom)
Local Motion Model• The affine transformation captures only the global motion.• An additional transformation is required, which models the
local deformation
40
Local Motion Model• The affine transformation captures only the global motion.• An additional transformation is required, which models the
local deformation
41
FFD with B-Splines
42
B-Spline / Math
43
44
Deformation with B-Splines
45
Original Lena
Deformation with B-Splines
46
Deformed with B-Spline - Lena
B-Spline Parameterization
47
B-Splines Practically
48
B-Splines Practically
49
Interpolation• For computation of the cost function, the new value is
evaluated at non-voxel positions, for which intensity interpolation is needed.
50
Interpolation• For computation of the cost function, the new value is
evaluated at non-voxel positions, for which intensity interpolation is needed.
51
(nearest neighbor)
Interpolation• For computation of the cost function, the new value is
evaluated at non-voxel positions, for which intensity interpolation is needed.
52
(tri-linear)
III. Optimization
53
Optimization
54
Iterative Optimization
55
Gradient Descent
56
Cost Function Derivative
57
58
59
60
61
62
Sampling Strategy• The most straightforward strategy is to use all voxels
from the fixed image, which has as an obvious downside that it is time-consuming for large images.
63
Sampling Strategy• The most straightforward strategy is to use all voxels
from the fixed image, which has as an obvious downside that it is time-consuming for large images.
• A common approach is to use a subset of voxels, selected on a uniform grid, or sampled randomly!
64
Sampling Strategy• The most straightforward strategy is to use all voxels
from the fixed image, which has as an obvious downside that it is time-consuming for large images.
• A common approach is to use a subset of voxels, selected on a uniform grid, or sampled randomly!
• Another strategy is to pick only those points that are located on striking image features, such as edges
65
Sampling Strategy• The most straightforward strategy is to use all voxels
from the fixed image, which has as an obvious downside that it is time-consuming for large images.
• A common approach is to use a subset of voxels, selected on a uniform grid, or sampled randomly!
• Another strategy is to pick only those points that are located on striking image features, such as edges
• 2K samples are often enough for good performance!
66
Sampling Strategy• The most straightforward strategy is to use all voxels
from the fixed image, which has as an obvious downside that it is time-consuming for large images.
• A common approach is to use a subset of voxels, selected on a uniform grid, or sampled randomly!
• Another strategy is to pick only those points that are located on striking image features, such as edges
• 2K samples are often enough for good performance!• Masks can be used to select a region of interest or to
avoid the undesired alignment of artificial edges in the images
67
IV. Advance Models of Non-Rigid Deformations
68
Deformation Model
69
Small deformation models do not necessarily enforce a one-to-one mapping. If the inverse transformation is correct, these two should be identity!
Diffeomorphic Registration• The large-deformation or diffeomorphic setting is a much
more elegant framework. A diffeomorphism is a globally one-to-one (objective) smooth and continuous mapping with derivatives that are invertible.
70
Diffeomorphic Registration• The large-deformation or diffeomorphic setting is a much
more elegant framework. A diffeomorphism is a globally one-to-one (objective) smooth and continuous mapping with derivatives that are invertible.
• If the mapping is not diffeomorphic, then topology is not necessarily preserved.
71
Diffeomorphic Registration• The large-deformation or diffeomorphic setting is a much
more elegant framework. A diffeomorphism is a globally one-to-one (objective) smooth and continuous mapping with derivatives that are invertible.
• If the mapping is not diffeomorphic, then topology is not necessarily preserved.
• it is easier to parameterize using a number of velocity fields corresponding to different time periods over the course of the evolution of the diffeomorphism (consider u(t) as a velocity field at time t
72
Diffeomorphic Registration• The large-deformation or diffeomorphic setting is a much
more elegant framework. A diffeomorphism is a globally one-to-one (objective) smooth and continuous mapping with derivatives that are invertible.
• If the mapping is not diffeomorphic, then topology is not necessarily preserved.
• it is easier to parameterize using a number of velocity fields corresponding to different time periods over the course of the evolution of the diffeomorphism (consider u(t) as a velocity field at time t
• Diffeomorphisms are generated by initializing with an identity transform and integrating over unit time to obtain
73
Forward & Inverse Transform
74
DARTEL: A Fast Diffeomorphic Method (Ashburner, NeuroImage 2007)
DiffeomorphicAnatomicalRegistrationThroughExponentiatedLie Algebra
75
DARTEL: A Fast Diffeomorphic Method (Ashburner, NeuroImage 2007)
• The DARTEL model assumes a flow field (u) that remains constant over time. With this model, the differential equation describing the evolution of a deformation is
76
DARTEL: A Fast Diffeomorphic Method (Ashburner, NeuroImage 2007)
• The DARTEL model assumes a flow field (u) that remains constant over time. With this model, the differential equation describing the evolution of a deformation is
• The Euler method is a simple integration approach to extend from identity (initial) transform, which involves computing new solutions after many successive small time-steps (h).
77
DARTEL: A Fast Diffeomorphic Method (Ashburner, NeuroImage 2007)
• The DARTEL model assumes a flow field (u) that remains constant over time. With this model, the differential equation describing the evolution of a deformation is
• The Euler method is a simple integration approach to extend from identity (initial) transform, which involves computing new solutions after many successive small time-steps (h).
• Each of these Euler steps is equivalent to
78
DARTEL: A Fast Diffeomorphic Method (Ashburner, NeuroImage 2007)
• The use of a large number of small time steps will produce a more accurate solution, for instance (8 steps)
79
Optimization of DARTEL• Image registration procedures use a mathematical model to
explain the data. Such a model will contain a number of unknown parameters that describe how an image is deformed.
80
Optimization of DARTEL• Image registration procedures use a mathematical model to
explain the data. Such a model will contain a number of unknown parameters that describe how an image is deformed.
• A true diffeomorphism has an infinite number of dimensions and is infinitely differential.
81
Optimization of DARTEL• Image registration procedures use a mathematical model to
explain the data. Such a model will contain a number of unknown parameters that describe how an image is deformed.
• A true diffeomorphism has an infinite number of dimensions and is infinitely differential.
• The discrete parameterization of the velocity field, u(x), can be considered as a linear combination of basis functions.
82
ρi(x) is the ith first degree B-spline basis function at position x
v is a vector of coefficients
Optimization of DARTEL• The aim is to estimate the single “best” set of values for these
parameters (v). The objective function, which is the measure of “goodness”, is formulated as the most probable deformation, given the data (D).
83
Optimization of DARTEL• The aim is to estimate the single “best” set of values for these
parameters (v). The objective function, which is the measure of “goodness”, is formulated as the most probable deformation, given the data (D).
• The objective is to find the most probable parameter values and not the actual probability density, so this factor is ignored. The single most probable estimate of the parameters is known as the maximum a posteriori (MAP) estimate.
84
Optimization of DARTEL• The aim is to estimate the single “best” set of values for these
parameters (v). The objective function, which is the measure of “goodness”, is formulated as the most probable deformation, given the data (D).
• The objective is to find the most probable parameter values and not the actual probability density, so this factor is ignored. The single most probable estimate of the parameters is known as the maximum a posteriori (MAP) estimate.
Or
85
Optimization of DARTEL• Many nonlinear registration approaches search for a
maximum a posteriori (MAP) estimate of the parameters defining the warps, which corresponds to the mode of the probability density
86
Optimization of DARTEL• Many nonlinear registration approaches search for a
maximum a posteriori (MAP) estimate of the parameters defining the warps, which corresponds to the mode of the probability density
• The Levenberg–Marquardt (LM) algorithm is a very good general purpose optimization strategy
87
Optimization of DARTEL• Many nonlinear registration approaches search for a
maximum a posteriori (MAP) estimate of the parameters defining the warps, which corresponds to the mode of the probability density
• The Levenberg–Marquardt (LM) algorithm is a very good general purpose optimization strategy
88
A: second order tensor field, H: concentration matrix, b: first derivative of likelihood func.
Optimization of DARTELSimultaneously minimize the sum of
– Likelihood component• Sum of squares difference
• ½ ∑i∑k(tk(xi) – μk(φ(1)(xi)))2
• φ(1) parameterized by u– Prior component
• A measure of deformation roughness
• ½uTHu
89
Optimization of DARTELPRIOR TERM• ½uTHu• DARTEL has three different models for H
– Membrane energy– Linear elasticity– Bending energy
• H is very sparse• H: deformation roughness
90
An example H for 2D registration of 6x6 images (linear elasticity)
Optimization of DARTEL
LIKELIHOOD TERM• Images assumed to be partitioned into different tissue
classes.– E.g., a 3 class registration simultaneously matches:
• Grey matter with grey matter• White matter wit white matter• Background (1 – GM – WM) with background
91
“Membrane Energy”
Convolution Kernel Sparse Matrix Representation
Penalizes first derivatives. Sum of squares of the elements of the Jacobian (matrices) of the flow field.
“Bending Energy”
Sparse Matrix Representation Convolution Kernel
Penalizes second derivatives.
“Linear Elasticity”• Decompose the Jacobian of the flow field into
– Symmetric component• ½(J+JT)• Encodes non-rigid part.
– Anti-symmetric component• ½(J-JT)• Encodes rigid-body part.
• Penalise sum of squaresof symmetric part.
• Trace of Jacobian encodes volume changes.Also penalized.
Gauss-Newton Optimization• Uses Gauss-Newton
– Requires a matrix solution to a very large set of equations at each iteration
u(k+1) = u(k) - (H+A)-1 b
– b are the first derivatives of objective function– A is a sparse matrix of second derivatives– Computed efficiently, making use of scaling and squaring
95
Example
96
Large Deformation Model• In the small deformation model, the displacements are stored
at each voxel w.r.t. their initial position.
97
Large Deformation Model• In the small deformation model, the displacements are stored
at each voxel w.r.t. their initial position.• Since regularization acts on the displacement field, highly
localized deformations cannot be modeled since the deformation energy caused by stress increases proportionally with the strength of the deformation.
98
Large Deformation Model• In the small deformation model, the displacements are stored
at each voxel w.r.t. their initial position.• Since regularization acts on the displacement field, highly
localized deformations cannot be modeled since the deformation energy caused by stress increases proportionally with the strength of the deformation.
• In the large deformation model, the displacement is generated via a time dependent velocity field v
99
Large Deformation Model• In the small deformation model, the displacements are stored
at each voxel w.r.t. their initial position.• Since regularization acts on the displacement field, highly
localized deformations cannot be modeled since the deformation energy caused by stress increases proportionally with the strength of the deformation.
• In the large deformation model, the displacement is generated via a time dependent velocity field v
100
where u(x,y,z,0)=(x,y,z).
Many Approaches are based on pre-processing step!
101
102
103
Image Gradients
104
Image Gradients
105
Image Gradients
106
107
Entropy Images
108
Entropy Images
109
Entropy Images
110
Entropy Images
111
112
Phase-based Registration
113
114
Multi-Resolution Registration
115
116
Attribute Vectors
117
Recent Approaches
118
Example: Joint Segmentation and Registration
119
Example: CNN based Registration!
120
address the two major limitations of existing intensity-based 2-D/3-D registration technology: 1) slow computation and 2) small capture range. directly estimates transformation parameters unlike iterative optimization
Example: CNN based Registration!1. Parameter Space Partition (PSP)
– partition the transformation parameter space into zones and train CNN regressors in each zone separately, to break down the complex regression task into multiple simpler sub-tasks.
2. Local Image Residual (LIR)– simplify the underlying mapping to be captured by the regressors. LIR
is calculated as the difference between the DRR rendered using transformation parameters t, and the X-ray image in local patches.
3. Hierarchical Parameter Regression– decompose the transformation parameters and regress them in a
hierarchical manner, to achieve highly accurate parameter estimation in each step.
4. CNN Regression Model
121
Example: CNN based Registration!
122
Workflow of LIR feature extraction, demonstrated on X-ray Echo Fusion data. The local ROIs determined by the 3-D points and the transformation parameters are shown as red boxes. The blue box shows a large ROI that covers the entire object.
EVALUATION
123
124
Intensity Based Metrics
125
Landmark Based Metrics• Identify a set of anatomical landmarks, and measure the
distance between them before/after registration
126
Landmark Based Metrics• Non-rigid registration error can be quantified with certainty
only at the available landmarks and with increasing uncertainty as distance from these landmarks increases.
127
Landmark Based Metrics• Non-rigid registration error can be quantified with certainty
only at the available landmarks and with increasing uncertainty as distance from these landmarks increases.
• A very dense set of landmarks, which is generally not available, would thus be required to gain a complete, global understanding of registration accuracy.
128
Segmentation Based Metrics• Tissue overlaps (after segmentation)
129
Jacobian Determinants• The Jacobian determinant is used to define a measure for the
plausibility of the transformation by calculating the number of voxels with a negative value or a value of zero:
130
A value det(∇φ(x)) > 1 implies an expansion of volume, 0 < det(∇φ(x)) < 1 indicates a contraction φ(x): deformation field
However,…..• Intensity, landmark, and segmentation based methods are
highly unreliable !!!!
131
CURT: Completely Useless Registration Tool
However,…..• Intensity, landmark, and segmentation based methods are
highly unreliable !!!!
132
CURT: Completely Useless Registration Tool
Tissue overlap measures (Jaccard index)
Inverse Consistency Error• For each pair of fixed and moving images, A and B, and for
each registration algorithm, the inverse consistency error between the forward transformation, TAB, and the backward transformation, TBA, is computed as follows:
133
Rohlfing’s TMI Paper (2012) suggests that• tissue overlap, image similarity, and inverse consistency error
are not reliable surrogates for registration accuracy, whether they are used in isolation or combined.
134
Rohlfing’s TMI Paper (2012) suggests that• tissue overlap, image similarity, and inverse consistency error
are not reliable surrogates for registration accuracy, whether they are used in isolation or combined.
• Ideally, actual registration errors measured at a large number of densely distributed landmarks (i.e., identifiable anatomical locations) should become the standard for reporting registration errors.
135
Rohlfing’s TMI Paper (2012) suggests that• tissue overlap, image similarity, and inverse consistency error
are not reliable surrogates for registration accuracy, whether they are used in isolation or combined.
• Ideally, actual registration errors measured at a large number of densely distributed landmarks (i.e., identifiable anatomical locations) should become the standard for reporting registration errors.
• make the evaluation of registration performance as independent as possible from the registration itself. Some obvious dependencies exist between images used for registration and features derived from them for evaluation.
136
Summary
137
Slide Credits and References• Darko Zikic, MICCAI 2010 Tutorial• Stefan Klein, MICCAI 2010 Tutorial• Marius Staring, MICCAI 2010 Tutorial• J. Ashburner, NeuroImage 2007.• M. F. Beg, M. I. Miller, A. Trouvé and L. Younes. “Computing
Large Deformation Metric Mappings via Geodesic Flows of Diffeomorphisms”. International Journal of Computer Vision 61(2):139–157 (2005).
• M. Vaillant, M. I. Miller, L. Younes and A. Trouvé. “Statistics on diffeomorphisms via tangent space representations”. NeuroImage 23:S161–S169 (2004).
• L. Younes, “Jacobi fields in groups of diffeomorphisms and applications”. Quart. Appl. Math. 65:113–134 (2007).
• www.nirep.org• Rueckert and Schnabel’s Chapter 5, Medical Image
Registration, Springer 2010. Biomedical Image Processing.
138