an algorithm to compute independent sets of voxels for parallelization of icd-based statistical...
TRANSCRIPT
An Algorithm to ComputeIndependent Sets of Voxels for Parallelization of ICD-based Statistical Iterative Reconstruction
Sungsoo Ha and Klaus Mueller
Department of Computer Science
Visual Analytics and Imaging (VAI) Lab
Stony Brook University and SUNY Korea
Motivation
• Statistical Iterative Reconstruction Algorithm
FBP SIR
Motivation
• Statistical Iterative Reconstruction Algorithm• Weighted Least Square (WLS) cost function
�̂�=arg min𝑥 ≥ 0 {1
2(𝐲−𝐀𝐱 )𝑇𝐖 (𝐲−𝐀𝐱 )+𝑅 (𝐱 )}
y Measured projection data
X Attenuation coefficients of the object subject to be reconstructed
A System matrix with size of
W Diagonal matrix for statistical weighting
R(x) Regularization
Motivation
• Statistical Iterative Reconstruction Algorithm• Weighted Least Square (WLS) cost function
High cost for forward & back projectionsThe nature of iterative algorithm
�̂�=arg min𝑥 ≥ 0 {1
2(𝐲−𝐀𝐱 )𝑇𝐖 (𝐲−𝐀𝐱 )+𝑅 (𝐱 )}
Motivation: optimization
ICD-based CG-based
FAST SLOWConvergence rate
HARD EASYParallelization
x
y
GCD (Fessler et al. 1997)
B-ICD(Benson et al. 2010)
x
y
ABCD(Fessler et al. 2011)
z
Goal
• Devise an algorithm– Find voxels that are “fully” independent each other– No additional algorithmic & computational complexity– More accurate (also complicated) pattern– Applicable for all CT geometry
ICD-based GC-based
FAST SLOWConvergence rate
HARD EASYParallelization
Independency among voxels
• Single voxel update scheme–Minimizing one direction at a time
correction weighting update
A
Single voxel update
A voxel A
object
x-ray source
flat detector
region related to voxel A
A
B
A voxel A
object
x-ray source
flat detector
region related to voxel A
B voxel B
region related to voxel B
Independent voxel
System Matrix, - M: # of line-integrals- N: # of voxels
A B C
Overlap between B & C
CT system matrix view
M
N
• Independent– A, B
• Dependent – A, C– B, C
Overlap between A & C
• Knapsack problem:
Finding set of independent voxels
min ZERO {¿𝑔∈𝐺𝑔 }𝑠 .𝑡 .𝐺= {𝑎𝑘∨1≤𝑘≤ N }
𝑎𝑚∩𝑎𝑛=𝟎∀𝑎𝑚𝑎𝑛∈𝐺 ,𝑚≠𝑛
• Knapsack problem:
• Combinatorial NP-hard problem
min ZERO {¿𝑔∈𝐺𝑔 }𝑠 . 𝑡 .𝐺={𝑎𝑘∨1≤𝑘≤ N }
𝑎𝑚∩𝑎𝑛=𝟎∀𝑎𝑚𝑎𝑛∈𝐺 ,𝑚≠𝑛
Finding set of independent voxels
A B C D E F AG = B CX
min ZERO {¿𝑔∈𝐺𝑔 }𝑠 . 𝑡 .𝐺={𝑎𝑘∨1≤𝑘≤ N }
𝑎𝑚∩𝑎𝑛=𝟎∀𝑎𝑚𝑎𝑛∈𝐺 ,𝑚≠𝑛
Finding set of independent voxels• Knapsack problem:
• Combinatorial NP-hard problem• First-Fit Decreasing algorithm
1. Sort voxels in descending order of the number of non-zero elements
2. Pick the voxel that contain the largest number of non-zero elements
3. Invalidate all voxels that depend on the selected voxel
Experiment settings
• Cone-beam CT geometry• Volume: 128 x 128 x 128 (1 x 1 x 1 mm)• Flat detector: 512 x 512 (1 x 1 mm)• SAD: 600 mm• SID: 1000 mm• The number of projections– Varying from 1 to 360– Uniformly distributed over 360 degrees
Extreme case study
# views# independent
groupMax. size of
independent groupAvg. size of
independent group
1 187 16,186 11,214
360 13,569 449 154
• ABCD (Axial Block Coordinate Descent) algorithm• Along z-direction: 128
More parallelism No additional complexity
Theoretical parallelism
# views# independent
groupMax. size of
independent groupAvg. size of
independent group
1 187 16,186 11,214
360 13,569 449 154
• Expected speed-up (theoretical parallelism) with ideal GPU implementation
Estimated gain of GPU-accelerated OS-SIR
𝒈𝒂𝒊𝒏𝑶𝑺−𝑺𝑰𝑹𝑮𝑷𝑼
𝑔𝑎𝑖𝑛𝑂𝑆−𝑆𝐼𝑅𝐺𝑃𝑈 =
𝑝𝑎𝑟𝑎𝑙𝑙𝑒𝑙𝑖𝑠𝑚(360 /¿𝑜𝑓 𝑣𝑖𝑒𝑤𝑠𝑝𝑒𝑟 𝑠𝑢𝑏𝑠𝑒𝑡)
𝒑𝒂𝒓𝒂𝒍𝒍𝒆𝒍𝒊𝒔𝒎= 𝟏𝟐𝟖𝟑
¿𝒖𝒑𝒅𝒂𝒕𝒆𝒔
Number of views / subset
Independence visualization
1
5
10
20
45
90
180
360
32 (bottom) 64 (middle) 96 (top) 32 (bottom) 64 (middle) 96 (top)
• At 360 views
Independence visualization
32 (bottom) 96 (top)
• A clue for optimism
Independence visualization
32 (bottom) 96 (top)
1 view
360 views
Conclusion & Future works
• More parallelism than existing methods– No additional complexity– One time computation– Applicable for all CT geometry
• Hints for GPU implementation of SIR
• Apply to actual GPU-accelerated SIR framework– Determine optimal computational performance– Convergence rate
Thanks!
• Q&A
• This research was partially supported by NSF grant IIS-11732 and the MSIP (Ministry of Science, ICT and Future Planning), Korea, under the ‘IT Consilience Creative Program (ITCCP)’ (NIPA-2013-H0203-13-1001) supervised by NIPA (National IT Industry Promotion Agency).