validation of clustering methods for pet...

Validation of Clustering

Methods for PET data

Prasanna K VelamuruAdvisors:

Dr.Rosemary A Renaut

Dr.Hongbin GuoDr Huan Liu

Outline

� Motivation

� Clustering Algorithms Validated

� Validation Metrics

� Results

� Significance of Results

� Conclusions and Future Work

PET – Quick Overview

� Diagnostic Imaging technique

� Functional Information

� Knowledge of biochemical basis of both normal as well

as abnormal functions

� Integral part of clinical care

� Oncology, Cardiology, Neurology/Psychiatry

Domain Of PET data used in this Study

� Measurement of biochemical and physiological

parameters from dynamic human brain PET data.

� Three-compartment FDG tracer kinetic model

Dynamic Human Brain PET Data

� Siemens 951/31 Scanner

� Output Image Matrix: 128 x 128 x 31 x 21

� 4-Dimensional: 3 in Space, 1 in Time.

X

Y

Z

T=0.1min T=0.55min T=5.75min T=45min

Characteristics of Dynamic PET data

� Output data is very noisy� Low Signal to Noise Ratio (SNR)

� Data obtained at irregular time intervals� Initial time intervals

� Very short: to observe gradient in the output

� last time interval � Long

� Output changes little

� Useful for determining long term decay term in the output

� Contributes significantly in the estimation of the kinetic rates.

Time Activity Curves

� Vector: [X1, X2,…,X20,X21]

� Defined for each Voxel / Pixel

� Total for each slice:

128*128=16384 TACs

� For Entire brain Volume:

128*128*31= 507904

Integrals

� Obtained by multiplying individual TACswith time intervals matrix.

[X1, X2,…,X20,X21] * ∆t1∆t2:

:

∆t20∆t21

0.0333

0.0333

:

:

0.5000

:

1.5000

:

10.000

30.000

• Single Scalar value • Approximate Estimate • Reduces Dimension of data

Clustering: In context of PET data

� Important preprocessing step performed prior to parametric estimation from dynamic PET data.

� Form of Segmentation

� Provide better information

� Partially overcome the effect of noise in the data

� Improve the accuracy of voxel level quantification in PET images

Motivation

� New preprocessing clustering techniques designed

and published by Guo et al

� Significantly reduces the overall time for clustering

� Basic initial validation done in the past

� Not comprehensively compared and validated with

classical clustering methods

Preliminary Concepts: Distance Measures

� Measurement of (dis)similarity between multivariable vectors / integrals

� Weighted Minkowski p norm

( Σnl=1|(xl-yl)wl|

p)1/p

� Distances weighted by the time duration intervals ([∆t1,∆t2, . . . , ∆t20,∆t21])

� Weighted Euclidean (``timeL2”) and Weighted Manhattan (``timeL1”)

� Found to be good distance measures for PET

Preliminary Concepts: Histogram-based

Thresholding

Properties of Last Frame

•Provides more accurate image reconstruction•Higher SNR compared to short time interval frames

•FDG accumulation is higher•Distribution of voxel intensities

-Provides closer correlation with spatial variation of

trace in tissue.•Supports thresholding to identify active voxels

Preliminary Concepts: Preclustering

� Initial thresholding� Does not provide enough information to cluster data

� Should include features over the entire set of TACs

� Compute mean TAC� From set of TACs that contain voxels with highest frequency

� Mean TAC used to find initial cluster

� Search is performed over all voxels� Distances are measured with respect to the entire TAC

Preclustering: Ilustration

Frequency=5934

Density~0.03

Precluster

Clustering Algorithms Used for

Validation: Hierarchical Clustering

� Agglomerative (bottom-up)

� Algorithm:

� Initialize: each item as a

cluster

� Iterate:

� select two most similarclusters

� merge them

� Halt: when required number

of clusters is reached

1 2 3 4 5

1−clustering

2−clustering

3−clustering

4−clustering

5−clustering


Validation: HAL

� Average Linkage

� cluster similarity = average similarity of all pairs


Validation: HCL

� Centroid Linkage

� cluster similarity = similarity between centroids

Clustering Algorithms Used for Validation: HCL1/HAL1 (with Preclustering)

� Algorithm Outline:� Initialization Phase

� Filter out active and inactive voxels

� Thresholding using final frame

� Determine preclusters within each active voxels interval

� Repeat preclustering process

� No more active voxel intervals are available

� Maximum iteration number is reached

� Perform HAL/HCL on reduced data set containing active voxels

� Preclusters and isolated voxels

Clustering Algorithms Used for Validation: HCL2/HAL2 (with Preclustering and Merging)

� Algorithm Outline� Perform same steps for initialization and preclustering as in HCL1/HAL1

� Calculate mean TAC for all preclusters

� For all isolated voxels

� Calculate distance to the mean TACs of preclusters

� Merge voxel with precluster of closest similarity (minimum distance)

� Perform HAL/HCL on reduced data set containing active voxels

� Only preclusters

Clustering Algorithms Used for Validation: K-Means

� Partitional clustering algorithm� Iterative relocation

� Locally minimizes sum of squared distance (“energy”) between the data points and their corresponding cluster centers

Σl=1k Σx

iε X

l[d(xi , µl)]

2

� Initialization of K cluster centers:� Totally random

� Random perturbation from global mean

� Heuristic to ensure well-separated centers

Clustering Algorithms Used for Validation: K-Means Illustration

Randomly Initialize Means

Assign Points to Clusters

x

x


x

x

Re-estimate Means


x

x

Re-assign points to clusters


Re-estimate Means

x

x


Re-assign points to clusters

x

x


Re-estimate Means and Converge

x

x

Clustering Algorithms Used for Validation: K-Means

� Algorithm Outline� Initialize K cluster centers µi randomly. 1 ≤ i ≤ K

� Repeat until convergence:

� Cluster Assignment Step: Assign each data point x to the cluster Xl, such that ``timeL2” / ``timeL1” distance of x from µi (center of Xi) is minimum

� Center Re-estimation Step: Re-estimate each cluster center µi as the mean of the points in that cluster

Clustering Algorithms for dynamic PET:

Intangibles� Unsupervised

� no predefined classes

� no specific examples that would show what kind of important relationships within the data are of biological significance.

� Optimal Number of clusters

� Not known a priori� Difficult to rely on visual perception due to noise and high

dimension

� How appropriate is the clustering method for the data at hand?� Would it be possible to have a better set of clusters?

� Trade-off between computational cost and cluster quality� How much tolerance is accepted?

Clustering of PET data: Average Computational Costs (Minutes) (2-D, slice 16)

K Means

HCL2

HCL1

HCL

HAL2

HAL1

HAL

#Clusters 76543

(0.34,0.79)(0.28,0.33)(0.16,0.19)(0.18,0.20)(0.12,0.16)

(0.19,0.41)(0.20,0.40)(0.27,0.47)(0.52,0.65)(0.44,1.43)

(0.71,3.59)(0.84,3.85)(0.83,3.23)(1.29,3.49)(1.70,3.83)

(113,183)(120,184)(122,184)(122,184)(124,185)

(1.20,5.59)(1.58,6.23)(1.22,6.11)(1.29,6.89)(2.06,8.28)

(0.53,7.3)(1.04,7.68)(0.56,8.13)(1.53,7.95)(2.31,8.28)

(113,186)(120,186)(122,185)(122,187)(128,191)

Cluster Validation

� Procedure of evaluating results of a clustering algorithm in a quantitative and objective manner

� Index of cluster validity � Used to measure the quality and appropriateness of a clustering

structure.

� Approach employed in this study� Intra cluster characteristics (Compactness, low magnitude)

� Inter cluster characteristics (Well-separated, high magnitude)

� Measures based on combination of both (Overall assessment)

Cluster Validation Metrics:

Intra-Cluster

� Average Distance from Mean

� For each element xi in cluster j, 1≤ j ≤ p

mji= d(xi,µj) where µj =1/nj(Σmx

jm)

� For each cluster

mj= 1/nj(Σimji)

� For all (p) clusters

m= 1/p Σjmj

� Sensitive to noisy points


Intra-Cluster� Maximum Distance from Mean

� Measures distance from edge to center (radius); influences HCL algorithm

� Highly sensitive to noise

� Maximum Diameter

� Distance between the farthest points within a cluster

� Tests for width of cluster; Sensitive to noisy points


Intra-Cluster� Average Spread

� Average distance of elements within a cluster to all other elements of the same cluster

� Measure of homogeneity of cluster

� Total Energy� Sum of squares of the distances to the mean for all cluster points

(Recall: K-means objective function)

� Expected to be least for K-means

� Useful measure to compare hierarchical methods with K-means


Inter-Cluster

� Separation

� For each element : Average distance of element i in cluster j to

elements of cluster k. (k ≠ j)

� For each cluster: Average of individual separation values

� Measure of average distance of separation

� Minimum Separation

� min (separation values for a cluster)

� Sensitive to noisy points

� Useful to detect questionable cluster assignments


Inter-Cluster

� Average Split

� Split: Closest point in cluster k to point i in cluster j.

� Average of split values for all points in a cluster; average for all clusters

� How close are two neighboring clusters?


Silhouette� Measures the standardized

difference between b(i) and a(i)

� a(i): average dissimilarity of object i to all other objects in its own cluster (average spread)

� b(i): average dissimilarity of object i to all other objects in its nearest cluster (separation)

close to 1, well classified

Cij= close to 0, not certain

close to -1, mis-classified

Average Silhouette Width

= Average value over all clusters

> 0.5 Good Classification

< 0.2 Lack of cluster structure

Useful to determine Optimal Number of

clusters (Maximum Width)


Modified Dunn’s Index*� General form of Dunn’s Index for a partition matrix U

� Modified Dunn’s Index (Average Linkage)

� Modified Dunn’s Index (Complete Linkage)

*Bezdek, J. C. and Pal,N.R.,Some new indexes of cluster validity, IEEE Trans. on Systems, Man, and Cybernetics, Part B: Cybernetics 28 (1998), no. 3, 301-315.


Modified Dunn’s Index

� Modified Dunn’s Index (Combined Average)

� All 3 measures� Useful for determining optimal number of clusters

� Maximum value: Indicator of optimal number

� Complete Linkage: Affected by noisy points

Results

Conclusions

� Methods comparable to each other

� Fast Hierarchical methods comparable to more expensive traditional methods

� K-means

� Greater average dissimilarity within clusters

� Fast

� Modestly well separated clusters

� Optimal Number of clusters

� Not very conclusive based on given measures

� Most instances give value between 3 and 6. � Require domain expert knowledge to determine biological relevance

Future Work

� Fuzzy Clustering schemes, SOM, PCA

� Using fast methods to exclusively study entire brain

volume (in progress)

� Advanced statistical techniques

� More indices to obtain optimal number of clusters

(Davies-Bouldin, Modified Hubert’s Gamma

statistic)

� Publish all results and scripts

Things I Learned

� Practical application and use of clustering concepts

� Very broad field, several applications, active

research area

� Hone Matlab scripting skills

� Opportunity to deal with real data and understand

associated challenges

� Patience- Invaluable in such studies

References

� Phelps, M.E., Positron Emission Tomography. In: Mazziotta, J. and Gilman, S., Eds.,1992,Clinical Brain Imaging: Principles and Applications

� Hongbin Guo, Rosemary Renaut, Kewei Chen and Eric Reiman,2003, Clustering Huge Data sets for Parametric PET imaging, Biosystems, 71, 1-2, pp.81-92

� Jain, A.K., Murty M.N., and Flynn P.J. (1999): Clustering: A Review, ACM Computing Surveys, Vol 31, No. 3, 264-323

� Kaufman L., and Rousseeuw P., 1990. Finding groups in Data : An introduction to Cluster Analysis. John Wiley and Sons, New York, NY.

� Everitt B.S., Landau S., Leese M., 2001. Cluster Analysis, 4 th Edition. Edward Arnold, London, UK.

� Halkidi, M., Batistakis, Y., Vazirgiannis M., 2001. On Clustering Validation Techniques Journal of Intelligent Information Systems, 17:2/3, 107–145.

� Kimura, Y., Senda, M., Alpert, N.,2002, Fast formation of statistically reliable FDG parametric images based on clustering and principal components, Phys. Med. Biol. 47(3),pp.455-468

� A.K. Jain and R. C. Dubes. Algorithms for Clustering Data. Prentice Hall, 1988.

Acknowledgements

� Dr. Rosemary Renaut

� Dr. Hongbin Guo

� Dr. Huan Liu

validation of clustering methods for pet...

Documents