![Page 1: Feature Identification for Colon Tumor Classification UCI Interdisciplinary Computational and Applied Mathematics Program Representative: Anthony Hou Joint](https://reader035.vdocument.in/reader035/viewer/2022062619/55199c2255034648068b4a11/html5/thumbnails/1.jpg)
Feature Identification for Colon Tumor Classification
UCI Interdisciplinary Computational and Applied Mathematics Program Representative:
Anthony Hou
Joint Work with Melody Lim, Janine Chua, Natalie CongdonFaculty Advisors: Dr. Fred Park, Dr. Ernie Esser, and Anna Konstorum
![Page 2: Feature Identification for Colon Tumor Classification UCI Interdisciplinary Computational and Applied Mathematics Program Representative: Anthony Hou Joint](https://reader035.vdocument.in/reader035/viewer/2022062619/55199c2255034648068b4a11/html5/thumbnails/2.jpg)
Problem Statement
Tumor spheroids
Control Chemical Added
![Page 3: Feature Identification for Colon Tumor Classification UCI Interdisciplinary Computational and Applied Mathematics Program Representative: Anthony Hou Joint](https://reader035.vdocument.in/reader035/viewer/2022062619/55199c2255034648068b4a11/html5/thumbnails/3.jpg)
Biological Background
Hepatocyte Growth Factor (HGF) has been shown to be increased in colon tumor microenvironment (in vivo)
Increased HGF is correlated with increased growth & dispersiveness
Tumor spheroids
Control +HGF
![Page 4: Feature Identification for Colon Tumor Classification UCI Interdisciplinary Computational and Applied Mathematics Program Representative: Anthony Hou Joint](https://reader035.vdocument.in/reader035/viewer/2022062619/55199c2255034648068b4a11/html5/thumbnails/4.jpg)
Experimental Approach
Data obtained from the Laboratory of Dr. Marian Waterman, in the Department of Microbiology at UC Irvine
Cell line used: primary, ‘colon cancer initiating cells’ (CCICs)
Cultured CCICs trypsinized and spun down
![Page 5: Feature Identification for Colon Tumor Classification UCI Interdisciplinary Computational and Applied Mathematics Program Representative: Anthony Hou Joint](https://reader035.vdocument.in/reader035/viewer/2022062619/55199c2255034648068b4a11/html5/thumbnails/5.jpg)
Experimental Approach (cont.)
Single cells plated in 96 well ultra-low attachment plates with DMEM, supplement, and with or without HGF at various concentrations
CCICs imaged at 10x resolution once a day for 12 days
Spheroid grown in media + 50ng/ml HGF, day 8
![Page 6: Feature Identification for Colon Tumor Classification UCI Interdisciplinary Computational and Applied Mathematics Program Representative: Anthony Hou Joint](https://reader035.vdocument.in/reader035/viewer/2022062619/55199c2255034648068b4a11/html5/thumbnails/6.jpg)
Our Motivational GoalHaving a set of data, biologists can see the qualitative effect when the concentration of HGF is high and when the concentration of HGF is low.
We want to find the feature(s) that can discriminate between a tumor spheroid that has high and low concentrations of HGF.
We hope this discovery can indicate which features are useful in helping biologists measure the amount of HGF in a certain colon tumor spheroid
![Page 7: Feature Identification for Colon Tumor Classification UCI Interdisciplinary Computational and Applied Mathematics Program Representative: Anthony Hou Joint](https://reader035.vdocument.in/reader035/viewer/2022062619/55199c2255034648068b4a11/html5/thumbnails/7.jpg)
Image Processing/Computer
Vision BackgroundClassification
We humans have an innate ability to learn to identify one object from another
![Page 8: Feature Identification for Colon Tumor Classification UCI Interdisciplinary Computational and Applied Mathematics Program Representative: Anthony Hou Joint](https://reader035.vdocument.in/reader035/viewer/2022062619/55199c2255034648068b4a11/html5/thumbnails/8.jpg)
Control +HGF
Now, how can we automate this process with respect to biological
images?
![Page 9: Feature Identification for Colon Tumor Classification UCI Interdisciplinary Computational and Applied Mathematics Program Representative: Anthony Hou Joint](https://reader035.vdocument.in/reader035/viewer/2022062619/55199c2255034648068b4a11/html5/thumbnails/9.jpg)
Classification ApproachImage Processing
Mathematical featuresShape features: Area, Perimeter/Area, Circularity Ratio, Texture features: Total Variation/Area, Average Intensity, Eccentricity
Why these 6 features?
Given feature: Day
Fisher’s Linear Discriminant (FLD) Classification
![Page 10: Feature Identification for Colon Tumor Classification UCI Interdisciplinary Computational and Applied Mathematics Program Representative: Anthony Hou Joint](https://reader035.vdocument.in/reader035/viewer/2022062619/55199c2255034648068b4a11/html5/thumbnails/10.jpg)
Raw +HGF tumor
Segmented +HGF tumor
Thresholdedbinary image
Boundary of +HGF tumor
Binary image with boundary applied
Processing Data
![Page 11: Feature Identification for Colon Tumor Classification UCI Interdisciplinary Computational and Applied Mathematics Program Representative: Anthony Hou Joint](https://reader035.vdocument.in/reader035/viewer/2022062619/55199c2255034648068b4a11/html5/thumbnails/11.jpg)
Shape Information
Features from Given Shape• Area• Perimeter/Area• Circularity Ratio• Eccentricity
HGF Binary
![Page 12: Feature Identification for Colon Tumor Classification UCI Interdisciplinary Computational and Applied Mathematics Program Representative: Anthony Hou Joint](https://reader035.vdocument.in/reader035/viewer/2022062619/55199c2255034648068b4a11/html5/thumbnails/12.jpg)
Image Information
• Total Variation
• Average Intensity
Features from Given Image
HGF Segmented
![Page 13: Feature Identification for Colon Tumor Classification UCI Interdisciplinary Computational and Applied Mathematics Program Representative: Anthony Hou Joint](https://reader035.vdocument.in/reader035/viewer/2022062619/55199c2255034648068b4a11/html5/thumbnails/13.jpg)
Classification
<V1,V2, …Vn>
Tumor gets mapped to feature vectors, which get mapped to points in high dimensional space. Now how do we separate the 2 groups?
![Page 14: Feature Identification for Colon Tumor Classification UCI Interdisciplinary Computational and Applied Mathematics Program Representative: Anthony Hou Joint](https://reader035.vdocument.in/reader035/viewer/2022062619/55199c2255034648068b4a11/html5/thumbnails/14.jpg)
Fisher’s Linear Discriminant
Describe mapping
Fisher’s Linear Discriminant: maximize ratio of inter-class variance to intra-class variance
![Page 15: Feature Identification for Colon Tumor Classification UCI Interdisciplinary Computational and Applied Mathematics Program Representative: Anthony Hou Joint](https://reader035.vdocument.in/reader035/viewer/2022062619/55199c2255034648068b4a11/html5/thumbnails/15.jpg)
Project OverviewDevelop classification scheme for colon tumor spheroids grown in media with and without HGF
Broader goal is to obtain quantitative understanding of HGF action on tumor spheroids.
Feature vectors can be utilized to quantify HGF action on tissue growth in vitro.
![Page 16: Feature Identification for Colon Tumor Classification UCI Interdisciplinary Computational and Applied Mathematics Program Representative: Anthony Hou Joint](https://reader035.vdocument.in/reader035/viewer/2022062619/55199c2255034648068b4a11/html5/thumbnails/16.jpg)
ResultsRan FLD code on 6 features: Area, Circularity Ratio, Average Intensity, Eccentricity, Perimeter/Area, TV/Area
Train on half the data
Repeated Random Sub-sampling Cross Validation was used on all tests
![Page 17: Feature Identification for Colon Tumor Classification UCI Interdisciplinary Computational and Applied Mathematics Program Representative: Anthony Hou Joint](https://reader035.vdocument.in/reader035/viewer/2022062619/55199c2255034648068b4a11/html5/thumbnails/17.jpg)
ResultsRan FLD code on 6 features: Area, Circularity Ratio, Average Intensity, Eccentricity, Perimeter/Area, TV/Area
Percent Correct for Control: 91.50%
Percent Correct for +HGF: 90.99%
![Page 18: Feature Identification for Colon Tumor Classification UCI Interdisciplinary Computational and Applied Mathematics Program Representative: Anthony Hou Joint](https://reader035.vdocument.in/reader035/viewer/2022062619/55199c2255034648068b4a11/html5/thumbnails/18.jpg)
Results: Adding DayGood results, but our goal is to maximize percentage correct, so included time (day)
Features used: Area, Perimeter/Area, TV/Area, Eccentricity, Average Intensity, Circularity Ratio, Day
Observed some tumors similar in shape and size, so we needed a descriptor to separate those. Caused by larger control tumor from later phase having similar area & perimeter to earlier-stage HGF tumor.
![Page 19: Feature Identification for Colon Tumor Classification UCI Interdisciplinary Computational and Applied Mathematics Program Representative: Anthony Hou Joint](https://reader035.vdocument.in/reader035/viewer/2022062619/55199c2255034648068b4a11/html5/thumbnails/19.jpg)
Results: Adding DayGood results, but our goal is to maximize percentage correct, so included time (day)
Features used: Area, Perimeter/Area, TV/Area, Eccentricity, Average Intensity, Circularity Ratio, Day
Observed some tumors similar in shape and size, so we needed a descriptor to separate those. Caused by larger control tumor from later phase having similar area & perimeter to earlier-stage HGF tumor. Percent Correct for Control: 98.88%Percent Correct for +HGF: 100%
![Page 20: Feature Identification for Colon Tumor Classification UCI Interdisciplinary Computational and Applied Mathematics Program Representative: Anthony Hou Joint](https://reader035.vdocument.in/reader035/viewer/2022062619/55199c2255034648068b4a11/html5/thumbnails/20.jpg)
Next ApproachExcellent results, but curious to see if same results can be obtained using less features
Plot all separately to get an idea of their individual classifying potential
![Page 21: Feature Identification for Colon Tumor Classification UCI Interdisciplinary Computational and Applied Mathematics Program Representative: Anthony Hou Joint](https://reader035.vdocument.in/reader035/viewer/2022062619/55199c2255034648068b4a11/html5/thumbnails/21.jpg)
Area
Due to area differences between tumors from control and +HGF
Control=blueHGF=red
![Page 22: Feature Identification for Colon Tumor Classification UCI Interdisciplinary Computational and Applied Mathematics Program Representative: Anthony Hou Joint](https://reader035.vdocument.in/reader035/viewer/2022062619/55199c2255034648068b4a11/html5/thumbnails/22.jpg)
Circularity Ratio Description
C1 = (Area of a shape)/(Area of circle)
where circle has the same perimeter as shape
![Page 23: Feature Identification for Colon Tumor Classification UCI Interdisciplinary Computational and Applied Mathematics Program Representative: Anthony Hou Joint](https://reader035.vdocument.in/reader035/viewer/2022062619/55199c2255034648068b4a11/html5/thumbnails/23.jpg)
Circularity Ratio
Given data are relatively circular from both groups (control and +HGF)
Control=blueHGF=red
![Page 24: Feature Identification for Colon Tumor Classification UCI Interdisciplinary Computational and Applied Mathematics Program Representative: Anthony Hou Joint](https://reader035.vdocument.in/reader035/viewer/2022062619/55199c2255034648068b4a11/html5/thumbnails/24.jpg)
Average Intensity Description
Average Intensity: sum of the image intensities over the shape divided by area
Inversely related to density.
Smaller values indicate less light passing through, suggesting a denser object
+HGF 10ng/ml Day 11 (10x)
Control Day 8 (10x)
![Page 25: Feature Identification for Colon Tumor Classification UCI Interdisciplinary Computational and Applied Mathematics Program Representative: Anthony Hou Joint](https://reader035.vdocument.in/reader035/viewer/2022062619/55199c2255034648068b4a11/html5/thumbnails/25.jpg)
Average Intensity
Control=blueHGF=red
• Control Group is similar in Average Intensity, whereas +HGFs are denser
• Not all are very dense, so there are some overlap with controls
![Page 26: Feature Identification for Colon Tumor Classification UCI Interdisciplinary Computational and Applied Mathematics Program Representative: Anthony Hou Joint](https://reader035.vdocument.in/reader035/viewer/2022062619/55199c2255034648068b4a11/html5/thumbnails/26.jpg)
Eccentricity Description
Measure of elongation of an object
![Page 27: Feature Identification for Colon Tumor Classification UCI Interdisciplinary Computational and Applied Mathematics Program Representative: Anthony Hou Joint](https://reader035.vdocument.in/reader035/viewer/2022062619/55199c2255034648068b4a11/html5/thumbnails/27.jpg)
Eccentricity
Due to most tumors from both groups being circular except for a few outliers
Control=blueHGF=red
![Page 28: Feature Identification for Colon Tumor Classification UCI Interdisciplinary Computational and Applied Mathematics Program Representative: Anthony Hou Joint](https://reader035.vdocument.in/reader035/viewer/2022062619/55199c2255034648068b4a11/html5/thumbnails/28.jpg)
Perimeter to Area Ratio
Why Normalize Perimeter by Area?
We do so because a small, jagged object may have the same area as a large, circular object. Thus, we divide by area, creating a more effective classifier.
![Page 29: Feature Identification for Colon Tumor Classification UCI Interdisciplinary Computational and Applied Mathematics Program Representative: Anthony Hou Joint](https://reader035.vdocument.in/reader035/viewer/2022062619/55199c2255034648068b4a11/html5/thumbnails/29.jpg)
Perimeter to Area Ratio
This is to be expected because the +HGF tumor spheroids have more dispersion, resulting in greater area, in contrast to the control tumor spheroids.
Control=blueHGF=red
![Page 30: Feature Identification for Colon Tumor Classification UCI Interdisciplinary Computational and Applied Mathematics Program Representative: Anthony Hou Joint](https://reader035.vdocument.in/reader035/viewer/2022062619/55199c2255034648068b4a11/html5/thumbnails/30.jpg)
Total Variation to Area Ratio Description
At every point, estimate its gradient (difference in intensities in x and y direction). Use discretization of Total Variation. Also normalized by area.
Texture
+HGF 10ng/ml Day 12 (10x)
Control Day 11 (10x)
![Page 31: Feature Identification for Colon Tumor Classification UCI Interdisciplinary Computational and Applied Mathematics Program Representative: Anthony Hou Joint](https://reader035.vdocument.in/reader035/viewer/2022062619/55199c2255034648068b4a11/html5/thumbnails/31.jpg)
Total Variation to Area Ratio
Due to similar densities/intensities in tumors from both groups
Control=blueHGF=red
![Page 32: Feature Identification for Colon Tumor Classification UCI Interdisciplinary Computational and Applied Mathematics Program Representative: Anthony Hou Joint](https://reader035.vdocument.in/reader035/viewer/2022062619/55199c2255034648068b4a11/html5/thumbnails/32.jpg)
Intuition Through Trial and Error
Given the individual results, we combined the two strongest features, area and perimeter/area, and plot them both using a scatter plot
![Page 33: Feature Identification for Colon Tumor Classification UCI Interdisciplinary Computational and Applied Mathematics Program Representative: Anthony Hou Joint](https://reader035.vdocument.in/reader035/viewer/2022062619/55199c2255034648068b4a11/html5/thumbnails/33.jpg)
Area vs. Perimeter/Area
Control=blueHGF=red
![Page 34: Feature Identification for Colon Tumor Classification UCI Interdisciplinary Computational and Applied Mathematics Program Representative: Anthony Hou Joint](https://reader035.vdocument.in/reader035/viewer/2022062619/55199c2255034648068b4a11/html5/thumbnails/34.jpg)
ResultsWe obtained reasonably accurate results, having only two controls on the +HGF side if we draw an imaginary line to separate the two groups
Ran FLD code on Area and Perimeter/Area
![Page 35: Feature Identification for Colon Tumor Classification UCI Interdisciplinary Computational and Applied Mathematics Program Representative: Anthony Hou Joint](https://reader035.vdocument.in/reader035/viewer/2022062619/55199c2255034648068b4a11/html5/thumbnails/35.jpg)
ResultsWe obtained reasonably accurate results, having only two controls on the +HGF side if we draw an imaginary line to separate the two groups
Ran FLD code on Area and Perimeter/Area
Percent Correct for Control: 89.03%
Percent Correct for +HGF: 96.92%
![Page 36: Feature Identification for Colon Tumor Classification UCI Interdisciplinary Computational and Applied Mathematics Program Representative: Anthony Hou Joint](https://reader035.vdocument.in/reader035/viewer/2022062619/55199c2255034648068b4a11/html5/thumbnails/36.jpg)
EvaluationReasonably decent results, but decided to add the feature Day
![Page 37: Feature Identification for Colon Tumor Classification UCI Interdisciplinary Computational and Applied Mathematics Program Representative: Anthony Hou Joint](https://reader035.vdocument.in/reader035/viewer/2022062619/55199c2255034648068b4a11/html5/thumbnails/37.jpg)
EvaluationReasonably decent results, but decided to add the feature Day
Results: Area, Perimeter/Area, Day
Percent Correct for Control: 100%
Percent Correct for +HGF: 100%
![Page 38: Feature Identification for Colon Tumor Classification UCI Interdisciplinary Computational and Applied Mathematics Program Representative: Anthony Hou Joint](https://reader035.vdocument.in/reader035/viewer/2022062619/55199c2255034648068b4a11/html5/thumbnails/38.jpg)
“Bad” FeaturesPlotting graphs of “good” features and running FLD showed how strong those features really are.
Our first thoughts: Were the “good” features too strong that the “bad” features couldn’t exhibit their full potential as classifiers?
CR, TV/Area, Average Intensity, Eccentricity
![Page 39: Feature Identification for Colon Tumor Classification UCI Interdisciplinary Computational and Applied Mathematics Program Representative: Anthony Hou Joint](https://reader035.vdocument.in/reader035/viewer/2022062619/55199c2255034648068b4a11/html5/thumbnails/39.jpg)
IntuitionDecided to run FLD test to see if they perform better as a group by themselves
Results: CR, TV/Area, Average Intensity, Eccentricity
![Page 40: Feature Identification for Colon Tumor Classification UCI Interdisciplinary Computational and Applied Mathematics Program Representative: Anthony Hou Joint](https://reader035.vdocument.in/reader035/viewer/2022062619/55199c2255034648068b4a11/html5/thumbnails/40.jpg)
IntuitionResults: CR, TV/Area, Average Intensity, Eccentricity
Percent Correct for Control: 75.33%
Percent Correct for HGF: 55.27%
Why?
![Page 41: Feature Identification for Colon Tumor Classification UCI Interdisciplinary Computational and Applied Mathematics Program Representative: Anthony Hou Joint](https://reader035.vdocument.in/reader035/viewer/2022062619/55199c2255034648068b4a11/html5/thumbnails/41.jpg)
Final ThoughtsOur belief: “bad” features are not necessarily useless.
Data sets vary; some may include tumors with different textures, shapes, area, and so on
Our set of features are extremely versatile
After feature identification, features can be used to further pursue broader goals such as the quantification of a certain chemical’s effect on their tumors
![Page 42: Feature Identification for Colon Tumor Classification UCI Interdisciplinary Computational and Applied Mathematics Program Representative: Anthony Hou Joint](https://reader035.vdocument.in/reader035/viewer/2022062619/55199c2255034648068b4a11/html5/thumbnails/42.jpg)
ConclusionEffectiveness of area vector is obviously in accordance with biological hypothesis that HGF increases cellular mitosis rate, resulting in larger tumors.
Effectiveness of perimeter/area vector quantifies contiguous cell spread, supporting hypothesis stating HGF results in a spheroid with greater perimeter/area ratio.
Tried a lot of fancy ways, but turns out the strongest features were the simplest ones that also agreed with biologists’ intuition.
![Page 43: Feature Identification for Colon Tumor Classification UCI Interdisciplinary Computational and Applied Mathematics Program Representative: Anthony Hou Joint](https://reader035.vdocument.in/reader035/viewer/2022062619/55199c2255034648068b4a11/html5/thumbnails/43.jpg)
Conclusion (cont.)Including Day Vs. Not Including Day
Day + less features = better results
Less features (without day) = worse results
Use more features (without day) = good results; separation in high dimensions
![Page 44: Feature Identification for Colon Tumor Classification UCI Interdisciplinary Computational and Applied Mathematics Program Representative: Anthony Hou Joint](https://reader035.vdocument.in/reader035/viewer/2022062619/55199c2255034648068b4a11/html5/thumbnails/44.jpg)
Future GoalsDevelop methods to quantify cell spread for cells that are no longer attached to the tumor.
Develop an automated segmentation scheme
Occlusions
Existing strong methods worked, but needed more preprocessing
+HGF 10ng/ml Day 13 (10x)
![Page 45: Feature Identification for Colon Tumor Classification UCI Interdisciplinary Computational and Applied Mathematics Program Representative: Anthony Hou Joint](https://reader035.vdocument.in/reader035/viewer/2022062619/55199c2255034648068b4a11/html5/thumbnails/45.jpg)
Future ExperimentsEXPERIMENT IDEA #1:
Run experiment w/ different concentrations of HGF
We want to quantify how HGF acts with respect to increasing concentration
Utilize developed feature vectors to classify images from different concentrations of HGF.
![Page 46: Feature Identification for Colon Tumor Classification UCI Interdisciplinary Computational and Applied Mathematics Program Representative: Anthony Hou Joint](https://reader035.vdocument.in/reader035/viewer/2022062619/55199c2255034648068b4a11/html5/thumbnails/46.jpg)
Future ExperimentsEXPERIMENT IDEA #2:
Stain spheroids for proteins associated with stem and differentiated cell compartments
Stains can be incorporated into new feature vectors to identify whether HGF-induced changes in stem / differentiated cell concentrations are significant enough to improve image classification.
![Page 47: Feature Identification for Colon Tumor Classification UCI Interdisciplinary Computational and Applied Mathematics Program Representative: Anthony Hou Joint](https://reader035.vdocument.in/reader035/viewer/2022062619/55199c2255034648068b4a11/html5/thumbnails/47.jpg)
AcknowledgementsNSF
Professors Jack Xin, Hongkai Zhao, Sarah Eichorn
Advisors: Dr. Fred Park, Dr. Ernie Esser, and Anna Konstorum
Laboratory of Dr. Marian Waterman
Group: Janine Chua, Melody Lim, Natalie Congdon
MBI
![Page 48: Feature Identification for Colon Tumor Classification UCI Interdisciplinary Computational and Applied Mathematics Program Representative: Anthony Hou Joint](https://reader035.vdocument.in/reader035/viewer/2022062619/55199c2255034648068b4a11/html5/thumbnails/48.jpg)
References
[1] Thomas Brabletz, Andreas Jung, Simone Spaderna, Falk Hlubek, and Thomas Kirchner. Opinion: migrating cancer stem cells - an integrated concept of malignant tumour progression. Nat Rev Cancer, 5(9):744{749, Sep 2005.
[2] Caroline Coghlin and Graeme I Murray. Current and emerging concepts in tumour metastasis. J Pathol, 222(1):1{15, Sep 2010.
[3] A De Luca, M Gallo, D Aldinucci, D Ribatti, L Lamura, A D'Alessio, R De Filippi, A Pinto, and N Normanno. The role of the egfr ligand/receptor system in the secretion of angiogenic factors in mesenchymal stem cells. J Cell Physiol, Dec 2010.