using image processing and statistical analysis to ...€¦ · using image processing and...

6
Using Image Processing and Statistical Analysis to Quantify Cell Scattering for Cancer Drug Research By Gretchen Argast, OSI Pharmaceuticals, LLC, and Paul Fricker, MathWorks Epithelial to mesenchymal transition (EMT), a process vital to embryonic development, has been linked to the spread of cancer in adults. As a result, there is increased interest in developing cancer drugs that target EMT in addition to drugs that target cell proliferation and survival. Until recently, measuring how a drug affected one aspect of EMT, cell scattering, was a manual process that involved subjectively assessing the relative closeness of cells in a culture. Researchers at OSI Pharmaceuticals worked with MathWorks consultants to develop an automated system for quantifying the scattering of cells in a sample. Based on MATLAB ® , Image Processing Toolbox™, and Statistics Toolbox™, the system measures nucleus-to-nucleus distances of nearest-neighbor cells. The ability to measure scattering is essential to evaluating the efficacy of drugs that may inhibit or reverse EMT because it gives researchers a reliable way to compare the effects of one drug against another. What is EMT? In humans and other vertebrates, there are two basic cell types: epithelial and mesenchymal. Several morphological and functional characteristics differentiate the two cell types. For example, epithelial cells depend on cell-to-cell contact for survival. Mesenchymal cells, in contrast, are characterized by their independence from nearby cells and by their mobility, two requirements for cell scattering. In EMT, cells lose their epithelial traits and acquire mesenchymal traits. EMT is essential for developing embryos because it produces mesenchymal cells that can migrate to form bone, cartilage, and other tissue where needed. In adults, however, EMT is associated with pathologies such as cancer and fibrosis. Because mesenchymal tumor cells are more mobile, and thus more invasive, than epithelial tumor cells, scientists believe that they facilitate metastasis, or the spread of tumor cells. EMT also diminishes the effectiveness of chemotherapy treatments that target epithelial cells. Analyzing Cell Sample Images OSI researchers have developed pancreatic and lung tumor models and identified a set of ligands, or binding molecules, that drive EMT in these models. Two of these ligands, hepatocyte growth factor (HGF) and oncostatin M (OSM), induced EMT in the models, enabling us to produce samples that demonstrate the cell scattering associated with EMT. The samples are stained so that the nucleus of each cell shows blue in the images captured by our microscopes (Figure 1). See more articles and subscribe at mathworks.com/newsletters. 1

Upload: vuongxuyen

Post on 29-May-2018

230 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Using Image Processing and Statistical Analysis to ...€¦ · Using Image Processing and Statistical Analysis to ... Based on MATLAB ... Using Image Processing and Statistical Analysis

Using Image Processing and Statistical Analysis to Quantify CellScattering for Cancer Drug ResearchBy Gretchen Argast, OSI Pharmaceuticals, LLC, and Paul Fricker, MathWorks

Epithelial to mesenchymal transition (EMT), a process vital to embryonic development, has been linked to the spread of cancer in adults.As a result, there is increased interest in developing cancer drugs that target EMT in addition to drugs that target cell proliferation andsurvival.

Until recently, measuring how a drug affected one aspect of EMT, cell scattering, was a manual process that involved subjectivelyassessing the relative closeness of cells in a culture. Researchers at OSI Pharmaceuticals worked with MathWorks consultants to developan automated system for quantifying the scattering of cells in a sample. Based on MATLAB®, Image Processing Toolbox™, and StatisticsToolbox™, the system measures nucleus-to-nucleus distances of nearest-neighbor cells. The ability to measure scattering is essential toevaluating the efficacy of drugs that may inhibit or reverse EMT because it gives researchers a reliable way to compare the effects of onedrug against another.

What is EMT?

In humans and other vertebrates, there are two basic cell types: epithelial and mesenchymal. Several morphological andfunctional characteristics differentiate the two cell types. For example, epithelial cells depend on cell-to-cell contact for survival.Mesenchymal cells, in contrast, are characterized by their independence from nearby cells and by their mobility, tworequirements for cell scattering.

In EMT, cells lose their epithelial traits and acquire mesenchymal traits. EMT is essential for developing embryos because itproduces mesenchymal cells that can migrate to form bone, cartilage, and other tissue where needed. In adults, however, EMT isassociated with pathologies such as cancer and fibrosis. Because mesenchymal tumor cells are more mobile, and thus moreinvasive, than epithelial tumor cells, scientists believe that they facilitate metastasis, or the spread of tumor cells. EMT alsodiminishes the effectiveness of chemotherapy treatments that target epithelial cells.

Analyzing Cell Sample Images

OSI researchers have developed pancreatic and lung tumor models and identified a set of ligands, or binding molecules, that drive EMTin these models. Two of these ligands, hepatocyte growth factor (HGF) and oncostatin M (OSM), induced EMT in the models, enablingus to produce samples that demonstrate the cell scattering associated with EMT. The samples are stained so that the nucleus of each cellshows blue in the images captured by our microscopes (Figure 1).

See more articles and subscribe at mathworks.com/newsletters.

1

Page 2: Using Image Processing and Statistical Analysis to ...€¦ · Using Image Processing and Statistical Analysis to ... Based on MATLAB ... Using Image Processing and Statistical Analysis

Figure 1. Left: An untreated cell sample showing cell nuclei in blue. Right: A similar sample treated with HGF and OSM (H+O).

To quantify the scattering of the cells, we developed a numerical procedure that uses image processing and statistical analyses. Measuringthe spatial density of the cells would be relatively straightforward if the images were completely covered by the cells: We would simplycount the number of nuclei in each image and then divide by the total image area. The images that we generate are almost alwayspartially covered, however, making it difficult to estimate the cell density correctly. We decided to develop an alternative approach toquantify the scattering, based on measurements of the distances between the cell nuclei.

To analyze the cell images, we used an algorithm consisting of four main steps:

1. Threshold the entire image to segment the cell nuclei, or clusters of nuclei.

2. Analyze the resulting blobs to determine their sizes (areas).

3. Zoom in on larger blobs to perform a localized analysis, to identify individual cells within the blob.

4. Identify the (x,y)-location for each cell nucleus in the image.

Because the intensity scaling is consistent across all the captured images, we can capture most of the individual blobs using a singlehard-coded threshold value. This thresholding procedure produces a binary image in which the cell nucleus is indicated by 1, or white,and its absence is indicated by 0, or black (Figure 2). Using Image Processing Toolbox, we analyzed these black and white images to findthe locations and sizes (areas) of all the blobs.

Figure 2. A cell sample image after applying a thresholding and erosion procedure.

In some cases, a few cells are so close together that their nuclei appear to be touching one another, and they cannot be distinguished asseparate nuclei. To enhance the processing of the images, we sorted the blobs into three categories based on their size. Those with areasbelow a certain size were deemed to be noise or partially occluded cells, and were discarded from the subsequent analysis. Blobs ofintermediate size were classified as individual nuclei that had already been successfully segmented. The largest blobs were presumed to beclusters of overlapping cells requiring further analysis.

To distinguish the individual nuclei within the larger blobs, the algorithm crops the subregions of the image containing the largest blobsand performs local, adaptive thresholding to more accurately distinguish the individual cells (Figure 3).

2

Page 3: Using Image Processing and Statistical Analysis to ...€¦ · Using Image Processing and Statistical Analysis to ... Based on MATLAB ... Using Image Processing and Statistical Analysis

Figure 3. Left: Unprocessed image of a cluster of nuclei initially identified as a single large blob. Right: The same cluster re-analyzed to identifythree individual cell nuclei.

At the end of the image analysis procedure, the algorithm has identified the location of most of the cell nuclei in the image, and storedthis data in an array. The success of the algorithm can be verified visually by overlaying the input images with markers at each measurednucleus location (Figure 4).

Figure 4. The processed image with each identified nucleus located and marked in red.

Measuring and Analyzing Distances Between Cells

Once we have processed the images and obtained an array of cell nucleus coordinates, we use basic MATLAB matrix operations tocompute the distances between an individual nucleus and all the other nuclei in the cell cluster. To assess the scattering of the cells, wecompute the distance between each cell and its nearest neighbor. Each image generates a set of nearest-neighbor distances, with one valuefor each cell. The distance values computed from the image data are initially measured in pixels, and are converted to microns using aknown length scale.

MATLAB histograms of these nearest-neighbor distances show clearly that the data fits into meaningful distribution patterns. Thesepatterns reveal distinct differences between each of the four types of cells that we were studying: untreated, HGF-treated, OSM-treated,and HFG+OSM-treated lung cancer cells (Figure 5).

3

Page 4: Using Image Processing and Statistical Analysis to ...€¦ · Using Image Processing and Statistical Analysis to ... Based on MATLAB ... Using Image Processing and Statistical Analysis

Figure 5. Histograms and curve fitting of nearest-neighbor distances for untreated cells (top left), HGF-treated cells (bottom left), OSM-treatedcells (bottom right), and HGF+OSM-treated cells (top right).

These histogram results suggested that the data could be characterized using a statistical distribution. Using Statistics Toolbox we fittedthe measured distance values to a series of probability distributions. Narrowing our search to asymmetric, continuous distributions, afteran iterative process we found that the loglogistic distribution provided the best fit for the nearest-neighbor distance results.

In addition to characterizing the scattering of the cells, one of the main objectives of this project was to develop a method fordifferentiating the degree of scattering produced by the treatment of cell samples with different ligands. To accomplish this, we usedMATLAB to compute the mean (μ) and variance (σ) parameters for the loglogistic distribution for each of the four samples (Figure 6).

Figure 6. Statistical fitting results for the four nearest-neighbor data sets.

The statistical fitting plots show that the computed values of μ and σ capture distinct differences in the magnitude of cell scattering in thefour data sets. Conversely, when these parameters are computed for a given data set, they can be used to identify which ligand (HGF,OSM, or HGF+OSM) was used to treat the original cell sample. The distributions show that either ligand alone induced scattering in thecells, and that the combined ligand treatment resulted in a further increase in scattering. These distributions reflect what we observequalitatively in the cells after treatment with ligands. From these results we concluded that the mean and variance parameters of the

4

Page 5: Using Image Processing and Statistical Analysis to ...€¦ · Using Image Processing and Statistical Analysis to ... Based on MATLAB ... Using Image Processing and Statistical Analysis

loglogistic distribution fitting of computed nearest-neighbor distances could be used to reliably quantify the scattering of cell nuclei in agiven sample.

In addition to characterizing the responses of the cells to different ligands, we also looked at the effect of drug treatment on the degree ofcell scattering. We computed the loglogistic distributions for samples treated with HGF+OSM that were also treated with increasingconcentrations of a drug that blocks the effects of HGF (50 nM to 2 μM) (Figure 7). At concentrations of 500 nM and above, the druginhibited the effects of HGF and reduced the degree of scattering to one that approximated the effects of OSM by itself. This type ofanalysis is essential for determining the optimal dose for a new drug.

Figure 7. Probability density function (PDF) fitting results for the nearest-neighbor data for a range of doses. Cell scattering induced in lungcancer cells with a combination of ligands (Untreated vs HGF+OSM) was inhibited in a dose-dependent manner when cells were

simultaneously treated with HGF+OSM and a drug that targets the receptor of HGF (500 nM to 2μM).

At the beginning of the EMT quantification project, our goal was to use image analysis techniques with our microscope data to quantifythe scattering or density of cells in our samples. After successfully analyzing the basic attributes of the cell nuclei using MATLAB andImage Processing Toolbox, we realized that the resulting data could best be characterized in terms of a statistical distribution. It was easyto transition to a statistical analysis of the data using Statistics Toolbox. MATLAB enabled us to work within a single developmentenvironment, from the initial image thresholding and nearest-neighbor distance calculations, through selecting and validating anappropriate statistical distribution, to the final comparison of different ligand dose responses.

With a system in place for quantifying the scattering of cells in a sample, OSI researchers now have an objective computational methodfor measuring the ability of drugs in development to reduce or reverse EMT, and potentially, for increasing the drug’s ability to inhibitcancer metastasis.

5

Page 6: Using Image Processing and Statistical Analysis to ...€¦ · Using Image Processing and Statistical Analysis to ... Based on MATLAB ... Using Image Processing and Statistical Analysis

About the Author

Gretchen Argast is a Senior Research Scientist at OSI Pharmaceuticals with expertise in developing EMT models and assays fordrug discovery research, as well as translational research for more advanced programs. She holds a B.A. in Biological Sciencesfrom the University of Chicago and a Ph.D. in Pathology from the University of Washington.

Paul Fricker is a MathWorks Principal Consulting Engineer. Paul has more than 15 years’ experience in signal and imageprocessing, modeling and simulation, and application development. He holds a B.Sc. in Chemistry from Dalhousie University,an M.Sc. in Physics from the University of Toronto, and a Ph.D. in Civil Engineering from Massachusetts Institute ofTechnology.

Products Used

▪ MATLAB

▪ Image Processing Toolbox

▪ Statistics Toolbox

Learn More

▪ New Features for High-Performance Image Processing in

MATLAB

▪ MathWorks Consulting Services

See more articles and subscribe at mathworks.com/newsletters.

Published 201292038v00

mathworks.com© 2012 The MathWorks, Inc. MATLAB and Simulink are registered trademarks of The MathWorks, Inc. See www.mathworks.com/trademarksfor a list of additional trademarks. Other product or brand names may be trademarks or registered trademarks of their respective holders.

6