segmentation of crystalline lens in photorefraction videosharat/icvgip.org/... · segmentation of...

$: Segmentation of Crystalline Lens in Photorefraction Videosharat/icvgip.org/... · Segmentation of Crystalline Lens in Photorefraction Video Mark Wendt and Shishir K. Shah ⁄ Quantitative$
Segmentation of Crystalline Lens in Photorefraction Video

Mark Wendt and Shishir K. Shah∗

Quantitative Imaging LaboratoryUniversity of Houston

Dept. of Computer ScienceHouston, TX, [email protected]

ABSTRACTWith an increase in the percentage of people over the age of65 years or older, there is a growing interest in finding waysto slow or reverse some of the effects of aging on the humanbody. One of these effects is the loss in the ability of thehuman eye to visually focus on near objects. Understandingthe properties of the natural lens and the associated mecha-nisms of accommodation will greatly increase our knowledgeto identify alternate solutions to reverse this phenomenon.Towards that end, a technique called photorefraction is cur-rently being used in laboratory studies involving monkeys.To quantitate the lens properties, there is a need to accu-rately measure monkey lenses in images produced by thistechnique. In this paper, we present two probabilistic meth-ods for segmenting the lens from photorefraction video se-quences. Results of the developed methods are comparedand evaluated against ground-truth segmentations. In ad-dition, the results obtained are also compared to those ob-tained by a level set segmentation method.

KeywordsSegmentation, Photorefraction, Monkey Lens

1. INTRODUCTIONIn 2003, 12% of the U.S. population was 65 years of age

or older. That percentage is projected to increase to 20% by2030 [6]. As the population ages, interest increases to findways to slow or reverse some of the effects of aging on thehuman body. One of these effects is the loss of the abilityto visually focus on a near object. Changing focus from farto near is called ‘accommodation’. ‘Presbyopia’ is the age-related loss of the ability to accommodate. This process iscomplete by the time people reach their early fifties.

Research regarding treatments for presbyopia will bene-fit from an increased understanding of the mechanisms in-volved. Toward that end, a technique called photorefraction

∗Corresponding author

Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, torepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee.ICVGIP ’10, December 12-15, 2010, Chennai, IndiaCopyright 2010 ACM 978-1-4503-0060-5/10/12 ...$10.00.

Figure 1: Diagram of a photorefraction experiment.

is currently being used in laboratory studies using monkeys.Monkeys have provided a useful model in the study of ac-commodation because of their ocular similarity to humans.Video sequences of the monkey eye are produced in pho-torefraction and calculations are made from frames of thevideo.

To acquire these sequences, each experiment is recordedto videotape at 30 Hz on a black and white CCD video cam-era. An opaque occluding device covers the lower half of thelens on the video camera (Figure 1A). An array of infraredlight emitting diodes (IR-LED) is located on this occlud-ing device. The system of video camera, occluding device,and LED array is placed in front the eye at a distance, d.Since the lower half of the camera lens is occluded, only theupper part of the light returning from the lens enters thecamera. The originating location of that light on the lens isdetermined by the optics of the eye. If the eye is focused atdistances greater than d, then the light entering the camerawill originate from the upper portion of the lens (Figure 1B).If the eye is focused at a distance less than d, then the lightentering the camera will be inverted such that it will comefrom the lower portion of the lens (Figure 1C). In imagesof the eye, the upper portion of the lens is brighter in theunaccommodated lens (Figure 2A) than in the accommo-dated lens (Figure 2B). This change in vertical brightnesscan be measured as a change in slope of the intensity profile(Figure 2C&D). Analysis of each video frame in a photore-

Figure 2: Images of a monkey lens in (A) the unac-commodated and (B) accommodated states producepixel intensity profiles with (C) a negative slope and(D) a flatter slope, respectively.

fraction session returns a single slope value that is calculatedfrom the intensity profile (Figure 3). This slope value has astrong inverse relationship with the actual refractive abilityof the eye as measured in inverse meters (m−1), also knownas diopters (D).

In typical photorefraction studies with monkeys, the iris issurgically removed to provide a better view of the structuresin the eye. Fully exposed lenses in video images of iridec-tomized monkey eyes provide a number of challenges to seg-mentation; occlusions from tears, eyelashes, and other struc-tures are common; the defining features of the lens boundaryvary from image to image and at different positions withinthe same image; and the lens illuminates differently at dif-ferent angles. In addition, certain conditions can cause theappearance of false lens edges that strongly mimic true ones.

The main goal in these experiments is to record the ac-commodative state of the eye in each frame in order to makeconclusions about changes in accommodation over time. Be-cause dynamic changes in accommodation are being sought,it is necessary to make precise measurements with low noise.Typical experiments run at 30 frames per second for 48 sec-onds on the low end to more than an hour on the high end.The large volume of data coming from these experiments,combined with the need for consistent measurements, cre-ates the need for these calculations to be performed auto-matically.

In this paper, we present an algorithm that is designedto segment lenses in individual photorefraction images. Itis further extended such that the segmented boundary is re-fined by exploiting the temporal dimension of the video. Theresults of the developed algorithms are compared to resultsobtained using a Level Set method of segmentation. Allthree of these algorithms address the ‘Lens Segmentation’module of the overall processing pipeline shown in Figure 3.The initial step of detecting the Purkinje image is related todetection of the lens center that forms the initialization forall segmentation algorithms discussed here. The lens centeris detected by finding the projection of the LED array onthe lens and is accomplished by template matching [14].

The rest of the paper is organized as follows: In Section 2,relevant previous work is discussed which lays the ground-

Figure 3: Diagram of photorefraction calculations.

work for the segmentation algorithms described here. Sec-tion 3 describes in detail the probabilistic edges (PE), anditerative probability (IP) methods as well as the level setmethod used for comparison. Methods of validating thesealgorithms are discussed in Section 4.1 and 4.2. Section 4.3presents results from the individual methods as well as com-parative results. Section 5 concludes with discussion of thesemethods and their significance.

2. PREVIOUS WORKMany methods have been developed which are designed

to follow or delineate the boundaries between regions basedon image intensity or gradient. Active contours, or ‘Snakes’,as originally proposed by Kass, et al [8], make use of a para-metric curve, v(s) = (x(s), y(s)), which minimizes an energyfunctional:

Esnake =

1∫

0

Eint(v(s)) + Eext(v(s)) + Econ(v(s)) ds (1)

The snake is designed such that the functional is minimizedas the curve molds around the object of interest. In thisfunctional, an external energy component is responsive toimage features such as the gradient, an internal energy com-ponent defines the motion of the snake, and a constraintcomponent incorporates an anchoring point or a repulsiveforce. As originally proposed, snakes have the disadvantagesthat they must be placed in close proximity to the object ofinterest and that they do not conform to concavities. Cohenand Cohen [1] addressed this by adding an inflating forceto the active contour. Another adaptation to address theseissues is to replace the external force with a gradient vectorfield [18]. In this method, a field of vectors point to the gra-dient and this field extends to all parts of the image, thusguiding the snake to the desired end point. Others haveadded an inertial force in a system that adds velocity andmass to the spline points [2]. The addition of mass and ve-locity to the spline points to other systems where particleshave been used in an active contour like manner. In thesesystems, the particles behave like real world particles thathave mass and follow Newtonian laws of motion. Szeliski etal. [15] used oriented particles which had not only a position,

but an orientation in 3D space. Jalba et al. [7] representedtheir particles as charged particles in an electric field.

Level set formulations have been proposed for segmenta-tion of curves by incorporating a means of following frontswhich propagate with a velocity which is dependent on theircurvature [13]. To do this, the object of interest is seen asthe intersection of the plane (or volume) with a higher di-mensional surface [11]. This method provides the advan-tage that it can deal with discontinuous regions and holesby representing them as branches of the higher dimensionalsurface. It also allows for initialization to occur far from theobject of interest. A general problem with these techniquesis the tendency to ‘leak’ when the magnitude of the stoppingcriterion, e.g. the gradient, decreases in certain regions. Asan alternative, flux has been used to direct the level setstoward low contrast edges such as elongated blood vessels[17]. Instead of propagating level sets, flux has also been in-corporated into graph cuts in combination with length/area[9].

The above methods are useful when a reliable edge typeexists such as a gradient or a zero-crossing. However, manyof the images in this study have competing edges with similarcharacteristics. For this, a method is desired which distin-guishes between competing edges and assigns probabilitiesto each. Probabilistic methods have been applied to numer-ous problems in computer vision. Maximum a posteriori(MAP) was used for higher-level inference of 1-D structuresto find roads in satellite images [5]. An analysis of this dataindicated that MAP was effective only to the level that theimages provided good visual cues [20]. Lower level segmen-tation was achieved by supervised learning of a large num-ber of general image cues in regions centered at each pointin training images [3]. Others have used a limited numberof specifically chosen edge characteristics such as color andgradient to aid the classification of edge points [10, 12]. Asopposed to edge points, Elder et al. [4] used supervisedlearning of contours where final boundaries were obtainedby probabilistic methods such that a true probability couldbe assigned to the final contour.

Numerous papers report a multi-step approach using avariety of methods to segment or track objects in highlyspecific contexts. As one example, Yan et al. [19] describedan approach for segmenting objects in medical images whichcould be described as a closed curve. In this approach, aregion of the image which was centered on the object ofinterest was transformed into the polar domain. Then apre-computing algorithm was used to locate the pixels onthe boundary of the object. The image with selected pixelswas transformed back into image coordinates and the pixelswere fitted with a Fourier equation.

3. METHODSThis section presents the segmentation methods developed

for detecting the lens boundary in all frames of the acquiredphotorefraction video. The probabilistic edge method (sec-tion 3.1) uses a multi-step approach reminiscent of Yan etal. [19] in that the image is pre-processed into a polar trans-form, edges are found on the polar image, and then a curveis fit to the edges. However, in this case, supervised learningof edge probabilities was used to find the edge. The resultsof the probabilistic edge method are further refined by aniterative probability method (section 3.2). Finally, these re-sults are compared with a level set method (section 3.3).

3.1 Probabilistic Edge MethodAs a first step, the image, I(x, y), was transformed into a

polar image, Ip(ψ, r), using the following transform:

Ip(ψ, r) = I(a, b) (2)

where

a = x0 + cos ψ ∗ (r + R0)

and

b = y0 + sin ψ ∗ (r + R0).

This is done to simplify edge calculations, which are ex-pected to exhibit radial continuity due to the inherent lensshape. The point (x0, y0) is a selected point near the centerof the image. The constant, R0, is the distance from the im-age center at which the transform begins. The initial pointin the polar transform (x0, y0) was chosen such that it wasclose to the center of the lens, (xc, yc). Assuming that thelens is circular and that (xc, yc) does lie within the circle,then the lens edge in the resulting polar transform will nec-essarily be a sinusoid with a single period, 2π. The equationfor the sinusoid is given by:

r = r0 + A ∗ sin(ψ + φ). (3)

The distance from the true center of the circle to the initialcenter is equal to the amplitude of the sinusoid:

A =√

(xc − x0)2 + (yc − y0)2 (4)

and the the angular direction of (xc, yc) in relation to (x0, y0)corresponds to the phase, φ, in the sinusoid. The actualradius of the lens is given by r0.

All subsequent operations were performed on the polarimage. The set of all edge points, T , in the polar imagewere calculated using zero-crossings of first and second orderderivatives:

T = T1 ∪ T2 (5)

where

T1 = {t1,...,tn},dIpt

dr= 0

and

T2 = {t1,...,tn},dI2pt

dr2= 0.

The subset of all edges which lie on the boundary of thelens are denoted as T 0. It was observed that first order lensedges, T 0

1 , tended to appear as dark intensity minima inregions where the lens and the sclera came together, whilesecond order edges, T 0

2 , were high contrast gradients in re-gions where the retina could be seen between the lens andthe sclera. Two first order texture values were also calcu-lated from the polar image. Energy and entropy were cal-culated from histograms of pixel intensities in a 7x7 windowcentered at each pixel in Ip.

A set of characteristics, D, was collected at the positions ofall edges, T , in each image. The set of characteristics can beenumerated as D = {d1,..., dn}. This set included (d1) firstderivative; (d2) second derivative; (d3) energy; (d4) entropy;

(d5) the probability of an edge, t ∈ T 02 , where

dIp

dr< 0; (d6)

distance from the previous edge, t ∈ T2, wheredIp

dr> 0; (d7)

distance from the previous edge, t ∈ T2, wheredIp

dr< 0; (d8)

distance from the previous edge, t ∈ T1; and (d9) distance

to the following edge, t ∈ T2, wheredIp

dr> 0. Also an

iterative probability factor, d10, was set to a constant valuefor individual frames or calculated as described later for theIterative Probability method.

Probability data was collected from thirty images in animage database. To account for the two types of lens edges,T 0

1 and T 02 , points were manually selected separately for

each. Two different sinusoids, S1 and S2 respectively, werethen fit to each type of edge using least squares. Inclusion inT 0

1 or T 02 was determined by the distance to the correspond-

ing set of calculated sinusoidal points, S1(ψ, r) or S2(ψ, r),in the polar image:

ti ∈ T 01 if |ti − si| ≤ 1, si ∈ S1 (6)

and

ti ∈ T 02 if |ti − si| ≤ 1, si ∈ S2. (7)

In thirty images, a total of 745,635 edge points were foundsuch that ti /∈ T 0 while 6388 edge points were found suchthat ti ∈ T 0.

During supervised learning, data was collected for eachcharacteristic, di ∈ D at each edge point, t ∈ T . The prob-abilities, p(D|t /∈ T 0) and p(D

∣∣t ∈ T 0) were estimated byfitting a Gaussian to the data. As a result, two parameters,σ and µ were obtained at each edge point where t ∈ T . Inaddition, two parameters were obtained in the case whereedge points were on the edge of the lens, t ∈ T 0. Oncethe probabilities were learned, the probability that any edgepoint, t ∈ T , would be on the edge of the lens could becalculated according to Bayes rule:

p(t ∈ T 0∣∣D) =

p(D∣∣t ∈ T 0) ∗ p(t ∈ T 0)

p(D)(8)

For simplicity, the maximum a posteriori (MAP) methodwas used whereby the probability, p(t ∈ T 0

∣∣D), was esti-

mated by the posterior probability, p(D∣∣t ∈ T 0). The pos-

terior probabilities were estimated with likelihood ratios foreach edge trait, di, as follows:

p(t ∈ T 0∣∣di) =

1(1 + p(di|t/∈T0)

p(di|t∈T0)

) . (9)

In addition, two specific posterior probabilities were es-timated in response to situations where true second orderedges, t ∈ T 0

2 , were preceded by a paired high probabilityfirst order edge, tprevious ∈ T1. The first estimate modulatesthe probability of the distance from the previous edge by theprobability of the previous edge:

p(t ∈ T 02

∣∣d8) =1(

1 + 1p(d8|t∈T0

2 )∗p(tprevious∈T01 |D)

) . (10)

Conversely, first order edges are multiplied by an inverseequation that reduces their probability if they are closelyfollowed by a high probability second order edge, tnext ∈ T2:

p(t ∈ T 01

∣∣d9) =1

p(d9

∣∣t ∈ T 01 ) ∗ p(tnext ∈ T 0

2

∣∣D). (11)

Final posterior probabilities were combined differently forT1 and T2:

p(t ∈ T 01

∣∣D) =∏

p(t ∈ T 01

∣∣di) (12)

p(t ∈ T 02

∣∣D) =∏

p(t ∈ T 02

∣∣dj). (13)

These probabilities are calculated for all edge points, t ∈ T ,in Ip and stored as an image which is the same size as Ip.

The best fit to the probabilistic data is determined by anenergy minimizing method. The total energy for each sinu-soidal fit to the data is calculated by the following equation:

Etotal = Cdist ∗Edist + Cconn ∗Econn + Cprob ∗Eprob (14)

where

Edist =

n=ψmax∑0

∣∣rp − tn

∣∣ ∗ w , (15)

Econn =

n=ψmax∑0

{1, (|rp − tn| ∗ w) > 1o, (|rp − tn| ∗ w) ≤ 1

, (16)

Eprob =

n=ψmax∑0

(w ∗ p(tn ∈ T 0∣∣D))−1 , (17)

w =max(p(Tψ

∣∣D))

max(p(T∣∣D))

(18)

where Cdist, Cconn, and Cprob are constants; rp is the po-sition of r that is predicted by the given sinusoid at thegiven position, ψ; and Tψ is the maximum probability ofall edges at any given value of ψ. The weight factor, w, isused to account for the low probabilities of certain regions ofthe probability image. Sinusoids are generated and the onewhich minimizes Etotal is chosen as the best fit to the data.Rather than generate every possible sinusoid, initial rangesof parameters r0, A, ψ, and φ were pre-selected using a pri-ori knowledge of the system. The total energy, Etotal, wascalculated for three evenly spaced values of each parameterand then the parameters were narrowed in a binary searchlike manner.

3.2 Iterative Probability MethodThe algorithm described so far segments the lens in images

or frames. However, the size and position of the lens fromframe to frame is strongly related. To take advantage of this,the iterative probability algorithm is used to characterize thedata over the time domain.

Initial lens data is calculated using the probabilistic edge(PE) algorithm described previously. Individual frames alongwith the results of these calculations were stored in an array,M , of odd length l. A weighted mean and a weighted stan-dard deviation are calculated for the lens data in the array.To weight these calculations, the lens results were multipliedby a Gaussian where σ = l/4, µ = l/2, and x is the positionof the data within the array. The PE algorithm is then per-formed on m frames beginning with the middle position inthe array. In this case, the height, r, of the polar image, Ip,is set to r0±2∗sd, where sd is the standard deviation whichwas previously calculated. In addition, an iterative proba-bility factor, d10, is set to a Gaussian centered at r0 withσ = sd, thus increasing the probability of the previously ob-tained lens parameters. As before, the PE algorithm returnsthe lens parameters, (xc, yc) and r0 which are then used tocalculate a new mean and standard deviation. Additionally,the total probability, Etotal, is also returned. The iterativeprobability (IP) algorithm minimizes Etotal using gradientdescent.

In addition to the information from the previous frame,the lead image can also use information from the previous IPcalculation. The new IP factor, d10, uses the average radiifrom the previous image and expands the sigma value basedon the change in center position of the circle, according to:

µ =rprevious + rIP

2(19)

and

σ = σinit +√

dx2c + dy2

c . (20)

3.3 Level Set MethodFor comparison, a level set method was applied to the

same images and video segments as the proposed methods.The method used was the ShapeDetectionLevelSetImage-Filter in the Insight Toolkit (ITK) [16] which is based onthe paper by Malladi et al. [11]. Initially, the purkinje imagewas located as previously described. Then, the image wasconvolved with a GradientMagnitudeRecursiveGaussianIm-ageFilter which acted as a Laplacian of Gaussian (LoG) filterthat found the gradient magnitude of the Gaussian blurredimage. Additionally, a sigmoidal filter was applied to theimage intensity values using the equation:

I ′ =Max−Min

1 + e

(1−β

α

) + Min. (21)

The user controlled the values of α and β in the sigmoidalfilter as well as the value of σ in the LoG filter. Additionally,the user was able to start the level set boundary at a selecteddistance from the purkinje image.

The application of this filter typically stopped short ofthe actual lens edge at a position of high luminance. Theactual lens edge is typically a dark boundary which can bedescribed as a first order minimum in the image. For thisreason, the boundary resulting of the level set method wasused as the starting point of a second level set which op-erated on the original image to find an image minimum.Both applications of the level set method had componentsfor propagation and curvature which were set by the user.

4. EVALUATION AND RESULTS

4.1 DataAn image database was created which had 126 photore-

fraction images. These were selected from videotapes of 29different experimental sessions involving one of six differentmonkeys. Images from within a single experiment were se-lected to show a variety of accommodative states as well asdifferent eye positions. Thirty of these images were used forsupervised learning of the edge probabilities. Therefore, theremaining 96 images were used for ground truth comparisonswith the different methods.

Comparisons of performance on video sequences were per-formed on a collection of 10 video sequences which rangedin length from 1300 to 1700 frames. Two of these sequencescame from experiments in which rapid accommodative changeswere stimulated while the remaining 8 sequences came fromexperiments with slower accommodative changes.

4.2 ExperimentsTo validate the developed algorithms in their ability to

properly segment a lens in a still image, an expert estab-lished ground truth for the 96 test images in the image

database. To do this, lens edge estimates were drawn oneach image using a short program which draws a circle basedon three selected points. Three acceptable estimates wereaveraged for each image returning an average radius, rGT ,and an average circle center (xGT , yGT ).

Good segmentations of the lens do not exist as single val-ues of radius and center, but rather as a range of these val-ues. However, poor segmentations do not vary smoothlyfrom good ones and tend to separate from the good oneswhen graphed. Therefore graphs were created which plot-ted positive segmentations against the percentage deviationfrom ground truth for both radius and circle center. An ac-ceptable range, rlimit and dlimit, was defined for the radiusand circle center, respectively .

Each algorithm was run on the same 96 test photorefrac-tion images. The results were then evaluated such that if thedistance of either the radius or the center point from groundtruth exceeded the acceptable range, then the algorithm re-sult was considered to be a false segmentation. Hence, eachsegmentation was assigned a score S according to:

S =

T, rlimit ≥ |r − rGT | and

dlimit ≥√

(x− xGT )2 + (y − yGT )2

F, rlimit < |r − rGT | or

dlimit <√

(x− xGT )2 + (y − yGT )2

Results are reported as percentages obtained from the num-ber of true segmentations divided by the total number ofimages.

The algorithms were also run on 10 of the video sequencesdescribed earlier. In this case, the probabilistic edge algo-rithm was used both with and without the IP algorithm.Additionally, IP parameters were evaluated such that therewere three different focus frames and 2 different windowlengths (both defined on Figure 7) for a total of six com-binations of parameters.

The radius from each frame was measured with a runningaverage of 21 frames calculated from the same data. Thenthe differences of all of the radii from the running averagewere computed as a single root-mean-square (RMS) value foreach video sequence. RMS values from all 10 video sequenceswere reported as a single average for each algorithm beingtested.

In the Level Set method, the user controls the contribu-tions of the various components of the level set equationand the associated preprocessing steps by setting the valuesof eight constants. No group of settings were found whichworked best for all images tested. These settings were ad-justed with varying results from image to image (Figure 4).

4.3 ResultsThe probability densities and associated training parame-

ters for the proposed PE and IP algorithms were estimatedfrom the 30 training images. Probabilities were obtainedfor each edge in the polar image based on the prior prob-abilities described in sections 3.1 and 3.2. Figure 5 showsan example polar image overlaid with edge probability es-timates. Edges with higher probabilities are shown to bebrighter (Figure 5C). Visual inspection of the probabilitiesoverlaid on the original polar image (Figure 5B) shows thatthe higher probability edges correspond well with the actuallens edge. Additionally, some regions in the horizontal (ψ)

Figure 4: Two randomly chosen images of final levelset positions on monkey lens image. Panel A showsa good fit to the monkey lens and panel B showsa boundary that stopped interior to the actual lensedge.

axis show higher probabilities than others indicating thatvarious portions of a lens edge will contribute more thanothers to the edge calculation.

Quantitative evaluation was performed by comparing lenssegmentation results to ground truth. Following manual seg-mentation of the 96 test images, each image was segmentedusing both algorithms, PE and Level Set. Two plots werecreated to find the range within which a result would beconsidered positive in relation to ground truth for the ra-dius and for the center of the lens. In Figure 6A, the ac-ceptable range from ground truth radius is held constantat a high value of 20% and increasing ranges from groundtruth center position are tested. Conversely, the acceptablerange from the ground truth center position is held at 20%and increasing ranges from ground truth radius are tested in

Figure 5: Edge probabilities from a sample polarimage (A) are shown overlaid on the same image(B) and on a blank image (C).

Figure 6B. Based on these graphs, the acceptable limit wasdefined as being within 6% of the ground truth radius. Seg-mentations that deviated less than this amount from boththe ground truth radius and the ground truth center werecounted as positive matches for ground truth segmentation.Using these limits, it was found that the probabilistic algo-rithm correctly identified the lens in 89.6% of the imagescompared to 82.3% in the Level Set method.

Figure 6: The percentage of images accepted as cor-rect segmentations was plotted over the size of therange of deviation from ground truth for (A) vari-able distance to the lens center and (B) variable dis-tance to the lens radius.

The iterative probability algorithm was analyzed with sixcombinations of parameters as described earlier. RMS val-ues from 10 video sequences are shown in Figure 7A. Therewas essentially no difference in the variance whether the IPalgorithm was run with one, two, or three focus frames.The larger window with 59 frames appeared to producehigher RMS values, but this was not a significant differ-ence. However, the RMS values themselves had a higherstandard deviation in video sequences of rapidly changingaccommodation. Inspection of results from rapidly chang-ing video sequences indicates that a large IP window sizecauses erroneous results in the vicinity of rapid accommoda-tive changes. When data from these video sequences was re-moved (Figure 7B), there was a slight, non-significant trend

toward lower RMS values with more focus frames.The IP algorithm appears to aid the PE algorithm on dis-

joint video sequences where the lens position changes dra-matically from one frame to the next. The RMS value fromthe Level Set method was large and variable at 1.9 ± 1.3pixels, while those of the PE method (0.44 ± 0.12) and thePE method guided by the IP algorithm (0.43 ± 0.13) werelower, more consistent, and nearly identical to each other.

Figure 7: Two parameters were varied as the IPalgorithm was used to test the variance of measuredradius from either (A) all 10 video sequences or (B) 8video sequences of eyes with slower accommodativeresponses.

Figure 8A shows results from a video sequence where theeye had moved during a time interval when the video camerahad been turned off and on again. When the probabilisticalgorithm was run on this sequence without guidance fromthe IP algorithm (red), it lost the lens location and wanderedfrom that point on. The probabilistic algorithm was able tofind the lens again when the IP algorithm was guiding thelead frame. The IP algorithm minimizes the sine fittingenergy functional of the PE algorithm. Figure 8B showshow the final sine fitting energy increases as the probabilityof a good fit decrease. Figure 9 shows a comparison of RMSresults from radius measurements of the level set method,the PE method, and the IP method used in combinationwith the PE method. The combination of IP parametersused for this comparison had an IP window of 19 framesand 3 focus frames. The level set algorithm parameters wereadjusted at the start of each run.

5. DISCUSSION AND CONCLUSIONIn comparison to the probabilistic edge method, the level

set method had a lower percentage of correct segmentationson single frame ground truth images and a much highervariance as measured by RMS of radius measurements onvideo sequences. The level set method as used here appliesthe same criteria to the whole lens regardless of the edgeconditions at any part of the lens. As such, it was vul-nerable to varying conditions in different radial directions.As indicated by its reasonably good performance on singleground truth images, it segmented initial video frames cor-rectly when the user was able to find the correct parameters.However, image conditions change during video sequencesand there was no guarantee that the conditions in futureframes would be sufficiently similar to those of the first suchthat the algorithm would continue to find the edge.

The probabilistic method could find multiple potentialedges in any radial direction instead of just one. Each poten-tial edge was assigned a probability and these probabilitieswere taken into account in the sine fitting functional. As a

Figure 8: (A) The PE algorithm was used eitheralone (red) or in conjunction with the IP algorithm(blue) on a video segment in which the camera hadbeen turned off in the vicinity of frame 1500. (B)The values of the sine fitting energy functional thatis minimized in the IP algorithm are plotted overtime.

Figure 9: Shows the RMS variance in radius ob-tained on 10 video sequences.

result, this method was robust in the presence of randomnoise, although still vulnerable to competing structures inthe image. A second advantage of the probabilistic algo-rithm was that probabilities could be learned for multipleedge types. In the current implementation, first order edgeswere one type of edge while second order edges were catego-rized as two types depending on the value of the first deriva-tive. Second order edges with a first derivative below zerowere never sought as a possible lens boundary, but ratherthey provided information which was used to calculate theprobabilities of the other two types. Other edge types couldbe added if this method is to be extended or applied else-where. For example, the contact lens on the eye often hada detectable edge which could have been used to influencethe probabilities of true edges. Boundaries can also includechanges in other features such as texture.

Measurements of variance from the IP method showedno significant improvement over single, lead-edge measure-ments of the probabilistic method. The reasons for this arenot clear and call for further investigation. However, the IPmethod did demonstrate the ability to steer the probabilityalgorithm back to the lens when it was lost on a disjointvideo segment. This is an important addition to the overallprobabilistic method for reasons of both speed and accuracy.The probabilistic method is typically initialized with a fullalgorithm and then quickly shifted to a faster version whichessentially means a smaller search window. This smallersearch window is inherently faster because of the reducedcalculations involved. It is also less vulnerable to other falsesegmentations because it is already focused on the regionpassed from the previous frame. The IP method now pro-vides a mechanism to deal with situations where the lens isoutside of the search window.

6. REFERENCES[1] L. D. Cohen and I. Cohen. Finite-element methods for

active contour models and balloons for 2-d and 3-dimages. IEEE Transactions on Pattern Analysis andMachine Intelligence, 15(11):359–369, March 1998.

[2] B. Das and S. Banerjee. Inertial snake for contourdetection in ultrasonography images. IEEEProceedings - Vision Image and Signal Processing,151(3):235–240, June 2004.

[3] P. Dollar, T. Zhuowen, and S. Belongie. Supervisedlearning of edges and object boundaries. In ComputerVision and Pattern Recognition, 2006 IEEE ComputerSociety Conference on, volume 2, pages 1964–1971.IEEE Computer Society, CVPR, 2006.

[4] J. H. Elder, A. Krupnik, and L. A. Johnston. Contourgrouping with prior models. IEEE Transactions onPattern Analysis and Machine Intelligence,25(6):661–674, 2003.

[5] D. Geman and B. Jedynak. An active testing modelfor tracking roads in satellite images. IEEETransactions on Pattern Analysis and MachineIntelligence, 18(1):1–14, 1996.

[6] W. He, M. Sengupta, V. A. Velkoff, K. A. DeBarros,and US Census Bureau. 65+ in the United States:2005. U.S. Government Printing Office, Washington,DC, 2005.

[7] A. C. Jalba, M. H. F. Wilkinson, and J. B. T. M.Roerdink. CPM: a deformable model for shape

recovery and segmentation based on charged particles.IEEE Transactions on Pattern Analysis and MachineIntelligence, 26(10):1320–1335, 2004.

[8] M. Kass, A. Witkin, and D. Terzopoulos. Snakes -active contour models. International Journal ofComputer Vision, 1(4):321–331, 1987.

[9] V. Kolmogorov and Y. Boykov. What metrics can beapproximated by geo-cuts, or global optimization oflength/area and flux. In ICCV ’05: Proceedings of theTenth IEEE International Conference on ComputerVision (ICCV’05) Volume 1, pages 564–571,Washington, DC, USA, 2005. IEEE Computer Society.

[10] S. Konishi, A. L. Yuille, J. M. Coughlan, and C. Z.Song. Statistical edge detection: learning andevaluating edge cues. IEEE Transactions on PatternAnalysis and Machine Intelligence, 25(1):57–74, 2003.

[11] R. Malladi, J. Sethian, and B.C.Vemuri. Shapemodeling with front propagation - a level setapproach. IEEE Transactions on Pattern Analysis andMachine Intelligence, 17(2):158–175, 1995.

[12] D. R. Martin, C. C. Fowlkes, and J. Malik. Learningto detect natural image boundaries using localbrightness, color, and texture cues. IEEE Transactionson Pattern Analysis and Machine Intelligence,26(5):530–549, 2004.

[13] S. Osher and J. A. Sethian. Fronts propagating withcurvature dependent speed: Algorithms based onhamilton-jacobi formulations. Journal ofComputational Physics, 79(1):12–49, 1988.

[14] M. Sonka, V. Hlavac, and R. Boyle. Image processing,analysis, and machine vision. Brooks/Cole PublishingCompany, Pacific Grove, CA, 2 edition, 1999.

[15] R. Szeliski, D. Tonnesen, and D. Terzopoulos.Modeling surfaces of arbitrary topology with dynamicparticles. In Computer Vision and PatternRecognition, Proceedings IEEE Computer SocietyConference on, pages 69–74. IEEE Computer Society,CVPR, 2007.

[16] National Library of Medicine. http://www.itk.org.

[17] A. Vasilevskiy and K. Siddiqi. Flux maximizinggeometric flows. IEEE Trans. Pattern Anal. Mach.Intell., 24(12):1565–1578, 2002.

[18] C. Y. Xu and J. L. Prince. Snakes, shapes, andgradient vector flow. IEEE Transactions on ImageProcessing, 7(3):1131–1147, November 1993.

[19] Z. Yan, B. J. Matuszewski, S. Lik-Kwan, and C. J.Moore. A novel medical image segmentation methodusing dynamic programming. In Medical InformationVisualisation - BioMedical Visualisation, InternationalConference on, pages 69–74. MediVis 2007, 2007.

[20] A. L. Yuille and J. M. Coughlan. Fundamental limitsof bayesian inference: order parameters and phasetransitions for road tracking. IEEE Transactions onPattern Analysis and Machine Intelligence,22(2):160–173, 2000.

segmentation of crystalline lens in photorefraction videosharat/icvgip.org/... · segmentation of...

Documents