stochastic artificial retinas: algorithm, optoelectronic circuits - hal

HAL Id: hal-00862122https://hal-iogs.archives-ouvertes.fr/hal-00862122

Submitted on 16 Sep 2013

HAL is a multi-disciplinary open accessarchive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come fromteaching and research institutions in France orabroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, estdestinée au dépôt et à la diffusion de documentsscientifiques de niveau recherche, publiés ou non,émanant des établissements d’enseignement et derecherche français ou étrangers, des laboratoirespublics ou privés.

Stochastic artificial retinas: algorithm, optoelectroniccircuits, and implementation issuesPhilippe Lalanne, Donald Prévost, Pierre Chavel

To cite this version:Philippe Lalanne, Donald Prévost, Pierre Chavel. Stochastic artificial retinas: algorithm, optoelec-tronic circuits, and implementation issues. Applied optics, Optical Society of America, 2001, 40 (23),pp.3861-3876. <hal-00862122>

https://hal-iogs.archives-ouvertes.fr/hal-00862122

https://hal.archives-ouvertes.fr

Stochastic artificial retinas: algorithm,optoelectronic circuits, and implementation

Philippe Lalanne, Donald Prevost, and Pierre Chavel

An analogy can be established between image processing and statistical mechanics. Many early- andintermediate-vision tasks such as restoration, image segmentation, and motion detection can be formu-lated as optimization problems that consist in finding the ground states of an energy function. Thisapproach yields excellent results, but, once it is implemented in conventional sequential workstations, thecomputational loads are too extensive for practical purposes, and even fast suboptimal optimizationapproaches are not sufficient. We elaborate on dedicated massively-parallel integrated circuits, calledstochastic artificial retinas, that minimize the energy function at a video rate. We consider severalcomponents of these artificial retinas: stochastic algorithms for restoration tasks in the presence ofdiscontinuities, dedicated optoelectronic hardware to implement thermal motion by photodetection ofspeckles, and hybrid architectures that combine optoelectronic, asynchronous-analog, and clocked-digitalcircuits. © 2001 Optical Society of America

OCIS codes: 100.3020, 100.2080, 100.1160.

brftvtcfebp

tpma

1. Introduction

In early and intermediate vision, artificial vision sys-tems are confronted with two major problems. Thefirst problem in machine vision is the sheer amount ofinput data to be acquired, managed, and processed.The extraordinary volume of data in real-time gray-level images leads to communication bottlenecks in theflow of data among imager, memory, and processor inconventional machines. The second problem arisesfrom the fact that the acquisition system provides in-complete two-dimensional and noisy observations ofthe three-dimensional scene. The interpretation ofthe data requires solution of an inverse problem. Onehas to introduce generic constraints into the problemto force the solution to lie in a subspace of the solutionspace by incorporating physical a priori information atthe scene. In this paper we elaborate on dedicatedmassively-parallel integrated circuits, called stochas-tic artificial retinas, that rely on regularization meth-ods to solve the inverse problem through video-ratestochastic minimization of appropriate energy func-

The authors are with the Laboratoire Charles Fabry de l’Institutd’Optique, Centre National de la Recherche Scientifique, B.P. 147,F-91403 Orsay Cedex, France.

P. Lalanne’s e-mail address is [email protected] 27 November 2000; revised manuscript received 7 May

2001.0003-6935y01y233861-16$15.00y0© 2001 Optical Society of America

tionals. These retinas can be used as single-circuitsensors dedicated to a specialized application or asmodular early-vision modules that are assembled atthe board level and that are used as front-end proces-sors for a variety of vision applications.

Many problems such as those of edge and motiondetection, segmentation, and restoration in early andintermediate vision can be formulated in terms of en-ergy minimization.1 The energy function combines aposteriori knowledge of the data acquisition systemand some a priori knowledge of image structures.The latter is important since it regularizes the inverseproblem by restricting the set of possible solutions.Within the framework of Markov random fields,2a priori and a posteriori knowledge is simply com-ined through Bayes theory. In general, the Markovandom field formulation exhibits two interestingeatures. First, because only small neighborhood sys-ems are involved for most early- and intermediate-ision tasks, the formulation yields a simple form forhe energy function whose gradients involve only localomputations. Second, experimental evidence ~see,or instance, Refs. 2–4! has shown that, in practice,nergy functions with small neighborhood systems cane minimized with manageable, albeit still heavy, com-utational loads.The criterion that is most often used for solution of

he optimization problem is known as a maximum aosteriori ~MAP! estimator. Finding the MAP esti-ate is clearly a formidable computational task: For

n image with N 3 N pixels and G gray levels, the

10 August 2001 y Vol. 40, No. 23 y APPLIED OPTICS 3861

N3N

rapsceatv

cmoMqcotpiitsactvoi

crpsspsasmt

w

wmpIct

MFfed

3

number of possible intensity images is G , whichules out any direct search, even for small values of Gnd N. A useful computational approach consists inerforming a random exploration of the energy land-cape following the Gibbs distribution, i.e., visitingonfigurations with a probability proportional to thexponential of the energy. Several algorithms suchs the Gibbs sampler2 permit such random explora-ion. The computation of the MAP estimate is pro-ided by annealing.5 The temperature is continually

decreased, so high-energy configurations are less andless frequently visited during the computation. Atthe end of the annealing, a deep minimum of the en-ergy landscape is obtained in practice.

When the random exploration is implemented in aconventional computer, computational loads are ex-pensive, and suboptimal deterministic algorithmsare usually preferred for the MAP estimation. Inthis paper we focus on very large-scale integrated~VLSI! optoelectronic circuits, hereafter called sto-hastic artificial retinas, for implementing video-rateinimization in early- and middle-vision tasks by use

f stochastic rather than deterministic approaches.ore precisely, we try to answer the following two

uestions: A, Can we design massively-parallel ma-hines that are able to perform video-rate stochasticptimization in early and middle vision? B, How dohese machines based on stochastic processing com-ete with their deterministic counterparts, especiallyn terms of problem-solving power and complexity ofmplementation? Several related studies based on de-erministic minimization were reported previously;ee, for instance, Refs. 6–11. The originality of ourpproach lies in the deliberate choice to consider sto-hastic processing. This choice was motivated byhe fact that stochastic processing is expected to pro-ide better estimates than deterministic processingver an enlarged set of energy functions or of process-ng tasks.

Obviously, answering questions A and B is diffi-ult, especially because of the large variety of algo-ithms and models that are involved in imagerocessing. However, we shall see that a partial an-wer can be obtained for restricted applications. Foreveral reasons that are discussed below, the classicroblem of image restoration appears suitable fortudying the feasibility and relevancy of stochasticrtificial retinas. Thus in this paper we focus on theo-called compound Gauss–Markov random fieldodels, which can be described by the posterior dis-

ribution

P~x, lyy! 51Z

exp 2 E~x, l, y!yE0, (1)

here

E~x, lyy! 5 iy 2 Fxi2 1 l2@iMlxi2 1 V~l!#, (2)

Z and E0 are normalization constants, y is an N 3 Nmatrix that represents sparse and noisy observationdata, x is a field with continuous variables ~pixel

862 APPLIED OPTICS y Vol. 40, No. 23 y 10 August 2001

process! defined on a regular N 3 N lattice, and l isanother field with binary variables that represent thepresence or absence of edges ~line process!. As inmany related studies ~see, for instance, Refs. 2 and12!, the binary sites are placed midway between thetwo components of each vertical or horizontal pair ofpixels; see Fig. 1 for more details. Notice that sitesat or near the boundary have fewer neighbors thaninterior sites. In what follows, we adopt the naturalfree-boundary model of Fig. 1, and no periodic bound-aries or toroidal lattices are considered. In Eq. ~2!,l2 is the regularization parameter that controls thedegree of regularity of the solution or, equivalently,reflects the confidence in the data. This parameterhas to be estimated and is related to E0 and to thevariance of the white additive Gaussian noise, whichis assumed throughout this paper to be responsiblefor the noisy observation. N2 3 N2 matrix Fis sparse when it represents a possible blur of theobserved data or can be a more globally interactingprojector when middle-vision tasks such as tomo-graphic reconstruction are considered. Althoughthe algorithm contribution presented in this paper isvalid for any matrix F, we are more concerned in

hat follows with blurring because blurring involvesore-modest neighborhood sizes compatible with

arallel processing. V~l! is the so-called potential.t is simply V~l! 5 a ¥i li when noninteracting dis-ontinuities are considered ~the energy cost for set-ing a discontinuity li to l is simply a for any i! or can

take more-complex forms ~see, for instance, Refs. 2and 12 and Fig. 6! when interacting boundaries thatexploit certain physical and geometric constraints ondiscontinuities, such as smoothness and connectionfeatures, are taken into account. Ml is a 2N2 3 N2

matrix that represents a smoothing operator that in-corporates the line field. With free boundary condi-tions ~as is the case in most early-vision problems!,

l has 2N useless null lines that can be removed.or simplicity, the sizes of matrices given in what

ollows will include null lines or columns ~boundaryffects are omitted for clarity!. Specifically, Ml isefined

iMlxi2 5 (iÞkN~k51,2,3, . . .!

N2

~ xi 2 xi11!2~1 2 vi!

1 (i51

N22N

~ xi 2 xi1N!2~1 2 hi!. (3)

Fig. 1. Field structure for N 5 4: circles, pixel field; lines, linefield.

ilea

itos

romwfesz~lb

hkfcm

ksadnTlftfmtseocamcamt

iptaamAsfflttci

imulGfit

persigIG

In Eq. ~3!, lexicographic notation is used for number-ng pixel and line sites, and horizontal and verticaline sites are denoted h and v, respectively. Thenergy function of Eq. ~2! can be viewed as a gener-lized form of the classic regularization functional

E~xyy! 5 iy 2 Fxi2 1 l2iMxi2, (4)

ncorporating discontinuities. To that end, notehat smoothing operator M in Eq. ~4! is replaced byperator Ml in Eq. ~2!, where l denotes the line-siteet that results from the union of h and v.In attempting to answer questions A and B, we find

estoration problems that rely on the energy functionf Eq. ~2! attractive for several reasons. First, inas-uch as it deals explicitly with two different fieldsith binary and continuous variables, the energy

unction of Eq. ~2! appears to be appropriate for use inxamining issues related to the implementation oftochastic artificial retinas. Second, as a generali-ation of the classic regularization functional of Eq.4!, the energy function can be applied to many prob-ems in early and middle vision that previously hadeen solved with standard regularization.13 Third,

the essential problem of preserving discontinuities inimage restoration is directly addressed by the energyfunction of Eq. ~2!. Finally, although much research

as addressed the energy function of Eq. ~2!, to ournowledge no optimal deterministic algorithm existsor interacting boundaries. For noninteracting dis-ontinuities, one can use nearly optimal fully deter-inistic algorithms14,15 to reduce computational

costs dramatically. The fact that deterministic ap-proaches are restricted to optimization problems withnoninteracting line processes is relevant for access tounique features of stochastic artificial retinas and foranswering question B.

In Section 2 an optimal stochastic algorithm forfinding the ground state of the energy function of Eq.~2! is described. This algorithm combines different

inds of noise source to produce samples of the inten-ity and line sites. For a fixed temperature, itchieves thermal equilibrium, sampling the posterioristribution of Eq. ~1!. It includes a simulated an-ealing procedure for computing MAP estimates.he algorithm preserves local computation, is paral-

el and, as we show in Section 4 below, is suitableor VLSI implementation. By considering restora-ion problems in one- and two-dimensional imageormats, in Section 3 we analyze the overall perfor-ance of the new algorithm. We perform computa-

ional tests to evaluate the ability of the algorithm toample at thermal equilibrium and to provide MAPstimates. Moreover, the restoration performancebtained for noninteracting and interacting line pro-esses are compared. Comparative visual results forrtificial and real images are provided for MAP esti-ates. In Section 4 we analyze some relevant ar-

hitectures for the implementation of stochasticrtificial retinas that provide video-rate global mini-ization. These retinas are seen as VLSI circuits

hat incorporate photodetectors illuminated by the

mage to be processed and by time-varying speckleatterns. The speckles provide random illumina-ion, which, once it is converted into photocurrents,cts as a source of thermal noise for stochastic relax-tion. A retinal architecture dedicated to the mini-ization of the energy function of Eq. ~2! is described.24 3 24 smart VLSI sensor that uses differential

peckle detection to minimize Ising-spinlike energyunctions at the video rate has been designed andabricated. Experimental results of the school prob-em of binary image restoration are presented. Sec-ion 5 summarizes the main results obtainedhroughout the paper and provides a discussion of theapabilities and limitations of stochastic artificial ret-nas for early vision.

2. Stochastic Algorithm for Compound Gauss–MarkovRandom Fields

Minimizing the nonconvex energy function of Eq. ~2!s not an easy task. The reason for this has to do

ainly with the dimension of the problem, which issually extremely large, and to the existence of many

ocal minima. In the research reported in Ref. 2, theibbs sampler algorithm was used to sample the twoelds ~x and l!, but experimental results were ob-ained only with multilevel ~not continuous! xi vari-

ables and for a small ~no more than four! number ofgray levels. Several contributions to generalizingthe primary work of Ref. 2 to continuous variables ofthe intensity field were reported previously. It is notour intention to list all these contributions; we re-strict ourselves to a brief overview of those that arestrongly related to our research and that are usefulfor its understanding. All these studies rely on thesame basic idea and exploit the semiquadratic form ofthe energy function: For a given line process config-uration, the resultant conditional energy functionalis a nondegenerate multinormal distribution. InRef. 16 a mixed-annealing ~MA! algorithm is pro-posed that relies on the facts that the conditionalenergy functional is convex and that its unique min-imum can be found straightforwardly. It uses theGibbs sampler to update the line process but changesthe stochastic relaxation of the intensity field into adeterministic relaxation. The deterministic relax-ation removes most of the computational burden,namely, the sampling of the continuous fields, butbreaks thermal equilibrium. Thus the MA does notguarantee global minimization. Generalization ofthe Gibbs sampler to continuous variables17,18 is notstraightforward, and some care must be taken in gen-eral. For the specific case of Gaussian distributions,relatively simple stochastic algorithms for samplingthe x field exist.19,20 These algorithms use N2 inde-

endent Gaussian delta-correlated noise samples forach full iteration of the x field. One iteration cor-esponds to visiting all pixels and performing a basicampling step. As for the Gibbs sampler, many fullterations must be repeated a great many times touarantee convergence toward thermal equilibrium.n the research reported Ref. 10, an algorithm, calledlobal hereafter, is proposed. It is based on an im-


ucfibdpeaetutiafwsti

p

rpsG

etiim

c

pTtiltMsmirgti

cfmg

wregdl

a

ggPmm

l

iaswi~

F

3

plicit formulation of the x field and restricts the sto-chastic search to subspace $xl*, l%, where xl* is the

nique intensity configuration that minimizes theonditional energy functional for a given boundaryeld l. This approach allows global optimization toe performed, but the elimination of the x field intro-uces long-range interaction and sacrifices the com-utational benefits of local neighborhood systems inarly vision. Thus the global algorithm is not par-llel. By exploiting the semiquadraticity of the en-rgy function, Geman and Yang21 proposed to samplehe multinormal distribution of the x field directly bysing an annealing procedure based on a fast Fourierransform, which provides global minimization. Wentroduce an algorithm, called the quasi-static relax-tion ~QSR! algorithm, that provides MAP estimatesor the energy function of Eq. ~2!. This algorithm,hich was reported previously in a preliminary

tudy,22 is suitable for massive parallel implementa-ion, and is especially efficient when small neighbor-ng structures ~sparse matrices F! are considered.

A Markov chain $@X~k!, L~k!#, k 5 0, 1, 2, . . .% withits associated temperature sequence $Tk, k 5 0, 1,2, . . .% is constructed as follows. For each k, core

rocedure H is applied as follows:

Subprocedure H1. Generate X~k! from L~k 2 1!by means of the conditional distribution of x, givenl 5 L~k 2 1!.

Subprocedure H2. Generate L~k! from the cur-ent configuration X~k! by sampling the boundaryrocess, using the Gibbs sampler algorithm as de-cribed in Ref. 2, by repeatedly programming theibbs sampler for local boundary replacement.

Subprocedure H1 directly exploits the fact that thenergy function of Eq. ~2! is semiquadratic and thushat the conditional distribution of x given l is Gauss-an, as we discuss in detail below. Subprocedure H2s more usual; see, for instance, Refs. 2 and 16. In a

ore general form, nl full iterations over the line fieldcan be provided, and, for each k, subsequences $Lq~k!,q 5 1, 2, . . . nl% are generated. Clearly, for a given Xonfiguration, L1~k! is obtained from Lnl

~k 2 1!, Lq~k!~q Þ 1! from Lq21~k! and L~k! 5 Lnl

~k!. Althoughthe convergence rate of the core procedure towardthermal equilibrium probably depends on parameternl, we did not explore this issue. Throughout the

aper, the numerical results are obtained for nl 5 1.hus the core procedure consists in visiting once all

he pixel and line sites. Because only short-rangenteractions have to be considered in updating theine process, the basic updating step is rather elemen-ary and does not require complex computations.oreover, as the associated chromatic number is

mall, a high degree of parallelism can be imple-ented. For the MAP estimation, a slowly decreas-

ng temperature sequence is implemented. Generalesults of simulated annealing guarantee conver-ence in distribution to a measure concentrated overhe global minima of E~x, lyy! if the cooling schedules sufficiently slow. However, because we are con-


erned mostly with practical implementation, we optor a faster cooling schedule and construct the inho-ogeneous Markov chain with temperature T~k!

iven by

T~k! 5 cT~k 2 1!, (5)

here c , 1. In the numerical and experimentalesults reported below, control parameter c was setqual to 0.9. This annealing schedule no longeruarantees convergence to optimal solutions, but itoes return nearly optimal solutions for most prob-ems.23 At each given temperature, nscan full iter-

ations are performed, where the parameter nscanrepresents the length of each homogeneous Markovchain. In summary, the QSR algorithm involves thefollowing steps:

1. Set the initial temperature T~0!.2. Provide nscan iterations of the core procedure

H at temperature T~k!.3. Set the new temperature according to Eq. ~5!.4. Return to step 2 as long as T~k! is larger thanfinal control temperature.

Let us consider step H1 of procedure H. For aiven temperature T, the conditional distribution of xiven l, PT~xul!, is normally distributed. IdentifyingT~xul! with the canonical form A exp@2~1y2!~x 2!tC21~x 2 m!# of a multinormal distribution, we readean vector m and covariance matrix C simply as

m 5 @FtF 1 l2MltMl#

21Fty, (6a)

C 5T2

@FtF 1 l2MltMl#

21. (6b)

We now proceed with generating samples accordingto the multinormal distribution function defined byEqs. ~6!. To do so we solve for x the following set ofinear equations,

~FtF 1 l2MltMl!x 5 Fty 1 Bwqs, (7)

where B is a symmetric matrix defined hereafter andwqs is a Wiener vector, i.e., a collection of zero-meanndependent Gaussian random numbers of unit vari-nce. Clearly, the average solution for x of Eq. ~7!atisfies Eq. ~6a!. For x to be normally distributedith the covariance matrix of Eq. ~6b!, B has to sat-

sfy the following fluctuation-dissipation relationshipsee Appendix A for more details!:

BBt 5 ~Ty2!@FtF 1 l2MltMl#. (8)

A. General Method

If wqs is an M component vector with M . N2, it isalways possible to solve Eq. ~8! for N2 3 M matrix B.

or instance, to calculate B for M 5 N2 we can usethe unit transform matrix U, which reduces @FtF 1l2Ml

tMl# to diagonal form,

U@FtF 1 l2MltMl#Ut 5 iki~l!diji, (9)

2 2

T

bt

sE

getr

v

Tfisicsc

Tert

gfiat1eootcdhpa

odaiFtaPepapf

where ki~l!, i 5 1, 2, . . . N , are the N eigenvalues of@FtF 1 l2Ml

tMl# and dij is the Kronecker symbol: dijequals 1 if i equals j and 0 otherwise. As a symmet-ric matrix, B is given by

B 5 Utiki~l!1y2dijiU. (10)

his approach involves N2 independent standardnormal random variables. In Ref. 21 the fast-Fourier-transform-based annealing algorithm ex-ploits the relationship between block circulantmatrices and a two-dimensional discrete Fouriertransform to effectively reduce the burdensome com-putation of the eigenproblem. It is valid if toroidalboundary conditions are used and, in our opinion, ishighly efficient for tomographic reconstruction prob-lems but is computationally expensive for problemsthat involve short-range interactions. More specifi-cally, we note that every bij~l! element of matrix Bpotentially depends on the whole line process, evenfor a sparse matrix F; the local nature of the fieldinteractions is not reflected in B.

B. Present Method

A more practical solution of Eq. ~8! is inspired directlyy the form of Eqs. ~6!. Indeed, it is easy to verifyhat @N2 3 3N2# matrix B, given by

B 5 ÎTy2 @Ft, lMlt#, (11)

atisfies the fluctuation-dissipation relationship ofq. ~8!. Thus the companion quasistatic vector wqs

required for solving Eq. ~7! has 3N2 components.The larger dimension of the system is compensatedfor by the fact that the local structure of interactionwith the line process is retained in B. In otherwords, every bij~l! element depends at most on a sin-le line site, and parallel calculation strategies can bemployed to increase effectiveness. Summarizing,o implement step H1 of the procedure in the short-ange interaction case, do the following:

1. Generate 3N2 independent Gaussian randomariables to form vector wqs.2. Form matrix B given by Eq. ~11!.3. Solve Eq. ~7! for x.

he advantage of using Eq. ~7! to update the intensityeld lies in the fact that the iterative and burdensometochastic computation process associated with thentensity field is replaced by a fast and deterministicomputation. Basically, the computation requiresolving Eq. ~7!. The largest matrix B used for theomputation has dimensions N2 3 3N2. But, when

blurring operators with a narrow support are consid-ered, the matrix is sparse, and efficient algorithmscan be employed for solving the system of Eqs. ~7!.

he QSR and MA algorithms are similar and are ofquivalent computational complexity; both algo-ithms use the Gibbs sampler algorithm for samplinghe line field and require solution of a system of N2

equations for sampling the intensity field. The dif-ference between the algorithms lies in the fact that,

whereas the MA algorithm samples the intensityfield at a null temperature and imposes the conditionthat x 5 m at every iteration, the QSR algorithm usesindependent Gaussian random variables to restorethermal equilibrium.

3. Numerical Results

In this section we report on several tests performedwith the QSR algorithm, focusing on its convergencerate and its ability to achieve thermal equilibrium inthe coupled field or to provide MAP estimates. Also,the influence of interacting or noninteracting lineprocesses on the quality of the restoration is dis-cussed.

A. Tests Performed on One-Dimensional Signals

We carried out computer simulations to test whetherthermal equilibrium was achieved for the coupledfield ~x, l!. The procedure was as follows: For aiven data configuration y and from an arbitrary con-guration of the coupled field, the system was slowlynnealed ~nscan 5 50! from a high temperature downo a given temperature T. At that temperature,0,000 configurations ~x, l! and their correspondingnergy values were collected after each global updatef the fields. Then the system was annealed to an-ther temperature, T9 ~T9 , T!. At that tempera-ure, another set of 10,000 energy samples wasollected. Directly testing the Boltzmann–Gibbsistribution of Eq. ~1! is not feasible because of theugeness of the configuration space. Instead, werefer to deal with the canonical distribution P~E, T!t temperature T,

P~E, T!dE 5 CTV~E!exp~2EyT!, (12)

where CT is a normalization constant that dependsnly on temperature and V~E! is the unknown stateensity of the coupled fields. P~E, T!dE is the prob-bility of observing a configuration ~x, l! with energyn the interval @E; E 1 dE@ at thermal equilibrium.rom the two sets of samples collected at tempera-ures T9 and T, the two frequency distributions P9nd P of observed populations were computed.9DE and PDE are the number of configurations withnergy in the interval @E; E 1 DE@ observed at tem-eratures T9 and T, respectively. If the sampling ischieved under thermal equilibrium, as V~E! is inde-endent of T, the logarithm of the ratio of the tworequency distributions is given by

lnF PP9G 5

~T 2 T9!

TT9E 1 C~T, T9!, (13)

where C 5 ln@CT# 2 ln@CT9# is a negative constantthat depends on T and T9. According to Eq. ~13!, theratio logarithm should vary linearly with energy; aslope m 5 ~T 2 T9!yTT9 is expected.

For illustration, we consider the weak-string prob-lem defined by the one-dimensional signal shown inFig. 2. Uncorrelated Gaussian noise with a stan-dard deviation s of 16 is added to form the 128 3 1


t3

nscp2dsai5ivFuWamrpmtfiatvpib

ss

Ts

Table 1. Comparison of Quality of the MAP Estimation for the Weak

w

3

input data y. No blurring is considered, and matrixF is equal to identity matrix I. The parameters ofhe energy function are l 5 4.5 and a 5 150. In Fig., the logarithm ~PyP9! is plotted versus energy E in

the overlapping region of the two histograms. Nu-merical data correspond to black rectangles, and thevertical bars are the uncertainties as estimated fromthe population’s square root. Results were gatheredfor temperatures T 5 9.41 and T9 5 8.47 ~m 50.0118!. The best least-squares line from the datahas a slope equal to 0.012 6 0.004 and a negativecoordinate 213.7 6 0.1 at the origin. Simulationresults and theoretical prediction are in good agree-ment, with departures from the theoretical curveonly at both ends, where frequencies are weak. Thisresult clearly illustrates the achievement of thermalequilibrium for the coupled field. In Section 4 below,we apply this test to experimental data obtained witha fully parallel analog–digital machine. Theachievement of thermal equilibrium is important inpractice, especially if estimates other than the MAPestimate, such as the maximum of the posterior mar-ginals or the thresholded posterior mean, are cho-sen.24

When a fast cooling schedule such as that of Eq. ~5!is used, thermal equilibrium is not perfectly achievedat each temperature T~k!. It is intuitively clear thatlarge decrements in T~k! will require longer homoge-neous Markov chain lengths to be able to restorequasi equilibrium at the next temperature T~k 1 l !.

Fig. 2. 128 3 1 weak string.

Fig. 3. Numerical test performed on the weak-string problem.At each temperature, 10,000 configurations ~x, l! are collected.

he temperatures are T 5 9.41 and T9 5 8.47. Vertical bars aretatistical uncertaincies.


Thus there is a trade-off between large decrementsand small values of nscan. Although this assertioncan be stated in more-mathematical terms,25 we do

ot pursue the details. To gain a quantitative in-ight into the influence of parameter nscan on theonvergence performance of the QSR algorithm, weroceed as follows: From the signal defined in Fig., a set of 50 input data y is generated with indepen-ent instances of white Gaussian noise, all with atandard deviation of 16. For every instance of noisend for a given value of nscan, the QSR algorithm ismplemented. The mean values of the energy of the0 configurations obtained at the end of the anneal-ng are listed in Table 1. As we expect, the meanalues decrease with increasing values of nscan.or comparison, Table 1 also provides the mean val-es observed with the MA and the global algorithms.e note that the MA algorithm is clearly not optimal

nd provides high mean energy values. The lowestean energy value is obtained for the global algo-

ithm and an nscan of 50. Although this algorithmerforms well, we must note that it is extremely de-anding in terms of computational loads. Because

he global algorithm restricts exploration of the con-guration space to the set of all the local minimassociated with each state of the line process, we needo solve Eq. ~6a! for m for every component of the xector, i.e., N2 times for each full iteration. For com-arison, note that only one inversion per full iterations required for the QSR algorithm, whatever the num-er of pixels is.

B. Reconstruction Problems with Two-DimensionalImages

In Subsection 3.A we focused on the weak string as aone-dimensional discontinuity-detecting filter. Inthis subsection we examine its two-dimensionalequivalent, the weak membrane. To qualitativelyand quantitatively study the potentialities of theQSR algorithm, we present the results of several nu-merical computations. The computations were allconducted without any blurring in the image ~F 5 I!.

For the first experiment, the 64 3 64 input imagehown in Fig. 4~a! was considered. This image wasynthesized by addition of white Gaussian noise ~m 5

String of Fig. 2 for Various Algorithms and for Various Values ofParameter nscana

Algorithm ~nscan! Mean Energy

QSR ~1! 90.1QSR ~10! 80.5QSR ~50! 78.0QSR ~100! 77.5MA ~10! 217.9MA ~50! 218.6Global ~5! 77.2Global ~10! 77.2Global ~100! 77.0

aThe mean energy is obtained by averaging over 50 instances ofhite Gaussian noise.

s35aC~tsg6tnpTr

saolmWose

0, s 5 65! to an original synthetic image formed by alinear ramp with a 39 3 39 square base. The min-imum and maximum step heights on the left and theright sides of the original image are 50 and 250,respectively. Figure 4~b! shows a restored image ob-tained with the QSR algorithm. We delineate theoutcome of the line process by painting white anypixel sides at a discontinuity. For the numericalcomputation, no interaction in the line field was con-sidered, and the free parameters ~l, a! were chosen toequal ~4, 500!. Basically, the restored image exhib-its a large smooth area that corresponds to the squarebase of the original image. However, we note thatthere are several inclusions at the pixel size level andthat the small step on the left side is badly restored.Although the degradation that is due to white noise issevere and no interaction in the line process is usedfor the restoration, no ending appears in the lineprocess of Fig. 4~b!, and closed regions in the imageare obtained. This tendency to form unbroken lineswithout any need to impose additional cost on lineendings is an intrinsic property of membrane elastic-ity that in Ref. 14 is termed hysteresis. It is worthmentioning that, despite the hysteresis, in other ex-periments performed with the weak membrane therestored image did not always exhibit closed line pro-cesses. In particular, the upper left-hand corner ofthe ramp was smooth in several observations.

For the second experiment we considered the noisyimage of Fig. 5~a!. This synthetic image was ob-tained by addition of white Gaussian noise ~m 5 0,

5 65! to an original synthetic image, a step with a9 3 39 square base. The step height is 50. Figure~b! shows the restored image obtained with the QSRlgorithm for ~l, a! 5 ~3, 250! and nscan 5 300.learly, the restoration quality is poor, but the values

3, 250! of the free parameters are the best valueshat we could find in several trials. Figure 5~c!hows the restored image obtained with the QSR al-orithm for the interacting line process shown in Fig.and for a 5 300. Basically, the potential favors

urns and continuations and penalizes endings. Weote that, except for some minor inclusions of one-ixel size, the step is approximately reconstructed.he net benefit of regularization by explicit incorpo-ation of an interacting line process was also ob-

Fig. 4. 64 3 64 weak membrane. ~a! Hand-drawn ramp imagecorrugated by white Gaussian noise ~m 5 0, s 5 65!. ~b! Restoredimage obtained without interaction of the line process ~l 5 4, a 5500, nscan 5 300!.

erved for real images with or without blurring. Tochieve a more-accurate reconstruction, we triedther potentials. As was proposed in Ref. 26, someine configurations, such as ending boundary place-

ents, were forbidden by constrained optimization.e could not achieve a better restoration quality. In

ur opinion, for the image of Fig. 5~a!, better recon-tructions could be achieved with higher-order mod-ls of the line process that would involve larger

Fig. 5. 64 3 64 weak membrane. ~a! Hand-drawn step imagecorrugated by white Gaussian noise ~m 5 0, s 5 65!. ~b! Restoredimage obtained without interaction of the line process ~l 5 3, a 5250, nscan 5 300!. ~c! Restored image obtained with l 5 3, a 5250, nscan 5 300, and the potentials of Fig. 6.

Fig. 6. Numerical values of potentials Vij used for the interactingline process. Crosses, nodes; vertical and horizontal lines, exist-ing edges. Rotational invariance is assumed.


aa~fse

f

t

utspw

msp

sl

apoessat

tpamomv2f

w

pcsrlp

o

3

neighborhood structures. We did not explore thismethod of achieving reconstructions because it mightbe difficult to implement in dedicated circuitry.

4. Implementation of Stochastic Artificial Retinas

Here we discuss architecture and hardware for im-plementing stochastic artificial retinas. The follow-ing discussion is restricted to early-vision tasks withreasonably local neighborhood systems, for whichcommunication requirements are small and a highdegree of parallelism can be achieved by suitable col-oring of the sites. These retinas provide video-ratestochastic relaxation operations with which nearlyoptimal estimates can be computed. The retinas aresilicon chips built from a mesh of processing elements~PEs!, each with its own photodetectors and somecomputational abilities. The latter are rather weak,but the computational power comes from all the PEsworking in parallel. Clearly, the implementation ofmassive parallelism is not obvious from a practicalpoint of view because the required silicon area per PEis much too large. Typically, a 256 3 256 regularrray of PEs could be implemented on a 1-cm2 chip in1-mm complementary metal-oxide semiconductor

CMOS! technology, provided that each PE containsewer than 30 transistors; this of course is a verymall size, which can be achieved only if analog op-rations are implemented.

A. Parallel Generation of Massive Amounts of RandomNumbers

Ideally, a stochastic artificial retina that involves256 3 256 PEs working in parallel at a global chiprequency of 1 MHz requires approximately 65 3 109

independent random numbers each second.Clearly, such a huge rate deserves particular study.Although noise often appears naturally in systems,paradoxically, when one is attempting to use thisnoise to generate random numbers it is often difficultto obtain properties that are sufficiently statisticallyrandom. Several techniques that exploit naturalnoise sources27–30 or pseudorandom-number genera-ors31,32 are compatible with these requirements.

Our approach relies on speckle. Speckle is a nat-ral noise of coherent light that is observable whenhe random complex amplitudes of many coherentcatterers are added. The resultant interferenceattern is recognizable by its random granularity,hich can be described by statistical means.33 In a

preliminary study34 we argued in favor of the use ofspeckle created by step-index fibers. By computingvalues as low as 1023 for some first-order autocorre-lation values and by successfully implementing sta-tistical tests of independence, we concluded that fiberspeckle offers good statistical properties of space andtime independence. More recently, the differentialdetection of speckles implemented by a phototransis-tor pair was shown35 to be an efficient tool for imple-

enting zero-mean Gaussian random currentources for VLSI silicon circuits. Analytical and ex-erimental evidence shows that the photocurrent’s


tandard deviation s is proportional to the mean il-umination ^I& of each phototransistor:

s 5 A^I&, (14)

nd thus one can set it freely by controlling the laserower. In Eq. ~14!, proportionality factor A dependsnly on the average number of speckles incident ontoach phototransistor and is fixed by the experimentaletup. Although it exploits the analog nature ofpeckle statistics, the differential detection is robustnd accurate; deviations from Gaussian laws are lesshan 1%.

The quasi-Gaussian laws obtained by the differen-ial detection of speckles can be exploited for sam-ling binary variables. As has been explainedbove, binary variables are important in early andiddle vision because they may encode the presence

r absence of discontinuities or may label sites inotion or at rest, for instance. We denote these

ariables with on–off states $21, 11% Si, i 5 1,. . . N. The local site replacement used repeatedly

or sampling ~heat-bath criterion! is

p~Si 5 1! 51

1 1 exp~FiyT!, (15)

here force Fi is the energy difference between the onand off states: Fi 5 E~Si 5 1uSr 5 sr, r Þ i! 2 E~Si 521uSr 5 sr, r Þ i!.

The implementation of the heat-bath criterion withspeckles36 relies on the fact that the cumulative dis-tribution function of a Gaussian, the error function,never differs by more than 1% percent from the sig-moid probability function of Eq. ~15!. A possible im-plementation with a comparator is shown in Fig. 7.Four currents are injected into the comparator: thetwo photocurrents generated by the two phototrans-istors and illuminated by independent speckles andtwo other currents, Fi

1 and Fi2, related to the force

by Fi 5 Fi1 2 Fi

2 and typically representing theositive and negative contributions in F. The sto-hastic PE depicted in Fig. 7 was integrated in theame CMOS technology as that used in the studyeported in Ref. 35. Under time-varying speckle il-umination, its operation was tested, and we com-uted the probability that the output voltage Si

equals 5 V, p~Si 5 1! by averaging for different valuesf the photocurrents encoding the force. Experimen-

Fig. 7. Clipped differential detection of speckle with a dual-railcomparator; Fi 5 Fi

1 2 Fi2.

dpTlttrst

E

cr

adt

~r1

ib

bctft

aiiriTificcotmofistowcrct

tal results ~circles! are shown in Fig. 8 and are com-pared with those for the sigmoid heat bath ~solidcurve!. Excellent agreement is obtained. Simple

imensional analysis shows that temperature T isroportional to the average speckle illumination.hus we can control temperature by changing the

aser power. To avoid any confusion, let us note thathe S-shaped response shown in Fig. 8 is due only tohe probabilistic updating and not to any nonlinearesponse of the comparator. In the absence ofpeckle, a Heaviside function is obtained instead ofhe S-shaped response ~null temperature!.

B. Stochastic Artificial Retinas for Implementing IsingSpin Models

To gain a more-quantitative insight into hardwareproblems relative to the implementation of stochasticartificial retinas, we built a prototype that is able toperform stochastic processing of spin-glass models.37

This choice was also motivated by the fact that spin-glass models exhibit a strong analogy with some sim-ple early-vision tasks.24,38 According to the analogy,the binary pixels Si are called spins, and the energy

~S! of a given configuration S is

E~S! 5 212 (

i51

N

(j51, jÞi

N

JijSiSj 2 (i51

N

hiSi. (16)

In Eq. ~16!, hi denotes an external local field and Jij isthe interaction coefficient between spins i and j. Weonsider a two-dimensional spin-glass model ar-anged in an image format, with symmetrical ~Jij 5

Jji! and binary ~61! interaction coefficients. Onlyhorizontal and vertical nearest neighbors are consid-ered; i.e., the system is a first-order Markov field inthe i and j directions.

The system that we implemented combines specklend electronic circuitry. It is composed of two mainevices: A silicon integrated circuit is in charge ofhe parallel computation of the energy gradients

Fig. 8. Circles and solid curve, experimental data and the fittedheat-bath criterion, respectively. Every circle was estimated over100,000 collected independent measurements.

forces! involved in the minimization, and a speckleandom-number generator provides the circuit with04 spatially and temporally independent illumina-

tions each microsecond. Following the analogy withstatistical mechanics, the integrated circuit alone im-plements the time evolution of the spin system at anull temperature, whereas the speckle illuminations,converted into photocurrents, act as sources of ther-mal noise. The temperature is read as the amountof randomness provided by the speckle photocur-rents, and we obtain annealing simply by decreasingthe laser power.

The integrated circuit implemented with 1-mmCMOS digital technology is composed of a 24 3 24array of identical PEs. The total circuit area, includ-ing connection pads, is 8 mm 3 8 mm. Every PEncludes an analog and a digital block. The digitallock of the ith PE stores the spin value Si and two of

the four coupling coefficients Jij in three static mem-ory points. These three static memory points areincorporated into a horizontal shift register, permit-ting reading and writting of the spin configurationsand Jij coefficients. The four bipolar products JijSjinvolved in the computation of the force Fi are imple-mented with logical XOR gates. The analog block isased on a dual-rail architecture for minimizing on-hip dispersions. This architecture fits nicely withhe differential detection of speckles. We computeorces Fi

1 and Fi2 by adding the positive and nega-

ive bipolar products JijSj through current summingat the nodes of the plus and minus rails, respectively.Further details of the electronic implementation canbe found in Ref. 39. A spatially uniform externalfield can be implemented by injection of a global cur-rent into either the plus or the minus rail of every PE,whereas any external field configuration can be sim-ulated by imaging of a gray-level mask illuminatedby an incoherent source onto the integrated circuit.According to the sign of hi, the light that issues from

square subdomain of the mask is placed such as tolluminate the phototransistor connected to the pos-tive or to the negative rail. Finally, the photocur-ents generated by two phototransistor specklelluminations are injected into their associated rails.he same phototransistors are used for the optical

nputs of the speckle and of the nonuniform externaleld. We update spin i according to the heat-bathriterion by latching the voltage that results fromomparison of the two rail currents into the Si mem-ry point. We implement the speckle illumination ofhe integrated circuit by imaging the exit face of aultimode fiber onto the circuit. The rotation speed

f a diffuser inserted between the laser diode and theber input face allows the time correlation of thepeckle illuminations to be controlled. This dura-ion is adjusted to the inverse of main clock frequencyf the chip used for successively updating black-and-hite spin sites alternately. The whole system is

ontrolled by a personal computer, enabling one toead and write spin values, to write interactive-oefficient values, and to set the global and local ex-ernal fields, the laser power, and the clock signals.


wwfi

ehrghadtspodtr

tPs

R

3

Many local tests were performed with the proto-type to validate the digital–analog operation. Wealso performed global tests to study the influence ofinaccuracies in analog computation on the overallsystem performance, for instance, on the ability toreach deep minima of the energy by annealing. Twosuch tests are now described.

The first experimental test evaluates the ability ofthe implemented system to sample at thermal equi-librium. The Jij coefficients were randomly set to 1

ith a probability equal to 0.87, and no external fieldas applied ~spin-glass model without a magneticeld!. As the configuration space is very large ~2576

Fig. 9. Test of thermal equilibrium with the implemented system.~a! Experimental data obtained by collection of 10,000 energies atemperatures T 5 3.02 and T9 5 2.68. ~b! Logarithm of the ratioyP9 as a function of energy. Circles, experimental data. Thelope of the solid line is T 2 T9yTT9.

Fig. 10. Experimental results. ~a! 24 3 24 synthetic image corestoration with speckle ~annealing!. ~c! Restoration without sp


configurations!, the test was performed on the canon-ical distribution; see Subsection 3.A. For two valuesof the laser power corresponding to two tempera-tures, T and T9, statistics were collected during timevolution of the system. Figure 9~a! shows the twoistograms obtained by collection of 10,000 configu-ations. In the overlapping region of the two histo-rams ~E [ @2520; 2350#!, the ratio of theistograms is computed, and its logarithm is plotteds a function of energy @Fig. 9~b!#. Because of spatialispersions that are due to the analog implementa-ion, temperature is defined only locally, at every spinite. For a fixed laser power, the observation of therobability that spin i takes the value 1 as a functionf Fi results in a stochastic updating function thatiffers slightly from the heat-bath criterion. By fit-ing the observed and expected probabilities, we de-ive the local temperature Ti associated with spin i.

The global temperature is defined by averaging overall sites. For the two histograms, the global temper-atures T and T9 are 3.02 and 2.68, respectively. Theslope m 5 ~T 2 T9!yTT9 is thus 0.042. In Fig. 9~b!the solid line represents the least-squares fit of slopem to the experimental data. Clearly, a deviationexists. In fact, the best least-squares line fitted fromthe experimental data has a slope m9 5 0.82 m.This deviation is due to inaccuracies of the analogimplementation, as was confirmed by numerical sim-ulation. In general, it was found that this test issensitive and reveals slight deviations from thermalequilibrium. This conclusion also holds for numeri-cal simulations, and in our opinion this test is a goodcandidate procedure for monitoring the decreasingrate of temperature in annealing.

The second test concerns a binary-image restora-tion problem. The interaction coefficients are allequal to 1. Experimental results are shown in Fig.10. Figure 10~a! shows a letter L degraded with 25%white channel noise. In the system, this noisy im-age represents the external field and is encodedthrough a binary-level mask that is imaged onto theintegrated circuit. The restored image, shown inFig. 10~b!, is obtained by annealing in the presence ofspeckle. Figure 10~c! represents a typical result ob-tained without speckle. In this case the system is

ted by white channel noise; the original image is a letter L. ~b!~gradient descent!.

rugaeckle

ntmfatHpwtfacqarilib1pan

ucrsemwt

cg

Fd

iaotn

R

operating at a null temperature and implements agradient descent algorithm. In a few iterations, it istrapped in a local minimum that depends on the ini-tial configuration. Comparison with numerical re-sults obtained with a personal computer indicatesthat, despite the inaccuracies of the analog imple-mentation, the system has good overall performanceand provides deep minima. For instance, the aver-age energy value obtained over 50 annealings withthe implemented system is only 19 larger than thatfound by numerical computations performed on a per-sonal computer. For comparison, the energies foundwith a gradient descent algorithm are on the averageapproximately 200 larger.

For the design of the silicon chip our primary goalwas to develop a robust prototype incorporating sev-eral test modes to better evaluate the performance ofthe stochastic processing and not to beat records interms of integration density or computational speed.With respect to PE integration density, there is roomfor improvement in both the logical and the analogblocks. For a silicon VLSI chip dedicated to the res-toration of binary images, the Jijs coefficients need

ot be reconfigurable, and no global external field haso be implemented. It is reasonable to expect that aoderately risky design of a 1-cm 3 1-cm silicon chip

abricated with the same technology will incorporatepproximately 100 3 100 PEs. With up-to-dateechnologies, this density can be further increased.owever, note that the phototransistor area ~atresent 30 mm 3 30 mm! does not scale down wellith decreasing resolution. Further integration will

herefore imply other approaches such as the use ofocal plane arrays to concentrate light onto the activereas. At present, the global clock frequency of thehip is limited by the phototransistor’s cutoff fre-uency ~'250 kHz!, and there is a full iteration overll black and white spins every 10 ms. Thus video-ate annealings are possible if no more than 2000 fullterations are performed. This is not the ultimateimit. In our opinion, 1 or 2 orders-of-magnitudencreases in the phototransistor response speed cane achieved with a slightly different design. A-MHz global clock frequency seems a reasonablerediction for stochastic machines composed of anrray of PEs operating on binary variables with localeighborhood systems.

C. Architecture for Image Restoration Problems

In this subsection we elaborate on stochastic arti-ficial retinas for video-rate implementation of theQSR algorithm. A hybrid parallel architecturethat comprises analog, digital, and optoelectroniccircuits is proposed for sampling the x and l fields

nder thermal equilibrium. Of course, as we areoncerned with VLSI silicon circuits, only short-ange neighborhood systems are considered. Forimplicity we restrict the following discussion toarly-vision tasks for which matrix F is identityatrix I. Our approach is directly inspired by theell-known fact that, according to Maxwell’s heat

heorem, the steady state of an electrical network

omposed of sources and linear resistances is thelobal minimum of a convex quadratic form.40 The

mapping between analog networks and the solutionof variational problems has been extensively stud-ied in the context of regularization in early vision.Analog linear networks1 are seen as a natural com-putational model for finding the global minimum ofthe classic regularization functional of Eq. ~4!. Forsolving nonconvex variational problems, nonlinearnetworks have to be considered. In Refs. 6 and 24the use of a hybrid architecture that relies on asequence of alternate probabilistic and determinis-tic steps was proposed for implementing the MAalgorithm. In Ref. 10, the implementation of theglobal algorithm was discussed by use of basicallythe same hybrid architecture made from a grid ofdigital PEs interacting with a linear neural net-work. In Refs. 41–43, resistor-with-fuse networksare proposed as a means for minimizing the energyfunction of Eq. ~2! for noninteracting boundaries.By tuning of the voltage control of the fuse resis-tances, a minimization is performed through a se-quence of convex functionals that are free ofspurious local minima. Clearly, the architecturepresented herein is directly inspired from those re-lated papers. Its originality comes from the factthat rigorous stochastic relaxation schemes are in-vestigated. For this, we introduce the concept ofresistive networks corrugated by Gaussian noises.This noise is called quasi-static to emphasize thefact that its temporal correlation length is muchlarger than the relaxation time of the network.This situation has to be opposed to that of the well-known Johnson noise, which automatically gener-ates voltage fluctuations in an electric resistance.As was explained by Nyquist,44 this noise is white inthe sense that its spectrum is almost flat up tofrequencies much higher than the inverse of thenetwork relaxation time. To our knowledge, theuse of quasi-static noises in linear resistive net-works for sampling multinormal distributions wasnot proposed earlier.

Before going to a description of a stochastic artifi-cial retina for implementing the QSR algorithm, weneed to introduce a property of linear resistive net-works corrupted by quasi-static Gaussian noise. Letus consider a network with n 1 1 nodes. Node i ~i 51, 2 . . . n!, whose voltage is denoted Vi, is shown in

ig. 11. The resistance that links node i to node j isenoted Rij ~Rij 5 Rji!, and the grounded resistance at

node i is Rii. A current of magnitude equal to Ji isnjected into node i. The associated current gener-tor is also connected to the common ground ~node 0!f the resistive network. Moreover, we supposehat, at every link between nodes i and j, a Gaussian-oise current source with a magnitude equal to uij

~uij 5 2uji! is associated in parallel with resistanceij. The property ~P1! of linear resistive networks

corrupted by quasi-static Gaussian noise sources canbe described as follows:


Kmcnhidfa

Qtnasfipttt

sK

estmf

aa~

a

cs

3

If all Gaussian sources are independent with zeromean and variance vij given by

vij 5T2

Rij21, (17)

and if the temporal correlation length of the sourcesis much longer than the RC relaxation time of theresistive network, then the distribution function ofthe potentials is independent of the node capaci-tances and is given by

P~V! 51Z

expF2K~V!

T G , (18)

where the pseudoenergy K~V! is equal to

K~V! 5 (i51

n Vi2

Rii1

12 (

i51

n

(j51

n ~Vi 2 Vj!2

Rij2 2 (

i51

n

IiVi.

(19)

~V! is the total power dissipated in the resistancesinus twice the total power taken from the grounded

urrent generators. For T 5 0, i.e., in the absence ofoise, property P1 is simply reduced to Maxwell’seat-dissipation theorem: The stationary general-

zed voltages are those that minimize K~V!. Theemonstration of Eqs. ~18! and ~19! comes mainlyrom simple electrical and statistical considerationsnd is given in Appendix 2.A possible machine for the implementation of theSR algorithm is shown in Fig. 12. It is composed of

wo interacting meshes, a four-connected-analogoisy resistive network, and a locally interconnectedrray of digital–analog PEs. These two meshes, re-pectively, implement the intensity and the boundaryelds. The analog’s resistive network works in aurely asynchronous mode. It features switcheshat set or break connections between adjacent nodeso implement the absence or presence of discontinui-ies. Its principle of operation can be understood

Fig. 11. Node i of a resistive network described with the gener-lized voltage coordinates. Rij and Vi denote the resistance of the

resistor linking nodes i and j and the voltage at node i, respectively.uij and Ji are current source generators.


traightforwardly if we rewrite the pseudoenergy~V! as

K~V! 5 (i51

n ~Vi 2 RiiIi!2

Rii1

12 (

i51

n

(j51

n ~Vi 2 Vj!2

Rij

2 (i51

n

RiiIi2. (20)

The last term, which does not depend on the poten-tials, is not relevant and can be incorporated into thenormalizing constant Z of Eq. ~18!. By reading thegeneralized potentials in Eq. ~20! as node voltagesncoding the continuous intensity field x, it is easilyhown that the steady-state potentials V of the resis-ive network of Fig. 12 are samples of the multinor-al distribution function defined by Eqs. ~2! and ~3!

or F 5 I. This holds, provided that

Rii 5 1, Rij 5 l22 ~i Þ j!, (21a)

Ji 5 yi, (21b)

nd independent Gaussian noise sources with vari-nces Ty2 and l2~Ty2! are attached to the groundedvertical! and the lateral resistors, respectively.

These noise sources are represented in Fig. 12 as

Fig. 12. Possible hybrid machine for implementation of the QSRalgorithm for F 5 I. It features two interacting meshes, an ana-log noisy resistive network and a locally interconnected array ofstochastic PEs. The resistive network, which is mapped onto afour-connected lattice of nodes, samples the intensity field at ther-mal equilibrium; the mesh of PEs uses the Gibbs sampler to sam-ple the boundary field. The hybrid machine has two basic cycles.At fixed time intervals, the update of the binary variables li en-oding the presence or absence of boundaries requires the mea-urement ~large upward arrows! of the voltage difference between

the adjacent nodes of the resistive network and some local compu-tation involving the current states of neighboring PEs. The up-dated binary output is injected into the resistive network ~largedownward arrow! through the new state of the correspondingswitch, which sets or breaks the resistive connection betweennodes.

iddttaoit

lPOodtmrec~fp

pt

wncgmsfipiCctttoatwicccB1

shaded current generators. Following the results ofSubsection 4.A, we can implement them by attachinga differential pair of phototransistors to every resis-tor and by illuminating the resistive network withspeckle. According to Eq. ~14!, the mean illumina-tion of the phototransistors attached to vertical resis-tors has to be l2 times smaller than that of thephototransistors attached to lateral resistors. Inpractice, under a spatially uniform speckle, this canbe achieved by integration of phototransistors of dif-ferent sizes, with l2 as the ratio of their area. Thedata to be processed are incorporated into the resis-tive network through current generators; see Eq.~21b!. This incorporation can be achieved by imag-ng of a spatial light modulator that is encoding theata onto photocurrent generators J. Or, as wasescribed in Subsection 4.B, the same phototransis-ors may be used for the inputs of the speckle and ofhe image to be processed. In this approach, gener-tors J are useless, and the data are directly imagednto one of the two phototransistors used for themplementation of the Gaussian random sources at-ached to the vertical resistors.

The sampling of the line field is obtained on aocally interconnected mesh of digital–analogEs, each implementing the heat-bath criterion.nce again, the use of clipped differential detection

f speckles may advantageously replace area-emanding silicon digital random-number genera-ors. The operation and implementation of thisesh are similar to those of the stochastic artificial

etina described in Subsection 4.B. The main differ-nce comes from the force computation, which in-ludes contributions from the resistive networkvoltage differences between adjacent nodes! androm neighboring PEs ~the force that results fromotentials Vij!. If an interacting line process such as

that shown in Fig. 6 is considered, every PE has to beconnected to six neighbors, and the chromatic num-ber for parallel computation is four. Unlike for theIsing-spin model, the PE’s contribution to the forcedoes not depend linearly on the line variables. Itsimplementation requires Boolean operations, whichmay limit the PE’s integration density. The imple-mentation of noninteracting line processes is mucheasier because the force computation reduces to thevoltage differences between adjacent nodes and aconstant bias equal to a. In this case, no coloring isrequired, and all the PEs can be updated in parallel.

With this type of design, we end up with a hybridmachine that involves asynchronous-analog andclocked-digital operations for the intensity and theline fields, respectively. All random-number gener-ation can be performed optically, producing three dif-ferential phototransistor pairs per intensity site andan additional pair per line site. PEs are updatedsynchronously at clock time ti 5 1, 2, . . . . This up-dating implies computing the forces, which in turninvolves the resistive network and neighboring PEs.For a successful operation, the time interval Dt 5ti11 2 ti has to be much larger than the RC relaxationtime of the resistive network. This ensures that the

voltages read by the digital–analog mesh of PEs willeffectively correspond to steady states of the resistivenetwork. The relaxation time of the resistive net-work is difficult to evaluate, as it depends on the lineprocess configuration and on the temperature. Ba-sically it is proportional to the area of the largestsmooth part of the network that has no discontinui-ties. At high temperature, this area is rather small,as many switches break connections between adja-cent nodes. At low temperature, larger areas areinvolved, but the voltage swings are lower. For allpractical purposes, the voltage changes induced bythe switches typically have to propagate a few nodesaway before the decay that is due to the groundedresistors swamps them out, and the voltages at nodesfarther away will remain relatively unchanged. Atypical value for the relaxation time is a few hun-dreds of nanoseconds.45 Thus a reasonable globalclock frequency for the system of Fig. 12 is 1 MHz. Ifvideo-rate operation for the minimization is under-taken, approximately 103–104 full iterations can be

erformed every 20 ms, depending on the coloring ofhe edge field.

Clearly, the implementation of such a machineith up-to-date silicon technology is challenging. Ifoninteracting line processes are considered, a mono-hip integration is preferable, as it favors the inte-ration density. For interacting line processes, aonochip integration is risky, as the large voltage

wings of the digital electronic required for the edgeeld are likely to contaminate the nearly analog com-utation. Thus a hybrid approach that relies on thentegration of several chips in digital and analogMOS technologies is more appropriate. The diffi-ulty in such a hybrid system lies in the implemen-ation of the large number of connections required forhe interaction of the two meshes. With conven-ional chip-to-substrate contacts along the perimeterf a chip, time multiplexing has to be implemented,nd the speed of the system is likely to be limited byhe inevitable bottleneck that one usually has to dealith in vision systems. With forthcoming packag-

ng technologies developed for multichip modules,ontrolled-collapse chip connection processes willonnect the chip directly to the substrate. Thus theontact density is proportional to the chip area.ased on this board-scale integration, more than0,000 connectionsycm2 are available,46 and the pros-

pects for implementing our proposal multichip sys-tem may therefore improve.

5. Concluding Remarks

In this paper we have considered stochastic artificialretinas that are able to implement stochastic relax-ations on Markov random fields in early and middlevision. These retinas are hybrid parallel machinesthat mix optics and digital and analog electronics andprovide global optimizations at a video rate. Theycan be used as single-circuit sensors dedicated to aspecialized application or as modular early-visionmodules that are assembled at the board level andthat are used as front-end processors for a variety of


mnssa

vdorcsTaipotdtI

aaafisatsoafiit

dpotttCeipclofbipttphaosovsmtrvm

3

vision applications. As an aid to better understand-ing of the advantages and drawbacks of building suchdedicated machines, several aspects of their construc-tion, including algorithms, computational efficiency,parallelism, hardware, and architectures, have beenconsidered. Although all the numerical and experi-mental examples covered in the paper are obtainedfor the maximum a posteriori estimator, the resultsfor issues of algorithm and implementation are notrestricted to this estimator. Other, more local, esti-mates that might provide better performance,14,47 forinstance, the maximizer of the posterior marginals,can be straightforwardly considered.

Throughout the paper we have focused on the prob-lem of image restoration in the presence of disconti-nuities. In the more-general case, our results maybe applied to optimization problems defined by asemiquadratic energy function. This function ex-plicitly depends on a continuous field related to theobserved data and on an additional binary field thatexplicitly encodes the presence or absence of somequalitative property such as the presence of a bound-ary. For this energy function we presented a rigor-ous algorithm, called the quasi-static relaxationalgorithm, that induces stochastic relaxation in cou-pled fields. For a fixed temperature, it samples con-figurations at thermal equilibrium, i.e., under theGibbs distribution. When it is used with simulatedannealing, it provides maximum a posteriori esti-mates. Its main attractive feature is that, when lo-cal neighborhood structures are considered, as is thecase in most early-vision tasks, it fully exploits theshort-range interactions. Thus it can be imple-mented with a high degree of parallelism. More-over, the algorithm is efficient for conventionalworkstations because sampling of the continuousfields requires only the inversion of a sparse linearsystem for which efficient inversion routines exist.

Considering restoration problems in one- and two-dimensional image formats, we performed computa-tions to test the algorithm’s overall performance.More specifically, we successfully tested the ability ofthe algorithm to perform sampling under conditionsof thermal equilibrium at a fixed temperature and tocompute global minima of the energy function. Toour knowledge, for interacting boundary fields thatreflect some basic constraints such as the absence ofending, no rigorous deterministic algorithms exist,and the QSR algorithm is one of the most efficientalgorithms at providing global estimates. The effecton restoration performance of incorporating interact-ing boundary fields into the energy function wasstudied. Visual comparative results for syntheticimages clearly showed that models with interactingboundaries enhance the restoration quality. Forsimple reconstruction operators ~F 5 I!, enhance-

ent was observed for images strongly corrugated byoise. We believe that more-complex problems,uch as restoration with blur or tomographic recon-truction, may benefit from interacting boundaries atn intermediate noise level.Implementing stochastic artificial retinas that pro-


ide video-rate operation was discussed. We intro-uced the imaging of time-varying speckle patternsnto silicon chips as an efficient tool for generatingandom numbers in VLSI circuits. Once they areonverted into photocurrent, the speckles act asources of thermal noise for stochastic relaxation.hese sources are nearly Gaussian, and, when theyre used with comparators, they provide an accuratemplementation of the heat-bath criterion for sam-ling binary Markov random fields. The feasibilityf building analog–digital machines that mix elec-ronics and speckle random-number generators wasemonstrated, and the successful video-rate opera-ion of a 24 3 24 smart VLSI sensor to minimizesing-spin-like energy functions was validated.

An architecture for the implementation of the QSRlgorithm was proposed. It relies on dedicated par-llel circuits that use speckle random-number gener-tors to implement the stochastic relaxation of bothelds. It guarantees the rigorous optimization ofeveral estimators for both interacting and noninter-cting line processes. This generality is obtained athe expense of the integration of additional noiseources to implement the stochastic samplings. Inur opinion, the extra work involved in using thisrchitecture does not present a serious difficulty; ef-cient solutions that are compatible with massive

ntegration and good statistical properties, such ashe differential detection of speckle, exist.

Stochastic artificial retinas do have two majorrawbacks with respect to conventional numericalrocessors. As their successful integration dependsn dedicated analog computations, they are sensitiveo the inevitable drift and imprecision of analogs;hey are not versatile, inasmuch as every implemen-ation is tailored to a specific processing task.learly, the architecture proposed in this paper is notxempt from these two drawbacks; for instance, tak-ng into account an eventual blur with a narrow sup-ort would lead to a somewhat different and more-omplex design. Not only would a network withonger connections be required, but also integrationf resistors with negative resistances and control oforeseeable network stability problems would have toe achieved. Hybrid architectures that combine dig-tal and analog computation, such as that of Fig. 12,robably represent a good compromise that exploitshe flexibility of digital electronics without sacrificinghe net benefit to integration density of analog com-utations. Although the successful operation of aybrid machine that implements the QSRelaxationlgorithm is clearly a continuing challenge, perhapsne of the most important results of this paper is tohow that solutions exist for the implementation ofptimal stochastic processing techniques in early-ision problems. Stochastic artificial retinas repre-ent a generic platform for building efficient visionachines to operate over a large class of energy func-

ions and estimators. Moreover, they contribute toeducing the gap between sophisticated models de-eloped by researchers on the one hand and dedicatedachines designed by engineers on the other, and

t

h

niTor~s

ttt

t

wsse

oNsCtwI

sETfdCt

1

1

1

1

1

1

1

1

1

1

2

2

they present promising prospects for the implemen-tation of video-rate Monte Carlo–like computations.

Appendix A: Demonstration of Eq. ~8!

For simplicity, we denote by G the matrix FtF 1l2Ml

tMl. From Eq. ~7! we have found that x is equalo G21Fty 1 G21B wqs. Because the mean vector m

of x is equal to G21Fty @see Eq. ~6a!#, the covariancematrix ~x 2 m!~x 2 m!t is simply G21Bwqs

~wqs!tBt~G21!t. Because wqs is a Wiener vector, theproduct wqs~wqs!t is the identity matrix. Moreover,as G 5 Gt and consequently ~G21!t 5 G21, covariancematrix ~x 2 m!~x 2 m!t reduces to G21BBtG21.Clearly, if B satisfies Eq. ~8!, the covariance matrixbecomes Ty2 G21, which is exactly the desired covari-ance matrix C of Eq. ~6b!.

Appendix B: Demonstration of Eqs. ~18! and ~19!

We obtain the steady state of the network of Fig. 12by writing Kirchhoff ’s current law. At node i we

ave

(m51

n

LimVm 5 Ji 1 qi, (B1)

where the total random current qi is equal to ¥m51n

umi and Lim equals 2~1yRim! for m Þ i and ¥p51n 1yRip

for m 5 i. Note that, in Eq. ~B1!, capacitances at allodes are not incorporated because they participate

n the transient response and not in the steady state.his amounts to considering that the correlation timef the Gaussian noise is much longer than the RCesponse time of the network. In compact form, Eq.B1! can be written as LV 5 J 1 q, where L is a realymmetric matrix formed by the coefficients Lim and

V, J, and q are n 3 1 vectors with coefficients Vi, Ji,and qi. Let us denote by Q the covariance matrix ofhe random variables qi. From Eq. ~17! and notinghat uij 5 2uji, we find from simple linear algebrahat

Q 5 ~Ty2!L. (B2)

From Eq. ~B1! it follows that the potentials V obey amultinormal distribution with mean vector m 5 L21Jand covariance matrix ~Ty2!L21. Thus the distribu-ion function P~V! of the potentials is given by

P~V! 51Z9

expF21T

~V 2 m!tL~V 2 m!G , (B3)

here Z9 is a normalizing constant and the super-cript t denotes matrix transposition. It is easilyhown that the distribution function of Eq. ~B3! isqual to that of Eq. ~18! for J~V! given by Eq. ~19!.

We thank our colleagues A. Dupret and E. Belhairef the Institut d’Electronique Fondamentale, Centreational de la Recherche Scientifique ~CNRS!, Or-

ay, and P. Garda of the Universite Pierre et Marieurie, Paris, who designed for us the analog part of

he integrated circuit. Most of the experimentalork was carried out by Jean Claude Rodier at the

nstitut d’Optique Theorique et Appliquee and was

upported by the Direction de la Recherche et destudes Techniques under contract DRET#92-139.he algorithmic contribution to this study benefitted

rom the expertise of Line Garnero of the Laboratoiree Neurosciences Cognitives et Imagerie Cerebrale,entre National de la Recherche Scientifique, Hopi-

al de la Salpetriere, Paris.

References1. T. Poggio and C. Koch, “Ill-posed problems in early vision:

from computational theory to analogue networks,” Proc. R. Soc.London Ser. B 226, 303–323 ~1985!.

2. S. Geman and D. Geman, “Stochastic relaxation, Gibbs distri-bution, and the Bayesian restoration of images,” IEEE Trans.Pattern Anal. Mach. Intell. 6, 721–741 ~1984!.

3. B. Chalmond, “Image restoration using an estimated Markovmodel,” Signal Process. 15, 115–129 ~1988!.

4. P. Bouthemy and E. Francois, “Motion segmentation and qual-itative dynamic scene analysis from an image sequence,” Int.J. Comput. Vision 10, 157–182 ~1993!.

5. S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi, “Optimizationby simulated annealing,” Science 220, 671–680 ~1983!.

6. C. Koch, J. Marroquin, and A. Yuille, “Analog neuronal net-works in early vision,” Proc. Natl. Acad. Sci. USA 83, 4263–4267 ~1986!.

7. D. W. Murray, A. Kashko, and H. Buxton, “A parallel approachto the picture restoration algorithm of Geman and Geman onan SIMD machine,” Image Vision Comput. 4, 133–142 ~1986!.

8. J. G. Harris, C. Koch, E. Staats, and J. Luo, “Analog hardwarefor detecting discontinuities in early vision,” Int. J. Comput.Vision 4, 211–222 ~1990!.

9. A. Lumsdaine, J. L. Wyatt, and I. M. Elfadel, “Nonlinear an-alog networks for image smoothing and segmentation,” J.VLSI Signal Process. 3, 53–68 ~1991!.

0. L. Bedini and A. Tonazzini, “Image restoration preserving dis-continuities: the Bayesian approach and neural networks,”Image Vision Comput. 10, 108–118 ~1992!.

1. P. C. Yu, S. J. Decker, H. S. Lee, C. G. Sodini, and J. L. Wyatt,“CMOS resistive fuses for image smoothing and segmenta-tion,” IEEE J. Solid-State Circ. 27, 545–553 ~1992!.

2. F. C. Jeng and J. W. Woods, “Compound Gauss–Markov ran-dom fields for image estimation,” IEEE Trans. Signal Process.39, 683–697 ~1991!.

3. M. Bertero, T. A. Poggio, and V. Torre, “Ill-posed problems inearly vision,” Proc. IEEE 76, 869–889 ~1988!.

4. A. Blake and A. Zisserman, Visual Reconstruction ~MIT, Cam-bridge, Mass., 1987!, Chap. 4.

5. P. Charbonier, L. Blanc-Feraud, G. Aubert, and M. Barlaud,“Deterministic edge-preserving regularization in computedimaging,” IEEE Trans. Image Process. 6, 298–311 ~1997!.

6. J. Marroquin, “Surface reconstruction preserving discontinui-ties,” A. I. Memo 792 ~MIT, Cambridge, Mass., 1984!.

7. D. Vanderbelt and S. G. Louie, “A Monte Carlo simulatedannealing approach to optimization over continuous vari-ables,” J. Comput. Phys. 56, 259–271 ~1984!.

8. M. A. Styblinsky and T. S. Tang, “Experiments in nonconvexoptimization: stochastic approximation with functionsmoothing and simulated annealing,” Neural Networks 3,467–483 ~1990!.

9. R. L. Stratonovitch, Markov Processes and Related Processes,Vol. 1 of Topics in the Theory of Random Noise ~Gordon &Breach, New York, 1967!, Chap. 4.

0. C. W. Gardiner, “The Ito calculus and stochastic differentialequations,” in Handbook of Stochastic Methods ~Springer-Verlag, Berlin, 1990!, Chap. 4, pp. 106–113.

1. D. Geman and C. Yang, “Nonlinear image recovery with half-


quadratic regularization,” IEEE Trans. Image Process. 4, 932–

2

2

22

2

2

2

3

3

3

3

3

35. Ph. Lalanne, E. Belhaire, J. C. Rodier, A. Dupret, P. Garda,

3

3

3

3

4

4

4

4

4

4

4

4

3

945 ~1996!.2. D. Prevost, Ph. Lalanne, L. Garnero, and P. Chavel, “Quasi-

static algorithm for image restoration preserving discontinui-ties,” in Neural and Stochastic Methods in Image and SignalProcessing III, S.-S. Chen, ed., Proc. SPIE 2304, 156–164~1994!.

23. E. Aarts and J. Korst, Simulated Annealing and BoltzmannMachines ~Wiley, New York, 1989!, Chap. 3, pp. 33–52.

4. J. Marroquin, S. Mitter, and T. Poggio, “Probabilistic solutionof ill-posed problems in computational vision,” J. Am. Statist.Assn. 82, 76–89 ~1987!.

5. Ref. 23, Chap. 4, pp. 57–75.6. D. Geman, S. Geman, C. Graffine, and P. Dong, “Boundary

detection by constrained optimization,” IEEE Trans. PatternAnal. Mach. Intell. 12, 609–628 ~1990!.

7. J. Alspector, J. W. Gannett, S. Haber, M. B. Parker, and R.Chu, “A VLSI-efficient technique for generating multiple un-correlated noise sources and its application to stochastic neu-ral networks,” IEEE Trans. Circ. Syst. 38, 109–123 ~1991!.

8. A. J. Martino and G. M. Morris, “Optical random numbergenerator based on photoevents locations,” Appl. Opt. 30, 981–989 ~1991!.

9. G. M. Morris, “Optical computing by Monte Carlo methods,”Opt. Eng. 24, 86–90 ~1985!.

0. J. Marron, A. J. Martino, and G. M. Morris, “Generation ofrandom arrays using clipped laser speckle,” Appl. Opt. 25,26–30 ~1987!.

1. A. Dupret, E. Belhaire, and P. Garda, “Scalable array of Gauss-ian white noise sources for analogue VLSI implementation,”Electron. Lett. 31, 1457–1458 ~1996!.

2. S. Wolfram, “Random sequence generation by cellular autom-ata,” Adv. Appl. Math. 7, 123–169 ~1986!.

3. J. W. Goodman, Statistical Optics ~Wiley, New York, 1985!,Chap. 4.

4. Ph. Lalanne, H. Richard, J. C. Rodier, P. Chavel, J. Taboury,K. Madani, P. Garda, and F. Devos, “2D generation of randomnumbers by multimode fiber speckle for silicon arrays of pro-cessing elements,” Opt. Commun. 76, 387–394 ~1990!.


and P. Chavel, “Gaussian random number generation by dif-ferential detection of speckles,” Opt. Eng. 34, 1835–1837~1995!.

6. G. Premont, Ph. Lalanne, P. Chavel, M. Kuijk, and P. Her-emans, “Generation of sigmoid probability functions by clippeddifferential speckle detection,” Opt. Commun. 129, 347–352~1996!.

7. K. H. Fisher and J. A. Hertz, Spin Glasses, D. Edwards, ed.~Cambridge U. Press, Cambridge, 1991!.

8. A. J. Gray, J. W. Kay, and D. M. Titterington, “On the esti-mation of noisy binary Markov random fields,” PatternRecogn. 25, 749–768 ~1992!.

9. A. Dupret, E. Belhaire, J. C. Rodier, Ph. Lalanne, D. Prevost,P. Garda, and P. Chavel, “An optoelectronic CMOS circuitimplementing a simulated annealing algorithm,” IEEE J.Solid-State Circ. 31, 1046–1050 ~1996!.

0. W. Millar, “Some general theorems for non-linear systems pos-sessing resistance,” Phil. Mag. 42, 1150–1160 ~1951!.

1. J. Harris, C. Koch, and J. Luo, “A two-dimensional analogVLSI circuit for detecting discontinuities in early vision,” Sci-ence 248, 1209–1211 ~1990!.

2. J. Harris, C. Koch, E. Staats, and J. Luo, “Analog hardware fordetecting discontinuities in early vision,” Int. J. Comput. Vi-sion 4, 211–223 ~1990!.

3. A. Lumsdaine, J. L. Wyatt, and I. M. Elfadel, “Nonlinear an-alog networks for image smoothing and segmentation,” J.VLSI Signal Process. 3, 53–68 ~1991!.

4. H. Nyquist, “Thermal agitation of electric charge in conduc-tors,” Phys. Rev. 32, 110–113 ~1928!.

5. H. Kobayashi, J. L. White, and A. A. Abidi, “An active resistornetwork for Gaussian filtering of images,” IEEE J. Solid-StateCirc. 26, 738–747 ~1991!.

6. R. A. Nordin, A. F. J. Levi, R. N. Nottenburg, J. O’Gorman, T.Tanbun-Ek, and R. A. Logan, “A systems perspective on digitalinterconnection technology,” J. Lightwave Technol. 10, 811–827 ~1992!.

7. J. Besag, “On the statistical analysis of dirty pictures,” J. R.Statist. Soc. B 48, 259–302 ~1986!.

stochastic artificial retinas: algorithm, optoelectronic circuits - hal

Documents