a microprocessor-controlled real-time image processor

A microprocessor-controlled real-timeimage processor

D. Lavie. Ph.D., and W.K. Taylor. M.Sc. Ph.D.

Indexing terms: Cybernetics and robotics, Image processing, Microprocessors, Pattern recognition

Abstract: A microcomputer-controlled vision system incorporating a grey level local image processor and abinary frame image processor operating on the S-100 bus is described. Examples are given to illustrate applica-tions in pattern recognition, industrial inspection and the rejection of faulty components by vision-controlledassembly robots.

1 Introduction

Computer vision systems with inputs derived from tele-vision cameras or CCD area image sensors have numerousapplications in robot vision, industrial inspection andpattern recognition. Except at low resolution and pro-cessing speed, however, conventional computer hardwareand programming languages are far too slow, and, for realtime operation at high resolution, the low-cost line-scanparallel-processing hardware and controlling micro-computer software described in this paper have been devel-oped.

Special purpose high-speed low-resolution processorshave been described elsewhere [1, 2] but these requirecustom-designed LSI chips, which at present are prohibi-tively expensive for most industrial applications. Allsystems must first digitise the analogue video image intogrey levels, preferably at the standard TV scanning rate, bymeans of a high-speed ADC. Typical grey levels are 64,128 and 256 or 6, 7 and 8 bits, and a large frame memoryis required to store a single frame. Average resolution is256 x 256 (65, 536) pixels and 8 grey bits which, for byteparallel serial input, requires a 64K byte RAM capable ofbeing written into at the rate of 256 bytes in a single TVactive scan time of 50 pis, i.e. at 5.12 x 106 bytes/s or 195ns per byte. Slower memories can be employed if a numberof bytes are first fed serially into long fast shift registersand loaded in parallel to parallel-connected memories at alower rate when the shift registers are full, as in pipelineprocessing [3].

The Microvision-100 system has been designed with theobject of reducing cost by employing only standard hard-ware based on type-74 logic, the 8085 microprocessor andstandard S100 family cards. Real-time operation has beenachieved in the grey level preprocessing by using SchottkyTTL local operator hardware in line-scan mode. The localoperators scan through the TV image in synchrony withthe electron beam or CCD camera scan, so that each newframe is processed by the selected operator in 1/50 s, thuskeeping in step with the real-time flow of information. Theresults of the processing are stored in RAM at the samesynchronous speed under DMA control.

In contrast with bit-plane parallel processing systemsthat store the pixel grey-level bits serially in shift registers,one for each pixel, the Microvision 100 processes all thegrey levels of a local operator array in parallel. Since thereis only a single grey-level image processing unit for 65 536pixels, in place of one for each pixel in a pure parallel

Paper 2598E, first received 15th March 1982 and in revised form 19th April 1983Dr. Taylor is, and Dr. Lavie was formerly, with the Department of Electronic &Electrical Engineering, University College, Torrington Place, London WC1E 7JE,England. Dr. Lavie is now with the Israeli Ministry of Defence

IEE PROCEEDINGS, Vol. 130, Pt. E, No. 5, SEPTEMBER 1983

frame processor, it is economically viable to include a widerange of microprocessor-controlled operators within theunit, control being selected during the relatively long TVblanking period.

2 Gradient measurement and edge detection

The literature on image processing contains numerousalgorithms for finding the 'edges' of objects, given the greylevel at each pixel and any method requiring a window upto 4 x 4 can be implemented on Microvision-100. If abinary decision of edge or no-edge is required, it is firstnecessary to compute the grey level gradient at each pixeland to compare it with a preselected threshold gradient.The Laplacian operator V2/(x, y) is a classical measure ofchanging gradient and can be approximated by V2f(i,j) forthe quantised pixels. This operator discards the directionof gradient and constant gradients, however, which can beuseful for discriminating features in pattern-recognitionapplications and are easily preserved in the Microvision-100 system. As a simple example of gradient estimation,consider a local scanning operator (see Fig. 1) that formsthe two positive or zero values of the four functions

at each pixel i, j , where a, _,- is the present digitised videooutput produced, for example, by the electron beam of avidicon camera 'reading' pixel i, j . All the other digitisedamplitudes are past values and are stored in fast RAMserving as delay lines. In this simple example, only twolines of video are involved, but larger local array operatorsspreading over n lines require n stored lines in RAM withaddresses clocked at the pixel scan rate of 5.12 MHz. Withreference to Fig. la, it will be seen that the functions ofexprs. 1 start at pixel 2,2 with ±(a x x — a2i2) a "d± ( « i , 2 ~ a 2 , i) a n d finish at pixel 256, 256 with

± («255. 256 ~ «256,± («255, 255 - «256,useful approximation, the gradient direction may be quan-tised into the eight compass directions and expressed forN-S or E-W gradients as A per horizontal or vertical pixelcentre separation d. The same gradient in NW-SE orNE-SW directions produces differences of ^/zA, owing tothe larger .Jld diagonal pixel separation, as illustrated inFig. \b.

It is convenient to avoid negative quantities for negativegradients by selecting the two positive functions in exprs.1. Thus, for a gradient A sloping downwards from NW toSE and having no component in the NE to SW direction,(fl,--ij-i — «,,;) will be + N/^A, and, for the opposite SEto NW gradient, {aUj — ai_l ,_,) is positive. In both theseexamples, (ai_l j — a,-,j-i) = 0. Similarly, for a gradientsloping downwards from W to E, (a,_! ,_ , — aLJ) = + A

149

and ( f l , . , j — a{, j - i ) = — A. The distributions of gradientmagnitudes and directions over an image have been usedin texture recognition [4], but if only the shape of objectsrequired is sufficient to compare the largest function inexprs. 1 with AT to produce a binary edge picture, as illus-trated in Fig. 2 for a face and a plastics cup. In the lattercase, there are strong gradients above A r in the regionscorresponding to highlights and shadows, in addition tothe true shape edges. These false edges (Figs. 2d-f) are

256

a,

a2

°3

.1

,1

i

a.

a 2

a 3

.2

,2

,2

°1.3

a2.3

.256

I II I

' i . J

I II I

256 a256 a256 256

E-W gradientW-E gradient

Fig. 1 Gradient estimation

a Array of 256 x 256 image pixels with grey levels a , }

b Local operator grey levels used in highest resolution gradient estimation

Fig. 2 Binary edge pictures

a and c, examples of unprocessed TV images; b and d, e,f, corresponding gradientmagnitude binary quantisation stored images. Differences between d, e and / arebecause of changes in sources of illumination

150

eliminated by taking the logical AND of two frames withdifferent sources of synchronised flash illumination, whichtakes 40 ms.

The pixel notation shown in Fig. la is also a represent-ation of the RAM memory locations used to store the pixelintensity bytes, except that only sufficient RAM to storefour lines is used for reasons of economy. These sameRAMS are used sequentially to store the previous fourlines throughout the picture, so that any local operator upto 4 x 4 may be employed. One form of 2 x 2 operator,for example, averages the signals for four pixels to formnew pixels containing:

4(0,7 (2)

in place of a(J. This has a smoothing or noise reductioneffect but at the same time reduces the resolution of finedetail.

The microprocessor is quite incapable of selecting theRAM addresses and read/write lines at the speed requiredfor real time operation, but during the frame flyback timethe micro has time to select DMA control and to specifythe operation to be performed throughout the followingframe scan. An address counter operating at the pixel rateis controlled by the clock of the vidicon (or CCD areaimage sensor), so that the digitised camera output a{ j iswritten into address i, j , starting at i = 1, j = 1, but wheni = 5, a5 v is written into the memory address that orig-inally stored ax j etc. At this point, ax j is lost, but since atmaximum only the last three rows, in addition to thepresent row, are required for computation, this loss doesnot prevent correct operation.

3 System architecture

Efforts have been made to keep the system as simple aspossible and to eliminate any unnecessary duplication. Theinput image memory, for example, is also used for thedisplay storage. Transferring information from the inputimage memory to the main microcomputer bus linesrequires DMA operation, and during that time the micro-processor is idle and its bus line is controlled by the DMAcontroller. By building a special hardware processor, it hasproved possible to perform a variety of logic operation atthe same time as the DMA transfer, for example, to ANDinformation stored in the intermediate processor withinformation already written in a microcomputer block,and to put the result back into the same block, thus savingsignificant time as compared with conventional methods.

The functional block diagram of the system is shown inFig. 3. Since it was built mainly to test various algorithms,a significant part was kept flexible under microcomputersoftware control. The system is divided into three principalparts; the front end, consisting of blocks PI, P2, P3 and

displaystorage

P6

displayinterface

P7

video S-KX) bus linevideo A/D

P2delayssection

P3

magnitudesection

PA

--1 4frameprocessor

P5

syncseparator

interface tomicro.bus line P8

analoguepreprocess,

S-TOO bus line

microcomputer

Fig. 3 Functional block diagram of Microvision-100

IEE PROCEEDINGS, Vol. 130, Pt. E, No. 5, SEPTEMBER 1983

P4 in Fig. 3, acquires video signal and carries out the localoperator algorithm in a raster scan mode. The second partcomprises the frame processor P5, the display storage P6,the display interface P7, and, lastly, P8, the interface to theS-100 bus line of the third part, the microcomputer.

3.1 Front endA standard composite video signal from the camera is pro-cessed by an analogue card PI to provide synchronisationpulses to the system and the video signal which is sampledand converted to 6 bits by the flash ADC in P2. Localoperator circuits in P3 and P4 that perform the edge detec-tion, as described in Section 2, will be described in greaterdetail with reference to Fig. 4. The 6-bit bus of the sampled

sample videosignal

control to A/D

RAM delay lines

video sync.

control andaddress formation

RIO

uFig. 4 Real-time grey-level local-operator circuits that supply the videoS-100 bus with processed video data

video is input to the RAM delay line Rl5 which can storeup to four lines of sampled video grey levels. This delayemploys fast (45 ns) RAMs, and the local delays R2 andR3 are constructed from shift registers of similar speed.Sampled 6-bit signals from four (or more) pixels are trans-ferred from the local delays, before and after the RAMdelay, into the magnitude comparison circuits R4 and R5,where the maximum absolute value between the currentlytested pixels is compared with a threshold, set by R6, in R7and R8 to produce a single-bit decision sequence of 'edge'or 'no edge' at each pixel in R9. This result is transferred,through a serial-to-parallel conversion, as an 8-bit byte tothe video data of S-100 bus line. At the same time, addressdata for the S-100 bus and controls for the ADC are gener-ated and updated by the sync signal in RIO. The frameprocessor described in the next Section can fetch the localoperator output at any time.

3.2 Frame processorThe frame processor shown in Fig. 5 is continuously activebut only performs DMA operations on the S-100 bus whenrequested to do so. The 8-bit word logic unit in the frameprocessor can perform logic operations between its twoinput ports, one from the display memory board (dataD—in, either directly or shifted) and one from the edgedetector (data C) or a block of the microcomputer (B(n)—out). Communication between the frame processor and theS-100 bus is only possible during a DMA cycle when theframe processor transfers the required addresses to the bus.The addresses can be 'barrel shifted' within the assignedblock, which enables a shifted replica of the fast logicaloperation to be written into one of the microblocks B(n).

The display memory block is read continuously, inde-pendently of any DMA operations, and provides a contin-uous monitor display of its contents via a parallel-to-serial(PISO) shift register. This is possible, since the addresses

on the video S-100 bus are synchronised with the cameraand are active throughout each frame. The frame processorgets its instructions from the microcomputer through theI-mode control bus in Fig. 5. When the frame processor is

video S-100 bus lineiD in

itodrsptay

___n ___dataB(n)-in dataB(n)-ou*micro S-100 bus line

controldata

internal control

Fig. 5 Frame processor of Microvision-100

instructed to execute frame operations it requests DMAfrom the micro. Following DMA acknowledge, it performsthe task within one frame time (20 ms). The system is thusable to perform logic operations between two blocks of 8Kbytes within 20 ms, which is at least an order of magnitudeimprovement over normal operation of a microcomputerfor similar tasks.

4 Software

A block diagram of the programmable part ofMicrovision-100 is shown in Fig. 6. At the heart of this isthe frame processor with its control inputs I from themicro. It can be regarded as an arithmetic logic unit (ALU)operating, between an accumulator (the display memory),a number of 8K byte long registers (the micromemoryblocks Bl, B2 ...) and data from the edge detector. Thetypical process of the system is described in the literatureas bit-plane format manipulation [5].

The software is developed outside Microvision-100 anddownloaded into it by a dedicated program inside itsmonitor, and, since the system is fully S-100 compatible, itis easy to adapt it to one of the flexible operating systemson the market (such as CP/M) and to use its assembler.

Any command which is sent through the I-bus to theframe processor prior to DMA transfer must specify thesubcommands: Stream direction, Logic operator andOffset.

Fig. 6 Block diagram of Microvision-100 programmable data flow

IEE PROCEEDINGS, Vol. 130, Pt. E, No. 5, SEPTEMBER 1983 151

Stream Direct subcommand specifies the two blocksbetween which the logic is to be performed. Possibilitiesare (a) any micromemory block with the display memoryblock, (b) edge-detected image data with the displaymemory (the edge-detector output is treated by the frameprocessor as a block holding the current preprocessedimage, although in fact it is being generated in real time),and (c) display memory block with itself. The streamcommand also informs the frame processor in which blockthe results of the logic operations are to be stored. Pos-sibilities are the display memory block or any of the DMAdedicated blocks. Between DMA operations, the micro canuse any part of its total memory, including the dedicatedblock.

Logic subcommand informs the frame processor whichlogic operation to perform between the chosen pair ofblocks A and B, say. Possibilities are shown in Table 1.

Table 1 : Possible Logicoperations between anypair of blocks A and B

A

A • B

A + B

1

A + B

ool

A®B

A + B

A B

A®B

B

A + B

0

A • B

A • B

A

Offset subcommand informs the frame processor whichaddress 'barrel shift' to perform while writing the resultinto the target block. Only microcomputer memory blocksare affected and the operation enables results to be storedin a shifted cartesian position in the 256 x 256 imagearray. In terms of X, Y co-ordinates, for example, animage with centre of enclosing rectangle (Fig. la) at Xo, Yo,stored in display memory can be translated to a newposition with centre at {Xo + AX, Yo + AY) within any

microcomputer block. The offset subcommand supplies theframe processor with the required offsets AX and AY. Thisfacility is used to shift objects from any random position toa central reference position with the central pixel of theenclosing rectangle at 128,128 [6].

The software includes subroutines that perform all theabove and many other operations. The following sequenceis an example of a command to carry out the exclusive or(exor) operation between the display memory and one ofthe microcomputer blocks, leaving a shifted result in thatblock:

STREAM: MVDTBI; The stream is between block 1 andthe display memory

LOGIC: BXORD; Perform exclusive orbetween the blocks.

OFFSET: SHIFT XI, 7 1 ; Write in thetarget block Blthe result of shiftby * 1 and 71.

Following the above sequence, the microcomputer willsend a DMA permission to the frame processor. Thiscommand informs the frame processor that it can requestDMA, but this will only be done at the start of the nextTV frame, i.e. it will impose DMA during an active TVframe.

The low-level commands described so far can be com-bined to perform high-level processing subroutines basedon bit-plane processing. Operations that are not suitablefor bit-plane subroutines can still be carried out usingnormal serial operation of the microcomputer.

5 Experimental results and conclusions

Some typical applications will be described with referenceto Figs, la and 8a which show images of components thathave to be recognised, inspected for tolerance variations,and rejected if there are missing or added features. Objectsthat have large percentage differences, such as the letters ofthe alphabet and numerals, are easily distinguished, butgreater discriminating power is required to inspect com-ponents like the camera part in Fig. la for small defectssuch as the absence of the small vertical slit, or incorrectdimensions at any part of the component.

Assuming that a vision-controlled industrial assemblyrobot is required to search for normal parts, the first stepis to train the system by storing processed images of thecomponents. The camera image is solid black in Fig. la.

• • - • • • - • • 4 /•' • : . / C • ' • • • . • : . ' • d

Fig. 7 Component teaching, inspection and recognition

a Component image (solid black) and binary edge processed image shifted to centralreference positionb Enlarged processed imagec Isolated pixel dots of'perfect' template matchd Detection of missing vertical slit

Fig. 8 Detection of unwanted additional feature

a and b, c, original and processed component images; e detection of the unwantedtriangular burr on faulty component d

152 IEE PROCEEDINGS, Vol. 130, Pt. E, No. 5, SEPTEMBER 1983

The processed image is shown inside the enclosing rectan-gle translated to the reference position by the X- andY-shift operators and stored in B(l), say, by the trainingoperation. Similarly, the processed image of Fig. 8a isstored in B(2). In subsequent use, Microvision-100 seeks tolocate and identify the two types of component by firstshifting each unknown image to the standard position andthen forming the exclusive-or of the unknown, first withB(l) and then with B(2). Assuming that the unknown isFig. la, the results of performing this operation with theB(l) memory are shown in Fig. 1c. This faint random pixeloutline of the component is due to quantisation noise [8]and, since in practice the dots are mostly isolated pixels,they can be almost entirely eliminated by forming thelogical AND of two frames shifted by one pixel. Accuraterecognition of any image is thus achieved by looking forthe B(n) that gives a zero or very low '1 ' pixel count as theresult of taking the exclusive-or with B(n) and theunknown processed image. Special hardware for per-forming any number of such comparisons simultaneouslyin parallel has been described elsewhere [9, 10]. For afaulty component that has no vertical slit, for example, thelarge number of mismatch pixels shown in Fig. Id occurround the location where the hole should be to indicatethe exact location and shape of the mismatch and also togenerate a component reject signal.

If only the rejection of faulty components is of interest,as in some automated inspection applications, the actuallocation of the imperfection and whether it is a missingfeature or an unwanted added feature, such as the triangu-lar burr in Fig. %d, does not influence the rejection capacityof Microvision-100, since any increase in the exclusive-orimage pixel count indicates an error and is directly pro-portional to the magnitude of the error. A reject thresholdcan thus be set to reject all components with more than agiven overall percentage error.

Components with random orientation in addition torandom position require, in general, a 360 degree rotationof the camera image for recognition of shape, position andorientation. The rotation has been achieved in a roboticsystem by rotating the gripper, together with an axiallypositioned fibre-optic image guide [6].

When it is necessary to recognise a large number N ofimages stored in the frame memories, the present systemwith a single ALU must operate N times sequentially, thus

taking 2QN ms to compare the unknown image with the Nstored images. By incorporating N ALUs and enabling allthe B(ri) in parallel, the comparisons could be made in 20ms for any N. The economic viability of this multiplicationof hardware depends on the application, and in industrialrobotics, for example, the resulting increase in productionrate may justify the extension.

The problem of recognising components at random dis-tances can be handled by Microvision-100 if a zoom lens isincorporated to make the enclosing rectangle a standardsize. The software equivalent would be to include sizeinvarient transformations [11].

6 Acknowledgments

The research reported in this paper was supported by theSERC Robotics Initiative and the Israeli Ministry ofDefence.

7 References1 DUFF, M.J.B.: 'Review of the CLIP image processing system'.

National computer conference, 1978, pp. 1055-10602 STROME, W.M., and GOODENOUGH, D.G.: The use of array

processors in image analysis', in 'Machine aided image analysis'. Insti-tute of Physics Conference Series 44, 1978, Chapter 6

3 STONE, H.S.: 'Introduction to computer architecture' (ScienceResearch Association Inc., 1977)

4 TAYLOR, W.K.: 'Optical texture recognition'. Institute of PhysicsConference Series 13, 1973, pp. 276-284

5 REEVES, A.P.: 'An array processing system with a Fortran-basedrealisation', Computer Graphics and Image Processing, 1979, 9, pp.267-281

6 TAYLOR, W.K., LA VIE, D., and ESAT, I.I.: 'A curvilinear snakearm with gripper-axis fibre-optic image processor feedback', Robotica,1983, 1, pp. 33-39

7 TAYLOR, W.K., and AL-KIBASI, K.T.: 'The UCLM3 programm-able pattern recognition machine'. IEEE Conference Publication74CH0885-4C, 1974, pp. 241-246

8 LA VIE, D., and TAYLOR, W.K.: 'Effects of border variations due tospatial quantisation on binary image template matching', Electron.Lett., 1982, 18, (10), pp. 418-420

9 TAYLOR, W.K., and ERO, G.: 'Real time teaching and recognitionsystem for robot vision', The Industrial Robot, 1980, 7, (2), pp. 99-106

10 TAYLOR, W.K.: 'Comparison apparatus for use in pattern recogni-tion'. US Patent 4 119 946, 1978

11 TAYLOR, W.K.: 'A theory of size and intensity invariance and theorigin of visual illusions in the brain', in ROSE, J. (Ed.): 'Progress ofcybernetics' (Gordon and Breach, 1969), pp. 419-432

Contents of Software & MicrosystemsThe contents are given below of the August 1983 issue of Software & Microsystems

Local area networks. An introductionD. Hutchinson

Algorithms of the digital differential analyser genusimplemented on an Intel 2920 signal processorD.Q.M. Fay

Cyclone I. A self-checking control-oriented multiprocessor.W.G. Marshall and W. Forsythe

The speed of floating-point operationsM. Macauley

IEE PROCEEDINGS, Vol. 130, Pt. E, No. 5, SEPTEMBER 1983 153

a microprocessor-controlled real-time image processor

Documents