browsing high definition colour pictures

6
203 Browsing High Definition Colour Pictures Philip Willis and David Milford5 Abstract The authors describe a method of presenting colour pictures which allows users to browse by panning and zooming. The pictures are seen as though on a 512 by 512 by 12 bit framestore, but are defined to 4096 by 4096. The implemen- tation allows the entire picture to be seen in overview, with fine details averaged, or to be zoomed in upon with finer detail progressively revealed by a sequence of twofold linear magnifications. Further zooming, beyond the resolution of the picture data, automatically produces a conventional pixel replication effect. In addition, the picture may be panned vertically or horizon tally. Keywords: Picture retrieval; picture compression; picture encoding; colour graphics; raster graph- ics; man-machine interaction; picture archive. 1. Introduction The work described here forms part of a larger project concerned with the interactive retrieval of colour pictures from a digital archive. A typical application area, the storage and retrieval of library pictorial information, requires a style of access best described as browsing. The user will know in advance either which types of picture are of interest (but not which precise ones), or will be able to identify the type after some minutes spent retrieving and examining some sample ones. This aspect of retrieval suggests that an interactive system, directly accessed by the user, will be more effective than any batch method. Colour pictures are not a conveniently con- sistent shape and so any solution to the browse problem must be able to cope with a variety of shapes. Further, most dour reproduction 5School of Mathematics, University of Bath, Claverton Down, Bath, Avon. BA2 7AY England The work described here was supported by a grant from the SERC. processes have a higher definition than display tubes have and are capable of producing pic- tures substantially larger than a convenient size of computer terminal. These two problems are related through the resolution required, typi- cally at least one order of magnitude better (linearly) than conventional computer peripherals. 2. The Approach Taken We have tackled the browse problem by developing a special, inexpensive colour terminal for connection to a conventional filing system, either locally, to form a self-contained device, or, perhaps typically, through a network to a shared file store. The terminal supports an encoded form of picture which offers both compression and convenient spatial properties essential for pan and zoom. These operations are performed locally to the display for a very rapid interactive response, with incremental on-demand data retrieval from the archive. To the user the system is almost indistiguishable from a framestore except in two particular regards: in our system, the response is almost instantaneous and zooming reveals new detail. In this paper, we will primarily describe the data structure we use to organise our pictures to permit this fast retrieval, pan and zoom. First we describe the basic structure for a single frame, then we outline how the existing terminal displays such a picture without recourse to an image buffer. Next we discuss the extended structure for a high definition picture and how this can be accessed. We will also indicate the future direction of the project. 3. Basic Data Structures Our pictures use various forms of quad encoding. This has been a popular method for some years now. Initially it was used by the image process- ing communityl.2 but work on algorithms for manipulating and representing its variants contin- ues to be reported.314.s The representation is achieved by recursively quartering a colour image until each quarter, or quad, is of uniform colour. The encoding records the size of the quad and its colour. Sizes are always a power of two so it Nor th-Holland Computer Graphics Forum 4 (1985) 203-208

Upload: philip-willis

Post on 02-Oct-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Browsing High Definition Colour Pictures

203

Browsing High Definition Colour Pictures

Philip Willis and David Milford5

Abstract The authors describe a method of presenting colour pictures which allows users to browse by panning and zooming. The pictures are seen as though on a 512 by 512 by 12 bit framestore, but are defined to 4096 by 4096. The implemen- tation allows the entire picture to be seen in overview, with fine details averaged, or to be zoomed in upon with finer detail progressively revealed by a sequence of twofold linear magnifications. Further zooming, beyond the resolution of the picture data, automatically produces a conventional pixel replication effect. In addition, the picture may be panned vertically or horizon tally. Keywords: Picture retrieval; picture compression; picture encoding; colour graphics; raster graph- ics; man-machine interaction; picture archive.

1. Introduction The work described here forms part of a larger project concerned with the interactive retrieval of colour pictures from a digital archive. A typical application area, the storage and retrieval of library pictorial information, requires a style of access best described as browsing. The user will know in advance either which types of picture are of interest (but not which precise ones), or will be able to identify the type after some minutes spent retrieving and examining some sample ones. This aspect of retrieval suggests that an interactive system, directly accessed by the user, will be more effective than any batch method.

Colour pictures are not a conveniently con- sistent shape and so any solution to the browse problem must be able to cope with a variety of shapes. Further, most d o u r reproduction

5School of Mathematics, University of Bath, Claverton Down, Bath, Avon. BA2 7AY England The work described here was supported by a grant from the SERC.

processes have a higher definition than display tubes have and are capable of producing pic- tures substantially larger than a convenient size of computer terminal. These two problems are related through the resolution required, typi- cally at least one order of magnitude better (linearly) than conventional computer peripherals.

2. The Approach Taken We have tackled the browse problem by

developing a special, inexpensive colour terminal for connection to a conventional filing system, either locally, to form a self-contained device, or, perhaps typically, through a network to a shared file store. The terminal supports an encoded form of picture which offers both compression and convenient spatial properties essential for pan and zoom. These operations are performed locally to the display for a very rapid interactive response, with incremental on-demand data retrieval from the archive. To the user the system is almost indistiguishable from a framestore except in two particular regards: in our system, the response is almost instantaneous and zooming reveals new detail.

In this paper, we will primarily describe the data structure we use to organise our pictures to permit this fast retrieval, pan and zoom. First we describe the basic structure for a single frame, then we outline how the existing terminal displays such a picture without recourse to an image buffer. Next we discuss the extended structure for a high definition picture and how this can be accessed. We will also indicate the future direction of the project.

3. Basic Data Structures Our pictures use various forms of quad encoding. This has been a popular method for some years now. Initially it was used by the image process- ing communityl.2 but work on algorithms for manipulating and representing its variants contin- ues to be reported.314.s The representation is achieved by recursively quartering a colour image until each quarter, or quad, is of uniform colour. The encoding records the size of the quad and its colour. Sizes are always a power of two so it

Nor th-Holland Computer Graphics Forum 4 (1985) 203-208

Page 2: Browsing High Definition Colour Pictures

204 Ph. Willis et al. 1 High Definition Colour Pictures

is only necessary to record the exponent, lead- ing to a compact representation in which most of the data relates to colour. To reconstruct a picture it is necessary to know which quad occupies which part of the screen. This can be done explicitly, linking the quads as the leaves of a four-branched quad tree. The tree structure reflects the recur- sive division of the screen. It can also be done implicitly, storing the quads in the order that they were generated to give a quad list. There are several possible orders, depending on exactly how the coding program was written, but these can be thought of as the orderings pro- duced by the various forms of tree traversal.

The quad tree is a generally useful form with particular characteristics which are important to our work. The tree can be thought of as a root node, corresponding to the entire picture, below which hang four child nodes, representing the major quarters. These in turn have four children each and so on until a leaf node (area of uniform colour) is reached. Each horizontal layer of the tree thus contains data relevant to a particular degree of resolution. The root resolves only the entire screen area. Successive layers resolve at double the resolution of the previous layer. Hence we can resolve to pixel level on a 512 pixel display with a nine layer tree. Another characteristic of the tree is the localization of picture data. Any given node N can be selected as a sub-tree and all nodes below it in the structure contain data which appears in the area of screen corresponding to quad N. Hence quad-aligned cut-and-stick manipulations correspond to snipping off subtrees and reattach- ing them elsewhere. It is also possible to use this characteristic to segment pictures into visually-related areas, a topic of particular interest to the image processing community. Finally, the encoding of pictures in this manner is compact, typically an order of magnitude compressed.

The quad tree and its variants have also been used to encode line drawings. The compres- sion achieved is broadly comparable with that for colour pictures, because lines typically correspond to boundaries between colours. Theoretically, lower compression results because a line requires two transitions (white to black followed by black to white) whereas a colour boundary is only a single transition. This is of little concern to us as we are essentially manipulating our pictures in an image-like manner, but line drawings are usually manipu-

lated symbolically. For example, with a line drawing, zooming in does not reveal thicker lines and zooming back does not produce grey, anti- aliased lines.

Closest to the display, we use a quad list organised in scan order: that is. the quads are ordered in that sequence in which a conven- tional raster scan would first touch them. This is a particularly simple form to feed to the special decoding hardware in our terminal, as will next become apparent.

4. The Quad Store Terminal Our display6 consists of a conventional MC68000 processor and RAM and a small amount of dedi- cated hardware capable of converting the quad list to colour video in real time. This hardware is not defeatable by picture complexity, although there obviously comes a point when the order of magnitude compression that simple quad lists achieve is lost. The real-time aspect is crucial to our overall system, because at no time do we need to store a pixel image: the conversion is simply repeated every frame in synchronism with an ordinary colour monitor. The compression achieved by quad encoding is also important, because it allows us to store relatively large amounts of picture data local to the display and because it is not necessary to close-couple the filestore with the terminal. Indeed, it is entirely feasible to use a 19,200 baud serial line. Picture storage is in the conventional slow RAM attached to the processor, avoiding the costs of large, specially organised image stores. We believe this to be the only display directly based on quad-encoding, although Oliver et a17 have proposed an alternative scheme.

Pictures are readily converted between the various quad-based forms, so there is no need to use the quad list elsewhere in the system. It is generally useful to have an explicit form such as the quad tree, because this allows spatial operations to be performed easily. We have demonstrated a system in which the tree form is transmitted to the terminal, and then converted by the local processor to quad list form for display. Such conversion takes less than half a second, even for complex pictures with 20,000 quads: this has been demonstrated with a range of colour pictures.

The decoder hardware essentially imple- ments run-length encoding both across and down the screen. When a quad has been com-

Page 3: Browsing High Definition Colour Pictures

205 Ph. Willis et al. f High Definition Colour Pictures

pletely displayed i t is discarded and enough further quads fetched to fill the gap that has been left. Interlacing introduces a complication, but only with pixel size quads. These are accordingly ordered slightly differently to ensure that there cannot be long runs of pixel quads, all of which occur in the field not currently being displayed. This would defeat the decoder.' The design now in use cannot be so defeated, even if the entire picture consists of pixel quads. There is thus no picture complexity limit imposed by the decoder. although clearly very complex pictures would not be compressed by quad encoding.

5. Extended Data Structure The ordinary quad tree will in practice have few leaves corresponding to very large quads and many at pixel level. Milford* quotes figures of 60% of quads being at pixel level, 85% at pixel or 2 by 2 pixels . Hence the branches close to the root seldom lead directly to leaves. only to further branches. It is therefore more cconomi- cal if the root branches directly to the lower levels of the tree and this will also decreasc the access time to any given leaf. We adopt a 64 way initial branch from the root, followed by con- ventional four way branches, for any given frame. This imposes a maximum size of 64 pixels square on any given quad. The screen picture can thus be visualised as an 8 by 8 chequer- board pattern, within each cell of which is a quad-tree.

Such a modified structure is sti l l only =pa- ble of representing a given frame and so in fact we extend it further by making the root branch 4096 times, corresponding to 64 by 64 cells in total. Since each cell corresponds to 64 by 64 pix- els, we can thereby represent pictures with up to 4096 by 40% pixels. This structure can be viewed at various pan and zoom settings by sim- ple limits to the traversal, so we now consider these in turn.

5.1. Natural sue with pan

If we assume that we wish the smallest leaf quad to correspond to one screen pixel, we obtain a picture at 1 : I or natural sue. Only part of the entire picture will thus be Seen at any instant. The relevant part of the tree corresponds to a window of 8 by 8 cells aligned on the entire grid of 64 by 64 cells. These are easily identified and the corresponding parts of the tree are traversed to produce a quad list for the display decoder. The

window can be aligned with any cell boundaries. producing an incremental pan. The increment is the cell, one eighth of the screen size, easily tracked by the viewer.

5.2. Zoom

Zooming is achieved by varying the notional size of the window: the larger the window, the more of the structure it encompasses and hence the smaller any particular detail will appear on screen. Conversely, reducing the size of the win- dow will produce a zoom-in effect. This can also be thought of as starting the tree traversal from nodes at different depths in the structure: starting at the root gives a complete overview. starting one layer deeper shows one quarter but at higher detail. The display screen has a lower resolution than the tree. so in practice we traverse the tree only as deep as necessary to show screen pixel data. Each layer of the tree corresponds to a doubling of the linear resolution of the picture: hence we achieve a times two incremental zoom each time we go one layer deeper. If the tree does in fact contain further detail the result is a true detail-revealing zoom. If it does not, then the leaf reached is interpreted at correspondingly larger size, producing a conventional pixel replica- tion effect. For this to work at all levels of zoom it is necessary to associate a colour value with every node, not just at the leaves. The appropri- ate colour value to use is the average of the four below it. Leaf node colours are thus explicitly determined by the 'actual' picture, colour values elsewhere are averages. The resulting pictures are thus fully anti-aiiased at zaom settings not requiring pixel replication. It is also possible to use the colour data in a slightly different way, to reveal new symbology rather than new detail, but we have not investigated this.

Zoom and pan are easily combined. Essentially the zoom setting determines the layer in which tree traversal starts and pan determines which particular node within the layer. The number of layers traversed is determined by the screen resolution.

6. Results Pictures at 4k by 4k definition are not easily obtained and so we constructed one as a patch- work of 512 by 512 pictures generated from a volume modelling program in the School of Engineering. This is sufficiently complex. about 100,OOO quads in total, to be a demanding test.

Page 4: Browsing High Definition Colour Pictures

206 Ph. Wlis et al. 1 High Definition Colour Pictures

Fig. 1 Overview of a 4k by 4k by 12 bits picture Fig. 2 Detail of same picture, times 2 true mlarge- ment

Fig. 3 Detail of same picture, times 4 true enlarge- ment

Each quad has a 12 bit colour field, so the entire picture would occupy 24 Mbytes if represented

Figure 1 is an overvicw of the complete picture. Figures 2 to 4 show successive pan and zoom combinations until the picture is revealed at screen resolution. Hence Figure 1 is at 4k resolu- tion; Figure 2 is at 2k; Figure 3 is at lk; Figure 4 is at 512.

Figures 5-7 continue the sequence beyond the resolution of the picture data, giving the same effect as a pixel-replicating zoom. Figure 7 thus corresponds to a 64 by 64 sample from the original 4k by 4k picture.

explicitly .

Fig. 4 Detail times 8 true enlargement: screen pixel equals picture pixel

Figure 8 shows the effect of limiting the depth of the tree scan, yielding a picture of lower resolution than the display monitor.

The longest pause between requesting a pic- ture and seeing it appear occurs when there is greatest on-screen complexity. Hence, given Fig- ure 1 on screen and requesting Figure 2, it takes approximately half a second for Figure 2 to appear. The other pictures are produced even more quickly. In particular, using low resolution (as in Figure 8) gives a nearly instantaneous response. This could be exploited when panning to give a nearcontinuous pan, with an automatic update to the full definition picture when the pan ceases.

Page 5: Browsing High Definition Colour Pictures

Ph. Willis et a!. /High Definition Colour Pictures 207

Fig. 5 Detail times 16: pixels replicated at 2 by 2 Fig. 6 Detail times 32: pixels replicated at 4 by 4

Fig. 7 Detail times 64: pixels replicated at 8 by 8

7. Current Investigation

Our present research centres around using the above system as the basis of a pictorial archive retrieval device. We intend connecting the termi- nal to a networked filestore and then investi- gating methods of paging pictorial data such that the user sees a satisfactory response to retrieval, panning and zooming but without load- ing the network with too much unwanted data. Once again the quad tree helps us with this. At any given instant there will be a certain picture on screen. Panning will cause edgewise adjacent data to be ~eeded, zooming requires depth-wise data. This locality of reference can be used to fetch data from the filestore in advance of need.

Fig. 8 Under-traversed tree, giving a low resolu- tion detail

a Acknowledgements

The pictorial data and algorithmic help was kindly supplied by our colleague. Dr J.R. Woodwark, School of Engineering. The project continues with further support from the SERC.

References

1. A. Klinger and C.R. Dyer, "Experiments on picture representation using regular decomposition," Computer Graphics and Image Processing 5 pp. 68-105 (1976). G.M. Hunter and K. Steiglitz, "Operations on images using quad trees." IEEE Tran-

2.

Page 6: Browsing High Definition Colour Pictures

208 Ph. WiuiS et al. 1 High Dejinition Colour pictures

sactions on Pattern Analysis and Machine Intelligence 1(2) pp. 145-153 (1979). J.R. Woodwark, “The explicit quad tree as a structure for computer graphics,” The Computer Journal 25 pp. 235-238 (May 1982).

M.A. Oliver and N.E. Wiseman. “Opera- tions on quadtree encoded images,” The Computer Journaf 26 pp. 83-91 (Feb 1983).

I. Gargantini, ‘‘An effective way to represent quadtrees.” Communications of the A C M

3.

4.

5.

25 pp. 905-910 (DCC 1982).

6. D.J. Milford and P.J. Willis. “Quad encoded display,” IEE Proceedings 131 (Part E: Computers and Digital Techniques), (May 1984).

M.A. Oliver, T.R. King, and N.E. Wiseman. “Quadtree scan conversion,” Proceedings of Eurogriyhics 84 , pp. 265-276 North- Holland, ( 1984).

8. D.J. Milford, “The display of quadtree encoded pictures,” Ph.D. Thesis, University of Bath (1984).

7.