a hybrid approach to medium- and low-resolution font-scaling and its oop style implementation

8/3/2019 A Hybrid Approach To Medium- And Low-Resolution Font-Scaling And Its OOP Style Implementation

1/272

AHybridApproachTo

Medium- AndLow-Resolution

Font-Scaling

AndIts OOPStyle Implementation


2/272


3/272

Diss. ETH No 10884

AHybrid Approach To

Medium- And Low-Resolution

Font-Scaling

And Its OOP Style Implementation

Adissertation submitted to the

Swiss Federal Institute of Technology Zrich

forthe degreeof

Doctor of Technical Sciences

presented by

Beat StammDipl.Phys.ETHZ

born 18 March 1963

citizen of Basel, Switzerland

accepted on the recommendation of

Prof. Dr. J. Gutknecht, examiner

Prof. Dr. R. D. Hersch, co-examiner

1994


4/272


5/272

v

Contents

0 Introduction 1

0.0 Bitmapped fonts and font-scaling 1

0.1 The raster tragedy at low resolution 3

0.2 The scope of this thesis 6

0.3 Guide to the reader 7

1 The Others: ABrief Overview of Existing Approaches 9

1.0 The MIT Project: A formal approach 9

1.1 The ETH Project: An interactive image-oriented approach 11

1.2 Adobe Type 1 Fonts: An approach with hints 12

1.3 Apple/Microsoft TrueType Fonts: An approach with instructions 15

2 The Foundations: An Object-Oriented Graphical Toolbox 18

2.0 Coordinates, pixels, and rectilinear regions 18

2.0.0 Vector displays 18

2.0.1 Raster displays 18

2.0.2 The grid-pointmodel 19

2.0.3 The checker-board model 21

2.0.4 Aconflict of models 22

2.0.5 Conclusion 1: Coordinates vs. pixels 242.0.6 Conclusion 2: Rectilinearregions 27

2.0.7 Conclusion 3: Object-orientedprogramming 29

2.1 Concepts of object-oriented programmingand their notation 30

2.1.0 Encapsulation 30

2.1.1 Late binding 31

2.1.2 Extensibility&polymorphism 33

2.1.3 Notation &terminology 37

2.2 Structuring a graphics library into interchangeable components 40

2.2.0 Graphical objects 402.2.1 Rendering tools 41

2.2.2 Raster devices 45

2.2.3 Levels of abstraction 47

2.3 Assessment of the new approach for a graphical toolbox 49

3 The Input: A Simple Outline-Font Editor 52

3.0 Knots 52


6/272

vi

3.0.0 The entity of interaction 52

3.0.1 Differentways of marking 53

3.0.2 Cumulative marking and ad-hoc constraints 54

3.0.3 Markingbeyond two dimensions 55

3.1 Steps beyond ordinary cut & paste 56

3.1.0 Ordinary cut &paste 56

3.1.1 Dragging 56

3.1.2 Inserting &draggingwith extended foci 59

3.1.3 Canceling 62

3.1.4 Split &merge 62

3.1.5 Contour orientation &seed-points 63

3.1.6 Zooming &other viewing attributes 65

3.1.7 Conveniencevs. non-modality 66

3.2 An experience with third order Bzier curves 68

3.2.0 Afont-designerspoint of view 683.2.1 Afont-scalers"point of view" 69

3.2.2 Conclusions 72

3.2.3 Adroitly defined Bziercurves 72

3.3 Assessment of the approach for an outline-fonteditor 73

4 The Backbone: A Formalism For Intelligent Outline-Fonts 76

4.0 The fundamental raster problem 76

4.0.0 Scaling- not a linear mapping? 76

4.0.1 Equality 784.0.2 Symmetry 79

4.0.3 Connectivity 81

4.1 An outline-oriented declarative high level languagefor fonts 82

4.1.0 Formal language vs. interactivity 82

4.1.1 High level vs. low level 84

4.1.2 Declarativevs. imperative 85

4.1.3 Outline-orientation vs. image-orientation 86

4.2 Semantic implications:The round-before-use rule 87

4.3 Formally defined regularity 884.3.0 Components 88

4.3.1 Attributes 89

4.3.2 Names &types 93

4.3.3 Instancing transformations 94

4.3.4 Hierarchy 95

4.3.5 Assemblyin pixel-space 99

4.3.6 Scopes 101


7/272

vii

4.3.7 Extensibility 102

4.4 Implied dynamic regularization 104

4.4.0 Stern regularityvs. artistic license 104

4.4.1 Curvedcontours 105

4.4.2 Near regularity 108

4.4.3 Non proportionality 112

4.4.4 Dynamic glyphs 113

4.4.5 Snapping into regularity 117

4.4.6 Well-behaveddegeneration 120

4.4.7 The lower limit 121

4.5 Steps beyond contours 123

4.5.0 Jaggies on the fringe 123

4.5.1 Half-bitting 124

4.5.2 Drop-outs &non-standard scan-conversion 126

4.5.3 Device independence &single pixels 1304.6 Inter-character spacing & "wysiwyg" type-setting 131

4.6.0 From characters to words 131

4.6.1 Left and right side-bearings 133

4.6.2 Pair-kerning 134

4.6.3 "Wysiwyg" type-setting 135

4.7 Formal notation & graphical feed-back 137

4.7.0 Floating elements 137

4.7.1 Static feed-back 139

4.7.2 Dynamic feed-back 1414.7.3 Structural feed-back 142

5 The Implementation: An Extensible OOP Style Application 144

5.0 A font-scalers "point of view" 144

5.0.0 Hierarchy 144

5.0.1 Attributes 145

5.0.2 Polymorphous modeling 145

5.0.3 Inherited decoupling of tasks 147

5.1 A font-compilers "point of view" 1495.1.0 Syntax &symbols 149

5.1.1 Terms &expressions 149

5.1.2 Once more: Polymorphous modeling 149

5.1.3 Names &scopes 151

5.2 A single recursive data structure for the font-scaler and -compiler 151

5.2.0 Aconflict of aims? 151

5.2.1 Compile once, scale many times 152


8/272

viii

5.2.2 Recursivedescent parsing &error handling 156

5.2.3 Persistent objects &intelligent riders 158

5.3 By way of assessment: Using Oberon-2 for the implementation 162

5.3.0 Abstractconcepts vs. concrete records and pointers 162

5.3.1 Canonicalstructures vs. multiple polymorphism 164

5.3.2 Polymorphism vs. wrapping 166

5.3.3 Forwarding,delegating&message records vs. methods 167

6 The Balance: A Competitive Prototype 170

6.0 The results: Demands met? 170

6.1 Artistic aspects 173

6.2 Technical aspects 177

6.3 Future research topics 178

7 Conclusions 182

Appendix A: The Algorithms

A.0 Graphical objects 183

A.0.0 Lines 183

A.0.1 Circles 187

A.0.2 Ellipses &arcs 191

A.0.3 Bzier&spline curves 195

A.1 Rendering tools 200

A.1.0 Recordingpixel boundaries 200A.1.1 Outlining &bounding boxes 201

A.1.2 Fillingsimple objects 205

A.1.3 Filling&windingnumbers 209

A.2 Raster devices 218

A.2.0 Copyingorthogonal rectangles 218

A.2.1 Scalingorthogonal rectangles 225

A.3 Spline-to-Bzier transformation 231A.3.0 The easy part 231

A.3.1 Misleadingprecision 233A.3.2 The hard part 234

A.3.3 Adequateresults for medium- and low-resolution 236

Appendix B: The Binary Format of Unstructured Outline-Fonts 237

Appendix C: The LanguageDefinition for Intelligent Outline-Fonts 238

Appendix D: A Self-Contained Example of a Font Definition 240

References


9/272

ix

Abstract

The present thesis is dedicated to the scaling of fonts for medium- and parti-

cularlyfor low-resolutionraster-devices.We start out from the simple but fun-

damental observation that a scaling function, which inevitablyhas to quanti-

ze, is inherently non-linear. We call this the fundamental raster problem. From

this non-linearity,and from the demand to preserve the regularityproperties

under scaling, the necessity to decompose a fonts characters into compo-

nents ensues inexorably.In contrast to other approaches, however, we do not

stop decomposing on the level of glyphs,but we proceed with the level of con-

tours, and down to the level of knots and single numbers. This is a conse-

quence of the fundamental raster problem.

This hierarchy of components unveils far-reaching opportunities. On the

one hand, it determines the structural information that is indispensable topreserve regularity on all the levels of hierarchy. On the other hand, we can

package individual intelligence into whatever level of hierarchy is most

appropriate to do so. With a few intelligibleconcepts, we provide for dynamic

regularization in a novel way. In contrast to commercial approaches, we do

not just propose yet another font-description language, but we illustrate a

strategy, how this font-description language is to be used particularly at low-

resolution. The key ideas of this strategy are adroitly defined Bzier curves andthe round-before-use rule. They are both consequences of the desire to define

components in font-space, combined with the need to assemble them in pi-xel-space. As a result, we obtain intelligent outlines, whose semantics mirrorscloselythe worldof typography.

Our approach is a hybrid approach in more than one way. On the onehand, the bare shapes are drawn out of a graphical outline-font editor, the

structural information is declared textually,whilegraphical representations of

the compiled declarations are introduced back into the formal language, in

order to provide feedback.In doing so we cope with the wish to acquire both

shape and structure in a way which is most appropriate for the respective

task. Besides outline-orientation, on the other hand, the extensibility ofcontours allows for a certain amount of image-orientation. In doing so we

fulfill the demand to talk not only about the geometry of the glyphs and

characters, but also about their digitized appearance.The graphical core of the font-scaler is based upon an object-oriented gra-

phical tool-box. Besides its almost infinite extensibility,this tool-boxrealizesthe strict decoupling of the actual rendering from the pure digitizing, and

from the underlying raster-devices. It does so by encapsulating the intelli-


10/272

x

gence for particular rendering algorithms into active objects. These intelligent

renderers are propagated to the font-scaler, which is relievedthereby from is-

sues depending neither on the fonts geometry,nor on the characters appear-

ance. The orthogonality of the tool-boxmakes it a highly valuable basis even

for rather demanding general-purpose drawing programs. All the algorithms

used in our approach are given in an Appendix. They contain many refine-

ments to existingalgorithms that may be of interest to the practitioner in ras-

ter graphics.

The implementation of the font-scaler, the proof-by-existence, is an

application embedded into an extensible environment. With implementing a

substantial part of the environment ourselves, we target several goals, not the

least of which is to provide ideal premises for our application. As a result, we

obtain most probably the simplest and concisest scaler for intelligent fonts

ever entirely published.


11/272

xi

Zusammenfassung

Die vorliegendeArbeit ist dem Skalieren von Schriften fr Rastergerte mitt-

lerer und niedriger Auflsung gewidmet.Wir gehen von der einfachen, aber

grundlegenden Feststellung aus, dass eine Skalierungsfunktion, welche

zwangslufig quantisieren muss, inherent nicht-linear ist. Wir nennen dies

das fundamentale Rasterproblem. Aus dieser Nicht-Linearittund der Anfor-

derung, die Regularittseigenschaften einer Schrift unter der Skalierung auf-

recht zu erhalten, folgt unausweichlich die Notwendigkeit,die Buchstaben

einer Schrift in Komponenten zu zerlegen.Im Gegensatzzu anderen Anstzen

jedoch hren wir mit dem Zerlegen nicht auf der Ebene von Glyphen auf,

sondern wir fahren auf der Ebene von Kontouren fort, bis hinunter zur Ebene

von Knotenpunkten und einzelnen Zahlen. Dies ist eine Konsequenz des

fundamentalen Rasterproblems.Diese Hierarchie von Komponenten deckt weitreichende Mglichkeiten

auf. Einerseits bestimmt sie die zum Aufrechterhalten der Regularitt auf

allen Ebenen der Hierarchie notwendige Strukturinformation. Andererseits

knnen wir individuelle Intelligenz in die dafr am ehesten geeigneteHierar-

chieebene packen. Mit wenigen, verstndlichen Konzepten stellen wir dyna-

mische Regularisierung auf eine neue Art zur Verfgung. Im Gegensatz zu

kommerziellen Anstzen schlagen wir nicht einfach noch eine andere Schrift-

beschreibungssprache vor, sondern wir illustrieren eine Strategie, wie diese

Schriftbeschreibungssprache zur Skalierung von Schriften insbesondere beiniedriger Auflsunganzuwenden ist. Die Schlsselideen dieser Strategie sind

geschickt definierte Bzier-Kurven und die Vor-Gebrauch-Runden Regel. Siesind beide Konsequenzen des Wunsches, Komponenten im Schriftenraum zu

definieren, kombiniert mit der Notwendigkeit, sie im Bildelementeraum

zusammen zu setzen. Als Resultat erhalten wir intelligente Umrisse, derenSemantik die Welt der Typographiegenau widerspiegelt.

Unser Ansatz ist ein Hybridansatz in mehr als einer Hinsicht. Einerseitswerden die blossen Formen einem graphischen Umrissschrifteneditor ent-

nommen, die Strukturinformation wird textuell deklariert, whrend die gra-phischen Darstellungen der bersetzten Deklarationen in den Text wiederein-

gefgt werden, um fr Rckmeldung zu sorgen. Damit, dass wir das so ma-

chen, erfllen wir den Wunsch, sowohl die Formen, als auch die Struktur auf

diejenige Weise gewinnen zu knnen, welche fr die entsprechende Aufgabe

am ehesten geeignetist. Neben der Umriss-Orientierungerlaubt die Erweiter-

barkeit der Kontouren andererseits ein gewisses Mass an Bild-Orientierung.

Damit, dass wir das so machen, erfllen wir die Anforderung,nicht nur ber


12/272

xii

die Geometrie der Glyphen und Buchstaben sprechen zu knnen, sondern

auch ber deren quantisierteErscheinung.

Der graphische Kerndes Schriftenskalierersbasiert auf einem objekt-orien-

tiertengraphischen Werkzeugkasten. Neben dessen schier unendlichen Erwei-

terbarkeit realisiert dieser Werkzeugkasten eine strikte Entkopplung des

eigentlichen Wiedergabevom reinen Digitalisieren und von den darunterlie-

genden Rastergerten. Er macht dies durch Kapselung der Intelligenz fr be-

sondere Wiedergabealgorithmenin aktiveObjekte.Diese intelligentenWieder-

gabewerkzeuge werden zum Schriftenskalierer hin, welcher dabei von Frage-

stellungen, die weder von der Schriftgeometrie, noch vom Erscheinungsbild

der Buchstaben abhngen, fortgepflanzt. Die Orthogonalitt des Werkzeug-

kastens macht diesen zu einer sehr wertvollen Basis sogar fr eher an-

spruchsvolle Allzweckzeichenprogramme.Alle in unserem Ansatz verwende-

ten Algorithmen sind in einem Appendixaufgefhrt. Sie enthalten viele Ver-

feinerungen zu existierenden Algorithmen,welche fr den Praktiker in RasterGraphik von Interesse sein knnen.

Die Implementierung des Schriftenskalierers, der Existenzbeweis,ist eine

in einer erweiterbaren Umgebung eingebettete Applikation.Mit der Eigenim-

plementation eines substantiellen Teils der Umgebung verfolgenwir verschie-

dene Ziele, nicht zuletzt ideale Voraussetzungen fr unsere Applikation zur

Verfgungzu stellen. Als Resultat erhalten wir den vielleichteinfachsten und

konzisesten Skalierer fr intelligente Schriften, der jemals vollstndig ver-ffentlichtwurde.


13/272

0 Introduction

0.0 Bitmapped fonts and font-scaling

Today, an increasing number of personal computers and workstations are

equipped with bitmapped displays and laser printers. The flexibility of these

raster devices makes the computer available to any kind of graphical output,

without investing in special purpose peripherals. The ability to do graphics is

overburdened to the respective software (or firmware).It actually amounts to

nothing more than building up an image out of many small picture elements

or pixels.

With that, the computer has become available to notably type-settingand

fonts. On a raster device, a bitmapped font comprises a particular choice ofpixelsthat mimic as closely as possible an individual font style and type size,

the latter denoting the maximum over all the characters vertical extensions

(in units of length called points or pt, corresponding to about 1/72 of an inch

or about 0.35 mm; for a more detailed definition cf. [Andr93]).

the maximum of the extensions in vertical direction

The choice also depends on the device resolution, which denotes the number

of pixels per unit length (for instance, dpi or dots per inch). Ahigher resolu-

tion corresponds to smaller pixels, of which more are needed, but it also

permits to reproduce more details, and vice-versa.In order to make different fonts availableto computers, it looks as if all we

needed to do is to define, which pixels make up for which font at what type

size and deviceresolution. On a modern computer we may expect a graphics-

oriented interactivesoftwaretool that would assist us in our endeavor.Such a

pixel-fonteditor wouldlet us specifythe pixelsusing a pointing devicesuch as

a mouse, or allow us to cut-and-paste parts of characters - just all the con-

veniencesof today's computer programs.


14/272

2

But there is one thing we have not reckoned with. Suppose we wanted to

define all the pixels for a Times font; for two raster devices (a 72 dpi screen

and a 300 dpi printer), four variants (book, italic, bold , italic-bold), and 8 ty-

pe sizes (from 6 to 36 pt). This requires raster data of just under one mega-

byte, or about 223 pixels. Assume that using our pixel-fonteditor we manage to

specify one pixel per second (sometimes we may need more time for serious

artwork, sometimes less, thanks to cut-and-paste etc.). With about 26 seconds

to the minute, 26 minutes per hour, 8 hours a day, and roughly 28 days a year

(no work on week-ends), we might make it - in just under one year of pixel-

editing.

Even though our estimate should be too high by several orders of binary

magnitude, we shall have to consider more than only two different device

resolutions. Todayscommercial screens have a resolution of 64 to 100 dpi

(which is considered low-resolution), matrix printers and facsimile devices

range from maybe 100 to 300 dpi, and laser printers start at 300 dpi (which iscommonly considered medium-resolution). Furthermore, we may need more

than eight type sizes, particularly in view of the zooming capabilities of more

demanding type-setting programs. On top of that, there are about 6000 (!)

fonts in use in the western world[Karow92b].

These figures show clearly that bitmapped fonts quickly exhaust manual

definition. It is thus intelligible that we wish to overburden the labor to ap-

propriate software.Now, naivelyseen the character A, for instance, is always

an A, whether it is somewhat larger or smaller, or whether the resolution is

higher or lower. If it were not an Aanymore, we would not be able to identifyit as an A;the software will have to make a different choice of pixels, depend-

ing on the type size and deviceresolution, but it is always an A.

a different choice of pixels mimics a 72 pt Times A, depending on the device

resolution

Therefore,all we seem to have to do is to capture the shape of the characters

in a form which is available to the simple mathematical transformation of

scaling. The shapes of the characters and fonts are the same at different reso-

lutions. In the simplest case, we could think of an approximation of the shap-

es by a large number of very short straight lines, but fortunately, apart from

straight lines, there are some curves that can be scaled very easily, too. Like


15/272

Introduction 3

this, the characters and fonts become representable in a generic form. The

process of producing bitmapped fonts from a generic representation is called

font-scaling. Ideally,the fonts are scaled on demand, and the scaling is perfor-

med in the twinklingof an eye, which is why it is called somewhat informally

on-the-fly font-scaling.

0.1 The raster tragedy at low resolution

From reading the preceding section the uninitiated reader might object with a

question like: "So, what is the problem? All that your font-scalinghas to do is

to overlay the shapes with a grid, reflectingthe targeted type size and device

resolution, and determine the pixels which are covered by the shape by at

least 50%." In fact, we have met quite a few colleagues, with a solid back-

ground in engineering,who thought that all this boils down to is maybe to de-vise an efficientalgorithm for the "50%-rule", like the linear incremental algo-

rithm for straight lines [Bresenham65].

With the legacy of what we shall present in Chapter 2 and AppendixA, we

have the tools to scale, digitize,and fill straight lines and curves.The respecti-

ve algorithms are implemented according to the rules of the engineering pro-

fession: they are correct, efficient,and concise. In the outline font editor to be

presented in Chapter 3, these tools are used fruitfully to acquire characters

and fonts in electronic form. Yet, if we apply these algorithms to scale a font

down to screen resolution (cf. also 4.1.3), we are presented with results suchas the one illustrated below.

A Times font, scaled naively down to 24 pt at 72 dpi

Frankly speaking, this feels a bit like being awarded a penalty illegitimately.

Unfortunately,we shall find ourselves ended up in a dead end road - despiteundoubtedly correct navigationaldecisions - again (cf. particularly4.5.2sq).

Lookingat what happens when we overlay the shapes with a grid and sub-

sequently apply the "50%-rule", we begin to see that the problem is not prima-

rily a lack of efficiency, but a much more fundamental one (cf. 4.0).


16/272

4

The shape of a Times H, overlaid by a grid that reflects

18 pt at 72 dpi (left), and applying the "50%-rule" (right)

Clearly, in terms of pixels the two stems are unequal, although the originalshapes are equal. Likewise, the serifs have become irregular - if they are still

there at all - despite the regular original shapes that, in particular, are symme-

tric. Finally,the entire character disintegrated into two disjointed parts, even

though the originalshape is connected.

If the algorithms are correct, then maybe the filling paradigm was a bit

rash, that is, the rule to turn a pixel "on" if at least 50% thereof is covered by

the shape. Maybe we should use the paradigm advocated in [Hersch88], that

is, to turn a pixel "on" if at least its center lies in the interior of the shape. In

fact, in the way in which we shall define coordinates and pixels(cf. 2.0.5), andtherefore the digitizing(cf. A.0) and filling(cf. A.1) algorithms, we are actually

applyingthis very same fillingparadigm, too.

The same initial situation as above, but applying the "pixel-center-rule"

As we can see, the other rule does not help too much. Although the resulting


17/272

Introduction 5

choice of pixels happens to mimic a connected figure, it introduces still far

more irregularitiesthan what is acceptable.

One point should be emphasized right now, in hopes that this may convin-

ce the remaining skeptics. In the above example, even the stems fail to repro-

duce regularly,although they are nothing but orthogonal rectangles. To digi-

tise and fillsuch rectangles, neither special purpose digitizingnor fillingalgo-

rithms are necessary.Rather, a primitiveof the raster devicecould be used in-

stead (cf. 2.2.2). Thus, whoever should doubt the correctness of any of the al-

gorithms givenin A.0 and A.1, or of either of the fillingparadigms, is encou-

raged to comprehend now that there is a lot more behind it.

We attach a sequence of illustrations without further comments and invite

the alert eye to identifyfurther "raster tragedies"(a term found in [Karow]).

72 pt at 300 dpi

36 pt at 300 dpi

24 pt at 300 dpi

18 pt at 300 dpi

12 pt at 300 dpi


18/272

6

9 pt at 300 dpi

6 pt at 300 dpi

Apparently,we have stepped across the Nyquist limit long before we are any-

where near low-resolution font-scaling. The Nyquist limit gives a theoretical

lower limit for the minimum sampling rate. It states that a signal can be sam-

pled and reconstructed without loss or distortion if the sampling rate is atleasttwice the rate of the highest frequencyin the original signal [Bigelow85].

Practically,this has the followingconsequences: The fonts designed at our

institute for medium resolution (cf. [Meier91, Meier93] and 3.3) have a font

height of 300 to 400 units in font space (cf. 4.0.0), corresponding to a type size

of 72 to 96 pt at 300 dpi resolution. In this representation, the thinnest strokes

are 3 or 4 pixels wide. Now, if we scale the fonts by a factor of only 1/4, corres-

ponding to a type size of 18 to 24 pt at 300 dpi (!), we are already at the

Nyquist limit. In fact in the above illustration the 24 pt ehas a "drop-out" - a

missing pixel in what at our institute would be considered a headline type sizeat printer resolution. Therefore,we may expect naive font-scalingto be bound

to failalreadyfor plain text at printer resolution, let alone screen resolution.

0.2 The scope of this thesis

The illustrations in the preceding section show clearly that the problem of

medium and low resolution font-scalingdiscloses itself in a multitude of see-

mingly different facets. Conventionally,now, we could compile a catalogue ofproblems to be tackled. Subsequently, we would devise an individual recipe

for each of these problems, specifying how to cure the respective symptom.

Finally, we would assess the resulting approach to medium and low resolu-

tion font-scalingat its abilityto meet our own demands.

It may be a striking argument for the people from the sales department to

be able to outdo competitors with the length of their own catalogue of featur-

es. We are convinced, however, that most of them - avoiding to say, all of


19/272

Introduction 7

them - fallinto one or more of the followingfour categories:

Regularity: Intelligent font-scaling provides invariance of translation (ofstems, serifs, thicknesses of stroke, etc.), mirroring (of serifs, bowls, shoul-

ders, etc.), and existence(of components, connections, features, etc.).

Near regularity: Within the bounds of quantization, intelligent font-scalingavoids disproportionate enlargement of small divergences from sternly

regular parts and measures (such as optical corrections and reference line

overlaps).

Constrained proportionality: Within the bounds of quantization and theconstraints imposed by safeguarding components against disappearance,

intelligent font-scalingpreservesproportionality of parts and measures.

Digitized appearance: Last but not least, intelligent font-scaling allows toinfluence the inevitable patterns of jaggedness on the fringe of slanted and

curved parts without allowingat the same time to refer to individual pixels

directly.We understand that some of these terms may appear somewhat alien to the

uninitiated reader, but we are convinced that they will become meaningful, at

the latest by the end of Chapter 4.

Withthat, the scope of this thesis is to deviseand implement an algorithm

for scaling fonts on-the-fly, for medium and particularly low resolution raster

devices, provided the generic representation of the fonts contains enough in-

formation to permit avoidingany of the problems of the above categories.Fur-

thermore, it is our personal goal to be comprehensive as far as the employed

algorithms are concerned, which comprises the complete scaling algorithmas well as the digitizingand rendering algorithms for scan-convertingand fill-

ing straight lines and curves.In doing so we put our cards on the table in hop-

es that this may provide future approaches to both font-scaling in particular

and two-dimensionalgraphics in general witha sound starting capital.

0.3 Guide to the reader

Even to the uninitiated reader it should have become clear by now that whatwe are about to tackle is indeed a problem. It is not simply an industrious but

uninspired piece of work, or maybe a laborious engineerstask. In view of the

multitude of mishaps that can happen, scattered over different levels of ab-

straction, and in accordance with the goals set out in the preceding section,

the present thesis has grown to respectable dimensions. Therefore,here is a

short guide to the reader:

Chapter 1 (The others: A brief overview of existing approaches) opens a


20/272

8

bracket around the main part. It briefly introduces both academic and

industrial approaches to font-scaling.

Chapter 2 (The foundations: an object-oriented graphical toolbox) gives thegraphical basis of the project. Although today one may take versatile gra-

phical capabilities for granted, the reader is invited to pay attention at least

to sections 2.0 (Coordinates, pixels, and rectilinear regions) and 2.3 (Assess-

ment of the new approach for a graphical toolbox).

Chapter 3 (The input: A simple outline-font editor) illustrates an applica-tion of the graphical toolbox in our own programming environment. Read-

ers uninterested in its particulars are encouraged at least not to miss out

section 3.2 (An experience with third order Bzier curves). Chapter 4 (The backbone: A formalism for intelligent outline-fonts), parti-

cularly 4.4 (Implied dynamic regularisation), constitutes the core of our ap-proach. Of this material at most the sections 4.6 (Inter-character spacing &

"wysiwyg" Type-Setting) and 4.7 (Formal Notation & Graphical Feed-Back)may be left out in a first reading.

Chapter 5 (The implementation: An extensible OOP style application), final-ly, illustrates the actual realization of the prototype.It may be a bit far away

from the conceptual range of typography.

Chapter 6 (The balance: A competitive prototype) closes the bracket aroundthe main part. It compares two popular industrial approaches with our

newapproach.

AppendixA(The algorithms) lists and describes all the algorithms we haveused in our font-scaler,including many refinements to existingalgorithmsthat may be of interest to the practitioner in raster graphics.

Appendices B, C, and D (The format of the unstructured outline-fonts, Thedefinition of the language for intelligent outline-Fonts, and A self-containedexample of a font definition) are mostly for referencepurposes.

We hope that this helps the reader not to lose track of things or just to restrict

the reading to the passages containing the major novelties.


21/272

1 The Others: ABrief Overview of Existing Approaches

1.0 The MIT Project: Aformal approach with components

The preceding chapter has conveyed a first glimpse at the extent of the low

resolution font-scaling problem. It is the very scope of this thesis to contri-

bute a solution to this problem, without introducing a completely new problem

to begin with.But since it is not a new problem at all, there must be other ap-

proaches to it already.This is why the present chapter gives a brief overview of

existingapproaches. We are concerned to make clear that this overview is no-

where near exhaustive - being so might easily amount on an entire thesis of

its own. Rather, we would like to present a selection of widely differingappro-

aches, and we shall mention related approaches as appropriate.For historical reasons, and not the least because it pursues an approach

with a certain similarity to ours, we mention the famous project launched at

the Massachusetts Institute of Technology (MIT) some twenty years ago

[Coueignoux75]. In the early seventies, memory must have been prohibitively

expensive.Reportedly, the hardware available for the project comprised like

64 kB main memory and 1 MB of disk space. (Incidentally, our first

experiences with computer programming took place in 1981 on a computer

with 16 kB main memory and a tape connector, later to be replaced by a 140

kB floppy disk. Today, we would not be surprised anymore to have 64 MBmain memory and 1 GB of disk space, instead.) Problems regarding these

constraints run throughout the wholethesis by Coueignoux.

However, this had a substantial benefit, namely to have to look for a com-

pact font representation. One of the factors contributing to memory effective-

ness is to store repeated occurrences of one and the same component only

once, which at the same time eliminates the necessity to see to it that equal

components be rendered equally: If there is only one instance of the stem or

the serif, then all the characters built with that kind of stem or serif can refer

to it.

Times "HILFE", exploded view


22/272

The Others:10

The method according to which the components are put together is specified

by means of quite comprehensive a grammarfor roman printed fonts, which

is why we have classified it as a formal approach.

The intuitive idea to decompose characters into their constituent glyphs

(stems, serifs, etc.), if they are to retain a certain degree of regularity under

scaling, is bound to fail in case the considered glyphsare not quite equal, but

in a way "similar"to one another.

Times "ET", exploded view

The top arm of the "E" ( ) may look like the ones of the "T" ( ),

but it is not quite identical to them. Likewise, the bottom left serif of the "E"

or the "T" ( ), when mirrored at y := -x ( ), is not the same as the one that is

part of the nose of the "E" ( ), yet they look as if they were related to one ano-

ther. Coueignouxwas of course aware of this fact, which is why he parameter-

ized the glyphs or primitives, as he calls the two-dimensional building blocks.

Withthat, one may understand the terminal nodes (or leaves)of the grammar

(or blue-prints) to be modeled by a procedure (one for each character of the al-

phabet) that makes use of the parameterized glyphs.Still,said parametrization is not yet quite sufficient if the character decom-

position is to keep at two-dimensional primitives only, which can be seen in

the followingsituation.

Times "ceo"

As clearly as above, these characters have a certain affinity to one another,

such as their overall proportions or the thicknesses of stroke. In fact, in the

classification introduced in [Adams89], they appear in the same group. Yet, to

decompose them into equal glyphs gives rise to separation lines which are

not related to the respectivecharacters in a natural way. In spite of this draw-


23/272

A Brief Overview of Existing Approaches 11

back (which may explain why - according to [Karow89] - it never became a

success) we further pursued the path that decomposes characters into their

constituent parts. We shall see that this approach is not bound to fail provi-

ded the decomposition does not stop on the level of two-dimensional build-

ing blocks.

1.1 The ETH Project: An image-oriented approach

Formal or descriptive approaches have a general drawback,which is the neces-

sity to have to "program" fonts. The famous METAFONT [Knuth86] is a speci-

men of this category. Whether it is the early version that strokes the skeleton

of characters using brushes of different sizes and orientations, or whetherit is

a later version that allowscharacters to be defined also by outlines, in the end

each character is described by a program. This not only impedes bringing thewealth of existingfonts to computers, but it is likely to make this method un-

available to font design, since programming is anticipated to lie outside the

type-designersusual field of activity.

This is one of the reasons why Kohen has adopted quite different a line

[Kohen87, Kohen88]. In his system for medium resolution font design the

fonts are acquired by the use of a graphics editor that has been especiallyta-

ilored to the demands of character design (cf. also Chapter 3). After an initial

training phase, this editor was operated productively by professional type

designers (cf. e.g. [Meier91]). The font design system comprised an algorithmfor scaling fonts for medium resolution printers (300 dpi). Basically,it used a

variant of the error diffusion algorithm introduced by [FloydSteinberg75],

which is why we have classified it as an image-oriented approach.

Alot of finesse must have been put in the way error diffusion is done, since

the actual diffusion step is driven by the particular topology of the glyph at

hand (cf. also A.2.1). Reportedly,this accounts for aesthetically pleasing digi-

tized appearance, or local beauty, as Kohen calls it. Localbeauty regards ques-

tions like nicelyrastered arcs or half-bittingapplied to specific parts of glyphs.

some Times 24 pt characters, hand-tuned for 300 dpi (enlarged)


24/272

The Others:12

Unfortunately,some of the vitalparameters of the algorithm are not recorded

for posterity,which is why we could not easilybase our own research on it.

There are other reasons why we did not pursue the same path any further,

though. Error diffusion algorithms are to a certain extent statistical in nature.

However, our goal includes font scaling at low resolution, where characters

consist of a few coarse pixels only, thus making statistical approaches ques-

tionable. Furthermore, they are at least an order of decimal magnitude slower

than outline-oriented algorithms (cf. A.2.1 and 4.1.3). On top of that, they can-

not easily model global consistencies, especiallyin case of curvilinearglyphs.

Already Kohen was aware of that and presented in [KohenGutknecht89] a hyb-

rid approach or, as he calls it, a combined algorithm.

1.2 Adobe Type 1 Fonts: An approach with hints

By the start of the present decade, two companies successfully marketed com-

mercial products that let personal computers scale fonts on demand down to

screen resolution [Adobe90, Apple90]. Adobes Type 1 fonts emerged from a

long tradition in page description [Adobe85], while ApplesTrueType fonts re-

portedly are the result of printer companies not wanting to pay Adobe royal-

ties, and constitute the answer to Adobesinitial unwillingness to unveil the

Type 1 Font Format [Fenton90]. Let us first look at AdobesType 1 fonts. (Ty-

pe 3 fonts, PostScripts"user-defined fonts", are fonts without hints as below,

nor any other "font-intelligence".)AType 1 font essentially is a program that uses a small subset of the origi-

nal PostScript language, together with a few powerful extensions. To capture

global information of a font or an entire family,a Type 1 font comprises vari-

ous dictionaries that enlist the reference lines (descender, base, mean, cap, and

ascender line (cf. 3.1.6, 3.3), together withtheir optical corrections (cf. 4.4.2).

The base and cap line, together with their optical corrections


25/272


Adobecalls the dictionaries BlueValues, OtherBlues, FamilyBlues, and Family-

OtherBlues, etc. In their terminology,family applies to our font scope, in con-

trast to the variant scope (regarding scopes cf. 4.3.6), while "other" specifies

reference lines below the base line. The last distinction does not appear ne-

cessary to us.

A Type 1 hint boils down to the fact that an entry may or may not be pre-

sent in the dictionary,in which case a default value is assumed. The exact be-

havior is well-documented and possibly can even be re-configured.Likewise,

the underlying algorithm may or may not be able to obey the hint, in which

case it is simply ignored. In particular, rather than havingthe primitives for li-

nes and curves only (lineto, curveto, etc.), Type 1 supports so-called hint com-

mands giving information about the ranges of a zone occupied by a predomi-

nantly horizontal or verticalpart. Such a part may be rectilinear or curvilinear

(cf. 4.4.1) and is what Type 1 calls a stem (hstem, vstem, etc.). For example, a

vstem3 specifies the horizontal ranges of 3 verticalstem zones, which "is espe-cially suited for controlling the stems and counters of characters such as a

lower case m" [Adobe90].

However, Type 1 hint commands do not amount on a decomposition into

components (i.e. horizontal stems or vertical stems etc.). Rather, the values

specified together with the respective commands are in effect for the entire

character. This means for the "E" below that either the serifs on the left and

the crossbars on the right are expected to have the same thickness of stroke

(which is called a hstem in this context),or the hints have to be (and can be)

changed amidst the definition of a character outline path.

A Times "E" with (left) and without (right) equal thicknesses of horizontal

stroke

Also, e.g. a sans-serif "I" (which in its simplest case we would understand to

be made of a single orthogonal rectangle)seems to require an assembly out of

a verticalstem with two horizontal so-calledghost stems at either end, in order


26/272

The Others:14

for the actual stems horizontal alignment with the blue values to work pro-

perly. This suggests to us a certain lack of flexibility of the approach, even in

so far as the shapes are letter-like.(In fact, Adoberecommends to use Type 3

fonts for other shapes.)

An approach similar to AdobesType 1 fonts is pursued by Agfa Compugra-

phic with their Intellifontformat. Intellifonts are used in PCL5, the page des-

cription language that comes with HewlettPackardslaser printers, and there-

fore is possibly much more wide-spread than widely known. The Intellifont

format starts out from the IKARUS format [Karow92c] and, as with Type 1

fonts, adds extra information to the bare outlines, but in contrast to Adobe,

Agfa Compugraphic calls this extra information instructions.

Analogousto Type 1 fonts, Intellifonts distinguish global instructions that

are common to an entire font from local instructions that are specific to indi-

vidual characters. On the global side, we are not surprised to be presented

with reference lines and standard dimensions in x- and y-direction,roughlythe equivalent of Adobesblue values, vstems, and hstems respectively,and

related to the goals targeted by TrueTypescontrol-value-table(cf. 1.3). On the

local side, however, Intellifonts differ from Type 1 fonts in that they empha-

size by concept the role of local extremal points and, in general, the role of

points that delimit the thickness of stroke. Agfa Compugraphic calls these

points skeletal points and stores them in addition to the other support- or

control-points.

A Times "B" with skeletal points

In general, these skeletal points are scaled and rounded to the nearest grid

line. Atree-likestructure, called association, is superimposed on these skelet-

al points. This permits to define the stem of the above "B" to be scaled and

rounded before the serifs, or to have the counterforms correct dimensions

(correct with respect to a giventype size and deviceresolution) take preceden-

ce over correct overallwidth of the character at issue (cf. 4.0.0).


27/272


1.3 Apple/Microsoft TrueTypeFonts: An approach with instructions

In contrast to Type 1 fonts, flexibility is most definitely not a weakness of

TrueType fonts, at least not a lack thereof. On the contrary, the vast flexibility

of the TrueType approach may at best befuddle the uninitiated designer of di-

gital fonts. Actually, TrueType fonts are fully-grown assembly language pro-

grams for a stack machine. Apart from basic arithmetical and conditional in-

structions, they comprise special-purpose instructions for explicitly fitting

knots (on- or off-curve points, cf. 3.2.0sq) to grid lines under subordination to

a wealth of parameters and tables. These parameters and tables are global to

the interpreter and may therefore governprocesses that are global to an entire

font or variant.They are collected in what TrueType calls the graphics state.

The graphics state contains in particular the current values for the freedom

vector( ) and the projection vector( ). These two vectors constrain the way

knots are moved upon grid-fitting:Motion takes place along the freedom vec-tor withthe purpose to assume a distance that, simultaneously,

is as close as possible to a givendistance (measured parallel to the projec-tion vector)

comprises an integral multiple of the targeted grid unit (measured on thefreedom vector).

To fit diagonal strokes to the grid, the projection vector is perpendicular to the

diagonal, while the freedom vector runs in parallel to the reference lines.

The way in which said integral multiple is determined depends furthermore

on one of severalround-states, which is part of the graphics state, too.

The special-purpose instructions that invoke such a motion are called

MDAP, MIAP, MDRP, and MIRP (Move Direct/Indirect Absolute/Relative

Point). In this context, the term absolute denotes an absolute distance, while

relative relates to a distance relative to a reference point (which may or may

not have been moved already).This helps to avoid, amongst other, unequally


28/272

The Others:16

rastered stem and crossbar widths (cf. 4.3.0). In the same context, the term

directmeans that the distance is measured directly(parallel to the projection

vector), while indirect replaces the measured distance by the nearest one

found in the CVT(Control-Value-Table) if it is "sufficientlynear" it (the latter

being governedby the control-value-cut-in). With this replacement, all kinds of

distances (stem or crossbar widths, serif heights, etc.) can be collected to

groups of nearly equal distances in order to have them assume the same va-

lues under sufficiently coarse scaling (cf. 4.4.2). Both the control-value-table

and the control-value-cut-in belong to the graphics state as well. At the lower

limit (cf. 4.4.7), the single-width-value and the single-width-cut-in stand in for

the same purpose. At least the last distinction does not appear necessary to

us.

The rounding can respect a minimum distance, such as to prevent coarsely

scaled distances and simple glyphs like orthogonal rectangles from vanishing

(cf. 4.3.1). For less obvious cases, a more general approach can be activated,called scan control mode, which analyses the rastered result against disconti-

nuities (cf. 4.5.2sq). Cases not covered by this most general drop-out control

mechanism can be cured by the possibility to specifyexceptions, called delta

instructions, that in the end even permit to patch individual pixelsfor a specif-

ic type size and at a givendeviceresolution.

The possibility to directly refer to individual pixels rises questions regard-

ing deviceindependence, for the pixel size is a function of the targeted resolu-

tion. Further features in TrueType fonts take us more and more away from the

conceptual range of a type-designer.Takingthe pick of the bunch, measuredmagnitudes may have to be forced to assume non-negativevalues (auto-flip),

reference line overlaps and related optical corrections are covered by twilight

zones and points (cf. 4.4.2), whilefor such simple things as side bearings True-

Type introduces phantom points (cf. 4.6.1).

To recapitulate, in the TrueType approach much of the font-intelligence

lies in the way the instructions are used, which is in contrast to the Type 1 ap-

proach. Therefore,the TrueType approach may serve as a common and port-

able basis to different ways in which the font-intelligenceis provided. Howev-

er, even if we disregard the assembly language approach at all, the conceptualrange adopted by the TrueType approach is on a purely geometric level of ab-

straction, while the one by Type 1 is much more closelyrelated to the world of

font-design.(The absence of typographicalterms and notions applies particu-

larlyto the F3 format by Sun Microsystems,which is why shall not go into the

F3 format in more detail.) Reportedly,TrueTypesflexibility has to be paid for

with higher production costs of a font [Karow92c], while Type 1s simplicity

may make it better suited for automatic acquisition of hints [Hersch91], such


29/272


as the auto-hinting proposed in [Karow89, HerschBtrisey91b, Btrisey93].

At this point it may be appropriate to give the prospects for our own appro-

ach. Basically,we shall decompose a font into equal or symmetric component

parts. In contrast to [Coueignoux75], components are not restricted to entire

glyphs, but may be contours, knots, and even single numbers as well. This

permits to define for the components a hierarchical structure that specifies

how more complexcomponents depend on simpler ones. To define the struc-

ture, we shall adopt a conceptual range similar to that of Type 1 fonts

[Adobe90], but without concealing the implementation of the font engine. To

adapt the components to the grid for a giventype size and deviceresolution,

we offera flexibility close to that of TrueType fonts [Apple90], but additionally

we explain in full detail, how this flexibility is employed best for the grid

adaptation. Finally, the particular implementation gives a means of talking

about digitized appearance or local beauty as suggested by [Kohen88], but

without foregoingthe efficiencyadvantage of an outline-oriented approach. Inretrospect, we may therefore understand our own approach to be a hybrid

approach in the sense that it combines the strengths of the approaches

introduced in this section without inheriting at the same time their

weaknesses as well.


30/272

2 The Foundations:

An Object-Oriented Graphical Toolbox

2.0 Coordinates, pixels, and rectilinear regions

2.0.0 Vector displays

The primary source of problems with the quantization of continuous graphi-

cal objects is the blue-eyedness by which beginners associate abstract mathe-

matical coordinates with visible technical pixels. Therefore we first present a

couple of our own experiences made with two inherently different ways of

looking at coordinates and pixels. The conclusions drawn at the end of thissection are a sound definition ofcoordinates and pixels, and the introduction

ofrectilinear regions.

At the beginning of interactive computer graphics in the early 1960s,

display devices were usually some form of line drawing or vector displays

[NewmanSproull82]. On such devices, vertices of polygons to be displayed

were kept in so-called display files or lists that were traversed,in turn, to direct

the electron gun from one vertex to the next one. At the end of the list,

directing the electron gun would simply start over again,and so forth.

a straight line drawn on a vector display

2.0.1 Raster displays

In the late 1960s, display lists were eventuallyreplaced by refresh buffers con-

taining - instead of vertices of polygons - a two-dimensional array of all the

(discrete) points that the electron beam can display as separate but closely

spaced tiny little dots [NewmanSproull82]. In such devicesthe electron gun is

directed across the screen in horizontal rows from dot to dot, rather than


31/272

An Object-Oriented Graphical Toolbox 19

being directed from vertex to vertex. For each dot that the refresh buffer marks

as a visible dot, the electron gun is turned on, and off for invisible dots. These

devicesare essentially the raster displays that we are still using today.

The obvious disadvantage of refresh buffers is that we cannot simply draw

a straight line by storing its end points in a display list. Instead, we have to cal-

culate and store in the refresh buffer a (discrete) choice of those dots that ap-

proximate the (analog) line as closelyas possible.

a straight line drawn on a raster device

Of course, refresh buffers have obvious advantages as well. Now we can draw

not only straight lines, but also circles and other curves, simply by calculating

a different choice of dots. Another advantage of software controlled refresh

buffers is that they allow us to display solid two-dimensional objects rather

easily, simply by turning on adjacent (rows of) dots, as is the case in one of

the followingillustrations. The graphical primitives have become available to

software,rather than being restricted to hardware.

2.0.2 The grid-point model

To calculate a choice of dots means first of all to associate abstract mathe-

matical coordinates with these dots. Keepingin mind the way raster devices

access the refresh buffer, it may appear quite natural to simply number con-

secutively the discrete geometrical points that the raster device can address.

Doing so all the way through the complete choice of dots, row-by-row, and

within each row, dot-by-dot,the numbering is equivalent to an integer grid ofhorizontal and verticalcoordinates. Each (geometrical)pointat which a hori-

zontal and a verticalcoordinate line intersect constitutes the center of a (tech-

nical) dot.

Upon devising our first algorithms for drawing lines and curves we intuiti-

vely understood the picture elements or pixels that make up for the visible re-

sult on a raster display to be the same as the dots just introduced. Since the

dots by themselves are approximations of the infinitesimal points in geometry,


32/272

The Foundations:20

we effectively associated the pixels with the points on the grid. This model to

define coordinates and pixelsis usually called the grid-point model.

a straight line drawn in the grid-point model

Based on the grid-point model we implemented a set of algorithms for draw-

ing hair lines, circles,arbitraryellipses and natural splines [Stamm89, Ch. 1].

However, if we now wanted to assemble a polygon out of a sequence of

such hair lines, we would run into the "end-point-paranoia",that is, the ques-tion whether the last pixel of one line belongs to the next line already,or not.

This may sound like a nit-pickerputting carefullythe dots on the is,since the

pixels are intended to model geometrical points without physical extension.

But in the grid-point model the points actually do have physical extension.

Therefore,if we inverted such a polygon(e.g. to implement dragging,cf. 3.1.1),

the respectivepixelswould be drawntwice, and thus remain invisible.

a triangle (left) inverted with missing vertices (right) in the grid-point model

Likewise, if we used the calculated points not for drawing dots, but as entries

into a table for parity filling(cf. A.1.2sq), we may get two identical or adjacent

table entries at the same vertex. In the triangle below, this causes the fillingto

leave most of the respective pixel span empty, whilehaving it leak out opposi-

te to the vertex at issue. Conversely, if we decided not to enter the triangles

vertices twice, we may get the pixel span in the middle, but at the expense of

missing verticesat the top and at the bottom, likewiththe outlined triangle.


33/272


a triangle (left) and a polygon (right) filled with errors in the grid-point model

Even worse, in the polygon above, this causes another vertex to be the source

of a leak. The seemingly natural decision to simply number through all the

displayable dots is causing the algorithms for calculating choices of dots a

couple of rather unnatural roundabout ways.

2.0.3 The checker-board model

By the time we started rendering genuinely two-dimensional objects, we were

aware of the above problems with the grid-point model. Right from the very

outset, we therefore interpreted the role of pixels differently.Rather than try-

ing to hide the information that raster displays cannot approximate geometri-

cal points any better than by tiny little dots, we accepted the fact that pixelsare small but finite areas. Starting out from a grid of integer coordinates, the

role of a pixel is no longer that of a dot on a grid-point,but that of fillingthe

gap between adjacent pairs of horizontal and verticalgrid lines. This model to

define coordinates and pixelsis usually called the checker-board model.

a circle, outlined 3 pixels thick (left) and filled completely (right) in the

checker-board model


34/272

The Foundations:22

Rendering thick lines and curves in the checker-board model can use roughly

the same algorithms as drawing hair lines in the grid-point model. The main

difference regards the way pixels are determined at the edge of such objects.

Rather than calculating the dots that are as close to the analog edge as pos-

sible, this time we have to calculate, on each row of pixels, the first and the

last pixel that the analog object coversby at least 50%.

Based on the checker-boardmodel we implemented a set of algorithms for

drawing thick lines and curves [Stamm89, Ch. 3]. Due to the chosen model

for associating coordinates with pixels, and since the parameters that deter-

mine the geometryof the lines and curves were integers, in none of these algo-

rithms we had to face problems like missing vertices or pixel spans, such as

discussed above. It is therefore not surprising to us that this way of lookingat

pixels has become the "de facto" industry standard [Hersch]. However, if it

comes to outlining, the checker-board model may be a choice that needs fur-

ther explanation. We will come back to this right below.

2.0.4 A conflict of models

With the above foundations in two-dimensional graphics we started devising

our first outline-font editor. One of the editorsfeatures was to let the type de-

signer view the characters both unfilled and filled, which is a vital asset for a

high-qualitytype design tool. The efficiencyof our routines, together with the

computing power of the personal workstations of those days, would permit tomanipulate at interactivespeeds evenfilled characters.

However, at this point another case of the "plus-minus-one-paranoia"

thwarted our plans: the outlined characters appeared to be slightly bigger

than their filledversions.

the same circle of radius 8 (middle), outlined in the grid-point model (left),

and filled in the checker-board model (right)


35/272


Thereason for this mishap is, we unconsciously mingled the grid-pointmodel

with the checker-boardmodel. Theroutines for unfilled outlines were employ-

ing the grid-point model, while the routines for filled outlines were adhering

to the checker-board model. Using a simple circle as an example, the conse-

quences thereof are illustrated above.

In the grid-pointmodel, the outlined circle has a diameter of 2r+ 1 pixels,hence its graphical bounding box exceedsthe mathematically correct bound-

ing box. In the checker-board model, the filled circle has a correct 2rpixels,and therefore also its bounding box is correct. For large radii, this may be

totally irrelevant,but in view of font-scalingbeing targeted at low-resolution,a

single pixel is too large an area to be ignored. In fact, even at the original

resolution of the outline font characters, extending over several hundred (!)

pixelson screen, our type designer Hans Ed. Meier would point out promptly

that when switching the viewing attributes to "solid" (cf. 3.1.6) the characters

seem to "shrink",whichis unacceptable for a high-qualitytypedesign tool.By now we should have become aware of the fact that there are two diffe-

rent ways to define coordinates, and that there is rationale for both of them:

In the grid-point model the coordinate points are associated with the pix-els, hence they extendover a small but finite range.

A 3 by 4 array in the grid-point model (left), the coordinate point [2, 1] (right)

The grid-point model may be the natural way to draw hair lines, that is, to

approximate truly one-dimensional objects but, besides implementational

difficulties,it does not yieldmathematically consistent bounding boxes.

In the checker-board model the coordinate points are associated with theintersections of the infinitelythin grid lines, constraining the small but fi-

nite gaps filled up by the pixels.

The same array (left) and coordinate point (right) in the checker-board model


36/272

The Foundations:24

The checker-board model looks like the natural choice for truly two-dimen-

sional objects and thick lines, but drawinghair lines is not that obvious in

this model.

Unfortunately, the two models are not compatible with one another. There is

no one-to-onemapping that relates one model to the other, and vice versa.

At this point the rash reader might argue astutely that our problem is only

that of specifyingthe origin of a pixel, that is, whether it is at [1/2, 1/2] or [0, 0].

However, this does not solve the bounding box problem, but merely shift the

entire coordinate plane by a vector[- 1/2, - 1/2], which we could accomplish as

well withdislocating our computers monitor as a whole.

the same circle, outlined in the grid-point model, but with the pixelsorigin at

[1/2, 1/2] (left), and at [0, 0] (right): Both circles exceed the mathematically

correct bounding box

This means that we have to make a choice for either model of the two, and

that once we have made our choice, we have to comply with our model syste-

matically,if not to say, painstakingly.

2.0.5 Conclusion 1: Coordinates vs. pixels

The reason why the grid-point model fails for solid objects is the intention to

describe mathematically precise (coordinate) points with imprecise technicalpixels. As long as we do not decouple the notion of a point from the notion of

a pixel, we are always tempted to associate with the coordinate lines a clearan-

ce of half a pixel or more. Like this, any subsequent notions of length, such as

the basic dimensions of characters, stems, and serifs, or optical corrections to

stroke thicknesses and reference line overlaps,are vague, if not to say, simply

wrong at the quantization limit. Therefore,our first conclusion is to choose

the checker-board model, and the followingdefinitions separate coordinates


37/272


from pixels:

Acoordinate is an integer whichdenotes an infinitelythin horizontal or ver-tical line in the Euclidean plane.

the coordinates x = 0, 1, , 7, and y = 0, 1, , 5

Apointis the intersection of a horizontal witha verticalcoordinate.

the point p = (x, y) = (2, 1)

Apixel spans a part of the Euclidean plane which is bordered by a pair ofadjacent verticaland horizontal coordinates.

the pixel a = [x, y, X, Y] = [2, 1, 3, 2]

An orthogonal rectangle spans a part of the Euclidean plane which is bor-dered by a pair of verticaland horizontal but not necessarilyadjacent coor-

dinates.


38/272

The Foundations:26

the orthogonal rectangle a = [x, y, X, Y] = [2, 1, 6, 4]

This is the only way to define mathematically pure geometric entities! Looking

at the real world, our definition does not appear that academic at all. The first

popular graphical user interface uses the same definition in its toolbox, too

[Apple85, later Petzold92].

From the above definitions, the width w and the height h of an orthogonalrectangle [x,y,X, Y] are straightforward,but most important. Mathematically,

they are

w :=X- x and h := Y- y,

and graphically,they are w and h pixelsrespectively;the mathematical and the

graphical notions of length or distance correspond.

The orthogonal rectangle constitutes a generalization of pixels to arbitrary

integral width and height. Conversely, if

(X- x) = (Y- y) = 1,

then the orthogonal rectangle spans exactly one pixel. Consequently, the or-

thogonal rectangle is a null rectangle, if

(x =X) (y = Y),

that is, its width or its height comprises not evena single pixel.This may sound academic, but this very conception of distance and its cor-

responding graphical appearance have proved most useful - at the latest for

section 4.4, where we talk about dynamic regularization (characters compo-

nents which may vanish under coarse scaling) and drop-outs (components

which must not do so). Notice, finally, the superiority of the checker-board

model: If we were to define orthogonal rectangles using the grid-point model,

then the mathematical widthX- x would span X- x + 1 pixels, which means


39/272


that null pixels would have to be defined by a negativemathematical length -

an idea from which we should disassociate ourselves clearly!

To consolidate our understanding of orthogonal rectangles, the following

definitions illustrate, in the order of appearance, the emptyarea, a single coor-

dinate point, the union of two bounding boxes, and the result ofclipping one

area against another,

:= [, , - , - ](x,y) := [x,y,x,y]

a b := [Min(ax, bx), Min(ay, by), Max(aX, bX), Max(aY, bY)]a b := [Max(ax, bx), Max(ay, by), Min(aX, bX), Min(aY, bY)]

givenany point (x,y) and any two orthogonal rectangles a = [ax, ay, aX, aY] and

b = [bx, by, bX, bY].

2.0.6 Conclusion 2: Rectilinear regions

For the objectives of both outlining and fillingwith the same (mathematically

correct) bounding box, we should be looking neither for the pixels that are

closest to some analog edge, nor for pixel spans whose first and last pixelsare

covered by at least 50%. Rather, we should be looking for an entity without

physical extension. Therefore,our second conclusion is to calculate instead

the (digital) boundaries between the interior and the exterior pixels that wewould get as a result of fillinggenuinely two-dimensionalobjects according to

the checker-board model. We use the term rectilinear regions for these boun-

daries and define them as follows:

A rectilinear region is the entity which represents the (digital) boundary ofa not necessarily simply connected set of orthogonal rectangles.

the rectilinear region defining the (digitized) shape of a circle


40/272

The Foundations:28

In this sense, rectilinear regions constitute a natural generalization of ortho-

gonal rectangles to arbitrary,multiply connected, digitizedshapes.

Starting out from its rectilinear region, defining a filled two-dimensional

object is trivialby concept, while outlining it requires us to make a choice; the

choice to turn on interior pixels that are immediately adjacent to the digital

boundary, rather than exteriorpixels.

the rectilinear region defining the shape of a circle (middle), together with the

outlined (left) and filled (right) region.

This is quite a natural choice, though, for turning on exterior pixels adjacent

to the boundary would simply increase the effectof "shrinking" upon switch-

ing the viewing attributes to "solid".

It may be helpful to understand a rectilinear region as one or more closed

polygons whose vertices are on grid points and whose edges are on grid lines.Subsequent verticesof such polygons differfrom one another eitherin their x-

orin their y-coordinate,i.e. by a so-called "Manhattan move", but not in both.

If both directions were allowed to differ simultaneously, the missing coordi-

nate point wouldhave to be interpolated for subsequent outlining or filling.

a simultaneous unit step in x- and y-direction (middle) is ambiguous:With-

out knowing the original shape, both interpolations(left and right) are correct


41/272


This can be done in two different ways: An ambiguitywhich may result e.g. in

a coarselyscaled serif being rendered too large.

The fact that the "Manhattan moves" are restricted to unit steps, i.e. mov-

ing "block" by "block",may be considered an implementational decision. In

doing so, calculating the pixel boundaries can be done strictly incrementally,

which may be simpler but less efficient(but cf. A.1.1). All in all, we have simp-

ly generalized "the conventional choice to make (integer) pixel boundaries

intuitive"[NewmanSproull82].

2.0.7 Conclusion 3: Object-oriented programming

We hope that one particular idea follows from reading between the lines of

the preceding sections. We did not make all the above considerations in the

very same orderly way before implementing the graphics toolbox to be usedthroughout the rest of our project. It would be a downrightlie if we claimed to

have done so. Rather, we took certain things for granted, notably the two diffe-

rent models for defining coordinates and pixels, or the idea that speed is the

primary concern of the practitioner in raster graphics, and hence that perfor-

mance justifies nextto any amount of code.

But eventually, a major rewrite of the graphics library became apparent.

For use by professional typographers,a font-editor has to fill characters with-

out the effectof "shrinking the outlines". Although this is not that difficult to

"fix", we would rather package the solution to this problem in such a way thatit can be used equally by straight lines and all the present and future curves.

Likewise, we would rather have future curves be concerned only with calculat-

ing (parts of) rectilinear regions, but not the problem how to fill or outline

these regions correctly.Last but not least, we would rather not have to record

necessarily all the "Manhattan moves" that make up for a rectilinear region if

this should be just for the sake of "interfacing"between finding pixel bound-

aries and fillingor outlining these boundaries correctly.

What these problems have in common is the wish to better structure the

respective computer program into re-usable and interchangeable componentparts. Today, a popular way to do so is to use some form ofobject-oriented pro-

gramming (OOP) style. Since we had alreadyimplemented all kinds of digitiz-

ing algorithms for solids, respecting the checker-board model and the pixel

boundaries, and since we eventuallyknew how we should have done the out-

lining in the first place, given the pixel boundaries of all kinds of solids, our

third conclusion is to give OOP style a chance to pay off.

Assuming that the interested reader may be familiar with all kinds of prob-


42/272

The Foundations:30

lems regarding digital typography,but not necessarily with their implementa-

tion in OOP style- let alone our notation to do so - the next section will give a

brief introduction to OOP style, together with its notation, in hopes that it be

generallyintelligible.The next section is followedby a description of the parti-

cular factorization chosen, and a critical assessment of our approach.

2.1 Concepts of object-oriented programming and their notation

2.1.0 Encapsulation

Object-oriented programmingis one of todaysen voguewords,and OOP is its

frequently traded abbreviation. While for some readers OOP may have be-

come a bit hackneyed already, the somewhat more traditional programmers

could still be afraid of missing the bus. The reality is probably somewhere in-between these two extremes. OOP on its own is neither a giant leap forward,

nor does it solve all the remaining problems of computer programming ne-

cessarily. Rather, we understand OOP as an evolutionary step beyond such

well known paradigms as information hiding or structured programming, ne-

vertheless coming in quite handy sometimes. The followingaccount of this

evolution should bring the reader, whom we assume to be familiar with the

basics of a Pascal-likelanguage, to a rough understanding of modern hybrid

languages - traditional imperativelanguages withOOP extensions.

Computer programs are essentially recipes for processing machine-read-able ingredients. These recipes are often termed algorithms, whilethe ingredi-

ents receive the common name data. The introduction ofhigh-level program-

ming languages (cf. also 4.1.1) started to bring computer programming closer

to the conceptual range of the actual programmers. Alanguage like FORTRAN

(FORmula TRANslator) was targeted at scientists who wished to make iterated

formulae evaluations available to computers, while a language like COBOL

(COmmon Business Oriented Language) was geared to the book-keeper who

wished to automate payrollsor lists of accounts receivableand suchlike.

Ever since the advent of the first high-levelprogramming languages,effortswere made to increase the languages expressiveness and to avert program

errors due to the lack of the latter. The introduction of subroutines or proce-

dures started to involveissues like existence or visibility of data. For the first

time, data that are not relevant for the rest of the program could be hidden

away from the latter by declaring them in a local scope. This may be the case

withbound variablesor intermediate results of lengthycomputations.

Subsequently, the module conceptprovided not only for separate compila-


43/272


tion, but also for the separation of the existence of data from their visibility.

Now it was possible for data to exist without being visible by other modules or

compilation units. But not being visible means not being accessible, whether

intentionally or by accident. By hiding the data in the implementation part of

the module, clients did not have to know anymore the details of the inter-de-

pendence of data, whileinvariant properties of the data could be guaranteed to

the clients by the capsule called module in Modula-2 or unit in UCSD-Pascal.

Such invariants may be as simple as asserting that an indexpoints to the next

free slot of a table, or as complex as the correspondences between the ac-

counts payable and the accounts receivable.

The notion of an abstract data type may be considered a preliminary pin-

nacle of this evolution. An abstract data type or ADTis a type whose internals

are not unveiled to its clients directly, but indirectly only through a set of ac-

cess procedures. With that, the advantages ofencapsulation have been made

availableon the level of variables,rather than modules only,since now clientscan own several instances of the same ADT. A simple example may be the

implementation of a financial amount typeas a very long integer with an extra

byte indicating how many decimal digits thereof are actually decimal places;

the invariant being that the number of decimal places is always non-negative,

and hence the "dollar-and-cents"formatting can rely on this fact.

Regarding the notation of encapsulation, we do not intend to make our

own proposal, partly in view of the considerable number of already existing

notations (Ada, C++, Modula-2,Oberon, UCSD-Pascal, ), and partly becausewe would rather expect the reader to be experienced enough to see, through-out the rest of this thesis, which parts of the examples are to be exported for

publicuse or are better restricted to private use.

2.1.1 Late binding

The preceding sub-section has focused on the ingredients of programs,

bringing the restrictiveconstraints of asserting delicate invariants to consider-

able flexibility in the form of ADTs. But this is not the only path that leads toOOP. In the present sub-section, therefore,we shall look as well at the recipes

for processing the ingredients, and how flexibility can be added there.

Quite early on, programming languages had to face the problem that only

a part of the recipe may be known upon designing a computer program. Sup-

pose we wanted to do a program for numeric integration, for which numerous

algorithms exist. Usually,these algorithms abstract from the particular func-

tion to be integrated by a stereotyped phrase like "Let f(x) be an integrable


44/272

The Foundations:32

function". In practice, this means that we have to commit ourselves to a par-ticular function to be integrated already upon implementing the integrating

procedure, rather than being allowedto implement the actual integration only

and supply the functions to be integrated to the integrating procedure later, in

the form of a parameter.

The designers of Algol-60 were certainly aware of this fact, which is why

parameters could be passed to procedures by name (in contrast to by value),

and therefore not only the names of variables,but as well those of functions.

While this is a mathematically most satisfying solution, it is not always that

simple to explain,let alone to realize it in practice in a compiler. It is therefore

intelligible that in Pascal the general call-by-name parameter mechanism was

downsized to the call-by-reference mechanism, but complemented by the pos-

sibility to supply functions and procedures as actual parameters. With that,

the aforementioned demand - to decide later, which function to integrate -

could be met.It is known that the appetite grows with the eating. Hence it is not surpris-

ing that soon the wish emerged to bindsuch procedural parameters to a mo-

dule or ADT for longer than just the integration of some function. This can

prove useful when it comes to deal with exceptions, or for re-directinginput-

and output operations. (Incidentally,an interrupt handler is a closelyrelated

situation.) In Modula-2 this demand is reflected by the powerful concept of

procedural variables (in contrast to parameters), and therefore procedure

types. With that, one can both bind the same algorithm to several ADTs and

bind different algorithms to the same ADT in turn. In Oberon, finally,cosmetic changes were made to the syntax of procedure types, making up for

their growing use.

As an example to explore the practical benefits of ADTs and procedure

types, we might devisean ADT for the windowsof a simple windowingsystem.

We observe that all windowsare essentially orthogonal rectangles, which may

or may not overlap each other on screen. As a result of moving or resizing a

window, some parts of some of the windowsmay become newly visible.Much

like with the integration of functions, the algorithm to determine for which

part of which window this applies is independent of the actual contents of thewindowsat issue. Hence the functionality to display a windowscontents is an

ideal candidate for installing a procedural variable into an ADT. Likewise, if a

mouse is moved or one of its buttons has been pressed, or if a key on the key-

board has been pressed, a simple algorithm has to determine, which of the

windowson screen is supposed to respond to this event, if any at all, but this

functionality is again independent of the nature of the particular window.

Therefore,in a first attempt we use a record, such as in Pascal, with some


45/272


of the record fields actuallybeing procedure types as newly introduced:

Window = RECORD

x,y,X, Y: INTEGER { considered "read-only" or not exported at all };

displayContents: PROCEDURE (x,y,X, Y: INTEGER);

handleMouse: PROCEDURE (buttons: Buttons; x,y: INTEGER);

handleKeyboard: PROCEDURE (ch: CHAR);

END;

The module that implements the windowingsystem manages a collection of

such records, together with procedures for opening and closing windows, and

for installing the procedure variables.

Later on, when we implement a text editor, we can define a procedure such

as the one below

PROCEDURE DisplayContents(x,y,X, Y: INTEGER);

BEGIN

"displaypart of windowscontents bordered by the rectangle[x,y,X, Y]"

END { DisplayContents };

and use the relevant procedure from the windowingsystem to assign it to the

window in which the text is being edited. If we do the same e.g. for a font edi-

tor, and so forth, we begin to see how the windowingsystem can assert inva-

riant properties, such as to correctly reflect a givencollection of windowsonscreen, even though only part of the recipes to do so are known when the

a hybrid approach to medium- and low-resolution font-scaling and its oop style implementation

Documents