towards automatic optimal rendering of three-dimensional syntax trees
TRANSCRIPT
TOWARDS AUTOMATIC OPTIMAL RENDERING
OF
THREE-DIMENSIONAL SYNTAX TREES
by
Harm Brouwer
A Bachelor Thesis
Submitted to the Faculty of Arts
of the
UNIVERSITY OF GRONINGEN
in partial fulfillment of the requirements for the
Degree of Bachelor
in
Information Science
June 2008
Thesis Advisors:
Dr. Leonie Bosveld
Dr. Mark de Vries
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
2
‘For a long time I limited myself to one color – as a form of discipline.’
PABLO PICASSO
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
3
Table of Contents
1. Introduction .................................................................................................. 5
2. Linguistic Background................................................................................. 8
2.1. Parataxis .................................................................................................................................8 2.1.1. Coordination..................................................................................................................8 2.1.2. Parenthesis ....................................................................................................................9
2.2. Multidominance..................................................................................................................10 2.2.1. Multidominance for Movement ............................................................................10 2.2.2. Multidominance for Sharing without Movement..........................................11
2.3. Closing Words .....................................................................................................................12 3. Information Visualization Background ....................................................14
3.1. Structural Analysis of the Modified Syntax Tree .................................................15 3.2. Conclusions of the Structural Analysis ....................................................................15
3.2.1. Three-dimensional Syntax Trees ........................................................................15 3.2.2. Multidominance in Syntax Trees ........................................................................18 3.2.3. Closing words..............................................................................................................19
3.3. Design Principles of Three-dimensional Syntax Trees.......................................19 3.4. Visual Clutter .....................................................................................................................22
3.4.1. Parametrical Variation ...........................................................................................24 3.4.2. Design Principle Violation Measurement........................................................24
4. Visual Optimality: Clutter and Optimality Theory .................................25
4.1. Optimality Theory .............................................................................................................25 5. Syntactic Structure: Building Operations................................................27
6. Core Mechanism: Generation and Evaluation .........................................28
6.1. Design of the post-derivational OT-procedure .......................................................28 6.1.1. Syntax Parser .............................................................................................................29 6.1.2. Generator .....................................................................................................................35
6.1.2.1. Parameter: Angle of Projection ...................................................................40 6.1.2.2. Parameter: z-axis Spacing ............................................................................40 6.1.2.3. Parameter: y-axis Spacing ............................................................................40 6.1.2.4. Parameter: x-axis Spacing ............................................................................41
6.1.3. Set of Constraints .....................................................................................................41 6.1.4. Evaluator ......................................................................................................................42
6.1.4.1. Metric: Node Occlusion ..................................................................................45 6.1.4.2. Metric: Crossing Lines ...................................................................................46 6.1.4.3. Metric: Constant Height Difference ..........................................................46
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
4
6.1.4.4. Metric: Equal Sister Height .........................................................................47 6.1.4.5. Metric: Constant Depth Difference ...........................................................48 6.1.4.6. Metric: Binary Mother Placement .............................................................48 6.1.4.7. Metric: Equal Sister Line Length ..............................................................49 6.1.4.8. Metric: Minimal Projection Angle ..............................................................50
6.1.5. Extraction of the Optimal Display .....................................................................50 6.2. Overview of the OT-procedure......................................................................................52
7. Conclusions...................................................................................................53
8. Further work ................................................................................................55
Acknowledgements ..........................................................................................56
References ........................................................................................................57
Appendix A: Structural Analysis ...................................................................59
I. General Structure ..................................................................................................................59 II. Detailed Information about Items and Links ...........................................................62 III. Potential Movement ..........................................................................................................63
Appendix B: Formal Design Principles .........................................................66
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
5
1. Introduction
In linguistics, tree diagrams are a well-known and commonly accepted represen-
tation of the constituent analysis of sentences. These tree diagrams are also
known as syntax trees in the context of generative grammar. A syntax tree
graphically consists of a set of nodes and branches and is ordered, binary and
two-dimensional (see Figure 1). The vertical dimension expresses succession in
the sense of substitution, i.e., the combination of two constituents into a larger
constituent (e.g., Verb and NP into VP). The horizontal dimension expresses the
relatedness of constituents within a larger constituent (e.g., Verb selects the NP
[the ball]). In linguistic terms, the vertical dimension, i.e., the set of branches,
represents relations of dominance or subordination. The horizontal dimension
represents precedence. Both subordination and precedence are asymmetrical. A
syntax tree is the immediate constituent analysis of a sentence if the data repre-
sented is a string of words representing a natural language sentence.
Figure 1. Constituent Analysis of 'The man hit the ball'.
(Chomsky, 1957: 27)
Relations of subordination are hypotactic relations. Since Chomsky
(1957), research of generative grammar has mainly focused on hypotactic phe-
nomena. These phenomena fitted quite well into the traditional syntax. As a re-
sult, constituent analysis could therefore be visualized with a syntax tree as de-
scribed above. However, besides hypotactic phenomena there exist paratactic
phenomena, like coordination and parenthesis. There is disagreement among
linguists on how to analyze these paratactic constructions within traditional
syntax. In addition, coordination can give rise to so called ‘right node raising’
constructions in which conjuncts share constituent or non-constituent parts of a
sentence. As with parataxis, there is disagreement on how to account for these
constructions. Since McCawley (1982), there has been an increase of interest in
parataxis (e.g., De Vries, 2005a; Goodall, 1987; Grootveld, 1994; Kluck, 2007).
This increase led to the development of theories that attempt to account for the
problems stated above. The theory of interest for this thesis assumes the exis-
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
6
tence of a third asymmetrical relation besides subordination and precedence (De
Vries, 2005b; De Vries, 2007a; De Vries, 2007b). This assumed third asymmetri-
cal relation is non-subordination or ‘behindance’. In addition, it assumes the in-
volvement of multidominance in constituent or non-constituent ‘right node rais-
ing’ constructions.
These assumptions give rise to two problems for constituent analysis rep-
resentation with a traditional syntax tree. As described above, the traditional
syntax tree is two-dimensional. The assumption of a third asymmetrical relation
might imply a graphical third dimension. In addition, traditional syntax trees
are ordered and binary, and disallow substitution of a set of nodes by a single
node at a lower level. This type of substitution is exactly what constitutes multi-
dominance constructions. More concretely, the traditional syntax tree needs to
be modified in order to facilitate the representation of a third asymmetrical rela-
tion and multidominance constructions.
Although there are multiple ways to modify the traditional syntax tree in
order to represent non-subordination and multidominance, this thesis will focus
on one specific approach. This approach is an extension of the traditional syntax
tree with an additional dimension, i.e., ‘depth’, to accommodate the non-
subordination relation. Within this modified tree diagram, multidominance still
involves substitution of a set of nodes by a single node at a lower level. Elements
of the substituted set of nodes can now reside on different levels of the dimen-
sion representing non-subordination.
The proposed modifications of the traditional syntax tree give rise to an
increment in representational complexity. A representation can be interactive or
non-interactive. In this thesis, the focus is on the latter, since this is a property
of printable media like scientific articles and books. From a mathematical per-
spective, these media restrict images to two-dimensional planes, e.g., a page in a
book. Since the modified syntax tree consists of three dimensions, the points of
this three-dimensional graphical image need to be mapped onto two dimensions.
This mapping is called a three-dimensional projection. Such a projection is pa-
rametric, e.g., the angle of projection can vary.
Both the addition of a graphical third dimension and the allowance of
multidominance may render the image confusing as a result of overlapping nodes
or crossing lines. This confusion is also known by the term clutter. Lloyd (2005)
describes techniques for clutter measurement. The fact that a three-dimensional
projection is parametric and that each unique configuration of parametrical val-
ues can lead to a different amount of clutter, suggests that there is an optimal
parametrical configuration. A parametrical configuration is optimal if the clutter
in the resulting projection is minimal.
In contrast to representing a traditional two-dimensional syntax tree,
representing an optimal projection of a three-dimensional syntax tree is a com-
plex and time-consuming process. However, the possibility to measure and re-
duce clutter based on variation in parametrical configuration suggests that the
optimal configuration can be computed. A mechanism that facilitates this com-
putation is a linguistic model from the field of phonology adapted for images.
This linguistic model is Optimality Theory. An Optimality Theory procedure pro-
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
7
vides a mechanism for the computation of each possible projection based on pa-
rametrical variation. Each of these projections will be evaluated with respect to
its amount of clutter. The projection with the least amount of clutter represents
the optimal projection and thus the optimal parametrical configuration. Hence,
the research question of this thesis can be defined as in (1).
(1) How can one automatically compute and render an optimal three-
dimensional syntax tree diagram for non-interactive printable media?
The structure of this thesis will be as follows. In §2, I will provide a lin-
guistic background for theories that address the analysis of paratactic and mul-
tidominance constructions. These theories require a modification of traditional
syntax and consequently of traditional syntax trees. The modification of tradi-
tional syntax trees will be placed in a diagrammatic theoretical perspective in
§3. This section will furthermore provide a theoretical background for clutter,
parametrical variation among projections and clutter measurement within a pro-
jection. In §4, I will propose a mechanism to compute an optimal projection,
based on parametrical variation and clutter measurement, in terms of an Opti-
mality Theory (OT) procedure. The input of this OT-procedure will be a sequence
of syntactic structure building operations which I will describe in §5. A detailed
overview of the workings of the OT-procedure will be provided in §6, in which I
will also elucidate the parameters and metrics for clutter measurement. Finally,
conclusions will be provided in §7 and ideas and suggestions for further work in
§8.
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
8
2. Linguistic Background
Lately, the linguistic literature has shown an increase of interest in paratactic
phenomena, like coordination and parenthesis. Contrary to the case of hypotactic
constructions, there is no consensus within the literature on how to analyze
paratactic constructions with traditional syntax. This disagreement among lin-
guists led to the development of different theories that attempt to accommodate
paratactic constructions. The theory of interest for this thesis assumes an exten-
sion of traditional syntax with a third asymmetrical relation besides subordina-
tion and precedence.
In this section, I will describe linguistic theories on two paratactic phe-
nomena: coordination and parenthesis, and in addition, two types of multidomi-
nance constructions: multidominance for movement and multidominance for
sharing without movement. Two types of sharing without movement can be de-
fined: sharing in combination with coordination and sharing in combination with
parenthesis, i.e., multidominance for sharing without movement relates to para-
taxis.
2.1. Parataxis
Parataxis involves the arrangement of clauses or phrases in a non-subordination
relation. It is therefore the opposite of hypotaxis, which involves the arrange-
ment of clauses or phrases in a subordination relation.
Below, I will discuss two paratactic constructions: coordination and paren-
thesis, which both give rise to analytical difficulties within traditional syntax.
Furthermore, I will discuss theories that involve parallelism to account for these
difficulties. Parallelism implies modification of traditional syntax and conse-
quently of traditional syntax trees.
2.1.1. Coordination
Coordination constitutes one of the paratactic constructions that gives rise to
analytical problems within traditional syntax. Sentence (2a) gives an example of
coordination.
(2) a. [Jane and Jack and John] walk home.
b. [Jane [and Jack [and John]]]
(3) a. [The man with the dog with the broken leg] walks home.
b. [The man [with the dog [with the broken leg]]]
Within traditional syntax, the coordination between brackets in (2a) can be
analyzed as in (2b) (e.g., Johannessen, 1998). In sentence (3a) an example of
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
9
subordination is given. This can be analyzed as in (3b). The analysis of coordina-
tion (2b) and the analysis of subordination (3b) show no structural differences.
The hierarchy of prepositional phrases in (3b) is correct. However, the suggested
hierarchy of conjuncts in (2b) is not. Conjuncts should be in a parallel non-
subordinate relation instead of in a subordinate relation (e.g., De Vries, 2005a;
Goodall, 1987; Grootveld, 1994; Van Riemsdijk, 1998).
However, there is disagreement on how to accommodate parallelism in tra-
ditional syntax. The idea of ‘behindance’ has been discussed by several authors
(e.g., Goodall, 1987; Grootveld, 1994; Van Riemsdijk, 1998). Nevertheless, the
exact accommodation of their ‘behindance’ relations in traditional syntax differs.
The theory of interest for this thesis is that of De Vries (2005a), who treats ‘be-
hindance’ as firmly rooted in a binary branching Minimalist type of syntax
(Chomsky, 1995).
This theory of coordination extends traditional syntax with an additional
structural relation, i.e., it makes syntax three-dimensional1. This implies that
the immediate constituent analysis of a sentence with this modified syntax can-
not be represented with a traditional syntax tree. The traditional syntax tree
needs to be extended with an additional dimension to accommodate the relation
of ‘behindance’.
2.1.2. Parenthesis
Parenthesis constitutes another paratactic construction that gives rise to ana-
lytical problems within traditional syntax. In a study of the syntactic represen-
tation of disjunct constituents, Espinal (1991) argues that the syntactic struc-
ture of a sentence containing disjunctive grammatical sequences may be repre-
sented as a number of semi-independent structures, i.e., the host structure and
the structure(s) of the disjunct constituent(s). Sentence (4) illustrates this.
(4) Apparently Chris, who is my neighbor, drank my coffee.
In sentence (4), ‘who is my neighbor’ is not a normal embedded subclause,
because it constitutes a secondary proposition. Evidence for this is that the ad-
verb ‘apparently’ has no scope over the parenthesis ‘who is my neighbor’. In
other words, it is apparent that Chris drank my coffee, but it is not specifically
apparent that Chris is my neighbor, i.e., the two propositions can be independ-
ently defined as in (5a) and (5b).
(5) a. Apparently Chris drank my coffee.
b. Chris is my neighbor.
1 The term ‘THREE-DIMENSIONAL SYNTAX ’ is metaphoric and is not to be confused with a three-
dimensional syntax tree, i.e., it refers to the three asymmetrical relations ‘PRECEDENCE ’, ‘DOMI-
NANCE’ and ‘BEHINDANCE ’ as dimensions.
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
10
Espinal (1991) argues that an analysis in which the meaning of a disjunct
is related to the meaning of its host structure requires a structural relation
other than dominance and precedence.
De Vries (2007b) also states that parentheses and similar constructions,
show structural independence in a certain sense. Despite this semi-
independency, they are syntactically and linearly integrated within their host
sentences. Among other concepts (see De Vries, 2007b) to account for these con-
tradictory properties of parenthesis from a Minimalist perspective of syntax
(Chomsky, 1995), De Vries (2007b) proposes a ‘behindance’ relation, i.e., making
traditional syntax three-dimensional. As with the theory of coordination, this
again implies that constituent analysis of a sentence with this modified syntax
cannot be represented with a traditional syntax tree. The traditional syntax tree
needs to be extended with an additional dimension to accommodate the relation
of ‘behindance’.
2.2. Multidominance
In a multidominant relation, a constituent or non-consituent part of a sentence
is subordinated by two or more different constituents. I will discuss two syntac-
tic phenomena for which there exist theories that involve such multidominance
constructions: mutidominance for movement and multidominance for sharing
without movement.
Furthermore, I will discuss why these constructions give rise to represen-
tational problems within both a traditional two-dimensional syntax tree and a
three-dimensional syntax tree.
2.2.1. Multidominance for Movement
Displacement is one of the syntactic phenomena for which theories involving
multidominance constructions have been proposed. The idea of displacement is
that a word or phrase can be related to a position in a sentence where it does not
surface (De Vries, 2007a). An example of a syntactic construction which gives
rise to displacement is so-called wh-movement. This is illustrated in sentence (6)
(De Vries, 2007a: 1).
(6) a. This talented girl should purchase a new violin.
b. Which violin should this talented girl purchase _ ?
In (6a) ‘a new violin’ occupies the regular direct object position, whereas in
(6b) ‘which violin’ relates to this regular direct object position, marked by an un-
derscore, but does not surface there.
Several theories have been proposed to account for this relation. On one ac-
count, a trace is left in the regular direct object position (Chomsky, 1981). An-
other approach is to place a copy in the regular direct object position (Chomsky,
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
11
1995). However, the multidominance approach to displacement differs signifi-
cantly from theories that involve copies or traces.
In a multidominance approach to movement (e.g., Gärtner, 2002), a dis-
placed syntactic object is dominated by two or more different constituents, i.e., it
is shared. This inherently leads to a representation problem, i.e., the traditional
ordered, binary and two-dimensional syntax tree disallows substitutions of mul-
tiple nodes by a single node at a lower level.
An intuitive solution to this problem is to allow these substitutions in or-
der to accommodate multidominance constructions. This allowance might give
rise to a new visualization problem. Substituting multiple nodes by a single node
at a lower level can graphically lead to crossing lines. In other words, it might
induce confusion or clutter in the representation.
2.2.2. Multidominance for Sharing without Movement
Besides sharing with traditional movement, two types of sharing without tradi-
tional movement can be defined. These are sharing in combination with coordi-
nation and sharing in combination with parenthesis.
Sharing in combination with coordination can lead to constructions called
Right Node Raising (RNR) or Backward Conjunction Reduction and Across-The-
Board (ATB) movement. The sentences in (8) illustrate RNR and the sentences in
(9) illustrate ATB.
(8) (De Vries, 2005b: 6) [Dutch]
a. Joop bewondert _, maar Jaap verafschuwt Balkenende.
b. Joop heeft een boek _ en Jaap heeft een CD gekocht.
c. Ik dacht dat Joop _, maar jij dacht dat Jaap een boek had gekocht.
d. Joop wilde onkruid trekken _, maar Jaap wilde liever zonnebaden
in de tuin.
(9) (De Vries, 2005b: 14) [Dutch]
a. Wie [[sloot de deur] en [verliet het gebouw]]?
b. Wat heeft [[Joop gekocht] en [Jaap verkocht]]?
In the RNR example sentences in (8), the underlined constituents are im-
plied in both conjuncts (De Vries, 2005b; Kluck, 2007; McCawley, 1982;
Sampson, 1975; Van Riemsdijk, 2006). In the literature, there are various views
on how to explain RNR. McCawley (1982), among others, proposes an analysis of
RNR in which the relevant constituent is dominated by multiple constituents,
i.e., it is structurally shared. In other words, McCawley and others propose a
multidominance approach to RNR.
In the ATB example sentences in (6), the left-peripheral wh-constituent is
shared by the two conjuncts (Citko, 2005; De Vries, 2007a). Citko (2005) ad-
dresses this problem in terms of parallel merging. De Vries (2007a) proposes a
similar solution in terms of external remerging.
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
12
Both RNR and ATB are dependent on coordination. In §2.2.1, I’ve shown
that there exist theories on the syntax of coordination that assume an extension
of traditional syntax with a ‘behindance’ relation. This makes syntax three-
dimensional. From this perspective, we can state that RNR and ATB imply
three-dimensional syntactic structures that involve multidominance.
Sharing in combination with parenthesis can lead to syntactic amalgams
(Guimarães, 2004; Lakoff, 1974; Van Riemsdijk, 1998). Sentence (10a) illustrates
a syntactic amalgam.
(10) (Guimarães, 2004: 1)
a. Homer drank I don’t remember how many beers at the party.
b. I don’t remember how many beers Homer drank at the party.
Guimarães (2004) concludes that a syntactic amalgam like (10a) is a par-
ticular kind of paratactic-like construction. That is, we can distinguish the host
sentence ‘Homer drank beers at the party’ and the parenthesis ‘I don’t remember
how many’. At first glance, the host sentence and the parenthesis seem to be un-
related. However, if we transform sentence (10a) into (10b), it becomes clear that
there is a relation between the meaning of the host sentence and the parenthe-
sis. This relation can be explained in terms of multidominance.
In §2.1.2, I’ve shown that there exist theories on the syntax of parenthesis
that assume an extension of traditional syntax with a ‘behindance’ relation.
From this perspective, we can state that syntactic amalgams imply three-
dimensional syntactic structures that involve multidominance.
Visualization of the multidominance approach towards the analysis of shar-
ing without traditional movement leads to a visualization problem analogous to
that of visualizing the multidominance approach to movement. That is, if we al-
low the substitution of multiple nodes by a single node at a lower level in a tra-
ditional syntax tree, this can graphically lead to the crossing of lines. In other
words, this might induce confusion or clutter into the representation.
2.3. Closing Words
In the linguistic literature, the increase of interest in paratactic constructions
like coordination and parenthesis, has led to the development of theories that
assume an extension of traditional syntax in the form of an additional structural
relation, i.e., making syntax three-dimensional. This structural relation is ‘be-
hindance’ or non-subordination.
The addition of this structural relation implies that immediate constituent
analysis with a traditional syntax tree is problematic. A traditional syntax tree
is two-dimensional, i.e., there is no way to graphically represent the ‘behindance’
relation. A modification of the traditional syntax tree is required to accommo-
date this relation. This modification might be the extension of the two-
dimensional syntax tree with a graphical third dimension, i.e., making the syn-
tax tree also three-dimensional.
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
13
Furthermore, there exist phenomena like RNR, ATB and syntactic amal-
gams that are dependent on paratactic structures. Certain theories on the analy-
sis of these phenomena introduce the concept of multidominance. The represen-
tation of multidominance constructions might lead to visualization problems in
two-dimensional as well as three-dimensional syntax trees, i.e., it may induce
confusion or clutter due to crossing lines. The same problem arises with theories
that involve multidominance in the analysis of movement.
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
14
3. Information Visualization Background
A traditional syntax tree is a hierarchical tree or hierarchy. Hierarchies are a
type of schematic diagram or spatial diagram (Novick & Hurley, 2001). This
classification suggests that a traditional syntax tree is based upon conventions.
Novick & Hurley (2001) conducted a structural analysis of three types of sche-
matic diagrams: matrices, networks and hierarchies (see Figure 2). More con-
cretely, they defined ten properties on which these spatial diagrams are hy-
pothesized to differ.
MATRIX
NETWORK
HIERARCHY
Figure 2. Three types of spatial diagrams.
Reconstructed from Novick & Hurley (2001).
Analogous to Novick & Hurley (2001), I provide a structural analysis of the
modified syntax tree based upon these properties. I will use this analysis to
place the three-dimensional syntax tree with the possibility of multidominance
in a diagram theoretic perspective.
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
15
Furthermore in this section, I will provide a definition of clutter and dis-
cuss parametric variation and clutter measurement. I will use this definition
and these methodologies to place optimality of a three-dimensional projection
into a perspective of information visualization theory.
3.1. Structural Analysis of the Modified Syntax Tree
Novick & Hurley (2001) define ten properties on which matrices, networks and
hierarchies are hypothesized to differ. They organize these ten properties into
three groups: ‘general structure’, ‘detailed information about items and links’ and
‘potential movement’. These groups respectively define the foundations of a dia-
gram, details on the linking of data within a diagram and movement from one
data point to another within a diagram.
In Appendix A, I provide a detailed structural analysis of the modified syn-
tax tree in terms of these three groups of properties and I do this from a graphi-
cal and a syntactic perspective, since these may differ in compatibility with cer-
tain properties. The traditional syntax tree is modified in two ways. The first
modification is an additional dimension. The second modification is the allow-
ance of multidominance constructions. I address the characteristics of these
modifications separately, because there may be differences in compatibility with
certain properties at this level also.
Below, I will provide the conclusions of this detailed structural analysis,
i.e., I will provide an overview of the properties of three-dimensional syntax
trees and syntax trees that allow multidominance.
3.2. Conclusions of the Structural Analysis
From both a syntactic and a graphical perspective, the properties of three-
dimensional syntax trees and the properties of two-dimensional and three-
dimensional syntax trees that allow multidominance, show partial compatibility
with both hierarchies and networks. However, none of these trees can be validly
classified as a pure network or as a pure hierarchy (see Table 1).
In order to provide a pure definition for each of the syntax trees, I will de-
rive a set of properties for each of the trees, based on the detailed structural
analysis as provided in Appendix A. More concretely, I will derive a set of prop-
erties for three-dimensional syntax trees (see §3.2.1) and for syntax trees that
allow multidominance (see §3.2.2).
3.2.1. Three-dimensional Syntax Trees
The general structure of a spatial diagram can be defined by four properties:
‘global structure’, ‘building block’, ‘number of sets’ and ‘item/link constraints ’
(Novick & Hurley, 2001) (see Appendix A; Table I).
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
16
Global structure. From both a syntactic and a graphical perspective, the
global structure of a three-dimensional syntax tree is organized into levels, be-
ginning with a single root node (usually located at the top or right) that
branches out to subsequent levels such that the identities of the nodes at one
level depend on the identities of the nodes at a preceding level.
STRUCTURAL ANALYSIS OF A MODIFIED SYNTAX TREE
General structure
3D 2D [MDom] 3D [MDom] Property Perspective
M N H M N H M N H
Syntactic X X X X Global structure
Graphical X X X X
Syntactic X X X X Building block
Graphical X X X
Syntactic X X X Number of Sets
Graphical X X X
Syntactic Item/Link Constraints
Graphical X
Detailed Information about Items and Links
3D 2D [MDom] 3D [MDom] Property Perspective
M N H M N H M N H
Syntactic X X X Item Distinguishability
Graphical X X X
Syntactic X X X X X X Link Type
Graphical X X X X X X
Syntactic X Absence of a Relation
Graphical X X X
Potential Movement
3D 2D [MDom] 3D [MDom] Property Perspective
M N H M N H M N H
Syntactic X X X Linking Relations
Graphical
Syntactic X X X X X X Existence of Paths
Graphical X X X X X X
Syntactic X X X Traversing the Representation
Graphical X X X
Table 1. Schematic compatibility overview of the structural analysis of the modified syn-
tax tree (M = Matrix, N = Network and H = Hierarchy). See Appendix A for a detailed de-
scription of the properties.
Building block. From a syntactic perspective, the building block of a three-
dimensional syntax tree is a single node that gives rise to at least two other
nodes, or at least two nodes that are narrowed down to a single node, but not
both (i.e., three nodes and two directional nodes connecting them, arranged as a
‘V’ in some orientation).
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
17
However, graphically, it is possible to have a single node that gives rise to
only a single node, i.e., from a graphical perspective, the building block of a
three-dimensional syntax tree is a single node that gives rise to at least a single
other node (see Appendix A for details).
Number of sets. From both a syntactic and a graphical perspective, the
number of sets in a three-dimensional syntax tree is irrelevant, i.e., the repre-
sentation does not naturally suggest that the nodes are arranged into a particu-
lar number or configuration of groups.
Item/links constraints. From a graphical perspective, the item/links con-
straints of a three-dimensional syntax tree comprise that there may not be (di-
rect) links between nodes at the same level or between nodes in non-adjacent
levels.
Graphically, parallel nodes are on a different ‘depth’ level. However, syn-
tactically, parallel nodes are on the same level. Consequently, from a syntactic
perspective, the item/links constraints comprise that there may not be (direct)
links between nodes at the same level or between nodes in non-adjacent levels,
except for nodes in a non-subordinate relation.
The detailed information about items and links in a spatial diagram can be
defined by three properties: ‘item distinguishability’, ‘link type’ and ‘absence of a
relation’ (Novick & Hurley, 2001) (see Appendix A; Table II).
Item distinguishability. From both a graphical and a syntactic perspective,
the item distinguishability in a three-dimensional syntax tree comprises that
nodes at a given level have identical status, but the nodes at different levels dif-
fer in status.
Link type. From both a graphical and a syntactic perspective, the link type
in a three-dimensional syntax tree is a directional link such that processing
flows from one end of the representation to the other.
Absence of a relation. From a graphical and a syntactic perspective, the ab-
sence of a relation in a three-dimensional syntax tree is indicated implicitly due
to constraints on which nodes may be linked (see above for ‘item/link con-
straints’ in three-dimensional syntax trees), but it must be computed for non-
linked nodes in adjacent levels.
The potential movement in a spatial diagram can defined by three proper-
ties: ‘linking relations’, ‘existence of paths’ and ‘traversing the representation ’
(Novick & Hurley, 2001) (see Appendix A; Table III).
Linking relations. From a syntactic perspective, the linking relations in a
three-dimensional syntax tree comprise either a single line that enters and mul-
tiple lines that leave each node (i.e., all depicted relations are one-to-many) or
multiple lines that enter and a single line that leaves each node (i.e., all de-
picted relations are many-to-one), but not both. Graphically, however, the link-
ing relations in a three-dimensional syntax tree comprise a single line that en-
ters and least a single line that leaves each node (i.e., all depicted relations are
at least one-to-one).
Existence of paths. From both a syntactic and a graphical perspective, the
existence of paths in a three-dimensional syntax tree comprise paths connecting
subsets of (more than two) nodes.
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
18
Traversing the representation. Finally, both graphically and syntactically,
the traversal of a three-dimensional syntax tree comprises that for any pair of
nodes, A and B, there is only one path to get from one to the other (i.e., closed
loops are not allowed).
3.2.2. Multidominance in Syntax Trees
This thesis focuses on three-dimensional syntax trees. However, both modified
two-dimensional and three-dimensional syntax trees can accommodate multido-
mance constructions, i.e., if these constructions are allowed. Since these trees
are almost identical in terms of the structural analysis based on the properties
of Novick & Hurley (2001) (see Table 1), I will discuss them simultaneously.
The general structure of a spatial diagram can defined by four properties:
‘global structure’, ‘building block’, ‘number of sets’ and ‘item/link constraints ’
(Novick & Hurley, 2001) (see Appendix A; Table I).
Global structure. From a syntactic and a graphical perspective, the global
structure of both two-dimensional and three-dimensional syntax trees that allow
multidominance, is compatible with networks, i.e., the representation does not
have any predefined formal structure.
Building block. From a syntactic and a graphical perspective, the building
block of syntax trees that allow multidominance, consists of two nodes and a di-
rectional link between them.
Number of sets. From a syntactic and a graphical perspective, the number
of sets in syntax trees that allow multidominance, is irrelevant, i.e., the repre-
sentations do not naturally suggest that the nodes are arranged into a particular
number or configuration of groups.
Item/link constraints. Both graphically and syntactically, the item/link
constraints in syntax trees that allow multidominance, comprise that except for
sisters, any node may be linked to any other node.
The detailed information about items and links in a spatial diagram can
defined by three properties: ‘item distinguishability’, ‘link type’ and ‘absence of a
relation’ (Novick & Hurley, 2001) (see Appendix A; Table II).
Item distinguishability. From a syntactic and a graphical perspective, the
item distinguishability in two-dimensional and three-dimensional syntax trees
that allow multidominance, comprises that nodes at a given level have identical
status, but the nodes at different levels differ in status.
Link type. Both graphically and syntactically, the link type of the links in
syntax trees that allow multidominance comprises that links between nodes are
directional such that processing flows from one end of the representation to the
other.
Absence of a relation. From a syntactic and a graphical perspective, the ab-
sence of a relation in a two-dimensional syntax tree that allows multidominance,
is indicated implicitly due to constraints on which nodes may be linked, but it
must be computed for non-linked nodes in adjacent levels. The same conclusion
holds for three-dimensional syntax trees that allow multidominance, but see the
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
19
‘item/links constraints’ property in §3.2.1 for a discussion on the syntactic inter-
pretation of ‘level’ in three-dimensional syntax trees.
The potential movement in a spatial diagram can be defined by three prop-
erties: ‘linking relations’, ‘existence of paths’ and ‘traversing the representation ’
(Novick & Hurley, 2001) (see Appendix A; Table III).
Linking relations. From a syntactic perspective, the linking relations in
two-dimensional and three-dimensional syntax trees that allow multidominance,
comprise that any number of lines can enter and leave each node. Thus both one-
to-many and many-to-one (i.e., many-to-many) relations can be represented si-
multaneously. Graphically, however, the linking relations in both trees comprise
a single line that enters and least a single line that leaves each node (i.e., all
depicted relations are at least one-to-one).
Existence of paths. Both graphically and syntactically, the existence of
paths in syntax trees that allow multidominance, comprises paths connecting
subsets of (more than two) nodes.
Traversing the representation. Finally, from a syntactic and a graphical
perspective, the traversal of syntax trees that allow multidominance, comprises
that for any pair of nodes, A and B, there is only one path to get from one to the
other (i.e., closed loops are not allowed).
3.2.3. Closing words
This thesis focuses on the representation of three-dimensional syntax trees that
allow multidominance. These trees are hybrid, i.e., they share properties from
hierarchies and networks.
First, with respect to the general structure, the global structure and build-
ing block of this type of tree are solely compatible with networks, whereas the
number of sets in the tree is solely compatible with hierarchies. In contrast, the
item/link constraints are incompatible with both.
Second, with respect to the detailed information about items and links, the
item distinguishability is only compatible with hierarchies, the link type is com-
patible with hierarchies and networks and the absence of relations in the tree is
only graphically compatible with hierarchies.
Finally, with respect to the potential movement, the linking relations are
compatible with networks from a syntactic perspective only. The existence of
paths is compatible with networks and hierarchies, and the traversal of a repre-
sentation is solely compatible with hierarchies.
3.3. Design Principles of Three-dimensional Syntax Trees
In the previous sections, I have described the diagrammatic properties of three-
dimensional syntax trees and of syntax trees that allow multidominance. These
properties were derived from a detailed structural analysis of these trees (see
Appendix A). This detailed structural analysis was based on a study by Novick &
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
20
Hurley (2001), who defined ten properties on which three types of spatial dia-
grams (i.e., matrices, networks and hierarchies) were hypothesized to differ. The
analysis focused on the general structure, the details about items and links and
the potential traversal of three-dimensional syntax trees and syntax trees that
allow multidominance.
However, these properties do not specify any design principles with respect
to the graphical realization of these trees. For three-dimensional syntax trees
that allow multidominance, these design principles comprise the placement of
nodes, the layout of the text labels that constitute these nodes, the layout of the
lines that connect these nodes and the angle of projection (see Table 2).
DESIGN PRINCIPLES
Nodes
Identifier Formal def. Design principle
dp_all-visible iii, viii All nodes should be visible.
dp_constant-height-diff ii, iv Height differences between mother and daughter
node(s) should be constant.
dp_constant-depth-diff ii, v Depth differences between parallel nodes should be
constant.
dp_equal-sister-height ii, iv Sister nodes should be placed at an equal height,
except when they are sisters in a multidominance
construction.
dp_binary-mother ii, vi, vii Mother nodes should be placed right above their
daughter or in the exact middle of their daughters
Text labels
Identifier Formal def. Design principle
dp_no-text-overlap iii, viii Text labels should not overlap other text labels.
Lines
Identifier Formal def. Design principle
dp_equal-sister-line-length ii, iv, vi, vii Lines of sister nodes should have an equal length,
except when they are sisters in a multidominance
construction.
dp_no-crossing-lines n/a There should be no crossing lines.
Angle of Projection
dp_minimal-angle n/a The angle of projection should be minimal, i.e., zero
degrees for two-dimensional syntax trees.
Table 2. Design principles concerning nodes, text labels, lines and the angle of projec-
tion. See Appendix B for the formal definitions.
The design principles for three-dimensional syntax trees that allow multi-
dominance are based on the representational conventions for traditional syntax
trees. In the linguistic literature, there is consensus on these conventions which
comprise ideas like the placement of sisters at an equal height and the place-
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
21
ment of a mother right above her daughter or in the exact middle of her daugh-
ters. The idea of design principles is to cover these conventions for traditional
syntax trees within a minimal number of principles. These design principles
should then be adjusted and extended for three-dimensional syntax trees that
allow multidominance. In other words, we have a set of design principles that
fully covers the layout of a three-dimensional syntax tree that allows multidomi-
nance. These design principles comprise the foundation of an optimal tree dia-
gram, i.e., if all the design principles are fully realized in a tree diagram, it is
optimal.
The ‘dp_all-visible’ principle prevents nodes from being invisible and the
‘dp_no-text-overlap’ prevents the text labels of these nodes from overlapping. In
other words, the ‘dp_no-text-overlap’ principle implies the ‘dp_all-visible’ princi-
ple. Therefore, we will treat these design principles as a single principle.
The ‘dp_constant-height-diff’ principle specifies a spacing between mother
and daughter node(s). At first glance, this principle might seem to show full
overlap with the ‘dp_equal-sister-height’ principle, which forces sister nodes to
be placed at an equal height. This overlap suggests that violation of one of these
principles implies violation of the other, and thus a double violation penalty.
However, if sisters are not placed on an equal height, i.e., violating the
‘dp_equal-sister-height’ principle, this does not consequently mean that both of
the sisters also violate the ‘dp_constant-height-diff’ principle, i.e., it is possible
that only one of the sisters violates it. In other words, these principles show an
interaction that results in partial overlap, rather than full overlap.
This is also relevant for the ‘dp_binary-mother’ principle and the ‘dp_equal-
sister-line-length’ principle. The ‘dp_binary-mother’ principle comprises the
placement of mother nodes directly above their daughter or in the exact middle
of their daughters. The ‘dp_equal-sister-line-length’ principle requires lines from
sister nodes to their mother node, to be of an equal length. At first glance, these
principles also seem to show full overlap, i.e., violation of the ‘dp_binary-mother’
principle seems to imply violation of the ‘dp_sister-line-length’ principle. How-
ever, a violation of the ‘dp_constant-height-diff’ by one of two sisters, can give
rise to a situation in which the ‘dp_binary-mother’ principle is violated, but the
‘dp_sister-line-length’ is not, due to the violation of the ‘dp_constant-height-diff’
principle. In other words, these principles show an interaction that results in
partial overlap, rather than full overlap.
The ‘dp_constant-depth-diff’ principle specifies a spacing between parallel
nodes. The ‘dp_no-crossing-lines’ principle restricts lines from crossing each
other. Finally, the ‘dp_minimal-angle’ principle comprises the angle of projection
to be minimal, i.e., for a two-dimensional syntax tree, this would be zero degrees,
and for a three-dimensional syntax tree as small as possible.
The amount of realization of these design principles in a graphical repre-
sentation may induce visual clutter (see §3.4) and requires a technique to obtain
visual optimality (see §4). In order to implement these design principles in an
automatic mechanism, they need to be formalized (see Appendix B).
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
22
3.4. Visual Clutter
Information visualization is the use of computer-supported, interactive, visual
representations of abstract data to amplify cognition (Card, Mackinlay & Shnei-
derman, 1999). The effectiveness of information visualization largely depends on
the ease and accuracy with which users can access the information (Lloyd, 2005).
Visual clutter in a display can degrade this effectiveness.
Lloyd (2005) states that the definition of clutter remains vague in informa-
tion visualization, i.e., the concept of clutter is clear, but it is not explicitly de-
fined in a commonly accepted way. For instance, some definitions apply to spe-
cific graphical representations and others rely too much on subjective judgment.
Clutter causes confusion in one way or another. The way in which clutter
causes confusion may vary from one graphical representation to another, while
the concept of clutter remains the same. Lloyd proposes a general definition of
clutter, see (11). This is the definition I will use throughout this thesis.
(11) CLUTTER def.
Clutter is a state of confusion that degrades both the accuracy and ease of
interpretation of information displays.
(Lloyd, 2005: p. 14)
In her thesis, Lloyd focuses on three types of clutter: ‘density’, ‘outliers’ and
‘occlusion’. Two of these are relevant for three-dimensional syntax trees that al-
low multidominance: ‘density’ and ‘occlusion’, respectively defined as in (12) and
(13). There will be no clutter of the type ‘outliers’, since there are no data points
that significantly vary from the majority of all data points, i.e., because the data
points do not represent quantities.
(12) DENSITY def.
The number of objects present relative to the amount of display space
available.
(Lloyd, 2005: p. 21)
(13) OCCLUSION def.
Objects that either overlap other objects or obstruct other objects from
view.
(Lloyd, 2005: p. 21)
These two types of clutter, ‘density’ and ‘occlusion’, correspond to certain
design principles. The ‘dp_all-visible’, ‘dp_no-text-overlap’ and ‘dp_no-crossing-
lines’ principles, typically avoid clutter in the sense of ‘occlusion ’, whereas the
‘dp_constant-height-diff’ and ‘dp_constant-depth-diff’ principles typically avoid
clutter in the sense of ‘occlusion’ and ‘density’.
However, the ‘dp_equal-sister-height’, ‘dp_binary-mother’, ‘dp_equal-sister-
line-length’ and ‘dp_minimal-angle’ principles cannot be properly explained in
terms of clutter in the sense of ‘occlusion’ or ‘density’. In addition to these two
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
23
types of clutter, there is a PREFERENCE FOR SYMMETRY to explain these princi-
ples. This preference for symmetry emphasizes that symmetry in a syntax tree
increases its accuracy and ease of interpretation.
The preference for symmetry corresponds to the ‘dp_equal-sister-height’
property, i.e., emplacement of sisters at an equal height. In addition, it corre-
sponds to the ‘dp_binary-mother’ principle, i.e., a mother node should be placed
directly above her daughter or in the exact middle of her daughters. It also cor-
responds to the ‘dp_equal-sister-line-length’ principle, i.e., equal length for lines
that connect sisters to their mother. Finally, it corresponds to the ‘dp_minimal-
angle’ principle, i.e., a larger angle of projection psychologically reduces the
amount of symmetry in a projection.
On clutter in general, Tufte (1990) states (14). This is a commonly accepted
truth and the main motivation behind the development of techniques for clutter
reduction.
(14) “Clutter and confusion are failures of design, not attributes of informa-
tion”
(Tufte, 1990)
There are three categories of clutter reduction techniques: ‘information
preserving’, ‘information reducing’ and ‘remapping’ (Lloyd, 2005). The first cate-
gory, ‘information preserving’ comprises techniques that display all data points
and modify display attributes, such as ‘camera angle’ and ‘opacity’, to produce
the least cluttered view. The second category, ‘information reducing’, comprises
techniques that delete data points to find a balance between information loss
and clutter reduction. The last category, ‘remapping’, comprises techniques that
map data onto several different visualizations with each its advantages and dis-
advantages.
Diagrams such as static three-dimensional projections of three-dimensional
syntax trees that allow multidominance, restrict relevant clutter reduction tech-
niques to a single category: ‘information preserving’, i.e., we don’t want informa-
tion loss and we only want a single visualization. The category of information
preserving clutter reduction techniques comprises techniques that reduce clutter
based on the modification of a single display attribute, e.g., ‘camera angle’. The
application of multiple reduction techniques can lead to both positive and nega-
tive interactions between these techniques.
In addition to clutter reduction techniques, Lloyd (2005) discusses clutter
measurement methods. These measurement methods are specific for a graphical
representation and based on three sources of clutter: ‘outliers’, ‘density’ and ‘oc-
clusion’.
The combined application of different clutter reduction techniques and
clutter measurement methods, suggests that it is possible to find an optimal dis-
play, i.e., the display in which the least amount of clutter is measured. However,
in order to produce the optimal three-dimensional projection of a three-
dimensional syntax tree that allows multidominance, it is required to test each
possible value for a specific display attribute, e.g., each ‘camera angle’. In other
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
24
words, it is incorrect to speak of clutter reduction. Rather than clutter reduction,
we will speak of parametrical variation in the projection.
Variation of parametrical values will result in displays with different
amounts of clutter. The amount of clutter in a three-dimensional projection on a
two-dimensional plane of a three-dimensional syntax tree that allows multido-
maince, is related to the violation of its design principles. As a consequence, the
clutter measurement techniques will consist in methods that measure the
amount of violation of these design principles. We will speak of design principle
violation measurement, rather than clutter measurement.
3.4.1. Parametrical Variation
In the previous section, I stated that variable parametrical values influence dis-
play attributes. Modification of a single display attribute can affect the amount
of violation of multiple design principles. Parameters for a projection comprise,
e.g., the angle of projection and the spacing between nodes along the axes. Each
of these parameters has value range and an interval. A combination of paramet-
rical values is what we will call a parametrical configuration. A parametrical
configuration affects the amounts of realization of the design principles within a
projection. In §6, I will define a set of parameters that affect the display attrib-
utes of a three-dimensional projection on a two-dimensional plane of a three-
dimensional syntax tree that allows multidominance.
3.4.2. Design Principle Violation Measurement
Variation in parametrical values, can lead to different amounts of design princi-
ple violation. This suggests that there might be a one-to-one mapping between
design principle and violation measurement method. However, there is some
overlap between certain design principles, i.e., the violation of multiple design
principles can in some cases be measured by a single violation measurement
method. In §6, I will define design principle violation metrics for all the design
principles in Table 2. Each of these metrics will return the amount of violation of
a single design principle or multiple design principles as a value in the continu-
ous range of 0 – 1, i.e., 0% – 100% violation.
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
25
4. Visual Optimality: Clutter and Optimality Theory
In §3.4.1 and §3.4.2, I described parametrical variation and design principle vio-
lation measurement, respectively. The variation in parametrical values that in-
fluence display attributes, may lead to different amounts of design principle vio-
lation, i.e., different amounts of clutter. This implies that there is an ‘optimal
display’ for each visualization. I will define ‘optimal display’ as in (15) and use
this definition throughout this thesis. The term ‘optimal display’ is equivalent to
‘optimal projection’, i.e., these terms can be used interchangeably.
(15) OPTIMAL DISPLAY def.
A display is optimal if the amount of measured clutter, i.e., the amount of
measured design principle violation, is minimal.
In order to find the ‘optimal display’ for a visualization, we need a mecha-
nism to generate, evaluate and rank possible displays, i.e., a mechanism in
which the highest ranked display is the ‘optimal display’. A mechanism that fa-
cilitates this, is an Optimality Theory procedure.
4.1. Optimality Theory
Optimality Theory (OT) (Prince & Smolensky, 1993) was initially developed as a
linguistic model for phonology, but is has also been applied in other areas of lin-
guistics. For instance, Broekhuis & Dekker (2000) used it complementarily to the
computational system of human language as described in the Minimalist pro-
gram (Chomsky, 1995). In this complementary setup, they use OT to account for
syntactic phenomena that cannot be satisfactory explained by the computational
system.
The OT model consists of three components: a GENERATOR, a set of CON-
STRAINTS and an EVALUATOR. In this thesis, I will develop a post-derivational
OT-procedure for visual optimality. As in Broekhuis & Dekker (2000), the input
for the generator will be a set of syntactic structure building operations (see §5).
The set of constraints will be derived from the design principles (see §3.3) and
the evaluator will be based on the measurement techniques for the violation of
these principles (see §3.4.2 and §6.1.4.1).
Putting the post-derivational OT-procedure together (see Figure 3), the
generator will generate a ‘candidate set’ of three-dimensional projections for a
given syntactic structure. The evaluator will rank all of the projections in the
‘candidate set’ based on the amounts of violation of the design principles that
constitute the set of constraints. The highest ranked candidate display(s) repre-
sent(s) the optimal display(s).
A more technical description of each of the components of the post-
derivational OT-procedure will be provided in §6.
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
26
Figure 3. Data flow diagram of the post-derivational OT-procedure in Yourdon and Coad
notation.
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
27
5. Syntactic Structure: Building Operations
Chomsky (1995) states that syntactic structures can be constructed using an op-
eration called Merge. Merge can take lexical objects from the syntactic work-
space or objects from a partial derivation as input. The latter leads to remerging,
see De Vries (2007a), among others. An example of a Merge operation and its
visual representation is given in (16).
(16) Merge(X,Y) � Z
This gives us an operation to define syntactic structures within traditional
syntax, i.e., it is possible to express subordinate relations. However, it is impos-
sible to express relations of non-subordination. In other words, the Merge opera-
tion cannot express the modified syntax as described in §2. Therefore, De Vries
(2007b) argues that there exist two types of Merge. These are the traditional
Merge or d-Merge for dominance and b-Merge for ‘behindance’. These two types
of Merge allow us to define syntactic structures within the modified syntax.
Chomsky (1973) defines the ‘strict cycle condition’ as in (17). This condition
basically states that it is impossible to perform operations on parts of a deriva-
tion that are already completed.
(17) STRICT CYCLE CONDITION def.
No rule can apply to a domain dominated by a cyclic node A in such a way
as to affect solely a proper subdomain of A dominated by a node B which
is also a cyclic node.
(Chomsky, 1973)
Given this condition and the two types of Merge, we can use a sequence of
these Merge operations as a definition of a syntactic structure within the modi-
fied syntax. The exact implementation of these Merge sequences will be de-
scribed in §6.
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
28
6. Core Mechanism: Generation and Evaluation
The research question of this thesis concerns the automatic generation of an op-
timal display (i.e., an optimal three-dimensional projection) of a three-
dimensional syntax tree that allows multidominance.
I defined an OPTIMAL DISPLAY in terms of a display in which the amount of
measured clutter, i.e., the amount of measured design principle violation, is
minimal. In this definition, the term CLUTTER refers to a state of confusion that
degrades both the accuracy and ease of interpretation of information displays.
This state of confusion arises from the violation of certain DESIGN PRINCIPLES for
a specific representation, i.e., in this specific case, the design principles for a
three-dimensional syntax tree that allows multidominance (as defined in §3.3
and formalized in Appendix B).
The amounts of violation of these design principles may vary due to varia-
tion in parametrical values that influence the attributes of a display, i.e., in this
specific case, the attributes of a display (i.e., a three-dimensional projection on a
two-dimensional plane) of a three-dimensional syntax tree. I will define several
parameters that influence the attributes of a display of a three-dimensional syn-
tax tree that allows multidominance in §6.1.2. In order to measure the effects of
variation of these parametrical values, I will define metrics to measure the
amounts of violation of the design principles for three-dimensional syntax trees
that allow multidominance in §6.1.4.
The possibility of parametrical variation and the possibility to measure the
amounts of violation of design principles, suggest that it is possible to develop a
mechanism that can automatically compute an optimal display of a three-
dimensional syntax tree that allows multidominance. I proposed such a mecha-
nism in terms of a post-derivational Optimality Theory (OT) procedure in §4. In
this section, I will describe this post-derivational OT-procedure and its compo-
nents in detail.
6.1. Design of the post-derivational OT-procedure
An OT-procedure consists of three main components: a GENERATOR, an EVALUA-
TOR and a set of CONSTRAINTS.
In order to generate an optimal display of a three-dimensional syntax tree,
the generator takes a sequence of syntactic structure building operations as in-
put and generates a candidate set of displays. Each of these candidate displays
has a unique configuration based on parametrical values. The evaluator will
evaluate and rank each candidate display against a target configuration, which
specifies target values for each parameter. The evaluation is based on the
amounts of violation of the design principles that constitute the set of con-
straints. The highest ranked candidate display represents the optimal display.
See Figure 3, for a graphical overview of this flow of operation.
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
29
However, this solely describes the OT-procedure from the perspective of
Optimality Theory. An actual implementation of the procedure will inevitably
consist of more components and phases. The procedure which I propose, is a
post-derivational OT-procedure. As described in the flow of operation, the gen-
erator of the OT-procedure takes a sequence of syntactic structure building op-
erations as input. Although this is correct from the perspective of Optimality
Theory, it is impossible in an actual implementation, i.e., the sequence of syntac-
tic structure building operations needs to be parsed into a workable internal
data-structure. In other words, we need a (pre-)derivational phase, i.e., a SYNTAX
PARSER, that transforms the sequence of syntactic structure building operations,
into an internal data-structure. This internal data-structure will be a Scalable
Vector Graphics (SVG) Document Object Model (DOM), i.e., a vector image that
is modifiable and facilitates the representation of hierarchies.
The SVG DOM data-structure will be the input for the generator, which
will generate a candidate set of displays. The evaluator will then measure the
amounts of violation of the design principles that constitute the set of con-
straints, for each combination of candidate display and target configuration
which specifies target values for each parameter. This will result in an OPTIMAL-
ITY THEORY MATRIX (OTM), i.e., a matrix with the constraints as columns, the
combinations of candidate displays and target configurations as rows and a score
in each cell.
From the perspective of Optimality Theory, the evaluator simply returns
the optimal candidate. However, in an actual implementation, the optimal can-
didate needs to be extracted from the OTM, i.e., we need a additional phase to
extract the optimal candidate.
Conclusively, we have the following components: a syntax parser, an inter-
nal data-structure, a generator, a set of parameters, a set of constraints, a set of
metrics, an evaluator, an OTM and a mechanism to extract the optimal candi-
date from the OTM. In the next sections, I will describe the cohesion and work-
ings of these components in detail. See Figure 4 for a graphical overview of the
implementation.
6.1.1. Syntax Parser
The whole process starts with a sequence of syntactic structure building opera-
tions, i.e., a sequence of Merge statements. As argued in §5, there are two types
of Merge: d-Merge and b-Merge. The first type, d-Merge, allows the expression of
subordinate relations and the second type, b-Merge, the expression of the non-
subordinate relations.
A sequence of d-Merge and b-Merge operations can represent a complex
syntactic structure. However, this representation is not a workable format for
the implementation of the OT-procedure, i.e., it needs to be parsed into a more
suitable data-structure. Although there are several suitable data-structures that
can represent the complex syntactic structures, I will take it one level further,
i.e., I will combine syntactic structure and graphical representation into a single
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
30
data-structure. This data-structure will be a Scalable Vector Graphics Document
Object Model.
Figure 4. Data flow diagram of the pre-derivational and post-derivational implementa-
tion of the OT-procedure in Yourdon and Coad notation.
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
31
Scalable Vector Graphics (SVG) is a vector image standard defined in the
Extended Markup Language (XML). The fact that it is defined in XML allows us
to use it as a Document Object Model (DOM), i.e., a standard object model for
representing XML and XML-like markup languages. This simply means we can
modify it dynamically within our implementation.
A XML document consists of a set of tags that represent the internal hier-
archy of the document. These tags can be nested. In other words, a XML docu-
ment allows us to represent tree-like hierarchies. In a plain XML document, it is
allowed to introduce tags that suit the data. However, as stated above, I will
combine the syntactic structure and its graphical representation in a single
structure, i.e., I will use SVG. In SVG there are predefined tags and attributes
that represent, e.g., shapes, lines and text. In addition, there is also a special
tag to group a set of tags. The possibility of grouping will eventually allow us to
represent both the syntactic structure and the graphical representation in a sin-
gle SVG DOM, but I will return to that later.
Before describing how a sequence of Merge operations can be parsed into a
SVG DOM, I need to address one more problem. The sequence of operations in
(18a) is ambiguous, i.e., it is unclear whether a new label ‘B’ is introduced in the
second operation and to which ‘B’ is referred in the third operation (i.e., the ‘B’
from the first operation or the potential new ‘B’ from the second operation). In
order to account for this problem, an argument of a Merge operation will not
solely consist of a text label, but will also have an identification label, see (18b).
The identification label is obligatory, but the text label only needs to be specified
upon the introduction of a constituent. The sequence of Merge operations in
(18a) can now be represented as in (18c).
(18a) d-Merge(A, B) � C
d-Merge(C, D) � B
d-Merge(E, B) � F
b-Merge(F, G) � H
d-Merge(D, I) � J
(18b) d-Merge([<“label”>, id], [<“label”>, id]) � [<“label”>, id]
b-Merge([<“label”>, id], [<“label”>, id]) � [<“label”>, id]
(18c) d-Merge([“A”, A], [“B”, B1]) � [“C”, C]
d-Merge([C], [“D”,D]) � [“B”, B2]
d-Merge([“E”, E], [B2]) � [“F”, F]
b-Merge([F], [“G”, G]) � [“H”, H]
d-Merge([D], [“I”, I]) � [“J”, J]
Consequently, the input of the syntax parser can be defined as a sequence
of d-Merge and b-Merge operations, similar to the sequence in (18c). The d-
Merge and b-Merge operation are asymmetrical, i.e., the first argument of an op-
eration precedes the second. A highly simplistic translation of (18c) into the SVG
data-structure is given in (19).
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
32
In (19), each unique constituent in (18c) is represented by a so called
group, i.e., <g> , tag. Most of these group tags contain an ‘id’ attribute that corre-
sponds with an identification label in the Merge operations in (18c). However,
one of the tags has a ‘ref’ attribute, instead of an ‘id’ attribute. The ‘ref’ attribute
represents a reference to an already existing ‘id’ attribute. This allows us to rep-
resent shared constituents in a multidominance construction. Each of the tags
contains a <text> tag to represent a text label and one or two <line> tag(s).
Each <line> tag represents a line that connects two nodes.
Each of the <g> tags and each of the <line> tags can be assigned coordi-
nates in the vector image. In other words, the data-structure can represent the
syntactic structure and its graphical representation at the same time.
(19) <svg>
<g id=” H” z=” 1 ”>
<text> H</text>
<g id=” F” z= ”0 ”>
<text> F</text>
<g id=” E” z=”0 ”>
<text> E</text>
</g>
<g id=” B2” z=” 0”>
<text> B</text>
<g id=” C” z=” 0 ”>
<text> C</text>
<g id=” A” z=” 0 ”>
<text> A</text>
</g>
<g id=” B1” z=” 0”>
<text> B</text>
</g>
<line id=” C:A ” />
<line id=” C:B1 ” />
</g>
<g id=” D” z=” 0 ”>
<text> D</text>
</g>
<line id=” B2:C ” />
<line id=” B2:D ” />
</g>
<line id=” F:E ” />
<line id=” F:B2 ” />
</g>
<g id=” G” z=” 1 ”>
<text> G</text>
</g>
<line id=” H:F ” />
<line id=” H:G” />
</g>
<g id=” J ” z=” 0 ”>
<text> J </text>
<g ref=” D” />
<g id=” I ” z=” 0 ”>
<text> I </text>
</g>
<line id=” J:D ” />
<line id=” J:I ” />
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
33
</g>
</svg>
Since SVG is an image standard for two-dimensional vector images, the ‘z’
attribute of the <g> tag is not in use. I introduced the use of this tag to repre-
sent the depth of a node, i.e., to facilitate the expression of parallel structures.
This depth is relative, rather than absolute, i.e., the ‘z’ attribute specifies a rela-
tive order of parallelism.
Parsing a sequence of Merge operations into a SVG DOM representation is
algorithmically complex. However, the global flow of operation of the parser is
quite straightforward. The idea is to return a SVG DOM that represents the syn-
tactic structure represented by a sequence of Merge operations. In other words,
the first thing the parser should do is create a SVG DOM and configure it as a
SVG document, i.e., create an <svg> document element (root element).
The parser will then process each of the Merge operations separately, i.e.,
although they relate to each other, they are treated independently by the parser.
Each Merge operation will first be split up in the left-constituent, the right-
constituent and the new constituent, i.e., the constituent that arises from merg-
ing the left-constituent and the right-constituent. If the left-constituent already
exists, its structure will be extracted from the partial derivation, i.e., from the
SVG DOM. The same process will be applied to the right-constituent.
If the new constituent does not already exist in the partial derivation,
there are two options: the Merge operation is a valid subordinate or non-
subordinate Merge operation or the Merge operation is a valid subordinate or
non-subordinate multidominance Merge operation. These two options will be
treated separately.
If the Merge operation constitutes a valid subordinate or non-subordinate
Merge operation, a relevant constituent will be constructed, i.e., a constituent in
which the new constituent and the left and right constituents are in a subordi-
nate relation or a constituent in which they are in a non-subordinate relation.
This new constituent will then be added to the SVG DOM at the correct position.
If there were existing left or right constituents, they will be moved into the
structure of the new constituent.
Likewise, if the Merge operation constitutes a valid subordinate or non-
subordinate multidominance Merge operation, a relevant constituent will be con-
structed, i.e., a constituent in which the new constituent and the left and right
constituents are in a subordinate relation or a constituent in which they are in a
non-subordinate relation. However, in both of these relevant constituents either
the left-constituent or the right-constituent is shared, i.e., it is in a multidomi-
nance relation.
In (20), I provide a formal overview of the algorithmic foundation of the
syntax parser in the pseudo code notation of Brookshear (2005).
(20) IMPORT CONSTANT BMERGE;
IMPORT CONSTANT DMERGE;
PROCEDURE PARSER(MERGE_OPERATIONS[])
; svg dom document
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
34
DEFINE svgdom CREATE_DOCUMENT();
; svg dom document element (root node)
DEFINE svgdom_de CREATE_DOCUMENT_ELEMENT(svgdom, “svg”);
; parse merge operations
FOREACH mo IN MERGE_OPERATIONS[]
; left constituent
DEFINE lc EXTRACT_LEFT_CONSTITUENT(mo);
IFF NOT IS_NEW_CONSTITUENT(lc)
lc EXTRACT_CONSTITUENT_FROM_DOM(lc);
; right constituent
DEFINE rc EXTRACT_RIGHT_CONSTITUENT(mo);
IFF NOT IS_NEW_CONSTITUENT(rc)
rc EXTRACT_CONSTITUENT_FROM_DOM(rc);
; new constituent
DEFINE nc EXTRACT_NEW_CONSTITUENT(mo);
; normal subordinate or non-subordinate merge
IFF IS_NEW_CONSTITUENT(nc) AND
IS_VALID_MERGE (lc, rc);
; new constituent element
DEFINE ce NULL;
; create subordinate group element
IFF GET_MERGE_TYPE(mo) EQUALS DMERGE
ce CREATE_SUBORDINATE_CONSTITUENT(
lc, rc, nc);
; create non-subordinate group element
IFF GET_MERGE_TYPE(mo) EQUALS BMERGE
ce CREATE_NON_SUBORDINATE_CONSTITUENT(
lc, rc, nc);
; attach element at the root
IFF IS_NEW_CONSTITUENT(lc) AND
IS_NEW_CONSTITUENT(rc)
svgdom_de ATTACH_ELEMENT(ce
; embed the left constituent
IFF NOT IS_NEW_CONSTITUENT(lc) AND
IS_NEW_CONSTITUENT(rc)
svgdom_de INSERT_BEFORE(lc, ce);
svgdom_de REMOVE_ELEMENT(lc);
; embed the right constituent
IFF IS_NEW_CONSTITUENT(lc) AND
NOT IS_NEW_CONSTITUENT(rc)
svgdom_de INSERT_BEFORE(rc, ce);
svgdom_de REMOVE_ELEMENT(rc);
; embed the left and the right
; constituent
IFF NOT IS_NEW_CONSTITUENT(lc) AND
NOT IS_NEW_CONSTITUENT(rc)
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
35
svgdom_de INSERT_BEFORE(lc, ce);
svgdom_de REMOVE_ELEMENT(lc);
svgdom_de REMOVE_ELEMENT(rc);
; multidominance subordinate or non-subordinate
; merge
IFF IS_NEW_CONSTITUENT(nc) AND
IS_VALID_MULTIDOMINANCE (lc, rc, nc);
; new constituent element
DEFINE ce NULL;
; create subordinate group element
IFF GET_MERGE_TYPE(mo) EQUALS DMERGE
ce CREATE_MDOM_SUBORDINATE_CONSTITUENT(
lc, rc, nc);
; create non-subordinate group element
IFF GET_MERGE_TYPE(mo) EQUALS BMERGE
ce CREATE_MDOM_NON_
SUBORDINATE_CONSTITUENT(
lc, rc, nc);
; attach the element at the root
svgdom_de ATTACH_ELEMENT(ce);
RETURN svgdom;
6.1.2. Generator
The output of the SYNTAX PARSER, i.e., the SVG DOM, is the input for the genera-
tor (see Figure 4). The generator will generate a candidate set of displays. Each
display in this candidate set has a unique configuration of parametrical values.
We will refer to this configuration as a CANDIDATE CONFIGURATION.
In the previous section, I argued that a SVG DOM can represent a syntac-
tic structure and its graphical representation at the same time, i.e., due to the
possibility to assign coordinates to certain tags. Consequently, a candidate dis-
play is simply a copy of the SVG DOM with coordinates computed and assigned
by an internal component called the projector.
The projector takes the SVG DOM tree as input and assigns coordinates to
the <g> and <line> tags. More concretely, the PROJECTOR turns the SVG DOM
into a candidate display. The technical design of the projector goes beyond the
scope of this thesis, but the most important thing is that it projects a syntax tree
based on a syntactic structure and a candidate configuration.
The number of possible permutations of parametrical values depends on
the number of parameters, their value ranges and intervals, and the number of
possible variations for each parameter within a projection, e.g., the number of
possible height difference variations between nodes depends on the number of
subordinated nodes in the syntactic structure. For example, if a parameter that
specifies height spacing between subordinated nodes has a value range of 50 –
150 pixels and an interval of 5 pixels, its range has a cardinality of 21. In a per-
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
36
fectly balanced binary tree that contains 7 nodes and no parallel nodes, there are
6 subordinated nodes. This means that there are 216 = 85.766.121 possible com-
binations of height differences.
The number of possible variations for a parameter within a projection is
specific for each parameter, i.e., each parameter depends on different factors. As
the main idea behind this thesis is to provide a general solution to the research
question, the number of candidate displays, i.e., candidate configurations, can be
computed as in (21), i.e., the product of the interval value ranges for each pa-
rameter powered by the possible number of variations of this parameter within a
specific syntactic structure.
(21) ( )( )∏∈∀
+=Pp
ppv
cc prangeN)(
1)( ,
Where range() returns the interval value range of parameter p which is a
member of the set of parameters P and pv() the number of possible varia-
tions for a parameter within a specific syntactic structure.
Each of the Ncc candidates represents a CANDIDATE CONFIGURATION (i.e., a
candidate display), which is not to be confused with a TARGET CONFIGURATION.
Each candidate configuration will be evaluated against each target configura-
tion. A target configuration represents a combination of target values, whereas a
candidate configuration represents combinations of variations of parametrical
values within a projection of a syntactic structure. In terms of design principle
violation measurement as described in §3.4.1, the amount of design principle vio-
lation in the candidate configuration will be measured with respect to the target
values in the target configuration. The total number of target configurations can
be computed as in (22), i.e., the product of the interval value ranges for each pa-
rameter.
(22) ( )∏∈∀
+=Pp
tc prangeN 1)( ,
Where range() returns the interval value range of parameter p which is a
member of the set of parameters P.
The generator will compute each of the Ncc candidate configurations and
evaluate it against each of the Ntc target configurations. We will refer to the
computation of each of the Ncc candidate configurations and Ntc target configura-
tions as permutation. Before describing the permutation of the parametrical val-
ues for each candidate configuration, I will describe the more simplistic permu-
tation of the parametrical values for each target configuration. An intuitive and
simple approach to the computation of each of the Ntc target configurations, is to
nest loops. In the case of four parameters, this results in four loops, i.e., a loop
for each parameter. Each of these loops iterates through the values of the corre-
sponding parameters. This allows the computation of all permutations, see (23).
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
37
(23) ; target configuration matrix for four parameters
DEFINE TARGET_CONFIGURATION[4];
FOREACH value_par_1 IN RANGE(PARAMETER_1)
FOREACH value_par_2 IN RANGE(PARAMETER_2)
FOREACH value_par_3 IN RANGE(PARAMETER_3)
FOREACH value_par_4 IN RANGE(PARAMETER_4)
TARGET_CONFIGURATION[0] value_par_1;
TARGET_CONFIGURATION[1] value_par_2;
TARGET_CONFIGURATION[2] value_par_3;
TARGET_CONFIGURATION[3] value_par_4;
However, this approach has two problems. First of all, although it allows
the computation of the Ntc target configurations, it is not usable for the Ncc can-
didate configurations, i.e., in the case of the candidate configurations, each pa-
rametrical value can vary per relevant factor within a projection, e.g., between
each mother and daughter. The second problem with this approach is that as
soon as parameters are added or removed, the static approach to permutation
will fail, i.e., in a setup with ten parameters, we would need ten nested loops in-
stead of four. In other words, in order to generalize the solution algorithmically,
we need a more dynamic approach towards permutation. I will describe such a
dynamic approach for the permutation of the Ntc target configurations. This ap-
proach can then be extended to the more complex situation of variation of para-
metrical values within a projection, i.e., the permutation of the Ncc candidate
configurations.
The equation in (22) allows us to compute the total number of target con-
figurations, Ntc, based on the cardinalities of the parametrical interval value
ranges. Consequently, the total number of target configurations and the cardi-
nalities of the parametrical value ranges, allow us to compute the permutations
of the parametrical values. More concretely, the nested loops can be generalized
into a single loop. The number of iterations of this loop equals the total number
of target configurations, i.e., Ntc, see (22).
If an index i represents an iteration of this loop, the parametrical value for
a parameter in an ordered set of parameters can be found by dividing i by the
cardinalities of the interval value ranges of the preceding parameters in the set
and taking the remainder of the result after dividing it by the cardinality of the
interval value range of the relevant parameter. For example, in a setup of three parameters P1 with value range 10 – 12
and interval 1, P2 with value range 50 – 52 and interval 2 and P3 with value
range 100 – 103 and interval 3, the cardinalities are respectively three, two and
two, i.e., there are 3 x 2 x 2 = 12 possible displays. The permutations for the pa-
rameters P1, P2 and P3 can now be computed by iterating an index i through the
possible displays and by computing the indices of the parametrical values as i
mod |P1| for the value of P1, (i / |P1|) mod |P2| for the value of P2 and (i /
|P1| / |P2|) mod |P3| for the value of P3, see Table 3.
This dynamic approach to permutation can also be applied to the permuta-
tion of the Ncc candidate configurations. However, this involves more than simple
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
38
permutation of the parametrical values, i.e., it depends on the parametrical val-
ues and the possible variations of these values within a syntactic structure. Con-
sequently, the number of parameters for a candidate configuration can be com-
puted as in (24).
PERMUTATIONS OF THREE PARAMETERS
Index P1 (10 – 12) Value P2 (50 – 52) Value P3 (100 – 103) Value
i i
mod |P1|
P1 i (i /|P1|)
mod |P2|
P2 i (i / |P1| / |P2|)
mod |P3|
P3 i
0 0 10 0 50 0 100
1 1 11 0 50 0 100
2 2 12 0 50 0 100
3 0 10 1 52 0 100
4 1 11 1 52 0 100
5 2 12 1 52 0 100
6 0 10 0 50 1 103
7 1 11 0 50 1 103
8 2 12 0 50 1 103
9 0 10 1 52 1 103
10 1 11 1 52 1 103
11 2 12 1 52 1 103
Table 3. Overview of the Permutations for Three Parameters.
(24) ∑
∈∀=
Ppparameterscc ppvN )(_
,
Where pv() returns the number of possible variations within a specific syn-
tactic structure of parameter p which is a member of the set of parameters
P.
For example, if a parameter that specifies height spacing between subordi-
nated nodes has a value range of 50 – 150 pixels and an interval of 5 pixels, its
range has a cardinality of 21. In a perfectly balanced binary tree that contains 7
nodes and no parallel nodes, this means that there are 7 – 1 = 6 situations in
which height differences can differ, and thus 6 parameters that correspond with
the height difference variation in a projection, i.e., the value of pv() equals 6.
This approach allows us to define a generalized algorithm for the genera-
tor, i.e., an algorithm that is independent of the number of parameters. This
generalized algorithm computes the number of candidate configurations as de-
scribed in (21) and then creates a candidate set of displays based on all the per-
mutations of parametrical values and its possible variations within a syntactic
structure. In (25), I defined this algorithm formally in the pseudo code notation
of Brookshear (2005).
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
39
(25) PROCEDURE GENERATE(SVGDOM, PARAMETERS[])
; number of candidate configurations
DEFINE Ncc 0;
; number of candidate parameters
DEFINE Ncc_parameters 0;
; number of parameters
DEFINE Nparameters CARDINALITY(PARAMETERS[]);
; determine Ncandidates and Nprojection_paramete rs
FOREACH parameter[] IN PARAMETERS[]
Ncc Ncc * CARDINALITY( RANGE(parameter[])) ^
PV(parameter[]));
Ncc_parameters Ncc_parameters + PV(parameter[]));
; candidate displays
DEFINE CANDIDATES[Ncc];
; candidate configuration parameters
DEFINE CC_PARAMETERS[Ncc_parameters];
; a candidate configuration
DEFINE C_CONFIGURATION[Ncc_parameters];
; fill the candidate configuration parameters
DEFINE t 0;
FOREACH i IN [0...Nparameters]
DEFINE parameter[] PARAMETERS[i];
FOREACH j IN [t...(t + PV(parameter[]))]
CC_PARAMETERS[j] parameter[];
t t + PV(parameter[]);
; permute the values of the candidate configuration
; parameters
FOREACH i IN [0...Ncc]
FOREACH j IN [0...Nparameters]
DEFINE parameter[] NULL;
DEFINE z i ;
FOREACH x IN [0...j ]
parameter[] CC_PARAMETERS[x];
z z / CARDINALITY( RANGE(parameter[]));
parameter[] CC_PARAMETERS[j];
z z MOD CARDINALITY( RANGE(parameter[]));
C_CONFIGURATION[j] parameter[z];
CANDIDATES[i] PROJECT(SVGDOM, C_CONFIGURATION[]);
RETURN CANDIDATES[];
In §3.4, I argued that projection parameters affect display attributes of a
projection. In this section, I described how a generator generates candidate dis-
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
40
plays based on the variation in values of these parameters. In the next sections,
I will describe four parameters that affect the display attributes of a three-
dimensional projection on a two-dimensional plane of a three-dimensional syntax
tree.
6.1.2.1. Parameter: Angle of Projection
The angle of projection or camera angle is the most straightforward three-
dimensional projection parameter. Variation of projection angle can affect the
amount of occlusion in a display, i.e., the amount of overlap of objects and the
amount of view obstruction. In terms of design principles, it affects the visibility
of all nodes, the overlap of text nodes and the crossing of lines.
Parameter Characteristics:
- Parameter name: ‘par_projection-angle’
- Value range: -45 – +45 degrees (along the z-axis)
- Interval: 1 degree
- Affected design principle(s):
o ‘dp_all-visible’
o ‘dp_no-text-overlap’
o ‘dp_no-crossing-lines’
- Interacts with:
o ‘par_z-spacing’
6.1.2.2. Parameter: z-axis Spacing
The z-axis spacing is a parameter that directly interacts with the projection an-
gle, i.e., the parameter specifies a default depth difference for parallel nodes.
Parameter Characteristics:
- Parameter name: ‘par_ z-spacing’
- Value range: 50 – 250 pixels
- Interval: 5 pixels
- Affected design principle(s):
o ‘dp_constant-depth-diff’
- Interacts with:
o ‘par_projection-angle’
6.1.2.3. Parameter: y-axis Spacing
The y-axis spacing affects the height distance between a mother and her daugh-
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
41
ter node(s) and thus also the equal height of sister nodes, i.e., the parameter
specifies a height difference between mother and daughter node(s).
Parameter Characteristics:
- Parameter name: ‘par_ y-spacing’
- Value range: 50 – 150 pixels
- Interval: 5 pixels.
- Affected design principle(s):
o ‘dp_constant-height-diff’
o ‘dp_equal-sister-height’
6.1.2.4. Parameter: x-axis Spacing
The x-spacing affects the length of sister lines and the binary placement of
mother nodes, i.e., the parameter specifies a distance between nodes along the x-
axis. However, the default x-axis spacing only defines the spacing between two
adjacent nodes along the x-axis that have no further widening leaves (see formal
principles in Appendix B).
Parameter Characteristics:
- Parameter name: ‘par_ x-spacing’
- Value range: 50 – 150 pixels
- Interval: 5 pixels.
- Affected design principle(s):
o ‘dp_equal-sister-line-length’
o ‘dp_binary-mother’
6.1.3. Set of Constraints
The output of the generator, i.e., the set of candidate displays, will be the input
for the evaluator (see Figure 4). The evaluator will rank each combination of
candidate display (i.e., each candidate configuration) and target configuration,
based on the amount of violation of the design principles that constitute the set
of constraints.
However, not all of the design principles that constitute the set of con-
straints are of equal importance, i.e., the violation of some principles is a heav-
ier violation than the violation of others. In order to account for this discrep-
ancy, each principle will be given a weight factor. Table 4 provides an overview
of the design principles, i.e., the constraints, and their weight factors. The
weight factors in Table 4 are chosen intuitively. They should, however, be em-
pirically verified and if necessary adjusted, i.e., for better and more accurate re-
sults.
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
42
Furthermore, due to the similarity of the ‘dp_all-visible’ and the ‘dp_no-
text-overlap’ principles, only the ‘dp_all-visible’ principle is represented in the
set of constraints.
DESIGN PRINCIPLES WITH WEIGHT FACTORS
Nodes
Identifier Weight factor
dp_all-visible 2.00
dp_constant-height-diff 0.75
dp_constant-depth-diff 0.75
dp_equal-sister-height 0.90
dp_binary-mother 0.25
Text labels
Identifier Weight factor
dp_no-text-overlap 2.00
Lines
Identifier Weight factor
dp_equal-sister-line-length 0.50
dp_no-crossing-lines 0.50
Angle of Projection
Identifier Weight factor
dp_minimal-angle 0.50
Table 4. Design principles concerning nodes, text labels and lines, with their correspond-
ing weight factors. The design principles constitute the set of constraints.
6.1.4. Evaluator
As stated in the previous section, the output of the generator, i.e., the set of can-
didate displays (i.e., the set of candidate configurations), will be the input for
the evaluator (see Figure 4).
The evaluator will compute a score for each design principle per each com-
bination of candidate display and target configuration. This score is based on the
amount of realization of the design principle, which is the reverse of the amount
of violation, in the candidate display, based on the target configuration, see (26).
The amount of realization multiplied by the relevant weight factor, constitutes
the score for a design principle for a specific combination of candidate display
and target configuration, see (27).
(26) VR −= 1 ,
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
43
Where R is the amount of realization of a specific design principle and V
is the amount of violation measured by the appropriate metric.
(27) WRS ×= ,
Where S is the score for a design principle for a specific combination of
candidate display and target configuration, R is the amount of realization
of the principle in the candidate display based on the target configuration
and W is the relevant weight factor for the principle, see Table 4.
The computation of a score for each design principle per each combination
of candidate display and target configuration, results in a matrix which I call the
Optimality Theory Matrix (OTM), see Table 5. The columns of the OTM repre-
sent the constraints and the rows represent the combinations. The number of
combinations can be computed as in (28). The algorithm of evaluation is formal-
ized in (29), using the pseudo code notation of Brookshear (2005).
(28) tcccnscombinatio NNN ×=
OPTIMALITY THEORY MATRIX
dp_all-visible dp_constant-height-diff … dp_no-crossing-lines
combination1 1.0 0.6 … 0.2
combination2 0.9 0.5 … 0.1
combination3 0.8 0.4 … 0.0
… … … … ...
combinationn, 0.7 0.3 … 1.0
Table 5. Optimality Theory Matrix (OTM) with example values.
(29) PROCEDURE EVALUATE(CANDIDATES[], CONSTRAINTS[], PARAMETERS[] )
; number of candidate configurations
DEFINE Ncc CARDINALITY(CANDIDATES[]);
; number of constraints
DEFINE Nconstraints CARDINALITY(CONSTRAINTS[]);
; number of parameters
DEFINE Nparameters CARDINALITY(PARAMETERS[]);
; compute number of target configurations
DEFINE Ntc 0;
FOREACH parameter[] IN PARAMETERS[]
Ntc Ntc * CARDINALITY( RANGE(parameter[]));
; target configurations
DEFINE T_CONFIGURATIONS[Ntc];
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
44
; single target configuration
DEFINE t_configuration[Nparameters];
; compute parametrical configurations
FOREACH i IN [0...Ntc]
FOREACH j IN [0...Nparameters]
DEFINE parameter[] NULL;
DEFINE z i;
FOREACH x IN [0...j ]
parameter[] = PARAMETERS[x];
z z / CARDINALITY( RANGE(parameter[]));
parameter[] PARAMETERS[j];
z z MOD CARDINALITY( RANGE(parameter[]));
t_configuration[j] parameter[z];
T_CONFIGURATIONS[i] t_configuration[j];
; optimality theory matrix
DEFINE OTM [Ncc * Ntc][Nconstraints];
; evaluate the constraints per combination
DEFINE r 0;
FOREACH i IN [0...Ncc]
DEFINE candidate CANDIDATES[i];
FOREACH j IN [0...Ntc]
DEFINE t_configuration[]
T_CONFIGURATIONS[j]
FOREACH x IN [0...Nconstraints]
DEFINE constraint CONSTRAINTS[x];
OTM[r, x]
(1 - MEASURE(
candidate,
constraint,
t_configuration)) *
WEIGHT(constraint);
r r + 1 ;
RETURN OTM[][];
As stated above, the evaluator evaluates each combination of candidate
display and target configuration, based on the amount of design principle viola-
tion. In the next sections, I will describe metrics for design principle violation
measurement. Each of these metrics will return the amount of violation as a
value in the continuous range of 0 – 1, i.e., 0% – 100% violation. Since these
metrics measure the amount of violation of design principles, there is a mapping
between metric and design principle. This mapping is given in Table 6. Due to
the similarity between the ‘dp_all-visible’ and ‘dp_no-text-overlap’ principles,
only the ‘dp_all-visible’ principle is represented in the set of constraints, see
§6.1.3.
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
45
DESIGN PRINCIPLES WITH WEIGHT FACTORS AND METRICS
Nodes
Identifier Weight fac. Metric
dp_all-visible 2.00 met_node-occlusion
dp_constant-height-diff 0.75 met_constant-height-diff
dp_constant-depth-diff 0.75 met_constant-depth-diff
dp_equal-sister-height 0.90 met_equal-sister-height
dp_binary-mother 0.25 met_binary-mother
Text labels
Identifier Weight fac. Metric
dp_no-text-overlap 2.00 met_node-occlusion
Lines
Identifier Weight fac. Metric
dp_equal-sister-line-length 0.50 met_equal-sister-line-length
dp_no-crossing-lines 0.50 met_crossing-lines
Angle of Projection
Identifier Weight fac. Metric
dp_minimal-angle 0.50 met_minimal-angle
Table 6. Design principles concerning nodes, text labels and lines, with their correspond-
ing weight factors and metrics. The design principles constitute the set of constraints.
6.1.4.1. Metric: Node Occlusion
Two of the most straightforward design principle violations, are the violation of
the ‘dp_all-visible’ principle and the ‘dp_no-text-overlap’ principle. Due to their
similarity, these principles can be treated as a single principle.
One approach to the measurement of the amount of violation of these com-
bined principles, is the computation of ‘hyperrectangle’ overlap. A ‘hyperrectan-
gle’ or ‘mininum bounding box’ is the smallest rectangle that fully encloses a
graphical object, i.e., in this case a node in the three-dimensional syntax tree.
The amount of violation of the combined principles can be measured by comput-
ing the sum of the overlapping surfaces as a fraction of the sum of all the ‘hyper-
rectangle’ surfaces. For example, in a configuration of 4 nodes with each a 10 x
20 pixel ‘hyperrectangle’, the total sum of ‘hyperrectangle’ surfaces is 10 x 20 x 4
= 800 pixels. If there is a 5 x 5 pixel overlap and a 10 x 10 pixel overlap, the to-
tal sum of overlapping ‘hyperrectangle’ surfaces is (5 x 5) + (10 x 10) = 125 pix-
els. The amount of violation of the combined design principles can then be com-
puted as 125 / 800 = 0.15625 (= 15.625%).
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
46
Metric Characteristics:
- Metric name: ‘met_node-occlusion’
- Based on: hyperrectangle overlap
- Measured design principle(s):
o ‘dp_all-visible’
o ‘dp_no-text-overlap’
6.1.4.2. Metric: Crossing Lines
Another common design principle violation, is the violation of the ‘dp_no-
crossing-lines’ principle, i.e., due to multidominance constructions. However, the
design of a metric for the amount of violation of this principle is less straight-
forward than for node occlusion. A possible, but rather simplistic approach, is to
measure the amount of violation of this principle by computing the number of
crossed lines as a fraction of the total number of lines. For example, in a con-
figuration of 10 lines, in which 1 line crosses 2 other lines, there are 3 crossed
lines, i.e., the two lines being crossed and the crossing line itself. The amount of
violation can then be computed as 3 / 10 = 0.3 (= 30%).
Metric Characteristics:
- Metric name: ‘met_crossing-lines’
- Based on: number of crossed lines
- Measured design principle(s):
o ‘dp_no-crossing-lines’
6.1.4.3. Metric: Constant Height Difference
An additional violation, is the violation of the ‘dp_constant-height-diff’ principle.
Violation of this principle can be measured by computing weighted violations as
a fraction of the highest possible amount of violation. For each pair of adjacent
nodes along the y-axis, the distance along the y-axis should equal the target
value for y-axis spacing. Each pair of adjacent nodes along the y-axis for which
the distance along the y-axis differs from the target value for y-axis spacing,
constitutes a violation. The amount of distance affects the weight of the viola-
tion, i.e., the larger the distance the heavier the violation.
For example, in a configuration of a tree consisting of 7 perfectly balanced
nodes without parallel nodes, there are 6 possible correct realizations of the
‘dp_constant-height-diff’ principle. If the ‘par_ y-spacing’ parameter is set to a
target value of 110 pixels in a value range of 50 – 150 pixels, we can compute the
maximum amount of violation, i.e., |50 – 110| = 60 > |150 – 110| = 40. In other
words, the maximum amount of violation of the design principle is 60 pixels.
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
47
If there are 2 pairs of adjacent nodes along the y-axis for which the dis-
tance along the y-axis is unequal to the target value for y-axis spacing, the num-
ber of violations is 2. If one of these pairs has a distance of 90 pixels and the
other pair has a distance of 120 pixels, the amount of violation can be computed
as (|90 – 110| / 60 / 6) + (|120 – 110| / 60 / 6) = 0.0833 (= 8.33%). The division
by the total number of correct realizations – 6 – normalizes the violations.
Metric Characteristics:
- Metric name: ‘met_constant-height-diff’
- Based on: number of violations and the amount of each violation of
the target value for y-axis spacing
- Measured design principle(s):
o ‘dp_constant-height-diff’
6.1.4.4. Metric: Equal Sister Height
The metric for the violation of the ‘dp_equal-sister-height’ principle is quite
similar to the metric for the ‘dp_constant-height-diff’ principle. It differs in that
this metric focuses on the equal height of sister nodes. The amount of violation
can be computed by computing the number of times sisters are on an unequal
height as a fraction of the total number of sister relations. The inequality of the
sister height affects the weight of the violation, i.e., the larger the difference the
heavier the violation.
For example, in a configuration of a tree consisting of 7 perfectly balanced
nodes without parallel nodes, there are 3 sister couples. If the value range for
the ‘par_y-spacing’ parameter is set to 50 – 150 pixels, we can compute the
maximum amount of violation, i.e., |50 – 150| = 100. In other words, the maxi-
mum amount of violation of the design principle is 100 pixels, i.e., the maximum
height difference between two sisters is 100 pixels.
If there is 1 couple in which the sisters are at an unequal height, this con-
stitutes a violation. If one of these sisters is at a distance of 90 pixels from her
mother and the other sister at a distance of 120 pixels. The height difference be-
tween the sisters is |120 – 90| = 30 pixels. The amount of violation can now be
computed as (|120 – 90| / 100 / 3) = 0.1 (= 10%). The division by the total num-
ber of correct realizations – 3 – normalizes the violation.
Metric Characteristics:
- Metric name: ‘met_equal-sister-height’
- Based on: number of sister pairs with nodes at an unequal height
and the amount of height difference.
- Measured design principle(s):
o ‘dp_equal-sister-height’
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
48
6.1.4.5. Metric: Constant Depth Difference
The ‘dp_constant-depth-diff’ constitutes another design principle for which viola-
tion can be measured. Measurement of the violation of this principle is quite
similar to the measurement of the ‘dp_constant-height-diff’ principle.
For each pair of adjacent nodes along the z-axis, the distance along the z-
axis should equal the target value for z-axis spacing. Each pair of adjacent nodes
along the z-axis for which the distance along the z-axis differs from the target
value for z-axis spacing, constitutes a violation. The amount of distance affects
the weight of the violation, i.e., the larger the distance the heavier the violation.
For example, in a configuration with 9 nodes in which 5 nodes are placed
parallel to 4 nodes, there are 4 pairs of adjacent nodes along the z-axis, i.e., 4
possible correct realizations of the design principle. If the ‘par_ z-spacing’ pa-
rameter is set to a target value of 200 pixels in a value range of 50 – 250 pixels,
we can compute the maximum amount of violation, i.e., |50 – 200| = 150 > |250
– 200| = 50. In other words, the maximum amount of violation of the design
principle can at most be 150 pixels.
If the distance along the z-axis for 2 pairs of these nodes is unequal to the
target value for z-axis spacing, the number of violations is 2. If the distance for
one of these pairs is 160 and the distance for the other pair is 210, the amount of
violation can be computed as (|160 – 200| / 150 / 4) + (|210 – 200| / 150 / 4) =
0.0833 (= 8.3%). The division by the total number of correct realizations – 4 –
normalizes the violations.
Metric Characteristics:
- Metric name: ‘met_constant-depth-diff’
- Based on: number of violations and the amount of each violation of
the target value for z-axis spacing
- Measured design principles:
o ‘dp_constant-depth-diff’
6.1.4.6. Metric: Binary Mother Placement
Each mother node that is not placed directly above their daughter or in the exact
middle of their daughters, constitutes a violation.
A horizontal line from the left daughter to the right daughter, i.e., a line
from the x-coordinate of the left daughter to the x-coordinate of the right daugh-
ter along the same y-coordinate, contains the exact middle of the daughters. A
vertical line from the mother to this exact middle, divides the horizontal line
into two parts, i.e., the part left of the vertical line and the part right of the ver-
tical line.
The angle of the intersection point, i.e., the angle between the middle of
the horizontal line and the vertical line, should ideally be 90 degrees at both
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
49
sides, i.e., a right angle. The amount of difference from this 90 degrees affects
the amount of violation, i.e., the larger the difference the heavier the violation.
For example, in a configuration of a tree consisting of 7 perfectly balanced
nodes without parallel nodes, there are 3 mother nodes, i.e., 3 possible realiza-
tions of the design principle. If 1 of these 3 mother nodes is not placed in the ex-
act middle of her daughters, there is 1 violation. If the angle between the left
part of the horizontal line and the vertical line is 80 degrees, then the angle be-
tween the right part of the horizontal line and the vertical line must be 180 – 80
= 100 degrees. The amount of violation can then be computed by taking the dif-
ference between the ideal angle and the relevant angle, i.e., (|80 – 100| / 2) = 10
degrees (alternatively, |90 – 80| = 10 degrees or |90 – 100| = 10 degrees) and
dividing it by the ideal angle of 90 degrees and the number of possible correct
realizations. This results in (10 / 90 / 3) = 0.037 (= 3.7%) violation. The division
by the total number of correct realizations – 3 – normalizes the violation.
The computation of the amount of violation for mother nodes that only have
a single daughter is quite similar. The exact middle is straight below the mother
node, i.e., if we assume a horizontal line along the x-axis straight through the
daughter and a vertical line from mother to daughter, the angle between the
horizontal and vertical line should ideally be 90 degrees at both sides.
However, if the daughter is not directly below her mother, the angle be-
tween the horizontal and vertical line will differ from 90 degrees. If, for example,
this angle is 120 degrees on the left side of the vertical line, i.e., |180 – 120| =
60 degrees on the right side of the vertical line, the amount of violation can be
computed by taking the difference between the ideal angle and the relevant an-
gle, i.e., (|120 – 60| / 2 = 30) degrees (alternatively, |90 – 60| = 30 degrees or
|90 – 120| = 30 degrees) and dividing it by the ideal angle of 90 degrees and the
number of possible correct realizations. In a situation in which the number of
possible correct realizations is 3, this results in (30 / 90 / 3) = 0.111 (= 11,1%)
violation. The division by the total number of correct realizations – 3 – normal-
izes the violation.
Metric Characteristics:
- Metric name: ‘met_binary-mother’
- Based on: number of mother nodes that are not binary placed and
the ideal placement angle.
- Measured design principles:
o ‘dp_binary-mother’
6.1.4.7. Metric: Equal Sister Line Length
The metric for the ‘dp_equal-sister-line-length’ design principle is quite straight-
forward. Each pair of sisters that has an unequal line length, constitutes a viola-
tion. The amount of length difference affects the weight of the violation, i.e., the
larger the difference the heavier the violation.
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
50
For example, in a configuration of a tree consisting of 7 perfectly balanced
nodes without parallel nodes, there are 3 pairs of sisters. If the line length one
pair of sisters is unequal, the number of violations is 1. If the length of the line
that connects the left daughter to her mother is 80 pixels and the length of the
line that connects the right daughter to her mother is 110 pixels, the difference
is |80 – 110| = 30 pixels. In other words, we could say that ideally the line
lengths should have been 80 + (30 / 2) = 95 pixels and 110 - (30 / 2) = 95 pixels.
The amount of violation can then be computed as (30 / 2 / 95 / 3) = 0.052 (=
5.2%). The division by the total number of correct realizations – 3 – normalizes
the violations.
Metric Characteristics:
- Metric name: ‘met_equal-sister-line-length’
- Based on: number of sister pairs with unequal line lengths and the
amount of line length differences.
- Measured design principles:
o ‘dp_equal-sister-line-length’
6.1.4.8. Metric: Minimal Projection Angle
The last design principle for which the amount of violation can be measured is
the ‘dp_minimal-angle’ principle. Measuring this principle is quite straightfor-
ward. We want the angle of projection to be minimal, i.e., the larger the angle
the larger the violation.
If the parameter ‘par_projection-angle’ has a value range of -45 – +45 de-
grees, the maximum violation is 45 degrees in a certain direction. If a certain
projection has an angle of 20 degrees in some direction, the amount of violation
can be computed as (20 / 45) = 0.44 (= 44%).
Metric Characteristics:
- Metric name: ‘met_minimal-angle’
- Based on: the angle of projection.
- Measured design principles:
o ‘dp_minimal-angle’
6.1.5. Extraction of the Optimal Display
The last phase in the implementation of the OT-procedure is the extraction of
the optimal display from the OTM (see Figure 4). The optimal display is the can-
didate display in the combination of candidate display and target configuration
with the best overall score for all the design principles, i.e., the candidate dis-
play in the combination with the highest total score.
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
51
The total score for a combination of candidate display and target configura-
tion is the sum of its scores for the design principles, see (30). Finding the opti-
mal display in an OTM now equals finding the maximum total score, i.e., by it-
erative evaluation of all combinations of candidate displays and target configu-
rations. An algorithm for this is formally defined in (31), using the pseudo code
notation of Brookshear (2005).
(30) ∑∈
=Cc
cST ,
Where T represent the total score for a combination and where Sc repre-
sents the score for constraint (design principle) c which is a member of
the set of constraints C.
(31) PROCEDURE FIND_OPTIMAL(
CANDIDATES[], CONSTRAINTS[], OTM[][], PARAMETERS[])
DEFINE Ncandidates CARDINALITY(CANDIDATES[]);
DEFINE Nconstraints CARDINALITY(CONSTRAINTS[]);
; compute number of target configurations
DEFINE Ntc 0;
FOREACH parameter[] IN PARAMETERS[]
Ntc Ntc * CARDINALITY( RANGE(parameter[]));
; optimal candidate
DEFINE optimal_candidate NULL;
; optimal total score
DEFINE optimal_total 0;
; extract optimal candidate
DEFINE r 0;
FOREACH i IN [0...Ncandidates]
DEFINE candidate CANDIDATES[i];
FOREACH j IN [0...Ntc]
DEFINE T 0;
FOREACH x IN [0...Nconstraints]
T T + OTM[r, x];
IF (T > optimal_total)
optimal_total T;
optimal_candidate candidate;
r r + 1;
RETURN optimal_candidate;
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
52
6.2. Overview of the OT-procedure
In the previous sections, I described the workings of the separate components of
the OT-procedure in detail. In Figure 4, I provided an overview of the cohesion of
these components.
The main flow of operation of the OT-procedure starts with a sequence of
syntactic structure building operations, i.e., a sequence of Merge operations.
These Merge operations will be parsed into a SVG DOM by the syntax parser.
Based on this SVG DOM and a set of parameters, the generator will generate a
set of candidate displays, in which each candidate display has a unique candi-
date configuration. The evaluator will evaluate each of these candidate displays
against each of the target configurations, based on the amounts of violation of
the design principles that constitute the set of constraints. This results in an
OTM. Finally, the optimal display will be extracted from this OTM. The flow of
operation is formally defined in (32), using the pseudo code notation of Brook-
shear (2005).
(32) ; predefined set of parameters
IMPORT PARAMETERS[];
; predefined set of constraints
IMPORT CONSTRAINTS[];
PROCEDURE OT(MERGE_OPERATIONS[])
DEFINE svgdom NULL;
DEFINE CANDIDATES[] NULL;
DEFINE OTM[][] NULL ;
DEFINE optimal_display NULL;
svgdom PARSE(MERGE_OPERATIONS[]);
CANDIDATES[] GENERATE(svgdom, PARAMETERS[]);
OTM[][] EVALUTATE(CANDIDATES[], CONSTRAINTS[], PARAMETERS[]);
optimal_display FIND_OPTIMAL(
CANDIDATES[],
CONSTRAINTS[],
OTM[][],
PARAMETERS[]);
RETURN optimal_display;
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
53
7. Conclusions
In the linguistic literature, the increase in interest in paratactic phenomena,
like parenthesis and coordination, has led to the development of theories that
imply a third asymmetrical relation besides dominance and precedence. This
third asymmetrical relation is non-subordination or ‘behindance’.
The extension of syntax with a third asymmetrical relation gives rise to
representational problems of syntax with a traditional two-dimensional syntax
tree, i.e., there is no way to express the non-subordination or ‘behindance’ rela-
tion. This implies a modification of traditional syntax trees in order to accommo-
date this relation. Although there are several possibilities to accommodate the
non-subordination or ‘behindance’ relation, I focused on one specific approach,
i.e., a three-dimensional version of the traditional two-dimensional syntax tree.
Besides paratactic phenomena, I also described hypotactic phenomena that
give rise to multidominance constructions, i.e., constructions in which constitu-
ent or non-constituent parts of a sentence are shared. In both two-dimensional
and three-dimensional syntax trees, these constructions might give rise to cross-
ing lines.
The construction of a representation of a three-dimensional syntax tree
that does or does not contain multidominance constructions, is a complex proc-
ess. In this thesis, I focused on the representation of such trees in scientific arti-
cles and books. Hence, the research question of this thesis is defined as in (1)
and is repeated for convenience in (33).
(33) How can one automatically compute and render an optimal three-
dimensional syntax tree diagram for non-interactive printable media?
The non-interactive printable media in (33) refer to scientific articles and
books. These media consist of two-dimensional planes, i.e., a page in a book.
Since the three-dimensional syntax tree consists of three dimensions, the points
of this three-dimensional graphical image need to be mapped onto two-
dimensions. This mapping is called a three-dimensional projection. Such a pro-
jection is parametrical, e.g., the angle of projection can vary.
The fact that a three-dimensional projection is parametrical, suggests that
an optimal projection can be found trough parametrical variation. The optimality
of a three-dimensional syntax tree projection depends on the amount of clutter in
the image. The amount of clutter depends on the violation of the design princi-
ples for such a projection.
After providing detailed diagram theoretic structural analyses of a three-
dimensional syntax tree and a three-dimensional syntax tree that allows multi-
dominance, I defined a set of design principles that constitutes the foundation of
an optimal three-dimensional syntax tree projection. The amount of violation of
these design principles can vary due to variation in the parametrical configura-
tion for a specific projection. Furthermore, I defined a set of parameters that af-
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
54
fect the display attributes of three-dimensional syntax tree projections and met-
rics to measure the amount of design principle violation within a projection.
An optimal three-dimensional syntax tree projection can now be found by
computing each possible projection based on parametrical variation, and by
measuring the amount of design principle violation in each of these possible pro-
jections; the projection with the lowest amount of design principle violation is
the most optimal.
In order to provide an answer to the research question in (33), this process
needs to be transformed into an automatic process. I proposed a mechanism for
this in terms of Optimality Theory (OT), which I adapted from the field of pho-
nology. More concretely, I proposed a post-derivational OT-procedure.
This OT-procedure consists of three main components: a generator, an
evaluator, and a set of constraints. From the perspective of OT, the generator
takes a sequence of syntactic structure building operations as input and gener-
ates a set of candidates containing each possible candidate projection, based on
parametrical variation. The sequence of syntactic structure building operations
is a sequence of Merge operations that define a complex syntactic structure. The
evaluator evaluates and ranks each of the candidate projections based on a set of
constraints. This set of constraints consists of the design principles. The highest
ranked candidate is the optimal projection.
In conclusion, I have answered the research question by developing a post-
derivational OT-procedure that computes an optimal diagram based on variation
in parameters that affect the display attributes of a three-dimensional syntax
tree projection.
The solution in terms of Optimality Theory is an elegant solution to the re-
search question. However, it is not the only possible solution, i.e., instead of
adapting a mechanism from the field of phonology, statistical methods from the
fields of information retrieval or artificial intelligence could have been used.
However, the actual advantages and disadvantages of each of the possible mod-
els will have to be studied and empirically verified in future research.
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
55
8. Further work
Perhaps the most straightforward suggestion for further work is the actual im-
plementation of the proposed OT-procedure. In fact, an actual implementation is
what gives rise to possible further work from the perspective of algorithms and
usability.
The first suggestion is to empirically test the weight factors for the design
principles in the set of constraints. The weight factors in this thesis are intui-
tively chosen, but adjustments based on empirical analysis might lead to more
accurate results. From the perspective of information science and artificial intel-
ligence, the weight factors can also be trained by applying machine learning
techniques, i.e., the weight factors can be statistically justified.
With respect to the algorithmic design of the implementation, I suggest the
investigation of reduction of the number of candidates. The amount of candidate
projections increases enormously as a result of an increasing number of parame-
ters and their possible variations within a projection. A possible approach to the
reduction of candidates is to search for the optimal projection, rather than to it-
eratively generate all candidates and evaluate them afterwards, e.g., by applying
an adapted binary search algorithm. Another speed increasing modification, is to
stop evaluating a candidate if it can no longer become an optimal candidate.
As stated in the conclusions, the OT-procedure might be replaced by statis-
tical models from the fields of information retrieval or artificial intelligence. The
question whether different statistical models will actually affect the speed or ef-
fectiveness of the implementation, needs to be empirically verified. It might well
be that the biggest speed increasing factor is the efficiency of the algorithms
that constitute the foundations of these models.
Furthermore, the output of the implementation, i.e., the three-dimensional
syntax tree diagrams, can be subjected to usability studies. A flexible implemen-
tation might be highly configurable and thus facilitate the generation of three-
dimensional syntax tree diagrams with different characteristics. These usability
studies might provide insight in what makes three-dimensional syntax trees ef-
fective and accurate. This insight may lead to the development of additional and
more specific design principles and parameters. Consequently, this will lead to
the development of additional design principle violation metrics or the improve-
ment of the accuracy of the proposed metrics.
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
56
Acknowledgements
I would like to take this opportunity to thank my thesis advisors Leonie Bosveld
and Mark de Vries for their help, criticism, ideas and time. Furthermore, I would
like to thank Mark de Vries for introducing me to the subject of this thesis and
John Nerbonne for allowing me to write a thesis apart from the general proce-
dure.
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
57
References
Broekhuis, H. & Dekkers, J. (2000). The Minimalist Program and Optimality Theory: Deri-
vations and Evaluation. In J. Dekkers, F. Van der Leeuw & J. Van de Weijer (Eds.),
Optimality Theory: Phonology, Syntax and Acquisition. Oxford: Oxford University
Press.
Brookshear, J. G. (2005). Computer Science: An Overview. Pearson Education, Inc.
Card, S. K., Mackinlay, J. & Shneiderman, B. (1999). Information Visualization, Using Vi-
sion to Think. Morgan Kaufmann.
Chomsky, N. (1957). Syntactic Structures. The Hague: Mouton.
Chomsky, N. (1973). Conditions on Transformations. In S. R. Anderson & P. Kiparsky
(Eds.), A Festschrift for Morris Halle. New York: Reinhart and Winston.
Chomsky, N. (1981). Lectures on Government & Binding. Dordrecht: Foris.
Chomsky, N. (1995). The Minimalist Program. Cambridge, MA: MIT Press.
Citko, B. (2005). On the Nature of Merge: External Merge, Internal Merge, and Parallel
Merge. Brandeis University.
Espinal, M. T. (1991). The Representation of Disjunct Constituents. Language, 67, 726-762.
Gärtner, H.-M. (2002). Generalized Transformations and Beyond: Reflections on Minimalist
Syntax. Berlin: Akademie Verlag.
Goodall, G. (1987). Parallel structures in syntax: Coordination, causatives and restructur-
ing. Cambridge: Cambridge University Press.
Grootveld, M. (1994). Parsing Coordination generatively. PhD Dissertation: Leiden Uni-
versity.
Guimarães, M. (2004). Derviation and Representation of Syntactic Amalgams. Doctoral dis-
sertation: University of Maryland.
Johannessen, J. (1998). Coordination. Oxford: Oxford University Press.
Kluck, M. (2007). The perspective of external remerge on Right Node Raising. In Proceed-
ings of CamLing, 130-137.
Lakoff, G. (1974). Syntactic Amalgams. In M. Galy, R. Fox & A. Bruck (Eds.), Papers from
the 10th meeting of the Chicago Linguistic Society.
Lloyd, N. (2005). Clutter Measurement and Reduction for Enhanched Information Visuali-
zation.
McCawley, J. (1982). Parenthethicals and discontinuous constituent structure. Linguistic
Inquiry, 13, 91-106.
Novick, L. R. & Hurley, S. M. (2001). To Matrix, Network, or Hierarchy: That Is The Ques-
tion. Cognitive Psychology, 42, 158-216.
Prince, A. & Smolensky, P. (1993). Optimality Theory: Constraint Interaction in Genera-
tive Grammar. Rutgers University Center for Cognitive Science and Computer Sci-
ence Department, University of Colorado at Boulder.
van Riemsdijk, H. (1998). Trees and Scions - Science and trees. Chomsky 70th Birthday
Celebration Fest-Web-Page.
van Riemsdijk, H. (2006). Grafts Follow from Merge. In M. Frascarelli (Ed.), Phases of In-
terpretation. Berlin: Mouton de Gruyter.
Sampson, G. (1975). The Single Mother Condition. Journal of Linguistics, 11, 1-11.
Tufte, E. R. (1990). Envisioning Information. Cheshire, CT: Graphics Press.
de Vries, M. (2005a). Coordination and Syntactic Hierarchy. Studia Linguistica, 59, 83-105.
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
58
de Vries, M. (2005b). Ellipsis in nevenschikking: voorwaarts deleren maar achterwaarts
delen. TABU, 34, 13-46.
de Vries, M. (2007a). Internal and External Remerge: On Movement, Multidomincance, and
the Linearization of Syntactic Objects. Manuscript, University of Groningen, Re-
vised version, August 2007. [under preview].
de Vries, M. (2007b). Invisible Constituents? Parentheses as B-Merged Adverbial Phrases.
In N. Deh & Y. Kavalova (Eds.), Parentheticals. Amsterdam: John Benjamins.
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
59
Appendix A: Structural Analysis
Novick & Hurley (2001) define ten properties on which matrices, networks and
hierarchies are hypothesized to differ. They organize these ten properties into
three groups: ‘general structure’, ‘detailed information about items and links’ and
‘potential movement’. These groups respectively define the foundations of a dia-
gram, details on the linking of data within a diagram and movement from one
data point to another within a diagram.
I will discuss the characteristics of the modified syntax tree in terms of t-
hese three groups of properties and I do this from a graphical and a syntactic
perspective, since these may differ in compatibility with certain properties. The
traditional syntax tree is modified in two ways. The first modification is an addi-
tional dimension. The second modification is the allowance of multidominance
constructions. I address the characteristics of these modifications separately,
because there may be differences in compatibility with certain properties at this
level also.
I. General Structure
The general structure of the three spatial diagrams is defined by four properties:
‘global structure’, ‘building block’, ‘number of sets’ and ‘item/link constraints ’
(see Table I).
Global structure. From both a graphical and a syntactic perspective, the
global structure of the three-dimensional syntax tree is compatible with hierar-
chies and networks, but incompatible with matrices. It is incompatible with ma-
trices, because it does not express a factorial combination of possibilities. It is,
however, compatible with networks, because networks don’t have a predefined
formal structure. It is also compatible with hierarchies, because it is organized
into levels, beginning with a single root node that branches out to subsequent
levels such that the identities of the nodes at one level depend on the identities
of the nodes at the preceding level. Although the three-dimensional syntax tree
is compatible with both networks and hierarchies, a hierarchy defines its charac-
teristics in more detail.
From both a graphical and a syntactic perspective, the global structure of
two-dimensional and three-dimensional syntax trees that allow multidominance,
is compatible with networks, but incompatible with matrices and hierarchies. It
is incompatible with matrices because it does not express a factorial combination
of possibilities. It is also incompatible with hierarchies, because ‘branching out’
suggests substitution of a single node by multiple nodes a lower level. Multi-
dominance constructions, however, involve substitution of multiple nodes by a
single node at a lower level. It is, however, compatible with networks, because
networks don’t have any predefined formal structure.
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
60
Table I. Properties Related to the General Structure of the Three Spatial Diagrams. Reconstructed from
Novick & Hurley (2001).
Building block. From a syntactic perspective, the building block of a three-
dimensional syntax tree is compatible with networks and hierarchies, but in-
compatible with matrices. It is incompatible with matrices, because there are no
cells or boxes. It is, however, compatible with networks, since there are two
nodes that have a directional link between them. It is also compatible with hier-
GENERAL STRUCTURE
Global Structure
Matrix All the values of one variable have the values of another
variable in common (i.e., the representation expresses a fac-
torial combination of properties).
Network The representation does not have any predefined formal
structure, and it does not necessarily have a unique starting
or ending node.
Hierarchy The representation is organized into levels, beginning with a
single root node (usually located at the top or right) that
branches out to subsequent levels such that the identities of
the nodes at one level depend on the identities of the nodes at
a preceding level.
Building Block
Matrix A cell/box denoting the intersection or combination of a value
i on one variable and value j on the other variable.
Network Two nodes and a (directional or non-directional) link between
them.
Hierarchy A single node that gives rise to at least two other nodes, or at
least two nodes that are narrowed down to a single node, but
not both (i.e., three nodes and two directional nodes connect-
ing them, arranged as a ‘V’ in some orientation).
Number of Sets
Matrix The rows and columns specify values along two distinct vari-
ables.
Network The nodes specify values along a single variable.
Hierarchy This representation does not naturally suggest that the nodes
are arranged into a particular number or configuration of
groups.
Item/Link Constraints
Matrix Values on the same dimension (i.e., same row or same col-
umn) may not be linked.
Network Any node may be linked to any other node (i.e., there are no
constraints).
Hierarchy There may not be (direct) links between nodes at the same
level or between nodes in non-adjacent levels.
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
61
archies, because a single node gives rise to at least two other nodes, i.e., in the
context of X-bar theory2.
From a graphical perspective, the building block of a three-dimensional
syntax tree is compatible with networks, but incompatible with hierarchies and
matrices. It is incompatible with matrices, because there are no cells or boxes. It
is also incompatible with hierarchies, because graphically it is possible to have a
single node that gives rise to only a single node. It is, however, compatible with
networks because there are two nodes that have a directional link between them.
From both a syntactic and a graphical perspective, the building block of a
two-dimensional or three-dimensional syntax tree that allows multidominance,
is compatible with networks, but incompatible with hierarchies and matrices. It
is incompatible with matrices, because there are no cells or boxes. It is also in-
compatible with hierarchies, because in a multidomance construction, two nodes
can be narrowed down to a single node that gives rise to two other nodes, e.g., in
the case of constituent sharing. It is, however, compatible with networks, be-
cause there are two nodes that have a directional link between them.
Number of sets. Both a traditional syntax tree and a three-dimensional
syntax tree do not arrange nodes in a particular number or configuration of
groups. From both a syntactic and a graphical perspective, the number of sets of
a three-dimensional syntax tree is therefore compatible with hierarchies, but in-
compatible with matrices and networks.
The same conclusion holds for two-dimensional and three-dimensional syn-
tax trees that allow multidominance.
Item/Link constraints. From a graphical perspective, the item/link con-
straints of a three-dimensional syntax tree are compatible with hierarchies, but
incompatible with matrices and networks. They are incompatible with matrices,
because values on the same dimension may be linked in a three-dimensional syn-
tax tree. They are also incompatible with networks, because items on the same
level may not be linked in the tree and networks have no constraints. They are,
however, compatible with hierarchies, because hierarchies restrict linkage be-
tween nodes at the same level or nodes at a non-adjacent level.
However, from a syntactic perspective, the interpretation of the term ‘level’
determines the compatibility with hierarchies. Parallel nodes may be linked in a
three-dimensional syntax tree. However, from a syntactic perspective, parallel
nodes are on the same level, i.e., violating the item/link constraints of hierar-
chies. Conclusively, the item/link constraints of the three-dimensional syntax
tree are incompatible with each of the three spatial diagrams from a syntactic
perspective.
From both a syntactic and a graphical perspective, the item/link con-
straints of two-dimensional and three-dimensional syntax trees that allow mul-
tidominance are incompatible with each of the three-spatial diagrams. They are
incompatible with matrices, because items on the same dimension may be linked
in a syntax tree. They are also incompatible with networks, because there are
constraints in syntax trees, e.g., two sisters may not be linked. Finally, they are
2 In the context of X-bar theory, certain elements may be null, e.g., a specifier. However, a null-
specifier is still represented by the X-bar tree.
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
62
also incompatible with hierarchies because nodes at non-adjacent levels may be
linked in syntax trees.
II. Detailed Information about Items and Links
The detailed information about items and links in the three spatial diagrams is
defined by three properties: ‘item distinguishability’, ‘link type’ and ‘absence of a
relation’ (see Table II).
Item distinguishability. The item distinguishability property comprises the
difference in status of nodes or rows. In matrices all rows and in networks all
nodes have identical status. In contrast, in a hierarchy nodes on the same level
have identical status, but nodes at a different level differ in status. The defini-
tion of status depends on the represented world that is mapped onto the hierar-
chy form (Novick & Hurley, 2001).
DETAILED INFORMATION ABOUT ITEMS AND LINKS
Item Distinguishability
Matrix All of the rows have identical status (i.e., are indistinguish-
able except by name), as do all of the columns.
Network All of the nodes have identical status (i.e., are indistinguish-
able except by name).
Hierarchy The nodes at a given level have identical status, but the
nodes at different levels differ in status.
Link Type
Matrix In general, the links between row and column values are
purely associative (i.e., they are non-directional).
Network The links between nodes may be associative (i.e., non-
directional), unidirectional, or bidirectional.
Hierarchy The links between nodes are directional such that processing
flows from one end of the representation to the other.
Absence of a Relation
Matrix The absence of a link between a row value and a column
value typically is indicated explicitly in the representation by
placing a special mark (e.g., an ‘X’) in the relevant cell.
Network The absence of a link between two nodes must be computed in
all cases because there are no constraints on which nodes
may be linked.
Hierarchy The absence of a link between two nodes generally is indi-
cated implicitly due to constraints on which nodes may be
linked, but it must be computed for non-linked nodes in adja-
cent levels.
Table II. Properties providing Detailed Information about the Items and Links in the Three Spatial Dia-
grams. Reconstructed from Novick & Hurley (2001).
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
63
From a syntactic perspective, status refers to levels of inclusiveness, i.e.,
subordination. From a graphical perspective, status refers to positioning on the
y-axis. From both a syntactic and a graphical perspective, the item distinguisha-
bility of a three-dimensional syntax tree is compatible with hierarchies, but in-
compatible with matrices and networks. It is incompatible with matrices and
networks because not all nodes or rows have identical status. It is compatible
with hierarchies because nodes at a given level have identical status, but nodes
at different levels differ in status.
The same conclusion holds for two-dimensional and three-dimensional syn-
tax trees that allow multidominance.
Link type. From both a syntactic and a graphical perspective, the link type
for a three-dimensional syntax tree is compatible with networks and hierarchies,
but incompatible matrices. It is incompatible with matrices because the values
are not purely associative. It is, however, compatible with networks because they
allow links to be unidirectional. It is also compatible with hierarchies because
links between nodes are directional in such a way that processing flows from one
end of the representation to the other.
The same conclusion holds for two-dimensional and three-dimensional syn-
tax trees that allow multidominance.
Absence of relation. From a graphical perspective, the absence of a relation
in a three-dimensional syntax tree is indicated implicitly due to the constraints
on which nodes may be linked, i.e., it is compatible with hierarchies, but incom-
patible with matrices and networks.
However, from a syntactic perspective, the item/link constraints property is
incompatible with each of the three spatial diagrams. Consequently, it is not
possible to validly reason about absence of relations from a syntactic perspec-
tive.
The same conclusion holds for three-dimensional syntax trees that allow
multidominance. In contrast, the absence of relations for two-dimensional syntax
trees that allow multidominance is compatible with hierarchies from both a syn-
tactic and a graphical perspective, i.e., it does not rely on the syntactic interpre-
tation of ‘level’ in parallel constructions. Consequently, it is incompatible with
matrices and networks, because absence of links is indicated implicitly due to
the constraints on which nodes may be linked.
III. Potential Movement
The potential movement in the three spatial diagrams is defined by three prop-
erties: ‘linking relations’, ‘existence of paths’ and ‘traversing the representation ’
(see Table III).
Linking relations. The linking relations property focuses on the links going
out of a single node (Novick & Hurley, 2001). From a syntactic perspective, the
linking relations in a three-dimensional syntax tree are compatible with hierar-
chies, but incompatible with matrices and networks. They are incompatible with
matrices, because the links in the syntax tree do not depict both one-to-many
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
64
and many-to-one relations in the represented world. They are also incompatible
with networks, because one-to-many and many-to-one relations cannot be repre-
sented simultaneously in the syntax tree. They are, however, compatible with
hierarchies, because a single line enters and multiple lines leave each node, i.e.,
all depicted relations are one-to-many in the context of X-bar theory (see ‘build-
ing block’ property)
From a graphical perspective, the linking relations in a three-dimensional
syntax tree are incompatible with each of the three spatial diagrams. They are
incompatible with matrices and hierarchies, because they do not allow represen-
tation of a one-to-one relation. Although networks do allow representation of a
one-to-one relation, they do also allow many-to-many relations. These many-to-
many relations are not allowed in three-dimensional syntax trees and these are
therefore also incompatible with networks.
POTENTIAL MOVEMENT
Linking Relations
Matrix The links associated with each row or column value depict
both one-to-many and many-to-one relations in the repre-
sented world, but the existence of these (many-to-many) rela-
tions must be inferred (i.e., is not directly accessible from the
representation).
Network Any number of lines can enter and leave each node. Thus
both one-to-many and many-to-one (i.e., many-to-many) rela-
tions can be represented simultaneously.
Hierarchy Either a single line enters and multiple lines leave each node
(i.e., all depicted relations are one-to-many) or multiple lines
enter and a single lines leaves each node (i.e., all depicted
relations are many-to-one), but not both.
Existence of Paths
Matrix This representation does not show paths connecting subsets
of (more than two) items.
Network This representation shows paths connecting subsets of (more
than two) nodes.
Hierarchy This representation shows paths connecting subsets of (more
than two) nodes.
Traversing the Representation
Matrix It does not really make sense to talk about traversing this
type of representation.
Network Multiple paths from one node to another are possible because
closed loops are allowed in this representation.
Hierarchy For any pair of nodes, A and B, there is only one path to get
from one to the other (i.e., closed loops are not allowed).
Table III. Properties related to the Potential for Movement in the Three Spatial Diagrams. Reconstructed
from Novick & Hurley (2001).
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
65
From a graphical perspective, the same conclusion holds for two-
dimensional and three-dimensional syntax trees that allow multidominance.
From a syntactic perspective, the linking relations in two-dimensional and
three-dimensional syntax trees that allow multidominance is compatible with
networks, but incompatible with matrices and hierarchies. It is incompatible
with matrices, because many-to-many relations are directly accessible from a
three-dimensional syntax tree that allows multidominance. It is also incompati-
ble with hierarchies, because hierarchies don’t allow many-to-many relations. It
is compatible with networks, because any number of lines can enter and leave
each node in a network, i.e., one-to-many, many-to-one and many-to-many rela-
tions can be represented.
Existence of paths. From both a syntactic and a graphical perspective, the
existence of paths in a three-dimensional syntax tree is compatible with net-
works and hierarchies, but incompatible with matrices. It incompatible with ma-
trices, because it does show paths. It is compatible with networks and hierar-
chies because it shows paths connecting subsets of (more than two) nodes.
The same conclusion holds for two-dimensional and three-dimensional syn-
tax trees that allow multidominance.
Traversing the representation. From both a syntactic and a graphical per-
spective, the traversal of a three-dimensional syntax tree is compatible with hi-
erarchies, but incompatible with matrices and networks. It is incompatible with
matrices, because it is actually possible to traverse the representation. It is also
incompatible with networks, because multiple paths from one node to another
are impossible in a syntax tree. It is compatible with hierarchies because for any
pair of nodes, A and B, there is only one path to get from one to another.
The same conclusion holds for two-dimensional and three-dimensional syn-
tax trees that allow multidominance.
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
66
Appendix B: Formal Design Principles
Formal Design Principles:
For each three dimensional syntax tree in a x,y,z-Cartesian coordinate space the
following rules hold:
i. Any node A, is also a tree A.
ii. There are three constants:
a. α is the default x-spacing.
b. β is the default y-spacing.
c. γ is the default z-spacing.
iii. For any node A:
The text label of A is completely visible (i.e., the node is com-
pletely visible).
iv. For any pair, A and B, of adjacent nodes along the y-axis:
|Ay – By|= |∆y|= β.
v. For any pair, A and B, of adjacent nodes along the z-axis:
|Az – Bz|= |∆z|= γ.
vi. For any mother node C, with a single daughter A:
(Cz = Az) ⇔ (Cx = Ax)
vii. For any mother node C, with a two daughters A and B, where A pre-
cedes B:
(Cz = Az ∧Cz = Bz) ⇔ (Cx = 0.5|Ax - Bx|+ Ax = 0.5 |∆ABx|+ Ax)
(Cz = Az ∧Cz ≠ Bz) ⇔ (A l = max[ )()( xxN ANANx
>∧∈∀ ] - Ax) ∧
Cx = Ax + Al + α
(Cz = Bz ∧Cz ≠ Az) ⇔ (Bl = Bx – min[ )()( xxN BNBNx
<∧∈∀ ]) ∧
Cx = Bx - Bl + α
viii. For any pair, A and B, of adjacent nodes along the x-axis, where A
precedes B:
a. E = NODE(Ax, Ay, Az - 1)
b. F = NODE(Bx, By, Bz - 1)
c. (Ay = By ∧Az = Bz) ⇔
(¬E ⇔ (A l = max[ )()( xxN ANANx
>∧∈∀ ] - Ax)) ∧
TOWARDS AUTOMATIC OPTIMAL RENDERING OF THREE-DIMENSIONAL SYNTAX TREES
67
(E ⇔ (Al = (max[ )()( xxN ANANx
>∧∈∀ ] - Ax) +
(max[ )()( xxx ENENN
>∧∈∀ ]) - Ax))) ∧
(¬F ⇔ (Bl = Bx – min[ )()( xxN BNBNx
<∧∈∀ ]) ∧
(F ⇔ (Bl = (Bx – min[ )()( xxN BNBNx
<∧∈∀ ]) +
(Bx – min[ )()( xxx FNFNN
<∧∈∀ ]))) ∧
(Bx - Ax = ∆x = Al + Bl + α)