readability metric feedback for aiding node-link ... · node-and-edge readability metrics to...

Readability metric feedbackfor aiding node-linkvisualization designers

C. DunneS. I. Ross

B. ShneidermanM. Martino

Analyzing network data can provide valuable insights in manydiverse fields. However, designing node-link visualizations thateffectively communicate the underlying network is challenging, asfor every network there are many potential unintelligible or evenmisleading layouts. Automated layout algorithms have helped, butfrequently generate ineffective visualizations. In order to buildawareness of effective node-link visualization strategies, we detailnew global readability metrics on a [0,1] continuous scale fornode-node overlap, edge crossing angle, angular resolution, groupoverlap, and visualization coverage. In addition, we define novelnode-and-edge readability metrics to provide more localizedidentification of where improvement is needed. We describethe trade-offs inherent in optimizing individual metrics as wellas recommend metric optimizations for particular tasks. Ourmetrics are implemented in a JavaScriptA API (applicationprogramming interface) to make them widely available todesigners of web-based visualization tools, who can use metricsto direct users towards poor areas of the drawing. Our prototypesystem using the API aims to help designers and theoristsevaluate and compare their layouts.

IntroductionNetwork data structures have been used extensively formodeling entities and their relationships across such diversedisciplines as computer science, sociology, bioinformatics,urban planning, and archeology. Analyzing networksinvolves understanding the complex relationships betweenentities, as well as any attributes, statistics, or groupingsassociated with them. Node-link network visualizations areexcellent for understanding the overall structure of a networkand for many important path-based tasks [1]. Moreover,they are frequently used [2, 3] and the only networkvisualization available in common analysis tools suchas NodeXL [4], Gephi [5], Cytoscape**[6] Pajek [7],and GUESS [8].However, many existing node-link visualizations are

difficult to extract meaning from because of 1) the inherentcomplexity of the relationships, 2) the number of itemsdesigners try to render in limited screen space, and3) for every network, there are many potential unintelligible

or even misleading visualizations. Automated layoutalgorithms have helped, but frequently generate ineffectivevisualizations even when used by expert analysts anddesigners. The results of applying force-directed layoutalgorithms can vary greatly depending on the size andtopology of the network, and the layout generated is highlydependent on the algorithm used. Past work, includingour own, has shown there can be vast improvementsin network visualization algorithms, but it remainsa challenge to automatically produce readable andmeaningful visualizations.We are interested in improving the general effectiveness

of node-link visualizations. By quantifying the readabilityof a layout based on human perception and cognitive studies,we can guide designers, theorists, and analysts towardsimprovements. Past work by Purchase and Leonard [9, 10]provides definitions for several global readability metrics(also called aesthetic criteria), which measure detrimentalfeatures such as edge crossings [see Figure 1(a)–(c)] andrate the layout as a whole. However, there are many seriousreadability problems that do not yet have metrics. We

�Copyright 2015 by International Business Machines Corporation. Copying in printed form for private use is permitted without payment of royalty provided that (1) each reproduction is done withoutalteration and (2) the Journal reference and IBM copyright notice are included on the first page. The title and abstract, but no other portions, of this paper may be copied by any means or distributed

royalty free without further permission by computer-based and other information-service systems. Permission to republish any other portion of this paper must be obtained from the Editor.

C. DUNNE ET AL. 14 : 1IBM J. RES. & DEV. VOL. 59 NO. 2/3 PAPER 14 MARCH/MAY 2015

0018-8646/15 B 2015 IBM

Digital Object Identifier: 10.1147/JRD.2015.2411412

introduce several new global metrics to quantify knownreadability issues such as node-node overlap, edge crossingangle, angular resolution, group overlap, and visualizationcoverage. Moreover, a single value is not sufficient todirect users to problem areas of the layout, which our workaddresses by introducing novel node and edge readabilitymetrics for individual nodes and edges.As there are trade-offs when optimizing readability

metrics, we survey the related literature studying thesetrade-offs and the effect of specific metrics on usertask performance. We also designed and implementedan interactive browser-based system and JavaScript**API (application programming interface) for helpingdesigners evaluate their visual and algorithmic choiceswith respect to alternatives. In summation, this work aimsto raise awareness of network visualization readabilityissues, and our techniques will help guide designersand theorists in creating more effective node-linkvisualizations.

Specifically, the contributions of this paper are as follows:

• New global readability metrics to help understanddifferent aspects of node-link visualization readability:node-node overlap, edge crossing angle, angularresolution, group overlap, and visualizationcoverage.

• Local readability metrics for individual nodes andedges to help users identify problem areas andfix them using interactive or automatic tools.

• JavaScript implementations of these readability metricsas a generally applicable web API.

• A browser-based system to help designers evaluateand compare their node-link visualization layouts,as well as help researchers study the utility of thesemetrics.

• A survey of work on readability metrics, algorithms foroptimizing them, and evaluations of their effectivenessfor various network analysis tasks.

Figure 1

Several examples of readability issues. Top row: Different visualizations of the same network, with the layout obscuring the topology in (a), while (b) and(c) are more understandable with less edge crossings. Middle row: We can eliminate the node-node overlap that makes the central overlapping group in(d) so difficult to understand by zooming out, increasing the spring lengths of the layout algorithm, and iterating the layout algorithm (e). This also helpsus avoid node-edge overlap. Bottom row: In edge tracing tasks, such as finding the length of the shortest path between the bottom right and top left nodesin (f), increasing the edge crossing angles approaching 70 degrees (g) improves user path finding performance.

14 : 2 C. DUNNE ET AL. IBM J. RES. & DEV. VOL. 59 NO. 2/3 PAPER 14 MARCH/MAY 2015

This paper begins with the background of node-linkvisualization readability. Then, we discuss our readabilitymetrics, including perceptual studies on their utility. Wecover a browser-based system and API for comparativeanalysis. We conclude the paper with the limitations of ourtechnique and mention several open avenues for research.This work is based on the initial discussions presentedin [11, 12].

Node-link visualization readabilityLayout algorithms attempt to find an optimal layout of thenetwork, often according to a set of readability metrics (RMs)or associated heuristics. Readability metrics are measuresof how understandable the entire node-link visualization is,based on detrimental features such as the number of edgecrossings or overlapping nodes in the drawing. Traditionallythese RMs have been called aesthetic criteria (e.g., byPurchase and Leonard [9, 10]), though several recent authorschoose to describe network visualizations in terms ofreadability instead of aesthetics [11–15]. We call themreadability metrics because of the ambiguity implied bythe word Baesthetic.[ We are not concerned as much withhow visually pleasing a particular network drawing is;instead, we are interested in how well it communicates theunderlying data. However, some of the most informativevisualizations happen to also be the most beautiful.There is a substantial body of work aimed at developing

and, more recently, empirically verifying the correctnessof a wide variety of RMs. Sugiyama’s book [16] includesan excellent figure showing several simple rule-baseddrawing optimizations (p. 14, Figure 2.3.1). Bennett et al.[17] later provided a summary of readability heuristics alongwith groundings in perceptual principles and experimentalvalidations. Excellent overviews of readability criteria forgeneral networks can also be found in [18–21]. RMs specificfor trees and UML (Unified Modeling Language) diagramsare described in [22] and [23], respectively. The first standardand numerical definitions of many specific RMs weregiven by Purchase and Leonard [10] and were elaboratedon by Purchase [9] who developed seven specific RMformulas. These will form the basis for much of our work.Several layout algorithms try to directly satisfy readability

metrics, such as Davidson and Harel’s [24] use of simulatedannealing to distribute nodes evenly, make edge-lengthsuniform, minimize edge-crossings, and keep nodes fromcoming near edges. Huang et al. [25, 26] developed a layoutto optimize angular resolution and edge crossing angles usingforce-directed sine and cosine forces, respectively, and theangular resolution force is also described by Huang et al.[27]. These approaches all deal with straight-line edges,which are the most common in practice, but for curved orpolyline edges Didimo et al. [28] developed a layout thatadds or removes bends as necessary to optimize edgecrossing angles, edge crossings, and a bend-specific metric

for geodesic edge tendency. Similarly, Simonetto et al. [29]introduced a post-processing algorithm which preserves edgecrossing properties, prevents nodes from overlapping edges,and increases angular resolution using polyline edges.Optimizing the layout for specific RMs can lead to

much more understandable drawings. For example,Figure 1(a)–(c) show how reducing edge crossings canlead to more straightforward representations. Optimizingfor RMs has been shown to promote many common analysistasks, though it does not guarantee the resulting drawingis understandable. The particular RMs that the layoutalgorithms optimize intentionally or indirectly throughheuristics may not be the correct ones for the tasks users aretrying to accomplish. There are often substantial trade-offsin task performance when different RMs are optimized, andcan result in ineffective, unintelligible, or even misleadingdrawings. For example, after reducing the number of edgecrossings in a large drawing, the spatial layout is oftentimessubstantially distorted, and it can alter a viewer’s perceptionof the importance and centrality of individual nodes.Additionally, as the optimization of many RMs is

NP-hard [20], automatic techniques often producesuboptimal network drawings. The International Symposiumon Graph Drawing has met annually for two decades,working to improve automated network layout algorithmsand RMs, among other things, but we believe thatstate-of-the-art automated layout algorithms alone areinsufficient to consistently produce understandable networkdrawings. Additional post-processing algorithms canimprove the layout, but are limited in how much they canmodify the layout. The layout algorithms available to endusers depend on the network analysis tool being used, andpost-processing techniques are rarely included and havedifficulties with evolving networks.Users can be made aware of the common problems RMs

measure, or even quantitative values for RMs to optimizemanually. However, current RMs only provide overallmeasures for the drawing, without any means for focusinguser attention on the problem areas. Users are not providedwith any indication of where to focus their improvements andhow effective they have been. Seasoned network analystsdevelop an ingrained understanding of proper layouttechniques and will interpret the spatial layout accordingly,but novice users are left to fend for themselves. Even expertusers have difficulty applying their layout techniques tonetworks over a few hundred nodes. Furthermore, usersmay not be aware of the optimization trade-offs of particularmetrics and how it affects task performance.

Specific readability metricsPrevious work primarily deals with readability metrics (RMs)for the entire node-link visualization, giving, for example,a single quantitative measure of the problems caused by edgecrossings. We name such measures global readability


metrics, or global RMs, and have developed several thatare not included in the literature. These serve as excellentmeasures for how understandable the whole networkvisualization is, but do not provide the level of specificityneeded to direct users to problem areas. To address thisproblem, we augment both existing and our new globalRMs with novel node and edge readability metricsthat describe how these individual components affect theglobal understanding. We call these node RMs and edgeRMs for short. This is an extension of the idea of individualnode and edge metrics espoused by Herman et al. [30].Our RMs are scaled appropriately to a continuous scale

from [0,1] where 1 indicates the positive maximum ofthe RM. This allows us to assign network readabilityrequirements to particular visualizations based on the contentand information we want them to impart. For example,a journal may recommend 1.00 node occlusion and greaterthan 0.95 edge crossing to publish a node-link visualization,while having different suggestions for UML diagramsor other kinds of networks. However, there are many usefulnode-link visualizations that violate these limits, and theyshould not be eliminated based solely on the RMs. Also notethat while the reverse scale would result in simpler RMformulas in many cases, we choose to use the conventional[0,1] scale keeping with the conventions set by ourcommunity [9, 31].In these formulas, we use a notation similar to that of

Purchase [9], where the network has n nodes and m edges,indexed using subscripts. Using a technique called bendspromotion [9], we can convert polyline edges into m0 straightline edges and replace the bends in the edges with nodesfor a total of n0. However, we will deal predominantlywith straight-line drawings. The ith node is denoted ni 2 Nor n0i 2 N 0, and the ith edge is denoted ei 2 E or e0i 2 E0.In cases where division by 0 could occur in our metriccalculation, we use the function below to simply return 0.

divOrZeroða; bÞ ¼ab if b 9 00 otherwise

n

Node-node overlap @nEuclid defined a point as that which has no part. Historically,network layout algorithms were designed around theseabstract networks [32], with nodes taking up little or nospace [22, 33, 34]. However, practical networks such associograms or UML diagrams represent nodes using text,shapes, colors, pictures, and size [32]. Classical algorithmscan thus frequently result in nodes with non-zero widthand height, overlapping one another in the node-linkvisualization as in Figure 1(d).This node-node overlap, also called node overplotting or

occlusion, is contrary to accepted node-link visualizationreadability guidelines [16], including those for trees [22] andUML diagrams [23]. Moreover, areas of the drawing withhigh overlap make it very difficult for the viewer to get an

accurate count of the number of individual nodes in a clusterto get a sense of its scale. These problems can be reducedsomewhat, but not entirely, through the use of a halo, fog,border, shadow, or 3D effects on nodes to help distinguishthem from each other.Many force-directed layout algorithms include node-node

repulsive forces or equivalent constructs, includingvariants of the spring embedder [35] such as the popularFruchterman-Reingold force-directed algorithm [36] andmore scalable gravitational N -Body approaches such asthose provided by Prefuse [37] using the Barnes-Hutforce calculation algorithm [38]. However, force-directedapproaches cannot usually guarantee all overlaps willbe removed while the area and shape of the drawingare preserved, because they rely on overly large repulsiveforces or post-processing [39]. One notable exception isan approach by Harel and Koren [40].There have also been many algorithms developed for

removing node-node overlaps using post-processing afteran initial layout algorithm. While the problem of creatinga minimum-area layout adjustment is NP-complete, thereare several practical algorithms. These include variantsof the force-scan method [32–34, 41–43], constrainedoptimization [44–46], and force-directed approaches[47–49]. However, many of these have various problemswith scaling up or preserving the network shape.For example, the scan line approach of Dwyer et al.’s

solve_VPSC [44, 45] is a quadratic programming algorithmthat removes overlaps and maintains orthogonal ordering.However, the visualization can become highly skewed [39].One option proposed by Li et al. [33] is varying the edgelengths in a standard force-directed layout. While thispreserves the orthogonal ordering well, it has scaling issuesand can require excessive space [39]. An alternative isthe Voronoi cluster busting algorithm of Lyons et al. [49]and used by Gansner and North [47] for their layout.This algorithm iteratively forms a Voronoi diagram for thelayout and moves nodes to the center of their Voronoi cells.This roughly maintains the network shape, but loses muchof the layout structure and again expands to take up alot of screen space [39]. Another interesting approach byImamichi et al. [50] for 3D visualizations assumes labelsextend from spherical nodes, models these masses witha set of spheres, and solves the sphere packing problem. Thisallows for arbitrary rotation and translation, but is not assuitable to 2D nodes.One of the most effective approaches for node-node

overlap removal appears to be the PRoxImity StressModel (PRISM) algorithm of Gansner and Hu [39], whichiteratively computes node overlap along the edges of aDelaunay triangulation and adjusts those edge lengthsaccordingly to remove the overlap. According to Gansnerand Hu’s evaluations, the PRISM approach scales up well tolarge networks while maintaining a good tradeoff between


preserving the network shape and limiting the area requiredby the adjusted visualization.Despite two decades of research into algorithms for

node-node overlap removal, most widely used networkvisualization tools fail to properly reduce node-node overlap.Examples include NodeXL [51] and Pajek [7], two commonsocial network analysis tools. In a recent user study [52],the authors had to hand-tune the diagrams produced byPajek to avoid node-node overlap. Figure 1(d) and (e) showhow node-node overlap can be eliminated by zooming outand increasing default spring lengths, at the cost ofdecreasing perceived clustering.

Global readability metricWe are not aware of any suitable existing readability metricsfor node-node overlap. We suggest a global RM @n as theratio of the area of the union of the node representationsover their total area if drawn independently. This unionis akin to flattening the node images in a photo editor andthen computing the area. On a continuous scale from 0 to1, 1 indicates that every node is uniquely distinguishablefrom its neighbors, and 0 indicates that all nodes inthe visualization are drawn within the bounds of thelargest node.

a ¼ area[nj2N

boundsðnjÞ

0@

1A

amax ¼Xnj2N

area boundsðnjÞ� �

a� ¼ argmaxnj2N

area boundsðnjÞ� �

@n ¼ divOrZeroða� a�; amax � a�Þ

For each node nj (as we are not interested in overlap of bendspromoted into nodes), we take its non-empty polygonrepresentation or bounds (possibly including a spacingrequirement) boundsðnjÞ, compute the union of all thesepolygons, and the area of this union a. Here, a represents thescreen space that nodes occupy. We then sum the areas ofeach individual polygon ðamaxÞ. In order for the metric tohave a range of [0,1], the areas a and amax must have themaximum node area a� subtracted from it. Dividing a byamax gives us our global RM for node-node overlap @n: a ratioof screen space occupied versus what would be occupied ifthe nodes were drawn disjointly.The union is the most time consuming operation, but

can be done fairly quickly. Martınez et al. [53] describean approach where for p total edges of all polygonsinvolved and k intersections the union can be computedin Oððpþ kÞ log pÞ time. Of course, even simple unionor intersection operations can be sped up using hierarchicalsubdivisions like quadtrees, k-d trees, R-trees, and binaryspace partition (BSP) trees [54], not to mention simplerbounding polygons.

Node readability metricSimilarly, our node RM @nðnjÞ is the proportion of the node’srepresentation that is not obscured by other nodes. A valueof 0 indicates the node occupies no unique points, while 1indicates that it is drawn completely disjointly.

aðnjÞ ¼ area[

nk 6¼nj2NboundsðnjÞ \� boundsðnkÞ

0@

1A

amaxðnjÞ ¼ area boundsðnjÞ� �

@nðnjÞ ¼ 1� divideOrZero aðnjÞ; amaxðnjÞ� �

Here the regularized intersection of rectangles P and Q,denoted P \ �Q, is the closure of the interior of the standardintersection P \ Q. Regularization is used to removelower-dimensional Bdangling[ components (for instance,lines in 2D drawings) [54].For a given node nj, we take its polygon representation

or bounds (possibly including a spacing requirement)boundsðnjÞ and find the regularized intersection of itwith each other overlapping node nk . Again, hierarchicalsubdivision allows us to much more quickly determine whichnodes intersect [54]. Dividing the overlapping area aðnjÞby the area the node would occupy disjointly amxðnjÞwe get the inverse of our [0,1] metric. Subtracting fromone we get our node RM for node-node overlap @nðnjÞ.

Edge readability metricNaturally, an edge RM @nðeiÞ would not be especiallyuseful for manually improving node-node overlap; however,it may play into algorithmic improvement techniques.Also, node-node overlap is usually grouped in the literaturewith node-edge overlap, which is discussed in depthelsewhere [11, 12].

Edge crossing @cThe number of edge crossings or intersections is the mostwidely accepted optimization criteria in the node-linkvisualization literature. In 1953, Moreno [55] wrote, BThefewer the number of lines crossing, the better the sociogram.[Edge crossing is listed as an important general RM in manybooks on network visualization [16, 20, 21], as well asfor automated UML diagram layout [23]. As with node-nodeoverlap, the effect of edge crossings can be somewhatmitigated with a halo, fog, border, shadow, or 3D effectson the edges to help distinguish them from each other.Additionally tapers, curves, or gradient colors can showedge directionality. Moreover, substantial work has alsobeen done in the design of network layout algorithms thatspecifically reduce the number of edge crossings, such as[24, 36, 56–59].Purchase’s seminal RM comparison user study identified

edge crossings as having the greatest impact on humanunderstanding of general networks of the five RMs she


studied [60]. This finding has been empirically validatedin [61–63]. These studies focus on edge tracing taskslike finding the length of the shortest path between twonodes, though use a global count of the number of edgecrossings. Ware et al. [64] suggest that the number of edgecrossings along the relevant edges is more important thana global measure. Additional evidence for the importance ofedge crossing comes from Korner and Albert [65], dealingwith visualizing ordered sets. Moreover, user preferencestudies identify minimizing edge crossings as the mostimportant RM for UML diagrams [63, 66] as well as fornode-link visualizations [67], and when given the optionof improving on an initial force-directed or random layout,users created node-link visualizations with 60% feweredge crossings on average [68]. Korner and Albert [65]theorize that crossed lines could be salient properties thatdistract the user’s visual system from the relationshipsthe drawing was designed to convey.However, Mutzel [58] suggests that allowing some edge

crossings can sometimes result in more readable node-linkvisualizations, and recent literature points to constrainingedge crossing angles being almost as effective as reducingedge crossings (see our metric for edge crossing angle).Furthermore, recent research on node-link visualizationscomparing edge tracing tasks like finding groups to nodeimportance tasks indicates that while reducing edge crossingsimproves edge tracing task performance and user preference,it has little effect on node importance tasks [69–71]. Thiswas further verified in eye tracking studies [72–74]. Huang,Eades, and Hong postulate that this indicates the effectsof edge crossings can vary depending on the situation.Further discussion of the cognitive load imposed by edgecrossings quantified using eye tracking is in [52, 75, 76], andas we suggested before, Figure 1(a)–(c) demonstrate howreducing edge crossings can lead to a much moreunderstandable drawing.

Global readability metricWe take from Purchase [9] the global RM for edge crossings@c based on c, the number of pairwise edge crossings inthe drawing. As suggested, scaling by an approximate upperbound for the number of crossings in the drawing, we canproduce a metric over [0,1].

call ¼Xm0i¼1ði� 1Þ ¼ m0ðm0 � 1Þ

2

cimpossible ¼1

2

Xn0j2N 0

deg n0j

� �deg n0j

� �� 1

� �

cmax ¼ call � cimpossible

@c ¼ 1� divOrZeroðc; cmaxÞ

Here, degðn0jÞ is the degree of node n0j (with bendspromotion). First, we calculate call, the number of crossings

if every pair of edges intersect. Of those, we removecimpossible, the impossible intersections of edges connectedto the same node in a straight-line visualization. This leavesus with cmax, a (probably high) upper bound to the numberof crossings in the drawing. Scaling c by cmax and subtractingfrom 1, we obtain the global RM for edge crossings @c.We can report all c crossings of m0 edges in Oðm0 logm0 þ cÞtime and Oðm0Þ space [54, 77] rather than testing all cmax

pairs. cmax can be computed in Oðn0Þ time and only needsto be calculated once.If the network topology is dynamically changing, only

those nodes with modified degree ð�n0Þ need to be usedto recalculate cimpossible in Oð�n0Þ time and the addedor removed edges must be fed back into the calculationof c. Similarly, if the layout is dynamically changing,then c must be updated for all edges whose locationhas changed.A discussion of various algorithms for line segment

intersection reporting can be found elsewhere [54].The ability to use precomputed results to only test themodified edges for intersections naturally depends on thechoice of algorithm, although some algorithms [77] areiterative and seem particularly suited for the addition ofnew edges.

Edge readability metricOur edge RM for edge crossings @cðe0iÞ is defined for anyedge e0i based on the number of pairwise edge crossings cðe0iÞbetween it and any other edge in the drawing. Scalingas before, we can produce a metric over [0,1]. With thismetric, we can identify the edges with the most crossingsin the drawing.

call e0i

� �¼m0 � 1

cimpossible e0i� �¼ deg src e0i

� �� þ deg tar e0i

� �� 2

cmax e0i� �¼ call e

0i

� �� cimpossible e0i

� �¼m0 � deg src e0i

� �� deg tar e0i

� �� þ 1

@c e0i� �¼ 1� divOrZero c e0i

� �; cmax e0i

� �� callðe0iÞ is the number of edges e0i could intersect inthe drawing, of which we can remove the impossibleintersections cimpossibleðe0iÞ. Edges that have the same sourceor target node as e0i ðsrcðe0iÞ and tarðe0iÞ, respectively) cannotintersect e0i in a straight-line visualization. Thus, we havecmaxðe0iÞ, an upper bound to the number of edges crossing e0i.Scaling cðe0iÞ by cmaxðe0iÞ and subtracting from 1, we get theedge RM for edge crossings.

Node readability metricOur node RM for edge crossings @cðn0jÞ is defined for anynode (and bend) n0j based on cðn0jÞ, the sum of the numberof crossings its connected edges have. We call thesecrossings caused by the node’s position the node’s triggeredcrossings. Again, we scale to a continuous metric scale


of [0,1]. This allows us to identify the nodes whose positionsare the cause of many edge crossings.

c n0j

� �¼

Xe0i2edges n0jð Þ

c e0i� �

cmax n0j

� �¼

Xe0i2edges n0jð Þ

cmax e0i� �

¼X

e0i2edges n0jð Þm0 þ 1�deg src e0i

� �� deg tar e0i

� ��

¼X

e0i2edges n0jð Þm0 þ 1�deg n0j

� ��deg adj n0j; e

0i

� ��

¼ deg n0j

� �m0 þ 1� deg n0j

� ��

�X

e0i2edges n0jð Þdeg adj n0j; e

0i

� ��

@c n0j

� �¼ 1� divOrZero c n0j

� �; cmax n0j

� ��

Here, edgesðn0jÞ is the set of all edges connected to node n0j.We define an upper bound to the number of edge crossingsof connected edges cmaxðn0jÞ as the sum of the individualedge upper bounds cmaxðe0iÞ from the edge RM. For allconnected edges, we can pick the current node n0j as eitherthe source or the target, and use the adjacent node along edgee0i, denoted adjðn0j; e0iÞ, as the other. As degðn0jÞ ¼ jedgesðn0jÞj,we obtain the formula for cmaxðn0jÞ. Again, scaling cðn0jÞby cmaxðn0jÞ and subtracting from 1, we obtain the nodeRM for edge crossings @cðn0jÞ.

Edge crossing angle @caThe impact of edge crossing angles was first discussed byWare et al. [64], who used a neurophysiological view of theuser. Ware et al. claim rapid early-stage neural processingcauses certain features to Bpop out[ to users, and that theseneurons are coarsely tuned when examining angles, roughlybetween þ=� 30 degrees. Although they did not find theimpact of edge crossing angles to be significant, they did findthat another angular measure, path continuity, was. Thisneurophysiological view supplies an explanation for theresults of [72, 73, 75, 78–81], which use eye-tracking userstudies to verify that the angle of edge crossings has asignificant impact on user response time for edge tracingtasks. According to Huang [73], when users trace anedge, their eyes follow it smoothly. With the addition ofa large-angle crossing, eye movement slows but remainssmooth. However, small-angle crossings cause even slowermovement, as well as back-and-forth eye movements aroundthe acute crossings. Moreover, response time significantlydecreases as the crossing angle tended toward 90 degreesbut tended to level off and even increase beyond 70 degrees[81]. This is attributed to extra eye oscillations as well.However, as the size of the network increases creating longer

searching paths, the impact of even large-angle crossingscan build up and become significant [73].Removing edge crossings is more important than

optimizing the edge crossing angle (at least in the range of60 to 80 degrees) [78]. However, based on these results,we can easily claim that we achieve the same level ofvisualization effectiveness by leaving a few more edgecrossings and maximizing crossing angles [79]. SeeFigure 1(f) and (g) for a demonstration of how larger angleedge crossing angles promote path finding tasks.

Edge readability metricWare et al. [64] use the average cosine crossing angle asa crossing angle metric, but only for an individual shortestpath between two nodes.

1

cp

Xcpi¼1

cos �pi

Here cp is the number of crossings on a shortest path pand �pi is the angle between p and the ith crossing edge ofthe path.Our edge RM for edge crossing angle @caðe0iÞ is defined

for any edge e0i as the average deviation of its individualedge crossing angles from an ideal angle # of 70 degrees[81]. # can be changed within [0,90] degrees as necessaryif further evaluations refine our understanding of theideal angle. According to Huang et al. [81], user responsetime at 90 degrees was even slightly worse than at50 degrees.

d e0i� �¼Xc e0ið Þ

j¼1j#� �ijj

dmax c e0i� ��

¼ c e0i� �

#

@ca e0j

� �¼ 1� divOrZero d e0i

� �; dmax e0i

� ��

Here, cðe0iÞ is the number of crossings edge e0i has, and�ij is the crossing angle of edges e0i and e0j, also [0,90]degrees. Note that this can be computed in conjunctionwith the metrics for Edge Crossing to reduce duplicatedcomputation.

Global readability metricThe global RM for edge crossing angle @ca is based on c,the number of pairwise edge crossings in the drawing,and the edge RM crossing angle deviation sum dðe0iÞ foreach individual edge e0i.

d ¼Xm0i¼1

dðe0iÞ ¼Xm0i¼1

Xcðe0iÞj¼1j#� �ijj

dmax ¼ c#

@ca ¼ 1� divOrZeroðd; dmaxÞ


The total deviation d is the sum of the deviation of alledges, and the maximum deviation dmax is c times theideal crossing angle # of 70 degrees [81].

Node readability metricA node RM for edge crossing angle @caðn0jÞ is lessuseful to define for manual improvement, but if useful forautomated improvement it could be constructed just like theglobal RM using edgesðn0jÞ in lieu of m0.

Angular resolution (min) @rmThe angular resolution (min) RM refers to the minimumangle formed by all the edges incident to an individual node.Sugiyama et al. [59], Formann et al. [82], and Colemanand Parker [56] dealt with this early on, and Battista et al.[20] used the minimum angle in the entire drawing as aglobal measure. Purchase [9] defines a minimum angle metricshe called @m, which is the inverse of the later angularresolution measure of Finkel and Tamassia [83]. Purchase[60] found the minimum angle metric had no effect onpath finding tasks, but it was determined relevant for shortestpath tasks in a study by Huang et al. [27]. Huang et al. [27]ranked the minimum angle in the entire drawing as morepredictive of user performance than Purchase’s measure;however, more than shortest path tasks need to be evaluated.For example, angular resolution was found to be importantfor recognizing actor status by Huang et al. [70, 71, 74]. Ofcourse, alternatives to straight-line edges like curvilinearlines [83] can result in visualizations with perfect angularresolution but are not widely used.

Node readability metricInspired by Purchase’s [9] minimum angle metric, wecompute a node RM for angular resolution @rmðn0jÞ for a noden0j based on the deviation dðn0jÞ of the incident edge anglesfrom the ideal minimum angle for that node #j. As the edgesare ideally spread symmetrically around the node, #j issimply 360 degrees divided by the number of incident edgesor degree degðn0jÞ. Here, �jmin is the minimum angle betweenedges incident to node n0j.

#j ¼360�

deg n0j

� �

d n0j

� �¼ #j � �jmin

#j

��

@rm n0j

� �¼ 1� d n0j

� �

After subtracting dðn0jÞ from 1, we obtain our node RM forangular resolution (min) @rmðn0jÞ on a [0,1] scale.

Global readability metricOur initial global RM for angular resolution (min) @rm isthe same as Purchase’s [9] minimum angle metric @rm.

Here d is the average deviation across all nodes of adjacentincident edge angles from the ideal minimum angle for thatnode n0j. These edges can be bend-promoted.

d ¼ 1

n0

Xn0j¼1

d n0j

� �¼ 1

n0

Xn0j¼1

#j � �jmin

#j

��

@rm ¼ 1� d

Edge readability metricAgain, an edge RM @rmðe0iÞ is not as useful for manualimprovement as the angular resolution metric is moreimportant for actor (node) status tasks. However, it couldbe computed from the deviation of incident edge anglesbetween the four nearest incident edges at its sourceand target nodes.

Angular resolution (dev) @rdIn our experiments, the metric for Angular Resolution (min)seems to unfairly penalize high degree nodes that have evena single small angle between incident edges. We proposean alternate set of metrics for angular resolution @rd basedon the average deviation (dev) of angles between eachincident edge at a node, not merely the minimum angleof them. We have the same goal of better capturingthe distribution as Huang et al. [25, 26], who averagedthe standard deviations of edge angles at each node.However, Huang et al. [27] determined that this measurewas not a good predictor of at least shortest-path taskperformance. Further evaluation is necessary to determinewhether our global RM for angular resolution (dev) @rdis a better predictor.

Node readability metricBelow is our refined formulation for an angular resolution(dev) node RM @rdðn0jÞ.

#j ¼360�

deg n0j

� �

d n0j

� �¼ 1

deg n0j

� � Xdeg n0jð Þ

i¼1

#j � �i;ðiþ1Þ#j

��

@rd n0j

� �¼ 1� d n0j

� �

#j, as in Angular Resolution (min), is the ideal angle betweenedges incident to node n0j. Here we are computing thedifference between the ideal angle #j and the angle betweeneach pair of subsequent incident edges �i;ðiþ1Þ. This pair anglecomputation is done modulo degðn0jÞ, so we find the anglebetween the first edge with the last as well. The total numberof these incident edges or degree of a node is given asdegðn0jÞ, which can include bend promoted edges. Thus,we take all of the deviation into account rather than focusingon a single close pair of edges.


Global readability metricOur global RM for angular resolution (dev) @rd is the logicalextension of the Angular Resolution (min) global RM. Herewe use the refined formulation of dðn0jÞ from the angularresolution (dev) node RM @rdðn0jÞ.

d ¼ 1

n

Xnj¼1

dðn0jÞ

¼ 1

n0

Xn0j¼1

1

deg n0j

� � Xdeg n0jð Þ

i¼1

#j � �i;ðiþ1Þ#j

��

0@

1A

@rd ¼ 1� d

Again, the �i;ðiþ1Þ pair angle computation is done modulodegðn0jÞ.

Edge readability metricJust as with the metric for Angular Resolution (min), an edgeRM @rdðe0iÞ is not as useful for manual improvement but canbe similarly computed.

Group overlap @gThe algorithm listed directly below counts the number ofoverlaps between groups (sets) of nodes in the network andthe remaining nodes. Naturally this only makes sense as aglobal RM. It first computes a convex hull for each group andthen finds the number of nodes outside the group that overlapwith the convex hull. The objective is to measure how theoriginal layout of the group affects users’ perceptions ofgroup membership, and how alternate layouts such as thosethat leverage group membership [84–87] improve on theseperceptions.

groups ¼ set of all groups, each a set of points ðxi; yiÞhullCounts ¼ ½�for all g 2 groups docount ¼ 0hull ¼ grahamScanðgÞfor all node 2 G:nodes j node 62 g doif intersects(hull, node) thencount þþ

hullCounts.add(count)return hullCounts

We believe that convex hulls are more appropriate for thismeasure than alternatives such as concave hulls because(1) convex hulls more accurately model the way usersperceive regions of the network, and (2) it is moreefficient to find intersections between convex polygonsthan simple polygons [54]. Additionally, colored convexhulls are often used to show network group structure(e.g., SocialAction [88]).

Two functions are called in the algorithm we just listed,and we assume these are defined elsewhere. The first,grahamScanðSÞ, is the Graham scan algorithm [89]for computing a convex hull of a finite set of pointsin Oðn log nÞ time, where n is the number of points(nodes), in this case jgj. The second, intersectsða; bÞ,computes the intersection of two convex polygons inOðlog nÞ time, where n is the count of the nodes ina and b [54, 90].Let n be the number of nodes and g the number of groups.

The time complexity of the algorithm that we just listed isOðgn log nÞ, which is derived below. Here, jnjj is the numberof sides of the polygon representing a particular node nj.The other uses of jsj indicate the size of the enclosed set s,e.g., jgij is the number of nodes in the set gi. timegs is thetime spent on grahamScanðSÞ for all groups and timeint is thetime spent on intersection calculation.

timegs ¼Xgi¼1jgij log jgij; where gi 2 groups

timeint ¼ gn log maxi

hullðgiÞj jð Þ þmaxjjnjj� ��

As 8i; jhullðgiÞj � jgij � maxiðjgijÞ � n, and as maxjðjnjjÞis a constant for the highest degree polygon used as anode shape,

time¼ timegs þ timeint¼O gn logmaxijgijð Þ

� ��Oðgn log nÞ

Visualization coverage @vcThe global visualization coverage or ink RM denoted@vc is our attempt to quantify the amount of screen spaceused by the visual items in a visualization compared tothe entire space available. It is formulated as the areaoccupied by all visual items divided by the area of thescreen space. The objective of this metric is to measurethe amount of theoretically available screen space, soas to quantify the reduction in ink presented to theuser after filtering or simplification [87, 91]. It can alsomeasure the reduction in ink by using aggregate edges(or no edges) between groups in the Group-in-a-Boxlayouts [84, 86, 87]. Naturally, blank space can be usefulfor showing clusters and relationships. In practice, thismetric should be optimized only partially or with carefulconsideration.Here, we use a notation of a network or graph optionally

with bends promotion G0 and a network visualization V ðG0Þ.Each individual node n 2 N and edge e0 2 E0 is indexedusing subscripts (e.g., ni; e0jVwe are not concernedwith extra nodes from bends promotion). For any node,edge, or visualization k, boundsðkÞ indicates a boundingshape b for that item in the visualization, andareaðbÞ denotes the area of that bounding shape.


The visualization coverage metric @vc is defined asfollows:

bn ¼[n2N

boundsðnÞ

be ¼[e02E0

boundsðe0Þ

a ¼ areaðbn [ beÞnamax ¼ argmax

ni2Narea boundsðniÞð Þ

eamax ¼ argmaxe0j2E0

area bounds e0j

� ��

a� ¼ maxðnamax; eamaxÞamax ¼ area bounds V ðG0Þð Þð Þ@vc ¼ divOrZeroða� a�; amax � a�Þ

First, a union is computed of all the node-bounding shapesand edge bounding shapes in the visualization, includingany meta-nodes and meta-edges. In order for the metricto have a range of [0,1], the areas and amax must have themaximum node or edge area a� subtracted from it. Thisquantity is then divided by the total visualization area.

Tools for designers and theoristsImproving individual readability metrics (RMs) can bebeneficial for other RMs as well, though often there aretradeoffs between them designers may have to weigh. Which

RMs should be improved thus depends on what designers aretrying to convey with their visualizations and intended usecases. Designers of network visualization software who areaware of which RMs their layout algorithms attempt tooptimize and the effects various layout techniques have onhow much of the underlying data is effectively conveyedare likely to produce readable networks.In order to help designers integrate our metrics into as

many network visualization platforms as possible, we havechosen to build a JavaScript API which can be run client orserver side. Using the API, tool designers can get immediatefeedback on the readability effects of their algorithmicchanges. While performance in JavaScript is reducedversus traditional computational geometry tools, we believeJavaScript’s increasingly widespread use for visualizationby novice designers makes it an ideal language for ourproject. However, the dearth of efficient computationalgeometry libraries is problematic though new efforts likethe JavaScript Topology Suite (JSTS) are catching up.In addition to an API, we provide an interactive

browser-based system where designers can analyze theoutput of their tools. A designer can upload a GraphML filerepresenting his or her network, its associated layout, circularnode sizes, and group membership. Our system calculatesany relevant metrics and displays the results to the userin a heatmap as in Figure 2. When the user clicks one of therows in the heatmap, the associated network is loaded in aseparate individual results view (Figure 3 and Figure 4).

Figure 2

Heatmap view of our browser-based readability analysis system, showing readability metrics for several networks ranging from 4 to 10,433 nodes. Userscan upload their own GraphML file to be analyzed. Any of the results can be clicked to view the individual results and interactive node-link visualization,as in Figure 3 and Figure 4. The metrics shown are edge crossing (EC), node-node overlap (NO), edge crossing angle (ECA), angular resolution (min)(AR_M), and angular resolution (dev) (AR_D).


The row of the heatmap is preserved so the user can seethe overall metric results, but each global RM cell isnow clickable to add node and edge RM color coding(if available). Nodes and edges with poor RM values areshown in bright red. This helps users understand the reasonbehind poor global RM results. Also, using this approach,users can rapidly flip between RM rankings and identifyareas that would benefit from algorithmic improvementsand hand-tuning of the layout.We show an example analysis in Figure 3, which shows

a network of Twitter replies and mentions for usersdiscussing NodeXL [51]. The dataset was also collectedusing NodeXL and is available from the NodeXL GraphGallery [92]. The connected components are positionedindividually using the Squarified Treemap Group-in-a-Boxlayout [86], which put the main component on the left and

the others on the right side in a column. Each component islaid out internally using Harel and Koren’s Fast Multi-scaleLayout (FMS) [93]. We can see that due to the layoutand large size of nodes, there is a considerable amount ofnode overlap present in the main connected component.Figure 4 presents another NodeXL [51] Twitter collection,

this time users discussing Embry-Riddle AeronauticalUniversity. This dataset has only one connected component,but clusters found using Clauset-Newman-Moore [94]are positioned separately using the Force-DirectedGroup-in-a-Box layout [84]. Again, clusters are laid outinternally with FMS [93]. We see relatively poor edgecrossing angles, mainly across inter-cluster edges.Our RMs are useful for understanding the effectiveness

of different network layouts, but one of our chiefcontributions is a set of local RMs which can be used to

Figure 3

The individual results and interactive node-link visualization for a network, with color coding indicating the nodes with high node-node overlap (NO) inbright red. The metric abbreviations are described in Figure 2. Network available at: http://nodexlgraphgallery.org/Pages/Graph.aspx?graphID=10201.


improve readability for specific nodes and edges. Using ourAPI, the local RMs can be used within a web-based networkanalysis tool to help users identify problem areas of thelayout using color coding and ranked lists. Moreover, theRMs and associated colors can be incrementally updated inreal time as users interactively manipulate the visualization.This provides them with immediate feedback as to howthey are affecting the visualization’s readability, bothon a global and local level. Our previous work [11, 12]describes pilot implementations of these techniquesin NodeXL [51] and SocialAction [88], two desktopnetwork analysis tools.Our initial feedback from designers inside IBM has

been positive, and we intend to deploy this system toseveral internal and external teams. We expect our currentimplementation to be particularly beneficial for userswho perform manual network layouts, such as intelligenceanalysts, network biologists, and designers. We intend to

run evaluations to quantify the benefits and drawbacksof using RM assistance techniques for these kinds of users.Once in widespread use, we will be able to gather feedback

to help refine the interface, as well as RM results for manydifferent types of networks which could help us refine theRM theory especially for calibrating and combining metrics(discussed below).

Limitations and future work

Additional readability metricsThere are many potential RMs that clarify node-linkvisualizations. Each impacts how successfully the resultenables user insights. We have prioritized metrics that appearmost beneficial in perceptual and cognitive studies, but a fulldiscussion of the metric space is beyond the scope of thispaper. See our previous work [11, 12] for a list of many othermetrics that can be considered and links to related literature.

Figure 4

The individual results and interactive node-link visualization for a network, with color coding indicating the edges with poor edge crossing angles (ECA)in bright red. The metric abbreviations are described in Figure 2. Network with layout available at: http://nodexlgraphgallery.org/Pages/Graph.aspx?graphID=18806.


Depending on the target use case, there may be RMs morerelevant than the ones described herein which should beinvestigated. There is also a need for local node and edgeversions of current global metrics that we did not cover aspart of our work. Defining RMs for individual clusters orregions would also be helpful, especially for examining largenetworks. Another area of note is the idea of Ghani et al. [95]to develop dynamic network metrics for analyzing thereadability of animated transitions between networks andlayouts. Of course, as new metrics are developed, they shouldbe evaluated for how they affect user task performance. Userstudies can also be used to help identify cases where taskperformance suffers despite high RM values, indicatingpotential RMs to develop.

Readability metric designOur RMs are all on a [0,1] scale with higher valuesequating to more readable layouts. However, the meaningof a specific metric value and the ability to comparevalues across metrics and networks may be limited. This ispartially due to the scales not always being uniform. Forexample, when computing the edge crossing metric @cfor a large network the upper bound for the numberof edge crossings cmax can dwarf the actual numberof crossings.Our RM scales are suited for manually or automatically

optimizing individual RMs, as well as comparing differentlayouts of the same network using a RM. Still, additionalcalibration is required in order to allow for comparingdifferent RMs on the same network or compare a RM acrossnetworks. This calibration will require a corpus of real-worldnetworks, their varied layouts, and RM values, which wehave begun collecting via our browser-based readabilityanalysis system. One straightforward way of improving theRM scales with such corpus is the use of z-scores for each.Additional user studies will be required to gauge theeffectiveness of these calibrations.

Combining metricsThe metrics we have developed can be aggregated in variousways to create an overall metric for a visualization. However,the challenges we face when trying to compare differentRMs on the same network also affect our ability tocombine the RMs effectively. Of particular interest is thetechnique of Huang et al. [31] for combining z-scoresfor each metric. Further user studies are necessaryto determine which of the various RM combinationtechniques are appropriate.If the available metrics are believed insufficient for

capturing the overall effectiveness of the visualizationa cognitive load or mental effort measure may proveeffective. For this purpose, Huang et al. [96] developed avisualization efficiency metric based on response time,response accuracy, and mental effort.

Using metrics to support revisionIn the future, we would like to investigate additional waysRMs can aid in network layout tasks. Our RMs and anyothers can be integrated into a visual taxonomy for the user,which can then be used to help users choose the metricsto combine and optimize for. Once chosen, the combinedlocal RM can be calculated for the surrounding regionas a node is dragged by the user and applied to snap the nodeto local RM maxima, reducing manual effort. Additionally,the RMs could be fed into a fully-automatic layout algorithmthat creates optimal network layouts for a certain task.Rather than optimizing the entire network with such analgorithm, local RMs allow for targeted improvements tospecific nodes and edges, perhaps based on a degree ofinterest measure. For example, node-node overlap maynot be critical to optimize for every nodeVjust the oneswith labels or a high betweenness centrality.

ConclusionAs network analysis and node-link network visualizationsin general become more main-stream, it is important toprovide new designers guidelines for effective visualizationcreation, as without them the layouts produced can beunintelligible or even misleading. By quantifying thereadability of a layout we can guide designers in makingimproved choices and in the future feed the results intocognitive assistive design tools. Past work providesdefinitions for several global readability metrics on a [0,1]continuous scale, which measure detrimental features likeedge crossings and rate the layout as a whole. We introducedadditional global readability metrics for node-node overlap,edge crossing angle, angular resolution (dev), groupoverlap, and visualization coverage.However, a single value is not sufficient to direct users to

problem areas of the layout, which we address with novelnode and edge readability metrics. Our metrics can beused by a user to motivate improvement of the networkdrawing, either by hand with color coding, throughimmediate feedback techniques like snap-to-local-maxima,fully-automatic improvement by feeding readability metricresults back into a layout algorithm, or targeted automaticimprovement. As there are trade-offs when optimizingspecific readability metrics, we survey the related literaturestudying each of these metrics and their effecton user task performance.Both the global and local readability metrics are

implemented as part of a JavaScript API, which we aremaking available to tool designers to integrate within theirsystems, so as to help users identify problem areas in thevisualization to improve. Moreover, our web-basedGraphML analysis system allows for comparative evaluationof multiple layout algorithms and analysis tools. It furtherwill enable the collection of a corpus of real-world networks,their associated layouts, and readability metric values.


This work opens up several avenues of research, includingthe creation of additional metrics, calibration of existingmetrics, ways to combine metrics, and techniques forimproving layouts.

AcknowledgmentsWe appreciate the feedback and support of Marc Smith,Tony Capone, Adam Perer, the Social Media ResearchFoundation, and Microsoft External Research.

*Trademark, service mark, or registered trademark of InternationalBusiness Machines Corporation in the United States, other countries,or both.

**Trademark, service mark, or registered trademark of CytoscapeConsortium Corporation, Sun Microsystems, or Microsoft Corporationin the United States, other countries, or both.

References1. N. Henry and J.-D. Fekete, BMatLink: Enhanced matrix

visualization for analyzing social networks,[ in Proc. 11thIFIP TC Int. Conf. Human-Comput. Interact. INTERACT, 2007,pp. 288–302.

2. A. Aris, BVisualizing and Exploring Networks Using SemanticSubstrates,[ Ph.D. dissertation, Dept. Comput. Sci., Univ.Maryland, College Park, MD, USA, 2008.

3. B. Shneiderman and A. Aris, BNetwork visualization by SemanticSubstrates,[ TVCG: IEEE Trans. Vis. Comput. Graph., vol. 12,no. 5, pp. 733–740, Sep. 2006.

4. M. Smith, B. Shneiderman, N. Milic-Frayling, E. M. Rodrigues,V. Barash, C. Dunne, T. Capone, A. Perer, and E. Gleave,BAnalyzing (social media) networks with NodeXL,[ in Proc.4th Int. Conf. C&T, 2009, pp. 255–264.

5. M. Bastian, S. Heymann, and M. Jacomy, BGephi: An open sourcesoftware for exploring and manipulating networks,[ in Proc.ICWSM, 2009, pp. 1–2.

6. P. Shannon, A. Markiel, O. Ozier, N. S. Baliga, J. T. Wang,D. Ramage, N. Amin, B. Schwikowski, and T. Ideker,BCytoscape: A software environment for integrated modelsof biomolecular interaction networks,[ Genome Res., vol. 13,no. 11, pp. 2498–2504, Nov. 2003.

7. V. Batagelj and A. Mrvar, BPajekVProgram for large networkanalysis,[ Connections, vol. 21, pp. 47–57, 1998.

8. E. Adar, BGUESS: A language and interface for graphexploration,[ in Proc. SIGCHI, 2006, pp. 791–800.

9. H. C. Purchase, BMetrics for graph drawing aesthetics,[J. Vis. Languages Comput., vol. 13, no. 5, pp. 501–516,Oct. 2002.

10. H. C. Purchase and D. Leonard, BGraph Drawing AestheticMetrics,[ Key Centre Softw. Technol., Dept. Comput. Sci.,Univ. Queensland, St Lucia, QLD, Australia, Tech. Rep. 361,1996.

11. C. Dunne, BMeasuring and Improving the Readability of NetworkVisualizations,[ Ph.D. dissertation, Univ. Maryland, College Park,MD, USA, 2013.

12. C. Dunne and B. Shneiderman, BImproving graph drawingreadability by incorporating readability metrics: A software toolfor network analysts,[ Univ. Maryland, College Park, MD,USA, Human-Comput. Interact. Lab Tech. Rep. HCIL-2009-13,2009.

13. E. M. Bonsignore, C. Dunne, D. Rotman, M. Smith, T. Capone,D. L. Hansen, and B. Shneiderman, BFirst steps to NetViz Nirvana:Evaluating social network analysis with NodeXL,[ in Proc. Int.Conf. CSE, 2009, vol. 4, pp. 332–339.

14. M. Ghoniem, J.-D. Fekete, and P. Castagliola, BA comparisonof the readability of graphs using node-link and matrix-basedrepresentations,[ in Proc. IEEE Symp. INFOVIS, 2004, pp. 17–24.

15. N. Henry, A. Bezerianos, and J.-D. Fekete, BImproving thereadability of clustered social networks using node duplication,[IEEE Trans. Vis. Comput. Graph., vol. 14, no. 6, pp. 1317–1324,Nov./Dec. 2008.

16. K. Sugiyama, Graph Drawing and Applications for Softwareand Knowledge Engineers, vol. 11. Singapore: World Scientific,2002, ser. Series on Software Engineering and KnowledgeEngineering.

17. C. Bennett, J. Ryall, L. Spalteholz, and A. Gooch, BThe aestheticsof graph visualization,[ in Proc. 3rd Eurograph. Conf. Comput.Aesthetics Graph., Visual. Imag. Comput. Aesthetics, 2007,pp. 57–64.

18. C. Batini, L. Furlani, and E. Nardelli, BWhat is a good diagram?A pragmatic approach,[ in Proc. 4th Int. Conf. ER ApproachSoftw. Eng., 1985, pp. 312–319.

19. G. D. Battista, P. Eades, R. Tamassia, and I. G. Tollis, BAlgorithmsfor drawing graphs: An annotated bibliography,[ Comput.Geometry, vol. 4, no. 5, pp. 235–282, Oct. 1994.

20. G. D. Battista, P. Eades, R. Tamassia, and I. G. Tollis,Graph Drawing: Algorithms for the Visualization of Graphs.Upper Saddle River, NJ, USA: Prentice-Hall, 1998.

21. C. Ware, Information Visualization: Perception for Design.San Francisco, CA, USA: Morgan Kaufmann, 2004.

22. C. Wetherell and A. Shannon, BTidy drawings of trees,[ IEEETrans. Softw. Eng., vol. SE-5, no. 5, pp. 514–520, Sep. 1979.

23. H. Eichelberger, BNice class diagrams admit good design?[ inProc. ACM Symp. SoftVis, 2003, pp. 159–216.

24. R. Davidson and D. Harel, BDrawing graphs nicely usingsimulated annealing,[ ACM Trans. Graph., vol. 15, no. 4,pp. 301–331, Oct. 1996.

25. W. Huang, P. Eades, S.-H. Hong, and C.-C. Lin, BImprovingforce-directed graph drawings by making compromises betweenaesthetics,[ in Proc. IEEE Symp. VL/HCC, 2010, pp. 176–183.

26. W. Huang, P. Eades, S.-H. Hong, and C.-C. Lin, BImprovingmultiple aesthetics produces better graph drawings,[ J. Vis.Languages Comput., vol. 24, no. 4, pp. 262–272, Aug. 2013.

27. W. Huang, M. Huang, and C.-C. Lin, BAesthetic of angularresolution for node-link diagrams: Validation and algorithm,[ inProc. IEEE Symp. VL/HCC, 2011, pp. 213–216.

28. W. Didimo, G. Liotta, and S. A. Romeo, BTopology-drivenforce-directed algorithms,[ in Proc. 18th Int. Conf. GD, 2011,pp. 165–176.

29. P. Simonetto, D. Archambault, D. Auber, and R. Bourqui,BImPrEd: An improved force-directed algorithm that preventsnodes from crossing edges,[ in Proc. 13th Eurograph./IEEEVVGTC Conf. Vis. EuroVis, 2011, pp. 1071–1080.

30. I. Herman, G. Melançon, and M. S. Marshall, BGraph visualizationand navigation in information visualization: A survey,[ IEEETrans. Vis. Comput. Graph., vol. 6, no. 1, pp. 24–43,Jan.–Mar. 2000.

31. W. Huang, C.-C. Lin, and M. L. Huang, BAn aggregation-basedapproach to quality evaluation of graph drawings,[ in Proc. 5thInt. Symp. VINCI, 2012, pp. 110–113.

32. W. Lai and P. Eades, BRemoving edge-node intersectionsin drawings of graphs,[ Inf. Process. Lett., vol. 81, no. 2,pp. 105–110, Jan. 2002.

33. W. Li, P. Eades, and N. S. Nikolov, BUsing spring algorithms toremove node overlapping,[ in Proc. APVis, 2005, pp. 131–140.

34. K. Misue, P. Eades, W. Lai, and K. Sugiyama, BLayout adjustmentand the mental map,[ J. Vis. Languages Comput., vol. 6, no. 2,pp. 183–210, Jun. 1995.

35. P. Eades, BA heuristic for graph drawing,[ in Proc. CN, 1984,vol. 42, pp. 149–160.

36. T. M. J. Fruchterman and E. M. Reingold, BGraph drawing byforce-directed placement,[ Softw.: Practice Exp., vol. 21, no. 11,pp. 1129–1164, Nov. 1991.

37. J. Heer, S. K. Card, and J. A. Landay, BPrefuse: A toolkit forinteractive information visualization,[ in Proc. SIGCHI, 2005,pp. 421–430.

38. J. Barnes and P. Hut, BA hierarchical O(N log N) force-calculationalgorithm,[ Nature, vol. 324, no. 6096, pp. 446–449,Dec. 1986.


39. E. Gansner and Y. Hu, BEfficient node overlap removal usinga proximity stress model,[ in Proc. 16th Int. Symp. GD, 2009,pp. 206–217.

40. D. Harel and Y. Koren, BDrawing graphs with non-uniformvertices,[ in Proc. Working Conf. AVI, 2002, pp. 157–166.

41. P. Eades and W. Lai, BAlgorithms for disjoint node images,[ inProc. 15th ACSC, 1992, pp. 253–265.

42. K. Hayashi, M. Inoue, T. Masuzawa, and H. Fujiwara, BAlayout adjustment problem for disjoint rectangles preservingorthogonal order,[ Syst. Comput. Japan, vol. 33, no. 2, pp. 31–42,2002.

43. X. Huang and W. Lai, BForce-transfer: A new approach toremoving overlapping nodes in graph layout,[ in Proc. 26thACSC, 2003, pp. 349–358.

44. T. Dwyer, K. Marriott, and P. Stuckey, BFast node overlapremoval,[ in Proc. 13th Int. Symp. GD, 2006, pp. 153–164.

45. T. Dwyer, K. Marriott, and P. Stuckey, BFast node overlapremoval-correction,[ in Proc. 14th Int. Symp. GD, 2007,pp. 446–447.

46. K. Marriott, P. Stuckey, V. Tam, and W. He, BRemoving nodeoverlapping in graph layout using constrained optimization,[Constraints, vol. 8, no. 2, pp. 143–171, Apr. 2003.

47. E. Gansner and S. North, BImproved force-directed layouts,[ inProc. 6th Int. Symp. GD, 1998, pp. 364–373.

48. X. Huang, W. Lai, A. S. M. Sajeev, and J. Gao, BA new algorithmfor removing node overlapping in graph visualization,[ Inf. Sci.,vol. 177, no. 14, pp. 2821–2844, Jul. 2007.

49. K. A. Lyons, H. Meijer, and D. Rappaport, BAlgorithms forcluster busting in anchored graph drawing,[ J. Graph AlgorithmsAppl., vol. 2, no. 1, pp. 1–24, 1998.

50. T. Imamichi, Y. Arahori, J. Gim, S.-H. Hong, and H. Nagamochi,BRemoving node overlaps using multi-sphere scheme,[ in Proc.16th Int. Symp. GD, 2009, pp. 296–301.

51. M. Smith, B. Shneiderman, N. Milic-Frayling, E. M. Rodrigues,V. Barash, C. Dunne, T. Capone, A. Perer, and E. Gleave,NodeXL: A Free and Open Network Overview, Discoveryand Exploration Add-In for Excel 2007/2010/2013, SocialMedia Research Foundation. [Online]. Available: http://nodexl.codeplex.com

52. W. Huang, S.-H. Hong, and P. Eades, BPredicting graph readingperformance: A cognitive approach,[ in Proc. APVis, 2006,pp. 207–216.

53. F. Martınez, A. J. Rueda, and F. R. Feito, BA new algorithm forcomputing boolean operations on polygons,[ Comput. Geosci.,vol. 35, no. 6, pp. 1177–1185, 2009.

54. D. M. Mount, BGeometric intersection,[ in The Handbook ofDiscrete and Computational Geometry, 2nd ed. Boca Raton,FL, USA: CRC Press, 2004, pp. 857–876.

55. J. L. Moreno, Who Shall Survive? Foundations of Sociometry,Group Psychotherapy and Sociodrama. Beacon, NY, USA:Beacon House, 1953.

56. M. K. Coleman and D. S. Parker, BAesthetics-based graph layoutfor human consumption,[ Softw.: Practice Experience, vol. 26,no. 12, pp. 1415–1438, Dec. 1996.

57. P. Eades and K. Sugiyama, BHow to draw a directed graph,[J. Inf. Process., vol. 13, no. 4, pp. 424–437, 1990.

58. P. Mutzel, BAn alternative method to crossing minimization onhierarchical graphs,[ SIAM J. Optim., vol. 11, no. 4, pp. 318–333,1997.

59. K. Sugiyama, S. Tagawa, and M. Toda, BMethods for visualunderstanding of hierarchical system structures,[ IEEE Trans.Syst., Man Cybern., vol. 11, no. 2, pp. 109–125, Feb. 1981.

60. H. C. Purchase, BWhich aesthetic has the greatest effect on humanunderstanding?[ in Proc. 5th Int. Symp. GD, 1997, pp. 248–261.

61. H. C. Purchase, BThe effects of graph layout,[ in Proc. OZCHI,1998, pp. 80–86.

62. H. C. Purchase, R. F. Cohen, and M. James, BValidatinggraph drawing aesthetics,[ in Proc. 3rd Int. Symp. GD, 1996,pp. 435–446.

63. H. C. Purchase, D. Carrington, and J.-A. Allder, BEmpiricalevaluation of aesthetics-based graph layout,[ Empirical Softw.Eng., vol. 7, no. 3, pp. 233–255, 2002.

64. C. Ware, H. C. Purchase, L. Colpoys, and M. McGill,BCognitive measurements of graph aesthetics,[ Inf. Vis.,vol. 1, no. 2, pp. 103–110, Jun. 2002.

65. C. Korner and D. Albert, BSpeed of comprehension of visualizedordered sets,[ J. Experimental Psychol.: Appl., vol. 8, no. 1,pp. 57–71, Mar. 2002.

66. H. C. Purchase, J.-A. Allder, and D. Carrington, BGraphlayout aesthetics in UML diagrams: User preferences,[J. Graph Algorithms Appl., vol. 6, no. 3, pp. 255–279,Jun. 2002.

67. W. Huang, S.-H. Hong, and P. Eades, BHow people readsociograms: A questionnaire study,[ in Proc. APVis, 2006,pp. 199–206.

68. F. van Ham and B. E. Rogowitz, BPerceptual organization inuser-generated graph layouts,[ IEEE Trans. Vis. Comput. Graph.,vol. 14, no. 6, pp. 1333–1339, Nov./Dec. 2008.

69. W. Huang, S.-H. Hong, and P. Eades, BLayout effects: Comparisonof sociogram drawing conventions,[ Univ. Sydney, New SouthWales, Australia, Tech. Rep. 575, 2005.

70. W. Huang, S.-H. Hong, and P. Eades, BLayout effects onsociogram perception,[ in Proc. 13th Int. Symp. GD, 2006,pp. 262–273.

71. W. Huang, S.-H. Hong, and P. Eades, BEffects of sociogramdrawing conventions and edge crossings in social networkvisualizations,[ J. Graph Algorithms Appl., vol. 11, no. 2,pp. 397–429, Mar. 2007.

72. W. Huang, BAn eye tracking study into the effects of graphlayout,[ Univ. Sydney, New South Wales, Australia, Tech.Rep., 2006.

73. W. Huang, BUsing eye tracking to investigate graph layouteffects,[ in Proc. APVis, 2007, pp. 97–100.

74. W. Huang, P. Eades, and S.-H. Hong, BBeyond time and error:A cognitive approach to the evaluation of graph drawings,[ inProc. Conf. BEyond Time Errors: Novel Eval. Methods Inf.Vis. BELIV, 2008, pp. 1–8.

75. W. Huang, BBeyond time and error: A cognitive approach tothe evaluation of graph visualizations,[ Ph.D. dissertation,Univ. Sydney, New South Wales, Australia, 2007.

76. C. Korner, BSequential processing in comprehension ofhierarchical graphs,[ Appl. Cognitive Psychol., vol. 18, no. 4,pp. 467–480, 2004.

77. K. Mulmuley, BA fast planar partition algorithm, II,[ J. ACM,vol. 38, no. 1, pp. 74–103, Jan. 1991.

78. W. Huang and M. Huang, BExploring the relative importanceof crossing number and crossing angle,[ in Proc. 3rd Int.Symp. VINCI, 2010, pp. 10:1–10:8.

79. W. Huang, BEstablishing aesthetics based on human graphreading behavior: Two eye tracking studies,[ Pers. UbiquitousComput., vol. 17, no. 1, pp. 93–105, Jan. 2013.

80. W. Huang and P. Eades, BHow people read graphs,[ in Proc.APVis, 2005, pp. 51–58.

81. W. Huang, S.-H. Hong, and P. Eades, BEffects of crossing angles,[in Proc. IEEE PacificVIS, 2008, pp. 41–46.

82. M. Formann, T. Hagerup, J. Haralambides, M. Kaufmann,F. T. Leighton, A. Symvonis, E. Welzl, and G. J. Woeginger,BDrawing graphs in the plane with high resolution,[SIAM J. Comput., vol. 22, no. 5, pp. 1035–1052,Oct. 1993.

83. B. Finkel and R. Tamassia, BCurvilinear graph drawing usingthe force-directed method,[ in Proc. 12th Int. Symp. GD,2005, vol. 3383/2005, pp. 448–453.

84. S. Chaturvedi, C. Dunne, Z. Ashktorab, R. Zacharia, andB. Shneiderman, BGroup-in-a-Box meta-layouts for topologicalclusters and attribute-based groups: Space efficient visualizationsof network communities and their ties,[ Comput. Graph. Forum,vol. 33, no. 8, pp. 52–68, Dec. 2014.

85. A. Noack, BAn energy model for visual graph clustering,[ in Proc.11th Int. Symp. GD, 2004, pp. 425–436.

86. E. M. Rodrigues, N. Milic-Frayling, M. Smith, B. Shneiderman,and D. Hansen, BGroup-in-a-Box layout for multi-facetedanalysis of communities,[ in Proc. IEEE 3rd Int. Conf.SocialCom, 2011, pp. 354–361.


87. B. Shneiderman and C. Dunne, BInteractive network explorationto derive insights: Filtering, clustering, grouping, simplification,[in Proc. 20th Int. Symp. GD, 2012, pp. 2–18.

88. A. Perer and B. Shneiderman, BBalancing systematic andflexible exploration of social networks,[ IEEE Trans. Vis.Comput. Graph., vol. 12, no. 5, pp. 693–700, Sep. 2006.

89. R. L. Graham, BAn efficient algorithm for determining the convexhull of a finite planar set,[ Inf. Process. Lett., vol. 1, no. 4,pp. 132–133, 1972.

90. D. P. Dobkin and D. G. Kirkpatrick, BFast detectionof polyhedral intersection,[ Theoretical Comput. Sci.,vol. 27, no. 3, pp. 241–253, 1983.

91. C. Dunne and B. Shneiderman, BMotif simplification: Improvingnetwork visualization readability with fan, connector, cliqueglyphs,[ in Proc. SIGCHI, 2013, pp. 3247–3256.

92. M. Smith, B. Shneiderman, N. Milic-Frayling, E. M. Rodrigues,V. Barash, C. Dunne, T. Capone, A. Perer, and E. Gleave,(2013) NodeXLGraph Gallery, Social Media Research Foundation.[Online]. Available: http://nodexlgraphgallery.org/

93. D. Harel and Y. Koren, BA fast multi-scale method for drawinglarge graphs,[ J. Graph Algorithms Appl., vol. 6, no. 3, pp. 179–202,2002.

94. A. Clauset, M. E. J. Newman, and C. Moore, BFinding communitystructure in very large networks,[ Phys. Rev. E: Statist., Nonlinear,Soft Matter Phys., vol. 70, Dec. 2004, Art. ID. 066111.

95. S. Ghani, N. Elmqvist, and J. S. Yi, BPerception of animatednode-link diagrams for dynamic graphs,[ Comput. Graph. Forum,vol. 31, no. 3, pp. 1205–1214, Jun. 2012.

96. W. Huang, P. Eades, and S.-H. Hong, BMeasuring effectivenessof graph visualizations: A cognitive load perspective,[ Inf. Vis.,vol. 8, no. 3, pp. 139–152, Jun. 2009.

Received May 19, 2014; accepted for publication June 9, 2014

Cody Dunne IBM Watson Group, Cognitive Visualization Lab,Cambridge, MA 02142 USA ([email protected]). Dr. Dunne is aResearch Staff Member focusing on information visualization, visualanalytics, and cognitive computing. He has worked on improvingthe readability of network visualizations and the application of networkanalysis techniques to real-world problems. Some examples includevisualizing citations in academic literature, interactions of peopleand organizations, relationships in archaeological dig sites, termco-occurrence, thesaurus category relationships, and computer networktraffic flow. He contributes to the NodeXL project, an open sourcenetwork visualization template for Microsoft Excel**. Dr. Dunnereceived his Ph.D. and M.S. degrees in computer science underBen Shneiderman at the University of Maryland Human ComputerInteraction Lab in 2013 and 2009, respectively. He earned aB.A. degree in computer science and mathematics from CornellCollege in 2007.

Steven I. Ross IBM Watson Group, Cognitive Visualization Lab,Cambridge, MA 02142 USA ([email protected]). Mr. Ross isa Senior Technical Staff member of the Cognitive Visualization Lab.He received S.B. and S.M. degrees in computer science from MITin 1980 and 1982, respectively. Mr. Ross was previously a member ofthe IBM Research Division, where he worked on collective intelligence,collaboration, and conversational interaction research. Before joiningIBM Research, he was an architect on the Lotus* 1-2-3 spreadsheet,a founder of Reasonix, an automated reasoning startup, and leaderof the software engineering team at Verbex that developed the firstcommercial continuous speech recognition product.

Ben Shneiderman Department of Computer Science, Universityof Maryland, College Park, MD 20742 USA ([email protected]).Prof. Shneiderman is a Distinguished University Professor in theDepartment of Computer Science, Member of the Institute forAdvanced Computer Studies, and Founding Director (1983–2000)

of the Human-Computer Interaction Laboratory (www.cs.umd.edu/hcil)at the University of Maryland. He received a B.S. degree in math/physics from the City College of New York in 1968 and M.S. andPh.D. degrees in computer science from the State University ofNew York at Stony Brook in 1972 and 1973, respectively.Prof. Shneiderman is a coauthor with Catherine Plaisant of Designingthe User Interface: Strategies for Effective Human-ComputerInteraction (5th edition, 2010). His latest book, with Derek Hansenand Marc Smith, is Analyzing Social Media Networks with NodeXL(www.codeplex.com/nodexl, 2010). He is a Fellow of the AAAS, theAssociation for Computing Machinery (ACM), and the Institute ofElectrical and Electronics Engineers (IEEE), and a Member of theNational Academy of Engineering, in recognition of his pioneeringcontributions to human-computer interaction and informationvisualization.

Mauro Martino IBM Watson Group, Cognitive Visualization Lab,Cambridge, MA 02142 USA ([email protected]). Dr. Martinoleads the Cognitive Visualization Lab and is an artist, designer, andresearcher focusing on the representation of networks of humaninteractions (www.mamartino.com). He was formerly an AssistantResearch Professor at Northeastern University working withAlbert-Laszlo Barabasi at Center for Complex Network Researchand with David Lazer and Fellows at the Institute for QuantitativeSocial Science (IQSS) at Harvard University. Previously, he wasa research affiliate with the SENSEable City Lab at MassachusettsInstitute of Technology (MIT) in Cambridge. His projects have beenshown at international festivals including Ars Electronica, TEDx,and Art Galleries including The Serpentine Gallery (London),GAFTA (San Francisco). His work has been featured on the coverof Nature and Proceedings of the National Academy of Sciences,as well as Nature Communication, Nature Physics, PopularScience, The Economist, The Financial Times, WIRED Magazine,The Guardian, BBC News, MIT News, and Harvard News.


readability metric feedback for aiding node-link ... · node-and-edge readability metrics to...

Documents