towards dynamic visualization for understanding evolution...

30
Prefinal Draft Version of: -1- Trier, M. (2008): Towards Dynamic Visualization for Understanding Evolution of Digital Communication Networks. Accepted for publication in Information Systems Research, to appear in 2008. Towards Dynamic Visualization for Understanding Evolution of Digital Communication Networks Matthias Trier 1 1 TU Berlin, Franklinstrasse 28/29, 10587 Berlin, Germany [email protected] Abstract. The capabilities offered by digital communication are leading to the evolution of new network structures that are grounded in communication patterns. As these structures are significant for organizations, much research has been devoted to understanding network dynamics in ongoing processes of electronic communication. A valuable method for this objective is Social Network Analysis. However, its current focus on quantifying and interpreting aggregated static relationship structures suffers from some limitations for the domain of analyzing online communication with high volatility and massive exchange of timed messages. To overcome these limitations, this paper presents a method for event-based dynamic network visualization and analysis together with its exploratory social network intelligence software Commetrix. Based on longitudinal data of corporate e-mail communication, the paper demonstrates how exploration of animated graphs combined with measuring temporal network changes identifies measurement artifacts of static network analysis, describes community formation processes and network lifecycles, bridges actor level with network level analysis by analyzing structural impact of actor activities, and measures how network structures react to external events. The methods and findings improve our understanding of dynamic phenomena in online communication and motivate novel metrics that complement Social Network Analysis. 1 Introduction Electronic media are becoming one of the main means for interaction in the workplace (Fallows, 2002). In addition to changing personal social behavior (e.g. Kraut et al., 2002) these means of computer-mediated communication affect organizational structures. For example, e-mail has

Upload: others

Post on 19-Aug-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Towards Dynamic Visualization for Understanding Evolution ...cmxp.bplaced.net/Commetrix/enron/PREFINALDRAFT... · Prefinal Draft Version of: ... these limitations, this paper presents

Prefinal Draft Version of: -1- Trier, M. (2008): Towards Dynamic Visualization for Understanding Evolution of Digital Communication Networks. Accepted for publication in Information Systems Research, to appear in 2008.

Towards Dynamic Visualization for Understanding

Evolution of Digital Communication Networks

Matthias Trier1

1 TU Berlin, Franklinstrasse 28/29,

10587 Berlin, Germany

[email protected]

Abstract. The capabilities offered by digital communication are leading to the evolution of new network

structures that are grounded in communication patterns. As these structures are significant for organizations,

much research has been devoted to understanding network dynamics in ongoing processes of electronic

communication. A valuable method for this objective is Social Network Analysis. However, its current focus on

quantifying and interpreting aggregated static relationship structures suffers from some limitations for the domain

of analyzing online communication with high volatility and massive exchange of timed messages. To overcome

these limitations, this paper presents a method for event-based dynamic network visualization and analysis

together with its exploratory social network intelligence software Commetrix. Based on longitudinal data of

corporate e-mail communication, the paper demonstrates how exploration of animated graphs combined with

measuring temporal network changes identifies measurement artifacts of static network analysis, describes

community formation processes and network lifecycles, bridges actor level with network level analysis by

analyzing structural impact of actor activities, and measures how network structures react to external events. The

methods and findings improve our understanding of dynamic phenomena in online communication and motivate

novel metrics that complement Social Network Analysis.

1 Introduction

Electronic media are becoming one of the main means for interaction in the workplace (Fallows,

2002). In addition to changing personal social behavior (e.g. Kraut et al., 2002) these means of

computer-mediated communication affect organizational structures. For example, e-mail has

Page 2: Towards Dynamic Visualization for Understanding Evolution ...cmxp.bplaced.net/Commetrix/enron/PREFINALDRAFT... · Prefinal Draft Version of: ... these limitations, this paper presents

Prefinal Draft Version of: -2- Trier, M. (2008): Towards Dynamic Visualization for Understanding Evolution of Digital Communication Networks. Accepted for publication in Information Systems Research, to appear in 2008.

been shown to complement formal work networks and provide more diverse, participative and

less formally aligned relations (Bikson and Eveland, 1990). In effect the capabilities offered by

digital communication networks are leading to the evolution of new network structures that are

grounded in communication patterns. Examples of such structures are evident, for example, in

the growth of online communities. They are defined as groups of people interacting in a virtual

environment with a purpose, supported by technology, and guided by norms and policies

(Preece, 2000). Such communities are of considerable significance for the corporation as

organizational network structures are knowledge intensive and can constantly adapt their

connection patterns (Monge and Contractor, 2003, p.325).

Contrary to conventional wisdom, in such virtual networks relationships and attachments are

developed and maintained (e.g. Cho et al., 2005). In a shared organizational context, the reduced

social overhead of contacting unacquainted people even allows information flows between

people that have never met face-to-face (Garton et al., 1997). Despite this virtual means of

communicating and the large size of the participating groups, Berge and Collins (2000) found

that most actors still have the perception of community.

The formation of social network structures via interaction of people over time (Krackhardt,

1991) renders communication structures and online communities an object of systematic

research with Social Network Analysis (SNA; e.g. Wellman et al., 1996; Garton et al., 1997). Its

explicit focus on quantitatively analyzing interdependent patterns of social relationships

differentiates SNA from traditional statistics and data analysis (Wasserman and Faust, 1994,

p.3). The analytical approach uses network graph visualization extensively to represent, describe,

and analyze communication matrices of interrelated actors.

However, in the context of describing and explaining evolving relationships within online

Page 3: Towards Dynamic Visualization for Understanding Evolution ...cmxp.bplaced.net/Commetrix/enron/PREFINALDRAFT... · Prefinal Draft Version of: ... these limitations, this paper presents

Prefinal Draft Version of: -3- Trier, M. (2008): Towards Dynamic Visualization for Understanding Evolution of Digital Communication Networks. Accepted for publication in Information Systems Research, to appear in 2008.

communication networks, SNA has the important methodological limitation, that “almost all

SNA research is static and cross-sectional rather than dynamic” (Monge and Contractor, 2003,

p.325). This denies the dynamic nature of social relationships (Emirbayer, 1997) and inherent

formation processes cannot be analyzed. In fact, the sampling method of SNA usually aggregates

the wealth of longitudinal communication data into a single cumulative social network structure.

The resulting analysis can be misleading when temporal and structural change is an inherent

network property; as with online communication networks with their complex processes of

community formation based on massive timed message events. Further SNA researchers

frequently generate lists of central actors without knowing how important persons came into a

position or if their status is already declining. Another important drawback is the predominance

of static network images for visual representation and interpretation of structural properties. Such

graphs can not represent network change (Moody et al., 2005, p. 1207).

To improve existing research methods and to create new insights about the dynamic properties

of online social networks, this paper presents an approach that disaggregates relationships into

their constituting events and suggests event-based dynamic network analysis. The introduced

method has also been implemented in the associated exploratory social network intelligence

software Commetrix (cf. Trier, 2004; Trier, 2005). Based on the notion that visualization of

information is the appropriate way to amplify cognition in complex domains (Card et al., 1999),

and that SNA can be augmented by improving current static visualizations (also cf. Moody et al.,

2005), the approach is to utilize current advances in information visualization to extend

perceptional and analytical inferences about large amounts of dynamic network data. The

individual streaming events are retained together with their time stamps for a more accurate

dynamic visualization and measurement. The software implementation and especially its

Page 4: Towards Dynamic Visualization for Understanding Evolution ...cmxp.bplaced.net/Commetrix/enron/PREFINALDRAFT... · Prefinal Draft Version of: ... these limitations, this paper presents

Prefinal Draft Version of: -4- Trier, M. (2008): Towards Dynamic Visualization for Understanding Evolution of Digital Communication Networks. Accepted for publication in Information Systems Research, to appear in 2008.

visualization are regarded as an important cornerstone that enables exploratory observation of

dynamic network evolution.

The proposed event-based approach is a promising foundation for complementing existing

SNA methods. Examples for its extensions include the analysis of group formation and

stabilization over time, of actor paths to central positions, or of process oriented activity patterns

with a structural impact on the network. Explicit recognition of relational events is further able to

capture the growth of relationships and the network’s reaction to external events. Generally, the

method provides multiple integrated levels of analysis by linking actor attributes (e.g. types),

actors’ activity patterns, and the resulting impact on general network structures.

The broad research objective of this paper is to illustrate the advantages offered by the

proposed event-based approach to dynamic network analysis in improving understanding of

evolving processes of online communication networks. Specifically, it addresses the following

research questions:

1) How can longitudinal network analysis overcome limitations and measurement artifacts of

summative pictures provided by static SNA? How volatile is the formation of an online

communication network and its actors’ positions?

2) What processes of general network and subgroup formation can be observed and described

with event-based visualization and analysis?

3) How can event-based dynamic network analysis evaluate actor activity, i.e. the structural

impact of actors who actively broker and integrate separate parts of the corporate network?

Which organizational positions have such actors?

4) What is the impact of external events on the network structure and its levels of change?

The paper begins with a brief introduction to Social Network Analysis followed by a discussion

Page 5: Towards Dynamic Visualization for Understanding Evolution ...cmxp.bplaced.net/Commetrix/enron/PREFINALDRAFT... · Prefinal Draft Version of: ... these limitations, this paper presents

Prefinal Draft Version of: -5- Trier, M. (2008): Towards Dynamic Visualization for Understanding Evolution of Digital Communication Networks. Accepted for publication in Information Systems Research, to appear in 2008.

of the main shortcomings of its aggregated data and visualization model. Related research is then

summarized to subsequently present the method of event-based dynamic network analysis and

the associated software Commetrix for visualizing and analyzing the dynamics of evolving

online communication networks. The suggested approach is applied to study the dynamics of the

corporate e-mail communication network of Enron Corporation.

2 SNA concepts and their shortcomings for dynamic analysis

The methodological body of Social Network Analysis (SNA) is frequently applied to observe

and analyze online social networks (e.g. Garton et al., 1997; Cho et al., 2005). SNA typically

builds a network of actors as nodes and their mutual relationships as ties. An overview of typical

measures of SNA is provided in Table 1. These measures include composition variables, i.e. the

number and properties of actors, or structural variables, i.e. the properties of relationships. In an

online communication context, a relationship can be derived by counting exchanged messages.

Relationship strength differs across communication media. For example, compared to e-mail,

instant messages have much higher frequencies of interaction. However, in relative terms, strong

and weak relationships can be identified for a defined technology of electronic communication.

Actors who maintain strong ties are more likely to share the resources they have (Wellman and

Wortley, 1990).

Another basic property is network size (cf. Table 1). Larger social networks tend to have more

heterogeneity in their social characteristics and more complexity in their structure (Wellman and

Potter, 1997). Large heterogeneous networks (such as those often found online) are more likely

to exhibit weak ties to different social circles which are beneficial for obtaining more diverse

information (Granovetter, 1973; Garton et al., 1997). A further important property often studied

Page 6: Towards Dynamic Visualization for Understanding Evolution ...cmxp.bplaced.net/Commetrix/enron/PREFINALDRAFT... · Prefinal Draft Version of: ... these limitations, this paper presents

Prefinal Draft Version of: -6- Trier, M. (2008): Towards Dynamic Visualization for Understanding Evolution of Digital Communication Networks. Accepted for publication in Information Systems Research, to appear in 2008.

in network analysis is the centrality of selected actors (cf. Table 1). It has been identified as an

indicator of satisfaction or importance of actors within a network (e.g. Brass, 1984).

Although these measures and roles provide elaborated methods to analyze networks, they

concentrate on structural issues. The snapshot of the final network does not describe, how central

actors achieved their final positions or if the network or its clusters experience stability or decay.

Tab. 1. Overview of structural SNA measures and network roles (for formalized definitions cf. Wasserman and Faust, 1994).

Network Size Number of nodes in a network, e.g. participating actors. Relationship Strength, Tie Strength

The strength of the relationship between two actors. It can indicate the frequency of interactions (daily, monthly), count actual interactions, or measure intensity of relationships. In the communication context, relationship strength is increased via timed events in the form of initiated and received messages.

Degree (Activity vs. Prominence)

The number of adjacent contacts a node has, e.g. e-mail communication partners. If the direction of the events is contained in the dataset, activity (out-degree) measures the relationship forming events initiated by the observed actor, e.g. establishing the contact, referring to another authors work, sending messages etc. Prominence (in-degree) measures the events initiated by actors adjacent to the observed node.

Diameter Longest shortest path (distance in terms of steps) between two nodes in the network, e.g. the longest process (in terms of steps) of forwarding a mail in a network from one side of the network to the other. The larger the diameter, the less likely is the arrival of information on the other end of the network.

Density Connectedness of the network’s nodes. Proportion of pair wise connections realized between n nodes of a network divided by the number of theoretically possible relationships between those n nodes. Communication networks usually have a low density (sparse network) as not all actors are connected to all others.

Clustering-Coefficient

Measure of sub-group formation and of the density of an ego-network. The proportion of links between the direct contacts of an observed ego-node divided by the theoretically possible links between its direct contacts. In a communication networks, this shows if contacts of an actor tend to share information directly (transitivity).

Centrality Betweenness

Measure of communication control. Number of shortest paths between pairs of nodes, which run through the observed node. In an e-mail network this could be the person who forwards important messages and thus is important for the information transfer between pairs of actors. This can be an important network position but is also critical for information transfer in a communication setting.

Centrality Closeness

Distance of a node to all other nodes in the network measured with average shortest path length. In a digital network this measure indicates how fast or efficient an actor can access the network and how likely it is, that information reaches him.

Centrality Degree A simple centrality measure, counting the relative share of contacts of a node. Reciprocity Symmetry of relationships. If there is a relationship from node A to node B and vice

versa, then this relationship is called reciprocal. In online communication settings, it can also be a weighting of the links from A and B versus the links from B to A.

Broker role (Gatekeeper)

Network position, which is located on an exclusive path between two cliques or subcomponents. If removed, adjacent subcomponents get disconnected. Brokers thus control the flow between sections of the network. They tend to have a high betweenness.

Hub role A hub is a central actor (i.e. with a high degree). Many messages pass this position. Isolate role An isolate has a degree of zero and has thus no relationships to others in the network. Transmitter, Receiver, Carrier role

Transmitters have an in-degree of 0 and an out-degree above 0. They have only sent messages to the network but did not receive any. Receivers have an out-degree of 0 and an in-degree of above 0. Carriers have in-degrees and out-degrees above 0 (normal case) and thus received and transmitted information between other nodes.

Pulsetaker role A pulse taker has a small degree but connects to nodes with a high degree (e.g. hubs). The quotient between indirect links and direct links is high. This can be an efficient position as most information is likely to arrive without the need to maintain many contacts.

Page 7: Towards Dynamic Visualization for Understanding Evolution ...cmxp.bplaced.net/Commetrix/enron/PREFINALDRAFT... · Prefinal Draft Version of: ... these limitations, this paper presents

Prefinal Draft Version of: -7- Trier, M. (2008): Towards Dynamic Visualization for Understanding Evolution of Digital Communication Networks. Accepted for publication in Information Systems Research, to appear in 2008.

The structure of recent changes remains invisible and unexplored as does the shift of central

positions between nodes. Centrality measures alone do not convey, if a central position is

beneficial for the network evolution or if it is a critical weak point. The actual activity of actors

and their impact on the lifecycle of the community cannot be observed.

Such gaps in recognizing dynamic processes have been long criticized by researchers:

"Models of structure are not sufficient unto themselves. Eventually one must be able to show

how concrete social processes and individual manipulations shape and are shaped by structure”

(White et al. 1976, p. 773; also cf. Emirbayer, 1997). According to Doreian and Stokman (1996)

studying network processes therefore requires the use of time, i.e. temporally ordered

information in addition to descriptions of network structures as summarized information.

Empirical analysis of social network change started with the collection of small numbers of

separate waves of relationship data with a primary focus on aggregated interim states of a

network (e.g. Hammer, 1980; Freeman, 1984; a comprehensive overview is given in Doreian and

Stokman, 1996, p.6). These methods are limited to comparative studies of general differences

between these states on the aggregated network level. The actual sequence of activities is lost

and changes in the relationship pattern can average out between two points of observation.

Hence, such comparative analysis may be employed in domains with little temporal change (e.g.

kinship networks) but seems inappropriate for studying fast paced online communication.

An approach that takes some repeatedly collected waves of relationship data as input and

estimates the existence of certain dynamic effects in a network is the stochastic actor-driven

model (e.g. Snijders, 2001). It is based on simulating Markov chains of networks between

consecutive observations and assumes that actors analyze their current embeddedness in a

network structure to change their links according to a pre-defined value function. It contains

Page 8: Towards Dynamic Visualization for Understanding Evolution ...cmxp.bplaced.net/Commetrix/enron/PREFINALDRAFT... · Prefinal Draft Version of: ... these limitations, this paper presents

Prefinal Draft Version of: -8- Trier, M. (2008): Towards Dynamic Visualization for Understanding Evolution of Digital Communication Networks. Accepted for publication in Information Systems Research, to appear in 2008.

factors expressing theoretic network effects (e.g. maximization of reciprocity or similarity

among actors). Each factor has a parameter that can be estimated based on the waves of

empirical data. This approach is close to another approach that employs probabilistic ties and

uses a multi-agent based simulation model to predict network behavior (Carley, 2003).

Such studies typically computed general variables at the network level using only a few waves

of aggregated data (also cf. Moody et al., 2005), and did not relate structural change directly to

time units. Thus the notion of pace or fluctuation of the network is not addressed. In terms of

insightful visual representation, the studies mainly rely on line graphs with one or more variables

(e.g. transitivity, reciprocity, density, and centrality) over a time-axis. Despite the key role of

imagery in network research (Freeman, 2000), the above approaches do not exploit dynamic

visualization to leverage the analysis. Other approaches in the field of visualization of dynamic

networks do so; these are briefly discussed next.

3 Related approaches in Visualizing Dynamic Social Networks

Since the beginning of graph theoretic analysis, there is a slow but continuous evolution of

technical approaches to social network visualizations culminating in the creation of advanced

tools to measure and visualize networks. Until recently, these visualizations simply compared

graphs of the cumulative networks states at different times. A related strand of research, not

directly focused on the quantitative analysis of relationship structures, developed rich and

animated representations of online social spaces of electronic communication. Further, software

libraries for dynamic graph drawing have been recently introduced. Finally, approaches that

explicitly discuss and target dynamic network visualization and analysis of continuous (streams

of) data with high sampling rates have begun to emerge. These related concepts are now

Page 9: Towards Dynamic Visualization for Understanding Evolution ...cmxp.bplaced.net/Commetrix/enron/PREFINALDRAFT... · Prefinal Draft Version of: ... these limitations, this paper presents

Prefinal Draft Version of: -9- Trier, M. (2008): Towards Dynamic Visualization for Understanding Evolution of Digital Communication Networks. Accepted for publication in Information Systems Research, to appear in 2008.

elaborated in more detail and discussed in relation to the presented Commetrix approach of

visualizing event-based social network data.

The static social network graph was first introduced by Moreno in 1934. This “sociogram”

contained actors as nodes and their relationships as links between the nodes. Since its invention,

changes occurred only in the technical methods to produce the graphs. Until today, powerful

software tools for semi-automated analysis and visualization of large network structures have

developed (for a comprehensive overview see Freeman, 2000). Examples for current analytical

software packages are Ucinet (Borgatti, Everett & Freeman, 1992) or Pajek (Batagelj and Mrvar,

1998). They usually import formatted data files and provide sophisticated statistical analysis.

They further can generate structural network graphs, which can then be exported as images or 3D

models. Although Pajek recently introduced means to define in which time periods nodes or

links were present in order to compute partial networks, such tools are based on data about

aggregated structures and do not automatically capture, evaluate or animate dynamic data and

events from communication sources.

An alternative family of approaches comes from visualizations of online social spaces of

electronic communication. They suggest various intuitive metaphors to represent online social

activity, e.g. graphical tree-like hierarchies of postings (e.g. Smith and Fiore, 2001), a garden

with flower petals, or a tree with leaves to convey the ‘health’ of the electronic group (e.g.

Girgensohn et al., 2003). This has also resulted in the formulation of the concept of Social

Translucency, as “an approach to designing digital systems that emphasizes making social

information visible within the system” (Erickson and Kellogg, 2000). This family of approaches

was the first to employ motion for insightful and ‘living’ virtual representations of changes in the

conversation. However, compared to event-based network analysis, those concepts were

Page 10: Towards Dynamic Visualization for Understanding Evolution ...cmxp.bplaced.net/Commetrix/enron/PREFINALDRAFT... · Prefinal Draft Version of: ... these limitations, this paper presents

Prefinal Draft Version of: -10- Trier, M. (2008): Towards Dynamic Visualization for Understanding Evolution of Digital Communication Networks. Accepted for publication in Information Systems Research, to appear in 2008.

developed to aid the user in visually navigating online spaces. They do neither provide for a

quantitative network analysis of the displayed dynamic structures nor do they explicitly focus on

relationships.

A further related development is the advancement of general graph drawing packages. One

example is Graphviz of AT&T Labs Research (Ellson et al., 2004). As an open source graph

visualization package, it is a collection of software for viewing and manipulating abstract graphs

in the software engineering, networking, databases, knowledge representation, and bio-

informatics. All early algorithms of Graphviz concentrated on static layouts, until Dynagraph

was introduced in 2004 which includes algorithms, that “maintain a model graph with layout

information, and accept a sequence of insert, modify or delete subgraph requests, with the

subgraphs specifying the nodes and edges involved” (Ellson et al., 2004, p.14). The focus,

though, is on interactive editors for general graph drawing with applicable technical layout

concepts and software libraries to dynamically update a graph view. The libraries include no

network analytical approach or perspective and are not focused on social networks.

There are three contemporary approaches that, similar to the method presented in this paper,

work on the actual integration of Social Network Analysis and changing graphs. Perer and

Shneiderman (2006) introduced an approach that includes some functions to trace changes in

network data by hiding links outside a selected moveable time window. Nodes maintain a fixed

position based on the final network configuration. This mode has been termed flipbook by

Moody et al. (2005, p.1234) as it is a static technique that reveals how a network structure

unfolds over time based on interactions. However, the lack of dynamic repositioning of nodes

yields interim networks with uninformative layouts. For example, nodes with an early but weak

relationship would eventually be placed far apart, but early in the sequence would better be

Page 11: Towards Dynamic Visualization for Understanding Evolution ...cmxp.bplaced.net/Commetrix/enron/PREFINALDRAFT... · Prefinal Draft Version of: ... these limitations, this paper presents

Prefinal Draft Version of: -11- Trier, M. (2008): Towards Dynamic Visualization for Understanding Evolution of Digital Communication Networks. Accepted for publication in Information Systems Research, to appear in 2008.

positioned near each other and then move apart to slowly give room for later but stronger

relationships between them. It is hence less suitable for recognizing cluster formation or sudden

changes in actor’s network positions.

Beyond this flipbook technique, the two more advanced approaches of dynamic network

visualization by Gloor et al. (2004) and Moody et al. (2005) try to represent structural change as

motion in a social network graph. Both segment longitudinal data into subsequent time windows

and render their individual network graphs, which are then visualized as an animated sequence.

To provide visual consistency for the changing node locations, positional transitions are

computed between subsequent visualization frames. However, the suggested techniques based on

transitions between time frames produce much unnecessary node movement that result in many

crossings or long edges in the dynamic layout. This is likely to decrease readability for datasets

larger than 50 to 100 nodes due to much simultaneous motion.

A further obstacle to dynamic network research is that these software tools provide extensions

to visualize network data but lack a direct integration with functionality to compute SNA metrics

for selected network sections. The user interface still exhibits much potential for improving

exploratory analysis and in-depth quantitative insights of the visualized networks or for

manipulations of the dataset (e.g. filtering out a subset). On the other hand, the approach of Perer

and Shneiderman (2006) is focused on easy exploration but does not fully exploit the

opportunities for dynamic visualization. All employed animation algorithms also have potential

for improvement and enrichment to better convey changing properties of actors and relationships

over time.

In summary, conventional SNA methods have developed comparative analysis and stochastic

parameter estimations but are lacking in advanced visualization capabilities for observation and

Page 12: Towards Dynamic Visualization for Understanding Evolution ...cmxp.bplaced.net/Commetrix/enron/PREFINALDRAFT... · Prefinal Draft Version of: ... these limitations, this paper presents

Prefinal Draft Version of: -12- Trier, M. (2008): Towards Dynamic Visualization for Understanding Evolution of Digital Communication Networks. Accepted for publication in Information Systems Research, to appear in 2008.

verification. Only a few recent approaches have started to develop visual means for observing

change in social networks, but they do not study the impact of activities or external events on the

final network structure. Extant visualization techniques still suffer from some limitations and are

not comprehensively connected to exploratory network measurement. Without such integration,

novel measures that better capture network dynamics remain unattainable.

4 A Methodology for Dynamic Visualization and Measurement Approaches

Commetrix is a java-based tool constructed for event-based dynamic network analysis and

attempts to address the limitations of current approaches. The development of this tool started at

about the same time as the above related approaches (cf. Trier 2004, 2005) and has yielded a

comprehensive set of software-based methods for exploratory static and dynamic visualization

with integrated analysis of social network measures.

The underlying framework for event-based dynamic network analysis consists of a data model

that contains information about the network including the timing of network events. Integrated

with that is a sophisticated visualization technique based on a 2D/3D spring embedder (cf.

Fruchterman and Reingold, 1991) that allows for adding and deleting network elements to a

graph representation. Finally, a special method for smooth graph transitions has been developed.

First, the fast paced communication data needs to be sampled and stored in a data model for

systematic analysis. Conventional SNA datasets are based on a graph G = (N, L) which consists

of a finite set of nodes N and a finite set of lines L that are constituted by pairs (ni,nj) of nodes

(Wasserman and Faust, 1994, p.122). If nodes represent actors and edges represent relationships,

such a graph is also referred to as a sociogram. The respective matrix which stores relationships

between each pair of actors is called a sociomatrix. The event-based approach now implies

Page 13: Towards Dynamic Visualization for Understanding Evolution ...cmxp.bplaced.net/Commetrix/enron/PREFINALDRAFT... · Prefinal Draft Version of: ... these limitations, this paper presents

Prefinal Draft Version of: -13- Trier, M. (2008): Towards Dynamic Visualization for Understanding Evolution of Digital Communication Networks. Accepted for publication in Information Systems Research, to appear in 2008.

several changes for storing the network data.

Relating to Doreian’s and Stokman’s (1997, p.3) definition of a network process as a “series of

events that create, sustain, and dissolve social structures”, relationships are not directly

considered but their constituting timed events are captured. In communication network analysis

such relational events are created by exchanging messages with others. From these events,

relationships can be aggregated. In the most basic sample procedure every message event will

increment the relationship’s strength by a value of 1. The simple case of dichotomous

relationships (absent vs. present ties) can be covered by only modeling a single timed event that

creates the relationship at a specific time. In studies of online communication, replies and carbon

copy e-mails can be stored as relational events or can be intentionally ignored in the sampling

process.

NetworkActor

Event Property

Property

(Relationship)

Data Model Visualization

(Type,…)

(Time, Content, Type,…)

has

has

has

has

Fig. 1. The data model stores actors with properties like name, function, type, etc. and events with properties like time, content, or type. Relationships are time oriented aggregations of events. The visualization represents actors as nodes and relationships as arcs and utilizes different visual variables (size, color, saturation, etc.) to encode the properties.

The data model underlying the approach consists of actors, actor properties, events, and event

properties (also cf. Figure 1). For each event several properties are captured. For example, the

time stamp of each message event is recorded as a message property. Hence, the sequence of

messages and the change in relationship structure or strength is represented as a series of

relational events in the data model. Examples of further event properties are keywords, contents,

coded communication types (e.g. socialization vs. task organization), or evaluations, that can

Page 14: Towards Dynamic Visualization for Understanding Evolution ...cmxp.bplaced.net/Commetrix/enron/PREFINALDRAFT... · Prefinal Draft Version of: ... these limitations, this paper presents

Prefinal Draft Version of: -14- Trier, M. (2008): Towards Dynamic Visualization for Understanding Evolution of Digital Communication Networks. Accepted for publication in Information Systems Research, to appear in 2008.

then be used for content-oriented analysis or similarity based grouping. In addition to these

important changes in capturing relationships, the actual actors are modeled together with their

actor properties. The latter can include names, organizations, evaluations, organizational ranks,

types, or locations.

The visualization represents the data model graphically. As in the conventional sociogram

(social network graph), actors are represented by nodes and edges represent the relationships as

flexible aggregations of message events. The sociogram extended with additional means for

information visualization and the capability to adapt to longitudinal network change yields a

dynamic graph termed ‘communigraph’. Utilizing Bertin’s (1967) concept of visual variables to

encode information, properties can be visualized by label, node size, node color (brightness,

transparency), or a number of rings around the node. Relationship properties are graphically

represented using colors, thickness, length, and labels.

In the domain of dynamic analysis, the representation of change in the graph is a fundamental

part of the visualization. It requires algorithms for handling transitions between incremental

network states in order to represent structural changes with organic movement. This major aspect

of dynamic visualization can be termed transition problem. Due to its role in differentiating

among alternative approaches to dynamic network visualization, this aspect is now discussed in

more detail.

As already introduced, the related work of Moody et al. (2005) and Gloor et al. (2004) is based

on a sliding time window that is moved through the overall sample period. For each of these time

windows (frames) a network layout is computed. If structural change occurs, two subsequent

network layouts differ in their node’s position. To create a consistent transition, the authors

render interim frames. The visualization is then “gradually adjusting node coordinates and

Page 15: Towards Dynamic Visualization for Understanding Evolution ...cmxp.bplaced.net/Commetrix/enron/PREFINALDRAFT... · Prefinal Draft Version of: ... these limitations, this paper presents

Prefinal Draft Version of: -15- Trier, M. (2008): Towards Dynamic Visualization for Understanding Evolution of Digital Communication Networks. Accepted for publication in Information Systems Research, to appear in 2008.

adding or deleting nodes and arcs“(Moody et al., 2005) or as (Gloor et al., 2004) describe it: “the

animation of the changing layout is interpolated between […] keyframes”. Both approaches thus

calculate network graph layouts at different states (e.g. per day) and then linearly move nodes

from their position in the network of the first time window to their position in the subsequent

time window.

Careful examination of such layouts shows that their rendering of transition frames disturbs

the impression of organic evolution of network structures. Nodes cross other nodes, swap their

position without need, or move at unintuitive changing speeds or in quickly changing directions

across the screen. The inconsistent motion is caused by two conflicting relocation strategies.

Node movement is alternately governed by the network layout algorithm of timeframe 1 and then

by the positional transition algorithm that moves nodes to their new optimal network position in

timeframe 2. Being trained to evaluate stable parts by their inertia, the observer is distracted from

observing how new nodes find their position while large ‘established’ centers also shift positions

and all adjacent nodes in their clusters with them. The result is a suboptimal impression of

transitions between separate layouts instead of observing network behavior with its events and

their impact on the remaining structure.

To create smoother transitions across time frames, the visualization implemented in the

Commetrix tool avoids linear transitions between rendered keyframes. Rather, new nodes are

added directly to the visual representation at the time, when the resulting event actually occurs.

The technique literally ‘throws’ additional communication elements into the network layout at

the according time to let them find their natural place. As the supplemental videos (available at

www.commetrix.de/enron) show, this results in a very organic view on network evolution. The

novel technique necessitated the development of a dynamic version of the spring embedder

Page 16: Towards Dynamic Visualization for Understanding Evolution ...cmxp.bplaced.net/Commetrix/enron/PREFINALDRAFT... · Prefinal Draft Version of: ... these limitations, this paper presents

Prefinal Draft Version of: -16- Trier, M. (2008): Towards Dynamic Visualization for Understanding Evolution of Digital Communication Networks. Accepted for publication in Information Systems Research, to appear in 2008.

layout algorithm (Fruchterman and Reingold, 1991). It can accommodate new nodes into an

existing network layout. A major reduction in unnecessary node movement has been achieved by

relating node inertia to their number of contacts (degree). As a result, larger structures become

more inert and less connected nodes quickly move towards them. This keeps established parts as

stable as they should appear, while drawing the user’s full attention to moving areas where the

actual change happens. The movement in the evolving graph of online communication thus

directly represents structural changes and in effect, the social network looks like a real living

system of interactive elements in a network relation. In analogy, relationships and nodes older

than the observed time window can be dynamically taken out of the layout procedure. This yields

visualizations that directly show the recent changes in the network’s evolution.

5 Analysis and Discussion

The research questions posed in the introduction are now addressed by illustrating the

capabilities of this approach with a sample of corporate e-mail data of Enron. The data were

originally published by the Federal Energy Regulatory Commission in May 2002 as a

consequence of the investigations into the fraud and bankruptcy scandals of Enron in December

2001. The original dataset covered 619446 messages (around 92% of monitored e-mails) in 3500

personal e-mail folders over a period of three and a half years. This sample has been refined by

Gervasio of SRI International for the CALO Project (Cognitive Assistant that Learns and

Organizes) and subsequently by Shetty und Adibi from the University of Southern California's

Information Sciences Institute, resulting in a corrected network of 517431 mails of 151 actors

(cf. Shetty and Adibi, 2004; the authors also provide a link to the data source). The managers,

traders and employees were working at different physical locations. In the study presented here,

Page 17: Towards Dynamic Visualization for Understanding Evolution ...cmxp.bplaced.net/Commetrix/enron/PREFINALDRAFT... · Prefinal Draft Version of: ... these limitations, this paper presents

Prefinal Draft Version of: -17- Trier, M. (2008): Towards Dynamic Visualization for Understanding Evolution of Digital Communication Networks. Accepted for publication in Information Systems Research, to appear in 2008.

of this set all those 19811 messages have been considered that originated and terminated within

this set of 151 actors. The sample duration is 38 months, i.e. from May 5th, 1999 to June 21st,

2002. Discussed topics include regulations, internal projects, company image, political

relationships, operations, logistics of arrangements, reports of business trips, and information

about partnerships. The data also includes information exchange of a more personal nature in the

professional context. The sample is very suitable to analyzing dynamic network evolution, as the

e-mail contents are known and it consists of strong relationships of timed electronic

communication, required to demonstrate network dynamics. The years 1999 and 2000 represent

everyday operations of the sampled population whereas the years 2001 and 2002 reflect several

external events in the context of Enron’s bankruptcy scandal, whose impact on the network

dynamics is studied.

Isolating Volatility in Communication Patterns and Positions

The first research question concerned the artifacts created by conventional summative SNA.

This method would only use the final static picture as shown in Figure 2d. This cumulated

network contains one large component of 150 actors (1 isolate node has been removed in the

graph). During the sample period, 1526 relationships can be observed with the average

relationship strength of 26 exchanged messages. For better reference to particular sections of the

network layouts, several borders have been manually added based on visual inspection. The final

structure shows that the e-mail network, although completely connected, forms larger subgroups,

which are in the cumulative graph of Figure 2d connected to a very dense center (named section

1) via a larger number of links. The more peripheral sections are smaller and have a stronger

connection within than between sections and thus appear more separated and peripheral. Nodes

have on average 20 contacts. The most central node is node 87 who is connected to 50% of all

Page 18: Towards Dynamic Visualization for Understanding Evolution ...cmxp.bplaced.net/Commetrix/enron/PREFINALDRAFT... · Prefinal Draft Version of: ... these limitations, this paper presents

Prefinal Draft Version of: -18- Trier, M. (2008): Towards Dynamic Visualization for Understanding Evolution of Digital Communication Networks. Accepted for publication in Information Systems Research, to appear in 2008.

actors (74 contacts).

a) July 1st, 2000 b) February 26th, 2001

c) October 24th, 2001 d) June 21st, 2002

Size: 87 actors in 167 relationships

Node 87: Dgr, BtwC%, DgrC%:5, 0.11%, 6%Rank BtwC: 47thRank Dgr: 21st

Size: 115 actors in 488 relationships

Node 87: Dgr, BtwC%, DgrC%:15, 1.14%, 13%Rank BtwC: 44thRank Dgr: 101st

Size: 147 actors in 1204 relationships

Node 87: Dgr, BtwC%, DgrC%:16, 0%, 11%Rank BtwC: 123rdRank Dgr: 63rd

Size: 150 actors in 1526 relationships (+ 1 isolate)

Node 87: Dgr,BtwC%,DgrC%:74, 9%, 50%Rank BtwC: 1stRank Dgr: 1st

Section 1Section 3

Section 2

Section 4

Section 1

Section 3

Section 4

Section 3 Section 1

Section 2

Section 4

Fig. 2. The cumulative evolution of the most central actor’s position in the network. Color represents degree and size represents the betweenness of the node at the respective time. Observed node is red. Network size, degree, betweenness centrality, and degree centrality for the observed node 87 is given. All measures and visual output were computed using Commetrix. The borders between sections were manually added for better reference. The original animated graph is available as a movie at http://www.commetrix.de/enron.

These findings of static analysis can now be contrasted with insights gained from analyzing the

network’s structural change. Figure 2a-c shows three snapshots of the animated graph of the

complete evolution. The changing node size of node 87 now highlights that this identified central

actor (node 87) clearly did not establish its position in a steady growth but rather suddenly

towards the end of the overall sampling period. The network metrics of node 87 over time (listed

in Figure 2) show a centrality ranking of rank 47 out of 87 active nodes in period 1 with only 5

contacts. Subsequently, node 87 remains equally unimportant in terms of centrality until in the

last quarter almost all of its centrality has been achieved (note the difference in node size

between Figure 2c and 2d).

Page 19: Towards Dynamic Visualization for Understanding Evolution ...cmxp.bplaced.net/Commetrix/enron/PREFINALDRAFT... · Prefinal Draft Version of: ... these limitations, this paper presents

Prefinal Draft Version of: -19- Trier, M. (2008): Towards Dynamic Visualization for Understanding Evolution of Digital Communication Networks. Accepted for publication in Information Systems Research, to appear in 2008.

A detailed analysis highlights the most interesting period: between February 2nd and February

5th, 2002, where the centrality has increased by the factor 2.8 in just three days within the overall

period of 1137 days (an animation visualizing this abrupt change is available at

www.commetrix.de/enron). Afterwards there are no further significant positional changes until

the end. Analyzing the broader context, node 87 represents the assistant of the leader of the

wholesale trading division. That leader became Enron’s last president (node 44) in August 2001.

This promotion seems to be an external change affecting the position of the assistant node 87.

It can be concluded that the most central node highlighted by summative SNA established its

position not in a steady increase but in a very fast burst of activity. Dynamic analysis hence

directs the focus to unusual patterns in the overall network evolution which would not have been

discovered with static analysis. Once temporal effects such as sudden changes have been

identified in the sample, the analyst can focus on studying the temporal evolution of the network

to decide whether node 87 should still be considered structurally important. With the underlying

event-based data model, the analysis can hence seamlessly shift from the network level to the

actor level.

Generally, dynamic analysis highlights that in digital communication networks central

positions can be very volatile due to the ease with which new relationships are created. This is

especially the case in a corporate network, where a macro-context has an impact on the

initialization of new contacts. Other examples, e.g. sudden changes triggered by the newly

appointed CEO around August 24th, 2001, support this notion. In online communication, the

measure of centrality hence strongly depends on the timing, e.g. SNA would identify another

node (84) in period three. For the studied domain, static measures are thus likely to yield

misleading results. Taking this point one step further, developing a general dynamic measure of

Page 20: Towards Dynamic Visualization for Understanding Evolution ...cmxp.bplaced.net/Commetrix/enron/PREFINALDRAFT... · Prefinal Draft Version of: ... these limitations, this paper presents

Prefinal Draft Version of: -20- Trier, M. (2008): Towards Dynamic Visualization for Understanding Evolution of Digital Communication Networks. Accepted for publication in Information Systems Research, to appear in 2008.

burstiness and volatility (i.e. variance) can help to establish, when dynamic analysis is necessary

in order to prevent measurement errors from inappropriate aggregated sampling. At the actor

level, sudden changes can further be related to actor properties (e.g. membership duration) or

used to identify a-typical changes as a signal for possible suspect behavior. On the other hand,

dynamic network analysis can also help to remove unrepresentative anomalies from the network

data, which otherwise would result in an incorrect representation of the final structure.

Network and Subgroup Formation

Next to observing single nodes and their positional changes in the network’s structure,

dynamic analysis provides improved means to describe the development of the complete network

and its separation of sections over time (research question 2). The following descriptive analysis

of the formation process of the Enron e-mail network is based on a combination of exploratory

visual inspection and time dependent SNA measures performed using Commetrix. The focus was

on identifying typical patterns by which certain social network architectures and subgroups

emerge. Figure 2a-d is again used as a visual reference.

Starting with a small integrated network, the center is increasing its density and section 3

emerges with a burst in activity around node 90 on January 1st, 2000 (Figure 2b). On this date,

the node achieved a betweenness of 19%. The new section is connected to the main center by

only a few nodes. On August 18th, 2000, the spike of section 4 occurs via an exclusive link

between nodes 67 and 106. Minor traces of the slow and broad formation of section 2 are also

visible. In the third quarter of the formation process, section 4 builds many connections to the

center and almost becomes integrated. During this process the initiating node 106 and the initial

exclusive link completely lose importance. Section 2 continues its slow separation in the winter

of 2000/2001. Section 3 establishes a broader connection via many connecting nodes to the

Page 21: Towards Dynamic Visualization for Understanding Evolution ...cmxp.bplaced.net/Commetrix/enron/PREFINALDRAFT... · Prefinal Draft Version of: ... these limitations, this paper presents

Prefinal Draft Version of: -21- Trier, M. (2008): Towards Dynamic Visualization for Understanding Evolution of Digital Communication Networks. Accepted for publication in Information Systems Research, to appear in 2008.

center. In this process, the initiator of section 3, node 90, loses its brokering position

(betweenness declines to 1.5%) and a new actor emerges with node 9 (betweenness 8%). Actors

6 and 76 in the center grow in their degree (denoted by node size). They largely contact nodes

within the center and thus increase the density of that area. In the final period, all peripheral

sections develop more separation but remain connected with each other via the center.

This first detailed descriptive account of dynamic network formation processes with emerging

and decaying sub-structures highlights some general process patterns. Separate sections were

initiated by a very central and active node (e.g. node 90 started section 3), by a central and

exclusive link (e.g. like between nodes 67 and 106 for section 4), or by very slow separation of

many nodes (section 2). Such descriptions of structural processes show the potential of dynamic

analysis to support the induction of general theories about dynamic patterns or antecedents of

community formation from empirical data. A further insight is that such cluster formation is not

uniform. Section 4 was moving towards integration with the center and the continuous separation

of section 3 resulted from node 9 taking over the declining central position of node 90. This

suggests a concept of several overlapping lifecycles of network sections instead of assuming

steady and homogeneous growth across the overall network.

Processes of change are even more visible if older messages outside the observed time

window decay and get eliminated from the visualization. This emphasizes the added activity

within the current time window (e.g. one day) without distractions from past accumulated

structures. In this visualization mode, clusters are only persistent if the participating nodes

reactivate their links within the defined time window. Otherwise relationships dissolve again.

This gives a visual impression of networking speed (frequency) and helps to understand, who

contacts whom to actually establish the network. Based on such process oriented analysis of

Page 22: Towards Dynamic Visualization for Understanding Evolution ...cmxp.bplaced.net/Commetrix/enron/PREFINALDRAFT... · Prefinal Draft Version of: ... these limitations, this paper presents

Prefinal Draft Version of: -22- Trier, M. (2008): Towards Dynamic Visualization for Understanding Evolution of Digital Communication Networks. Accepted for publication in Information Systems Research, to appear in 2008.

individual activities and their structural impact, the identification of correct important (i.e.

active) players in online communication networks with high volatility can be improved.

The Structural Impact of Brokering Actions

One application of such activity oriented analysis of network dynamics arises in the context of

research question 3. The changes brought about within subsequent time windows of one day are

studied to analyze brokering activities that span large distances in order to integrate separate

parts of the corporate network. Each action is considered a brokering activity, which creates

shortcuts in path length of more than one step. This excludes connections resulting from the

natural tendency of indirect paths of length 2 to become direct paths of length 1 (triadic closure),

i.e. bypassing one intermediary node. For example, if nodes A and D were connected via three

steps (e.g. A-B-C-D) then A performed a brokering activity if he directly created a relationship

to D (A-D). Figure 3 gives an example of how dynamic visualization shows such a brokering

situation. On the observed day, the marked node impacts the overall network structure by

connecting three otherwise disconnected segments of the network. The process results in shorter

network paths and thus contributes to the formation of a more integrated network structure.

Research question 3 further concerns the organizational ranks of actors with a high brokering

activity level. To establish this relationship, available data about 95 organizational positions is

utilized. In the example shown in Figure 3, the observed node has the rank president (represented

by node color and label) and its new contacts are also above management level: one director and

two vice presidents. For the following analysis, it has to be noted that the sample is not a random

sample but focuses on people, which where related to the Enron case of fraud conspiracy and the

resulting bankruptcy in late 2001 and is thus biased towards upper levels of the hierarchy. The

year 2000 is taken as a subset. This year’s activity is well before the unusual final year of 2001

Page 23: Towards Dynamic Visualization for Understanding Evolution ...cmxp.bplaced.net/Commetrix/enron/PREFINALDRAFT... · Prefinal Draft Version of: ... these limitations, this paper presents

Prefinal Draft Version of: -23- Trier, M. (2008): Towards Dynamic Visualization for Understanding Evolution of Digital Communication Networks. Accepted for publication in Information Systems Research, to appear in 2008.

and should thus give a representative account of networking processes. All brokering activities of

the network have been counted and coded by visual inspection of the animated graph.

a) February 27th, 2000 b) c) February 28th, 2000Fig. 3. A node with job position president connects three otherwise separate clusters (via two vice presidents and one director) on February 28th, 2000. Four separate frames of the according animation. Time filter shows only one past month of mail activity.

Together, 74 actions have been classified as brokering actions in the year 2000. They spread

evenly across the year. For 36 of these actions the job position of the involved actors is known.

Out of these brokering actions, 9 instances (25 percent) have involved only top management

positions and further 9 (25 percent) only employees. The majority of 18 connecting actions (50

percent) have been cross-hierarchical. This quantitative pattern is supported by the visual

impression: Managers connect with distant employees in a brokering action to join separate parts

of the network and form the single integrated component shown in Figure 2d.

This study of brokering activity demonstrates the multiple levels of analysis facilitated by the

approach. Event-based analysis relates actor attributes (e.g. organizational rank) with actor

activities and their impact on network formation. The technique of sliding time window

visualization and analysis enables to hide past cumulative structures in order to analyze a

network from an activity oriented view. Network structures are disaggregated into individual

networking processes and each incremental network development can be observed and

Page 24: Towards Dynamic Visualization for Understanding Evolution ...cmxp.bplaced.net/Commetrix/enron/PREFINALDRAFT... · Prefinal Draft Version of: ... these limitations, this paper presents

Prefinal Draft Version of: -24- Trier, M. (2008): Towards Dynamic Visualization for Understanding Evolution of Digital Communication Networks. Accepted for publication in Information Systems Research, to appear in 2008.

measured. By that, the structural impact of actor attributes and activities on the overall formation

of the social network structure is uncovered.

The Impact of External Events

Related to the analysis of activities, dynamic visualization of sliding time windows can be

utilized to learn about the reactions of an electronic communication network to external events

(research question 4). The network’s level of activity and change is measured by setting a sliding

time window (e.g. one month) and by moving forward in time taking measurements of active

nodes, active relationships between them, and the current average relationship strength. The

active nodes and relationships of one time window can be interpreted as the incremental addition

of network activity. Figure 4 summarizes the quantitative analysis of the animation.

The number of actors slowly increases until July 2001. Then it stagnates at the level of about

130 simultaneously active actors. Despite this stable number of active users per month, this

period is marked by an unprecedented increase in active relationships and in relationship

strength. A constant number of actors are creating new relationships among themselves and

intensifying them, resulting in an increasingly dense network. This temporal effect is

accompanied by a sharp increase in message frequency (middle chart in Figure 4). Further

information about the context of the period reveals that this pattern of change happens at the time

when the Security and Exchange Commission started their investigations into the Enron fraud

scandal on October 31st, 2001, and Enron filed for bankruptcy in December 2nd, 2001. The Enron

e-mail network seems to react to a fundamental external event with a contraction indicated by a

quick increase in interaction frequency, network activity, and network integration.

Page 25: Towards Dynamic Visualization for Understanding Evolution ...cmxp.bplaced.net/Commetrix/enron/PREFINALDRAFT... · Prefinal Draft Version of: ... these limitations, this paper presents

Prefinal Draft Version of: -25- Trier, M. (2008): Towards Dynamic Visualization for Understanding Evolution of Digital Communication Networks. Accepted for publication in Information Systems Research, to appear in 2008.

0

2

4

6

8

10

12

14

Sep 99

Nov 99

Jan 0

0

Mrz 00

Mai 00

Jul 0

0

Sep 00

Nov 00

Jan 0

1

Mrz 01

Mai 01

Jul 0

1

Sep 01

Nov 01

Jan 0

2

Mrz 02

Mai 02

Average Relationship StrengthLinear (Average Relationship Strength)

0

500

1000

1500

2000

2500

3000

Sep-99 Nov-99 Jan-00 Mar -00 May-00 Jul-00 Sep-00 Nov-00 Jan-01 Mar -01 May-01 Jul-01 Sep-01 Nov-01 Jan-02 Mar -02 May-02

Frequency (sent msg in 30 day window)

050

100150200250300350400450500

Sep

-99

Nov

-99

Jan-

00

Mar

-00

May

-00

Jul-0

0

Sep

-00

Nov

-00

Jan-

01

Mar

-01

May

-01

Jul-0

1

Sep

-01

Nov

-01

Jan-

02

Mar

-02

May

-02

Number of Active AuthorsNumber of Active Relations

Fig. 4. Level of network change. Time window shows only the last month of network activity to indicate the change in active actors, active relationships and current average relationship strength across the time period. A clear peak in change and in relationship strength is visible around the time when the Enron scandal was published.

This finding is another example of the improved analytical understanding of processes in

online communication networks resulting from the combination of events and SNA. The study

can be a starting point for further academic investigations about typical reactions to external

events. Next to reactions, another important issue is the anticipation of events by analyzing

network behavior to identify indicators for a current general external impact. This also builds a

connection to the first research question which found anomalies that even affected the final

structure of the network. Anomalies are likely to be more influenced by external factors than by

internal structures. Further, community formation processes (cf. second research question) can be

related to the impact of external events. Generally, the research scope extends from studying

change to studying change of change, e.g. large tendencies and their likely reasons, sudden

activity, or fast restructuring.

Page 26: Towards Dynamic Visualization for Understanding Evolution ...cmxp.bplaced.net/Commetrix/enron/PREFINALDRAFT... · Prefinal Draft Version of: ... these limitations, this paper presents

Prefinal Draft Version of: -26- Trier, M. (2008): Towards Dynamic Visualization for Understanding Evolution of Digital Communication Networks. Accepted for publication in Information Systems Research, to appear in 2008.

6 Conclusions and Outlook

The approach described in this paper has two types of implications: insights about the

dynamics of an e-mail network and methodical insights about how event-based dynamic network

analysis can help researchers and practitioners to learn more about social networks with massive

timed events.

Dynamic analysis of Enron’s corporate e-mail creates a more detailed picture of processes in

online communication networks: Central actors are not constantly maintaining their position but

quickly rise and fall in their centrality ranking. Centrality is thus very volatile and dependent on

time, reflecting a temporal utilization of the network by individuals to carry out organizational

tasks. Very short bursts in activity can affect the overall network structure significantly. Network

sections (and with that possibly communities) emerge and decay and are not necessarily a

persistent structural element. This suggests several overlapping lifecycles of different subnets in

the overall network. Three different activity patterns have been found to initiate such sections,

exclusive nodes, exclusive links, or slow separation of a dense subgroup. Actor and activity

oriented dynamic analysis uncovers that integrated network structures are a result of brokering

activities. The analysis of actor attributes showed that managers primarily connected with distant

employees across hierarchies to form the final integrated network. External events induced

reaction patterns marked by fast network contraction with a sharp increase of message frequency

accompanied by increasing network density, and intensified relationships among actors.

From a methodological point of view, these findings demonstrate the novel research

perspectives resulting from event-based dynamic network analysis. Networks are now less a

static phenomenon but can be perceived as a versatile structure in constant change and motion.

The main underlying methodological difference in the approach described here is disaggregating

Page 27: Towards Dynamic Visualization for Understanding Evolution ...cmxp.bplaced.net/Commetrix/enron/PREFINALDRAFT... · Prefinal Draft Version of: ... these limitations, this paper presents

Prefinal Draft Version of: -27- Trier, M. (2008): Towards Dynamic Visualization for Understanding Evolution of Digital Communication Networks. Accepted for publication in Information Systems Research, to appear in 2008.

relationships into ordered series of timed events, and explicit recognition of a variety of event

and actor attributes. The resulting dynamic visualization and analysis is computationally

intensive and thus requires sophisticated software support. For that, the exploratory java-based

tool Commetrix has been employed. Its close integration of SNA and visualization overcomes an

important weakness of other current approaches to network dynamics. During the process of

visual inspection, network metrics such as degree or betweenness can be computed and exported

for the visualized partial structures and their changes. In effect, researchers can now trace and

measure how final structure emerges from single activities at different but connected levels of

analysis. This provides an opportunity to overcome SNA’s current limitation of interpreting

network structures based on a single level of analysis (cf. Monge and Contractor, 2003) despite

the strong interdependency between the actor and the network level (Doreian and Stokman,

1997, p.15). With such integration of network and actor level analysis samples with unusual

development become an opportunity rather than a threat. If change is detected, researchers scale

their perspective from general static network properties down to patterns of change of actors and

their activities.

In addition to this bridging capability, the relevance of developing dynamic network

visualization and analysis is substantiated by the finding that the core metric of SNA, i.e.

centrality, is highly dependent on time. This motivates research into novel methods that identify

important people based on their networking activities and their structural impact. Dynamic

visualization has been emphasized as the primary means to induce hypotheses and theory from

observed network data. The concept of visually moving through subsequent sliding time

windows and removing older message events renders current zones of change an explicit object

of analysis. Such a visualization mode highlights the immediate impact of recent events as

Page 28: Towards Dynamic Visualization for Understanding Evolution ...cmxp.bplaced.net/Commetrix/enron/PREFINALDRAFT... · Prefinal Draft Version of: ... these limitations, this paper presents

Prefinal Draft Version of: -28- Trier, M. (2008): Towards Dynamic Visualization for Understanding Evolution of Digital Communication Networks. Accepted for publication in Information Systems Research, to appear in 2008.

demonstrated by Enron’s peak of networking in coincidence with its bankruptcy filing. With

that, researchers could extend studies of other drastic impacts on networks (e.g. catastrophes) to

derive more informed prediction models for networking behavior. For the practitioner, the

presented approach allows improved detection of emerging organizational communities and their

developing integration with other groups (e.g. after reorganization). Active people which might

not be detected by static metrics can be identified, or changes and activity levels of network

areas can be analyzed to measure network reactions on external stimuli (e.g. campaigns).

Future research will need to augment the exploratory study discussed in this paper to arrive at

a methodology for robust scientific insights into network dynamics. Currently, important

objectives include the quantification and automation of the dynamic measure brokering activity.

Another challenging field of research is the design of algorithms that automatically identify the

formation of online communities as (emerging) borders between sections of the network to

support current visual inspection and to advance the current descriptive account of network

evolution. This can enable the recognition of typical temporal interaction patterns in large

networks of online communication. If future algorithms can compare masses of incremental

subsequent subnets in order to identify and measure patterns or temporal relationships among

patterns, stability in network structures can be advanced from a general description to a

quantified measure to compare subgroup dynamics within an overall network.

A final direction of our current research recognizes that the message event properties of the

presented method for event-based network analysis can also store contents. Such a combination

of content analysis with dynamic analysis allows new ways of studying innovation diffusion over

time in online communities and can advance SNA towards Social Network Intelligence.

Page 29: Towards Dynamic Visualization for Understanding Evolution ...cmxp.bplaced.net/Commetrix/enron/PREFINALDRAFT... · Prefinal Draft Version of: ... these limitations, this paper presents

Prefinal Draft Version of: -29- Trier, M. (2008): Towards Dynamic Visualization for Understanding Evolution of Digital Communication Networks. Accepted for publication in Information Systems Research, to appear in 2008.

References

Anklam, P. (2002): Knowledge Management: The Collaboration Thread. Bulletin of the American Society for Information Science and Technology, 28(6)2002.

Batagelj , V., Mrvar, A. (1998): Pajek - Program for Large Network Analysis. Connections 21(2)1998, p. 47-57. Berge, Z. L., Collins, M. P. (2000): Perceptions of e-moderators about their roles and functions in moderating

electronic mailing lists. Distance Education: An International Journal, 21(2000)1, p. 81-100. Bertin, J. (1967): Semiology of Graphics. The University of Wisconsin Press, Madison. Bikson, T.K., Eveland, J.D. (1990): The interplay of workgroup structures and computer support. In Galagher, J.,

Kraut, R., and Egido, C. (Eds.) Intellectual teamwork Norwood, NJ: Erlbaum. p. 243-290. Borgatti, S., Everett, M.,Freeman, L.C. (1992): UCINET IV, Version 1.0, Columbia: Analytic Technologies. Brass, D.J. (1984): Being in the Right Place: A Structural Analysis of Individual Influence in an Organization.

Administrative Science Quarterly, 29(1989)4, p. 518-39. Card, S.K., Mackinlay, J.D., Shneiderman, B. (1999): Information Visualization, Using Vision to Think. Morgan

Kaufmann Publisher, San Francisco, 1999. Carley, K. M. (2003): Dynamic Network Analysis. In: Breiger, R., Carley, K. M., Pattison, P. (Eds.) Dynamic Social

Network Modelling and Analysis: Workshop Summary and Papers (2003). Cho H.-K., Trier, M., Kim, E. (2005): The Use of Instant Messaging in Working Relationship Development: A Case

Study. Journal of Computer-Mediated Communication, Volume 10, Issue 4, July 2005. Doreian, P., Stockman, F.N. (1996): The Dynamics and Evolution of Social Networks. In: Evolution of Social

Networks, edited by P. Doreian and Frans N. Stokman. New York: Gordon & Breach, p. 1-17. Ellson,, J., Gansner, E.R., Koutsofios E., North, S.C., Woodhull, G.(2004): Graphviz and Dynagraph – Static and

Dynamic Graph Drawing Tools. Graph drawing software. New York: Springer-Verlag, 2004. Emirbayer, M., (1997): Manifesto for a Relational Sociology. American Journal of Sociology 103(1997)2, p. 281-

317. Erickson T., Kellogg W.A. (2000): Social Translucence: An Approach to Designing Systems that Mesh with Social

Processes. In Transactions on Computer-Human Interaction. 7(1)2000, ACM Press, New York, 2000. p. 59-83. http://www.research.ibm.com/SocialComputing/Papers/st_TOCHI.htm. Accessed: 2007-01-20.

Fallows, D. (2002): Email at work - few feel overwhelmed and most are pleased with the way email helps them do their jobs. PEW Internet and American Life Project. URL: http://207.21.232.103/pdfs/ PIP_Work_Email_Report.pdf. Accessed: 2007-01-28.

Freeman, L.C. (1984): The impact of computer based communication on the social structure of an emerging scientific speciality. Social Networks, (6)1984, p. 201-221.

Freeman, L.C. (2000): Visualizing Social Networks. Journal of Social Structure. 1(1)2000. http://www.cmu.edu/joss/content/articles/volume1/Freeman.html. Accessed: 2006-10-20.

Fruchterman, T. M. J., Reingold, E. M. (1991) Graph Drawing by Force-Directed Placement. Software - Practice & Experience 21(11)1991, p. 1129-1164.

Garton, L., Haythornthwaite, C., Wellman, B. (1997): Studying Online Social Networks. Journal of Computer Mediated Communication, 3(1997)1.

Girgensohn, A., Lee, A., Zhang, J. (2003): Social Browsers for Visualizing Web Communities. Proceedings of the ACM WWW 2003, Budapest, Hungary, May 2003.

Gloor, P., Laubacher R., Zhao, Y., Dynes, S. (2004): Temporal Visualization and Analysis of Social Networks, NAACSOS Conference, June 27 - 29, Pittsburgh PA, North American Association for Computational Social and Organizational Science, 2004.

Granovetter, M.S. (1973): The Strength of Weak Ties. American Journal of Sociology, 78(1973)6, p. 1360-80. Krackhardt, D. (1991): The strength of strong ties: The importance of philos in organizations. In N. Nohira and R.

Eccles (eds.): Organizations and networks: Theory and practice. Cambridge, MA: Cambridge University Press, p. 216-239.

Page 30: Towards Dynamic Visualization for Understanding Evolution ...cmxp.bplaced.net/Commetrix/enron/PREFINALDRAFT... · Prefinal Draft Version of: ... these limitations, this paper presents

Prefinal Draft Version of: -30- Trier, M. (2008): Towards Dynamic Visualization for Understanding Evolution of Digital Communication Networks. Accepted for publication in Information Systems Research, to appear in 2008.

Kraut, R., Kiesler, S., Boneva, B., Cummings, J. N., Helgeson, V., & Crawford, A. M. (2002): Internet paradox revisited. Journal of Social Issues, 58(1)2002, p. 49-74.

Moody J., McFarland D., Bender-DeMoll S. (2005): Dynamic Network Visualization. American Journal of Sociology. AJS Volume 110 Number 4 (January 2005), p.1206–41

Moreno, J. L. (1934): Who Shall Survive? Nervous and Mental Disease Publishing Company, Washington, 1934. Perer, A., Shneiderman, B. (2006): Balancing Systematic and Flexible Exploration of Social Networks. IEEE

Transactions on Visualization and Computer Graphics 12(5)2006. Preece, J. (2000): Online Communities: Designing Usability, Supporting Sociability. Chichester and New York:

John Wiley & Sons. Shetty, J., Adibi, J. (2004): The Enron Dataset Database Schema and Brief Statistical Report. URL:

http://www.isi.edu/~adibi/Enron/Enron_Dataset_Report.pdf .Accessed: 2006-06-01. Smith, M.A., Fiore, A.T. (2001): Visualization components for persistent conversations. Proceedings of the SIGCHI

conference on Human factors in computing systems, 2001. Snijders, T.A.B. (2001): The Statistical Evaluation of Social Network Dynamics. In: Sociological Methodology

Dynamics, edited by M. Sobel and M. Becker, Basil Blackwell, Boston and London, 2001, p. 361 -395. Trier, M. (2004): IT-Supported Monitoring and Analysis of Social Networks in Virtual Knowledge Communities.

In: Proceedings Sunbelt XXIV Conference, Portoroz, Slovene 2004. Trier, M. (2005): IT-supported Visualization of Knowledge Community Structures., Proceedings of 38th IEEE

Hawaii International Conference of Systems Sciences, Hawaii, USA, 2005. Wasserman, S., Faust, K. (1994): Social Network Analysis: Methods and Applications. Cambridge University Press:

Cambridge, 1994. Wellman, B., Potter, S. (1997): The elements of personal community. In: Wellman, B. (ed.): Networks in the global

village. Norwood, NJ, 1997. Wellman, B., Salaff, J., Dimitrova, D., Garton, L., Gulia, M., Haythornthwaite, C. (1996): Computer networks as

social networks: Collaborative work, telework, and virtual community. Annual Review of Sociology, 22(1996), p. 213-238.

Wellman, B., Wortley, S. (1990): Different Strokes from Different Folks: Community Ties and Social Support. American Journal of Sociology. 96(1990), p. 558-88.

White, H.C., Boorman S.A., Brieger, R.L. (1976): Social Structure from Multiple Networks: I. Blockmodels of Roles and Positions. American Journal of Sociology, (81)1976, p. 730-80.