an approach to quantitatively measuring collaborative performance in online conversations

12
An approach to quantitatively measuring collaborative performance in online conversations Paul Dwyer Atkinson Graduate School of Management, Willamette University, 900 State Street, Salem, OR 97301, United States article info Article history: Available online 6 January 2011 Keywords: Collaboration Cognitive modeling Collective thinking abstract Interpersonal dynamics often hinder people from optimizing collaboration. Researchers who monitor the intellectual activity of people as they converse online receive less value when such collaboration is impaired. How can they detect suboptimal collaboration? This study builds on a new metric for measur- ing collaborative value from the information content of participant contributions to propose a measure of collaborative efficiency, and demonstrates its utility by assessing collaboration around a sample of web- logs. The new collaborative value metric can augment qualitative research by highlighting for deeper investigation conversational themes that triggered elevated collaborative production. Identifying these themes may also define the cognitive box people have built within a collaborative venue. Challenging people to consider fresh ideas by deliberately introducing them into collaborative venues is recom- mended as the key to overcoming collaborative dysfunction. Ó 2010 Elsevier Ltd. All rights reserved. 1. Introduction Academic research recognized the importance of collaboration, collective value creation, before it became a major byproduct of the interpersonal connectivity provided by the Internet. It was ob- served that people collaborated through word-of-mouth and en- gaged in sensemaking, a pooling of their limited, yet diverse knowledge from which shared coherent thought structures could be formed over whatever new ideas, things and events they encountered (Weick, Sutcliffe, & Obstfeld, 2005). Researchers often want to use these deliberations to gain insights into people’s thoughts and opinions. They search the many collaborative venues for mentions of subjects in which they are interested. However, gaining value from online collaborative venues is challenging. Sco- ble and Israel (2006), the authors of Naked Conversations, high- lighted one challenge when they cautioned that online communities may only reflect the views of a vocal minority, a phe- nomenon they call the echo chamber. Currently, the primary way collaborative dysfunction has been detected is through a qualita- tive analysis of content (Gress, Fior, Hadwin, & Winne, 2010) or survey responses (Thomson, Perry, & Miller, 2008). The demands of qualitative analysis (time, labor and skill) justifiably deter its use in the larger and more comprehensive studies that are possible given the vast number of online collaborations (Marshall & Ross- man, 2010, p. 4). A few studies have used quantitative methods to narrowly measure the echo chamber condition (Adamic & Glance, 2005; Wallsten, 2005). However, this paper seeks to make a methodological contribution to the psychology literature by pro- posing a more broadly applicable approach to measuring online collaborative performance that, by being algorithmic and therefore automatable, also lowers the barriers to taking advantage of the large volume of online collaborations. The remainder of the paper is organized as follows. First, litera- ture is reviewed that highlights two examples of natural socio- psychological processes that impair human collaboration. This is the foundation for the study’s relevance as they highlight a need for measuring collaboration effectiveness. Second, the literature describing the measurement of collaboration and its outcomes is surveyed, culminating in a description of the first of two prerequi- sites necessary for measuring collaborative performance: a metric for collaborative value. Two general measures of collaborative value: Metcalfe’s Law (ML) and collaboration envelope area (CEA) are compared before selecting CEA to be the basis for a performance metric. Third, the second prerequisite for measuring collaborative performance is addressed: an estimate of ideal collaborative value. A simulation of optimal collaboration is used to derive a model that predicts ideal value from empirical data. Fourth, observed collabo- rative value is compared to its ideal in a ratio that is proposed as a measure of collaborative performance. Finally, how CEA can aug- ment qualitative analysis is discussed along with some practical recommendations for raising collaborative performance. 2. A review of collaboration theory and measurement This section begins by discussing how even though incentives draw people into online collaboration, there are natural socio- psychological processes that often cause such collaboration to be 0747-5632/$ - see front matter Ó 2010 Elsevier Ltd. All rights reserved. doi:10.1016/j.chb.2010.12.006 Tel.: +1 503 370 6229; fax: +1 503 370 3011. E-mail address: [email protected] Computers in Human Behavior 27 (2011) 1021–1032 Contents lists available at ScienceDirect Computers in Human Behavior journal homepage: www.elsevier.com/locate/comphumbeh

Upload: paul-dwyer

Post on 05-Sep-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: An approach to quantitatively measuring collaborative performance in online conversations

Computers in Human Behavior 27 (2011) 1021–1032

Contents lists available at ScienceDirect

Computers in Human Behavior

journal homepage: www.elsevier .com/locate /comphumbeh

An approach to quantitatively measuring collaborative performancein online conversations

Paul Dwyer ⇑Atkinson Graduate School of Management, Willamette University, 900 State Street, Salem, OR 97301, United States

a r t i c l e i n f o

Article history:Available online 6 January 2011

Keywords:CollaborationCognitive modelingCollective thinking

0747-5632/$ - see front matter � 2010 Elsevier Ltd. Adoi:10.1016/j.chb.2010.12.006

⇑ Tel.: +1 503 370 6229; fax: +1 503 370 3011.E-mail address: [email protected]

a b s t r a c t

Interpersonal dynamics often hinder people from optimizing collaboration. Researchers who monitor theintellectual activity of people as they converse online receive less value when such collaboration isimpaired. How can they detect suboptimal collaboration? This study builds on a new metric for measur-ing collaborative value from the information content of participant contributions to propose a measure ofcollaborative efficiency, and demonstrates its utility by assessing collaboration around a sample of web-logs. The new collaborative value metric can augment qualitative research by highlighting for deeperinvestigation conversational themes that triggered elevated collaborative production. Identifying thesethemes may also define the cognitive box people have built within a collaborative venue. Challengingpeople to consider fresh ideas by deliberately introducing them into collaborative venues is recom-mended as the key to overcoming collaborative dysfunction.

� 2010 Elsevier Ltd. All rights reserved.

1. Introduction a methodological contribution to the psychology literature by pro-

Academic research recognized the importance of collaboration,collective value creation, before it became a major byproduct of theinterpersonal connectivity provided by the Internet. It was ob-served that people collaborated through word-of-mouth and en-gaged in sensemaking, a pooling of their limited, yet diverseknowledge from which shared coherent thought structures couldbe formed over whatever new ideas, things and events theyencountered (Weick, Sutcliffe, & Obstfeld, 2005). Researchers oftenwant to use these deliberations to gain insights into people’sthoughts and opinions. They search the many collaborative venuesfor mentions of subjects in which they are interested. However,gaining value from online collaborative venues is challenging. Sco-ble and Israel (2006), the authors of Naked Conversations, high-lighted one challenge when they cautioned that onlinecommunities may only reflect the views of a vocal minority, a phe-nomenon they call the echo chamber. Currently, the primary waycollaborative dysfunction has been detected is through a qualita-tive analysis of content (Gress, Fior, Hadwin, & Winne, 2010) orsurvey responses (Thomson, Perry, & Miller, 2008). The demandsof qualitative analysis (time, labor and skill) justifiably deter itsuse in the larger and more comprehensive studies that are possiblegiven the vast number of online collaborations (Marshall & Ross-man, 2010, p. 4). A few studies have used quantitative methodsto narrowly measure the echo chamber condition (Adamic &Glance, 2005; Wallsten, 2005). However, this paper seeks to make

ll rights reserved.

posing a more broadly applicable approach to measuring onlinecollaborative performance that, by being algorithmic and thereforeautomatable, also lowers the barriers to taking advantage of thelarge volume of online collaborations.

The remainder of the paper is organized as follows. First, litera-ture is reviewed that highlights two examples of natural socio-psychological processes that impair human collaboration. This isthe foundation for the study’s relevance as they highlight a needfor measuring collaboration effectiveness. Second, the literaturedescribing the measurement of collaboration and its outcomes issurveyed, culminating in a description of the first of two prerequi-sites necessary for measuring collaborative performance: a metricfor collaborative value. Two general measures of collaborativevalue: Metcalfe’s Law (ML) and collaboration envelope area (CEA)are compared before selecting CEA to be the basis for a performancemetric. Third, the second prerequisite for measuring collaborativeperformance is addressed: an estimate of ideal collaborative value.A simulation of optimal collaboration is used to derive a model thatpredicts ideal value from empirical data. Fourth, observed collabo-rative value is compared to its ideal in a ratio that is proposed as ameasure of collaborative performance. Finally, how CEA can aug-ment qualitative analysis is discussed along with some practicalrecommendations for raising collaborative performance.

2. A review of collaboration theory and measurement

This section begins by discussing how even though incentivesdraw people into online collaboration, there are natural socio-psychological processes that often cause such collaboration to be

Page 2: An approach to quantitatively measuring collaborative performance in online conversations

1022 P. Dwyer / Computers in Human Behavior 27 (2011) 1021–1032

suboptimal. These considerations are the foundation for thisstudy’s relevance as they highlight a need for measuring collabora-tion effectiveness if the results of that collaboration will be used togain insights and make decisions.

2.1. Incentives and impairments to collaboration

Balasubramanian and Mahajan (2001) asserted that people aredrawn into virtual community participation by the promise ofthree sources of value: (1) focus-related, where the communityas a whole benefits from everyone’s contribution, (2) consumption,the benefit individuals receive personally, and (3) approval, thesatisfaction from seeing others approve of one’s contributions.However, the forming of a community does not guarantee optimalor even fruitful collaboration. There are many natural interpersonalprocesses that interact in complex ways to both help and hindercollaboration. Two examples of natural processes that hinder max-imizing collaborative value are described in the paragraphs thatfollow. It is shown that personal value-seeking can undermine col-lective value creation.

2.1.1. Cultural tribalismScoble and Israel’s ‘‘echo chamber’’ effect refers to the illusion of

vibrant community that frequent communication between a fewparties can create:

You may think you are conversing with the world, when it’s justa few people talking frequently, back and forth to each other,creating the illusion of amplification. The echo chamber candeceive a business into thinking it is either more widely suc-cessful or further off the mark than it is in reality, because afew people are making a lot of noise (Scoble & Israel, 2006, p.134).

This phenomenon has been addressed in academic research ascultural tribalism. Kitchin described cultural tribalism in this way:

. . . communities based upon interests and not localities mightwell reduce diversity and narrow spheres of influence, as likewill only be communicating with like. As such, rather than pro-viding a better alternative to real-world communities cyber-space leads to dysfunctional on-line communities. . . (Kitchin,1998, p. 90).

Cultural tribalism is thus portrayed with extreme pessimism asthe ultimate equilibrium condition of all online communities. Sincethe cost of trial and switching are low, people will sample a largenumber of online communities and migrate to the ones whereinthey feel most at home, those where they hear enough of whatthey want to hear to feel cognitively at ease. Matz and Wood(2005) supported this view by finding that heterogeneous attitudescreate dissonance or tension and discomfort between members ofa group. They found that the level of such discomfort was propor-tional to the numerical minority status of those with atypicalviews. The discomfort is partly relieved if the minority view-hold-ers felt free to affirm their attitudes without pressure to conform.But, true relief of discomfort was only achieved if the minoritycould persuade the majority of their error, the majority could pres-ent more convincing support for their position, or if the minoritycould leave and join a more compatible group. The migratory as-pect of cultural tribalism, given its ease among internet communi-ties, seems to be the result of a natural, unavoidable copingmechanism that will cause communities characterized by diversethought to diminish over time with high cognitive diversity beingonly temporary.

2.1.2. Flocking theoryReynolds (1987) proposed flocking theory as a computational

model that explains how the coordinated movement of a groupcan emerge from individuals making decisions based on personalinformation. He discovered that by using three simple ‘‘steeringbehaviors,’’ coordination would emerge without any explicit man-agement activity: separation (‘‘steer to avoid crowding local flock-mates’’), alignment (‘‘steer toward the average heading of localflockmates’’), and cohesion (‘‘steer toward the average position oflocal flockmates’’). These steering behaviors are really just heuristiccomponents of the more complicated calculation of which direc-tion an individual should move to be in its desired location, relativeto the group, one unit of time in the future.

Although flocking theory was developed as a solution to model-ing the behavior of flocking birds and animals in computer graph-ics, it has been extensively investigated in a variety of academicdisciplines and mathematically modeled by physicists Toner andTu (1998). Rosen (2002) proposed that flocking theory was a goodexplanation for self-organization in human social systems. His pro-posal was based on the idea that communication is the mechanismof cohesion in human society where a social network of individualsshares access to a collective body of knowledge that acts as a‘‘roadmap’’ for coordinated action with little centralized control.

Rosen based his model on multiple literature streams. Simmeland Levine said that for social relationships to occur ‘‘the personal-ities must not emphasize themselves too individually. . . with toomuch abandon and aggressiveness’’ (Simmel & Levine, 1972, p.158). Eisenberg and Phillips (1990) proposed that communitycohesion is always a balance between autonomy and interdepen-dence, corresponding to Reynold’s separation and cohesion steer-ing behaviors. Rosen concludes that to some extent uniformityand common interest is essential to flock maintenance andthat individuals must sacrifice some autonomy to keep groupacceptance.

In summary, people whose behavior results in cultural tribalismare particularly motivated by Balasubramanian and Mahajan’sthird source of value: a need for cognitive support (i.e., they lookfor people who will say ‘‘You are correct.’’). Those who engage inflocking are motivated by their second source of value: personalfulfillment of a need for social belonging (i.e., they moderate self-expression to preserve a relationship). These two examples of per-sonal value-seeking lead to the expectation that the value pro-duced by collaboration has limits and is therefore measurable.The next section begins by discussing the first prerequisite formeasuring collaborative performance: a collaborative value metric.

2.2. Current measures of collaborative value

As already noted, a distinction is made between measuringcollaborative performance and one of its prerequisites: measuringthe value resulting from collaboration. The literature does notmake that distinction, as generally collaborative outcomes aremeasured without reference to what those outcomes could havebeen were collaboration ideal. As a result it is necessary to usethe goals of the research to assess whether the focus was on mea-suring performance or value. Only two studies were found wherecollaborative performance was measured without the use ofqualitative scales. In both cases, the studies were narrowlyfocused on measuring the echo chamber condition. As well, onlytwo collaboration value metrics were found that were not basedon scales developed from qualitative research: Metcalfe’s Law(ML) and Dwyer’s collaboration envelope area (CEA). The nextsection briefly surveys qualitative literature related to measuringcollaborative outcomes.

Page 3: An approach to quantitatively measuring collaborative performance in online conversations

P. Dwyer / Computers in Human Behavior 27 (2011) 1021–1032 1023

2.2.1. QualitativeGress, Fior, Hadwin and Winne searched five academic dat-

abases for articles related to computer-supported collaborationwith the intent of compiling a ‘‘comprehensive review of measure-ment tools and methods used’’ (Gress et al., 2010, p. 807). Theyfound 186 articles, many business-related, containing ‘‘340 mea-sures of collaborative constructs’’ published from January 1999 toSeptember 2006 (Gress et al., 2010, p. 808). The majority of thesestudies were focused around the results of self-report question-naires and interviews that captured such constructs as participantawareness, perceptions, recollections and biases about the collab-orative process. They also noted that collaboration metrics tendedto be narrowly specialized to the context of each study. Although itwas not reviewed in Gress et al.’s study, Thomson et al’s. (2008)study exemplifies Gress et al.’s findings: survey data was used toestimate a structural equation model with latent constructs pro-posed as measures of collaborative outcomes specifically relatedto collaboration within organizations.

Gress et al. criticized many studies for a lack of procedural skillin ensuring that measures did not overlap collaboration constructs.Marshall and Rossman (2010) cautioned researchers to carefullyassess their resources (time, manpower and skill) before startingprojects because of the high demands of qualitative research. How-ever, the vast number of online collaborations offers an unprece-dented amount of data, and the enticing prospect of performinglarger scale, more comprehensive research than could generallybe performed in the past. Opening this opportunity beyond thosewith large resources partly motivates this study. In the next threesections quantitative measures of collaborative outcomes aredescribed.

2.2.2. Measuring echo chambersIn the previous section it was noted that Gress et al. found that

collaboration metrics tend to be narrowly specialized to the con-text of their study. This is also true of the two quantitative mea-sures of the echo chamber condition described in this section.Because collaboration theory associates an echo chamber conditionwith impaired performance, these measures are considered perfor-mance rather than value metrics. Adamic and Glance (2005) madethe first attempt to measure the echo chamber condition bysearching political blogs for a disproportionate repetition of certaintwo word phrases or bigrams that they identified as informativeusing Tomokiyo and Hurst’s (2003) algorithms for keyphraseextraction. Wallsten (2005) also looked for such disproportionaterepetition of key phrases but compared it to the preponderanceof key phrases used by major media outlets at the same time. Aweblog is therefore an echo chamber to the extent that it repeatskey phrases more than the major media repeats those phrases dur-ing the same period. This allowed Wallsten to take into account thegeneral salience of a key phrase at any moment in time. In the nexttwo sections quantitative measures of collaborative value aredescribed.

2.2.3. Metcalfe’s law (ML)Gilder introduced the concept of Metcalfe’s Law: ‘‘the systematic

value of compatibly communicating devices grows as the square oftheir number’’ based on Metcalfe’s original statement of this ‘‘law,’’in a slide presentation sometime in the 1980’s, referring specifi-cally to communication technology like fax machines (Gilder,1993; Metcalfe, 1995). Gilder expanded the definition by substitut-ing ‘‘users’’ for ‘‘compatibly communicating devices,’’ thus makingMetcalfe’s Law relevant to the context of social networks and col-laboration. Odlyzko and Tilly argued that ML overestimated the va-lue of adding connections to a network by observing that theefficiency of communication across network connections is always

suboptimal, suggesting the true value is better estimated by ‘‘ntimes the logarithm of n.’’ (Odlyzko & Tilly, 2010, p. 1).

Metcalfe’s Law and its variants estimate collaborative value so-lely from the number of collaborators, assuming that differences inindividuals’ contributions can be disregarded. However, flockingtheory and inherent variation in the knowledge possessed by indi-viduals leads to the expectation that community participants willdiffer in the amount of unique collaborative value they add. Inthe next section, a means of measuring collaborative value fromthe information content of participant contributions is described.

2.2.4. Collaboration envelope area (CEA)Dwyer (2011) proposed measuring the area of a collaboration

envelope, calculated from the information content of text postedto blogs, as a better measure of collaborative value because it takesinto account differences in the information content of individuals’contributions. The principles behind the collaboration envelope areunlikely to be familiar to readers, and a basic familiarity with themis essential to understanding why this study builds on that metricinstead of ML. Therefore, the next section describes the principlesunderlying the collaboration envelope, starting with how it isbased on a hybrid of bag-of-words modeling from text analysis,and two standard models from information theory: Shannon’s en-tropy and Kullback–Leibler divergence.

2.3. Principles behind the collaboration envelope

2.3.1. Bag-of-words modelingBag-of-words modeling translates a body of text into a proba-

bility distribution by ‘‘classifying [its] words into a smaller numberof thematic categories [i.e., bags]. . . the relative occurrence of thedifferent categories indicates the underlying thematic content’’(Genovese, 2002, p. 105). The mapping of words into thematic cat-egories is facilitated by looking them up in a tag dictionary, a collec-tion of words that have been previously assigned to one or morethematic categories. A commonly used tag dictionary is the Har-vard–Lasswell IV (HL-4) dictionary, which categorizes 11,788 ofthe most-used words in the English language into 182 psycho-graphic categories.

2.3.2. Shannon’s entropyAfter the words in a body of text are grouped into categories,

the probability of occurrence of each category is calculated bydividing the number of words in each category by the total numberof words. This set of probabilities forms a probability distributionfor a body of text. This probability distribution is reduced to a sin-gle value H(X) denoted information entropy, or simply entropy (itsexpected information content), using the following equation(Shannon, 1948):

HðXÞ ¼ �Xn

i¼1

pðxiÞ lnðpðxiÞÞ; ð1Þ

where X is a body of text with words categorized into n themes, andthe probability of occurrence of any theme i is p(xi). If the words in abody of text belong to one category, entropy is minimized at zero. Ifthe words are uniformly distributed across n categories, entropy ismaximized at the logarithm of n. This model of information contentassumes that a body of text with more than one theme (i.e., catego-ries of words) contains more information than text with a singletheme.

2.3.3. Kullback–Leibler divergenceThe difference between the thematic probability distributions

of two bodies of text can be measured by their Kullback–Leibler

Page 4: An approach to quantitatively measuring collaborative performance in online conversations

1024 P. Dwyer / Computers in Human Behavior 27 (2011) 1021–1032

divergence (DKL) using the following equation (Kullback & Leibler,1951):

DKLðPkQÞ ¼Xn

i¼1

pðxiÞ lnðpðxiÞ=qðxiÞÞ ð2Þ

where P and Q are bodies of text with words categorized into thesame set of n themes, and the probability of occurrence of anytheme i is p(xi) and q(xi) respectively. It should be noted that Kull-back–Leibler divergence is a non-symmetric measure, that is DKL

(P||Q) is not equal to DKL(Q||P). That makes DKL particularly appropri-ate as a measure of information gain between subsequent additions(i.e., comments as Q) to a conversation (P) over time.

2.3.4. Area as a measure of discussion coverageDwyer (2011) observed that the maximum and minimum levels

of information gain across all the conversations in a weblog formthe upper and lower boundaries of a collaboration envelope. Theupper boundary indicates how broad an array of thematic associa-tions was introduced into conversation. The lower boundary indi-cates the extent to which conversation also focused on specificthemes. The area between the two boundaries was proposed asan indicator of the extent to which conversants explored all theinformation relevant to a conversation, alternating between addingnew themes to keep a discussion alive and deeply delving into spe-cific themes already introduced. Fig. 1A shows an example of a col-laboration envelope where the vertical axis is information gain andthe horizontal axis is a timeline that marks participation eventsfrom the start of each conversation.

This section began with Gress et al. and their review of collab-oration measurement tools and methods. It was noted that mostof the collaborative outcome metrics are study-specific and de-rived from qualitative scales. There is no generally applicablemeans of measuring collaborative performance that lowers the

Fig. 1. Example collaboration envelope from a simulation of ideal collaboration. The uppregression models with log-transformed absolute-valued information gain as the depenplotted with a logarithmic-scaled vertical axis.

resource barriers demanded by qualitative methods allowing oneto take full advantage of the research opportunities promised bythe large volume of online collaborations.

It was noted that a quantitative value metric is a prerequisite toa performance metric. By virtue of its consideration of content andits basis in information theory CEA should be a better indicator ofcollaborative value than ML and its variants. However, empiricaldata should be used to finalize the choice between the two. Thenext section begins by introducing a dataset and discusses therationale used to select it. This data is used to further validatethe selection of CEA over ML as the measure of collaborative value,as well as demonstrate the proposed collaboration efficiency ratio.

3. Methods

3.1. Selection of data

It might be thought that if all collaboration is impaired thenthere is no data that can inform us of how to raise collaborativeperformance. It should be noted that this state of dysfunction isnot bilateral: dysfunctional or not; but rather occupies a positionon a continuum where some are better than others and we canlearn from the less dysfunctional even though they are not perfect.

This study draws its data from twelve diverse weblogs that cov-er a broad spectrum of human interest. A weblog is a websitewhere an author, or discrete set of authors, displays articles, calledposts or entries. Most weblogs permit readers to add comments toposts and thereby are a vehicle for conversation and collectiveintelligence, tapping the knowledge of a group. The weblogs usedin this study were selected in accordance with Kozinets’s (2002)netnography methodology. In netnography the principles of eth-nography, or unobtrusive observation, are applied to study virtualcommunities. A selection criterion was specified for subject

er boundary in 1A appears roughly log-linear, so both boundaries were fit to lineardant variable. Figure B shows these models superimposed on the envelope of 1A

Page 5: An approach to quantitatively measuring collaborative performance in online conversations

P. Dwyer / Computers in Human Behavior 27 (2011) 1021–1032 1025

communities that differed from the practice of standard ethnogra-phy: select communities that receive above-average posting traffic,have a large number of contributing members, contain descrip-tively rich content, and enjoy a high level of member-to-memberinteraction. These weblogs are treated as independent samplesfrom the population of all weblogs because the HL-4 categoriesused in this study are too general for it to be affected by any idio-syncratic commonality of interest within the weblogs.

The next section describes the methodologies used to gatherand analyze the data. The analysis began by investigating whetherML is justified in disregarding individual’s contributions whenmeasuring the value of collaboration. Then the observed and idealcollaboration envelopes for each weblog were calculated and com-pared using the proposed collaboration efficiency ratio.

3.2. The data gathering process

The data was gathered using a custom computer program writ-ten in C Sharp (a.k.a., C#) that retrieved the entire archive of web-log pages as HTML and then parsed the HTML for author andcommenter identification, origination dates and intellectual con-tent. This is commonly called screen scraping. Most weblogs areimplemented using a standardized template for information lay-out, thus easing the task of HTML parsing. All data was writtento a Microsoft SQL Server database as it was gathered. The datagathering phase took approximately three months of continuouscomputer program operation.

3.3. Calculating information content and gain

After the data was gathered each body of text was translatedinto a thematic probability distribution using bag-of-words model-ing and the HL-4 dictionary with the process already described inSection 2.3.1. Shannon’s entropy (i.e., information content) wascalculated as described in Section 2.3.2 from each weblog post’sprobability distribution and considered to be the initial additionof value to each conversation. Then the Kullback–Leibler diver-gence (i.e., information gain) was calculated as described in Section2.3.3 between each new comment and the overall thematic proba-bility distribution of the conversation at the point when the com-ment was added. In so doing this stage operates with theperspective that each participant builds on the work of those thancame before.

3.4. Collaborator clusters

It has been noted that ML and its variants ignore differences be-tween collaborators’ contributions. Since flocking theory and in-nate knowledge variation prompted the speculation that peoplewould differ in the collaborative value they provide, this studyinvestigated whether natural segments or clusters could be de-tected among collaborators and what differences in impact thosesegments might have. As already stated, the commenters to eachweblog were considered to be independent samples from a singlepopulation; as a result they were pooled together for the clusteranalysis. Three clustering algorithms (McQueen’s (1967) k-means;Chiu, Fang, Chen, Wang, and Jeris (2001) two-step algorithm; and,Kohonen (1982) neural network-based algorithm as implementedin IBM SPSS Modeler 14) were run separately on the mean infor-mation content (i.e., entropy) across the comments each partici-pant wrote. The results of the three algorithms applied to a 40%testing partition were compared on the basis of silhouette coeffi-cient to find the best clustering scheme (Kaufman & Rousseeuw,1990).

3.5. Calculating collaboration envelope boundaries and area

The goals of this stage of the analysis were twofold: (1) furthervalidate choosing CEA over ML as the metric for collaborative valueby comparing values calculated from empirical data, and (2) calcu-late observed CEA for each weblog as a first step in assessing col-laboration performance. This stage particularly addresses thescenario of a researcher wanting to get an overall impression ofthe quality of collaboration within a venue, such collaborationbeing spread across many independent sessions. The key decisionwas the choice of aggregation method. One option was to summa-rize the final entropy of all the collaborations in each weblog withtheir mean and standard deviation, assuming the final entropy val-ues fit a normal distribution. However, entropy is maximized whencollaborators introduce a uniform distribution of themes. Thus en-tropy measures exploration, the extent to which the pool of ideaswas expanded across all collaborations, but hides the extent towhich themes were elaborated and explained. These are the threefactors that Bybee (1997)) noted leading to the creation of collab-orative value. The paradigm of measuring an area between curveswas adopted because it best captured the operation of these factorsof value creation: the upper curve represents exploration, whilethe lower curve embodies explanation, and the area between indi-cates the extent to which conversants alternated between addingnew themes to keep a discussion alive and deeply delving into spe-cific themes already introduced.

3.5.1. Calculating the boundariesThe upper and lower boundaries of each weblog’s collaboration

envelope were determined by finding the maximum and minimuminformation gain contributed by a comment posted at each com-ment index. An example makes this explanation clear: The Unoffi-cial Apple Weblog (TUAW) had 45 posts that inspired 50comments or more. The upper boundary point at comment index50 is the maximum information gain provided by one of the 4550th comments, while the lower boundary point at comment index50 is the minimum information gain provided by one of the 4550th comments. Upper boundary values and the mean informationgain at each comment index were compared with values calculatedusing Odlyzko and Tilly’s variant of ML. The Odlyzko and Tilly cal-culation used the size of the pool of commenters observed to beavailable at each comment index as its parameter.

3.5.2. Calculating the area of the collaboration envelopeIt is readily apparent from Fig. 1 that a collaboration envelope is

bounded at the top and bottom by rough edges that can be approx-imated by non-linear mathematical models, but not fully describedby them. As a result, the trapezoidal rule was used to approximatethe area between each bounding curve (Hoffman, 2001). The trap-ezoidal rule approximates the area under a function (denoted f(x))by modeling it as a collection of trapezoids, that is, four sided fig-ures with one pair of parallel sides, where the area of each trape-zoid is found using the following equation:

Z b

af ðxÞdx � ðb� aÞððf ðaÞ þ f ðbÞÞ=2Þ; ð3Þ

where a and b mark the position of two parallel sides along the x-axis.

By virtue of its consideration of content and its basis in informa-tion theory CEA should be a better indicator of collaborative valuethan ML and its variants; however, choosing between the two onlypartly addresses the problem of detecting suboptimal collabora-tion that motivated this study. The next section describes how anagent-based model was used to meet the second prerequisite for

Page 6: An approach to quantitatively measuring collaborative performance in online conversations

1026 P. Dwyer / Computers in Human Behavior 27 (2011) 1021–1032

measuring collaborative performance: estimating an ideal collabo-ration envelope.

3.6. Simulating collaboration with agent-based modeling

Agent-based modeling (ABM) is a simulation technique where asystem is modeled as a collection of decision-making entitiescalled agents. ABM is also a way of thinking about modeling: it isa bottom-up methodology that describes a system as the outcomeof individuals acting autonomously, rather than the result of sys-tem-wide laws that dictate individual behavior from the top down.As a result, it is easy to use ABM because only the behavior of a sin-gle agent need be programmed and as many agents as desired canbe replicated and set loose within the simulation. Many theoristsview ABM as providing the most natural way of modeling a system(Bonabeau, 2002; Casti, 1997; Epstein & Axtell, 1996).

The results of an ABM simulation can often be unexpected andcounter-intuitive. These are attributes of emergent behavior, that is,behavior resulting from interactions between unsynchronizedautonomous entities. Such interactions are often non-linear, sensi-tive to initial conditions and stochastic. This aspect of ABM makesit a powerful utility for testing theoretical models by revealingwhat Krippendorf called a behavior space:

The collection of behaviors a system can follow, the set of pathsa system is capable of taking. A behavior space represents,sometimes graphically, and/or abstractly, and, often withinmany dimensions, just what a system can do so that what itactually does can be seen as a special case determined by initialconditions (Krippendorf, 2010).

Fig. 2. Simulated information gain grid. Rather than assume every comment addedinformation gain equal to its distance from the nearest previous comment, gain wascalculated after the location of each new comment was found in one of the fourquadrants, with its nearest neighbor positioned at the origin. Comments areassumed to add new information content only to the extent they are in front ofexisting content, while those placed behind explain ideas already expressed.

In this study, the behavior space is the range of all possibleinformation gain sequences that can occur in a weblog conversa-tion where content is encoded into a probability distribution basedon the 182 thematic categories in the HL-4 dictionary. The goal ofthis study’s use of ABM was to define the collaboration envelopthat encloses every possible information gain sequence that re-sulted from the discussion with the weblogs in the data set. Thiswas done by simulating a large number of conversations as de-scribed in the next paragraph, and then processing their data inthe same way as was the observed data into a collaboration enve-lope. If the simulation incorporates none of the processes thatundermine collaboration then its results should represent theideal.

It was noted that Rosen proposed that communication was themechanism of cohesion in human society where a social networkof individuals shares access to a collective body of knowledge thatacts as a ‘‘roadmap’’ for coordinated action with little centralizedcontrol (Rosen et al., 2002). It was also noted that in the sensemak-ing process collaborators incrementally create a collective body ofknowledge by searching their personal knowledge and uttering rel-evant cognitive associations. Collaborators thus try to assemble acommon understanding that approximates all the knowledgeknown about a subject. A simulation of a population’s piecemealconstruction of a collective body of knowledge was implementedin Wilensky’s NetLogo modeling environment (Version 4.1) (Wilen-sky, 2010). The simulation took place within a 100 by 100 celltoroidal (i.e., doughnut-shaped) information space that representsall the knowledge about a subject. Each coordinate within thespace maps a unit of knowledge, the difference in information con-tent between units varies with the Euclidean distance betweeneach unit’s coordinates. To be consistent with weblogs the spaceis populated with two classes of agent: many commenters andone weblog author. Collectively, commenters were assumed toadd whatever portion of knowledge they possessed from the entire

knowledge space after an author initiates discussion. Under theassumption that commenter knowledge was randomly triggeredas conversation proceeds, the coordinates of each simulated unitof knowledge were selected at random as the simulation ran.Rather than assume that every comment added information gainequal to its distance from the nearest previous comment (its near-est neighbor), gain was calculated after the location of each newcomment was found in one of the quadrants depicted in Fig. 2, withits nearest neighbor positioned at the origin. Fig. 2 assumes thatcomments only add new information content to the extent theyare in front of existing content (i.e., located at coordinates withgreater numeric value), while those placed behind explain ideas al-ready expressed. The quadrant labels (i.e., exploration, elaborationand explanation) and their meaning are taken from Bybee’s modelof constructivist learning, another example of collaboration (Bybee,1997). The simulation was reset to initial conditions and run 1000times with commenter populations randomly sampled from along-tailed distribution with a mean of 50 people.

3.7. Estimating collaborative performance

The simulation was expected to yield a collaboration envelopewhere the upper and lower boundaries could be used as parame-ters to a variety of predictive models, with the best-fitting upperand lower models selected as general estimators of ideal bound-aries. Then the area of the observed collaboration envelope wouldbe divided by the area of an envelope calculated from the idealboundaries in a simple ratio intended to estimate collaborativeperformance or efficiency.

4. Results

4.1. Collaborator clusters

As Section 3.4 describes, three clustering algorithms were runseparately on the mean information content across the commentseach participant wrote. The results of the three algorithms appliedto a 40% testing partition were compared on the basis of silhouettecoefficient to find the best clustering scheme. It was found that Chiuet al.’s two-step algorithm achieved not only the best, but highlysignificant results (silhouette = .653, where values greater than .5are significant) segmenting participants into three well-definedclusters occupying low, medium and high positions along the

Page 7: An approach to quantitatively measuring collaborative performance in online conversations

P. Dwyer / Computers in Human Behavior 27 (2011) 1021–1032 1027

spectrum of the mean information content contributed by allcollaborators.

It has already been noted that this study’s use of informationentropy ascribes greater information content to a body of text witha variety of themes. When this idea is applied to people whoseinformation contributions place them at the lower end of the en-tropy spectrum it can be said that they tend to focus their contri-butions on a single theme. That led to them being denotedfocused contributors while those at the high end are broad contrib-utors, and those in the middle are balanced contributors. Fig. 3shows the time line of participation for the three collaborator seg-ments. It is no surprise that the populous balanced cluster playedan important role. However, the shift in the relative prominenceof balanced and focused clusters is most interesting. Focused con-tributors tended to grow more prevalent toward a conversation’slater stages when conclusions are likely to be drawn. Broad con-tributors were prevalent earlier when expanding the variety ofideas is traditionally considered to be ideal in achieving productivecollaboration (Osborn, 1957). Note also how the participation ofbroad contributors (the middle grey line in Fig. 3B) seemed to fol-low Rogers’s (1962) classic diffusion curve where broad comment-ers gradually increased in number early in conversation until apeak was reached from which their activity slowly declined.

This confirmation that individual differences in collaborativeproduction are worthy of note adds more support to selectingCEA over ML as the metric for collaborative value. Metcalfe’s Law

Fig. 3. Contributor cluster participation timeline and histogram. This is the participationcontributors tend to grow more prevalent toward a conversation’s later stages, drawexpanding the variety of ideas is critical to productive collaboration.

wrong-headedly assumes differences in individuals’ contributionlevels can be disregarded, while CEA incorporates such differences.The next section further compares CEA with ML on a new dimen-sion: the area and boundaries of the collaboration envelope are cal-culated for each weblog and compared with values calculatedusing Odlyzko and Tilly’s variant of ML.

4.2. The ideal and observed collaboration envelopes

As described in Section 3.6, a simulation of a population’s ideal-ized piecemeal construction of a collective body of knowledge wasimplemented in an agent-based model. The simulation yielded theexample collaboration envelope already referred to in Fig. 1. It wasobserved that the upper boundary of the envelope in Fig. 1A ap-peared to be roughly log-linear. Therefore, as described in Section3.7, the upper and lower boundaries were fit to linear regressionmodels using IBM SPSS Statistics 17 with log-transformed abso-lute-valued information gain as the dependant and participationindex as the independent variables (upper and lower R2 = .83 and.49 respectively). These models are depicted in Fig. 1B (dottedlines) superimposed on the envelope of 1A plotted with a loga-rithm-scaled vertical axis. Standardized estimates of the depen-dent variable were calculated to create sets of upper and lowerboundary z-scores; both sets were found to fit the same linearregression model. The ideal upper ðyuiÞ and lower ðyliÞ boundaryvalues of any collaboration envelope can therefore be estimated

time line for the three types of participants across all weblog conversations. Focuseding conversation to a conclusion. Broad contributors are prevalent earlier when

Page 8: An approach to quantitatively measuring collaborative performance in online conversations

1028 P. Dwyer / Computers in Human Behavior 27 (2011) 1021–1032

from the mean (l) and standard deviation (r) of log-transformedabsolute-valued observed boundary values, the number of partici-pation indices ðsmaxÞ and each participation index ðsiÞ using thefollowing equations:

yui ¼ eðððð1:7�ð3:3=smax�ðsiþ1ÞÞÞ�rÞþlÞÞ ð4Þ

yli ¼ ð�1Þ � eðððð1:7�ð3:3=smax�ðsiþ1ÞÞÞ�rÞþlÞÞ ð5Þ

Fig. 4 shows the observed collaboration envelopes associatedwith TUAW (A) and Freakonomics (B) as well as the estimated idealboundaries (dotted lines). Fig. 4 uses information gain on a loga-rithmic scale as the vertical axis and a timeline relative to the pointwhen each conversation began as the horizontal axis. Below eachinformation gain timeline is a histogram showing the volume ofcontent at each comment interval. The other weblogs showed sim-ilar graphs. Fig. 4B shows that Freakonomics’ collaboration enve-lope is larger, leading to the supposition that its participantsperceive, and draw from, a much larger thought space thanTUAW’s, and in so doing add more value with their collaboration.Table 1’s efficiency ratios and comparisons of Fig. 4A and B showthat not only do Freakonomics’ participants perceive a larger

Fig. 4. Example observed and ideal (dotted) collaboration envelopes with participationrelative to the point when each conversation began is the horizontal axis. Below the informinterval.

information space but they are more efficient at exploring it withboth broadly themed and focused contributions. Note that boththe upper and lower edges of the observed envelope in Fig. 4Bare closer to being congruent to the ideal than those in 4A. Furtherexamination of the observed and ideal envelopes leads to theobservation that the ideal model predicts greater exploration earlyin collaboration when the number of participants is highest, andthe weblog post that initiated discussion has introduced manythemes that could be springboards for discussion. It seems thoughthat collaborators tend to focus prematurely and thus fail to reachthe ideal. This situation was less pronounced in the Freakonomicsweblog as the observed envelope can be seen to more completelyfill the ideal space.

Thus far the selection of CEA to measure the collaborative valueachieved by a community has been supported by its informationtheory underpinnings and its recognition of individual differencesin contribution. How do the attributes of the collaboration enve-lope quantitatively compare to ML? The correlations betweeninformation gain during weblog conversations (i.e., the Kullback–Leibler divergence between a new comment and the conversationthus far) and Odlyzko and Tilly’s estimates of collaborative valueare given in Table 1.

histograms. Information gain is the vertical axis (logarithmic scale) and a timelineation gain timeline is a histogram showing the volume of content at each comment

Page 9: An approach to quantitatively measuring collaborative performance in online conversations

Table 1Weblogs used and correlates of collective thinking.

Weblog Description CEA Efficiency (%) Correlation with D CEA Corr. with N ln (N)

Words Commenters Mean gain Max. gain

AutoBlog Auto test drives and commentary. 36.5 25.7 ns .051� ns .304Blogoscoped News about Google 49.5 37.7 .045� ns ns .673The Consumerist Consumer advocacy 54.1 44.1 .064 ns ns .356EnGadget High tech product reviews and news 100.0 73.8 .098 .247 ns .456Freakonomics Blog for the book 134.7 93.8 .234 .383 .065 .534Gizmodo High tech product reviews and news 50 39.2 .039�� .075 ns .412Joystiq Computer gaming reviews and news 31.9 30.3 ns ns -.360 nsMaverick Entrepreneur Mark Cuban’s blog 82.7 62.6 .336 .281 ns .391Paul Stamatiou Product reviews and technical support 41.8 34.5 .066�� .092� ns .335Townhall Politics, religion, society 71.6 68.5 .229 .108 ns .485TUAW Unofficial Apple Weblog 53.0 40.2 ns .134 ns .557TV Squad TV commentary 111.2 82.0 .092 .216 ns .427

Notes: All p < .01, except �p < .05, ��p < .1 and non-significant(ns) p > .1.

P. Dwyer / Computers in Human Behavior 27 (2011) 1021–1032 1029

The third and fourth columns of Table 1 show the size of the ob-served collaboration envelope and the efficiency ratio, how the ob-served compares with its estimated ideal. Any correlation in sizebetween observed and ideal was found not statistically significant,even though the ideal is derived from summary statistics of the ob-served. There was also no significant correlation between the sizeof the ideal CEA and the efficiency ratio. Therefore it does not seemnecessary to have a collaboration venue with a large amount ofassociated information in order to have efficiency. As Bybee’s the-ory, the agent-based model and Fig. 4B suggest: the key to effi-ciency seems to be getting the collaborators to fully explore theinformation space around a venue and its topics of discussion. Itmay be significant that the most efficient venue was Freakonomics,the weblog in the sample most oriented around science, discussingthought-provoking and counter-intuitive economic phenomena.

The fifth and sixth columns of Table 1 demonstrate that changesin the area of a weblog’s collaboration envelope due to each newconversation do not redundantly reflect the information conveyedby simple activity measures, specifically the number of wordsassociated with a post and its comments (column 5) and the num-ber of collaborators (column 6). The collaboration envelope seemsto measure the content written while still being consistent withtheoretical expectations that prompted ML and its variants to at-tempt estimating collaborative value solely from the number ofcollaborators.

It was to be expected that there would be positive correlationbetween the information gained during weblog conversationsand Odlyzko and Tilly’s variant of ML. However, the right-most col-umn of Table 1 shows that only the maximum levels of informationgain across all conversations were reliably and substantially corre-lated with Odlyzko and Tilly’s model. This study cannot supportthe proposition that Odlyzko and Tilly’s model is more realisticthan the original ML because, although not documented in Table1, the correlation between ML and maximum information gainwas generally observed to be similar and often better. However,it must be concluded that ML and its variants only estimate max-imum levels of information gain (i.e., the collaboration envelope’supper boundary), not collaborative value per se.

5. Discussion

When considering the implications of this study’s findings it isimportant to be mindful that the scope of this investigation is nar-row and conclusions must not overflow its limited bounds. Thisstudy primarily makes a methodological contribution to the psy-chology literature motivated by two problems: (1) there is a pau-city of ways to measure the extent to which online collaboration

has limited value because natural socio-psychological processescause people to limit the range of ideas they introduce into theirdeliberations; and, (2) the demands of qualitative analysis (i.e.,time, labor and skill) justifiably deter its use in the larger and morecomprehensive studies that are possible given the vast number ofonline collaborations (Marshall & Rossman, 2010, p. 4). The pro-posed solution builds on a new measure of collaborative value,the CEA, to derive a metric for collaborative efficiency: the effi-ciency ratio. Both CEA and the efficiency ratio can complementthe prevailing qualitative methodologies by drawing the attentionof researchers to sources of value-laden user-generated content fordeeper examination, thus allowing them to deploy their resourceswhere they should gain the most return. The next section demon-strates how CEA can augment qualitative analysis by showing howsudden increases in CEA over time highlighted, better than simpleactivity metrics, conversational events that attracted participantattention; these events are prime candidates for qualitative analy-sis to get deeper insights.

5.1. How monitoring CEA over time empowers qualitative research

If the difference between observed and ideal CEA is the result ofprematurely focusing on a few themes and ignoring others, thenthe attributes of the themes that were most compelling shouldbe investigated with qualitative analysis. These themes can beidentified by tracing how CEA changed over time and focusing onthe conversational events with the largest changes. Fig. 5 depictshow TUAW’s area developed. Note how the area of TUAW’s collab-oration envelope experienced sudden jumps, interspersed betweenperiods of gradual increase. This general pattern was observedacross all the weblogs. Table 2 lists the title and summary of thepost that started the conversation inspiring the largest increasein the envelope. The weblog’s archive may be searched with eachtitle to see the post’s details and comments.

Can monitoring simple activity metrics also highlight theseevents? In columns 5 and 6 of Table 1 simple indicators of collab-orative activity were compared with changes in envelope area.While the collaboration envelope is derived from text content, itis apparent that mere word counting does not duplicate what thearea of the envelope measures. However, the stronger correlationbetween envelope area and the number of commenters mayprompt the speculation that the simpler indicator is sufficient.Fig. 6 shows word and commenter counts across the same timeinterval as Fig. 5. A vertical dashed line marks the time of a majorincrease in envelope area. Both word and commenter counts regis-ter an event at that moment, however the importance of suchevents is often obscured by frequent unexpected fluctuations.The collaboration envelope seems more stable in assigning distinct

Page 10: An approach to quantitatively measuring collaborative performance in online conversations

Fig. 5. TUAW cumulative and incremental collaboration envelope area (CEA) over time. Note how CEA experienced sudden jumps, interspersed between periods of gradualincrease. This general pattern was observed across all the weblogs. These events are prime targets for qualitative analysis.

Table 2Principle envelope expanding events.

Weblog Weblog post title and discussion summary

AutoBlog New York Preview: Subaru Impreza WRX gets early release on cover of MT. Motor Trend broke the embargo on withholding photos of the new WRXuntil after the official corporate press release

Blogoscoped Google AdWords Feedback Buttons. Concerning Google starting to ask for feedback on the usefulness of the specific ads that show up in searchresults

The Consumerist Dealerships rip you off with the ‘‘four square’’ here’s how to beat it. Exposing one of the strategies car dealerships use to gain a negotiatingadvantage

EnGadget Federico Rojas: The father of the father of Engadget. A Father’s Day remembrance of the one who inspired the writer’s love of technologyFreakonomics Why are we eating so much shrimp? Concerning an interesting pattern in the reasons given by survey respondentsGizmodo The PS price drop in three acts. Concerning rumors of a price reduction for the Sony PS3Joystiq Capcom working on new 2D arcade fighter. Reaction to a teaser site for a new gameMaverick Why I do not wear a suit and can’t figure out why anyone doesPaul Stamatiou Why Microsoft’s Windows Live Messenger won’t succeed. Skeptical reactions to the initial release of the softwareTownhall Cloture fails! Bill dead! 53 no votes. Discussion of legislation to put a time limit on consideration of a bill in the SenateTUAW How to: expurgate your dictionary. Concerning the removal of naughty words from OS X’s built-in dictionaryTV Squad American Idol: Live Results Show #12. Concerning the latest discharge of a contestant

1030 P. Dwyer / Computers in Human Behavior 27 (2011) 1021–1032

significance to important weblog posts. Similar graphs for theother weblogs also showed that simple activity metrics obscureconversational events that prompted high cognitive activity.

The next section offers some practical insights pertaining tohow collaboration within an online venue might be managed andthereby improved.

5.2. Increasing the efficiency of collaboration

It is an axiom that for something to be managed it must be mea-sured. However, it cannot be assumed that axiom is symmetric:that everything measurable is manageable. It is understandablethat this study would be looked to for insight into how collabora-tive efficiency, now that it can be estimated, might be optimized.There are clues in this study’s findings that could begin to providethis needed insight. It was noted that the problems that arise inhuman collaboration are natural phenomena both internal andexternal to the individuals involved. However, not all natural phe-nomena seem directed at undermining collaboration. It was notedin the discussion of Fig. 3 that broad contributors seem to naturallyenter a collaboration early expanding the array of ideas considered,

while focused collaborators enter later when the drawing of con-clusions would logically be occurring. Consistent with Bybee’smodel, the interplay of these collaborator segments seems directedat optimizing collaborative outcomes. Would it be best to try mag-nifying the influences that seem productive while minimizingthose that seem counterproductive? As already concluded fromBybee’s (1997) model, the key to efficiency seems to be gettingthe collaborators to fully explore the information space around atopic of discussion, a perspective supported by Santanen, Briggs,and de Vreede (2004). It has been shown that the evolution ofthe CEA can be monitored to see which ideas activated collabora-tors into high modes of cognitive production. Researchers mightbe tempted to try resurrecting these periods of high productionby reintroducing the same themes into collaborative venues. How-ever, not only might such attempts exacerbate the processes iden-tified in Sections 2.1.1 and 2.1.2 as causes of collaborativedysfunction, but Zuckerman’s treatise on intellectual arousal,where people are seen as occupying a spectrum in their need fornovel stimulation and propensity to become bored with the famil-iar, suggests that collaborators are less likely to be activated bystale themes (Zuckerman, 1979, p. 297). Indeed Okazaki posits that

Page 11: An approach to quantitatively measuring collaborative performance in online conversations

Fig. 6. Indicators of activity in TUAW conversations. Figure shows word and commenter counts across the same time interval as Fig. 5. A vertical dashed line marks the timeof the reference increment in CEA. Both word and commenter counts registered an event at that moment; however there is little or no correlation between DCEA, word andcommenter counts. Similar results were observed for the other weblogs.

P. Dwyer / Computers in Human Behavior 27 (2011) 1021–1032 1031

active participants in online word-of-mouth are more prone tonovelty-seeking (Okazaki, 2009). A better approach is to use longi-tudinal observation to define the cognitive box, that is the set of fa-vored ideas that defines the way people think about a venue’stopics of discussion. Weblog authors can then, in a manner similarto the Socratic method, decide whether they are satisfied withthose perceptions, or whether they should inject new ideas intocollaborative venues and thereby challenge people to associatethese new ideas with topics typically discussed (Paul & Elder,2007). This injection of new ideas can be done using subsequentweblog posts that initiate fresh discussion, or with questions posedin comments to an existing post. The size of a venue’s collaborationenvelope can then be monitored as a source of feedback to see ifthe newly promoted ideas get traction and inspire rejuvenated col-laborative value generation, with its attendant increase in perfor-mance and efficiency, in directions initially overlooked bycollaborators. Santanen et al. recommend similar delivery of con-text-relevant stimuli to help collaborators activate knowledge thatmight not be recalled during unaided deliberation.

5.3. Limitations and future research

The collaboration efficiency ratio is intended to be subject-mat-ter agnostic. As a result it cannot qualitatively differentiate (e.g.,good or bad) between ideas expressed in online collaboration,nor predict the practical value of insights derived from the contentof user-generated media. Goldenberg, Lehmann, and Mazursky(2001) defer that issue to human judgment. The efficiency ratioidentifies venues with suboptimal collaboration but not the precisecause of it. This is one reason why it is noted in the narrative thatthis methodology is not intended to replace the qualitative analysisof what collaborators write.

A valuable step beyond this study might involve pairing the cal-culation of collaboration efficiency with a qualitative analysis ofcontent that also assesses the degree of dysfunction. Comparingthose two measures would be an interesting validation of the effi-

ciency ratio. Additionally, it was recommended that weblogauthors employ Socratic questioning to increase collaboration effi-ciency on the basis of its success in classrooms. However, it is notcertain that this would be effective nor is it certain whether suchintervention would be better executed by starting new conversa-tions or by interjecting questioning comments into conversationsalready started. Resolving these uncertainties would be a logicalnext step in this research stream.

References

Adamic, L. A., & Glance, N. (2005). The political blogosphere and the 2004 USelection: Divided they blog. In Proceedings of the 3rd international workshop onlink discovery.

Balasubramanian, S., & Mahajan, V. (2001). The economic leverage of the virtualcommunity. International Journal of Electronic Commerce, 5(3), 103–138 (Spring).

Bonabeau, E. (2002). Agent-based modeling: Methods and techniques forsimulating human systems. Proceedings of the National Academy of Science,99(3), 7280–7287.

Bybee, R. W. (1997). Achieving scientific literacy: From purposes to practices.Portsmouth, NH: Heinemann Educational Books.

Casti, J. (1997). Would-be worlds: How simulation is changing the world of science.New York: Wiley.

Chiu, T., Fang, D. P., Chen, J., Wang, Y., & Jeris, C. (2001). A robust and scalableclustering algorithm for mixed type attributes in large database environments.In Proceedings of the seventh ACM SIGKDD international conference on knowledgediscovery and data mining (Vol. 263).

Dwyer, P. (2011). Measuring collective cognition in online collaboration venues.International Journal of e-Collaboration, 7(1), 47–61 (January–March).

Eisenberg, E. M., & Phillips, S. R. (1990). What is organizational miscommunication?In J. Wiemann, N. Coupland, & H. Giles (Eds.), Handbook of miscommunicationand problematic talk (pp. 85–103). Oxford, UK: Multilingual Matters.

Epstein, J. M., & Axtell, R. L. (1996). Growing artificial societies: Social science from thebottom up. Cambridge, MA: MIT Press.

Genovese, J. E. (2002). Cognitive skills valued by educators: Historical contentanalysis of testing in Ohio. Journal of Educational Research, 96(2), 101–115(November–December).

Gilder, G. (1993). Metcalfe’s law and legacy. Forbes ASAP(September 3).Goldenberg, J., Lehmann, D. R., & Mazursky, D. (2001). The idea itself and the

circumstances of its emergence as predictors of new product success.Management Science, 47(1), 69–84.

Gress, C. L. Z., Fior, M., Hadwin, A. F., & Winne, P. H. (2010). Measurement andassessment in computer-supported collaborative learning. Computers in HumanBehavior, 26, 806–814.

Page 12: An approach to quantitatively measuring collaborative performance in online conversations

1032 P. Dwyer / Computers in Human Behavior 27 (2011) 1021–1032

Hoffman, J. D. (2001). Numerical methods for engineers and scientists. New York:McGraw-Hill.

Kaufman, L., & Rousseeuw, P. (1990). Finding groups in data: An introduction to clusteranalysis. London: John Wiley & Sons.

Kitchin, R. (1998). Cyberspace the world in the wires. Chichester, UK: Wiley.Kohonen, T. (1982). Self-organized formation of topologically correct feature maps.

Biological Cybernetics, 44(2), 59–69 (July).Kozinets, R. V. (2002). The field behind the screen: Using netnography for marketing

research in online communities. Journal of Marketing Research, 39, 61–72.Krippendorf, K. (2010). A dictionary of cybernetics, <http://pespmc1.vub.ac.be/ASC/

indexASC.html> Retrieved 20.05.10.Kullback, S., & Leibler, R. A. (1951). On information and sufficiency. Annals of

Mathematical Statistics, 22(1), 79–86 (March).Marshall, C., & Rossman, G. B. (2010). Designing qualitative research. Thousand Oaks,

CA: Sage.Matz, D. C., & Wood, W. (2005). Cognitive dissonance in groups: The consequences

of disagreement. Journal of Personality and Social Psychology, 88(1), 22–37(January).

McQueen, J. B. (1967). Some methods for classification and analysis of multivariateobservations. In Proceedings of the fifth Berkeley symposium on mathematicalstatistics and probability (pp. 281–297).

Metcalfe, R. (1995). Metcalfe’s law: A network becomes more valuable as it reachesmore users. Infoworld(2) (October).

Odlyzko, A., & Tilly, B. (2010). A refutation of Metcalfe’s law and a better estimatefor the value of networks and network interconnections. Working paper,University of Minnesota. <http://www.dtc.umn.edu/~odlyzko/doc/metcalfe.pdf> Retrieved 20.05.10.

Okazaki, S. (2009). Social influence model, electronic word of mouth: PC versusmobile internet. International Journal of Advertising, 28(3), 439–472.

Osborn, A. P. (1957). Applied imagination: Principles and procedures of creativethinking (2nd ed.). New York: Scribner.

Paul, R., & Elder, L. (2007). Critical thinking: The art of Socratic questioning. Journalof Developmental Education, 31(1), 36–37 (Fall).

Reynolds, C. W. (1987). Flocks, herds, and schools: A distributed behavioral model.In SIGGRAPH ‘87 conference proceedings: Computer graphics (Vol. 21, No. 4, pp.25–34).

Rogers, E. M. (1962). Diffusion of innovations. Glencoe, CA: Free Press.Rosen, D. (2002). Flock theory: Cooperative evolution and self-organization of social

systems. In Proceedings of the computational analysis of social and organizationalsystems (CASOS) conference.

Santanen, E. L., Briggs, R. O., & de Vreede, G. J. (2004). Causal relationships increative problem solving: Comparing facilitation interventions for ideation.Journal of Management Information Systems, 20(4), 167–197 (Spring).

Scoble, R., & Israel, S. (2006). Naked conversations. Hoboken, NJ: John Wiley & Sons.Shannon, C. E. (1948). A mathematical theory of communication. Bell System

Technical Journal, 27, 379–423 (July and October, 623–656).Simmel, G., & Levine, D. (1972). Georg Simmel on individuality and social forms.

Chicago, IL: University of Chicago Press.Thomson, A. M., Perry, J. L., & Miller, T. K. (2008). Linking collaboration processes,

outcomes: Foundations for advancing empirical theory. In L. Bingham & R.O’Leary (Eds.), Big ideas in collaborative public management (pp. 97–117).Armonk NY: M.E. Sharpe.

Tomokiyo, T., & Hurst, M. (2003). A language model approach to keyphraseextraction. In Proceedings of the ACL workshop on multiword expressions.

Toner, J., & Tu, Y. (1998). Flocks, herds, schools: A quantitative theory of flocking.Physical Review E, 58(4), 4828–4858 (October).

Wallsten, K. (2005). Political blogs and the bloggers who blog them: Is the politicalblogosphere an echo chamber? In Presented at the American political scienceassociation’s annual meeting.

Wilensky, U. (2010). NetLogo. Evanston, IL: Center for connected learning andcomputer-based modeling. Northwestern University. <http://ccl.northwestern.edu/netlogo/> Retrieved 20.05.10.

Weick, K. E., Sutcliffe, K. M., & Obstfeld, D. (2005). Organizing and the process ofsense making. Organization Science, 16(4), 409–421 (July–August).

Zuckerman, M. (1979). Sensation seeking: Beyond the optimal level of arousal.Hillsdale, NJ: Erlbaum.