diversity of thought in the blogosphere: …diversity of thought in the blogosphere: implications...

DIVERSITY OF THOUGHT IN THE BLOGOSPHERE: IMPLICATIONS FOR

INFLUENCING AND MONITORING IMAGE

A Dissertation

PAUL DWYER

Submitted to the Office of Graduate Studies of

Texas A&M University

in partial fulfillment of the requirements for the degree of

DOCTOR OF PHILOSOPHY

August 2008

Major Subject: Marketing

DIVERSITY OF THOUGHT IN THE BLOGOSPHERE: IMPLICATIONS FOR

INFLUENCING AND MONITORING IMAGE

A Dissertation

PAUL DWYER

Submitted to the Office of Graduate Studies of

Texas A&M University

in partial fulfillment of the requirements for the degree of

DOCTOR OF PHILOSOPHY

Approved by:

Chair of Committee, Rajan Varadarajan

Committee Members, Benton Cocanougher

Venkatesh Shankar

Richard Woodman

Head of Department, Jeffrey Conant

August 2008

Major Subject: Marketing

ABSTRACT

Diversity of Thought in the Blogosphere: Implications for Influencing and Monitoring

Image.

(August 2008)

Paul Dwyer, B.S.C.E., University of South Florida;

M.B.A., Texas A&M University

Chair of Advisory Committee: Dr. Rajan Varadarajan

A blog, a shortened form of weblog, is a website where an author shares

thoughts in posts or entries. Most blogs permit readers to add comments to posts and

thereby be a conversational mechanism. One way that companies have started to use

blogs is to monitor their corporate image (in this dissertation, the term image is used in

reference to corporate, brand and/or product image). This study focuses on how common

socio-psychological processes mediate consumers’ revelation of corporate image in the

blogosphere. Centering resonance analysis, a means of measuring similarity between

two bodies of text, is used in conjunction with multidimensional scaling to locate text as

cognitive objects in a space. Clusters are then detected and measured to quantify

diversity in the thoughts expressed. Detected patterns are studied from a social process

theory perspective, where complex phenomena are hypothesized to be the result of the

interaction of simpler processes.

A majority of blog commenters compromise the expression of their thoughts to

gain social acceptance. This study identifies the most extreme of such people so

companies who monitor blogs can assign less weight to image indications gained from

them as they may be merely expressing thoughts that are intended to maintain social

acceptance.

It was also found that single-theme blogs attract a readership with similarly

narrow interests. The boldest and most diverse thinkers among comment writers have the

most impact because of their ability to provoke the thinking of others. However,

commenters who repeat the same ideas have little effect, suggesting that introducing

shills is unlikely to shift the sentiment of a blog’s readership.

People participate in blog communities for reasons (e.g., need for community)

that may undermine thought diversity. However, there may be value in serving those

needs even though no valuable insights are provided into image or directions for product

development. Members of homogeneous-thinking communities were observed to more

actively participate, with greater longevity. This may increase loyalty to the company

hosting the blog.

ACKNOWLEDGEMENTS

I would like to thank my committee chair, Dr. Varadarajan, and my committee

members, Dr. Cocanougher, Dr. Shankar, and Dr. Woodman, for their guidance and

support throughout the course of this research.

Thanks also go to my friends, colleagues and the department faculty and staff for

making my time at Texas A&M University a great experience.

TABLE OF CONTENTS

ABSTRACT .............................................................................................................. iii

ACKNOWLEDGEMENTS ...................................................................................... v

TABLE OF CONTENTS .......................................................................................... vi

LIST OF FIGURES................................................................................................... viii

LIST OF TABLES .................................................................................................... xi

CHAPTER

I INTRODUCTION………………………............................................ 1

Literature Overview ....................................................................... 7

Research Questions ........................................................................ 9

Potential Contributions................................................................... 11

II OVERVIEW OF RELEVANT LITERATURE................................... 12

Cognitive Associations................................................................... 12

Image Management ........................................................................ 18

Detecting and Extracting Cognitive Associations.......................... 20

Mapping Cognitive Associations ................................................... 25

Social Process Theory .................................................................... 30

Summary ........................................................................................ 44

III CONCEPTUAL MODEL AND HYPOTHESES ................................ 45

Cognitive Diversity ........................................................................ 45

Cultural Tribalism, Need-for-Cognition and Cognitive Diversity . 47

Threshold Models of Collective Behavior and

Cognitive Diversity ........................................................................ 52

Flocking Theory and Cognitive Diversity...................................... 53

Memetics and Cognitive Diversity................................................. 57

Reciprocity and Cognitive Diversity.............................................. 66

Alternative Model .......................................................................... 71

Relevance of Sociological Theory ................................................. 75

CHAPTER Page

Summary ........................................................................................ 77

IV METHODOLOGY............................................................................... 82

Data ................................................................................................ 83

Method ........................................................................................... 87

Limitations ..................................................................................... 115

Summary ........................................................................................ 116

V RESULTS............................................................................................. 117

Classification of Blog Entry Comments......................................... 118

Classification of Commenters ........................................................ 125

Summary ........................................................................................ 144

VI DISCUSSION AND CONCLUSIONS................................................ 145

Review of Research Questions....................................................... 145

Implications for Research in Marketing......................................... 160

Managerial Implications................................................................. 167

Public Policy Implications ............................................................. 177

Limitations ..................................................................................... 177

Summary ........................................................................................ 179

REFERENCES.......................................................................................................... 181

APPENDIX A ........................................................................................................... 193

APPENDIX B ........................................................................................................... 215

APPENDIX C ........................................................................................................... 219

APPENDIX D ........................................................................................................... 230

VITA ......................................................................................................................... 238

LIST OF FIGURES

FIGURE Page

1.1 Corporate Image Management and the Blogosphere .............................. 7

1.2 Conceptual Map of the Literature Reviewed .......................................... 9

2.1 Corporate Image Management with Blogs.............................................. 19

2.2 Betweenness Centrality in Text............................................................... 24

2.3 The Power-Law Distribution of Attention in the Blogosphere ............... 33

3.1 Cognitive Diversity Latent Variable ....................................................... 46

3.2 Cultural Tribalism and Need-for-Cognition as Opposing Motivations .. 48

3.3 Cultural Tribalism / Need-for-Cognition Reflective Sub-Models .......... 51

3.4 Flocking and Idea Seeding Reflective Sub-Models ................................ 56

3.5 Spectrum of Blog Author Thought Diversity.......................................... 58

3.6 Locus of Influence Conceptual Map ....................................................... 60

3.7 Evangelists and Idea Seeders .................................................................. 61

3.8 Perceptual Map of Commenter Types..................................................... 62

3.9 Memetic Propagation Reflective Sub-Models ........................................ 65

3.10 Naïve Model: Hypothesized Relationships ............................................. 71

3.11 Alternative Model: Relationships............................................................ 73

4.1 Pretest Blog Entry Comment Mapping ................................................... 92

4.2 Pretest BFA Commenter Clusters ........................................................... 94

4.3 Pretest GM Fastlane Commenter Clusters .............................................. 95

FIGURE Page

4.4 Pretest Normal Probability Plot of BFA Cognitive Diversity................. 103

4.5 Pretest Normal Probability Plot of GM Fastlane Cognitive Diversity.... 104

4.6 Naïve Factor Model Fitted with Pretest Data.......................................... 108

4.7 Alternative Factor Model Fitted with Pretest Data ................................. 110

4.8 PC Search DAG ...................................................................................... 113

4.9 Greedy Equivalent Search DAG ............................................................. 114

4.10 PC Search and GES Commonality DAG ................................................ 115

5.1 Blog Entry Comment Mapping ............................................................... 119

5.2 Blog Cultural Tribalism Histogram......................................................... 120

5.3 Paneled Histograms of Thought Diversity .............................................. 122

5.4 Paneled Histograms of Cultural Tribalism and Need for Cognition....... 123

5.5 All Blog Commenter Clusters ................................................................. 126

5.6 Paneled Histograms of Boldness and Individual Thought Diversity ...... 127

5.7 All Blog Idea Seeding Histogram ........................................................... 128

5.8 Individual Blog Commenter Clusters...................................................... 129

5.9 Paneled Histograms of Blog Author and Commenter Types .................. 131

5.10 Fitted Core Naïve Model......................................................................... 139

5.11 Fitted Alternative Model ......................................................................... 141

FIGURE Page

A-1 Dendrogram of an Agglomerative Hierarchical Cluster ......................... 193

A-2 The Long Tail of the Blogosphere .......................................................... 204

A-3 A Social Attractor in Individual Thought Diversity................................ 212

C-1 Individual Blog Normal Probability Plots of the

Cognitive Diversity Factor ...................................................................... 226

C-2 Full Reflective and Formative Naïve Model........................................... 228

C-3 Alternative Model of Potential Interrelationships ................................... 239

D-1 Naïve Model Cognitive Diversity ........................................................... 230

D-2 Naïve Model Indoctrination / Free Thought............................................ 230

D-3 Naïve Model Cultural Tribalism ............................................................. 231

D-4 Naïve Model Idea Seeding ...................................................................... 231

D-5 Naïve Model Flocking............................................................................. 232

D-6 Naïve Model Mass Dissemination .......................................................... 232

D-7 Naïve Model Reciprocation .................................................................... 233

D-8 Naïve Model Need for Cognition............................................................ 233

D-9 Alternative Model Cognitive Diversity................................................... 234

D-10 Alternative Model Indoctrination / Free Thought ................................... 234

D-11 Alternative Model Cultural Tribalism..................................................... 235

D-12 Alternative Model Idea Seeding.............................................................. 235

D-13 Alternative Model Flocking .................................................................... 236

D-14 Alternative Model Mass Dissemination.................................................. 236

D-15 Alternative Model Need for Cognition ................................................... 237

LIST OF TABLES

TABLE Page

1.1 A Hierarchy of Marketing-Relevant Blog Types .................................... 4

3.1 Hypotheses and Construct Definition Summary..................................... 79

4.1 Data Sources............................................................................................ 85

4.2 Data Source Quantities............................................................................ 86

4.3 Pretest Confirmatory Factor Analysis, Reliability, Variance Explained

and Factorial Invariance .......................................................................... 99

4.4 Pretest Correlation and Φ-matrix of Primary Constructs ........................ 101

4.5 Pretest Multigroup Fitted Naïve Model Parameter Values ..................... 106

4.6 Pretest Multigroup Fitted Alternative Model Parameter Values............. 107

4.7 Pretest Alternative Model Tests of Mediation ........................................ 109

5.1 Comment Discriminant Function Coefficients (Equation 3.1) ............... 124

5.2 Commenter Discriminant Function Coefficients (Equation 3.2) ............ 132

5.3 Normalized Estimates of Mardia’s (1970) Multivariate Coefficient ...... 134

5.4 Confirmatory Factor Analysis, Reliability, Variance Explained and

Factorial Invariance................................................................................. 136

5.5 Multigroup Fitted Naïve Model Parameter Values ................................. 140

5.6 Multigroup Fitted Alternative Model Parameter Values......................... 140

5.7 Alternative Model: Tests of Mediation ................................................... 142

5.8 Summary of Hypotheses and Results...................................................... 143

C-1 Correlation and Φ-matrix of Primary Constructs for AutoBlog ............. 219

TABLE Page

C-2 Correlation and Φ-matrix of Primary Constructs for Blog for America . 219

C-3 Correlation and Φ-matrix of Primary Constructs for Blog Maverick ..... 220

C-4 Correlation and Φ-matrix of Primary Constructs for The Consumerist.. 220

C-5 Correlation and Φ-matrix of Primary Constructs for EnGadget ............. 221

C-6 Correlation and Φ-matrix of Primary Constructs for

The Evangelical Outpost ......................................................................... 221

C-7 Correlation and Φ-matrix of Primary Constructs for GM Fastlane ........ 222

C-8 Correlation and Φ-matrix of Primary Constructs for Freakonomics....... 222

C-9 Correlation and Φ-matrix of Primary Constructs for Gizmodo .............. 223

Google Blogoscoped ............................................................................... 223

C-11 Correlation and Φ-matrix of Primary Constructs for Joystiq .................. 224

C-12 Correlation and Φ-matrix of Primary Constructs for PaulStamatiou...... 224

C-13 Correlation and Φ-matrix of Primary Constructs for Townhall .............. 225

C-14 Correlation and Φ-matrix of Primary Constructs for TV Squad............. 225

The Unofficial Apple Weblog................................................................. 226

CHAPTER I

INTRODUCTION

The term blog1, a shortened form of weblog, refers to a website where an author

shares thoughts in posts or entries. Most blogs permit readers to add comments to posts

and thereby be a conversational mechanism. Popular blogs have become meeting places

for communities of readers. Organizations are trying to better understand the relevance

of these online communities to their operations and how they should respond to the

phenomenon.

Murray (2007) ranked blog readership as a percentage of population among

world nations. He reported that Japan ranked highest with 74% of the population reading

blogs, followed by South Korea (43%), China (39%), the United States (27%), and the

United Kingdom (23%) in the top five. In each country, the younger the reader, the more

time they spend reading blogs. Additionally, blog readers were more likely to self-

identify as taking part in political, community or social welfare activities and were thus

designated “influencers” (p. 5).

____________

This dissertation follows the style of Journal of Marketing. 1 The first use of a term central to this study is italicized and the word is defined in the

glossary in Appendix A.

Technorati (2007), a blog search engine, reports that it tracks almost 100 million

blogs. These blogs cover such a variety of subject matter that they resist the creation of

the kind of taxonomy that scientists generally seek to create. The blogosphere eschews

top-down taxonomies in favor of folksonomies, a form of collaborative categorization

based on the aggregation of tags, key words descriptive of content (Golder and

Huberman 2006). Nevertheless, in Table 1.1 a categorization framework is described

that delineates types of blogs suggested most of interest to the marketing community.

The framework begins by partitioning marketing-related blogs into those related

to business, and those related to public policy and non-profit issues, so as to capture the

full breadth of marketing areas of practice. While, it may seem that more importance has

been conferred on business blogs, as the category breakdown is more nuanced, the

general structure of the business blog hierarchy can be applied to public policy and non-

profit blogs as well.

Established information providers have been augmenting their current service

offerings with new ones based on the extraction of information from the blogosphere,

the collection of all blogs. One example is VNU, the corporate parent of ACNielsen and

Nielsen Media Research, which announced in January 2006 the acquisition of

BuzzMetrics and Intelliseek, two pioneers in the monitoring of online consumer-

generated media, and merged them into Nielsen BuzzMetrics. This acquisition of new

competencies by an industry giant signals that they perceive consumer-generated media

to be a new information resource that, rather than being a passing fad, is growing in

significance.

One type of information of considerable interest to companies is information

about their corporate image2. Companies have an image they would like to instill in the

minds of customers. They are aware that all methods of communicating that image are

imperfect and inefficient to varying degrees. Hence, the need to infer the image in the

consumers’ minds, compare it to the one they wish to create and then develop and

execute ways to bring the two images into congruence. Nielsen BuzzMetrics offers a

service called BrandPulse that monitors blogs and message boards for mention of their

client customers’ brands. Blogs of many types are potential sources for marketing

insights even though organizations tend to directly connect with consumers in only a few

blog types. As most of the world economies are either consumerist or in the process of

becoming so, any blog can be a venue for the discussion of corporate brands, trademarks

and other image assets.

2 Throughout this dissertation, the term image refers generally to corporate, brand and/or

product image.

TABLE 1.1

A Hierarchy of Marketing-Relevant Blog Types

Example and Description External Sanctioned Employee

IBM developerWorks. A large set of blogs authored by IBM

product developers from a wide array of product lines.

Senior Executive GM Fastlane. GM Vice Chairman Bob Lutz’ blog on current and

future product offerings.

Character The Family Guy Blog. Supposedly written by a fictional

character, often a TV character, mascot or brand icon.

Corporate

Internal A blog intended for the exchange, accumulation and management

of internal corporate knowledge.

Communal Hiptop Nation. A consumer’s blog that became the primary

“moblog” (i.e., mobile weblog) for Danger Hiptop and T-Mobile

Sidekick enthusiasts to post photos, etc.

Diary-like Wil Wheaton. An actor’s personal blog.

Firm-Sponsored Bob Balfe. An IBM Lotus developer’s blog, linked to by IBM,

but hosted apart from developerWorks.

Review-like Paul Stamatiou. The widely linked to blog of a college student

who reviews products and gives technical support.

Personal

Unsanctioned Employee

The Masked Blogger. Rumored to be the blog of an Apple

employee. Apple authorizes no external blogs. Corporate Watchdog WalMart Watch. Analysis and commentary of WalMart.

Consumer Advocacy The Consumerist. A consumer news and retaliation blog.

Business-Oriented

Third-Party

Industry-level AutoBlog. News and reviews of cars in general.

Religion and Culture The Evangelical Outpost. Commentary on culture and religion

from an evangelical perspective.

Social and Health Kalyn’s Kitchen. Recipes for healthy meals.

Public Policy and Non-Profit

Organizational and Personal

Political Blog for America. The official blog of the Democratic Party.

Scoble and Israel (2006), the authors of Naked Conversations, based on their

informal observations, note that blogs can be a rich source of insight into how customers

view a company. However, they caution that these insights may only reflect the views of

a vocal minority, a phenomenon they call the echo chamber. This expression of caution

inspired the general research question that started this study’s line of research: Is there a

way to measure the extent to which a blog has become an echo chamber?

The echo chamber phenomenon is one of many models or social forms that have

been used to explain behavior observed within group or community settings. The term

“social forms” is drawn from social process theory and refers specifically to lesser

socio-psychological processes that operate both in tandem and conjunction to create

complex social phenomena (Cederman 2005). This study investigates the expression of

thought in a diverse selection of corporate, political and individual weblogs from a social

process theory perspective.

Kawasaki (2004), an entrepreneur and venture capitalist, opines that the most

successful products are polarizing: consumers either love them or hate them; only

mediocre products are viewed with indifference. Kawasaki’s point of view is consistent

with Muniz and O’Guinn (2001) and McAlexander et al. (2002) who see passion as what

motivates community and results in loyalty. It might be thought then that the echo

chamber, and its attendant limited expression of thought, is the most desirable condition

for consumer communities as it allows companies to monitor their image among its core

constituency without the introduction of noise from those who passionately dislike, or

are merely indifferent to, them and their products. This argument has surface merit, but

since an echo chamber is an extreme situation of limited thought expression, it creates a

context that by focusing on the status quo engenders few ideas for new products or to

solve problems that might expand the customer base without alienating the core. These

limited perspectives also provide no indication of how a target market may be evolving

in its determinants of satisfaction.

Image monitoring is complementary to satisfaction analysis frameworks like

Kano’s (1984) model, where product attributes are differentiated on the basis of being

basic essentials, determinants of performance and sources of excitement. By maximizing

diversity of perspectives companies increase the likelihood of finding product

enhancements that are basic essentials to new consumer segments while being at worst,

factors of indifference, and at best, performance or excitement increasers to existing

customers. Additionally, it is naïve to believe that other companies will not compete for

their competitors’ loyal and most profitable core. Jones and Sasser (1995) note that for a

consumer to be truly loyal they must be completely satisfied. Furthermore, they note that

complete satisfaction is a moving target and that communing only with existing

customers can miss leading indicators of market shifts and their attendant satisfaction

determinants. They recommend that companies use a variety of methods to listen to

existing, potential and former customers. Clearly, the goal of complete satisfaction can

only be met by maximizing diversity of insight from the market.

Figure 1.1 shows how the blogosphere could fit into a corporate image

management process. Kapferer (2004) notes that corporate image can be affected by

conventional marketing communications such as public relations, advertising,

sponsorship and personal selling, as a part of a multifaceted image management

program. This study investigates the efficacy of adding blogging, as a new and emerging

form of interactive marketing communication, and blog monitoring to the array of image

management activities. Companies can monitor how they are portrayed in the

blogosphere and use those insights to modify their image building activities thus creating

a feedback loop. Consumers are assumed to interact with the blogosphere in a manner

mediated by social forms. This study broadly focuses on this mediation and its impact on

the extent to which the image in the blogosphere presents an accurate representation of

brand image in consumers’ minds.

FIGURE 1.1

Corporate Image Management and the Blogosphere

Literature Overview

This dissertation draws on an eclectic body of past research. Figure 1.2 provides

an overview of the literature underpinnings of this research. In Chapter II, each entity in

Figure 1.2 is described in the order given and greater detail is provided into how these

entities connect together. A more detailed definition of a blog is provided. In addition,

the large body of prior research into the nature of brand cognitive associations and how

these associations have previously been elicited from consumers in a context similar to

that of a blog is summarized. Since blogging uses written communications, advances in

textual analysis are also summarized, culminating in a detailed description of centering

resonance analysis (CRA) which yields a similarity measure for pairs of text bodies.

Then, how this similarity measure can be translated with multidimensional scaling into

points in a cognitive space and used as a basis for measuring diversity in the thoughts

expressed is described. Finally, some alternative models (social process theories) are

discussed that explain the diversity of thought expression, and thought expression in

general, in a group context.

FIGURE 1.2

Conceptual Map of the Literature Reviewed

Research Questions

As previously stated, companies are interested in knowing how they are

perceived by consumers and are turning to online community monitoring to gain

insights. However, the question of the extent to which perceptions gained from these

communities are a true reflection of their image in the overall market is not known and

has not been researched. Blog-based research can be characterized as quasi-experimental

because it has some of the attributes of experimental scientific research, like the

collection of data to which statistical methods can be applied, but it suffers from a lack

of control over factors that might confound the identification of testable cause-and-effect

relationships. Without an understanding of the processes that are operant within blogs,

not much faith can be placed in findings based on statistical analyses. This research

study is aimed at identifying some of the major processes operating within blogs,

particularly the ones that affect how thoughts get expressed and thereby perform some

basic research for the modeling of cause-and-effect within blogs. Toward this end, the

following research questions are investigated:

1. What mechanisms explain the expression of thoughts by individuals in a blog?

Since comments are a vehicle for consumers to express their thoughts, there is a

need to determine which social processes are most relevant in determining

whether or not comments are made, and if so, when they are made.

2. What processes expand cognitive diversity, the expression of diverse thought?

Following Scoble and Israel’s (2006) focus on the echo chamber phenomenon as

a factor that could reduce the expression of diversity in thought, research

questions 2 and 3 seek to look at both sides of the issue of factors affecting the

expression of thought diversity. Research question 2 focuses on which of the

processes identified in the first research question tend to increase the diversity of

thought expression.

3. What processes limit cognitive diversity? Addressing this question has the

potential to shed insights into Scoble and Israel’s (2006) concerns about the echo

chamber phenomenon and reveal other factors that might limit cognitive

diversity.

4. What is the relative importance of these processes? The core concept of social

process theory is that social processes act in tandem and in conjunction. It may

be easier to cope with any distortion these processes bring to brand image

insights if it is known how these processes are ranked according to influence.

Potential Contributions

This study seeks to make the following substantive and methodological

contributions to the field of marketing:

1. Substantive: Provide empirically-based insights into how individuals reveal their

cognitive associations in groups, particularly blogs. Demonstrate the extent to

which social process theory explains the expression of thought in blog

communities, and thereby enhances our understanding of the usefulness (value)

of using blogs as a source of image insight.

2. Methodological: Demonstrate a novel integrative application of CRA with a type

of multidimensional scaling that uses a spring-based iterative algorithm, whereby

similar objects (in this case, bodies of text) are pulled together as dissimilar ones

are pushed apart until a stable equilibrium positioning is attained within a

multidimensional space. When these bodies of text have settled into stable

coordinates, thematic clusters are detected and measured to quantify diversity in

expressed thought.

CHAPTER II

OVERVIEW OF RELEVANT LITERATURE

A blog differs from a forum, newsgroup or general online community in that a

single author, or discrete set of authors, initiates the discussion. Most blogs permit

readers to add comments to posts and facilitate trackbacks, permanent linking to posts

by other blogs and websites. The collection of all blogs is often called the blogosphere.

Blogging pundits, such as Scoble and Israel (2006), hail the blog as an ideal forum

where organizations can converse with their market and constituents to gain insights into

how they are perceived. Since their widespread emergence after the Internet achieved

mainstream adoption, a number of marketing researchers have focused on virtual

communities in their work (e.g., Shoham 2004; Dholakia, Bagozzi and Pearo 2004);

however, there is a dearth of research studies that have used blogs as the research

context.

A discussion of the various kinds of image associations that consumers might

reveal in a blog follows in the next section.

Cognitive Associations

People tend to have very complex entwined cognitive associations with

organizations, brands and products. Govers and Schoormans (2005) make a distinction

between people’s perceptions of a brand and a product. Similarly, distinctions can be

made between an organization and a brand in its portfolio. Organization, product and

brand are levels in a hierarchy of consumer associations. While a consumer may have

different associations between these three levels, the nature of these associations is the

same in that they are all cognitive structures. While the literature reviewed in this

chapter tends to focus on one level of this hierarchy, the thoughts expressed generally

apply to all. Since this discussion is focused on cognitive structures in general, the words

organization, brand and product are used interchangeably with “brand” receiving the

most use.

Reputation

Brown et al. (2006) describe the perspectives from which an organization can be

viewed. Identity is the way an organization views itself. Intended image is the way an

organization would like to be perceived by outsiders. Construed image is the way an

organization thinks it is being perceived and reputation is the way it is actually

perceived. This taxonomy generally is inline with Gioia, Schultz and Corley (2000);

however, Gioia et al. made a distinction between image, the transient aspects of how an

organization is perceived, and reputation, the long-term perception. Brown et al. adopted

Brown and Dacin’s (1997) definition of reputation as organizational associations, a

“label for all the information about a company [organization in the broadest sense] that a

person holds” (p.104).

Brown et al. (2006) also made a distinction between reputation as an attribute of

an organization and reputation as the sum of all mental associations possessed by an

individual. This distinction is important as it explains why there is a need to distinguish

between the various organizational viewpoints: the way it is viewed by others is only

partially under an organization’s control. An organization is constantly communicating

through overt means such as advertising and through more subtle means such as its

products and its actions within the general community. Theorists have proposed various

models for how an organization’s message could be perceived differently than intended.

Shannon and Weaver (1949) introduced an encoding-decoding model where a message

is encoded by a sender, transmitted through a medium and decoded by a receiver into a

different message. They attribute a difference in comprehension to noise, random

interference, in the transmission medium. Later research, such as that reported by

Grayson (1998) attributes it to differences in the interpretive repertoire of the sender and

receiver. The sender encodes the message presuming that the receiver will not decode it

differently. Hall (1980) describes a circuit of communication whereby, through feedback

and clarification, a sender and receiver can converge on a shared meaning. The capacity

for reciprocal communication is one of the most attractive attributes of the blog.

Sabate and Puente (2003) concluded that a good corporate reputation results in

higher profits and higher profits result in a better reputation. Roberts and Dowling

(2002) decomposed reputation into financial (i.e., the company is good investment) and

general (the company is perceived as a good social actor) components. On this

foundation, Dowling (2006) concluded that companies can achieve superior financial

performance by investing in “being more profitable and in being perceived as good” (p.

135). Dowling observed that a good reputation aids corporate growth by making sales

easier and by making investors more willing to fund a company’s attempts at testing new

ventures. Obloj and Obloj (2006) noted that investments in increasing reputation are

done at the cost of investing in greater profitability. They concluded that superior profits

only accrue when there is a large difference between the reputations of competing firms.

Therefore, firms should strive only to be a close follower in reputation-building, not the

leader in their market.

Personality

In the brand identity prism proposed by Kapferer (1992), personality was

denoted as merely one facet of a body of interdependent associations that make up

image. Aaker (1997) took issue with this portrayal and proposed that it be regarded as a

separate construct: “the set of human characteristics associated to a brand.” Azoulay and

Kapferer (2003) argued that Aaker’s (1997) definition was too broad and not in keeping

with psychology’s definition of personality. They supported Kapferer’s (1992) model

that personality was merely an aspect of image and should only include those human

attributes that are appropriate to brands. As a result, they discarded cognition, skills and

abilities, gender, social class and ethnicity from the array of human characteristics that

can be associated with a brand.

Even though Azoulay and Kapferer (2003) advanced theoretical support for their

views, Freling and Forbes’ (2005) qualitative study of brand personality found that

consumers associate the full spectrum of human characteristics to brands. They ascribe

this to a natural human tendency to anthropomorphize objects with which they have a

relationship.

Belk (1988) noted that consumers make possessions a part of themselves and a

reflection of their identity. According to Belk, possessions serve a variety of self-

expressive purposes: they are a tangible link to one’s past as well as expressions of the

multiple levels of self such as family and groups one belongs to. It would not be

surprising to discover that the identity of a consumer may be so entwined with a

perceived brand image that they are unable to separate the two in their discourse.

Meaning

Brown, Kozinets and Sherry (2003) added the idea of brand meaning to the

pantheon of cognitive associations attached to brands. They described four classes of

meaning: allegory (a brand story), aura (core values), arcadia (idealized community)

and antinomy (paradox: old style, new technology).

Attitude

Petty, Unnava and Strathman (1991) noted that attitude is a relatively global and

enduring evaluation of a product. Ajzen (2001) differed by noting that attitude toward an

object may not be exclusively a global impression, but may be expressed on a series of

independent scales: bad-good, useful-useless, reliable-unreliable, etc. Attitude is relevant

to consumption because, as Ajzen (2001) notes “attitude predicts behavior,” people

purchase products they have a positive attitude toward. Howard and Sheth (1969)

position attitude as the mediator between choices and intention.

Relationship

Fournier (1998) distinguished a relationship from a one-time transaction and

mere habitual purchase from loyalty. Her definition of relationship also included the idea

that consumers and brands form partnerships of true interdependence where a brand is

typically attributed a human-like personality, much like the findings of Aaker (1997),

and the brand is perceived as an active partner in goal fulfillment. These goals can be

tangible/utilitarian and/or emotional as brands can facilitate the expression of self.

Fournier (1998) also points out that brand relationships exist in the context of other

relationships and may be entwined with them. She also stated that, much like human-to-

human relationships, brand relationships change over time. Arnould and Thompson

(2005) include the brand relationship in their taxonomy of consumer culture theory

because assigning human-like status to an intangible set of perceptions (i.e. a brand) is

an aspect of a constructed reality. John (1999) points out that children involve brands in

their socialization process and thereby create the foundations for brand relationships

early in life.

Knowledge and Expertise

Keller (2003) defined brand knowledge very broadly, including the intangible

aspects discussed here with the purely utilitarian attributes that compose most researcher

definitions. Mitchell and Dacin (1996) conceptualize knowledge as an awareness of

attributes and how they contribute to performance. Sujan (1985) includes the presence of

product categories in her definition, while Ratneshwar and Shocker (1991) unite

categorization with an appreciation of appropriate usage context. Expertise is

consistently presented as a high degree of knowledge combined with an ability to

properly apply it.

Community

Kleine, Kleine and Kernan (1993) observed that people use brands and products

in their identity-crafting processes. They noted that social connections are an important

part of that process because they allow a person to get feedback from knowledgeable

others about how well they are doing in their identity-crafting. As a result, the

knowledge of brand community is an important aspect of the mental associations that a

consumer has with brands. McAlexander, Schouten and Koening (2002) supplemented

that view by showing that association with a brand community increases a consumer’s

enjoyment of a product and provides a motivation for having a knowledge of available

communities.

Now that the primary image associations someone might reveal in a blog have

been described, in the next section the issue of corporate image management is

discussed.

Image Management

Figure 2.1 portrays a possible corporate image management process

incorporating blog insights. Companies have a self-image, the way they view

themselves, denoted their corporate identity. That identity is assumed to be the basis of

their intended image, the impression they would like to create in the minds of others.

Companies compare their intended image with a construed image, how they think others

view them. If the intended and construed images are too dissimilar, companies can try to

alter the way they are perceived by either tangible (i.e., through products and behavior)

or intangible (e.g., advertising and public relations) means. Consumers develop an

impression of a company from these image-building activities. That impression is

expressed in blog comments and subsequently read by corporate image monitoring. This

construed image is again compared with the image the company is trying to convey and

image building activities are once again adjusted to compensate for any mismatch. Blog

monitoring is thus an integral part of the image-building feedback loop.

FIGURE 2.1

Corporate Image Management with Blogs

The next section provides an overview of literature that supports examining blog

communications to detect consumer image associations. It then describes how these

associations can be detected and extracted.

Detecting and Extracting Cognitive Associations

As previously stated, this research project is concerned with investigating how

individual brand image cognitive associations are revealed in a group context. In order to

find patterns in how cognitive associations are revealed it is necessary to measure them

and the relationships between them.

Blogging, where an author posts an entry and the world is free to comment, is

similar to Olson and Muderrisoglu’s (1979) free elicitation procedure where

“respondents are free to say anything and everything that comes to mind when presented

with a stimulus probe cue.” They tested the stability of responses between two points in

time and found that 50-60% of the associations revealed in the first experiment also

appeared in the second. They also found that the order of mention of these associations

had a correspondence of about 0.35. This led them to conclude that free responses are a

reasonably consistent and reliable measure of thought structure.

Hutchinson (1983) demonstrated the use of a network-based model of cognitive

associations. He analyzed the transcripts of unconstrained free response from consumers

and product-category experts to construct a network of brand attributes connected by

directional ties (i.e. arrows) showing the order of recall (a temporal network). He built

his research on ideas introduced by a number of researchers who found that cognitive

associations are retrieved from memory in an order that reflects the way they are stored

(Bousfield and Cohen 1955; Bousfield, Sedgewick and Cohen 1954; Friendly 1979). At

the time when he proposed his network mapping of cognitive associations, two and three

dimensional visualizations based on data processed with multidimensional scaling

(Green, Wind and Claycamp 1975) was the state-of-the-art in modeling how “brand-

features” were represented in the mind of the consumer.

Ward and Reingen (1990) introduced another network-based model explaining

how a shared group cognitive structure emerges from individual structures and then, in a

kind of feedback mechanism, homogenized these individual cognitive structures. This

means that the knowledge or understanding of individuals becomes shared group

knowledge as individuals express their thoughts. Hearing the thoughts of others cause

individuals to modify their personal knowledge, making it more like the knowledge

shared by the whole group. They used this model to explain the process of how a group

arrives at a consensus-based consumption decision. Their work was heavily based on

theories of group polarization, such as those advanced by Burnstein and Sentis (1981).

Groups tend to be attracted to the “initially favored alternative”, that is, the first

alternative that is well explained and supported by logical arguments. Chandrashekaran,

Walker, Ward and Reingen (1996) expanded this model by demonstrating how decisions

could be predicted and not merely explained.

It should be noted that all this prior research confined cognitive association maps

to those involving the purely utilitarian attributes of products, believed at the time to be

most relevant to the purchase decision. As described, consumers associate a much richer

set of intangible and non-fungible attributes to organizations, brands and products. As

Keller (2003) points out, the mapping of all these cognitive associations has not been

explored:

An important future research challenge … is to develop holistic

perspectives toward brand knowledge that would encompass the full

range of all the different kinds of information involved, for example,

approaches to create and apply detailed mental maps for brands. An ideal

mental map representation would be a blueprint of brand knowledge, as

comprehensive while also as parsimonious as possible, that would

provide the necessary breath and depth of understanding of consumer

behavior and marketing activity. (p. 596)

Even though early research embarked on a very promising trajectory it seems to

have stalled, possibly waiting for more advanced methodologies to be developed. The

paragraphs to follow set the stage for introducing one such new methodology.

Content Analysis

Among the many definitions of content analysis summarized in Kassarjian

(1977), the one provided by Kerlinger (1964) most closely fits the objectives of this

study:

Content analysis, while certainly a method of analysis, is more than that.

It is … a method of observation. Instead of observing people’s behavior

directly, or asking them to respond to scales, or interviewing them, the

investigator takes the communications that people have produced and

asks questions of the communications. (p. 544)

Most of the current methods of content analysis are unchanged from the ones

described by Kassarjian (1977). Methodologies tend to be of two types: manifest and

latent. Manifest methods tend to be word use frequency counting systems that cannot

capture the way words are used in context, only that some words are used more often

than others. Latent methods generally employ a group of human interpreters to overcome

the deficiencies of manifest methods. However, latent methods are often characterized

by inter-rater reliability problems.

Thompson (1997) describes a “hermeneutical framework for deriving marketing

insights” from consumer-generated text that is particularly germane to this study. He

uses the word hermeneutic to denote a situation where the consumer reflects on the role

products play in their present and past circumstances in the context of the broader

culture in which they live. The nature of a blog encourages commenting consumers to

write at length about an issue that interests them. This permits them sufficient scope for

reflection and elaboration. The ongoing communal nature of the blog encourages repeat

contribution, allowing them to further elaborate on issues they want to discuss as time

and reflection allows them to refine their thinking.

Centering Resonance Analysis

Centering Resonance Analysis (CRA), a method of content analysis proposed by

Corman et al. (2002), turns a body of text into a network of nouns and adjectives. The

words that have the most effect in giving the text coherence are then found by finding

the words with the highest betweenness centrality, or simply betweenness. These are

words that are important because they connect other words together. Figure 2.2 shows

part of the word network composed from a comment posted to GM’s Fastlane blog. It is

apparent that the word “oil” has high betweenness because it connects many other words

together. Corman et al. (2002) postulate that these word networks provide insights into

the way knowledge is structured in the mind of the writer. The words with high

betweenness represent the most salient cognitive associations. A quantitative measure of

a word’s betweenness is called its influence. Two separate bodies of text can be

compared on the basis of commonalities in the influence of the same words used. A

quantitative measure of this similarity is called resonance.

FIGURE 2.2

Betweenness Centrality in Text

CRA operates in a manner consistent with Thompson’s (1997), framework

because it models a whole body of text as one coherent network and seeks to detect the

role specific words play in that network of meaning. The ability to measure the

resonance between separate texts allows the researcher to construct an integrative

interpretation. This study proceeds on the premise that the most influential words in the

comments are also the most salient organizational, brand and product associations

elicited by the author’s stimulus posting.

Having described the means of extracting cognitive associations from a body of

text, the next section discusses how the many cognitive associations that consumers

reveal can be organized to allow the detection of patterns among them.

Mapping Cognitive Associations

Multidimensional Scaling

Multidimensional scaling (MDS) (Green, Wind and Claycamp 1975) has become

an important tool of marketing research as it is typically used to create two or three

dimensional visualizations of data that leverage a natural human facility for identifying

patterns. Even though many datasets reflect more latent dimensions than can be reduced

effectively to two or three dimensional visual models without too great a loss of

information, there is no need to abandon MDS in such situations. Strictly speaking,

MDS is not a technique for data visualization; it is a means of converting pair-wise

similarity measures expressed as distances (like one minus the resonance between two

bodies of text) into coordinates in a space of any dimensionality. Matrix manipulation

and spring models (described below) are the two most common ways of performing

MDS. Spring models are usually favored when a large number of distances are used

because the manipulation of large matrices is time consuming and resource intensive

when performed on personal computers. Since the use of spring model MDS is unusual

in marketing, the underlying principles are described here. However, first an even more

unusual means of performing MDS using neural networks is described because the

founders of that methodology have used it in situations similar to that of this study.

Woelfel and Fink (1980) introduced Galileo, a computer program that extracts

“objects of cognition” from a body of text and plots them in a multidimensional space

“whose properties are determined by the patterns of interrelations among the objects” (p.

8). Galileo embodies the research findings of Barnett and Woelfel (1979) who

documented critical problems with reducing psychological measurements to the minimal

set of eigenvalue-based dimensions commonly used to define a multidimensional scaling

space. Woelfel and Fink (1980) still claim (Woelfel 2007) to be the only researchers who

have successfully mapped the cognitive content described in a body of text to a multi-

dimensional space. In order to do so, they employed a neural network to translate such

textual content to coordinates. Typically, users of two and three dimensional scaling

have endeavored to assign labels to the axes of the space that denote attributes on which

objects (whether products, customers or cognitive objects) vary. Barnett and Woelfel

(1979) point out that in high dimensional spaces, the axes cannot be labeled and

psychological constructs are points within the space.

Spring Models in Multidimensional Scaling

Kamada and Kawai (1989) devised the algorithm used to perform MDS with

spring models. Their algorithm models each data point as connected to every other data

point by a spring tensioned according to some measure of similarity (such as CRA

resonance). Through an iterative mechanism, similar data points are pulled together as

dissimilar ones are pushed apart until an equilibrium configuration is attained. If that

methodology were used in this study, representing blog entries and comments as spring-

connected objects, once the objects self-organized into stable positions within multi-

dimensional space, thematic clusters could be detected using methods of inherent

classification such as those discussed below. Thematic clusters can then be measured in

the following ways to indicate diversity in expressed thought:

1. The mean number of clusters across blog entries can be a blog-wide measure of

cognitive diversity, while the number of clusters within a blog entry can indicate

the degree of cognitive diversity within a single blog entry conversation.

2. Radius and density: Each cluster has a center of mass (also called a centroid or

barycenter). The mean and standard deviation of the distance between every

cognitive object and the centroid are measures of a cluster’s radius and density,

respectively.

3. Proportion of outliers: The mean and standard deviation of the distance between

every cognitive object and its cluster centroid can also be used to identify

outliers, cognitive objects located far from a cluster’s center.

It should be apparent from the above description that the CRA and spring model

multidimensional scaling algorithms are particularly appropriate for automation as they

make no qualitative determinations that imitate human judgment. There is, therefore, no

need to employ multiple human referees to assess the reliability of CRA or spring model

multidimensional scaling as the mapping of the same body of text to multidimensional

space is certain to have identical results regardless of the number of trials.

Inherent Classification

Most clustering algorithms, such as K-Means and Expectation Maximization

(EM), require that the researcher specify the number of clusters the algorithm is to find.

However, in this research study, it is necessary to know what clusters are naturally

present. One popular clustering algorithm that does not require the number of clusters to

be specified beforehand was developed by Chiu et al. (2001). Chiu et al. described an

entropy-based hierarchical clustering algorithm that implements agglomerative

hierarchical clustering in two stages:

1. Pre-cluster the data points into a large number of sub-clusters (Chui et al. calls

these “dense regions”).

2. Cluster the resulting sub-clusters into a smaller “optimal” number of clusters.

In the first phase, Chiu et al. (2001) use Zhang et al’s (1996) BIRCH method of

constructing a cluster feature (CF) tree. Each CF consists of a finite number of data

points that are within a certain radius of each other. The maximum number of data points

per CF and the radius must be selected prior to building the tree. As the dataset is

traversed, data points are added to CFs, new CFs are created and existing CFs are split

until every data point in the dataset is accommodated.

Prior to the second phase, the “optimal” number of clusters is determined in its

own two-step procedure:

1. Calculate a coarse estimate of the optimal number of clusters using Bayesian

Information Criterion (BIC). BIC is a likelihood-based metric that decreases as

the number of clusters increases to a certain point and then increases beyond that

point. The aim of this first step is to find the number of clusters at the BIC

minimum. Chiu et al. (2001) note that since the BIC estimate almost always

overshoots the true optimum number of clusters, another step is needed to refine

the estimated optimum.

2. Simulate the merging of all the sub-clusters and calculate the ratio of inter-cluster

distance and the number of clusters after each merge. A big jump in the change

in this ratio from one merge to another usually indicates that merge should not

have occurred. This process can also be done by plotting the number of clusters

versus classification error and then locating the “knee” of this curve. This method

is sometimes criticized for finding the knee locally by looking at pairs of adjacent

points rather than at the whole curve. It is however, a simple method to use.

After the optimal number of clusters has been determined, the CF tree is

collapsed into that number of clusters based on the combining of sub-clusters in close

proximity to each other. Proximity is measured as a log-likelihood distance related to the

decrease in log-likelihood as the two sub-clusters are combined into one cluster. The

distance between sub-clusters w and z is calculated as defined in equation 2.1.

(2.1) ><−+= zwzwzwd ,),( ξξξ

In equation 2.1, <w, z> denotes the index of the combined cluster and ξ is the entropy, or

information homogeneity, of a cluster. The more homogeneous a cluster with respect to

the distribution of values within it, the lower its entropy. This clustering method

therefore tries to join sub-clusters where the difference between their separate entropies

and their joined entropy is small.

Now that a means of organizing and analyzing the relationships between

cognitive associations has been described, the literature that offers explanations for these

relationships and any patterns detected is reviewed next.

Social Process Theory

Simmel introduced the idea that all social phenomena are the collective result of

transactions between individuals (Simmel and Levine 1972). He observed that there is

no universal law or force compelling any social act from the top down, it is solely

created from the bottom up by individuals acting under their own volition. Cederman

(2005) sees Simmel as the progenitor of the science of complexity, the search for simple

rules (also called generative rules) that underlie complex phenomena. Phelan (2001)

differentiates between traditional and complexity science:

Traditional science seeks direct causal relations between elements in the

universe whereas complexity theory drops down a level to explain the

rules that govern the interactions between lower-order elements that in

aggregate create emergent properties in higher-level systems. (p. 128)

Cederman (2005) builds on this concept in defining social process theory:

The sociological process approach starts with an observed social

phenomenon, whether unique or ubiquitous, and then postulates a process

constituted by the operation of mechanisms that together generate the

phenomenon in question. (p. 867)

The processes that operate together to generate a phenomenon are often called

social forms.

The blog is a prime example of an arena dominated by the dynamic of

complexity: the blog author can post content but cannot compel response. Readers are

moved to respond in a manner reflecting the diversity of ways they interpret the blog

entry, possibly inspiring a cascade of responses from more readers in a cycle that repeats

until readers no longer have anything to write. In the remainder of the chapter, six social

forms that previous research has found to govern collective action are discussed. These

six complex processes do not constitute an exhaustive list, but are merely the processes

that previous research suggests might be most operant and important to communities in

the blogosphere.

The Echo Chamber

Scoble and Israel’s (2006) “echo chamber” effect refers to the illusion of a

vibrant community that frequent communication between a few parties can create:

Blogging can fool you. You may think you are conversing with the world,

when it’s just a few people talking frequently, back and forth to each

other, creating the illusion of amplification. The echo chamber can

deceive a business into thinking it is either more widely successful or

further off the mark than it is in reality, because a few people are making

a lot of noise. (p. 134)

Three similar phenomena have been addressed in academic research under the terms:

cultural tribalism, groupthink and group polarization. Kitchin (1998) described cultural

tribalism in this way:

… communities based upon interests and not localities might well reduce

diversity and narrow spheres of influence, as like will only be

communicating with like. As such, rather than providing a better

alternative to real-world communities cyberspace leads to dysfunctional

on-line communities … (p. 90)

Cultural tribalism is thus portrayed as the ultimate equilibrium condition of all online

communities. Since the cost of trial and switching are low, people will sample a large

number of communities and migrate to the ones wherein they feel most at home, those

where they hear enough of what they want to hear to feel cognitively at ease. Such

groups are likely to be small, due to an intolerance of dissent, albeit close-knit.

Conversation may be lively, but always among the same people, sharing a limited range

of thoughts. Therefore, this migration to comfortable cognitive spaces causes cognitive

diversity to be sacrificed. If cultural tribalism is as strong an influence as Kitchin (1998)

opines, then communities characterized by diverse thought are likely to diminish over

time with high cognitive diversity being only temporary.

FIGURE 2.3

The Power-Law Distribution of Attention in the Blogosphere

This argument is supported by Shirky (2003), whose meta-analysis cites the

findings of several researchers who observed from different perspectives that attention

metrics (e.g., numbers of readers, comments and trackbacks) in the blogosphere tend to

follow a power-law distribution similar to that depicted in Figure 2.3. Only a few very

popular blogs attract the attention of a large number of people while most blogs,

occupying the long tail of the distribution, attract a niche of people whose unique

preferences closely match the blog’s content.

The discussion thus far might well lead one to wonder whether the

mechanism that underlies cultural tribalism is solely one of random drift to

centers of thought sameness. Wallace (1999) offers another perspective as he

describes a different situation that will lead to low diversity in ideas expressed:

Each person can share what he or she knows with the others, making the

whole at least equal to the sum of the parts. Unfortunately, this is often

not what happens…. As polarization gets underway, the group members

become more reluctant to bring up items of information they have about

the subject that might contradict the emerging group consensus. The

result is a biased discussion in which the group has no opportunity to

consider all the facts, because the members are not bringing them up…

Each item they contribute would thus reinforce the march toward group

consensus rather than add complications and fuel debate. (pp. 81-82)

Wallace (1999) uses the term “polarization,” but is really describing a similar

phenomenon called groupthink. The mechanisms of the two phenomena are similar but

the outcomes differ. In groupthink, a group’s desire for unanimity causes people to

withhold information that might contradict what they perceive to be a growing group

consensus (Janus 1972). In group polarization, people gradually come to advocate

extreme versions of their personal beliefs causing the group to divide into more

homogeneous sub-groups along well-demarcated ideological lines (Moscovici and

Zavalloni 1969). The two processes differ in their details, but both involve individuals

compromising their personal knowledge or beliefs to achieve a form of collective unity.

It is possible therefore that cultural tribalism is a superficial condition that does not truly

indicate the diversity of thought among its members, it may be the result of a deliberate

seeking of collective unity among otherwise diverse thinkers.

However, Matz and Wood (2005) found that heterogeneous attitudes create

dissonance or tension and discomfort between members of a group. They found that the

level of such discomfort was proportional to the numerical minority status of those with

atypical views. The discomfort is partly relieved if the minority view-holders felt free to

affirm their attitudes without pressure to conform. But, true relief of discomfort was only

achieved if the minority could persuade the majority of their error, the majority could

present more convincing support for their position, or if the minority could leave and

join a more compatible group. It should be concluded then that compromising personal

values to achieve collective unity is not a sustainable situation in the blog context as

unity is not a forced necessity. The migratory aspect of cultural tribalism, given its ease

among Internet blogs, seems to be the result of a natural, unavoidable coping

mechanism.

Flocking Theory

Reynolds (1987) proposed flocking theory as a computational model that

explains how the coordinated movement of a group can emerge from individuals making

decisions based on personal information. He discovered that by using three simple

“steering behaviors,” coordination would emerge without any explicit management

activity: separation (“steer to avoid crowding local flockmates”), alignment (“steer

toward the average heading of local flockmates”), and cohesion (“steer toward the

average position of local flockmates”). These steering behaviors are really just heuristic

components of the more complicated calculation of which direction an individual should

move to be in its desired location, relative to the group, one unit of time in the future.

Although flocking theory was developed as a solution to modeling the behavior

of flocking birds and animals in computer graphics, it has been extensively investigated

in a variety of academic disciplines and mathematically modeled by physicists, Toner

and Tu (1998). Rosen (2002) proposed that flocking theory was a good explanation for

self-organization in human social systems. He proposed that communication was the

mechanism of cohesion in human society where a social network of individuals shares

access to a collective body of knowledge that acts as a “roadmap” for coordinated action

with little centralized control.

Rosen (2002) based his model on multiple literature streams. Simmel and Levine

(1972) state that for social relationships to occur “the personalities must not emphasize

themselves too individually … with too much abandon and aggressiveness.” Eisenberg

and Phillips (1990) proposed that community cohesion is always a balance between

autonomy and interdependence, corresponding to Reynold’s (1987) separation and

cohesion steering behaviors. Rosen (2002) concludes that to some extent uniformity and

common interest is essential to flock maintenance and that individuals must sacrifice

some autonomy to keep group acceptance.

From the discussion thus far, flocking and the unity-seeking compromises of

groupthink and group polarization may seem not to be distinct concepts. The important

difference is that people who engage in groupthink and group polarization are motivated

by a perceived need for collective unity, while people who engage in flocking

(“flockers”) are motivated by a personal need for social belonging. Additionally, in the

blogosphere, nothing constrains a flocker from engaging in a cultural tribalism-style

migration to another blog if the discomfort felt from compromise exceeds the value

gained from social acceptance. So the flocker’s compromise probably causes less

dissonance than that felt by the migrating tribalist.

Threshold Models of Collective Behavior

Granovetter (1978) criticized previous research on crowd behavior for being too

accepting of a universal model of individual compliance with social norms. He used as

an example the situation of a riot where individuals who would never start a riot would

participate nonetheless if sufficient other individuals joined in. He proposed that

everyone had a threshold for such behavior and that if the diversity of individual

thresholds were uniformly present in a crowd, a riot, once started, would propagate

throughout the group getting everyone to participate. This model can be readily applied

to a blog, where a few readers might have something they would like to say but are too

shy to be the first to comment. Once some people who are less shy express their

thoughts, the shyer will feel comfortable enough to post their comments.

Memetics

Marsden (1998) called memetics the study of mind viruses, ideas that are highly

contagious and quickly spread through a population. The term “memetics” is derived

from meme, a word introduced by Dawkins (1976) in his classic book The Selfish Gene.

Dawkins (1976) was primarily concerned with arguing that most human behavior is

directed toward genetic propagation. However, as Blackmore (2001) observed, he

wanted to show that the core processes of evolution (mutation, propagation and natural

selection) are not confined to biological genetics but also to other replicators such as

ideas. Through memes or systems of related memes (what Blackmore called memesets),

humanity passes certain cultural practices from generation to generation because these

practices increase survivability. Marsden (1998) described a three stage process where

acquired ideas (“you get an idea”) become ideology (“the idea gets you”) resulting in

specific behavior. It is readily apparent that this mechanism is superior to forcing people

to behave in a certain way because when the force stops the behavior will probably stop.

However, if a person can be persuaded as to the value of a certain behavior, then that

behavior is much more likely to be continued indefinitely. Just as not everyone is equally

susceptible to catching a biological virus, there are various degrees of resistance and

immunity to memes. Dawkins (2006) described two strategies that have evolved to

maximize the success of meme propagation: intense indoctrination and mass

dissemination. During intense indoctrination, the target person or population is subjected

to the high repetition of a meme, or memeset, in an attempt to overwhelm thresholds of

resistance. Indoctrination, in its attempt to suppress the opposing influence of critical

thinking, often involves the exclusion or ridicule of competing memes. In mass

dissemination, the targets are the people with the lowest barriers to adoption, so the idea

is spread as widely as possible to increase the probability of reaching people with a low

threshold of resistance.

It may seem from the discussion thus far that there is little difference between

contexts influenced by intense indoctrination and those influenced by cultural tribalism

and flocking. Cultural tribalism and flocking are purely grassroots phenomena, while

indoctrination is driven by top-down intention to persuade others. Ultimately, to be

successful, efforts at indoctrination must inspire grassroots support. People most

susceptible to, and accepting of, the indoctrinated meme will cluster (cultural tribalism)

while others, motivated to be accepted by the core, flock around the periphery.

Need-for-Cognition

Thus far in the discussion, two innate motivations have been discussed along

with their resultant behaviors: the desire for uncompromised self-expression, revealed in

cultural tribalism, and the desire for acceptance, revealed in flocking. It is apparent that

neither of these two motivations contributes to the creation of diverse thought within a

blog as both are associated with situations where diversity of thought will be either

reduced or constrained. For blogs to be centers of diverse thought, influences must be

present that tend to increase the expression of diverse perspectives. As already stated, the

focus of this study is detecting diversity in attitudes toward organizations, brands and

products. It is proposed that the best place to see what attitudes are present is one where

attitudes are formed and modified. The blog context is likely to be such a place as blog

authors, even without exercising such extreme methods as indoctrination, are likely to

use persuasive language intended to affect attitudes. Petty and Cacioppo (1986)

proposed one of the most accepted models of attitude change: the Elaboration

Likelihood Model (ELM). The concept of an elaboration continuum, where thought

among targets of persuasion varies from low to high, is central to the ELM. Part of what

determines the level of thought exercised by targets of persuasion is their innate need-

for-cognition, an enjoyment of cognitively demanding tasks. A community of blog

commenters that reflects the full diversity of the market should exhibit the full spectrum

of the elaboration continuum in their commenting activity. Commenters at higher need-

for-cognition levels should be a grassroots influence that increases the level of thought

diversity expressed in a blog.

Thus far in this discussion, need-for-cognition is portrayed as an internal

motivation that inspires thinking. Neither the psychological nor the marketing literature

connects need-for-cognition with social behavior. However, past research has described

the phenomenon of sensemaking as an internal activity that inspires collaborative social

activity. In the next section, sensemaking is described and need-for-cognition is

suggested to be its underlying motivation.

Need-for-Cognition and Sensemaking

Sensemaking, as the word implies, is an act of making sense of circumstances. It

has already been noted that communicated messages are often perceived differently than

intended by a sender and differently again among receivers because of a noisy

transmission medium or because of differences in the interpretive repertoire of senders

and receivers. Thus, it should be no wonder that blog entries and comments might be a

source of reader confusion. The conceptualization of sensemaking in the blogosphere

used in this dissertation is based on a framework developed by Weick, Sutcliffe and

Obstfeld (2005) to explain how sensemaking occurs, and is a formative process, in

organizations. The association of organizing with blogging is particularly apt because

both are contexts where individuals react to their understanding of current conditions

and thereby shape the evolution of a community. Weick et al. describe sensemaking as

an ongoing individual process of retrospection where a person asks “what is going on

here?” and then uses the explanation they consider most plausible as a basis for action. It

seems reasonable to propose that the depth of such questioning and subsequent inquiry

should be related to an individual’s need-for-cognition. The action resulting from an

individual’s sensemaking changes the context, inspiring others to enter the sensemaking

process to update their understanding of what’s going on before they perform their own

actions. Hence, individuals enter and leave the sensemaking process in an ongoing cycle

driven by the actions of community members. Weick et al. see the sensemaking process

as a response to surprising or unexpected actions; however this study sees the element of

surprise as only affecting the degree of sensemaking an individual undergoes in response

to change. Since there is always a mismatch between how an act is intended and

perceived, all action contains an element of the unexpected.

Weick et al. see sensemaking as starting with a noticing of something “at

variance with the normal” (also described as “disruptive ambiguity”) that is then

investigated and “labeled” so as to become part of the common currency for

communicational exchange. The investigation begins with a personal retrospective

search of past experience in the person who first notices the novel event. The result of

that personal search is the initial labeling. Weick et al. see sensemaking as always a

social process where the initial labeling facilitates discussion with others where the

sensemaking continues and propagates, possibly resulting in a change of the labeling:

“thinking is acted out conversationally.” Behind the sensemaking is always a

consideration of response: “what should I do?” Action and talk thus become a cycle.

Even though every element of conversation in a blog is preserved, not every thought

expressed becomes the impetus for new thought expression. The more unexpected the

thought, the more it is likely to become part of a chain of collective learning.

In the marketing literature, Rosa et al. (1999) show how product markets are

knowledge structures that are socially constructed through a collaborative sensemaking

process among producers and consumers. In this process, consumers and producers tell

each other stories in published media: the producers convey what the products can do;

the consumers reveal what they need and their experiences with products. Through the

process, consumers and producers converge on a common understanding of what a

product category can and should do. The blog should be an ideal platform for the

sensemaking process because it is publicly visible and designed to be an interactive

conversational tool.

Cognitive Diversity

In the preceding sections on cultural tribalism, flocking, memetics and need-for-

cognition, their relevance to cognitive diversity (the diversity of thought expressed in a

blog) has been the focus of discussion. For this reason, the focus of this study is on

cognitive diversity as a construct. This is not the first study to use cognitive diversity as

a focal construct. Recent works by Page (2007) and Surowiecki (2004) extol the virtues

of cognitive diversity in groups of all kinds: diverse thinking leads to better collaborative

outcomes. Page (2007) recognizes a dual nature to diversity in cognition: diversity in

preferences and diversity in thinking styles, or as he calls them, cognitive toolboxes.

Page also notes there is some degree of interdependence between preferences and

toolboxes since people develop the thinking skills needed to satisfy their preferences and

then change their preferences based on their new ways of thinking (pp. 285-296).

Reciprocity and Game Theory

A blog is a public good, depending on the participation of a community for

value, but not diminished in value by the number of readers. In a blog, where the author

posts (a costly effort) in anticipation of response (a payoff), there is always uncertainty

about the amount of response the author will receive from any post in return for his cost

of posting. This is an example of a trust-reciprocity scenario, shown by Pereira, Silva

and Andrade e Silva (2006) to be robust in certain situations:

… reciprocity is a pattern of behavior that may be considered ‘robust’

because it emerges in specific environments where the marginal costs of

such behavior are reasonable, but not prohibitive. (p. 420)

Since the unique feature of a blog is that a single author controls the discussion,

each blog entry is like a sequential game where the author moves first. As already stated,

this first move is in expectation that the value received from all the resultant community

contributions will exceed the cost of posting. If Pereira et al’s conclusion is applied to

blogs, then it can be supposed that if the value of the author’s post exceeds the cost of

reciprocal response, community members will reply under the pressure of a sense of

obligation to reciprocate equal to the difference between value received from the post,

and the effort expended to contribute a comment. The author’s original post plus the

cascade of member contributions will continue to raise the value received by the

community, increasing the level of obligation felt by those that have not contributed,

inspiring more contribution as thresholds of tolerance for withheld reciprocation (i.e.

cost of reciprocation) are exceeded. Some community members may have very high

levels of tolerance or costs of reciprocation and be free riders in any specific game.

While their receipt of benefits does not diminish the value of existing content to the rest

of the community, it does cause the maximum value of content to peak at less than what

could result from full participation. While the author’s post might be seen as an act of

altruism, it is rational to assume the author will eventually stop trying to initiate a

response if repeated attempts generated less return than the cost of posting. This model’s

“thresholds of tolerance” is similar to the type of model proposed by Granovetter (1978).

Summary

This chapter provides a review of the principal streams of literature relevant to

this research study. Brand image is encoded in the minds of consumers in a variety of

ways. Stimuli, similar to that provided by a blog author’s entry, evoke free-form

responses from consumers that reveal their brand associations. These responses can be

mapped to a multidimensional space where similarity between cognitive objects can be

assessed by looking at their spatial proximity. Various social processes have been

described that purport to explain patterns in the spatial proximity of cognitive objects,

including social process theory, which ascribes these patterns to the interaction of many

social processes. This review suggests that social process theory offers a credible

explanation for the way people contribute to blogs and the diversity of thought they

express. The next chapter builds on this conclusion and proposes a conceptual model and

hypotheses to test it.

CHAPTER III

CONCEPTUAL MODEL AND HYPOTHESES

As previously stated, this research study investigates the extent to which postings

and comments in blogs reflect a company’s image in the overall market. This research

study seeks to lay a foundation for answering the above by testing for the effect of six

social processes (or forms), identified in prior research and detailed in Chapter II, that

are known to have an effect on the diversity of thought expressed in a group forum.

First, this chapter presents a discussion on this study’s dependent variable, a

latent variable construct representing cognitive diversity, that is, diversity of thought

expression. Next, six latent independent variable constructs are described. Each

represents one of the social processes described in Chapter II. Two conceptual models,

hypotheses and conceptual support for the hypotheses are presented.

Cognitive Diversity

Once multidimensional scaling has located blog entries and their associated

comments into stable positions within cognitive space, thematic clusters can be detected.

In Chapter II, four cognitive diversity metrics were introduced from measures of

thematic clusters: the number of clusters associated with a blog entry, their mean radius

and density, and the percentage of outliers. As depicted in Figure 3.1, these metrics can

be used as indicator variables for the diversity of thought (i.e., cognitive diversity) latent

variable. The influences on diversity of thought are all formative effects and are defined

as independent variables in the sections to follow. As stated in Chapter I, blog authors

write posts or entries that prompt commenters to write replies. Since it is the relative

locations of comments to each other and their associated blog entry that are the basis for

this study’s measures of cognitive diversity and the constructs hypothesized to affect it,

the blog entry is the unit of analysis, unless otherwise noted.

FIGURE 3.1

Cognitive Diversity Latent Variable

Cultural Tribalism, Need-for-Cognition and Cognitive Diversity

In Chapter II, cultural tribalism was described as a process whereby like-minded

people are brought together by a desire to feel validated by the exercise of

uncompromised self-expression. Since this is generally only possible among people who

agree, it results in the creation of contexts were members only hear their own limited

ideas and perspectives echoed by each other. Need-for-cognition was described as a

competing motivational influence where people who like intense thinking are attracted

into the blog conversation to engage in high intensity cognitive tasks (like sensemaking)

and thereby expand the diversity of thought expressed. As explained more fully next,

this study conceives of these two competing motivations as occupying opposite

quadrants of a conceptual space defined by variations in individual thought diversity

(i.e., the variety across an individual’s past and current idea expressions) and collective

thought diversity (i.e., current variety across different individuals) as shown in Figure

FIGURE 3.2

Cultural Tribalism and Need-for-Cognition as Opposing Motivations

In Chapter I, it was mentioned that blog monitoring can be part of a feedback

loop, allowing companies to monitor and adjust their self presentation. As such, blog

monitoring facilitates organizational learning. March (1991) argues that cognitive

diversity is the key to such knowledge development because it results in new alternatives

being applied to problem solving rather than a constant recycling of old solutions. March

opines that cognitive diversity is achieved by the constant influx of new voices; these

new people will not have developed enough attachments with or within the group to be

overly concerned with achieving cohesion or forming alliances. Since cultural tribalism

causes people to form long term associations with communities wherein they feel most

comfortable, it is reflected in the mean longevity, or mean time between the first

comment and the current comment, of commenters to the current blog entry (“Mean

Commenter Longevity”) and the number of commenters who commented on the last

blog entry (“% Repeat Contributors (T-1)”). The latter indicator captures a recency

effect; that is, it is assumed that if a person commented on the last blog entry, then they

are more likely to comment on the current blog entry and thus lower the diversity in

thought expression for the whole community. This assumption is based on Holme’s

(2003) finding that ties to virtual communities decay over time with decreasing contact.

A recent comment is evidence of current membership and therefore a strong predictor of

further comments in the immediate future.

However, long term associations are not sufficient to distinguish cultural

tribalism from the other influences that affect cognitive diversity. Commenters that

receive value from satisfying their need-for-cognition can also be expected to be regular

commenters. As described in Chapter II, cultural tribalism is predicted to cause closer-

knit relationships between people who express similar views. As a result, Figure 3.3 also

depicts cultural tribalism as reflected by a negative relationship to: (1) the mean entry-to-

comment and comment-to-comment cognitive distance (denoted “Mean Collective

Thought Separation” in Figure 3.3), (2) the standard deviation of entry-to-comment and

comment-to-comment cognitive distance across blog entries (denoted “Stdev Collective

Thought Separation”), (3) the mean cognitive distance between an individual’s

comments across blog entries (denoted “Mean Individual Thought Separation”), and (4)

the standard deviation of the pair wise cognitive distances between an individual’s

comments across blog entries (denoted “Stdev Individual Thought Separation”). Need-

for-Cognition is depicted in Figure 3.3 as reflected by positive relationships to the same

indicators. Thus,

H1: Cultural tribalism is negatively related to cognitive diversity.

H2: Need-for-Cognition is positively related to cognitive diversity.

Hypotheses 1 and 2 are reflectively modeled in Figure 3.3. In Figure 3.3, the unit

of analysis is the blog entry.

Blog entry comments primarily influenced by cultural tribalism can be

distinguished from those primarily influenced by need-for-cognition by a discriminant

model (equation 3.1) that differentiates on the basis of diversity of thought expressed

using similar metrics to those depicted in Figure 3.3.

(3.1) 0 1 2 3 4j C C I IZ d dβ β β σ β β σ= + + + +

FIGURE 3.3

Cultural Tribalism / Need-for-Cognition Reflective Sub-Models

In equation 3.1, the unit of analysis is the blog entry comment, Zj is the

discriminant Z score of the discriminant function j, Cd is the mean entry-to-comment

and comment-to-comment cognitive distance across comments to the current blog entry

(a limited form of the metric denoted “Mean Collective Thought Separation” in Figure

3.3), σC is the standard deviation of entry-to-comment and comment-to-comment

cognitive distance across comments to the current blog entry (a limited form of the

metric denoted “Stdev Collective Thought Separation” in Figure 3.3), Id is the mean

comment-to-comment cognitive distance for the current commenter across blog entries

(a limited form of the metric denoted “Mean Individual Thought Separation” in Figure

3.3) and σI is the standard deviation of comment-to-comment cognitive distance for the

current commenter across blog entries (a limited form of the metric denoted “Stdev

Individual Thought Separation” in Figure 3.3).

The parameters of equation 3.1 can be used to distinguish between contexts

primarily motivated by cultural tribalism and those motivated by need-for-cognition. To

conceptually distinguish between cultural tribalism and need-for-cognition, the

parameters of equation 3.1 can be divided into two groups to create sub-factors that

represent individual thought diversity (mean and standard deviation in individual thought

separation) and collective thought diversity (mean and standard deviation in collective

thought separation). These sub-factors are used as axes of the perceptual map shown in

Figure 3.2. If blog entry comments are plotted on this perceptual map, their type should

be readily distinguishable by their presence in one of the marked quadrants. Since the

unit of analysis of equation 3.1 is the comment, comments must be added to Figure 3.2

one-by-one in the order they were created. Individual and collective thought diversity

coordinates would be calculated at each step using the comment being added, and its

thematic relationship to the comments and blog entry already present.

Threshold Models of Collective Behavior and Cognitive Diversity

In Chapter II, Granovetter’s (1978) model of a generative process was described

as a process that builds on itself. Granovetter used his model to elaborate an example

scenario that assumes everyone has a threshold for violent behavior. If the full variety of

individual thresholds were uniformly distributed within a crowd, a riot, once started,

would propagate throughout the group getting everyone to participate. The idea behind

this process is used in this study to propose a mechanism for the emergence of divergent

themes within the comments to a blog entry. In this situation, the existence of a class of

commenters, denoted idea seeders is predicted. Idea seeders are the riot-starters of a

blog. They are among the most uninhibited, and therefore the most prolific, commenters.

They are willing to express ideas on the most diverse array of themes. Just as riot

participation spreads from the originators to the rest of a crowd in a cascading fashion,

idea seeders are consistent starters of new thematic clusters; their willingness to express

themselves emboldens others with similar ideas to write comments, cascading through

the readership until all like thoughts are expressed. As the number of commenters

increase, the probability of having an idea seeder among them likewise increases. Thus,

H3: Idea seeding is positively related to cognitive diversity.

Flocking Theory and Cognitive Diversity

In Chapter II it was noted that Rosen’s (2002) proposal that flocking theory, a

model explaining how the coordinated movement of a group can emerge from

individuals making decisions based on personal information, was a good explanation for

self-organization in human social systems. Rosen concluded that, to some extent,

uniformity and common interest is essential to flock maintenance and that individuals

must sacrifice some autonomy to keep group acceptance. In a blog conversation,

consistency of theme is created and preserved by commenters controlling how far their

comments diverge from the overall group theme. Commenters balance their desire to

express unique, individualistic ideas with the need to conform to the conversational

direction of the group. They thus take the conversation in the direction they perceive the

flock is already going. Thus,

H4: Flocking behavior is negatively related to cognitive diversity.

It is proposed that the flocking process is closely related to the threshold behavior

discussed earlier, in that idea seeders are the emergent flock leaders. Their comments

initiate a cascade of other comments whose thematic consistency exhibits the flocking

phenomenon. There is therefore some probability that there will be two classes of

commenter in a blog entry conversation: idea seeders who often lead the flock and those

who primarily follow.

An estimate of the effect of flocking and idea seeding on a blog entry is

determined in a three step process: separate idea seeders from flock followers (hereafter

denoted flockers), associate the classified individuals with the blog entries they

commented on and then estimate the value of the flocking and idea seeding latent factors

from the participation history of the associated commenters.

Idea seeders can be separated from flockers by a discriminant model (equation

3.2) that differentiates between commenters on the basis of diversity of thought

expressed. Commenter type is assessed on the basis of how prolific their comment

activity is (C), the mean cognitive distance ( S ) between their comments across blog

entries (widest for idea seeders), the standard deviation of the cognitive distances (σS)

between their comments across blog entries (largest for idea seeders), the number of new

thematic clusters they started (K) and the standard deviation of the cognitive distances

(σK) between their comments and the centroids of the closest thematic clusters (widest

for idea seeders) across blog entries.

(3.2) 0 1 2 3 4 5S KZj C S Kβ β β β σ β β σ= + + + + +

In equation 3.2 the unit of analysis is the individual commenter, Zj is the

discriminant Z score of the discriminant function j, and the rest of the parameters are as

above. It should be noted that the mean cognitive distance between comments and the

centroids of the closest thematic clusters across blog entries is not expected to

differentiate between idea seeders and flockers because, as described in Chapter II,

flockers can have wide variety in their preferred distance from the rest of the flock. The

distinguishing factor is that there will be a high level of consistency in this distance. That

consistency is fully captured in the standard deviation of the cognitive distances between

their comments and the centroids of the closest thematic clusters across blog entries.

Once commenters are separated into idea seeders and flockers, then the influence

that each type exerts on the cognitive diversity of a blog entry can be modeled as shown

in Figure 3.4. In Figure 3.4, all the parameters of equation 3.2 are useful in assessing the

relationship between idea seeding and cognitive diversity. The relationship between

flocking and cognitive diversity is fully captured by using three indicators: the mean

number of clusters started, the standard deviation of individual comment-to-cluster

centroid distances and the mean number of comments written by commenters to the

current blog entry. This is because flockers are only characterized by being very

consistent in the degree their thought expression over time differs from that of the group

and by being more inhibited in their thought expression as a result of being more careful.

Therefore, flockers will comment less and start fewer new thematic clusters. Variation in

the cognitive distance between their comments across blog entries is not important

because that is completely dependent on the thematic perspective of the flock they

belonged to when they make comments to the current blog entry.

FIGURE 3.4

Flocking and Idea Seeding Reflective Sub-Models

Memetics and Cognitive Diversity

In Chapter II, memetics was described as the study of mind viruses (i.e., memes).

It was mentioned that Dawkins (2006) described two strategies that evolved to maximize

the success of meme propagation: intense indoctrination and mass dissemination.

Indoctrination uses repetition of the same meme or system of related memes (a

memeset), often while simultaneously excluding or ridiculing competing memes, in a

closed cognitive context (in this case, a blog) to overcome resistance, increase

acceptance and spread of the meme ideas. Mass dissemination spreads a meme or

memeset in as many cognitive contexts as possible, looking for a sympathetic audience

of potential converts with little resistance to the meme ideas. Note that, in the

blogosphere, it is not suggested that mass dissemination occurs under the organizing

influence of some central control. The personality and enthusiasm of individuals drive

them to spread the memes they have embraced in a wide variety of cognitive contexts.

This is proposed, in addition to flocking, to be another example of emergent behavior,

individuals behaving in the same manner, under their own volition, who inadvertently

create the appearance of collective order.

FIGURE 3.5

Spectrum of Blog Author Thought Diversity

Alternatively, as noted in Chapter II, indoctrination, as a top-down influence, can

only occur under the influence of a central control. In the blogosphere, that control can

only be a blog author, whose cognitive distance and, probably, time interval between

blog entries are less than average across the blogosphere. Since indoctrination is, by

definition, an extreme behavior it is proposed that blog authors occupy a spectrum of

thought diversity, and by implication, tolerance of diverse thought, as depicted in Figure

3.5. Throughout this discussion of indoctrination, it was assumed that there was some

intent on the part of the blog author to propound a limited set of views. It is possible

though that the appearance of indoctrination might be created by a blog author who

watches the blog to see what topics garner the most discussion and then repeatedly

introduces those topics with no intent but to keep the conversation lively.

In Chapter II, indoctrination and mass dissemination were both connected to

Granovetter’s (1978) thresholding dynamic. It was reasoned that indoctrination seeks to

overcome thresholds of resistance to meme ideas through repetition in a closed context,

while mass dissemination seeks to find people with low thresholds of resistance in a

variety of contexts (e.g., those characterized by both cultural tribalism and need-for-

cognition). Just as rioting behavior can spread through a crowd as differing individual

thresholds of riot resistance are overcome by observing others’ rioting behavior, so

meme acceptance can be expected to spread through any group as the unconverted see

others in their vicinity become converted. This cascading propagation of meme

acceptance is expected to be aided by flocking behavior. People with flocking propensity

will sense if their local group’s thematic direction has been affected by the acceptance of

a meme or memeset. In order to preserve group acceptance, they will control the extent

to which their expressed ideas diverge from their perception of the group norm, thus

aiding in memetic propagation.

FIGURE 3.6

Locus of Influence Conceptual Map

On the basis of the foregoing, it is further argued that indoctrination and mass

dissemination can be opposing means of memetic propagation. An individual attempting

to disseminate a competing meme in an indoctrination context will face the opposition of

an entrenched set of memes, and a process of constant reinforcement of those existing

memes, that will work to suppress the propagation of a new meme. If the new meme

does manage to take hold and propagate, the commenters who embrace the new meme

will probably migrate, as a cultural tribe, to a different blog context that may indeed be

an indoctrination arena for the new meme. Recall in Chapter II, that indoctrination was

denoted as a top-down influence while cultural tribalism was denoted a “grassroots”

phenomenon. It was noted that for indoctrination to have any effect it must garner

grassroots support. It is argued that indoctrination will more often gain its grassroots

support from cultural tribalism, while blog author free thinking will most often gain its

support from need-for-cognition. These proposed relationships are depicted as a

conceptual map in Figure 3.6.

FIGURE 3.7

Evangelists and Idea Seeders

As already stated, mass dissemination is manifested as an individual, rather than

a systemic, activity where certain prolific commenters have low diversity in thought

expression as they repeat the same ideas. These commenters are denoted as evangelists.

This description might make evangelists appear similar to idea seeders; however, idea

seeders are not only prolific but diverse in the expression of their thoughts. The

distinction is made more explicit in Figure 3.7 which shows an assumed normal

distribution in the diversity of expressed thought among prolific commenters;

commenters in the low diversity tail are evangelists while those in the high diversity tail

are idea seeders.

FIGURE 3.8

Perceptual Map of Commenter Types

The parameters of equation 3.2 can be re-estimated to distinguish between

evangelists, idea seeders and flockers. To conceptually distinguish between flocking,

idea seeding and evangelism, the parameters of equation 3.2 can be divided into two

groups to create sub-factors that represent individual thought diversity (mean and

standard deviation in individual thought separation) and boldness (number of thematic

clusters started, standard deviation in comment-to-closest-thematic-cluster separation

and the number of comments contributed). These sub-factors can be used as axes of the

perceptual map shown in Figure 3.8. If commenters, based on their participation history,

are plotted on this perceptual map, their type should be readily distinguishable by their

presence in one of the marked quadrants.

Merriam-Webster (2007) defines boldness as “showing a fearless, daring spirit.”

Roget (1962) defines it as “willingness to take risks,” (p. 891) with synonyms such as

“adventuresomeness” and “venturesomeness.” These terms and definitions are very

much in keeping with the concept of entrepreneurship, defined by Merriam-Webster as

“assuming the risks of a business.” Another similar term is innovativeness, defined by

Roget as: “being characterized by, or productive of, new things or ideas.” (p. 139)

Clearly, boldness, innovativeness and entrepreneurial are related terms and concepts.

However, entrepreneurial has an explicit business connotation, where boldness is a more

generic term. This difference is important to this study, as a word denoting a willingness

to take risks in expressing ideas is called for, despite the marketing context.

Innovativeness also does not fully capture the need as it involves producing new

ideas, while idea seeders (those high in the quality for which a label is sought) need not

be idea originators; they may also be conduits for ideas acquired elsewhere that they

think are applicable to a particular blog conversation. Idea seeders thus enable what Pink

(2005) calls symphony: taking little pieces of information, and weaving them together so

a broader pattern can be discerned (p. 125). Idea seeders, and later the bolder and more

intellectually diverse flockers, individually bring little pieces of information, both

original and from diverse sources, while the collective conversation weaves them

together to find the broader pattern. This makes idea seeders a new type of influential

(Katz and Lazarsfeld 1955) distinct from the concept of mavens and opinion leaders (as

presented by Vernette 2004). The latter were seen as static in their information

dissemination role (Watts and Dodds 2007), while these are information conduits on a

more happenstance basis, dependent on what the conversation calls to mind and what

their boldness allows them to express to the group.

Now that the relevant constructs are defined, the influence of indoctrination and

mass dissemination on cognitive diversity can then be reflectively modeled as shown in

Figure 3.9.

FIGURE 3.9

Memetic Propagation Reflective Sub-Models

While it should be apparent from the preceding discussion that cognitive

diversity will be lower in blogs highly influenced by indoctrination, it might be thought

that mass dissemination is just an isolated act of individual fanatics that can have no

effect on overall thought diversity. However, it has already been noted that mass

dissemination can cause memes to enjoy a threshold-overwhelming cascading

propagation of acceptance, aided by flocking behavior. Therefore, both mass

dissemination and indoctrination can be major influences on cognitive diversity. Thus,

H5: Mass dissemination is negatively related to cognitive diversity.

H6: Indoctrination is negatively related to cognitive diversity.

Reciprocity and Cognitive Diversity

In Chapter II, it was mentioned that in a blog, the author posts (a costly

effort) in anticipation of response (a payoff). Equation 3.3, a variation of a model

proposed by Gintis (2004), describes the value received by the author ( Av ), the

difference between the author’s valuation of the community’s contribution ( TAB )

and his own cost ( Ac ).

(3.3) A

ATAA cb

cBv −−

−=−= ∑

Where: K is the number of responders, b is the benefit the responder intended to

contribute (0 < b < 1), ε is the responder’s ineffectiveness in contributing the intended

benefit ( 10 << ε ), δ is the author’s premium (or discount) for the value of each

responder’s contribution (-1< 1<δ ), and Ac is the author’s cost of posting.

Both active (those contributing to the discussion) and passive community

members receive the value from the author’s post and from the contributions of active

members. Equation 3.4 shows the value received by each contributing member ( Cv ) and

equation 3.5 shows the value received by passive members ( Pv ).

(3.4) C

CACOCC cbb

cbBv −−

−=−+= ∑

≠= )1(

(3.5) )1(

−=+= ∑

Where: OCB or TPB is the prospective contributor’s valuation of the contributions of

other active members, ACb or APb is the prospective contributor’s valuation of the

author’s post and Cc is the prospective contributor’s cost of participation.

Equations 3.4 and 3.5 can be applied to the community as a whole. Equation 3.6

is the total value received by a community with at least one participating member ( TCv ).

Equation 3.7 is the value received by a passive community ( TPv ).

(3.6) )1(

ATTCTCTC

−+−

−=+−= ∑

(3.7) )1(

Where: TCTC CB − is the net benefit of community contribution and ATb is the

community’s valuation of the author’s post.

As a game, this scenario can be described by the following matrix:

Author

Post No Post Reply

ATC vv , Community

No Reply )(, ATP cv − 0,0

If the author posts and the community reciprocates, the community receives certain

payoff TCv while the author receives contingent payoff Av . If the author posts and the

community fails to respond, the community receives certain payoff TPv , its valuation of

the author’s post. The community always knows the current value of the discussion and

will only post when the value of existing content exceeds the lowest reciprocation cost

of any member under the terms described below. The author never knows his payout

prior to his post; however, for the author to keep posting in repeated games, he must

expect his payoff to at least exceed his cost ( AA cv > ).

The community maximizes its self-interest differently based on its expectations

of whether the author will keep on posting content of at least minimal quality from its

own perspective (repeated game) or abandon the blog if no or too little response is made

in the first game (non-repeated game). Since the community can always count on the TDv

payout, the TCv payouts are easily calculated. Equilibrium values for the community

payoffs are as given in equations 3.8 and 3.9:

repeatednon

repeated

repeatednon

repeated

The return to the reciprocating community is greater than that of the passive community

if the game is repeated more than once ( TCC×3 versus TCC×2 ). So, this model

supports the prediction that the community will select repeated game play from the

beginning.

Since blogs can be dedicated to a variety of subjects, there can be no fixed

standard by which to assess value, content is valuable if the blog community sees it as

being valuable. This study proposes to objectively measure value under these terms by

adopting a variation of Google’s PageRank algorithm similar to that specified in Dwyer

(2007). Dwyer (2007) observed that Google faced a similar problem as the one faced

here when they wanted to find the most valued web content associated with any search

term. Since webpages can contain content on any topic, and they needed a way of

assessing value without analyzing the content, they adopted a populist criterion for

assessing value whereby the most link-referenced webpages were considered most

valued. According to Bianchini, Gori, and Scarselli (2005), the PageRank ( px ) of page p

is computed by taking into account the set of pages (pa[p]) pointing to p:

(3.10) [ ]

)1( dh

ppaq q

p −+= ∑∈

Where )1,0(∈d d is a proportioning factor and qh is the outdegree of q, that is, the

number of links coming out from page q. The proportioning factor determines the

amount of importance added to p by the pages linking to it. Page p has an inherent

importance of d−1 .

In this study, a blog entry and its comments are modeled as being connected

based on their resonance or similarity of theme. It is proposed that equation 3.10 be

modified to assess the value of a comment as described by equation 3.11:

(3.11) [ ]

(1 )p q

q cm p

v d R v d∈

= + −∑ i

In equation 3.11, vp is the value assigned comment p, q denotes all the other comments

with a resonance (R) to p greater than 0. The total value of an author’s entry can be

calculated the same way, as the recursive sum of the values of all resonant comments.

In Chapter II, it was argued that some proportion of readers become commenters

under the influence of a sense of obligation over prior value received from the blog. It is

argued that the total value calculated in equation 3.11 reflects the degree of future

reciprocity that will result from an author’s entry. It is also argued that since some

proportion of commenters will be influenced by reciprocity, the amount of reciprocity

will be proportional to the total number of commenters. These two indicators are

depicted in their relationship to reciprocity in Figure 3.10. Since reciprocity should bring

in commenters from the full spectrum of commenter types, it is argued that:

H7: Reciprocity is positively related to cognitive diversity.

FIGURE 3.10

Naïve Model: Hypothesized Relationships

All eight latent variables and hypotheses are integrated into a single conceptual

model presented in Figure 3.10. Definitions of the latent variable indicators are

aggregated in Appendix B.

Alternative Model

In this chapter, the relationships between six social forms and cognitive diversity

have been combined into the naïve model of Figure 3.10. As a naïve model, the social

forms are seen exclusively as acting directly on cognitive diversity. However, since this

research study takes a social process theory perspective, it must be recognized that the

six social forms may be interlinked in a complex manner. While it may seem that these

interlinked social forms are interaction effects that should be captured by the

multiplicative interaction terms commonly found in regression models, it is not certain

that the nature of the interlinking will be detected in this manner. The social forms in this

study are often highly dependent on initial conditions, are path dependent, nonlinear and

only loosely coupled to each other. They often also involve random delays and different

timescales. Quantification of the relationships between such processes is seldom

attempted in marketing research and is an issue this study lays a foundation for future

research to address.

Nevertheless, certain interrelationships that can be captured in a conventional

structural equation model have been implied in this discussion. Those interrelationships

have been combined into the model shown in Figure 3.11. In addition to including

hypotheses 1 and 2 from the naïve model, as R8 and R9 respectively, this model depicts

three exogenous influences:

a) The spectrum of blog author thought diversity, a top-down influence acting from

indoctrination to free thinking.

b) Mass dissemination, a grass-roots influence acting on the following individually

motivated or need-based social processes that subsequently influence cognitive

diversity (R8 and R9 in Figure 3.11):

i. Cultural tribalism, inspired by a need for self-expression.

ii. Flocking, motivated by a need for acceptance.

iii. Need-for-cognition.

c) Idea seeding, also a grass-roots influence, acting on the same social processes as

mass dissemination.

FIGURE 3.11

Alternative Model: Relationships

It was mentioned in the discussion of Figure 3.5 that the appearance of

indoctrination might be created by a blog author who watches the blog to see what topics

garner the most discussion and then repeatedly introduces those topics with no intent but

to keep the conversation lively. This scenario is included in the model of Figure 3.11

with the inclusion of prior collaborative value (“Total Value T – 1”) as a formative

influence on indoctrination.

In Chapter II, indoctrination and mass dissemination were both connected to

Granovetter’s (1978) thresholding dynamic. It was reasoned that indoctrination seeks to

overcome thresholds of resistance to meme ideas through repetition in a closed context,

while mass dissemination seeks to find people with low thresholds of resistance in a

variety of contexts (e.g., those characterized by both cultural tribalism and need-for-

cognition, relationships R2 and R4 in Figure 3.11). Just as rioting behavior can spread

through a crowd as differing individual thresholds of riot resistance are overcome by

observing others’ rioting behavior, so meme acceptance can be expected to spread

through any group as the unconverted see others in their vicinity become converted. This

cascading propagation of meme acceptance is expected to be aided by flocking behavior.

People with flocking propensity will sense if their local group’s thematic direction has

been affected by the acceptance of a meme or memeset. In order to preserve group

acceptance, they will control the extent to which their expressed ideas diverge from their

perception of the group norm, thus aiding in memetic propagation (R5 and R6 in Figure

3.11).

The logic underlying most of the relationships in this alternative model should

now be clear. However, the manner in which flocking and mass dissemination are

depicted may raise some questions. In this narrative, flocking commenters have been

differentiated from evangelists and idea seeders by their lack of boldness. As flockers

are thus characterizable as followers rather than leaders, the presence of flocking is

portrayed in Figure 3.11 as inspired by the more aggressive influences of cultural

tribalism (R5) and need-for-cognition (R6). It is argued that these more aggressive

influences result in the formation of core groups of interaction around whom flockers

gather depending on which group they desire to be affiliated with.

Mass dissemination, as implemented by evangelists, has been depicted in Figure

3.11 as related to both cultural tribalism (R2) and need-for-cognition (R4), while idea

seeding is exclusively related to the latter (R7). To the blog reader, evangelists and idea

seeders initially look the same. They are both commenters that often introduce radical

ideas into the blog conversation; only with close study does it become apparent that

evangelists repeatedly propound the same ideas. As a result, both will activate need-for-

cognition in the people to whom the ideas are new. The people who like the evangelized

ideas will naturally form a tribe around them.

Relevance of Sociological Theory

Hulberg (2006) examined the standard practices of corporate image management

using Burrell and Morgan’s (1979) sociological framework. Hulberg notes that Burrell

and Morgan’s framework describes four paradigms, or perspectives, that purports to

classify all sociological theories: (1) Functionalist: Consumers’ image perceptions can

be influenced by corporate messages and behavior; (2) Interpretive: Consumers’ image

perceptions are entirely formed through social interactions among them and cannot be

influenced by corporate messages or behavior; (3) Radical humanism: Consumers view

corporate attempts to convey image as manipulative; and (4) Radical structuralism:

Consumers view all corporate action as motivated toward gaining power and profit that

will be used to suppress others.

Clearly, the image management process of Figure 2.1 is predominantly based on

the functionalist paradigm: companies are seen as able to influence image perceptions.

However, the inclusion of the social forms (socio-psychological processes) hypothesized

to be the primary mediators in the expression of consumers’ thoughts to blogs, allows

the use of not only functionalist, but interpretive and radical humanist perspectives:

a) Functionalist. It is readily apparent that the memetic processes of indoctrination

and mass dissemination explicitly assume that one party can influence the image

perceptions of another. Even though idea seeding and reciprocity do not result in

perceptions of image being influenced, they do initiate cascades of behavior

where individuals influence one another by causing boldness and sense of

obligation thresholds to be exceeded. Idea seeding and reciprocity can thus be

viewed from a functionalist perspective.

b) Interpretive. Flocking and cultural tribalism are both processes where outcomes

depend on group norms established without the intervention of the blog author or

company hosting the blog. The degree of cognitive diversity expressed by

flockers will be dependent on individual perceptions of how much divergent

expression the group will tolerate. Since cultural tribalists are motivated by a

need for self expression, they will be attracted to blog communities already

focusing on their preferred themes. The tribe can be expected to migrate to any

blog whose theme more closely matches their preference, abandoning any blog

whose theme changes sufficiently. Therefore, even though cultural tribalism is

supported by indoctrination that matches tribal preference, the phenomenon is

caused by like individuals coming together and thereafter acting in concert.

c) Radical humanist. As discussed in the section on memetic processes,

indoctrination and mass dissemination are specifically intended to be tools of

persuasion. Thus, any blog readers who are motivated primarily by need-for-

cognition are expected to recognize these influences for what they are and will,

depending on the intensity of their use, either disregard them or abandon the blog

for being too narrow in subject matter. Blog readers motivated by the need for

self expression underlying cultural tribalism are also expected to have a similar

response. Note that blog readers whose preferences match ideas spread through

these memetic influences will not see them as manipulative.

While the association of sociological processes with image management is not

new, this study increases awareness of the variety of ways that sociological processes

influence image diffusion and the expression of image.

Summary

Earlier in the chapter, the focus of this study was stated to be investigating the

effects of six social forms, identified by prior research to strongly affect cognitive

diversity, the diversity of thought expressed in a group forum. If the operation of these

social forms cause the expression of consumer brand image to be distorted or

incomplete, then companies that mine blogs for brand image insights will not gain an

accurate picture of brand image in the market. To that end, measurable constructs for the

six social forms have been proposed as eight latent variables with suitable indicators.

These latent variable constructs and their indicators are summarized in Table 3.1. These

latent variables have been configured into two models, a naïve model proposing direct

effects on cognitive diversity and an interaction model proposing interrelationships

between seven of the latent variables. Full depictions of these models, with all reflective

and formative relationships, are included in Appendix C as Figures C-2 (Naïve) and C-3

(Alternative). The next chapter details the methodology used to test the hypothesized

relationships.

TABLE 3.1

Hypotheses and Construct Definition Summary

Constructs Indicators Mean cluster density. After the comment clusters for each blog entry are

found, the distance from each comment’s location in cognitive space to the

centroid of its cluster can be calculated.

Mean cluster radius. After the comment clusters for each blog entry are

centroid of its cluster can be calculated.

Number of clusters. After the comment clusters for each blog entry are

found, the number of clusters is proposed to be indicative of the number of

conversational themes.

Cognitive Diversity. The many

cognitive styles or ways

individuals think, perceive and

remember information and then

use that information to solve

problems or, in general, enact their

behavior.

Percentage of outliers. After the comment clusters for each blog entry are

centroid of its cluster can be calculated. The mean and standard deviation of

each cluster’s comment-to-centroid distances is calculated and a z-score

calculated for each comment. Any comment with a z-score greater than 2.0

is considered an outlier.

Mean collective thought separation. The mean of entry-to-comment and

comment-to-comment cognitive distances for the current blog entry.

Stdev collective thought separation. The standard deviation of entry-to-

comment and comment-to-comment cognitive distances for the current blog

entry.

H1 Cultural Tribalism. A state that a

virtual community can be in where

the members share a strong

ideological identity, creating a

kind of intellectual segregation

where like only talks to like.

Mean indiv. thought separation. The mean comment-to-comment cognitive

distance, across blog entries, between comments made by the same

individual, a commenter on the current blog entry.

TABLE 3.1 (CONT)

Constructs Indicators Stdev indiv. thought separation. The standard deviation of entry-to-

comment and comment-to-comment cognitive distances for the current blog

entry.

Mean commenter longevity. The mean time between the first comment and

the current comment of all commenters to the current blog entry.

(cont)

% Repeat commenters (T - 1). The number of commenters contributing to

the current blog entry who contributed to the previous blog entry.

Mean collective thought separation. The mean entry-to-comment and

comment-to-comment cognitive distances for the current blog entry.

Stdev collective thought separation. See above.

Mean indiv. thought separation. The mean entry-to-comment and comment-

to-comment cognitive distances for the current blog entry.

Stdev indiv. thought separation. See above.

H2 Need-for-Cognition. Part of what

determines the level of thought

exercised is innate need-for-

cognition, an enjoyment of

cognitively demanding tasks

(From Petty and Cacioppo’s

(1986) Elaboration Likelihood

Model).

% First time commenters. The number of commenters contributing to the

current blog entry who have never contributed to previous entries.

Mean clusters started. The mean number of thematic clusters started over

the participation lifetime of a blog commenter.

Stdev indiv. thought separation from cluster centroid. The standard

deviation of the cognitive distance between a commenter’s comments and

the centroids of the closest thematic cluster across blog entries over the

participation lifetime of a blog commenter.

Mean number of comments. The mean number of comments (across blog

entries) created by individuals commenting on the current blog entry.

Mean indiv. thought separation. See above.

H3 Idea Seeding. The most

uninhibited commenters

expressing ideas on the most

diverse array of themes.

TABLE 3.1 (CONT)

Constructs Indicators Mean clusters started. See above.

Stdev indiv. thought separation from cluster centroid. The mean cognitive

distance between a commenter’s comments and the centroids of the closest

thematic cluster across blog entries over the participation lifetime of a blog

commenter.

H4 Flocking. People control the

expression of their thoughts to

balance individuality with group

cohesion.

Mean number of comments. See above.

Mean clusters started. See above.

Stdev indiv. thought separation from cluster centroid. See above.

Mean number of comments. See above.

Mean indiv. thought separation. See above.

H5 Mass Dissemination. One of the

two primary memetic propagation

mechanisms where the targets are

the people with the lowest barriers

to adoption, so the idea is spread

as widely as possible to increase

the probability of reaching people

with a low threshold of resistance.

Mean time between entries. The mean interval between successive entries

posted to a blog.

Mean entry-to-entry separation. The mean pair-wise cognitive distances

between entries to the same blog.

H6 Indoctrination. One of the two

primary memetic propagation

mechanisms where the target

person or population is subjected

to a high repetition of the meme

idea in an attempt to overwhelm

thresholds of resistance.

Stdev entry-to-entry separation. The standard deviation in the pair-wise

cognitive distances between entries to the same blog.

H7 Reciprocity. As people read the

blog and get value from it, a sense

of obligation to contribute builds

Total number of commenters. The total number of unique commenters to a

blog entry.

CHAPTER IV

METHODOLOGY

Chapter III introduced an integrated naïve model of how six social processes

interact to affect cognitive diversity and an alternative model portraying certain

interrelationships between the six social processes. In this chapter, a means of testing the

explanatory sufficiency of these models is described. It will be described how the model

in Figure 3.10 was instantiated as a multi-group model. It will be shown how blog entry

comments were separated into those influenced primarily by cultural tribalism and those

characterized by need-for-cognition in so much as they embody the attributes of the

presence of those grassroots influences. Within each blog, commenters were segmented

into idea seeders, evangelists and flockers as they embodied the influences of threshold

behavior, mass dissemination and flocking, respectively. All commenters, regardless of

type were predicted to be capable of engaging in behavior motivated by reciprocation,

need-for-cognition and cultural tribalism. The aggregate model simultaneously captured

the influence of the full spectrum of blog author thought diversity levels and the three

groups of commenters on overall cognitive diversity.

This chapter begins by describing the data set used in this study and the

considerations made in its selection. Then, a step-by-step description is given of how the

raw data was processed to make it suitable for testing the hypotheses presented in

Chapter III. Next, the criterion used in determining whether or not hypotheses are

supported is described. Finally, the limitations associated with the methodology are

described.

Kozinets (2002) introduced netnography as a methodology where the principles

of ethnography, or unobtrusive observation, were applied to the study of a virtual

community of coffee aficionados. He specified a selection criterion for subject

communities that differed from the practice of standard ethnography: select communities

that are focused on a research question-relevant topic, receive above average posting

traffic, have a large number of contributing members, contain descriptively rich content

and thus, in general, enjoy a high level of member-to-member interaction. Having a high

level of member-to-member interactions is important to this study because it increases

the number of cognitive objects (i.e., bodies of text) and, as a result, the probability of

detecting the effects of the social processes described in Chapter III.

The study used author posts and community response data from the 15 diverse

blogs listed in Table 4.1. Consistent with the netnography methodology proposed by

Kozinets (2002), these blogs were purposively selected because they are:

a) Highly active. Blogs whose authors post frequently and each post tends to attract

more than 20 comments from readers (Table 4.2).

b) Representative of the wide diversity in blog topics relevant to marketing. In light

of the broad scope of marketing as a field of study, as summarized in Table 4.1,

in addition to blogs related to companies and products, blogs related to political

ideology, popular culture and people who are business thought-leaders are

included in the study.

c) Relevant to the research questions. Indoctrination is a social process whose

effects are not likely to be strong in all blogs. Therefore, three blogs (Blog for

America, The Evangelical Outpost and Townhall) were purposely selected as

indoctrination blogs because an informal examination of their posts and

comments indicated a high level of narrowly-scoped ideological content. Three

additional blogs (EnGadget, Gizmodo and Joystiq) were selected as need-for-

cognition blogs because they frequently introduce new products to consumers.

The overview of need-for-cognition and sensemaking theory presented in

Chapters II and III suggests that such blogs might prompt a higher level of

cognitive activity motivated by need-for-cognition as consumers collaboratively

converge on an understanding of the benefits that these new products might offer.

The remaining blogs were selected because their high level of activity should

allow most of the processes studied to be manifested.

d) Allow comments from readers. In Chapters I and II, it was emphasized that the

purpose of blogging was for consumers and producers to converse.

Unfortunately, many corporate blogs do not allow for reader comments. Since

this study focuses on the expression of diverse thought, only blogs that allow

comments are suitable.

e) Retain an archive of posts and comments. Some blogs only show the most recent

activity. Hypotheses 1, 2 and 4 require that the long term activity of commenters

be studied to detect regularities in the commenting activity of individuals.

TABLE 4.1

Data Sources

3 The number within parentheses is the weblog’s position on the Technorati, a weblog search engine,

ranking of the most authoritative (most widely cited by other websites) of the almost 100 million weblogs

tracked (as of Aug 07). An asterisk denotes situations where the whole blog archive was used because

there were fewer than 2000 blog entries.

ID Blog Name Description 1 AutoBlog (71)

3 A blog covering the auto industry with test drives

and commentary on articles from other sites.

2 Blog for America

(21,406)

The Democratic Party’s blog.

3 Blog Maverick (269)* Entrepreneur Mark Cuban’s blog.

4 The Consumerist (24) A consumer retaliation blog.

5 EnGadget (1) A product review blog for high-tech consumer

goods.

6 The Evangelical

Outpost (1668)

Reflections on culture, politics and religion from an

evangelical worldview.

7 Fastlane (8890)* GM Vice Chairman Bob Lutz’s blog.

8 Freakonomics (137)* The blog for the book Freakonomics

9 Gizmodo (3) A product review and news blog for high-tech

consumer goods.

10 Google Blogoscoped

A consumer’s blog covering Google and its

services.

11 Joystiq (39) A product review blog for the gaming industry.

12 PaulStamatiou (99)* A college student reviewing technology products

and offering technical support

13 Townhall (73)* Hugh Hewitt’s, a radio talk show host, blog on

politics, religion and conservative social

commentary.

14 TV Squad (104) Commentary and review of TV shows.

15 The Unofficial Apple

Weblog (52)

Since Apple has no corporate blog, their fans have

started blogs about them.

TABLE 4.2

Data Source Quantities

Blog Dates Posts Comments Commenters AutoBlog Mar 07 – Jul 07 2,036 27,826 14,657

Blog for

America

Mar 03 – Dec 04 2,232 252,395 15,388

Blog Maverick Nov 99 – Sep 07 435 16,681 7,546

Consumerist

Mar 07 – Aug

2,599 73,052 9,141

EnGadget May 07 – Jul 07 2,030 51,738 14,203

The Evangelical

Outpost

Oct 03 – Sep 07 1,731 52,027 4,813

Fastlane Jan 05- Aug 07 266 15,914 5,572

Freakonomics Mar 05 – Aug

1,142 23,445 8,375

Gizmodo Jul 07 – Aug 07 2,243 39,091 7,078

Google

Blogoscoped

Jan 06 – Aug 07 2,460 20,227 4,015

Joystiq Jun 07 – Sep 07 2,058 52,130 7,218

Paul Stamatiou Sep 05 – Sep 07 757 10,195 3,102

Townhall Apr 06 – Sep 07 985 15,455 2,689

TV Squad Jan 07 – Jul 07 2,034 24,994 6,605

The Unofficial

Apple Weblog

Apr 07 – Sep 07 2,165 23,414 7,465

Total 25,173 698,584 117,867

Two of the blogs (Blog for America and GM Fastlane) were selected as pretest

blogs because they are expected to demonstrate activity at the extremities of the various

conceptual maps (i.e., Figures 3.2 and 3.6) used to demarcate the diverse influences

affecting blogs.

Method

Data Collection and Preprocessing

The first phase of this study was to collect a sample of approximately 2000

authors’ posts and associated comments from each blog using a custom software

program that automated the collection. Automated processes, used throughout this study

and noted where appropriate, are particularly relevant to this study because of the large

volume of data. In cases where the blog had less than 2000 author posts, the entire blog

archive was collected. Then, each entry and comment was decomposed into centering

resonance analysis (CRA) word networks, the influence of each node of the word

network was calculated, and then the resonance between posts and comments was

calculated on the following dimensions:

a) Post-to-post. The resonance of a post with all posts that follow it. This was

designed to show the extent to which all posts follow the same theme.

b) Post-to-comment. The resonance of a post with the comments that respond to it

and the resonance between the responding comments. This was designed to

indicate the extent to which the comments thematically diverge from that of the

c) Prior post and its comments to subsequent posts. The resonance between past

posts and their comments with future posts. This was designed to reveal the

extent to which authors are responsive to issues important to the commenters.

d) Posts by the same author. Since some blogs have more than one author this

dimension was designed to indicate whether individual authors keep posting on

the same theme.

e) Comments by the same commenter. Capturing the diversity of thought expressed

by individual commenters was an important part of assessing the degree of

autonomy tolerated by the group.

f) Comment-to-comment. The resonance between all comments, regardless of

associated post, was designed to allow clusters to be detected that indicate major

themes or the content of the community’s common knowledge structure.

Corman and Dooley (2008), two of the authors of Corman et al. (2002), and

inventors of Centering Resonance Analysis (CRA), founded Crawdad Technologies,

LLC to market Crawdad, a software program that automates the process of turning text

into CRA word networks, and calculating influence and resonance. They permit the

download of an evaluation copy of their software that processes a body of example text

into a CRA word network and performs the relevant calculations. After reading Corman

et al. (2002) and Wasserman and Faust (1994) it was possible to write a program in

Microsoft’s C-Sharp (C#) programming language that duplicated the output of the

Crawdad software using its example text. The C# program was then used to process the

data for this study.

MDS and Cluster Detection

As described in Chapter II, since CRA’s resonance metric allows a pair-wise

measure of similarity to be calculated for every document (i.e., post and comment);

Kamada and Kawai’s (1989) spring-like methodology was adapted to self-assemble

clusters of cognitive objects in a multidimensional space. These clusters were then

demarcated using Chiu et al’s (2001) two-step methodology and cluster membership

confirmed with McQueen’s (1967) k-means.

Spring model multidimensional scaling was implemented with a custom program

written in C# that utilized a modified version of Dwyer’s (2004) WilmaScope 3D graph

visualization library. WilmaScope was modified to use four dimensions instead of three

so as to allow the spring models sufficient freedom to assume their desired unstressed

lengths as explained in Chapter II in the sections entitled Multidimensional Scaling and

Spring Models in Multidimensional Scaling.

Negative and Positive Valance

CRA focuses on the detection of thematic consistency between documents;

however, it does not capture differences in positive and negative valence in its

determinacy of theme consistency. To address this shortcoming, a separate accounting of

relative frequency in the use of negative words between two compared documents was

maintained. In the previous section, a means of locating a document in a

multidimensional thematic space is described where relative differences are turned into

relative coordinates that can then be used in the detection of thematic clusters. The

accounting of negative valance is used as an additional dimension in the detection of

thematic clusters.

Multigroup Analysis

After the comment text was transformed into cognitive objects located in clusters

in multidimensional space, the task of identifying blog entry comment (cultural tribalism

and need-for-cognition) and commenter (evangelist, idea seeder or flocker) types began.

This study regards the commenters in the blogs listed in Table 4.1 as cluster samples

from the same larger population, people who comment in blogs. Cluster samples differ

from random samples in that subjects in the same cluster are more similar to each other

than those in a truly random sample. Recognizing a level of similarity seems appropriate

because blogs tend to be topical and thus attract people who, at the least, have a common

interest in the same topic. As a result, the parameters for equations 3.1 and 3.2, the

discriminant models identifying primary influences and commenter types, were

estimated separately using commenter data from each of the 15 blogs. Then, the

parameter estimations were compared to check for consistency. Additionally, it was

necessary to demonstrate both the factorial invariance of the measuring instrument (i.e.,

the 8 factors proposed in Chapter III) and the invariance of the causal structure (the

models in Figures 3.10 and 3.11) across the individual blogs. EQS 6.1 was used in

conjunction with the procedures prescribed by Byrne (2006), and summarized below, to

demonstrate both types of invariance.

In order to use equations 3.1 and 3.2 for segmentation, the parameter coefficients

must have been estimated using data already classified. Discriminant models are then

estimated using a portion (typically 70%) of the pre-classified dataset and the accuracy

of the determinant model’s ability to classify data items is then tested using the

remaining or holdout sample. Since it is desired to determine whether the number of

clusters naturally present in the data matches the conceptual map topologies given in

Figures 3.2 (cultural tribalism and need-for-cognition) and 3.8 (flockers, idea seeders

and evangelists), Chui et al’s (2001) methodology was used to get an initial cluster

demarcation. Then, the assignment of objects to clusters was confirmed with McQueen’s

(1967) widely used k-means clustering algorithm. When the data was thus pre-classified,

the parameters of the discriminant models (equations 3.1 and 3.2) described in this study

were estimated.

Classification of Blog Entry Comments

Chapter III described how blog entry comments can be classified as either

influenced primarily by cultural tribalism or need-for-cognition using the discriminant

model equation 3.1 or by plotting the positions of comments in the conceptual map

shown in Figure 3.2, derived from equation 3.1. The two comment types were

distinguished, one from the other, using Chui et al’s (2001) methodology using all the

comments from a random sample of 25 blog entries selected from each of the two pretest

blogs. Two clusters emerged and cluster membership was confirmed using McQueen’s

(1967) k-means. When this pretest sample was plotted on the conceptual map of Figure

3.2, the result is as displayed in Figure 4.1.

FIGURE 4.1

Pretest Blog Entry Comment Mapping

While the results are not perfectly conforming to the proposed perceptual map

regions, the higher density of data points within the so-called cultural tribalism and

need-for-cognition quadrants suggests support for the proposed topology for blog entry

comment segmentation. A notable aspect of these results is that a near equal number of

the comments within the cultural tribalism quadrant belong to the GM Fastlane blog, not

the blog predicted to be an indoctrination blog. A positive aspect of this observation is

that the topology-based segmentation seems to distinguish one comment from another;

however, the unexpected inability to distinguish between blogs suggests that the logic

underlying the initial prediction of blog types may have been too superficial.

The parameters of equation 3.1 were estimated from the pretest dataset and the

accuracy of the resulting model tested with a holdout sample. One discriminant function

emerged from the estimation with a characteristic root eigenvalue greater than 1.0. This

function correctly classified 88.3% of the holdout sample.

Classification of Commenters

Chapter III described how commenters can be characterized as evangelists, idea

seeders and flockers by plotting their positions on a conceptual map (Figure 3.8, derived

from equation 3.2). Idea seeders can be separated from evangelists by high diversity in

thought expression: idea seeders express thoughts about a wide variety of things while

evangelists always express similar thoughts. Idea seeders and evangelists can be

separated from flockers on the basis of differences in boldness: idea seeders and

evangelists are bolder when expressing their thoughts.

When the commenting activity of 1,000 commenters, 500 randomly selected

from the Blog for America (BFA) and the GM Fastlane blog (GM) as a pretest sample,

was demarcated by Chiu et al’s (2001) methodology, three clusters emerged (Figures 4.2

and 4.3). Cluster assignments were confirmed using McQueen’s (1967) k-means

methodology.

FIGURE 4.2

Pretest BFA Commenter Clusters

FIGURE 4.3

Pretest GM Fastlane Commenter Clusters

The placement of these clusters was considered conclusive and conforming to

expected patterns. It is interesting to note that idea seeding seems to be more pronounced

in the GM Fastlane blog.

In preparation for estimating the parameters of equation 3.2 the commenters

were forcibly classified into the segments of Figure 3.8. Using these topology-based

segment membership assignments, the parameters of equation 3.2 were estimated and

the accuracy of the resulting model tested with a holdout sample. Two discriminant

functions emerged from estimation with each of the two pretest datasets, correctly

classifying 86.5% (GM) and 79.3% (BFA) of the holdout sample.

Construct Validity

Bagozzi, Yi and Phillips (1991) broadly define construct validity “as the extent to

which an operationalization [(i.e., a measuring device)] measures the concept it is

supposed to measure.” (p. 421) Construct validity focuses on the extent to which data

exhibits (1) convergent validity, the extent to which different assessment methods

concur in their measurement of the same construct (also called factorial or measurement

invariance), and (2) discriminant validity, the extent to which the measures of different

constructs are distinct (Campbell and Fiske 1959). The demonstration of factorial

invariance began by establishing baseline confirmatory factor analysis (CFA) results for

each blog dataset. The results were then compared to see if the same indicators loaded

on the same factors. After the factorial structure was found to be the same across blogs,

hypotheses that each free factor loading (each factor’s “first” indicator is constrained to

unity and is thus not free) was equal across blogs was tested by forcing the factor

loadings to be equal and then observing overall fit statistics as well as the LaGrange

Multiplier (LM) Test univariate probability for each factor loading. This probability is a

measure of significance for the LM Test, a test of the hypothesis that a factor loading

varies across groups (in this case, blogs). LM Test probabilities less than 0.05 are

considered statistically insignificant. That is, when the LM Test probability is less than

0.05, the parameter does not have the same value across the blogs. This procedure does

not require equal sample sizes, nor does the test of causal structure invariance described

below.

Invariance of causal structure was tested in a manner similar to the above. The

same structural model was simultaneously fitted to the 15 blog datasets while

constraining all factor loadings, structural path loadings and error covariances equal

across the blogs. Then multigroup model fit statistics were observed as well as the LM

Test univariate probability for each constrained parameter.

Underlying the use of structural equation modeling to perform CFA is a strong

assumption of multivariate normality. If evidence shows that this assumption is violated,

it has important implications for the interpretability of the findings. None of the

indicators in the pretest data set demonstrated significant nonzero univariate skewness

(one tail of the distribution is longer than the other) or kurtosis (variance is due to

infrequent extreme deviations from the mean). Normalized estimates of Mardia’s (1970)

multivariate coefficient, an aggregate indicator of kurtosis, were found to be 2.15 (BFA)

and 3.98 (GM). Bentler (2005) suggests that only values greater than 5.0 indicate non-

normally distributed data. Another test of multivariate normality is discussed later.

The individual pretest blog baseline CFA results showed the same indicators

loading on the same factors as given in Table 4.3. Additionally, as Table 4.3 shows, all

but eight of the 32 core factor construct indicators met Hair et al’s (2005) criterion for

statistical significance (factor loading > 0.4 with a sample size > 200) and 14 met the

criterion for practical significance (loading > 0.5). It must be noted that Hair et al.’s

criteria, as well as most of the reference values quoted in this study, applies to evaluating

instruments used to gather experimental data. The circumstances of this study are much

different as Hair et al.’s criteria are being used to evaluate a means of gaining insights

from non-experimental, albeit primary, empirical data. Therefore, it is argued that Hair

et al’s criteria must be referenced with caution as the level of control over extraneous

variation that characterizes experiments is absent here. That being understood, the

general conformance of this study’s factor loadings and other metrics to those

recommended by Hair et al. (2005) and Byrne (2006) is worthy of note and indicative of

a reasonable level of performance. The multigroup CFA with factor loadings constrained

equal across blogs had good fit (CFI = 0.954, RMSEA = 0.019, χ2 (896) = 1,225)

indicating that both blogs have similar, if not the same, factor structures.

TABLE 4.3

Pretest Confirmatory Factor Analysis, Reliability, Variance Explained and Factorial Invariance

Main Factor and Reliability

Indicator Loading R2 LM Test Probability

Mean Cluster Density -0.562 0.317 0.000

Mean Cluster Radius 0.668 0.448 0.000

Number of Clusters 0.612 0.377 0.000

Cognitive Diversity

r = 0.765

Percentage of Outliers 0.537 0.290 N/A

Mean Collective Thought Separation -0.496 0.246 0.000

Stdev Collective Thought Separation -0.571 0.326 0.000

Mean Indiv. Thought Separation -0.493 0.243 0.000

Stdev Indiv. Thought Separation -0.450 0.211 N/A

Mean Commenter Longevity 0.349 0.122 0.072

Cultural Tribalism

r = 0.683

% Repeat Commenters (T - 1) 0.493 0.244 0.895

Mean Clusters Started -0.562 0.316 0.040

Stdev Indiv. Thought Separation From Cluster

Centroid -0.639 0.408 N/A

Flocking

r = 0.625

Mean Number of Comments -0.618 0.381 0.000

Mean Clusters Started 0.411 0.169 0.092

Centroid 0.419 0.177 0.137

Mean Number of Comments 0.470 0.221 0.039

Mean Indiv. Thought Separation 0.365 0.134 0.746

Idea Seeding

r = 0.519

Stdev Indiv. Thought Separation 0.413 0.172 N/A

TABLE 4.3 (CONT)

Mean Time Between Entries -0.623 0.406 N/A

Mean Entry-to-Entry Separation -0.580 0.337 0.008

Indoctrination / Free

Thought

r = 0.485 Stdev Entry-to-Entry Separation -0.474 0.225 0.002

Centroid 0.501 0.256 0.006

Mean Indiv. Thought Separation -0.515 0.278 0.000

Mass Dissemination

r = 0.663

Stdev Indiv. Thought Separation -0.562 0.316 N/A

Mean Collective Thought Separation 0.478 0.230 0.001

Stdev Collective Thought Separation 0.601 0.361 0.195

Mean Indiv. Thought Separation 0.551 0.304 0.001

Stdev Indiv. Thought Separation 0.490 0.251 N/A

Need for Cognition

r = 0.640

% First Time Commenters 0.391 0.153 0.056

Reciprocity

r = 1.0

Total Number of Commenters

Note: ρ < .05. Reliabilities are calculated from the indicators to each individual factor in Table 4.3. Pearson “r” is reported

instead of Cronbach’s α. N/A in the right hand column denotes the arbitrary first factor.

Table 4.3 also shows that 13 of the 32 factor loadings exhibited a less than 5%

LM Test univariate probability, indicating that their factor loadings are not the same

across the two blogs even though the structure appears to be the same.

TABLE 4.4

Pretest Correlation and Φ-matrix of Primary Constructs

1 2 3 4 5 6 7 1 CogDiv -

2 CultTrib 0.033 -

3 Flocking 0.004 0.001 -

4 IdeaSeed 0.265 0.084 -0.014 -

5 Indoc 0.079 0.475 0.078 0.166 -

6 MassDiss 0.120 0.020 0.206 0.528 0.574 -

7 NFC 0.117 0.183 0.109 0.162 0.942 0.586 -

8 Recip 0.205 0.108 -0.062 0.117 0.981 0.209 0.439

*Note: Correlations are corrected for attenuation. All values are significant to ρ < .05.

CogDiv = Cognitive Diversity, CultTrib = Cultural Tribalism, IdeaSeed = Idea Seeding,

Indoc = Indoctrination, MassDiss = Mass Dissemination, NFC = Need-for-Cognition,

Recip = Reciprocity.

Discriminant validity was assessed in two stages: (1) correlations between

constructs, corrected for attenuation, were confirmed significantly less than unity (Table

4.4 shows the correlations for the GM Fastlane blog), and (2) a χ2 difference test was

performed between the baseline multigroup CFA model (CFI = 0.998, χ2 (872) = 888)

and an alternative multigroup model (CFI = 0.106, χ2 (928) = 7,338) where the

correlation between each latent construct was constrained at unity. According to Cheung

and Rensvold (2002), a large CFI or χ2 difference demonstrates discriminant validity.

While there is a widely accepted understanding of what constitutes a significant χ2

difference for a given degrees of freedom, Cheung and Rensvold’s recommendation that

a ∆CFI > 0.01 be considered significant is only beginning to gain acceptance (Byrne

2006, p. 341). In this study, an attempt is made to disclose both metrics where possible.

It is readily apparent that a ∆CFI of 0.892 and a ∆χ2 of 6,450 with degrees of freedom

equal to 56 argue strongly in favor of discriminant validity.

As has already been intimated, prior to conducting a confirmatory factor analysis,

and the model estimation, it is important to test for multivariate normality in the factor

indicators. This is particularly important in this study as many measures of collective

phenomena have an underlying power-law distribution, as discussed in Chapter II, rather

than a normal distribution. Hair et al. (2005) point out the difficulty in testing

multivariate normality (the joint normality of more than one variable) and suggest that

testing univariate normality is an acceptable substitute, especially if sample sizes are

large. Hair et al. recommend graphical analysis using the normal probability plot as well

as statistical techniques such as skewness and kurtosis. Figures 4.4 and 4.5 show normal

probability plots for estimated values of the BFA and GM Fastlane blog’s cognitive

diversity latent variable. These plots show no normality problems.

FIGURE 4.4

Pretest Normal Probability Plot of BFA Cognitive Diversity

FIGURE 4.5

Pretest Normal Probability Plot of GM Fastlane Cognitive Diversity

Hypothesis Testing

Blog entry comments were characterized as influenced primarily by cultural

tribalism or need-for-cognition and commenters in each blog were segmented into

evangelists, idea seeders and flockers. Then, the models depicted in Figures 3.10 and

3.11 were estimated and hypotheses tested. Although this model is fully described in

Chapter III, the segmentation of blog entry comments and commenters might be a source

of confusion. So, before the estimation of the model is discussed the implementation

details involving the segmented blog entry comments and commenters will be described.

Since the unit of analysis in the models of Figures 3.10 and 3.11 is the blog

entry, values must be calculated for the 32 indicators used to reflect the eight latent

variable constructs for each blog entry. As explained in Chapter III, the need for

validated self-expression (as manifested in cultural tribalism) and need-for-cognition are

competing motivations within blogs. Each blog entry is approached in retrospect,

complete with all its comments. As each blog entry is examined, its comments are

segmented based on equation 3.1 into those influenced primarily by cultural tribalism

and those primarily influenced by need-for-cognition. Then summary measures of each

segment, mean and standard deviations of individual and collective thought separation,

are calculated to measure the effect of each segment, and its associated influence, on the

whole blog entry. These summary measures are the indicators for the cultural tribalism

and need-for-cognition latent variables in Figures 3.10 and 3.11.

The flocking, mass dissemination and idea seeding factors reflect relationships

between summary measures of cognitive objects produced by individual commenters.

For example, when calculating the values for the three indicators used to reflect flocking,

all the data for the commenters previously identified as flockers who commented on the

current blog entry are retrieved. From this data the mean number of clusters started and

the mean number of comments posted, across all blog entries, is calculated.

Additionally, for each flocker who commented on the current blog entry, the standard

deviation of the cognitive distance between all the comments they post in the lifetime of

their participation and their associated cluster centroids is calculated. The joint reflective

impact of the three indicators on the flocking latent variable becomes the impact of that

construct on the cognitive diversity of the current blog entry.

The calculation of the indicators reflective of the other four latent variables is

fully described in Chapter III and should be readily understood. As a result, the

discussion will now turn to estimating the model.

TABLE 4.5

Pretest Multigroup Fitted Naïve Model Parameter Values

Parameter Value Hypothesis Support

Standardized Unstandardized Statistical Practical LM Test

Probability H1 -0.0625 -0.0543 (t = -1.64) Yes Yes 0.000

H2 0.6469 0.5435 (t = 9.93) Yes Yes 0.000

H3 -0.0300 -0.0372 (t = -0.68)* No No 0.026

H4 -0.1042 -0.0843 (t = -2.54) Yes Yes 0.000

H5 -0.0017 -0.0017 (t = -0.04)* No No 0.000

H6 -0.0748 -0.0721 (t = -1.61) Yes Yes 0.000

H7 0.0015 0.0007 (t = 0.02)* No No 0.000

*Not significant

Prior to fitting the 32 indicator measurements to the models in Figures 3.10 and

3.11, with a pretest sample of 150 blog entries randomly selected from the GM Fastlane

blog and the Blog for America, the models were translated into simultaneous equations.

Then, multigroup structural equation modeling and maximum likelihood estimation

(using EQS 6.1), with factor loadings and relationships between factors constrained

equal across blogs, was used to solve the naïve and alternative models and obtain the

results reported in Tables 4.5 (Naïve: CFI = 0.568, RMSEA = 0.121, χ2 (78) = 647) and

4.6 (Alternative: CFI = 0.883, RMSEA = 0.063, χ2 (78) = 232), and Figures 4.6 and 4.7.

TABLE 4.6

Pretest Multigroup Fitted Alternative Model Parameter Values

Parameter Value Relationship Support Path

Probability R1 0.3757 0.4354 (t = 6.07) Yes Yes 0.000

R2 -0.2108 -0.2427 (t = -4.34) Yes Yes 0.034

R3 -0.0993 -0.1159 (t = -1.93) Yes Yes 0.000

R4 0.1398 0.1720 (t = 3.00) Yes Yes 0.000

R5 0.3021 0.3258 (t = 6.04) Yes Yes 0.003

R6 -0.4416 -0.4498 (t = -8.13) Yes Yes 0.000

R7 0.2761 0.3993 (t = 4.61) Yes Yes 0.010

R8 -0.0862 -0.0770 (t = -2.24) Yes Yes 0.000

R9 0.6483 0.5468 (t = 10.28) Yes Yes 0.000

As shown in Table 4.5, only H1, H2, H4 and H6 are supported as indicated by

both statistical significance and the size of the unstandardized parameter (i.e., effect

size). However, statistical significance is not noteworthy in the main part of this study,

as the large sample size is assured to raise statistical power enough to make all the

estimated values statistically significant. A more meaningful criterion is whether the

sizes of the estimated parameter values (i.e., effect size) show they have practical

significance (Kirk 1996). Kirk eschews the selection of a static and universal threshold

value of practical significance to be applied to every scenario. This study arbitrarily

defines practical significance as an unstandardized parameter estimate greater than 0.05.

In other words, an influence must have at least a 5% effect on a target to be considered

practically significant. Tables 4.5 and 4.6 also show that none of the structural paths

between factors pass the test for weight invariance across all blogs (LM Test Probability

> 0.05). It is readily apparent that the alternative model fits significantly better than the

naïve model (∆CFI = 0.315, ∆χ2 (0) = 415).

FIGURE 4.6

Naïve Factor Model Fitted with Pretest Data

Formal tests of mediation were performed on the model in Figure 3.11 whenever

one construct influenced another through an intermediary construct (e.g., mass

dissemination affects cognitive diversity through cultural tribalism in Figure 3.11). This

was done using Hardy and Bryman’s (2004) procedure of bypassing each intermediary,

one at a time, with a direct path and performing a χ2-difference test. In accord with

Cheung and Rensvold’s (2002) recommendation, ∆CFI was also calculated. When

bypassing a mediator construct fails to significantly improve model fit, the validity of

the base model is supported. Baron and Kenny (1986) specified that mediation be tested

by looking at the estimated parameter value of the path bypassing each mediator; values

close to zero show the mediator is needed. The results of mediation tests are summarized

in Table 4.7; all mediation tests support the proposed alternative model.

TABLE 4.7

Pretest Alternative Model Tests of Mediation

CFI and Χ2 Difference Test Test of Significance Relationship

CFI X2 ∆CFI ∆X2

Bypass Parameters

Baseline 0.883 χ2 (78) = 232

Indoctrination

to Flocking 0.882 χ

2 (76) = 232 0.001

2 (2) 0dχ = 0.018 (t = 0.432)

Dissemination

to Flocking

0.883 χ2 (76) = 230 0.000

2 (2) 2, .5dχ ρ= < 0.014 (t = 0.243)

Idea Seeding

to Flocking 0.883 χ

2 (76) = 230 0.000

2 (2) 2, .5dχ ρ= < 0.104 (t = 1.156)

Indoctrination

to Cognitive

Diversity

0.890 χ2 (76) = 221 0.007

2 (2) 9, .02dχ ρ= < -0.034

(t = -0.972)

Dissemination

to Cognitive

Diversity

0.883 χ2 (76) = 230 0.000

2 (2) 2, .5dχ ρ= < -0.119

(t = -2.333)

Idea Seeding

to Cognitive

Diversity

0.891 χ2 (76) = 220 0.008

2 (2) 12, .01dχ ρ= < 0.034 (t = 0.437)

FIGURE 4.7

Alternative Factor Model Fitted with Pretest Data

Even though the fit of the naïve model to the pretest data is fairly good, when

assessed in the context of the small sample size, three of the hypothesized relationships

are both statistically insignificant and too small in effect size for the associated

hypotheses to be considered supported at this stage. There are encouraging indications

though, in the effect sizes and valance polarities of the relationships representing H1, H2,

H4 and H6, where there is good support of the associated hypotheses.

The alternative model is both well fitted to the pretest data and has relationships

between factors that are both statistically and practically significant. Note that the

alternative model suggests explanations for the hypotheses not supported in the naïve

model. The reason why the hypothesized association between idea seeding and cognitive

diversity (H3) was unsupported seems to be because the grassroots influence of need-for-

cognition is required. Similar speculation seems justified for the hypothesized

relationship between mass dissemination and cognitive diversity (H5), the grassroots

influences of cultural tribalism and need-for-cognition seem to be required. Such

speculation also seems justified by the weakness of the hypothesized relationship

between indoctrination and cognitive diversity (H6) even though the hypothesis was

statistically and practically supported. The unsupported hypothesized effect of

reciprocity on cognitive diversity (H7) may be due to an insufficient conceptualization of

reciprocity. It seems the general principle behind the concept is sound, as lagged value

was associated with increased numbers of participating commenters in the alternative

model.

In this section of the chapter, two alternative models have been compared with

varied results. It seems reasonable to ponder whether there is another model that would

be superior compared to the models considered. In the next two sections, two novel

methodologies are discussed that purport to find causal models from correlation data.

These methodologies are used to search for better models that are then compared with

the ones already described.

Directed Acyclic Graphs

Directed Acyclic Graphs (DAGs) are similar to structural equation models

(SEMs) in that they are a pictorial diagram of the relationships between variables. The

important difference is that where SEMs are representations of simultaneous linear

equations, DAGs represent the findings of an artificial intelligence algorithm that uses

correlations and metadata about temporal relationships between variables to propose an

underlying causal structure. Swanson and Granger (1997) suggested that DAGs could be

used as a data-driven means of inferring causal structure when little theory is available.

Many alternative algorithms are used to construct DAGs; however the most well-known

are Spirtes, Glymour and Scheines’ (2000) PC search and Chickering’s (2002) Greedy

Equivalent Search (GES). Regular users of DAG algorithms (e.g., Zhang, Bessler and

Leatham 2006) tend to place greater confidence in relationships found by multiple

algorithms over those found by just one.

FIGURE 4.8

PC Search DAG

PC Search

The PC search algorithm begins by connecting every variable with an undirected

edge (i.e., a line connecting them in the diagram). Edges are removed where correlation

is either zero or conditional on other relationships between variables. Edges are made

directed (i.e., given an arrow head) in a subsequent step too complex to be described

here. When PC search was applied to the raw correlation matrix underlying that in Table

4.4, the DAG depicted in Figure 4.8 resulted.

FIGURE 4.9

Greedy Equivalent Search DAG

Greedy Equivalent Search

The Greedy Equivalent Search (GES) algorithm is a good complement to PC

search because it begins by assuming that no edges connect the variables. It uses a two-

step iterative procedure and a Bayesian analysis of probabilities to converge on a highest

likelihood set of edges between the variables. When GES was used on the raw version of

Table 4.4’s correlation matrix, the DAG in Figure 4.9 emerged.

FIGURE 4.10

PC Search and GES Commonality DAG

Figure 4.10 shows the edges found by both the PC Search and GES algorithms.

The edges that match those on the naïve and alternative models are labeled. The findings

of the data-driven algorithms lend support for some of the relationships proposed in the

theory-based alternative model. The inclusion of one of the unsupported hypotheses

from the naïve model (H7) in a mediator role between indoctrination and cognitive

diversity is interesting and worthy of further thought.

Limitations

Barnett and Woelfel (1979) point out that it can be difficult to select the optimal

number of dimensions in the multidimensional space where cognitive objects are plotted.

I propose to use four dimensions in this study because as dimensionality increases, the

computational tasks for self-assembling thematic clusters rise dramatically along with

the computer memory required.

The results of this study, while relevant to the broad spectrum of marketing

phenomena, may not be completely replicable in a blog study closer to the core business

focus of mainstream marketing research. As stated in Chapter III, the social forms in this

study are often highly dependent on initial conditions, are path dependent, nonlinear and

only loosely coupled to each other. They often also involve random delays and different

timescales. The more specific in context a future blog study may be, the more likely that

context will be affected by a lesser number of social processes.

The purposive sample of blogs may cause the effects of processes that may be

rare, such as indoctrination, to appear to have greater importance, relative to the other

processes, than is true of the overall blogosphere.

Summary

In this chapter, a means of testing the models proposed in Chapter III was

described and its effectiveness demonstrated with pretest data. In the next chapter all 15

blogs in the dataset are subjected to the testing procedure described here to create a basis

for final determinations about the proposed hypotheses and models.

CHAPTER V

RESULTS

Chapter IV described this study’s data set and a multifaceted series of procedures

designed to test the hypotheses and models introduced in Chapter III. The data set is a

diverse selection of 15 blogs, selected using guidelines specified by Kozinets (2002) to

ensure a data set with high information content. The blogs were also selected so as to

cover the full breadth of marketing activities: for-profit and non-profit, organizational

and individual. Blog entries and associated comments were converted into word

networks by Centering Resonance Analysis (CRA). The similarity (or resonance)

between these word networks was measured and used as an input to multidimensional

scaling to locate blog entries and comments as points in a cognitive space. Standard

clustering algorithms were used to find thematic clusters among the points in cognitive

space. The clusters were measured on such dimensions as size and density to derive a

measure of cognitive diversity. Cognitive diversity metrics were then used to segment

blog entry comments into those most influenced by cultural tribalism and need-for-

cognition. Such metrics were also used to segment commenters into idea seeders,

evangelists and flockers. Multi-group structural equation modeling was then used to test

a model and hypotheses that related the joint influences of blog author and commenters

on the diversity of thought expressed in blogs.

This chapter begins by describing the results of conducting the tests and

procedures described in Chapter IV on all 15 of the blogs in the full dataset. On the basis

of those results, final determinations are made concerning support for the proposed

hypothesis and models.

Classification of Blog Entry Comments

Chapter III described how blog entry comments can be classified as either

influenced primarily by cultural tribalism or need-for-cognition using the discriminant

model equation 3.1 or, by plotting the positions of comments in the conceptual map

shown in Figure 3.2, derived from equation 3.1. The two comment types were

distinguished, using Chui et al’s (2001) clustering algorithm using all the blog entries

and comments from each of the 15 blogs. Two clusters emerged and cluster membership

was confirmed using McQueen’s (1967) k-means. When a random sample of 500

comments was drawn from each blog in the dataset, pooled and plotted on the

conceptual map of Figure 3.2, the result is as displayed in Figure 5.1 (B). While the

results are not perfectly conforming to the proposed perceptual map regions, the higher

density of data points within the so-called cultural tribalism and need-for-cognition

quadrants suggests support for the proposed topology for blog entry comment

segmentation. It is easy to see that the clustering algorithms merely cut the mass of

comments in the middle. To get a more nuanced insight, it is necessary to look at the

comment frequency histograms below and to the left of the scatter plot (A and C). Even

though the two clusters contain approximately equal numbers of points (50.3% cultural

tribalism), the cultural tribalism cluster is denser, as shown by its smaller footprint in the

scatter plot and taller height in both the individual and collective thought diversity

histograms. This observation was confirmed by a statistically significant difference (F =

28.53) between the mean distance between cluster members and their cluster centers.

FIGURE 5.1

Blog Entry Comment Mapping

This observation is consistent with cultural tribalism theory; not only are both types of

thought diversity low, but there is less variation even at those low levels. It is also

interesting to note that high and low levels of individual and collective thought diversity

seem to occur together; the two are moderately correlated 0.450 (ρ < 0.001).

FIGURE 5.2

Blog Cultural Tribalism Histogram

The histogram in Figure 5.2 addresses the question of whether the blogs differ in

the number of comments classified as predominantly influenced by cultural tribalism.

The white band indicates a 3σ confidence interval around the mean. There is no

statistically significance difference in the level of cultural tribalism between the blogs.

To gain another perspective on the question of whether blogs differ in the

relative effects of cultural tribalism versus need-for-cognition, the underlying

distributions for individual and collective thought diversity were compared in the

paneled histograms of Figure 5.3. In Figure 5.3, the blogs are identified with a number

corresponding to those in the ID column of Table 4.1. It is apparent from Figure 5.3 A

and B that the blogs differ in these underlying distributions, with some seeming to have

skewed (e.g., 10, 11 in 5.2 A), normal (e.g., 1, 6) and bimodal (e.g., 5, 13, 15)

distributions. The skewed distributions all favor the low end of the scale, perhaps

indicating cases where the whole blog is characterized by cultural tribalism. Bimodal

distributions may indicate blogs in transition, where idea seeding is in conflict with

either indoctrination or mass dissemination for ideological dominance.

FIGURE 5.3

Paneled Histograms of Thought Diversity

To address these speculations more definitively, paneled histograms of estimated

values for the need-for-cognition and cultural tribalism latent variables were constructed

(Figure 5.4). All the blogs seem to have similarly shaped distributions even though they

are located on different parts of their underlying scale; they seem to have normally

distributed levels of cultural tribalism and positively skewed levels of need-for-

cognition. Since these two constructs share in common the individual and collective

thought diversity indicators, and are conceptualized as polar opposites, it may be that the

normal distribution of cultural tribalism is a magnification of the elongated negative tail

of the need-for-cognition distribution. Should this be true, then all the distributions of

Figure 5.3 are bimodal, some merely more apparent than others.

FIGURE 5.4

Paneled Histograms of Cultural Tribalism and Need for Cognition

Regardless of whether the distributions of Figure 5.3 are bimodal or not, the evidence

suggests that need-for-cognition is a universally dominant influence in blogs, something

that bodes well for the possibility of high general levels of cognitive diversity.

The parameters of equation 3.1 were estimated from the dataset and the accuracy

of the resulting model tested with a holdout sample. One discriminant function emerged

from the estimation with a characteristic root eigenvalue greater than 1. This function

correctly classified 90.2% of the holdout sample. The coefficients of the discriminant

model are given in Table 5.1.

TABLE 5.1

Comment Discriminant Function Coefficients (Equation 3.1)

Parameter Standardized Unstandardized

β4 (Stdev Individual) 0.457 0.541

β3 (Mean Individual) 0.483 0.565

β2 (Stdev Collective) 0.486 0.553

β1 (Mean Collective) 0.467 0.547

β0 (Constant) -0.042

The only interesting theoretical insight from Table 5.1 is that all the parameters play a

near equal role in separating comments primarily motivated by need-for-cognition from

those motivated by cultural tribalism.

Classification of Commenters

Chapter III described how commenters can be characterized as evangelists, idea

seeders and flockers by plotting their positions on a conceptual map (Figure 3.8, derived

from equation 3.2). Idea seeders can be separated from evangelists by high diversity in

thought expression: idea seeders express thoughts about a wide variety of things while

evangelists always express similar thoughts. Idea seeders and evangelists can be

separated from flockers on the basis of differences in boldness: idea seeders and

evangelists are bolder when expressing their thoughts.

When the commenting activity of 7500 commenters, 500 randomly selected from

each of the blogs in the dataset, was pooled, converted to z-scores and demarcated by

Chiu et al’s (2001) clustering algorithm, three clusters emerged (see Figure 5.5 B).

These cluster assignments were confirmed using McQueen’s (1967) k-means algorithm.

Compare Figure 5.5 (A) with Figure 3.7, where idea seeders and evangelists are seen as

occupying opposite tails of a normal distribution along an individual thought diversity

axis. In Figure 5.5 (A) the distribution fits the rough pattern of a normal distribution but

the center is distorted into a sharp peak. It seems as though commenters near the mean

exhibit (or perhaps are drawn into) greater similarity in cognitive consistency (the

cognitive distance between an individual’s thought expressions) than commenters away

from the mean on either tail. This higher density section appears to be within the cluster

labeled “Flocking” or “Flockers” in Figure 5.5 (B). It was predicted that flockers would

be below average in boldness but unrestricted in their overall thought diversity. While

that appears to be generally true, it seems that the “average” flocker perceives and

maintains (perhaps through conscious effort or innate tendency to conform) an ideal

level of consistency in their expressed thoughts. Figure 5.6 (B) shows this phenomenon

is consistent across blogs.

FIGURE 5.5

All Blog Commenter Clusters

FIGURE 5.6

Paneled Histograms of Boldness and Individual Thought Diversity

This pattern of things in the same vicinity being drawn to a common point is

similar to the physics concept of a fixed point attractor, a stationary point that exerts a

gravity-like force on nearby objects, drawing them closer. Here, evidence of a social

attractor may be present. A full discussion of these results here is out of context;

however, they are discussed in Chapter VI. Figure 5.7 shows that unrestricted thought

freedom, as personified by idea seeders, is consistent across all blogs (the white band is a

2σ (95%) confidence interval around the mean).

FIGURE 5.7

All Blog Idea Seeding Histogram

Since the scatter plot of Figure 5.5 (B) holds data from all 15 blogs, it might be

considered indicative of the diversity of boldness and individual thought that

characterizes the whole blogosphere. If that is so, then comparing it with similar plots

using data from one blog might be a qualitative method of assessing cognitive diversity.

Figure 5.8 contains such plots of commenters from all the individual blogs. While most

plots contain the same three basic clusters the size and location of the clusters vary

noticeably. For example, it could be strongly argued that blog 6 has the profile of a blog

that reflects the diversity present in the whole blogosphere while such blogs as 12, 13

and 14 are more limited. Note that no comparisons can be made with diversity of

thought in the whole market; that must be deferred to future research.

FIGURE 5.8

Individual Blog Commenter Clusters

In Figure 5.8, the clustering algorithms group commenters into similarly located

clusters in all the blogs, with positions roughly as predicted in Figure 3.8. While

interesting, the scatter plots of Figure 5.8 are more instructive when viewed in

conjunction with Figures 5.6 and 5.9. The boldness histograms in Figure 5.6 (A)

generally show bimodal distributions, two separate normally distributed populations on

either side of the mean. The individual thought diversity histograms (Figure 5.6 B) are

consistent in showing one wide normally distributed population, punctuated with a

center peak. The alternative model depicted in Figure 3.11 proposed indoctrination, mass

dissemination and idea seeding to be the exogenous influences driving the other

constructs, Figure 5.9 shows that levels of those influences, as well as flocking, are

roughly normally distributed. Indoctrination seems invariant across blogs, while flocking

changes little. Variance in these influences does not seem to be an obvious explanation

for the variation in commenter cluster configurations shown in Figure 5.8. However,

levels of mass dissemination and idea seeding vary widely and seem logically correlated

with changes in Figure 5.8. Note, for example, that low levels of mass dissemination in

Figure 5.9 (A) blogs 10 and 12 match small mass dissemination clusters in Figure 5.8. A

similar pattern exists for idea seeding in blogs 3 and 9.

FIGURE 5.9

Paneled Histograms of Blog Author and Commenter Types

In preparation for estimating the parameters of equation 3.2, the commenters

were forcibly classified into the segments of Figure 3.8. Using these topology-based

segment membership assignments, the parameters of equation 3.2 were estimated and

the accuracy of the resulting model tested with a holdout sample. Two discriminant

functions emerged from estimation with data from each of the blogs, correctly

classifying an average of 92.1% of their respective holdout samples. Means of the

coefficients of the respective discriminant models are given in Table 5.2.

TABLE 5.2

Commenter Discriminant Function Coefficients (Equation 3.2)

Function 1 Function 2 Parameter

Standardized Unstandardized Standardized Unstandardized Β5

(Stdev Comment

to Cluster)

0.419 0.671 -0.010 -0.016

(# Clusters) 0.503 0.810 -0.012 -0.019

(Stdev

Individual)

0.035 0.043 0.714 0.897

Individual)

0.038 0.046 0.639 0.777

(# Comments) 0.391 0.597 -0.051 -0.078

(Constant) -0.063 -0.033

Construct Validity

As stated in Chapter IV, construct validity focuses on the extent to which data