the forecasting accuracy and effectiveness of complexity manager

THE FORECASTING ACCURACY AND EFFECTIVENESS OF COMPLEXITY MANAGER

LINDY-JO SMART

A Thesis

Submitted to the Faculty of Mercyhurst College

In Partial Fulfillment of the Requirements for

The Degree of

MASTER OF SCIENCEIN

APPLIED INTELLIGENCE

DEPARTMENT OF INTELLIGENCE STUDIESMERCYHURST COLLEGE

ERIE, PENNSYLVANIAAPRIL 2011

DEPARTMENT OF INTELLIGENCE STUDIESMERCYHURST COLLEGE

ERIE, PENNSYLVANIA

THE FORECASTING ACCURACY AND EFFECTIVENESS OF COMPLEXITY MANAGER

A ThesisSubmitted to the Faculty of Mercyhurst CollegeIn Partial Fulfillment of the Requirements for

The Degree of

MASTER OF SCIENCEIN

APPLIED INTELLIGENCE

Submitted By:

LINDY-JO SMART

Certificate of Approval:

___________________________________Kristan J. WheatonAssociate ProfessorDepartment of Intelligence Studies

___________________________________William J. WelchInstructorDepartment of Intelligence Studies

___________________________________Phillip J. BelfioreVice PresidentOffice of Academic Affairs

April 2011

ACKNOWLEDGEMENTS

First, I would like to thank Kris Wheaton for his incredible guidance and patience

through this process and for always having the time to sit down and work through

challenges.

I would like to thank Bill Welch for taking on the role as my secondary reader.

I would like to thank Hema Deshmukh for helping me complete all statistics in this thesis

and for her patience throughout the process.

I would like to thank Richards Heuer for his personal correspondence throughout the

experiment creation process.

I would like to thank all faculty members in the Intelligence Department at Mercyhurst

College for their dedication, guidance, and for offering such a rewarding challenge that is

the Intelligence Studies graduate program.

I would also like to thank my friends and family for their continual support,

encouragement, and patience with me the past two years.

ABSTRACT OF THE THESIS

The Forecasting Accuracy and Effectiveness of Complexity Manager

A Critical Examination

Lindy-Jo Smart

Master of Science in Applied Intelligence

Mercyhurst College, 2011

Associate Professor Kristan J. Wheaton, Chair

The purpose of this study was to assess the forecasting accuracy and effectiveness

of the structured analytic technique, Complexity Manager. The study included an

experiment with Mercyhurst College Intelligence Studies graduate and undergraduate

students placed into small groups to assess an intelligence problem and forecast using

intuition or the structured analytic technique, Complexity Manager. Data was collected

using a researcher-created Forecasting Answering Sheet that included the variables that

each group considered and a researcher-created questionnaire to capture individual

responses to the study. Students that used Complexity Manager spent significantly more

time working with their groups than the groups that used intuition alone. Students that

used intuition alone generated a greater number of variables. However, there was no

connection between the generation of variables and forecasting accuracy; the experiment

group produced more accurate forecasts. The results of the study show that the use of

Complexity Manager may increase forecasting accuracy; three out of 24 control groups

forecasted accurately while six out of 23 experiment groups forecasted accurately.

Therefore, the use of this structured analytic technique increases collaboration with more

accurate results than intuition alone. To yield more statistically sound results, it is

necessary for further studies to yield a higher level of participation.

TABLE OF CONTENTS

TABLE OF CONTENTS……………………………………………………… vii

LIST OF FIGURES…………………………………………………………..... x

CHAPTER 1: INTRODUCTION………………………………………………. 1

CHAPTER 2: LITERATURE REVIEW……………………………………….. 5

Intelligence Failures……………………………………………………... 6

Cognitive Bias…….……………………………………………………... 11

Unstructured and Structured Techniques………………………………... 19

Collaboration ……..……………………………………………………... 29

Complexity Manager…………………………………………………….. 43

Hypothesis ………..……………………………………………………... 47

CHAPTER 3: METHODOLOGY……………………………………………… 48

Setting……………………………………………………………........... 49

Participants …………………………………………………………….. 49

Intervention and Materials ……...……………………………………… 51

Measurement Instruments ………………………………………….…... 52

Data Collection and Procedures ………………..……………………... 53

Data Analysis ……….………………………………………...………... 56

CHAPTER 4: RESULTS……………………………………………………..… 58

Survey Responses……………………………………………………... 58

Group Analytic Confidence ……………………………………………. 62

Group Source Reliability …………………………………………….. 62

Variables………………………………………………………………. 63

Forecasting Accuracy……..……………………………………………. 65

Quality of Variables.............……………………………………………. 66

CHAPTER 5: CONCLUSIONS………………………………………………... 70

Discussion...…………………………………………………………... 70

Limitations…………..………………………………………………….. 76

Recommendations for Further Research………………………………... 78

Conclusions……………………………………………………………... 80

REFERENCES………………………………………………………………...... 83

APPENDICES………………………………………………………………...... 87

Appendix A……………………………………………………………. 87

Appendix B………………………………………………................... 88

Appendix C………………………………………………................... 89

Appendix D………………………………………………................... 90

Appendix E………………………………………………................... 91

Appendix F………………………………………………................... 92

Appendix G………………………………………………................... 108

Appendix H………………………………………………................... 109

Appendix I.………………………………………………................... 110

Appendix J….………………………………………………................... 112

Appendix K.………………………………………………...................... 113

LIST OF FIGURES

Figure 2.1 Complexity Manager: Cross-Impact Matrix 46

Figure 3.1 Number of Participants Per Academic Class 50

Figure 4.1 Source Reliability Per Group 63

Figure 4.2 Forecasting Per Group 66

Figure 5.1 Analytic Confidence Per Group 71

CHAPTER 1: INTRODUCTION

The entire project cost $40,000 and consisted of 1,200 experiments in a 14-month

period (“Edison Gets the Bright Light Right,” 2009). The team searched the world,

testing materials from beard hair to fishing line to bamboo. Over 40,000 pages of notes

were taken (“Edison’s Lightbulb at The Franklin Institute,” 2011). Then, in 1879, after

testing over 1,600 materials, Thomas Edison and his associates found a filament that

would burn for 15 hours. Edison stated, "I tested no fewer than 6,000 vegetable growths,

and ransacked the world for the most suitable filament material" (“Edison’s Lightbulb at

The Franklin Institute,” 2011). By 1880, Edison had produced a bulb that could last for

1,500 hours and was then placed on the market (“Light Bulb History – Invention of the

Light Bulb,” 2007).

The success of Edison’s invention, as with other scientific findings, is evident

because it produces tangible and in this case, visual results. The longer a filament would

burn, the more successful it was. Failure was depicted through nothingness; a filament

didn’t burn for any noticeable length of time. When considering Edison’s filament

experiments compared to experiments in other fields of study, are the results just as

definite and stunning?

In the context of intelligence, the two couldn’t be more different. If intelligence is

successful, nothing may happen. But if intelligence fails, the results could be

catastrophic. It is not something that can be tested within the confines of a vacuum bulb,

nor can thousands of possible solutions be tested before the problem can be resolved. But

it can and must improve somehow. One improvement can be through the use of

structured analytic techniques. However, though structured techniques exist, not all

analysts use them because effectiveness has not been proven and constraints further

hinder use. Therefore, the testing of the methods is needed to increase use within the

Intelligence Community (IC) and determine the validity of each method.

In “Assessing the Tradecraft of Intelligence Analysis,” Gregory F. Treverton and

C. Bryan Gabbard define structured analytic techniques as “technologies, products, or

processes that will help the analyst in three ways…searching for and dealing with data…

building and testing hypotheses…and third, in communicating more easily both with

those who will help them do their work” (2008, p. 18). Though these techniques are

created to assist analysts, there is no consensus on the need for or value of them

(Treverton & Gabbard, 2008). Some analysts willingly use them while others prefer not

to. Therefore, even though a large number of structured analytic techniques are available

to analysts, without proper training and understanding, the techniques are useless.

Another major issue is that many of these structured analytic techniques have yet

to be tested. Richards Heuer states that the concept of structured analytic techniques

began in the 1980s when Jack Davis began teaching and writing about “alternative

analysis” (Heuer & Pherson, 2010, p. 8). Even though the concept has been around for

almost thirty years, it remains largely untested. Therefore, without proper training for the

use of structured analytic techniques and without proven results to give the methods

authority, the use of structured analytic techniques will remain limited.

Richards Heuer states that all structured analytic techniques provide the same

benefit; they guide communication among analysts who need to share evidence, provide

alternative perspectives, and discuss significance of evidence (2009). Structured analytic

techniques involve a step-by-step process that externalizes an analyst’s thoughts.

Therefore, their thoughts and ideas can be reviewed and discussed at each step of the

process. The techniques provide structure to individual thought processes and to the

interaction between collaborations to help generate hypotheses and mitigate cognitive

limitations (Heuer, 2009).

Heuer suggests the evaluation of structured analytic techniques because the only

testing the IC has done is through experience and through a small number of colleges and

universities that offer intelligence courses (Heuer, 2009). He further states that there is no

systematic program established for evaluating and validating the techniques (Heuer &

Pherson, 2010). To resolve this issue, Heuer suggests conducting experiments with

analysts using the technique to analyze typical intelligence issues (Heuer and Pherson,

2010). Heuer states that the most effective approach to evaluating the “techniques is to

look at the purpose for which a technique is being used, and then to determine whether or

not it actually achieves that purpose, or if there is some better way” (2009, p. 5).

Structured analytic techniques are needed to mitigate limitations such as

organizational and individual bias and to decrease the number and negative effects of

intelligence failures. Though structured analytic techniques may be an effective way to

mitigate these issues, few have been tested including Complexity Manager, the subject of

this study. Therefore, testing Complexity Manager would increase the validity of

structured analytic techniques, particularly this specific technique.

The purpose of this study was to conduct an experiment to test the effectiveness

of the methodology Complexity Manager according to Richards Heuer’s

recommendations of using intelligence analysts to analyze a typical intelligence issue.

This thesis is one of thousands of filaments that need tested to ensure that structured

analytic techniques are serving their purpose; to give the visible, tangible results needed

in the intelligence field of study.

CHAPTER 2: LITERATURE REVIEW

Since the creation of the Intelligence Community (IC), issues surrounding the

validity and soundness of its forecasts have surfaced due to intelligence failures. The IC

has taken steps to proactively avoid further significant intelligence failures. However,

there exist debates both about the causes of an intelligence failure and how preventable it

may be actually. Further, the methods used for reaching a forecast are not universally

agreed upon; whether it is more effective to use structured analytic techniques as opposed

to strictly using intuition varies amongst analysts. Using structured analytic techniques

gives the analyst greater confidence in their forecast, especially when managing a

complex situation with multiple hypotheses. However, the very techniques that the

analyst uses have not been proven effective. One reason is because structured analytic

techniques assist the analyst in forecasting the outcome of future events; it is difficult to

state with absolute certainty that the technique is effective. In other words, the technique

may have been used properly, but the forecast may still have been wrong. Regardless, the

testing of structured analytic techniques is essential for not only validating the method

itself, but for the validity of the IC’s ability to forecast. Therefore, the testing of Richards

Heuer’s Complexity Manager will be an initial step at assessing the strength of this

structured analytic technique.

The literature will address four areas related to the testing of Richards Heuer’s

technique, Complexity Manager. The first section will address research related to

intelligence failures and will be followed in the second section by a proposed cause of

them, cognitive bias. The third section will focus on research about the use of

unstructured versus structured analytic techniques. Finally, the fourth section will discuss

research related to the use of collaboration and its connection to Complexity Manager.

Intelligence Failures

For the purpose of this thesis, “intelligence” will be defined according to the

Mercyhurst College Institute for Intelligence Studies (MCIIS) definition created by

Kristan Wheaton and Michael Beerbower which states that intelligence is “a process

focused externally, designed to reduce the level of uncertainty for a decision maker using

information derived from all sources” (2006, p. 319). For the purpose of this thesis,

“intelligence failure” will be defined according to Gustavo Diaz’s definition from

“Methodological Approaches to the Concept of Intelligence Failure” which states that an

intelligence failure is: “the failure of the intelligence process and the failure of decision

makers to respond to accurate intelligence” (Diaz, 2005, p. 2).

Diaz cites both Mark Lowenthal and Abram N. Shulsky for the creation of this

definition. Mark Lowenthal’s definition emphasizes that an intelligence failure is a failing

in the intelligence cycle: “the inability of one or more parts of the intelligence process

(collection, evaluation, analysis, production, dissemination) to produce timely, accurate

intelligence on an issue or event of importance to national interest” (Lowenthal, 1985, p.

51, as cited by Diaz, 2005). Shulsky’a definition acknowledges the connection between

intelligence and policy: “a misunderstanding of the situation that leads a government to

take actions that are inappropriate and counterproductive to own interests” (Shulsky,

1991, p. 51, as cited by Diaz, 2005).

The discussion of intelligence failures is not only to understand why they

occurred, but to what extent they can be minimized in the future. There is a lack of

consensus regarding the cause of intelligence failures. Some researchers state that

analytic failure causes intelligence failures while others state that they are caused by the

decision maker’s potential misunderstanding of the intelligence product given to them. In

other words, it can either be caused by faulty analysis or by miscommunication with the

decision maker. Regardless of the cause, researchers often agree that intelligence failures

are inevitable.

In “Methodological Approaches to the Concept of Intelligence Failure” Gustavo

Diaz claims that intelligence failures are inevitable because of unavoidable limitations.

He suggests that there are two schools of thought for intelligence failure: the traditional

point of view and the optimistic point of view. The traditional point of view believes that

policymakers are responsible for the failures because they do not listen to the analysis or

they misinterpret it (Diaz, 2005). The optimistic point of view believes that intelligence

can be improved through the use of technology and that failures can be reduced by the

use of new techniques (Diaz, 2005).

Diaz suggests a third approach, an alternative approach that captures both; there is

not one single source of guilt for intelligence failures. An intelligence failure, like any

human activity, is inevitable because failures and imperfections are normal. Intelligence

cannot always give the same result even with the same environment because there are

always factors that cannot be controlled (Diaz, 2005). Diaz states that accidents in

complex systems, such as a county’s national security, are inevitable because it is

impossible to account for all failures. Not only is it impossible to account for every

factor, but there are limitations to the amount and relevance of data collection and the

reliability of sources. Also, funding limits the amount of resources available to fight the

threats and leads to a need to prioritize them.

Richard Betts, in “Analysis, War, and Decision: Why Intelligence Failures are

Inevitable” notes barriers that are inherent to the nature of intelligence that include:

ambiguity of data, ambivalence of judgment due to conflicting data, and useless reforms

in response to previous intelligence failures (1978). Therefore, the inability of

intelligence to be infallible and its intrinsic ties to decision making makes intelligence

failures inevitable. Betts further states that if decision makers had more time, then

intelligence failures would not occur because it could be resolved, just as academic issues

are resolved (Betts, 1978). Because time will always be a main concern and reason for

needing intelligence analysis, Betts concludes by suggesting “tolerance for disaster”

(Betts, 1978, p. 89).

Intelligence failures can be inevitable due to the nature of intelligence work, or by

human nature. According to Simon Pope and Audun Josang, co-authors of “Analysis of

Competing Hypothesis using Subjective Logic,” intelligence failures or errors in general,

are due to problems with framing, resistance to change, risk aversion, limitations of short-

term memory, and cognitive bias. These issues can negatively affect intelligence,

especially where issue outcomes appear similar. (Pope & Josang, n.d.). Because of the

inevitability of error with human reason and past intelligence failures, the researchers

conclude that to continue to rely solely on intuition would be irresponsible. As the

authors state, “management of intelligence analysis should encourage the application of

products that allow clear delineation of assumptions and chains of inference.” (Pope &

Josang, p.2). The inevitability of intelligence failures due to cognitive bias will be

discussed in more detail after the discussion of intelligence failures.

The inevitability of intelligence failures may not only be natural, but may be a

necessary part of intelligence. Stephen Marrin, a former CIA analyst, suggests in

“Preventing Intelligence Failures by Learning from the Past” that intelligence failures

occur as a trade-off to another action that could have caused future, unavoidable failures.

Also, imperfections in the intelligence process are the result of unavoidable tradeoffs to

structures. He states, for example, that any changes that would have been made to prevent

September 11, 2001 would have caused other unavoidable future failures (Marrin, 2004).

Therefore, the only way to make improvements is to understand that everything has

tradeoffs and either work to minimize them or find new ways of doing things that move

beyond the tradeoffs.

An intelligence failure often implies a negative impact on the U.S. national

security; however, failures occur every day in varying degrees. Though these things occur

on a daily basis, it is not until the information is applied to a high profile situation that it

is then known as an intelligence failure (Marrin, 2004). Marrin also makes the point that

though intelligence failures are becoming more public through investigations; successes

are often not discussed to avoid losing sources and methods (Marrin, 2004).

Consequently, in the public view, failures outnumber successes; the degree of success is

not known. If intelligence failures are inevitable, then the misconception that failures

outnumber successes may also be a necessary part of intelligence to maintain source

confidentiality.

John Hollister Hedley, states in “Learning from Intelligence Failures” that from

the United States’ perspective, anything that occurs that catches the U.S. by surprise or

was unintended, is then seen as an intelligence failure. Aside from the perception of

intelligence failure, Hedley also notes that failure is inevitable because analysts must be

willing to take risks to do their job well because even when information is incomplete,

inaccurate, or contradictory, a decision must be made (Hedley, 2005). Even under these

circumstance and though it is impossible to learn how to prevent something inevitable

like intelligence failures, the ratio of success to failure could be improved (Hedley, 2005).

Therefore, to improve the ratio, structures and methods could be applied to increase the

likelihood of success.

In Intelligence Analysis in Theater Joint Intelligence Centers: An Experiment in

Applying Structured Methods, Master Sergeant Robert D. Folker, Jr. states that the root

cause of intelligence failures is analytic failure; the lack of analysis of the collected raw

data. He states that regardless of what analysts believe about intelligence failures, it is

the opinion of the decision maker that matters most; if the decision maker doesn’t believe

the accuracy of the intelligence then it will not be useful. Therefore, improvements

should focus on improving accuracy and producing timely and useful products (Folker,

2000). Regardless of how inevitable failures are, Folker emphasizes a need for

improvements in the quality of analyst’s work so decision makers can make quality

decisions.

The literature reviewed on intelligence failures stated that failures are caused by

cognitive bias, analytic failure, or the decision maker’s misunderstanding of the

intelligence product they are given. Whatever the cause, the IC agrees that intelligence

failures are inevitable because failure and imperfection are inevitable and normal. Also,

the nature of intelligence requires judgments to be made on time sensitive issues so

tradeoffs must be made. Though failure is inevitable, it is not justifiable or excusable to

not attempt to prevent or reduce the severity of its consequences. Rather, the more

intelligence failures are understood and what the causes are, the more likely it would be

to lessen its effects, especially the factors that are within the analyst’s control including

an understanding of their own cognitive bias and its ramifications on their intelligence

products.

Cognitive Bias

In “Fixing the Problem of Analytic Mind-Sets: Alternative Analysis,” Roger

George describes cognitive bias as a mindset from which both the analyst and the

decision maker develop a series of expectations based on past events and draw their own

conclusions. As both are presented with new data, they either validate it because it is

consistent with earlier data or they disregard it if it does not fit into the pattern. As new

events occur, data consistent with earlier patterns of beliefs are more likely to be accepted

as valid, while data that conflicts with expectations lack precedent. It is human nature for

individuals to ‘‘perceive what they expect to perceive,’’ making the mindset unavoidable

(George, 2004, p. 387). Initially, the mindset can help create experts for data collection;

however, eventually the mindset will make the experts obsolete as they are unable to

accept or process new information or changing events.

Heuer identifies cognitive bias as a mental error that is consistent and predictable.

In Jack Davis’ introduction to Heuer’s Psychology of Intelligence Analysis, Davis

identifies three factors that Heuer recognizes as the cognitive challenges that analysts

face: the mind cannot effectively deal with uncertainty; even if the analyst has an

increased awareness of their biases, this does little to help analysts deal effectively with

uncertainty; and tools and techniques help the analyst apply higher levels of critical

thinking and improve analysis on complex issues, especially when information is

incomplete or deceptive (Davis, 1999). In the chapter, “Thinking About Thinking,”

Heuer notes that weakness and bias are inherent in the human thinking process. However,

they can be alleviated by the analyst’s conscious application of tools and techniques

(Heuer, 1999). Though bias is present, techniques can help mitigate its effects.

Patterns are necessary for analysts to know what to look for and what is

important. These patterns then form the analyst’s mindset and create their perception of

the world (Davis, 1999). Mindsets are unavoidable and objectivity is achieved only by

making assumptions as clear as possible so when others view the analysis, they can

assess its validity. Cognitive bias may be unavoidable, but overt acknowledgment

reduces its negative effect on intelligence analysis.

Cognitive biases can develop within individuals and collectively in the

organization they work for. David W. Robson states in “Cognitive Rigidity: Methods to

Overcome It” that organizations develop mental models that serve as the basis for belief

within the organization. Like an individual’s mindset, these mental models can be

difficult to overcome (Robson, n.d.). This cognitive rigidity can lead to reliance on

hypotheses that purely reinforce what the organization believes to be true and valued.

Just as an expert’s judgment can become obsolete, cognitive rigidity within an

organization can ultimately lead it to dismiss radical alternatives to its approach and

restrict its abilities to change over time though necessary.

Organizations often struggle with cognitive rigidity because, by its very nature, it

is undetectable (Robson, n.d.). Robson notes that this is especially true for organizations

that handle complex problems and forecast possible outcomes. The danger of cognitive

rigidity lies in the experience of the organization; the more experienced the organization,

the more susceptible it is to being set in its mental model. Over time, the organization’s

cognitive framework becomes self-reinforcing as it accepts only data that confirms its

core assumptions that it has been built on. For organizations that deliver actionable

intelligence, these frameworks consequently influence the estimation of probability and

may negatively influence the intended solution and possibly lead to an intelligence failure

(Robson, n.d.).

Rob Johnston identifies two general types of bias in “Integrating Methodologists

into Teams of Substantive Experts” that include pattern bias and heuristic bias. Pattern

bias, more commonly known as confirmation bias, is looking for evidence that confirms

instead of rejects a hypothesis and heuristic bias uses inappropriate guidelines or rules to

make predictions (Johnston, 2005). Johnston looks at how each affects experts. He states

that unreliable expert forecasts are often caused by both pattern and heuristic bias.

Becoming an expert requires years of viewing the world through a particular lens;

however, because of these biases, it can also lead to poor intelligence.

Johnston states that intelligence analysis is like other complex tasks that demand

expertise to solve complex problems; the more complex the task, the longer it takes to

build necessary expertise. However, this level of expertise paradoxically makes expert

forecasts unreliable. Johnston notes that experts outperform novices with pattern

recognition and problem solving, but expert predictions are “seldom as accurate as

Bayesian probabilities” (2005, p. 57). Johnston attributes this to cognitive bias and time

constraints. Experts are effective but only to the extent to where their bias does not affect

the quality of their analysis.

Johnston also discussed bias in Analytic Culture in the US Intelligence

Community. An Ethnographic Study. Johnston conducted a series of 439 interviews, focus

groups, and other forms of participation from members of the IC (2005). The purpose of

the work was to identify and describe elements that negatively affect the IC. Within this

study, he focused a section of his work on confirmation bias. Johnston found through

interviews and observation that confirmation bias was the most prevalent form of bias in

the study (Johnston, 2005, p. 21). For example, when Johnston asked the participants to

describe their work process, they responded that the initial steps to investigating an

intelligence issue were to do a literature search on previous literature. The issue that

Johnston notes with this is that searches can quickly lead to unintentional searching for

confirming information. Therefore, the evidence collection could quickly become a

search that only confirmed the analyst’s own thoughts and assumptions.

A weakness of Johnston’s ethnographic study on confirmation bias may be the

presence of his own bias. Johnston chose to include only four quotations from interviews

with analysts during his discussion on confirmation bias. All answers that he provides in

the body of his results conclude the same thing: initial searches are done by reading

previous products and past research. However, with over 439 participants in the study, it

is doubtful that only four participants answered this question and it is even less likely that

all 439 participants only discussed literature searches. A more comprehensive look at

confirmation bias within the IC would have been to measure, in a more quantitative form,

the responses to see exactly where the bias stems from. If not, it appears that Johnston

chose those quotes only to make his point that analysts most often use literature searches

to begin the analysis process instead of presenting all the responses.

To test cognitive bias, researchers Brant A. Cheikies, Mark J. Brown, Paul E.

Lehner, and Leonard Adelman assessed the effectiveness of a structured analytic

technique in their study “Confirmation Bias in Complex Analyses.” The researchers

believe that most studies of confirmation bias involve abstract, unrealistic experiments

that do not mirror complex analysis tasks managed by the IC. Therefore, the purpose of

this study was to recreate a study of an actual event, using techniques to assess the

presence of confirmation and anchoring bias and if the structured analytic technique,

Analysis of Competing Hypotheses (ACH), successfully reduces it. For their study, the

researchers define an anchoring effect as a “tendency to resist change after an initial

hypothesis is formed” (Cheikies et al, 2004, p. 9). The researchers define confirmation

bias as the “tendency to seek confirming evidence and/or bias the assessment of available

evidence in a direction that supports preferred hypotheses” (Cheikies et al, 2004, p. 10).

This is the first recorded experiment that looked to test ACH’s ability to minimize

confirmation bias (Cheikies et all, 2004). The researchers replicated a study by Tolcott,

Marvin, and Lehner, conducted in 1989, to obtain the same confirmation bias results.

Therefore, they could then test ACH’s ability to mitigate confirmation bias using the

previous study as a control. For the study, the researchers used 24 employees from a

research firm. The participants averaged 9.5 years of intelligence analysis experience. All

participants were emailed 60 pieces of evidence regarding The USS Iowa Explosion that

occurred in April 1989, including three hypothesized causes of the explosion: Hypothesis

1 (H1) inexperienced rammerman inadvertently caused the explosion, (H2) friction

ignited powder, and (H3) gun captain placed incendiary device. The experiment group

was given the same information but was also given an ACH tutorial.

To test for confirmation and anchoring bias, the researchers had H1 and H3

receive the most confirming evidence in the first two rounds while having H2 have the

least confirming evidence. Also H1 and H3 were constructed to be easiest to visualize. To

analyze the results, two analyses of variance (ANOVA) were performed to determine the

confidence ratings of the participants.

The results showed that as the participants assessed new evidence, it was greatly

affected by the beliefs that they had at the time the evidence was given. The evidence

that confirmed the participants’ current belief was given more weight than the

disconfirming evidence (Cheikies et all, 2004). In this study, ACH reduced confirmation

bias, but only to the participants that did not have professional analysis experience.

The study was able to show that an anchoring effect was present and also that

ACH was able to minimize an analyst’s tendency toward confirmation bias. The

researchers were effective at establishing an anchoring effect because they built on a

successful study that had previous done the same thing. However, the researchers’ use of

participants that were not experienced in intelligence analysis is a weakness of the study.

Though all participants were interested in analysis, only 12 had analysis experience. The

varying abilities of the participants calls into question the validity of the results because

those not trained or experienced in analysis would not only be less aware of the purpose

of weighing criteria and the use of a structured analytic technique, but also of the

presence of cognitive bias.

In his Applied Intelligence Master’s Thesis, “Forecasting Accuracy and Cognitive

Bias in the Analysis of Competing Hypothesis,” Andrew Brasfield looked to further

investigate the inconclusive and varying results from previous studies about the

structured analytic technique, Analysis of Competing Hypotheses (ACH) (2009).

Specifically, Brasfield looked at ACH’s goals of increased forecasting accuracy and

decreased cognitive bias.

Seventy undergraduate and graduate Intelligence Studies student participants were

divided into control and experiment groups and further divided into groups based on

political affiliation to detect the presence of a pre-existing mindset regarding the topic,

the 2008 Washington State gubernatorial election. The two possible outcomes would

either be the incumbent governor or the challenger would win the election. The

participants had access to open source material to gather evidence for their forecast. The

control group used an intuitive process and the experiment group used ACH to structure

their analysis. The participants were given a full week to complete the assignment.

Brasfield tested accuracy by comparing the results of the control group to the

results of the experiment group. To test for cognitive bias, Brasfield looked to see if there

was a pattern between the participants’ party affiliation and their forecasts. Also, for the

experiment group, Brasfield used the evidence in the participants’ ACH that

overwhelmingly supported a particular party affiliation; this detected the presence of

cognitive bias.

The results of the election showed that the incumbent won. The results of the

study showed that the experiment group was 9 percent more accurate than the control

group; 70 percent of the participants in the experiment group forecasted the winner and

61 percent of the participants in the control group forecasted the winner (Brasfield,

2009). Brasfield states that structured analytic techniques “should only improve overall

forecasting accuracy incrementally since intuitive analysis is, for the most part, an

effective method itself” (Brasfield, 2009, p.39). Therefore, though the improvement is

only minor, the findings show that ACH does improve analysis.

Regarding cognitive bias, ACH appeared to mitigate bias of party affiliation

among Republicans but not Democrats. Brasfield notes that this may be due to the

Democratic candidate winning the election. For the experiment group using ACH,

participants used more evidence and applied it appropriately. Nearly all control group

participants used evidence that only supported the candidate they forecasted to win,

suggesting confirmation bias in the control group (Brasfield, 2009). Therefore, ACH does

appear to mitigate this bias.

As intelligence failures are inevitable, so is the cognitive bias that contributes to

them. Cognitive bias within the IC is most prevalent in the form of confirmation bias

where analysts and decision makers seek information that conforms most to the data that

they currently have. Within organizations, this takes the shape of confirming the beliefs

and values that the organization already holds. Because intelligence relies on judgment,

an analyst’s cognitive bias can affect the accuracy of the forecast and the decision

maker’s cognitive bias can affect the action they take. The biases of both are a major

contributor to whether intelligence succeeds or fails. Cognitive bias is a mindset that

either rejects or confirms information based on previous experiences. By consciously

recognizing cognitive bias, the analyst is taking the first proactive measure to mitigate

their contribution to an intelligence failure. However, the analyst must do more than just

recognize their bias; they must take the next step to distance their bias from their forecast

either by using intuition or structured analytic techniques.

Unstructured and Structured Techniques

It is debated within the IC about the use of structured analytic techniques for

effective forecasting and decision making. One side believes unstructured, intuitive

thinking is effective and comes with experience working in the field. The other side

believes that the use of structured analytic techniques organizes and manages complex

situations more effectively than intuition alone. The argument for both, however,

recognizes a need to overcome bias and the need for a process to effectively aid in

strategic decision-making. Because bias is inherent and present in every decision that is

made, the analyst must make a conscious application of analytic techniques outside of

their own mind to help reduce the effects of bias. However, intuition is also a powerful

tool that can be used alone or in conjunction with analytic techniques.

Intuition

David Meyers, a professor of Psychology at Hope College Memory, describes the

intuitive process in his book, Intuition: Its Powers and Perils. He states that memory is

not a single, unified system. Rather, it is “two systems operating in tandem” (Meyers,

2002, p. 22). Implicit memory, or procedural memory, is learning how to do something

whereas explicit memory, or declarative memory, is being able to state what and how

something is known. Meyers offers an example: as infants, we learn reactions and skills

used throughout our lives; however, we cannot explicitly recall anything from our first

three years (2002). This phenomenon continues throughout our lifetime. Though we may

not explicitly recall much of our past, we implicitly and intuitively remember skills.

Beyond a basic idea of intuition, Meyers also discusses intuitive expertise.

Compared to novices, experts know more through learned expertise. Meyers describes

William Chase and Herbert Simon’s chess expert study. The researchers found that the

chess experts could reproduce the board layout after looking at the board for only 5

seconds and could also perceive the board in clusters of positions that they were familiar

with. Therefore, the experts could intuitively play 5 to 10 seconds a move without

compromising their level of performance. Through this example, Meyers relays that

experts are able to recognize cues that enable them to access information they have stored

in their memories; expert knowledge is more organized and, therefore, more efficiently

accessible (Meyers, 2002). Experts see large, meaningful patterns while novices are only

able to see the pieces. Another difference between experts and novices is that experts

define problems more specifically. Meyers does note, however, that expertise is

discerning. Expertise is within a particular field and scope for each individual (Meyers,

2002).

Though intuition allows us to access experiences and apply them efficiently, it

does have its drawbacks. Meyers recognizes three forms of bias that intuition is prone to:

hindsight bias, self-serving bias, and overconfidence bias (Meyers, 2002). Hindsight bias

is when events and problems become obvious retrospectively. Then, once the outcome is

known, it is impossible to revert to the previous state of mind (Meyers, 2002). In other

words, we can easily assume in hindsight that we know and knew more than we actually

did. Another disadvantage of intuition is a self-serving bias. Meyers states that in past

experiments, people more readily accepted credit for successes but attributed failure to

external factors or “impossible situations” (Meyers, 2002, p. 94). A third drawback to

intuition is an overconfidence bias that can surface from judgments of past knowledge in

estimates of “current knowledge and future behavior” (Meyers, 2002, p. 98).

Overconfidence is then sustained by seeking information that will confirm decisions

(Meyers, 2002).

Intuition can be a powerful tool, especially when quick decisions need to be

made. However, intuition is subject to bias and an individual’s expertise is limited in

scope and personal ability. Because of this, it is necessary for analysts to remain

cognizant of its possible disadvantages and to use tools that most effectively make use of

their intuition.

Naresh Khatri and H. Alvin in “Role of Intuition in Strategic Decision Making”

look to fill the gap in the field of research on the role that intuition serves in decision

making. At the time of their study, the researchers state that there are only a few scholarly

works on intuition and even less research that has been conducted in the field.

The researchers define intuition as a “sophisticated form of reasoning based on

‘chunking’ that an expert hones over years of job-specific experience...in problem-

solving and is founded upon a solid and complete grasp of the details of the business”

(Khatri & Alvin, p. 4). Khatri and Alvin (n.d.) state that intuition can be developed

through exposure to and experience with complex problems, especially those that have a

mentor through the process.

They note six important properties of intuition. The first is that intuition is a

subconscious drawing of experiences. The second is that intuition is complex because it

can handle more complex systems than the conscious mind (Parikh, 1994, as cited by

Khatri & Alvin, p. 5). The rational mind thinks more linearly while intuition can

overcome those limitations. A third property is that the process of intuition is quick.

Intuition can recall a number of experiences in a short period of time; compressing years

of learned behavior into seconds (Isenberg, 1984, as cited by Khatri & Alvin, p.5). A

fourth property of intuition is that it is not emotion. Intuition does not come from

emotion; rather, emotions such as anger or fear cloud the subtle intuitive signals. The

fifth property is that intuition is not bias. The researchers state that there are two sides to

the bias debate over intuition. The first is that cognitive psychology research states

decision making “is fraught with cognitive bias” (Khatri & Alvin, p. 6). However,

another body of research suggests that intuition it not necessarily biased but “uncannily

accurate” (Khatri & Alvin, p.6). The researchers’ line of reasoning follows that the same

cognitive process that is used for valid judgments is the same one that generates the

biased ones; therefore, “intuitive synthesis suffers from biases or errors, so does rational

analysis” (Khatri & Alvin, p.6). Finally, intuition is part of all decisions. It is used in all

decisions, even decisions based on concrete facts. As the researchers note, “at the very

least, a forecaster has to use intuition in gathering and interpreting data and in deciding

which unusual future events might influence the outcome” (Goldberg, 1973, cited by

Khatri & Alvin). The researchers show through these properties that intuition is not an

irrational process because it is based on a deep understanding and rooted in years of

experience surfacing to help make quick decisions.

Khatri and Alvin state that strategic decisions are characterized by incomplete

knowledge; that decision makers cannot rely solely on formulas to solve problems, so a

deeper sense of understanding of intuition is necessary. The authors note that intuition

should not be viewed as the opposite of quantitative analysis or that analysis should not

be used. Rather, “the need to understand and use intuition exists because few strategic

business decisions have the benefit of complete, accurate, and timely information”

(Khatri & Alvin, p. 8).

Khatri and Alvin surveyed senior managers of computer, banking, and utility

industries in the Northeastern United States and found that intuitive processes play a

strong role in decision making within each respective industry. The industries were

chosen based on each environment; the computer industry is least stable, the banking

industry is moderately stable, and the electric and gas companies are the most stable but

the least competitive of the three. Khatri and Alvin acknowledged the effect the size of

the organization has on the culture; small organizations tend to “use more of

informal/intuitive decision making and less of formal analysis than large organizations”

(Khatri & Alvin, p. 14).

The researchers narrowed their scope by sampling organizations that fell within a

specified sales volume range. For the scope of the study, organizations in the computer

and utility industries all had sales of over $10 million and nine banks ranged from $50

million to $350 million in assets. The researchers used both subjective and objective

indicators for measurement of performance. The researchers had a response rate of 68

percent, or 281 individuals from 221 companies.

The industry mean scores were examined using the Newman-Keuls procedure and

a hierarchical regression analysis. The results showed that the computer industry uses a

higher level of intuitive synthesis than banks and the banking industry uses a higher level

of synthesis than the utility industry. The researchers’ three indicators of intuition include

judgment, experience, and gut-feeling. Each of these varied according to industry. They

found that managers in banks and computer companies use more judgment and rely on

their previous experience more so than the utility company managers. Managers of

computer companies rely on gut-feelings significantly more than bank or utility

managers. Therefore, Khatri and Alvin found that intuition is used more in unstable

environments than stable.

Due to their findings, the researchers suggest that intuition be used for strategic

decision making in less stable environments and cautiously in more stable environments.

Khatri and Alvin state that intuition can be developed through exposure to and experience

with complex problems, especially those that have a mentor through the process.

Khatri and Alvin note that their geographic range, size, and choice of industries

chosen were limitations to their study. The Northeast was the geographic area studied and

its economy can vary widely not only regionally, but nationally. They note that further

research should draw from large sample sizes of varying industries. Another limitation of

the study may be the researcher’s use of indicators and definition of a “stable”

environment. The researchers noted that the indicators were subjective. Without making

the indicators as objective as possible, it becomes increasingly difficult to use the

indicators as a standard for comparing data across the industry. Also, Khatri and Alvin

state that the use of intuition is more effective in an unstable environment. However,

without having a clear definition of what a “stable” vs. an “unstable” environment is, the

appropriate use of intuition for a specified environment may not be properly determined.

Structured Analytic Techniques

“The National Commission on Terrorist Attacks Upon the United States”, or the

9/11 Commission, was created to evaluate and report the causes relating to the terrorist

attacks on September 11, 2001 (Grimmet, 2004). The Commission also reported on the

evidence collected by all related government agencies about what was known

surrounding the attacks, and then reported the findings, conclusions and

recommendations to the President and Congress on what proactive measures could be

taken against terroristic threats in the future (Grimmet, 2004).

Throughout the document, there are repeated recommendations stressing the

necessity of information sharing not only throughout United States agencies but through

international efforts. The Commission recommended information sharing procedures that

would create a trusted information network balancing security and information sharing

(Grimmet, 2004). Also throughout is the emphasis on improved analysis. A Key

Recommendation of the Joint Inquiry House and Senate Intelligence Committees stated

that the IC should increase the depth and quality of its domestic intelligence collection

and analysis (Grimmet, 2004). The committees also suggested an “information fusion

center” where all-source terrorism analysis could be improved in both quality and focus

(Grimmet, 2004). The committees also stated that the IC should “implement and fully

utilize data mining and other advanced analytical tools, consistent with applicable law”

(Grimmet, 2004, p. 20). By stating this, the committee is recognizing the value in using

structured analytic techniques to improve intelligence. Therefore, the use of analytic

techniques is necessary within the IC, especially when working directly with terrorism.

Folker states in Intelligence Analysis in Theater Joint Intelligence Centers: An

Experiment in Applying Structured Methods that a debate exists between unstructured

and structured analytic techniques. It is a difference in thinking of intelligence as either

an art form or a science. The researcher states only a small number of analysts

occasionally use structured analytic techniques when working with qualitative data and

instead rely on unstructured methods (Folker, 2000). In the context of this experiment,

structured analytic techniques are defined as “various techniques used singly or in

combination to separate and logically organize the constituent elements of a problem to

enhance analysis and decision making” (Folker, 2000, p. 5).

Folker states that advocates of unstructured methods feel that intuition is more

effective because structured analytic techniques too narrowly define the intelligence

problem and ignore other important pieces of information (Folker, 2000). Those who use

structured analytic techniques claim that the results are more comprehensive and

accurate. The methods can be applied to a broad range of issues to assist the analyst to

increase objectivity. The emphasis on structured analytic techniques is not to replace the

intuition of the analyst, but to implement a logical framework to capitalize on intuition,

experience, and subjective judgment (Folker, 2000). However, no evidence exists for

either; at the time of the experiment, Folker states that there has not been a study done to

adequately assess if the use of structured analytic techniques actually improves

qualitative analysis (2000). Folker points to the advantages of structured analytic

techniques when he states:

A structured methodology provides a demonstrable means to reach a conclusion.

Even if it can be proven that, in a given circumstance, both intuitive and scientific

approaches provide the same degree of accuracy, structured methods have

significant and unique value in that they can be easily taught to other analysts as a

way to structure and balance their analysis. It is difficult, if not impossible, to

teach an intelligence analyst how to conduct accurate intuitive analysis. Intuition

comes with experience. (2000, p. 14)

The ability of structured analytic techniques to be taught and replicated is shown to be a

clear advantage over intuition which is learned through personal experience. Folker states

that even though structured analytic techniques have advantages, they are not used

because of time constraints, a sense of increased accountability, and because there is no

proof that it will actually improve analysis.

Analysts are faced with an ever increasing amount of qualitative data that is used

for solving intelligence problems. In order to use the data in a more objective way, Folker

designed an experiment to test the effectiveness of a structured analytic technique and its

ability to improve qualitative analysis. This was accomplished by comparing analytic

conclusions drawn from two groups; between those that used intuition and those that used

a structured analytic technique. Then the participants’ answers were scored as correct or

incorrect and compared statistically to determine which group performed better.

There were 26 total participants in this study; 13 in the control and 13 in the

experiment group. The low participation level was taken into account and Fisher’s Exact

Probability Test was used to determine “statistical significance for the hypotheses and for

the influence of the controlled factors (rank, experience, education, and branch of

service)” (Folker, 2000, p. 16). The participants completed a questionnaire to give

demographics and identify any prior training and experience they had. Both groups were

given a map, the same two scenarios, and an answer sheet. All were given one hour to

complete the first scenario and 30 minutes to complete the second scenario. Both

scenarios were built using extensive testing and were based on actual events. The results

indicated that the use of structured analytic techniques improved qualitative analysis and

that the controlled factors did not seem to affect the results.

Folker noted that time constraints for learning the methodology was a limiting

factor. Folker allotted one hour for teaching Analysis of Competing Hypotheses (ACH)

and stated that the complexity of the scenarios may have affected the results. Therefore,

for future studies, it is necessary to either use experienced analysts of varying degree that

are familiar with the methodology, or allot more time for learning it.

In “The Evolution of Structured Analytic Techniques,” Heuer states that

structured analytic techniques are “enablers of collaboration”; that the techniques are the

process by which effective collaboration occurs (Heuer, 2009). Structured analytic

techniques and collaboration should be developed together for the success of both. Heuer

states that there is a need for evaluating the effectiveness of structured techniques beyond

just experience; each needs tested.

Heuer states that testing the accuracy of a methodology is difficult because it

assumes that the accuracy of intelligence can be measured. Also, testing for accuracy is

problematic when most intelligence questions are probabilistic (2009). Heuer notes that

this would require a large number of experiments to acquire a distinguishable comparison

between the accuracy of one technique over another. Further, the number of participants

that would be needed for these would be unrealistically high. Heuer states that the most

feasible and effective approach for evaluating a technique is to look at the purpose for

which the technique is being used and then determine whether it achieves that purpose or

if there is a better way to achieve that purpose; simple empirical experiments can be

created to test these.

Rather than a debate between the use of unstructured versus structured analytic

techniques, it is possible to view the two sides as existing on either ends of a spectrum;

with both being necessary and useful for decision making and forecasting. The emphasis

of structured analytic techniques is not to replace intuition but to create support for the

analyst’s intuition, experience, and subjective judgment; analytic tools increase

objectivity. As shown through the literature, intuition is especially effective for highly

experienced professionals, and analytic tools help analysts attain a higher level of critical

thinking and improve analysis on complex issues. This is necessary in intelligence,

especially when information can be deceptive or datasets can be incomplete. In sum,

structured analytic techniques can be taught and intuition cannot. Though both are

valuable, the use of structured analytic techniques can increase objectivity, especially for

the novice that may have a less developed intuition than an experienced analyst.

Collaboration

Collaboration may be the first initial step towards reducing confirmation bias. As

Johnston stated in Analytic Culture in the US Intelligence Community: An Ethnographic

Study, analysts often use literature searches as the initial step for assessing an intelligence

problem. However, collaboration could generate multiple hypotheses that a literature

search would miss. Though collaboration could alleviate issues with confirmation bias,

problems with group dynamics could hinder multiple hypotheses generation.

In The Wisdom of Crowds, James Surowiecki proposes the use of groups not only

to generate more ideas, but to increase the quality of the decisions made. Surowiecki

affirms that groups make far better judgments than individuals, even experts. Surowiecki

states that experts are important, but their scope is narrow and limited; there is no

evidence to support that someone can be an expert at something broad like decision

making or policy (2009). Instead, groups of cognitively diverse individuals make better

forecasts than the most skilled decision maker (Surowiecki, 2009).

Surowiecki furthers the argument that cognitively diverse groups are important by

stating that a diverse group of people with varying levels of knowledge and insight are

more capable at making major decisions than “one or two people, no matter how smart

those people are” (Surowiecki, 2004, p. 31). When making decisions, it is best to have a

cognitively diverse group (Surowiecki, 2009). This is because diversity adds perspective

and it also eliminates or weakens destructive group decision making characteristics such

as overly influential members. In other words, conscientious selection helps to alleviate

dominate personalities from taking over the group.

Surowiecki cites James Shanteau, one of the leading thinkers on the nature of

expertise, to back up his claim. Shanteau asserts that many studies have found individual

expert judgment to be inconsistent with other experts in their field of study (Surowiecki,

2009). Also, they are prone to what Meyers would call the “overconfidence bias”;

experts, like anyone else whose judgments are not calibrated, often overestimate the

likelihood that their decisions are correct. In other words, being an expert does not

necessarily mean accurate decision-making. Experts should be integrated into a group to

make them the most effective they can be.

Surowiecki uses scientific research as an example to show the effectiveness of

collaboration. He states that because scientists collaborate and openly share their data

with others, the scientific community’s knowledge continues to grow and solve complex

problems (Surowiecki, 2009). Collaboration not only improves research, it also fully

utilizes experts’ abilities. Individual judgment is not as accurate or consistent as a

cognitively diverse group. Therefore, diverse groups are needed for sound decision

making.

Wesley Shrum, in his work, “Collaborationism” discusses the motivations and

purpose of collaboration. Shrum states that collaboration should not be generalized

because it occurs in many forms across a wide range of disciplines. For example, some

disciplines require collaboration while others can easily opt not to use it. Shrum questions

what motivates individuals and groups to collaborate when their field does not

necessarily require it and what those motivators are. The most common motivation is

resources such as technology and funding. By collaborating, individuals are more likely

to have access to resources they need. Others are motivated by bettering their discipline

through strategic efforts. In other words, if a less established discipline collaborates with

a well established discipline, the less established discipline will gain legitimacy (Shrum,

n.d.). A third motivation is to gain information from other disciplines to solve complex

problems. Shrum (n.d.) states that cross discipline collaboration is increasing.

A major issue that Shrum sees in current collaboration is that it is often

technology-based. Collaboration is designed to “produce knowledge later rather than

now” (Shrum, p. 19). The collaboration isn’t being used to solve problems or produce

results at that present moment, but to create things such as databases to be used at a later

point in time. Shrum states that the knowledge produced later may not even involve the

same individuals that originally collaborated to create it. This is a problem because the

farther all disciplines move away from the “interactivity of collaboration…the farther we

move from the essential phenomenon that the idea of collaboration centrally entails:

people working together with common objectives” (Shrum, p.19). Shrum looks at

collaboration in a realistic sense that individuals are using collaboration to get what they

need out of it as individuals and can abandon the process at any point they feel it is no

longer useful. Collaboration is being used for individual instant gratification rather than

strategic pursuits. By analyzing collaboration in its current state, Shrum is able to identify

the benefits and issues surrounding modern collaboration.

In his study, “Processing Complexity in Networks: A Study of Informal

Collaboration and its Effect on Organizational Success” Ramiro Berardo seeks to identify

how individual organizations “active in fragmented policy arenas” are able to achieve

their goals through collaborations and what motivates collaboration (Berardo, 2009, p.

521). The basis of the study is the resource exchange premise that individuals or

organizations rarely have enough resources to pursue their goals; therefore, they must

exchange resources. The more resources the individual or organization is able to acquire,

the more likely they will be able to achieve their goals. Berardo states that it is through

the expansion of connections that the individuals or organizations will be most

successful. It is not only the number of connections but more importantly, the way that

the collaborations are connected to others in the network.

Berardo studied a multi-organizational project that was addressing water-related

problems in southwest Florida. Berardo used data collected from the applicants that were

part of the project that would determine whether or not the applicant received funding.

The data also contained detailed information about the nature of work the applicant did.

Then, using this information, Berardo contacted the 92 applicants that worked on the

project through a semi-structured phone survey. The information provided through the

survey gave Berardo the names of other organizations that participated in the project as

“providers of resources that could be used to strengthen the application” (Berardo, 2009,

p. 527). The data was then put into a matrix that contained information about the

organizations and their common relationships (Berardo, 2009). It then showed the pattern

of participation of organizations in the project.

The main organization controlled 50 percent of the budget and other organizations

became a part of the project through an application process, hoping to obtain funding

from the main organization. The applicants had diverse backgrounds ranging from

financial to legal to technical expertise. Therefore, both the funder and the applicants

benefited because the funder received knowledge from the expertise of the applicants and

the applicants received funding (Berardo, 2009). Berardo explains that all involved in

this process, from the main organization to the experts, were part of an informal

collaboration and that these types of collaborations are becoming more common (2009).

The results of the study showed that a larger number of partners increase the

likelihood of a project to be funded and organizations that are most active and take a

leadership role are most likely to be funded. This study confirmed the resource exchange

theory that the more partners, the more resources available to improve quality. Berardo

found that the leading organization is most successful when its partners collaborate with

each other in other projects. Also, once the collaboration gets to a certain number, over

seven, the likelihood of getting funding for the project declines. This is because it creates

a level of unmanageable complexity for the main funding organization (Berardo, 2009).

Berardo states, “there is a limit to the benefits of engaging in collaborative activities with

more and more partners, and that limit is given by the increasing complexity that an

organization is likely to face when the partners provide large amounts of nonredundant

resources” (Berardo, 2009, p. 535).

A weakness with the study was that it looked at only one collaborative effort.

Therefore, future studies would need to confirm the results by looking at other types of

collaborations, ranging in size and areas of expertise. When thinking about this study in

terms of collaboration within the intelligence community, factors may differ from the

results of this study. For example, agencies may not be collaborating to mutually benefit

because of funding. Therefore, incentives to collaborate may be different within the IC

than through nonprofit organizations or for-profit companies. The question is, then, what

is the incentive to collaborate within the IC when the resource may not be as

straightforward as funding? In other words, what would be the incentive for an expert

working in the for-profit sector to collaborate with an analyst? This may be why the

individual analyst lacks the motivation to collaborate and mandates for collaboration are

necessary in the field.

In their study, “A Structural Analysis of Collaboration between European

Research Institutes,” researchers Bart Thijs and Wolfgang Glänzel investigate the

influence the research profile has on an institute’s collaborative trends. Thijs and Glänzel

note that there is extensive research on the collaborative patterns of nations, institutes,

and individuals with most of them finding a positive correlation between collaboration

and scientific productivity (2010). The researchers aimed to provide a more micro look at

collaborative behavior; focusing less on nations or institutions as a whole, but instead

looking at the research institute and its international and domestic collaborations. The

researchers classified a research institute by its area of expertise in order to establish what

other types of research institutes it collaborated with and why. The researchers then

looked to find the group that, according to its research profile, was the most preferred

partner for collaboration.

Thijs and Glänzel used data from the Web of Science database of Thomas Reuters

and limited their scope to include only articles, letters, or reviews indexed between 2003

and 2005. The documents were classified into a subject category system, dividing them

into eight different groups: Biology (BIO), Agriculture (AGR), a group of institutions

with a multidisciplinary focus (MDS), Earth and space sciences (GSS), Technical and

natural sciences (TNS), Chemistry (CHE), General medicine (GRM) and, Specialized

medicine (SPM) (Thijs & Glänzel, 2010).

The researchers found that institutions from the multidisciplinary group are most

likely to be partners. Also, groups that are more closely related are more likely to

collaborate than the other groups. For example, biology with agriculture; technical and

nature sciences with chemistry; and general medicine with specialized medicine (Thijs &

Glänzel, 2010).

Aside from showing what the collaboration strengths are within the sciences, this

study shows that collaborations are usually the strongest within a certain field or focus.

Also, instead of the multidisciplinary group collaborating with a more specialized group,

it partners with others like itself. In other words, instead of a multidisciplinary group

seeking the expertise of a particular field for collaboration, it seeks others like itself.

Blaskovich looked into group dynamics in her research, “Exploring the Effect of

Distance: An Experimental Investigation of Virtual Collaboration, Social Loafing, and

Group Decisions.” Global businesses use technology to virtually collaborate with a

dispersed workforce. Through past studies, it has been shown that virtual groups have

improved in brainstorming capabilities and more thorough analysis (Blaskovich, 2008).

While virtual collaboration (VC) has potential benefits, it may be counterproductive,

resulting in social loafing; “the tendency for individuals to reduce their effort toward a

group task, resulting in sub-optimal outcomes” (Latane, et al., 1979, as cited by

Blaskovich, 2008). Social loafing has been considered a contribution to poor group

performance, but it is a critical problem intensified by VC.

In her study, participants were grouped into teams and given a hypothetical

situation. They were to be management accountants responsible for the company’s

resources for information technology investments. The groups were asked to give one of

two recommendations: “(1) expend resources to invest in the internal development of a

new technology system (insource) or (2) use the resources to contract with a third-party

outsourcing company (outsource)” (Blaskovich, 2008, p. 33). The groups were given a

data set with mixed evidence as their source of information.

A total of 279 undergraduate and graduate students were placed randomly into

groups of three. The control groups worked face-to-face in a conference room and the

experiment group worked from individual computers in separate rooms; the VC group

used text-chat as their form of communication. To measure the communication of the

groups, Blaskovich had the groups continually update their recommendations as new

pieces of evidence were introduced. The recommendation pattern of the face-to-face

groups moved toward the outsourcing option regardless of the evidence order. However,

the VC groups were dependent on the order of the evidence. Therefore, Blaskovich

concluded that group recommendations were influenced by the mode and order of the

evidence introduced. The groups made their decisions and submitted them through the

designated group recorder. Then, the face-to-face group members were moved to separate

computers. The VC members logged-off of their chat session and all completed a

questionnaire about the experiment (Blaskovich, 2008).

Social loafing was recorded as being present according to time spent on the task,

the participants’ ability to recall information about the task, and self-reported evidence

about their personal effort (Blaskovich, 2008). The face-to-face groups spent an average

of 20.6 minutes on the task while the VC groups spent an average of 22.0 minutes

(Blaskovich, 2008).The accuracy score for the participants’ recall ability was 8.28 items

on the test for the face-to-face groups and the VC group was 7.93. For the self-reporting,

the VC groups perceived their efforts and level of participation to be lower than the face-

to-face groups (Blaskovich, 2008).

Blaskovich concludes that VC causes group performance to decline and that

social loafing may be a contributing factor to this. Also, the VC group decisions may be

of poorer quality than that of face-to-face groups because their judgments were

influenced by the order of evidence instead of the quality of evidence (Blaskovich, 2008).

Blaskovich’s research shows that virtual collaboration should be used cautiously if the

virtual group is making a decision or recommendation. Collaboration has shown to be

beneficial for brainstorming, especially when a diverse group of experts contribute.

However, Blaskovich’s study raises the issue of exactly what the advantages and

disadvantages of VC are. Also, applied to the IC, this study brings up the issue of the

level of collaboration that may be appropriate virtually; if VC is effective at

brainstorming or if decisions or recommendations by participants of VC should be

considered reliable.

The Office of the Director of National Intelligence (ODNI) created the report

“United States Intelligence Community: Information Sharing Strategy” which discusses

the increased need for information sharing, especially after September 11, 2001. The

“need to know” culture that was formed during the Cold War now impedes the

Intelligence Community’s ability to respond properly to terroristic threats (ODNI, 2008).

Therefore, the IC needs to move towards a “need to share” mindset; a more collaborative

approach to properly uncover the threats it now faces (ODNI, 2008). The report stresses

that “information sharing is a behavior and not a technology” (ODNI, 2008, p. 3).

Information sharing has to take place within the community; it has to happen through

effective communication and not just through the availability of new technology.

The Office of the Director of National Intelligence supports the transformation of

the IC culture to emphasize information sharing. However, it recognizes the difficulty

that would come with overhauling the entire culture and mindset of the IC. The new

environment that the ODNI proposes could include the same information, but it would

make the information available to all authorized agencies that would benefit from the

collaborative analysis (ODNI, 2008). ODNI’s vision and model stresses a “responsibility

to provide” that would promote greater collaboration in the IC and to its stakeholders.

Ultimately, the report is stating that the IC has a responsibility to improve communication

and collaboration to effectively manage new threats.

Douglas Hart and Steven Simon’s “Thinking Straight and Talking Straight:

Problems of Intelligence Analysis” discusses the need for structured arguments and

dialogues in intelligence. Hart and Simon note that the 9/11 Commission Report, the 9/11

Joint Congressional Inquiry Report, and other reports have cited that a lack of

collaboration is one of the causes for more recent intelligence failures (Hart & Simon,

2006). Hart and Simon propose that dialogues encourage analysts from different

backgrounds to develop common definitions and understandings to decrease potential

misunderstandings. Communication also encourages the exchange of different viewpoints

to reduce confirmation bias. Through the use of communication and brainstorming,

conversations evolve into critical thinking sessions for both the individual and the group:

Critical thinking can be enabled by collaboration, especially when it involves

compiling, evaluating, and combining multi-disciplinary perspectives on complex

problems. Effective collaboration; however, is possible only when analysts can

generate and evaluate alternative and competing positions, views, hypotheses and

ideas (Hart & Simon, 2006, p. 51).

The authors view collaboration as necessary and effective; however, the authors

state that documents like the National Intelligence Estimates (NIE) seem to discourage

collaboration between individuals and agencies: “Enforced consensus relegating

alternative assessments to footnotes…has been a disincentive to collaboration…in

addition, collaboration and sharing generally require extra work that competes with time

spent on individual assignments” (Hart & Simon, 2006, p. 51). Less time is being spent

on collaboration because individual assignments are the priority.

Researchers Jessica G. Turnley and Laura A. McNamara address collaboration

issues in “An Ethnographic Study of Culture and Collaborative Technology in the

Intelligence Community.” The goal of the study was to research improvements in

intelligence analysis that could be implemented through methods that effectively merged

sources and analysis through multi-agency teams. The researchers conducted their

ethnographic study at two intelligence agencies located at three different sites to address

the question: “What does collaboration mean in the analytic environment, and what is the

role of technology in supporting collaboration” (Turnley & McNamara, p. 2).

The research was conducted through interviews of analysts and through group

and daily work routine observations. The researchers visited three sites. Two sites were

within the same agency which the researchers called Intelligence Agency One (IA-1).

This agency focused on strategic intelligence. The third site was an agency that

developed software tools for tactical intelligence. The researchers called this site

Intelligence Agency Two (IA-2). One researcher spent five and one-half weeks observing

and interviewing analysts at IA-1. She collected data through 30 interviews and forty

hours of observation. At IA-2, the other researcher spent 20 hours becoming familiar with

the site and organization and spent 40 hours interviewing and observing operations.

In the sites the researchers studied, the word “collaboration” is intrinsically tied to

information, hierarchy, and power in the IC. Therefore, the analyst’s ability to collaborate

was only effective if the collaboration did not have a negative impact on the investments

of the individual within the organization. The structure of IA-1 was noted to be

hierarchical with each analyst given a specific area of responsibility and subject focus.

The researcher noted collaboration issues at IA-1 because of the hierarchy structure.

Participants’ responses about issues with collaboration were placed into these five

categories: introversion, a feeling of ownership over subject matter, privilege of

individual effort over group effort for rewards, organizational knowledge, and over-

classification of information.

IA-2, the site responsible for information management technology to produce

tactical intelligence, also had issues with collaboration but for a different reason. The

issues at this site stemmed from multiple companies working together, but having

different agendas for their participation in the contract. An even bigger issue was defining

ownership of the technology used for collaboration. At this site, the analysts could call up

an inquiry and multiple resources from multiple sensors are displayed on a single

platform. This is an issue to the organization because it defines who owns, controls, and

manages the data. Due to power struggles or fear of diminished confidentiality of

sources, certain sensors may refuse to give up necessary information and the

collaboration can be stalled or stopped.

This research was effective at showing how collaboration may already be used,

but that the organization’s culture greatly affects the use of it. The limitation of this study

was the lack of facilities the researchers were able to visit. Because they only visited

three sites, two of which operated under the same agency, they had a less comprehensive

view of the IC and its use of collaboration.

In “Small Group Processes for Intelligence Analysis,” Heuer discusses the role of

collaboration in the production of quality intelligence products and the elements needed

for successful collaboration. Heuer states that intelligence analysis is requiring more of a

group effort rather than an individual effort (Heuer, 2008). This is because intelligence

products need input from multiple agencies and from subject matter experts outside of

their field. Collaboration is also encouraged within agencies that have multiple locations

and can work online together to save time and travel costs.

However, there are issues within groups that can be counterproductive.

Individuals can be late to the group’s sessions or may be unprepared. The groups may be

dominated by certain types of individuals which prevents others from speaking up or

allowing for full generation of ideas. Also, the positions that the individual holds in the

agency can affect the group’s performance. For example, top level professionals are often

less likely to express dissent for fear of retribution or even embarrassment (Heuer, 2008).

Group dynamics play an important role in the effectiveness of collaboration.

To avoid these issues or to mitigate them, Heuer suggests the use of small, diverse

groups of analysts that openly share ideas and an increased use of structured analytic

techniques (2008). Using analysts from multiple agencies will broaden perspectives

“leading to more rigorous analysis” (Heuer, 2008, p. 16). Structured analytic techniques

can give structure to individual thoughts and the interaction between analysts. By using

structured techniques, analysts are providing group members with a written example of

their thought process; this can then be compared and critiqued by the other members

(Heuer, 2008). Heuer states that each step of the structured analytic technique process

induces more divergent and novel discussion than just collaboration alone (2008).

Analysts should not only use tools but should also collaborate with other analysts

or subject matter experts to make sure personal, individual cognitive bias is not affecting

the product and to generate multiple hypotheses. When describing the need for

collaboration in the form of subject matter experts, Heuer stated that expertise is needed

because the methodology itself does not solve the problem. The combination of expertise

and methodology “is always needed” because it is the methodology that guides the

expertise (R. Heuer, personal communication, June 2010). Collaboration also allows

those with diverse backgrounds from various fields to apply their expertise to the

intelligence problem. In other words, the more brainstorming, the more that hypotheses

can be identified than just a literary search only could provide. With collaboration,

communication and dialogue evolve into critical thinking for both the individual and the

group.

The use of collaboration and structure analytic techniques has gained the attention

of the IC when considering solutions to minimizing the frequency of intelligence failures.

Also, it is necessary for individual analysts and organizations to be aware of the presence

of cognitive bias and take safeguards to avoid its negative effects on intelligence analysis.

There exists little evidence of the effectiveness of structured analytic techniques or

intuition and both need to be empirically tested for the validity of the methods and for

managing complex situations within the IC.

Complexity Manager

According to Richards Heuer, the origin of his idea for Complexity Manager goes

back “over 30 years ago” when a future forecasting technique called Cross-Impact

Analysis was tested at the CIA (R. Heuer, personal communication, September 2010).

Heuer recalls taking a group of analysts through the development of a cross-impact

matrix, used in Complexity Manager, and was inspired by the technique’s effectiveness

as “a learning experience for all the analysts to develop a group understanding of each of

the relationships” (R. Heuer, personal communication, September, 2010).

Taking this experience, along with a broad understanding of how increasingly

complex the world has become, Heuer looked to create a technique that dealt with this

new level of complexity while still allowing ease of use to the analyst. Heuer states that

research organizations often deal with complexity by developing complex models that are

expensive and take a lot of time (R. Heuer, personal communication, September 2010).

However, much of the benefit from such modeling comes in the early stages when

identifying the variables, rating their level of significance, and understanding the

interactions between each. As Heuer states: “that [variable identification and interaction]

is easy to do and can be sufficient enough to generate new insights, and that is what I

tried to achieve with Complexity Manager” (R. Heuer, personal communication,

September, 2010). By using Complexity Manager, the analyst is breaking down the

complex system into its smallest component parts before moving forward to analyze the

entire system. By doing so, the analyst can understand potential outcomes and

unintended side effects of a potential course of action (R. Heuer, personal

communication, June 2010).

Complexity Manager is a structured analytic technique that also makes use of

collaboration to brainstorm multiple hypotheses for a complex issue. Therefore, if proven

effective, Complexity Manager would help to further decrease an analyst’s contributions

to intelligence failures by limiting the influence of cognitive bias. This would be

alleviated through the use of collaboration and through the process of using this

structured technique.

Complexity Manager combines the advantages of both structured analytic

techniques and collaboration through small teams of subject matter experts. Complexity

Manager “is a simplified approach to understanding complex systems—the kind of

systems in which many variables are related to each other and may be changing over

time” (Heuer & Pherson, 2010, p. 269). Complexity Manager, as a decision support tool,

helps to organize all options and relevant variables in one matrix. It also provides an

analyst with a framework for understanding and forecasting decisions that a leader,

group, or country is likely to make as well as their goals and preferences. Complexity

Manager is most useful at helping the analyst to identify the variables that are most

significantly influencing a decision. As Heuer states, Complexity Manager “enables

analysts to find a best possible answer by organizing in a systematic manner the jumble

of information about many relevant variables” (Heuer & Pherson, 2010, p. 273).

Complexity Manger is an eight step process. The following is Richards Heuer’s

directions for use of the structured analytic technique:

1. Define the problem2. Identify and list relevant variables 3. Create a Cross-Impact Matrix4. Assess the interaction between each pair of variables5. Analyze direct impacts6. Analyze loops and indirect impacts7. Draw conclusions8. Conduct an opportunity analysis (Heuer & Pherson, 2010, p. 273-277).

For a more detailed description of each of the eight steps, consult Heuer and Pherson’s

Structured Analytic Techniques for Intelligence Analysis. Below, Figure 2.2 shows the

Cross-Impact Matrix that is used for recording the nature relationships between all the

variables (Heuer & Pherson, 2010, p. 273). Heuer recognizes that the Cross-Impact

Matrix includes the same initial steps that are required to build a computer model or

simulation (Heuer & Pherson, 2010, p. 272). Therefore, when an analyst does not have

the time or budget to build a social network analysis or use the Systems Dynamics

approach, they can gain the same benefits using Complexity Manager through:

“identification of the relevant variables or actions, analysis of all the interactions between

them, and assignment of rough weights or other values to each variables or interaction”

(Heuer & Pherson, 2010, p. 272).

Figure 2.1 The Cross-Impact Matrix is used as to assess interactions between variables.

In theory, Complexity Manager is able to mitigate cognitive bias through the use

of both small group collaboration and a structure analytic technique. However, there is no

research proving the effectiveness of this technique, nor is there literature on it being

used in the field. When used in the appropriate context, Complexity Manager may be one

effective tool that may be used to reduce the risks of intelligence failures caused by

cognitive bias. However, unless tested or used in the field by analysts, this will not be

known. Therefore, it is necessary to test the effectiveness of Complexity Manager

through teams of analysts.

Hypothesis

Assessing Complexity Manager as an intelligence analysis tool, I have developed

four testable hypotheses. My first hypothesis is that the groups using Complexity

Manager will have a higher level of confidence in their forecast than those using intuition

alone. My second hypothesis is that analysts using Complexity Manager will produce

higher quality variables than those using intuition alone. My third hypothesis is that the

groups using Complexity Manager will identify more variables than those that used

intuition alone. My fourth hypothesis is that those using Complexity Manager will

produce more accurate forecasts than those that used intuition alone.

CHAPTER 3: METHODOLOGY

The purpose of this study is to assess the forecasting accuracy and effectiveness

of Complexity Manager, a structured technique. For the validity of structured techniques

and for all tools and techniques used by professionals in the Intelligence Community, it is

necessary to evaluate effectiveness through multiple experiments. This study is one of

many evaluations needed for that purpose.

The following research questions were addressed in this study:

1. Do analysts that use Complexity Manager have a higher level of confidence than

those that used intuition alone?

2. Do analysts that use Complexity Manager have higher quality variables assessed

before delivering their forecast than those that used intuition alone?

3. Do analysts that use Complexity Manager have a higher number of variables

assessed before delivering their forecast than those that used intuition alone?

4. Do analysts that use Complexity Manager produce a more accurate forecast than

those that use intuition alone?

The study was designed to compare a structured analytic technique, Complexity

Manager, to intuition alone when forecasting. If Complexity Manager is effective,

advantages of its use would be shown through the data collected. Data was collected by

standardized questionnaires and forms created by the researcher. Pre and post

intervention data were collected and analyzed through statistics and descriptive analysis

of results between the control and experiment group.

Setting

The research was conducted at Mercyhurst College in Erie, Pennsylvania. The

researcher recruited Intelligence Studies students through classroom visits across the

campus and the intervention was completed at the computer labs at the Intelligence

Studies building. Classroom visits were conducted two weeks before the intervention to

allow for the students to plan for the intervention and to maximize the number of sign-

ups for the researcher. The intervention was conducted in the computer labs at the

Intelligence Studies building because the department supports such endeavors and the

researcher could reserve the two computer labs exclusively for the purpose of the study.

This controlled environment allowed for each student to utilize a computer for pre-

intervention collection. Both computer labs were equipped with a projector so the

researcher could present a tutorial to the experiment on how to use Complexity Manager.

Participants

To ensure that this study was conducted in an ethical manner, the researcher

submitted the study to Mercyhurst College’s Institutional Review Board and obtained

permission before starting the study. A copy of the consent form and all related

documents can be found in the appendix of this thesis.

The participants were selected through purposive sampling to meet the needs and

criteria of the study. The participants were restricted to undergraduate and graduate level

Intelligence Studies students only because of their understanding of the Intelligence

process and the need for analysts to evaluate Complexity Manager. Freshman Intelligence

Students were able to participate even though they have very limited experience in the

field because the researcher created groups with each freshman paired with an

Figure 3.1. Number of participants per academic class.

undergraduate upperclassman or second year graduate student. This maximized the

sampling size for the study and varied the level of expertise for each group.

Figure 3.1

shows the distribution of

students according to their

academic class. There were

56 females and 106

males for a total of 162

participants.

Freshman through

second year graduate students participated in the study: 43 freshmen; 37 sophomores; 28

juniors; 18 seniors; 26 first year graduate students; and 10 second year graduate students.

When completing initial sign-ups for the experiment, the researcher requested that the

participants disclose information about their education. For the undergraduates, because

they were all intelligence majors, the researcher requested that the participants list any

minors they may have. For the graduate students, the researcher requested that these

participants list their undergraduate major and minor if applicable. This was done to show

the range of expertise of the participants. Not all undergraduate intelligence students had

minors; 28 of the 126 had a declared minor. Undergraduate intelligence students that

participated in the study had the following minors: Russian Studies, Business

Intelligence, Business Administration, History, Criminal Justice, Criminal Psychology,

Philosophy, Psychology, Political Science, Spanish, Computer Systems, and Asian

Studies.

Junior

Senior

1st Ye

ar Grad

2ndYear

Academic Class

ControlExperiment

Participants

The 36 graduate students that participated in the study disclosed the following

undergraduate majors: Intelligence, Social and Political Thought, History, English,

Political Science, Spanish, International Affairs, Russian Studies, Psychology,

Telecommunication and Business, International Business, Security and Intelligence,

Biochemistry, Mathematics, French, Forensics, Sociology, Social Work, and Criminal

Justice. 17 of the 36 graduate students had the following minors at the undergraduate

level: Science and Technology, International Affairs, Life Sciences, Political Science,

Mandarin, Spanish, Russian Studies, French, Middle Eastern Studies, Asian Pacific

Studies, Economics, Public Policy, and East Asian Studies.

Intervention and Materials

The independent variable for this intervention was the experiment group’s use of

Complexity Manager and the dependent variable is the accuracy of the groups’ forecasts,

as well as the number and quality of the variables that the groups produced.

The researcher first consulted Richards Heuer and Randy Pherson’s book

Structured Analytic Techniques for Intelligence Analysis because it contained the step-by-

step procedure for using Complexity Manager. Then the researcher created forms based

on the procedure and from further instruction from email correspondence with Mr.

Richards Heuer. The methodology form was created to replicate the step-by-step

procedure while allowing for participants’ maximum understanding of the methodology

in a short period of time; each step of the procedure and instructions guiding the

participant were put on separate pages (See Appendix F). Each form created was used to

collect data directed towards the research questions and all forms were approved by the

Institutional Review Board at Mercyhurst College.

Measurement Instruments

The researcher collected data through the use of a post-intervention questionnaire,

through assessment of the groups’ forecasting accuracy, and by the number and quality of

the variables documented by the groups.

Questionnaire Answers

The questionnaire for the control group contained nine questions. Four of the

questions, Questions 1-4, asked for the amount of time and the number of variables that

the individual contributed compared to the amount of time and the number of variables

that the group produced. Questions 1-4 asked for quantitative amounts that could be

compared to other individuals and other groups. Four questions, Questions 5-8, asked the

participant to rank their knowledge of the intelligence issue, the clarity of instructions,

the availability of open source information, and the helpfulness of working in teams for

the assigned task. Questions 5-8 asked the participants to rank their experience on a scale

of one to five. The final question asked for general comments about the experiment.

The questionnaire for the experiment group consisted of thirteen questions.

Questions 1-9 were identical to the control group questions. Questions 10-13 were

specific to the use of Complexity Manager including: the usefulness of Complexity

Manager for assessing significant variables, understanding of Complexity Manager

before the experiment, understanding of Complexity Manager after the experiment, and if

the participant would use Complexity Manager for future tasks. All questions allowed for

space for the participant to comment further if they wanted to do so. (See Appendix H

and I for both questionnaires).

Forecasting Accuracy

All participants were tasked with forecasting whether the vote for the Sudan

Referendum set for January 9, 2011, would occur as scheduled or if it would be delayed.

The use of an actual event allowed for definite results to compare the groups’ forecasts

against. (See Appendix G for the forecasting worksheet.)

Number of Variables

Along with forecasting if the Sudan Referendum would occur on the set date, the

participants were also tasked with identifying the variables that were most influential for

deciding the course of the Sudan Referendum. The researcher calculated the number of

variables that the control group produced compared to the experiment group to assess if

Complexity Manager aided in the production of an increased number of variables

considered.

Quality of Variables

The quality of the variables recorded by the control group was qualitatively

compared to the experiment group. The researcher assessed quality by visually

comparing the thoroughness and comprehensiveness of the control versus the experiment

groups’ variables.

Data Collection and Procedures

Pre-Intervention

From October 11, 2010, to October 19, 2010, the researcher visited eleven

Intelligence Studies classes. Recruitment occurred at the beginning of the class period.

The researcher handed out sign-up sheets requesting general information that included:

name, email address, undergraduate minor if applicable, and a ranking for preferred days

to participate in the study for November 1, 2010, to November 4, 2010. The form also

requested graduate students to include their undergraduate major and minor, if applicable.

Also, because many of the second year graduate students did not have a class during the

week of recruitment, emails requesting participation for the experiment were sent to only

second year graduate students. By October 19, 2010, 239 participants had volunteered to

participate. All undergraduate and first year graduate professors offered extra credit to

those students that participated in the study.

The researcher then entered all the sign-up data into a spreadsheet and organized

the participants into one of the four dates, November 1 through 4, 2010, with nearly all

the participants receiving their first-ranked choice. Once the participants were organized

into days, the researcher then organized the participants into groups; each group had three

members. All groups had at least one freshman assigned to each group. On October 25,

2010, the researcher emailed the participants to let them know their assigned date and

time. From October 25, 2010, to October 31, 2010, participants that were unable to

participate emailed the researcher; at this point, 17 participants withdrew from the study.

From November 1 to November 4, 2010, the researcher then emailed the participants the

morning of their assigned date and time to remind them that the experiment was to occur

that evening.

Intervention

At the beginning of the intervention, participants were asked to sit with their

assigned group. At the front of the room was a list of all the participants organized into

groups of three and four. The groups all had a number and the participants were to sit at

the computers with corresponding numbers. All documentation the participants would

need was placed at the computers before the intervention began. First, after all

participants were seated, they signed a consent form. After all participants completed

this, the researcher gave instructions for the intervention. Both documents are located in

Appendix C and D. After all forms that would be used for the intervention were

explained, the researcher then addressed the participants who had group members

missing. Those that did not have a full group of three were asked to come to the front of

the room so they could be moved into another group. This instruction and reconfiguration

of groups took 10 minutes.

For the control group, the next step was to begin collection. The groups were

given a list of possible sources they could use to begin their collection process and the

groups independently divided the workload. Please see Appendix E for this document.

After an hour of collection, the groups reconvened to brainstorm possible variables and to

give their forecast on a Forecasting Answer Sheet the researcher created. Please see

Appendix G for this document.

Before the experiment group began collection, the researcher gave a brief

PowerPoint tutorial on how to use Complexity Manager. The researcher then described a

packet that was created for the groups to work through the methodology step-by-step.

The participants were also given the directions as written by Richards Heuer and Randy

Pherson in their book, Structured Analytic Techniques for Intelligence Analysis. This

tutorial and explanation took 10 minutes. The groups were then given an hour for

collection and given the same list of possible sources as the control group. After an hour

of collection, the groups reconvened to brainstorm possible variables using Complexity

Manager. The experiment group participants then gave their forecast on the Forecasting

Answer Sheet the researcher created. Please see Appendix E for the methodology packet.

All participants were given two and one half hours to complete the experiment.

Post-Intervention

The post-intervention period including completing the questionnaire described in

the Measurement Instruments section. The students were also given a debriefing

statement describing the purpose of the experiment. Please see Appendix J for the

debriefing statement. On November 4, 2010, the researcher emailed the names of all the

students that participated in the study to the professors that offered extra credit.

Data Analysis

Descriptive and inferential statistics were used for data analysis of the survey

responses, group analytical confidence, group source reliability, and the number of

variables both the control and experiment group considered. The data was subdivided for

analysis purposes and Statistical Package for the Social Sciences (SPSS) software was

used to identify the mean and standard deviation for the control and experiment group.

An independent sample t test was used to compare the mean scores and to identify any

significant differences between the control and experiment data sets’ mean scores. The

survey questions comparing the control and experiment group are: individual amount of

time spent working in the study; group amount of time spent working in the study;

previous knowledge of the Sudan Referendum before beginning the study; clarity of

instructions; availability of open source materials; and how helpful it was to work in

teams. The variables comparing the number that the control and experiment group

produced within each group are: economic, social, political, geographic, military,

technology, and then a total number of variables for both the control and the experiment

group. The quality of the variables was analyzed descriptively and assessed for content.

CHAPTER 4: RESULTS

The results will be presented in order of reference from the Methods section of

this study: survey responses, group analytical confidence, group source reliability, the

number of variables the control and experiment groups considered, and the forecasting

accuracy of both groups. This will be followed by the descriptive analysis of the quality

of variables. Please see Appendix K for complete SPSS data.

Survey Responses

Surveys were distributed to each individual after their group completed and

returned their forecasting answer sheet to the researcher.

Surveys asked each individual to state the approximate amount of time they spent

working individually and the amount of time that they spent working with their groups.

80 control group members and 65 experiment group members answered the survey

question regarding the amount of time they each spent working individually. The control

group’s individual amount of time spent ranged from 20 minutes to 110 minutes. The

experiment group’s individual amount of time ranged from 15 minutes to 120 minutes.

Using SPSS software, the results showed that there is no difference between the control

and experiment group for the individual amount of time spent working, t (142) = -.797, p

(.455) > (α = 0.05).

80 control group members and 67 experiment group members answered the

question regarding the amount of time they spent working as a group. The control group

amount of time spent working together ranged from 5 to 90 minutes. The experiment

group amount of time spent working together ranged from 25 to 150 minutes. Using

SPSS software, the results showed that there is a difference between the control and

experiment group for the amount of time spent working together, t (145) = -7.71, p

(0.00) ˂ (α = 0.05). The experiment group had a greater mean (M = 74.1045 minutes, SD

= 30.30058) than the control group (M = 40.4375 minutes, SD = 20.71802). The

experiment group spent more time working in their groups than the control group did.

Knowledge of Sudan Referendum

To gauge the understanding of the subject matter used for the intervention, the

researcher asked the students to state their knowledge of the Sudan Referendum prior to

beginning the study. The students were given a scale ranging from 1 to 5 with 1

indicating that the individual had little knowledge of the Sudan Referendum and 5

indicating that the individual had great knowledge about the Sudan Referendum.

survey question regarding their prior knowledge of the Sudan Referendum. Using SPSS

software, the results showed that there was no difference between the control and

experiment group’s knowledge of the Sudan Referendum, t (115.143) = -1.699, p (0.092>

(α = 0.05). The experiment group had a slightly greater mean (M = 1.7463, SD =

1.17219) than the control group (M = 1.4578, SD = .83083).

Clarity of Instructions

To gauge the participants’ perception of the clarity of the researcher’s

instructions, the researcher asked the students to rate it on a scale of 1 to 5. 1 indicated

little clarity and 5 indicated that the directions were entirely clear.

survey question regarding the clarity of instruction. Using SPSS software, the results

showed that there is a difference between the control and experiment group’s perception

of the clarity of instructions provided by the researcher, t (140.859) = 6.098, p (0.000) <

(α = 0.05). The control group had a greater mean (M = 4.3614, SD = .77426) than the

experiment group (M = 3.5821, SD = .78140). The control group perceived the

instructions to be clearer compared to the experiment group. This is likely due to the

differences in the directions between the control group and the experiment group. The

control group had more straightforward instructions; collaborate with the team to come

up with a forecast. The experiment group’s task was more ambiguous with the added

instruction of learning and using Complexity Manager. Though both groups were

explained the process of the experiment, the experiment group may have perceived the

instructions to be less clear because of the added complexity of learning and applying a

structure analytic technique.

Open Source Availability

To gauge the participants’ perception of the availability of open source

information regarding the Sudan Referendum, the researcher asked the students to rate it

on a scale of 1 to 5. 1 indicated little availability and 5 indicated an abundance of open

source information regarding the Sudan Referendum.

survey question regarding the availability of open source materials. Using SPSS software,

the results showed that there is no difference between the control and experiment group’s

perception of the availability of open source materials regarding the Sudan Referendum,

t (138.980) = -0.914, p (0.3620) > (α = 0.05). The experiment group had a very slight

greater mean (M = 4.2985, SD = .79801) than the control group (M = 4.1807, SD

= .76739)

Team Helpfulness

Individuals were placed into groups of 3 or 4 to complete a team forecast. To

gauge the participants’ perception of how helpful it was to work in teams for the study,

the researcher asked the students to rate it on a scale of 1 to 5. 1 indicated that working in

teams was not helpful and 5 indicated that it was very helpful to work in teams.

survey question regarding the helpfulness of teamwork. Using SPSS software, the results

showed that there is no difference between the control and experiment group’s perception

of the helpfulness of working in teams for this study, t (115.648) = 1.175, p (0.242) > (α

= 0.05). The control group had a very slight greater mean (M = 4.4819, SD = .70471)

than the experiment group (M = 4.3134, SD = .98794). Both the control group and the

experiment group found that working in a team was helpful.

Initially, it would seem that those that use a structured analytic technique would

value teamwork more because it consciously facilitates collaboration. However, the

academic major may have overshadowed this and played a larger role in the participants’

perception of team helpfulness. The Intelligence Studies major at Mercyhurst College

values and draws heavily on the use of groups to facilitate learning and collaboration.

Therefore, all students likely came into the experiment with the mindset that teamwork

adds value and validity to the forecast. Another factor when considering the shared

perception of team helpfulness for both the control and experiment groups is the nature of

the task. The amount of learning that had to be done would have been very difficult for

one person to do in a two and one half hour timeframe. Therefore, a team would likely be

a welcomed solution to the workload regardless of if a structured analytic technique was

used or not. A third factor is a varied level of individual experience with analysis. 53% of

the participants in the control group were freshmen or sophomores and 39% of

participants in the experiment group were freshmen or sophomores. Collectively,

freshmen and sophomores accounted for 46% of the total participants. Therefore it is

likely that many of the freshmen and sophomores valued working on a team with more

experienced upperclassmen.

Group Analytic Confidence

On the group forecasting answer sheet, the researcher requested that the groups

give their analytic confidence for their forecast regarding the Sudan Referendum. The

participants were to gauge their confidence with “High” being the most confident and

“Low” being of the lowest confidence.

24 control groups and 23 experiment groups gauged their analytic confidence.

Normality assumptions were not satisfied because the sample size was small, less than

30, so the Mann-Whitney test was used. The results showed that there is no difference

between the control and experiment groups’ analytic confidence, p (0.458) > (α = 0.05).

The implications of this finding will be explored in more detail in the Conclusions

chapter.

Group Source Reliability

On the group forecasting answer sheet, the researcher requested that the groups

give their source reliability for their forecast regarding the Sudan Referendum. The

Figure 4.1. Source reliability per group.

participants were to gauge their confidence with “High” being the most confident in the

sources used for forecasting and “Low” being of the lowest confidence.

24 control groups and 23 experiment groups gauged their source reliability. Normality

assumptions were not

satisfied because the sample

size was small, less than 30, so

the Mann-Whitney test was

used. The results showed that

there is no difference

between the control and

experiment groups’ analytic confidence, p (0.914) > (α = 0.05). Figure 4.1 shows the

majority of the control and experiment groups had medium reliability in sources. No

group indicated that they had low source reliability.

Variables

Individuals were placed into groups of three or four, 24 control groups and 23

experimental groups, and were asked to give a team forecast that included a list of

variables that were used for considering their group forecast. The researcher created

categories of variables for the groups’ consideration that include: economic, social,

political, geographic, military, and technology. The variables were examined by the

researcher through both statistics and descriptive analysis; recognizing that it is not only

the quantity but also the quality of the variables that make accurate forecasts.

Implications of all the variables findings will be explored in more detail in the

Conclusions chapter.

Medium

igh High

Not Give

101520

Group Source Reliability

Control Experiment

Source Reliability

Economic Variables

Using SPSS software, the results showed that there is a difference between the

control and experiment group’s number of economic variables considered, t (34.545) =

4.476, p (0.000) < (α = 0.05). The control group had a greater mean (M = 2.9583, SD =

1.26763) than the experiment group (M = 1.6522, SD = .64728).

Social Variables

Normality was not satisfied for the experiment group, so the Mann-Whitney test

for independent samples was used. The results showed that there is a difference between

the control and experiment groups’ number of social variables considered, p (0.000) < (α

= 0.05). The experiment group produced 34% less variables than the experiment group.

Political Variables

control and experiment group’s number of political variables considered, t (36.222) =

Geographic Variables

Normality was not satisfied for the experiment group, so the Mann-Whitney test

for independent samples was used. The results showed that there is a difference between

the control and experiment groups’ number of geographic variables considered, p (0.000)

< (α = 0.05). The experiment group produced 48% less variables than the control group.

Military Variables

control and experiment group’s number of military variables considered, t (42.888) =

7.178, p (0.000) < (α = 0.05). The control group had a greater mean (M = 2.8750, SD

= .89988) than the experiment group (M = 1.2174, SD = .67126).

Technology Variables

control and experiment group’s number of technology variables considered, t (31.176) =

Total Variables

control and experiment group’s number of total variables considered, t (39.195) = 8.295,

p (0.000) < (α = 0.05). The control group had a significantly greater mean (M = 17.1250,

SD = 4.08936) than the experiment group (M = 8.8696, SD = 2.59903). Again,

implications regarding all variables considered can be found in the Conclusions chapter.

24 control groups and 23 experiment groups forecasted whether the vote for the

Sudan Referendum would occur on January 9th, 2011 or if it would be delayed. On

January 9th, 2011, the voting process did begin as scheduled (Ross, 2011). 3 of the 24

control groups and 6 of the 23 experiment groups accurately forecasted the event. Using

SPSS software, it was determined that there was no statistical difference between the

control and experiment group’s ability to accurately forecast (P-value = 0.2367) > (α =

0.05). Although assumptions of normality were not satisfied due to the small sample size,

Figure 4.2. Forecasting per group.

the raw data does show that twice as many experiment groups accurately forecasted the

event (see figure 4.2).

19 of the 24 control groups and 16 of the 23 experiment groups inaccurately

forecasted the event. Using

SPSS software, it was

determined that there was no

difference between the

control and experiment

group’s inaccurate forecast (P-

value = 0.4505) > (α =

0.05). Three groups’

forecasts were not included in the statistical testing: one control and one experiment

group did not give a forecast and one group in the control forecasted that the chances

were even. Therefore, these three forecasts were not included in the analysis.

Quality of Variables

Because quantity may not reflect the quality of information, the researcher also

descriptively analyzed the written variables completed by the 24 control groups and 23

experiment groups. Quality, according to the Merriam-Webster’s Collegiate Dictionary,

id defined as “a degree of excellence, superiority in kind” (n.d.). In Structured Analytic

Techniques for Intelligence Analysts, Heuer and Pherson discuss a three-step approach to

the evaluation of structured analytic techniques. In this evaluation, they note that quality

of analysis is not restricted to just accuracy. Heuer and Pherson suggest that quality of

analysis is measured by “clarity of presentation, transparency in how the conclusion was

Control Experiment02468

101214161820

Highly Likely, Likely, Will Occur (Above 50%)Highly Unlikely, Un-likely, Will Not Oc-cur (Below 50%)

reached, and construction of an audit trail for subsequent review” (2010, p. 317).

Considering the definition of quality and Heuer and Pherson’s measure of quality of

analysis, the researcher defined quality variables as “variables that are superior as shown

through clarity of presentation and transparency in how conclusions were reached.”

The researcher did not include “construction of an audit trail” because both the

control and the experiment group were asked to write out the variables considered.

Therefore, this instruction required both groups to leave an audit trail of their variables

and significant findings. Using the above definition of quality, the researcher found that

quality variables were presented in two ways: completeness of the description and

specificity.

Completeness of the Description

Both the control and the experiment groups consistently cited similar variables

for consideration when forecasting. Both groups consistently spoke of border disputes,

issues involving oil rights, and ethnic tensions. However, the teams in the control group

routinely used full sentences while only one team in the experiment group used full

sentences. Though this does not increase the validity of the data, it does show the

completeness of the team’s thought; it showed clarity of presentation. The teams that

used complete sentences were also able to show cause and effect. Therefore, the ability to

show cause and effect allowed for transparency in how the analysts arrived at their

conclusions.

For example, one team in the control group writes, “Southern Sudan’s economy is

mostly comprised from oil revenue that it receives from the North. However, because the

north has ceased paying its share of oil revenues in foreign currency, turmoil internally

will likely result.” One team in the experiment group writes on the same topic, “Share of

oil revenues.” Both groups are stating that oil revenues are a variable to consider when

forecasting whether then Sudan Referendum will occur on the date scheduled, but the

team in the control group conveyed why oil revenues would affect the possibility of

delay.

Specificity

The completeness of the description allowed for more specific variables to be

considered. The teams in the control group more frequently cited specific pieces of

evidence for consideration while the experiment group used broader concepts for

consideration. For example, one team in the control group writes, “The Northern

government has control over the TV and radio signals, and only allows broadcasts that

are in line with their policies.” One team in the experiment group writes, “Low tech

capabilities affect other key variables.” The team in the control group is citing specifics;

the Northern government has control over certain technologies in the country. The team

in the experiment group is stating that a low level of technical capability in Sudan affects

other variables. In this control group example, as does the greater part of the control

group, the team is not only stating a specific piece of evidence, but showing the split

between north and south Sudan; the reason for the Sudan Referendum. In this experiment

group example, the team in the experiment group is stating that there is a connection

between a low level of technology and how that affects the other variables; however, this

broad generalization does not allow for an understanding of the urgency of the

technological issues in Sudan and how it could affect the possible delay of the

referendum.

Two factors could influence the result of variable specificity. The first factor is

possibly a lack of experience with the method and the topic. By using broad topics such

as “political pressure” and “security issues” the students were decreasing the complexity

of the task by equating the variables to things they were more familiar with. “Political

pressure” and “security issues” are more common to the students than the specifics of the

Sudan Referendum. Therefore, by broadening the variables, the students could more

easily use the cross impact matrix; assessing how politics affected security, rather than

how one specific instance in Sudan affected the other. The other factor could be a lack of

clarity. While the control group may have cited specific instances for their variables, the

experiment group may have placed those specific instances into broader categories,

which they identified as variables. Therefore, clarity may be lacking; is a variable a

specific example (“The Northern government has control over the TV and radio signals,

and only allows broadcasts that are in line with their policies”) or a broadened

understanding of specific instances (“Low tech capabilities affect other key variables”).

A definition of “variable” in the context of Complexity Manager may help to clarify what

exactly is needed for the structured technique to be most effective.

CHAPTER 5: CONCLUSION

Throughout the IC’s history, intelligence failures have driven reform in

organizational structure, information sharing, and the use of structure analytic techniques.

However, if the use of structured analytic techniques is the solution to decreasing the

possibility or severity of future intelligence failures, then structured analytic techniques

should be tested to ensure each is a valid means to reducing such risks. Though

encouraged, testing of the techniques is limited. The intent of this study was to test one

structured analytic technique and offer further research for testing. The purpose of this

study was to assess the validity of the structured method, Complexity Manager. To do so,

the researcher designed an experiment comparing the use of Complexity Manager versus

intuition alone. The research was conducted at Mercyhurst College’s Intelligence Studies

Program with participants from every academic class level; freshman to second year

graduate students.

Discussion

Do analysts that use Complexity Manager have a higher level of confidence than

those that used intuition alone?

Analytic confidence is based on the use of a structured analytic technique, source

reliability, source corroboration, level of expertise on the subject, amount of

collaboration, task complexity, and time pressure (Peterson, 2008). The results of the

survey questions asking groups to rate their level of analytic confidence shows that those

that used Complexity Manager did not have a higher level of confidence than those that

used intuition alone. Furthermore, looking at the results from the other survey questions

that identify components of analytic confidence confirms that the experiment group did

Figure 5.1. Analytic confidence per group.

not experience a higher level of confidence than the control group. The control group

and the experiment group had no difference for their source reliability; level of expertise

on subject matter; and

amount of collaboration,

shown through “team

helpfulness.” Assessing task

complexity and time

pressure, both the control

group and experiment group

were given the same task

with the same amount of time. However, the experiment group, (M = 74.1045 minutes,

SD = 30.30058) spent a greater amount of time working than the control group (M =

40.4375 minutes, SD = 20.71802). The time that the experiment groups spent working

together may have been related to the teams’ need to learn and then use Complexity

Manager.

This finding suggests that using a structured analytic technique may not increase

analytic confidence, but may better calibrate the analyst. The analysts could have lacked

confidence in not only their ability to use the structured analytic technique, but lacked

confidence in the analysis that it helped to produce. Task complexity was high and there

was a time constraint of 2.5 hours; however, the majority of both the control and

experiment groups had a medium analytic confidence. Nine groups gave low analytic

confidence; three control and six experiment. Overall, the experiment group had a lower

confidence level, but had a greater forecasting accuracy. Therefore, this suggests that

Medium

Not Give

Group Analytic Confidence

ControlExperiment

Analytic Confidence Range

there may not be a connection between analytic confidence and the use of a structured

analytic technique. Analytic confidence may have no bearing on forecasting accuracy. In

other words, having a high analytic confidence may not suggest that the analyst is more

likely to forecast accurately. In summary, this finding suggests that using a structured

analytic technique could assist the analyst in assessing their own analytic confidence, but

does not improve their overall analytic confidence. Further studies yielding a higher

number of group or individual analytic confidence ratings would be needed to statistically

confirm this.

Do analysts that use Complexity Manager have higher quality variables assessed

before delivering their forecast than those that used intuition alone?

The researcher concluded after a descriptive analysis comparing the control to the

experiment group that those that used Complexity Manager did not have higher quality

variables than those that used intuition alone. The teams that used Complexity Manager

often used short, broad generalizations while those that used intuition alone wrote out

complete sentences that specified specific points and conflicts between north and south

Sudan. These complete sentences allowed the control group to fully explain the cause and

effect of each variable. The experiment group spent a greater amount of time working

together than the control group, yet the quality of their variables was less than that of the

control group.

Two factors may have influenced the style of reporting. The experiment group

was given an answer sheet for their forecast along with a methodology packet; a step-by-

step guide to using Complexity Manager where they could complete the steps directly on

those pages. However, the control group was only given an answer sheet. The first factor

the researcher considered was redundancy, having to write the variables twice, to have

influenced the experiment group to only write short, broad generalizations on their

answer sheet. However, the methodology packets reveal the same statements. This leads

to a second factor that may have influenced the style of writing. For Complexity

Manager, the experiment group was tasked with completing a cross-impact matrix and to

do so, the variables first had to be listed in the left hand column. The teams may have

judged the lines too short to include whole sentences and therefore only transferred those

statements onto the answer sheet. Though this may account for the length of the sentence,

it does not account for why the experiment group often used broader concepts, such as

referencing oil refineries, while the control group used more specific statements, such as

stating where the refineries were and why it was a conflict.

Do analysts that use Complexity Manager have a higher number of variables

assessed before delivering their forecast than those that used intuition alone?

As shown through the intervention results, analysts that used Complexity

Manager did not have a higher number of variables assessed before delivering their

forecast. The control group had a greater number of variables assessed in every category.

Also, the control group had a greater mean total number of variables (M = 17.1250, SD =

4.08936) than the experiment group (M = 8.8696, SD = 2.59903).

The amount of variables assessed did not connect to more accurate forecasts.

Therefore, quantity had no bearing on quality. Increasing the pieces of evidence could

easily bias the analyst, thinking that the more evidence found, the more likely it would be

that the Sudan Referendum would not occur on the scheduled date. Though the

experiment group had less variables assessed, they produced a greater number of accurate

forecasts. This suggests that the experiment group weighed the significance of each

variable, rather than totaling the pieces of evidence confirming the likelihood of one

event over another. Therefore, using a structured analytic technique may have helped

decrease analyst bias when forecasting.

Do analysts that use Complexity Manager produce a more accurate forecast than

those that use intuition alone?

There was no statistical difference between the control and experiment group for

producing more accurate forecasts. This may have been due to the small sample size of

both the control and experiment groups. However, in effect, a p-value of 0.2367 indicates

that there is a 76% chance that the data is not due to chance. Furthermore, looking at the

raw data, 6 out of 23 experiment groups had accurate forecasts while only 3 out of 24

control groups had accurate forecasts. Both of these facts suggest that Complexity

Manager assisted the experiment group with producing more accurate forecasts.

Analytic Confidence findings showed that 3 control groups and 6 experiment

groups gave a low confidence rating. Forecasting Accuracy findings also showed that 3

control groups and 6 experiment groups produced accurate forecasts. However, only one

experiment group that forecasted accurately gave a low confidence rating; 8 of the 9

accurate forecasts gave a medium analytic confidence rating.

Two further conclusions can be drawn from the forecasting accuracy of the

experiment group. The first conclusion is that forecasting accuracy may be connected to

collaboration, a required process within the Complexity Manager technique. The

experiment group spent more time working collaboratively than the control group and

also produced more accurate forecasts. (Time in Minutes: Experiment group mean =

74.1045 and Control group mean = 40.4375). The second conclusion is that there appears

to be no connection between the number of variables assessed and forecasting accuracy;

the control group recorded a greater number of variables than the experiment group but

did not forecast more accurately.

Shannon Ferrucci recorded similar findings in her 2009 Master’s thesis, “Explicit

Conceptual Models: Synthesizing Divergent and Convergent Thinking.” When assessing

the size of conceptual models that participants produced in her study, Ferrucci found that

though the experimental group’s conceptual models were larger than the control group’s

models, the control group forecasted better than the experiment group (Ferrucci, 2009).

Ferrucci suggested that the large number of concepts that the experiment group created in

their models created confusion and decreased their ability to understand the most relevant

information for completing their forecast (2009). As in Ferrucci’s experiment, the large

number of variables that the control group created may have overwhelmed the analysts

and made it more difficult to select the most relevant variables that would affect the

possible delay for the Sudan Referendum.

Another factor that may account for the control groups’ low forecasting accuracy

could be a connection between the number of variables assessed and cognitive bias.

Robert Katter, John Montgomery, and John Thompson found in their 1979 study,

“Cognitive Processes in Intelligence Analysis: A Descriptive Model and Review of

Literature,” that intelligence is conceptually driven rather than data driven. This

understanding is important because it shows how the analyst arrives at their conclusions.

An analyst’s forecast is not pure data. Instead, it is a process driven by how the analyst

interprets that data after moving through their cognitive model (Katter et al., 1979). The

purpose of the cognitive model is to account for “inputs” of the analyst, with input

meaning stimuli from the external world or what is in their internal memory (Katter et al.,

1979). The model has three parts that are summarized below:

1. The individual’s initial processing of outside information is automatically

conducted in less than a second. Then the new information is automatically

compared with information already stored in the memory. When even just a

gross match is found, the new information that matches existing memory

patterns is stored.

2. New information that may not fit into the existing memory patterns can be

automatically ignored or viewed as irrelevant or uninteresting.

3. The central cognitive function consists of a continuous “Compare/Construct”

cycle that modifies the memory-storage. Three types of information

modification are: sensory information filtering, memory information

consolidation, and memory access interference.

In this study on Complexity Manager, the control group did not have anything in place to

force themselves to be more cognizant of their cognitive models as they recorded all of

their variables for forecasting. Therefore, without having an external regulator such as a

structured analytic technique, the control groups’ forecasts were more negatively affected

by their cognitive models, taking the form of cognitive bias.

Limitations

A limitation to the study was the sample size. 162 individuals participated, but the

number of forecasts was limited because the individuals were grouped into teams of three

and four. Therefore, instead of 162 forecasts, only 47 were given. Time constraints were

also a limitation. The study was conducted during a two and one half hour time period.

This did not allow for full development or understanding of the issue. The researcher

intentionally chose this time period to maximize the number of participants and decrease

the number of drop-outs. Also, the participants were students with time constraints due to

other classes and obligations. A third limitation was the level of expertise from the

participants. The participants were students with limited knowledge of the field and very

limited knowledge of both Complexity Manager and the intelligence topic, the Sudan

Referendum.

Three other limitations of this study relate directly to the implementation of the

intervention. The first limitation was the amount of time that may be appropriate for

learning not only Complexity Manager, but any structured analytic technique. One of the

main considerations when restricting the intervention to 2.5 hours was maximizing the

sample size. The participants, being students, had many other obligations. If the

researcher had asked participants to commit to a longer intervention, the sample size may

have decreased significantly. However, the time restriction may not have allowed for

proper understanding and absorption of Complexity Manager. Besides time restrictions

with learning the structured analytic technique, assessing when to provide the Complexity

Manager tutorial was a limitation. The researcher gave the tutorial directly after giving

the tasking for the analysis. This was done to allow the groups to work at their own pace

and complete their analysis earlier if they desired to. However, this turned into a

limitation because it overloaded the participants with information. Collection may have

suffered because the students were more concerned with understanding the technique.

The third limitation is the timing of the intervention. Mercyhurst College operates on a

trimester system, with each term lasting ten weeks. The researcher gave the intervention

during the eighth week of the term. Though students may have attended to earn extra

credit knowing the end of the term was near, this may also negatively affected the

intervention. Participants dropped out of the intervention because they had other

obligations such as team meetings and projects. For those that attended the intervention,

they may have worked more quickly through it than if it had been held earlier in the term

when they had less pressing obligations to manage. If the technique had been presented

separately and more thoroughly, and if the intervention had taken place earlier in the

term, the results may have more accurately reflected the purpose of the study.

Recommendations for Future Research

Based on the results of this study, there are several recommendations for future

research. Limitations concerning time constraints for training the participants using

Complexity Manager could be mitigated or eliminated if the training is separate from the

intervention. Not only could understanding increase, but this would allow the participants

to analyze more complex issues that have multiple outcomes, a major function of

Complexity Manager. Along with taking more time to train the participants, having

participants that are trained in a particular area of expertise could also improve the

intervention.

Therefore, the second recommendation is to either include professionals that have

had more experience in the field or to choose a topic within an area of expertise that

would be more familiar to all participants. The participants for this intervention had to

familiarize themselves with a topic that was largely unknown to the majority of the

participants and also learn a new structured analytic technique. Having participants that

are subject matter experts or have a higher level of expertise could assist the participants

to more fully utilizing Complexity Manager; to explore potential outcomes and

unintended side effects of a potential course of action. In other words, involving subject

matter experts and working through issues with multiple outcome possibilities are two

major components of Complexity Manager that could be tested in further studies.

The researcher of this intervention focused on variables, analyst confidence, and

the forecasting accuracy of Complexity Manager. The third recommendation is to

compare Complexity Manager to another well-tested structured technique. The researcher

compared intuition to the use of Complexity Manager. Doing so showed that those that

used an intuitive process produced a greater number of specified variables compared to

those that used Complexity Manager. However, those that used Complexity Manager

worked significantly longer as in their groups. Having both the control and experiment

group use a structured analytic method could possibly eliminate the dramatic change in

control and experiment group time and isolate the question: “Is Complexity Manager

more effective at assisting analysts brainstorm variables that impact a complex issue than

other techniques?”

The fourth recommendation is to obtain a higher number of participants or

forecasts for the study. Collaboration was necessary for Complexity Manager, therefore,

the researcher organized the participants into groups of three to four; this significantly

minimized the number of forecasts. Therefore, increased participation or a method that

would allow for individual forecasting would minimize this limitation. Further studies

using these recommendations could more fully assess the validity of Complexity

Manager.

Conclusions

Complexity Manager originated as a way for analysts to develop a group

understanding of each of the relationships within a complex system. Heuer created

Complexity Manager to help analysts generate new insights through variable

identification and interaction in order to understand the potential outcomes and

unintended side effects of a potential course of action. Heuer states that Complexity

Manager is useful for identifying the variables that are most significantly influencing the

decision at hand and enables the analyst to find the best possible answer to an intelligence

question by organizing information into the structured technique.

The dynamics of the variable interactions were not noted in this study because

this experiment focused on how effective Complexity Manager is at variable

identification and its correlation to forecasting accuracy, or, the best possible answer to

an intelligence question. The results show that those that used Complexity Manager

identified fewer variables and less specific variables, but had a higher number of accurate

forecasts than those that did not use Complexity Manager. Therefore, it can be concluded

that Complexity Manager is effective at identifying variables that lead to more accurate

forecasts than using intuition alone.

Using Complexity Manager did not assist participants with identifying more

variables than those that did so intuitively. This may suggest that those that used intuition

may have actually used a divergent process, brainstorming, before forecasting or that

Complexity Manager is not effective at assisting analysts with drawing out a higher

volume of or higher quality variables. However, the increased number of variables

recorded did not connect to an increased forecasting accuracy.

The use of teams greatly reduced the number of forecasts, which in turn, reduced

the sample size for extracting statistically significant results. However, the results of this

suggest that Complexity Manager increases forecasting accuracy. Collaboration is a

necessary part of Complexity Manager; the teams must brainstorm variables and work

through the matrix together. The increased amount of time spent collaborating and

following the steps of the structured analytic technique increased the number of accurate

forecasts in the experiment group.

This experiment showed that Complexity Manager’s strongest abilities include

effective collaboration, possible improved analytic confidence calibration, and an aid to

increasing forecasting accuracy. An area of improvement would be a stronger definition

of what a variable is in the context of Complexity Manager; is it a specific event that

could be a catalyst to other events? Is it broader? Is a single variable composed of

multiple significant events that are categorized under one general umbrella? Or is it a

combination of both? Thinking of the entire IC, how should analyst balance single,

significant or seemingly insignificant events with more general trends?

Final Thoughts

The testing of the effectiveness of one analytic technique, at this point in time,

seems to be secondary to gathering empirical evidence regarding the collective benefits

and abilities that structured analytic techniques offer. Instead of testing one particular

technique one at a time, this researcher recommends testing two techniques against each

other or two techniques against intuition alone. This would increase the amount of

techniques tested and would keep the control of intuition in place. More importantly,

comparing techniques against each other could help to show emerging patterns through

the strengths and weaknesses that may overlap within all structured analytic techniques.

This would improve all structured analytic techniques.

Though this is only one study assessing the effectiveness and forecasting accuracy

of one structured analytic technique, it did produce quantitative results that confirm that

structured techniques decrease bias and increase forecasting accuracy. One by one,

experiments and results such as this add to the validity of each structured analytic

technique and the Intelligence Community as a whole.

REFERENCES

Berardo, R. (2009). Processing complexity in networks: a study of informal collaboration and its effects on organizational success. Policy Studies Journal, (37)3, 521-539. Retrieved September 24, 2010, from Academic Search Complete. doi: 10.1111/j.1541-0072.2009.00326.x

Betts, R. (1978). Analysis, war, and decision: why intelligence failures are inevitable. World Politics, 31(1), 69-89. Retrieved June 20, 2010, from http://www.jstor.org/stable/200 9967.

Blaskovich, J.L. (2008). Exploring the effect of distance: an experimental investigation of virtual collaboration, social loafing, and group decisions. Journal of Information Systems (22)1. 27-46. Retrieved September 3, 2010, from Academic Search Complete.

Brasfield, A.D. (2009) Forecasting accuracy and cognitive bias in the analysis of competing hypotheses (Unpublished master’s thesis). Mercyhurst College, Erie, PA.

Cheikies, B. A., Brown, M. J., Lehner, P.E., & Adelman, L. (October 2004). Confirmation bias in complex analyses. 1-16. Retrieved June 13, 2010, from http://www.mitre.org/work/tech_papers/tech_papers_04/04_0985/04_0985.pdf.

Davis, J. (1999). Improving intelligence analysis at CIA: Dick Heuer’s contribution to intelligence analysis. In Heuer, R., Jr. (1999). Psychology of Intelligence Analysis, Center for the Study of Intelligence: Central Intelligence Agency, xiii-xxv. Retrieved May 31, 2010, from https://www.cia.gov/library/center-for-the-study-of-intelligence/csi-publications/books-and-monographs/psychology-of-intelligence-analysis/PsychofIntelNew.pdf.

Diaz, G. (January 2005). Methodological approaches to the concept of intelligence failure. UNISCI Discussion Papers, Number 7, 1-16. Retrieved July 31, 2010, from http://revistas.ucm.es/cps/16962206/articulos/UNIS0505130003A.PDF.

Edison's Lightbulb at The Franklin Institute. (2011). Retrieved April 1, 2011, from http://www.fi.edu/learn/sci-tech/edison-lightbulb/edison-lightbulb.php?cts=electricity.

Folker, R. D., Jr. (2000). Intelligence Analysis in Theater Joint Intelligence Centers: An Experiment in Applying Structured Methods Occasional Paper Number Seven, 1-45. Retrieved June 13, 2010, from http://www.fas.org/irp/eprint/folker.pdf.

Ferrucci, S. (2009). Explicit conceptual models: synthesizing divergent and convergent thinking (Unpublished master’s thesis). Mercyhurst College, Erie, PA.

George, R. Z. (2004). Fixing the problem of analytical mind-sets: alternative analysis. International Journal of Intelligence and Counterintelligence, 17(3), 385-404. doi: 10.1080/08850600490446727.

Goodman, M. (2003). 9/11: The failure of strategic intelligence: intelligence and national security, 18(2), 59-71. doi: 10.1080/02684520310001688871.

Grimmet, R.F. (2004). Terrorism: key recommendations of the 9/11 commission and recent major commissions and inquiries. (Congress Research Service). Washington, DC. Retrieved September 3, 2010, from http://www.au.af.mil/au/awc/awcgate/crs/rl32519.pdf.

Hart, D. & Simon, S. (2006). Thinking straight and talking straight: problems of intelligence analysis. Survival, 48(1), 35-59. doi: 10.1080/00396330600594231.

Hedley, J. (2005). Learning from intelligence failures. International Journal of Intelligence and Counterintelligence, 18(3), 436. doi: 10.1080/08850600590945416.

Heuer, R., Jr. (2009). The evolution of structured analytic techniques. Presentation to the National Academy of Science, National Research Council Committee on Behavioral and Social Science Research to Improve Intelligence Analysis for National Security. Washington, D.C.

Heuer, R., Jr. (1999). Psychology of Intelligence Analysis. Center for the Study of Intelligence: Central Intelligence Agency, 1-183. Retrieved May 31, 2010, from https://www.cia.gov/library/center-for-the-study-of-intelligence/csi-publications/books-and-monographs/psychology-of-intelligence-analysis/PsychofIntelNew.pdf.

Heuer, R. J., Jr. (2008). Small group processes for intelligence analysis, 1-38. Retrieved September 16, 2010 from http://www.pherson.org/Library/H11.pdf.

Heuer, R. J. Jr. & Pherson, R. (2010). Structured Analytic Techniques for Intelligence Analysis. Washington, D.C.: CQ Press.

Johnston, R. (2005). Analytic culture in the US Intelligence Community: an ethnographic study. Retrieved July 31, 2010, from http://www.au.af.mil/au/awc/awcgate/cia/analytic_culture.pdf.

Johnston, R. Integrating methodologists into teams of substantive experts. Studies in Intelligence, 47(1), 57-65. Retrieved June 13, 2010, from http://www.dtic.mil/cgi-bin/GetTRDoc?AD=ADA525552&Location=U2&doc=GetTRDoc.pdf.

Katter, R., Montgomery, C., & Thompson, J. (1979). Cognitive processes in intelligence analysis: a descriptive model and review of the literature (Technical Report 445). Arlington: US Army Intelligence Security and Command.

Khatri, N. & Alvin, H. Role of intuition in strategic decision making. 1-38. Retrieved May 31, 2010, from http://www3.ntu.edu.sg/nbs/sabre/working_papers/01-97.pdf.

Lefebvre, S. J. (2003). A look at intelligence analysis. Retrieved from http://webzoom.freewebs.com/swnmia/A%20Look%20At%20Intelligence%20Analysis.pdf.

Light Bulb History - Invention of the Light Bulb. (2007). Retrieved April 1, 2011, from http://www.ideafinder.com/history/inventions/lightbulb.htm.

Marrin, S. (2004). Preventing intelligence failures by learning from the past. International Journal of Intelligence and Counterintelligence, 17(4), 655-672. Retrieved June 20, 2010, from http://dx.doi.org/10.1080/08850600490496452.

Myers, D.G. (2002). Intuition: Its Powers and Perils. London: Yale University Press.

National Commission on Terrorist Attacks. (2004). The 9/11 commission report. Washington, DC: US Government Printing Office. Retrieved October 15, 2010, from http://govinfo.library.unt.edu/911/report/911Report.pdf.

October 21, 1879: Edison Gets the Bright Light Right. This Day In Tech. Wired.com. (2009). Retrieved April 1, 2011, from http://www.wired.com/thisdayintech/2009/10/1021edison-light-bulb/.

The Office of the Director of National Intelligence (2008). United States Intelligence Community Information Sharing Strategy (ODNI Publication No. A218084). Washington, DC. Retrieved September 3, 2010, from http://www.dni.gov/reports/IC_Information_Sharing_Strategy.pdf.

Pope, S. & Josang, A. Analysis of Competing Hypothesis using subjective logic. 10th International Command and Control Research and Technology Symposium: The Future of C2 Decisionmaking and Cognitive Analysis. Retrieved May 31, 2010, from http://www.cs.umd.edu/hcil/VASTcontest06/paper126.pdf.

Quality. (n.d.) In Merriam-Webster’s collegiate dictionary. Retrieved from http://www.merriam-webster.com/dictionary/quality.

RAND National Security Research Division. (2008). Assessing the tradecraft of intelligence analysis. Santa Monica: Treverton, G & Gabbard, C.B.

Ross, W. (2011, January 11). Southern Sudan votes on independence. BBC. Retrieved from http://www.bbc.co.uk/news/world-africa-12144675.

Robson. D.W. Cognitive rigidity: methods to overcome it. Retrieved May 31, 2010, from https://analysis.mitre.org/proceedings/Final_Papers_Files/40_Camera_Ready_Paper.pdf.

Shrum, W. Collaborationism. Retrieved September 24, 2010, from http://worldsci.net/papers.htm#Collaboration.

Surowiecki, J. (2004). The Wisdom of Crowds. New York: Anchor Books.

Thijs, B. & Glänzel, W. (2010). A structural analysis of collaboration between European research institutes. Research Evaluation (19)1, 55-65. doi: 10.3152/095820210X492486.

Thomas Edison's biography: Edison Invents! Smithsonian Lemelson Center. (n.d.). . Retrieved April 1, 2011, from http://invention.smithsonian.org/centerpieces/edison/000_story_02.asp.

Turnley, J. G. & McNamara, L. An ethnographic study of culture and collaborative technology in the intelligence community. Sandia National Laboratory, 1-21. Retrieved May 31, 2010, from http://est.sandia.gov/consequence/docs/JICRD.pdf.

Wheaton, K. J. & Beerbower, M. T. (2006). Towards a new definition of intelligence. Stanford Law & Policy Review, 17(2), 319-330. Retrieved September 13, 2010, from LexisNexis.

Appendix A: IRB Approval

Appendix B:Structured Methods Experiment

Sign UpName: Class Year: E-mail Address:

Undergraduate Minor (If Applicable):

Graduate Student’s Undergraduate Major (If Applicable):

Graduate Student’s Undergraduate Minor (If Applicable)

Please select a ranked preference for the dates: (Rank Session Preference: 1=Highest, 4=Lowest)

Monday, November 1, 2010: 6 pm _____

Tuesday, November 2, 2010: 6pm _____

Wednesday, November 3, 2010: 6 pm _____

Thursday, November 4, 2010: 6 pm _____

Upon completion, please return this form to Lindy Smart.

Appendix C:Participation Consent Form

You have been invited to participate in a study about forecasting in Intelligence analysis.

Your participation in the experiment involves the following: team assignments, a team

evaluation of a designated subject, and returning the completed forms back to the

researcher of the experiment. Teams will be given one hour for collection and then will

reconvene for up to an hour and a half to put the analysis together and give a team

forecast.

Your name will only be used to notify professors of your participation in order for them

to assign extra credit. There are no foreseeable risks or discomforts associated with your

participation in this study. Participation is voluntary and you have the right to opt out of

the study at any time for any reason without penalty.

I, ____________________________, acknowledge that my involvement in this research

is voluntary and agree to submit my data for the purpose of this research.

_________________________________ __________________Signature Date

_________________________________ __________________Printed Name Class

Name(s) of professors offering extra credit: ____________________________________Researcher’s Signature: ___________________________________________________If you have any further question about forecasting or this research you can contact me at

Research at Mercyhurst College which involves human participants is overseen by theInstitutional Review Board. Questions or problems regarding your rights as aparticipant should be addressed to Tim Harvey; Institutional Review BoardChair; Mercyhurst College; 501 East 38th Street; Erie, Pennsylvania 16546-0001;Telephone (814) 824.3372.

Lindy Smart, Applied Intelligence Master’s Student, Mercyhurst College Kristan Wheaton, Research Advisor, Mercyhurst College

Appendix D:Forecasting Thesis Experiment

Instructions

You are an analyst working at the Embassy of the United States of America in Sudan.

You have been tasked with forecasting whether the vote for the Sudan Referendum

set for January 9, 2011 will occur as scheduled or if it will be delayed. You are also

to identify the variables that are most influential for deciding the course of the

Sudan Referendum. The state-level high committees responsible for organizing the

referendum expect delays but the United Nations is committed to conducting it on time.

You and your teammates will be assigned areas of expertise in the economic, political,

military, social, technological, and geographic area of Sudan. You will use open source

information to complete your task. You will be given a list of sources to use at a starting

point for collection. Teams will be given one hour for collection and then will reconvene

for up to an hour and a half to put the analysis together and give a team forecast.

Researcher Contact: Lindy Smart

Appendix E:Starting Point for Collection

Possible Sources:

http://www.usip.org/

http://www.state.gov/

http://www.bloomberg.com/

http://www.pbs.org/newshour/

http://www.alertnet.org/

http://www.reuters.com/

http://www.hrw.org/

http://news.yahoo.com/

http://www.washingtontimes.com/news/

http://allafrica.com/

http://www.bbc.co.uk/news/world/africa/

http://www.janes.com/

http://w3.nexis.com/new/ (Must have your username and password)

http://merlin.mercyhurst.edu/ (Databases available through the Mercyhurst Library)

Appendix F:Structured Methods Experiment

Methodology

1. State the problem to be analyzed, including the time period to be covered by the

analysis:

__________________________________________________________________

2. Brainstorming list of relevant variables:

Economic:

Political:

Social:

Technology:

Military:

Geographic:

Direction and magnitude of the impact:+ Strong positive impact - Strong negative impact+ Medium positive impact - Medium negative impact+ Weak Positive Impact - Weak negative impact

Use plus and minus signs to show whether the variable being analyzed has a positive or negative impact on the paired variable. The size of the plus or minus sign signifies the strength of the impact on a three-point scale. 3=strong, 2=medium, 1=weakIf the variable being analyzed has no impact on the paired variable, the cell is left empty. If a variable might change in a way that could reverse the direction of its impact, from positive to negative or vice versa, this is shown by using both a plus and a minus sign.

3. List the variables in the Cross-Impact Matrix. Put the most important variables at the top. *Matrix not limited to 10 variables/may not have 10 variables.

A B C D E F G H I JABCDEFGHIJ

Reading the Matrix: The cells in each row show the impact of the variable represented by that row on each of the variables listed across the top of the matrix. The cells in each column show the impact of each variable listed down the left side of the matrix on the variable represented by the column.

Please note: Size of matrix does not reflect actual size of the matrix given to students. Students received a matrix that fit a page in its entirety.

DIRECTIONS FOR COMPLETING THE CROSS-IMPACT MATRIX

3. As a team, assess the interaction between each pair of variables and enter the results into the relevant cells of the matrix. For each pair of variables, ask the question: Does this variable impact the paired variable in a manner that will increase or decrease the impact or influence of that variable?

a. When entering ratings in the matrix, it is best to take one variable at a time, first going down the column and then working across the row. The variables will be evaluated twice; for example, the impact of variable A on variable B and the impact of variable B on variable A.

b. After rating each pair of variables, and before doing further analysis, consider pruning the matrix to eliminate variables that are unlikely to have a significant effect on the outcome.

c. Measure the relative significance of each variable by adding up the weighted values in each row and column. Record the totals in each row and column.

i. The sum of the weights in each row is a measure of each variable’s impact on the system as a whole.

ii. The sum of the weights in each column is a measure of how much each variable is affected by all the other variables.

iii. Those variables most impacted by the other variables should be monitored as potential indicators of the direction in which events are moving or as potential sources of unintended consequences.

4. Write about the impact of each variable, starting with variable A. (Use the following pages to write out your answers.)

a. Describe the variable further if clarification is necessary (For example, if one of the variables you identified is “Weak Government Officials” then use this space to write exactly what you meant. You may want to include names, party affiliations, and examples of why the officials are “weak”).

Variable A1. Describe the variable further if clarification is necessary:

2. Identify all the variables that impact on Variable A with a rating of 2 or 3 (Medium or Strong Effect) and briefly explain the nature, directions, and, if appropriate, the timing of this impact:

Variables that impact Variable A: (Shown in the COLUMNS)

a. How strong is it and how certain?

b. When might these impacts be observed?

c. Will the impacts be felt only in certain conditions?

3. Identify and discuss all variables on which Variable A has an impact

with a rating of 2 or 3 (Medium or Strong Effect):

Variables on which Variable A has an impact: (Shown in the ROWS)

b. Identify and discuss the potentially good or bad side effects of these impacts.

Good side effects:

Bad side effects:

Variable B1. Describe the variable further if clarification is necessary:

2. Identify all the variables that impact on Variable B with a rating of 2 or 3 (Medium or Strong Effect) and briefly explain the nature, directions, and, if appropriate, the timing of this impact:

Variables that impact Variable B: (Shown in the COLUMNS)

3. Identify and discuss all variables on which Variable B has an impact

Variables on which Variable B has an impact: (Shown in the ROWS)

Good side effects:

Bad side effects:

Variable C1. Describe the variable further if clarification is necessary:

2. Identify all the variables that impact on Variable C with a rating of 2 or 3 (Medium or Strong Effect) and briefly explain the nature, directions, and, if appropriate, the timing of this impact:

Variables that impact Variable C: (Shown in the COLUMNS)

3. Identify and discuss all variables on which Variable C has an impact

Variables on which Variable C has an impact: (Shown in the ROWS)

Good side effects:

Bad side effects:

Variable D1. Describe the variable further if clarification is necessary:

2. Identify all the variables that impact on Variable D with a rating of 2 or 3 (Medium or Strong Effect) and briefly explain the nature, directions, and, if appropriate, the timing of this impact:

Variables that impact Variable D: (Shown in the COLUMNS)

3. Identify and discuss all variables on which Variable D has an impact

Variables on which Variable D has an impact: (Shown in the ROWS)

Good side effects:

Bad side effects:

Variable E: 1. Describe the variable further if clarification is necessary:

2. Identify all the variables that impact on Variable E with a rating of 2 or 3 (Medium or Strong Effect) and briefly explain the nature, directions, and, if appropriate, the timing of this impact:

Variables that impact Variable E: (Shown in the COLUMNS)

3. Identify and discuss all variables on which Variable E has an impact

Variables on which Variable E has an impact: (Shown in the ROWS)

Good side effects:

Bad side effects:

Variable F: 1. Describe the variable further if clarification is necessary:

2. Identify all the variables that impact on Variable F with a rating of 2 or 3 (Medium or Strong Effect) and briefly explain the nature, directions, and, if appropriate, the timing of this impact:

Variables that impact Variable F: (Shown in the COLUMNS)

3. Identify and discuss all variables on which Variable F has an impact

Variables on which Variable F has an impact: (Shown in the ROWS)

Good side effects:

Bad side effects:

Variable G: 1. Describe the variable further if clarification is necessary:

2. Identify all the variables that impact on Variable G with a rating of 2 or 3 (Medium or Strong Effect) and briefly explain the nature, directions, and, if appropriate, the timing of this impact:

Variables that impact Variable G: (Shown in the COLUMNS)

3. Identify and discuss all variables on which Variable G has an impact

Variables on which Variable G has an impact: (Shown in the ROWS)

Good side effects:

Bad side effects:

Variable H: 1. Describe the variable further if clarification is necessary:

2. Identify all the variables that impact on Variable H with a rating of 2 or 3 (Medium or Strong Effect) and briefly explain the nature, directions, and, if appropriate, the timing of this impact:

Variables that impact Variable H: (Shown in the COLUMNS)

3. Identify and discuss all variables on which Variable H has an impact

Variables on which Variable H has an impact: (Shown in the ROWS)

Good side effects:

Bad side effects:

Variable I: 1. Describe the variable further if clarification is necessary:

2. Identify all the variables that impact on Variable I with a rating of 2 or 3 (Medium or Strong Effect) and briefly explain the nature, directions, and, if appropriate, the timing of this impact:

Variables that impact Variable I: (Shown in the COLUMNS)

3. Identify and discuss all variables on which Variable I has an impact

Variables on which Variable I has an impact: (Shown in the ROWS)

Good side effects:

Bad side effects:

Variable J: 1. Describe the variable further if clarification is necessary:

2. Identify all the variables that impact on Variable J with a rating of 2 or 3 (Medium or Strong Effect) and briefly explain the nature, directions, and, if appropriate, the timing of this impact:

Variables that impact Variable J: (Shown in the COLUMNS)

3. Identify and discuss all variables on which Variable J has an impact

Variables on which Variable J has an impact: (Shown in the ROWS)

Good side effects:

Bad side effects:

5. Analyze loops and indirect impacts: a. Identify any feedback loops.b. Determine if the variables are static or dynamic.

i. Static: Static variables are expected to remain more or less unchanged during the period covered by the analysis.

ii. Dynamic variables are changing or have the potential to change. c. Determine if the dynamic variables are either predictable or unpredictable.

i. Predictable: Predictable change includes established trends or established policies that are in the process of being implemented.

ii. Unpredictable: Unpredictable change may be a change in leadership or an unexpected change in policy or available resources.

Feedback loops:

Static Variables:

Dynamic-Predictable:

Dynamic-Unpredictable:

7. Draw conclusions: Using data about the individual variables assembled in Steps 5 and 6, draw conclusions about the system as a whole.d. What is the most likely outcome or what changes might be anticipated

during the specified time period?

e. What are the driving forces behind the outcome?

f. What things could happen to cause a different outcome?

g. What desirable or undesirable side effects should be anticipated?

Appendix G:Forecasting Thesis Experiment

Answer Sheet

Names and Corresponding Professor Offering Extra Credit:

Name____________________ Prof. Offering Extra Credit: ____________________Name____________________ Prof. Offering Extra Credit: ____________________Name____________________ Prof. Offering Extra Credit: ____________________Name____________________ Prof. Offering Extra Credit: ____________________

Forecast:

Variable(s) considered:Economic:

Social:

Political:

Geographic:

Military:

Technological:

Source Reliability (circle one): LOW MEDIUM HIGH

Analytic Confidence (circle one): LOW MEDIUM

Appendix H:Follow-Up Questionnaire: Control Group

Thanks for your participation! Please take a few moments to answer the following

questions. Your feedback is greatly appreciated.

1. Individual amount of time spent on the assigned task? _____________

2. Amount of time spent working with your group? _____________

3. Number of variables that you contributed to the group? _____________

4. Total number of variables that the group considered before forecasting? _____________

5. Please rate your level of knowledge of the Sudan Referendum before the

experiment with 1 being no knowledge at all and 5 being a very thorough

understanding.

1 2 3 4 5

6. Please rate the clarity of the instructions for the task with 1 being not clear at all

and 5 being very clear.

1 2 3 4 5

7. Please rate the availability of open source information with 1 being very difficult

to find and 5 being very easily found.

1 2 3 4 5

8. Please rate how helpful it was to work in teams for this task with 1 being not

helpful at all and 5 being very helpful.

1 2 3 4 5

9. Please provide any additional comments

you may have about the experiment:

Appendix I:Follow-Up Questionnaire: Experiment Group

Thanks for your participation! Please take a few moments to answer the following

questions. Your feedback is greatly appreciated.

1. Individual amount of time spent on the assigned task? _____________

2. Amount of time spent working with your group? _____________

3. Number of variables that you contributed to the group? _____________

4. Total number of variables that the group considered before forecasting? ________

5. Please rate the clarity of the instructions for the task with 1 being not clear at all

and 5 being very clear.

1 2 3 4 5

6. Please rate the availability of open source information with 1 being very difficult

to find and 5 being very easily found.

1 2 3 4 5

7. Please rate how helpful it was to work in teams for this task with 1 being not

helpful at all and 5 being very helpful.

1 2 3 4 5

8. Please rate your level of knowledge of the Sudan Referendum before the

experiment with 1 being no knowledge at all and 5 being a very thorough

understanding.

1 2 3 4 5

9. Please rate how helpful Complexity Manager was for assessing significant

variables before forecasting the assigned tasks with 1 being not helpful at all and

5 being very helpful.

1 2 3 4 5

Comment:

10. Please rate your level of understanding of Complexity Manager before the

experiment with 1 being no understanding at all and 5 being a very thorough

understanding.

1 2 3 4 5

Comment:

11. Please rate your level of understanding of Complexity Manager after the

experiment with 1 being no understanding and 5 being a very thorough

understanding.

1 2 3 4 5

Comment:

12. Would you use Complexity Manager for future tasks? (circle one)

Yes No

Comments:

13. Please provide any additional comments you may have about Complexity

Manager or the experiment overall:

Appendix J:Complexity Manager

Participant Debriefing

Thank you for participating in this research. I appreciate your contribution and

willingness to support the student research process.

The purpose of this study is to determine the accuracy of Complexity Manager for

forecasting compared to unstructured methods. This experiment was designed to test if

Complexity Manager aided analysts in accurately forecasting compared to using the

intuitive process alone. This is the first experiment conducted on Complexity Manager

and is also an addition to other experiments on the effectiveness and accuracy of

structured methodologies. Participants were organized at random and both the control

group and the experiment group were placed into groups of three to simulate subject

matter expert collaboration required for this methodology.

The results of this experiment will be given to Mr. Richards Heuer, the creator of

Complexity Manager.

If you have any further questions about Complexity Manager or this research you can

contact me.

Appendix K: SPSS Testing

Time in Minutes: Individual

Case Processing Summary

Valid Missing Total

N Percent N Percent N Percent

Time in Minutes Control 80 100.0% 0 .0% 80 100.0%

Experiment 64 100.0% 0 .0% 64 100.0%

Independent Samples Test

Levene's Test for Equality

of Variances t-test for Equality of Means

F Sig. t df Sig. (2-tailed) Mean Difference

Time in Minutes Equal variances

assumed

19.425 .000 -.797 142 .427 -3.35938

Equal variances

not assumed

-.750 92.938 .455 -3.35938

Time in Minutes: Group

Group Statistics

Group N Mean Std. Deviation Std. Error Mean

Time in Minutes for Group Control 80 40.4375 20.71802 2.31635

Experiment 67 74.1045 30.30058 3.70181

Levene's Test for Equality of

Variances t-test for Equality of Means

F Sig. t df Sig. (2-tailed) Mean Difference

Time in Minutes for Group Equal variances

assumed

14.017 .000 -7.963 145 .000 -33.66698

Equal variances

not assumed

-7.710 113.

.000 -33.66698

Analytic Confidence

Source Reliability

Survey Questions

Group Statistics

Knowledge of Sudan Control 83 1.4578 .83083 .09120

Experiment 67 1.7463 1.17219 .14321

Clarity of Instructions Control 83 4.3614 .77426 .08499

Experiment 67 3.5821 .78140 .09546

Open Source Control 83 4.1807 .76739 .08423

Experiment 67 4.2985 .79801 .09749

Team Help Control 83 4.4819 .70471 .07735

Experiment 67 4.3134 .98794 .12070

Levene's Test for

Equality of Variances t-test for Equality of Means

F Sig. t df Sig. (2-tailed)

Knowledge of Sudan Equal variances assumed 10.211 .002 -1.760 148 .080

Equal variances not assumed -1.699 115.143 .092

Clarity of Instructions Equal variances assumed .011 .917 6.104 148 .000

Equal variances not assumed 6.098 140.859 .000

Open Source Equal variances assumed .044 .834 -.918 148 .360

Equal variances not assumed -.914 138.980 .362

Team Help Equal variances assumed 6.172 .014 1.217 148 .225

Equal variances not assumed 1.175 115.648 .242

Variables

Economic Variables

Group Statistics

Economic Variable Control 24 2.9583 1.26763 .25875

Experiment 23 1.6522 .64728 .13497

Levene's Test for

Economic Variable Equal variances

assumed

9.613 .003 4.419 45 .000

Equal variances not

assumed

4.476 34.545 .000

Social Variables

Political Variables

Group Statistics

Political Variable Control 24 3.3333 1.43456 .29283

Experiment 23 1.7826 .79524 .16582

Levene's Test for

Political Variable Equal variances assumed 4.010 .051 4.555 45 .000

Equal variances not

assumed

4.608 36.222 .000

Geographic Variables

Military Variables

Group Statistics

Military Variable Control 24 2.8750 .89988 .18369

Experiment 23 1.2174 .67126 .13997

Levene's Test for

Military Variable Equal variances assumed 3.229 .079 7.133 45 .000

Equal variances not

assumed

7.178 42.48

Technology Variables

Group Statistics

Technology Variable Control 24 2.3750 1.43898 .29373

Experiment 23 1.1739 .83406 .17391

Levene's Test for

Technology Variable Equal variances

assumed

9.883 .003 3.481 45 .001

Equal variances not

assumed

3.519 37.176 .001

Total Variables

Group Statistics

Total Variables Control 24 17.1250 4.08936 .83474

Experiment 23 8.8696 2.59903 .54193

Levene's Test for Equality of

Variances t-test for Equality of Means

Total Variables Equal variances assumed 6.132 .017 8.219 45 .000

Equal variances not

assumed

8.295 39.195 .000

Hypothesized Difference 0Level of Significance 0.05

Group 1 Number of Successes 3Sample Size 24

Intermediate CalculationsGroup 1 Proportion 0.125

Group 2 Proportion0.26086956

Difference in Two Proportions

-0.13586956

Average Proportion0.19148936

Z Test Statistic

-1.18338919

Two-Tail Test

Lower Critical Value

-1.95996398

Upper Critical Value1.95996398

p-Value0.23665493

7Do not reject the null

hypothesis

DataHypothesized Difference 0Level of Significance 0.05

Intermediate Calculations

Difference in Two Proportions0.09601449

Average Proportion0.74468085

Z Test Statistic0.75462400

Two-Tail Test

Lower Critical Value

-1.95996398

Upper Critical Value1.95996398

p-Value0.45047461

8Do not reject the null

hypothesis

the forecasting accuracy and effectiveness of complexity manager

Documents

analysing accuracy, complexity and fluency in written text

state tax revenue forecasting accuracy€¦ · state tax...

validation and forecasting accuracy in models of...

beginning to enjoy the “outside view” a first glance at...

a study on the trade-off between energy forecasting ...a...

wind power forecasting accuracy and uncertainty in...

demand forecasting - dspace@mit:...

validation and forecasting accuracy in models of climate

accuracy of revenue forecasting as a developer of

enterpriseone jde5 forecasting peoplebook - oracle ·...

measuring forecasting accuracy

accuracy of moving average forecasting for nepse between

water demand forecasting accuracy and influencing factors

evaluations of korean mortality forecasting models -...

forecasting - rjerz.com · •describe three measures of...

forecasting accuracy of behavioural models for participation...

the development of complexity, accuracy, and fluency … ·...

forecasting accuracy for arch models and garch (1,1) family

a hybrid method to improve forecasting accuracy utilizing...

larry d'angelo - better forecasting accuracy - shorter...