proquest dissertations - american university
Post on 14-May-2022
7 Views
Preview:
TRANSCRIPT
Date
CONTRAST IN JUDGMENTS OF MENTAL HEALTH
By
Laura H. Kushner
Submitted to the
Faculty of the College of Arts and Sciences
of American University
in Partial Fulfillment of
the Requirements for the Degree of
Master of Arts
In
Psychology
Chair:
2008
Scott Parker cr1 . J.-:"-
Kathleen C. Gun
American University
Washington, D.C. 20016
AMERIC/.\N Uf· 1 !\/~RS!TY LIBRARY q 2 C\C\
UMI Number: 1460430
INFORMATION TO USERS
The quality of this reproduction is dependent upon the quality of the copy
submitted. Broken or indistinct print, colored or poor quality illustrations and
photographs, print bleed-through, substandard margins, and improper
alignment can adversely affect reproduction.
In the unlikely event that the author did not send a complete manuscript
and there are missing pages, these will be noted. Also, if unauthorized
copyright material had to be removed, a note will indicate the deletion.
®
UMI UMI Microform 1460430
Copyright 2008 by ProQuest LLC.
All rights reserved. This microform edition is protected against
unauthorized copying under Title 17, United States Code.
ProQuest LLC 789 E. Eisenhower Parkway
PO Box 1346 Ann Arbor, Ml 48106-1346
CONTRAST IN JUDGMENTS OF MENTAL HEALTH
BY
Laura H. Kushner
ABSTRACT
The present study examined contrast in judgments of mental health. Subjects
rated word/definition pairs on a 0 (Low) to 100 (High) mental health scale. We first
established some pairs as exhibiting Low, Moderate, or High mental health. In
Experiment 1, subjects who first saw High words rated Moderate words lower than
subjects who first saw Moderate words. Thus, contrast occurred for Moderate words.
Experiment 2 used Low (rather than High) and Moderate words but found no contrast.
The results echoed previous research on contrast and mental health.
Contrast was induced for the Moderate words in the Moderate/Healthy condition:
Moderate words were moved away from their baseline ratings. Contrast did not occur in
the Moderate/Low condition: Moderate words were not moved away from their baseline
ratings. In neither case did the Moderate words induce changes in the ratings of the
extreme words.
11
ACKNOWLEDGEMENTS
I would like to thank Scott for his unlimited patience, guidance, and support. I
would also like to thank Dr. Haaga and Dr. Gunthert for their valuable feedback and time.
Thanks to my mom and dad for their continued encouragement and willingness to listen.
111
TABLE OF CONTENTS
ABSTRACT ........................................................................................ .ii
ACKNOWLEDGMENTS ...................................................................... .iii
TABLE OF CONTENTS ........................................................................ iv
Chapter
1. INTRODUCTION: Contrast in Judgments of Mental Health .................. 1
Background ......................................................................... 2
Contrast and Perception of Mental Health ...................................... 9
Goals of the Study .............................................................. 26
2. EXPERIMENT 0: CALIBRATION ............................................. 27
Participants ...................................................................... 27
Materials ........................................................................... 27
Procedure .......................................................................... 28
Results ............................................................................ 28
3. EXPERIMENT 1: HIGH MENTAL HEALTH/MODERATE MENTAL
HEALTH ................................................................................................ 30
Participants ...................................................................... 30
Materials ........................................................................... 30
Procedure .......................................................................... 30
Results ............................................................................ 31
IV
4. EXPERIMENT 2: LOW MENTAL HEALTH/MODERATE MENTAL
HEALTH .............................................................................................. 34
Experiment 2A Method .......................................................... 34
Participants ....................................................................... 34
Materials ............................................................................ 34
Procedure ........................................................................... 34
Results ............................................................................. 35
Discussion ......................................................................... 36
Experiment 2B Method .......................................................... 37
Participants ....................................................................... 3 7
Materials ............................................................................ 37
Procedure ........................................................................... 37
Results ............................................................................. 37
Discussion ......................................................................... 38
5. GENERAL DISCUSSION .......................................................... 39
APPENDIX .................................................................................................. 46
A. INSTRUCTIONS ............................................................ 46
B. TABLES ...................................................................... 47
REFERENCES ............................................................................................ 49
v
CHAPTER 1
INTRODUCTION
Contrast in Judgments of Mental Health
The contrast effect is a psychological phenomenon that has been demonstrated in
myriad circumstances. The effect occurs after exposure to a stimulus along a continuum;
the subsequent perception of another stimulus from the same continuum will often be
pushed in the opposite direction, away from that of the initial stimulus. For example, if
one simultaneously places one's right hand in hot water and left hand in cold water for
one minute and then immediately places both hands in a container of lukewarm water, the
right hand will perceive the lukewarm water as cold while the left hand will perceive the
lukewarm water as hot (Locke 1690/1964).
In his seminal book, Outline of Psychology, Wilhelm Wundt included several
sections on the contrast effect. Throughout these sections, it is evident that Wundt
believed contrast was fundamental, widespread, and significant. Wundt went to so far as
to include it as one of three fundamental principles of psychical phenomena (Wundt
1907). Since Wundt, a multitude of studies have investigated and demonstrated the
omnipresence of the contrast effect. Because of contrast's generality, the phenomena
might likely apply to clinical judgments. For example, it is likely that the assessment of
one person may be influenced by judgments previously made of other people. The
present experiment investigates the contrast phenomenon in a context of the perception of
mental health. 1
2
Background
Before examining contrast's role in mental health judgments, it is useful to have
more background on contrast and related phenomena. As described above contrast
occurs when ratings or judgments of a stimulus are pushed in the opposite direction of an
original stimulus. Contrast also has an opposite, known as assimilation. Assimilation
occurs when ratings or judgments of a stimulus are pushed toward the ratings of an
original stimulus (Suls & Wheeler, 2007).
Although a plethora of literature exists on the topic of assimilation and contrast,
an overarching theory does not exist that definitively explains why contrast occurs in
some situations while assimilation occurs in others. Nevertheless, work by Stapel (2007)
has helped clarify the issue.
Historically, contrast and assimilation were generally researched by different
groups of investigators. Contrast was principally examined by psychophysicists while
assimilation was studied predominantly by social psychologists (Suls & Wheeler, 2007).
Each group usually believed that assimilation (if you were a social psychologist) or
contrast (if you were a psychophysicist) was the default mechanism for human
judgments.
Stapel (2007) argues that this division among groups of psychologists continues
today. He emphasizes that research in social cognition tends to focus on issues relating
to categorization and interpretation: thereby leading to assimilation. Other researchers,
(Stapel refers to them as social judgment and comparison researchers) tend to focus
more on comparative issues of social judgments: thereby leading to contrast. As a result
of these different theoretical underpinnings, different kinds of procedures were
developed by each group that tended to lead to assimilation in some instances and to
contrast in others. It will be helpful to explain those kinds of procedures in more detail.
3
To begin, there are several situations that can increase the chance that contrast
will occur. When subjects are presented a series of stimuli that they are asked to
evaluate, this generally produces contrast. In many of these kinds of experiments, a
series of evaluative judgments are initially made of stimuli that come from one region of
a scale (the context stimuli). Next, judgments are made regarding a different series of
stimuli from a different region on the scale (the test stimuli). These kinds of experiments
are common within the sensory psychophysics literature (e.g. Conner, Land & Booth,
1987, taste; Frederiksen, 1975, loudness and brightness; Pol, Hijman, Baare & van Ree,
1998, smell) and generally produce contrast. Similar designs have also found contrast in
other perceptual domains (e.g. Kenrick & Gutierres, 1980, using physical attractiveness
ratings; Samsom & Rachman, 1992, using fear evoking stimuli).
While there is not a consensus as to why these kinds of situations produce
contrast Lockhead (1982) described that contrast will be "Commonly found between the
response to the current stimulus and the average magnitude of several preceding stimuli"
(p. 28). For example, if several loud tones precede a more moderate tone, that moderate
tone will be judged as being quieter than if it had been preceded by several quiet tones.
Just as certain procedures tend to produce contrast, certain other procedures tend
to lead to assimilation. One such procedure is a priming paradigm. In this kind of
design, subjects are first exposed (often implicitly) to traits or behaviors that are
ambiguously related to the property to be judged. Next, subjects are asked to evaluate
the property in question. For example, Higgins, Rholes, and Jones (1977) exposed
4
subjects to a list of either socially desirable traits or socially undesirable traits. Later
subjects were asked to judge a hypothetical person based on an ambiguous verbal
description. Subjects who were previously exposed to the socially desirable traits on
average judged the hypothetical person as more socially desirable than did people who
had been exposed to the undesirable traits. It is thought that this situation is likely to
cause assimilation because the overlap in the context between the prime and the
subsequent evaluation will cause the subject's evaluative response to migrate toward the
prime (Wedell, Hicklin & Smarandescu, 2007).
Another class of studies in social cognition also produces results that are
assimilative - those that use numerical anchoring. In these studies, the subject first gets
exposed to a number rather than a trait or social context. The subject then gives a
numerical rating of something and that rating is typically moved closer to the number
that was initially presented. For example, Mussweiler and Englich (2005) exposed
subjects to a series ofrandom symbols on a computer screen. After a series of three
symbols were presented for one second each, a number (the anchor) was presented for
thirty three milliseconds. In this experiment, the subject did not report being exposed to
a number. Immediately following the presentation of numbers and symbols, subjects
were asked to consider, "How many Euros does a midsize car cost on average?" When
subjects were not exposed to a numerical value the average estimate of the car was M =
18,312 euros. Subjects exposed to a high anchor value (30,000) gave an average
estimate of the price of a car (M = 21,219 euros) that was significant! y higher than did
subjects who were exposed to a low anchor value (10,000) and then asked to estimate
the price of a car (M = 17,15 0 euros). A number of studies have produced similar results
(e.g Eply & Gilovich, 2001, estimating the freezing point of vodka; Blankenship.,
Wegener, Petty, Detweiler-Bedell, & Macy, 2008, estimating the age of varying famous
men when a significant event occurred in their life.)
5
Interestingly, the results from the Higgins Rholes, and Jones (1977) study violate
Lockhead's theory regarding the importance of the stimuli presented prior to the
response stimuli. According to Lockhead (1982), while contrast is likely to occur if
subjects average several preceding stimuli, assimilation is more likely to occur if one
stimulus is presented and then subjects are asked to evaluate an immediate, subsequent
stimulus. With Higgins, Rholes, and Jones (1977), a series of traits were presented to
each subject, yet assimilation occurred. Of note, in this priming study subjects were not
evaluating the initial personality traits. Instead, they were only evaluating the
hypothetical person's characteristics (after the trait had been presented). The current
research will be set up in such a way that both the initial stimuli and the test stimuli are
evaluated. Historically, this type of paradigm has produced contrast.
Additionally, recent research has demonstrated that when test stimuli are
simultaneously presented (as a group) assimilation is more likely to occur than if the
stimuli are presented sequentially. Zellner and Cobuzzi (2008) asked subjects to
evaluate their liking for pieces of art on a 20 I-point bipolar hedonic scale ( + 100
indicated the most attractive art imaginable, while -100 indicated the most unattractive
art imaginable). Subjects were placed in one of three groups: Grouping, Comparing, and
Control. Grouping subjects observed three paintings by Goya at the same time, the
middle one being the "test" painting. The test painting had been selected based on
previous research that suggested it would have a positive hedonic rating while the two
outer paintings or "context" paintings were selected based on previous research that
suggested they would have negative hedonic ratings (Dolese, Zellner, Vasserman, &
Parker, 2005).
First, subjects were asked based on the paintings how much they liked the artist.
6
Then, they were asked to evaluate their liking for each painting. The Comparing
subjects also saw the three paintings at the same time; however, these subjects were first
asked to rank order the paintings by preference and then to evaluate their liking for each
painting. Finally, the Control group saw only the test painting and rated their liking for
it.
When these paintings had been viewed sequentially (Dolese, Zellner, Vasserman
& Parker, 2005) contrast occurred. But here, the Grouping group rated the test painting
as significantly less hedonically positive than the Control group, exhibiting assimilation.
However, there was no significant difference in ratings of the test painting between the
Comparing groups and the control group. In other words, when viewed simultaneously
and understood to be part of a single group, the two context paintings pulled down the
ratings of the middle "test" painting. Simultaneous viewing can induce assimilation for
stimuli that when viewed sequentially induce contrast.
In addition to design structure leading to assimilation, social influence may also
tend to lead subjects to more assimilative responses. In a famous experiment, Sherif
(1936) exposed subjects to an illusion in which a pinpoint oflight appeared to be moving
(known as the autokinetic effect). Subjects were asked (on an individual basis) to
estimate how much the light moved. In a later phase of the experiment, subjects were
asked to estimate how much the light moved while they were in a group. In this phase,
7
subjects' answers began to converge together. This experiment indicates that a person's
desire to conform to the standards set by others may lead to assimilation. Other examples
of social influence leading to assimilation are abundant (e.g. Asch, 1956, estimating line
length; Yin & Saltzstein, 1968, estimating number of dots).
The importance of the impact of study design in determining contrast or
assimilation was highlighted in a recent study by Zellner, Strickhouser, and Tornow
(2004). In phase one of this study subjects were shown liquid (a tropical fruit drink) in a
cup. They were asked to rate how much they expected to like the drink after simply
looking at, after being told it was a tropical fruit drink, and after being told it was cough
syrup flavoring. Subjects rated their taste expectations on a 201-point hedonic scale
ranging from -100 "most unpleasant taste imaginable" to 100 "most pleasant taste
imaginable". Subjects rated their expectations differently in the three conditions: when
they were told nothing (M= .09), when they were told it was a tropical fruit drink (M=
53.60), and when they were told it was cough syrup (M= -36.00).
In the second phase of the experiment different subjects were given the fruit juice
to drink. Some of the subjects were told that the liquid was a cough syrup, other were
told it was a tropical fruit drink, and finally some were told nothing about the drink. In
this paradigm, subjects who were told the drink was cough syrup rated how much they
liked the juice significantly higher than did the subjects who were told that the drink was
a tropical fruit juice. Contrast occurred, presumably as a result of violation of the
subjects' expectations about how much they would like a cough syrup. As highlighted
above, this paradigm closely matches the kind of design that is likely to produce
contrast.
8
In the final phase of the experiment different subjects were presented with this
same drink. This time, the subjects were told that prior subjects had rated the drink. All
subjects were told a specific numerical rating that had been previously assigned to the
drink. The ratings that were specified were the expectations that were the data of phase
one of the experiment. Thus, some subjects were told the juice was rated .09, some were
told 53.60, and some were told -36.00. In this experiment, subjects who received the
prior negative rating gave the drink a lower taste liking rating than subjects who had
been giving a prior positive rating. Assimilation occurred.
Although the final phase of the experiment does not exactly match the
aforementioned priming procedure, subjects were previously given an exact number to
use as a reference point in this scenario. Another facet that may have led to assimilation
is that subjects were socially influenced, paralleling Sherif (1936). It is possible that
their actual preferences of the stimuli were not affected, but they did not want their
opinions to deviate from the previous raters. In any case, Zellner et al. (2004) were
successful by creating contrast and assimilation using the same stimulus set simply by
slightly altering the instructional procedures of the experiment.
While research has clarified the instances in which contrast and assimilation
occur, there is as yet no theory that can explain which phenomenon occurs under what
circumstances and why. In the current study, judgments relating to mental health are
examined. A series of context stimuli are presented followed by a series of different
stimuli. As discussed above, this kind of paradigm typically leads to contrast.
Therefore, the intent of this research is to expand knowledge of contrast in mental health
assessment.
9
Contrast and Perception of Mental Health
One of the first studies to examine judgments of mental health and illness was
done by Arnhoff (1954). Arnhoffs work is seminal because although contrast was not
observed with mental health and illness ratings, it was the first study to investigate the
topic. Two hundred twenty-two schizophrenic responses to requests for definitions of
various vocabulary items were standardized in order to obtain, "a number of stimuli
whose stimulus values were known" (Arnhoff, 1954, p. 272). In the initial calibration
phase of the study, twenty-two experienced clinicians rated the words and definitions on
"the severity of the disorder in the thinking processes exhibited in the responses"
(Arnhoff, 1954, p.272) using an 11-point scale where 1 indicated little pathology while
11 represented an extremely pathological response. The mean rating for each stimulus
definition was calculated. Only those definition responses that had an average rating
between 4 and 8 were included in next phase of the experiment.
In the next phase, 180 subjects (ranging in education from psychology
undergraduates to experienced clinical psychologists) were recruited and divided into
three groups. Each group was given a set of the 4-8 vocabulary items and definitions
and asked to rate the definitions on the aforementioned 11-point scale. Each of the three
groups was then presented with another series of the 4-8 items and definitions. Before
the second series of definitions were presented, the first group read an example of a
definition at the high end of the scale and its corresponding 11 rating. The second group
read an example of a definition at the low end of the scale and its corresponding 1 rating.
Finally, the third group received no example of high- or low-rated items. Examples of
the extreme stimuli prior to the ratings of subsequent words were used to examine
10
whether these "anchors" might influence how the subsequent, moderate words were
rated. Arnhoff found no difference in ratings among the three groups; the "anchors" did
not influence the ratings of subsequent moderate rated words. It is worth noting that this
procedure was more likely to elicit assimilation than contrast. The extreme words were
not evaluated; rather, they were used as an anchoring point. Also, because subjects were
told that the extremes had previously been rated as such, a social influence component
may also have influenced the outcome. However, in 1957 Campbell, Hunt, and Lewis
investigated contrast' s role in influencing perception of mental illness and had different
results.
Campbell, Hunt, and Lewis (1957) designed a study that had a conceptual
foundation similar to that of Arnhoff, and it was the first study to successfully observe
contrast in judgments of mental health and illness. Their study used some
word/definition pairs that were drawn from the Arnhoff study. More words were also
added to the normal and extreme ends of the pathology (thought disturbance) scale.
Forty-nine college sophomores judged the degree of pathology of the Arnhoff
word/definitions and the new word/definitions. An additional 50 items were also added
in the middle range of pathology. These items were preliminarily rated by 26 students
and were sorted into categories using the Thurstone technique of equal-appearing
intervals. These three groups of words: the high pathology, middle pathology and low
pathology words were then pooled together to create a pathology continuum. Finally,
the word/definitions were divided into nine separate groups that spanned a full range of
pathology: these groups were formed by choosing the words that were most consistently
rated by the subjects during the standardization phase. Within this nine-point scale,
group number one words represented no disturbance, while group number nine words
represented extreme disturbance. A sample of a few of the chosen items included: "(1)
Gamble: To take a chance a risk. (5) Brim: Outside diameter with a margin. (9) Stave:
That's before, that's long but not happiness" (Campbell et al., 1957, p. 348).
11
The experimental phase began by placing the standardized stimuli into booklets
that were presented to subjects. Five word/definition pairs were placed on each page (in
counterbalanced orders). In phase one of the experiment, the booklets contained stimuli
from one half of the 1-9 scale (either items ranging 1-5 on thought disturbance or items
ranging 5-9 on thought disturbance). Phase two was the transition phase of the
experiment (containing mid-scale, 2-8, items). Finally, phase three contained stimuli
from the opposite half of the 1-9 scale (again, items ranging from 1-5 on thought
disturbance or 5-9). There were two conditions in the experiment: one group (the low
high group) first got the low rated items, then the mid-scale items, and finally the high
rated items; the second group (the high-low group) began with the high rated items and
proceeded to the low-rated items. As before, subjects were asked to rate each item for
degree of thought disturbance on the aforementioned 1-9 scale.
The critical data were the ratings of items that had previously received an
average rating of "5" during the standardization phase of the experiment. After phase
one of the experiment, there was a difference between the low-high group and the high
low groups' ratings of those mid-scale stimuli. This experiment demonstrated that
subjects who were first exposed to items exhibiting high disturbance rated items in the
medium range of disturbance as less severely disturbed than did subjects who were first
exposed to low disturbance items (Campbell et al. 1957). Although Arnhoff did not find
12
any kind of effect when extreme words were presented prior to moderate words;
Campbell et al.' s paradigm demonstrated that contrast can affect judgments of thought
disturbance. Note that while Campbell et al. successfully demonstrated contrast by
comparing the low-high and high-low groups to each other, they did not test whether
either group's ratings differed from the original value "5" of the standardization stimuli.
Thus their study did not determine whether contrast occurred in both directions or only
mone.
Another important facet of the research done by Campbell et al. (1957) is that
after phase one and phase two, the results from phase three of the study became more
assimilative. Campbell et al. (1957) theorized that in this relatively complex judgment
task, subject variables such as experience and comprehension of the instructions may
influence responses. Specifically, the subjects who had the strongest contrast reactions
in phases one and two of the experiment were the least likely to assimilate in phase
three. Campbell et al. (1957) suggested that the subjects who found the stimuli in phase
one the least ambiguous were the most likely to produce contrast responses throughout
the entirety of the experiment. Finally, it is important to note that by the end of phase
three of the experiment each subject had seen 130 word/definition pairs; some kind of
task fatigue may have interacted with the phenomena under investigation.
Manis and Paskewitz (l 984a) found an additional factor, the degree of similarity
between context and test stimuli, that influence the probability of contrast occurring. In
their series of experiments, subjects were exposed to one of two different kinds of
stimuli. One kind was definitions provided by Arnhoff (1954) that had previously been
rated as exhibiting disordered thinking; the second was handwriting samples. Prior to
the experiment, a "standardization group" (otherwise unspecified) rated these
handwriting samples for psychopathology; average ratings were computed to measure
the amount of psychopathology that each item indicated.
13
Manis and Paskewitz (1984a) varied whether subjects were initially presented
with definitions or with handwriting samples. Both of these stimulus types were
subdivided into two separate subgroups, those that had been previously standardized as
relatively pathological and those that had been previously standardized as being non
pathological. The pathological definitions ranged between 8.0 to I0.7 on an I I-point
scale of pathology, while the non-pathological definitions ranged from 1.5 to 3.9 on the
same I I-point scale. Also, the pathological handwriting samples ranged from 7.8 to 10.3
on a separate I I-point scale of pathology while the low pathology handwriting samples
ranged from 1.6 to 3.0 on the same I I-point scale. During this initial, induction phase of
the experiment, subjects were presented with one of the four possible combinations of
stimulus type and pathology level: High-pathology definition, low-pathology definition,
high-pathology handwriting, and low-pathology handwriting. All subjects were told that
the samples came from, "a representative cross-section of patients in a state hospital" (p.
222). Each subject was initially shown six moderate pathology stimuli as warm-up,
followed by 20 of the more extreme stimuli described above. Subjects decided whether
the creator of the definition or handwriting sample was schizophrenic or not: a yes/no
decision.
Next, in the experimental phase, subjects were told that the stimuli were,
"derived from another representative sample of patients" (Manis & Paskewitz, I 984a,
p.222). The test items in the experimental phase were both handwriting samples and
14
word/definition pairs. The word/definition test items fell within the moderate range of
pathology; between 4.1 and 7 .8 on the previously mentioned 1 I-point pathology scale.
There were two separate kinds of handwriting stimuli; half of the test stimuli fell
between 1. 7 and 7 .1 on the I I-point psychopathology scale, the other stimuli fell
between 4.1 and 6.6. Some of the subjects rated solely the word/definition pairs, others
rated only the handwriting samples (the 1.7 to 7.1 stimuli), finally some subjects rated a
stimulus set that included both word/definition pairs and handwriting samples (the 4.1 to
6.6 stimuli). Subjects rated the degrees of thought disturbance of the creators of stimuli
on a 7-point scale where 1 represented a person with normal thought, while 7
represented a highly thought-disordered person. Subjects again rated whether the creator
of the stimulus was schizophrenic or not. Finally, subjects rated the confidence of their
diagnosis of schizophrenia (or not) on a separate 7-point scale where I represented a
guess and 7 represented very certain. Because the two 7-point scales ended up being
highly correlated the scales were combined to form one 14-point scale that ranged from
I (very certain the person is not schizophrenic) to 14 (very certain that the person is
schizophrenic). Manis and Paskewitz translated this 14 point scale into a single score,
presumably using a Z-score transformation to equally weight both scales.
For some participants the induction phase stimuli were of the same type as the
experimental phase stimuli (definitions or handwriting samples), while for other
participants the induction and experimental phase stimuli were of different types. So, for
example, subjects who were first presented with definitions could be asked to rate
psychopathology of more definitions or could be asked to rate handwriting samples.
15
Contrast occurred in the experiment. Subjects first exposed to high pathology
induction stimuli rated the subsequent moderate stimuli lower than subjects who were
first exposed to low pathology induction stimuli followed by moderate stimuli. When
the initial, induction stimulus type matched the experimental stimulus type, contrast was
more likely to occur. For example, if a subject was exposed to a non-pathological
definition in the initial phase of the experiment, the subject's rating of a moderate
definition was likely to be rated as more pathological than if the initial stimulus exposure
had been a hand-writing sample. The contrast effect was strongest with the experimental
stimuli that immediately followed the initial stimuli. Results were different for the
subjects who were presented 'mis-matched' stimuli (for example, definitions followed
by handwriting samples). For these groups, the contrast effect was present; however, the
effect size was much smaller. Manis and Paskewitz (1984a) questioned whether the
'mis-match' data results were replicable, and referenced a similar experiment by
Parducci, Knobel and Thomas (1976) in which subjects rated the sizes of geometric
forms (circles and squares) in which contrast did not occur in the 'mis-match' condition.
Manis and Paskewitz (1984a, p.225) concluded that, "the presentation of a biased
induction series ... leads to contrast in estimating the amount of psychopathology implied
by other stimuli of the same type; when, however, stimuli from a different domain are
evaluated with respect to the psychopathology that they imply, the preceding induction
experience does not have a consistent impact". In other words, extreme vocabulary
definitions can create contrast with other vocabulary definitions, and extreme
handwriting samples can create contrast with other handwriting samples. However, if a
degree of similarity does not exist between induction series stimuli and response stimuli,
the magnitude of contrast-like responses significantly decreases. The authors declared,
"The most important finding in these studies is the relative specificity of the contrast
phenomenon" (p.227).
16
Manis and Paskewitz (1984b) published another study that also demonstrated the
specificity of contrast while simultaneously examining the effect of subjects'
expectations. In the experiment, half of the subjects were initially presented with high
pathology word/definition pairs and the other half were presented with low-pathology
word/definition pairs. Subjects were asked, "Was the definition produced by a
schizophrenic or non-schizophrenic patient?" (p. 367). After this induction phase,
subjects were presented with definitions and handwriting samples and evaluated the
degree of pathology evident in the word or handwriting sample. Some subjects were
asked to state their expectation in connection with the test item. So, before being
presented with the item, some subjects were asked to guess the percentage of
schizophrenic items in the whole list and also asked if the following stimuli would be
from a schizophrenic or non-schizophrenic patient. Subjects' expectations were
congruent with the induction series they were initially assigned; the high-pathology
group believed the test group would include more schizophrenic responses than the low
pathology group.
In addition to the expectation effect, contrast also occurred. Subjects from the
low-pathology induction group rated the second group of definitions as being more
pathological than did those in the high-pathology conditions. Supporting their previous
studies findings regarding specificity of stimuli, handwriting samples were not affected
17
by contrast. In sum, the work of Manis and Paskewitz (1984b) provides factors that can
both suppress and enhance contrast.
The present study utilizes word/definition pairs as stimuli, following in the
dominant tradition in this field. However, not all research into contrast in judgments of
mental health has used word/definition pairs as the stimuli to be evaluated. For example,
Bieri, Orcutt, and Leman (1963) extended research on contrast and clinical judgment
using a different sort of stimuli and suggesting that the presence of extreme stimuli
presented prior to moderate stimuli creates ratings for those moderate stimuli that are
different from the baseline ratings of moderate stimuli. In the first portion of their
experiment, thirty-three clinical practitioners rated social behaviors of psychological
case reports on a 20-point scale of pathology where 1 represented very mild
maladjustment and 20 represented extreme maladjustment. The behaviors with the
largest amount of inter-rater agreement were placed into one of three categories: low
pathology, middle pathology, or high pathology. Four behavior examples formed one
case.
In the next portion of the experiment, sixty-five social work students were asked
to rate the pathology of the cases on the same 20-point scale. Each student only saw one
category of case (Low, Middle, or High). Cases were also divided according to the
predominant behavior the cases exhibited: aggressive or dependent pathology. This
portion of the experiment provided the non-anchored ratings of the middle cases.
After standardization data had been gathered, 176 more subjects saw two
extreme cases (either high or low pathology) followed by a middle pathology case.
Some of the subjects saw the aggressive cases while others saw the dependent cases.
18
There were eight possible experimental groups based on order of cases presented, kind
of pathology presented, and whether the stimuli behavior was categorized as aggressive
or dependent. The presentation of three cases constituted phase one of the study. There
were three other phases of the experiment that evaluated the impact of time on ratings;
however, the details of the other phases extend beyond the scope of this paper. Results
from phase one echoed several other studies of contrast. The subjects who rated the low
pathology cases first rated the middle cases as having significantly higher pathology than
did the subjects who rated the high pathology cases prior to the middle cases.
Additionally, Bieri et al. (1963) examined the ratings of the middle cases in phase one of
the study compared with the ratings of the non-anchored middle cases. Each of the eight
groups in phase one had moderate ratings that were indicative of contrast; for example,
subjects presented with high pathology cases first rated moderate cases as having less
pathology than the ratings of the non-anchored middle cases. However, only two of the
eight groups had statistically significant differences. Bieri et al. did not specify which of
the two groups provided the significant difference of phase one middle cases from non
anchored middle cases.
Bieri et al. (1963) provided preliminary evidence to suggest that exposure to low
pathology stimuli (or to high-pathology stimuli) prior to moderate stimuli creates ratings
for the moderate stimuli that are significantly different from baseline ratings of those
moderate stimuli. However, the conclusions were not definitive; only two of the eight
groups of stimuli yielded significant results and we don't know which ones they were.
Thus the question of just what judgments about mental health are moved by contrast
remains unsettled.
19
In 1971, Perrett found contrast with another kind of mental health/illness stimuli
while investigating whether the professional experience of the subjects would impact
magnitude of the contrast effect. In this study, abbreviated case histories were gathered
from either the closed files of psychiatric inpatients or from the MMPI Atlas. The case
histories reflected a wide range of psychological disturbances. Initially, 80 histories
were rated by five clinical psychologists on a seven-point scale of psychopathology (1
representing "very, very mild" and 7 representing "very, very severe" pathology).
Among the 80 histories, the ones whose ratings had the smallest standard deviations
were selected for use as stimuli in the experiment. The six case histories that received
moderate (3, 4, 5) ratings were selected to be the critical test stimuli that could be
affected by contrast.
In the experiment, subjects were divided into two groups, differing in the context
in which they saw the test stimuli. Each group saw a total of 20 case histories, and rated
their psychopathology on the same seven-point scale that was initially used by the
clinical psychologists. One group saw 14 histories that were previously rated as having
low psychopathology (1, 2, or 3 on the seven-point scale) and immediately afterwards
saw the six test stimuli. The other group saw 14 histories that were previously rated as
having high psychopathology (5, 6, or 7 on the seven-point scale) and immediately after
saw the six test stimuli. Because the six moderate test stimuli were common to both
groups, the differences between their ratings by the two groups were used to evaluate
contrast.
Perett (1971) also divided the subjects according to the kind of background
experiences that they possessed. Some of the subjects worked as therapists in an
20
inpatient setting, some of the subjects worked as therapists in an outpatient setting, and
some of the subjects were undergraduates enrolled in a psychology course. It was
hypothesized that those subjects who worked in an inpatient setting and saw the most
severe kinds of pathology on a regular basis would therefore be the least impacted by
contrast.
The results showed that moderate test cases were rated as more severely
pathological when they appeared following milder case histories than when they
appeared following more severe cases. Thus, the ratings exhibited contrast. Also, the
experienced clinicians and the undergraduates were affected differently by the stimuli.
Contrast was higher for the students than it was for the clinicians. While the variable of
experience did produce statistically significant results, Perett emphasized that the largest
effect size was a result of what level of pathology the subjects saw prior to the test cases.
More recently, Wedell, Parducci and Lane (1990) examined specific factors-- the
number of categories on a rating scale and the kind of anchors on the scale-- that may
influence how contrast affects clinical judgment. They used as stimuli the same thirty
four case histories of psychiatric inpatients as did Perett (1971); six were the "moderate"
test stimuli, 14 the low psychopathology stimuli and 14 the high psychopathology
stimuli from the Perrett study.
Because Wedell et al. (1990) were testing a number of factors that could
influence contrast, they did two separate experiments. In the first experiment, subjects
judged the mental health of psychiatric patients based on their case histories. In this
phase subjects were assigned to use one of three different category rating scales: a 3-
point, 7-point, or 100- point scale. In each of these scales, a rating of one or zero
represented a case history with a "very, very mild disturbance". The upper limit of the
scale (either 3, 7, or 100) represented a case history with a "very, very severe
disturbance".
21
In addition to the size of the category rating scale, the experimenters also
manipulated what kind of stimuli the subjects saw prior to the moderate test cases. Some
of the subjects saw only mild cases prior to rating moderate cases and some saw only
severe cases prior to rating moderate cases (the "restricted range" groups). Some of the
other subjects saw mostly mild or severe cases prior to the test cases but had three case
examples of the opposite severity included (the "full range" groups).
Like Perett (1971), Wedell et al. (1990) found that the factor that most largely
impacted contrast is the degree of pathology of the words that precedes the test cases.
Additionally, they also found a statistically significant difference between groups as a
function of the range of cases that was presented prior to the test cases. For the full range
groups, who saw a mix of mild and severe cases prior to the moderate cases, the contrast
was smaller than for the groups who were in the restricted range groups. Also, increasing
the number of rating categories from 3 to 7 decreased the magnitude of contrast but there
was no significant difference in contrast between 7- and 101- point scales. Wedell et al.
attribute the heightened contrast in the 3-point scale to response biases rather to
enhanced perceptual changes. The present study is interested in the perceptual aspect of
contrast, shifts in the degrees of apparent mental health, rather than response biases and
hence uses a 100-point scale.
In the second experiment, Wedell et al. (1990) tested whether anchoring the two
opposite ends of the scale might diminish the contrast effect. This experiment used only
22
a 7-point category scale. Subjects were assigned to one of three anchor rating groups. In
the first group, subjects saw an anchor that provided a case history example for each
endpoint of the scale (exemplar anchor). In the second group, subjects saw an anchor
that provided a detailed description for each category that could be selected (DSM
anchor). Subjects in the third group did not see any anchoring information. Again, in
this experiment the same restricted and full range groups of both mild and severe stimuli
were used.
Results were similar to those of the first experiment. Contrast was most affected
by the degree of pathology of the words that preceded the test cases and mirrored the
aforementioned results. Again, there was also a statistically significant difference in
contrast as a function of the range of pathology that was presented prior to the test cases.
In the second experiment, both kinds of anchors equally attenuated the contrast effect in
the restricted range cases, "29% reduction for exemplar anchors and 25% reduction for
DSM anchors" (Wedell et al., 1990, p. 326). There was also a trend for the anchors to
affect the full range group, but not to a statistically significant degree. Overall, the
variables of type of pathology and the range of pathology had more of an impact on
contrast than did whether or not anchors were present.
In sum, Wedell et al. (1990) demonstrated that the strength of the contrast effect
decreased when at least seven rating categories were used or when anchors that spanned
the stimulus range were used; however, contrast was most strongly influenced by the
degree of pathology of the stimuli that the subjects saw prior to the moderate stimuli.
An earlier study by Campbell, Hunt and Lewis (1958) had also examined the
effects of using detailed anchors. In their study, Campbell et al. used vocabulary
23
responses that had been previously rated on a nine-point scale of pathology: a rating of
one indicated "well-organized and normal" thinking, while a rating of nine indicated,
"totally disorganized and eccentric" thinking (p. 213). The methods of the study
replicated their aforementioned 1957 study. However, in this study, some of the subjects
received a detailed explanation/anchor of every value on the 9-point scale. For example,
they were shown that a rating of four meant that, "distinct traces of disorganization and
eccentricity are shown" (p. 215). Like Wedell et al. (1990), Campbell et al. (1958) had
found that contrast occurred both when subjects were presented with the
descriptor/anchor for each scale value and when they were given only the end-point
labels. However, the effect of contrast was larger when the endpoints were labeled than
when a detailed example was given for every value of the scale. The present study uses a
100-point rating scale with anchors only at the endpoints.
While the literature on contrast and mental health has largely examined the effect
of extremely pathological stimuli on moderate stimuli, it has not looked at whether
moderate stimuli have contrast effects on the perception of extreme stimuli.
Furthermore, the research has not determined whether contrast can push mental health
assessments in both directions - toward both more and less pathology. Though
bidirectional contrast occurs in some settings, it does not in others. Parker, Bascom et al.
(in press) demonstrated bidirectional contrast occurring simultaneously when people
listened to various kinds of music. In their study, six different melodies were composed
by the researchers. Three of the melodies were categorized as "bad" because they
violated the norms of popular and classical music. The remaining three melodies were
categorized as "good" because they had similar structure to popular music. Half of the
24
subjects heard the three "good" melodies before the "bad", the remaining half heard the
three "bad" melodies before the "good". Subjects were asked to rate, "how much you
liked listening to it" using a 201-point bipolar hedonic scale (from -100 "hated listening
to it" to 100 "loved listening to it"). Subjects' ratings of the "bad" melodies were higher
if they were rated first, than if they were heard second. The "good" melodies were rated
higher if they were rated second, than if they were rated first. However, loudness
contrast does not seem to be equally powerful in both directions. Parker and Schneider
(1994; see also Parker, Murphy & Schneider, 2002) found that intense sounds affect the
loudnesses of soft sounds, but saw little influence of soft sounds on the perception of
loud ones. The present study will look for bidirectionality of contrast in mental health
judgments and, in addition, will look into whether moderate stimuli influence the
judgments of extreme stimuli.
Word!Definition pairs are used in this experiment, following the largest amount
ofresearch in contrast and mental health judgments. Although several of the
aforementioned studies used case histories, these studies often used psychology students
or clinicians as subjects. The use of words and definitions makes the stimuli more
accessible to a general population.
The present study uses groups of four words followed by a pair of test stimuli,
the ones used to test contrast. Throughout the previously described studies of contrast
and mental health, there was a wide variance in the number of stimuli presented. For
example, Bieri et al. (1963) presented three case histories during a given phase of the
study while Campbell et al. (1957) presented 130 word definition pairs to each subject.
25
A "four plus two" stimuli pattern was used for a number of reasons. First, as confirmed
by previous studies, contrast does not seem to be particularly sensitive to stimulus
number. It is a robust phenomenon and would likely occur with a wide variety of
stimulus number presentations. Secondly, as discussed by Cambell et al. (1957) if
subjects are repeatedly exposed to a larger number of moderate words, contrast may
originally appear but eventually lead to assimilation. Other previous research had used a
variety of context and test stimuli number and contrast occurred (e.g. Parker, Bascom,
Rabinovitz, & Zellner, in press, three context & three test stimuli; Zellner, Allen, Henley
& Parker, 2006, four context & two test stimuli; Dolese, Zellner, Vasserman, & Parker,
2005, five context and two test stimuli; Zellner, Rohm, Bassetti & Parker, 2003, eight
context and two test stimuli). The present study, then, follows the procedure of Zellner,
Allen et al.
In previous studies of contrast and mental pathology, there has been sufficient
evidence to suggest that exposure to stimuli at a well-defined place on a scale can create
contrast for stimuli at a different well-defined place on a scale. However, only one
previous study has suggested that exposure to low-pathology stimuli (or to high
pathology stimuli) prior to moderate stimuli creates ratings for the moderate stimuli that
are significantly different from baseline ratings of those moderate stimuli. Also, there
have been no investigations in the mental health judgment literature into whether initial
exposures to moderate stimuli can shift ratings of either low or high rated stimuli. These
considerations motivate the present study.
Goals of the Study
1.) Re-examine the contrast effect and its relation to mental health perception.
(i.e. confirm it occurs)
2.) Re-examine whether the presence of the unhealthy stimuli/healthy stimuli
shifts ratings for moderate stimuli away from the baseline ratings of those
moderate stimuli.
26
3.) Investigate whether initial exposures to moderate stimuli can shift ratings of
either low- or high-rated stimuli.
CHAPTER2
EXPERIMENT 0: CALIBRATION
The objective of the calibration experiment was to find a set of words that people
rated reliably and that could be used to investigate contrast.
Method
Participants
Seventy-seven people (31 females, 46 males, ranging from college age to older
adults) participated in this experiment. Participants were randomly selected from the
general population and asked if they would be willing to participate in a psychology
experiment that would take approximately ten minutes. Participants were excluded from
the study if they were not fluent English speakers. Participants were not offered
compensation for their time.
Materials
The creation of the stimuli was inspired by word/definition pairs similar to those
developed by Arnhoff (1954). Sixty words and definitions were created (some) or
adopted (the rest) in order to find responses that would be rated as exhibiting varying
degrees of mental health/mental illness.
27
28
Procedure
Subjects were run individually. Each participant rated the full list of 60
word/definition pairs. In all cases, 10 word/definition pairs were placed on six different
sheets of paper. Subjects were asked to read each word/definition pair aloud and rate
their perceptions of the definitions. Five distinct 6-page sets of the 60 stimuli were
constructed, recombining stimuli into the six groups of 10 in roughly counterbalanced
orders.
Participants rated the definitions in one of two conditions. Half of the
participants evaluated the mental health of the person who created the definitions on a
scale ranging from 0 = no mental health to 100 = perfect mental health, while the other
half evaluated the mental illness of the person who created the definitions on a scale
ranging from 0 = no mental illness to 100 = extreme mental illness. In both conditions
five approximately counter-balanced random stimulus-page orders were used. Subjects'
ratings were made orally, and the experimenter transcribed their responses.
Results
In general, ratings of words in the mental health condition had lower standard
deviations than did ratings of words in the mental illness condition, so the mental illness
ratings were discarded, and mental health ratings are used in the following studies.
Based on the calibration data, four word/definition pairs were selected whose mean
ratings fell into the Low mental health range and were similar to each other (M=37.6,
SD=18.3), four word/definition pairs were similarly selected from the Moderate mental
health range (M=61.4, SD=l 3.6), and four word/definition pairs were selected from the
High mental health range (M=89.9, SD=7.3). (One of the High mental health range
word/definition pairs, "cushion", was used in the Arnhoff [1954] study). These words
are listed in Table 1 and were then used in Experiments 1 and 2.
29
CHAPTER3
EXPERIMENT 1: HIGH MENTAL HEAL THI
MOD ERA TE MENTAL HEALTH
Method
Participants
Sixty people, (sex breakdown unknown), ranging from college aged to older
adults, participated in this experiment. Participants were randomly selected from the
general population and asked if they would be willing to participate in a psychology
experiment that would take approximately five minutes. Participants were excluded
from the study if they were not fluent English speakers. Participants were not offered
compensation for their time.
Materials
The eight word/definition pairs selected on the basis of the calibration
experiment were used as the stimuli: the four High pairs and the four Moderate pairs
shown in Table 1.
Procedures
The procedure was similar to that of the calibration experiment. Subjects were
run individually. Each participant rated six word/definition pairs. Twenty-four subjects
received the four Moderate word/definition pairs followed by two High word/definition
pairs (the Moderate/High group). The remaining 36 subjects received four High 30
31
word/definition pairs followed by two moderate word/definition pairs (the
High/Moderate group). The orders of the word/definition pairs within their category of
Moderate or High mental health were randomized and counterbalanced. For the four
word/definition pairs that were presented first there were two possible stimulus orders
that subjects might have seen: the four word pairs in one order (ABCD) or the reverse
order (DCBA) - see Table 1. After the initial presentation of four word/definition pairs
from one category (Moderate or High), subjects only saw two word/definition pairs from
the other category. Each subject saw one of the following combinations of two: (AB)
(BA) (CD) or (DC). Thus sixteen distinct stimulus orders were used in the experiment.
Each word/definition pair was placed on a different sheet of paper. Subjects were
asked to read each word/ definition pair aloud and rate the mental health of the person
who created the definitions on a scale ranging from 0 = no mental health to 100 = perfect
mental health.
Subjects' ratings were made orally, and the experimenter transcribed their
responses.
Results
Table 2 shows the average ratings for all words in all conditions of Experiment 1.
Each subject's four responses for Moderate words in the Moderate/High group were
averaged; those averages had Mean= 64.9 and SD =10.7. (Table 2 shows the Mean
rating for all words in all conditions of Experiment 1 and Experiment 0). Notice that the
value 64.9 is similar to the calibration data for those words, where their average rating
was 61.4. Each subject's two responses for Moderate words in the High/Moderate group
32
were also averaged; those averages had Mean= 53.9 and SD=19.4. There was a
significant difference between these two groups' average ratings of Moderate words,
t(58) = 2.62,p<.OI l. The estimated value of omega-squared, the proportion of variance
in the two populations attributable to differences in their means (an effect-size measure),
was .09. Moderate words were rated higher by the Moderate/High group (which saw
those words first) than by the High/Moderate group.
Conversely, no difference was seen in the two groups' ratings of the High words.
For each subject we also calculated the average of the ratings for the High words in both
the Moderate/High group (M=85.1, SD=18.4) and the High/Moderate group (M=87.5,
SD=9.9). (Both of those, in words in the High/Moderate group were also averaged; those
averages had Mean= 53.6 and SD particular the value for the High/Moderate group, are
similar to the calibration data where the mean rating for those words was 89.9; see Table
2.) There was no significant difference between the two sets of ratings of the High
words, t(58) = .67,p>.50. The estimated value of omega-squared was 0.
Prior to the completion oft-tests all distributions in the experiment were tested
fo'r normality by assessing skewness and kurtosis. In one of the moderate samples
(Moderate/High group), the kurtosis value indicated that the sample might deviate from
a normal distribution. As a result, a Mann-Whitney test was also performed on this data,
comparing the ratings of the Moderate words in the Moderate/High and High/Moderate
groups. Like the t-test, it found the ratings to be significantly lower in the
High/Moderate group (n = 24, 36; U = 584, 280; z = 2.29; p<.022). Effect size was
measured using Cliff's (1993) d which ranges in value from 0 to 1. For this test, d = .36.
Thus the significant difference found with the t-test was reconfirmed with the Mann
Whitney test.
Discussion
33
Contrast occurred for the Moderate words: Moderate words were moved
significantly below their baseline ratings when preceded by High words. This is the first
demonstration of a shift from baseline ratings in judgments of mental health.
However, no contrast occurred for the High words. Whether this is a discovery
about contrast or a ceiling effect is not clear. The High words that were selected for the
experiment had a mean rating of nearly 90. It is possible that Moderate words presented
first might have increased the ratings of High words; however the High words were
already at an extreme end on the mental health scale.
CHAPTER4
EXPERIMENT 2: LOW MENTAL HEALTH/
MOD ERA TE MENTAL HEALTH
Experiment 2A: Method
Participants
Forty-eight people, (sex breakdown unknown), ranging from college age to older
adults, participated in this experiment. Participants were randomly selected from the
general population and asked if they would be willing to participate in a psychology
experiment that would take approximately five minutes. Participants were excluded
from the study if they were not fluent English speakers. Participants were not offered
compensation for their time.
Materials
Eight word/definition pairs selected on the basis of the calibration experiment
were used as the stimuli: the four Low pairs and the four Moderate pairs. See Table 1.
Procedures
The procedure was similar to that of Experiment I. Subjects were run
individually. Each participant rated six word/definition pairs. Twenty-four subjects
received the four Moderate word/definition pairs followed by two Low word/definition
pairs (the Moderate/Low group). The remaining 24 subjects received four Low
word/definition pairs followed by two Moderate word/definition pairs (the
34
35
Low/Moderate). The orders of the word/definition pairs within their category of
Moderate or Low mental health were randomized and counterbalanced As in Experiment
1, for the four word/definition pairs that were presented first there were two possible
stimulus orders that subjects might have seen: the four word pairs in one order (ABCD)
or the reverse order (DCBA) - see Table 1. After the initial presentation of four
word/definition pairs from one category (Moderate or Low), subjects only saw two
word/definition pairs from the other category. Each subject saw one of the following
combinations of two: (AB) (BA) (CD) or (DC). Thus sixteen distinct stimulus orders
were used in the experiment.
Each word/definition pair was placed on a different sheet of paper. Subjects were
asked to read each word/definition pair aloud and rate the mental health of the person
who created the definitions on a scale ranging from 0 = no mental health to 100 = perfect
mental health.
Subjects' ratings were made orally, and the experimenter transcribed their
responses.
Results
Each subject's four responses for Moderate words in the Moderate/Low group
were averaged; those averages had Mean= 73.6 and SD =11.3. (Table 3 shows the mean
ratings for all words in all conditions of Experiment 2 and Experiment 0. Notice that the
value 73.6 is noticeably higher than the Experiment 0 calibration data for those words,
where their average rating was 61.4). Each subject's two responses for Moderate words
in the Low/Moderate group were also averaged; those averages had Mean= 75.7 and
SD= 13 .3. There was not a significant difference between these two groups' average
ratings of the Moderate words, t(46) = .62,p>.54. The estimated value of omega
squared was 0.
36
Additionally, no difference was seen in the two groups' ratings of the Low
words. For each subject we also calculated the average of the ratings for the Low words
in both the Moderate/Low group (M=47.9, SD=15.3) and the Low/Moderate group
(M=47.9, SD=18.5). (Both of those are also noticeably higher than the calibration data
where the average rating for those words was 36.7; see Table 3.) There was no
significant difference between the two sets of ratings of the Low words, t( 46) = .106,
p>.99. The estimated value of omega-squared was 0.
Discussion
Contrast did not occur in the Low mental health/Moderate mental health
experiment. The ratings of the Moderate words in both conditions and the Low words in
both conditions were nearly identical. Because the scores for both the Moderate words
and the Low words (when presented first) were much higher than the original calibration
data, the experiment was replicated. Although it is difficult to know why the Moderate
scores and Low scores (when presented first) were so different than the calibration data,
one hypothesis is that these subjects were run during the summer. Fewer college
students were subjects during this time; therefore, most of the subjects came from the
general community as opposed to the American University campus. In order to have a
subject sample that more closely resembled the subject sample of Experiment 1,
Experiment 2B was run.
Experiment 2B: Method
Participants
37
Forty-eight people, (sex breakdown unknown), ranging from college age to older
adults, participated in this experiment. Participants were randomly selected from the
general population and asked if they would be willing to participate in a psychology
experiment that would take approximately five minutes. Participants were excluded
from the study if they were not fluent English speakers. Participants were not offered
compensation for their time. This experiment was run during the fall; thus, more college
students were part of the sample.
Materials
Identical to Experiment 2A
Procedures
Identical to Experiment 2A
Results
As in Experiment 2A, each subject's four responses for Moderate words in the
Moderate/Low group were averaged; those averages had Mean= 65.2 and SD =12.4.
(Table 3 shows the mean ratings for all words in all conditions of Experiment 2B and
experiment 0. Notice that the value 65.2 is more similar to the calibration data for the
Moderate words, 61.45, than to the ratings of the Moderate words presented first in
experiment 2A). Each subject's two responses for Moderate words in the Low/Moderate
group were also averaged; those averages had Mean= 67.2 and SD=18.2. There was not
a significant difference between these two groups' average ratings of the Moderate
words, t(45) = .44,p>.66. The estimated value of omega-squared was 0.
38
Additionally, no difference was seen in the two groups' ratings of the Low
words. Each subject's ratings of the Low words were averaged. Those averages had
Mean= 45.2 and Standard Deviation 18.6 in the Moderate/Low group, and Mean= 40.1
Standard Deviation= 12.8 in the Low/Moderate group. There was no significant
difference between the two sets of ratings for the Low words, t( 45) = 1.1, p> .28. The
estimated value of omega-squared was 0.
Discussion
Again, contrast did not occur with either the Low or Moderate words. The scores
for the Moderate words (presented first) were closer to the original calibration data than
they had been in Experiment 2A; however, this change did not affect contrast.
CHAPTERS
GENERAL DISCUSSION
Contrast occurred for the Moderate words in the Moderate/High condition
(Experiment 1): Moderate words were moved away from their baseline ratings in the
direction of lowered mental health. However, contrast did not occur in the
Moderate/Low condition (Experiment 2A/2B): Moderate words were not moved away
from their baseline ratings. In neither experiment did the Moderate words induce
changes in the ratings of the extreme words.
The results of this study echoed previous research on contrast and mental health.
Similar to the findings of Campbell, Hunt, and Lewis (1957), Perett (1971), Manis and
Paskewitz (1984), and Wedell, Parducci and Lane (1990) this study confirmed that
contrast occurs in judgments of mental health. This study also echoed the preliminary
conclusions of Bieri, Orcutt, and Leman (1963). The presence of the High words shifted
ratings for Moderate words away from the baseline ratings of those Moderate words.
While Bieri et al. demonstrated a general trend towards these results, this study was the
first to confirm overall statistical significance.
This was the also the first study to measure contrast in mental health judgments.
Previous research had focused on scales of thought disturbance or pathology. This
experiment instead used a scale of mental health; where one end of the scale represented
unhealthy and the other represented healthy. In the calibration phase of the study both a
mental health scale and mental illness scale were used to rate the word/definition pairs. 39
40
However, the mental health ratings had less variability than the mental illness ratings,
and that was the reason for choosing to use them to study contrast. Perhaps rating mental
illness was generally more difficult than rating mental health. Anecdotally, more
subjects had trouble assigning judgments in the mental illness condition, and often stated
that without more information it was difficult to make judgments.
A potential concern regarding studies with contrast is that significant findings do
not reflect perceptual differences in actual stimuli, but rather, may simply reflect a
change in the way people are using the rating scale. Manis and Paskewitz (1984a)
allayed this worry and demonstrated that a shift of the perceptual locations of the ratings
on the scale was not the cause of the phenomenon. As described previously, both
handwriting samples and words/definitions were used in their study. Subjects first were
exposed to low pathology words/definitions, then subjects were exposed to either
moderate pathology words/definitions or moderate pathology handwriting samples. The
magnitude of contrast was much greater when the moderate words/definitions were
presented than when the handwriting samples were presented. The use of the
handwriting sample group functioned as a control to ensure that the previous exposure to
the low pathology words/definitions was not simply influencing how subjects used the
rating scale.
Also, because six separate hypothesis tests were performed in this experiment
(and only one produced significant results), it is necessary to consider the possibility that
a type one error might be present. To account for this possibility, a Bonferroni
Correction was used that suggested the critical p-value should be .0083. In the
significant Moderate/High condition the p-value was .011, close enough to suggest that
41
the Moderate words were actually affected by the High words. Also, it is possible to
perform the Bonferroni Correction with an N of two because only two t-tests were done
within the Moderate/High condition. This correction provides a critical p-value of .025,
which this experiment satisfactorily meets.
Consistent with previous experiments of this type, the current experiment
produced contrast. In the experiment, a series of one kind of stimuli were presented to
each subject (4 Healthy words). Next, a series of another kind of stimuli was present to
each subject (2 Moderate words). In reporting results, the average of the Healthy words
and the average of the Moderate words were calculated. As Lockhead (1982) discussed,
this kind of design would likely produce contrast. If the experiment were to have
presented each subject with only one Healthy stimulus followed by one Moderate
stimulus (to be evaluated), results might have been observed to be more assimilative.
Also, as Zellner and Cobuzzi (2008) demonstrated, assimilation is more likely to
occur when presentation of stimuli is simultaneous. In the context of mental health
stimuli, a simultaneous presentation would not have been feasible.
Given the nature of the experiment's design, it is interesting to wonder why
contrast did not occur in the Low/Moderate condition. One possibility is that there was
not a high degree of consistency among the raters on any of the less mentally healthy
items. The Low and Moderate words had standard deviations that were 16.24 and 13.62
respectively (the SD for the High words was a much lower 4.17). This lack of
intersubject agreement in both the Low and Moderate stimuli could explain why it was
difficult to obtain consistent results within the contrast phase. The high variability
among the Moderate words could also explain why the Moderate words did not affect
42
the High words in Experiment 1. For example, it is possible that for some subjects a
word that was classified as having Low Mental Health might have induced more
mentally healthy perceptions than some of the words that were classified as High Mental
Health. If a subject perceived a Low Mental Health word as having similar or even
higher mental health than a word classified as High Mental Health, contrast would surely
be limited. In the future, it would be beneficial to devise Low and Moderate mental
health stimuli that could yield more consistent intersubject agreement.
Another possibility as to why contrast did not occur in the Low/Moderate
condition is that the perceived differences in mental health between Low and Moderate
words are not as great as the perceived differences between Moderate and High words.
Despite the fact that there was a 30 point difference between both Low and Moderate
and 30 point difference between Moderate and High (on the 100 point mental health
scale), maybe the scale was not interval. If the perceived difference between Low and
Moderate words was smaller than the perceived difference between High and Moderate
words, contrast between the Low and Moderate words may have been less likely to
occur. For instance, if two types of words were already perceived to be similar in degree
of pathology the magnitude of the effect of contrast would diminish if not disappear.
Future research might be able to find other Low stimuli that do induce measurable
contrast.
While it is necessary to examine potential procedural reasons why the
experimental paradigm did not produce contrast between Low and Moderate words, it is
possible that the lack of contrast may speak to something more general about the
phenomenon. Maybe the contrast effect is not as general as Wundt suggested. There
43
might be an aspect of the effect that makes it more likely to occur in one direction than
the other with this sort of stimuli.
Finally, Moderate words never induced contrast on either type of extreme word.
It is possible that the extreme stimuli were not changed from their baseline ratings
because of perceptual ceiling and floor effects; both the Low mental health and High
mental health stimuli were already toward the end points of the mental health scale
(M=36.67 and M=89.90, respectively). While the Low mental health stimuli still had
room to migrate to the lower end of the scale, the aforementioned lack of consistency
among the ratings might have impacted those kinds of results. Again, the possibility
exists that, in general, Moderate stimuli do not induce contrast. Alternatively, it could
be that contrast operates only in one direction for mental health judgments as it seems to
in loudness.
The clinical implications of this study mirror many previous studies that
examined contrast and mental health: contrast affects clinicians' judgments. For
example, private practice clinicians tend to rate the same patients as more
psychologically troubled than do state hospital clinicians (Campbell et al., 1957). While
this study did not use clinicians as subjects, the results support the same conclusions. As
much as clinicians desire to assess and treat clients independently from one another,
previous exposure to other clients makes this impossible.
Clinicians should be aware of the effect that previous clients can have on the
judgment of their future clients. While awareness that contrast occurs may not entirely
eliminate the effect, it would likely lessen it. Also, clinicians should use strategies that
have been shown to lessen contrast in terms of rating mental health and illness. As
44
described previously, Wedell et al. (1990) showed that the magnitude of contrast
decreased when subjects were given a detailed description of every point of a particular
scale. If clinicians need to subjectively score clients, it is essential that they have a
detailed exemplar for every particular category. Also, Wedell et al. (1990) found that as
the number of rating categories increased, the degree of contrast decreased. Therefore, if
a clinician has a choice to use a larger scale to rate clients, it would likely increase the
accuracy of actual judgments and limit the effect of contrast. Finally, as Manis et al.
(1984a) demonstrated when two different types or categories of stimuli are presented,
the magnitude of contrast significantly decreases, not having a consistent impact.
Perhaps clinicians could use this categorization effect to their advantage; for example,
diagnose one client with an affective disorder (one category) prior to diagnosing a client
with an anxiety disorder (a distinct category).
The current study may also have some implications that do not mirror previous
research. In these experiments, the pathology in the word/definition stimuli was
disorganization. When the majority of these studies were done in the nineteen fifties and
sixties disorganization may have been the extent of the lay publics' general knowledge
about mental illness. Currently, however, a lay person may have a more complex view
of mental health and illness. For example, information about post traumatic stress
disorder, depression, and anxiety disorders are common in the media. In this
experiment, when subjects were asked to reflect on the mental health of the individual
who created the definition, a modem subject may have a very different conception of
mental health than did a subject fifty years ago. Finally, contrast was never found with
45
the Low mentally healthy words; therefore, the ability to have confidence in all clinical
implications of the study is limited.
This experiment corroborated previous findings on contrast and mental health.
Although Bieri et al. (1963) said they found some instances of movement away from
baseline ratings; this was the first study to definitely show in detail that shift from
baseline ratings occur. Finally, this study was one of the first studies to demonstrate
contrast in terms of mental health, not pathology. To summarize, Moderate words were
moved away from their baseline ratings in the direction of lowered mental health in the
High/Moderate condition of Experiment 1. Future studies should examine whether
changing some variables could reveal contrast with Low and Moderate words.
APPENDIX A
INSTRUCTIONS
I'll have to read these instructions to you or else I'll leave something out.
We are interested in learning about your impressions of people's mental health.
What will happen in this experiment is that you will read out loud some word definitions that come from a wide variety of people. What we would like you to do is to evaluate the mental health of the people who provided the definitions. After you read each word and its definition, you will rate the mental health of the individual who created the definition. There are no right or wrong answers, we are only interested in your impressions.
The ratings will be on a zero to a hundred point scale.
(Hand Scale) Notice that on the scale:
Zero indicates someone with a total lack of mental health, while one hundred indicates someone with perfect mental health. So, for example, a rating of 85 would represent someone you perceive as having very good mental health.
What we need you to do is to read each word and definition and then rate the mental health of the person.
Do not rate definitions according to how intelligent you think the person who made the definition was. Even though a definition may be incorrect, that does not necessarily indicate anything about the mental health of the individual.
Try to be as discriminating as possible.
Any questions?
Hand out definitions After the study, hand out thank you sheet.
46
APPENDIXB
TABLES
Table 1: Stimuli List
Contrast Mean Standard Stimuli Definition Rating Deviation Low A. Smoke Rings and poof its magic 37.9 16.7 B. Turban Wraps royally around the clock 33.7 16.0 C. Combat Guns that shoot the noise hurts 37.8 15.9 D. Lipstick Talk too much mouth glued together 37.3 16.4
Moderate
A. King Rules the roost, man of the land 64.6 12.9 B. Philosophy Thoughts fill my vacant mind 62.1 14.3
C. Grass Mow all day, grows at night 59.4 14.6 D. Bride I do answers the popped question 59.7 12.7
High A.Runway Long stretch of road where planes take off and land 90.6 4.3 B. Chaos Confusion, opposite of order 89.6 4.4 C. Cushion A padded device for comfort 89.0 3.6 D. Vase Object used to contain flowers 90.4 4.4
Low Average=36.7, Standard Deviation=16.2 Moderate A verage=61.5, Standard Deviation= 13.6 High Average 89.9, Standard Deviation= 4.2
47
48
Table 2: Words and Ratings-Moderate/High
Moderate Words Mean Rating (SD) Calibration 61.4 (13.6) High/Moderate 53.6 (19.4) Moderate/High 64.9 (10.7)
High Words Mean (SD) Calibration 89.9 ( 7.3) Moderate/High 85.1 (18.4) High/Moderate 87.5 ( 9.9)
Table 3: Words and Ratings-Moderate/Low
Moderate Words Mean Rating (SD) Calibration 61.4 (13.6) Moderate/Low (2A) 73.6 (11.3) Low/Moderate (2A) 75.7 (13.3) Moderate/Low (2B) 65.2 (12.4) Low/Moderate (2B) 67.2 (18.2)
Low Words Mean Rating (SD) Calibration 37.6 (18.3) Moderate/Low (2A) 47.9 (18.5) Low/Moderate (2A) 47.9 (15.3) Moderate/Low (2B) 45.2 (18.6) Low/Moderate (2B) 40.1 (12.8)
REFERENCES
Arnhoff, F. N. (1954 ). Some factors influencing the unreliability of clinical judgments. Journal of Clinical Psychology, 10, 272-275.
Asch, S. E. (1956). Studies of independence and conformity: I. A minority of one against a unanimous majority. Psychological Monographs: General and Applied, 70, (416).
Bieri, J., Orcutt, B.A., & Leaman, R. (1963). Anchoring effects in sequential clinical judgments. Journal of Abnormal and Social Psychology, 67, 616-623.
Blankenship, K. L., Wegener, D. T., Petty, R. E., Detweiler-Bedell, B., & Macy, C. L. (2008). Elaboration and consequences of anchored estimates: An attitudinal perspective on numerical anchoring. Journal of Experimental Social Psychology, 44, 1465-1476.
Campbell, D. T., Hunt, W. A., & Lewis, N. A. (1957). The effects of assimilation and contrast in judgments of clinical materials. American Journal of Psychology, 70, 347-360.
Cambell, D.T., Hunt, W.A., & Lewis, N.A. (1958). The relative susceptibility of two rating scales to disturbances resulting from shifts in stimulus context. Journal of Applied Psychology, 42, 213-217.
Cliff, N. (1993). Dominance statistics: Ordinal analyses to answer ordinal questions. Psychological Bulletin, 114, 494 - 509.
Conner, M., Land, D., & Booth, D. (1987). Effect of stimulus range on judgements of sweetness intensity in a lime drink British Journal of Psychology, 78, 357-364.
Dolese, M., Zellner, D., Vasserman, M., & Parker, S. (2005). Categorization affects hedonic contrast in the visual arts. Bulletin of Psychology and the Arts, 5, 21-25.
Eply, N., & Gilovich, T. (2004). Are adjustments insufficient. Personality and Social Psychology Bulletin, 30, 447-460.
Frederiksen, J.R. (1975). Two models for psychophysical judgment: Scale invariance with changes in stimulus range. Perception & Psychophysics, 17, 147-157.
49
50
Higgins, E.T., Rholes, W.S., & Jones, C.R. (1977). Category accessibility and impression formation. Journal of Experimental Social Psychology, 54, 181-192.
Kenrick, D.T & Gutierres S.E. (1980). Contrast effects and judgments of physical attractiveness: When beauty become s a social problem. Journal of Personality and Social Psychology, 38, 131-140.
Locke, J. (1964). An essay concerning human understanding. New York: New American Library. (Original work published 1690).
Lockhead, G.R (1982). Sequential predictors of choice in psychophysical tasks. In Kornblum, S., & Requin, J. (Eds.) Preparatory states and processes. (pp. 27-47) Hillsdale, NJ: Lawrence Erlbaum Associates, Publishers.
Manis, M., & Paskewitz, J. R. (1984a). Specificity in contrast effects: Judgments of psychopathology. Journal of Experimental Social Psychology, 20, 217-230.
Manis, M., & Paskewitz, J. R. (1984b). Judging psychopathology: Expectation and contrast. Journal of Experimental Social Psychology, 20, 363-381.
Martin, L. L. (1986). Set/reset: Use and disuse of concepts in impression formation. Journal of Personality and Social Psychology, 51, 493-504.
Mussweiler, T., & Englich, B. (2005). Subliminal anchoring: Judgmental consequences and underlying mechanisms. Organizational Behavior and Human Decision Processes, 98, 133-143.
Parducci, A. (1968). The Relativism of Absolute Judgments. Scientific American, 219, 84-90.
Parducci, A., Knobel, S. & Thomas, C. (1976). Independent contexts for category ratings: A range-frequency model. Psychological Review, 72, 407-408.
Parker, S., Bascom, J., Rabinovitz, B., & Zellner, D. (in press). Bedonie contrast with musical stimuli. Psychology of Aesthetics, Creativity and the Arts.
Parker, S., Murphy, D., & Schneider, B. (2002). Top-down gain control in the auditory system: Evidence from identification and discrimination experiments. Perception
& Psychophysics, 64, 598-615.
Parker, S., & Schneider, B. (1994). The stimulus range effect: Evidence for top-down control of sensory intensity in audition. Perception & Psychophysics, 56, 1-11.
Perrett, L. F. (1971). Immediate and background contextual effects in clinical judgment. Unpublished doctoral dissertation, University of California, Los Angeles.
51
Pol, H.E.H., Hijman, R., Baare, W.F.C., & van Ree, J.M. (1998). Effects of context on judgments of odor intensities in humans. Chemical Senses, 23, 131-135.
Rota, L. M., & Zellner, D.A. (2007). The categorization effect in hedonic contrast: Experts differ from novices. Psychonomic Bulletin & Review, 14, 179-183.
Samsom, D., & Rachman, S. (1992). A search for contrast effects with fear evoking stimuli. British Journal of Clinical Psychology, 31, 33-44.
Schwarz, N., & Bless, H. (1992). Constructing reality and its alternatives: An inclusion/exclusion model of assimilation and contrast effects in social judgment. In L. L. Martin & A. Tesser (Eds.), The construction of social judgments (pp. 217-245). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.
Sherif, M. (1936). The psychology of social norms. New York: Harper.
Stapel, D.A. (2007). In the mind of the beholder: The interpretation comparison model of accessibility effects. In Stapel, D.A., & Suls, J. (Eds.) Assimilation and contrast in social psychology. (pp. 143-164) New York, NY: Psychology Press, Inc.
Stapel, D.A., Koomen, W. & van der Plight, J. (1997). Categories of category accessibility: The impact of trait concept versus exemplar priming on person judgments. Journal of Experimental Social Psychology, 3 3, 4 7-7 6.
Suls, J., & Wheeler, L. (2007). Psychological magnetism: A brief history of assimilation and contrast in psychology. In Stapel, D.A., & Suls, J. (Eds.) Assimilation and contrast in social psychology. (pp. 9-44) New York, NY: Psychology Press, Inc.
Weddell, D.H., Hicklin S.K., & Smarandescu, L.O. (2007). Contrasting models of assimilation and contrast. In Stapel, D.A., & Suls, J. (Eds.) Assimilation and contrast in social psychology. (pp. 45-74) New York, NY: Psychology Press, Inc.
Wedell, D.H., Parducci, A., & Lane, M. (1990). Reducing the dependence of clinical judgment on the immediate context: Effects of number of categories and types of anchors. Journal of Personality and Social Psychology, 58, 319-329.
Wilson, T.D., Lisle, D.J., Kraft, D. & Wetzel, C.G. (1989). Preferences as expectationdriver inferences: Effects of affective expectation on affective experience. Journal of Personality and Social Psychology, 56, 519-530.
Wundt, W. (1907). Outline of Psychology. New York, New York: G. E. Stechert & Co.
Yin, R. K., & Saltzstein, H. D. (1968). Transfer of social influence effects on psychological judgments. Journal of Psychology, 68, 313-319.
52
Zellner, D. A., & Cobuzzi, J. (Eds.) (2008). Fechner Day 2008. Proceedings of the 2ih Annual Meeting of the International Society for Psychophysics, Toronto, Canada: The International Society for Psychophysics.
Zellner, D. A., Allen, D., Henley, M., & Parker, S. (2006). Hedonic contrast and condensation: Good stimuli make mediocre stimuli less good and less different. Psychonomic Bulletin & Review, 13, 235-239.
Zellner, D.A., Rohm, E.A., Bassetti, T.L., & Parker, S. (2003). Compared to what? Effects of categorization on hedonic contrast. Psychonomic Bulletin & Review, 10, 468-473.
Zellner, D.A., Strickhouser, D., & Tornow, C.E. (2004). Disconfirmed hedonic expectations produce perceptual contrast, not assimilation. The American University of Psychology, 117, 363-387.
top related