hamilton depression
TRANSCRIPT
7/27/2019 Hamilton Depression
http://slidepdf.com/reader/full/hamilton-depression 1/15
Am J Psychiatry 161:12, December 2004 2163
Reviews and Overviews
http://ajp.psychiatryonline.org
The Hamilton Depression Rating Scale:
Has the Gold Standard Become a Lead Weight?
R. Michael Bagby, Ph.D.
Andrew G. Ryder, M.A.
Deborah R. Schuller, M.D.
Margarita B. Marshall, B.Sc.
Objective: The Hamilton Depression Rat-
ing Scale has been the gold standard for theassessment of depression for more than 40years. Criticism of the instrument has beenincreasing. The authors review studies pub-lished since the last major review of this in-strument in 1979 that explicitly examinethe psychometric properties of the Hamil-ton depression scale. The authors’ goal is todetermine whether continued use of theHamilton depression scale as a measure of treatment outcome is justified.
Method: MEDLINE was searched for stud-ies published since 1979 that examinepsychometric properties of the Hamilton
depression scale. Seventy studies wereidentified and selected, and then groupedinto three categories on the basis of themajor psychometric properties exam-ined—reliability, item-response character-istics, and validity.
Results: The Hamilton depression scale’s
internal reliability is adequate, but manyscale items are poor contributors to the
measurement of depression severity; oth-
ers have poor interrater and retest reliabil-
ity. For many items, the format for re-
sponse options is not optimal. Content
validity is poor; convergent validity and
discriminant validity are adequate. The
factor structure of the Hamilton depres-
sion scale is multidimensional but with
poor replication across samples.
Conclusions: Evidence suggests that the
Hamilton depression scale is psychomet-
rically and conceptually flawed. The
breadth and severity of the problems mil-
itate against efforts to revise the current
instrument. After more than 40 years, it is
time to embrace a new gold standard for
assessment of depression.
(Am J Psychiatry 2004; 161:2163–2177)
The Hamilton Depression Rating Scale (1) was devel-
oped in the late 1950s to assess the effectiveness of the first
generation of antidepressants and was originally pub-
lished in 1960. Although Hamilton (1) recognized that the
scale had “room for improvement” (p. 56) and that further
revision was necessary, the scale quickly became the stan-
dard measure of depression severity for clinical trials of
antidepressants (2, 3). The Hamilton depression scale has
retained this function and is now the most commonly
used measure of depression (3). Our objective in this arti-
cle is to provide a review of the Hamilton depression scale
literature published since the last major evaluation of its
psychometric properties, more than 20 years ago (4). More
recent reviews have appeared (3, 5–7), but they have notsystematically examined the literature with regard to a
broad range of measurement issues. Significant develop-
ments in psychometric theory and practice have been
made since the 1950s and need to be applied to instru-
ments currently in use. We evaluate the Hamilton depres-
sion scale in light of these current standards and conclude
by presenting arguments for and against retaining, revis-
ing, or rejecting the Hamilton depression scale as the gold
standard for assessment of depression.
Method
Studies for the review were identified by means of MEDLINEsearches for both “depression” and “Hamilton.” All studies pub-
lished during the period since the last major review (January 1980
to May 2003) were considered. Studies selected for review had to
be explicitly designed to evaluate empirically the psychometric
properties of the instrument or to review conceptual issues re-
lated to the instrument’s development, continued use, and/or
shortcomings. At least 20 published versions of the Hamilton de-
pression scale exist, including both longer and shortened ver-
sions. This review was limited to studies that examined the origi-
nal 17-item version, as the majority of the studies that evaluated
the scale’s psychometrics used the 17-item version. Only a small
number of studies evaluated other versions, and most of these
versions contain the original 17 items. Seventy articles met the se-
lection criteria and were categorized into three groups on the ba-
sis of the major psychometric property examined—reliability,item response, and validity. Table 1 lists the articles included in
the review.
Results
Reliability
Clinician-rated instruments should demonstrate three
types of reliability: 1) internal reliability, 2) retest reliability,
and 3) interrater reliability. Cronbach’s alpha statistic (78)
is used to evaluate internal reliability, and estimates ≥0.70
7/27/2019 Hamilton Depression
http://slidepdf.com/reader/full/hamilton-depression 2/15
2164 Am J Psychiatry 161:12, December 2004
HAMILTON DEPRESSION SCALE
http://ajp.psychiatryonline.org
TABLE 1. Characteristics of Studies Examining the Psychometric Properties of the Hamilton Depression Rating Scale a
% ofFemaleSubjects
Psychometric PropertiesExamined
Study Year Language N Subjects ReliabilityItem
Response Validity
Aben et al. (8) 2002 Dutch 202 46 Stroke patients ×
Addington et al. (9) 1990 English 250 — b Schizophrenia inpatients ×
Addington et al. (10) 1996 English 112c 60 Schizophrenia inpatients × ×
Addington et al. (10) 1996 English 89d — b Schizophrenia inpatients × ×
Akdemir et al. (11) 2001 Turkish 94 66 Psychiatric patients × ×Baca-García et al. (12) 2001 Spanish 1 100 Dysthymia outpatient ×
Bech (5) 1981 Danish 66 70 Depressed inpatients × ×
Bech et al. (13) 1992 Multilingual 1,128 — b Psychiatric patients × ×
Bech et al. (14) 2002 Danish 650 — b Psychiatric patients × ×
Berard and Ahmed (15) 1995 English 22 64 Elderly psychiatric outpatients × ×
Berrios and Bulbena-Villarasa (16)
1990 Castilian 1,204 59 Psychiatric outpatients × ×
Brown et al. (17) 1995 English 259 — b Medical outpatients ×
Carroll et al. (18) 1981 English 278 — b Depressed patients ×
Cicchetti and Prusoff (19) 1983Time 1 English 86 — b Depressed outpatients ×
Time 2 English 81 — b Depressed outpatients ×
Craig et al. (20) 1985 English 32 0 Schizophrenia inpatients × ×
Daradkeh et al. (21) 1997 Arabic 73 58 Depressed inpatients × ×
Deluty et al. (22) 1986 English 70 39 Psychiatric inpatients × ×
Demitrack et al. (23) 1998 — b 85 66 Professionals/laypersons ×
Entsuah et al. (24) 2002Sample 1 Multilingual 865 65 Psychiatric patients ×
Sample 2 Multilingual 757 64 Psychiatric patients ×
Sample 3 Multilingual 450 62 Psychiatric patients ×
Faries et al. (25) 2000 — b 1,658 — b Depressed outpatients ×
Feinberg et al. (26) 1981 English — b — b Depressed patients ×
Fleck et al. (27) 1995 French 60 77 Psychiatric outpatients ×
Fuglum et al. (28) 1996 Danish — b — b Depressed patients × ×
Gastpar and Gilsdorf (29) 1990 Multilingual 122 66 Depressed patients ×
Gibbons et al. (30) 1993 English 370 72 Psychiatric patients × ×
Gilley et al. (31) 1995Sample 1 English 185 56 Alzheimer’s disease patients × ×
Sample 2 English 54 39 Comparsion subjects with normalcognition
× ×
Sample 3 English 57 37 Parkinson’s disease patients × ×
Gottlieb et al. (32) 1988 English 43 67 Neurological patients × ×
Gullion and Rush (33) 1998 English 324 67 Depressed patients ×
Hammond (34) 1998 English 100 74 Elderly medical patients ×Hooijer et al. (35) 1991 Flemish 56 — b Mental health professionals ×
Hotopf et al. (36) 1998 English 49 65 Primary care patients ×
Kobak et al. (37) 1999 English 113 — b Psychiatric patients/communitycomparison subjects
× ×
Koenig et al. (38) 1995 English 38 55 Elderly medical patients ×
Lambert et al. (39) 1986 — b 1,850 — b Psychiatric patients ×
Lambert et al. (40) 1988 English 13 31 Psychiatric inpatients/outpatients ×
Leentjens et al. (41) 2000 Dutch 63 37 Parkinson’s disease patients ×
Leung et al. (42) 1999 Chinese 93 56 Psychiatric inpatients × ×
McAdams et al. (43) 1996 English 101 23 Schizophrenia outpatients ×
Maier and Philipp (44) 1985 German 280 — b Psychiatric outpatients ×
Maier et al. (45) 1988Sample 1 German 130 — b Psychiatric inpatients × × ×
Sample 2 German 48 — b Psychiatric inpatients × × ×
Maier et al. (46) 1988 German 130 — b Psychiatric inpatients ×
Marcos and Salamero (47) 1990 Spanish 234 76 Community geriatric subjects ×
Meyer et al. (48) 2001 English 196 68 Medical outpatients ×Middelboe et al. (49) 1994 Danish 36 64 Medical outpatients ×
Moberg et al. (50) 2001 English 20 70 Geriatric consultation/liaison patients ×
Mottram et al. (51) 2000 English 433 73 Elderly psychiatric referrals ×
Naarding et al. (52) 2002Sample 1 Dutch 44 36 Stroke inpatients ×
Sample 2 Dutch 274 60 Alzheimer’s disease patients ×
Sample 3 Dutch 85 40 Parkinson’s disease patients ×
O’Brien and Glaudin (53) 1988Sample 1 English 183 70 Psychiatric outpatients ×
Sample 2 English 182 70 Psychiatric outpatients ×
(continued)
7/27/2019 Hamilton Depression
http://slidepdf.com/reader/full/hamilton-depression 3/15
Am J Psychiatry 161:12, December 2004 2165
BAGBY, RYDER, SCHULLER, ET AL.
http://ajp.psychiatryonline.org
reflect adequate reliability (79, 80). The internal reliability
of individual items is calculated by using corrected item-
to-total correlation with Pearson’s r; items should have a
correlation greater than 0.20 (79, 80). Retest reliability as-
sesses the extent to which multiple administrations of the
scale generate the same results. When scores on an instru-
ment are expected to change in response to effective treat-
ment, it is necessary to demonstrate that these scores re-main the same in the absence of treatment. Interrater
reliability assesses the extent to which multiple raters gen-
erate the same result. Although Pearson’s r is often used to
compute these estimates, the preferred method is the
intraclass r (81), which allows for adjustment for agree-
ment by chance. Estimates of retest and interrater reliabil-
ity should be at a minimum of 0.70 (Pearson’s r) and 0.60
(intraclass r) (82). For retest reliability of scale items, Pear-
son’s r >0.70 is considered acceptable (83).
Internal Reliability
Table 2 summarizes the results from studies examining
internal reliability of the total Hamilton depression scale. Es-
timates ranged from 0.46 to 0.97, and 10 studies reported es-
timates ≥0.70. Table 3 summarizes the studies that exam-
ined internal reliability at the item level. The majority of
Hamilton depression scale items show adequate reliability.
Six items met the reliability criteria in every sample (guilt,middle insomnia, psychic anxiety, somatic anxiety, gastro-
intestinal, general somatic), and an additional five items met
the criteria in all but one sample (depressed mood, suicide,
early insomnia, late insomnia, work and interests, hypo-
chondriasis). Loss of insight was the item with the most vari-
able findings, suggesting a potential problem with this item.
Interrater Reliability
Total Hamilton depression scale interrater reliabilities
are displayed in Table 2. Pearson’s r ranged from 0.82 to
TABLE 1. Characteristics of Studies Examining the Psychometric Properties of the Hamilton Depression Rating Scalea (continued)
% ofFemaleSubjects
Psychometric PropertiesExamined
Study Year Language N Subjects ReliabilityItem
Response Validity
O’Hara and Rehm (54) 1983 English 20 0 Depressed outpatients ×
Olsen et al. (55) 2003 Danish 91 74 Psychiatric and medical patients ×
Onega and Abraham (56) 1997 English 206 70 Geriatric psychiatric outpatients ×
Pancheri et al. (57) 2002 Italian 186 62 Depressed outpatients × ×Paykel (58) 1990Sample 1 English 101 — b Depressed inpatients × ×
Sample 2 English 118 — b Psychiatric outpatients × ×
Sample 3 English 167 — b General practice outpatients × ×
Potts et al. (59) 1990 English 694 74 Depressed outpatients ×
Ramos-Brieva andCordero-Villafafila (60)
1988 Spanish 135 70 Depressed inpatients/outpatients × ×
Rehm and O’Hara (61) 1985 English 158 100 Community (symptomatic) subjects × ×
Reynolds and Kobak (62) 1995 English 357 59 Psychiatric outpatient/nonreferredcommunity subjects
×
Riskind et al. (63) 1987 English 191 54 Psychiatric outpatients × ×
Santor and Coyne (64) 2001Sample 1 English 316 — b Primary care outpatients ×
Sample 2 English 318 70 Depressed outpatients ×
Santor and Coyne (65)Sayer et al. (66)
2001 English 732 — b Depressed patients ×
1993 English 114 61 Psychiatric inpatients × ×
Senra Rivera et al. (67) 2000 Castilian 52 65 Depressed patients × ×Shain et al. (68) 1990 English 45 64 Depressed adolescent inpatients ×
Smouse et al. (69) 1981 English — b — b Depressed patients ×
Steinmeyer and Möller (70) 1992 German 223e 68 Psychiatric inpatients ×
Steinmeyer and Möller (70) 1992 German 174f 68 Psychiatric inpatients ×
Strik et al. (71) 2001Sample 1 Dutch 156 0 Medical patients × ×
Sample 2 Dutch 50 100 Medical patients × ×
Teri and Wagner (72) 1991 English 75 68 Alzheimer’s patients ×
Thase et al. (73) 1983 English 147 100 Depressed outpatients × ×
Thompson et al. (74) 1998 English 242 100 Psychiatric referrals ×
Whisman et al. (75) 1989 English 70 100 Depressed outpatients × ×
Williams (76) 1988 English 23 65 Psychiatric inpatients ×
Zheng et al. (77) 1988 Chinese 329 47 Psychiatric inpatients/outpatients × ×
a Studies were published between January 1980 and May 2003 and identified by means of a MEDLINE search for both “depression” and“Hamilton.”
b Not reported.c Number of subjects providing data at time 1.d Number of subjects providing follow-up data 3 months after admission.e Number of subjects providing baseline (i.e., pretreatment) data.f Number of subjects providing endpoint (week 6) data after treatment with either paroxetine or amitriptyline.
7/27/2019 Hamilton Depression
http://slidepdf.com/reader/full/hamilton-depression 4/15
2166 Am J Psychiatry 161:12, December 2004
HAMILTON DEPRESSION SCALE
http://ajp.psychiatryonline.org
0.98, and the intraclass r ranged from 0.46 to 0.99. Some
investigators provided evidence that the skill level or ex-
pertise of the interviewer and the provision of structured
queries and scoring guidelines affect reliability (19, 23, 35,
54). Across studies, the best estimate mean of interrater re-
liability for studies reporting higher levels of interviewer
skill and use of expert raters, structured queries, and scor-
ing guidelines did not statistically differ from that for other
studies (z=0.81, n.s.).
At the individual item level, interrater reliability is poor
for many items. Cicchetti and Prusoff (19) assessed reli-
ability before treatment initiation and 16 weeks later at
trial end. Only early insomnia was adequately reliable be-
fore treatment, and only depressed mood was adequately
reliable after treatment. Thirteen items had coefficients
<0.50 before treatment, and 11 items had coefficients
<0.50 after treatment. Rehm and O’Hara (61) performed a
similar analysis with data from two samples. Six items
showed adequate reliability in the first sample (early in-
somnia, middle insomnia, late insomnia, somatic anxiety,
gastrointestinal, loss of libido), as did 10 in the second
sample (depressed mood, guilt, suicide, early insomnia,
middle insomnia, late insomnia, work/interests, psychic
anxiety, somatic anxiety, gastrointestinal). Loss of insight
showed the lowest interrater agreement in both samples.
Craig et al. (20) found that only one item, work/interests,
had adequate interrater reliability. Moberg et al. (50) re-
ported that nine items demonstrated adequate reliability
when the standard Hamilton depression scale was admin-
istered (depressed mood, guilt, suicide, early insomnia,
late insomnia, agitation, psychic anxiety, hypochondria-
sis, loss of insight), but all items showed adequate reliabil-
ity when the scale was administered with interview guide-
lines. Potts et al. (59) demonstrated that a single omnibus
coefficient can mask specific problems. Using a structured
interview version of the Hamilton depression scale, they
TABLE 2. Studies Reporting Reliability Estimates for the Total 17-Item Hamilton Depression Rating Scale a
Study YearInternal Reliability(Cronbach’s alpha)
Interrater Reliability(Pearson’s r)
Interrater Reliability(Intraclass r)
Retest Reliability(Pearson’s r)
Addington et al. (9) 1990 0.82Addington et al. (10) 1996 0.93Akdemir et al. (11) 2001 0.75 0.87 – 0.98b 0.85Baca-Garcí a et al. (12) 2001 0.97Cicchetti and Prusoff (19) 1983
Time 1 0.46Time 2 0.82
Craig et al. (20) 1985 0.95Deluty et al. (22) 1986 0.96Demitrack et al. (23) 1998 0.65 – 0.79b
Fuglum et al. (28) 1996 0.86 0.81Gastpar and Gilsdorf (29) 1990 0.48Gilley et al. sample 1 (31) 1995 0.92Gottlieb et al. (32) 1988 0.99Hammond (34) 1998 0.46Kobak et al. (37) 1999 0.91 0.98Koenig et al. (38) 1995 0.97Leung et al. (42) 1999 0.94Maier et al. (45) 1988
Sample 1 0.70Sample 2
Time 1 0.72Time 2 0.70
McAdams et al. (43) 1996 0.77Meyer et al. (48) 2001 0.57 – 0.80b
Middelboe et al. (49) 1994 0.75O’Hara and Rehm (54) 1983
Expert raters 0.91Novice raters 0.76
Pancheri et al. (57) 2002 0.90Potts et al. (59) 1990 0.82 0.92Ramos-Brieva and Cordero-Villafafila (60) 1988 0.72Rehm and O’Hara (61) 1985
Study 1 0.76 0.78 – 0.91b
Study 2 0.91 – 0.96b
Reynolds and Kobak (62) 1995 0.92 0.96Riskind et al. (63) 1987 0.73Shain et al. (68) 1990 0.97Teri and Wagner (72) 1991 0.65 – 0.97b
Whisman et al. (75) 1989 0.85
Williams (76) 1988 0.81Zheng et al. (77) 1988 0.71 0.92a Estimates are from studies published between January 1980 and May 2003 that measured psychometric properties of the Hamilton
depression scale. Studies were identified by means of a MEDLINE search for both “depression” and “Hamilton.”b Range over multiple pairs of raters.
7/27/2019 Hamilton Depression
http://slidepdf.com/reader/full/hamilton-depression 5/15
Am J Psychiatry 161:12, December 2004 2167
BAGBY, RYDER, SCHULLER, ET AL.
http://ajp.psychiatryonline.org
found an overall intraclass coefficient of 0.92; however,
two trained psychiatrists differed at least 20% of the time
in their ratings of psychic anxiety, psychomotor agitation,
and psychomotor retardation, and they differed by at least
two points 15% of the time in their ratings of loss of libido.
The ratings of trained raters disagreed with the psychia-
trists’ ratings on psychomotor agitation (50% of the time),
hypochondriasis (60%), loss of libido (90%), and loss of
energy (100%).
Retest Reliability
Retest reliability for the Hamilton depression scale
ranged from 0.81 to 0.98 (Table 2). Retest reliability at the
item level (Table 3) ranged from 0.00 to 0.85. Williams (76)
argued in favor of using structured interview guides to
boost item and total scale reliability and developed the
Structured Interview Guide for the Hamilton Depression
Rating Scale. This effort increased the mean retest reliabil-
ity across individual items to 0.54, although only four
items met the criteria for adequate reliability (depressed
mood, early insomnia, psychic anxiety, and loss of libido).
Item Characteristics
Content and scaling. Standard psychometric practice
dictates that items within an instrument should measure a
single symptom and contain response options linked to
increasing or decreasing amounts of that symptom. Each
item is assumed to contribute equally to the total score or
be backed with evidence in support of differential weight-
ing. These criteria are not consistently met by using the
current scaling procedure or the options for rating symp-
toms. Although improperly scaled items can cause prob-
lems in quantitative measurement, evaluation of item
scaling takes place first at a qualitative level. Some Hamil-ton depression scale items measure single symptoms
along a meaningful continuum of severity; many do not.
The item assessing depressed mood includes a combina-
tion of affective, behavioral, and cognitive features, such
as gloomy attitude, pessimism about the future, subjective
feeling of sadness, and tendency to weep. The general so-
matic symptoms item, which is also symptomatically het-
erogeneous, includes feelings of heaviness, diffuse back-
ache, and loss of energy. Headache is coded only as part of
somatic anxiety along with such symptoms as indigestion,
palpitations, and respiratory difficulties. Genital symp-
toms for women entail loss of libido and menstrual distur-
bances. The problems inherent in the heterogeneity of
these rating descriptors reduce the potential meaningful-
ness of these items, a problem exacerbated if the different
components of an item actually measure multiple con-
structs and thus measure different effects.
Most items on the Hamilton depression scale at least are
scaled so that increasing scores represent increasing se-
verity. It is less clear whether the anchors used for different
scores on certain items actually assess the same underly-
ing construct/syndrome. This ambiguity is most obvious
for severity ratings involving psychotic features. The feel-
ings of guilt item, for example, is graded as follows: 0=ab-
sent, 1=self-reproach, 2=ideas of guilt or rumination over
past errors or sinful deeds, 3=present illness is a punish-
ment, and 4=hears accusatory or denunciatory voices
and/or experiences threatening visual hallucinations. A
patient with guilt-themed hallucinations may be more se-
verely ill than a patient who has nonpsychotic guilty feel-
ings, but is he/she feeling more guilt? The psychotic fea-
tures may instead represent a qualitatively different
construct/syndrome associated with more severe illness.
Similarly, the hypochondriasis item progresses through
bodily self-absorption (rated 1) and preoccupation with
health (rated 2) before switching to querulous attitude
(rated 3) and then again to hypochondriacal delusions
(rated 4). These item-scoring anchors violate basic mea-
surement principles, because nominal scaling and ordinal
scaling are combined in a single item.
Although Hamilton (1) explained the rationale for the
inclusion of both 3-point and 5-point items, the argument
was not made on the grounds of differential weighting.Hamilton believed that certain items would be difficult to
anchor dimensionally and therefore assigned them fewer
response options. The end result is that certain items con-
tribute more to the total score than others. Contrasting
psychomotor retardation and psychomotor agitation, for
example, reveals that a severe manifestation of the former
contributes 4 points, whereas an equally severe manifes-
tation of the latter contributes 2 points. Similarly, some-
one who weeps all the time can contribute 3 or 4 points on
depressed mood, whereas someone who feels tired all the
time can contribute only 2 points on the general somatic
symptoms item.
Item Response Analysis
A psychiatric rating scale should measure a single psy-
chopathological construct (i.e., an illness or syndrome)
and be composed of items that adequately cover a range of
symptoms that are consistently associated with the syn-
drome. Item response theory, a method used increasingly
in the evaluation and construction of psychometric in-
struments, permits empirical evaluation of these pre-
mises. It is important to note that this method was not
available when the original Hamilton depression scale was
developed, although some researchers more recently used
this method to evaluate this instrument. According to item
response theory, a scale and its constituent items may
have good reliability estimates but still fail to meet item re-
sponse theory criteria. For example, if a depression scale
were composed only of items measuring mild depression,
the instrument would have great difficulty distinguishing
between moderate and severe cases of depression, as both
would be characterized by high scores on all items. This is-
sue is particularly pressing in studies of clinical change;
not only is a wide range of severity often represented in
this research, but individual patients are expected to move
7/27/2019 Hamilton Depression
http://slidepdf.com/reader/full/hamilton-depression 6/15
2168 Am J Psychiatry 161:12, December 2004
HAMILTON DEPRESSION SCALE
http://ajp.psychiatryonline.org
along this continuum as they improve. Continued use of
items insensitive to change underestimates the strength of actual treatment effects and makes it necessary to have
larger samples to demonstrate that an effect is statistically
significant. Falsely identifying patients as not having
changed represents an additional source of “noise” and
weakens the “signal” of a true treatment effect. A prag-
matic implication of such lack of sensitivity is that new
compounds shown to be promising in the laboratory may
appear spuriously ineffective in clinical trials.
A related issue concerns the extent to which a severity
score actually measures a single unidimensional syn-
drome. To summarize a syndrome with a single score re-
quires a precise understanding of what that score repre-sents. The implicit assumption is that the severity score
represents a single dimension (84); if depression is hetero-
geneous, interpretation of a single summed score is un-
clear. If, for example, items assessing psychological and
physical symptoms were only loosely related, a single
score would not distinguish between two potentially dif-
ferent groups of depressed patients—one group whose
symptoms were primarily psychological and another
group with primarily vegetative symptoms. Any effects of
an intervention targeting only one of these aspects would
be harder to detect.Gibbons et al. (85) presented a strategy for identifying a
unidimensional set of items from a psychiatric rating scale
and evaluating the extent to which these items adequately
measure the full range of depression severity. Subse-
quently, a subset of Hamilton depression scale items that
would measure a single dimension of depression across a
wide range of severity was developed (30). This subset in-
cluded depressed mood, which was sensitive at low levels;
work/interests, psychic anxiety, and loss of libido, which
were sensitive at mild levels; somatic anxiety, psychomo-
tor agitation, and guilt, which were sensitive at moderate
levels; and suicide, which was sensitive at severe levels.
These items were proposed as a psychometrically stronger
form of the full Hamilton depression scale.
Santor and Coyne (64, 65) used item response theory to
examine the functioning of the full Hamilton depression
scale and its individual items. In one of these studies (65)
they examined individual Hamilton depression scale item
performance in a combined sample of primary care pa-
tients and depressed patients from the National Institute
of Mental Health Treatment of Depression Collaborative
Research Program. One expects different item ratings at
TABLE 3. Studies Reporting Item Reliability Estimates for the 17-Item Hamilton Depression Rating Scalea
Scale Item
Reliability Measure and Study YearDepressed
Mood Guilt SuicideEarly
InsomniaMiddle
InsomniaLate
InsomniaWork/
Interests
Internal reliabilityb
Berrios and Bulbena-Villarasa (16) 1990Sample 1 0.32 0.24 0.26 0.25 0.32 0.31 0.39Sample 2 0.37 0.38 0.40 0.23 0.37 0.42 0.33
Gastpar and Gilsdorf (29) 1990
Time 1 0.10 0.22 – 0.04 0.04 0.22 0.13 0.09Time 2 0.65 0.39 0.50 0.44 0.46 0.53 0.73
Paykel (58) 1990Sample 1 0.52 0.31 0.31 0.24 0.21 0.38 0.59Sample 2 0.42 0.38 0.47 0.27 0.34 0.30 0.58Sample 3 0.52 0.41 0.49 0.34 0.35 0.34 0.59
Rehm and O’Hara (61) 1985 0.63 0.26 0.47 0.40 0.41 0.37 0.46Interrater reliabilityc
Cicchetti and Prusoff (19) 1983Time 1 0.37 0.18 0.59 0.76 0.57 0.42 0.33Time 2 0.72 0.37 0.64 0.57 0.45 0.49 0.64
Moberg et al. (50)d 2001Standard administration 0.90 0.80 0.90 0.61 0.39 0.89 0.50Interview guidelines 0.96 0.83 0.81 0.97 0.78 0.89 0.87
Rehm and O’Hara (61)e 1985Above median split 0.61 0.39 0.49 0.74 0.79 0.72 0.56Below median split 0.84 0.82 0.92 0.91 0.79 0.92 0.73
Retest reliabilityf
Akdemir et al. (11) 2001 0.61 0.78 0.67 0.69 0.79 0.76 0.73Williams (76) 1988 0.80 0.63 0.64 0.80 0.62 0.30 0.54
a Estimates are from studies published between January 1980 and May 2003 that measured psychometric properties of the Hamilton depres-sion scale. Studies were identified by means of a MEDLINE search for both “depression” and “Hamilton.”
b Correlation of item scores with total scores. An uncorrected Pearson’s r>0.20 was considered significant. Significant correlations are shown inboldface type.
c Interrater Pearson’s r≥0.70 was considered significant; intraclass r≥0.60 was considered significant. Significant correlations are shown in bold-face type.
d The study included both standard and interview guideline methods; interrater reliability was calculated by using the intraclass r.e The subjects were assigned to two groups by means of a median split according to Hamilton depression scale total scores; interrater Pear-
son’s r values were calculated for both groups.f Test-retest Pearson’s r >0.70 was considered acceptable. Acceptable correlations are shown in boldface type.
7/27/2019 Hamilton Depression
http://slidepdf.com/reader/full/hamilton-depression 7/15
Am J Psychiatry 161:12, December 2004 2169
BAGBY, RYDER, SCHULLER, ET AL.
http://ajp.psychiatryonline.org
different levels of depression severity, with zeroes more
common at mild levels of overall depression and higheritem scores more common with more severe overall de-
pression. Moreover, whereas most items on the Hamilton
depression scale are, overall, sensitive to depression sever-
ity, 12 items had at least one problematic response option
(the five items that had no such problems were depressed
mood, guilt, suicide, work/interests, and psychic anxiety)
(64). For example, the likelihood of receiving a rating of 1
on the insomnia items was essentially the same regardless
of the overall severity of depression, but the likelihood of
receiving a rating of 4 on somatic anxiety was very low
even when overall depression was severe. These findings
confirm that the rating scheme is not ideal for many items
on the Hamilton depression scale, with the unfortunate
effect of decreasing the capacity of the Hamilton depres-
sion scale to detect change (6, 7).
Rasch Analysis
Additional efforts to analyze the performance of indi-
vidual Hamilton depression scale items and to identify an
underlying single dimension of depression severity have
benefited from a technique known as Rasch analysis, a
method similar to item response theory. Rasch analysis
proposes an ideal underlying dimension based on mathe-
matical and theoretical reasoning about the construct thatis being measured and then assesses the extent to which
actual data correspond to this ideal. This approach was
first applied to the Hamilton depression scale by Bech et
al. (86), who confirmed that six items previously shown to
have properties associated with unidimensionality (87)
could be combined to create a shorter scale that met the
formal Rasch criteria. This six-item scale was thus pro-
posed as a better measure than the full Hamilton depres-
sion scale for assessing depression severity along a single
dimension; the six-item scale is composed of items for de-
pressed mood, guilt, work/interests, psychomotor retar-
dation, anxiety psychic, and general somatic symptoms
(87). The unidimensionality of this six-item subscale has
since been confirmed in two studies that used Rasch
methods (13, 14). Maier and Philipp (44) used Rasch anal-
ysis to confirm unidimensionality for a subset of Hamilton
depression scale items. The resulting scale was similar to
that obtained by Bech et al. (86). In another study that
used Rasch analysis (46), six items were found to be prob-
lematic: suicide, psychomotor agitation, anxiety somatic,
general somatic symptoms, hypochondriasis, and loss of
insight.
Scale Item
Retardation AgitationPsychicAnxiety
SomaticAnxiety Gastrointestinal
GeneralSomatic
Loss ofLibido Hypochondriasis
WeightLoss
Loss ofInsight Mean
0.24 0.24 0.42 0.35 0.33 0.29 0.29 0.34 0.29 0.06 0.290.31 0.35 0.36 0.29 0.37 0.34 0.30 0.36 0.26 0.06 0.32
0.03 0.07 0.39 0.34 0.28 0.32 0.05 0.34 – 0.04 0.25 0.160.40 0.40 0.64 0.58 0.53 0.55 0.55 0.23 0.11 0.27 0.47
0.33 0.37 0.53 0.41 0.52 0.25 0.27 0.33 0.40 0.45 0.380.33 0.20 0.33 0.47 0.63 0.50 0.39 0.43 0.49 0.23 0.400.21 0.25 0.50 0.42 0.41 0.44 0.23 0.16 0.42 – 0.07 0.350.14 0.18 0.54 0.46 0.27 0.38 0.13 0.33 0.25 0.16 0.34
0.39 0.20 0.19 0.34 0.43 0.30 0.39 0.29 0.57 – 0.02 0.370.26 0.32 0.40 0.45 0.51 0.42 0.59 – 0.04 0.06 – 0.03 0.40
0.46 0.89 0.67 0.57 0.34 0.57 0.39 0.76 0.58 0.63 0.640.75 0.97 0.88 0.84 0.95 0.92 0.94 0.89 1.00 1.00 0.90
0.54 0.51 0.52 0.88 0.75 0.55 0.77 0.35 0.51 0.37 0.590.69 0.52 0.71 0.82 0.73 0.68 0.69 0.64 0.54 0.22 0.72
0.85 0.66 0.80 0.79 0.71 0.66 0.76 0.79 0.08 0.79 0.700.32 0.11 0.78 0.66 0.59 0.61 0.70 0.55 0.58 0.00 0.54
7/27/2019 Hamilton Depression
http://slidepdf.com/reader/full/hamilton-depression 8/15
2170 Am J Psychiatry 161:12, December 2004
HAMILTON DEPRESSION SCALE
http://ajp.psychiatryonline.org
Validity
Validity of psychiatric rating scales such as the Hamil-
ton depression scale comprises 1) content, 2) convergent,
3) discriminant, 4) factorial, and 5) predictive validity.
Content validity is assessed by examining scale items to
determine correspondence with known features of a syn-
drome. Convergent validity is adequate when a scaleshows Pearson’s r values of at least 0.50 in correlations
with other measures of the same syndrome. Discriminant
validity is established by showing that groups differing in
their diagnostic status can be separated by using the scale.
Predictive validity for symptom severity measures such as
the Hamilton depression scale is determined by a statisti-
cally significant (p<0.05) capacity to predict change with
treatment. Factorial validity is established by using factor
analysis or related techniques (e.g., principal-component
analysis) to demonstrate that a meaningful structure can
be found in multiple samples. An a priori criterion of 0.40
has been used to identify which items are part of which
factors (88).
Content validity. Because of its wide use and long clini-
cal tradition, the Hamilton depression scale seems to both
define as well as measure depression. One could criticize
DSM-IV for not adequately capturing Hamilton depres-
sion scale depression as much as one could criticize the
Hamilton depression scale for not providing full coverage
of DSM-IV depression. Nonetheless, the operational crite-
ria provided in DSM-IV are used as the official nosology
for much of psychiatry worldwide. The criteria for major
depression have been revised three times in response to
developments in field trial research and clinical consensus
based on expert opinion, most recently in 1994. Research-
ers have developed a number of longer versions of the
Hamilton depression scale that include additional symp-
toms such as the reverse vegetative features of atypical de-
pression. However, the core items of the Hamilton depres-sion scale have remained unchanged for more than 40
years. It is reasonable to ask whether this instrument cap-
tures depression as it is currently conceptualized. Several
symptoms contained within the Hamilton depression
scale are not official DSM diagnostic criteria, although
they are recognized as features associated with depression
(e.g., psychic anxiety). For other symptoms included in the
Hamilton depression scale (e.g., loss of insight, hypochon-
driasis), the link with depression is more tenuous. More
critically, important features of DSM-IV depression are of-
ten buried within more complex items and sometimes are
not captured at all. The work/interests item includes an-
hedonic features along with listlessness, indecisiveness,
social avoidance, and lowered productivity. It is impossi-
ble to determine the extent to which anhedonia per se in-
fluences severity. Guilt is captured in both Hamilton de-
pression scale depression and DSM-IV depression, but the
Hamilton depression scale contains no explicit assess-
ment of feelings of worthlessness. Decision-making diffi-
culties are buried within the work/interests item of the
Hamilton depression scale, but concentration difficulties
are not included. The reverse vegetative symptoms—
TABLE 4. Studies Reporting Estimates of Convergent Validity of the 17-Item Hamilton Depression Rating Scale, ComparedWith Other Depression Measuresa
r
Study Year
Beck DepressionInventory
BriefPsychiatric
Rating Scale
Center forEpidemiologic
StudiesDepression Scale
Clinical GlobalImpression
Scale
CarrollRating Scale
for Depression
GlobalAssessment
Scale
Akdemir et al. (11) 2001 0.48 0.56Berard and Ahmed (15) 1995 0.48
Brown et al. (17) 1995 0.70 – 0.85b
Carroll et al. (18) 1981 0.60 0.71Craig et al. (20) 1985 0.56 0.65Feinberg et al. (26) 1981 0.77 0.75Gottlieb et al. (32) 1988
Low-severity group 0.89High-severity group 0.57
Hotopf et al. (36) 1998 0.77Kobak et al. (37) 1999 0.89Leung et al. (42) 1999Maier et al. total sample (46) 1988Olsen et al. (55) 2003Rehm and O’Hara (61) 1985 0.73 – 0.86Senra Rivera et al. (67) 2000
Time 1 0.70Time 2 0.92
Whisman et al. (75) 1989
Time 1 0.27 0.41Time 2 0.67 0.68
Zheng et al. (77) 1988 – 0.47a Estimates are from studies published between January 1980 and May 2003 that measured psychometric properties of the Hamilton depres-
sion scale. Studies were identified by means of a MEDLINE search for both “depression” and “Hamilton.”b Multiple assessments over an 8-month period.
7/27/2019 Hamilton Depression
http://slidepdf.com/reader/full/hamilton-depression 9/15
Am J Psychiatry 161:12, December 2004 2171
BAGBY, RYDER, SCHULLER, ET AL.
http://ajp.psychiatryonline.org
weight gain, hyperphagia, and hypersomnia— were pro-
vided by Hamilton (1) as additional items but are notscored on the original Hamilton depression scale.
Convergent validity. A wide range of instruments has
been used to examine the convergent validity of the Hamil-
ton depression scale (Table 4). Most of the correlation co-
efficients met the preestablished criterion, and the Hamil-
ton depression scale showed adequate convergent validity
in correlations with all but two scales, including the major
depression section of the Structured Clinical Interview for
DSM-IV. The latter finding provides evidence of noncorre-
spondence between the Hamilton depression scale and
DSM-IV.Discriminant validity. Two approaches have been used
to evaluate the discriminant validity of the Hamilton de-
pression scale. In the first approach, several studies used
the receiver operating curve as a statistical means of deter-
mining the cutoff scores for detecting depression and then
provided corresponding rates of sensitivity, specificity,
positive predictive power, and negative predictive power
for the Hamilton depression scale in distinguishing de-
pressed and nondepressed subjects. In other studies, re-
r
HospitalAnxiety andDepression
Scale
Montgomery-Åsberg
DepressionRating Scale
MajorDepressionInventory
MinnesotaMultiphasicPersonalityInventory
RaskinDepression
Scale
StructuredClincial
Interview forDSM-IV
VisualAnalogue
Scale
ZungSelf-RatingDepression
Scale
0.370.38
– 0.65
0.490.01
0.670.85 0.65
0.860.67 0.81
0.680.88
0.200.500.62
TABLE 5. Studies Reporting Classification Accuracy Rates for the 17-Item Hamilton Depression Rating Scale a
Study Year Cutoff Scoreb Sensitivity SpecificityPositive
Predictive ValueNegative
Predictive Value
Aben et al. (8) 2002 12 0.78 0.75 0.37 0.95Leentjens et al. (41) 2000 13/14 0.88 0.86 0.84 0.89Leung et al. (42) 1999 12/13 0.88 0.86 0.84 0.89Mottram et al. (51) 2000 15/16 0.88 0.99 0.99 0.97Naarding et al. (52) 2002
Sample 1 10/11 0.73 1.00 1.00 0.88Sample 2 13/14 0.45 0.96 0.76 0.86Sample 3 15/16 0.70 0.99 0.93 0.91
Strik et al. (71) 2001 11/12 0.76 0.86 0.41 0.99Thompson et al. (74) 1998 — c 0.69 – 0.87d 0.99 – 1.00d — c — c
Mean 12.6/13.5 0.76 0.91 0.77 0.92a Rates are from studies published between January 1980 and May 2003 that measured psychometric properties of the Hamilton depression
scale. Studies were identified by means of a MEDLINE search for both “depression” and “Hamilton.”b The minimum score above which sensitivity and specificity are maximized in the detection of depression with the Hamilton depression scale
for a given study. Where two scores are given, the lower score represents the threshold below which cases are classified as nondepressed,and the higher score represents the threshold above which cases are classified as depressed.
c Not reported.d Range of scores across multiple assessments.
7/27/2019 Hamilton Depression
http://slidepdf.com/reader/full/hamilton-depression 10/15
2172 Am J Psychiatry 161:12, December 2004
HAMILTON DEPRESSION SCALE
http://ajp.psychiatryonline.org
searchers have examined the capacity of the Hamilton de-
pression scale to distinguish different groups of clinical
patients (e.g., patients with endogenous versus those with
nonendogenous depression, patients with anxiety versus
those with depression) using statistical techniques to de-
tect mean group differences. Classification rates resulting
from receiver operating curve analysis have not been
widely reported in the Hamilton depression scale litera-ture. Our search only identified seven studies (Table 5), and
some of these investigations sought to detect depression in
samples of patients with medical conditions other than
psychiatric disorders (Table 1). Sensitivity, specificity, and
negative predictive power were generally consistent and
large, but positive predictive power was more variable, and
two studies reported very low positive predictive power.
The second type of discriminant validity study attempts
to distinguish different clinical groups. In a comparison of
healthy, depressed, and bipolar depressed individuals,
Rehm and O’Hara (61) found that the total Hamilton de-
pression scale score clearly differentiated these three cate-
gories, with the depressed patients scoring higher than the
healthy participants and with the bipolar depressed pa-
tients scoring higher than both of the other groups. At the
item level, four items—psychomotor agitation, gastro-
intestinal symptoms, loss of insight, and weight loss—failed to differentiate depressed from healthy subjects.
Only psychic anxiety and hypochondriasis significantly
differentiated the subjects with unipolar and bipolar de-
pression. Kobak et al. (37) showed significant total scale
score differences between individuals with major depres-
sion, individuals with minor depression, and healthy com-
parison subjects. Zheng et al. (77) reported that the Hamil-
ton depression scale was able to discriminate psychiatric
patients classified as mildly, moderately, and severely dys-
functional on the basis of Global Severity Scale scores.
Thase et al. (73) found that the Hamilton depression scale
could distinguish patients with endogenous depression
from patients with nonendogenous depression, with pa-
tients in the former category having higher scores. Gott-
lieb et al. (32) reported no significant differences between
the Hamilton depression scale scores of patients classified
as having low-severity versus high-severity Alzheimer’sdisease. Several researchers have investigated the capacity
of the Hamilton depression scale to differentiate between
patients with anxiety and those with depression. Prusoff
and Klerman (89) suggested the Hamilton depression
scale could indeed separate these constructs, and Maier et
al. (45) demonstrated that the Hamilton depression scale
had a higher correlation with an external measure of de-
pression than with an external measure of anxiety, but thesaturation of the Hamilton depression scale with anxiety-
related concepts was nonetheless considerable.
Predictive validity. Edwards et al. (90) performed a meta-
analysis of 19 studies with a total of 1,150 patients that
compared the predictive validity of the Hamilton depres-
sion scale and the Beck Depression Inventory. Treatments
included pharmacotherapy, behavior therapy, cognitive
restructuring, dynamic psychotherapy, and various com-
binations. The Hamilton depression scale was found to be
TABLE 6. Studies Reporting Factor Analyses and Principal-Component Analyses of the 17-Item Hamilton Depression RatingScalea
Study YearNumber
of FactorsDepressed
Mood Guilt SuicideEarly
InsomniaMiddle
InsomniaLate
InsomniaWork/
Interests
Addington et al. (10) 1996Time 1 7 I I, V V — II II, V, VI I, IVTime 2 7 I, II, VII II, III, VII VII III III III, V II
Akdemir et al. (11) 2001 6 — II II III III III — Berrios and
Bulbena-Villarasa (16) 1990Sample 1 4 I I, II I I I I IISample 2 4 I I, II I IV I, IV I I, II, IV
Brown et al. (17) 1995 6 III III III I V V VIDaradkeh et al. (21) 1997 5 II II, IV I I IIIFleck et al. (27) 1995 3 I I — III III III IGibbons et al. (30) 1993 5 I, IV I I — II II I, IVMarcos and Salamero (47) 1990 3 II — II III III III IIO’Brien and Glaudin (53) 1988
Sample 1 6 I I, VI I — IV IV I, IISample 2 8 III VII VI II II II III
Onega and Abraham (56) 1997 4 I I I II II II IPancheri et al. (57) 2002 4 III II — I I I IIIRamos-Brieva and
Cordero-Villafafila (60) 1988 5 III II, III I, III I I I IIISmouse et al. (69) 1981 3 I I I, II I, II I, II I, II ISteinmeyer and Möller (70) 1992
Time 1 6 II V V III III III IITime 2 2 I, II II I, II II I I I
Zheng et al. (77) 1988 5 III IV III V V V IVa Results are from studies published between January 1980 and May 2003 that measured psychometric properties of the Hamilton depression
scale. Studies were identified by means of a MEDLINE search for both “depression” and “Hamilton.” Roman numerals indicate the numberof the factor on which the item loaded significantly. A factor loading of ≥0.40 was considered statistically significant.
7/27/2019 Hamilton Depression
http://slidepdf.com/reader/full/hamilton-depression 11/15
Am J Psychiatry 161:12, December 2004 2173
BAGBY, RYDER, SCHULLER, ET AL.
http://ajp.psychiatryonline.org
more sensitive to change, compared to the Beck Depres-
sion Inventory. Lambert et al. (39) performed a meta-anal-
ysis that included 36 studies and a total of 1,850 patients
and that compared the Hamilton depression scale to the
Beck Depression Inventory and the Zung Self-Rating De-
pression Scale. They reported that the Hamilton depres-
sion scale was more sensitive to change than were the twoself-report measures. Sayer et al. (66) also demonstrated
that the Hamilton depression scale outperformed the
Beck Depression Inventory in detecting change. Lambert
et al. (40) reported that the Beck Depression Inventory is
more likely to show treatment effects at 12 weeks than the
Zung Self-Rating Depression Scale or the Hamilton de-
pression scale; the Zung Self-Rating Depression Scale and
the Hamilton depression scale were more likely to detect
changes after 3 weeks.
One disadvantage of a multidimensional instrument
such as the Hamilton depression scale in detecting change
is that specific treatments may affect only a single dimen-sion. If the total score includes somatic symptoms that ac-
tually reflect treatment side effects, estimates of treatment
response will be spuriously low (44). In two studies and
one meta-analysis researchers addressed this issue using
the various unidimensional core depression item sets de-
scribed earlier in the section on item characteristics (91,
92). The six-item subscale developed by Bech et al. (87)
was found to be at least as responsive as the full Hamilton
depression scale. A meta-analysis of eight fluoxetine stud-
ies with 1,658 patients showed that the different uni-
dimensional subscales (44, 87) were more sensitive to
change than was the full Hamilton depression scale score.
These results were replicated in a second meta-analysis of
four tricyclic antidepressant studies (25).
Factorial validity. A total of 15 studies with 17 samples
reported a factor analysis of the Hamilton depressionscale (Table 6). In most of the studies, researchers used the
eigenvalue ≥1 rule to determine the number of factors, ex-
tracted those factors from the data using principal-com-
ponent analysis, and then determined the optimal config-
uration of items on factors using varimax rotation. The
number of factors identified ranged from two to eight. In-
somnia items appeared consistently on the same factor in
13 data sets, suggesting a sleep disturbance factor. There
was some support for the presence of a general depression
factor, as depressed mood, guilt, and suicide appeared to-
gether on the same factor in six data sets, and the combi-
nation of depressed mood, suicide, and psychic anxiety appeared on the same factor in seven data sets. Support
was also found for an anxiety/agitation factor, with the ag-
itation, psychic anxiety, and somatic anxiety items ap-
pearing together in six samples. Clearly, the Hamilton de-
pression scale is not unidimensional, as separate sets of
items do seem to reliably represent general depression
and insomnia factors; however, the exact structure of the
Hamilton depression scale’s multidimensionality remains
unclear.
Retardation AgitationPsychicAnxiety
SomaticAnxiety Gastrointestinal
GeneralSomatic
Loss ofLibido Hypochondriasis
WeightLoss
Loss ofInsight
IV III, VI VII I VII I, VII IV, VII III I, VI VIII II, III I, II I I VI VI IV I, IV, V IV — I, II II I VI IV IV I VI
II I I I, II I, III I IV I III IIIII II I, III I, III I I I, IV I I, III IIIIII I I I II VI V IV IV II
I, II I, III V V V II, III II IVI II II II II I — II II — IV I I I V IV — — III, V IIIII — II I — I I I — —
V V I II III II I II, IV III VII
I III V, VI IV V VII VI IV VIIII IV III III II I I III II IV
— II — I I, IV — – I IV —
III II II II IV IV IV V V IIIII I, II I, III I I I
IV I II I VI IV VI I IV IV — II I I I I — — — — IV II I I I I III I I II
7/27/2019 Hamilton Depression
http://slidepdf.com/reader/full/hamilton-depression 12/15
2174 Am J Psychiatry 161:12, December 2004
HAMILTON DEPRESSION SCALE
http://ajp.psychiatryonline.org
Conclusions
The Hamilton depression scale has been the standard for
the assessment of depression for more than 40 years. Re-
searchers and policy makers charged with the task of pro-
viding standards to evaluate treatment outcomes in de-
pression are faced with three possible solutions: retain,
revise, or reject. The latter solution argues for the develop-
ment of a new instrument or the replacement of the Hamil-ton depression scale with existing, psychometrically supe-
rior instruments.
Many of the psychometric properties of the Hamilton
depression scale are adequate and consistently meet es-
tablished criteria. The internal, interrater, and retest reli-
ability estimates for the overall Hamilton depression scale
are mostly good, as are the internal reliability estimates at
the item level. Similarly, established criteria are met for
convergent, discriminant, and predictive validity, al-
though the latter does suffer somewhat due to multidi-
mensionality. At the item level, interrater and retest coeffi-
cients are weak for many items, and the internal reliability coefficients indicate that some items are problematic. The
lack of individual item reliability is not necessarily a fatal
psychometric flaw; what is critical is that the items as a
whole provide adequate reliability.
Evaluation of item response shows that many of the
individual items are poorly designed and sum to generate a
total score whose meaning is multidimensional and un-
clear. The problem of multidimensionality was highlighted
in the evaluation of factorial validity, which showed a fail-
ure to replicate a single unifying structure across studies.
Although the unstable factor structure of the Hamilton de-
pression scale may be partly attributable to the diagnostic
diversity of population samples, well-designed scales as-sessing clearly defined constructs produce factor struc-
tures that are invariant across different populations (88).
Finally, the Hamilton depression scale is measuring a con-
ception of depression that is now several decades old and
that is, at best, only partly related to the operationalization
of depression in DSM-IV.
These findings indicate that continued use of the Hamil-
ton depression scale requires, at the very least, a complete
overhaul of its constituent items. Accumulated empirical
evidence offers some hope that substantial revision can
redress a number of psychometric problems, thereby pro-
viding an improved measure. Shortened versions of the
Hamilton depression scale converge on a common set of
core features and in general have proven more effective in
detecting change. The truncated item sets for these instru-
ments, however, are limited in that they do not permit
capture of the full depressive syndrome. Other studies
based on item response theory methods have indicated
that modifications of the rating scheme are readily imple-
mented and can enhance the unidimensionality of these
core symptoms in a manner that allows uniform assess-
ment of change. Identifying a core set of symptoms with
proven psychometric qualities, along with making rating
scheme changes that would allow consistent assessment
of the severity of depression, could provide a foundation
for a reconstructed scale. One advantage of such a revision
is that it would maintain continuity with the long-stand-
ing use of the original Hamilton depression scale. This sort
of transition is probably more palatable and therefore
more readily acceptable to regulatory commissions.
The Depression Rating Scale Standardization Team re-
vised the Hamilton depression scale (i.e., the GRID-HAMD
[93, 94]) by employing several of the methodological ad-
vances we have been advocating in this article. They used
item response theory methods to inform, in part, the re-
vision process; developed clear structured interview
prompts and scoring guidelines; and to some extent stan-
dardized the scoring system. We nonetheless believe that
by making an effort to retain the original 17 items, the De-
pression Rating Scale Standardization Team failed to ad-
dress many of the flaws of the original instrument. Most of
the items still measure multiple constructs, items that
have consistently been shown to be ineffective have beenretained, and the scoring system still includes differential
weighting of items. Moreover, the GRID-HAMD content is
virtually unchanged from the original. All the items that
appeared on the Hamilton depression scale in 1960
are included in the GRID-HAMD. Thus, this revision has
neither removed items based on outdated concepts nor
added items that incorporate contemporary definitions of
depression.
Rejection of the Hamilton depression scale and replace-
ment with an alternative existing measure or the imple-
mentation of a new instrument has scientifically compel-
ling advantages over revision. The Inventory of Depressive
Symptomatology (95) and the Montgomery- Å sberg De-pression Rating Scale (96), designed to address the limita-
tions of the Hamilton depression scale, represent two
potential replacement alternatives. Although these instru-
ments measure contemporary definitions of depression
(33), neither item response theory methods nor other con-
temporary measurement techniques were employed in
their development. As indicated earlier, such techniques,
especially item response theory, maximize the capacity of
an instrument to detect change. On the other hand, the de-
velopment and implementation of a new instrument that
is based on current knowledge of depression and that takes
advantage of psychometric and statistical advances might
offer the best solution. The decision to replace the Hamil-
ton depression scale with either an existing instrument or a
newly developed instrument would ultimately rest on con-
sensus that such an instrument could capture more ade-
quately the full spectrum of the depression construct and
on empirical evidence of the new instrument’s superiority
in detecting treatment effects.
In conclusion, we have been struck with the marked
contrast between the effort and scientific sophistication
involved in designing new antidepressants and the con-
7/27/2019 Hamilton Depression
http://slidepdf.com/reader/full/hamilton-depression 13/15
Am J Psychiatry 161:12, December 2004 2175
BAGBY, RYDER, SCHULLER, ET AL.
http://ajp.psychiatryonline.org
tinued reliance on antiquated concepts and methods for
assessing change in the severity of the depression that
these very medications are intended to affect. Effort in
both areas is critical to the accessibility of new medica-
tions for patients with depression. Many scales and instru-
ments used in psychiatry today are based on—or at least
include—current DSM symptoms, and the measurement
of depression should follow this trend. It is time to retire
the Hamilton depression scale. The field needs to move
forward and embrace a new gold standard that incorpo-
rates modern psychometric methods and contemporary
definitions of depression.
Received Dec. 7, 2003; revision received Feb. 26, 2004; accepted
March 22, 2004. From the Centre for Addiction and Mental Health,
University of Toronto; and the Department of Psychology, University
of British Columbia, Vancouver, B.C. Address reprint requests to Dr.
Bagby, Centre for Addiction and Mental Health, 250 College St., Tor-
onto, Ont., Canada M5T 1R8; [email protected] (e-mail).
Supported in part by Eli Lilly and Co. and by a Senior Research Fel-
lowship from the Ontario Mental Health Foundation to Dr. Bagby. Mr.
Ryder was supported by a postdoctoral fellowship from the Michael
Smith Foundation for Health Research, Vancouver, B.C., Canada.
The authors thank Arun Ravindrun and Sid Kennedy for their com-
ments and Natasha Owen for assistance with the manuscript.
References
1. Hamilton M: A rating scale for depression. J Neurol Neurosurg
Psychiatry 1960; 23:56 – 62
2. Demyttenaere K, De Fruyt J: Getting what you ask for: on the
selectivity of depression rating scales. Psychother Psychosom
2003; 72:61 – 70
3. Williams JB: Standardizing the Hamilton Depression Rating
Scale: past, present, and future. Eur Arch Psychiatry Clin Neu-
rosci 2001; 251(suppl 2):II6 – II12
4. Hedlund JL, Vieweg BW: The Hamilton Rating Scale for Depres-
sion: a comprehensive review. J Operational Psychiatry 1979;10:149 – 165
5. Bech P: Rating scales for affective disorders: their validity and
consistency. Acta Psychiatr Scand Suppl 1981; 295:1 – 101
6. Bech P: Psychometric development of the Hamilton scales: the
spectrum of depression, dysthymia and anxiety, in The Hamil-
ton Scales. Edited by Bech P, Coppen A. Berlin, Springer-Verlag,
1990, pp 72 – 79
7. Maier W: The Hamilton Depression Scale and its alternatives: a
comparison of their reliability and validity, ibid, pp 64 – 71
8. Aben I, Verhey F, Lousberg R, Lodder J, Honig A: Validity of the
Beck Depression Inventory, Hospital Anxiety and Depression
Scale, SCL-90, and Hamilton Depression Rating Scale as screen-
ing instruments for depression in stroke patients. Psychoso-
matics 2002; 43:386 – 393
9. Addington D, Addington J, Schissel B: A depression rating scale
for schizophrenics. Schizophr Res 1990; 3:247 – 251
10. Addington D, Addington J, Atkinson M: A psychometric com-
parison of the Calgary Depression Scale for Schizophrenia and
the Hamilton Depression Rating Scale. Schizophr Res 1996; 19:
205 – 212
11. Akdemir A, Turkcapar MH, Orsel SD, Demirergi N, Dag I, Ozbay
MH: Reliability and validity of the Turkish version of the Hamil-
ton Depression Rating Scale. Compr Psychiatry 2001; 42:161 –
165
12. Baca-Garcia E, Blanco C, Saiz-Ruiz J, Rico F, Diaz-Sastre C, Cic-
chetti DV: Assessment of reliability in the clinical evaluation of
depressive symptoms among multiple investigators in a multi-
center clinical trial. Psychiatry Res 2001; 102:163 – 173
13. Bech P, Allerup P, Maier W, Albus M, Lavori P, Ayuso JL: The
Hamilton scales and the Hopkins Symptom Checklist (SCL-90):
a cross-national validity study in patients with panic disorders.
Br J Psychiatry 1992; 160:206 – 211
14. Bech P, Tanghoj P, Andersen HF, Overo K: Citalopram dose-re-
sponse revisited using an alternative psychometric approach
to evaluate clinical effects of four fixed citalopram doses com-
pared to placebo in patients with major depression. Psycho-pharmacology (Berl) 2002; 163:20 – 25
15. Berard RMF, Ahmed N: Hospital Anxiety and Depression Scale
(HADS) as a screening instrument in a depressed adolescent
and young adult population. Int J Adolesc Med Health 1995; 8:
157 – 166
16. Berrios GE, Bulbena-Villarasa A: The Hamilton Depression
Scale and the numerical description of the symptoms of de-
pression, in The Hamilton Scales. Edited by Bech P, Coppen A.
Berlin, Springer-Verlag, 1990, pp 80 – 92
17. Brown C, Schulberg HC, Madonia MJ: Assessing depression in
primary care practice with the Beck Depression Inventory and
the Hamilton Rating Scale for Depression. Psychol Assess 1995;
7:59 – 65
18. Carroll BJ, Feinberg M, Smouse PE, Rawson SG, Greden JF: The
Carroll Rating Scale for Depression, I: development, reliabilityand validation. Br J Psychiatry 1981; 138:194 – 200
19. Cicchetti DV, Prusoff BA: Reliability of depression and associ-
ated clinical symptoms. Arch Gen Psychiatry 1983; 40:987 – 990
20. Craig TJ, Richardson MA, Pass R, Bregman Z: Measurement of
mood and affect in schizophrenic inpatients. Am J Psychiatry
1985; 142:1272 – 1277
21. Daradkeh T, Abou-Saleh M, Karim L: The factorial structure of
the 17-item Hamilton Depression Rating Scale. Arab J Psychia-
try 1997; 8:6 – 12
22. Deluty BM, Deluty RH, Carver CS: Concordance between clini-
cians’ and patients’ ratings of anxiety and depression as medi-
ated by private self-consciousness. J Pers Assess 1986; 50:93 –
106
23. Demitrack MA, Faries D, Herrera JM, DeBrota D, Potter WZ: The
problem of measurement error in multisite clinical trials. Psy-chopharmacol Bull 1998; 34:19 – 24
24. Entsuah R, Shaffer M, Zhang J: A critical examination of the
sensitivity of unidimensional subscales derived from the
Hamilton Depression Rating Scale to antidepressant drug ef-
fects. J Psychiatr Res 2002; 36:437 – 448
25. Faries D, Herrera J, Rayamajhi J, DeBrota D, Demitrack M, Pot-
ter WZ: The responsiveness of the Hamilton Depression Rating
Scale. J Psychiatr Res 2000; 34:3 – 10
26. Feinberg M, Carroll BJ, Smouse PE, Rawson SG: The Carroll Rat-
ing Scale for Depression, III: comparison with other rating in-
struments. Br J Psychiatry 1981; 138:205 – 209
27. Fleck MP, Poirier-Littre MF, Guelfi JD, Bourdel MC, Loo H: Facto-
rial structure of the 17-item Hamilton Depression Rating Scale.
Acta Psychiatr Scand 1995; 92:168 – 172
28. Fuglum E, Rosenberg C, Damsbo N, Stage K, Lauritzen L, Bech
P (Danish University Antidepressant Group): Screening and
treating depressed patients: a comparison of two controlled
citalopram trials across treatment settings: hospitalized pa-
tients vs patients treated by their family doctors. Acta Psychiatr
Scand 1996; 94:18 – 25
29. Gastpar M, Gilsdorf U: The Hamilton Depression Rating Scale in
a WHO collaborative program, in The Hamilton Scales. Edited
by Bech P, Coppen A. Berlin, Springer-Verlag, 1990, pp 10 – 19
30. Gibbons RD, Clark DC, Kupfer DJ: Exactly what does the Hamil-
ton Depression Rating Scale measure? J Psychiatr Res 1993; 27:
259 – 273
7/27/2019 Hamilton Depression
http://slidepdf.com/reader/full/hamilton-depression 14/15
2176 Am J Psychiatry 161:12, December 2004
HAMILTON DEPRESSION SCALE
http://ajp.psychiatryonline.org
31. Gilley DW, Wilson RS, Fleischman DA, Harrison DW, Goetz CG,
Tanner CM: Impact of Alzheimer’s-type dementia and informa-
tion source on the assessment of depression. Psychol Assess
1995; 7:42 – 48
32. Gottlieb GL, Gur RE, Gur RC: Reliability of psychiatric scales in
patients with dementia of the Alzheimer type. Am J Psychiatry
1988; 145:857 – 860
33. Gullion CM, Rush AJ: Toward a generalizable model of symp-
toms in major depressive disorder. Biol Psychiatry 1998; 44:
959 – 97234. Hammond MF: Rating depression severity in the elderly physi-
cally ill patient: reliability and factor structure of the Hamilton
and the Montgomery-Åsberg Depression Rating Scales. Int J
Geriatr Psychiatry 1998; 13:257 – 261
35. Hooijer C, Zitman FG, Griez E, van Tilburg W, Willemse A, Dink-
greve MA: The Hamilton Depression Rating Scale (HDRS);
changes in scores as a function of training and version used. J
Affect Disord 1991; 22:21 – 29
36. Hotopf M, Sharp D, Lewis G: What’s in a name? a comparison
of four psychiatric assessments. Soc Psychiatry Psychiatr Epide-
miol 1998; 33:27 – 31
37. Kobak KA, Greist JH, Jefferson JW, Mundt JC, Katzelnick DJ:
Computerized assessment of depression and anxiety over the
telephone using interactive voice response. MD Comput 1999;
16:64 – 6838. Koenig HG, Pappas P, Holsinger T, Bachar JR: Assessing diagnos-
tic approaches to depression in medically ill older adults: how
reliably can mental health professionals make judgments
about the cause of symptoms? J Am Geriatr Soc 1995; 43:472 –
478
39. Lambert MJ, Hatch DR, Kingston MD, Edwards BC: Zung, Beck,
and Hamilton Rating Scales as measures of treatment out-
come: a meta-analytic comparison. J Consult Clin Psychol
1986; 54:54 – 59
40. Lambert MJ, Masters KS, Astle D: An effect-size comparison of
the Beck, Zung, and Hamilton rating scales for depression: a
three-week and twelve-week analysis. Psychol Rep 1988; 63:
467 – 470
41. Leentjens AF, Verhey FR, Lousberg R, Spitsbergen H, Wilmink
FW: The validity of the Hamilton and Montgomery-Åsberg de-pression rating scales as screening and diagnostic tools for de-
pression in Parkinson’s disease. Int J Geriatr Psychiatry 2000;
15:644 – 649
42. Leung CM, Wing YK, Kwong PK, Lo A, Shum K: Validation of the
Chinese-Cantonese version of the Hospital Anxiety and Depres-
sion Scale and comparison with the Hamilton Rating Scale of
Depression. Acta Psychiatr Scand 1999; 100:456 – 461
43. McAdams LA, Harris MJ, Bailey A, Fell R, Jeste DV: Validating
specific psychopathology scales in older outpatients with
schizophrenia. J Nerv Ment Dis 1996; 184:246 – 251
44. Maier W, Philipp M: Improving the assessment of severity of de-
pressive states: a reduction of the Hamilton Depression Rating
Scale. Pharmacopsychiatry 1985; 18:114 – 115
45. Maier W, Philipp M, Heuser I, Schlegel S, Buller R, Wetzel H: Im-
proving depression severity assessment, I: reliability, internalvalidity and sensitivity to change of three observer depression
scales. J Psychiatr Res 1988; 22:3 – 12
46. Maier W, Heuser I, Philipp M, Frommberger U, Demuth W: Im-
proving depression severity assessment, II: content, concurrent
and external validity of three observer depression scales. J Psy-
chiatr Res 1988; 22:13 – 19
47. Marcos T, Salamero M: Factor study of the Hamilton Rating
Scale for Depression and the Bech Melancholia Scale. Acta Psy-
chiatr Scand 1990; 82:178 – 181
48. Meyer JS, Li YS, Thornby J: Validating mini-mental status, cog-
nitive capacity screening and Hamilton depression scales utiliz-
ing subjects with vascular headaches. Int J Geriatr Psychiatry
2001; 16:430 – 435
49. Middelboe T, Ovesen L, Mortensen EL, Bech P: Depressive
symptoms in cancer patients undergoing chemotherapy: a
psychometric analysis. Psychother Psychosom 1994; 61:171 –
177
50. Moberg PJ, Lazarus LW, Mesholam RI, Bilker W, Chuy IL, Ney-
man I, Markvart V: Comparison of the standard and structured
interview guide for the Hamilton Depression Rating Scale in
depressed geriatric inpatients. Am J Geriatr Psychiatry 2001; 9:35 – 40
51. Mottram P, Wilson K, Copeland J: Validation of the Hamilton
Depression Rating Scale and Montgomery and Åsberg Rating
Scales in terms of AGECAT depression cases. Int J Geriatr Psychi-
atry 2000; 15:1113 – 1119
52. Naarding P, Leentjens AF, van Kooten F, Verhey FR: Disease-
specific properties of the Rating Scale for Depression in pa-
tients with stroke, Alzheimer’s dementia, and Parkinson’s dis-
ease. J Neuropsychiatry Clin Neurosci 2002; 14:329 – 334
53. O’Brien KP, Glaudin V: Factorial structure and factor reliability
of the Hamilton Rating Scale for Depression. Acta Psychiatr
Scand 1988; 78:113 – 120
54. O’Hara MW, Rehm LP: Hamilton Rating Scale for Depression:
reliability and validity of judgments of novice raters. J Consult
Clin Psychol 1983; 51:318 – 31955. Olsen LR, Jensen DV, Noerholm V, Martiny K, Bech P: The inter-
nal and external validity of the Major Depression Inventory in
measuring severity of depressive states. Psychol Med 2003; 33:
351 – 356
56. Onega LL, Abraham IL: Factor structure of the Hamilton Rating
Scale for Depression in a cohort of community-dwelling eld-
erly. Int J Geriatr Psychiatry 1997; 12:760 – 764
57. Pancheri P, Picardi A, Pasquini M, Gaetano P, Biondi M: Psycho-
pathological dimensions of depression: a factor study of the
17-item Hamilton depression rating scale in unipolar de-
pressed outpatients. J Affect Disord 2002; 68:41 – 47
58. Paykel ES: Use of the Hamilton Depression Scale in General
Practice, in The Hamilton Scales. Edited by Bech P, Coppen A.
Berlin, Springer-Verlag, 1990, pp 40 – 47
59. Potts MK, Daniels M, Burnam MA, Wells KB: A structured inter-
view version of the Hamilton Depression Rating Scale: evi-
dence of reliability and versatility of administration. J Psychiatr
Res 1990; 24:335 – 350
60. Ramos-Brieva JA, Cordero-Villafafila A: A new validation of the
Hamilton Rating Scale for Depression. J Psychiatr Res 1988; 22:
21 – 28
61. Rehm LP, O’Hara MW: Item characteristics of the Hamilton Rat-
ing Scale for Depression. J Psychiatr Res 1985; 19:31 – 41
62. Reynolds WM, Kobak KA: Reliability and validity of the Hamil-
ton Depression Inventory: a paper-and-pencil version of the
Hamilton Depression Rating Scale clinical interview. Psychol
Assess 1995; 7:472 – 483
63. Riskind JH, Beck AT, Brown G, Steer RA: Taking the measure of
anxiety and depression: validity of the reconstructed Hamilton
scales. J Nerv Ment Dis 1987; 175:474 – 479
64. Santor DA, Coyne JC: Evaluating the continuity of symptomatol-
ogy between depressed and nondepressed individuals. J Ab-
norm Psychol 2001; 110:216 – 225
65. Santor DA, Coyne JC: Examining symptom expression as a func-
tion of symptom severity: item performance on the Hamilton
Rating Scale for Depression. Psychol Assess 2001; 13:127 – 139
66. Sayer NA, Sackheim HA, Moeller JR, Prudic J, Devanand DP,
Coleman EA, Kiersky JE: The relations between observer-rating
and self-report of depressive symptomatology. Psychol Assess
1993; 5:350 – 360
67. Senra Rivera C, Racano Perez C, Sanchez Cao E, Barba Sixto S:
Use of three depression scales for evaluation of pretreatment
7/27/2019 Hamilton Depression
http://slidepdf.com/reader/full/hamilton-depression 15/15
Am J Psychiatry 161:12, December 2004 2177
BAGBY, RYDER, SCHULLER, ET AL.
http://ajp.psychiatryonline.org
severity and of improvement after treatment. Psychol Rep
2000; 87:389 – 394
68. Shain BN, Naylor M, Alessi N: Comparison of self-rated and cli-
nician-rated measures of depression in adolescents. Am J Psy-
chiatry 1990; 147:793 – 795
69. Smouse PE, Feinberg M, Carroll BJ, Park MH, Rawson SG: The
Carroll Rating Scale for Depression, II: factor analyses of the
feature profiles. Br J Psychiatry 1981; 138:201 – 204
70. Steinmeyer EM, Möller HJ: Facet theoretic analysis of the
Hamilton-D scale. J Affect Disord 1992; 25:53 – 6171. Strik JJ, Honig A, Lousberg R, Denollet J: Sensitivity and specific-
ity of observer and self-report questionnaires in major and mi-
nor depression following myocardial infarction. Psychosomat-
ics 2001; 42:423 – 428
72. Teri L, Wagner AW: Assessment of depression in patients with
Alzheimer’s disease: concordance among informants. Psychol
Aging 1991; 6:280 – 285
73. Thase ME, Hersen M, Bellack AS, Himmelhoch JM, Kupfer DJ:
Validation of a Hamilton subscale for endogenomorphic de-
pression. J Affect Disord 1983; 5:267 – 278
74. Thompson WM, Harris B, Lazarus J, Richards C: A comparison
of the performance of rating scales used in the diagnosis of
postnatal depression. Acta Psychiatr Scand 1998; 98:224 – 227
75. Whisman MA, Strosahl K, Fruzzetti AE, Schmaling KB, Jacobson
NS, Miller DM: A structured interview version of the HamiltonRating Scale for Depression: reliability and validity. Psychol As-
sess 1989; 1:238 – 241
76. Williams JB: A structured interview guide for the Hamilton De-
pression Rating Scale. Arch Gen Psychiatry 1988; 45:742 – 747
77. Zheng YP, Zhao JP, Phillips M, Liu JB, Cai MF, Sun SQ, Huang MF:
Validity and reliability of the Chinese Hamilton Depression Rat-
ing Scale. Br J Psychiatry 1988; 152:660 – 664
78. Cronbach LJ: Coefficient alpha and the internal structure of
tests. Psychometrika 1951; 16:297 – 334
79. Briggs SR, Cheek JM: The role of factor analysis in the develop-
ment and evaluation of personality scales. J Pers 1986; 54:
106 – 148
80. Nunnally JC, Bernstein IH: Psychometric Theory, 3rd ed. New
York, McGraw-Hill, 1994
81. Fleiss JL, Shrout PE: The effects of measurement errors onsome multivariate procedures. Am J Public Health 1977; 67:
1188 – 1191
82. Landis JR, Koch GG: The measurement of observer agreement
for categorical data. Biometrics 1977; 33:159 – 174
83. Anastasi A, Urbina S: Psychological Testing, 7th ed. New York,
MacMillan, 1997
84. Bock RD, Gibbons RD, Murraki E: Full information item factor
analysis. Applied Psychol Measurement 1988; 12:261 – 280
85. Gibbons RD, Clark DC, VonAmmon CS, Davis JM: Application of
modern psychometric theory in psychiatric research. J Psychi-
atr Res 1985; 19:43 – 55
86. Bech P, Allerup P, Gram LF, Reisby N, Rosenberg R, Jacobsen O,
Nagy A: The Hamilton depression scale: evaluation of objectiv-
ity using logistic models. Acta Psychiatr Scand 1981; 63:290 – 299
87. Bech P, Gram LF, Dein E, Jacobsen O, Vitger J, Bolwig TG: Quan-
titative rating of depressive states. Acta Psychiatr Scand 1975;
51:161 – 170
88. Gorsuch RL: Factor Analysis. Hillside, NJ, Lawrence Erlbaum As-
sociates, 1983
89. Prusoff B, Klerman GL: Differentiating depressed from anxious
neurotic outpatients. Arch Gen Psychiatry 1974; 30:302 – 309
90. Edwards BC, Lambert MJ, Moran PW, McCully T, Smith KC, Ell-
ingson AG: A meta-analytic comparison of the Beck Depression
Inventory and the Hamilton Rating Scale for Depression as
measures of treatment outcome. Br J Clin Psychol 1984;
23(part 2):93 – 99
91. O’Sullivan RL, Fava M, Agustin C, Baer L, Rosenbaum JF: Sensi-
tivity of the six-item Hamilton Depression Rating Scale. Acta
Psychiatr Scand 1997; 95:379 – 384
92. Hooper CL, Bakish D: An examination of the sensitivity of the
six-item Hamilton Rating Scale for Depression in a sample of
patients suffering from major depressive disorder. J Psychiatry
Neurosci 2000; 25:178 – 184
93. Kalai A, Ginertini M, Kobak K, Engelhardt N, Williams JBW,
Evans K, Bech P, Lipsitz J, Olin J, Pearson J, Rothman M: The
GRID-HAMD: a reliability study in patients with major depres-
sion, in Abstracts of the 43rd Annual New Clinical Drug Evalua-
tion Unit (NCDEU) Meeting. Bethesda, Md, NIMH, 2003, Poster
I-19
94. Kalai A, Williams JB, Koback KA, Lipsitz J, Engelhardt N, Evans K,
Olin J, Pearson J, Rothman M, Bech P: The new GRID HAM-D: pi-
lot testing and international field trials. Int J Neuropsychophar-
macol 2002; 5:S147 – S148
95. Rush AJ, Giles DE, Schlesser MA, Fulton CL, Weissenburger J,
Burns C: The Inventory for Depressive Symptomatology (IDS):
preliminary findings. Psychiatry Res 1986; 18:65 – 87
96. Montgomery SA, Åsberg M: A new depression scale designed to
be sensitive to change. Br J Psychiatry 1979; 134:382 – 389