using a screening instrument for domestic violence in ...€¦  · web viewwomen with partner:...

64
The CalWORKs Project The CalWORKs Project for Substance Abuse, Mental Health and Domestic Violence Issues in Welfare Reform Programs: Technical Manual By Daniel Chandler March 2000

Upload: others

Post on 09-Aug-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Using A Screening Instrument for Domestic Violence In ...€¦  · Web viewWomen with Partner: Results for Any Abuse Using Any “Yes” as Cut-Off (Emotional Abuse, Current Partner

The CalWORKsThe CalWORKs ProjectProject

forSubstance Abuse, Mental Health and Domestic Violence Issues in Welfare Reform Programs:

Technical ManualBy Daniel Chandler

March 2000

Page 2: Using A Screening Instrument for Domestic Violence In ...€¦  · Web viewWomen with Partner: Results for Any Abuse Using Any “Yes” as Cut-Off (Emotional Abuse, Current Partner

CALWORKS PROJECTCOLLABORATING ORGANIZATIONS AND STAFF

California Institute for Mental Health (www.cimh.org)2030 J Street

Sacramento, CA 95814916-556-3480 (x-106)916-446-4519 FAX

Sandra Naylor Goodwin, Executive Director/Project Director

Joan Meisel, Policy and Practice Consultant

Daniel Chandler, Research Director

Pat Jordan, Consultant

Shelley Kushner, Project Coordinator

Debbie Richardson Cox, Project Assistant

Children & Family Futures (www.cffutures.com)4940 Irvine Blvd., Suite 202

Irvine, CA 92620714-505-3525

714-505-3626 FAX

Nancy K. Young, Director

Sid Gardner, President

Karen Sherman & Shaila Simpson, Associates

Family Violence Prevention Fund (www.fvpf.org) 383 Rhode Island Street, Suite 304

San Francisco, CA 94133415-252-8990

415-252-8991 FAX

Janet Carter, Managing Director

Kelly Mitchell-Clark, Program Manager

Cindy Marano, Consultant

Carol Ann Peterson, Consultant

Acknowledgement: We appreciate the generous financial support of the National Institute of Justice, Violence Against Women Office. Additional funding has been provided by California counties, the Wellness Foundation and the David and Lucile Packard Foundation.

Page 3: Using A Screening Instrument for Domestic Violence In ...€¦  · Web viewWomen with Partner: Results for Any Abuse Using Any “Yes” as Cut-Off (Emotional Abuse, Current Partner

TABLE OF CONTENTS

I. An introduction to screening 1

A. Screening and trust 1

B. Screening instruments as a method of identification 2

Ii. Screening instruments, gold standards, and cut-points 7

A. Screening for mental health diagnoses 7

Recommendations for mental health screening instrument and cutpoints 13

B. Screening for problems with alcohol 16

Recommendations for alcohol screening instrument and cutpoints 20

C. Screening for problems with illicit drugs 22

Recommendations for illicit drug screening instrument and cutpoints 25

D. Screening for domestic violence 26

recommendations for domestic violence screening instrument and cutpoints 30

Appendix 1: Calculation of specificity, sensitivity and positive predictive value 33

Appendix ii: Age and partner specific tables for domestic violence 34

Page 4: Using A Screening Instrument for Domestic Violence In ...€¦  · Web viewWomen with Partner: Results for Any Abuse Using Any “Yes” as Cut-Off (Emotional Abuse, Current Partner

Screening for AOD/MH/DV Page 1

I. AN INTRODUCTION TO SCREENING

A. SCREENING AND TRUST

Screening. This report provides technical information to administrators who are considering using screening instruments to assist in identifying women with AOD/MH/DV issues. It should be used in conjunction with the Screening Guide being published at the same time and available from the California Institute for Mental Health or on the web at: www.cimh.org1

Screening needs to be distinguished from assessment and “appraisal.”

Screening is the use of a simple, usually brief, set of questions that can indicate the need for a thorough assessment of AOD/MH/DV issues. The questions can be self-administered or administered by a staff member who may or may not be an AOD/MH/DV specialist. The outcome of a “positive screen” is a referral to a specialist for an assessment. We are also specifically referring to “instruments,” that is sets of questions that have been scientifically validated.2

Assessment is the detailed evaluation of diagnosis, severity, and history that is necessary in order to determine whether treatment or services are appropriate and if so to design a treatment or service plan.

Appraisal refers to one of the formal steps in the CalWORKs Welfare-to-Work process. We do not use this term in relationship to AOD, MH, or DV issues.

One of the key reasons that these short screening instrument have been little used3 is that their psychometric properties have never been described for a population receiving welfare, particularly in a welfare reform context. This report uses information from CalWORKs Project research in two California counties to choose between and validate screening instruments for AOD, mental health and domestic violence.

This Manual provides the background and rationale for recommendations we offer to administrators and practitioners regarding which instruments to use and which cut-points to use to achieve different ends.

Trust. These instruments only “work” in a context of trust and helpfulness. Although the questions are not “direct”—such as how often do you smoke marijuana—their purpose can be surmised by respondents. Honest answers can only be expected if the screens are administered in a setting in which CalWORKs applicants believe that if they divulge sensitive information it will be used to help them. There are many possible ways of establishing such a trusting setting, including using the screening instruments in the context of a home visit.

1 The Guide contains copies of each of the recommended instruments as well as much information on the implementation of a program that uses screening instruments. A Microsoft Word file with all of the recommended instruments in English and Spanish is also available: www.cimh.org/project.html 2 Many counties ask questions about AOD/MH/DV but they are ad hoc rather than having an empirical basis.3 Los Angeles is the only county of major size which we know to have used a standardized screening instrument.

Page 5: Using A Screening Instrument for Domestic Violence In ...€¦  · Web viewWomen with Partner: Results for Any Abuse Using Any “Yes” as Cut-Off (Emotional Abuse, Current Partner

Screening for AOD/MH/DV Page 2

A caution is also in order: the insensitive use of screening instruments may well cause distress or distrust among CalWORKs applicants or recipients. In fact, it is better to avoid screening instruments entirely if a trusting context cannot be established. The issue of establishing trust is discussed at considerable length in the Guide.

The CalWORKs Project Research In Kern And Stanislaus Counties Tested Screening Instruments for AOD, MH and DV.

The CalWORKs Project research in Kern and Stanislaus counties offered an opportunity to validate screening instruments for AOD/MH/DV conditions with welfare recipients. The research included interviews with 347 women who had applied for cash aid (CalWORKs) in Stanislaus County and 356 women who were on-going recipients of cash aid under CalWORKs in Kern County. The Kern sample included 18 undocumented workers and 31 women with a disability—none of whom were required to participate in Welfare to Work activities. We have included these women in this analysis because a) their prevalence rates are very similar to the Welfare to Work group and b) we have recommended that counties provide Welfare to Work service opportunities to sanctioned or exempted women.4 Since the research interview classed each person as having or not having domestic violence issues within the past 12 months, mental health issues within the past 12 months, or AOD abuse or dependence within the past 12 months, these classifications are used as a “gold standard” with which to compare the results of screening instruments. The methodology for this research, and the characteristics of the samples, are described in a recent CIMH report available on the web.5 The overall prevalence rates we found are summarized in Table 1.

Please note that the respondents in the research are all women, so the properties of the tests described below apply only to women. In general, it has been found that prevalence and cut-points differ for men and women, so extrapolating to male CalWORKs recipients may not be valid. And the domestic violence screening instruments we tested were specifically designed for use with women.

B. SCREENING INSTRUMENTS AS A METHOD OF IDENTIFICATION

There are many approaches to identifying CalWORKs recipients who have AOD/MH/DV issues. They range from direct personal questions to observation of signs or symptoms to self-disclosure by the recipient. How counties are approaching the issue of identification is considered at some length in the CalWORKs Project Six County Case Study.6 The use of brief screening instruments should be considered in the larger context of a county’s identification efforts.

Table 1: Overall prevalence of AOD/MH/DV diagnoses/issues in CalWORKs Project Research in Kern County and Stanislaus County: Summer of 1999

4 Meisel, J & Chandler, D. (2000). The CalWORKs Project Six County Case Study Project Report. Sacramento: California Institute for Mental Health, 2030 J. Street, Sacramento, CA 95814.5 Chandler, D., & Meisel, J. (2000). The Prevalence of Mental Health, Alcohol and Other Drug, & Domestic Violence Issues Among CalWORKs Participants in Kern and Stanislaus Counties. Sacramento: California Institute for Mental Health. Available at www.cimh.org6 Meisel and Chandler, op cit.

Page 6: Using A Screening Instrument for Domestic Violence In ...€¦  · Web viewWomen with Partner: Results for Any Abuse Using Any “Yes” as Cut-Off (Emotional Abuse, Current Partner

Screening for AOD/MH/DV Page 3

Number Of Conditions

Kern Recipients

(N=347)

Stanislaus Applicants

(N=356)

None 45% 30%

One Only 34% 38%

Two Only 19% 26%

Three 2% 6%

TOTAL 100% 100%

Properties of Screening Instruments

Our companion report, the Screening Guide, discusses the advantages and disadvantages of using screening instruments. It also provides step-by-step factors to consider in designing a screening program. In this Technical Manual we present the considerations relevant to a choice of screening instruments and the data that back up our recommendations for which instruments to choose and which threshold or “cut point” indicates need for a formal assessment.

Costs of screening. Screening instruments have three types of costs. 1) There is a cost for administering instruments. 2) There is a cost for false positives, as assessments are expensive. And 3) there is the most important cost of false negatives—persons who need assistance but who our test wrongly leads us to believe do not need help. These costs depend in large part on the psychometric properties of the test. An ideal test would be cheap to administer, identify virtually all persons in need and not wrongly identify many of those not in need. The ideal is rarely realized in any context, but is very unlikely in screening for AOD/MH/DV issues. Thus administrators will need to use the properties of the screening instruments we describe below to balance the three different costs to attain an optimum for their particular agency. The advantage of having known psychometrics—which we present below—is that this balancing can be conducted knowledgeably rather than blindly.

Instrument requirements. In looking for screening instruments we wanted a very short set of questions (generally no more than five) that do not arouse anxiety by their directness. Instruments also had to be readily scoreable and interpretable to recipients.7

Context for administration. Although we will consider this issue more later on, it is important to recognize that the context for validation of the screening instruments did not match that which would be involved in their actual use. In a research interview in which absolute confidentiality and privacy is promised, CalWORKs recipients may be more ready to answer screening questions than they would be in a CalWORKs office or community based organization. The 7 There are some situations in which a more detailed screening instrument would be useful—for use with persons already known to be at high risk, for example—and we have provided references to such instruments in Appendix II of the Screening Guide.

Page 7: Using A Screening Instrument for Domestic Violence In ...€¦  · Web viewWomen with Partner: Results for Any Abuse Using Any “Yes” as Cut-Off (Emotional Abuse, Current Partner

Screening for AOD/MH/DV Page 4

second stage of this process is developing—with some pilot counties—procedures that will maximize the efficiency of the screening instruments that this phase has validated with the population.

Even in the research context there was one important difference between how we administered screens for domestic violence and mental health compared to those for AOD. The screening questions for the former were asked prior to the gold standard questions—the usual way of testing instruments. However, we were concerned that the alcohol and drug screening questions might bias answers to the gold standard instruments, so we put the screening questions at the end. Although they “fit” better there, they may be affected by the participant having already answered a series of related questions.

The effects of age and race/ethnicity. Because test properties, as well as prevalence, may change from subpopulation to subpopulation we also explore the effects of age and race on the properties of screening tests. It is important to know how stable the properties of the screening instruments are and how to adjust their cut-offs to take account of subpopulation differences.

What counts as a “no” answer on a screening question. One important qualification is that all questions are converted into a “yes” or “no” answer, with the “yes” being those who were asked the question and said “yes” and “no” being all others. The reason for this is two-fold:

Because of the way the skips in the survey were arranged there were questions not asked of everyone because a previous response had skipped them out of the section, so we used the skip out questions in lieu of a “no” for those who were not asked. For example, women who skipped out of the alcohol section because they did not drink at least 12 drinks during the previous year were presumed to answer no to all of the questions on the screen for alcohol abuse.

The alternative would have been to have used the responses only from those who answered the specific questions. This means the prevalence rate would be higher than we found overall, since only those at risk were asked all the screening items. It also does not correspond to the screening situation we think will be most common: a set of questions asked of all CalWORKs participants. If the screening were limited to persons for whom there was already reason to suspect high risk, then our approach would not be as useful.

Page 8: Using A Screening Instrument for Domestic Violence In ...€¦  · Web viewWomen with Partner: Results for Any Abuse Using Any “Yes” as Cut-Off (Emotional Abuse, Current Partner

Screening for AOD/MH/DV Page 5

What makes a screen good. In comparing each screening instrument with its “gold standard” equivalent there are several factors we look at8:

How good is the instrument at detecting persons who are “positive” for the condition? This is usually termed the “sensitivity” of the screening instrument. Those who have the condition but not detected are called “false negatives.”

How good is the instrument at screening out those who really do not have the condition? This is called the “specificity” of the instrument. Those who do not have the condition but the screening instrument says they do are called “false positives.” If the rate of false positives is high it means the instrument will be inefficient, referring many to assessments who will turn out negative.

In actual usage, we would not know in advance how a person scored on the “gold standard,” so a very important measure is the “positive predictive value” of the screening instrument. This tells us what percentage of those who score positive on the screening instrument are likely to score positive on a more detailed assessment. (The positive predictive value is directly related to sensitivity and specificity.)

If instead of trying to identify persons with a condition, we were interested in “ruling out” persons (saying, for example, here is a group for which we do not have to worry about mental health problems), then we look at negative predictive value. It tells us the percentage of those screening negative who would actually be negative if assessed. Generally, we would want a very high negative predictive value before saying definitely someone does not have a AOD/MH/DV condition.

Each of these values is a “raw” or uncalibrated measure. We also calculated calibrated versions of the test performance statistics. These are most suited to choosing between tests.9 Calibration is important because it takes out the effects of the level of the test and the prevalence rate. For example, if a test were set to be positive for 99 percent of the sample, it would also include 99 percent of those with a diagnosis—but would not be a good test. Similarly, even a random sample will generate a “hit” rate equal to the prevalence rate. To deal with this issue, the values are recalibrated to go between a zero when the test does no better than a random sample of the population (with a given prevalence) to 1.00 when it identifies all those with a diagnosis over and above a random sample.

a) We also report the Receiver Operating Curve (ROC) proportion for each test. This is a measure of the overall power of the screening questions to predict a diagnosis, independent of cut-point. It can range from .5, if the model has no predictive power, to 1.00 if it predicts 100 percent of the gold standard cases. The ROC proportions range from about .70 to .98 for

8 In fact a diagnosis or positive score on a longer instrument is not equivalent to knowing whether a disorder exists or does not since there is always some unreliability in the assigning of diagnoses. Although we do not take account of this unreliability in the analyses below, its general effect is to make confidence intervals wider. 9 Kraemer, H. C. (1992). Evaluating Medical Tests: Objective and Quantitative Guidelines. Newbury Park: Sage Publications.

Page 9: Using A Screening Instrument for Domestic Violence In ...€¦  · Web viewWomen with Partner: Results for Any Abuse Using Any “Yes” as Cut-Off (Emotional Abuse, Current Partner

Screening for AOD/MH/DV Page 6

the screening instruments presented here, so there is a wide range of performance across screens and diagnoses.

Inconsistent items. If a four item test is to be scored using a cut-off of 1, 2, or 3, then the items should be consistent in that each item has the same likelihood of predicting a positive on the gold standard or that the probabilities of the different items are “nested” so that all those who answer yes to D, will have answered C, B and A, etc. Not all of the instruments met this criterion.

A practical example

As an aid to understanding these test properties, try figuring out the results for a population of 1000 new applicants for aid. For example, using the statistics below taken from Table 2 (showing information for the MH5 with “Major Depression” as the gold standard and a cut-point of .20), we would expect:

1) Out of the 1000 persons there would be 270 who were “truly” depressed. (27% prevalence times 1000)

2) Using the .20 cut-off in this illustration, however, we would identify 480 as positive on the test and needing to be assessed. (48% is the percent positive, or level of the test; so 48% times 1000)

3) Those scoring positive on the test would include 89 percent of the 270 who are depressed or 240 persons. (Sensitivity of 89% times the 270 true positives.)

4) Since there are 270 true positives there are 730 true negatives. The 63 percent specificity means that the test correctly identifies as “negative” 63 percent of the true negatives (460 persons).

5) Out of the 480 who scored positive, 45 percent or 216 would be identified as being depressed upon an assessment (positive predictive value).

6) Of those with a negative on the test, 90 percent would be found to be a true negative if we were to do an assessment on them (negative predictive value).

7) The ROC of .80 indicates an overall moderate predictive capacity.

Illustration taken from Table 2.

Prevalence Percent Positive on Test

Sensitivity Specificity Positive Predictive Value

Negative Predictive Value

Receiver Operating Curve

MH5Cutoff =..20 27% 48% 89% 63% 45% 90% .80

Page 10: Using A Screening Instrument for Domestic Violence In ...€¦  · Web viewWomen with Partner: Results for Any Abuse Using Any “Yes” as Cut-Off (Emotional Abuse, Current Partner

Screening for AOD/MH/DV Page 7

II. SCREENING INSTRUMENTS, GOLD STANDARDS, AND CUT-POINTS

A. SCREENING FOR MENTAL HEALTH DIAGNOSES

Choice of screening instruments to be tested

Although there are numerous mental health screening instruments available, the psychometric properties of most are quite similar, with longer instruments not performing significantly better than short ones.10 We selected the mental health instrument that has seemed to perform best in a variety of trials (it is sometimes called Mental Health Inventory or MHI-5)11 It is a five item screen developed in the Medical Outcomes Study. Three of these items are used as well in the SF-12 Health Survey, also derived from the Medical Outcomes Study and a very widely used instrument for the brief assessment of functional health status.12 Thus we have two instruments to test: the five item MH5 screen and the mental health subscore from the SF-12 Health Survey (which uses other SF-12 Health Survey items as well as the three mental health specific items and is weighted based on population studies).

For each question of the five MH5 questions there are six possible responses ranging from “all of the time” to “none of the time.” The three questions included in the SF-12 Health Survey are italicized.

1. How much of the time during the past 4 weeks…have you been a very nervous person?

2. How much of the time during the past 4 weeks…have you felt so down in the dumps that nothing could cheer you up?

3. How much of the time during the past 4 weeks…have you felt calm and peaceful?

4. How much of the time during the past 4 weeks…did you have a lot of energy?

5. How much of the time during the past 4 weeks…have you felt downhearted and blue?

10 Schade, C. P., Jones, E. R. J., & Wittlin, B. (1998). A ten-year review of the validity and clinical utility of depression screening. Psychiatric Services, 49(1), 55-61.11 Weinstein, M., Berwick, D., Goldman, P., Murphy, J., & Barsky, A. (1989). A comparison of three psychiatric screening tests using receiver operating characteristic (ROC) analysis. Med Care, 27(6), 593-607; Berwick, D., Murphy, J., Goldman, P., Ware JE, J., Barsky, A., & Weinstein, M. (1991). Performance of a five-item mental health screening test. Med Care, 29(2), 169-176.12 The Medical Outcome Study and the SF-12 Health Survey are described in: Ware, J. E., Jr., Kosinski, M., & Keller, S. D. (1995). SF-12 Health Survey: How to Score the SF-12 Health Survey Physical and Mental Health Summary Scales.: The Health Institute, New England Medical Center, Boston, MA.

Page 11: Using A Screening Instrument for Domestic Violence In ...€¦  · Web viewWomen with Partner: Results for Any Abuse Using Any “Yes” as Cut-Off (Emotional Abuse, Current Partner

Screening for AOD/MH/DV Page 8

Properties of the instruments

Choice of a “Gold Standard” for Mental Health. Because depression has been shown to be correlated with welfare status we chose as one “gold standard” the diagnosis of major depression (overall prevalence 28 percent) as generated by the Composite International Diagnostic Interview—a part of our research interview. A second gold standard was a very broad criterion of “any mental health diagnosis;” (overall prevalence of 39 percent). Two narrower standards were “two or more mental health disorders” (overall prevalence of 19 percent) and “has a mental health diagnosis and reports of ‘a lot’ of interference with daily activities” (overall prevalence 27 percent). In the recommendations section there is further consideration of an appropriate gold standard. At this stage we want to be sure we look at all the possible standards.

Age and race. Controlling for age and race did not improve the predictive power of the models for “any diagnosis” or for “Dx plus interference” but did improve prediction for “major depression.” However, separate analysis for Hispanics and African-Americans (the two racial/ethnic groups that stood out) for major depression showed the same screening instrument to be most effective for these groups as for the population as a whole.13

Individual items versus scale scores. Of the five mental health specific items (MH5), not all were significant in for each race/ethnicity category. For example, “feeling blue” was only marginally statistically significant overall, but was the most significant for African-Americans. Overall, however, selecting a model from among the five items did not result in any improvements in efficiency over the MH5 or the weighted mental health scale score (MCS12). For each of the gold standards multiple cut-offs are given below for both the MH5 and the MCS12. The range of cut-offs to be displayed was chosen by inspecting a graph of sensitivity vs. specificity.

Note that if the SF-12 Health Survey is used the weighted score will be calculated by a vendor (it is probably too complicated for individual welfare departments to calculate and certainly too complicated to hand score). These scores are calculated in such a way that a high score means a low probability of having the diagnosis.

If the MH5 is used, the response categories need to be set up in such a way that the low score means the symptom is more frequent. The overall score then is created by summing the item scores and dividing by 5. As with the MCS12, the higher the score the lower the probability of having the diagnosis.

Reading the tables. The test statistics described in Part I run across the table (column labels). In the left hand column is the description of the instrument and the cutoff points being tested. The cut-off point is described in two ways. First there is the probability of having the diagnosis. This means that if the table says “cutoff=.20,” as in the first line of Table 2, everyone with a probability o f .20 or more is classified as a “yes” or a “hit” and referred for assessment. That is, the percent positive on the test will be comprised of all those with a .20 or greater probability.

Beneath the probability is the actual score on the test that would result in that probability. Remember that both the MCS12 and the MH5 are scored in such a way that lower scores mean a

13 The instrument worked better for African-Americans, however, so somewhat higher “hit” rates could be expected if the population was largely African-American.

Page 12: Using A Screening Instrument for Domestic Violence In ...€¦  · Web viewWomen with Partner: Results for Any Abuse Using Any “Yes” as Cut-Off (Emotional Abuse, Current Partner

Screening for AOD/MH/DV Page 9

high probability of having the diagnosis. So—to use the same example—when the MH5 (in the first table) shows a cut-off of .20 probability, this corresponds with an actual mean score on the 5 items of 4.3, so anything less than 4.3 would be counted as a “hit” and a referral made (if this were the cut-off point we adopted).

Calibrated sensitivity and specificity and efficiency. The calibrated sensitivity and specificity are indicated on each table to show the most sensitive, specific and efficient test for each test (MH5 and MCS12) across all cut-offs. So for each gold standard and instrument, the most sensitive test is shown by (blue) italicized and bolded type. The same applies to specificity. The most efficient test is bolded (red) across the row. The most efficient test would ordinarily be the choice unless there was a reason to optimize sensitivity (we are more interested in identifying persons who need an assessment than in optimizing efficiency).

Table 2: Depression as the gold standard: MH5 (see Part I, Section C above for how to read tables)

Prevalence Percent Positive on Test

Sensitivity Specificity Positive Predictive Value

Negative Predictive Value

Receiver Operating Curve

MH5Cutoff =.20Score=4.3

27% 48% 89% 63% 45% 90% .80

MH5Cutoff =.30Score=3.7

27% 35% 67% 77% 52% 86% .80

MH5Cutoff =.40Score=3.3

27% 25% 54% 86% 58% 84% .80

MH5Cutoff =.50Score=2.9

27% 16% 40% 92% 65% 81% .80

MH5Cutoff =.60Score=2.5

27% 11% 31% 96% 75% 79% .80

Page 13: Using A Screening Instrument for Domestic Violence In ...€¦  · Web viewWomen with Partner: Results for Any Abuse Using Any “Yes” as Cut-Off (Emotional Abuse, Current Partner

Screening for AOD/MH/DV Page 10

Table 3: Depression as the gold standard: MHC12 (see above for how to read tables)

Prevalence Percent Positive on Test

Sensitivity Specificity Positive Predictive Value

Negative Predictive Value

Receiver Operating Curve

MCS12Cutoff=.20Score=46.6

26% 45% 76% 66% 45% 88% .79

MCS12Cutoff=.30Score=41.5

26% 32% 65% 81% 55% 87% .79

MCS12Cutoff=.40Score=36.3

26% 24% 54% 87% 60% 84% .79

MCS12Cutoff=.50Score=32.0

26% 17% 42% 92% 65% 81% .79

MCS12Cutoff=.60Score=27.7

26% 11% 31% 95% 70% 79% .79

Table 4: Any diagnosis as the gold standard: MH5 (see above for how to read tables)

Any diagnosis (including

PTSD)

Prevalence Percent Positive on Test

Sensitivity Specificity Positive Predictive Value

Negative Predictive Value

Receiver Operating Curve

MH5 Cutoff =.25Score=4.7

39% 59% 85% 57% 56% 86% .80

MH5 Cutoff =.37Score=4.1

39% 43% 73% 77% 67% 82% .80

MH5 Cutoff =.50Score=3.7

39% 35% 61% 82% 69% 77% .80

MH5 Cutoff =.60Score=3.3

39% 19% 47% 92% 78% 73% .80

MH5 Cutoff =.70Score=2.9

39% 16% 35% 95% 82% 70% .80

Page 14: Using A Screening Instrument for Domestic Violence In ...€¦  · Web viewWomen with Partner: Results for Any Abuse Using Any “Yes” as Cut-Off (Emotional Abuse, Current Partner

Screening for AOD/MH/DV Page 11

Table 5: Any diagnosis as the gold standard: MCS12 (see above for how to read tables)

Any diagnosis (including

PTSD)

Prevalence Percent Positive on Test

Sensitivity Specificity Positive Predictive Value

Negative Predictive Value

Receiver Operating Curve

MCS12 Cutoff=.22Score=52.75

39% 65% 88% 50% 53% 87% .79

MCS12 Cutoff=.32Score=47.8

39% 47% 72% 69% 59% 80% .79

MCS12 Cutoff=.42Score=43.5

39% 37% 63% 79% 66% 77% .79

MCS12 Cutoff=.52Score=39.5

39% 29% 56% 88% 75% 76% .79

MCS12 Cutoff=.62Score=35.4

39% 22% 44% 92% 78% 73% .79

Table 6: Gold Standard: Any diagnosis (including PTSD) PLUS at least one diagnosis that respondent said interfered “a lot” with normal activities

Prevalence Percent Positive on Test

Sensitivity Specificity Positive Predictive Value

Negative Predictive Value

Receiver Operating Curve

MH5 Cutoff =.20Score=4.3 28% 48% 78% 63% 45% 88% .79

MH5 Cutoff =.30*Score=3.8 28% 38% 70% 74% 50% 86% .79

MH5 Cutoff =.50Score=2.9 28% 25% 51% 85% 57% 82% .79

MH5 Cutoff =.60Score=2.5 28% 14% 38% 92% 64% 79% .79

MH5 Cutoff =.70Score=2.0 28% 11% 30% 96% 73% 78% .79

MCS12 Cutoff=.12Score=55.5 27% 73% 95% 36% 36% 95% .77

MCS12 Cutoff=.22Score47.05 27% 44% 71% 66% 44% 86% .77

MCS12 Cutoff=.32*Score=40.6 27% 30% 60% 81% 54% 84% .77

MCS12 Cutoff=.42Score=35.93 27% 22% 48% 87% 59% 82% .77

MCS12 Cutoff=.52Score=31.25 27% 15% 36% 93% 65% 79% .77

Page 15: Using A Screening Instrument for Domestic Violence In ...€¦  · Web viewWomen with Partner: Results for Any Abuse Using Any “Yes” as Cut-Off (Emotional Abuse, Current Partner

Screening for AOD/MH/DV Page 12

Table 7: Gold Standard: Two or more diagnoses (including PTSD)Prevalence Percent

Positive on Test

Sensitivity Specificity Positive Predictive Value

Negative Predictive Value

Receiver Operating Curve

MH5Cutoff =.12Score=4.3

19% 48% 88% 61% 35% 95% .82

MH5Cutoff =.22Score=3.5

19% 30% 66% 79% 43% 91% .82

MH5Cutoff =.32Score=3.1

19% 20% 54% 88% 51% 89% .82

MH5Cutoff =.42Score=2.7

19% 14% 40% 94% 60% 87% .82

MH5Cutoff =.52Score=2.3

19% 9% 33% 97% 71% 86% .82

MCS12Cutoff=.12

Score=47.6319% 46% 81% 63% 34% 93% .81

MCS12Cutoff=.22Score=39.9

19% 29% 69% 81% 45% 92% .81

MCS12Cutoff=.32Score=34.5

19% 21% 56% 88% 52% 89% .81

MCS12Cutoff=.42Score=30.2

19% 14% 41% 92% 56% 87% .81

MCS12Cutoff=.52Score=26.0

19% 10% 30% 95% 62% 85% .81

Page 16: Using A Screening Instrument for Domestic Violence In ...€¦  · Web viewWomen with Partner: Results for Any Abuse Using Any “Yes” as Cut-Off (Emotional Abuse, Current Partner

Screening for AOD/MH/DV Page 13

RECOMMENDATIONS FOR MENTAL HEALTH SCREENING INSTRUMENT AND CUTPOINTS

Considerations in choice of instrument

Both recommended mental health screening instruments developed out of the Medical Outcomes Study, a major population survey of the 1980s. One version uses five questions (the MH5) while the other uses three direct questions on mental health from the SF-12 Health Survey but includes other related questions (on functional disability) and weights them all in terms of national norms. In the CalWORKs population, neither test performed as well as would have been desired. However, if the tests perform in practices as they did in the research setting one might expect ten percent to fifteen percent of CalWORKs recipients to be referred for mental health services.

If screening for health-related functional difficulties is being contemplated then the mental health score (MCS12) of the SF-12 Health Survey screening instrument would be the choice for mental health screening. If only mental health, and not physical health, is the subject of screening, then the MH5 would be the choice. Both tests perform very similarly.

Aside from the issue of whether functional health problems are screened for, there are several practical considerations to be weighed:

1. The MH5 can be scored at the time of administration, although it is complex enough that it may not be feasible for CalWORKs recipients to do the scoring. Both addition and division are required in order to get a mean, and the scores are read in a counter-intuitive way—low scores indicate greater likelihood of mental disorder.

2. The SF-12 Health Survey employs a complex weighted scoring method. Although permission to use the SF-12 Health Survey for free is routinely granted, scoring is complicated. There are a variety of vendors available who offer fax-back and other administration methods. With the SF-12 Health Survey scoring is neither immediate nor can it be done by the respondent, limiting the ability to use self-scoring as a way of building trust.

3. The mental health scale on the SF-12 Survey has national norms associated with it, which the MH5 does not.

4. A big advantage of the SF-12 Health Survey is that the mental health questions are unlikely to be threatening to respondents both because they occur in the context of a set of health questions and because the questions themselves are indirect (feeling calm, having energy). While the MH5 uses similarly indirect questions, the fact that there are five of them together would be more likely to raise the defenses of people (“I’m not crazy”).

Overall, our suggestion is that the SF-12 Health Survey in its entirety makes most sense in the context of CalWORKs. It need not be administered at the same time as the AOD and DV screens are so that the need for off-site scoring would not necessarily be a disadvantage. However, we present summary statistics for both tests.

Page 17: Using A Screening Instrument for Domestic Violence In ...€¦  · Web viewWomen with Partner: Results for Any Abuse Using Any “Yes” as Cut-Off (Emotional Abuse, Current Partner

Screening for AOD/MH/DV Page 14

Consideration in choice of a gold-standard

Depending on the choice of gold standard, prevalence of a mental disorder was found to range from 19 to 39 percent. While depression has been the disorder most often linked to longer welfare tenure or more difficulty working, it is by no means the only diagnosis that could have these effects. However, many other diagnoses could be equally or even more disabling. We definitely would like to include PTSD, for example. In fact, we tabulated the diagnoses given to CalWORKs recipients identified as having a mental health issue by the co-located behavioral health team in Stanislaus County and found that only 25 percent of those identified had a diagnosis of major depression.

However, the criterion of “any diagnosis” would appear too broad as a great many people in the population live and work with diagnosable disorders such as specific phobias. On the other extreme, while persons with two or more diagnoses have been found in the National Co-Morbidity survey to have more functional impairments, this standard fails because many persons with only one diagnosis could need treatment.

In the CalWORKs context we suggest that the gold standard of at least one diagnosis for which respondents judged the symptoms to interfere “a lot” with their “life or activities” is most appropriate.

Considerations in choice of cut-point

The prevalence of a diagnosis that involves “a lot” of interference with life or activities is 28 percent (25 percent in Kern and 30 percent in Stanislaus). Depending on which cut-point is used on the MCS12 or the MH5 between 73 and 11 percent of the CalWORKs recipients in Kern and Stanislaus were identified as positive. The percentage of those with this condition who are correctly identified ranges from 95 percent to 30 percent. However, to identify 95 percent one would need to refer for assessment 73 percent of all those screened, an unrealistic number.

It seems unlikely that a CalWORKs program could assess more than 25 percent of its applicant/recipients and perhaps far fewer. Therefore we limit possible cut-points to those that would not result in more than approximately 25 percent being referred for further testing or assessment. For the MH5 this would mean probability scores of .50 or above (although both the most efficient and most sensitive test would use a probability score of .30). This is equal to a mean score on the five items of 2.9 or less. For the MCS12 the cut-point would be probability scores of .42 and above (which is equal to a score of less than 35.93), although again both the most sensitive and most efficient cut-points are lower.

If, in practice, the use of the recommended cut-points results in fewer referrals than would be suggested by the research data, a lower cut-point could be adopted. Likewise, if the percentage of “positive” results on later testing or assessment exceed those predicted from the research, the cut-point could be reduced. In the table below we show predicted outcomes of screening using the recommended cut points for both the MCS12 and the MH5 assuming 1000 persons being screened.

Page 18: Using A Screening Instrument for Domestic Violence In ...€¦  · Web viewWomen with Partner: Results for Any Abuse Using Any “Yes” as Cut-Off (Emotional Abuse, Current Partner

Screening for AOD/MH/DV Page 15

Table 8: Predicted outcomes of screening for mental health issues using the recommended instruments, gold standard and cut-points

MCS12 with cutoff score of 35.93 or less

Number screened 1000

Number with a diagnosis that interfered a lot with life and activities during past year 270

Number identified on test as a “positive” and referred for further testing assessment 220

Number of persons identified after further testing or assessment as having a disabling psychiatric condition within past year 130

Number of “false positives” who were referred for further testing or assessment but are not identified as having a disabling psychiatric condition within past year 90

Number of “false negatives” who were not referred for further testing or assessment but did have a disabling psychiatric condition within past year 140

MH5 with cutoff score of 2.9 or less

Number screened 1000

Number with a diagnosis that interfered a lot with life and activities during past year 280

Number identified on test as a “positive” and referred for further testing assessment 250

Number of persons identified after further testing or assessment as having a disabling psychiatric condition within past year 143

Number of “false positives” who were referred for further testing or 107

Page 19: Using A Screening Instrument for Domestic Violence In ...€¦  · Web viewWomen with Partner: Results for Any Abuse Using Any “Yes” as Cut-Off (Emotional Abuse, Current Partner

Screening for AOD/MH/DV Page 16

assessment but are not identified as having a disabling psychiatric condition within past year

Number of “false negatives” who were not referred for further testing or assessment but did have a disabling psychiatric condition within past year 137

B. SCREENING FOR PROBLEMS WITH ALCOHOL

Choice Of Screening Instruments To Be Tested

Most tests of screening instruments women have been in prenatal care, in emergency rooms or primary care. The emergency room population is similar to the welfare reform population in that most clients were on AFDC. TWEAK, RAPS and CAGE are the three instruments shown to have greatest sensitivity and specificity. The CAGE, although very widely used, has been found in other tests not to be equally sensitive across ethnic and racial groups;14 nor are the items equally discriminating.15 In recent tests the TWEAK has performed best for women in pregnancy, and the RAPS in emergency rooms.16 A recent review recommends the TWEAK for women.17

In administering these questions in the survey, persons who had clearly indicated they were non-drinkers were skipped out of this section. We have assumed that, as non-drinkers, they would have answered each question negatively.

The questions in each instrument are shown below (note that there is some overlap between instruments):

TWEAK

Can you tell me how many drinks you can hold?[Six or more is cut-off]

Have close friends or relatives complained about your drinking in the past year?

Do you sometimes take a drink in the morning when you first get up?

Has a friend or family member ever told you about things you said or did while you were drinking, that you could not remember? If yes, Did that happen in the last 12 months?

14 Cherpitel, C. J. (1998). Differences in performance of screening instruments for problem drinking among blacks, whites and Hispanics in an emergency room population. J Stud Alcohol, 59(4), 420-426.15 Volk, R. J., Cantor, S. B., Steinbauer, J. R., & Cass, A. R. (1997). Item bias in the CAGE screening test for alcohol use disorders. J Gen Intern Med, 12(12), 763-769.16 Cherpitel, C. J. (1995). Screening for alcohol problems in the emergency room: a rapid alcohol problems screen. Drug Alcohol Depend, 40(2), 133-137.17 Sinha, R. (2000). Women. In A. S. G. Zernig, M. Kurz, and S. S. O'Malley, (Ed.), Handbook of Alcoholism. Boca Raton: CRC Press.

Page 20: Using A Screening Instrument for Domestic Violence In ...€¦  · Web viewWomen with Partner: Results for Any Abuse Using Any “Yes” as Cut-Off (Emotional Abuse, Current Partner

Screening for AOD/MH/DV Page 17

Do you sometimes feel the need to cut down on your drinking?

Page 21: Using A Screening Instrument for Domestic Violence In ...€¦  · Web viewWomen with Partner: Results for Any Abuse Using Any “Yes” as Cut-Off (Emotional Abuse, Current Partner

Screening for AOD/MH/DV Page 18

RAPS

Do you sometimes take a drink in the morning when you first get up?

Has a friend or family member ever told you about things you said or did while you were drinking, that you could not remember? If yes, Did that happen in the last 12 months?

How often during the last year have you had a feeling of guilt or remorse after drinking? (Circle the number below that is most accurate.) Would you say….Daily or almost daily, weekly, monthly, less than monthly, never.

How often during the last year have you failed to do what was normally expected from you because of drinking? Would you say….Daily, weekly, monthly, less than monthly, never.

During the last year have you lost friends (girlfriends/boyfriends) because of drinking?

CAGE

Do you sometimes feel the need to cut down on your drinking?

During the past 12 months, have people annoyed you by criticizing your drinking?

Have you ever felt bad or guilty about your drinking? [Inferred from Never vs. Any on, “How often during the last year have you had a feeling of guilt or remorse after drinking?”]

Do you sometimes take a drink in the morning when you first get up?

The “Gold Standards” For Problems With Alcohol

We tested the screening instruments against two standards. The more restrictive was for alcohol dependence or abuse (DSM-IV diagnoses generated by the CIDI). The more general added to this group those who reported that they drank five drinks or more at least once a month in the prior year (binge drinkers).

FINDINGS

Alcohol Dependence Or Abuse As The “Gold Standard”

Prevalence. Prevalence for alcohol dependence and/or abuse is around 9 percent overall and even lower among Hispanics (slightly over 5 percent). Screening with such a low prevalence is more difficult. The alcohol screens did quite well in these circumstances, perhaps reflecting a fairly extensive history of instrument validation (although not specifically with welfare reform participants). One consequence of the low prevalence, however, was that we could only test the instruments at a low cut-off (any yes answer vs. no yes answer), because there were too few positive cases (N of 10 is required) to use a higher cut-off. Other research, however, has already

Page 22: Using A Screening Instrument for Domestic Violence In ...€¦  · Web viewWomen with Partner: Results for Any Abuse Using Any “Yes” as Cut-Off (Emotional Abuse, Current Partner

Screening for AOD/MH/DV Page 19

pointed to the need to use a low cut-point for women as opposed to the higher cut-off validated for men.18

Was any particular screen better than the others? We tested both the separate instruments and the complete set of nine items. (The nine items were used in different combinations in the three instruments they were drawn from.) Of particular interest was the fact that not all of the items were equally predictive. In fact, of the five items in the TWEAK only three were predictive at a statistically significant level and of the four items of the CAGE only two were predictive. The “Best Model” presented below was created by using only items that were statistically significant.

Note that the positive predictive value is relatively low for all screens. Practically, this means that there will be around 50 percent who are referred for assessment but are negative for a diagnosis at assessment. Given the low numbers identified (70 to 110 out of 1000), however, the costs for the assessments would seem reasonable.

Taking age and ethnicity into account.

For the TWEAK and the RAPS, Hispanic race/ethnicity was highly significant (age over 30 was not). The table below shows the test properties for the TWEAK and the RAPS controlling statistically for Hispanic/ non-Hispanic. For the CAGE, race was not significant,19 but age (over or under 30) was. In the table below we have controlled for age.20

The properties of the “Best Model”—the most predictive model using age, race and the nine items found in the three screens—were nearly identical to the TWEAK. Given this near equivalence and the fact that the Best Model has not been validated on a sample different from the one in which it was derived,21 we do not suggest using it in practice.

Choosing between sensitivity, specificity and efficiency. The calibrated sensitivity and specificity are indicated to show the most sensitive, specific and efficient test—taking into account prevalence and the percent positive on the test at different cut-off points. The most sensitive test is shown by italicized and bolded type (in blue). The same applies to specificity. The most efficient test is bolded across the row (in red). The most efficient test would ordinarily be the choice unless there was a reason to optimize sensitivity (i.e., if we are more interested in identifying persons who need an assessment than in optimizing efficiency) or specificity (i.e., we are more interested in “ruling out” persons who do not have a diagnosis). In this case, if we want the most sensitive test of the three tests then we would choose the TWEAK. The CAGE maximizes both specificity and efficiency. Given the low prevalence and low percent positive on the test, however, it seems reasonable to maximize sensitivity—which would mean using the TWEAK.

18 Cherpitel, C. (1997). Analysis of cut points for screening instruments for alcohol problems in the emergency room. Journal of Studies on Alcohol, 56(6), 695-700.19 This is an interesting finding in view of the fact that the CAGE was more affected by racial differences in some other trials. See: Cherpitel, C. J. (1998). Differences in performance of screening instruments for problem drinking among blacks, whites and Hispanics in an emergency room population. J Stud Alcohol, 59(4), 420-426.20 The actual sensitivity and specificity varied little, however, regardless of whether age and race were controlled.21 Because unique characteristics of any sample may affect psychometric properties, it is usually recommended that tests be validated on a new sample. We are unable to do that in the case of this Best Model.

Page 23: Using A Screening Instrument for Domestic Violence In ...€¦  · Web viewWomen with Partner: Results for Any Abuse Using Any “Yes” as Cut-Off (Emotional Abuse, Current Partner

Screening for AOD/MH/DV Page 20

Table 9: Gold standard of dependence/abuse (controlling for Hispanic ethnicity for TWEAK and RAPS and age for the CAGE and Best Model)

Abuse/dependence

Prevalence Percent Positive on Test

Sensitivity Specificity Positive Predictive Value

Negative Predictive Value

Receiver Operating Curve

TWEAK 8% 11% 70% 94% 47% 97% .86

RAPS 8% 7% 60% 97% 62% 97% .82

CAGE 8% 8% 60% 97% 60% 97% .82

BEST MODEL22

8% 12% 74% 94% 48% 98% .86

Binge Drinking Or Alcohol Dependence Or Abuse As The “Gold Standard”

There were 53 women in the two counties who scored positively on alcohol dependence or abuse; an additional 44 reported binge drinking (total N of 97). We evaluated the screens to see how they worked with the broader “gold standard” of abuse/dependence/binge drinking. Again, the cut-off, was a “yes” on one or more questions.

The effect of age and ethnicity. Age and race/ethnicity were included in each model initially but they did not improve prediction significantly in any of the models.

Table 10: Gold standard of dependence/abuse or binge drinking (age and ethnicity held constant if statistically significant)

Abuse/dependence/binge

Prevalence Percent Positive on Test

Sensitivity Specificity Positive Predictive Value

Negative Predictive Value

Receiver Operating Curve

TWEAK 14% 11% 57% 96% 70% 93% .77

RAPS 14% 7% 41% 98% 77% 91% .70

CAGE 14% 8% 44% 98% 80% 92% .71

BEST MODEL23

14% 11% 58% 96% 70% 93% .77

As has been reported elsewhere with regard to these screens,24 performance is poorer when high risk drinking (binge drinking) is added to dependence/abuse as the gold standard. Given the

22 Five of the nine questions were statistically significant: more than six, cut-down, amnesia, failed, guilt.23 Four of the nine questions were optimum: complain, sixplus, cut-down, guilt.24 Cherpitel, C. J. (1998). Differences in performance of screening instruments for problem drinking among blacks, whites and Hispanics in an emergency room population. J Stud Alcohol, 59(4), 420-426.

Page 24: Using A Screening Instrument for Domestic Violence In ...€¦  · Web viewWomen with Partner: Results for Any Abuse Using Any “Yes” as Cut-Off (Emotional Abuse, Current Partner

Screening for AOD/MH/DV Page 21

small number of persons overall who are alcohol dependent/abusive or binge drinkers, the number of false positives seems small enough so that any woman answering any question positively should receive an assessment.

Inconsistent items. In the analysis of the CAGE, two of the four items turned out not to be at all statistically significant. This means they not only did not add to predictive power but that the four items are not nested, making choice of a cutoff point difficult.

Choosing between sensitivity, specificity and efficiency. The calibrated sensitivity and specificity are indicated to show the most sensitive, specific and efficient test—taking into account prevalence and the percent positive on the test at different cut-off points. The most sensitive test is the TWEAK, while the CAGE is most specific. The most efficient test is the TWEAK. So choosing the TWEAK maximizes both sensitivity and efficiency.

RECOMMENDATIONS FOR ALCOHOL SCREENING INSTRUMENT AND CUTPOINTS

Considerations in choice of instrument

Because of the generally very low rates of identification of CalWORKs recipients with alcohol abuse or dependence, we put primary reliance on sensitivity in comparing instruments. We tested the TWEAK, CAGE, RAPS and a set of optimized items (“Best Model”). The TWEAK has been found to outperform the CAGE with women in perinatal programs, and it performed somewhat better in our research with CalWORKs recipients as well. In essence, regardless of the gold standard adopted the TWEAK received a higher percentage of positive scores than did the other tests and, even though its positive predictive value was somewhat lower than the other instruments, the percentage of women identified having either alcohol dependence/abuse or dependence/abuse/bingeing was higher. For example, in Kern and Stanislaus combined 5.2 percent of the sample would have been identified with a dependence/abuse problem versus 4.3 percent for the RAPS and 4.8 percent for the CAGE. When binge drinking is considered as well the percentages correctly identified are 7.7 for the TWEAK, 5.4 for the RAPS, and 6.4 for the CAGE. Given the low referral rates for alcohol problems the greater sensitivity of the TWEAK seems to be the most important finding.25

Considerations in choice of gold standard

The two gold standards considered were a) diagnosis on the CIDI of alcohol dependence or alcohol abuse and b) diagnosis of alcohol dependence or abuse or binge drinking, defined has drinking five or more drinks at one time at least once a month. As noted above, inclusion of binge drinking increased the percentage of recipients identified by the TWEAK from 5.6 to 7.7. In light, too, of recent findings that misuse of alcohol that does not involve dependence accounts for high percentages of the negative effects of alcohol on job performance26 we recommend using the broader standard. 25 These figures are obtained by multiplying the positive predictive value by the percent positive on the test. Rounding makes these somewhat different from the results obtained by multiplying the prevalence by the sensitivity.26 A Harvard study of 14,000 workers in Fortune 500 companies showed most loss of productivity was due to the drinking behaviors of non-alcoholics. http://www.nmisp.org/alcohol/alc-work.htm

Page 25: Using A Screening Instrument for Domestic Violence In ...€¦  · Web viewWomen with Partner: Results for Any Abuse Using Any “Yes” as Cut-Off (Emotional Abuse, Current Partner

Screening for AOD/MH/DV Page 22

Considerations in choice of cut-point

Because of the relatively low prevalence of alcohol problems we did not have a large enough sample size to test a high cutoff. Therefore, the only cut-point we can speak to is a “low” cutoff of one or more “yes” answers on any of the five questions on the TWEAK. In practice, one might experiment with asking those who only answer one question positively to take a further more detailed test, such as the SASSI, while those who answer yes to two or more questions would be referred directly for a professional assessment.

The table below shows the outcomes we would predict based on the research if 1000 CalWORKs recipients were screened for dependence, abuse or binge drinking using the TWEAK.

Table 11: Predicted outcomes of screening for alcohol issues using the recommended instrument, gold standard and cut-point TWEAK for dependence, abuse or binge drinking

Number screened 1000

Number with a research-based CIDI diagnosis of abuse or dependence or who drink 5 drinks at a time at least once a month during the previous year 140

Number identified on test as a “positive” and referred for further testing assessment 110

Number of persons identified after further testing or assessment as meeting criteria for alcohol abuse or dependence or binge drinking within past year 77

Number of “false positives” who were referred for further testing or assessment but were not identified with dependence/abuse/bingeing within past year 33

Number of “false negatives” who were not referred for further testing or assessment but did report dependence/abuse/bingeing within past year 63

C. SCREENING FOR PROBLEMS WITH ILLICIT DRUGS

Options are very limited if a brief screen is wanted. There are two versions of a “drug-CAGE,” one combines drugs and alcohol, the other adapts the alcohol CAGE for drugs. Neither has been shown particularly sensitive to light use or substance “abuse” but both have been shown to work

Page 26: Using A Screening Instrument for Domestic Violence In ...€¦  · Web viewWomen with Partner: Results for Any Abuse Using Any “Yes” as Cut-Off (Emotional Abuse, Current Partner

Screening for AOD/MH/DV Page 23

fairly well for drug dependence.27 We chose to test the version that asks only about drugs, not alcohol. The items in the screen are shown below.

DRUG CAGE ITEMS

In the last 12 months have you felt you should cut down on your drug use?

In the last 12 months have people annoyed you by criticizing your drug use?

In the last 12 months have you felt bad or guilty about your drug use?

Sometimes people feel bad when a drug wears off. Did that ever happen to you in the past year? If so, Did you ever take another drug when that happened?

Drug Dependence or Abuse as the Gold Standard

Effects of race and age. Age was not statistically significant in a logistic regression model containing the four drug CAGE questions for race and age. However, African-Americans were significantly less likely to be classed as having a drug abuse/dependence diagnosis, and Hispanics were close to significantly less. Since the number of cases was insufficient to test the screens separately for African-Americans and/or Hispanics, both age and race were left in the model.

Choosing between sensitivity, specificity and efficiency. Since only one screening instrument is being tested, the calibrated measures used in earlier analyses are not necessary.

Inconsistent items. In the drug CAGE, only two of the items (cut down and annoy) were statistically significant. In practice, this means that the probability associated with each question is neither the same nor “nested.” Practically, this means the test could be reduced to two items.

Therefore we also estimated a model that included only the two statistically significant items (and race and age). As can be seen from the table below the two item screen performs exactly the way the four item screen does. In addition it is possible to set a higher cut off (to reduce false positives) of a “yes” on both questions. Based on this population (and realizing the bias of administration in the research setting), the two item test (“cut down’ and “annoy”) performs very well.

Based on this population and method of administration, the drug CAGE performs very well, identifying almost all the women with drug abuse or dependence. Since the prevalence is low, the false positives (39 out of every 100 identified as positive) will still be a relatively small number. For example, since the test identifies about 10 percent as positive, this means 40 out 1000 persons tested will be assessed negative for drug dependence or abuse (though they might be positive for less severe drug use patterns). Using the two item test but referring for assessment

27 Midanik, L., Zahnd, E., & Klein, D. (1998). Alcohol and drug CAGE screeners for pregnant, low-income women: the California Perinatal Needs Assessment. Alcohol Clin Exp Res, 22(1), 121-125.Brown, R., & Rounds, L. (1995). Conjoint screening questionnaires for alcohol and other drug abuse: criterion validity in a primary care practice. Wis Med J, 94(3), 135-140.

Page 27: Using A Screening Instrument for Domestic Violence In ...€¦  · Web viewWomen with Partner: Results for Any Abuse Using Any “Yes” as Cut-Off (Emotional Abuse, Current Partner

Screening for AOD/MH/DV Page 24

if both are positive would reduce the number of negative assessments to 22 of 1000 but it would also reduce the number identified from 59 to 45.

Table 12: Drug CAGE (drug dependence or abuse) controlling for age race/ethnicity

Prevalence Percent Positive on Test

Sensitivity Specificity Positive Predictive Value

Negative Predictive Value

Receiver Operating Curve

All four items. Cut-off yes on any 1 6% 10% 98% 96% 61% 100% .98“Cut down” and

“annoy” only. Cut-off of either 6% 10% 98% 96% 60% 100% .98

“Cut down” and “annoy” only. Cut-

off of both 6% 6% 76% 99% 78% 98% .98

Drug Dependence or Abuse OR Other Illicit Drug Use as the Gold Standard

A broader population to test would be any one who was diagnosed with drug abuse or dependence or who admitted to having used any illicit drug (or prescription drug used illicitly) at least five times in the prior year. These rates were considerably higher, especially in Stanislaus, and varied significantly by race/ethnicity (see table below). The range included only 4 percent of Hispanics in Kern to 38 percent of African Americans in Stanislaus (although the sample size was small in the latter group).

Again, because the screens are likely to be used in counties with similarly varying prevalence figures (which will be unknown to those administering the screens) we have chosen to model the sample as a whole—which results in prevalence figures in between the two counties. For example the overall prevalence of drug use/abuse/dependence in Kern is 9 percent; in Stanislaus it is 29 percent; and in the combined sample it is 19 percent.

Page 28: Using A Screening Instrument for Domestic Violence In ...€¦  · Web viewWomen with Partner: Results for Any Abuse Using Any “Yes” as Cut-Off (Emotional Abuse, Current Partner

Screening for AOD/MH/DV Page 25

Table 13: Illicit drug use, research findings by county and race

KernN

(Percent)

StanislausN

(Percent)

White97

(14)

168

(33)

Hispanic157

(4)

122

(23)

Black74

(9)

34

(38)

Other Race19

(10)

32

(22)

Inconsistent items. The model that included use of illicit drugs is significant for all four of the drug CAGE items; however, the “eye opener” and “guilty” items contributed little to the prediction and resulted in “non-nested” items as before, complicating scoring. A model with just “cut down” and “annoy” in it had virtually identical predictive power (100 percent of those endorsing the “annoy” item were positive for the “gold standard”).

Table 14: Drug CAGE (drug dependence or abuse or illicit use) controlling for age race/ethnicity

Prevalence Percent Positive on Test

Sensitivity Specificity Positive Predictive Value

Negative Predictive Value

Receiver Operating Curve

All four items. Cut-off any 1

19% 10% 50% 100% 97% 89% .80

“Cut down” and “annoy” only. Cut-off

if either one

19% 10% 49% 100% 97% 89% .80

“Cut down” and “annoy” only. Cut-off

if both

19% 5% 24% 100% 100% 84% .79

Effects of adding drug use. Both the full four-item and reduced two-item model had low sensitivity, identifying only half of those who had used illicit drugs. (Requiring as a cut-off a “yes” on both items of the reduced model had very low sensitivity and is not recommended.) However, since the percentage using illicit drugs is more than double that

Page 29: Using A Screening Instrument for Domestic Violence In ...€¦  · Web viewWomen with Partner: Results for Any Abuse Using Any “Yes” as Cut-Off (Emotional Abuse, Current Partner

Screening for AOD/MH/DV Page 26

of those abusing or dependent on them, even with low sensitivity about a third more women were identified using this criterion than abuse/dependence alone.

RECOMMENDATIONS FOR ILLICIT DRUG SCREENING INSTRUMENT AND CUTPOINTS

Considerations in choice of gold standard and instrument

In screening for illicit drugs we need to consider the gold standard and instrument in conjunction with each other. For drug dependence and abuse, an instrument comprised of only two of the four items on the Drug-CAGE, the instrument we tested, performed as well as all four items together. This was not the case, however, when we added to the a gold standard any use at least five times in the past year of an illicit drug (or misused prescription drug). So if the gold standard is to be dependence/abuse alone we suggest using only the two highly predictive items, since that minimizes the number of drug-related questions being asked and may thus reduce defensiveness. But if the gold standard also includes other drug use, then all four items would need to be included.

We believe using a diagnosis of dependence or abuse is most appropriate in the context of CalWORKs since either diagnosis indicates both negative social consequences due to drug use and a degree of subjective discomfort with drug-related behaviors. It seems unlikely that recipients who are not experiencing negative consequences of drug use will be receptive to referral for drug abuse services. This issues is explored at greater length in the CalWORKs Project report on prevalence.28 However, it is reasonable to refer for further testing or assessment those who do endorse either of the items, so if illicit drug users who do not meet criteria for abuse or dependence do say “yes” to either item they would be referred regardless of the gold standard.

Considerations in choice of cut-point

We tested using a “yes” on one or on both of the “cut-down” and “annoy” items, and found a considerable loss of sensitivity if two are used. Thus we would recommend that a “yes” on either question lead to either further testing (with the SASSI, for example) or to a referral for assessment by an AOD professional.

The table below shows the outcomes we would predict based on the Kern and Stanislaus research results if the two questions were included in a screening instrument filled out by 1000 CalWORKs recipients.

Because the sensitivity and specificity were so high in this test (98 and 96 percent respectively), it seems likely that the results were influenced by the administration of the screening items after the full CIDI. Nonetheless, this would only eliminate lack of precision due to dishonesty. The items themselves are highly predictive.

28 Chandler & Meisel. Op cit. Page 36.

Page 30: Using A Screening Instrument for Domestic Violence In ...€¦  · Web viewWomen with Partner: Results for Any Abuse Using Any “Yes” as Cut-Off (Emotional Abuse, Current Partner

Screening for AOD/MH/DV Page 27

Table 15: Predicted outcomes of screening for illicit drug issues using the recommended instrument, gold standard and cut-point “Yes” on either the “cut-down” or “annoy” items of the drug-CAGE for drug dependence or abuse

Number screened 1000

Number with a research-based CIDI diagnosis of drug abuse or dependence during the previous year 60

Number identified on test as a “positive” and referred for further testing assessment 100

Number of persons identified after further testing or assessment as meeting criteria for drug abuse or dependence within past year 5829

Number of “false positives” who were referred for further testing or assessment but were not identified with drug dependence/abuse within past year 40

Number of “false negatives” who were not referred for further testing or assessment but did report drug dependence/abuse within past year 1

D. SCREENING FOR DOMESTIC VIOLENCE

Choice Of Screening Instruments To Be Tested

A literature search revealed three short screening instruments for domestic violence, none of which had been tested on a welfare or welfare reform population. They were the HITS, a five item test developed for use in family practice setting; the Women Abuse Screening Tool, also for use in family practice; and a three-item Partner Violence Screen (PVS) for emergency department use (abbreviated as).30 Only HITS and the PVS had been tested in an optimal way, with screening test and gold standard test administered to the same persons consecutively.31

29 Rounding from 58.3; the number of false negatives is rounded from 1.4. 30 Sherin, K. M., Sinacore, J. M., Li, X. Q., Zitter, R. E., & Shakil, A. (1998). HITS: a short domestic violence screening tool for use in a family practice setting. Fam Med, 30(7), 508-512; Brown, J. B., Lent, B., Brett, P. J., Sas, G., & Pederson, L. L. (1996). Development of the Woman Abuse Screening Tool for use in family practice. Fam Med, 28(6), 422-8; Feldhaus, K. M., Koziol-McLain, J., Arnsbury, H. L., Norton, I. M., Lowenstein, S. R., & Abbott, J. T. (1997). Accuracy of 3 Brief Screening Questions for Detecting Partner Violence in the Emergency Department. JAMA, 277(17), 1357-1361.31 An alternate but inferior method is to define two groups in a separate process and then see if the screening test distinguishes them. For example, in testing the Woman Abuse Screening Tool a group of known domestic violence victims was compared with a group of professional acquaintances of the authors.

Page 31: Using A Screening Instrument for Domestic Violence In ...€¦  · Web viewWomen with Partner: Results for Any Abuse Using Any “Yes” as Cut-Off (Emotional Abuse, Current Partner

Screening for AOD/MH/DV Page 28

We eliminated the HITS because it had a direct approach (“How often does your partner physically hurt you, etc.”) we thought inappropriate for screening in welfare offices.

The Women Abuse Screening Tool (WAST) “long” form included seven items. The “short form,” (that we used) was comprised of two questions: In general, how would you describe your relationship…a lot of tension, some tension, no tension? And “Do you and your partner work out arguments with great difficulty, some difficulty, no difficulty?” However, we added a third question from the longer version because of our concern with emotional abuse: “Has your partner ever abused you emotionally…often, sometimes, never?” In developing this instrument Brown et al. had paid considerable attention to which items caused discomfort or embarrassment in the family practice setting. Brown et al. concluded that scoring worked best when answering either “often” or “sometimes” or both was considered to be a positive response. We also tested this “combined positive” against using only the “often” category and agreed that predictive qualities of the model were better with the combined positive. Because of the way the questions were phrased, they were only tested on women with a current partner.

The PVS consisted of one direct question on physical violence: “Have you ever been hit, kicked, punched or otherwise hurt by someone within the past year? If so, by whom—current partner, past partner, someone else.” Two other questions focused on perceived safety: “Do you feel safe in your current relationship?” and “Is there a partner from a previous relationship who is making you feel unsafe now?” The part of the first question asking who caused the injury is not a part of the screen per se.

Choice Of The “Gold Standards” For Domestic Violence

Unlike AOD and mental health, in which DSM-IV diagnoses serve as a widely recognized gold standard, domestic violence lacks clear cut agreement as to what behaviors by an intimate partner would constitute a “case” of domestic violence. The instrument most often used, the Conflict Tactics Scale (CTS), is limited in the range of behaviors it measures.32 We have, however, used many of the items in the CTS as they permit comparability. We have adopted measures of emotional abuse and controlling behaviors from a 1993 national survey in Canada and the 1995 National Institute of Justice survey in the United States.33 We restricted our definition, as well, to acts committed by “a current or past partner.” Incidents were recorded separately for the previous year and any time in the past. Prevalence rates for all of these measures are reported in our November 2000 report The Prevalence of Mental Health, Alcohol and Other Drug, & Domestic Violence Issues Among CalWORKs Participants in Kern and Stanislaus Counties.

From the standpoint of a welfare worker, the most useful measure would be one of domestic abuse which is occurring or being threatened at the time of the interview with the CalWORKs recipient. Although we did, in fact, ask such a question at the end of the domestic violence 32 Straus, M. A., & Gelles, R. J. (1990). Physical Violence in American Families. New Brunswick: Transaction Publishers. Also see: Morse, B. J. (1995). Beyond the Conflict Tactics Scale: assessing gender differences in partner violence. Violence Vict, 10(4), 251-272.33Johnson, H., & Sacco, V.-F. (1995). Researching violence against women: Statistics Canada's national survey. Canadian-Journal-of-Criminology, 37(3), 281-304; Tjaden, P., & Thoennes, P. (1998). Prevalence, Incidence, and Consequences of Violence Against Women: Findings From the National Violence Against Women Survey (http://www.ncjrs.org/txtfiles/172837.txt): National Institute of Justice, Violence Against Women Office.

Page 32: Using A Screening Instrument for Domestic Violence In ...€¦  · Web viewWomen with Partner: Results for Any Abuse Using Any “Yes” as Cut-Off (Emotional Abuse, Current Partner

Screening for AOD/MH/DV Page 29

section of the interview (where it would be likely to be answered truthfully), the question was unfortunately phrased, asking if the woman felt unsafe at the time she was enrolled in the CalWORKs welfare-to-work program. Although the first interview took place five months after the deadline for welfare-to-work enrollment, in fact in Kern County many clients had not yet had an interview regarding the new requirements of CalWORKs. Thus the question was ambiguous regarding time for a sizeable proportion of the respondents.34

Lacking a valid measure tied to a CalWORKs enrollment visit we tested the screening instruments on two progressively more limited standards. The broadest recorded a “yes” or “no” for any type of abuse during the prior 12 months. “Any Abuse” included physical abuse, stalking, threats, forced sex, verbal abuse, controlling behavior or any of three types of abusive control. A second standard was for “Physical Abuse.”35

Partner-Status and Age

Partners. The WAST only asks questions about a “current partner.” The Partner Violence Screen asked about both current and past partners.

As it turned out, the assumption that the person making a woman feel unsafe is a current partner is not at all justified by fact. Feldhaus states:

One important observation is that questions about partner violence should not be limited to women who are married or who have current partners, nor should questions about partner abuse be restricted to current partners. First, we observed that women may have multiple, changing relationships, and some are unsure how to define them (current or past). In addition the highest prevalence rate of partner violence…was noted in women with only a previous relationship and no current partner.”

In the Kern/Stanislaus population, we found that of 136 women who reported physical abuse or forced sex in the prior 12 months, 45 percent involved a current partner, 52 percent a past partner, and 3 percent both a current and past partner.

However, we found (as did Feldhaus in describing the PVS) some ambiguity about partners. One of the first questions we asked in the interview had to do with whether the woman currently had a partner—a husband or other man she lived with or someone else she was romantically involved with. All of these women are classified as the head of the household for welfare purposes (none were in the two-parent category). Based on the initial questions only 305 women had a partner, yet 315 answered the screening questions for those having a partner. Similarly, of 33 women who said their current partner’s abusive behavior had not stopped at the time of the interview, 4 had initially reported not having a partner. The analysis that follows, however, limited the questions that assume a partner to the women indicating explicitly they had a current partner.

34 In addition, 49 of the women in Kern were not required to participate in welfare to work.35 See The Prevalence of Mental Health, Alcohol and Other Drug, & Domestic Violence Issues Among CalWORKs Participants in Kern and Stanislaus Counties for the specific items. Women reporting any physical abuse in the last year reported a mean of four behaviors out of seven possible.

Page 33: Using A Screening Instrument for Domestic Violence In ...€¦  · Web viewWomen with Partner: Results for Any Abuse Using Any “Yes” as Cut-Off (Emotional Abuse, Current Partner

Screening for AOD/MH/DV Page 30

The practical implications of partner ambiguity are that a) a screen should include both past and current intimate partners and b) the screen should flexibly allow women to categorize abusive partners. So, for example, women should be permitted to answer questions about both current and past partners, not skipped on the basis of a relationship screening question.

Age. Age was a highly significant predictor, particularly for Any Abuse. In the next section we present results for statistical models controlling for age. However, in Appendix III we have included models for those 30 and under and those over 30 for women with and without partners. To see the sometimes very substantial effects of age and partner-status please consult the tables in the Appendix.

FINDINGS FOR DOMESTIC VIOLENCE

Usefulness of the WAST items. The WAST items could only be tested on women with a current partner. Using the responses of any woman who said she had a current partner, we entered five questions into logistic regression models. The questions used were the three questions from the WAST (amount of tension, amount of argument, emotional abuse), and two from the PVS: “Do you feel safe in your current relationship?” and “Have you been hit, kicked, punched, or otherwise hurt by someone within the past year?” Age (over 30), education, and race were also included in the model.

The “indirect” items (lot of tension in relationship, lot of arguments) were not useful predictors, dropping out of the logistic regression equation as not close to statistically significant—even for the Any Abuse standard. So while indirect—and therefore non-threatening—they were not predictive.

For the variables left in the model (emotional abuse, current partner unsafe, hit/injured), the logistic regression analyses showed that the likelihood of a “positive” on the gold standard increased monotonically, so that the more items that are selected, the greater the chance of a correct identification.

Possibly because the emotional abuse item was only asked of women with a current partner (and it proved quite predictive), the screens performed somewhat better for women with a current partner than those without. It may, therefore, be worthwhile adding the emotional abuse item to the three questions of the PVS. If so, we suggest changing the form of this question so that it reflects current and past partners, as in “Has a current or past partner abused you emotionally in the past 12 months?”

Because the emotional abuse item had a different N (since it was asked only of women with a partner), we could not include it in the model with the three PVS questions. However, we calculated its predictive power as a separate item among the 315 women with partners. A total of 83, or 26 percent, answered “yes” to the question of whether their partner had ever abused them emotionally. Of these 83 persons, 56 (67 percent) also reported “any abuse” and 29 (35 percent) reported physical abuse within the previous year.

Page 34: Using A Screening Instrument for Domestic Violence In ...€¦  · Web viewWomen with Partner: Results for Any Abuse Using Any “Yes” as Cut-Off (Emotional Abuse, Current Partner

Screening for AOD/MH/DV Page 31

Given the poor results for the indirect WAST items and the limited sample size for emotional abuse, we limited our test of screening properties to the PVS.36

Performance of the PVS. Because all three of the PVS items have to do with safety we feel a positive answer to any one should result in a referral for assessment. Therefore we only show one cut-point for each of the gold standards.

Table 16: Any “yes” answer to PVS items, Any Abuse and Physical Abuse as the gold standards, controlling for age and partner status when required37

Prevalence Percent Positive on Test

Sensitivity Specificity Positive Predictive Value

Negative Predictive Value

Receiver Operating Curve

Any Abuse as Gold Standard

42% 41% 65% 76% 66% 75% .74

Physical Abuse as Gold Standard

21% 28% 78% 85% 58% 93% .85

RECOMMENDATIONS FOR DOMESTIC VIOLENCE SCREENING INSTRUMENT AND CUTPOINTS

Considerations in choice of instrument

Because of the low predictive power of the indirect questions in the WAST about arguments and tension, we suggest the use of the Partner Violence Screen (PVS). However, because the three questions (current or past partner make you feel unsafe and have you been hurt or injured in past year) all have to do with physical safety and much that is repressive about domestic violence is emotional, we suggest adding from the WAST the question on emotional abuse within the last year. Unfortunately, we did not test the emotional abuse question on women who did not have a partner, but its good performance among women who do have a partner supports the decision to include it for all women. Emotional abuse is also much more predictive for older women. On the next page is the set of questions we propose for identifying the need for further tests or assessment of domestic violence. It consists of the PVS supplemented with the optional emotional abuse question.

36 The age and partner specific tables in Appendix III also include the emotional abuse question for women with partners.37 Neither partner status nor age contributed to the predictive power of the model for “physical abuse” but they were highly predictive for “any abuse.”

1. Do you feel safe in your current relationship?

A I AM NOT IN A RELATIONSHIP RIGHT NOW

B. NO

C. YES

2. Is there a partner from a previous relationship who is making you feel unsafe right now?

A. YES

B. NO

3. Has a partner (current or past) abused you emotionally within the past year?

A. YES

B. NO

4. Have you been hit, kicked, punched, or otherwise hurt by someone within the past year?

A. NO

B. YES

If “YES,” who did that to you?

A. CURRENT PARTNER

B. PREVIOUS PARTNER

C. SOMEONE ELSE

Page 35: Using A Screening Instrument for Domestic Violence In ...€¦  · Web viewWomen with Partner: Results for Any Abuse Using Any “Yes” as Cut-Off (Emotional Abuse, Current Partner

Screening for AOD/MH/DV Page 32

Considerations in choice of gold standard

Use of the standard of “any abuse” in the prior 12 months appears too broad. In some age and partner categories more than 50 percent of the women would qualify and 42 percent of the population was identified by the PVS (without the emotional abuse question). Choosing “physical abuse” as a gold standard reduces the women in the population who have this issue to 21 percent overall. Physical abuse, with the issues of safety it presents, is also much more clearly related to difficulty meeting CalWORKs work activity requirements. Finally, the three questions of the PVS are all oriented toward safety. This is reflected in the higher sensitivity of the test for physical abuse than “any” abuse.

Considerations in choice of cut-point

Our recommendation is to use a “yes” to any question on safety (hit/injured, current partner unsafe, past partner unsafe) as the cut-off. [A “yes” on the emotional abuse question without other affirmative answers could lead to asking the CalWORKs recipient to fill out a more comprehensive screening instrument for domestic violence.38] Any yes on a question that implies 38 Overall, 23 percent of the women reported physical abuse within the past year. For women without partners we were unable to test the emotional abuse question. For women with partners, the most sensitive cut point was to use a

Page 36: Using A Screening Instrument for Domestic Violence In ...€¦  · Web viewWomen with Partner: Results for Any Abuse Using Any “Yes” as Cut-Off (Emotional Abuse, Current Partner

Screening for AOD/MH/DV Page 33

that the woman is not safe must be followed by both a referral to a domestic violence professional for assessment and immediate further questions regarding whether it is safe for the woman to return to her partner or home.

Table 17: Predicted outcomes of screening for physical abuse for women using the PVS and any “yes” as a cutoff PVS test for physical abuse: current or past partner make you feel unsafe; hit or otherwise injured in past year

Number screened 1000

Number of persons identified has having been physically abused during the previous year 210

Number identified on test as a “positive” and referred for further testing assessment 270

Number of persons identified after further testing or assessment as meeting criteria for physical abuse drinking within past year 162

Number of “false positives” who were referred for further testing or assessment but were not identified with physical abuse within past year 108

Number of “false negatives” who were not referred for further testing or assessment but did report physical abuse within past year 48

“yes” on emotional abuse. However, inclusion of emotional abuse resulted in referring 43 percent of the population for further testing or an assessment—much higher than the maximum of 25 percent we adopted earlier as what is likely to be feasible.

Page 37: Using A Screening Instrument for Domestic Violence In ...€¦  · Web viewWomen with Partner: Results for Any Abuse Using Any “Yes” as Cut-Off (Emotional Abuse, Current Partner

Screening for AOD/MH/DV Page 34

APPENDIX 1: CALCULATION OF SPECIFICITY, SENSITIVITY AND POSITIVE PREDICTIVE VALUE

The Relationship of Screening to True Cases of AOD/MH/DV

True Classification

Have AOD/MH/DV No AOD/MH/DV ALL CASES

Have AOD/MH/DV A B A + B

Screeningprediction

Do not have AOD/MH/DV C D C + D

A + C B + D A + B + C+ D

True positive cases = A

True negative cases = D

False positive cases = B

False negative = C

Sensitivity = A/(A+C)

Specificity = D/(B+D)

Positive predictive value = A/(A=B)

Negative predictive value = D/ (C+D)

Prevalence = (A+C)/(A+B+C+D)

Page 38: Using A Screening Instrument for Domestic Violence In ...€¦  · Web viewWomen with Partner: Results for Any Abuse Using Any “Yes” as Cut-Off (Emotional Abuse, Current Partner

Screening for AOD/MH/DV Page 35

APPENDIX II: AGE AND PARTNER SPECIFIC TABLES FOR DOMESTIC VIOLENCE

ANY ABUSE AS THE GOLD STANDARD

Women with Partner: Results for Any Abuse Using Any “Yes” as Cut-Off (Emotional Abuse, Current Partner Unsafe39, and Hit/Injured)

Prevalence Percent Positive on test

Sensitivity Specificity Positive Predictive Value

Negative Predictive Value

ROC

Any Abuse: Have Partner, Age 18-30

60% 42% 59% 82% 83% 58% .74

Any Abuse: Have Partner, Age 31-54

41% 45% 70% 73% 64% 78% .77

Women with Partner: Results for Any Abuse Using “Yes” on HIT as Cut-Off (Emotional Abuse, Current Partner Unsafe40, and Hit/Injured)

Prevalence Percent Positive on test

Sensitivity Specificity Positive Predictive Value

Negative Predictive Value

ROC

Any Abuse: Have Partner, Age 18-30

59% 26% 42% 97% 95% 53% .74

Any Abuse: Have Partner, Age 31-54

40% 19% 43% 96% 88% 91% .77

39 For those having a current partner, only “current partner unsafe” could be tested rather than both “current” and “past” partners. For those having no partner, only “past” partners were tested. 40 For those having a current partner, only “current partner unsafe” could be tested rather than both “current” and “past” partners. For those having no partner, only “past” partners were tested.

Page 39: Using A Screening Instrument for Domestic Violence In ...€¦  · Web viewWomen with Partner: Results for Any Abuse Using Any “Yes” as Cut-Off (Emotional Abuse, Current Partner

Screening for AOD/MH/DV Page 36

Women with No Partner: Results for Any Abuse Using Any “Yes” as Cut-Off (Past Partner Unsafe41, and Hit/Injured)

Prevalence Percent Positive on test

Sensitivity Specificity Positive Predictive Value

Negative Predictive Value

ROC

Any Abuse: No Partner, Age 18-30

52% 31% 49% 89% 83% 61% .70

Any Abuse: No Partner, Age 31-54

32% 22% 45% 90% 67% 78% .69

Women with No Partner: Results for Any Abuse Using “Yes” on HIT as Cut-Off (Past Partner Unsafe42, and Hit/Injured)

Prevalence Percent Positive on test

Sensitivity Specificity Positive Predictive Value

Negative Predictive Value

ROC

Any Abuse: No Partner, Age 18-30

53% 25% 43% 95% 91% 60% .70

Any Abuse: No Partner, Age 31-54

32% 14% 36% 96% 80% 77% .69

41 For those having a current partner, only “current partner unsafe” could be tested rather than both “current” and “past” partners. For those having no partner, only “past” partners were tested. 42 For those having a current partner, only “current partner unsafe” could be tested rather than both “current” and “past” partners. For those having no partner, only “past” partners were tested.

Page 40: Using A Screening Instrument for Domestic Violence In ...€¦  · Web viewWomen with Partner: Results for Any Abuse Using Any “Yes” as Cut-Off (Emotional Abuse, Current Partner

Screening for AOD/MH/DV Page 37

Physical Abuse as the Gold Standard

Women with Partner: Results for Physical Abuse Using Any “Yes” as Cut-Off (Emotional Abuse, Current Partner Unsafe43, and Hit/Injured)

Prevalence Percent Positive on test

Sensitivity Specificity Positive Predictive Value

Negative Predictive Value

ROC

Physical Abuse: Have Partner, Age 18-30

26% 42% 86% 73% 52% 94% .89

Physical Abuse: Have Partner, Age 31-54

20% 45% 96% 68% 43% 99% .93

Women with Partner: Results for Physical Abuse Using “Yes” on HIT as Cut-Off (Emotional Abuse, Current Partner Unsafe44, and Hit/Injured)

Prevalence Percent Positive on test

Sensitivity Specificity Positive Predictive Value

Negative Predictive Value

ROC

Physical Abuse: Have Partner, Age 18-30

26% 26% 84% 94% 83% 94% .89

Physical Abuse: Have Partner, Age 31-54

20% 20% 77% 94% 77% 94% .93

43 For those having a current partner, only “current partner unsafe” could be tested rather than both “current” and “past” partners. For those having no partner, only “past” partners were tested. 44 For those having a current partner, only “current partner unsafe” could be tested rather than both “current” and “past” partners. For those having no partner, only “past” partners were tested.

Page 41: Using A Screening Instrument for Domestic Violence In ...€¦  · Web viewWomen with Partner: Results for Any Abuse Using Any “Yes” as Cut-Off (Emotional Abuse, Current Partner

Screening for AOD/MH/DV Page 38

Women with No Partner: Results for Physical Abuse Using Any “Yes” as Cut-Off (Past Partner Unsafe45, and Hit/Injured)

Prevalence Percent Positive on test

Sensitivity Specificity Positive Predictive Value

Negative Predictive Value

ROC

Physical Abuse: No Partner, Age 18-30

27% 30% 79% 86% 68% 92% .86

Physical Abuse: No Partner, Age 31-54

15% 22% 71% 87% 49% 94% .82

Women with No Partner: Results for Physical Abuse Using “Yes” on HIT as Cut-Off (Past Partner Unsafe46, and Hit/Injured)

Prevalence Percent Positive on test

Sensitivity Specificity Positive Predictive Value

Negative Predictive Value

ROC

Physical Abuse: No Partner, Age 18-30

26% 25% 78% 94% 82% 92% .86

Physical Abuse: No Partner, Age 31-54

15% 14% 65% 95% 68% 94% .82

45 For those having a current partner, only “current partner unsafe” could be tested rather than both “current” and “past” partners. For those having no partner, only “past” partners were tested. 46 For those having a current partner, only “current partner unsafe” could be tested rather than both “current” and “past” partners. For those having no partner, only “past” partners were tested.

Page 42: Using A Screening Instrument for Domestic Violence In ...€¦  · Web viewWomen with Partner: Results for Any Abuse Using Any “Yes” as Cut-Off (Emotional Abuse, Current Partner

Screening for AOD/MH/DV Page 39

Summary of Calibrated Sensitivity, Specificity and EfficiencyMost Sensitive Most Specific Most Efficient

HAVE PARTNER ANY ABUSEAge 18-30 “Any Yes” X XAge 18-30 “HIT” X

Age 31-54 “Any Yes” X XAge 31-54 “HIT” X

NO PARTNER ANY ABUSEAge 18-30 “Any Yes” X XAge 18-30 “HIT” X X

Age 31-54“Any Yes” X XAge 31-54 “HIT” X

HAVE PARTNER PHYSICAL ABUSE

Age 18-30 “Any Yes” XAge 18-30 “HIT” X X X

Age 31-54“Any Yes” XAge 31-54 “HIT” X X

NO PARTNER PHYSICAL ABUSE

Age 18-30 “Any Yes” XAge 18-30 “HIT” X X X

Age 31-54“Any Yes” XAge 31-54 “HIT” X X