validity of sds-17 rrcb - cleveland state university...validity of sdr measures may not be directly...

Validity of the SDS-17

Measure of Social Desirability

in the American Context

Brian F. Blake, Ph.D.

Cleveland State University

Jillian Valdiserri, M.A.

Marketing Strategies Inc.

Kimberly A. Neuendorf, Ph.D.

Cleveland State University

Jacqueline Nemeth, M.A.

American Greetings

“How To” Series

November 2005

Brian F. Blake, Ph.D. Senior Editor

Allyson Graf Rhiannon Hamilton

Co-Editor Co-Editor

Entire Series available: http://academic.csuohio.edu:8080/cbrsch/home.html

2

RRRRESEARCH ESEARCH ESEARCH ESEARCH RRRREPORTS IN EPORTS IN EPORTS IN EPORTS IN CCCCONSUMER ONSUMER ONSUMER ONSUMER BBBBEHAVIOREHAVIOREHAVIOREHAVIOR These analyses address issues of concern to marketing and advertising professionals and to academic researchers investigating consumer behavior. The reports present original research and cutting edge analyses conducted by faculty and graduate students in the Consumer-Industrial Research Program at Cleveland State University. Subscribers to the series include those in advertising agencies, market research organizations, product manufacturing firms, health care institutions, financial institutions and other professional settings, as well as in university marketing and consumer psychology programs. To ensure quality and focus of the reports, only a handful of studies will be published each year. “Professional” Series - Brief, bottom line oriented reports for those in marketing and advertising positions. Included are both B2B and B2C issues. “How To” Series - For marketers who deal with research vendors, as well as for professionals in research positions. Data collection and analysis procedures. “Behavioral Science” Series - Testing concepts of consumer behavior. Academically oriented.

3

AVAILABLE PUBLICATIONS:

Professional Series

� Lyttle, B. & Weizenecker, M. Focus groups: A basic introduction, February, 2005. � Arab, F., Blake, B.F., & Neuendorf, K.A. Attracting Internet shoppers in the Iranian

market, February, 2003. � Liu, C., Blake, B.F., & Neuendorf , K.A. Internet shopping in Taiwan and U.S.,

February, 2003. � Jurik, R., Blake, B.F., & Neuendorf, K.A. Attracting Internet shoppers in the

Austrian market, January, 2003. � Blake, B.F., & Smith, L. Marketers, Get More Actionable Results for Your Research

Dollar!, October, 2002.

How To Series

� Blake, B.F., Valdiserri, J., Neuendorf, K.A., & Nemeth, J. Validity of the SDS-17 measure of social desirability in the American context, November, 2005.

� Blake, B.F., Dostal, J., & Neuendorf, K.A. Identifying constellations of website features: Documentation of a proposed methodology, February, 2005.

� Saaka, A., Sidon , C., & Blake, B.F. Laddering: A “How to do it” manual – with a note of caution, February, 2004.

� Blake, B.F., Schulze, S., & Hughes, J.M. Perceptual mapping by multidimensional scaling: A step by step primer, July, 2003.

Behavioral Science Series

� Shamatta, C., Blake, B.F., Neuendorf, K.A, Dostal, J., & Guo, F. Comparing website attribute preferences across nationalities: The case of China, Poland, and the USA, October, 2005.

� Blake, B.F., Dostal, J., & Neuendorf, K.A. Website feature preference constellations: Conceptualization and measurement, February, 2005.

� Blake, B.F., Dostal, J., Neuendorf, K.A., Salamon, C., & Cambria, N.A. Attribute preference nets: An approach to specifying desired characteristics of an innovation, February, 2005.

� Blake, B.F., Neuendorf, K.A., Valdiserri, C.M., & Valdiserri, J. The Online Shopping Profile in the cross-national context: The roles of innovativeness and perceived newness, February, 2005.

� Blake, B.F., & Neuendorf , K.A. Cross-national differences in website appeal: A framework for assessment, July, 2003.

� Blake, B.F., Neuendorf , K.A., & Valdiserri , C.M. Appealing to those most likely to shop new websites, June, 2003.

� Blake, B.F., Neuendorf , K.A., & Valdiserri , C.M. Innovativeness and variety of information shopping, April, 2003.

4

RRRRESEARCH ESEARCH ESEARCH ESEARCH RRRREPORTS IN EPORTS IN EPORTS IN EPORTS IN CCCCONSUMER ONSUMER ONSUMER ONSUMER BBBBEHAVIOREHAVIOREHAVIOREHAVIOR

EDITORIAL DIRECTOR: DR. BRIAN BLAKE

Dr. Brian Blake has a wide variety of academic and professional experiences. His early career... academically, rising from Assistant Professor to tenured Professor at Purdue University, his extensive published research spanned the realms of psychology (especially consumer, social, and cross-cultural), marketing, regional science, sociology, community development, applied economics, and even forestry. Professionally, he was a consultant to the U.S. State Department and to the USDA, as well as to private firms. Later on...on the professional front, he co-founded a marketing research firm, Tactical Decisions Group, and turned it into a million dollar organization. After merging it with another firm to form Triad Research Group, it was one of the largest market research organizations based in Ohio. His clients ranged from large national firms (e.g., Merck and Co., Dupont, Land o’ Lakes) to locally based organizations (e.g., MetroHealth System, American Greetings, Progressive Insurance, Liggett Stachower Advertising). On the academic side, he moved to Cleveland State University and co-founded the Consumer-Industrial Research Program (CIRP). Some of Cleveland’s best and brightest young marketing research professionals are CIRP graduates. In the last few years...academically, he is actively focusing upon establishing CIRP as a center for cutting edge consumer research. Professionally, he is market research consultant for a variety of clients.

CO EDITOR (2005-2007): ALLYSON GRAF

A first year student of the Consumer Industrial Research Program at Cleveland State University, Allyson is happily pursuing her goal to become a skilled researcher in the consumer/marketing field. She graduated summa cum laude from Hiram College in 2004 with a B.A. in psychology. Allyson’s curiosity about consumer behavior was peaked at Hiram while conducting research for her senior seminar project concerning consumer perception of advertisements. After obtaining her M.A., Allyson plans to continue her education by earning her Ph.D. and working at the college or university level, conducting research and teaching.

CO EDITOR (2006-2007): RHIANNON HAMILTON Rhiannon Hamilton was a regular contributor to the University of Cincinnati’s student newspaper, as well as to a local independent newspaper during her undergraduate studies in Psychology. After graduating with high honors, she pursued her passion for research working for several years on various projects for local hospitals and the Federal Government. Rhiannon’s interest specifically for Consumer Psychology developed during the year she spent conducting field research for R.J Reynolds. She is currently in her first year of CIRP studies, at the conclusion of which she plans to establish her career.

5

Validity of the SDS-17 Measure of Social Desirability in the American Context

Summary

In Germany, Stober and colleagues presented evidence for the validity of the

SDS-17, a new measure of social desirability bias (1999; 2001). In the current

investigation, three experiments (n=800) assessed the SDS-17’s validity in the US

environment. In all conditions SDS-17 scores correlated highly with Marlowe-Crowne

scores. In Study 1, a group administration of a paper and pencil booklet, SDS-17 scores

of 327 college students were higher under Fake Good than Standard conditions, and both

were higher than scores in the Honest condition. Study 2, an online survey of a

demographically diverse adult sample (n=257), showed that the increase in SDS-17

scores under Fake Good conditions occurs also in a Web survey and that SDS-17 scores

were unrelated to one’s demographic profile. Study 3, a group administration to 216

college students, revealed again that scores under Fake Good were higher than those

under Standard administration and that SDS-17 scores correlated more highly with the

Impression Management than with the Self Deception subscales of the BIDR. The SDS-

17 appeared valid for the US environment as a measure of socially desirable responding.

The evidence, however, encourages its further assessment as an index of social

desirability bias per se.

6

Validity of the SDS-17 Measure of Social Desirability in the American Context

Behavioral scientists (e.g., Crowne & Marlowe, 1960; Edwards, 1957) have long

been concerned with social desirability bias (SDB), defined as distorting one’s self-

presentation to make a favorable impression upon others. It continues to be seen in

various quarters, for example, consumer psychology (e.g., Fisher, 2000), as a source of

contamination that should be controlled in self-report measures. It has been well

established (e.g., Barger, 2002; Ellingson, Sackett, & Hough, 1999; Helmes & Holden,

2003), though, that SDB may be but a component of the more general multidimensional

phenomenon of socially desirable responding (SDR), defined as presenting oneself as

having characteristics appreciated by others. Beyond reflecting just SDB, SDR may

represent the internalization of cultural values (Fisher & Katz, 2000), the expression of

ongoing personality traits like emotional stability and conscientiousness (Ones,

Viswesvaran, & Reiss, 1996), or an overly favorable but honest self-evaluation devised to

maintain self-esteem (Paulhus, 1984), among other factors.

Attempts to unravel the role of SDB face the problem that there is no

consensually agreed-upon measure of SDB. One of the closest constructs is the

Impression Management dimension of Paulhus’s Balanced Inventory of Desirable

Responding (BIDR; Paulhus, 1984; 1994; 2002); this refers to the enhanced positivity of

one’s self presentation resulting from situational demands or temporary inclinations to

describe oneself to others in favorable terms. The conceptual basis for this measure,

though, has been questioned (e.g., Helmes & Holden, 2003; Kroner & Weekes, 1996;

Pauls & Crost, 2004). A new measure of SDB with demonstrated construct validity

could be helpful in clarifying the nature and impacts of SDB.

7

In Germany, Stober (1999; 2001) proposed a new measure of social desirability

bias. It was intended to overcome particular limitations of the most used index of SDR

(e.g., Beretvas, Meyers & Leite, 2002), the Marlowe-Crowne Scale (Crowne & Marlowe,

1960) and its shorter subscales (e.g., Barger, 2002). Initial validation studies (Stober,

1999; 2001) and applications (Stober & Wolfradt, 2001) were strongly supportive.

However, as is true for psychological measures in general (e.g., Van de Vijver, 2003), the

validity of SDR measures may not be directly generalizable across cultures (e.g., Johnson

& Van de Vijver, 2003). The purpose of this series of studies was to assess the reliability

and validity of the SDS-17 in the United States context.

SDS-17 Development

As conceived by Stober (1999; 2001), “social desirability” is a readiness to give

biased, distorted self descriptions that portray oneself in a manner that can make a

favorable impression on others. The SDS-17 is composed of 16 true-false items, e.g., “I

will never live off other people,” “I sometimes litter.” It is a balanced index in that one’s

score is increased by a true response on nine items, and by a false response on seven

items. Its statements were intended to have contemporary referents and phrasing. Stober

observed that various items on the original and shortened forms of the Marlowe-Crowne

(MC) are dated (e.g., “My table manners at home are as good as when I eat out in a

restaurant;” “I never make a long trip without checking the safety of my car”). Indeed,

similar observations have been made for quite some time (e.g., Ballard, Crino, &

Rubenfeld, 1988). Further, MC statements often incorporate culture bound referents

(e.g., driving cars, voting for political candidates), as do other SDR scales such as the

BIDR of Paulhus (1984; 1994; 2002). The SDS-17 contains few obvious references to

8

cultural institutions and organizations. Moreover, since the length of the MC (33 items)

can be troublesome (e.g., Fisher, 2000), the brevity of the SDS-17 is advantageous.

Convergent validity was suggested (Stober, 1999; 2001) by the scale’s correlation

with the MC (.74, p<.01, in the 1999 and .68, p<.01, in the 2001 study), the revised Lie

Scale of the Eysenck Personality Questionnaire (r=.60, p<.01; Eysenck & Eysenck,

1991), and the Sets of Four Scale (r=.52, p<.01; Borkenau & Ostendorf, 1992).

Discriminant validity was indicated by nonsignificant correlations with neuroticism,

extraversion, psychoticism, and with openness to experience as gauged by the revised

Eysenck Personality Questionnaire and the NEO Five Factor Inventory (Costa &

McCrae, 1993), respectively. Discriminant validity was lower than would be desirable

for an index intended to measure SDB, though, as seen in significant positive correlations

with agreeableness and conscientiousness measured by the Five Factor Inventory (Costa

& McCrae, 1993).

Befitting an index intended to assess readiness to provide distorted responses,

Stober (2001) demonstrated that SDS-17 scores were lower when respondents were given

“standard” questionnaire instructions (i.e., told to complete the questionnaire truthfully)

than when given instructions to produce responses presenting oneself favorably (i.e., told

to imagine that the way they responded was important to their application for a good job,

typically called a “fake good” condition). Further, comparison was made with the two

dimensions of the BIDR (version 6, Paulhus, 1994; German version, Musch, 1999). The

Impression Management (IM) subscale of the BIDR was developed to reflect conscious

dissimulation of responses with the aim of making a favorable impression on others,

while the Self Deceptive Enhancement (SDE) subscale referred to responding geared to

9

seeing oneself in a favorable lighty and so to protect positive self esteem (Paulhus, 1984).

Stober (2001) found that the SDS-17 correlated higher with the IM subscale (r=.37,

p<.01) than with the SDE subscale (r=.21, p<.05), suggesting that the SDS-17

represented distorted self presentation more than a chronic tendency to protect one’s self

esteem via thinking of oneself as having appealing characteristics.

Unanswered Questions

National Setting. Is the SDS-17 scale appropriate for use in the United States? .

Responses that convey a favorable impression in one society do not necessarily do so in

another (e.g., Middleton & Jones, 2000; Smith, Smith, & Seymour, 1993), and validation

may not transfer directly across nations (cf. Van de Vijver, 2003). Hence, the SDS-17

must be empirically validated for use in the US and other nations.

Motivation to Distort. In the initial validation research (Stober 1999; 2001), subject

responses in the assumed “fake good” (job application) condition were compared to those

in the “standard” (respond frankly) condition. Two issues can be raised about this

procedure. First, it is possible that one may “fake good” differently in a job interview

than in other settings (e.g., applying for fraternity membership or meeting a blind date).

A more conservative approach, then, would be to instruct subjects to make themselves

appear good or appealing in the eyes of others without specifying a particular context.

Second, under the standard conditions it is quite possible that subjects are still motivated

to distort. A more convincing approach would be to establish, in addition to the standard

setting, an “honest” condition in which subjects are strongly encouraged to respond

truthfully.

10

Administration Vehicle. Given the increasing popularity of internet surveys in the US, is

the SDS-17 valid not only in the traditional group administration via paper and pencil

booklets but also in the online environment? Differences in SDR can occur between

online and paper and pencil (e.g., Pettit, 2002), and between responding alone or in a

group (e.g., Richman, Kiesler, Weisband, & Drasgow, 1999). Although such studies

have not yielded entirely consistent results (Hancock & Flowers, 2001), a conservative

approach would be to validate the SDS-17 separately for group paper and pencil and for

online administration.

Demographic Differences. In the US, do SDS-17 scores vary with one’s demographic

characteristics? Stober (2001) found in Germany that scores increased with age, but were

unrelated to gender. In the US are there differences associated not only with age and

gender but also with other key variables, notably education, employment status, income,

or marital status?

Overview of Studies

Study 1, based on group administration of a print questionnaire to a university

student sample, asked whether SDS-17 scores are higher when individuals attempt to

answer in a socially desirable manner (Fake Good condition) than when responding in a

typical test administration (Standard condition) and whether both are higher than when

respondents are asked to respond with frank and open answers (Honest condition). Study

2 asked whether the increase in SDS-17 scores under Fake Good conditions occurs

among post-college-age adults taking part in an online survey. Both Studies 1 and 2

assessed the SDS-17’s convergent validity by asking whether it correlates with the MC

scale. In Study 3, university students again responded to a print questionnaire in a group

11

administration. Included in the booklet were the IM and SDE subscales of the Paulhus

BIDR (1994) as well as the MC scale. Again, responses were obtained under two

conditions, Fake Good and Standard.

Predictions

Validity of the SDS-17 as an index of SDR in general and SDB in particular

would be indicated if, first, in all three studies the SDS-17 scores were higher under Fake

Good than under Standard conditions and if, second, in Study 1 scores in both the Fake

Good and Standard conditions were higher than those in the Honest condition. Third,

validity as an index of SDR (but not necessarily of SDB) would be suggested if SDS-17

scores correlated positively with Marlowe-Crowne scores in all conditions in all three

studies, although the correlations under Fake Good conditions may be less reflective of

the scale’s construct validity than would be the correlations in the Standard or Honest

conditions (cf. Pauls & Crost, 2004). Fourth, a stronger correlation of the SDS-17 with

the Impression Management than with the Self Deceptive Enhancement scales would

support the Stober conceptualization of the SDS-17 as an index of SDB. Fifth, while

there was no compelling basis for offering a prediction, it was asked whether SDS-17

scores varied systematically with one’s demographic characteristics.

Selection of Validation Measures

The MC scale, 33 true-false items balanced for positive and negative wording,

was used to establish convergent validity of the SDS-17 as a measure of the SDR because

the MC is the most widely employed indicator of SDR (Beretvas et al., 2002). Due to its

multidimensionality (e.g., Ellingson et al., 1999; Helmes & Holden, 2003), among other

factors (Barger, 2002), the MC cannot be construed as a measure purely of SDB.

12

Despite challenges to the BIDR’s validity as an index of SDB (e.g., Helmes &

Holden, 2003), the version 6 IM and SDE subscales were used because: a) the IM

dimension represents the construct underlying the SDS-17, as demonstrated by Stober’s

(2001) use of it to assess the construct validity of the SDS-17; b) the IM-SDE distinction

is used in numerous settings (cf. Ellingson et al., 1999); c) a body of some research does

support its assumptions (e.g., Hancock & Flowers, 2001; Musch, Brockhaus, & Broeder,

2002); and d) as noted earlier, there is a dearth of measures of the SDB construct that

have been widely used and successfully validated as an index uniquely measuring SDB.

Both the 10 item IM subscale (e.g., “I sometimes tell lies if I have to”) and the 10 item

SDE subscale (e.g., “I have always been honest with myself”) have a true-false response

format and are balanced for positively and negatively worded statements.

Method

Study 1

Subjects. Undergraduate students (n=327) enrolled in psychology and communication

courses at a larger Midwestern urban university (n=252) and at a smaller residential

college (n=75) were offered course credit for participation in a survey of their shopping

behavior and of their orientation toward a variety of issues.

Conditions. Subjects were randomly assigned to one of three conditions and were given

a questionnaire booklet. In the Standard condition (conducted in the manner suggested

by Dillman (2000) for group administration of a print questionnaire), subjects were told

that their answers would be confidential and they were asked to respond truthfully. In the

Honest condition, it was explained that the experiment was designed to evaluate how the

13

questionnaire was susceptible to distortion by persons attempting to make themselves

appear attractive in the eyes of others. It was noted that they had been randomly assigned

to the Honest condition and that other persons were being asked to “fake good.”

Investigators exhorted the subjects to answer as frankly as possible and assured them

their answers would be anonymous and confidential. In the Fake Good condition,

subjects were also informed that the goal of the study was to test the responsivity of the

questionnaire to social desirability bias. They were told that they were not to answer in

terms of their actual opinions but to answer in a manner which would make them appear

to be a good person with admirable qualities, without specifying which qualities they

should emphasize. Subjects were instructed to answer the demographic questions

truthfully.

Questionnaire. The booklet included the 16 SDS-17 and the 33 MC items, along with the

six demographic characteristics described in Table 1. Other items, irrelevant to this

study, pertained to shopping for products and services.

Study 2

Respondents. A convenience sample of 257 friends, relatives, coworkers, and others

known to graduate students on the research team were asked to participate. An attempt

was made to secure as demographically and geographically diverse a sample as possible

as long as respondents were over 21 years of age. When recruited, participants were told

simply that the survey involved measuring their orientations and consumer behavior and

that their answers would be confidential. Participants were directed to the University’s

website containing the questionnaire.

14

Conditions. When accessing the website, respondents were assigned randomly to the

Standard or the Fake Good conditions and given the same instructions and explanations

as were used in Study 1.

Questionnaire. The instrument contained the same measures as did Study 1, but

formatted for online presentation.

Study 3

The 216 subjects were recruited from psychology and communication undergraduate

classes at a large Midwestern urban university and assigned to either the Standard or the

Fake Good condition in the same manner as in Study 1. This questionnaire also

contained the 20 BIDR items.

Subject Characteristics

Of 814 participants, 800 provided usable data (See Table 1). Study 1 was composed of

young, mainly single students, a substantial number of whom were employed part-time.

There was considerable spread in their own or family income. Study 2 respondents were

predominantly well-educated and fairly affluent. The student sample in Study 3 was

again generally young, single, and employed at least part time. Both Study 2 and Study 3

respondents were predominantly female. Thus, the two student samples were fairly

typical of many U.S. student populations used in psychological research and the online

sample, while upscale, included more demographic diversity.

15

Table 1

Subject Characteristics

Characteristic

Study 1

Study 2

Study 3

Sample Size

Total 327 257 216 Fake Good 115 121 105 Standard 118 136 111 Honest 94 --- --- Age (mean years) 21.57 36.45 22.95 Gender (% male) 49.7% 36.2% 33.0% Education

High School or less --- 6.7% --- Post High School 93.4% 23.3% 90.3% College Graduate 6.4% 70.0% 9.7% Employment

% Full-time 17.1% 35.8% 12.5% % Part-time 45.6% 13.2% 48.6% Other* 37.3% 51.0% 38.9% Income

0 - $20,000 27.2% 12.7% 36.9% $20,001 - $30,000 22.0% 8.6% 10.7% $30,001 - $50,000 5.8% 23.0% 17.0% $50,001 or more 38.5% 55.8% 35.4% Marital Status

Single 89.9% 39.6% 86.6% Married 5.5% 52.9% 12.0% Divorced, Widowed, Separated 4.0% 7.5% 1.4%

* Includes unemployed, homemakers, retired, etc.

16

Results

Preliminary Analyses

Demographic Differences Between Conditions. Differences between conditions in

respondent profiles on the seven demographic characteristics were assessed separately for

each study. Only three of the 35 comparisons were statistically significant and only one

(income in Study 2) was of any magnitude. In light of the lack of relationship in Study 2

between respondent demographic characteristics and SDS-17 scores as discussed later,

these differences were felt to be inconsequential. Homogeneity of the Validation Scales.

Table 2 presents the KR-20 coefficients for the MC and the BIDR scales in each

condition. The two scales appeared to be highly reliable in each condition in each study

and so could serve as evidence of convergent validity for the SDS-17 measure.

Table 2

Homogeneity (KR-20) Coefficients of Social Desirability Measures

Study 1 Study 2 Study 3

Condition MC

SDS-

17 MC

SDS-

17 MC BIDR IM SDE

SDS-

17

Fake Good .94 .92 .87 .89 .91 .89 .86 .80 .80

Standard .76 .70 .71 .70 .77 .82 .76 .71 .64

Honest .77 .76 --- --- --- --- --- --- ---

Establishment of Conditions. Tests of differences among the conditions in scores on the

MC Scale revealed the following: In Study 1, an ANOVA indicated that the conditions

differed significantly (F2,313=11.67, p<0.001). The Tukey HSD showed that MC scores

17

in the Fake Good condition (M=24.49, n=113) were higher than in the Standard condition

(M=16.34, n=116, p<0.001) and the Honest condition (M=13.99, n=87, p<0.001), while

scores for the Standard were higher than those for the Honest group (p<0.01). In Study 2,

the MC scores were higher under Fake Good (M=25.13, n=114) than under Standard

conditions (M=16.92, n=123, t(235)=-12.72, p<0.001). In Study 3, the Levene test

revealed that the variances were not homogenous within each group (F=7.75, p<0.01).

The t-test (assuming heterogeneous variances) again found differences in mean MC

scores between the Fake Good (M=30.84, n=100) and the Standard groups (M=15.64,

n=106, t(201.18)=-23.05, p<0.001). It can be concluded that that manipulation of the Fake

Good, Standard, and Honest Conditions was successful in all three studies, as verified via

the MC, independently of the SDS-17 scores.

SDS-17 Scale Homogeneity

As shown in Table 2, the SDS-17 scale was highly consistent internally when

individuals were attempting to fake good. Under more typical survey conditions the

measure had satisfactory reliability in three of the four conditions. Overall, the measure

appears to have satisfactory to good internal consistency.

Demographic Differences

Under “normal” conditions do scores on the SDS-17 scale vary with one’s

demographic profile? Using data from the Standard condition of Study 2, the

demographic factors were converted to dummy variables and entered simultaneously as

predictors in an ordinary least squares multiple regression against SDS-17 scores as the

criterion. No significant differences between demographic groups emerged in their SDS-

17 scores (R=0.32, R2=0.10, adjusted R2=-0.03; F16,106=0.76, p=0.73).

18

Response to Conditions

The means and standard deviations of SDS-17 scores in each condition in each

study are shown in Table 3.

Table 3

Means and Standard Deviations of SDS-17 Scores in Each Condition

Condition

Study 1

Study 2

Study 3

Fake Good 13.57 (3.81) 13.67 (3.44) 14.88 (2.00)

Standard 8.38 (3.18) 7.99 (3.11) 7.50 (2.94)

Honest 7.20 (3.38) --- ---

In Study 1, an ANOVA showed that the differences among the conditions were

significant (F2,319=102.56, p<0.001). The Tukey HSD test observed a difference in the

Fake Good vs. the Standard (p<0.001) and vs. the Honest (p<0.001), and in the Standard

vs. the Honest (p<0.05). Cohen’s d showed the Standard-Fake Good difference (d=1.49)

and the Honest-Fake Good difference (d=1.79) to be quite strong. The Standard-Honest

difference (d=0.36) was more modest, but meaningful. In Study 2, the difference was

again significant in the predicted direction (t(245)=-13.66, p<0.001) and, given d=1.74,

quite powerful. Finally, in Study 3, the Levene test (F=28.74, p<0.001) showed that the

within group variances were different. The t-test with heterogeneous variances was yet

again significant (t(195.00)=-21.69, p<0.001) and the difference substantial (d=2.99). Thus,

in all three studies, individuals’ SDS-17 scores varied with conditions in the manner

expected.

19

MC and BIDR Scale Correlations. Assessments of convergent validity were quite

positive. In each condition in each study the Pearson product moment correlation

between the SDS-17 scale and the MC is strong and positive (see Table 4). Table 4 also

shows that the correlation between the SDS-17 index and the overall BIDR is significant

in each condition in Study 3. Additionally, the SDS-17 measure correlated more highly

with the IM than with the SD subscale both in the Fake Good (Hotelling-Williams t=-

2.57, p<.05) and in the Standard (Hotelling-Williams t=-2.39, p<.05) conditions.

Table 4

Pearson Product Moment Correlations of the SDS-17 Scale with the MC and

BIDR Scales

Study 1 Study 2 Study 3

Condition MC MC MC BIDR IM SDE

Fake Good .80** .91** .84** .37** .49** .24*

Standard .72** .78** .74** .38** .50** .19

Honest .70** --- --- --- --- ---

* - p<.05, ** - p<.001

Discussion

The three studies strongly support the viability of the SDS-17 as a measure of

SDR. First, internal consistency was acceptable in six of the seven assessments; the

exception (alpha=.63) might have reflected context effects in the questionnaire’s item

sequence. Second, in all three experiments SDS-17 scores consistently increased with

situational demands to display socially desirable responses in all three experiments.

20

Third, convergent validity was established via substantial correlations with the Marlowe-

Crowne and the BIDR scales, well established SDR indicators. Fourth, internal

consistency, known group and convergent validity were shown both for the online survey

situation as well as for paper and pencil group administration. Fifth, the absence of a

significant relationship with the six demographic characteristics in Study 2 renders the

SDS-17 easier to use in practice, since a person’s demographic profile would not have to

be considered when interpreting SDS-17 scores.

Whether SDS-17 scores represent solely SDB, however, is a separate question.

That the scale does reflect SDB is supported by its higher correlation in Study 3 with the

IM than with the SDE subscale of the BIDR. Further, in the Standard condition, SDS-17

correlated significantly with the IM, but not with the SDE, subscale. The SDS-17 did

correlate low, but significantly, with the SDE scale in the Fake Good context. The latter

correlation perhaps may be artifactual, given that conceptually unrelated scales

(Ellingson et al., 1999) in general and social desirability measures (Pauls & Crost, 2004)

in particular may intercorrelate more strongly under Fake Good conditions. The

BIDR/SDS-17 correlations tend to support SDS-17 as an index of SDB, at least under

typical administrative conditions. The caveat, however, arises from the dearth of a

commonly agreed-upon standardized measure of SDB per se to serve as a basis for

convergent validity.

Given the encouraging results obtained here, additional research is warranted to

address issues beyond the limits of the present investigation. First, evidence for the

construct validity of the SDS-17 or of any measure proposed as an index of SD bias

(dissimulation, distortion) could profit from a rather infrequently used approach, the

21

direct measurement of the magnitude of each person’s dissimulation/distortion on an

item-by-item, scale-by-scale basis. For example, Ellingson et al. (1999) directly gauged

the degree of dissimulation on an item via a repeated measures design comparing a

person’s response under fake good and under honest instructional conditions. If such

analyses can be corrected for acquiescence (e.g., Van de Vijver, 2003), non-

comparability of positively and negatively worded items (Wong, Rindfleisch, &

Burroughs, 2003), sequence effects and other potential artifacts, they might help

differentiate between SDB and SDR (particularly, non-SDB components of SDR).

Second, norms based on a more nationally representative respondent sample would help

investigators assess the degree of SDB/SDR shown by an individual or by a respondent

sample as a whole (as embodied, for example, in the group mean).

Cross-cultural research requires measures that are valid within each relevant

culture and, even better, that are equivalent across cultures (cf. Van de Vijver, 2003).

Some support for the SDS-17’s validity is now available in the US as well as in

Germany. If subsequent validation research is also supportive, the SDS-17 may be

valuable for cross-cultural investigations.

22

References

Ballard, B., Crino, M. D., & Rubenfeld, S. (1988). Social desirability bias and the

Marlowe-Crowne Social Desirability scale. Psychological Reports, 63, 227-237.

Barger, S. (2002). The Marlowe-Crowne affair: Short forms, psychometric structure and

social desirability. Journal of Personality Assessment, 79(2), 286-305.

Beretvas, S. N., Meyers, J. L., & Leite, W. L. (2002). A reliability generalization study

of the Marlowe-Crowne Social Desirability Scale. Educational & Psychological

Measurement, 62(4), 570-589.

Borkenau, P., & Ostendorf, F. (1992). Social desirability scales as moderator and

suppressor variables. European Journal of Personality, 6, 199-214.

Costa, P. T., & McCrae, R. R. (1993). Revised NEO Personality Inventory (NEO PI-R)

and NEO Five Factor Inventory: Professional Manual. Odessa, FL:

Psychological Assessment Resources.

Crowne, D. P., & Marlowe, D. (1960). A new scale of social desirability independent of

psychopathology. Journal of Consulting Psychology, 24, 349-354.

Dillman, D. (2000). Mail and Internet surveys: The tailored design method. New York:

John Wiley.

Edwards, A. L. (1957). The social desirability variable in personality assessment and

research. New York: Dryden.

Ellingson, J. E., Sackett, P. R., & Hough, L. M. (1999). Social desirability corrections in

personality measurement: Issues of applicant comparison and construct validity.

Journal of Applied Psychology, 84(2), 155-166.

23

Eysenck, H. J., & Eysenck, S. B. G. (1991). Manual of the Eysenck Personality Scales

(EPS Adult). London: Hodder and Stoughton.

Fisher, R. J. (2000). The future of social desirability research in marketing. Psychology

& Marketing, 17(2), 73-77.

Fisher, R. J., & Katz, J. E. (2000). Social desirability bias and the validity of self-

reported values. Psychology and Marketing, 17(2), 105-120.

Hancock, D. R., & Flowers, C. (2001). Comparing social desirability responding on

World Wide Web-paper administered surveys. Educational Technology Research

& Development, 49(1), 5-13.

Helmes, E., & Holden, R. (2003). The construct of social desirability: One or two

dimensions? Personality and Individual Differences, 34, 1015-1023.

Holden, R. R., & Fekken, G. C. (1989). Three common social desirability scales:

Friends, acquaintances, or strangers. Journal of Research in Personality, 23, 180-

191.

Johnson, T. P., & Van de Vijver, F. J. R. (2003). Social desirability in cross-cultural

research. In J. A. Harkness, F. J. R. Van de Vijver, & P. P. Mohler (Eds.), Cross-

cultural survey methods (pp. 195-206). Hoboken, NJ: Wiley-Interscience.

Kroner, D. G., & Weekes, J. R. (1996). Balanced Inventory of Desirable Responding:

Factor structure, reliability, and validity with an offender sample. Personality and

Individual Differences, 21(3), 323-333.

Middleton, K. L., & Jones, J. L. (2000). Socially desirable response sets: The impact of

country culture. Psychology & Marketing, 17(2), 149-163.

24

Musch, J. (1999). German version of the Balanced Inventory of Desirable Responding

(BIDR, version 6). Unpublished manuscript, University of Bonn, Germany.

Musch, J., Brockhaus, R., & Broeder, A. (2002). An inventory for the assessment of two

factors of social desirability. Diagnostica, 48(3), 121-129.

Ones, D. S., Viswesvaran, C., & Reiss, A. D. (1996). Role of social desirability in

personality testing for personnel selection: The red herring. Journal of Applied

Psychology, 81, 660-679.

Paulhus, D. L. (1984). Two component models of socially desirable responding. Journal

of Personality and Social Psychology, 46, 598-609.

Paulhus, D. L. (1994). Balanced Inventory of Desirable Responding: Reference manual

for BIDR version 6. Unpublished manuscript, University of British Columbia,

Vancouver, Canada.

Paulhus, D. L. (2002). Socially desirable responding: The evolution of a construct. In

H. I. Braun & D. N. Jackson (Eds.), The role of constructs in psychological and

educational measurement (pp. 49-69). Mahwah, NJ: Lawrence Erlbaum

Associates, Publishers.

Pauls, C. A., & Crost, N. W. (2004). Effects of faking on self deception and impression

management scales. Personality & Individual Differences, 37, 1137-1151.

Pettit, F. A. (2002). A comparison of World Wide Web and paper-and-pencil personality

questionnaires. Behavior Research Methods, Instruments, & Computers, 34(1),

50-55.

Richman, W. L., Kiesler, S., Weisbad, S., & Drasgow, F. (1999). A meta-analytic study

of social desirability distortion in computer-administered questionnaires,

25

traditional questionnaires, and interviews. Journal of Applied Psychology, 84(5),

754-775.

Smith, S., Smith, K., & Seymour, K. (1993). Social desirability of personality items as a

predictor of endorsement: A cross-cultural analysis. Journal of Social

Psychology, 133(1), 43-52.

Stober, J. (1999). The Social Desirability Scale-17 (SDS17): Development and first

results on reliability and validity. Diagnostica, 45, 173-177.

Stober, J. (2001). The Social Desirability Scale-17 (SDS17): Convergent validity,

discriminant validity, and relationship with age. European Journal of

Psychological Assessment, 17(3), 222-232.

Stober, J., & Wolfradt, U. (2001). Worry and social desirability: Opposite relationships

for socio-political and social-evaluation worries. Personality and Individual

Differences, 31(4), 605-613.

Van de Vijver, F. J. R. (2003). Bias and equivalence: Cross-cultural perspectives. In J.

A. Harkness & F. J. R. Van de Vijver ( Eds.), Cross-cultural survey methods (pp.

143-156). Hoboken, NJ: Waley-Interscience.

Wong, N., Rindfleisch, A., & Burroughs, J. (2003). Do reverse-worded items confound

measures in cross-cultural consumer research The case of the Material Values

Scale. Journal of Consumer Research, 30(1), 72-92.

validity of sds-17 rrcb - cleveland state university...validity of sdr measures may not be directly...

Documents