t02-measuring the involvement construct
TRANSCRIPT
-
8/10/2019 t02-Measuring the Involvement Construct
1/13
easuring the Involvement onstruct
A bipolar a
-
8/10/2019 t02-Measuring the Involvement Construct
2/13
34
THE JOURNAL OF CONSUMER RESEAR
been tested for inte rnal reliability, stability, or validity.
Hence a standardized, general, valid, and multiple-item
measure of involvement should be useful.
BACKGROUND AND CRITERIA
FOR MEASURING INVOLVEMENT
A measure of involvementindependent of the be-
havior tha t results from involvement would allow the
researcher to use the same measure across various re-
search studies. This measu re sho uld also be sensitive to
the proposed areas that affect a person's involvement
level. These areas might be classified into three cate-
gories (Bloch and Richins 1983; Houston and Roths-
child 1978):
1. Personalinherent interests, values, or needs that
motivate one toward the object
Physicalcharacteristics of the object that cause dif-
ferentiation and increase interest
3. Situationalsomething that temporarily increases
relevance or interest toward the object
In Houston and R othschild 's (1978) framework, differ-
ent situations and different people are two factors that
lead to various levels of involvement. Houston and
Rothschild integrate physical characteristics of the
product as part of the situational factor. Coinciding with
Bloch and Richins (1983), the present article separates
the physical from the situational and allows the same
physical object to be subjected to different levels of in-
volvement given different situations.
The evidence for the three factorsphysical, per-
sonal, or situationalthat influence the consumer's
level of involvement or response to products, advertis-
ing, and purchase decisions is found in the literature.
For examp le, Wright (1974) found that v ariation in the
type of media print versus audioinfluenced the re-
sponse given to the same message (physical). Lastovicka
and G ardner (1978a) demonstrated that the same prod-
uct has different involvem ent levels across people (per-
sonal),
and Clarke and Belk (1978) demonstrated that
different purchase situations for the same products
cause differences in search and evaluation or raise the
level of involvement (situational). Based on this prior
reasoning, a measure of involvement might be devel-
oped tha t would pick up differences across people, ob-
jects,and situations.
Different types of scales were pretested before select-
ing a measurement approach that seemed to be gener-
alizable across all product categories. First, a series of
vignettes was developed to represent involvemen t. T he
vignettes were similar to scenarios found in Lastovicka
and Gardner (1978b). Problems arose with developing
enough generalizable scenarios for a reliable scale. Lik-
ert scale items proposed a problem because items that
seemed to be appropriate for frequently purchased
goods did not seem to apply to durable goods and v
versa.
The mo st effective a nd g eneralizable tyjre of scale
peared to be a sem antic differential type (Osgood, S
and Tann enb aum 1957). Th e Sem antic D ifferen
consists of a series of bipolar items, each measured
a seven-point rating scale. It is easy to administer
score, takes only a few minu tes to com plete, and is
plicable to a wide array of objects. The descriptors
phrases easily relate across product categories and
be appropriate to other do mains, such as purchase
cisions or ad ve rti^ m en ts. (However, the main focu
this article and scale development is involvement w
products.) The steps taken to develop the measure w
1.
Define the construct to be measured.
2. Generate items that pertain to the construct.
3. Judge the content validity of generated items (i
reduction).
4.Determine the internal reliabiiity of items judge
have content validity (item reduction).
5.
Determine the stability of internally reliable it
over time (item reduction).
6. Measure the co nten t validity of the 20 selected ite
as a w hole.
7. Measure the criterion-related validity, which is
ability of the scale to discriminate among diffe
products for the same people and different situati
for the same product and same people.
8. Test the construct validity or theoretical value of
scale by gathering data and testing whether the sc
discriminates on self-reported behavior.
DEFINING THE CONSTRUCT
This article will adopt the general view of invol
ment that focuses on personal relevance (Greenw
and Leavitt 1984; Kru gm an 1967; Mitchell 19
Rothschild 1984). In the advertising domain, invol
ment is manipulated by making the ad relevant:
receiver is pereonally affected, and hence m otivated
respond to the ad (e.g.. Petty and Cacciopo 1981).
product class research, the concern is with the releva
of the pro duct t o the n eeds and values of the consum
In purchase decision research, the concern is that
decision is relevant, and hence that the consumer w
be motivated to make a careful purchase decision (e
Clarke an d Belk 1978). Although each is a different
main of research, in general, high involvement me
personal relevance (Greenwald and Leavitt 1984).
In this study, the definition of involvement used
the purposes of scale development was:
A person's perceived relevance of the object based
inherent needs, values, and interests.
This definition recognized past definitions of invol
ment (e.g., Engel and Blackwell 1982; Krugman 19
-
8/10/2019 t02-Measuring the Involvement Construct
3/13
nition may be applied to advertisements, p roducts,
7) in advertising focused o n persona connections.
efined involvem ent with advertising as
s perception of the relevancy of the ad con-
of purchase and involvement inter-
red to response invo lvem ent and defined it as a func-
endu ring involvem ent or a need derived from a
e in the individu al's hierarchy of needs.
ITEM GENERATION AND
CONTENT VALIDITY
A sem antic differential scale was to be dev eloped
e eaTlier definition of invo lvem ent. T hu s, a
The first step was to judge the pro-
68
word pairs was tested in two p hases: (1) initial
ion of poor w ord pairs, and (2)
fin r
udging of the
Three expert judges (senior Ph .D. candidates in con -
ith "advertisement;" and third, replacing the
ve of involvem ent, (2) somew hat representative
nt. W ord pairs that w ere not rated as representative
Word pairs that were dropped included traditional
tudes used in th e psychology an d mar-
of involve me nt. The judges decided
scale that represent the low end of involvement
were generally not negativeas they would be if me
suring attitudes but rather were "w ho cares " descri
tors,
e.g., unimportant, unexciting, doesn't matter,
of no co ncern.
Five new judges then rated the remaining 43 wo
pairs using the same procedure. Only 23 items we
consistently rated as representing the involvem ent co
struct (80 percent agreement over products, purcha
decisions, and advertisements for each word pair). Th
meant that at least 12 of the possible 5 judgments f
each word pair (five judges over three objects) had
be rated as representative of the involvement construc
Agreement across judges and within each area for th
23 word pairs was as follows: advertisements, 84 pe
cent; products, 87 percent; and purchase decisions, 7
percent.
Twenty-three was assumed to be too low a numb
of items with which to start data collection (French an
Michael 1966; Nun nally 1978). Thu s, seven addition
items were added to the item pool to raise the initi
number to 30 (five of these seven were eventual
dropped). For example, trivial-grand (45 perce
agreement) was changed to trivial-fundam ental, an
inspiring-discouraging (55 percent agreement) w
changed to inspiring-uninspiring and returned to th
list. Therefore, a thirty-item scale emerged from t
content validity phase that trained and knowledgeab
judges agreed measured involvement over three d
mains: products, advertisements, and purchase dec
sions. However, this study focused on, and further va
idation procedures were carried out on, involveme
with products.
INTERNAL SCALE RELIABILITY
The next task was to administer the 30 items as
scale over different product categories to measure th
internal consistency or inter-item correlation. Tw
product classeswatches and athletic shoeswere s
lected b ecause they were thoug ht to b e used by the su
jects.
One hundred and fifty-two undergraduate ps
chology stud ents com pleted the scale during class tim
Approximately half of the subjects filled out the sca
pertaining to athletic shoes and the o ther half filled o
the scale pertaining to watches. The results show th
for both product categories, 26 bipolar items had a
item-to-total score correlation of 0.50 or more, and
Cronbach alpha level of 0.95.
Six adjective pairs with relatively low item-to-to
correlations were dropped; interestingly, m ost of the
adjective pairs had been returned earlier to the ite
pool. Factor analyses, using varimax rotation wi
squared m ultiple correlations in the diagno nals for fa
tor extraction, were carried out over both products
check if the item s selected for deletion loaded o nto o
particular dimension or were amorphous across facto
For both watches and athletic shoes, one factor e
-
8/10/2019 t02-Measuring the Involvement Construct
4/13
44
THE JOURNAL OF CONSUMER RESEARCH
plained the major variation in the data, accounting for
70.3 percent and 69 3percent of the comm on variance,
respectively (eigenvalues 13.3 and 13.2). Watches had
two more factors, accounting for 11.6 percent and 5.6
percent of the common variance (eigenvalues 2.2 and
1.1), and athletic shoes had three more factors, ac-
coun ting for 11.7 percent, 5.9 percent, and 5.7 percent
ofthe common variance (eigenvalues 2.2, 1.2, and l.l).
The results of the factor analyses showed that the
item s selected for deletion did not load togethe r on any
unique factor across either product category. Since the
first factor accounts for approximately 70 percent of
the variance, and none of the remaining items had a
loading of zero or less on th at first dim ension , the scale
development con tinued on the assumption of a simple
linear combination of the individual items (Comrey
1973).The assumption is that no individual item is suf-
ficient, and that it is the scale taken as a whole that
tends to measure the involvement construct (Nunnally
1978).
TEST-RETEST RELIABILITY
Test-retest reliability of the remaining 24 items was
examined over two new subject samples and four new
prod uct categories. Sixty-eight psychology students ini-
tially rated calculators and mouthwash. Forty-five MBA
students rated breakfast cereals and red
wine.
The order
of the products was counterbalancedhalf of the sub-
jects in each group rated on e prod uct category first, and
the other half rated the other product category first.
The scales were administered duringclasstime and took
about five minutes to complete.
Three weeks later the scales were administered over
the same product categories to the same subjects. Thir-
teen psychology students and 19 MBA subjects were
lost to attrition; thus, 55 psychology students and 26
MBA students were used to measure test-retest reli-
ability. The average Pearson co rrelation between T ime
I and Time 2 on the 24 items was 0.90. Individual item-
to-item co rrelations ranged from 0.31 to
0.93.
Four ad-
ditional items with average test-retest correlations below
0.60 were deleted. The resulting twenty-item involve-
ment score test-retest correlations for each product were
as follows: calculators,
r
0.88; mouthwash, r = 0.89;
breakfast cereals, r = 0.88 ; and red wine,r=0.93.These
product categories were also tested for internal scale
reliability. The Cronbach alpha ranged from 0.95 to
0.97 over the four products.
Therefore, a twenty-item scale emerged from the in-
ternal reliability and stability phases of scale develop-
ment for products. Twenty items allowed an adequate
sampling of the possible items that represent involve-
ment with products and yet was long enough to ensure
a high level of reliability.' On a practical level, the scale
'Although the current analyses do not suggest what the reliability
is for subsets ofthe scale items, the case may be that a smaller num ber
fits neatly on one page and only takes a few mo m ent
to complete. The scale was then counterbalanced s
that ten random items were reverse scored. Since each
bipolar item was rated on a seven-point scale, the tota
possible score ranged from a low of 20 to a high of 140
The scale was named the Personal Involvement Inven
tory (PII) and is listed in Appendix A.
SECOND CONTENT VALIDITY
A second measure of content validity was obtained
from the open-ended responds of 45 MBA student
over three prod uct categories: 35mm cameras, red wine
and breakfast cereals. After completing the scales fo
each product, subjects answered the following open
ended question:
Now wewould like you to
state,
in yourown
words,w
you rated each product category as you did.
Subjects were then divided into three groupshigh
medium, or lowfor each product class according to
their scale scores.^ Examples o fthe open-ended respon
ses appear in the Exhibit.
Two expert judges (senior Ph.D. candidates in con
sumer behavior) b lind t o the scale scores evaluated the
total set of open-ended responses. For each produc
category, the judges sorted the comments into three
groups indicative of low involvement, medium in
volvement, and high involvement with the product cat
egory, based on how well the responses represented in
volvement, as defined earlier.
Interjudge reliability on the classification ofthe re
sponses was 80 percent agreement for 35mm cameras
84 percent agreement for red wines, and 80 percen
agreement for breakfast
cereals.
Classifications on which
the two expert judges did not agree were then given to
of items w ould be almost as reliable as the 20 items. The problem o
reducing the scale to fewer item s lies in deciding which items to selec
as subsets, since individual items differed in their reliability acros
product categories. A subset of items that may approach the reliabilit
ofthe 20 items for one product may not approach the same reliabilit
for another product. This variation is evident in that the test-retes
total score correlation ranged from 0.88 to 0.93 over products, an
test-retest for the 20 individual items ranged from 0.44 to 0.93 ove
various products. The twenty-item measure should outperform any
subset of the scale; besides, decreasing the number of items would
not really make the scale any easier to administer, but may serve to
decrease the domain of items judged as being representative of in
volvement and also lower the reliability ofthe scale. Researchers who
may use this scale are warned not to haphazardly reduce Ihe numbe
of items.
^The classification of subjects into low, medium, and high score
was based on an overall distribution developed over 3 product cat
egories (Table 3) and several hundred subjects. All scores were tab
ulated on the PII scale range presented in the Figure. Subjects whose
PII scores fell into the botto m 25 percent of the overall distribution
were classified as having low involvement with the product. Subject
whose PH scores fell into the middle 50 percent ofthe distribution
were classified as having medium involvement, and subjects whos
PIl scores were in the to p 25 percent ofth e distribution were classified
as having high involvement with the product. For development o
this classification scheme see Appendix B.
-
8/10/2019 t02-Measuring the Involvement Construct
5/13
34
EXHIBIT
OPEN-ENDED RESPONSES ON CONTENT VALIDITY
3Smm Cameras
High involvement for cameras (scw e greater than 110).
a. Subject 1 . Cameras are impo rtant, but not essential. TTiey
provide
s
creative and h istorical outlet for m e.
b. Subject 12. Cim eras interest me and are
i
impcMiant bcrfiby
to me.
owinvolvement for cameras (sc wes less than 70).
a. Subject 1 7. Because I never use 35mm cameras and am not
extremely interested in them.
b. Subject 37. It's a nice prod uct to have but not a high priority.
I have several but as
I
recall, none of the purchases was an
involved purchase.
ed Wine
High involvem ait for red wine (score greater than 11 0).
a. Subject 22. Red wine adds a tot to the approfsiate meals.
b. Subject 6-1 have always wa nted to know more about wmes
and fflijoy
it
when people I know teach me about them.
Low involvem ent for red wine (score less than 70).
a. Subject 2 0. 1 m not interes ted in vt/ines nor do I particularly
appreciate the m ystique that surrounds w ines, in general.
b. Subject 36. OK for socials and getting drunk.
Breakfast cereals
. High involvement for breakfas t cereals (score greater than 1 1 0).
a. Subject 27.1 eat cereal, healthy efficient 'wake up Ame rica.
Cereal is good for y ou.
b. Subject 8. Because they are diet foods.
Low involvement for breakfas t cereals (score less than 70).
a. Subject 3. think breakfast cereals are a sham . I only eat
gr^ ienuts. it infuriates me to see breakfast cereals
advertised to be eaten with toas t, juice, etc. W hat's the use,
\awexercise?Irefuse to buy c ereal for my
child.
b. Subject 31 .1 eat cereal for convenience;itis easy and fast.I
have no interest in them nor am Ifascinated w ith them.
e to classify. Th e categories of responses, as
presented in Table 1. These data indicate
responses from the subjects, thus adding
ditional m odicum of suppo rt to the validity of the
CRITERION-RELATED VALIDITY
Criterion-related validityisdem onstrated by com-
neormore external variables that provideadirect
was the simple ordering or classification of prod-
Twenty-one products classified in other studiesas
TABLE
RELATIONSHIP BETWEEN THE SCALE SCORES AND
THE OPEN-ENDED RESPONSES
Scale
sccwes
Lo w
Medium
High
(Total)
Lo w
Medium
High
Judges' ratings
Lo w
7
4
0
(11)
1 2
8
0
Medium High (Total)
35 mm Cameras '
1
12
4
(17)
1
9
1
0
7
1 0
(17)
(8)
(23)
(14)
(45)
Red wine
0
8
6
(13)
(25)
(7)
Lo w
1 1
0
12
8
Collapsed for
Chi-square
Medium
1 3
4
1
1 0
Hig
7
1 0
0
14
(Total) (20) (11) (14) (45)
Breakfast cereals^
Lo w
Medium
High
(Total)
1 9
9
0
(28)
3
9
2
(14)
0
1
2
(3)
(22)
0 9 )
(4)
(45)
1 9
9
3
1 1
0
3
x:'-
10,4,
Of = 2.p 0.01.
' x ' = 1 7 . 0 , c / / - 2 ,p
-
8/10/2019 t02-Measuring the Involvement Construct
6/13
46
THE JOURNALO CONSUMER RESEARC
,
138) = 39.9,/J
-
8/10/2019 t02-Measuring the Involvement Construct
7/13
34
T LE 2
HELATIONSHIP BETWEEN CONSTRUCT VALIDITY STATEMENTS AND LOW. M EDIUM. OR HIGH PII SCORES:
MEANS, STANDARD DEVIATIONS, AND CORRELATIONS
Construct v^idi ty
sta tement '
. wou ld be interested in
reading mfc>rmation
about how the product is
made.
. would be intwes ted in
reading Itie onsumer
Reports
article about this
product.
have compared product
characteristics am ong
t'inds.
.
think there are a great
deal of differences
among brands.
I have a most-preferred
brand of this product.
Lo w
(32)
3.28
(2.0)
3 . 0 0 '
(1.8)
2.59*
(1.8)
3 .94 '
0-6)
2 .88 '
(1.9)
Instant coffee
Medium
(12)
4.42
(2.3)
4.75
(2.3)
3.42
(2.1)
4.67
(1.1)
4.83
(1.8)
High
(12)
4.25
(2.3)
4.92
(2.3)
5.25
(2.0)
6.33
(-8)
6.17
(1.7)
r"
.30=
.47"
.52=
.63=
.68=
Laundry detergent
Lo w
(4)
1 .25 '
(.5)
2.75*
(2.9)
1.75
(1.5)
2 ,25"
(1-0)
2 .50"
(3.0)
Medium
(28)
4.04
(1.7)
4.46
(2.0)
4.36
(1.8)
4.00
(1.7)
4.68
(1.6)
High
(25)
4.48
(2.4)
5.00
(2.1)
4.80
(2.4)
5.20
(2.1)
5.44
(1.9)
r
.37=
.33=
.42=
.42=
.42=
Low
(9)
3.56
(2.1)
4.56
(1.9)
3 . 1 1 '
(1.9)
4 . 1 1 '
(1.2)
2.56'
(1-4)
Color television
Medium
(26)
4.00
(1.9)
4.65
(1.9)
3.85
(1.9)
4.85
(1.5)
4.77
( 1 7 )
High
(12)
4.23
(2.1)
5.36
(2.1)
4.59
(2.3)
5.73
(1.7)
5.55
(1.9)
r
.1 4
.27=
.23
.33=
.50=
TTie construct valkSty stateme nts are me asi^ed on a seven-point scale: (t ) strongly disagree to (7) strongly agree
' r - Pearson CoireJation bet we m PI s core and response to constrvici vaBdity question .
p < 0 . 0 1 ,
" p < 0 . 0 5
Lo w scores signiflcantty (Sflerent than htgh scores p < 0-01 .
' Low scores significantly cfiffereot than high scores p < 0.05.
Nwnbera in parentfwses in Tatjie heading ore numbers of subjects in each group. NiHT*ef^ in parentheses in T^ jle body are st an d^ fl deviaBons.
color television because it may be their responsibility
to do the family laundry. If this is true, they would
value the product's benefits and they would be iikely
to be interested in the quality of the product because
they need the prod uct to perform their household duties.
Televisions, however, may not fall under their respon-
sibility for mainte nan ce o r interest them as mu ch. Elec-
tronics, solid state, or color tuning may not be relevant
to them. And if a television does not affect them per-
sonally, housewives might have relatively low involve-
ment with this product.
The results of the M ANOVA for all five statements
were significant for the three products (instant coffee
K10 ,
98) = 6.56,
p
O OOI;
laundry detergent
F
10,100)
= 2.34,p 0.05 ; and color television F(1O, 100) = 2.00,
p
-
8/10/2019 t02-Measuring the Involvement Construct
8/13
348
THE JOURN L OF CONSUMER RESE R
is made. T o tap this dimension, subjects were asked the
extent to which they agreed with the statemen t I have
compared product characteristics among brands of
. For all prod ucts, the high scorers had signif
icantly greater agreement with the statement than low
scorers.
Perception of brand d ifferences. The next proposi-
tion tested was that high involvement scoTers would
perceive greater differences among brands in the prod-
uct class than low involvem ent scorers. This proposition
stems from writings of Robertson (1976), who suggests
that high involvement im plies that beliefs about product
attributes are strongly held, whereas low involvement
individuals do not hold strong beliefs about product
attribu tes. Thu s, the strength ofth e belief system to the
attributes emphasizes the perception of differences
among brands on the attributes where beliefs are
strongly held. Subjects were asked to respond to the
statem ent I think there are a great deal of differences
amo ng brand s of . High scorers always per-
ceived greater differences {p< 0.01) among brands than
low scorers in the product class.
Brand preferences. People highly involved in a
product class were hypothesized to have a most pre-
ferred brand in the product category. The preference
of a particular brand stems from the perception of dif-
ferences among brands. Since high involvement implies
perceiving greater differences about product attributes,
then the consumer should have a greater preference
based on that product differentiation. Again, over all
three products, high scorers showed a significantly (/J
< O.OI) greater agreem ent with the statement I have
a most preferred brand of than low scorers.
In conclusion, the various measures of construct va-
lidity used the correlation of two paper an d pencil tests
on the same subjects as evidence that the proposed scale
does tap the construct of involvement, as applied to
produ ct categories. Although no one result is an excel-
lent test of the scale, each finding add s to the weight of
evidence that the scale is an acceptable measure of in-
volvement, as applied to product categories.
F CTOR N LYSES
OF TH PII
An investigation ofthe dimensionality ofthe twenty-
item scale was carried out for each product category
used in the scale development. The items were factor
analyzed using varimax rotation with squared multiple
correlations in the diagonal for factor extraction. The
general pattern of results showed one main factor and
(usually) one minor or residual factor for every produc t
category. The major factor accounted for a range of
com mo n va riance from 65 percent for jean s to 100 per-
cet for instant coffee. Over ail products, all items loaded
positively on the first factor, which indicates that the
asumption of a simple linear combination ofthe scale
items was not violated.
SENSITIVITY TO SITUATIONAL
DIFFERENCES
The second content validity, the criterion validi
and the co nstruct validity sections have d emon stra
that the level of involvement with product categor
varies greatly over individuals. For any product ca
gory, there seems to be individuals who have low
volvement with the product and individuals who ha
high involvement with the product. Additionally,
average level of involvem ent varies across the differ
products. For example, students rated bubble bath
on the PII and rated automob iles 122 on the PII. T
demonstrates that different products are perceived d
ferently by the sam e p eople. The scale is also propos
to be sensitive to different situ ations , a third factor t
causes involvement, given the same people and the sa
products.
Previous studies by Clarke and Belk (1978) and B
(1981) demonstrated that some purchase situations c
be more involving than others. They found that
purchase of some previously uninvolving products
gifts can raise the level of involvement in the p urcha
decision. To investigate the possibility of rating p
chase situations on the scale, the PII was administe
over two purchase situations for wine to 41 memb
of the clerical and administrative staff used in the p
vious construct validity study. * Each subject rated t
purchase situations: ( i) the purchase of a bottle of w
for everyday consumption, and (2) the purchase o
bottle of wine for a special dinne r party . The scale ite
were internally reliable for these purchase decisio
Cronbach alphas were 0.98 and 0.97 respectively, a
the item-to-total correlations were generally above 0.
For this data collection, these situations were cou
terbalanced across subjects. The mean scale score
the everyday consumption was 78 ,s
34), and for
special dinner party was 106 s = 24). A related m
sures t-test was significant at ;(40) - 5.42,
p