chapter12 slides
TRANSCRIPT
Welcome to Powerpoint slides
for
Chapter 12
Factor Analysisfor
Data Reduction
Marketing ResearchText and Cases
byRajendra Nargundkar
Introduction
1. Factor Analysis is a set of techniques used for understanding variables by grouping them into “factors” consisting of similar variables
2. It can also be used to confirm whether a hypothesized set of variables groups into a factor or not
3. It is most useful when a large number of variables needs to be reduced to a smaller set of “factors” that contain most of the variance of the original variables
4. Generally, Factor Analysis is done in two stages, called
• Extraction of Factors and • Rotation of the Solution obtained in stage
5. Factor Analysis is best performed with interval or ratio-scaled variables
Slide 1
Application Areas/Example
1. In marketing research, a common application area of Factor Analysis is to understand underlying motives of consumers who buy a product category or a brand
2. The worked out example in the chapter will help clarify the use of Factor Analysis in Marketing Research
3. In this example, we assume that a two wheeler manufacturer is interested in determining which variables his potential customers think about when they consider his product
4. Let us assume that twenty two-wheeler owners were surveyed by this manufacturer (or by a marketing research company on his behalf). They were asked to indicate on a seven point scale (1=Completely Agree, 7=Completely Disagree), their agreement or disagreement with a set of ten statements relating to their perceptions and some attributes of the two-wheelers.
5. The objective of doing Factor Analysis is to find underlying "factors" which would be fewer than 10 in number, but would be linear combinations of some of the original 10 variables
Slide 2
The research design for data collection can be stated as follows-
Twenty 2-wheeler users were surveyed about their perceptions and image attributes of the vehicles they owned. Ten questions were asked to each of them, all answered on a scale of 1 to 7 (1= completely agree, 7= completely disagree).
1. I use a 2-wheeler because it is affordable.2. It gives me a sense of freedom to own a 2-wheeler.3. Low maintenance cost makes a 2-wheeler very economical in the long run.4. A 2-wheeler is essentially a man’s vehicle.5. I feel very powerful when I am on my 2-wheeler.6. Some of my friends who don’t have their own vehicle are jealous of me.7. I feel good whenever I see the ad for 2-wheeler on T.V., in a magazine or on a hoarding.8. My vehicle gives me a comfortable ride.9. I think 2-wheelers are a safe way to travel.10. Three people should be legally allowed to travel on a 2-wheeler.
Slide 3
Slide 4
The input data containing responses of twentyrespondents to the 10 statements are in Appendix 1,in the form of a 20 Row by 10 column matrix(reproduced below).
QUESTION NO.
S.No.
1 2 3 4 5 6 7 8 9 10
1 1 4 1 6 5 6 5 2 3 22 2 3 2 4 3 3 3 5 5 23 2 2 2 1 2 1 1 7 6 24 5 1 4 2 2 2 2 3 2 35 1 2 2 5 4 4 4 1 1 26 3 2 3 3 3 3 3 6 5 37 2 2 5 1 2 1 2 4 4 58 4 4 3 4 4 5 3 2 3 39 2 3 2 6 5 6 5 1 4 1
10 1 4 2 2 1 2 1 4 4 1
Table contd on next slide...
11 1 5 1 3 2 3 2 2 2 112 1 6 1 1 1 1 1 1 2 213 3 1 4 4 4 3 3 6 5 314 2 2 2 2 2 2 2 1 3 215 2 5 1 3 2 3 2 2 1 616 5 6 3 2 1 3 2 5 5 417 1 4 2 2 1 2 1 1 1 318 2 3 1 1 2 2 2 3 2 219 3 3 2 3 4 3 4 3 3 320 4 3 2 7 6 6 6 2 3 6
Slide 4 contd
QUESTION NO.
S.No.
1 2 3 4 5 6 7 8 9 10
Slide 5
The data are subjected to Factor Analysis in twostages (though the stages are 2, both outputs can berequested at the same time, at least in SPSS, by theprocess described in the SPSS Commands Appendixto the chapter).1. In stage 1, we request the software package used
(SPSS, Statistica, etc.) to EXTRACT factors withan Eigen Value of 1 or higher. The methodrequested is the PRINCIPAL COMPONENTS.This gives us the output in Figs. 2 and 3.
Fig. 2: Factor Matrix (Unrotated)
Factor Factor 2 Factor 3VAR00001 .17581 .66967 .49301VAR00002 - -.60774 .25369VAR00003 - .81955 .21827VAR00004 .96647 -.03627 -.09745VAR00005 .95098 .16594 -.13593VAR00006 .95184 -.08442 -.02522VAR00007 .97128 .09591 -.04636VAR00008 - .77498 -.03757VAR00009 - .73502 -.48213VAR00010 .16143 .31862 -.81356
Slide 6
Interpretation of the Output
1. The first step in interpreting the output is to lookat the factors extracted, their eigen values and thecumulative percentage of variance (fig 3,reproduced below).
Fig. 3: Final Statistics
Variable Communality
* Factor Eigenvalue
Pactof Var
CumPct
VAR00001 .72243 * 1 3.88282 38.8 38.8VAR00002 .45214 * 2 2.77701 27.8 66.6VAR00003 .73056 * 3 1.37475 13.7 80.3VAR00004 .94488 *VAR00005 .95038 *VAR00006 .91376 *VAR00007 .95474 *VAR00008 .79869 *VAR00009 .77745 *VAR00010 .78946 *
1. We note that three factors have been extracted,based on our criterion that only Factors with eigenvalues of 1 or more should be extracted. We seefrom the Cum. Pct. (Cumulative Percentage ofVariance Explained) column in Fig. 3 that thethree factors extracted together account for 80.3percent of the total variance (informationcontained in the original ten variables). This is apretty good bargain, because we are able toeconomise on the number of variables (from 10we have reduced them to 3 underlying factors),while we lost only about 20 percent of theinformation content (80 percent is retained by the3 factors extracted out of the 10 originalvariables).
2. This represents a reasonably good solution for ourproblem.
Slide 6 contd...
Slide 7
1. Now, we try to interpret what these 3 extractedfactors represent. This we can accomplish bylooking at figs 4 and 2, the rotated and unrotatedfactor matrices.
Fig. 4: Rotated Factor Matrix
Factor 1 Factor 2 Factor 3VAR00001 .13402 .34749 .76402VAR00002 -.18143 -.64300 -.07596VAR00003 -.10944 .62985 .56742VAR00004 .96986 -.06383 -.01338VAR00005 .96455 .13362 .04660VAR00006 .94544 -.13868 .02600VAR00007 .97214 .02862 .09411VAR00008 -.26169 .85203 .06517VAR00009 .00891 .87772 -.08347VAR00010 .07209 -.10990 .87874
1. Looking at fig. 4, the rotated factor matrix, wenotice that variable nos. 4, 5, 6 and 7 haveloadings of 0.96986, 0.96455, 0.94544 and0.97214 on factor 1 (we look down the Factor 1column in fig. 4, and look for high loadings closeto 1.00). This suggests that Factor 1 is acombination of these four original variables. Fig.2 also suggests a similar grouping. Therefore,there is no problem interpreting factor 1 as acombination of “a man’s vehicle” (statement invariable 4), “feeling of power” (variable 5),“others are jealous of me” (variable 6) and “feelgood when I see my 2-wheeler ads”.
2. At this point, the researcher’s task is to find asuitable phrase which captures the essence of theoriginal variables which form the underlyingconcept or “factor”. In this case, factor 1 could benamed “male ego”, or “machismo”, or “pride ofownership” or something similar. With the samemathematical output, interpretations of differentresearchers may differ.
Slide 7 contd...
1. Now we will attempt to interpret factor 2. We look in fig 4, down the column for Factor 2, and find that variables 8 and 9 have high loadings of 0.85203 and 0.87772, respectively. This indicates that factor 2 is a combination of these two variables.
2. But if we look at fig. 2, the unrotated factor matrix, a slightly different picture emerges. Here, variable 3 also has a high loading on factor 2, along with variables 8 and 9. It is left to the researcher which interpretation he wants to use, as there are no hard and fast rules. Assuming we decide to use all three variables, the related statements are “low maintenance”, “comfort” and “safety” (from statements 3, 8 and 9). We may combine these variables into a factor called “utility” or “functional features” or any other similar word or phrase which captures the essence of these three statements / variables.
Slide 8
3. For interpreting Factor 3, we look at the column labelled
factor 3 in fig. 4 and find that variables 1 and 10 are loaded
high on factor 3. According to the unrotated factor matrix of
fig. 2, only variable 10 loads high on factor 3. Supposing we
stick to fig. 4, then the combination of “affordability’ and
“cost saving by 3 people legally riding on a 2-wheeler” give
the impression that factor 3 could be “economy” or “low
cost”.
4. We have now completed interpretation of the 3 factors
with eigen values of 1 or more. We will now look at some
additional issues which may be of importance in using factor
analysis.
Slide 8 contd...
Slide 9
Additional Issues in Interpreting Solutions
1. We must guard against the possibility that a variable may load highly on more than one factors. Strictly speaking, a variable should load close to 1.00 on one and only one factor, and load close to 0 on the other factors. If this is not the case, it indicates that either the sample of respondents have more than one opinion about the variable, or that the question/ variable may be unclear in its phrasing.
2. The other issue important in practical use of factor analysis is the answer to the question ‘what should be considered a high loading and what is not a high loading?” Here, unfortunately, there is no clear-cut guideline, and many a time, we must look at relative values in the factor matrix. Sometimes, 0.7 may be treated as a high value, while sometimes 0.9 could be the cutoff for high values.
Slide 9…contd…Additional Issues (Contd.)
1. The proportion of variance in any one of the original variables which is captured by the extracted factors is known as Communality. For example, fig. 3 tells us that after 3 factors were extracted and retained, the communality is 0.72243 for variable 1, 0.45214 for variable 2 and so on (from the column labelled communality in fig. 3). This means that 0.72243 or 72.24 percent of the variance (information content) of variable 1 is being captured by our 3 extracted factors together. Variable 2 exhibits a low communality value of 0.45214. This implies that only 45.214 percent of the variance in variable 2 is captured by our extracted factors. This may also partially explain why variable 2 is not appearing in our final interpretation of the factors (in the earlier section). It is possible that variable 2 is an independent variable which is not combining well with any other variable, and therefore should be further investigated separately. “Freedom” could be a different concept in the minds of our target audience.
2. As a final comment, it is again the author’s recommendation that we use the rotated factor matrix (rather than unrotated factor matrix) for interpreting factors, particularly when we use the principal components method for extraction of factors in stage 1.