subject variations in sorting data: revisiting the … - pointsofview3.pdf · subject variations in...

Subject variations in sorting data: Revisiting the

Points-of-View model

David Bimler, John Kirkland Massey University

7th International Conference on Social Science Methodology

CCA and the Method of Sorting •Cognitive anthropology gives us CCA or Cultural

Consensus Analysis (Romney & Moore 1998).• Subjects respond to a list of questions about some domain;• Converted into matrix of inter-subject correlations.• the latent factor structure of the matrix indicates whether

subjects share a consensus “shared cognitive structure”.• Sometimes the subjects’ responses are judgements of

similarities between pairs of items or concepts. CCA can then be combined with MDS (or clustering).

• Similarities have been quantified with Method of Triads (Moore &c., 2002; Romney &c., 1997)

•Method of Sorting (Boster & D’Andrade, 1989; Boster & Johnson, 1989; Coxon 1999).

Presenter

Presentation Notes

The central notion of CCA – “Culture as a shared cognitive representation”, that can be quantified in some way. Boster & D’Andrade – stuffed birds. Boster & Johnson – outlines of fish species. Moore, Romney & Hsia (2002) – terms for basic colours. Romney, Moore & Rusch (1997) – emotion terms.

Example: 21 Animal namesSee also Henley (1969); Howard & Howard (1977); Arabie, Hubert & De Soete (1997);…

antelopebeavercamelcatchimp

chipmunkcowdeerdogelephant

monkeyrabbitratsheeptigerzebra

giraffegoatgorillahorselion

Sorted by 170 3rd- and 4th-form pupils from a NZ secondary school (most aged 13 or 14).

Presenter

Presentation Notes

These are most frequent names from a list of 30 names used in a series of studies.

Data collection: GPA-sorting procedure

1. Grouping

2. Partition

3. Additive

Presenter

Presentation Notes

Initial sorting into groups Then a finer subdivision of each group into subgroups; Then merging groups into larger groups, as often as the subject is willing to continue.

Each subject’s sorting decisions can be written as a 21-by-21 matrix of inter-item dissimilarities.

For S1, ‘ante-lope’ and ‘deer’ go together at every stage

A ‘0’ entry means those two items are alwaysgrouped together, at every level of sorting

A ‘1’ means they are never grouped together.

Most entries are 1s.

We ‘unwrap’ the matrix into a 210-by-1 column…

Presenter

Presentation Notes

Actually we only unwrap the lower half-matrix below the diagonal, since it’s symmetric.

…Then probe for latent structure, using PCA.

Presenter

Presentation Notes

Principal Components Analysis being the simplest form of factor analysis.

Loadings on F1 are all positive; mean loading = 0.58.

(not very high).• 2nd principal component is still significant –accounts for 10.3% of variance (vs. 36.2% for F1).• Consensus.⇒ Oblimin rotation to simple structure:

Presenter

Presentation Notes

This is a situation where the central CCA assumption of a single shared representation doesn’t really fit the data. To examine the range of variation among the subjects, we rotate the factor structure.

Loading plot for rotated factor structure

Presenter

Presentation Notes

The subjects vary along a range… now the rotated factors are the ends of that range.

Each factor is a column of similarity values

Presenter

Presentation Notes

At this point, it is worth looking at each factor in more detail.

…Can be arranged back into similarity matrix

Presenter

Presentation Notes

Each factor – in this case, F2 – is simply an prototypal or idealised similarity matrix.

┌────────────── antelope┌───┤ ┌┬─ elephant│ │ ┌────┤└─ giraffe

┌────┤ └──────┤ └── zebra│ │ └──────┬ lion

┌───────────┤ │ └ tiger│ │ └────────────────── camel│ │

┌──────────────────────┤ └────────────────────┬── chimp│ │ ├── gorilla│ │ └── monkey│ ││ └───────────────────────────────┬─── beaver┤ └─── chipmunk││ ┌───────┬─── cat│ ┌────────┤ └─── dog│ │ └─────┬───── rabbit└─────────────────────────────────────┤ └───── rat

││ ┌───── cow│ ┌──┤┌───┬ goat└───────────┤ └┤ └ sheep

│ └──── horse└──────── deer

Hierarchical cluster structure of F2Is dominated by division into exotic / “zoo” animalsand familiar animals

Presenter

Presentation Notes

To get an idea of the internal structure of the factor, we can subject it to cluster analysis.

Two-dimensional MDS model for F2:

D1 = Familiar – Exotic?

D2

= ??

?

Presenter

Presentation Notes

For another perspective, we subject the factor to MDS. In this case, 2 dimensions were adequate. The 1st dimension accommodates the “familiar / exotic” distinction. 2nd dimension has no single, consistent interpretation. Can embed the cluster structure within the MDS solution.

Repeat the process for F1

┌─────┬── antelope┌─────┤ └── deer│ └───┬──── camel

┌──┤ └──┬─ horse│ │ └─ zebra

┌─────────────────────────────────────┤ └─────┬──────── cow│ │ └────┬─── goat│ │ └─── sheep│ └───────┬───────── elephant│ └───────── giraffe│┤ ┌─┬─── beaver│ ┌────────────────────┤ └─── chipmunk│ ┌───────────┤ └──┬── rabbit│ │ │ └── rat│ │ ││ │ └────────────────────────┬─ chimp└────────────────┤ └┬ gorilla

│ └ monkey││ ┌─────────────┬── cat└─────────────────────┤ └─┬ lion

│ └ tiger└──────────────── dog

F1 structure has a cluster of large animalsSmaller clusters of ‘varmints’; primates; predators

Presenter

Presentation Notes

“Varmints” was Shepard’s term, from an early analysis of the animal domain.

MDS model for this factor has 3 dimensions

1st and 2nd

dimensions

Presenter

Presentation Notes

D2 essentially separates the primates from the others – it could be “intelligence”, or “closeness to humans”.

1st and 3rd

dimensions

Presenter

Presentation Notes

D3 separates the carnivores from everything else. Henley obtained a very similar result back in 1969, though she used 30 animal names, and pairwise similarity judgements rather than sorting.

• These factors are idealised patterns of responses. Rival points of view (Tucker & Messick, 1963).

• Combining them in the right proportions – defined by the factor loadings – produces a personal cognitive space for each subject, underlying his / her sorting sequence.

• 4th-form pupils have higher loading on F1 than 3rd

formers; lower on F2.

• No sex difference.

• Progressing to a more complex view of animal names with more distinctions − more developed cognitive basis?

Mean Factor loading3rd 4th t p

Rotd F1 0.38 0.56 -3.923 0.000Rotd F2 0.30 0.13 3.495 0.001

“Points of view” model of individual difference

Presenter

Presentation Notes

It turns out that this generalisation of CCA is simply a re-invention of a model of individual difference that was around in the 1960s. Each “Point of View” is not a dimension, but a prototype similarity structure that may contain several dimensions. Is it an *interesting* model? Can we predict a subject’s position along the range of variation between two points of view? It seems that we can.

Presenter

Presentation Notes

Here’s the plot of individual factor loadings again, but colour-coded according to class.

• Could describe this age trend as a decrease in salience of a ‘familiarity’ axis; increase in salience of Size, Predacity, Intelligence axes.

• Would then fit within the weighted-Euclidean framework for individual variations.

• However, that requires 4 parameters to characterise each subject (vs. only two parameters for the Points-of-View model).

• Each P.o.V. is not a dimension, but a similarity structure that may contain several dimensions.

Dimension weights and “points of view”

Presenter

Presentation Notes

According to a series of studies by Chan, the reverse trend appears in Alzheimer’s Syndrome: the cognitive axes become less salient again. Want to repeat this last point.

Application #2: Colour caps

First three unrotated factors capture 36.3%, 21.5%, 12.2% of variance. Mean loading on F1 = 0.56.

12 mm circular “caps”16 samples, forming a circle in “colour space”Hierarchical sorts from 36 subjects

Presenter

Presentation Notes

Positions of the colours in the “CIE-LAB” colour space. Here the data are complete hierarchical sorts, starting with each cap as a separate group, clumping them progressively into larger and larger groups. Again, the assumption of a single representation is not adequate.

Factor loadings after oblimin rotation

8 subjects known to be “colour blind” (red-green deficient). All high on F3.

Normal subjects polarised to F1 or F2

Presenter

Presentation Notes

We retained 3 factors this time. If anyone asks about the 9th subject in among the colour-blind ones – he was enlisted from a sample of college students, and we could not trace him afterwards for tests.

Factor 2

Factor 1

Factor 3

Presenter

Presentation Notes

Re-arrranged F1 into a similarity matrix and analysed it with MDS – see first solution, It describes a strategy of clustering the blue and green caps first, into clusters that then successively swallow up the rest of the colour circle (‘chaining’). F2 describes the strategy of clustering red and purple caps, then progressively adding the rest of the circle to those clusters F3 describes the colour-deficient strategy of clustering red, green and yellow caps together on one hand, and purple, blue and blue-green caps together on the other hand.

Application #3: Kinship terms• 15 Kinship concepts (Rosenberg & Kim data). 85 sorts.• F1 and F2 account for 51.4%, 13.7% of variance (then

down to 7.2% for F3). Mean loading on F1 = 0.64.

Rotated version

Presenter

Presentation Notes

The data here were single-stage sorts. From Rosenberg (1982), “The method of sorting in multivariate research with applications…” Retained and rotated two factors. Fewer than 85 points, because some of the sorts were repeated by multiple subjects. Rosenberg arranged the sorts to emphasise 8 different sorting strategies – I’ve coloured the points accordngly.

┌──────┬─ G.father┌────────────────────────────┤ └─ G.son│ └──────┬─ G.mother

┌─────────────────┤ └─ G.daughter│ ││ │ ┌────────────────┬── brother│ └─────────────────┤ └── sister│ │ ┌─────┬── father│ └──────────┤ └── daughter┤ └─────┬── mother│ └── son││ ┌───┬───── nephew│ ┌──┤ └───── niece│ ┌─┤ └───────── aunt└────────────────────────────────────────┤ └──────────── uncle

└────────────── cousin

F1 cluster structure shows a 3-way split:Nuclear-family relationships; Two generations away;Collaterals (‘the rellies’)

Presenter

Presentation Notes

MDS reveals that there is another (minor) distinction within the factor, separating males from females.

┌────────────────────────────────────┬ G.father┌──────────┤ └ G.mother│ └─────────────────────┬─────────────── father│ ├─────────────── mother│ └─────────┬───── cousin│ └───┬─ aunt┤ └─ uncle││ ┌──────────────┬── G.daughter│ ┌───────────┤ └── G.son│ ┌───────────┤ └─────────────┬─── nephew│ │ │ └─── niece└──────┤ └────────────────────────┬──── daughter

│ └──── son│└──────────────────────────────────────┬── brother

└── sister

F2 structure has 3 generational clusters:Older generations; Younger generations;contemporaries

Presenter

Presentation Notes

1st dimension is ranking the kinship terms in order of generation; The 2nd dimension does not have a simple interpretation.

2nd and 3rd

dimensions

Presenter

Presentation Notes

The role of the 2nd and 3rd dimensions is to allow maximum separation between younger and older direct-line kinships… …also between the younger and older collaterals.

Application #4: Occupation names

• 16 names of occupations (Project on Occupational Cognition, 1975). 103 H-sorts.

• First 3 factors accountfor 38.9%, 6.5%, 5.7%.

• Mean loading on F1 = 0.41.

• The 3 teachers and 2 accountants attendmore to F1 (accordingto personal maps). Notso for doctors.

Presenter

Presentation Notes

Note that we’re back to hierarchical-sorting data now. Retained two factors from the PCA solution (it may be that further factors are also worth examining). For most of the subjects, their job is not recorded, since they were recruited by post-code rather than through a professional group

F1 is polarised along a ‘status’ dimension:

┌┬─── Minister┌─────┤└─── Teacher│ └──┬─ Actuary

┌───────────────────────────────────────────────┤ └┬ Accountant│ │ └ Solicitor│ └────────── Civ.Srv.Exec│┤ ┌──┬──────── Psych.nurse│ ┌─────────────────┤ └──────── Policeman│ │ └─────────── Amb.Driver└────────────────────────────┤ ┌─────┬────── Labourer

│ ┌──────────┤ └────── Porter└─────┤ └──────────── Barman

│ ┌───────────────────┬─ Machinist└─┤ └─ Carpenter└────┬──────────────── Comm.Rep.

└──────────────── Lorry drvr

Presenter

Presentation Notes

Possible interpretations of this dimension are ‘status’, prestige, remuneration.

F2 is a three-way split between“People skills”;Cognitive skills;Manual skills.

┌──┬─ Minister┌────┤ └─ Psych.nurse

┌──────────────────────────────────┤ └──── Teacher│ └─────┬─── Amb.Driver

┌───────┤ └─── Policeman│ ││ │ ┌──┬── Actuary│ │ ┌───┤ └── Accountant│ └──────────────────────────────────┤ └───── Civ.Srv.Exec┤ └───────── Solicitor││ ┌──────┬───── Labourer│ ┌────┤ └───── Lorry drvr│ ┌────┤ └──────┬───── Porter│ │ │ └───── Barman└─────────────────────────────┤ └────────────────┬ Machinist

│ └ Carpenter└────────────────────── Comm.Rep.

Presenter

Presentation Notes

Nothing to add here.

MDS represents F2 in 2D:

Presenter

Presentation Notes

Two-dimensional MDS solution was adequate.

Conclusions• When comparing subjects’ sorting choices, the distance

between their similarity matrices does not work.

• Dominated by the relative numbers of 0 and 1 entries in each matrix (in turn determined by the number of piles).

⇒ use correlation between similarity matrices. Compensates for numbers of 0s and 1s.

• Single factor underlying matrix of inter-subject correlations? ⇒ Subjects all apply the same implicit model to the domain

(the cultural consensus analysed by CCA).

• Otherwise, subjects may fall into distinct sub-groups; or form a continuum between two factors.

• The factors themselves are idealised similarity structures (Points of View) − amenable to MDS or clustering.

Presenter

Presentation Notes

The Cultural Consensus theorists have an important insight for handling sorting data… …But they failed to take the next step, and look at what it means if the assumption of a single shared mental structure isn’t true. If they had done that, they would have re-invented a wheel from the 1960s.

Thank you

• Acknowledgements to the Project on Occupational Cognition, which conducted the hierarchical sorting of job titles,

• And the UK Data Service at the University of Essex, who are the custodians of the data.

• Emma Barraclough helped collect the colour data.

• Enormous debt to A. P. M. Coxon for delivering talk.

subject variations in sorting data: revisiting the … - pointsofview3.pdf · subject variations in...

Documents