the comprehensive approach to analyzing multivariate...

1

Article In Press 5/12/2014 @ Journal of Research in Personality Subject to Final Copy Editing

The Comprehensive Approach to Analyzing Multivariate Constructs

Ryne A. Sherman & David G. Serfass

Florida Atlantic University

Author Notes

Ryne A. Sherman, Florida Atlantic University; David G. Serfass, Florida Atlantic

University; Correspondence regarding this article may be addressed to Ryne Sherman by e-mail

at [email protected].

All statistical analyses were conducted using R (R Core Team, 2014). We thank Dustin

Wood for comments on a prior draft of this article. All errors and omissions remain our own.

Running Head: ANALYZING MULTIVARIATE CONSTRUCTS 2

Abstract

Many psychological constructs of interest to personality psychologists, such as personality,

behavior, and emotions, are made up of many variables. Moreover, similarity metrics, such as

self-other agreement, profile similarity, or behavioral consistency, result from calculations

conducted across many variables. When analyzed using a comprehensive approach, such

multivariate constructs present unique analytic challenges. Such challenges are not well

addressed in standard graduate statistics textbooks or presently available in standard commercial

software. This article introduces the ‘multicon’ package, freely available in the R statistical

package, designed to aid researchers interested in taking a comprehensive approach to analyzing

multivariate constructs. Realistic examples from personality psychology are provided to

demonstrate the utility of this package.

Keywords: multivariate constructs; profile correlations; R; replicability


The Comprehensive Approach to Analyzing Multivariate Constructs

Is personality related to behavior? How do extraverts behave differently from introverts?

How well do two people agree about what someone else’s personality is like? How accurately

can we judge someone else’s personality? How similar/consistent are people or situations?

Personality scientists are often concerned with these sorts of questions and many more like them.

However, answering questions such as these can be quite complicated. To see why, compare

these questions to another question: What is the relationship between a person’s height and

weight? A key difference is that the constructs of interest in the first set of questions are

multivariate, while the constructs in the latter question are not. Multivariate constructs, as the

name implies, refer to psychological constructs that consist of many psychological variables.1

Many constructs of interest to personality psychologists are multivariate in nature: personality,

behavior, emotions, motives, situations, etc.

The difficulty with multivariate constructs is that they make answering questions like

those posed in at the outset challenging. For example, answering the question about the

relationship between personality and behavior requires, at minimum, some definition of what is

meant by “personality” and “behavior.” Depending on one’s particular perspective, the

multivariate construct of personality might include thousands of traits (Allport & Odbert, 1936),

one-hundred (Block, 1961), or merely a handful (i.e., 5; McCrae & Costa, 2008). Regardless,

most personality scientists recognize that personality is a multivariate construct. Behavior is also

a multivariate construct although arguably psychologists have put less effort into taxonomizing

behavior than personality (Furr, 2009).

There are roughly two strategies psychologists have used to deal with the problem of

multivariate constructs.2 The first strategy reduces the construct(s) of interest to a smaller


number (e.g., 1-6) of more mentally tractable, often empirically derived, essential variables. We

refer to this strategy as the essential approach. For example, instead of “personality” (broadly

construed) one might focus on just a single trait (e.g., extraversion) or a subset of broad traits

(e.g., the Big 5). Likewise, instead of “behavior” (broadly construed) one might focus on just a

single behavior (e.g., talkativeness) or on a subset of broad behaviors (e.g., interpersonal

behaviors from the Interpersonal Circumplex).

The second strategy for dealing with the problem of multivariate constructs tries to avoid

data reduction as much as possible preferring to comprehensively assess and analyze the many

relationships between the constructs of interest. We refer to this strategy as the comprehensive

approach (Sherman & Wood, 2014). A researcher employing this approach may use measures

designed such that each item represents a distinct characteristic such as the California Adult Q-

set (CAQ: Block, 1961) or the Inventory for Individual Differences in the Lexicon (IIDL: Wood,

Nye, & Saucier, 2010). Alternatively, a comprehensive approach may even employ measures

designed to assess essential variables (e.g., the NEO PI-R: Costa & McCrae, 1992; the Big Five

Inventory: John & Srivastava, 1999; the HEXACO-PI-R: Lee & Ashton, 2004), but treat each

item as if it were to be analyzed separately (cf. Biesanz, 2010; Biesanz & Human, 2010; Human

& Biesanz, 2011a, b).

There are strengths and weaknesses to both approaches. The essential approach reduces

complex multivariate constructs such as personality and behavior into mentally tractable subsets.

This makes the research conceptually easier to transmit to other scientists and beyond. The

comprehensive approach, on the other hand, can be mentally taxing (i.e., who wants to look at a

correlation matrix with 100×67 = 6700 unique elements?; see section “Are these two multivariate

constructs related”). An additional advantage of the essential approach is that it can drastically


reduce the number of variables analyzed resulting in lower Type I error rates. The

comprehensive approach often involves computing a large number of correlations and risks

identifying noise as signal. However, the essential approach may miss or obscure associations

between the constructs of interest (cf. Brown & Sherman, in press; Fast & Funder, 2008; Hirsh,

DeYoung, Xu, & Peterson, 2010). The comprehensive approach is less likely to miss or obscure

such associations. Lastly, both the essential and comprehensive approaches can be used to

answer questions about agreement, similarity, or consistency at the nomethetic (e.g., item) level.

However, comprehensive approaches—which include more variables—may be superior for

addressing these questions at the ideographic (e.g., person, profile) level because the increased

number of variables increases the reliability of such profiles.

A perhaps less-well recognized difference between the essential and comprehensive

approaches is that the statistical tools for conducting analyses from an essential approach are

well-described in graduate statistics textbooks, widely available in standard commercial software

(e.g., SAS, SPSS, Excel), and easy to implement. The comprehensive approach, on the other

hand, comes with a unique set of problems (e.g., how to handle so many variables, how to

appropriately test for profile similarity) requiring different data analytic methods. Such methods

are not (a) well-described in textbooks, (b) widely available in standard commercial software, or

(c) easy to implement.

This article introduces the ‘multicon’ package—an R package offering functions

designed to deal with the problems inherent with the comprehensive approach for handling

multivariate constructs (Sherman, 2014). In this article, we provide examples of realistic

questions a personality scientist may encounter and show how a researcher using a

comprehensive approach might use the functions available in the ‘multicon’ package to address


these questions. Table 1 provides a summary of the types of questions we address in this article

along with the functions from the ‘multicon’ package used to address them. All datasets used in

these examples are built into the ‘multicon’ package making it easy to follow along.3 Although

we refer to differences between the essential and comprehensive approaches to handling

multivariate constructs, this article is not meant to create, or resolve, a conflict between these two

approaches. Indeed, as noted previously, both approaches have strengths and weaknesses. As

such, this article will primarily focus on analytic issues involved in using a comprehensive

approach and describe the tools provided by the ‘multicon’ package to help resolve them.

Are these two multivariate constructs related?

We began by asking what appears to be a simple question: Is personality and behavior?

Let us say that we have measured personality with the 100-item CAQ (Block, 1961) and

behavior with the 67-item Riverside Behavioral Q-sort (RBQ: Funder, Furr, & Colvin, 2000;

Furr, Wagerman, & Funder, 2010). The essential approach to this question would be to first, for

both personality and behavior, reduce the number of items measured to some essential subset.

Such subsets could be derived empirically (e.g., factor analysis, principal components) or

theoretically (e.g., the interpersonal circumplex; see Markey, Funder, & Ozer, 2003). The second

stop using the essential approach would then be to examine the associations (correlations)

between the resultant subsets of variables. Almost all software packages, commercial or

otherwise, are designed to make such analyses easy and convenient.

A comprehensive approach this question though would aim to analyze the full set of

correlations between all 100 personality items and the 67 behaviors. Calculating such a

correlation matrix is usually quite easy in just about any statistical package. However, as

previously noted, perusing through a matrix of 6700 correlations will likely prove mentally


intractable. Thus, an alternative method for quantifying the degree of relationship between

personality and behavior is needed. One method is to count the total number of statistically

significant correlations in the matrix (cf. Block, 1960). Another is to determine if the average

magnitude amongst the 6700 correlations is larger than one would expect if the constructs were

not related (Sherman & Funder, 2009). Following Sherman and Funder (2009), a randomization

test can be used to do both of these simultaneously. The test randomly reassigns CAQ profiles to

RBQ profiles, creating a pseudo dataset, and calculates both the total number of statistically

significant correlations and the average absolute r amongst the 6700 correlations in this pseudo

dataset. To better illustrate this process, imagine picking up each subject’s CAQ profile (keeping

all 100 scores intact) and randomly reassigning this profile to a subject’s RBQ profile. In doing

so, one is simulating a random relationship between personality and behavior, while maintaining

the dependencies (covariation) within the multivariate constructs. Next, one calculates the

100×67 correlation matrix on this pseudo dataset and records the number of statistically

significant correlations and the average absolute r of this correlation matrix. These numbers

represent simulated values under a model of a random relationship between personality and

behavior. Repeating this procedure many times allows for the formation of a sampling

distribution, to which we can compare the observed results from the original dataset. Calculating

the proportion of simulated values greater than or equal to the observed values (for the number

statistically significant and the average absolute r respectively) yields a p-value indicating the

probability of obtaining the originally observed results under chance.

Conducting such an analysis using standard commercial software is either not possible or

would require an arduous amount of programming. The rand.test function in the ‘multicon’

package conducts such an analysis. In this example, we use the rand.test function to determine


whether personality (as measured by the CAQ) has an overall relationship with behavior (as

measured by the RBQ).

install.packages(‘multicon’) # Only if this is the first time using this package

library(multicon)# Load the mulitcon package

data(caq)# Loading the CAQ dataset

data(beh.comp) # Loading the behavior dataset

rand.test(caq, beh.comp, sims=10000) # The analysis; could take a minute or so

It should be noted that because the sims argument is set to 10,000, which is ten times more than

the default value, this analysis may take 30 seconds or more. The output from this analysis is a

list with two objects: one for the average absolute correlation ($AbsR) and the other for the

number of statistically significant results ($Sig).4

# Output below

$AbsR

Average Absolute r

N 205.0000

Observed 0.0699

Exp. By Chance 0.0559

Standard Error 0.0011

p 0.0000

99.9% Upperbound p 0.0000

99.9% Lowerbound p 0.0000

95th % 0.0578

$Sig

Number Significant

N 205.0000

Observed 790.0000



p 0.0000



95th % 391.0000

There are 205 valid cases (listwise deletion is used). The observed average absolute r was .0699.

This can be compared to the value expected by chance alone which is .0559 with a standard error

of .0011. The resulting probability (p-value) of observing a value of .0699 under a null model of

no association between personality and behavior is <.0001.5 A similar list of findings is reported

for the number of statistically significant results showing 790 observed statistically significant

associations, a null expected value of 335, and a p-value of <.0001. Given the arbitrariness of

statistical significance levels, the results based on the average absolute r values are usually


preferred (Sherman & Funder, 2009). Overall, these results demonstrate the relationship between

the multivariate constructs of personality and behavior is much greater than one would expect by

chance alone. Or in other words, personality does really seem to be related to behavior.

How is a particular variable of interest related to a multivariate construct?

The prior example concerns the case where a researcher is interested in the relationship

between two multivariate constructs. However, sometimes researchers are interested in the

relationship between a single variable of interest and some other multivariate construct.

Notable examples include: who is at risk to abuse drugs (Block, Block, & Keyes, 1988; Shedler

& Block, 1990; Walton & Roberts, 2004), how is childhood personality related to adult political

orientation (Block & Block, 2006), what kinds of people are liked by others (Wortman & Wood,

2011), what kinds of people are likely to procrastinate (Watson, 2001), or how is a particular

personality trait associated with adult behavior (Nave, Sherman, Funder, Hampson, & Goldberg,

2010).

To use an example with real data, let us say that we are interested in the association

between trait extraversion and the aforementioned 67 behaviors from the RBQ. Using an

essential approach, we might first attempt to empirically reduce the 67 behaviors to a more

manageable set. Then, after identifying such a set we might correlate these behavioral

dimensions with extraversion.6

A comprehensive approach to the question of the relationship between extraversion and

behavior would be more interested in the correlations between extraversion and all 67 behaviors.

Computing these correlations (and the associated t-test) is easily done with just about any data

analytic software. Consistent with tradition using this approach (Block, 1961; Funder, 2013), an

abbreviated table (i.e., just those reaching some level of statistical significance) of these


correlations are shown in Table 2. Because such tables are common in work with multivariate

constructs, but sometimes arduous to put together, the q.cor function in the ‘multicon’ package

generates the information for such tables. Moreover, an object resulting from the q.cor function

can be quickly summarized into a tidy table by passing it to R’s generic print function.

data(beh.comp)# Loading the behavioral composites dataset

data(RSPdata) # Loading the RSP data set to get extraversion scores

ext.obj1 <- q.cor(RSPdata$sEXT, beh.comp, sex=RSPdata$ssex, fem=1, male=2, sims=1)

data(rbqv3.items) # Loading the item content for the RBQ

print(ext.obj1, rbqv3.items, "RBQ", short=T)# Viewing the results easily

The q.cor function takes several arguments. The first argument is the variable of interest (in this

case extraversion). The second argument is the multivariate construct of interest (in this case the

RBQ scores). The third argument is a variable denoting the sex of the participants. Traditionally,

research using this approach examines the correlations for the full sample and separately by sex

(Block, 1961). However, any binary variable can be passed to this argument. The fourth and fifth

arguments tell the q.cor function the codes for the aforementioned binary variable for females

and males respectively. Finally, the sims argument tells the function how many randomly

simulated datasets to use for the randomization test (discussed shortly). For simplicity, we have

set this number to 1 at this point.7

This example also passes four arguments to the print function. The first is the object

created by q.cor just discussed. The second is a vector containing the item content for the

behavioral items. The third argument (“RBQ”) is character indicating a short abbreviation for the

list of behavioral items. In practice, neither of these latter two arguments needs to be included.

The print function will create generic item names, if these arguments are not specified (e.g.,

item1, item2). The fourth argument (short=TRUE) returns only an abbreviated list of the results

(i.e., the same as those in Table 2) by removing any items that do not have a p-value of less than


.10 for the combined sample or less than .05 for either sex. By default (short=FALSE), the full

list of items and their correlations are returned.

Executing the code just described generates a table similar to that shown in Table 2. It is

worth noting that the vector correlation between the full set of correlations for women and men

(i.e., the 67 correlations for women correlated with the 67 correlation for men) is returned (r =

.67), as an indicator of the consistency of the results across sex. The value is reported in the note

at the bottom of Table 2.

The main results in Table 2 are based on 67 correlations and significance tests. As such,

we are bound to find both some large correlations and statistically significant results, even if the

data were generated randomly. What is needed is a statistic that can establish whether the pattern

of correlations shown in Table 2 is more than just noise. The aforementioned rand.test function

in the ‘multicon’ package does just that. In this case however, instead of randomly reassigning

entire personality profiles to behavioral profiles, only the extraversion scores for each subject are

randomly reassigned to behavioral profiles to create pseud datasets. Otheriwse, the procedures

(i.e., calculating and recording the average absolute r and the number significant on each pseudo

dataset to form a sampling distribution) are the same.


rand.test(RSPdata$sEXT, beh.comp, sims=10000)

# Output below

$AbsR

Average Absolute r

N 205.0000

Observed 0.0904



p 0.0016



95th % 0.0705

$Sig

Number Significant

N 205.0000

Observed 15.0000



p 0.0022



95th % 8.0000

In this example, the rand.test function takes three arguments. The first is a vector containing the

scores for the variable of interest (extraversion), the second is a data.frame or matrix containing

the multivariate construct of interest (behavior), and the third is the number of sims (which we

have changed from the usual default of 1,000 to 10,000 to increase precision).

As before, the results from this analysis are divided into two sections: one for the average

absolute correlation ($AbsR) and the other for the number of statistically significant results

($Sig). The observed average absolute r was .0904 between trait-level extraversion and the 67

behavioral composites. This can be compared to the value expected by chance which is .0559

with a standard error of .0081. The resulting probability (p-value) of observing a value of .0904

under a null model of no association between extraversion and behavior is .0016 and the 99.9%

confidence interval for this p-value is .0003 to .0029 (indicating our p-value is accurate to within

about .0026). A similar list of findings is reported for the number of statistically significant

results showing 15 observed statistically significant associations, a null expected value of 3.32,

and a p-value of .0022.


Of perhaps most interest, the q.cor function automatically calls the rand.test function so

that they need not be conducted separately:

ext.obj <- q.cor(RSPdata$sEXT, beh.comp, sex=RSPdata$ssex, fem=1, male=2, sims=1000)

print(ext.obj, rbqv3.items, "RBQ", short=TRUE)

The output from these lines of code is the same as from the q.cor output previously, but

this time the number of sims (which is passed to the rand.test function) has been set to 1000.

Perhaps the most important value to most researchers will be the p-value for the average absolute

association. These values have been added as the last row in Table 2, labeled “Average Absolute

r.” These p-values indicate that there are meaningful (i.e., non-random) relationships between

self-reported extraversion and the behavioral composites. As such, we can proceed with more

confidence and justification that what we are interpreting in Table 2 is more than just noise. As

Table 2 shows, those who scored high on extraversion were more likely to be talkative, have a

high energy level, and speak in a loud voice. Conversely, those who scored low on extraversion

were more likely to act reserved, with little expression, and to keep others at a distance.

How replicable are the associations between a variable of interest and a multivariate construct?

Although the randomization test in the previous analysis indicate that average association

between extraversion and behavior is greater than we would expect by chance, it says nothing

about the replicability of the results displayed in Table 2. Specifically, how much should we

expect the overall observed pattern of associations abbreviated in Table 2 to replicate in new

samples? Estimating the replicability of a typical effect in psychology requires, in most cases,

conducting the study again on a new sample. Interestingly however, the expected replicability of

the pattern of results in Table 2 can be estimated without the need to conduct a new study (see

Sherman & Wood, 2014 for details). The note at the bottom of Table 2 indicates the estimated

replicabilities for the full patterns of correlations between extraversion and the behavioral


composites. These values should be interpreted as the expected correlation between the observed

full pattern of correlations between extraversion and behavior and the pattern of correlations one

would observe if one were to conduct the study again on a new sample of the same size drawn

from the same population (Sherman & Wood, 2014). In other words, these values represent the

replicability of the patterns of correlations expressed as an alpha reliability metric. The expected

replicabilities for the patterns of correlations are computed using the vector.alpha function in the

‘multicon’ package.

round(vector.alpha(RSPdata$sEXT, beh.comp),2) # Full sample, use round to get 2 digits

# Output below

Results

N 205.00

Average r 0.01

Alpha 0.67

Lower Limit 0.54

Upper Limit 0.77

The results indicate the sample size (listwise deletion is used), the average correlation amongst

the transposed cross-products of Z-scores (see Sherman & Wood, 2014 for technical details), the

estimated replicability (Alpha) and the confidence intervals (95% by default) for the replicability

estimate. In this case we see a replicability value of .67 indicating that we would expect the full

pattern of results, abbreviated in Table 2, to correlate approximately .67 [.54, .77] with the

results from a new sample of the same size (N=205) drawn from the same population. Such a

value also bolsters our confidence and justification that we can proceed with substantive

interpretations of the pattern of results observed in Table 2.

How well do judges agree about a target?

The examples thus far have concerned questions of how a multivariate construct is

related to another construct of interest (e.g., another multivariate construct or single variable). At

other times researchers are interested in questions of agreement, similarity, or consistency, in

multivariate constructs rated by different judges or measured across time. Indeed, perhaps one of


the most foundational questions of personality psychology pertains to the agreement between

judges. Agreement among independent judgments about what targets are like provides strong

evidence for the existence of some real attributes belonging to the targets (Funder & Dobroth,

1987; Norman & Goldberg, 1966). Because consensus among independent judgments of targets’

personalities is so well-established (Albright, Kenny, & Malloy, 1988; Albright, Malloy, Dong,

Kenny, Fang, Winquist, & Yu, 1997; Kenny, Albright, Malloy, & Kashy, 1994) personality

researchers are rarely interested in only estimating such effects today. More often personality

scientists are interested in consensus as an indicator of the reliability of a set of informant reports

about a target (Vazire, 2006). For example, many studies gather personality reports from

multiple acquaintances of a target and average these ratings to form informant composites (e.g.,

Back, Stopfer, Vazire, Gaddis, Schmukle, Egloff, & Gosling, 2010; Carlson, Vazire, & Furr,

2011; Colvin & Funder, 1991; Funder, Kolar, & Blackman, 1995; Oltmanns & Turkheimer,

2009; Vazire & Mehl, 2008). These informant composites are then used to predict some other

outcome of interest. Because the acquaintances are typically not distinguishable judges (i.e., each

acquaintance rates only one target and there are no psychologically important differences

between acquaintances), the appropriate reliability statistics for such composites comes from the

intraclass correlation (ICC: Shrout & Fleiss, 1979).

An item-level ICC is easy to compute using popular commercial software. However,

when working with a multivariate construct such as personality, researchers may be interested in

computing many (e.g., 100, one for each CAQ item) ICCs at a given time, something that can be

rather burdensome in popular commercial software. The item.ICC function in the ‘multicon’

package computes such ICCs easily.


data(acq1) # A data.frame containing 100 personality judgments from the first

acquaintance

data(acq2) # A data.frame containing 100 personality judgments from the second

acquaintance

item.ICC(acq1, acq2)

The item.ICC function takes at least two arguments, which must be data.frames of the same size

with the columns containing the corresponding items (i.e., the first item is in the first column in

both data.frames). In the case of multiple raters or occasions, one can simply add additional

data.frames of the same size.

The results for this example provide all six possible ICCs (Shrout & Fleiss, 1979) for the

pairs of acquaintances across all 100 personality characteristics. By applying the describe

function from the ‘psych’ package (Revelle, 2014; automatically loaded with ‘multicon’) to these

results, we can obtain a summary of the results across all 100 personality characteristics.

describe(item.ICC(acq1, acq2))

# Output below

var n mean sd median trimmed mad min max range skew kurtosis se

ICC1 1 100 0.11 0.09 0.10 0.11 0.08 -0.14 0.36 0.50 0.14 0.11 0.01

ICC1k 2 100 0.18 0.15 0.17 0.19 0.15 -0.33 0.53 0.86 -0.41 0.60 0.01

ICC2 3 100 0.11 0.09 0.10 0.11 0.08 -0.13 0.36 0.49 0.19 0.11 0.01

ICC2k 4 100 0.18 0.15 0.18 0.19 0.15 -0.30 0.53 0.83 -0.34 0.48 0.01

ICC3 5 100 0.11 0.09 0.10 0.11 0.09 -0.13 0.37 0.50 0.20 0.15 0.01

ICC3k 6 100 0.18 0.15 0.18 0.19 0.15 -0.31 0.54 0.85 -0.34 0.53 0.01

In this example, the average reliability (across all 100 items rated) for a single rater was .11 (SD

= .09) and the average reliability of an item composite was .18 (SD = .15). Moreover, the

reliabilities for some composites ranged from a low of -.33 to a high of .53, (ICC1,k) indicating

wide variability across the items in terms of agreement.

Functions such as item.ICC are perfect for questions about item-level agreement,

similarity, or consistency. However, sometimes researchers may be interested in profile-level

agreement instead, a particular strength of the comprehensive approach. The Profile.ICC

function in the ‘multicon’ package makes such computations effortless. Using the

aforementioned acquaintance ratings we can do the following:


Profile.ICC(acq1,acq2) # The profile-level ICCs between the two judges

describe(Profile.ICC(acq1,acq2)) # Descriptives for the agreements

# Output below

var n mean sd median trimmed mad min max range skew kurtosis se

ICC1 1 205 0.35 0.21 0.37 0.36 0.25 -0.18 0.82 0.99 -0.26 -0.70 0.01

ICC1k 2 205 0.48 0.27 0.54 0.51 0.25 -0.43 0.90 1.33 -0.98 0.70 0.02

ICC2 3 205 0.35 0.22 0.37 0.36 0.25 -0.19 0.82 1.00 -0.26 -0.69 0.02

ICC2k 4 205 0.48 0.27 0.54 0.51 0.25 -0.45 0.90 1.35 -0.99 0.74 0.02

ICC3 5 205 0.35 0.21 0.37 0.36 0.25 -0.18 0.81 1.00 -0.25 -0.70 0.01

ICC3k 6 205 0.48 0.27 0.54 0.51 0.25 -0.45 0.90 1.35 -0.98 0.71 0.02

Like the item.ICC function, the Profile.ICC function takes at least two arguments that must be

data.frames of the same size, but this time the analysis is done on the rows rather than the

columns. In the case of multiple raters or more occasions, one can simply add additional

data.frames of the same size.

In this example the average profile-level reliability (across all 205 acquaintance pairs) for

a single rater was .35 (SD = .21) and the average reliability of a composite profile was .48 (SD =

.27). Such information may be valuable, and worth reporting, when creating composite profiles

from two or more raters of a target. In addition, such values also provide individual “consensus

scores” for each target which may be used to understand which targets are more judgable than

others (Colvin, 1993).

How accurate are judgments about a target?

When judges (or time periods) are distinguishable, the usual Pearson’s correlation is

often the preferred metric for indexing agreement, similarity, or consistency. One particular

index of similarity of interest to personality scientists is accuracy of judgments. The question of

accuracy in personality judgments has a long history and this is hardly the place to review it (see

Funder, 1999; Jussim, 2012; Kenny, 1994 for excellent reviews). Instead, we simply note that

accuracy in personality judgment is often quantified via agreement (e.g., self-other) between

judges at either the item-level (e.g., Funder & Colvin, 1988; Küfner, Back, Nestler, & Egloff,


2010; Watson, 1989) or the profile-level (e.g., Biesanz & Human, 2010; Human & Biesanz,

2011, 2012; Letzring, 2008; Letzring, Wells, & Funder, 2006).

Assessing agreement, similarity, or consistency at the item-level using basic R software is

straightforward. In the next example, self-ratings on the CAQ are correlated with acquaintance

composite ratings on the CAQ creating a 100 × 100 correlation matrix. Because the items are in

the same order (i.e., corresponding columns in the two datasets), the diagonal of this matrix

contains the item-level agreements for each of the 100 CAQ items. If we are interested in the

descriptive statistics (e.g., means, medians, SDs) for these 100 correlations, the describe.r

function in the ‘multicon’ package does the appropriate calculations applying r-to-Z

transformations and back when necessary. Finally, we may be interested in estimating

confidence intervals around the average item-level agreement. We can use R’s built-in t.test

function to calculate these.

data(acq.comp) # Acquaintance composites of personality on the 100-item CAQ

data(caq) # Self-reported personality on the 100-item CAQ

diag(cor(acq.comp, caq)) # The agreements on the 100-items

describe.r(diag(cor(acq.comp, caq))) # Describing the agreements

t.test(fisherz(diag(cor(acq.comp, caq)))) # t-test against zero

R’s built in cor function takes two arguments, the data.frames containing the personality ratings

of interest from acquaintances and from the self. Applying the diag function to the resulting

100×100 correlation matrix returns the correlations of interest (i.e., one accuracy correlation per

item). The describe.r function summarizes these 100 correlations appropriately applying r-to-Z

transformations (and back). Finally, R’s built-in t.test function computes a 95% confidence

interval around this average value. In this example we see that the average item-level agreement

is r = .17 (SD = .09) with a minimum of -.08 and a maximum of .41. The 95% confidence

interval around this average item-level agreement is [.15, .18] suggesting that the average item-

level self-other agreement of .17 is well-captured (i.e., accurate) and greater than zero.


Assessing profile-level agreement (or accuracy) using the basic R statistical software, or

any commercially available software package, is somewhat less straightforward. At the

minimum, it typically involves first transposing one’s dataset and then computing correlations on

the “new” columns (formerly the rows). However, the Profile.r function in the ‘multicon’

package easily computes profile correlations without any extra steps. Using the same

acquaintance composites and self-report ratings on the CAQ, the following code can be used to

quantify profile-level agreement. Again, describe.r gets the appropriate descriptive statistics for

the agreement coefficients:

Profile.r(acq.comp, caq) # The profile accuracy scores

describe.r(Profile.r(acq.comp, caq)) # Describing the accuracy scores

# Output below

var n miss mean sd median trimmed mad min max range skew kurtosis se

1 1 205 0 0.47 0.22 0.47 0.47 0.23 -0.04 0.81 0.82 -0.02 -0.35 0.02

In this example the average profile-level agreement between acquaintance CAQ composites and

self-reports is .47 (SD = .22) with a minimum of -.04 and a maximum of .82. One complication

with using profile-level agreement as an indicator of accuracy however is that such correlations

are confounded by normativeness (Cronbach, 1955; Furr, 2008). In other words, a positive

association between two profiles may not actually reflect agreement or knowledge of another

particular person, but simply knowledge of what people are like in general (i.e., describing the

average person). Thus, a t-test of the average profile agreement against zero would not

appropriately test the hypothesis that people are accurate in knowing each other above chance

levels.

Furr (2008) provided two different routes to resolving this issue. The first is to create an

empirical estimate of the true baseline level profile agreement. This can be done by randomizing

the profile pairs so that they are matched with a different profile (e.g., acquaintance ratings for

subject 1 are paired with self-ratings from subject 2, etc.), computing the average agreement


amongst these randomly paired profiles and considering it the baseline (Letzring et al., 2006).

More ideally, one could calculate the average profile agreement between all non-paired profiles

and test the observed average profile agreement against this number. The second solution offered

by Furr (2008) is to first remove the normative (i.e., average) profiles from both sets of profiles

and then to calculate profile agreement as one normally would on these “distinctive” profiles.

Such agreements, often referred to as “distinctive” profile agreements, can then be appropriately

tested against a baseline of zero. The Profile.r function in the ‘multicon’ package has an option

for easily conducting both of these analyses.8

prof.out <- Profile.r(acq.comp, caq, distinct=TRUE)

str(prof.out)

prof.out$Agreement # The overall and distinctive profile accuracies

round(describe.r(prof.out$Agreement),2) # Their descriptives

round(prof.out$Tests,3) # And their appropriate test statistics

By setting the distinct option in the Profile.r function to TRUE we get an object

containing (a) The mean (normative) acquaintance composite profile, (b) the mean (normative)

self-reported profile, (c) the correlation between the two normative profiles, (d) both the overall

and distinctive profile agreements for each subject, and (e) tests of statistical significance for

both the average overall and average distinctive profile agreements. Once again, by applying

describe.r to the agreements we get their descriptive statistics.

# Output below

var n miss mean sd median trimmed mad min max range skew kurtosis

se

Overall 1 205 0 0.47 0.22 0.47 0.47 0.23 -0.04 0.81 0.82 -0.02 -0.35

0.02

Distinctive 2 205 0 0.17 0.15 0.16 0.16 0.15 -0.19 0.55 0.67 0.35 0.07

0.01

Overall Distinctive

N 205.000 205.000

Mean 0.466 0.170

baseline 0.360 0.000

t 8.052 16.307

p-value 0.000 0.000


In this example the average overall agreement is r = .47 (SD = .22), which is the same as the

average profile agreement indicated previously. In addition, the average distinctive profile

agreement is r = .17 (SD = .15). Testing these against their appropriate baselines (.36 and .00

respectively) we see that both results are unlikely to have occurred under the null hypothesis of

no association between self-other ratings (ps < .001).

Sometimes researchers are not interested in just assessing the average level of profile

agreement and testing it against its baseline. In fact, sometimes predicting profile agreements

(e.g., accuracy scores) is the question of interest (e.g., Who is easy to judge?: Colvin, 1993; Who

is a good judge?: Letzring, 2008; Who is similar to whom?: Wortman, Wood, Furr, Fanciullo, &

Harms, 2014). In such cases where profile agreements are later correlated with some other

variable(s) of interest, it may be of importance to know the reliability of the profile agreements

themselves (Wood & Brumbaugh, 2009). One way of assessing the reliability (or replicability;

see Sherman & Wood, 2014) of a pattern of profile agreements relies on the fact that correlations

are simply averages of cross-products of standardized scores. Thus, much in the same way as one

computes internal consistency for composites from a rating scale, one may apply the same logic

to cross-products of standardized scores and compute alpha on these values (see Sherman &

Wood, 2014 for details). The R function Profile.r.rep in the ‘multicon’ package computes the

reliabilities (or replicabilities) for both overall and distinctive a patterns of profile agreements.

Profile.r.rep(acq.comp, caq)

# Output below

Replicability Lower Limit Upper Limit

Overall 0.7191478 0.6358432 0.7916674

Distinctive 0.4642754 0.3053722 0.6026062

In this example, the replicability for the pattern of self-acquaintance overall agreements is

.72 [95% CI = .64, .79] while for distinctive agreements it is .46 [.30, .60]. These numbers

indicate that if one were to randomly draw another set of 100 items from the population of items


from which the 100 CAQ items were generated, and have these same participants rate

themselves again, we would expect the patterns of profile agreements to correlate with each

other at .72 and .46 for overall and distinctive profile agreements respectively. Such values have

implications for researchers using profile similarity scores in subsequent analyses. For example,

some researchers may desire to correlate similarity scores with length of acquaintanceship to

ostensibly test the hypothesis that people who know each other longer judge each other more

accurately. With replicability values of .72 and .46 respectively, we know that these are the

upper bounds on the possible association between length of acquaintance and self-other

agreement (much in the same way that reliability is the upper bound on validity).

Although the correlation is a popular choice for quantifying profile agreement, similarity,

or consistency, researchers may alternatively be interested in a regression approach providing

both an intercept and slope between pairs of profiles. The Profile.reg function in the ‘multicon’

package makes assessing profile agreement via regression straightforward and includes options

for centering the profiles (“group” [default] – within-profile centering, “grand” – between-profile

centering, and “none” – no centering) and standardizing (FALSE [default] – no standardizing

and TRUE – standardized with the level determined by the center argument).

Profile.reg(acq.comp, caq) # Intercepts and slopes, defaults to group mean (within-S)

centering

Profile.reg(acq.comp, caq, std=T) # Standardized

Profile.reg(acq.comp, caq, std=T, center="grand") # Grand mean standardizing instead

The Profile.reg function takes two arguments. The first argument is a data.frame containing the

predictor profiles (i.e., X). The second argument is a data.frame containing the predicted profiles

(i.e., Y). As can be seen by running these examples, an intercept and slope is returned for each

pair of profiles, with again various options for how variables should be centered and/or

standardized.


How do profiles differ?

Although researchers are often interested in agreement, similarity, or consistency

amongst pairs of profiles, on some occasions they may be interested in how profiles differ, or

how one profile is distinctive from another. For example, in one recent study third-party ratings

of a situation were statistically removed from participant-ratings of the same situation in order to

retain “distinctive” self-ratings or construals. That is, how individuals saw the situation

differently from observers, which can be used as a measure of biases in situation perception.

These construals were subsequently correlated with personality (Sherman, Nave, & Funder,

2013). Conducting such an analysis using standard commercial software can be a cumbersome

task involving multiple transpositions of the raw data, storing of residuals, and recombining data

sets. The Profile.resid function in the ‘multicon’ package makes obtaining distinctive profiles

(i.e., residuals) from pairs of profiles easy.

resid.out <- Profile.resid(acq.comp, caq)

head(resid.out)

The Profile.resid function takes two arguments. The first is a data.frame containing the

predicting profiles (i.e., X) and the second is a data.frame containing the predicted profiles (i.e.,

Y).

In this example, self-reported CAQ profiles are predicted from acquaintance composite

CAQ profiles of the target. The resulting residuals for each pair of profiles are retained. Because

each CAQ profile contains 100 items and there are 205 subjects in this data set, the resulting

object (resid.out) is a 205×100 data.frame containing the distinctive self-reported CAQ profile

scores (residuals) after controlling for acquaintance composite profile scores. Intuitively, one

might also think that a difference score approach, wherein acquaintance CAQ composite scores

are simply subtracted from self-reported CAQ scores, would yield the same results. While these


two approaches are related, they are not identical. Mathematically, if the correlation between the

self-reported CAQ profile and the acquaintance composite profile were 1.00, the difference score

method and the regression based method provided by Profile.resid would return identical results.

On the other hand, if the correlation were .00, nothing would be removed from the self-reported

CAQ profile using Profile.resid. This is not true of the difference score method. Therefore, the

size of the relationship between the two profiles is an important aspect of how differences are

calculated when using the regression based approach provided by Profile.resid. This aspect is not

captured by the difference score approach, which implicitly assumes all pairs of profiles are

equally correlated (Sherman et al., 2012).

In the case where a researcher is interested examining distinctiveness at the item-level

instead of at the profile level (i.e., statistically removing the effect of one item on another for

each pair of items rather than for pairs of profiles), the ‘multicon’ package also includes the

function item.resid.

head(item.resid(acq.comp, caq))

The output format is the same as with Profile.resid except that the residuals come from item-

level regressions rather than profile-level.

Using multivariate constructs to test theoretical predictions

Because using a comprehensive approach to dealing with multivariate constructs often

involves many analyses it would be possible for someone to criticize this approach as being

entirely exploratory and atheoretical. In fact however, by employing template matching (Bem &

Funder, 1978) research using a comprehensive approach often is theoretically oriented (e.g.,

Sherman, Nave, & Funder, 2012; Sherman, Figueredo, & Funder, 2013). Template matching

entails correlating (or matching) an observed profile of measured characteristics with a

theoretically derived profile of those characteristics. The resulting template match scores, which


indicate the degree to which a particular profile corresponds with the theoretical template, can be

used in subsequent analyses. In one recent study, participant self-reported CAQ profiles were

correlated with a theoretically derived template for the prototypical slow-life history individual

(Sherman et al., 2013). Like many of the other analyses described in this article, standard

commercial software does not provide an easy and efficient method for getting template match

scores. However, the temp.match function in the ‘multicon’ package easily computes template

match scores.

data(CAQ)

data(opt.temp)

opt.temp # The optimally adjusted person

temp.match(opt.temp, caq) # Overall template match scores

describe.r(temp.match(opt.temp, caq))

The temp.match function takes two arguments. The first is the template itself, which is a vector

containing a score for each item in the template. The second is a data.frame containing the

profiles of scores to be matched to the template.

In this example self-reported CAQ profiles for 205 participants are correlated with the

optimally adjusted person template for the CAQ (Block, 1961). The now familiar describe.r

function from the ‘multicon’ package returns the descriptive statistics for these template match

scores. Like other profile-level analyses though, these template match scores include both

normative and distinctive components (Furr, 2008). By setting the distinct option in the

temp.match function to TRUE however, both overall and distinctive (controlling for

normativeness) template match scores are returned.

temp.match(opt.temp, caq, distinct=TRUE)

describe.r(temp.match(opt.temp, caq, distinct=TRUE)$Matches)

Interestingly, the results of this analysis reveal that while the average overall template match

score with the optimally adjusted template is r = .50, when normativeness (the average

personality profile) is removed, the average distinctive template match score is r = .00. Such a


result is in line with a flood of recent empirical evidence indicating that psychological

adjustment is highly associated with normativeness (Baird, Le, & Lucas, 2006; Fleeson & Wilt,

2010; Human, Biesanz, Finseth, Pierce, & Le, 2014; Klimstra, Hale, Raaijmakers, & Meeus,

2011; Klimstra, Luyckx, Hale, Goossens, & Meeus, 2010; Letzring, 2008; Sherman, et al., 2012;

Wood, Gosling, & Potter, 2007). Because template match scores are often correlated with other

measures of interest researchers may also be interested in knowing the reliability or replicability

of the scores themselves. For example, in one study Sherman and colleagues (2013) computed

template match scores for the prototypical slow-life history person and then correlated these

scores with behavior. The temp.match.rep function in the ‘multicon’ package computes such

replicabilities, with confidence intervals, for both overall and distinctive template match scores

following the logic outlined by Sherman and Wood (2014).

temp.match.rep(opt.temp, caq)

# Output below

Replicability Lower Limit Upper Limit

Overall 0.8073255 0.7501756 0.85707647

Distinctive -0.4568784 -0.8890083 -0.08069408

The arguments given to temp.match.rep are identical those passed to temp.match (i.e., the

template followed by the data.frame to be matched to the template). The results of this analysis

indicate that while overall template match scores are quite replicable/reliable (.81 [.75, .86]), the

distinctive template match scores are not (-.46 [-.89, -.08]). Indeed, the replicability/reliability is

so low that the distinctive template match scores reflect little more than random noise. Thus, the

functions available in the ‘multicon’ package can illuminate the reliability of profile similarities

using statistics that have not been available until recently. In some cases, as in this example,

these statistics can be useful by indicating that one should not proceed with interpreting the

correlates of a particular set of profile similarity scores at all.


Discussion

This article introduced the ‘multicon’ package for the R statistical software and

highlighted some of its functions most relevant for researchers interested in examining

multivariate constructs from a comprehensive approach. As a reminder, Table 1 provides a

summary of the kinds of questions researchers using a comprehensive approach may be

interested in and the corresponding analytic function available in the ‘multicon’ package. It is

worth noting that the list of functions shown in Table 1, and described herein, is not an

exhaustive list of the functions available in the ‘multicon’ package. For instance, the ‘multicon’

package also includes functions for ipsatizing (within-person standardizing) data (ipsatize),

calculating summary statistics for multi-trait multi-method matrices (MTMM), and several group

mean plotting functions following the recommendations made by Cumming (2012; 2014)

including graphs with error (confidence) bars (egraph) and Cat’s eye plot (catseye). Further,

some of the functions shown here include a number of additional arguments with options not

discussed. We anticipate continual refinements to the ‘multicon’ package in the coming years to

make comprehensive analysis of multivariate constructs more flexible and widely available. In

particular, we encourage practitioners to provide their suggestions on how current functions

might be improved or for new functions to be added to the package.

Conclusion

The comprehensive approach to analyzing personality, behavior, emotions, and situations

offers many opportunities for researchers hoping to find deeper relationships amongst these

inherently multivariate constructs. But there are number of difficulties for researchers wanting to

employ such an approach. A primary difficulty with the comprehensive approach is that standard

commercial software either does not offer such analytic tools or makes such analyses


cumbersome and confusing. The ‘multicon’ package offers numerous functions containing many

of the recently developed solutions to these difficulties.


References

Albright, L., Kenny, D. A., & Malloy, T. E. (1988). Consensus in personality judgments at zero

acquaintance. Journal of Personality and Social Psychology, 55(3), 387-395.

Albright, L., Malloy, T. E., Dong, Q., Kenny, D. A., Fang, X., Winquist, L., & Yu, D. (1997).

Cross-cultural consensus in personality judgments. Journal of Personality and Social

Psychology, 72, 558-569.

Allport, G. W., & Odbert, H. S. (1936). Trait-names: A psycho-lexical study. Psychological

Monographs, 47(211).

Back, M. D., Stopfer, J. M., Vazire, S., Gaddis, S., Schmukle, S. C., Egloff, B., & Gosling, S. D.

(2010). Facebook profiles reflect actual personality, not self-idealization. Psychological

Science, 21(3), 372-374.

Block, J. (1960). On the number of significant findings to be expected by chance.

Psychometrika, 25(4), 369-380.

Block, J. (1961). The Q-sort Method in Personality Assessment and Psychiatric Research.

Springfield, IL: Charles C. Thomas.

Block, J., & Block, J. H. (2006). Nursery school personality and political orientation two decades

later. Journal of Research in Personality, 40, 734-749.

Block, J., Block, J. H., & Keyes, S. (1988). Longitudinally foretelling drug usage in adolescence:

Early childhood personality and environmental precursors. Child Development, 59, 336-355.

Biesanz, J. C. (2010). The social accuracy model of interpersonal perception: Assessing

individual differences in perceptive and expressive accuracy. Multivariate Behavioral

Research, 45, 853-885.


Biesanz, J., C., & Human, L. J. (2010). The cost of forming more accurate impressions:

Accuracy-motivated perceivers see the personality of others more distinctively but less

normatively than perceivers without an explicit goal. Psychological Science, 21, 589-594.

Brown, N. A., & Sherman, R. A. (in press). Predicting interpersonal behavior using the Inventory

for Individual Differences in the Lexicon (IIDL). Journal of Research in Personality.

Carlson, E. N., Vazire, S., & Furr, R. M. (2011). Meta-insight: Do people really know how

others see them? Journal of Personality and Social Psychology, 101(4), 831-846.

Colvin, C. R. (1993). Judgable people: Personality, behavior, and competing explanations.

Journal of Personality and Social Psychology, 64, 861-873.

Colvin, C. R., & Funder, D. C. (1991). Predicting personality and behavior: A boundary on the

acquaintanceship effect. Journal of Personality and Social Psychology, 60(6), 884-894.

Costa, P. T., Jr., & McCrae, R. R. (1992). The NEO PI-R professional manual. Odessa, FL:

Psychological Assessment Resources, Inc.

Cronbach, L. J. (1955). Processes affecting scores on “understanding of others” and “assumed

similarity.” Psychological Bulletin, 52, 177-193.

Cumming, G. (2012). Understanding the New Statistics: Effect Sizes, Confidence Intervals, and

Meta-Analysis. New York: Routledge.

Cumming, G. (2014). The new statistics: Why and how. Psychological Science, 25(1), 7-29.

Fast, L. A., & Funder, D. C. (2008). Personality as manifest in word use: Correlations with self-

report, acquaintance-report, and behavior. Journal of Personality and Social Psychology, 94,

334-346.


Fleeson, W., & Wilt, J. (2010). The relevance of Big Five trait content in behavior to subjective

authenticity: Do high levels of within-person behavioral variability undermine or enable

authenticity achievement? Journal of Personality, 78(4), 1354-1382.

Funder, D. C. (1995). On the accuracy of personality judgments: A realistic approach.

Psychological Review, 102, 652-670.

Funder, D. C. (1999). Personality Judgment: A Realistic Approach to Person Perception. San

Diego: Academic Press.

Funder, D. C. (2013). The Personality Puzzle (6th edition). New York: Norton.

Funder, D. C., & Colvin, C. R. (1988). Friends and strangers: Acquaintanceship, agreement, and

the accuracy of personality judgment. Journal of Personality and Social Psychology, 55,

149-158.

Funder, D. C., & Dobroth, K. M. (1987). Differences between traits: Properties associated with

interjudge agreement.

Funder, D. C., Furr, R. M., & Colvin, C. R. (2000). The Riverside Behavioral Q-sort: A tool for

the description of social behavior. Journal of Personality, 68, 451-489.

Funder, D. C., Kolar, D. W., & Blackman, M. C. (1995). Agreement among judges of

personality: Interpersonal relations, similarity, and acquaintanceship. Journal of Personality

and Social Psychology, 69, 656-672.

Furr, R. M. (2008). A framework for profile similarity: Integrating similarity, normativeness, and

distinctiveness. Journal of Personality, 76(5), 1267-1316.

Furr, R. M. (2009). Personality psychology as a truly behavioral science. European Journal of

Personality, 23, 369-401.


Furr, R. M., Wagerman, S., & Funder, D. C. (2010). Personality as manifest in behavior: Direct

behavioral observation using the revised Riverside Behavioral Q-sort (RBQ-3.0). In C. R.

Agnew, D. E. Carlston, W. G., Graziano, & J. R. Kelly (Eds.), Then a miracle occurs:

Focusing on behavior in social psychological theory and research. (pp. 186-204). Oxford

University Press.

Hirsh, J. B., DeYoung, C. G., Xu, X., & Peterson, J. B. (2010). Compassionate liberals and polite

conservatives: Associations of agreeableness with political ideology and values. Personaltiy

and Social Psychology Bulletin, 36, 655-664.

Human, L. J., & Biesanz, J. C. (2011a). Through the looking glass clearly: Accuracy and

assumed similarity in well-adjusted individuals’ first impressions. Journal of Personality and

Social Psychology, 100(2), 349-364.

Human, L. J., & Biesanz, J. C. (2011b). Target adjustment and self-other agreement: Utilizing

trait observability to disentangle judgeability and self-knowledge. Journal of Personality and

Social Psychology, 101(1), 202-216.

Human, L. J., & Biesanz, J. C. (2012). Accuracy and assumed similarity in first impressions of

personality: Differing associations at different levels of analysis. Journal of Research in


Human, L. J., Biesanz, J. C., Finseth, S. M., Pierce, B., & Le, M. (2014). To thine own self be

true: Psychological adjustment promotes judgeability via personality-behavior congruence.

Journal of Personality and Social Psychology, 106(2), 286-303.

John, O. P., & Srivastava, S. (1999). The Big-Five trait taxonomy: History, measurement, and

theoretical perspectives. In L. A. Pervin & O. P. John (Eds.), Handbook of Personality:

Theory and Research. (Vol. 2, pp. 102-138): New York: Guilford Press.


Jussim, L. (2012). Social Perception and Social Reality. New York: Oxford University Press.

Kenny, D. A. (1994). Interpersonal Perception: A Social Relations Analysis. New York:

Guilford Press.

Kenny, D. A., Albright, L., Malloy, T. E., & Kashy, D. A. (1994). Consensus in interpersonal

perception: Acquaintance and the big five. Psychological Bulletin, 116(2), 245-258.

Klimstra, T. A., Hale III, W. W., Raaijmakers, Q. A. W., & Meeus, W. H. J. (2011).

Hypermaturity and immaturity of personality profiles in adolescents. European Journal of

Personality, 26(3), 203-211.

Klimstra, T. A., Luyckx, K., Hale III, W. W., Goossens, L., & Meeus, W. H. J. (2010).

Longitudinal associations between personality profile stability and adjustments in college

students: Distinguishing among overall stability, distinctive stability, and within-time

normativeness. Journal of Personality, 78(4), 1163-1184.

Küfner, A. C. P., Back, M. D., Nestler, S., & Egloff, B. (2010). Tell me a story and I will tell you

who you are! Lens model analyses of personality and creative writing. Journal of Research

in Personality, 44(4), 427-435.

Lee, K., & Ashton, M. C. (2004). Psychometric properties of the HEXACO personality

inventory. Multivariate Behavioral Research, 39, 329-358.

Letzring, T. D. (2008). The good judge of personality: Characteristics, behaviors, and observer

accuracy. Journal of Research in Personality, 42, 914-932.

Letzring, T. D., Wells, S. M., & Funder, D. C. (2006). Quantity and quality of available

information affect the realistic accuracy of personality judgment. Journal of Personality and

Social Psychology, 91, 111-123.


Markey, P. M., Funder, D. C., & Ozer, D. J. (2003). Complementarity of interpersonal behavior

in dyadic interactions. Personality and Social Psychology Bulletin, 29, 1082-1090.

McCrae, R. R., & Costa, P. T., Jr. (2008). The five-factor theory of personality. In O. P. John, R.

W. Robins, & L. A. Pervin (Eds.), Handbook of Personality: Theory and Measurement (3rd

ed. pp 159-181). New York: Guilford.

Nave, C. S., Sherman, R. A., Funder, D. C., Hampson, S. E., & Goldberg, L. R. (2010). On the

contextual independence of personality: Teachers’ assessments predict directly observed

behavior after four decades. Social and Personality Psychology Science, 1, 327-334.

Norman, W. T., & Goldberg, L. R. (1966). Raters, rates, and randomness in personality structure.

Journal of Personality and Social Psychology, 4(6), 681-691.

Oltmanns, T. F., & Turkheimer, E. (2009). Person perception and personality pathology. Current

Directions in Psychological Science, 18(1), 32-36.

R Core Team (2014). R: A language and environment for statistical computing. [Computer

software] Vienna, Austria: R Foundation for statistical computing. http://www.R-

project.org/.

Revelle, W. (2014). psych: Procedures for personality and psychological research, Northwestern

University, Evanston, IL, USA http://CRAN.R-project.org/package=psych (Version 1.4.4).

Shedler, J., & Block, J. (1990). Adolescent drug use and psychological health: A longitudinal

inquiry. American Psychologist, 45, 612-630.

Sherman. R. A. (2014). multicon: An R package for the analysis of multivariate constructs

(version 1.2). http://CRAN.R-project.org/package=multicon.


Sherman, R. A., & Funder, D. C. (2009). Evaluating correlations in studies of personality and

behavior: Beyond the number of significant findings to be expected by change. Journal of

Research in Personality, 43(6), 1053-1063.

Sherman, R. A., Figueredo, A. J., & Funder, D. C. (2013). The behavioral correlates of overall

and distinctive life history strategy. Journal of Personality and Social Psychology, 105(5),

873-888.

Sherman, R. A., Nave, C. S., & Funder, D. C. (2010). Situational similarity and personality

predict behavioral consistency. Journal of Personality and Social Psychology, 99(2), 330-

343.

Sherman, R. A., Nave, C. S., & Funder, D. C. (2012). Properties of persons and situations related

to overall and distinctive personality-behavior congruence. Journal of Research in


Sherman, R. A., Nave, C. S., & Funder, D. C. (2013). Situational construal is related to

personality and gender. Journal of Research in Personality, 47, 1-14.

Sherman, R. A., & Wood. D. (2014). Estimating the expected replicability of a pattern of

correlations and other measures of association. Multivariate Behavioral Research, 49(1), 17-

40.

Shrout, P. E., & Fleiss, J. L. (1979). Intraclass correlations: Uses in assessing rater reliability.

Psychological Bulletin, 86, 420-428.

Vazire, S. (2006). Informant reports: A cheap, fast, and easy method for personality assessment.

Journal of Research in Personality, 40, 472-481.


Vazire, S., & Mehl, M. R. (2008). Knowing me, knowing you: The accuracy and unique

predictive validity of self-ratings and other-ratings of daily behavior. Journal of Personality

and Social Psychology, 95(5), 1202-1216.

Walton, K. E., & Roberts, B. W. (2004). On the relationship between substance use and

personality traits: Abstainers are not maladjusted. Journal of Research in Personality, 38,

515-535.

Watson, D. (1989). Strangers’ ratings of five robust personality factors: Evidence of a surprising

convergence with self-report. Journal of Personality and Social Psychology, 57(1), 120-128.

Watson, D. (2001). Procrastination and the five-factor model: A facet level analysis. Personality

and Individual Differences, 30, 149-158.

Wood, D., & Brumbaugh, C. C. (2009). Using revealed mate preferences to evaluate market

force and differential preference explanations for mate selection. Journal of Personality and

Social Psychology, 6, 1226-1244.

Wood. D., Gosling, S. D., & Potter, J. (2007). Normality evaluations and their relations to

personality traits and well-being. Journal of Personality and Social Psychology, 93(5), 861-

879.

Wood, D., Nye, C. D., & Saucier, G. (2010). Identification and measurement of a more

comprehensive set of person-descriptive trait markers from the English lexicon. Journal of

Research in Personality, 44, 258-272.

Wortman, J., & Wood, D. (2011). The personality traits of liked people. Journal of Research in


Wortman, J., Wood, D., Furr, R. M., Fanciullo, J., & Harms, P. D. (2014). The relations between

actual similarity and experienced similarity. Journal of Research in Personality, 49, 31-46.


Table 1.

A non-exhaustive summary of research questions that can be asked and analyzed using the ‘multicon’ package

Research Question ‘multicon’ Function

Notable Options

Is this multivariate construct related to some other multivariate construct?

rand.test

Is this single variable related to some multivariate construct?

rand.test

What are the correlations between a variable of interest and a multivariate construct for the full sample and separate by sex? (conducts rand.test for full sample and each sex)

q.cor

How replicable is the pattern of associations between a variable of interest and a multivariate construct?

vector.alpha

distinct=FALSE

Do the pairs of profiles agree? (Distinguishable cases; Overall and Distinctive agreement; correlations)

Profile.r distinct=TRUE; includes tests of overall and distinctive agreement

How replicable are the profile agreements? (Overall and Distinctive agreement; correlations)

Profile.r.rep

What are the descriptive statistics for a bunch of correlations?

describe.r

Do the pairs of profiles agree? (Overall agreement; regression)

Profile.reg Centering (group, grand, or none) with the center= option. Standardizing (T / F) with the std= option

How do the pairs of profiles differ from each other? (regression)

Profile.resid

How do the corresponding items differ from each other? (regression)

item.resid

Do the profiles match some template? (Overall agreement; correlation)

temp.match distinct=FALSE

Do the profiles match some template? (Overall and Distinctive agreement; correlations)

temp.match distinct=TRUE

How replicable are the template match scores? (Overall and Distinctive agreement; correlations)

temp.match.rep


Table 2.

The Behavioral Correlates of Extraversion ## - RBQ Item (essential dimension) Combined Women Men Positive Correlates 20 - Is talkative (III) .35*** .36*** .33*** 15 - High enthusiasm and energy level (VII) .26*** .23* .31** 56 - Speaks in a loud voice (V) .22** .13 .33*** 48 - Expresses sexual interest (VI) .21** .26** .15 02 - Volunteers Information about Self (IV) .17* .24* .09 07 - Exhibits social skills (IV) .14* .09 .20* 30 - Appears to regard self as phys. Attractive (VI) .13+ .01 .27** Negative Correlates 08 - Reserved and unexpressive (III) -.35*** -.34*** -.37*** 40 - Keeps other(s) at a distance (III) -.28*** -.24* -.32** 50 - Gives up when faced w/obstacles (III) -.28*** -.40*** -.15 13 - Exhibits awkward interpersonal style (VI) -.17* -.22* -.14 60 - Seems detached from situation (III) -.17* -.11 -.24* 36 - Behaves in fearful or timid manner (III) -.16* -.23* -.07 18 - Expresses agreement frequently (VII) -.15* -.11 -.19+ 55 - Behaves in competitive manner (IV) -.15* -.21* -.11 64 - Concentrates; Work hard at task (IV) -.14* -.11 -.16 51 - Behaves in stereotypical gender style or manner (VI) -.12+ -.14 -.10 06 - Appears relaxed and comfortable (I) -.11 -.20* -.02 Average Absolute r .090** .102+ .106* Note. RBQ Item content is abbreviated. *** = p < .001, ** = p < .01, * = p < .05, + = p < .10. Ns are 205, 105, and 100 for Combined, Women, and Men respectively. Male-female vector correlation r = .62. Estimated pattern replicabilities are .67, .46, and .47 respectively.


Footnotes

1 We are grateful to Mike Furr for suggesting the name multivariate constructs.

2 We say “roughly” because Funder’s (2013) single trait, essential trait, and many trait

approaches to personality correspond to the strategies we identify. We do not differentiate

between Funder’s single trait and essential trait approaches here for the sake of simplicity.

3 More information about this dataset can be found in Sherman, Nave, and Funder (2010).

4 The rand.test function has an argument to set the random seed, which is by default set to 2.

This ensures that this function returns the same result for the same analysis every time. Setting

the seed to FALSE, or to a different value will change the exact values slightly because different

random reassignments will be used.

5 Those following along will note this code generated a warning stemming from the rand.test

function suggesting that the confidence intervals for the p-values are not be. Because rand.test

relies on resampling it provides 99.9% confidence intervals for the accuracy of the resulting p-

values. The warning indicates that one might want to use a larger number of resamples

(simulations) to obtain a more precise p-value estimate. However, for extreme p-values, the

number of simulations necessary to generate an accurate confidence interval can be extremely

large.

6 As an exercise, we actually conducted such an analysis on the data about to be discussed. These

results and their comparison to the results found using a comprehensive approach, which may be

of interest to some, are available at the end of the R file associated with this manuscript.

However, as noted previously, the goal of this article is not to compare the essential and

comprehensive approaches to this problem, so we will not do so here.


7 In addition, this analysis returns more warnings indicating that our estimated p-value is likely to

be imprecise. This is in part due to the fact that we set the number of sims=1 for the purpose of

this demonstration. See also footnote #5.

8 It is worth noting that Furr’s (2008) method of creating distinctive profiles involves subtracting

the mean profile from each profile in the set. The Profile.r function discussed here uses a

regression based approach to create distinctive profiles by predicting the scores in each profile in

the set from the mean profile and retaining the residuals.

the comprehensive approach to analyzing multivariate...

Documents