estimating the functional dimensionality of neural representations · estimating the functional...

Estimating the functional dimensionality of neural

representations

Christiane Ahlheima,∗, Bradley C. Lovea,b,

aDepartment of Experimental Psychology, University College London, 26 Bedford Way,London WC1H 0AP, United Kingdom

bThe Alan Turing Institute, United Kingdom

Abstract

Recent advances in multivariate fMRI analysis stress the importance of infor-mation inherent to voxel patterns. Key to interpreting these patterns is esti-mating the underlying dimensionality of neural representations. Dimensionsmay correspond to psychological dimensions, such as length and orientation,or involve other coding schemes. Unfortunately, the noise structure of fMRIdata inflates dimensionality estimates and thus makes it difficult to assess thetrue underlying dimensionality of a pattern. To address this challenge, wedeveloped a novel approach to identify brain regions that carry reliable task-modulated signal and to derive an estimate of the signal’s functional dimen-sionality. We combined singular value decomposition with cross-validationto find the best low-dimensional projection of a pattern of voxel-responsesat a single-subject level. Goodness of the low-dimensional reconstructionis measured as Pearson correlation with a test set, which allows to test forsignificance of the low-dimensional reconstruction across participants. Usinghierarchical Bayesian modeling, we derive the best estimate and associateduncertainty of underlying dimensionality across participants. We validatedour method on simulated data of varying underlying dimensionality, show-ing that recovered dimensionalities match closely true dimensionalities. Wethen applied our method to three published fMRI data sets all involving pro-cessing of visual stimuli. The results highlight three possible applications ofestimating the functional dimensionality of neural data. Firstly, it can aid

∗Corresponding authorEmail addresses: [email protected] (Christiane Ahlheim), [email protected]

(Bradley C. Love)

.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 12, 2018. . https://doi.org/10.1101/232454doi: bioRxiv preprint

https://doi.org/10.1101/232454

http://creativecommons.org/licenses/by/4.0/

evaluation of model-based analyses by revealing which areas express reliable,task-modulated signal that could be missed by specific models. Secondly,it can reveal functional differences across brain regions. Thirdly, knowingthe functional dimensionality allows assessing task-related differences in thecomplexity of neural patterns.

Keywords: neural representations, dimensionality reduction, multivariateanalysis

2


https://doi.org/10.1101/232454


1. Introduction1

A growing number of fMRI studies are investigating the representational2

geometry of voxel response patterns. For example, using representational3

similarity analysis (RSA; Kriegeskorte and Kievit, 2013), researchers have4

characterized visual object representations along the ventral stream (Khaligh-5

Razavi and Kriegeskorte, 2014) and how these representations vary across6

tasks (Bracci et al., 2017).7

Interpreting representational geometry in neural responses can be diffi-8

cult. For example, RSA tests for a hypothesized representational pattern,9

but an important and more fundamental question should be addressed first,10

namely whether there is any dimensionality to the underlying neural pattern11

and, if so, what that dimensionality is.12

Knowing whether a pattern has dimensionality should be prerequisite for13

RSA and other multivariate representational analyses because a particular14

similarity structure can only be found when there is sufficient dimensional-15

ity to represent the proposed relations. For example, searching for a flavor16

space with dimensions sweet, sour, bitter, salty and umami would be a fool’s17

errand in brain areas that contain little or no dimensionality. Furthermore,18

independent of the particular geometry, the dimensionality of a neural pat-19

tern is informative of how many features of a task are represented in a brain20

region, which can inform our understanding of an area’s function.21

There are many methods of dimensionality reduction and estimation,22

most of which involve low-rank matrix approximation and aim to maximize23

the correspondence between the original and the approximated matrix. For24

example, two common approaches to estimate the dimensionality of an ob-25

served neural or behavioral pattern are principal component analysis (PCA)26

or relatedly, multidimensional scaling (MDS).27

PCA, or the closely related factor analysis and singular value decompo-28

sition (SVD) (Hastie et al., 2009), is widely used in the study of individual29

differences and aids estimating how many latent components, or “factors”,30

underlie a pattern of (item) responses across participants, as for instance31

in the context of intelligence (Spearman, 1904) or personality tests (Cattell,32

1947). In the context of neuroimaging, PCA has been used to identify brain33

networks (Huth et al., 2012; Friston et al., 1993). PCA derives how much34

variance of the observed pattern is explained by each underlying component.35

Similarly, MDS finds the best representation of original distances in a36

low-dimensional space (Kriegeskorte and Kievit, 2013). For example, two37

3


https://doi.org/10.1101/232454


stimuli like a chair and table that are very close to each other in the high-38

dimensional space will be represented closely in the low-dimensional projec-39

tion achieved by MDS, whereas two stimuli that were very distant from each40

other, for instance a chair and a bunny, will be projected far apart. MDS has41

been successfully applied to behavioral as well as neural data to reveal which42

stimulus features underly observed representational geometries (Bracci and43

Op de Beeck, 2016; Kriegeskorte and Kievit, 2013; Kriegeskorte et al., 2008),44

though it has been questioned to which extent results from MDS are inter-45

pretable (Goddard et al., 2017). For reasons outlined below, we will focus on46

SVD to estimate the dimensionality of neural representations, though other47

methods could be paired with our general approach, including nonlinear ap-48

proaches such as Nonlinear PCA (Kramer, 1991).49

Estimating the dimensionality of neural data brings its own unique chal-50

lenges. In a noise-free scenario, dimensionality can be defined as the number51

of linear orthogonal components (singular- or eigenvalues) underlying a ma-52

trix that are larger than zero (Shlens, 2014), indicating that the component53

fits some variance in the data. Unfortunately, actual recordings of neural ac-54

tivity always contain noise, which inflates non-signal components above zero55

(Fusi et al., 2016; Diedrichsen et al., 2013). This noise makes it challenging56

to determine which areas contain signal and, if so, what the dimensionality57

of the signal is.58

One criterion, which we adopt in the work reported here, is to choose59

the number of components that should maximize reconstruction accuracy60

(measured by correlation) on new data (i.e., test data). While even for61

data with low or moderate true dimensionality more components will always62

increase fit for existing data (i.e., training data), performance on test data63

(i.e., generalization, prediction) will usually be best for a moderate number of64

components because these components largely reflect true signal as opposed65

to noise in the observed training sample.66

The problem of distinguishing between signal and noise in a neural pat-67

tern is related to the bias-variance trade-off in supervised learning and model-68

selection. Overly simple models (few components) are highly biased, fitting69

training data poorly and not performing well on test data. These overly70

simple models cannot pick-up on nuances in the signal. Conversely, overly71

complex models (many components) are too sensitive to the variance in the72

training date (i.e., overfit). Although they fit the training data very well,73

overly complex models treat noise in the training data as signal and, there-74

fore, generalize poorly. Thus, the sweet spot for test performance should be75

4


https://doi.org/10.1101/232454


at some moderate number of components that largely reflect true signal (see76

Figure 1 A). Thus, identifying the true number of underlying components is77

analogous to deciding which model best explains the data.78

One naive way to navigate this trade-off between simple and complex79

models is to use some arbitrary cutoff, such as including the number of com-80

ponents that captures some amount of variance in the training data or decid-81

ing based on visual inspection which components may carry signal (known82

as scree plot, Cattell, 1966). In the case of fMRI, where the signal-to-noise83

ratio depends on multiple factors like scanner settings, experimental design,84

and physiological activity (Huettel et al., 2003), estimating the underlying85

dimensionality based on an arbitrary cut-off criterion for explained variance86

could be misleading. Likewise, although identifying relevant components87

via visual inspection works for small datasets, it is not applicable to large88

datasets as fMRI data, as it would require a manual decision for each voxel.89

Furthermore, the size of fMRI datasets (usually thousands of voxels) calls90

for a computationally efficient and automated approach, making estimating91

the dimensionality for the whole brain feasible. Thus, for neuroimaging data,92

there is a need for an efficient, systematic and objective approach that can93

both identify areas with statistical significant dimensionality and provide a94

useful estimate of the underlying dimensionality.95

Previous efforts to estimate the dimensionality of neural response pat-96

terns have applied linear classifiers to neural data to evaluate dimensionality97

(Rigotti et al., 2013; Diedrichsen et al., 2013). Rigotti et al. (2013) were able98

to show that dimensionality of single-cell recordings in monkey PFC is linked99

to successful task-performance, indicating that dimensionality of neural pat-100

terns is task-sensitive. In line with this, Diedrichsen et al. (2013) showed101

that the dimensionality of motor cortex representations differs depending on102

the task. Using a combination of PCA and linear Gaussian classifiers, the103

authors showed that motor cortex representations of different force levels104

are low dimensional, whereas usage of different fingers was associated with105

multidimensional neural patterns (Diedrichsen et al., 2013). Notably, both106

studies focused on estimating task-related changes in dimensionality in a pre-107

scribed brain region, rather than estimating which areas across the brain had108

significant dimensionality.109

In the present work, we expand on previous contributions by evaluat-110

ing a novel approach that, in a robust and computationally efficient manner,111

tests for which areas display statistically significant dimensionality, estimates112

the dimensionality, and provides an indication of the certainty of the esti-113

5


https://doi.org/10.1101/232454


mate. We combine singular value decomposition (SVD) and cross-validation114

to identify areas across the brain with underlying dimensionality. We derive115

which of all possible low-dimensional reconstructions of the fMRI signal is the116

best dimensionality estimate of a held-out test run, and quantify the good-117

ness of the low-dimensional reconstruction via Pearson correlation. Using a118

cross-validation procedure to identify the best dimensionality estimate boosts119

that only components that carry signal and thus generalize to new data are120

kept. By assessing the significance of the correlation, we can distinguish be-121

tween areas that show reliable signal with underlying dimensionality vs. areas122

that do not show a reliable task-modulation. After establishing significant123

functional dimensionality, we use Bayesian modeling to derive a population124

estimate and associated uncertainty of the degree of dimensionality. We will125

refer to this task-dependent dimensionality as functional dimensionality.126

Through simulations and evaluation of three (published) fMRI datasets,127

we find that our method successfully identifies areas with significant func-128

tional dimensionality and provides reasonable estimates of the underlying129

dimensionality. In the first fMRI dataset, participants performed a catego-130

rization task which required differential attention to various stimulus features131

(Mack et al., 2013). The second study investigated shape- and category spe-132

cific neural responses to the presentation of natural images (Bracci and Op de133

Beeck, 2016). The third study involved categorization tasks that varied sys-134

tematically in their attentional demands (Mack et al., 2016), which we predict135

should affect functional dimensionality.136

Across all three studies, we were able to identify areas carrying functional137

dimensionality in a manner that supported and extended the original find-138

ings. Focusing on wholebrain effects in the the first two studies, we identified139

a consistent network of areas showing functional dimensionality during vi-140

sual stimulus processing. This network encompassed areas that were reported141

by the original authors as being task-relevant, identified through represen-142

tational similarity analysis and cognitive model fitting (Bracci and Op de143

Beeck, 2016; Mack et al., 2013). Furthermore, functional dimensionality was144

revealed in additional areas, highlighting the sensitivity of our method and145

suggesting that reliable task-modulated signal was present that was not ex-146

plained by the models the original authors tested. In the last study, we147

combined a region-of-interest approach and multilevel Bayesian modeling to148

show that dimensionality varied depending on task-requirements, which fol-149

lows from the original authors’ claims but remained untested until now (Mack150

et al., 2016). We outline how the notion and identification of functional di-151

6


https://doi.org/10.1101/232454


A

1 3 5 7 9 11 13 15number of components

0

0.05

0.1

0.15

0.2

0.25

0.3

corre

latio

n

B

number of components

corre

latio

ntrainingtest

overfitting

number of components number of components

Figure 1: Illustration of the concept of overfitting and generalizability. A: Asmore components are added to a low-dimensional reconstruction, the correlationbetween the training data and the reconstruction approaches the maximum of 1for a full-dimensional reconstruction (purple curve). Adding components is equiv-alent to adding model parameters to improve fit, which reduces the model’s biasand increases its variance. For the correlation between the reconstructed train-ing and independent test data (red curve), adding components initially improvesperformance but at some point reduces performance due to overfit (see Parpartet al., 2017, for a related illustration). B: Reconstruction correlations achieved byall possible low-dimensional reconstructions for a simulated ground-truth dimen-sionality of 4. Reconstruction correlations rise as more components are added upto the point where the true dimensionality is reached, and decrease afterwards.Results are averaged across 6 runs and 1000 simulated voxel patterns with varyingsignal-to-noise ratios.

mensionality can aid the analysis and understanding of neuroimaging data152

in various ways.153

2. General Methods154

Neuroimaging data, such as fMRI, M/EEG, or single-cell recordings, can155

be represented as a matrix of n voxels, neurons, or sensors × m conditions.156

For example, BOLD response patterns in the fusiform face area (FFA) to157

3 different stimulus conditions can be expressed as a matrix Y of the size158

n (number of voxels) × 3 (face, house, or tool stimulus condition). The159

maximum possible dimensionality is the minimum of n and m, which in this160

example would be 3, assuming many voxels in FFA were included in the161

7


https://doi.org/10.1101/232454


analysis. However, functional dimensionality could be lower. For example,162

dimensionality would be lower if the region only responded to face stimuli163

and showed the same lower response to house and tool stimuli.164

Various methods exist that allow to estimate a matrix’s dimensionality165

and a review of all of them is beyond the scope of this paper. The approach we166

present here is modular and estimates a matrix’s dimensionality by combining167

low-rank approximation with cross-validation and significance testing. This168

modularity allows to flexibly choose the dimensionality reduction technique169

which best fits with ones requirements. Here, we used SVD (which is often170

used to compute PCA solutions) because it is a well-understood, easy to171

implement, and a computationally efficient low-rank matrix approximation.172

The choice of SVD, as well as how the data matrix is normalized is in-173

formed by our understanding of the underlying neural signal. Because voxels174

differ greatly from one another in their overall activity level and activity lev-175

els can drift over runs, we demean each row (i.e., voxel) of the data matrix by176

run. In contrast, we do not demean each column, as would typically be done177

with approaches that focus on the covariance of the column vectors (e.g.,178

PCA). The reason we do not normalize by column (i.e., condition) is that we179

are open to the possibility that different stimuli may be partially coded by180

overall activity levels of a population of voxels. For example, imagine a brain181

area only responds strongly to faces, but not to other stimuli. An SVD with182

demeaned voxels (i.e., rows) would be sensitive to this dimension of represen-183

tation, whereas a procedure that effectively worked with demeaned columns184

would not be sensitive to this task-driven difference in neural activity (see185

Davis et al., 2014; Hebart and Baker, 2017, for a related discussion).186

In the following section, we describe how a combination of SVD and187

cross-validation can be used to test whether an observed neural pattern can188

be successfully reconstructed using a low-rank approximation, assessed as a189

significant Pearson correlation between a low-rank approximation and a held190

out test set, and how this technique provides an estimate of the pattern’s un-191

derlying dimensionality (see Figure 2 for an overview of all steps). As all our192

examples are fMRI data sets, we will describe the steps using fMRI termi-193

nology, though the procedure could be applied to any type of neuroimaging194

data. We provide the code and data to replicate the analyses presented here195

and for use on other datasets at osf.io/tpq92.196

8


osf.io/tpq92

https://doi.org/10.1101/232454


j correlation coefficientsper participant

significantnot significant

no dimensionality inferred

r1,i…rj,ir1,1…rj,1

i

r1…ri

averaging across j

t-test

r1…ri ,

sd(r1)…

sd(ri)

Bayesian modeling

NV (µ, σ)

dimensionality = NV(µ, σ)

j-2 training

validation

test

j matrices of beta estimates (pre-whitened, mean-centered

if necessary)

all low-dimensional reconstructions of training data

SVD

rmax with validation data ➞ dimensionality estimate k

r

k-dimensional reconstruction of training + validation, correlate

with test r

Step 2

Step 3

Step 4

Step 1

9


https://doi.org/10.1101/232454


Figure 2 (previous page): Step 1: Prior to dimensionality estimation, raw dataare pre-processed with preferred settings and software and beta estimates derivedfrom a GLM are obtained for each condition of interest. The resulting j matricesof size n (number of voxels) ×m (number of conditions) are pre-whitened andmean-centered (by row, i.e., voxel) to remove baseline differences across runs.Step 2: a combination of cross-validation and SVD is implemented to find the bestdimensionality estimate k for each run j. Pearson correlations between all possiblelow-dimensionality reconstructions of the data and a held-out test set quantify thegoodness of each reconstruction for each run j (see Figure 3 for details). Step 3: theresulting j correlations are averaged for each participant and tested for significance,for instance using t-tests, across all participants. Step 4: If the reconstructioncorrelations are significant across participants, a hierarchical Bayesian model canbe used to derive the best estimate of the degree of functional dimensionality (seeFigure 4 for details). For each participant, the average estimated dimensionalityand standard deviation of this estimate is calculated and a population estimateand respective standard deviation (uncertainty in the estimate) is derived acrossall participants.

2.1. Step 1: Data pre-processing197

We developed the presented method with application to fMRI data in198

mind, though it can be easily adapted to fit requirements of single cell record-199

ings or M/EEG data. The method takes beta estimates resulting from a200

GLM fit to the observed BOLD response as input. In all studies presented201

here, standard pre-processing steps were performed using SPM 12 (Wellcome202

Department of Cognitive Neurology, London, United Kingdom), but the pre-203

cise nature of the preprocessing and implemented GLM is not critical to our204

method. Functional data were motion corrected, co-registered and spatially205

normalized to the Montreal Neurological Institute (MNI) space.206

To reduce the impact of the structured noise, which is correlated across207

voxels, on the dimensionality estimation and to improve the reliability of208

multivariate voxel response patterns (Walther et al., 2016), we applied mul-209

tivariate noise-normalization, that is, spatial pre-whitening, before estimat-210

ing the functional dimensionality. We used the residual time-series from the211

fitted GLM to estimate the noise covariance Σnoise and used regularization to212

shrink it towards the diagonal (Ledoit and Wolf, 2004). Each n×m matrix213

of beta estimates Y was then multiplied by Σ− 1

2noise (Walther et al., 2016).214

In fMRI data, the baseline activation can differ across functional runs.215

10


https://doi.org/10.1101/232454


This has important implications for our approach presented here, as it can216

bias the correlation between neural patterns across runs. To account for this,217

we demeaned the pre-whitened beta estimates across conditions, resulting in218

an average estimate of zero for each voxel. This demeaning reduces the219

possible maximum dimensionality of the data to kmax = m − 1. Notably,220

demeaning of voxels is conceptually different from demeaning conditions,221

which would have been implemented by PCA, as it preserves differences222

between conditions, whereas PCA would remove those.223

2.2. Step 2: Evaluating all possible SVD (dimensional) models224

The dimensionality of a matrix is defined as its number of non-zero sin-225

gular values, identified via singular value decomposition (SVD). SVD is the226

factorization of an observed n × m matrix M of the form UΣV ᵀ. U and227

V are matrices of size m × m and n × n, respectively, and Σ is an n × m228

matrix, whose diagonal entries are referred to as the singular values of M .229

A k-dimensional reconstruction of the matrix M can be achieved by only230

keeping the k largest singular values in Σ and replacing all others with zero,231

resulting in Σ. This is known as Eckart-Yong theorem (Eckart and Young,232

1936), leading to equation 1:233

M = UΣV ᵀ (1)

To estimate the dimensionality of fMRI data, we applied SVD to j(number234

of runs) matrices Y of n(number of voxel) × m(number of beta estimates),235

with the restriction of n > m.236

Critically, fMRI beta estimates are noisy estimates of the true signal.237

In the presence of noise, all singular values of a matrix will be non-zero,238

requiring the definition of a cut-off criterion to assess the number of singular239

values reflecting signal. Removing noise-carrying components from a matrix240

is beneficial, as it avoids overfitting to the noise and thus, improves the241

generalizability of the low-dimensional reconstruction to another sample (see242

Figure 1 A for an illustration of the concept of overfitting). We aimed to avoid243

any subjective (arbitrary) criterion as percentage of explained variance or244

alike (Cattell, 1966). To that end, we implemented a nested cross-validation245

procedure at the core of our method to identify singular values that carry246

signal (see step 1 of the general overview depicted in Figure 2 and Figure 3247

for a detailed illustration of the cross-validation approach). This allows us248

to overcome the inflation of dimensionality of fMRI data due to noise and249

test which areas of the brain carry signal with functional dimensionality.250

11


https://doi.org/10.1101/232454


Data are partitioned j × (j − 1) times into training (Ytrain), validation251

(Yval), and test (Ytest) data. The (demeaned and pre-whitened) j−2 training252

runs are averaged, and SVD is applied to the resulting n×m matrix Ytrain.253

We then build all possible low-dimensional reconstructions of the averaged254

training data, with dimensionality ranging from 1 to m−1. Low-dimensional255

reconstructions are generated by keeping only the k highest singular values256

and setting all others to zero. Each low-dimensional reconstruction of matrix257

Ytrain is correlated with the held-out Yval. This is repeated for each possible258

partitioning in training and validation, resulting in j−1 × m−1 correlation259

coefficients. Correlations are Fisher’s z-transformed and averaged across the260

j − 1 partitionings. The dimensionality with the average highest correlation261

is picked as best estimate k of the underlying dimensionality. As keeping262

components that reflect noise rather than signal lowers the correlation with263

an independent data set, the highest correlation is not necessarily achieved264

by keeping more components. This procedure thus avoids inflated dimen-265

sionality estimates.266

After identifying the best dimensionality estimate k for run j, the training267

and validation runs from 1 to j−1 are averaged together and SVD is applied268

to the averaged data. We then generate a k-dimensional reconstruction of269

the averaged data. The quality of this final low-dimensional reconstruction270

is measured as Pearson correlation with Ytest. We chose Pearson correlation271

and not mean-square error (MSE), which is suggested by the use of SVD as272

measure of reconstruction quality, because MSE is influenced by the variance273

of the reconstructed data, which depends on its dimensionality k.274

2.3. Step 3: Determining statistical significance275

The approach results in j estimates of the underlying dimensionality and276

j corresponding test correlations per participant. Under the null-hypothesis277

of no dimensionality, and thus, only noise present in the matrix, reconstruc-278

tion correlations averaged across runs are distributed around zero. Thus,279

across-participants significance of the averaged reconstruction correlations280

can be assessed using one-sample t-tests or non-parametric alternatives, as281

for instance permutation tests (Nichols and Holmes, 2003), and established282

correction methods for multiple comparisons, like threshold-free cluster en-283

hancement (TFCE, see Smith and Nichols, 2009).284

Only if a significant, k-dimensional, reconstruction correlation can be285

established across participants, we refer to an area as showing functional di-286

mensionality. It should be noted that a significant reconstruction correlation287

12


https://doi.org/10.1101/232454


only indicates that the underlying functional dimensionality is one or bigger.288

However, testing for a dimensionality of two or larger can be achieved by289

removing not only the voxel-mean before estimating the dimensionality, but290

also the condition mean, effectively removing univariate differences between291

conditions.292

2.4. Step 4: Estimating the degree of dimensionality293

The previously described steps allow us to identify which areas carry294

reliable signal with functional dimensionality, but do not provide a precise295

estimate of the degree of the underlying dimensionality. The best population296

estimate of a region’s functional dimensionality should optimally combine297

information across participants, giving more weight to participants with more298

reliable estimates, and should furthermore reflect how peaked the distribution299

of underlying population estimates is, accounting for the fact that different300

participants could express different true dimensionality.301

Given a significant reconstruction correlation across participant, j esti-302

mates of the degree of dimensionality are obtained (for each voxel or ROI)303

for each participant. In a noise-free scenario, all j estimates reflect the true304

dimensionality and thus, direct inference could be made solely based on these305

estimates. Under noise, these estimates could over- or underestimate the true306

dimensionality. The less reliable the j dimensionality estimates, the higher307

the variance across them. Mere averaging of the j estimates across par-308

ticipants would discard this information, weighting all participants equally,309

irrespective of their reliability. Down-weighting the influence of less reliable310

dimensionality estimates on the population estimate leads to a better popu-311

lation estimate (Kruschke, 2014).312

To account for this, we implemented a multilevel Bayesian model using the313

software package Stan (The Stan Development Team, 2017). Given the mean314

and standard deviation of j dimensionality estimates per participant, the315

model derives the best estimate for the true degree of dimensionality across all316

participants. Due to the nature of the multilevel model, individual estimates317

are subject to shrinkage towards the estimated population mean, and the318

degree of shrinkage is more pronounced for estimates with higher variance and319

stronger deviation from the estimated population mean (Kruschke, 2014).320

Additionally to the estimate of the population dimensionality, the model321

returns estimates for the population dimensionality’s variance, reflecting the322

uncertainty of the dimensionality estimate. For each individual participant,323

13


https://doi.org/10.1101/232454


separately pre-whitened and mean-centred

n vo

xel

m beta estimates

j-2 training data

validation data

test data

m-1 low-dimensional reconstructions

r1 … rk-1

[…]

k-dimensional reconstruction

(training + validation)

max r : dimensionality

estimate k

j ×(j-1) partitioning into training, validation and test

SVD

test correlation rj

j runs

∑

averaged training data

Figure 3: Illustration of the combination of SVD and cross-validation, correspond-ing to step 2 in Figure 2. For each searchlight or ROI, j (number of runs) n(number of voxels) × m (number of beta estimates) matrices are used to estimatethe functional dimensionality. For all possible partitions of j runs into training,validation and test data, we first average all training runs and build all possiblelow-dimensional reconstructions of these averaged data using SVD. All reconstruc-tions are then correlated with the validation run, resulting in j − 1 correlationcoefficients and respective dimensionalities. The dimensionality that results in thehighest average correlation across j − 1 runs is picked as dimensionality estimatek for this fold and a k-dimensional reconstruction of the average of the trainingand validation runs is correlated with a held-out test-run, resulting in a final re-construction correlation. In total, j reconstruction correlations are returned thatcan be averaged and tested for significance across participants using one-samplet-tests or alike. To derive a better estimate of the underlying dimensionality, thej dimensionality estimates per participant can be submitted to the hierarchicalBayesian model (step 4 in Figure 2)

14


https://doi.org/10.1101/232454


the model estimates the participant’s true underlying dimensionality and re-324

turns the uncertainty of this estimate. As we did not have strong priors325

regarding the dimensionality of the neural patterns, we implemented a uni-326

form prior over the population dimensionality estimates, reflecting that the327

dimensionality could be anything from 1 to m − 1. This can be adapted to328

be informative for studies estimating the functional dimensionality of neural329

patterns with stronger priors. Figure 4 shows an illustration of the model.330

The model assumed that all individual average dimensionality estimates331

yi come from a truncated individual t-distribution, centered at the true indi-332

vidual dimensionality µi which comes from a common truncated normal dis-333

tribution with mean µ and variance σ2, see Equation 2. We chose a truncated334

t-distribution at the individual level to account for the fact that there is only335

a limited number (j) of samples underlying each participant’s dimensionality336

estimate. The uniform prior distribution over the true dimensionality ranged337

from 1 to m− 1.338

yi ∼ T (j − 1, µi, σi), 1 ≤ yi ≤ m− 1,with

µi ∼ N(µ, σ), 0 ≤ µi ≤ σmax,

σi ∼ N(σi, 1), 0 ≤ σi ≤ max(σi), and

σi ∼ U(0,max(σi)).

(2)

The maximum population variance was defined as the expected variance339

of this uniform distribution 112

(m−2)2, reflecting the prior that each partici-340

pant could express a different, true dimensionality. On the subject-level, the341

maximum variance was defined as342

max(σ2i ) =

j

j − 1∗ (m− 1 − m

2)2 (3)

which corresponds to the maximum possible variance across j dimension-343

ality estimates.344

3. Simulations345

Before applying our method to real fMRI data, we tested the validity of346

our method through dimensionality-recovery studies on simulated fMRI data.347

Estimating the dimensionality for simulated cases where the true underlying348

dimensionality is known allowed us to assess whether our procedure results349

in a reliable dimensionality estimate.350

15


https://doi.org/10.1101/232454


nv(µ, σ)

µ

~

σ

min = 0, max = max(σi)

σi

~

xi σi^

µi

student t (µi, σi) nv(σi, 1)~~

min = 1, max = m-1 min = 0, max = 1/12 (m-2)2

observed values

estimated values

~~

Figure 4: Illustration of the implemented multilevel model to estimate the degreeof functional dimensionality, corresponding to step 4 in Figure 2. The observedaveraged dimensionality estimates per participant are assumed to be sampled froman underlying subject-specific t-distribution with mean µi and standard deviationσi. The standard deviation σi of the participants’ dimensionality estimates isassumed to be sampled from a normal distribution with mean σi and a standarddeviation of 1. The subject-specific t-distributions of µi are assumed to comefrom a population distribution with a normally distributed mean µ and varianceσ. Subject-specific standard deviations σi are assumed to come from a uniformdistribution, ranging from 0 to max(σi). At the top level, a uniform prior isimplemented. Mean and variance of the normal distribution of population meansµ are assumed to come from a uniform distribution ranging from 1 to m− 1 and0 to σmax, respectively. Distributions were derived from https://github.com/

rasmusab/distribution_diagrams.

16


https://github.com/rasmusab/distribution_diagrams

https://github.com/rasmusab/distribution_diagrams

https://doi.org/10.1101/232454


3.1. Methods351

Simulated data were created using the RSA toolbox (Nili et al., 2014) and352

custom Matlab code. Parameters of the simulation were picked in accordance353

with the study by Mack and colleagues (2013). We simulated fMRI data of354

presentation of 16 different stimuli, presented for 3 sec, three repetitions per355

run, and six runs, closely matching the specifications of the original study.356

To mimic a searchlight-approach, we defined the size of the cubic sphere357

4 × 4 × 4 voxels, resulting in a simulated pattern of 64 voxels.358

We simulated data with a dimensionality of 2, 4, and 6. We set the mean359

signal to noise ratio (SNR) to match empirically observed reconstruction360

correlation magnitudes of .25. As in the real data, reconstruction correlations361

varied across participants, ranging from .10 to .50. Thus, participants differed362

in their reliability of the dimensionality estimates.363

To generate data with varying ground-truth dimensionality k, we first364

generated true, i.e. noise-free, n(voxel) × m(conditions) matrices with un-365

derlying pre-defined dimensionality. This was achieved by applying PCA to366

a random 16 × 16 matrix and building a k-dimensional reconstruction of it.367

Rows of this matrix were added to a n× 16 matrix. For each row, i.e. voxel,368

a specific amplitude was drawn from a normal distribution and added.369

In the next step, we calculated the dot-product of the generated beta370

matrices and generated design matrices, which were HRF convolved. This371

resulted in noise-free fMRI time series.372

A noise matrix was generated by randomly sampling from a Gaussian dis-373

tribution. The n(voxel) × t(timesteps) matrix was then spatially smoothed374

and temporally smoothed with a Gaussian kernel of 4 FWHM. Finally, this375

temporally and spatially smoothed noise matrix was added to the noise-free376

time-series and the design matrix was fit the the resulting data using a GLM.377

This resulted in a (noisy) voxel × conditions beta matrix for each simulated378

run. The generated beta matrices were then passed on to the dimensionality379

estimation.380

We capitalized on our prior knowledge of possible dimensionalities that381

could underly the pattern and thus tested only for reconstruction correlations382

that could be achieved by keeping either 2, 4, or 6 components. This resulted383

in three reconstruction correlations per run. Reconstruction correlations were384

averaged across runs and we assessed how often each of the possible models385

of dimensionality achieved the highest correlation across subjects, for each386

respective ground-truth. Ideally, for each participant, the highest reconstruc-387

tion correlation would be achieved by the k-dimensional reconstruction that388

17


https://doi.org/10.1101/232454


2 4 6winning model

2

4

6

grou

nd tr

uth

0.10.20.30.40.50.60.70.80.9

grou

nd tr

uth

winning model

Figure 5: Results from the simulation. Confusion matrix indicating with whichpercentage a 2, 4, or 6 dimensional model was picked as best model given a ground-truth dimensionality of 2, 4, or 6. Values in the diagonal, that is, where thecorrect model was picked for a given ground-truth, were consistently higher thanoff-diagonal values.

fits with the underlying ground truth. However, due to noise, deviations from389

this are possible and we aimed to assess how likely those deviations could be390

expected to occur.391

To gather a reliable estimate of the performance of our procedure, we ran392

a total of 1000 of these simulations for each ground-truth dimensionality.393

3.2. Results394

Across 1000 simulations of data with a ground-truth dimensionality of 2,395

4, or 6, we found that the highest reconstruction correlations were generally396

achieved by the low-dimensional reconstruction of the data that matched397

the ground-truth (see Figure 5), with 93.9%, 85.7%, and 76.7% correctly398

classified, respectively.399

As described earlier, key to our method is the fact that keeping more400

components than actually underly an observed pattern results in a reduced401

reconstruction correlation, thereby allowing us to identify the best dimen-402

sionality estimate based on the achieved reconstruction correlation. Figure403

1 B illustrates how the reconstruction correlations drop as components are404

added that do not carry signal for the case of a true underlying dimensionality405

of 4.406

18


https://doi.org/10.1101/232454


3.3. Discussion407

By applying our procedure to simulated fMRI data with different under-408

lying ground truth dimensionality, we tested how well estimated dimension-409

alities match with true dimensionalities. The results confirm the validity of410

our approach, showing that for data with reasonable signal-to-noise ratio,411

estimated dimensionalities match closely the underlying ground truth. One412

observation is that estimates become more confusable at higher dimension-413

alities.414

4. Data sets415

Following the successful tests of our procedure with simulated data, we416

applied our method to three different, previously published fMRI datasets, all417

employing visual stimuli and testing healthy populations. We tested three418

core aims of our method: 1) Identifying areas carrying functional dimen-419

sionality, 2) Using functional dimensionality to assess sensitivity to stimulus420

features, and 3) Measuring task-dependent differences in dimensionality.421

4.1. Identifying areas carrying functional dimensionality422

Using data from a category learning study by Mack et al. (2013), we aimed423

to identify areas carrying functional dimensionality and compare them with424

the areas found by the original authors’ model-based analysis. Model-based425

analyses make specific assumptions about representational geometry that426

our approach does not. Furthermore, these analyses require some underlying427

dimensionality to identify an area. Therefore, we expected our method to428

reveal significant functional dimensionality in all areas that were reported in429

the original study, as well as additional areas that were reliably modulated by430

the task in a way that was not captured by the model tested in the original431

publication.432

4.1.1. Methods433

Participants were trained on categorizing nine objects that differed on four434

binary dimensions: shape (circle/triangle), color (red/green), size (large/small),435

and position (left/right). During the fMRI session, participants were pre-436

sented with the set of all 16 possible stimuli and had to perform the same437

categorization task. Out of 23 participants, 20 were included in the final438

analysis presented here, with 19 participants completing 6 runs composed of439

48 trials and one participant completing 5 runs.440

19


https://doi.org/10.1101/232454


Standard pre-processing steps were carried out using SPM12 (Penny et al.,441

2006) and beta estimates were derived from a GLM containing one regres-442

sor per stimulus (16 in total, see Supplemental Materials for details). The443

dataset was retrieved from osf.io/62rgs.444

We ran a whole-brain searchlight with a 7mm radius sphere to estimate445

which brain areas carry signal with functional dimensionality, that is, signal446

that could be reliably predicted across runs based on a low-dimensional recon-447

struction. For each searchlight, data were pre-whitened and mean-centered448

as described above. Dimensionality estimation was performed as previously449

described and the resulting j correlations and dimensionality estimates were450

ascribed to the center of the searchlight. The code for the searchlight was451

based on the RSA toolbox (Nili et al., 2014).452

For each voxel, the j correlation coefficients were averaged and their sig-453

nificance was assessed via non-parametric one-sample t-tests across subjects454

using FSL’s randomise function (Winkler et al., 2014). Results were family-455

wise error (FWE) corrected using a TFCE threshold of p < .05.456

In their original analysis, the authors fit a cognitive model to participants457

classification behavior to estimate attention-weights to the single stimulus458

features. Based on these attention weights, they derived model-based simi-459

larities between stimuli and used RSA to examine which brain regions show460

a representational geometry that matches with these predictions. We repli-461

cated this analysis using the same beta estimates that were passed on to the462

dimensionality estimation in order to maximize comparability of the two ap-463

proaches. As for estimating the dimensionality, we ran a whole-brain search-464

light with a 7mm radius sphere (based on the RSA toolbox, Nili et al., 2014).465

We averaged voxel response patterns across runs and calculated the repre-466

sentational distance matrices (RDM) as all pairwise 1−Pearson correlation467

distance. We assessed correspondence of these RDMs with the model-based468

distance matrices via Spearman correlation. The resulting Spearman corre-469

lation for each participant was assigned to the center of the searchlight and470

their significance was assessed via non-parametric one-sample t-tests across471

subjects using FSL’s randomise function (Winkler et al., 2014). Results were472

family-wise error (FWE) corrected using a TFCE threshold of p < .05.473

4.1.2. Results474

We aimed to identify areas that show functional dimensionality and ex-475

amine how those overlap with the authors’ original findings implementing a476

model-based analysis. We found significant dimensionality (i.e., reconstruc-477

20


osf.io/62rgs

https://doi.org/10.1101/232454


model-based RSA of stimulus similarities

functional dimensionality

overlap

Figure 6: Areas that showed significant functional dimensionality (green), signif-icant fit with the RSA comparing neural representational similarity with model-based predictions of stimulus similarity (orange), or both (yellow). FWE-correctedusing a TFCE threshold of p < .05. Notably, our method identifies large clustersof functional dimensionality in prefrontal cortex, indicating that areas here wereconsistently engaged by the task, though their patterns did not fit with the imple-mented cognitive model.

tion correlations) in an extended network of occipital, parietal and prefrontal478

areas (see Figure 6). In these areas, signal was reliable across runs and showed479

functional dimensionality.480

As can be seen in Figure 6, our method successfully identified all areas481

that were found in the original model-based analysis, which bolsters the482

authors original interpretation of their results. Notably, we were able to483

identify further areas that did not show a fit with the implemented attention-484

based model, suggesting that signal changes in those areas reflect a different485

aspect of the task space than captured by the cognitive model.486

4.1.3. Discussion487

Within the first dataset, we showed that by identifying areas with signifi-488

cant functional dimensionality, it is possible to reveal areas that can plausibly489

be tested for correspondence with a hypothesized representational similarity490

structure, as for instance derived from a cognitive model. More specifically,491

we were able to identify all areas that have been reported in the original492

analysis by Mack et al. (2013) to show a representational similarity as pre-493

dicted by a cognitive model. Additionally, we found further areas that had494

not been revealed in the original analysis to show functional dimensionality.495

This indicates that those areas have a reliable functional dimensionality but496

reflect cognitive processes or task-aspects that are not captured by the cog-497

nitive model. For instance, activation in the medial BA 8 has been found498

21


https://doi.org/10.1101/232454


to correlate with uncertainty and task-difficulty (Volz et al., 2005; Huettel,499

2005; Crittenden and Duncan, 2014), suggesting that the neural patterns in500

this region in the current task might reflect processes related to the difficulty501

or category uncertainty of the categorization decision for each stimulus. To-502

gether, the findings highlight the potential of our procedure to aid evaluation503

of model performance and identify areas ahead of model-fitting.504

4.2. Using functional dimensionality to assess sensitivity to stimulus features505

Using data from a study with real-world categories and photographic506

stimuli by Bracci and Op de Beeck (2016), we tested whether different507

brain regions show functional dimensionality in response to different stimulus508

groupings (i.e., depending on how the stimulus-space is summarized). For ex-509

ample, the columns in the data matrix may be organized along either visual510

categories or shape. In this fashion, our technique could be useful in eval-511

uating general hypotheses regarding the nature and basis of the functional512

dimensionality in brain regions.513

4.2.1. Methods514

During the experiment, participants were presented repeatedly with 54515

different natural images that were of nine different shapes and belonged to six516

different categories (minerals, animals, fruit/vegetables, music instruments,517

sport instruments, tools), allowing the authors to dissociate between neural518

responses reflecting shape or category information.519

Standard pre-processing of the data was carried out using SPM12 (see520

Supplemental Material for details). In line with the authors original analysis,521

we tested for differences depending on whether the stimuli were averaged to522

emphasize their category or shape information. To that end, we constructed523

two separate GLMs. The first GLM (catGLM) was composed of one regressor524

per category (six in total), thus averaging across objects shapes. The second525

GLM (shapeGLM) consisted of nine different regressors, one for each shape,526

averaging neural responses across object categories. In both GLMs, regres-527

sors were convolved with the HRF and six motion-regressors as covariates of528

no interest were included.529

Dimensionality was estimated separately for both GLMs. We ran a whole-530

brain searchlight with a 7mm sphere on the beta estimates of the respective531

GLM, again pre-whitening and mean-centering voxel patterns within each532

searchlight before estimating the dimensionality. Reconstruction correlations533

were averaged across runs for each participant and tested for significance534

22


https://doi.org/10.1101/232454


category GLM

shape GLM

overlap

Figure 7: Areas showing significant functional dimensionality for the shape GLM(green), the category GLM (orange), or both (yellow). Results are FWE-correctedusing an TFCE threshold of p < .05. Across both GLMs, posterior and parietalregions show functional dimensionality. Prefrontal regions show more pronouncedfunctional dimensionality for the category GLM, in line with the original findings.

across participants using FSL’s randomise function (Winkler et al., 2014).535

Results were FWE corrected using a TFCE threshold of p < .05.536

4.2.2. Results537

When testing for functional dimensionality for the shape-sensitive GLM,538

we found significant reconstruction correlations in bilateral posterior occipito-539

temporal and parietal regions, indicating functional dimensionality in these540

areas. Additionally, a significant cluster was revealed in the left lateral pre-541

frontal cortex (see Figure 7). Testing for functional dimensionality for the542

category-sensitive GLM also revealed strong significant correlations in occip-543

ital and posterior-temporal regions, but notably showed more pronounced544

correlations in bilateral lateral and medial prefrontal areas as well. This is545

in line with the authors original findings that showed that neural patterns in546

parietal and prefrontal ROIs correlated more strongly with a model reflect-547

ing category similarities, whereas shape similarities were largely restricted to548

occipital and posterior temporal ROIs.549


With the second dataset, we tested whether different areas are identi-551

fied to express significant functional dimensionality depending on how the552

underlying task-space is summarized. In line with the original authors’553

findings (Bracci and Op de Beeck, 2016), we found more pronounced func-554

tional dimensionality in prefrontal regions for the GLM emphasizing the555

category-information across stimuli, compared to the one focusing on shape-556

23


https://doi.org/10.1101/232454


information. Likewise, functional dimensionality in occipital regions was557

more pronounced for the shape-based GLM.558

However, compared to the authors’ original findings, we did not find a559

sharp dissociation between shape and category. For example, we find both560

shape and category dimensionality present in early visual regions and shape561

dimensionality extending into frontal areas. As discussed in the previous562

section, our method provides a general test of dimensionality whereas the563

original authors evaluate specific representational accounts that make ad-564

ditional assumptions about shape and category similarity structure. Com-565

paring results suggest that to some degree the dissociation found in Bracci566

and Op de Beeck (2016) rests on these specific assumptions. A more gen-567

eral test of functional dimensionality, for stimuli organized along shape or568

category, provides additional information to assist in interpreting the cogni-569

tive function of these brain regions, which complements testing more specific570

representational accounts.571

4.3. Measuring task-dependent differences in dimensionality572

In this third dataset, we consider whether the underlying dimensional-573

ity of neural representations changes as a function of task. In Mack et al.574

(2016), participants learned a categorization rule over a common stimulus575

set that either depended on one or two stimulus dimensions. We predicted576

that the estimated functional dimensionality, as measured by our hierarchi-577

cal Bayesian method, should be higher for the more complex categorization578

problem, extending the original authors’ findings.579

4.3.1. Methods580

Participants learned to classify bug stimuli that varied on three binary581

dimensions (mouth, antenna, legs) into two contrasting categories based on582

trial-and-error learning. Over the course of the experiment, participants583

completed two learning problems (in counterbalanced order). Correct classi-584

fication in type I problem required attending to only one of the bugs features,585

whereas classification in type II problem required combining information of586

two features in an exclusive-or manner.587

Previous research has shown that neural dimensionality appropriate for588

the problem at hand is linked to successful task performance (Rigotti et al.,589

2013). Thus, we hypothesized that dimensionality of the neural response590

would be higher for type II compared to type I in areas known to process591

visual features, as for instance lateral occipito-temporal cortex (LOC; see e.g.592

24


https://doi.org/10.1101/232454


Eger et al., 2008). We included data from 22 participants in our analysis (one593

participant was excluded due to artifacts in the fMRI data, please refer to594

the Supplemental Material for further details on the experiment and data595

preprocessing). The dataset was retrieved from osf.io/5byhb.596

In order to infer the degree of functional dimensionality, we estimated it597

across ROIs encompassing LOC in the left and right hemisphere separately598

for the two categorization tasks. Because the relevant stimulus dimensions599

were learned through trial-and-error learning, we excluded the first functional600

run (early learning) of each problem and analyzed the remaining three runs601

for each problem.602

Prior to estimating the dimensionality, data were pre-whitened and mean-603

centered. Dimensionality was estimated across all voxels for each ROI and604

problem, resulting in 3 (runs) × 2 (ROIs) × 2 (problems) correlation co-605

efficients and dimensionality estimates. Correlation coefficients were aver-606

aged per participant, ROI and problem and tested for significance using607

one-sample t-tests. To derive the best population estimate for the under-608

lying dimensionality for each ROI and problem, we implemented the above609

described hierarchical Bayesian model. To that end, we calculated mean and610

standard deviation of each participant’s dimensionality estimate per ROI611

and problem and used those summary statistics to estimate the degree of612

underlying dimensionality for each ROI and problem.613

4.3.2. Results614

Estimating dimensionality across two different ROIs in LOC and two615

different tasks allowed us to test whether the estimated dimensionality differs616

across problems with different task-demands. As participants had to pay617

attention to one stimulus feature in the type I problem and two stimulus618

features in the the type II problem, we hypothesized that dimensionality of619

the neural response would be higher for type II compared to type I in an620

LOC ROI.621

Both ROIs showed significant reconstruction correlations across both tasks622

(lLOC, type I: t21 = 3.08, p = .006; rLOC, type I: t21 = 2.21, p = .038; lLOC,623

type II: t21 = 3.03, p = .006; rLOC, type II: t21 = 3.37, p = .003). This shows624

that signal in the LOC showed reliable functional dimensionality across runs625

for both problem types, which is a prerequisite for estimating the degree of626

functional dimensionality.627

To estimate whether the dimensionality differed across problems, we ana-628

lyzed the data by implementing a multilevel Bayesian model using Stan (The629

25


osf.io/5byhb

https://doi.org/10.1101/232454


A B

0 1 2 3 4 5 6 7 8estimated dimensionality; right LOC

problem 1problem 2

0 1 2 3 4 5 6 7 8estimated dimensionality; left LOC

type I problem

0 1 2 3 4 5 6 7 8estimated dimensionality; right LOC

problem 1problem 2type II problem

Figure 8: Results of estimating functional dimensionality for two different catego-rization problems. A: Outline of the two ROIs in left and right LOC. B: Histogramsof posterior distributions of estimated dimensionalities in left and right LOC forthe type I and II problems. Dimensionalities were estimated by implementing sep-arate multilevel models for each ROI and model using Stan. Across both ROIs,the peak of the posterior distributions of the estimated dimensionality for type IIwas higher than for type I, mirroring the structure of the two problems.

Stan Development Team, 2017), see Figure 2 for an illustration of the model.630

As hypothesized, the estimated underlying dimensionality was higher for the631

type II problem compared to type I (type I: µleft = 2.92 (CI 95% : 1.33, 4.33),632

µright = 2.66 (CI 95% : 1.23, 4.14); type II: µleft = 4.74 (CI 95% : 3.20, 6.46),633

µright = 4.69 (CI 95% : 3.56, 6.06), see Figure 8).634


Besides knowing which areas show neural patterns with functional di-636

mensionality, an important question concerns the degree of the underlying637

dimensionality. Using data from a categorization task where participants638

had to attend to either one or two features of a stimulus, we demonstrate639

how our method can be used to test whether the degree of underlying dimen-640

sionality of neural patterns varies with task demands. A notable strength641

of the dataset for our research question is that the authors used the same642

stimuli in a within-subject paradigm, counterbalancing the order of the two643

categorization tasks across subjects. This allowed us to investigate how the644

dimensionality of a neural pattern changes with task, while controlling for645

possible effects due to differences in signal-to-noise ratios across participants646

or brain regions.647

Our results show that, as expected, the degree of underlying functional648

dimensionality is higher when the task required attending to two stimulus649

26


https://doi.org/10.1101/232454


features instead of only one. Notably, this assumption was implicit to the650

conclusions drawn by the authors in the original publication (Mack et al.,651

2016). The authors analyzed neural patterns in hippocampus and imple-652

mented a cognitive model to show that stimulus-specific neural patterns were653

stretched across relevant compared to irrelevant dimensions. Thus, irrelevant654

dimensions were compressed and the dimensionality of the neural pattern655

was reduced the less dimensions were relevant to the categorization problem.656

Our approach allows to directly assess this effect without the need of fitting657

a cognitive model.658

5. General Discussion659

Multivariate and model-based analyses of fMRI data have deepened our660

understanding of the human brain and its representational spaces (Norman661

et al., 2006; Kriegeskorte and Kievit, 2013; Haxby et al., 2014; Turner et al.,662

2017). However, before evaluating specific representational accounts, it is663

sensible to first ask the more basic question of whether brain areas displays664

functional dimensionality more generally. Here, we presented a novel ap-665

proach to estimate an area’s functional dimensionality by a combined SVD666

and cross-validation procedure. Our procedure identifies areas with signif-667

icant functional dimensionality and provides an estimate, reflecting uncer-668

tainty, of the degree of underlying dimensionality. Across three different data669

sets, we confirmed and extended the findings from the original contributions.670

After verifying the operation of the method with a synthetic (simulated)671

dataset in which the ground truth dimensionality was known, we applied672

our method to three published fMRI datasets. In each case, the procedure673

confirmed and extended the authors’ original findings, advancing our un-674

derstanding of the function of the brain regions considered. Each of three675

datasets highlighted a potential use of estimating functional dimensionality.676

In the first study, working with data from Mack et al. (2013), we demon-677

strated that testing for functional dimensionality can complement model-678

based fMRI analyses that evaluate more specific representational hypothe-679

ses. First, one cannot find a rich relationship between model representations680

and brain measures when there is no functional dimensionality in regions681

of interest. Second, there might be additional areas that display significant682

functional dimensionality that do not show correspondence with the model.683

These additional areas invite further analysis as they might implement684

processes and representations outside the scope of the tested model. Func-685

27


https://doi.org/10.1101/232454


tional dimensionality can indicate interesting unexplained signal. For exam-686

ple, in the first dataset examined, functional dimensionality was found in687

all the areas identified by Mack et al. (2013), plus medial BA 8, which is a688

candidate region for task difficulty and response conflict (see Alexander and689

Brown, 2011, for a model of medial prefrontal cortex function), which was690

not the authors’ original focus but may merit further study.691

In the second study, working with data from Bracci and Op de Beeck692

(2016), we demonstrated how stimuli could be grouped or organized in differ-693

ent fashions to explore how dimensional organization varies across the brain.694

In this case, the data matrix was either organized along shape or category.695

We found neural patterns of shape and category selectivity consistent with696

the authors’ original results. However, we found the selectivity to be more697

mixed in our analyses and identified additional responsive regions, mirroring698

our results when we considered data from Mack et al. (2013).699

Our method may have been more sensitive to signal because it makes700

fewer assumptions about the underlying representational structure and al-701

lows for individual differences in the underlying dimensions. In this sense,702

assessing functional complexity complements existing analysis procedures.703

Indeed, our approach could be used to evaluate multiple stimulus groupings704

to inform feature selection in encoding models (Diedrichsen and Kriegeskorte,705

2017; Naselaris et al., 2011).706

In a third study, working with data from Mack et al. (2016), we evaluated707

whether our method could identify changes in task-driven dimensionality. By708

combining estimates of functional dimensionality with a hierarchical Bayesian709

model, we found that the functional dimensionality in LOC was higher when710

a category decision required using two features rather than one. These results711

are consistent with the original authors’ theory but were hitherto untestable.712

In summary, assessing functional dimensionality across these three studies713

complemented the original analyses and revealed additional nuances in the714

data. In each case, our understanding of the neural function was further715

constrained. Moreover, comparing the results to those from model-based716

and other multivariate approaches was informative in terms of understanding717

underlying assumptions and their importance.718

Of course, as touched upon in the Introduction, there are many possible719

ways to assess dimensional structure in brain measures and progress has been720

made on this challenge (Rigotti et al., 2013; Machens et al., 2010; Rigotti and721

Fusi, 2016; Diedrichsen et al., 2013; Bhandari et al., 2017). Here, our aim722

was to specify a general, computational efficient, robust, and relatively simple723

28


https://doi.org/10.1101/232454


and interpretable procedure that can easily be applied to whole brain data724

to first test for statistical significant functional dimensionality and, if found,725

to provide an estimate of its magnitude using Bayesian hierarchical modeling726

to make clear the uncertainty in that estimate.727

We hope our contribution is useful to researches interested in further728

exploring their data, whether it be fMRI, MEG, EEG, or single-cell record-729

ings. Researchers may consider variants of our method. For example, as730

mentioned in the Introduction, the SVD could be substituted with another731

procedure depending on the needs and assumptions of the researchers. There732

is no magic bullet to the difficult problems of estimating the underlying di-733

mensionality of noisy neural data, but we have made progress on this issue734

both theoretically and practically. In doing so, we have also provided addi-735

tional insights into the brain basis of visual categorization. We hope that736

by demonstrating the merits of estimating the functional dimensionality of737

neural data that we motivate others to take advantage of this additional and738

complementary viewpoint on neural function.739

6. Data availability740

A Matlab toolbox for estimating functional dimensionality of fMRI data741

as well as data needed to replicate the analyses presented here will be made742

available after publication. Nifti files and code for the analyses presented743

here are available from the authors upon request.744

7. Acknowledgments745

This work was funded by the National Institutes of Health [grant number746

1P01HD080679]; the Leverhulme Trust [grant number RPG-2014-075]; and747

Wellcome Trust Senior Investigator Award [WT106931MA] to Bradley C.748

Love. Correspondences regarding this work can be sent to [email protected]

or [email protected]. The authors are grateful to all study authors for sharing750

their data and wish to thank all members of the LoveLab and Jan Balaguer751

for valuable input. Declarations of interest: none.752

29


mailto:[email protected]

mailto:[email protected]

https://doi.org/10.1101/232454


8. References753

Alexander, W. H. and Brown, J. W. (2011). Medial prefrontal cortex as an754

action-outcome predictor. Nature Neuroscience, 14(10):1338–1344.755

Bhandari, A., Rigotti, M., Gagne, C., Fusi, S., and Badre, D. (2017). Char-756

acterizing human prefrontal cortex representations with fMRI. In Society757

for Neuroscience, Washington, DC:.758

Bracci, S., Daniels, N., and Op de Beeck, H. (2017). Task Context Overrules759

Object- and Category-Related Representational Content in the Human760

Parietal Cortex. Cerebral Cortex, pages 1–12.761

Bracci, S. and Op de Beeck, H. (2016). Dissociations and associations be-762

tween shape and category representations in the two visual pathways. Jour-763

nal of Neuroscience, 36(2):432–444.764

Cattell, R. B. (1947). Confirmation and clarification of primary personality765

factors. Psychometrika, 12(3):197–220.766

Cattell, R. B. (1966). The Scree Test For The Number Of Factors. Multi-767

variate Behavioral Research, 1(2):245–276.768

Crittenden, B. M. and Duncan, J. (2014). Task difficulty manipulation re-769

veals multiple demand activity but no frontal lobe hierarchy. Cerebral770

Cortex, 24(2):532–540.771

Davis, T., LaRocque, K. F., Mumford, J. A., Norman, K. A., Wagner, A. D.,772

and Poldrack, R. A. (2014). What do differences between multi-voxel and773

univariate analysis mean? How subject-, voxel-, and trial-level variance774

impact fMRI analysis. NeuroImage, 97:271–283.775

Diedrichsen, J. and Kriegeskorte, N. (2017). Representational models: A776

common framework for understanding encoding, pattern-component, and777

representational-similarity analysis. PLoS Computational Biology, 13(4):1–778

33.779

Diedrichsen, J., Wiestler, T., and Ejaz, N. (2013). A multivariate method780

to determine the dimensionality of neural representation from population781

activity. NeuroImage, 76:225–235.782

30


https://doi.org/10.1101/232454


Eckart, C. and Young, G. (1936). The approximation of one matrix by783

another of lower rank. Psychometrika, 1(3):211–218.784

Eger, E., Ashburner, J., Haynes, J.-D., Dolan, R. J., and Rees, G. (2008).785

fMRI activity patterns in human LOC carry information about object786

exemplars within category. Journal of cognitive neuroscience, 20(2):356–787

370.788

Friston, K. J., Frith, C. D., Liddle, P. F., and Frackowiak, R. S. J. (1993).789

Functional Connectivity: The Principal-Component Analysis of Large790

(PET) Data Sets. Journal of Cerebral Blood Flow & Metabolism, 13(1):5–791

14.792

Fusi, S., Miller, E. K., and Rigotti, M. (2016). Why neurons mix: High di-793

mensionality for higher cognition. Current Opinion in Neurobiology, 37:66–794

74.795

Goddard, E., Klein, C., Solomon, S. G., Hogendoorn, H., Thomas, A., and796

Klein, C. (2017). Interpreting the dimensions of neural feature represen-797

tations revealed by dimensionality reduction. NeuroImage.798

Hastie, T., Tibshirani, R., and Friedman, J. (2009). Unsupervised Learn-799

ing. In The Elements of Statistical Learning, chapter 14, pages 485–585.800

Springer, New York, NY, second edition.801

Haxby, J. V., Connolly, A. C., and Guntupalli, J. S. (2014). Decoding Neu-802

ral Representational Spaces Using Multivariate Pattern Analysis. Annual803

review of neuroscience, pages 435–456.804

Hebart, M. N. and Baker, C. I. (2017). Deconstructing multivariate decoding805

for the study of brain function. NeuroImage.806

Huettel, S. A. (2005). Decisions under Uncertainty: Probabilistic Context807

Influences Activation of Prefrontal and Parietal Cortices. Journal of Neu-808

roscience, 25(13):3304–3311.809

Huettel, S. A., Song, A. W., and McCarthy, G. (2003). Functional magnetic810

resonance imaging. Sinauer Associates, Inc., Sunderland, Massachusetts,811

USA, second edition.812

31


https://doi.org/10.1101/232454


Huth, A. G., Nishimoto, S., Vu, A. T., and Gallant, J. L. (2012). A Continu-813

ous Semantic Space Describes the Representation of Thousands of Object814

and Action Categories across the Human Brain. Neuron, 76(6):1210–1224.815

Khaligh-Razavi, S. M. and Kriegeskorte, N. (2014). Deep Supervised, but816

Not Unsupervised, Models May Explain IT Cortical Representation. PLoS817

Computational Biology, 10(11).818

Kramer, M. A. (1991). Nonlinear principal component analysis using autoas-819

sociative neural networks. AIChE Journal, 37(2):233–243.820

Kriegeskorte, N. and Kievit, R. A. (2013). Representational geometry: In-821

tegrating cognition, computation, and the brain. Trends in Cognitive Sci-822

ences, 17(8):401–412.823

Kriegeskorte, N., Mur, M., Ruff, D. A., Kiani, R., Bodurka, J., Esteky, H.,824

Tanaka, K., and Bandettini, P. A. (2008). Matching Categorical Object825

Representations in Inferior Temporal Cortex of Man and Monkey. Neuron,826

60(6):1126–1141.827

Kruschke, J. K. (2014). Doing Bayesian data analysis : a tutorial with R,828

JAGS, and Stan.829

Ledoit, O. and Wolf, M. (2004). A well-conditioned estimator for large-830

dimensional covariance matrices. Journal of Multivariate Analysis,831

88(2):365–411.832

Machens, C. K., Romo, R., and Brody, C. D. (2010). Functional, but not833

anatomical, separation of what and when in prefrontal cortex. Journal of834

Neuroscience, 30(1):350–360.835

Mack, M. L., Love, B. C., and Preston, A. R. (2016). Dynamic updating836

of hippocampal object representations reflects new conceptual knowledge.837

Proceedings of the National Academy of Sciences of the United States of838

America, 113(46):13203–13208.839

Mack, M. L., Preston, A. R., and Love, B. C. (2013). Decoding the brain’s840

algorithm for categorization from its neural implementation. Current Bi-841

ology, 23(20):2023–2027.842

32


https://doi.org/10.1101/232454


Naselaris, T., Kay, K. N., Nishimoto, S., and Gallant, J. L. (2011). Encoding843

and decoding in fMRI. NeuroImage, 56(2):400–410.844

Nichols, T. and Holmes, A. (2003). Nonparametric Permutation Tests845

for Functional Neuroimaging. Human Brain Function: Second Edition,846

15(1):887–910.847

Nili, H., Wingfield, C., Walther, A., Su, L., Marslen-Wilson, W., and848

Kriegeskorte, N. (2014). A Toolbox for Representational Similarity Anal-849

ysis. PLoS Computational Biology, 10(4):e1003553.850

Norman, K. A., Polyn, S. M., Detre, G. J., and Haxby, J. V. (2006). Be-851

yond mind-reading: multi-voxel pattern analysis of fMRI data. Trends in852

Cognitive Sciences, 10(9):424–430.853

Parpart, P., Jones, M., and Love, B. (2017). Heuristics as Bayesian inference854

under extreme priors. Cognitive Psychology, in press.855

Penny, W., Friston, K., Ashburner, J., Kiebel, S., and Nichols, T. (2006).856

Statistical Parametric Mapping: The Analysis of Functional Brain Images:857

The Analysis of Functional Brain Images, volume 8. Academic press.858

Rigotti, M., Barak, O., Warden, M. R., Wang, X.-J., Daw, N. D., Miller,859

E. K., and Fusi, S. (2013). The importance of mixed selectivity in complex860

cognitive tasks. Nature, 497(7451):1–6.861

Rigotti, M. and Fusi, S. (2016). Estimating the dimensionality of neural862

responses with fMRI Repetition Suppression. Arxiv.863

Shlens, J. (2014). A tutorial on principal component analysis. arXiv.864

Smith, S. M. and Nichols, T. E. (2009). Threshold-free cluster enhancement:865

Addressing problems of smoothing, threshold dependence and localisation866

in cluster inference. NeuroImage, 44(1):83–98.867

Spearman, C. (1904). General Intelligence, Objectively Determined and Mea-868

sured. American Journal of Psychology, 15(2):201–292.869

The Stan Development Team (2017). MatlabStan: the MATLAB interface870

to Stan.871

33


https://doi.org/10.1101/232454


Turner, B. M., Forstmann, B. U., Love, B. C., Palmeri, T. J., and Van Maa-872

nen, L. (2017). Approaches to analysis in model-based cognitive neuro-873

science. Journal of Mathematical Psychology, 76:65–79.874

Volz, K. G., Schubotz, R. I., and Von Cramon, D. Y. (2005). Variants of875

uncertainty in decision-making and their neural correlates. Brain Research876

Bulletin, 67(5):403–412.877

Walther, A., Nili, H., Ejaz, N., Alink, A., Kriegeskorte, N., and Diedrich-878

sen, J. (2016). Reliability of dissimilarity measures for multi-voxel pattern879

analysis. NeuroImage, 137:188–200.880

Winkler, A. M., Ridgway, G. R., Webster, M. A., Smith, S. M., and Nichols,881

T. E. (2014). Permutation inference for the general linear model. Neu-882

roImage, 92:381–397.883

34


https://doi.org/10.1101/232454


estimating the functional dimensionality of neural representations · estimating the functional...

Documents