estimating the functional dimensionality of neural representations · estimating the functional...
TRANSCRIPT
Estimating the functional dimensionality of neural
representations
Christiane Ahlheima,∗, Bradley C. Lovea,b,
aDepartment of Experimental Psychology, University College London, 26 Bedford Way,London WC1H 0AP, United Kingdom
bThe Alan Turing Institute, United Kingdom
Abstract
Recent advances in multivariate fMRI analysis stress the importance of infor-mation inherent to voxel patterns. Key to interpreting these patterns is esti-mating the underlying dimensionality of neural representations. Dimensionsmay correspond to psychological dimensions, such as length and orientation,or involve other coding schemes. Unfortunately, the noise structure of fMRIdata inflates dimensionality estimates and thus makes it difficult to assess thetrue underlying dimensionality of a pattern. To address this challenge, wedeveloped a novel approach to identify brain regions that carry reliable task-modulated signal and to derive an estimate of the signal’s functional dimen-sionality. We combined singular value decomposition with cross-validationto find the best low-dimensional projection of a pattern of voxel-responsesat a single-subject level. Goodness of the low-dimensional reconstructionis measured as Pearson correlation with a test set, which allows to test forsignificance of the low-dimensional reconstruction across participants. Usinghierarchical Bayesian modeling, we derive the best estimate and associateduncertainty of underlying dimensionality across participants. We validatedour method on simulated data of varying underlying dimensionality, show-ing that recovered dimensionalities match closely true dimensionalities. Wethen applied our method to three published fMRI data sets all involving pro-cessing of visual stimuli. The results highlight three possible applications ofestimating the functional dimensionality of neural data. Firstly, it can aid
∗Corresponding authorEmail addresses: [email protected] (Christiane Ahlheim), [email protected]
(Bradley C. Love)
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 12, 2018. . https://doi.org/10.1101/232454doi: bioRxiv preprint
evaluation of model-based analyses by revealing which areas express reliable,task-modulated signal that could be missed by specific models. Secondly,it can reveal functional differences across brain regions. Thirdly, knowingthe functional dimensionality allows assessing task-related differences in thecomplexity of neural patterns.
Keywords: neural representations, dimensionality reduction, multivariateanalysis
2
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 12, 2018. . https://doi.org/10.1101/232454doi: bioRxiv preprint
1. Introduction1
A growing number of fMRI studies are investigating the representational2
geometry of voxel response patterns. For example, using representational3
similarity analysis (RSA; Kriegeskorte and Kievit, 2013), researchers have4
characterized visual object representations along the ventral stream (Khaligh-5
Razavi and Kriegeskorte, 2014) and how these representations vary across6
tasks (Bracci et al., 2017).7
Interpreting representational geometry in neural responses can be diffi-8
cult. For example, RSA tests for a hypothesized representational pattern,9
but an important and more fundamental question should be addressed first,10
namely whether there is any dimensionality to the underlying neural pattern11
and, if so, what that dimensionality is.12
Knowing whether a pattern has dimensionality should be prerequisite for13
RSA and other multivariate representational analyses because a particular14
similarity structure can only be found when there is sufficient dimensional-15
ity to represent the proposed relations. For example, searching for a flavor16
space with dimensions sweet, sour, bitter, salty and umami would be a fool’s17
errand in brain areas that contain little or no dimensionality. Furthermore,18
independent of the particular geometry, the dimensionality of a neural pat-19
tern is informative of how many features of a task are represented in a brain20
region, which can inform our understanding of an area’s function.21
There are many methods of dimensionality reduction and estimation,22
most of which involve low-rank matrix approximation and aim to maximize23
the correspondence between the original and the approximated matrix. For24
example, two common approaches to estimate the dimensionality of an ob-25
served neural or behavioral pattern are principal component analysis (PCA)26
or relatedly, multidimensional scaling (MDS).27
PCA, or the closely related factor analysis and singular value decompo-28
sition (SVD) (Hastie et al., 2009), is widely used in the study of individual29
differences and aids estimating how many latent components, or “factors”,30
underlie a pattern of (item) responses across participants, as for instance31
in the context of intelligence (Spearman, 1904) or personality tests (Cattell,32
1947). In the context of neuroimaging, PCA has been used to identify brain33
networks (Huth et al., 2012; Friston et al., 1993). PCA derives how much34
variance of the observed pattern is explained by each underlying component.35
Similarly, MDS finds the best representation of original distances in a36
low-dimensional space (Kriegeskorte and Kievit, 2013). For example, two37
3
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 12, 2018. . https://doi.org/10.1101/232454doi: bioRxiv preprint
stimuli like a chair and table that are very close to each other in the high-38
dimensional space will be represented closely in the low-dimensional projec-39
tion achieved by MDS, whereas two stimuli that were very distant from each40
other, for instance a chair and a bunny, will be projected far apart. MDS has41
been successfully applied to behavioral as well as neural data to reveal which42
stimulus features underly observed representational geometries (Bracci and43
Op de Beeck, 2016; Kriegeskorte and Kievit, 2013; Kriegeskorte et al., 2008),44
though it has been questioned to which extent results from MDS are inter-45
pretable (Goddard et al., 2017). For reasons outlined below, we will focus on46
SVD to estimate the dimensionality of neural representations, though other47
methods could be paired with our general approach, including nonlinear ap-48
proaches such as Nonlinear PCA (Kramer, 1991).49
Estimating the dimensionality of neural data brings its own unique chal-50
lenges. In a noise-free scenario, dimensionality can be defined as the number51
of linear orthogonal components (singular- or eigenvalues) underlying a ma-52
trix that are larger than zero (Shlens, 2014), indicating that the component53
fits some variance in the data. Unfortunately, actual recordings of neural ac-54
tivity always contain noise, which inflates non-signal components above zero55
(Fusi et al., 2016; Diedrichsen et al., 2013). This noise makes it challenging56
to determine which areas contain signal and, if so, what the dimensionality57
of the signal is.58
One criterion, which we adopt in the work reported here, is to choose59
the number of components that should maximize reconstruction accuracy60
(measured by correlation) on new data (i.e., test data). While even for61
data with low or moderate true dimensionality more components will always62
increase fit for existing data (i.e., training data), performance on test data63
(i.e., generalization, prediction) will usually be best for a moderate number of64
components because these components largely reflect true signal as opposed65
to noise in the observed training sample.66
The problem of distinguishing between signal and noise in a neural pat-67
tern is related to the bias-variance trade-off in supervised learning and model-68
selection. Overly simple models (few components) are highly biased, fitting69
training data poorly and not performing well on test data. These overly70
simple models cannot pick-up on nuances in the signal. Conversely, overly71
complex models (many components) are too sensitive to the variance in the72
training date (i.e., overfit). Although they fit the training data very well,73
overly complex models treat noise in the training data as signal and, there-74
fore, generalize poorly. Thus, the sweet spot for test performance should be75
4
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 12, 2018. . https://doi.org/10.1101/232454doi: bioRxiv preprint
at some moderate number of components that largely reflect true signal (see76
Figure 1 A). Thus, identifying the true number of underlying components is77
analogous to deciding which model best explains the data.78
One naive way to navigate this trade-off between simple and complex79
models is to use some arbitrary cutoff, such as including the number of com-80
ponents that captures some amount of variance in the training data or decid-81
ing based on visual inspection which components may carry signal (known82
as scree plot, Cattell, 1966). In the case of fMRI, where the signal-to-noise83
ratio depends on multiple factors like scanner settings, experimental design,84
and physiological activity (Huettel et al., 2003), estimating the underlying85
dimensionality based on an arbitrary cut-off criterion for explained variance86
could be misleading. Likewise, although identifying relevant components87
via visual inspection works for small datasets, it is not applicable to large88
datasets as fMRI data, as it would require a manual decision for each voxel.89
Furthermore, the size of fMRI datasets (usually thousands of voxels) calls90
for a computationally efficient and automated approach, making estimating91
the dimensionality for the whole brain feasible. Thus, for neuroimaging data,92
there is a need for an efficient, systematic and objective approach that can93
both identify areas with statistical significant dimensionality and provide a94
useful estimate of the underlying dimensionality.95
Previous efforts to estimate the dimensionality of neural response pat-96
terns have applied linear classifiers to neural data to evaluate dimensionality97
(Rigotti et al., 2013; Diedrichsen et al., 2013). Rigotti et al. (2013) were able98
to show that dimensionality of single-cell recordings in monkey PFC is linked99
to successful task-performance, indicating that dimensionality of neural pat-100
terns is task-sensitive. In line with this, Diedrichsen et al. (2013) showed101
that the dimensionality of motor cortex representations differs depending on102
the task. Using a combination of PCA and linear Gaussian classifiers, the103
authors showed that motor cortex representations of different force levels104
are low dimensional, whereas usage of different fingers was associated with105
multidimensional neural patterns (Diedrichsen et al., 2013). Notably, both106
studies focused on estimating task-related changes in dimensionality in a pre-107
scribed brain region, rather than estimating which areas across the brain had108
significant dimensionality.109
In the present work, we expand on previous contributions by evaluat-110
ing a novel approach that, in a robust and computationally efficient manner,111
tests for which areas display statistically significant dimensionality, estimates112
the dimensionality, and provides an indication of the certainty of the esti-113
5
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 12, 2018. . https://doi.org/10.1101/232454doi: bioRxiv preprint
mate. We combine singular value decomposition (SVD) and cross-validation114
to identify areas across the brain with underlying dimensionality. We derive115
which of all possible low-dimensional reconstructions of the fMRI signal is the116
best dimensionality estimate of a held-out test run, and quantify the good-117
ness of the low-dimensional reconstruction via Pearson correlation. Using a118
cross-validation procedure to identify the best dimensionality estimate boosts119
that only components that carry signal and thus generalize to new data are120
kept. By assessing the significance of the correlation, we can distinguish be-121
tween areas that show reliable signal with underlying dimensionality vs. areas122
that do not show a reliable task-modulation. After establishing significant123
functional dimensionality, we use Bayesian modeling to derive a population124
estimate and associated uncertainty of the degree of dimensionality. We will125
refer to this task-dependent dimensionality as functional dimensionality.126
Through simulations and evaluation of three (published) fMRI datasets,127
we find that our method successfully identifies areas with significant func-128
tional dimensionality and provides reasonable estimates of the underlying129
dimensionality. In the first fMRI dataset, participants performed a catego-130
rization task which required differential attention to various stimulus features131
(Mack et al., 2013). The second study investigated shape- and category spe-132
cific neural responses to the presentation of natural images (Bracci and Op de133
Beeck, 2016). The third study involved categorization tasks that varied sys-134
tematically in their attentional demands (Mack et al., 2016), which we predict135
should affect functional dimensionality.136
Across all three studies, we were able to identify areas carrying functional137
dimensionality in a manner that supported and extended the original find-138
ings. Focusing on wholebrain effects in the the first two studies, we identified139
a consistent network of areas showing functional dimensionality during vi-140
sual stimulus processing. This network encompassed areas that were reported141
by the original authors as being task-relevant, identified through represen-142
tational similarity analysis and cognitive model fitting (Bracci and Op de143
Beeck, 2016; Mack et al., 2013). Furthermore, functional dimensionality was144
revealed in additional areas, highlighting the sensitivity of our method and145
suggesting that reliable task-modulated signal was present that was not ex-146
plained by the models the original authors tested. In the last study, we147
combined a region-of-interest approach and multilevel Bayesian modeling to148
show that dimensionality varied depending on task-requirements, which fol-149
lows from the original authors’ claims but remained untested until now (Mack150
et al., 2016). We outline how the notion and identification of functional di-151
6
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 12, 2018. . https://doi.org/10.1101/232454doi: bioRxiv preprint
A
1 3 5 7 9 11 13 15number of components
0
0.05
0.1
0.15
0.2
0.25
0.3
corre
latio
n
B
number of components
corre
latio
ntrainingtest
overfitting
number of components number of components
Figure 1: Illustration of the concept of overfitting and generalizability. A: Asmore components are added to a low-dimensional reconstruction, the correlationbetween the training data and the reconstruction approaches the maximum of 1for a full-dimensional reconstruction (purple curve). Adding components is equiv-alent to adding model parameters to improve fit, which reduces the model’s biasand increases its variance. For the correlation between the reconstructed train-ing and independent test data (red curve), adding components initially improvesperformance but at some point reduces performance due to overfit (see Parpartet al., 2017, for a related illustration). B: Reconstruction correlations achieved byall possible low-dimensional reconstructions for a simulated ground-truth dimen-sionality of 4. Reconstruction correlations rise as more components are added upto the point where the true dimensionality is reached, and decrease afterwards.Results are averaged across 6 runs and 1000 simulated voxel patterns with varyingsignal-to-noise ratios.
mensionality can aid the analysis and understanding of neuroimaging data152
in various ways.153
2. General Methods154
Neuroimaging data, such as fMRI, M/EEG, or single-cell recordings, can155
be represented as a matrix of n voxels, neurons, or sensors × m conditions.156
For example, BOLD response patterns in the fusiform face area (FFA) to157
3 different stimulus conditions can be expressed as a matrix Y of the size158
n (number of voxels) × 3 (face, house, or tool stimulus condition). The159
maximum possible dimensionality is the minimum of n and m, which in this160
example would be 3, assuming many voxels in FFA were included in the161
7
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 12, 2018. . https://doi.org/10.1101/232454doi: bioRxiv preprint
analysis. However, functional dimensionality could be lower. For example,162
dimensionality would be lower if the region only responded to face stimuli163
and showed the same lower response to house and tool stimuli.164
Various methods exist that allow to estimate a matrix’s dimensionality165
and a review of all of them is beyond the scope of this paper. The approach we166
present here is modular and estimates a matrix’s dimensionality by combining167
low-rank approximation with cross-validation and significance testing. This168
modularity allows to flexibly choose the dimensionality reduction technique169
which best fits with ones requirements. Here, we used SVD (which is often170
used to compute PCA solutions) because it is a well-understood, easy to171
implement, and a computationally efficient low-rank matrix approximation.172
The choice of SVD, as well as how the data matrix is normalized is in-173
formed by our understanding of the underlying neural signal. Because voxels174
differ greatly from one another in their overall activity level and activity lev-175
els can drift over runs, we demean each row (i.e., voxel) of the data matrix by176
run. In contrast, we do not demean each column, as would typically be done177
with approaches that focus on the covariance of the column vectors (e.g.,178
PCA). The reason we do not normalize by column (i.e., condition) is that we179
are open to the possibility that different stimuli may be partially coded by180
overall activity levels of a population of voxels. For example, imagine a brain181
area only responds strongly to faces, but not to other stimuli. An SVD with182
demeaned voxels (i.e., rows) would be sensitive to this dimension of represen-183
tation, whereas a procedure that effectively worked with demeaned columns184
would not be sensitive to this task-driven difference in neural activity (see185
Davis et al., 2014; Hebart and Baker, 2017, for a related discussion).186
In the following section, we describe how a combination of SVD and187
cross-validation can be used to test whether an observed neural pattern can188
be successfully reconstructed using a low-rank approximation, assessed as a189
significant Pearson correlation between a low-rank approximation and a held190
out test set, and how this technique provides an estimate of the pattern’s un-191
derlying dimensionality (see Figure 2 for an overview of all steps). As all our192
examples are fMRI data sets, we will describe the steps using fMRI termi-193
nology, though the procedure could be applied to any type of neuroimaging194
data. We provide the code and data to replicate the analyses presented here195
and for use on other datasets at osf.io/tpq92.196
8
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 12, 2018. . https://doi.org/10.1101/232454doi: bioRxiv preprint
j correlation coefficientsper participant
significantnot significant
no dimensionality inferred
r1,i…rj,ir1,1…rj,1
i
r1…ri
averaging across j
t-test
r1…ri ,
sd(r1)…
sd(ri)
Bayesian modeling
NV (µ, σ)
dimensionality = NV(µ, σ)
j-2 training
validation
test
j matrices of beta estimates (pre-whitened, mean-centered
if necessary)
all low-dimensional reconstructions of training data
SVD
rmax with validation data ➞ dimensionality estimate k
r
k-dimensional reconstruction of training + validation, correlate
with test r
Step 2
Step 3
Step 4
Step 1
9
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 12, 2018. . https://doi.org/10.1101/232454doi: bioRxiv preprint
Figure 2 (previous page): Step 1: Prior to dimensionality estimation, raw dataare pre-processed with preferred settings and software and beta estimates derivedfrom a GLM are obtained for each condition of interest. The resulting j matricesof size n (number of voxels) ×m (number of conditions) are pre-whitened andmean-centered (by row, i.e., voxel) to remove baseline differences across runs.Step 2: a combination of cross-validation and SVD is implemented to find the bestdimensionality estimate k for each run j. Pearson correlations between all possiblelow-dimensionality reconstructions of the data and a held-out test set quantify thegoodness of each reconstruction for each run j (see Figure 3 for details). Step 3: theresulting j correlations are averaged for each participant and tested for significance,for instance using t-tests, across all participants. Step 4: If the reconstructioncorrelations are significant across participants, a hierarchical Bayesian model canbe used to derive the best estimate of the degree of functional dimensionality (seeFigure 4 for details). For each participant, the average estimated dimensionalityand standard deviation of this estimate is calculated and a population estimateand respective standard deviation (uncertainty in the estimate) is derived acrossall participants.
2.1. Step 1: Data pre-processing197
We developed the presented method with application to fMRI data in198
mind, though it can be easily adapted to fit requirements of single cell record-199
ings or M/EEG data. The method takes beta estimates resulting from a200
GLM fit to the observed BOLD response as input. In all studies presented201
here, standard pre-processing steps were performed using SPM 12 (Wellcome202
Department of Cognitive Neurology, London, United Kingdom), but the pre-203
cise nature of the preprocessing and implemented GLM is not critical to our204
method. Functional data were motion corrected, co-registered and spatially205
normalized to the Montreal Neurological Institute (MNI) space.206
To reduce the impact of the structured noise, which is correlated across207
voxels, on the dimensionality estimation and to improve the reliability of208
multivariate voxel response patterns (Walther et al., 2016), we applied mul-209
tivariate noise-normalization, that is, spatial pre-whitening, before estimat-210
ing the functional dimensionality. We used the residual time-series from the211
fitted GLM to estimate the noise covariance Σnoise and used regularization to212
shrink it towards the diagonal (Ledoit and Wolf, 2004). Each n×m matrix213
of beta estimates Y was then multiplied by Σ− 1
2noise (Walther et al., 2016).214
In fMRI data, the baseline activation can differ across functional runs.215
10
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 12, 2018. . https://doi.org/10.1101/232454doi: bioRxiv preprint
This has important implications for our approach presented here, as it can216
bias the correlation between neural patterns across runs. To account for this,217
we demeaned the pre-whitened beta estimates across conditions, resulting in218
an average estimate of zero for each voxel. This demeaning reduces the219
possible maximum dimensionality of the data to kmax = m − 1. Notably,220
demeaning of voxels is conceptually different from demeaning conditions,221
which would have been implemented by PCA, as it preserves differences222
between conditions, whereas PCA would remove those.223
2.2. Step 2: Evaluating all possible SVD (dimensional) models224
The dimensionality of a matrix is defined as its number of non-zero sin-225
gular values, identified via singular value decomposition (SVD). SVD is the226
factorization of an observed n × m matrix M of the form UΣV ᵀ. U and227
V are matrices of size m × m and n × n, respectively, and Σ is an n × m228
matrix, whose diagonal entries are referred to as the singular values of M .229
A k-dimensional reconstruction of the matrix M can be achieved by only230
keeping the k largest singular values in Σ and replacing all others with zero,231
resulting in Σ. This is known as Eckart-Yong theorem (Eckart and Young,232
1936), leading to equation 1:233
M = UΣV ᵀ (1)
To estimate the dimensionality of fMRI data, we applied SVD to j(number234
of runs) matrices Y of n(number of voxel) × m(number of beta estimates),235
with the restriction of n > m.236
Critically, fMRI beta estimates are noisy estimates of the true signal.237
In the presence of noise, all singular values of a matrix will be non-zero,238
requiring the definition of a cut-off criterion to assess the number of singular239
values reflecting signal. Removing noise-carrying components from a matrix240
is beneficial, as it avoids overfitting to the noise and thus, improves the241
generalizability of the low-dimensional reconstruction to another sample (see242
Figure 1 A for an illustration of the concept of overfitting). We aimed to avoid243
any subjective (arbitrary) criterion as percentage of explained variance or244
alike (Cattell, 1966). To that end, we implemented a nested cross-validation245
procedure at the core of our method to identify singular values that carry246
signal (see step 1 of the general overview depicted in Figure 2 and Figure 3247
for a detailed illustration of the cross-validation approach). This allows us248
to overcome the inflation of dimensionality of fMRI data due to noise and249
test which areas of the brain carry signal with functional dimensionality.250
11
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 12, 2018. . https://doi.org/10.1101/232454doi: bioRxiv preprint
Data are partitioned j × (j − 1) times into training (Ytrain), validation251
(Yval), and test (Ytest) data. The (demeaned and pre-whitened) j−2 training252
runs are averaged, and SVD is applied to the resulting n×m matrix Ytrain.253
We then build all possible low-dimensional reconstructions of the averaged254
training data, with dimensionality ranging from 1 to m−1. Low-dimensional255
reconstructions are generated by keeping only the k highest singular values256
and setting all others to zero. Each low-dimensional reconstruction of matrix257
Ytrain is correlated with the held-out Yval. This is repeated for each possible258
partitioning in training and validation, resulting in j−1 × m−1 correlation259
coefficients. Correlations are Fisher’s z-transformed and averaged across the260
j − 1 partitionings. The dimensionality with the average highest correlation261
is picked as best estimate k of the underlying dimensionality. As keeping262
components that reflect noise rather than signal lowers the correlation with263
an independent data set, the highest correlation is not necessarily achieved264
by keeping more components. This procedure thus avoids inflated dimen-265
sionality estimates.266
After identifying the best dimensionality estimate k for run j, the training267
and validation runs from 1 to j−1 are averaged together and SVD is applied268
to the averaged data. We then generate a k-dimensional reconstruction of269
the averaged data. The quality of this final low-dimensional reconstruction270
is measured as Pearson correlation with Ytest. We chose Pearson correlation271
and not mean-square error (MSE), which is suggested by the use of SVD as272
measure of reconstruction quality, because MSE is influenced by the variance273
of the reconstructed data, which depends on its dimensionality k.274
2.3. Step 3: Determining statistical significance275
The approach results in j estimates of the underlying dimensionality and276
j corresponding test correlations per participant. Under the null-hypothesis277
of no dimensionality, and thus, only noise present in the matrix, reconstruc-278
tion correlations averaged across runs are distributed around zero. Thus,279
across-participants significance of the averaged reconstruction correlations280
can be assessed using one-sample t-tests or non-parametric alternatives, as281
for instance permutation tests (Nichols and Holmes, 2003), and established282
correction methods for multiple comparisons, like threshold-free cluster en-283
hancement (TFCE, see Smith and Nichols, 2009).284
Only if a significant, k-dimensional, reconstruction correlation can be285
established across participants, we refer to an area as showing functional di-286
mensionality. It should be noted that a significant reconstruction correlation287
12
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 12, 2018. . https://doi.org/10.1101/232454doi: bioRxiv preprint
only indicates that the underlying functional dimensionality is one or bigger.288
However, testing for a dimensionality of two or larger can be achieved by289
removing not only the voxel-mean before estimating the dimensionality, but290
also the condition mean, effectively removing univariate differences between291
conditions.292
2.4. Step 4: Estimating the degree of dimensionality293
The previously described steps allow us to identify which areas carry294
reliable signal with functional dimensionality, but do not provide a precise295
estimate of the degree of the underlying dimensionality. The best population296
estimate of a region’s functional dimensionality should optimally combine297
information across participants, giving more weight to participants with more298
reliable estimates, and should furthermore reflect how peaked the distribution299
of underlying population estimates is, accounting for the fact that different300
participants could express different true dimensionality.301
Given a significant reconstruction correlation across participant, j esti-302
mates of the degree of dimensionality are obtained (for each voxel or ROI)303
for each participant. In a noise-free scenario, all j estimates reflect the true304
dimensionality and thus, direct inference could be made solely based on these305
estimates. Under noise, these estimates could over- or underestimate the true306
dimensionality. The less reliable the j dimensionality estimates, the higher307
the variance across them. Mere averaging of the j estimates across par-308
ticipants would discard this information, weighting all participants equally,309
irrespective of their reliability. Down-weighting the influence of less reliable310
dimensionality estimates on the population estimate leads to a better popu-311
lation estimate (Kruschke, 2014).312
To account for this, we implemented a multilevel Bayesian model using the313
software package Stan (The Stan Development Team, 2017). Given the mean314
and standard deviation of j dimensionality estimates per participant, the315
model derives the best estimate for the true degree of dimensionality across all316
participants. Due to the nature of the multilevel model, individual estimates317
are subject to shrinkage towards the estimated population mean, and the318
degree of shrinkage is more pronounced for estimates with higher variance and319
stronger deviation from the estimated population mean (Kruschke, 2014).320
Additionally to the estimate of the population dimensionality, the model321
returns estimates for the population dimensionality’s variance, reflecting the322
uncertainty of the dimensionality estimate. For each individual participant,323
13
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 12, 2018. . https://doi.org/10.1101/232454doi: bioRxiv preprint
separately pre-whitened and mean-centred
n vo
xel
m beta estimates
j-2 training data
validation data
test data
m-1 low-dimensional reconstructions
r1 … rk-1
[…]
k-dimensional reconstruction
(training + validation)
max r : dimensionality
estimate k
j ×(j-1) partitioning into training, validation and test
SVD
test correlation rj
j runs
∑
averaged training data
Figure 3: Illustration of the combination of SVD and cross-validation, correspond-ing to step 2 in Figure 2. For each searchlight or ROI, j (number of runs) n(number of voxels) × m (number of beta estimates) matrices are used to estimatethe functional dimensionality. For all possible partitions of j runs into training,validation and test data, we first average all training runs and build all possiblelow-dimensional reconstructions of these averaged data using SVD. All reconstruc-tions are then correlated with the validation run, resulting in j − 1 correlationcoefficients and respective dimensionalities. The dimensionality that results in thehighest average correlation across j − 1 runs is picked as dimensionality estimatek for this fold and a k-dimensional reconstruction of the average of the trainingand validation runs is correlated with a held-out test-run, resulting in a final re-construction correlation. In total, j reconstruction correlations are returned thatcan be averaged and tested for significance across participants using one-samplet-tests or alike. To derive a better estimate of the underlying dimensionality, thej dimensionality estimates per participant can be submitted to the hierarchicalBayesian model (step 4 in Figure 2)
14
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 12, 2018. . https://doi.org/10.1101/232454doi: bioRxiv preprint
the model estimates the participant’s true underlying dimensionality and re-324
turns the uncertainty of this estimate. As we did not have strong priors325
regarding the dimensionality of the neural patterns, we implemented a uni-326
form prior over the population dimensionality estimates, reflecting that the327
dimensionality could be anything from 1 to m − 1. This can be adapted to328
be informative for studies estimating the functional dimensionality of neural329
patterns with stronger priors. Figure 4 shows an illustration of the model.330
The model assumed that all individual average dimensionality estimates331
yi come from a truncated individual t-distribution, centered at the true indi-332
vidual dimensionality µi which comes from a common truncated normal dis-333
tribution with mean µ and variance σ2, see Equation 2. We chose a truncated334
t-distribution at the individual level to account for the fact that there is only335
a limited number (j) of samples underlying each participant’s dimensionality336
estimate. The uniform prior distribution over the true dimensionality ranged337
from 1 to m− 1.338
yi ∼ T (j − 1, µi, σi), 1 ≤ yi ≤ m− 1,with
µi ∼ N(µ, σ), 0 ≤ µi ≤ σmax,
σi ∼ N(σi, 1), 0 ≤ σi ≤ max(σi), and
σi ∼ U(0,max(σi)).
(2)
The maximum population variance was defined as the expected variance339
of this uniform distribution 112
(m−2)2, reflecting the prior that each partici-340
pant could express a different, true dimensionality. On the subject-level, the341
maximum variance was defined as342
max(σ2i ) =
j
j − 1∗ (m− 1 − m
2)2 (3)
which corresponds to the maximum possible variance across j dimension-343
ality estimates.344
3. Simulations345
Before applying our method to real fMRI data, we tested the validity of346
our method through dimensionality-recovery studies on simulated fMRI data.347
Estimating the dimensionality for simulated cases where the true underlying348
dimensionality is known allowed us to assess whether our procedure results349
in a reliable dimensionality estimate.350
15
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 12, 2018. . https://doi.org/10.1101/232454doi: bioRxiv preprint
nv(µ, σ)
µ
~
σ
min = 0, max = max(σi)
σi
~
xi σi^
µi
student t (µi, σi) nv(σi, 1)~~
min = 1, max = m-1 min = 0, max = 1/12 (m-2)2
observed values
estimated values
~~
Figure 4: Illustration of the implemented multilevel model to estimate the degreeof functional dimensionality, corresponding to step 4 in Figure 2. The observedaveraged dimensionality estimates per participant are assumed to be sampled froman underlying subject-specific t-distribution with mean µi and standard deviationσi. The standard deviation σi of the participants’ dimensionality estimates isassumed to be sampled from a normal distribution with mean σi and a standarddeviation of 1. The subject-specific t-distributions of µi are assumed to comefrom a population distribution with a normally distributed mean µ and varianceσ. Subject-specific standard deviations σi are assumed to come from a uniformdistribution, ranging from 0 to max(σi). At the top level, a uniform prior isimplemented. Mean and variance of the normal distribution of population meansµ are assumed to come from a uniform distribution ranging from 1 to m− 1 and0 to σmax, respectively. Distributions were derived from https://github.com/
rasmusab/distribution_diagrams.
16
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 12, 2018. . https://doi.org/10.1101/232454doi: bioRxiv preprint
3.1. Methods351
Simulated data were created using the RSA toolbox (Nili et al., 2014) and352
custom Matlab code. Parameters of the simulation were picked in accordance353
with the study by Mack and colleagues (2013). We simulated fMRI data of354
presentation of 16 different stimuli, presented for 3 sec, three repetitions per355
run, and six runs, closely matching the specifications of the original study.356
To mimic a searchlight-approach, we defined the size of the cubic sphere357
4 × 4 × 4 voxels, resulting in a simulated pattern of 64 voxels.358
We simulated data with a dimensionality of 2, 4, and 6. We set the mean359
signal to noise ratio (SNR) to match empirically observed reconstruction360
correlation magnitudes of .25. As in the real data, reconstruction correlations361
varied across participants, ranging from .10 to .50. Thus, participants differed362
in their reliability of the dimensionality estimates.363
To generate data with varying ground-truth dimensionality k, we first364
generated true, i.e. noise-free, n(voxel) × m(conditions) matrices with un-365
derlying pre-defined dimensionality. This was achieved by applying PCA to366
a random 16 × 16 matrix and building a k-dimensional reconstruction of it.367
Rows of this matrix were added to a n× 16 matrix. For each row, i.e. voxel,368
a specific amplitude was drawn from a normal distribution and added.369
In the next step, we calculated the dot-product of the generated beta370
matrices and generated design matrices, which were HRF convolved. This371
resulted in noise-free fMRI time series.372
A noise matrix was generated by randomly sampling from a Gaussian dis-373
tribution. The n(voxel) × t(timesteps) matrix was then spatially smoothed374
and temporally smoothed with a Gaussian kernel of 4 FWHM. Finally, this375
temporally and spatially smoothed noise matrix was added to the noise-free376
time-series and the design matrix was fit the the resulting data using a GLM.377
This resulted in a (noisy) voxel × conditions beta matrix for each simulated378
run. The generated beta matrices were then passed on to the dimensionality379
estimation.380
We capitalized on our prior knowledge of possible dimensionalities that381
could underly the pattern and thus tested only for reconstruction correlations382
that could be achieved by keeping either 2, 4, or 6 components. This resulted383
in three reconstruction correlations per run. Reconstruction correlations were384
averaged across runs and we assessed how often each of the possible models385
of dimensionality achieved the highest correlation across subjects, for each386
respective ground-truth. Ideally, for each participant, the highest reconstruc-387
tion correlation would be achieved by the k-dimensional reconstruction that388
17
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 12, 2018. . https://doi.org/10.1101/232454doi: bioRxiv preprint
2 4 6winning model
2
4
6
grou
nd tr
uth
0.10.20.30.40.50.60.70.80.9
grou
nd tr
uth
winning model
Figure 5: Results from the simulation. Confusion matrix indicating with whichpercentage a 2, 4, or 6 dimensional model was picked as best model given a ground-truth dimensionality of 2, 4, or 6. Values in the diagonal, that is, where thecorrect model was picked for a given ground-truth, were consistently higher thanoff-diagonal values.
fits with the underlying ground truth. However, due to noise, deviations from389
this are possible and we aimed to assess how likely those deviations could be390
expected to occur.391
To gather a reliable estimate of the performance of our procedure, we ran392
a total of 1000 of these simulations for each ground-truth dimensionality.393
3.2. Results394
Across 1000 simulations of data with a ground-truth dimensionality of 2,395
4, or 6, we found that the highest reconstruction correlations were generally396
achieved by the low-dimensional reconstruction of the data that matched397
the ground-truth (see Figure 5), with 93.9%, 85.7%, and 76.7% correctly398
classified, respectively.399
As described earlier, key to our method is the fact that keeping more400
components than actually underly an observed pattern results in a reduced401
reconstruction correlation, thereby allowing us to identify the best dimen-402
sionality estimate based on the achieved reconstruction correlation. Figure403
1 B illustrates how the reconstruction correlations drop as components are404
added that do not carry signal for the case of a true underlying dimensionality405
of 4.406
18
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 12, 2018. . https://doi.org/10.1101/232454doi: bioRxiv preprint
3.3. Discussion407
By applying our procedure to simulated fMRI data with different under-408
lying ground truth dimensionality, we tested how well estimated dimension-409
alities match with true dimensionalities. The results confirm the validity of410
our approach, showing that for data with reasonable signal-to-noise ratio,411
estimated dimensionalities match closely the underlying ground truth. One412
observation is that estimates become more confusable at higher dimension-413
alities.414
4. Data sets415
Following the successful tests of our procedure with simulated data, we416
applied our method to three different, previously published fMRI datasets, all417
employing visual stimuli and testing healthy populations. We tested three418
core aims of our method: 1) Identifying areas carrying functional dimen-419
sionality, 2) Using functional dimensionality to assess sensitivity to stimulus420
features, and 3) Measuring task-dependent differences in dimensionality.421
4.1. Identifying areas carrying functional dimensionality422
Using data from a category learning study by Mack et al. (2013), we aimed423
to identify areas carrying functional dimensionality and compare them with424
the areas found by the original authors’ model-based analysis. Model-based425
analyses make specific assumptions about representational geometry that426
our approach does not. Furthermore, these analyses require some underlying427
dimensionality to identify an area. Therefore, we expected our method to428
reveal significant functional dimensionality in all areas that were reported in429
the original study, as well as additional areas that were reliably modulated by430
the task in a way that was not captured by the model tested in the original431
publication.432
4.1.1. Methods433
Participants were trained on categorizing nine objects that differed on four434
binary dimensions: shape (circle/triangle), color (red/green), size (large/small),435
and position (left/right). During the fMRI session, participants were pre-436
sented with the set of all 16 possible stimuli and had to perform the same437
categorization task. Out of 23 participants, 20 were included in the final438
analysis presented here, with 19 participants completing 6 runs composed of439
48 trials and one participant completing 5 runs.440
19
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 12, 2018. . https://doi.org/10.1101/232454doi: bioRxiv preprint
Standard pre-processing steps were carried out using SPM12 (Penny et al.,441
2006) and beta estimates were derived from a GLM containing one regres-442
sor per stimulus (16 in total, see Supplemental Materials for details). The443
dataset was retrieved from osf.io/62rgs.444
We ran a whole-brain searchlight with a 7mm radius sphere to estimate445
which brain areas carry signal with functional dimensionality, that is, signal446
that could be reliably predicted across runs based on a low-dimensional recon-447
struction. For each searchlight, data were pre-whitened and mean-centered448
as described above. Dimensionality estimation was performed as previously449
described and the resulting j correlations and dimensionality estimates were450
ascribed to the center of the searchlight. The code for the searchlight was451
based on the RSA toolbox (Nili et al., 2014).452
For each voxel, the j correlation coefficients were averaged and their sig-453
nificance was assessed via non-parametric one-sample t-tests across subjects454
using FSL’s randomise function (Winkler et al., 2014). Results were family-455
wise error (FWE) corrected using a TFCE threshold of p < .05.456
In their original analysis, the authors fit a cognitive model to participants457
classification behavior to estimate attention-weights to the single stimulus458
features. Based on these attention weights, they derived model-based simi-459
larities between stimuli and used RSA to examine which brain regions show460
a representational geometry that matches with these predictions. We repli-461
cated this analysis using the same beta estimates that were passed on to the462
dimensionality estimation in order to maximize comparability of the two ap-463
proaches. As for estimating the dimensionality, we ran a whole-brain search-464
light with a 7mm radius sphere (based on the RSA toolbox, Nili et al., 2014).465
We averaged voxel response patterns across runs and calculated the repre-466
sentational distance matrices (RDM) as all pairwise 1−Pearson correlation467
distance. We assessed correspondence of these RDMs with the model-based468
distance matrices via Spearman correlation. The resulting Spearman corre-469
lation for each participant was assigned to the center of the searchlight and470
their significance was assessed via non-parametric one-sample t-tests across471
subjects using FSL’s randomise function (Winkler et al., 2014). Results were472
family-wise error (FWE) corrected using a TFCE threshold of p < .05.473
4.1.2. Results474
We aimed to identify areas that show functional dimensionality and ex-475
amine how those overlap with the authors’ original findings implementing a476
model-based analysis. We found significant dimensionality (i.e., reconstruc-477
20
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 12, 2018. . https://doi.org/10.1101/232454doi: bioRxiv preprint
model-based RSA of stimulus similarities
functional dimensionality
overlap
Figure 6: Areas that showed significant functional dimensionality (green), signif-icant fit with the RSA comparing neural representational similarity with model-based predictions of stimulus similarity (orange), or both (yellow). FWE-correctedusing a TFCE threshold of p < .05. Notably, our method identifies large clustersof functional dimensionality in prefrontal cortex, indicating that areas here wereconsistently engaged by the task, though their patterns did not fit with the imple-mented cognitive model.
tion correlations) in an extended network of occipital, parietal and prefrontal478
areas (see Figure 6). In these areas, signal was reliable across runs and showed479
functional dimensionality.480
As can be seen in Figure 6, our method successfully identified all areas481
that were found in the original model-based analysis, which bolsters the482
authors original interpretation of their results. Notably, we were able to483
identify further areas that did not show a fit with the implemented attention-484
based model, suggesting that signal changes in those areas reflect a different485
aspect of the task space than captured by the cognitive model.486
4.1.3. Discussion487
Within the first dataset, we showed that by identifying areas with signifi-488
cant functional dimensionality, it is possible to reveal areas that can plausibly489
be tested for correspondence with a hypothesized representational similarity490
structure, as for instance derived from a cognitive model. More specifically,491
we were able to identify all areas that have been reported in the original492
analysis by Mack et al. (2013) to show a representational similarity as pre-493
dicted by a cognitive model. Additionally, we found further areas that had494
not been revealed in the original analysis to show functional dimensionality.495
This indicates that those areas have a reliable functional dimensionality but496
reflect cognitive processes or task-aspects that are not captured by the cog-497
nitive model. For instance, activation in the medial BA 8 has been found498
21
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 12, 2018. . https://doi.org/10.1101/232454doi: bioRxiv preprint
to correlate with uncertainty and task-difficulty (Volz et al., 2005; Huettel,499
2005; Crittenden and Duncan, 2014), suggesting that the neural patterns in500
this region in the current task might reflect processes related to the difficulty501
or category uncertainty of the categorization decision for each stimulus. To-502
gether, the findings highlight the potential of our procedure to aid evaluation503
of model performance and identify areas ahead of model-fitting.504
4.2. Using functional dimensionality to assess sensitivity to stimulus features505
Using data from a study with real-world categories and photographic506
stimuli by Bracci and Op de Beeck (2016), we tested whether different507
brain regions show functional dimensionality in response to different stimulus508
groupings (i.e., depending on how the stimulus-space is summarized). For ex-509
ample, the columns in the data matrix may be organized along either visual510
categories or shape. In this fashion, our technique could be useful in eval-511
uating general hypotheses regarding the nature and basis of the functional512
dimensionality in brain regions.513
4.2.1. Methods514
During the experiment, participants were presented repeatedly with 54515
different natural images that were of nine different shapes and belonged to six516
different categories (minerals, animals, fruit/vegetables, music instruments,517
sport instruments, tools), allowing the authors to dissociate between neural518
responses reflecting shape or category information.519
Standard pre-processing of the data was carried out using SPM12 (see520
Supplemental Material for details). In line with the authors original analysis,521
we tested for differences depending on whether the stimuli were averaged to522
emphasize their category or shape information. To that end, we constructed523
two separate GLMs. The first GLM (catGLM) was composed of one regressor524
per category (six in total), thus averaging across objects shapes. The second525
GLM (shapeGLM) consisted of nine different regressors, one for each shape,526
averaging neural responses across object categories. In both GLMs, regres-527
sors were convolved with the HRF and six motion-regressors as covariates of528
no interest were included.529
Dimensionality was estimated separately for both GLMs. We ran a whole-530
brain searchlight with a 7mm sphere on the beta estimates of the respective531
GLM, again pre-whitening and mean-centering voxel patterns within each532
searchlight before estimating the dimensionality. Reconstruction correlations533
were averaged across runs for each participant and tested for significance534
22
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 12, 2018. . https://doi.org/10.1101/232454doi: bioRxiv preprint
category GLM
shape GLM
overlap
Figure 7: Areas showing significant functional dimensionality for the shape GLM(green), the category GLM (orange), or both (yellow). Results are FWE-correctedusing an TFCE threshold of p < .05. Across both GLMs, posterior and parietalregions show functional dimensionality. Prefrontal regions show more pronouncedfunctional dimensionality for the category GLM, in line with the original findings.
across participants using FSL’s randomise function (Winkler et al., 2014).535
Results were FWE corrected using a TFCE threshold of p < .05.536
4.2.2. Results537
When testing for functional dimensionality for the shape-sensitive GLM,538
we found significant reconstruction correlations in bilateral posterior occipito-539
temporal and parietal regions, indicating functional dimensionality in these540
areas. Additionally, a significant cluster was revealed in the left lateral pre-541
frontal cortex (see Figure 7). Testing for functional dimensionality for the542
category-sensitive GLM also revealed strong significant correlations in occip-543
ital and posterior-temporal regions, but notably showed more pronounced544
correlations in bilateral lateral and medial prefrontal areas as well. This is545
in line with the authors original findings that showed that neural patterns in546
parietal and prefrontal ROIs correlated more strongly with a model reflect-547
ing category similarities, whereas shape similarities were largely restricted to548
occipital and posterior temporal ROIs.549
4.2.3. Discussion550
With the second dataset, we tested whether different areas are identi-551
fied to express significant functional dimensionality depending on how the552
underlying task-space is summarized. In line with the original authors’553
findings (Bracci and Op de Beeck, 2016), we found more pronounced func-554
tional dimensionality in prefrontal regions for the GLM emphasizing the555
category-information across stimuli, compared to the one focusing on shape-556
23
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 12, 2018. . https://doi.org/10.1101/232454doi: bioRxiv preprint
information. Likewise, functional dimensionality in occipital regions was557
more pronounced for the shape-based GLM.558
However, compared to the authors’ original findings, we did not find a559
sharp dissociation between shape and category. For example, we find both560
shape and category dimensionality present in early visual regions and shape561
dimensionality extending into frontal areas. As discussed in the previous562
section, our method provides a general test of dimensionality whereas the563
original authors evaluate specific representational accounts that make ad-564
ditional assumptions about shape and category similarity structure. Com-565
paring results suggest that to some degree the dissociation found in Bracci566
and Op de Beeck (2016) rests on these specific assumptions. A more gen-567
eral test of functional dimensionality, for stimuli organized along shape or568
category, provides additional information to assist in interpreting the cogni-569
tive function of these brain regions, which complements testing more specific570
representational accounts.571
4.3. Measuring task-dependent differences in dimensionality572
In this third dataset, we consider whether the underlying dimensional-573
ity of neural representations changes as a function of task. In Mack et al.574
(2016), participants learned a categorization rule over a common stimulus575
set that either depended on one or two stimulus dimensions. We predicted576
that the estimated functional dimensionality, as measured by our hierarchi-577
cal Bayesian method, should be higher for the more complex categorization578
problem, extending the original authors’ findings.579
4.3.1. Methods580
Participants learned to classify bug stimuli that varied on three binary581
dimensions (mouth, antenna, legs) into two contrasting categories based on582
trial-and-error learning. Over the course of the experiment, participants583
completed two learning problems (in counterbalanced order). Correct classi-584
fication in type I problem required attending to only one of the bugs features,585
whereas classification in type II problem required combining information of586
two features in an exclusive-or manner.587
Previous research has shown that neural dimensionality appropriate for588
the problem at hand is linked to successful task performance (Rigotti et al.,589
2013). Thus, we hypothesized that dimensionality of the neural response590
would be higher for type II compared to type I in areas known to process591
visual features, as for instance lateral occipito-temporal cortex (LOC; see e.g.592
24
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 12, 2018. . https://doi.org/10.1101/232454doi: bioRxiv preprint
Eger et al., 2008). We included data from 22 participants in our analysis (one593
participant was excluded due to artifacts in the fMRI data, please refer to594
the Supplemental Material for further details on the experiment and data595
preprocessing). The dataset was retrieved from osf.io/5byhb.596
In order to infer the degree of functional dimensionality, we estimated it597
across ROIs encompassing LOC in the left and right hemisphere separately598
for the two categorization tasks. Because the relevant stimulus dimensions599
were learned through trial-and-error learning, we excluded the first functional600
run (early learning) of each problem and analyzed the remaining three runs601
for each problem.602
Prior to estimating the dimensionality, data were pre-whitened and mean-603
centered. Dimensionality was estimated across all voxels for each ROI and604
problem, resulting in 3 (runs) × 2 (ROIs) × 2 (problems) correlation co-605
efficients and dimensionality estimates. Correlation coefficients were aver-606
aged per participant, ROI and problem and tested for significance using607
one-sample t-tests. To derive the best population estimate for the under-608
lying dimensionality for each ROI and problem, we implemented the above609
described hierarchical Bayesian model. To that end, we calculated mean and610
standard deviation of each participant’s dimensionality estimate per ROI611
and problem and used those summary statistics to estimate the degree of612
underlying dimensionality for each ROI and problem.613
4.3.2. Results614
Estimating dimensionality across two different ROIs in LOC and two615
different tasks allowed us to test whether the estimated dimensionality differs616
across problems with different task-demands. As participants had to pay617
attention to one stimulus feature in the type I problem and two stimulus618
features in the the type II problem, we hypothesized that dimensionality of619
the neural response would be higher for type II compared to type I in an620
LOC ROI.621
Both ROIs showed significant reconstruction correlations across both tasks622
(lLOC, type I: t21 = 3.08, p = .006; rLOC, type I: t21 = 2.21, p = .038; lLOC,623
type II: t21 = 3.03, p = .006; rLOC, type II: t21 = 3.37, p = .003). This shows624
that signal in the LOC showed reliable functional dimensionality across runs625
for both problem types, which is a prerequisite for estimating the degree of626
functional dimensionality.627
To estimate whether the dimensionality differed across problems, we ana-628
lyzed the data by implementing a multilevel Bayesian model using Stan (The629
25
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 12, 2018. . https://doi.org/10.1101/232454doi: bioRxiv preprint
A B
0 1 2 3 4 5 6 7 8estimated dimensionality; right LOC
problem 1problem 2
0 1 2 3 4 5 6 7 8estimated dimensionality; left LOC
type I problem
0 1 2 3 4 5 6 7 8estimated dimensionality; right LOC
problem 1problem 2type II problem
Figure 8: Results of estimating functional dimensionality for two different catego-rization problems. A: Outline of the two ROIs in left and right LOC. B: Histogramsof posterior distributions of estimated dimensionalities in left and right LOC forthe type I and II problems. Dimensionalities were estimated by implementing sep-arate multilevel models for each ROI and model using Stan. Across both ROIs,the peak of the posterior distributions of the estimated dimensionality for type IIwas higher than for type I, mirroring the structure of the two problems.
Stan Development Team, 2017), see Figure 2 for an illustration of the model.630
As hypothesized, the estimated underlying dimensionality was higher for the631
type II problem compared to type I (type I: µleft = 2.92 (CI 95% : 1.33, 4.33),632
µright = 2.66 (CI 95% : 1.23, 4.14); type II: µleft = 4.74 (CI 95% : 3.20, 6.46),633
µright = 4.69 (CI 95% : 3.56, 6.06), see Figure 8).634
4.3.3. Discussion635
Besides knowing which areas show neural patterns with functional di-636
mensionality, an important question concerns the degree of the underlying637
dimensionality. Using data from a categorization task where participants638
had to attend to either one or two features of a stimulus, we demonstrate639
how our method can be used to test whether the degree of underlying dimen-640
sionality of neural patterns varies with task demands. A notable strength641
of the dataset for our research question is that the authors used the same642
stimuli in a within-subject paradigm, counterbalancing the order of the two643
categorization tasks across subjects. This allowed us to investigate how the644
dimensionality of a neural pattern changes with task, while controlling for645
possible effects due to differences in signal-to-noise ratios across participants646
or brain regions.647
Our results show that, as expected, the degree of underlying functional648
dimensionality is higher when the task required attending to two stimulus649
26
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 12, 2018. . https://doi.org/10.1101/232454doi: bioRxiv preprint
features instead of only one. Notably, this assumption was implicit to the650
conclusions drawn by the authors in the original publication (Mack et al.,651
2016). The authors analyzed neural patterns in hippocampus and imple-652
mented a cognitive model to show that stimulus-specific neural patterns were653
stretched across relevant compared to irrelevant dimensions. Thus, irrelevant654
dimensions were compressed and the dimensionality of the neural pattern655
was reduced the less dimensions were relevant to the categorization problem.656
Our approach allows to directly assess this effect without the need of fitting657
a cognitive model.658
5. General Discussion659
Multivariate and model-based analyses of fMRI data have deepened our660
understanding of the human brain and its representational spaces (Norman661
et al., 2006; Kriegeskorte and Kievit, 2013; Haxby et al., 2014; Turner et al.,662
2017). However, before evaluating specific representational accounts, it is663
sensible to first ask the more basic question of whether brain areas displays664
functional dimensionality more generally. Here, we presented a novel ap-665
proach to estimate an area’s functional dimensionality by a combined SVD666
and cross-validation procedure. Our procedure identifies areas with signif-667
icant functional dimensionality and provides an estimate, reflecting uncer-668
tainty, of the degree of underlying dimensionality. Across three different data669
sets, we confirmed and extended the findings from the original contributions.670
After verifying the operation of the method with a synthetic (simulated)671
dataset in which the ground truth dimensionality was known, we applied672
our method to three published fMRI datasets. In each case, the procedure673
confirmed and extended the authors’ original findings, advancing our un-674
derstanding of the function of the brain regions considered. Each of three675
datasets highlighted a potential use of estimating functional dimensionality.676
In the first study, working with data from Mack et al. (2013), we demon-677
strated that testing for functional dimensionality can complement model-678
based fMRI analyses that evaluate more specific representational hypothe-679
ses. First, one cannot find a rich relationship between model representations680
and brain measures when there is no functional dimensionality in regions681
of interest. Second, there might be additional areas that display significant682
functional dimensionality that do not show correspondence with the model.683
These additional areas invite further analysis as they might implement684
processes and representations outside the scope of the tested model. Func-685
27
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 12, 2018. . https://doi.org/10.1101/232454doi: bioRxiv preprint
tional dimensionality can indicate interesting unexplained signal. For exam-686
ple, in the first dataset examined, functional dimensionality was found in687
all the areas identified by Mack et al. (2013), plus medial BA 8, which is a688
candidate region for task difficulty and response conflict (see Alexander and689
Brown, 2011, for a model of medial prefrontal cortex function), which was690
not the authors’ original focus but may merit further study.691
In the second study, working with data from Bracci and Op de Beeck692
(2016), we demonstrated how stimuli could be grouped or organized in differ-693
ent fashions to explore how dimensional organization varies across the brain.694
In this case, the data matrix was either organized along shape or category.695
We found neural patterns of shape and category selectivity consistent with696
the authors’ original results. However, we found the selectivity to be more697
mixed in our analyses and identified additional responsive regions, mirroring698
our results when we considered data from Mack et al. (2013).699
Our method may have been more sensitive to signal because it makes700
fewer assumptions about the underlying representational structure and al-701
lows for individual differences in the underlying dimensions. In this sense,702
assessing functional complexity complements existing analysis procedures.703
Indeed, our approach could be used to evaluate multiple stimulus groupings704
to inform feature selection in encoding models (Diedrichsen and Kriegeskorte,705
2017; Naselaris et al., 2011).706
In a third study, working with data from Mack et al. (2016), we evaluated707
whether our method could identify changes in task-driven dimensionality. By708
combining estimates of functional dimensionality with a hierarchical Bayesian709
model, we found that the functional dimensionality in LOC was higher when710
a category decision required using two features rather than one. These results711
are consistent with the original authors’ theory but were hitherto untestable.712
In summary, assessing functional dimensionality across these three studies713
complemented the original analyses and revealed additional nuances in the714
data. In each case, our understanding of the neural function was further715
constrained. Moreover, comparing the results to those from model-based716
and other multivariate approaches was informative in terms of understanding717
underlying assumptions and their importance.718
Of course, as touched upon in the Introduction, there are many possible719
ways to assess dimensional structure in brain measures and progress has been720
made on this challenge (Rigotti et al., 2013; Machens et al., 2010; Rigotti and721
Fusi, 2016; Diedrichsen et al., 2013; Bhandari et al., 2017). Here, our aim722
was to specify a general, computational efficient, robust, and relatively simple723
28
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 12, 2018. . https://doi.org/10.1101/232454doi: bioRxiv preprint
and interpretable procedure that can easily be applied to whole brain data724
to first test for statistical significant functional dimensionality and, if found,725
to provide an estimate of its magnitude using Bayesian hierarchical modeling726
to make clear the uncertainty in that estimate.727
We hope our contribution is useful to researches interested in further728
exploring their data, whether it be fMRI, MEG, EEG, or single-cell record-729
ings. Researchers may consider variants of our method. For example, as730
mentioned in the Introduction, the SVD could be substituted with another731
procedure depending on the needs and assumptions of the researchers. There732
is no magic bullet to the difficult problems of estimating the underlying di-733
mensionality of noisy neural data, but we have made progress on this issue734
both theoretically and practically. In doing so, we have also provided addi-735
tional insights into the brain basis of visual categorization. We hope that736
by demonstrating the merits of estimating the functional dimensionality of737
neural data that we motivate others to take advantage of this additional and738
complementary viewpoint on neural function.739
6. Data availability740
A Matlab toolbox for estimating functional dimensionality of fMRI data741
as well as data needed to replicate the analyses presented here will be made742
available after publication. Nifti files and code for the analyses presented743
here are available from the authors upon request.744
7. Acknowledgments745
This work was funded by the National Institutes of Health [grant number746
1P01HD080679]; the Leverhulme Trust [grant number RPG-2014-075]; and747
Wellcome Trust Senior Investigator Award [WT106931MA] to Bradley C.748
Love. Correspondences regarding this work can be sent to [email protected]
or [email protected]. The authors are grateful to all study authors for sharing750
their data and wish to thank all members of the LoveLab and Jan Balaguer751
for valuable input. Declarations of interest: none.752
29
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 12, 2018. . https://doi.org/10.1101/232454doi: bioRxiv preprint
8. References753
Alexander, W. H. and Brown, J. W. (2011). Medial prefrontal cortex as an754
action-outcome predictor. Nature Neuroscience, 14(10):1338–1344.755
Bhandari, A., Rigotti, M., Gagne, C., Fusi, S., and Badre, D. (2017). Char-756
acterizing human prefrontal cortex representations with fMRI. In Society757
for Neuroscience, Washington, DC:.758
Bracci, S., Daniels, N., and Op de Beeck, H. (2017). Task Context Overrules759
Object- and Category-Related Representational Content in the Human760
Parietal Cortex. Cerebral Cortex, pages 1–12.761
Bracci, S. and Op de Beeck, H. (2016). Dissociations and associations be-762
tween shape and category representations in the two visual pathways. Jour-763
nal of Neuroscience, 36(2):432–444.764
Cattell, R. B. (1947). Confirmation and clarification of primary personality765
factors. Psychometrika, 12(3):197–220.766
Cattell, R. B. (1966). The Scree Test For The Number Of Factors. Multi-767
variate Behavioral Research, 1(2):245–276.768
Crittenden, B. M. and Duncan, J. (2014). Task difficulty manipulation re-769
veals multiple demand activity but no frontal lobe hierarchy. Cerebral770
Cortex, 24(2):532–540.771
Davis, T., LaRocque, K. F., Mumford, J. A., Norman, K. A., Wagner, A. D.,772
and Poldrack, R. A. (2014). What do differences between multi-voxel and773
univariate analysis mean? How subject-, voxel-, and trial-level variance774
impact fMRI analysis. NeuroImage, 97:271–283.775
Diedrichsen, J. and Kriegeskorte, N. (2017). Representational models: A776
common framework for understanding encoding, pattern-component, and777
representational-similarity analysis. PLoS Computational Biology, 13(4):1–778
33.779
Diedrichsen, J., Wiestler, T., and Ejaz, N. (2013). A multivariate method780
to determine the dimensionality of neural representation from population781
activity. NeuroImage, 76:225–235.782
30
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 12, 2018. . https://doi.org/10.1101/232454doi: bioRxiv preprint
Eckart, C. and Young, G. (1936). The approximation of one matrix by783
another of lower rank. Psychometrika, 1(3):211–218.784
Eger, E., Ashburner, J., Haynes, J.-D., Dolan, R. J., and Rees, G. (2008).785
fMRI activity patterns in human LOC carry information about object786
exemplars within category. Journal of cognitive neuroscience, 20(2):356–787
370.788
Friston, K. J., Frith, C. D., Liddle, P. F., and Frackowiak, R. S. J. (1993).789
Functional Connectivity: The Principal-Component Analysis of Large790
(PET) Data Sets. Journal of Cerebral Blood Flow & Metabolism, 13(1):5–791
14.792
Fusi, S., Miller, E. K., and Rigotti, M. (2016). Why neurons mix: High di-793
mensionality for higher cognition. Current Opinion in Neurobiology, 37:66–794
74.795
Goddard, E., Klein, C., Solomon, S. G., Hogendoorn, H., Thomas, A., and796
Klein, C. (2017). Interpreting the dimensions of neural feature represen-797
tations revealed by dimensionality reduction. NeuroImage.798
Hastie, T., Tibshirani, R., and Friedman, J. (2009). Unsupervised Learn-799
ing. In The Elements of Statistical Learning, chapter 14, pages 485–585.800
Springer, New York, NY, second edition.801
Haxby, J. V., Connolly, A. C., and Guntupalli, J. S. (2014). Decoding Neu-802
ral Representational Spaces Using Multivariate Pattern Analysis. Annual803
review of neuroscience, pages 435–456.804
Hebart, M. N. and Baker, C. I. (2017). Deconstructing multivariate decoding805
for the study of brain function. NeuroImage.806
Huettel, S. A. (2005). Decisions under Uncertainty: Probabilistic Context807
Influences Activation of Prefrontal and Parietal Cortices. Journal of Neu-808
roscience, 25(13):3304–3311.809
Huettel, S. A., Song, A. W., and McCarthy, G. (2003). Functional magnetic810
resonance imaging. Sinauer Associates, Inc., Sunderland, Massachusetts,811
USA, second edition.812
31
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 12, 2018. . https://doi.org/10.1101/232454doi: bioRxiv preprint
Huth, A. G., Nishimoto, S., Vu, A. T., and Gallant, J. L. (2012). A Continu-813
ous Semantic Space Describes the Representation of Thousands of Object814
and Action Categories across the Human Brain. Neuron, 76(6):1210–1224.815
Khaligh-Razavi, S. M. and Kriegeskorte, N. (2014). Deep Supervised, but816
Not Unsupervised, Models May Explain IT Cortical Representation. PLoS817
Computational Biology, 10(11).818
Kramer, M. A. (1991). Nonlinear principal component analysis using autoas-819
sociative neural networks. AIChE Journal, 37(2):233–243.820
Kriegeskorte, N. and Kievit, R. A. (2013). Representational geometry: In-821
tegrating cognition, computation, and the brain. Trends in Cognitive Sci-822
ences, 17(8):401–412.823
Kriegeskorte, N., Mur, M., Ruff, D. A., Kiani, R., Bodurka, J., Esteky, H.,824
Tanaka, K., and Bandettini, P. A. (2008). Matching Categorical Object825
Representations in Inferior Temporal Cortex of Man and Monkey. Neuron,826
60(6):1126–1141.827
Kruschke, J. K. (2014). Doing Bayesian data analysis : a tutorial with R,828
JAGS, and Stan.829
Ledoit, O. and Wolf, M. (2004). A well-conditioned estimator for large-830
dimensional covariance matrices. Journal of Multivariate Analysis,831
88(2):365–411.832
Machens, C. K., Romo, R., and Brody, C. D. (2010). Functional, but not833
anatomical, separation of what and when in prefrontal cortex. Journal of834
Neuroscience, 30(1):350–360.835
Mack, M. L., Love, B. C., and Preston, A. R. (2016). Dynamic updating836
of hippocampal object representations reflects new conceptual knowledge.837
Proceedings of the National Academy of Sciences of the United States of838
America, 113(46):13203–13208.839
Mack, M. L., Preston, A. R., and Love, B. C. (2013). Decoding the brain’s840
algorithm for categorization from its neural implementation. Current Bi-841
ology, 23(20):2023–2027.842
32
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 12, 2018. . https://doi.org/10.1101/232454doi: bioRxiv preprint
Naselaris, T., Kay, K. N., Nishimoto, S., and Gallant, J. L. (2011). Encoding843
and decoding in fMRI. NeuroImage, 56(2):400–410.844
Nichols, T. and Holmes, A. (2003). Nonparametric Permutation Tests845
for Functional Neuroimaging. Human Brain Function: Second Edition,846
15(1):887–910.847
Nili, H., Wingfield, C., Walther, A., Su, L., Marslen-Wilson, W., and848
Kriegeskorte, N. (2014). A Toolbox for Representational Similarity Anal-849
ysis. PLoS Computational Biology, 10(4):e1003553.850
Norman, K. A., Polyn, S. M., Detre, G. J., and Haxby, J. V. (2006). Be-851
yond mind-reading: multi-voxel pattern analysis of fMRI data. Trends in852
Cognitive Sciences, 10(9):424–430.853
Parpart, P., Jones, M., and Love, B. (2017). Heuristics as Bayesian inference854
under extreme priors. Cognitive Psychology, in press.855
Penny, W., Friston, K., Ashburner, J., Kiebel, S., and Nichols, T. (2006).856
Statistical Parametric Mapping: The Analysis of Functional Brain Images:857
The Analysis of Functional Brain Images, volume 8. Academic press.858
Rigotti, M., Barak, O., Warden, M. R., Wang, X.-J., Daw, N. D., Miller,859
E. K., and Fusi, S. (2013). The importance of mixed selectivity in complex860
cognitive tasks. Nature, 497(7451):1–6.861
Rigotti, M. and Fusi, S. (2016). Estimating the dimensionality of neural862
responses with fMRI Repetition Suppression. Arxiv.863
Shlens, J. (2014). A tutorial on principal component analysis. arXiv.864
Smith, S. M. and Nichols, T. E. (2009). Threshold-free cluster enhancement:865
Addressing problems of smoothing, threshold dependence and localisation866
in cluster inference. NeuroImage, 44(1):83–98.867
Spearman, C. (1904). General Intelligence, Objectively Determined and Mea-868
sured. American Journal of Psychology, 15(2):201–292.869
The Stan Development Team (2017). MatlabStan: the MATLAB interface870
to Stan.871
33
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 12, 2018. . https://doi.org/10.1101/232454doi: bioRxiv preprint
Turner, B. M., Forstmann, B. U., Love, B. C., Palmeri, T. J., and Van Maa-872
nen, L. (2017). Approaches to analysis in model-based cognitive neuro-873
science. Journal of Mathematical Psychology, 76:65–79.874
Volz, K. G., Schubotz, R. I., and Von Cramon, D. Y. (2005). Variants of875
uncertainty in decision-making and their neural correlates. Brain Research876
Bulletin, 67(5):403–412.877
Walther, A., Nili, H., Ejaz, N., Alink, A., Kriegeskorte, N., and Diedrich-878
sen, J. (2016). Reliability of dissimilarity measures for multi-voxel pattern879
analysis. NeuroImage, 137:188–200.880
Winkler, A. M., Ridgway, G. R., Webster, M. A., Smith, S. M., and Nichols,881
T. E. (2014). Permutation inference for the general linear model. Neu-882
roImage, 92:381–397.883
34
.CC-BY 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted January 12, 2018. . https://doi.org/10.1101/232454doi: bioRxiv preprint