medical image analysis - university of texas at dallasaria/papers/bazarganinosratinia2014.pdf2.1....

14
Joint maximum likelihood estimation of activation and Hemodynamic Response Function for fMRI q Negar Bazargani , Aria Nosratinia Department of Electrical Engineering, University of Texas at Dallas, Richardson, TX 75080, USA article info Article history: Received 20 July 2013 Received in revised form 28 January 2014 Accepted 29 March 2014 Available online 24 April 2014 Keywords: fMRI Hemodynamic Response Function Maximum likelihood estimation Singular value decomposition Tikhonov regularization abstract Blood Oxygen Level Dependent (BOLD) functional magnetic resonance imaging (fMRI) maps the brain activity by measuring blood oxygenation level, which is related to brain activity via a temporal impulse response function known as the Hemodynamic Response Function (HRF). The HRF varies from subject to subject and within areas of the brain, therefore a knowledge of HRF is necessary for accurately computing voxel activations. Conversely a knowledge of active voxels is highly beneficial for estimating the HRF. This work presents a joint maximum likelihood estimation of HRF and activation based on low-rank matrix approximations operating on regions of interest (ROI). Since each ROI has limited data, a smoothing con- straint on the HRF is employed via Tikhonov regularization. The method is analyzed under both white noise and colored noise. Experiments with synthetic data show that accurate estimation of the HRF is possible with this method without prior assumptions on the exact shape of the HRF. Further experiments involving real fMRI experiments with auditory stimuli are used to validate the proposed method. Ó 2014 Elsevier B.V. All rights reserved. 1. Introduction In Blood Oxygenation Level Dependent (BOLD) fMRI, the hemo- dynamic response starts immediately after the stimulus is applied and increases to its peak with 3–10 s delay, then decays with a post stimulus undershoot and returns to its baseline (Jezzard et al., 2002, chap. 14). It has been observed that the neurovascular sys- tem is well approximated by a linear and stationary model if the inter stimulus interval (ISI) is more than a few seconds (Boynton et al., 1996; Dale and Buckner, 1997). Using a linear time invariant model the hemodynamic changes can be characterized as the con- volution of the stimulus with the impulse response function of the system, which is called the Hemodynamic Response Function (HRF). The estimation of the activation levels typically requires knowledge of the HRF. Furthermore, minimum mean square error (MMSE) estimation of the HRF requires knowledge of activation levels. The cross dependency motivates our approach to joint estimation of these quantities. This paper proposes a joint maximum likelihood method for the simultaneous estimation of the HRF and activation levels in fMRI data. Our method does not require a prior for the probability distributions of the HRF or the activation levels. This is useful in sit- uations where obtaining reliable and verifiable prior distributions on the HRF or the activation levels are difficult. It is well known that the HRF varies across different regions of the brain (Aguirre et al., 1998). The shape of the HRF is observed to be constant across neighboring voxels in a homogeneous region (Ciuciu et al., 2003; Gössl et al., 2001b), thus a more robust HRF estimation may be obtained from the data corresponding to a set of voxels within a neighborhood, instead of a single voxel. Two approaches are explored for establishing the ROIs. The first approach is to deduce the ROI via analyzing the data with AFNI (Analysis of Functional NeuroImages) imaging analysis tool. 1 AFNI is a widely used set of programs that implement well-known algo- rithms for processing, analyzing and displaying fMRI data (Cox, 1996), and is often used for benchmarking new developments. In order to capture HRF variations, our joint estimation process is applied individually on each of the ROIs. The second approach does not involve using another algorithm as the input to ours, but rather uses the proposed joint estimation algorithm in a two-step ‘‘boot- strap’’ approach, as follows. In the first step, the brain is segmented into equal-size cubes and the joint MLE method is applied to each segment, without a priori assumptions about any of the regions. The resulting activation levels are then used to determine the ROI. http://dx.doi.org/10.1016/j.media.2014.03.005 1361-8415/Ó 2014 Elsevier B.V. All rights reserved. q This work appeared in part in Asilomar 2008. Corresponding author. Present address: Philips Healthcare, 595 Miner Rd, Highland Heights, OH 44143, USA. Tel.: +1 469 713 9982. E-mail addresses: [email protected] (N. Bazargani), aria@utdallas. edu (A. Nosratinia). 1 http://afni.nimh.nih.gov. Medical Image Analysis 18 (2014) 711–724 Contents lists available at ScienceDirect Medical Image Analysis journal homepage: www.elsevier.com/locate/media

Upload: dokien

Post on 24-May-2018

214 views

Category:

Documents


1 download

TRANSCRIPT

Medical Image Analysis 18 (2014) 711–724

Contents lists available at ScienceDirect

Medical Image Analysis

journal homepage: www.elsevier .com/locate /media

Joint maximum likelihood estimation of activation and HemodynamicResponse Function for fMRI q

http://dx.doi.org/10.1016/j.media.2014.03.0051361-8415/� 2014 Elsevier B.V. All rights reserved.

q This work appeared in part in Asilomar 2008.⇑ Corresponding author. Present address: Philips Healthcare, 595 Miner Rd,

Highland Heights, OH 44143, USA. Tel.: +1 469 713 9982.E-mail addresses: [email protected] (N. Bazargani), aria@utdallas.

edu (A. Nosratinia). 1 http://afni.nimh.nih.gov.

Negar Bazargani ⇑, Aria NosratiniaDepartment of Electrical Engineering, University of Texas at Dallas, Richardson, TX 75080, USA

a r t i c l e i n f o

Article history:Received 20 July 2013Received in revised form 28 January 2014Accepted 29 March 2014Available online 24 April 2014

Keywords:fMRIHemodynamic Response FunctionMaximum likelihood estimationSingular value decompositionTikhonov regularization

a b s t r a c t

Blood Oxygen Level Dependent (BOLD) functional magnetic resonance imaging (fMRI) maps the brainactivity by measuring blood oxygenation level, which is related to brain activity via a temporal impulseresponse function known as the Hemodynamic Response Function (HRF). The HRF varies from subject tosubject and within areas of the brain, therefore a knowledge of HRF is necessary for accurately computingvoxel activations. Conversely a knowledge of active voxels is highly beneficial for estimating the HRF. Thiswork presents a joint maximum likelihood estimation of HRF and activation based on low-rank matrixapproximations operating on regions of interest (ROI). Since each ROI has limited data, a smoothing con-straint on the HRF is employed via Tikhonov regularization. The method is analyzed under both whitenoise and colored noise. Experiments with synthetic data show that accurate estimation of the HRF ispossible with this method without prior assumptions on the exact shape of the HRF. Further experimentsinvolving real fMRI experiments with auditory stimuli are used to validate the proposed method.

� 2014 Elsevier B.V. All rights reserved.

1. Introduction

In Blood Oxygenation Level Dependent (BOLD) fMRI, the hemo-dynamic response starts immediately after the stimulus is appliedand increases to its peak with 3–10 s delay, then decays with a poststimulus undershoot and returns to its baseline (Jezzard et al.,2002, chap. 14). It has been observed that the neurovascular sys-tem is well approximated by a linear and stationary model if theinter stimulus interval (ISI) is more than a few seconds (Boyntonet al., 1996; Dale and Buckner, 1997). Using a linear time invariantmodel the hemodynamic changes can be characterized as the con-volution of the stimulus with the impulse response function of thesystem, which is called the Hemodynamic Response Function(HRF).

The estimation of the activation levels typically requiresknowledge of the HRF. Furthermore, minimum mean square error(MMSE) estimation of the HRF requires knowledge of activationlevels. The cross dependency motivates our approach to jointestimation of these quantities.

This paper proposes a joint maximum likelihood method for thesimultaneous estimation of the HRF and activation levels in fMRI

data. Our method does not require a prior for the probabilitydistributions of the HRF or the activation levels. This is useful in sit-uations where obtaining reliable and verifiable prior distributionson the HRF or the activation levels are difficult.

It is well known that the HRF varies across different regions ofthe brain (Aguirre et al., 1998). The shape of the HRF is observedto be constant across neighboring voxels in a homogeneous region(Ciuciu et al., 2003; Gössl et al., 2001b), thus a more robust HRFestimation may be obtained from the data corresponding to a setof voxels within a neighborhood, instead of a single voxel. Twoapproaches are explored for establishing the ROIs. The firstapproach is to deduce the ROI via analyzing the data with AFNI(Analysis of Functional NeuroImages) imaging analysis tool.1 AFNIis a widely used set of programs that implement well-known algo-rithms for processing, analyzing and displaying fMRI data (Cox,1996), and is often used for benchmarking new developments. Inorder to capture HRF variations, our joint estimation process isapplied individually on each of the ROIs. The second approach doesnot involve using another algorithm as the input to ours, but ratheruses the proposed joint estimation algorithm in a two-step ‘‘boot-strap’’ approach, as follows. In the first step, the brain is segmentedinto equal-size cubes and the joint MLE method is applied to eachsegment, without a priori assumptions about any of the regions.The resulting activation levels are then used to determine the ROI.

2 http://www.fil.ion.ucl.ac.uk/spm.3 http://www.sourceforge.net/projects/marsbar.

712 N. Bazargani, A. Nosratinia / Medical Image Analysis 18 (2014) 711–724

The activations are discarded after determining the ROI, and then theROI obtained from the first round are used in a second round of jointMLE, which determines the final values of both activation and HRF.Thus, the joint MLE proposed in this paper can also be used as astand-alone method.

1.1. Background and prior work

Parametric and nonparametric models have been used for mod-eling the HRF. In parametric models, a pre-determined shape isassumed for the HRF and its parameters are estimated. Amongthe parametric models one may name the Poisson density(Friston et al., 1994), the Gaussian function (Rajapakse et al.,1998), the Gamma density (Cohen, 1997; Lange and Zeger, 1997),shifted Gamma density (Boynton et al., 1996), difference of twoGamma functions (Glover, 1999) and truncated Gaussian functions(Gössl et al., 2001b). Parametric models may introduce bias in theHRF estimate, since they cannot capture HRF shape variabilityacross voxels.

Since the HRF varies across brain regions, subjects and experi-ments (Aguirre et al., 1998), capturing these variabilities requiresa flexible model, e.g., a linear combination of basis functions. Pastwork in this area includes combining canonical shape for the HRFand its temporal derivative (Friston et al., 1998), spline basis sets,finding basis sets using principal components (Aguirre et al.,1998; Woolrich et al., 2004a), cosine functions (Zarahn, 2002),radial basis functions (Riera et al., 2004), linear combination ofFourier basis functions (Josephs et al., 1997) and spectral basisfunctions (Liao et al., 2002). Lindquist et al. (Lindquist andWager, 2007) performed a comparative study with various basisfunctions.

Various nonparametric methods have been proposed for HRFestimation, where no prior shape is assumed for the HRF. Goutteet al. (Goutte et al., 2000) uses smooth finite impulse response(FIR) filters, where the HRF model has as many free parametersas the data points. The nonparametric model is flexible enoughto capture an arbitrary shape for the HRF, but it is often sensitiveto noise. Bayesian nonparametric estimations of the HRF were pro-posed in (Ciuciu et al., 2003; Marrelec et al., 2003a). Afonso et al.(2008) also uses a Bayesian framework to jointly estimate theHRF and activation levels. The HRF is estimated voxel-wise in thesemethods.

Because of the low signal-to-noise ratio in the fMRI data, voxel-wise estimation methods may be too noisy. It has been observed(Ciuciu et al., 2003; Gössl et al., 2001a) that for a specific task,the shape of the HRF is constant across neighboring voxels, thusa more robust HRF estimation may be obtained from the data cor-responding to a set of voxels within a neighborhood, instead of asingle voxel. Therefore, in this paper, we follow a region-basedapproach.

A region-based formulation for the HRF estimation was pro-posed in (Makni et al., 2005; Makni et al., 2008), where a singleHRF shape was used with varying amplitude within a ROI. A jointestimation of the HRF and the activation level for a ROI was pro-posed in (Makni et al., 2005, 2008), however, unlike our work(Makni et al., 2005, 2008) use a Bayesian framework that requiresknowledge of the prior distributions for activation levels, HRF andother parameters. Our work takes a maximum likelihood approachto this problem. Our method does not require the prior probabilitydistributions of the HRF or the activation levels. This is useful in sit-uations where obtaining reliable and verifiable prior probabilitydistributions on the HRF or the activation levels are difficult. Ithas been common for ROI-based analyses to depend on othermethods to determine the ROI. In (Makni et al., 2008) and in(Chaari et al., 2013) regions are first identified as functionallyhomogeneous parcels in the mask of the gray matter using a

specific procedure. In (Makni et al., 2005) the input ROIs are SPMclusters obtained from maps based on standard Statistical Param-eter Mapping (SPM) software2 activation using MARSBAR toolbox.3

Makni et al. (2008) also requires parcellation and a knowledge ofactive regions. In one part of our work, we use AFNI data analysisto provide the ROI, based on the existing tradition of the above-men-tioned methods. But we also provide a second ‘‘boot-strap’’ two-stepmethod where we first apply the proposed joint MLE on a cube-seg-mentation of the brain to produce the ROI, and then apply the pro-posed joint MLE a second time to produce the activation and HRFvalues.

Due to the level of noise and limitation on the data points aris-ing from a region-dependent model on the HRF, we also considersmoothing constraints. It has been shown by Buxton et al. (2004)that the HRF is smooth. Earlier works on the use of smoothnessconstraint in the HRF estimation include smoothing a finiteimpulse response (Goutte et al., 2000), as well as constraints onthe second order discrete derivative operator (Ciuciu et al., 2003;Marrelec et al., 2003a). In addition, Tikhonov regularization(Tikhonov and Arsenin, 1977) has been proposed for imposingsmoothness constraints on the HRF estimate. Two approachesbased on using Tikhonov regularization and smoothing splinesare introduced in (Zhang et al., 2007; Vakorin et al., 2007). In bothpapers, the HRF estimation is done voxel-wise on a block designexperimental paradigm. The impact of regularization constrainton the HRF estimates has been studied in (Casanovaa et al.,2008) by comparing regularization-based techniques to the leastsquare estimation methods.

To summarize, the main contribution of this paper consists ofmaximum likelihood joint estimation of the HRF and activationlevels, with a regularization that allows our method to work withsmaller amounts of data, thus making it possible to focus on smal-ler ROI and therefore better capture the variations of the HRFacross the areas of the brain. The method is developed in the pres-ence of either white noise or colored noise. Table 1 shows a com-parison between the main features of this work with SPMsoftware and the AFNI toolbox.

2. Materials and methods

2.1. Formulation

Throughout the paper, vectors and matrices are denoted by boldlowercase and uppercase symbols, respectively. Vectors aredefined as column vectors. The time series at voxel i at time t isrepresented as

yiðtÞ ¼ ai ðsðtÞ � hðtÞÞ þ eiðtÞ; eiðtÞ � Nð0;CÞ ð1Þ

where yi is a N � 1 vector, N represents the number of scans, t is thetime which is between 1 and N, activation level at that voxel is rep-resented by ai, sðtÞ is the stimulus time series which is convolvedwith the HRF represented by hðtÞ, and eiðtÞ is a zero-mean Gaussianrandom variable that represents the observation noise with thecovariance matrix C. In this formulation we just model activationsignal and Gaussian noise, so other artifacts such as drift areremoved from the measured signal. We remove the drift by fittinga polynomial to the fMRI time series (Glover, 1999) and subtractingthe resulting fit. In this formulation only one stimulus is applied tothe subject. The solution for the multiple stimuli experiment is dis-cussed in Section 2.2.6. The size of ROI is denoted with M. Weassume that the HRF is approximately invariant across homogenous

Table 1Comparison of SPM, AFNI and proposed joint MLE

SPM AFNI Joint MLE

Activation levelestimation

Yes. For each voxel No. It is not part of the model Yes. For each voxel

HRF estimation Fixed basis functions, estimates parameters. Fixedacross voxels

1. FIR – estimate for each voxel Estimates HRF for a ROI2. Fixed a priori (no estimation)

Capturingvariability ofHRF

Time and dispersion derivation, or using FIRmodel. voxel-by-voxel

Yes (voxel-by-voxel) Yes (across ROIs)

Voxel-based orregion-based?

Voxel-based Voxel-based Region-based

Can handlemultiplestimuli?

Yes Yes (different HRF for eachstimulus can be estimated)

Yes (different HRF for each ROI)

Noise serialcorrelation

Yes, auto regressive (AR (1) for classical, forBayesian AR with order defined in the process)

Can model ARMA (1,1) with3dREMLfit

AR (1)

Summary Low flexibility in estimation of HRF via basisfunctions

More flexibility in HRF via FIR,but noisy estimate of HRF

Joint estimation of HRF and activation, ROI-based,regularized. More reliable HRF estimate for low SNR regions

N. Bazargani, A. Nosratinia / Medical Image Analysis 18 (2014) 711–724 713

ROI, and the activation levels for each voxel within each ROI mayvary. The combined time series equation for a ROI is:

Y ¼ S

h1

h2

..

.

hp

266664

377775 a1 a2 � � � aM½ � þ E ð2Þ

where Y ¼ ½y1 . . . yM � is a N �M matrix of observations, whosecolumn yi is the time series observed at voxel i. The HRF is repre-sented by ½h1 � � � hp�t . In agreement with the existing results as wellas common sense, we assume the HRF starts from zero, rises to itspeak and decays to zero eventually. The time-sampled version ofthe HRF is represented as a FIR filter that has p nonzero values.The column vector a ¼ a1 a2 � � � aM½ �t represents the activation lev-els for the M voxels in that ROI, E contains the observation noise. Forthe purposes of exposition this noise is at first assumed to be white,and then the results are extended to the case of temporally corre-lated noise. The noise in fMRI is spatially correlated (Woolrichet al., 2004b) as well. However, since the goal of this study is to lookat activation levels and HRF estimations, we simplify the noise byjust accounting for the temporal correlation. The spatially uncorre-lated noise model is prevalent in works where HRF is estimated,including (Afonso et al., 2008; Casanovaa et al., 2008; Goutteet al., 2000; Lindquist and Wager, 2007; Makni et al., 2005, 2008).

The matrix S represents the stimulus data and has a Toeplitzstructure in order to implement convolution by matrix multiplica-tion. Specifically:

S ¼

sð1Þ 0 � � � 0sð2Þ sð1Þ � � � 0

..

. ...

� � � ...

sðNÞ sðN � 1Þ � � � sðN � pþ 1Þ

266664

377775 ð3Þ

2.2. Maximum likelihood estimation

2.2.1. Rank-one approximation in white noiseBy assuming independent, identically distributed Gaussian

noise eiðtÞ � N ð0;r2Þ, the likelihood function for the voxel timeseries in a region with M voxels becomes

pðY; h;a;rÞ ¼YMi¼1

pðyi; h;ai;rÞ

/YMi¼1

r�N exp � 12r2 kyi � Shaik2

2

� �ð4Þ

where yi is the time series vector of voxel i with activation level ai

and r is the standard deviation of the noise at each of the voxels inthe ROI. It is assumed that the amount of noise is the same in a ROI.Maximizing the likelihood leads to the following optimization:

minh;a

XM

i¼1

kyi � Shaik22 ð5Þ

Recalling that Y ¼ ½y1 . . . yM �, Eq. (5) can be converted from a sum ofvector norms to its equivalent in Frobenius norm, transforming theoptimization problem as follows:

minh;akY � Shatk2

F ð6Þ

Since the product of two matrices is limited in its rank to the lowerof the constituent matrix ranks, the product SðhatÞ is a rank-onematrix. Thus, the problem represented by expression (6) is essen-tially to find a rank-one approximation to the observations Y. Fromthe Eckart and Young theorem (Eckart and Young, 1936) it followsthat a low-rank approximation either in the Frobenius norm orthe spectral norm is given by a truncated sum of the singular valuedecomposition of the original matrix. Specifically, the answer to:

minBkY � BkF ð7Þ

s:t: rankðBÞ 6 k

is given by:

Bopt ¼Xk

i¼1

giuivti ð8Þ

where gi are the ordered singular values and ui and vi are therespective left and right singular vectors of matrix Y. For rank-oneapproximation, only one of the terms will be retained, therefore:

Bopt ¼ Shat ¼ g1u1vt1 ð9Þ

from which we set at ¼ g1vt1 and Sh ¼ u1, the estimates are as

follows:

a ¼ g1v1 h ¼ Syu1 ð10Þ

where Sy is the pseudo inverse of S. To resolve the scaling ambiguitybetween h and a, a normalization constraint is imposed on h, i.e.,khk2 ¼ 1.

2.2.2. Regularization in white noiseWe now turn to the practical question of the required data for

the reliable computation of the maximum likelihood solution.

714 N. Bazargani, A. Nosratinia / Medical Image Analysis 18 (2014) 711–724

The HRF is estimated in a homogenous ROI, this imposes a practicalupper limit on the number of voxels that can be used in ourestimation.

To solve this problem, we turn to earlier investigations whichhave shown that the HRF is a smooth function of time (Buxtonet al., 2004). We incorporate this knowledge into our algorithmby imposing a smoothing constraint on the estimated HRF usingthe Tikhonov regularization technique (Tikhonov and Arsenin,1977). The cost function to be minimized in this case is

minh;akY � Shatk2

F þ kkDhk22 ð11Þ

where k is the regularization parameter. The smoothness constraintis imposed by a discrete second derivative matrix D which isdefined as

D ¼

�2 1 0 � � � � � � 0

1 �2 1 0 . .. ..

.

0 . .. . .

. . .. . .

. ...

..

. . .. . .

. . .. . .

.0

..

. . ..

0 1 �2 10 � � � � � � 0 1 �2

2666666666664

3777777777775

ð12Þ

We use a regularized rank-one matrix approximation to minimizethis penalized cost function. We first find the rank-one matrixapproximation of Y, which is Yð1Þ ¼ g1u1vt

1. We define u ¼ u1 andv ¼ g1v1. By substituting h with Syu1 and a with v, the iterativeestimation procedure will be simplified. The minimization problemis transformed to:

minu;vtkY � uvtk2

F þ kkLuk22 ð13Þ

where L ¼ DSy. Now the vectors u and v are updated iterativelyuntil convergence.4 A relaxation method is used to solve the prob-lem, resulting in the following update equations, whose derivationis relegated to Appendix.

uðkþ1Þ ¼ ðvðkÞtvðkÞIþ kLtLÞ�1

YvðkÞ ð14Þ

vðkþ1Þ ¼ Ytuðkþ1Þ

uðkþ1Þtuðkþ1Þð15Þ

After convergence and finding the optimum v and u, the estimatesof h and a are

a ¼ vopt h ¼ Syuopt ð16Þ

Selecting a suitable regularization parameter is an important issue.One of the prominent techniques that is used to find near-optimumregularization parameter is the generalized cross-validationcriterion (Golub et al., 1979), however, due to the nature of the costfunction in (13) and optimizing over two vectors, the derivation ofcross validation is mathematically intractable. Therefore we pro-pose another approach for choosing the regularization parameterwhich is explained in Section 3.

It is noteworthy that the basic matrix analysis and optimizationtechniques used in this section have a long and storied history, buttheir application to the specific problem at hand and the adapta-tion to the requirements of our problem are novel. In the passing,we note the distinction of our approach with respect to earlierworks that also used smoothing, e.g. (Shen and Huang, 2008)which finds the sparse principal component in their analysis. Thework in (Shen and Huang, 2008) solves the problem for a sparse-ness constraint while we modify the regularization term to be

4 This is similar to the technique used in (Shen and Huang, 2008).

the smoothing constraint and solve for the regularized rank-onematrix approximation.

2.2.3. Rank-one approximation in colored noiseVarious results, e.g. (Woolrich et al., 2001), indicate that the

additive noise in fMRI is not purely white, but rather is better mod-eled by a random sequence that is correlated in time. Therefore weextend our results to capture the more accurate model for thenoise. Using formulation for a voxel time series as Eq. (1), the like-lihood function for all the voxels time series in the ROI becomes

pðY; h;a;CÞ ¼YMi¼1

pðyi; h;ai;CÞ

/YMi¼1

jCj�12 exp �1

2ðyi � ShaiÞtC�1ðyi � ShaiÞ

� �ð17Þ

where h; a and C, the noise covariance matrix, are the unknownparameters. We model the fMRI noise as a first order autoregressive(AR) process. The use of AR (1) noise model has precedents in theliterature, e.g. (Bullmore et al., 1996). We assume that noise statis-tics are stationary within a homogeneous region of neighboringvoxels, similar to a global AR(1) which was proposed inFrackowiak et al. (2004). However, no assumptions are made inour work about global variations of the noise statistics. It followsthat the noise covariance matrix of the voxels in a given ROI is fixedand is represented by C. For a AR (1) temporal noise model, C ¼ r2

e C,where r2

e is the variance of the input white Gaussian noise in the AR(1) model and C is a matrix that its ‘th element is c‘ ¼ qj‘j. Bysubstituting C in Eq. (17), the maximum likelihood solution takesthe following form

minh;a;C

XM

i¼1

ðyi � ShaiÞtC�1ðyi � ShaiÞ þ lnC ð18Þ

Because C�1 is a positive definite matrix, it can be decomposed tosquare roots, i.e., C�1 ¼WtW. By substituting in Eq. (18),

minh;a;W

XM

i¼1

kWðyi � ShaiÞk22 þ lnC ð19Þ

Or,

minW

minh;akWY �WShatk2

F þ lnC ð20Þ

It should be noted that W itself is completely characterized by onescalar parameter q, the delay-one autocorrelation of noise. Severalvalues of q in the range of 0 to 1, as mentioned in Worsley et al.(2002), are tested. The minimization is done in two steps. For eachq, the matrix W is generated. First we optimize the inner minimiza-tion with respect to h and a, which gives a result as a function of W(or q). In the second step, the minimum over W (effectively a scalaroptimization over q) is found.

2.2.4. Regularized rank-one approximation in colored noiseThe optimization in this case is

minW

minh;akWY �WShatk2

F þ kkDhk22 þ lnC ð21Þ

The noise model is AR (1) as the previous section. For each correla-tion coefficient q, the matrix W is generated. For each W, weoptimize over h and a using the regularization algorithm explainedin Section 2.2.2. Then we minimize over W (effectively a scalaroptimization over q) and the optimum values are obtained for theactivation levels and the HRF estimates.

N. Bazargani, A. Nosratinia / Medical Image Analysis 18 (2014) 711–724 715

2.2.5. Iterative estimationIn previous sections all the voxels in the region were used in the

estimation of the HRF and were treated as active voxels. We extendthe estimation algorithm to the case where inactive voxels or insig-nificantly activated voxels also exist in the ROI. Inactive voxelsharm the reliability of the HRF and activation levels estimate. Weaddress this problem by iterating between the estimation of theHRF and the activation levels.

The iterative process consists of two halves. In the first half-iter-ation, the HRF and activation levels are estimated via a rank-onematrix approximation in the same way discussed in the previoussections considering all the voxels in the ROI. This gives an initialestimate for activation levels and the HRF. In the second half-iter-ation, the most recently estimated HRF is convolved with the stim-ulus time series and used as a design matrix in a generalized linearmodel (GLM). The new estimates for activation levels is obtainedby the least square estimation that is well-known in the literature.We then calculate Student’s t-test for each voxel in a ROI toperform statistical inference on activation levels. A threshold fort-values is selected according to the desired significance level.The t-values for each voxel are converted to P-values based onthe degrees of freedom of the dataset. The significance level ofP-value = 0.001 which is corrected for multiple comparisons is cho-sen for selecting the t-test threshold. The Bonferroni correctionmethod (Abdi, 2007) is used for multiple comparison correction,where for a single voxel the significance level is calculated by divid-ing the significance level 0.001 by the number of the tests (voxels).If P-value of each voxel is less than 0:001=Number of tests, thenthe voxel is active and the null hypothesis is rejected.

Those voxels with t-tests less than a threshold will be excludedfrom the next iteration. The iterations will stop when no furtherimprovement on the HRF estimate is observed. After convergence,the last HRF estimate is used for calculating the t-tests for all of thevoxels in the region.

It is important to select a good threshold for t-test values inorder to exclude inactive voxels. If the threshold is set too high,the number of the participating voxels decreases, which makesthe estimates less reliable, while setting the threshold too low willinclude too many low-quality voxels which will also harm the esti-mate. The iterative process outlined in this section can be used withany of the algorithms proposed previously. In particular, it can beused in the context of white or colored noise, and with or withoutregularization constraint. The effect of the selection of the t-testthreshold on the estimations is studied and the results are shownin Fig. 8. The convergence of the iterative algorithm is studied byexamining the changes in the estimated HRF from one iteration toanother. At most 10 iterations are sufficient to achieve convergence.

2.2.6. Extension to multiple stimuliAll of the above proposed algorithms can be extended to multiple

stimuli. If the responses to different stimuli are in different brain loca-tions, the estimations are straightforward. For each stimulus we findthe joint MLE solution in every ROI. Then for each ROI and each stim-ulus, we evaluate the objective function with its corresponding esti-mates. For each stimuli the estimates which give the lowest error at agiven ROI, are selected as the response to that stimuli.

In the case where two stimuli give responses in the same region,the same approach as mentioned above can still work for the eventrelated design as long as the stimulus time series are non-overlap-ping in time. There are several examples in the literature for thenon-overlapping, orthogonal designs (Henson et al., 2002).

2.3. Region of interest selection

For real fMRI data two approaches are used to obtain ROIs. Thefirst method is to use AFNI toolbox and analyze the data to locate

active areas of the brain. Then use 3dclust program in AFNI tofind the active clusters. These ROIs from AFNI are used as inputsfor the joint MLE algorithm.

The second approach is to segment the brain into equal sizecubes. The optimal cube size for this segmentation reflects atrade-off. The cubes (ROIs) should be small enough to guaranteethe invariance of the HRF within each cube but large enough tocontain reliable information for the estimation. Cubes of size 27–64 voxels represent a reasonable trade-off that produce goodresults. This cube segmentation uniformly divides the space.

For the analysis with the brain segmentation method, the jointMLE method is applied to each cube separately, so the cubes act asa rough approximation to desired regions. The resulting activationlevels are then used to determine the active regions. The ROIs fromthe first round are used in a second round of joint MLE to estimatethe final activation levels and HRF. In other words, we can ‘‘boot-strap’’ our method by running it in the first round with cube-shaped regions.

3. Results

3.1. Synthetic data

The proposed algorithm is applied to a synthetic fMRI time ser-ies constructed by convolving a stimulus time series with a bench-mark HRF. Our benchmark HRF is the difference of two Gammafunctions with known parameters as in (Glover, 1999)

hðtÞ ¼ ta1b1

� �a1

e�ðt�ða1b1ÞÞ

b1 � ct

a2b2

� �a2

e�ðt�ða2b2ÞÞ

b2 ð22Þ

where a1 ¼ 6; a2 ¼ 12; b1 ¼ b2 ¼ 0:9 and c ¼ 0:35. The synthetictime series also reflects the magnitude of activation levels and aGaussian noise. The noise statistics are fixed for the voxels in theROI and are chosen such that the signal-to-noise ratio (SNR) is withinthe typical range of the SNR in the real fMRI data, which is between0.2 and 1 (Jezzard et al., 2002). The SNR is defined asjjShaijj2=ðN � r2Þ. We apply our algorithm on both block designand event related experiments. For the block design experiment,the stimulus time series has a period of 60 s, with 30 s on and 30 soff intervals. For the event related experiment, there are 51 stimuliwith random inter stimulus intervals (ISI). The time series hasN ¼ 300 samples. M different time series for the voxels in a ROI aregenerated in this way. We performed experiments with white Gauss-ian noise and with colored Gaussian noise having AR (1) model.

For selecting the regularization parameter k, we calculate themean square error (MSE) between the benchmark and the esti-mated HRF for each value of k for the synthetic data set. We test200 different values for k, with the range from zero to 107. Theimplementation is done in MATLAB, on Intel core2, 2:0 GHz proces-sor. The computation time for selecting the optimum k depends onthe number of voxels in a ROI and changes from 60 to 72 s forM ¼ 50 and M ¼ 100 voxels, respectively. The regularizationparameter corresponds to the least MSE is selected as the optimumvalue. The process of choosing the optimum k is done on the blockdesign and event related simulated data sets. The chosen optimumregularization parameters on simulated data are used for the realfMRI data experiments.

3.1.1. Synthetic data with white Gaussian noiseThe MLE algorithm with rank-one approximation is applied to

the synthetic data with additive white Gaussian noise. The resultsare presented for both regularized and non-regularized cases inblock design and event related experiments.

For the error analysis, we generate voxel’s data with blockdesign paradigm and perform Q ¼ 500 Monte Carlo cycles over

716 N. Bazargani, A. Nosratinia / Medical Image Analysis 18 (2014) 711–724

the entire ROI. The activation levels are drawn from a Normal dis-tribution, ai � Nð3;0:01Þ. The MSE between the benchmark andthe estimated HRF, and the MSE between the benchmark and theestimated activation levels are calculated. First, an average solutionfor the estimated HRF, �h, over Q ¼ 500 realizations is computed.Then the estimated variance is approximated by Vð�hÞ, and theMSE is calculated as follows

�h ¼ 1Q

XQ

q¼1

hq ð23Þ

Vð�hÞ ¼ 1Q

XQ

q¼1

ðhq � �hÞ2

ð24Þ

MSEðh; hÞ � Vð�hÞ þ ðh� �hÞ2 ð25Þ

Fig. 1 shows the MSE between the benchmark and the esti-mated HRF with no regularization for a block design experiment(top) and for an event related experiment (bottom). Experimentaldata are produced for several SNRs and for observation setscontaining M ¼ 50 and 100 voxels. The results show that the per-formance of estimation improves with both higher values forSNR and M. The reported SNR at each case is the average of SNRamong voxels in the ROI.

The performance of the method including regularization con-straint is shown in Fig. 2 for observation sets containing M ¼ 50and 100 voxels for various SNR. The results for a block designand an event related experiments are shown on top and bottomof the figure, respectively. By imposing the smoothing constrainton the HRF shape, improvements are achieved in the form of smal-ler MSE between the benchmark and average of the HRF estimates.

0 5 10 15 20 250

0.005

0.01

0.015

0.02

0.025

0.03

0.035

0.04

Time (Seconds)

Mea

n Sq

uare

Erro

r

SNR=0.2, no regularizationSNR=0.5, no regularizationSNR=0.8, no regularizationSNR=1, no regularization

0 5 10 15 20 250

0.5

1

1.5

2

2.5

3

3.5

4

4.5x 10−4

Time (Seconds)

Mea

n Sq

uare

Erro

r SNR=0.2, no regularizationSNR=0.5, no regularizationSNR =0.8, no regularizationSNR=1, no regularization

Fig. 1. Mean square error between the benchmark HRF and the estimated HRF on ROIsexperiment (a) and (b); and for an event related experiment (c) and (d) with white nois

To evaluate the estimate of the HRF, Monte Carlo simulation isperformed on a block design and also on an event related para-digm. In these simulations SNR ¼ 0:2, the number of voxels isM ¼ 50, and Monte Carlo simulation is performed Q ¼ 500 times.The mean of the HRF estimate and the variance of the estimationat each point are shown in Fig. 3 for block and event related para-digms. The HRF estimates are presented with and without impos-ing the smoothing constraint. It is observed that the smoothnessconstraint improves the accuracy of the mean of the HRF estimate(closer to the benchmark HRF) and improves (reduces) the estima-tion variance. In the interest of brevity the results are reported onlyat a single SNR, but results at a wide range of SNRs have beenobtained that show a similar trend.

Similar error analysis experiments are performed with the blockdesign paradigm to evaluate the activation level estimates. The ROIhas M ¼ 100 voxels and SNR ¼ 0:5. The benchmark coefficients aredrawn from a Gaussian distribution with mean 3 and variance of0.01. For each voxel, the mean and variance of the activation levelsare calculated via Monte Carlo. In Fig. 4a the absolute value of thedifference between the benchmark and estimated activation levelsat each voxel are plotted. It is observed that the regularization con-straint reduces the estimation error. The variances of the estimatedactivation levels are shown in Fig. 4b. The regularization constraintresults in a smaller estimation variance. Experiments at other SNRlevels were also performed which verify the expected trends withrespect to SNR and number of voxel observations, but were notincluded in the paper in the interest of brevity. The performanceof the estimation algorithm on estimating activation levels is eval-uated on the event related paradigm using Monte Carlo simulation.It is observed that the regularization constraint improves the HRF

0 5 10 15 20 250

0.005

0.01

0.015

0.02

Time (Seconds)

Mea

n Sq

uare

Erro

r

SNR=0.2, no regularizationSNR=0.5, no regularizationSNR=0.8, no regularizationSNR=1, no regularization

0 5 10 15 20 250.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

2.2 x 10−4

Time (Seconds)

Mea

n Sq

uare

Erro

r

SNR=0.2, no regularizationSNR=0.5, no regularizationSNR=0.8, no regularizationSNR=1, no regularization

with M = 50 (left) and M = 100 (right) voxels with different SNRs for a block designe assumption and without regularization.

5 10 15 20 25

0

1

2

3

4

5

6

7x 10−3

Time (Seconds)

Mea

n Sq

uare

Erro

r

SNR=0.2, with regularizationSNR=0.5, with regularizationSNR=0.8, with regularizationSNR=1, with regularization

0 5 10 15 20 250

0.5

1

1.5

2

2.5

3

3.5

4

x 10−3

Time (Seconds)

Mea

n Sq

uare

Erro

r

SNR=0.2, with regularizationSNR=0.5, with regularizationSNR=0.8, with regularizationSNR=1, with regularization

0 5 10 15 20 250

0.5

1

1.5

2

2.5

3

3.5x 10−4

Time (Seconds)

Mea

n Sq

uare

Erro

r

SNR=0.2, with regularizationSNR−0.5, with regularizationSNR=0.8, with regularizationSNR=1,with regularization

0 5 10 15 20 250.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

2.2x 10−4

Time (Seconds)

Mea

n Sq

uare

Erro

r

SNR=0.2,with regularizationSNR=0.5,with regularizationSNR=0.8, with regularizationSNR=1, with regularization

Fig. 2. Mean square error between the benchmark HRF and the estimated HRF on M = 50 (left) and M = 100 (right) voxels with different SNRs for a block design experiment(a) and (b); and for an event related experiment (c) and (d) with white noise assumption and regularization constraint.

N. Bazargani, A. Nosratinia / Medical Image Analysis 18 (2014) 711–724 717

estimate as well as the activation levels estimate. It also improves(reduces) the estimation variance. These experiments show thatwhile event related and block design experiments both benefitfrom regularization, the improvements are more pronounced inthe latter case. In the case that voxels in a ROI have larger differ-ences between their activation levels, the same results for erroranalysis with Monte Carlo have been obtained. The variances ofthe estimates are higher compared to the case with smaller differ-ence between activation levels.

We compare the performance of the rank-one MLE method withthe Statistical Parameter Mapping (SPM) software which uses thegeneralized linear model (GLM) analysis. In SPM–GLM the HRF ispre-calculated using a mixture of Gamma functions and is fixedduring the analysis, the canonical HRF is used for this experiment,while in our approach the HRF is estimated jointly with activationlevels. The comparison is on synthetic data corrupted by whiteGaussian noise. In these simulations the number of voxels isM ¼ 100, activation levels are drawn from a Normal distributionNð3;0:1Þ and Monte Carlo simulation is performed Q ¼ 500 times.Several SNR values has been tested for block design and eventrelated experiments. The average of the MSE which was calculatedin Eq. (25), is computed for the HRF estimate and activation levelsestimates and reported in Table 2. Comparing the results showsmaller average MSE for the joint MLE approach compared toSPM–GLM. The smallest average MSE is obtained with the regular-ized MLE approach.

Fig. 5 compares the benchmark and the estimated HRFs for onerealization of a simulated time series of M ¼ 50 voxels. For thisexperiment, the SNR is 0:5 and the HRF estimates are plotted for

both block design and event related experiments. The regulariza-tion constraint improves the HRF estimate, and furthermore theimprovements are more pronounced, once again, for the blockdesign experiment.

3.1.2. Experiments with colored noiseThe noise correlations are modeled by a first-order autoregres-

sive AR (1) model. No regularization constraint is imposed on theHRF in this experiment. For the error analysis, M ¼ 50 syntheticfMRI time series are generated with the block design paradigm.We perform Q ¼ 500 Monte Carlo cycles over the entire ROI withM voxels. The estimation results are compared for various valuesof the correlation coefficient for noise. The goal of this experimentis to study the effect of whitening the colored noise on the HRF andactivation levels estimation.

The correlation coefficient q ¼ 0:4 is similar to the noisecorrelation observed in the real fMRI data sets. For a block designexperiment with SNR = 0.3, q ¼ 0:4 and the activation levels drawnfrom a Nð3;0:01Þ, no significant difference is observed in the HRFestimate by accounting for the colored noise. This result is inagreement with the observations made by previous investigators,e.g. (Makni et al., 2008; Marrelec et al., 2003b), which report thatunder the above-mentioned conditions the noise model has littleimpact on the estimated HRF shape.

We also investigate the effect of colored noise on detectionperformance. Fig. 6 shows the t-tests calculated in three differentscenarios: using the benchmark (ideal) noise correlation coeffi-cient, estimating the noise correlation coefficient, and ignoringthe correlation of noise (assuming it is white even though it is

0 5 10 15 20 25−0.3

−0.2

−0.1

0

0.1

0.2

0.3

0.4

0.5

0.6

Time (Seconds)

% B

OLD

sig

nal c

hang

e

benchmark HRFwithout regularization

0 5 10 15 20 25−0.2

−0.1

0

0.1

0.2

0.3

0.4

0.5

0.6

Time (Seconds)

% B

OLD

sig

nal c

hang

e

benchmark HRFwith regularization

0 5 10 15 20 25−0.2

−0.1

0

0.1

0.2

0.3

0.4

0.5

0.6

Time (Seconds)

% B

old

sign

al c

hang

e

benchmark HRFwithout regularization

0 5 10 15 20 25−0.2

−0.1

0

0.1

0.2

0.3

0.4

0.5

0.6

Time (Seconds)

% B

old

sign

al c

hang

e

benchmark HRFwith regularization

Fig. 3. Average of the HRF estimate and the variance of the estimate at each point on a ROI with 50 voxels, with SNR ¼ 0:2 for (a) a block design experiment withoutregularization, (b) a block design experiment with regularization, (c) an event related experiment without regularization and (d) an event related experiment withregularization.

0 20 40 60 80 1000

0.02

0.04

0.06

0.08

0.1

0.12

0.14

voxels

diffe

renc

e of

ben

chm

ark

and

estim

ated

coe

ffici

ents

without regularizationwith regularization

0 20 40 60 80 1004

5

6

7

8

9

10

11

12

13

14x 10−3

voxels

varia

nce

of th

e es

timat

ed c

oeffi

cien

ts without regularizationwith regularization

Fig. 4. Comparing the benchmark and estimated activation levels by imposing regularization for a block design experiment with 100 voxels in the ROI and SNR ¼ 0:5. (a)Absolute value of the difference between estimated and benchmark activation levels. (b) Variance of the activation levels estimates.

718 N. Bazargani, A. Nosratinia / Medical Image Analysis 18 (2014) 711–724

not). It is observed that taking into account the noise correlationresult in t-values closer to the t-value calculated with the bench-mark covariance.

We repeat this experiment for various other values of correla-tion coefficient and SNR, and the results demonstrate the sametrend as seen in Fig. 6. The improvement in the t-values diminishas the SNR increases. This is also in agreement with the resultreported in (Makni et al., 2008): that using an AR (1) noise modelat low SNR slightly improves the detection.

Although colored noise models yield a small improvement atlow SNR for activation detection, our simulations show that thecorresponding improvements for the HRF estimation are not atall significant.

3.1.3. Iterative estimationWe apply our iterative algorithm on a synthetic data generated

as follows: an ROI is constructed with 40 active and 10 inactivevoxels. The observation length is N ¼ 300 and observations are

Table 2Monte Carlo comparison between SPM and joint MLE. Average of MSE betweenbenchmark and estimated values are shown for block and event related designs for aROI with 100 voxels.

Average of MSE SNR = 0.5 SNR = 0.8 SNR = 1

HRFNoReg, Block design 0.0078 0.0049 0.0041HRFReg, Block design 0.0071 0.0044 0.0037HRFSPM, Block design 0.0162 0.0162 0.0162

HRFNoReg, Event related 1:04� 10�4 6:43� 10�5 5:27� 10�5

HRFReg, Event related 9:95� 10�5 6:16� 10�5 5:02� 10�5

HRFSPM, Event related 0.0162 0.0162 0.0162

aNoReg, Block design 0.2093 0.0907 0.0684aReg, Block design 0.1832 0.0812 0.0617aSPM, Block design 0.3169 0.2974 0.2963

aNoReg, Event related 0.0657 0.0387 0.0297aReg, Event related 0.0657 0.0387 0.0297aSPM, Event related 0.3077 0.2716 0.2565

0 10 20 30 40 50

2

4

6

8

10

12

14

Voxels

T−te

st

t−value with benchmark covariance t−value with no whiteningt−value with estimated covariance

Fig. 6. T-tests calculated with the benchmark correlation coefficient, comparedwith the t-tests calculated with and without whitening.

N. Bazargani, A. Nosratinia / Medical Image Analysis 18 (2014) 711–724 719

contaminated with additive white Gaussian noise. The iterativeestimation algorithm is implemented with a regularizationconstraint.

For an event related paradigm with SNR ¼ 0:2 and activationlevels drawn from a normal distribution ai � Nð3;0:01Þ, Fig. 7ashows the benchmark HRF as well as the HRF estimates at the firstand last iterations. The t-tests are shown in Fig. 7b. The HRF esti-mate and activation detection performance improve through iter-ations. The t-test statistics improve through iterations andapproach the t-tests calculated with the benchmark HRF. We alsoperform simulations for SNR from 0.2 to 1. The results show thatthe advantage of the iterative algorithm is more pronounced atlower SNRs. We note that this was not unexpected, since themethod was devised in the first place to address the small data setsresulting from a region-based estimation of the HRF. We observedthe same improvement in the HRF and activation level estimationsfor the block design paradigm.

We investigate the effect of choosing the t-test threshold on theestimations for a ROI with 40 active and 10 inactive voxels. Fig. 8a,shows the estimated HRF for a block design experiment with theSNR of 0:8 for different t-test thresholds. The activation levels aredrawn from a Gaussian distribution with mean 2 and variance0.5. The results for an event related design are shown in Fig. 8b,the SNR is 0:2 for this experiment. By setting a high t-test thresh-old, the number of voxels whose t-test exceeds the thresholddecreases and the estimates become less reliable. The significancelevel of P-value = 0.001 corrected for multiple comparisons, corre-sponds to a t-test threshold which will result in reliable estimates.

0 5 10 15 20 25−0.4

−0.3

−0.2

−0.1

0

0.1

0.2

0.3

0.4

0.5

0.6

Time (Seconds)

% B

OLD

sig

nal c

hang

e

benchmark HRFwithout regularizationwith regularization

Fig. 5. HRF estimate for a ROI with 50 voxels and SNR = 0.5, illustrating the effect of th

3.2. Real fMRI data

3.2.1. Analysis with active ROIs from AFNIThe real data was acquired from a healthy subject at the Univer-

sity of Texas Southwestern at Dallas on a 3-T scanner. The experi-ment is an event related paradigm with 27 auditory stimuli, theevents are jittered with random inter stimulus intervals. The voxelsize is 3.125 � 3.125 � 4 mm and the number of scans is 160,TR = 2 s and the only preprocessing step done on the data is theregistration. The size of the whole brain data set is 64 � 64 � 40voxels.

The regions of interest are obtained as follow. The data is detr-ended (using Legendre polynomials of degree 3) and then analyzedwith AFNI toolbox using 3dDeconvolve to locate the active areasof the brain. The criteria for defining the voxel clusters are thethreshold for the activation map and the size of the cluster. Theactive contiguous voxels which are above these specific thresholdsare defined as ROIs. The program 3dclust in AFNI toolbox findsclusters that fit these criteria. We set the threshold for the F-testto be greater than 3:045 and the number of voxels in each clusterto be greater than 10 voxels. We obtain 4 ROIs with thesethresholds.

The iterative algorithm with regularization constraint and withthe color noise assumption is applied to each ROI. The HRF andactivation levels for each of the voxels in each region are estimated.The Student’s t-test is calculated for each voxel.

The activated brain regions identified by our approach are com-pared with the activation maps from AFNI. The axial and coronal

0 5 10 15 20 25−0.2

−0.1

0

0.1

0.2

0.3

0.4

0.5

Time (Seconds)

% B

OLD

sig

nal c

hang

e

benchmark HRFwithout regularizationwith regularization

e regularization constraint in (a) block design and (b) event related experiments.

0 5 10 15 20 25

−0.1

0

0.1

0.2

0.3

0.4

0.5

Time (Seconds)

% B

OLD

sig

nal c

hang

e

HRF at first iterationHRF at last iterationBenchmark HRF

0 10 20 30 40 50

−2

−1

0

1

2

3

4

5

6

7

8

voxels

T−te

st

t−values at first iterationt−values at last iterationt−values with benchmark HRF

Fig. 7. (a) HRF estimates and (b) t-tests for a ROI with 50 voxels including 40 active voxels. one realization of event related with SNR = 0.2 and white noise assumption.

0 5 10 15 20 25

−0.2

−0.1

0

0.1

0.2

0.3

0.4

0.5

Time (seconds)

% B

OLD

sig

nal c

hang

e

t>4.2t>5.6t>7t>11.2t>12.6t>14benchmarh HRF

0 5 10 15 20 25−0.2

−0.1

0

0.1

0.2

0.3

0.4

0.5

0.6

Time (seconds)

% B

OLD

sig

nal c

hang

e

t>2.7t>5.4t>6.3t>7.2benchmarh HRF

Fig. 8. Effect of the t-test threshold on the estimated HRF, for the iterative algorithm in a ROI with 40 active and 10 inactive voxels for (a) a block design experiment withSNR ¼ 0:8, (b) an event related experiment with SNR ¼ 0:2.

720 N. Bazargani, A. Nosratinia / Medical Image Analysis 18 (2014) 711–724

views of the activation maps for one of the ROIs (ROI 1) are shownin Fig. 9. The cross-hair in Fig. 9 shows the voxel of interest. Thesubfigure a shows activation maps from our MLE approach andFig. 9b shows the HRF estimate with our iterative approach for thatvoxel and its corresponding ROI. The bottom row shows the regionof interest and the estimated HRF obtained from AFNI. The activa-tion maps are superimposed on the anatomical data. The HRF esti-mate with AFNI for the same voxel is shown in Fig. 9d. The resultson the two other ROIs are shown in Figs. 10 and 11, respectively.

The subfigure c in each of these figures shows the ROI fromAFNI, which is the input to the MLE. Thus, the activate regions fromiterative joint MLE algorithm which are shown in subfigure ashould be consistent with the AFNI activation maps in subfigurec. Our method is not too far from the AFNI when signal-to-noiseratio (SNR) is good, because in that case the precise determinationof the HRF is not crucial since at high SNR the activation is some-what robust to the precise form of the HRF. At very high SNR even asingle voxel may have enough information to produce a sufficientlyclose approximation of the HRF estimate. The difference betweenour algorithm and AFNI, one would expect, are not in entirely dif-ferent activation, but rather on the voxels in the activation regionwhere the SNR is lowest. Thus it is not surprising that the overalllocation of the new activation is not different from AFNI, but theshape of the new region is slightly different and improved.

In AFNI the activation analysis as well as the HRF estimation areperformed voxel-wise. In our approach, the HRF estimation is per-formed over a region. The HRF is estimated in AFNI via a finiteimpulse response filter.

3.2.2. Analysis with no knowledge on active ROIsThe same real fMRI data as in Section 3.2.1 is analyzed without

using another algorithm to provide input ROIs to the joint MLEalgorithm, but uses the joint MLE algorithm in a two-step ‘‘boot-strap’’ approach as mentioned in Section 2.3. In the first round,the brain is segmented into equal-size cubes of 3� 3� 3 voxels.The iterative algorithm with the colored noise assumption isapplied to each cube separately. The resulting activation levelsfrom first round are used to determine the active regions. Thethreshold for t-test to define active regions is chosen based onthe desired significance level for false positive errors. The regionsfrom first round are used in second round of joint MLE to estimatefinal activation levels and HRF.

The activation maps and HRF estimates from joint MLE are com-pared with the results from AFNI. Fig. 12 subfigure a shows coronaland sagittal views of the t-test activation map for ROIs defined asactive with joint MLE algorithm. The HRF estimate for the ROIwhich is shown with crosshairs is in subfigure b. The activationmap from AFNI which is F-test are shown in subfigure c, the HRFestimate from AFNI is shown in subfigure d. Fig. 12 shows thateven without prior knowledge of active ROI, our joint MLEapproach is capable of detecting active regions. As mentioned pre-viously, in AFNI the activation analysis as well as the HRF estima-tion are performed voxel-wise. In our approach, the HRFestimation is performed over a region.

We observe slight differences with AFNI, which are explainedby the differences between F-test and t-test and the differencesbetween the algorithms used in AFNI and our method. However,

0 5 10 15 20 25−0.3

−0.2

−0.1

0

0.1

0.2

0.3

0.4

0.5

0.6

Time (seconds)

% B

OLD

sig

nal c

hang

e

Fig. 9. Axial and coronal views of a single slice activation maps for ROI 1. (a) Activation maps from the MLE method, (b) estimated HRF from MLE method, (c) region ofinterest obtained from AFNI and (d) estimated HRF from AFNI.

0 5 10 15 20 25−0.3−0.2−0.1

00.10.20.30.40.50.6

Time (seconds)

% B

OLD

sig

nal c

hang

e

Fig. 10. Axial and coronal views of a single slice activation maps for ROI 2. (a) Activation maps from the MLE method, (b) estimated HRF from MLE method, (c) region ofinterest obtained from AFNI and (d) estimated HRF from AFNI.

N. Bazargani, A. Nosratinia / Medical Image Analysis 18 (2014) 711–724 721

we note that varying amounts of difference may be observedbetween AFNI and our method, for example when signal-to-noiseratio is low on a per-voxel basis, but the active ROI is not too small.

In such cases, a more reliable HRF estimate and hence more reli-able estimate of activation level, can be obtained by our proposedjoint MLE method since it is ROI-based.

0 5 10 15 20 25−0.2

−0.1

0

0.1

0.2

0.3

0.4

0.5

0.6

Time (seconds)

% B

OLD

sig

nal c

hang

e

Fig. 11. Axial and coronal views of a single slice activation maps for ROI 3. (a) Activation maps from the MLE method, (b) estimated HRF from MLE method, (c) region ofinterest obtained from AFNI and (d) estimated HRF from AFNI.

0 5 10 15 20 25−0.3

−0.2

−0.1

0

0.1

0.2

0.3

0.4

0.5

0.6

% B

OLD

sig

nal c

hang

e

Fig. 12. Coronal and sagittal views of a single slice activation maps. (a) Activation maps from the MLE method without prior knowledge on active ROI, (b) estimated HRF fromMLE method, (c) activation map based on AFNI 3dDeconvolve algorithm and (d) estimated HRF from AFNI.

722 N. Bazargani, A. Nosratinia / Medical Image Analysis 18 (2014) 711–724

4. Conclusion

This paper proposes a maximum likelihood method for jointlyestimating the activation levels and the Hemodynamic Response

Function for a region of interest in fMRI without using prior distri-butions. This will enable capturing variabilities in the HRF acrossthe brain regions. The proposed algorithm can use ROIs from AFNIor any other analysis tool as input. It also has the ability to work

N. Bazargani, A. Nosratinia / Medical Image Analysis 18 (2014) 711–724 723

without pre-defined ROIs by applying our algorithm on segmentedcubes in the first round. A rank-one approximation is used for esti-mating both the HRF and activation levels in a computationallytractable manner. A smoothness constraint is imposed on theHRF estimate to improve the results under limited observationdata, which often happens in a region-based estimation of theHRF. Both block design and event related paradigms can be usedwith this method.

We investigated the effect of colored noise. Our results showthat the accurate modeling of noise autocorrelation will not signif-icantly affect the HRF estimation, which is consistent with theresults reported earlier in the literature.

To have accurate HRF estimates, inactive voxels must beexcluded from HRF estimation, but activation estimation itself isdependent on HRF values. To address this problem, an iterativemethod for the estimation of the HRF and activation values is pro-posed. The algorithms were tested using synthetic and real dataover a wide range of parameters.

Acknowledgment

The authors would like to thank the radiology department ofthe University of Texas South Western medical center at Dallasfor providing the fMRI data.

Appendix A. Regularized rank-one matrix approximation

In this appendix we address the computation of the regularizedrank-one minimization problem

minu;vtkY � uvtk2

F þ kkLuk22 ðA:1Þ

Clearly the answer may deviate from the right and left singular vec-tors due to the penalty factor. Our approach to find vectors u and vis found via a relaxation method, specifically, by the successive opti-mization of u while v is held constant, and vice versa.

First, for a fixed v we optimize u. The minimization criterion(A.1) can be rewritten asX

j

Xi

ðyji � ujv iÞ2 þ kutLtLu ðA:2Þ

By expanding the squares, we have

minu

Xj

Xi

y2ji � 2

Xj

Xi

yjiujv i þX

j

Xi

u2j v

2i þ kutLtLu ðA:3Þ

The first term does not have any effect on the minimization. There-fore, we need to minimize

minu� 2utYvþ utuvtv þ kutLtLu ðA:4Þ

By taking the derivative with respect to u and setting it to zero, weget

�2Yvþ 2vtvuþ 2kLtLu ¼ 0 ðA:5Þ

In each iteration k, we can use the equation above to calculate u interms of observed values as well as the latest vðkÞ, the value of v initeration k.

uðkþ1Þ ¼ ðvðkÞtvðkÞIþ kLtLÞ�1

YvðkÞ ðA:6Þ

We now perform the remaining task, namely to find an update for vgiven the latest calculated u. By expanding the squares in a mannersimilar to Eq. (A.3) and ignoring the terms that do not involve v,

minv� 2

Xj

Xi

yjiujv i þ utuvtv ðA:7Þ

which is equivalent to

minv� 2Ytuvþ utuvtv ðA:8Þ

By taking derivative with respect to v and setting it to zero

�2Ytuþ 2utuv ¼ 0 ðA:9Þ

The solution is

vðkþ1Þ ¼ Ytuðkþ1Þ

uðkþ1Þtuðkþ1Þ: ðA:10Þ

References

Abdi, H., 2007. Bonferroni and Sidak corrections for multiple comparisons. In:Salkind, N.J. (Ed.), Encyclopedia of Measurement and Statistics. Sage, ThousandOaks (CA), pp. 103–107.

Afonso, D., Sanches, J., Lauterbach, M., 2008. Robust brain activation detection infunctional MRI. In: 15th IEEE International Conference on Image Processing(ICIP), San Diego, CA, pp. 2960–2963.

Aguirre, G., Zarahn, E., D’Esposito, M., 1998. The variability of human BOLDhemodynamic response. Neuroimage 8, 360–369.

Boynton, G., Engel, S., Glover, G., Heeger, D., 1996. Linear systems analysis offunctional magnetic resonance imaging in human V1. J. Neurosci. 16, 4207–4221.

Bullmore, E., Brammer, M., Williams, S., Rabe-Hesketh, S., Janot, N., David, A.,Mellers, J., Howard, R., Sham, P., 1996. Statistical methods of estimation andinference for functional MR image analysis. Mag. Reson. Med. 35, 261–277.

Buxton, R., Uludag, K., Dubowitz, D.J., Liu, T.T., 2004. Modeling the hemodynamicresponse to brain activation. NeuroImage 23, S220–S233.

Casanovaa, R., Ryalib, S., Serencesc, J., Yanga, L., Kraftd, R., Laurientia, P., Maldjiana,J., 2008. The impact of temporal regularization on estimates of the boldhemodynamic response function: a comparative analysis. Neuroimage 40,1606–1618.

Chaari, L., Vincent, T., Forbes, F., Dojat, M., Ciuciu, P., 2013. Fast joint detection–estimation of evoked brain activity in event-related fMRI using a variationalapproach. IEEE Trans. Medical Imag. 32, 821–837.

Ciuciu, P., Poline, J.B., Marrelec, G., Idier, J., Pallier, C., Benali, H., 2003. Unsupervisedrobust nonparametric estimation of the hemodynamic response function forany fMRI experiment. IEEE Trans. Medical Imag. 22, 1235–1251.

Cohen, M., 1997. Parametric analysis of fMRI data using linear systems methods.NeuroImage 6, 93–103.

Cox, R., 1996. AFNI: software for analysis and visualization of functional magneticresonance neuroimages. Comput. Biomedical Res. 29, 162–173.

Dale, A.M., Buckner, R.L., 1997. Selective averaging of rapidly presented individualtrials using fMRI. Human Brain Mapp. 5, 329–340.

Eckart, C., Young, G., 1936. The approximation of one matrix by another of lowerrank. Psychometrika 1, 211–218.

Frackowiak, R., Ashburner, J.T., Penny, W., Zeki, S., Friston, K., Frith, C., Dolan, R.,Price, C., 2004. Human Brain Function. second ed.. Elsevier Science.

Friston, K., Jezzard, P., Turner, R., 1994. Analysis of functional MRI time-series.Human Brain Mapp. 1, 153–171.

Friston, K., Josephs, O., Rees, G., Turner, R., 1998. Nonlinear event-related responsesin fMRI. Magnet. Reson. Med. 39, 41–52.

Glover, G., 1999. Deconvolution of impulse response in event-related bold fMRI.Neuroimage 9, 416–429.

Golub, G., Heath, M., Wahba, G., 1979. Generalized cross-validation as a method forchoosing a good ridge parameter. Technometrics 21, 215–223.

Gössl, C., Auer, D., Fahrmeir, L., 2001a. Bayesian spatiotemporal inference infunctional magnetic resonance imaging. Biometrics 57, 554–562.

Gössl, C., Fahrmeir, L., Auer, D., 2001b. Bayesian modeling of the hemodynamicresponse function in bold fMRI. NeuroImage 14, 140–148.

Goutte, C., Nielsen, F., Hansen, L., 2000. Modeling the haemodynamic response infMRI using smooth FIR filters. IEEE Trans. Medical Imag. 19, 1188–1201.

Henson, R., Price, C., Rugg, M., Turner, R., Friston, K., 2002. Detecting latencydifferences in event-related BOLD responses: application to words versusnonwords and initial versus repeated face presentations. Neuroimage 15, 83–97.

Jezzard, P., Matheews, P., Smith, S., 2002. Functional MRI an Introduction toMethods. Oxford University Press.

Josephs, O., Turner, R., Friston, K., 1997. Event-related fMRI. Human Brain Mapp. 5,243–248.

Lange, N., Zeger, S., 1997. Non-linear Fourier time series analysis for human brainmapping by functional magnetic resonance imaging. J. Roy. Stat. Soc. Ser. C:Appl. Stat. 46, 1–29.

Liao, C., Worsley, K., Poline, J., Aston, J., Duncan, G., Evans, A., 2002. Estimating thedelay of the fMRI response. Neuroimage 16, 593–606.

Lindquist, M., Wager, T., 2007. Validity and power in hemodynamic responsemodeling: a comparison study and a new approach. Human Brain Mapp. 28,764–784.

724 N. Bazargani, A. Nosratinia / Medical Image Analysis 18 (2014) 711–724

Makni, S., Ciuciu, P., Idier, J., Poline, J., 2005. Joint detection–estimation of brainactivity in functional MRI: a multichannel deconvolution solution. IEEE Trans.Signal Process. 53, 3448–3502.

Makni, S., Idier, J., Vincent, T., Thirion, B., Dehaene-Lambertz, G., Ciuciu, P., 2008. Afully Bayesian approach to the parcel-based detection–estimation of brainactivity in fMRI. Neuroimage 41, 941–969.

Marrelec, G., Benali, H., Ciuciu, P., Pelegrini Issac, M., Poline, J., 2003a. Robustestimation of the hemodynamic response function in event-related BOLD fMRIusing basic physiological information. Human Brain Mapp. 19, 1–17.

Marrelec, G., Ciuciu, P., Pelegrini-Issac, M., Benali, H., 2003b. Estimation of thehemodynamic response function in event-related functional MRI: directedacyclic graphs for a general Bayesian inference framework. Inf. Process. Med.Imaging 18, 635–646.

Rajapakse, J., Kruggel, F., Maisog, J., Von Cramon, D., 1998. Modeling hemodynamicresponse for analysis of functional MRI time-series. Human Brain Mapp. 6, 283–300.

Riera, J., Watanabe, J., Kazuki, I., Naoki, M., Aubert, E., Ozaki, T., et al., 2004. A state-space model of the hemodynamic approach: nonlinear filtering of BOLD signals.Neuroimage 21, 547–567.

Shen, H., Huang, J., 2008. Sparse principal component analysis via regularized lowrank matrix approximation. J. Multivar. Anal. 99, 1015–1034.

Tikhonov, A., Arsenin, V., 1977. Solution of Ill-posed Problems. W.H. Winston &Sons, Washington, DC.

Vakorin, V., Borowsky, R., Sarty, G., 2007. Characterizing the functional MRIresponse using Tikhonov regularization. Stat. Med. 26, 3830–3844.

Woolrich, M., Behrens, T., Smith, S., 2004a. Constrained linear basis sets for HRFmodelling using variational bayes. Neuroimage 21, 1748–1761.

Woolrich, M., Jenkinson, M., Brady, J., Smith, S., 2004b. Fully Bayesian spatio-temporal modeling of FMRI data. IEEE Trans. Medical Imag. 23, 213–231.

Woolrich, M., Ripley, B., Brady, M., Smith, S., 2001. Temporal autocorrelation inunivariate linear modeling of fMRI data. Neuroimage 14, 1370–1386.

Worsley, K., Liao, C., Aston, J., Petre, V., Duncan, G., Morales, F., Evans, A., 2002. Ageneral statistical analysis for fMRI data. NeuroImage 15, 1–15.

Zarahn, E., 2002. Using larger dimensional signal subspaces to increase sensitivity infMRI time series analyses. Human Brain Mapp. 17, 13–16.

Zhang, C., Jiang, Y., Yu, T., 2007. A comparative study of one-level and two-levelsemiparametric estimation of hemodynamic response function for fMRI data.Stat. Med. 26, 3845–3861.