the time-rescaling theorem and its application to neural spike

22
NOTE Communicated by Jonathan Victor The Time-Rescaling Theorem and Its Application to Neural Spike Train Data Analysis Emery N. Brown [email protected] Neuroscience Statistics Research Laboratory, Department of Anesthesia and Critical Care, Massachusetts General Hospital, Boston, MA 02114, U.S.A., and Division of Health Sciences and Technology, Harvard Medical School / Massachusetts Institute of Technology, Cambridge, MA 02139, U.S.A. Riccardo Barbieri [email protected] Neuroscience Statistics Research Laboratory, Department of Anesthesia and Critical Care, Massachusetts General Hospital, Boston, MA 02114, U.S.A. Val  erie Ventura [email protected] Robert E. Kass [email protected] Department of Statistics, Carnegie Mellon University, Center for the Neural Basis of Cognition, Pittsburg, PA 15213, U.S.A. Loren M. Frank [email protected] Neuroscience Statistics Research Laboratory, Department of Anesthesia and Critical Care, Massachusetts General Hospital, Boston, MA 02114, U.S.A. Measuring agreement between a statistical model and a spike train data series, that is, evaluating goodness of t, is crucial for establishing the models validity prior to using it to make inferences about a particular neural system. Assessing goodness-of- t is a challenging problem for point process neural spike train models, especially for histogram-based models such as perstimulus time histograms (PSTH) and rate functions estimated by spike train smoothing. The time-rescaling theorem is a well- known result in probability theory, which states that any point process with an integrable conditional intensityfunction may be transformed into a Poisson process with unit rate. We describe how the theorem may be used to develop goodness-of- t tests for both parametric and histogram- based point process models of neural spike trains. We apply these tests in two examples: a comparison of PSTH, inhomogeneous Poisson, and inho- mogeneous Markov interval models of neural spike trains from the sup- Neural Computation 14, 325–346 (2001) c ° 2001 Massachusetts Institute of Technology

Upload: others

Post on 12-Feb-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

NOTE Communicated by Jonathan Victor

The Time-Rescaling Theorem and Its Application to NeuralSpike Train Data Analysis

Emery N BrownbrownsrlbmghharvardeduNeuroscience Statistics Research Laboratory Department of Anesthesia and CriticalCare Massachusetts General Hospital Boston MA 02114 USA and Division ofHealth Sciences and Technology Harvard Medical SchoolMassachusetts Institute ofTechnology Cambridge MA 02139 USA

Riccardo BarbieribarbierisrlbmghharvardeduNeuroscience Statistics Research Laboratory Department of Anesthesia and CriticalCare Massachusetts General Hospital Boston MA 02114 USA

Val Acircerie VenturavventurastatcmveduRobert E KasskassstatcmueduDepartment of Statistics Carnegie Mellon University Center for the Neural Basis ofCognition Pittsburg PA 15213 USA

Loren M FranklorenneurostatmghharvardeduNeuroscience Statistics Research Laboratory Department of Anesthesia and CriticalCare Massachusetts General Hospital Boston MA 02114 USA

Measuring agreement between a statistical model and a spike train dataseries that is evaluating goodness of t is crucial for establishing themodelrsquos validity prior to using it to make inferences about a particularneural system Assessing goodness-of-t is a challenging problem forpoint process neural spike train models especially for histogram-basedmodels such as perstimulus time histograms (PSTH) and rate functionsestimated by spike train smoothing The time-rescaling theorem is a well-known result in probability theory which states that any point processwith an integrable conditional intensity function may be transformed intoa Poisson process with unit rate We describe how the theorem may beused to develop goodness-of-t tests for both parametric and histogram-based point process models of neural spike trains We apply these tests intwo examplesa comparison of PSTH inhomogeneous Poisson and inho-mogeneous Markov interval models of neural spike trains from the sup-

Neural Computation 14 325ndash346 (2001) cdeg 2001 Massachusetts Institute of Technology

326 E N Brown R Barbieri V Ventura R E Kass and L M Frank

plementary eye eld of a macque monkey and a comparison of temporaland spatial smoothers inhomogeneous Poisson inhomogeneous gammaand inhomogeneous inverse gaussian models of rat hippocampal placecell spiking activity To help make the logic behind the time-rescaling the-orem more accessible to researchers in neuroscience we present a proofusing only elementary probability theory arguments We also show howthe theorem may be used to simulate a general point process model of aspike train Our paradigm makes it possible to compare parametric andhistogram-based neural spike train models directly These results sug-gest that the time-rescaling theorem can be a valuable tool for neuralspike train data analysis

1 Introduction

The development of statistical models that accurately describe the stochasticstructure of neural spike trains is a growing area of quantitative research inneuroscience Evaluating model goodness of tmdashthat is measuring quan-titatively the agreement between a proposed model and a spike train dataseriesmdashis crucial for establishing the modelrsquos validity prior to using it tomake inferences about the neural system being studied Assessing goodnessof t for point neural spike train models is a more challenging problem thanfor models of continuous-valued processes This is because typical distancediscrepancy measures applied in continuous data analyses such as the aver-age sum of squared deviations between recorded data values and estimatedvalues from the model cannot be directly computed for point process dataGoodness-of-t assessments are even morechallenging forhistogram-basedmodels such as peristimulus time histograms (PSTH) and rate functionsestimated by spike train smoothing because the probability assumptionsneeded to evaluate model properties are often implicit Berman (1983) andOgata (1988) developed transformations that under a given model convertpoint processes like spike trains into continuous measures in order to assessmodel goodness of t One of the theoretical results used to construct thesetransformations is the time-rescaling theorem

A form of the time-rescaling theorem is well known in elementary prob-ability theory It states that any inhomogeneous Poisson process may berescaled or transformed into a homogeneous Poisson process with a unit rate(Taylor amp Karlin 1994) The inverse transformation is a standard method forsimulating an inhomogeneous Poisson process from a constant rate (homo-geneous) Poisson process Meyer (1969) and Papangelou (1972) establishedthe general time-rescaling theorem which states that any point process withan integrable rate function may be rescaled into a Poisson process with aunit rate Berman and Ogata derived their transformations by applying thegeneral form of the theorem While the more general time-rescaling theo-rem is well known among researchers in point process theory (Br Acircemaud1981 Jacobsen 1982 Daley amp Vere-Jones 1988 Ogata 1988 Karr 1991) the

The Time-Rescaling Theorem 327

theorem is less familiar to neuroscience researchers The technical nature ofthe proof which relies on the martingale representation of a point processmay have prevented its signicance from being more broadly appreciated

The time-rescaling theorem has important theoretical and practical im-plications for application of point process models in neural spike train dataanalysis To help make this result more accessible to researchers in neu-roscience we present a proof that uses only elementary probability theoryarguments We describe how the theorem may be used to develop goodness-of-t tests for both parametric and histogram-based point process modelsof neural spike trains We apply these tests in two examples a compar-ison of PSTH inhomogeneous Poisson and inhomogeneous Markov in-terval models of neural spike trains from the supplementary eye eld ofa macque monkey and a comparison of temporal and spatial smoothersinhomogeneous Poisson inhomogeneous gamma and inhomogeneous in-verse gaussian models of rat hippocampal place cell spiking activity Wealso demonstrate how the time-rescaling theorem may be used to simulatea general point process

2 Theory

21 The Conditional Intensity Function and the Joint Probability Den-sity of the Spike Train Dene an interval (0 T] and let 0 lt u1 lt u2 lt lt uniexcl1 lt un middot T be a set of event (spike) times from a point processFor t 2 (0 T] let N (t) be the sample path of the associated counting processThe sample path is a right continuous function that jumps 1 at the eventtimes and is constant otherwise (Snyder amp Miller 1991) In this way N (t)counts the number and location of spikes in the interval (0 t] Therefore itcontains all the information in the sequence of events or spike times Fort 2 (0 T] we dene the conditional or stochastic intensity function as

l(t | Ht ) D limD t0

Pr(N (t C D t) iexcl N (t) D 1 | Ht)D t

(21)

where Ht D f0 lt u1 lt u2 uN (t) lt tg is the history of the process up totime t and uN (t) is the time of the last spike prior to t If the point process is aninhomogeneous Poisson process then l(t | Ht ) D l (t) is simply the Poissonrate function Otherwise it depends on the history of the process Hencethe conditional intensity generalizes the denition of the Poisson rate It iswell known that l (t | Ht ) can be dened in terms of the event (spike) timeprobability density f (t | Ht ) as

l(t | Ht ) Df (t | Ht )

1 iexclR t

uN (t)f (u | Ht) du

(22)

for t gt uN (t) (Daley amp Vere-Jones 1988 Barbieri Quirk Frank Wilson ampBrown 2001) We may gain insight into equation 22 and the meaning of the

328 E N Brown R Barbieri V Ventura R E Kass and L M Frank

conditional intensity function by computing explicitly the probability thatgiven Ht a spike uk occurs in [t t C D t) where k D N (t) C 1 To do this wenote that the events fN (t C Dt) iexcl N (t) D 1 | Htg and fuk 2 [t t C D t) | Htgare equivalent and that therefore

Pr (N (t C D t) iexcl N (t) D 1 | Ht ) D Pr(uk 2 [t t C Dt) | uk gt t Ht )

DPr(uk 2 [t t C Dt) | Ht )

Pr (uk gt t | Ht)

D

R tCDtt f (u | Ht ) du

1 iexclR t

uN (t)f (u | Ht ) du

frac14 f (t | Ht)D t

1 iexclR t

uN (t)f (u | Ht ) du

D l (t | Ht )D t (23)

Dividing by D t and taking the limit gives

limD t0

Pr(uk 2 [t t C D t) | Ht )D t

Df (t | Ht )

1 iexclR t

uN (t)f (u | Ht ) du

D l(t | Ht ) (24)

which is equation 21 Therefore l (t | Ht)D t is the probability of a spike in[t t C Dt) when there is history dependence in the spike train In survivalanalysis the conditional intensity is termed the hazard function because inthis case l (t | Ht )D t measures the probability of a failure ordeath in [t tCD t)given that the process has survived up to time t (Kalbeisch amp Prentice1980)

Because we would like to apply the time-rescaling theorem to spike traindata series we require the joint probability density of exactly n event timesin (0 T] This joint probability density is (Daley amp Vere-Jones 1988 BarbieriQuirk et al 2001)

f (u1 u2 un N (T) D n)

D f (u1 u2 un unC1 gt T)

D f (u1 u2 un N (un) D n) Pr (unC1 gt T | u1 u2 un)

DnY

kD1

l(uk | Huk) exp

raquoiexcl

Z uk

ukiexcl1

l(u | Hu) dufrac14

cent exp

(iexcl

Z T

un

l (u | Hu) du

) (25)

The Time-Rescaling Theorem 329

where

f (u1 u2 un N (un) D n)

DnY

kD1

l (uk | Huk) exp

raquoiexcl

Z uk

ukiexcl1

l (u | Hu) dufrac14

(26)

Pr(unC1 gt T | u1 u2 un) D exp

(iexcl

Z T

un

l(u | Hu) du

) (27)

and u0 D 0 Equation 26 is the joint probability density of exactly n eventsin (0 un] whereas equation 27 is the probability that the n C 1st eventoccurs after T The conditional intensity function provides a succinct wayto represent the joint probability density of the spike times We can nowstate and prove the time-rescaling theorem

Time-Rescaling Theorem Let 0 lt u1 lt u2 lt lt un lt T be a realizationfrom a point process with a conditional intensity function l(t | Ht ) satisfying0 lt l (t | Ht ) for all t 2 (0 T] Dene the transformation

L (uk) DZ uk

0l (u | Hu ) du (28)

for k D 1 n and assume L (t) lt 1 with probability one for all t 2 (0 T]Then the L (uk) rsquos are a Poisson process with unit rate

Proof Let tk D L (uk) iexcl L (ukiexcl1 ) for k D 1 n and set tT DR Tun

l(u | Hu) du To establish the result it sufces to show that the tksare independent and identically distributed exponential random variableswith mean one Because the tk transformation is one-to-one and tnC1 gt tTif and only if unC1 gt T the joint probability density of the tkrsquos is

f (t1 t2 tn tnC1 gt tT )

D f (t1 tn) Pr (tnC1 gt tT | t1 tn) (29)

We evaluate each of the two terms on the right side of equation 29 Thefollowing two events are equivalent

ftnC1 gt tT | t1 tng D funC1 gt T | u1 u2 ung (210)

Hence

Pr(tnC1 gt tT | t1 t2 tn ) D Pr(unC1 gt T | u1 u2 un )

D exp

(iexcl

Z T

un

l (u | Hun) du

)

D expfiexcltTg (211)

330 E N Brown R Barbieri V Ventura R E Kass and L M Frank

where the last equality follows from the denition of tT By the multivariatechange-of-variable formula (Port 1994)

f (t1 t2 tn ) D | J| f (u1 u2 un N (un) D n) (212)

where J is the Jacobian of the transformation between uj j D 1 n andtk k D 1 n Because tk is a function of u1 uk J is a lower triangularmatrix and its determinant is the product of its diagonal elements denedas | J| D |

QnkD1 Jkk| By assumption 0 lt l (t | Ht) and by equation 28 and

the denition of tk the mapping of u into t is one-to-one Therefore bythe inverse differentiation theorem (Protter amp Morrey 1991) the diagonalelements of J are

Jkk Duk

tkD l(uk | Huk

) iexcl1 (213)

Substituting |J | and equation 26 into equation 212 yields

f (t1 t2 tn ) DnY

kD1

l(uk | Huk) iexcl1

nY

kD1

l (uk | Huk)

cent expraquo

iexclZ uk

ukiexcl1

l (u | Hu ) dufrac14

DnY

kD1

expfiexcl[L (uk) iexcl L (ukiexcl1 )]g

DnY

kD1

expfiexcltkg (214)

Substituting equations 211 and 214 into 29 yields

f (t1 t2 tn tnC1 gt tT ) D f (t1 tn) Pr(tnC1 gt tT | t1 tn)

D

sup3nY

kD1

expfiexcltkgacute

expfiexcltTg (215)

which establishes the result

The time-rescaling theorem generates a history-dependent rescaling ofthe time axis that converts a point process into a Poisson process with a unitrate

22 Assessing Model Goodness of Fit We may use the time-rescalingtheorem to construct goodness-of-t tests for a spike data model Once a

The Time-Rescaling Theorem 331

model has been t to a spike train data series we can compute from itsestimated conditional intensity the rescaled times

tk D L (uk) iexcl L (ukiexcl1 ) (216)

If the model is correct then according to the theorem the tks are indepen-dent exponential random variables with mean 1 If we make the furthertransformation

zk D 1 iexcl exp(iexcltk) (217)

then zks are independent uniform random variables on the interval (0 1) Be-cause the transformations inequations 216 and 217 are both one-to-one anystatistical assessment that measures agreement between the zks and a uni-form distribution directly evaluates how well the original model agrees withthe spike train data Here we present two methods Kolmogorov-Smirnovtests and quantile-quantile plots

To construct the Kolmogorov-Smirnov test we rst order the zks fromsmallest to largest denoting the ordered values as z (k) s We then plot the val-ues of the cumulative distribution function of the uniform density dened

as bk D kiexcl 12

n for k D 1 n against the z (k) s If the model is correct then thepoints should lie on a 45-degree line (Johnson amp Kotz 1970) Condencebounds for the degree of agreement between the models and the data maybe constructed using the distribution of the Kolmogorov-Smirnov statisticFor moderate to large sample sizes the 95 (99) condence bounds arewell approximated as bk sect 136n12 (bk sect 163n12) (Johnson amp Kotz 1970)We term such a plot a Kolmogorov-Smirnov (KS) plot

Another approach to measuring agreement between the uniform prob-ability density and the zks is to construct a quantile-quantile (Q-Q) plot(Ventura Carta Kass Gettner amp Olson 2001 Barbieri Quirk et al 2001Hogg amp Tanis 2001) In this display we plot the quantiles of the uniformdistribution denoted here also as the bks against the z(k) s As in the case ofthe KS plots exact agreement occurs between the point process model andthe experimental data if the points lie on a 45-degree line Pointwise con-dence bands can be constructed to measure the magnitude of the departureof the plot from the 45-degree line relative to chance To construct point-wise bands we note that if the tks are independent exponential randomvariables with mean 1 and the zks are thus uniform on the interval (0 1)then each z (k) has a beta probability density with parameters k and n iexcl k C 1dened as

f (z | k n iexcl k C 1) Dn

(n iexcl k) (k iexcl 1)zkiexcl1 (1 iexcl z)niexclk (218)

for 0 lt z lt 1 (Johnson amp Kotz 1970) We set the 95 condence boundsby nding the 25th and 975th quantiles of the cumulative distribution

332 E N Brown R Barbieri V Ventura R E Kass and L M Frank

associated with equation 218 for k D 1 n These exact quantiles arereadily available in many statistical software packages For moderate tolarge spike train data series a reasonable approximation to the 95 (99)condence bounds is given by the gaussian approximation to the bino-mial probability distribution as z(k) sect 196[z(k) (1 iexcl z(k) ) n]12 (z(k) sect2575[z(k) (1 iexcl z(k) ) n]12) To our knowledge these local condence boundsfor the Q-Q plots based on the beta distribution and the gaussian approxi-mation are new

In general the KS condence intervals will be wider than the corre-sponding Q-Q plot intervals To see this it sufces to compare the widthsof the two intervals using their approximate formulas for large n From thegaussian approximation to the binomial the maximum width of the 95condence interval for the Q-Q plots occurs at the median z(k) D 050 andis 2[196 (4n) 12] D 196niexcl12 For n large the width of the 95 condenceintervals for the KS plots is 272niexcl12 at all quantiles The KS condencebounds consider the maximum discrepancy from the 45-degree line alongall quantiles the 95 bands show the discrepancy that would be exceeded5 of the time by chance if the plotted data were truly uniformlydistributedThe Q-Q plot condence bounds consider the maximum discrepancy fromthe 45-degree line for each quantile separately These pointwise 95 con-dence bounds mark the amount by which each value z(k) would deviatefrom the true quantile 5 of the time purely by chance The KS bounds arebroad because they are based on the joint distribution of all n deviationsand they consider the distribution of the largest of these deviations TheQ-Q plot bounds are narrower because they measure the deviation at eachquantile separately Used together the two plots help approximate upperand lower limits on the discrepancy between a proposed model and a spiketrain data series

3 Applications

31 An Analysis of Supplementary Eye Field Recordings For the rstapplication of the time-rescaling theorem to a goodness-of-t analysis weanalyze a spike train recorded from the supplementary eye eld (SEF) of amacaque monkey Neurons in the SEF play a role in oculomotor processes(Olson Gettner Ventura Carta amp Kass 2000) A standard paradigm forstudying the spiking propertiesof these neurons is a delayed eye movementtask In this task the monkey xates is shown locations of potential targetsites and is then cued to the specic target to which it must saccade Nexta preparatory cue is given followed a random time later by a go signalUpon receiving the go signal the animal must saccade to the specic targetand hold xation for a dened amount of time in order to receive a rewardBeginning fromthe point of the specic target cue neural activity is recordedfor a xed interval of time beyond the presentation of the go signal Aftera brief rest period the trial is repeated Multiple trials from an experiment

The Time-Rescaling Theorem 333

such as this are jointly analyzed using a PSTH to estimate ring rate for anite interval following a xed initiation point That is the trials are timealigned with respect to a xed initial point such as the target cue The dataacross trials are binned in time intervals of a xed length and the rate in eachbin is estimated as the average number of spikes in the xed time interval

Kass and Ventura (2001) recently presented inhomogeneous Markov in-terval (IMI) modelsas an alternative to the PSTH for analyzing multiple-trialneural spike train data These models use a Markov representation for theconditional intensity function One form of the IMI conditional intensityfunction they considered is

l(t | Ht ) D l (t | uN (t) h ) D l1 (t | h )l2 (t iexcl uN (t) | h ) (31)

where uN (t) is the time of the last spike prior to t l1 (t | h ) modulates r-ing as a function of the experimental clock time l2 (t iexcl uN (t) | h ) repre-sents Markov dependence in the spike train and h is a vector of modelparameters to be estimated Kass and Ventura modeled log l1 (t | h ) andlog l2 (t iexcl uN (t) | h ) as separate piecewise cubic splines in their respectivearguments t and t iexcl uN (t) The cubic pieces were joined at knots so thatthe resulting functions were twice continuously differentiable The numberand positions of the knots were chosen in a preliminary data analysis In thespecial case l (t | uN (t) h ) D l1 (t | h ) the conditional intensity function inequation 31 corresponds to an inhomogeneous Poisson (IP) model becausethis assumes no temporal dependence among the spike times

Kass and Ventura used their models to analyze SEF data that consistedof 486 spikes recorded during the 400 msec following the target cue signalin 15 trials of a delayed eye movement task (neuron PK166a from Olsonet al 2000 using the pattern condition) The IP and IMI models were tby maximum likelihood and statistical signicance tests on the spline co-efcients were used to compare goodness of t Included in the analysiswere spline models with higher-order dependence among the spike timesthan the rst-order Markov dependence in equation 31 They found thatthe IMI model gave a statistically signicant improvement in the t relativeto the IP and that adding higher-order dependence to the model gave nofurther improvements The t of the IMI model was not improved by in-cluding terms to model between-trial differences in spike rate The authorsconcluded that there was strong rst-order Markov dependence in the r-ing pattern of this SEF neuron Kass and Ventura did not provide an overallassessment of model goodness of t or evaluate how much the IMI and theIP models improved over the histogram-based rate model estimated by thePSTH

Using the KS and Q-Q plots derived from the time-rescaling theorem it ispossible to compare directly the ts of the IMI IP and PSTH models and todetermine which gives the most accurate description of the SEF spike trainstructure The equations for the IP rate function lIP (t | hIP ) D l1 (t | hIP )

334 E N Brown R Barbieri V Ventura R E Kass and L M Frank

and for the IMI conditional intensity function lIMI (t | uN (t) hIMI ) Dl1 (t | hIMI )l2 (t iexcl uN (t) hIMI ) are given in the appendix along with a dis-cussion of the maximum likelihood procedure used to estimate the co-efcients of the spline basis elements The estimated conditional inten-sity functions for the IP and IMI models are respectively lIP (t | OhIP ) andlIMI (t | uN (t) OhIMI ) where Oh is the maximum likelihood estimate of thespecic spline coefcients For the PSTH model the conditional intensityestimate is the PSTH computed by averaging the number of spikes in eachof 40 10 msec bins (the 400 msec following the target cue signal) acrossthe 15 trials The PSTH is the t of another inhomogeneous Poisson modelbecause it assumes no dependence among the spike times

The results of the IMI IP and PSTH model ts compared by KS andQ-Q plots are shown in Figure 1 For the IP model there is lack of t atlower quantiles (below 025) because in that range its KS plot lies just out-side the 95 condence bounds (see Figure 1A) From quantile 025 andbeyond the IP model is within the 95 condence bounds although be-

The Time-Rescaling Theorem 335

yond the quantile 075 it slightly underpredicts the true probability modelof the data The KS plot of the PSTH model is similar to that of the IPexcept that it lies entirely outside the 95 condence bands below quan-tile 050 Beyond this quantile it is within the 95 condence bounds TheKS plot of the PSTH underpredicts the probability model of the data toa greater degree in the upper range than the IP model The IMI model iscompletely within the 95 condence bounds and lies almost exactly onthe 45-degree line of complete agreement between the model and the dataThe Q-Q plot analyses (see Figure 1B) agree with KS plot analyses witha few exceptions The Q-Q plot analyses show that the lack of t of theIP and PSTH models is greater at the lower quantiles (0ndash050) than sug-gested by the KS plots The Q-Q plots for the IP and PSTH models alsoshow that the deviations of these two models near quantiles 080 to 090are statistically signicant With the exception of a small deviation belowquantile 010 the Q-Q plot of the IMI lies almost exactly on the 45-degreeline

Figure 1 Facing page (A) Kolmogorov-Smirnov (K-S) plots of the inhomoge-neous Markov interval (IMI) model inhomogeneous Poisson (IP) and perstim-ulus time histogram (PSTH) model ts to the SEF spike train data The solid45-degree line represents exact agreement between the model and the dataThe dashed 45-degree lines are the 95 condence bounds for exact agree-ment between the model and experimental data based on the distribution of theKolmogorov-Smirnov statistic The 95 condence bounds are bk sect 136niexcl 1

2 where bk D (k iexcl 1

2) n for k D 1 n and n is the total number of spikes The

IMI model (thick solid line) is completely within the 95 condence boundsand lies almost exactly on the 45-degree line The IP model (thin solid line)has lack of t at lower quantiles (lt 025) From the quantile 025 and be-yond the IP model is within the 95 condence bounds The KS plot of thePSTH model (dotted line) is similar to that of the IP model except that it liesoutside the 95 condence bands below quantile 050 Beyond this quantileit is within the 95 condence bounds yet it underpredicts the probabilitymodel of the data to a greater degree in this range than the IP model doesThe IMI model agrees more closely with the spike train data than either the IPor the PSTH models (B) Quantile-quantile (Q-Q) plots of the IMI (thick solidline) IP (thin solid line) and PSTH (dotted line) models The dashed lines arethe local 95 condence bounds for the individual quantiles computed fromthe beta probability density dened in equation 218 The solid 45-degree linerepresents exact agreement between the model and the data The Q-Q plotssuggests that the lack of t of the IP and PSTH models is greater at the lowerquantiles (0ndash050) than suggested by the KS plots The Q-Q plots for the IP andPSTH models also show that the deviations of these two models near quantiles080 to 090 are statistically signicant With the exception of a small devia-tion below quantile 010 the Q-Q plot of the IMI lies almost exactly on the45-degree line

336 E N Brown R Barbieri V Ventura R E Kass and L M Frank

We conclude that the IMI model gives the best description of these SEFspike trains In agreement with the report of Kass and Ventura (2001) thisanalysis supports a rst-order Markov dependence among the spike timesand not a Poisson structure as would be suggested by either the IP or thePSTH models This analysis extends the ndings of Kass and Ventura byshowing that of the three models the PSTH gives the poorest description ofthe SEF spike train The IP model gives a better t to the SEF data than thePSTH model because the maximum likelihood analysis of the parametricIP model is more statistically efcient than the histogram (method of mo-ments) estimate obtained from the PSTH (Casella amp Berger 1990) That isthe IP model t by maximum likelihood uses all the data to estimate theconditional intensity function at all time points whereas the PSTH anal-ysis uses only spikes in a specied time bin to estimate the ring rate inthat bin The additional improvement of the IMI model over the IP is dueto the fact that the former represents temporal dependence in the spiketrain

32 An Analysis of Hippocampal Place Cell Recordings As a secondexample of using the time-rescaling theorem to develop goodness-of-ttests we analyze the spiking activity of a pyramidal cell in the CA1 regionof the rat hippocampus recorded from an animal running back and forth ona linear track Hippocampal pyramidal neurons have place-specic ring(OrsquoKeefe amp Dostrovsky 1971)a given neuron res onlywhen the animal is ina certain subregion of the environment termed the neuronrsquos place eld On alinear track these elds approximately resemble one-dimensional gaussiansurfaces The neuronrsquos spiking activity correlates most closely with the an-imalrsquos position on the track (Wilson amp McNaughton 1993) The data serieswe analyze consists of 691 spikes from a place cell in the CA1 region of thehippocampus recorded from a rat running back and forth for 20 minutes ona 300 cm U-shaped track The track was linearized for the purposes of thisanalysis (Frank Brown amp Wilson 2000)

There are two approaches to estimating the place-specic ring maps ofa hippocampal neuron One approach is to use maximum likelihood to ta specic parametric model of the spike times to the place cell data as inBrown Frank Tang Quirk and Wilson (1998) and Barbieri Quirk et al(2001) If x(t) is the animalrsquos position at time t we dene the spatial functionfor the one-dimensional place eld model as the gaussian surface

s (t) D expraquo

a iexcl b (x (t) iexcl m )2

2

frac14 (32)

where m is the center of place eld b is a scale factor and expfag is themaximum height of the place eld at its center We represent the spike timeprobability density of the neuron as either an inhomogeneous gamma (IG)

The Time-Rescaling Theorem 337

model dened as

f (uk | ukiexcl1 h )

Dys (uk )C (y )

microZ uk

ukiexcl1

ys (u) duparay iexcl1

expraquo

iexclZ uk

ukiexcl1

ys(u) dufrac14

(33)

or as an inhomogeneous inverse gaussian (IIG) model dened as

f (uk | ukiexcl1 h )

Ds(uk)

micro2p

hR uk

ukiexcl1s(u) du

i3para 1

2

exp

8gtlt

gtiexcl

12

plusmnR uk

ukiexcl1s (u) du iexcl y

sup22

y 2R uk

ukiexcl1s(u) du

9gt=

gt (34)

where y gt 0 is a location parameter for both models and h D (m a b y )is the set of model parameters to be estimated from the spike train If weset y D 1 in equation 33 we obtain the IP model as a special case ofthe IG model The parameters for all three modelsmdashthe IP IG and theIIGmdashcan be estimated from the spike train data by maximum likelihood(Barbieri Quirk et al 2001) The models in equations 33 and 34 are Markovso that the current value of either the spike time probability density orthe conditional intensity (rate) function depends on only the time of thepreviousspike Because of equation 22 specifying the spike time probabilitydensity is equivalent to specifying the conditional intensity function If welet Oh denote the maximum likelihood estimate of h then the maximumlikelihood estimate of the conditional intensity function for each model canbe computed from equation 22 as

l(t | Ht Oh ) Df (t | uN (t) Oh )

1 iexclR t

uN (t)f (u | uN (t) Oh ) du

(35)

for t gt uN (t) The estimated conditional intensity from each model maybe used in the time-rescaling theorem to assess model goodness of t asdescribed in Section 21

The second approach is to compute a histogram-based estimate of theconditional intensity function by using either spatial smoothing (Muller ampKubie 1987 Frank et al 2000) or temporal smoothing (Wood Dudchenkoamp Eichenbaum 1999) of the spike train To compute the spatial smooth-ing estimate of the conditional intensity function we followed Frank et al(2000) and divided the 300 cm track into 42 cm bins counted the numberof spikes per bin and divided the count by the amount of time the animalspends in the bin We smooth the binned ring rate with a six-point gaus-sian window with a standard deviation of one bin to reduce the effect of

338 E N Brown R Barbieri V Ventura R E Kass and L M Frank

running velocity The spatial conditional intensity estimate is the smoothedspatial rate function To compute the temporal rate function we followedWood et al (1999) and divided the experiment into time bins of 200 msecand computed the rate as the number of spikes per 200 msec

These two smoothing procedures produce histogram-based estimatesof l(t) Both are histogram-based estimates of Poisson rate functions be-cause neither the estimated spatial nor the temporal rate functions makeany history-dependence assumption about the spike train As in the analy-sis of the SEF spike trains we again use the KS and Q-Q plots to comparedirectly goodness of t of the ve spike train models for the hippocampalplace cells The IP IG and IIG models were t to the spike train data by max-imum likelihood The spatial and temporal rate models were computed asdescribed The KS and Q-Q plotgoodness-of-t comparisonsare in Figure 2

The IG model overestimates at lower quantiles underestimates at inter-mediate quantiles and overestimates at the upper quantiles (see Figure 2A)

The Time-Rescaling Theorem 339

The IP model underestimates the lower and intermediate quantiles andoverestimates the upper quantiles The KS plot of the spatial rate model issimilar to that of the IP model yet closer to the 45-degree line The temporalrate model overestimates the quantiles of the true probability model of thedata This analysis suggests that the IG IP and spatial rate models are mostlikely oversmoothing this spike train whereas the temporal rate model un-dersmooths it Of the ve models the one that is closest to the 45-degreeline and lies almost entirely within the condence bounds is the IIG Thismodel disagrees only with the experimental data near quantile 080 Becauseall of the models with the exception of the IIG have appreciable lack of tin terms of the KS plots the ndings in the Q-Q plot analyses are almostidentical (see Figure 2B) As in the KS plot the Q-Q plot for the IIG model isclose to the 45-degree line and within the 95 condence bounds with theexception of quantiles near 080 These analyses suggest that IIG rate modelgives the best agreement with the spiking activity of this pyramidal neuronThe spatial and temporal rate function models and the IP model have thegreatest lack of t

33 Simulating a General Point Process by Time Rescaling As a sec-ond application of the time-rescaling theorem we describe how the the-orem may be used to simulate a general point process We stated in theIntroduction that the time-rescaling theorem provides a standard approachfor simulating an inhomogeneous Poisson process from a simple Poisson

Figure 2 Facing page (A) KS plots of the parametric and histogram-based modelts to the hippocampal place cell spike train activity The parametric modelsare the inhomogeneous Poisson (IP) (dotted line) the inhomogeneous gamma(IG) (thin solid line) and the inhomogeneous inverse gaussian (IIG)(thick solidline) The histogram-based models are the spatial rate model (dashed line) basedon 42 cm spatial bin width and the temporal rate model (dashed and dotted line)based a 200msec time bin width The 45-degree solid line represents exact agree-ment between the model and the experimental data and the 45-degree thin solidlines are the 95 condence bounds based on the KS statistic as in Figure 1 TheIIG model KS plot lies almost entirely within the condence bounds whereasall the other models show signicant lack of t This suggests that the IIG modelgives the best description of this place cell data series and the histogram-basedmodels agree least with the data (B) Q-Q plots of the IP (dotted line) IG (thinsolid line) and IIG (thick solid line) spatial (dashed line) temporal (dotted anddashed line) models The 95 local condence bounds (thin dashed lines) arecomputed as described in Figure 1 The ndings in the Q-Q plot analyses arealmost identical to those in the KS plots The Q-Q plot for the IIG model is closeto the 45-degree line and within the 95 condence bounds with the exceptionof quantiles near 080 This analysis also suggests that IIG rate model gives thebest agreement with the spiking activity of this pyramidal neuron The spatialand temporal rate function models and the IP model have the greatest lack of t

340 E N Brown R Barbieri V Ventura R E Kass and L M Frank

process The general form of the time-rescaling theorem suggests that anypoint process with an integrable conditional intensity function may be sim-ulated from a Poisson process with unit rate by rescaling time with respectto the conditional intensity (rate) function Given an interval (0 T] the sim-ulation algorithm proceeds as follows

1 Set u0 D 0 Set k D 1

2 Draw tk an exponential random variable with mean 1

3 Find uk as the solution to tk DR uk

ukiexcl1l (u | u0 u1 ukiexcl1 ) du

4 If uk gt T then stop

5 k D k C 1

6 Go to 2

By using equation 23 a discrete version of the algorithm can be constructedas follows Choose J large and divide the interval (0 T] into J bins each ofwidth D D T J For k D 1 J draw a Bernoulli random variable ucurren

k withprobability l(kD | ucurren

1 ucurrenkiexcl1 )D and assign a spike to bin k if ucurren

k D 1 andno spike if ucurren

k D 0While in many instances there will be faster more computationally ef-

cient algorithms for simulating a point process such as model-based meth-ods for specic renewal processes (Ripley 1987) and thinning algorithms(Lewis amp Shedler 1978 Ogata 1981 Ross 1993) the algorithm above is sim-ple to implement given a specication of the conditional intensity functionFor an example of where this algorithm is crucial for point process simu-lation we consider the IG model in equation 32 Its conditional intensityfunction is innite immediately following a spike if y lt 1 If in addition yis time varying (y D y (t) lt 1 for all t) then neither thinning nor standardalgorithms for making draws from a gamma probability distribution maybe used to simulate data from this model The thinning algorithm fails be-cause the conditional intensity function is not bounded and the standardalgorithms for simulating a gamma model cannot be applied because y istime varying In this case the time-rescaling simulation algorithm may beapplied as long as the conditional intensity function remains integrable asy varies temporally

4 Discussion

Measuring how well a point process model describes a neural spike traindata series is imperative prior to using the model for making inferences Thetime-rescaling theorem states that any point process with an integrable con-ditional intensity function may be transformed into a Poisson process witha unit rate Berman (1983) and Ogata (1988) showed that this theorem maybe used to develop goodness-of-t tests for point process models of seismo-logic data Goodness-of-t methods for neural spike train models based on

The Time-Rescaling Theorem 341

time-rescaling transformations but not the time-rescaling theorem have alsobeen reported (Reich Victor amp Knight 1998 Barbieri Frank Quirk Wilsonamp Brown 2001) Here we have described how the time-rescaling theoremmay be used to develop goodness-of-t tests for point process models ofneural spike trains

To illustrate our approach we analyzed two types of commonlyrecordedspike train data The SEF data are a set of multiple short (400 msec) series ofspike times each measured under identical experimental conditions Thesedata are typically analyzed by a PSTH The hippocampus data are a long(20 minutes) series of spike time recordings that are typically analyzed witheither spatial or temporal histogram models To each type of data we tboth parametric and histogram-based models Histogram-based models arepopularneural data analysis toolsbecause of the ease with which they can becomputed and interpreted These apparent advantages do not override theneed to evaluate the goodness of t of these models We previously usedthe time-rescaling theorem to assess goodness of t for parametric spiketrain models (Olson et al 2000 Barbieri Quirk et al 2001 Ventura et al2001) Our main result in this article is that the time-rescaling theorem can beused to evaluate goodness of t of parametric and histogram-based modelsand to compare directly the accuracy of models from the two classes Werecommend that before making an inference based on either type of modela goodness-of-t analysis should be performed to establish how well themodel describes the spike train data If the model and data agree closelythen the inference is more credible than when there is signicant lack of tThe KS and Q-Q plots provide assessments of overall goodness of t For themodels t by maximum likelihood these assessments can be applied alongwith methods that measure the marginal value and marginal costs of usingmore complex models such as Akaikiersquos information criterion (AIC) andthe Bayesian information criterion (BIC) in order to gain a more completeevaluation of model agreement with experimental data (Barbieri Quirk etal 2001)

We assessed goodness of t by using the time-rescaling theorem to con-struct KS and Q-Q plots having respectively liberal and conservative con-dence bounds Together the two sets of condence bounds help character-ize the range of agreement between the model and the data For example amodelwhose KSplot lies consistently outside the 95 KS condence bounds(the IP model for the hippocampal data) agrees poorly with the data Onthe other hand a model that is within all the 95 condence bounds of theQ-Q plots (the IMI model for the SEF data) agrees closely with the data Amodel such as the IP model for the SEF data that is within nearly all theKS bounds may lie outside the Q-Q plot intervals In this case if the lackof t with respect to the Q-Q plot intervals is systematic (ie is over a set ofcontiguous quantiles) this suggests that the model does not t the data well

As a second application of the time-rescaling theorem we presented analgorithm for simulating spike trains from a point process given its condi-

342 E N Brown R Barbieri V Ventura R E Kass and L M Frank

tional intensity (rate) function This algorithm generalizes the well-knowntechnique of simulating an inhomogeneous Poisson process by rescaling aPoisson process with a constant rate

Finally to make the reasoning behind the time-rescaling theorem moreaccessible to the neuroscience researchers we proved its general form usingelementary probability arguments While this elementary proof is most cer-tainly apparent to experts in probability and point process theory (D Brill-inger personal communication Guttorp 1995) itsdetails to our knowledgehave not been previously presented The original proofs of this theorem usemeasure theory and are based on the martingale representation of point pro-cesses (Meyer 1969 Papangelou 1972 Br Acircemaud 1981 Jacobsen 1982) Theconditional intensity function (see equation 21) is dened in terms of themartingale representation Our proof uses elementary arguments because itis based on the fact that the joint probability density of a set of point processobservations (spike train) has a canonical representation in terms of the con-ditional intensity function When the joint probability density is representedin this way the Jacobian in the change of variables between the original spiketimes and the rescaled interspike intervals simplies to a product of the re-ciprocals of the conditional intensity functions evaluated at the spike times

The proof also highlights the signicance of the conditional intensityfunction in spike train modeling its specication completely denes thestochastic structure of the point process This is because in a small time in-terval the product of the conditional intensity function and the time intervaldenes the probability of a spike in that interval given the historyof the spiketrain up to that time (see equation 23) When there is no history dependencethe conditional intensity function is simply the Poisson rate function Animportant consequence of this simplication is that unless history depen-dence is specically included then histogram-based models such as thePSTH and the spatial and temporal smoothers are implicit Poisson models

In both the SEF and hippocampusexamples the histogram-based modelsgave poor ts to the spike train These poor ts arose because these mod-els used few data points to estimate many parameters and because theydo not model history dependence in the spike train Our parametric mod-els used fewer parameters and represented temporal dependence explicitlyas Markov Point process models with higher-order temporal dependencehave been studied by Ogata (1981 1988) Brillinger (1988) and Kass andVentura (2001) and will be considered further in our future work Paramet-ric conditional intensity functions may be estimated from neural spike traindata in any experiments where there are enough data to estimate reliably ahistogram-based model This is because if there are enough data to estimatemany parameters using an inefcient procedure (histogrammethod of mo-ments) then there should be enough data to estimate a smaller number ofparameters using an efcient one (maximum likelihood)

Using the K-S and Q-Q plots derived from the time-rescaling theoremit is possible to devise a systematic approach to the use of histogram-based

The Time-Rescaling Theorem 343

models That is it is possible to determine when a histogram-based modelaccurately describes a spike train and when a different model class temporaldependence or the effects of covariates (eg the theta rhythm and the ratrunning velocity in the case of the place cells) should be considered and thedegree of improvement the alternative models provide

In summary we have illustrated how the time-rescaling theorem may beused to compare directly goodness of t of parametric and histogram-basedpoint process models of neural spiking activity and to simulate spike traindata These results suggest that the time-rescaling theorem can be a valuabletool for neural spike train data analysis

Appendix

A1 Maximum Likelihood Estimation of the IMI Model To t the IMImodel we choose J large and divide the interval (0 T] into J bins of widthD D T J We choose J so that there is at most one spike in any bin In thisway we convert the spike times 0 lt u1 lt u2 lt lt uniexcl1 lt un middot T into abinary sequence ucurren

j where ucurrenj D 1 if there is a spike in bin j and 0 otherwise

for j D 1 J By denition of the conditional intensity function for the IMImodel in equation 31 it follows that each ucurren

j is a Bernoulli random variablewith the probability of a spike at jD dened as l1 ( jD | h )l2 ( jD iexclucurren

N ( jD ) | h )D We note that this discretization is identical to the one used to constructthe discretized version of the simulation algorithm in Section 33 The logprobability of a spike is thus

log(l1 ( jD | h ) ) C log(l2 ( jD iexcl ucurrenN ( jD ) | h )D ) (A1)

and the cubic spline models for log(l1 ( jD | h ) ) and log(l2 ( jD iexcl ucurrenN ( jD ) | h ) )

are respectively

log(l1 ( jD | h ) ) D3X

lD1

hl ( jD iexcl j1 ) lC

Ch4 ( jD iexclj2 )3C C h5 ( jD iexcl j3) 3

C (A2)

log(l2 ( jD iexcl ucurrenN ( jD ) | h ) ) D

3X

lD1

hlC5 ( jD iexcl ucurrenN ( jD ) iexclc 1 ) l

C

C h9 ( jD iexcl ucurrenN ( jD ) iexclc 2 ) 3

C (A3)

whereh D (h1 h9 ) and the knots are dened asjl D lT4 for l D 1 2 3 andc l is the observed 100l3 quantile of the interspike interval distribution ofthe trial spike trains for lD 1 2 We have for Section 31 that hIP D (h1 h5)

344 E N Brown R Barbieri V Ventura R E Kass and L M Frank

and hIMI D (h1 h9) are the coefcients of the spline basis elements for theIP and IMI models respectively

The parameters are estimated by maximum likelihood with D D 1 msecusing the gam function in S-PLUS (MathSoft Seattle) as described in Chap-ter 11 of Venables and Ripley (1999) The inputs to gam are the ucurren

j s the timearguments jD and jD iexclucurren

N ( jD ) the symbolic specication of the spline mod-els and the knots Further discussion on spline-based regression methodsfor analyzing neural data may be found in Olson et al (2000) and Venturaet al (2001)

Acknowledgments

We are grateful to Carl Olson for allowing us to use the supplementary eyeeld data and Matthew Wilson for allowing us to use the hippocampal placecell data We thank David Brillinger and Victor Solo for helpful discussionsand the two anonymous referees whose suggestions helped improve theexposition This research was supported in part by NIH grants CA54652MH59733 and MH61637 and NSF grants DMS 9803433 and IBN 0081548

References

Barbieri R Frank L M Quirk M C Wilson M A amp Brown E N (2001)Diagnostic methods for statistical models of place cell spiking activity Neu-rocomputing 38ndash40 1087ndash1093

Barbieri R Quirk M C Frank L M Wilson M A amp Brown E N (2001)Construction and analysis of non-Poisson stimulus-response models of neu-ral spike train activity J Neurosci Meth 105 25ndash37

Berman M (1983)Comment on ldquoLikelihood analysis of point processes and itsapplications to seismological datardquo by Ogata Bulletin Internatl Stat Instit50 412ndash418

Br Acircemaud P (1981) Point processes and queues Martingale dynamics BerlinSpringer-Verlag

Brillinger D R (1988) Maximum likelihood analysis of spike trains of interact-ing nerve cells Biol Cyber 59 189ndash200

Brown E N Frank L M Tang D Quirk M C amp Wilson M A (1998) Astatistical paradigm for neural spike train decoding applied to position pre-diction from ensemble ring patterns of rat hippocampal place cells Journalof Neuroscience 18 7411ndash7425

Casella G amp Berger R L (1990) Statistical inference Belmont CA DuxburyDaley D amp Vere-Jones D (1988) An introduction to the theory of point process

New York Springer-VerlagFrank L M Brown E N amp Wilson M A (2000) Trajectory encoding in the

hippocampus and entorhinal cortex Neuron 27 169ndash178Guttorp P (1995) Stochastic modeling of scientic data London Chapman amp

Hall

The Time-Rescaling Theorem 345

Hogg R V amp Tanis E A (2001) Probability and statistical inference (6th ed)Englewood Cliffs NJ Prentice Hall

Jacobsen M (1982)Statistical analysis of counting processes New York Springer-Verlag

Johnson A amp Kotz S (1970) Distributions in statistics Continuous univariatedistributionsmdash2 New York Wiley

Kalbeisch J amp Prentice R (1980)The statisticalanalysis of failure time data NewYork Wiley

Karr A (1991) Point processes and their statistical inference (2nd ed) New YorkMarcel Dekker

Kass R E amp Ventura V (2001)A spike train probability model Neural Comput13 1713ndash1720

Lewis P A W amp Shedler G S (1978) Simulation of non-homogeneous Poissonprocesses by thinning Naval Res Logistics Quart 26 403ndash413

Meyer P (1969)D Acircemonstration simpli Acircee drsquoun th Acirceoreme de Knight InS AcirceminaireprobabilitAcirce V (pp 191ndash195) New York Springer-Verlag

Muller R U amp Kubie J L (1987)The effects of changes in the environment onthe spatial ring of hippocampal complex-spike cells Journal of Neuroscience7 1951ndash1968

Ogata Y (1981) On Lewisrsquo simulation method for point processes IEEE Trans-actions on Information Theory IT-27 23ndash31

Ogata Y (1988) Statistical models for earthquake occurrences and residualanalysis for point processes Journal of American Statistical Association 83 9ndash27

OrsquoKeefe J amp Dostrovsky J (1971) The hippocampus as a spatial map Pre-liminary evidence from unit activity in the freely-moving rat Brain Res 34171ndash175

Olson C R Gettner S N Ventura V Carta R amp Kass R E (2000) Neuronalactivity in macaque supplementary eye eld during planning of saccades inresponse to pattern and spatial cues Journal of Neurophysiology84 1369ndash1384

Papangelou F (1972) Integrability of expected increments of point processesand a related random change of scale Trans Amer Math Soc 165 483ndash506

Port S (1994) Theoretical probability for applications New York WileyProtter M H amp Morrey C B (1991)A rst course in real analysis (2nd ed) New

York Springer-VerlagReich D Victor J amp Knight B (1998) The power ratio and the interval

map Spiking models and extracellular recordings Journal of Neuroscience18 10090ndash10104

Ripley B (1987) Stochastic simulation New York WileyRoss S (1993) Introduction to probability models (5th ed) San Diego CA Aca-

demic PressSnyder D amp Miller M (1991)Random point processes in time and space (2nd ed)

New York Springer-VerlagTaylor H M amp Karlin S (1994) An introduction to stochastic modeling (rev ed)

San Diego CA Academic PressVenables W amp Ripley B (1999) Modern applied statistics with S-PLUS (3rd ed)

New York Springer-Verlag

346 E N Brown R Barbieri V Ventura R E Kass and L M Frank

Ventura V Carta R Kass R Gettner S amp Olson C (2001) Statistical analysisof temporal evolution in single-neuron ring rate Biostatistics In press

Wilson M A amp McNaughton B L (1993) Dynamics of the hippocampal en-semble code for space Science 261 1055ndash1058

Wood E R Dudchenko P A amp Eichenbaum H (1999) The global record ofmemory in hippocampal neuronal activity Nature 397 613ndash616

Received October 27 2000 accepted May 2 2001

326 E N Brown R Barbieri V Ventura R E Kass and L M Frank

plementary eye eld of a macque monkey and a comparison of temporaland spatial smoothers inhomogeneous Poisson inhomogeneous gammaand inhomogeneous inverse gaussian models of rat hippocampal placecell spiking activity To help make the logic behind the time-rescaling the-orem more accessible to researchers in neuroscience we present a proofusing only elementary probability theory arguments We also show howthe theorem may be used to simulate a general point process model of aspike train Our paradigm makes it possible to compare parametric andhistogram-based neural spike train models directly These results sug-gest that the time-rescaling theorem can be a valuable tool for neuralspike train data analysis

1 Introduction

The development of statistical models that accurately describe the stochasticstructure of neural spike trains is a growing area of quantitative research inneuroscience Evaluating model goodness of tmdashthat is measuring quan-titatively the agreement between a proposed model and a spike train dataseriesmdashis crucial for establishing the modelrsquos validity prior to using it tomake inferences about the neural system being studied Assessing goodnessof t for point neural spike train models is a more challenging problem thanfor models of continuous-valued processes This is because typical distancediscrepancy measures applied in continuous data analyses such as the aver-age sum of squared deviations between recorded data values and estimatedvalues from the model cannot be directly computed for point process dataGoodness-of-t assessments are even morechallenging forhistogram-basedmodels such as peristimulus time histograms (PSTH) and rate functionsestimated by spike train smoothing because the probability assumptionsneeded to evaluate model properties are often implicit Berman (1983) andOgata (1988) developed transformations that under a given model convertpoint processes like spike trains into continuous measures in order to assessmodel goodness of t One of the theoretical results used to construct thesetransformations is the time-rescaling theorem

A form of the time-rescaling theorem is well known in elementary prob-ability theory It states that any inhomogeneous Poisson process may berescaled or transformed into a homogeneous Poisson process with a unit rate(Taylor amp Karlin 1994) The inverse transformation is a standard method forsimulating an inhomogeneous Poisson process from a constant rate (homo-geneous) Poisson process Meyer (1969) and Papangelou (1972) establishedthe general time-rescaling theorem which states that any point process withan integrable rate function may be rescaled into a Poisson process with aunit rate Berman and Ogata derived their transformations by applying thegeneral form of the theorem While the more general time-rescaling theo-rem is well known among researchers in point process theory (Br Acircemaud1981 Jacobsen 1982 Daley amp Vere-Jones 1988 Ogata 1988 Karr 1991) the

The Time-Rescaling Theorem 327

theorem is less familiar to neuroscience researchers The technical nature ofthe proof which relies on the martingale representation of a point processmay have prevented its signicance from being more broadly appreciated

The time-rescaling theorem has important theoretical and practical im-plications for application of point process models in neural spike train dataanalysis To help make this result more accessible to researchers in neu-roscience we present a proof that uses only elementary probability theoryarguments We describe how the theorem may be used to develop goodness-of-t tests for both parametric and histogram-based point process modelsof neural spike trains We apply these tests in two examples a compar-ison of PSTH inhomogeneous Poisson and inhomogeneous Markov in-terval models of neural spike trains from the supplementary eye eld ofa macque monkey and a comparison of temporal and spatial smoothersinhomogeneous Poisson inhomogeneous gamma and inhomogeneous in-verse gaussian models of rat hippocampal place cell spiking activity Wealso demonstrate how the time-rescaling theorem may be used to simulatea general point process

2 Theory

21 The Conditional Intensity Function and the Joint Probability Den-sity of the Spike Train Dene an interval (0 T] and let 0 lt u1 lt u2 lt lt uniexcl1 lt un middot T be a set of event (spike) times from a point processFor t 2 (0 T] let N (t) be the sample path of the associated counting processThe sample path is a right continuous function that jumps 1 at the eventtimes and is constant otherwise (Snyder amp Miller 1991) In this way N (t)counts the number and location of spikes in the interval (0 t] Therefore itcontains all the information in the sequence of events or spike times Fort 2 (0 T] we dene the conditional or stochastic intensity function as

l(t | Ht ) D limD t0

Pr(N (t C D t) iexcl N (t) D 1 | Ht)D t

(21)

where Ht D f0 lt u1 lt u2 uN (t) lt tg is the history of the process up totime t and uN (t) is the time of the last spike prior to t If the point process is aninhomogeneous Poisson process then l(t | Ht ) D l (t) is simply the Poissonrate function Otherwise it depends on the history of the process Hencethe conditional intensity generalizes the denition of the Poisson rate It iswell known that l (t | Ht ) can be dened in terms of the event (spike) timeprobability density f (t | Ht ) as

l(t | Ht ) Df (t | Ht )

1 iexclR t

uN (t)f (u | Ht) du

(22)

for t gt uN (t) (Daley amp Vere-Jones 1988 Barbieri Quirk Frank Wilson ampBrown 2001) We may gain insight into equation 22 and the meaning of the

328 E N Brown R Barbieri V Ventura R E Kass and L M Frank

conditional intensity function by computing explicitly the probability thatgiven Ht a spike uk occurs in [t t C D t) where k D N (t) C 1 To do this wenote that the events fN (t C Dt) iexcl N (t) D 1 | Htg and fuk 2 [t t C D t) | Htgare equivalent and that therefore

Pr (N (t C D t) iexcl N (t) D 1 | Ht ) D Pr(uk 2 [t t C Dt) | uk gt t Ht )

DPr(uk 2 [t t C Dt) | Ht )

Pr (uk gt t | Ht)

D

R tCDtt f (u | Ht ) du

1 iexclR t

uN (t)f (u | Ht ) du

frac14 f (t | Ht)D t

1 iexclR t

uN (t)f (u | Ht ) du

D l (t | Ht )D t (23)

Dividing by D t and taking the limit gives

limD t0

Pr(uk 2 [t t C D t) | Ht )D t

Df (t | Ht )

1 iexclR t

uN (t)f (u | Ht ) du

D l(t | Ht ) (24)

which is equation 21 Therefore l (t | Ht)D t is the probability of a spike in[t t C Dt) when there is history dependence in the spike train In survivalanalysis the conditional intensity is termed the hazard function because inthis case l (t | Ht )D t measures the probability of a failure ordeath in [t tCD t)given that the process has survived up to time t (Kalbeisch amp Prentice1980)

Because we would like to apply the time-rescaling theorem to spike traindata series we require the joint probability density of exactly n event timesin (0 T] This joint probability density is (Daley amp Vere-Jones 1988 BarbieriQuirk et al 2001)

f (u1 u2 un N (T) D n)

D f (u1 u2 un unC1 gt T)

D f (u1 u2 un N (un) D n) Pr (unC1 gt T | u1 u2 un)

DnY

kD1

l(uk | Huk) exp

raquoiexcl

Z uk

ukiexcl1

l(u | Hu) dufrac14

cent exp

(iexcl

Z T

un

l (u | Hu) du

) (25)

The Time-Rescaling Theorem 329

where

f (u1 u2 un N (un) D n)

DnY

kD1

l (uk | Huk) exp

raquoiexcl

Z uk

ukiexcl1

l (u | Hu) dufrac14

(26)

Pr(unC1 gt T | u1 u2 un) D exp

(iexcl

Z T

un

l(u | Hu) du

) (27)

and u0 D 0 Equation 26 is the joint probability density of exactly n eventsin (0 un] whereas equation 27 is the probability that the n C 1st eventoccurs after T The conditional intensity function provides a succinct wayto represent the joint probability density of the spike times We can nowstate and prove the time-rescaling theorem

Time-Rescaling Theorem Let 0 lt u1 lt u2 lt lt un lt T be a realizationfrom a point process with a conditional intensity function l(t | Ht ) satisfying0 lt l (t | Ht ) for all t 2 (0 T] Dene the transformation

L (uk) DZ uk

0l (u | Hu ) du (28)

for k D 1 n and assume L (t) lt 1 with probability one for all t 2 (0 T]Then the L (uk) rsquos are a Poisson process with unit rate

Proof Let tk D L (uk) iexcl L (ukiexcl1 ) for k D 1 n and set tT DR Tun

l(u | Hu) du To establish the result it sufces to show that the tksare independent and identically distributed exponential random variableswith mean one Because the tk transformation is one-to-one and tnC1 gt tTif and only if unC1 gt T the joint probability density of the tkrsquos is

f (t1 t2 tn tnC1 gt tT )

D f (t1 tn) Pr (tnC1 gt tT | t1 tn) (29)

We evaluate each of the two terms on the right side of equation 29 Thefollowing two events are equivalent

ftnC1 gt tT | t1 tng D funC1 gt T | u1 u2 ung (210)

Hence

Pr(tnC1 gt tT | t1 t2 tn ) D Pr(unC1 gt T | u1 u2 un )

D exp

(iexcl

Z T

un

l (u | Hun) du

)

D expfiexcltTg (211)

330 E N Brown R Barbieri V Ventura R E Kass and L M Frank

where the last equality follows from the denition of tT By the multivariatechange-of-variable formula (Port 1994)

f (t1 t2 tn ) D | J| f (u1 u2 un N (un) D n) (212)

where J is the Jacobian of the transformation between uj j D 1 n andtk k D 1 n Because tk is a function of u1 uk J is a lower triangularmatrix and its determinant is the product of its diagonal elements denedas | J| D |

QnkD1 Jkk| By assumption 0 lt l (t | Ht) and by equation 28 and

the denition of tk the mapping of u into t is one-to-one Therefore bythe inverse differentiation theorem (Protter amp Morrey 1991) the diagonalelements of J are

Jkk Duk

tkD l(uk | Huk

) iexcl1 (213)

Substituting |J | and equation 26 into equation 212 yields

f (t1 t2 tn ) DnY

kD1

l(uk | Huk) iexcl1

nY

kD1

l (uk | Huk)

cent expraquo

iexclZ uk

ukiexcl1

l (u | Hu ) dufrac14

DnY

kD1

expfiexcl[L (uk) iexcl L (ukiexcl1 )]g

DnY

kD1

expfiexcltkg (214)

Substituting equations 211 and 214 into 29 yields

f (t1 t2 tn tnC1 gt tT ) D f (t1 tn) Pr(tnC1 gt tT | t1 tn)

D

sup3nY

kD1

expfiexcltkgacute

expfiexcltTg (215)

which establishes the result

The time-rescaling theorem generates a history-dependent rescaling ofthe time axis that converts a point process into a Poisson process with a unitrate

22 Assessing Model Goodness of Fit We may use the time-rescalingtheorem to construct goodness-of-t tests for a spike data model Once a

The Time-Rescaling Theorem 331

model has been t to a spike train data series we can compute from itsestimated conditional intensity the rescaled times

tk D L (uk) iexcl L (ukiexcl1 ) (216)

If the model is correct then according to the theorem the tks are indepen-dent exponential random variables with mean 1 If we make the furthertransformation

zk D 1 iexcl exp(iexcltk) (217)

then zks are independent uniform random variables on the interval (0 1) Be-cause the transformations inequations 216 and 217 are both one-to-one anystatistical assessment that measures agreement between the zks and a uni-form distribution directly evaluates how well the original model agrees withthe spike train data Here we present two methods Kolmogorov-Smirnovtests and quantile-quantile plots

To construct the Kolmogorov-Smirnov test we rst order the zks fromsmallest to largest denoting the ordered values as z (k) s We then plot the val-ues of the cumulative distribution function of the uniform density dened

as bk D kiexcl 12

n for k D 1 n against the z (k) s If the model is correct then thepoints should lie on a 45-degree line (Johnson amp Kotz 1970) Condencebounds for the degree of agreement between the models and the data maybe constructed using the distribution of the Kolmogorov-Smirnov statisticFor moderate to large sample sizes the 95 (99) condence bounds arewell approximated as bk sect 136n12 (bk sect 163n12) (Johnson amp Kotz 1970)We term such a plot a Kolmogorov-Smirnov (KS) plot

Another approach to measuring agreement between the uniform prob-ability density and the zks is to construct a quantile-quantile (Q-Q) plot(Ventura Carta Kass Gettner amp Olson 2001 Barbieri Quirk et al 2001Hogg amp Tanis 2001) In this display we plot the quantiles of the uniformdistribution denoted here also as the bks against the z(k) s As in the case ofthe KS plots exact agreement occurs between the point process model andthe experimental data if the points lie on a 45-degree line Pointwise con-dence bands can be constructed to measure the magnitude of the departureof the plot from the 45-degree line relative to chance To construct point-wise bands we note that if the tks are independent exponential randomvariables with mean 1 and the zks are thus uniform on the interval (0 1)then each z (k) has a beta probability density with parameters k and n iexcl k C 1dened as

f (z | k n iexcl k C 1) Dn

(n iexcl k) (k iexcl 1)zkiexcl1 (1 iexcl z)niexclk (218)

for 0 lt z lt 1 (Johnson amp Kotz 1970) We set the 95 condence boundsby nding the 25th and 975th quantiles of the cumulative distribution

332 E N Brown R Barbieri V Ventura R E Kass and L M Frank

associated with equation 218 for k D 1 n These exact quantiles arereadily available in many statistical software packages For moderate tolarge spike train data series a reasonable approximation to the 95 (99)condence bounds is given by the gaussian approximation to the bino-mial probability distribution as z(k) sect 196[z(k) (1 iexcl z(k) ) n]12 (z(k) sect2575[z(k) (1 iexcl z(k) ) n]12) To our knowledge these local condence boundsfor the Q-Q plots based on the beta distribution and the gaussian approxi-mation are new

In general the KS condence intervals will be wider than the corre-sponding Q-Q plot intervals To see this it sufces to compare the widthsof the two intervals using their approximate formulas for large n From thegaussian approximation to the binomial the maximum width of the 95condence interval for the Q-Q plots occurs at the median z(k) D 050 andis 2[196 (4n) 12] D 196niexcl12 For n large the width of the 95 condenceintervals for the KS plots is 272niexcl12 at all quantiles The KS condencebounds consider the maximum discrepancy from the 45-degree line alongall quantiles the 95 bands show the discrepancy that would be exceeded5 of the time by chance if the plotted data were truly uniformlydistributedThe Q-Q plot condence bounds consider the maximum discrepancy fromthe 45-degree line for each quantile separately These pointwise 95 con-dence bounds mark the amount by which each value z(k) would deviatefrom the true quantile 5 of the time purely by chance The KS bounds arebroad because they are based on the joint distribution of all n deviationsand they consider the distribution of the largest of these deviations TheQ-Q plot bounds are narrower because they measure the deviation at eachquantile separately Used together the two plots help approximate upperand lower limits on the discrepancy between a proposed model and a spiketrain data series

3 Applications

31 An Analysis of Supplementary Eye Field Recordings For the rstapplication of the time-rescaling theorem to a goodness-of-t analysis weanalyze a spike train recorded from the supplementary eye eld (SEF) of amacaque monkey Neurons in the SEF play a role in oculomotor processes(Olson Gettner Ventura Carta amp Kass 2000) A standard paradigm forstudying the spiking propertiesof these neurons is a delayed eye movementtask In this task the monkey xates is shown locations of potential targetsites and is then cued to the specic target to which it must saccade Nexta preparatory cue is given followed a random time later by a go signalUpon receiving the go signal the animal must saccade to the specic targetand hold xation for a dened amount of time in order to receive a rewardBeginning fromthe point of the specic target cue neural activity is recordedfor a xed interval of time beyond the presentation of the go signal Aftera brief rest period the trial is repeated Multiple trials from an experiment

The Time-Rescaling Theorem 333

such as this are jointly analyzed using a PSTH to estimate ring rate for anite interval following a xed initiation point That is the trials are timealigned with respect to a xed initial point such as the target cue The dataacross trials are binned in time intervals of a xed length and the rate in eachbin is estimated as the average number of spikes in the xed time interval

Kass and Ventura (2001) recently presented inhomogeneous Markov in-terval (IMI) modelsas an alternative to the PSTH for analyzing multiple-trialneural spike train data These models use a Markov representation for theconditional intensity function One form of the IMI conditional intensityfunction they considered is

l(t | Ht ) D l (t | uN (t) h ) D l1 (t | h )l2 (t iexcl uN (t) | h ) (31)

where uN (t) is the time of the last spike prior to t l1 (t | h ) modulates r-ing as a function of the experimental clock time l2 (t iexcl uN (t) | h ) repre-sents Markov dependence in the spike train and h is a vector of modelparameters to be estimated Kass and Ventura modeled log l1 (t | h ) andlog l2 (t iexcl uN (t) | h ) as separate piecewise cubic splines in their respectivearguments t and t iexcl uN (t) The cubic pieces were joined at knots so thatthe resulting functions were twice continuously differentiable The numberand positions of the knots were chosen in a preliminary data analysis In thespecial case l (t | uN (t) h ) D l1 (t | h ) the conditional intensity function inequation 31 corresponds to an inhomogeneous Poisson (IP) model becausethis assumes no temporal dependence among the spike times

Kass and Ventura used their models to analyze SEF data that consistedof 486 spikes recorded during the 400 msec following the target cue signalin 15 trials of a delayed eye movement task (neuron PK166a from Olsonet al 2000 using the pattern condition) The IP and IMI models were tby maximum likelihood and statistical signicance tests on the spline co-efcients were used to compare goodness of t Included in the analysiswere spline models with higher-order dependence among the spike timesthan the rst-order Markov dependence in equation 31 They found thatthe IMI model gave a statistically signicant improvement in the t relativeto the IP and that adding higher-order dependence to the model gave nofurther improvements The t of the IMI model was not improved by in-cluding terms to model between-trial differences in spike rate The authorsconcluded that there was strong rst-order Markov dependence in the r-ing pattern of this SEF neuron Kass and Ventura did not provide an overallassessment of model goodness of t or evaluate how much the IMI and theIP models improved over the histogram-based rate model estimated by thePSTH

Using the KS and Q-Q plots derived from the time-rescaling theorem it ispossible to compare directly the ts of the IMI IP and PSTH models and todetermine which gives the most accurate description of the SEF spike trainstructure The equations for the IP rate function lIP (t | hIP ) D l1 (t | hIP )

334 E N Brown R Barbieri V Ventura R E Kass and L M Frank

and for the IMI conditional intensity function lIMI (t | uN (t) hIMI ) Dl1 (t | hIMI )l2 (t iexcl uN (t) hIMI ) are given in the appendix along with a dis-cussion of the maximum likelihood procedure used to estimate the co-efcients of the spline basis elements The estimated conditional inten-sity functions for the IP and IMI models are respectively lIP (t | OhIP ) andlIMI (t | uN (t) OhIMI ) where Oh is the maximum likelihood estimate of thespecic spline coefcients For the PSTH model the conditional intensityestimate is the PSTH computed by averaging the number of spikes in eachof 40 10 msec bins (the 400 msec following the target cue signal) acrossthe 15 trials The PSTH is the t of another inhomogeneous Poisson modelbecause it assumes no dependence among the spike times

The results of the IMI IP and PSTH model ts compared by KS andQ-Q plots are shown in Figure 1 For the IP model there is lack of t atlower quantiles (below 025) because in that range its KS plot lies just out-side the 95 condence bounds (see Figure 1A) From quantile 025 andbeyond the IP model is within the 95 condence bounds although be-

The Time-Rescaling Theorem 335

yond the quantile 075 it slightly underpredicts the true probability modelof the data The KS plot of the PSTH model is similar to that of the IPexcept that it lies entirely outside the 95 condence bands below quan-tile 050 Beyond this quantile it is within the 95 condence bounds TheKS plot of the PSTH underpredicts the probability model of the data toa greater degree in the upper range than the IP model The IMI model iscompletely within the 95 condence bounds and lies almost exactly onthe 45-degree line of complete agreement between the model and the dataThe Q-Q plot analyses (see Figure 1B) agree with KS plot analyses witha few exceptions The Q-Q plot analyses show that the lack of t of theIP and PSTH models is greater at the lower quantiles (0ndash050) than sug-gested by the KS plots The Q-Q plots for the IP and PSTH models alsoshow that the deviations of these two models near quantiles 080 to 090are statistically signicant With the exception of a small deviation belowquantile 010 the Q-Q plot of the IMI lies almost exactly on the 45-degreeline

Figure 1 Facing page (A) Kolmogorov-Smirnov (K-S) plots of the inhomoge-neous Markov interval (IMI) model inhomogeneous Poisson (IP) and perstim-ulus time histogram (PSTH) model ts to the SEF spike train data The solid45-degree line represents exact agreement between the model and the dataThe dashed 45-degree lines are the 95 condence bounds for exact agree-ment between the model and experimental data based on the distribution of theKolmogorov-Smirnov statistic The 95 condence bounds are bk sect 136niexcl 1

2 where bk D (k iexcl 1

2) n for k D 1 n and n is the total number of spikes The

IMI model (thick solid line) is completely within the 95 condence boundsand lies almost exactly on the 45-degree line The IP model (thin solid line)has lack of t at lower quantiles (lt 025) From the quantile 025 and be-yond the IP model is within the 95 condence bounds The KS plot of thePSTH model (dotted line) is similar to that of the IP model except that it liesoutside the 95 condence bands below quantile 050 Beyond this quantileit is within the 95 condence bounds yet it underpredicts the probabilitymodel of the data to a greater degree in this range than the IP model doesThe IMI model agrees more closely with the spike train data than either the IPor the PSTH models (B) Quantile-quantile (Q-Q) plots of the IMI (thick solidline) IP (thin solid line) and PSTH (dotted line) models The dashed lines arethe local 95 condence bounds for the individual quantiles computed fromthe beta probability density dened in equation 218 The solid 45-degree linerepresents exact agreement between the model and the data The Q-Q plotssuggests that the lack of t of the IP and PSTH models is greater at the lowerquantiles (0ndash050) than suggested by the KS plots The Q-Q plots for the IP andPSTH models also show that the deviations of these two models near quantiles080 to 090 are statistically signicant With the exception of a small devia-tion below quantile 010 the Q-Q plot of the IMI lies almost exactly on the45-degree line

336 E N Brown R Barbieri V Ventura R E Kass and L M Frank

We conclude that the IMI model gives the best description of these SEFspike trains In agreement with the report of Kass and Ventura (2001) thisanalysis supports a rst-order Markov dependence among the spike timesand not a Poisson structure as would be suggested by either the IP or thePSTH models This analysis extends the ndings of Kass and Ventura byshowing that of the three models the PSTH gives the poorest description ofthe SEF spike train The IP model gives a better t to the SEF data than thePSTH model because the maximum likelihood analysis of the parametricIP model is more statistically efcient than the histogram (method of mo-ments) estimate obtained from the PSTH (Casella amp Berger 1990) That isthe IP model t by maximum likelihood uses all the data to estimate theconditional intensity function at all time points whereas the PSTH anal-ysis uses only spikes in a specied time bin to estimate the ring rate inthat bin The additional improvement of the IMI model over the IP is dueto the fact that the former represents temporal dependence in the spiketrain

32 An Analysis of Hippocampal Place Cell Recordings As a secondexample of using the time-rescaling theorem to develop goodness-of-ttests we analyze the spiking activity of a pyramidal cell in the CA1 regionof the rat hippocampus recorded from an animal running back and forth ona linear track Hippocampal pyramidal neurons have place-specic ring(OrsquoKeefe amp Dostrovsky 1971)a given neuron res onlywhen the animal is ina certain subregion of the environment termed the neuronrsquos place eld On alinear track these elds approximately resemble one-dimensional gaussiansurfaces The neuronrsquos spiking activity correlates most closely with the an-imalrsquos position on the track (Wilson amp McNaughton 1993) The data serieswe analyze consists of 691 spikes from a place cell in the CA1 region of thehippocampus recorded from a rat running back and forth for 20 minutes ona 300 cm U-shaped track The track was linearized for the purposes of thisanalysis (Frank Brown amp Wilson 2000)

There are two approaches to estimating the place-specic ring maps ofa hippocampal neuron One approach is to use maximum likelihood to ta specic parametric model of the spike times to the place cell data as inBrown Frank Tang Quirk and Wilson (1998) and Barbieri Quirk et al(2001) If x(t) is the animalrsquos position at time t we dene the spatial functionfor the one-dimensional place eld model as the gaussian surface

s (t) D expraquo

a iexcl b (x (t) iexcl m )2

2

frac14 (32)

where m is the center of place eld b is a scale factor and expfag is themaximum height of the place eld at its center We represent the spike timeprobability density of the neuron as either an inhomogeneous gamma (IG)

The Time-Rescaling Theorem 337

model dened as

f (uk | ukiexcl1 h )

Dys (uk )C (y )

microZ uk

ukiexcl1

ys (u) duparay iexcl1

expraquo

iexclZ uk

ukiexcl1

ys(u) dufrac14

(33)

or as an inhomogeneous inverse gaussian (IIG) model dened as

f (uk | ukiexcl1 h )

Ds(uk)

micro2p

hR uk

ukiexcl1s(u) du

i3para 1

2

exp

8gtlt

gtiexcl

12

plusmnR uk

ukiexcl1s (u) du iexcl y

sup22

y 2R uk

ukiexcl1s(u) du

9gt=

gt (34)

where y gt 0 is a location parameter for both models and h D (m a b y )is the set of model parameters to be estimated from the spike train If weset y D 1 in equation 33 we obtain the IP model as a special case ofthe IG model The parameters for all three modelsmdashthe IP IG and theIIGmdashcan be estimated from the spike train data by maximum likelihood(Barbieri Quirk et al 2001) The models in equations 33 and 34 are Markovso that the current value of either the spike time probability density orthe conditional intensity (rate) function depends on only the time of thepreviousspike Because of equation 22 specifying the spike time probabilitydensity is equivalent to specifying the conditional intensity function If welet Oh denote the maximum likelihood estimate of h then the maximumlikelihood estimate of the conditional intensity function for each model canbe computed from equation 22 as

l(t | Ht Oh ) Df (t | uN (t) Oh )

1 iexclR t

uN (t)f (u | uN (t) Oh ) du

(35)

for t gt uN (t) The estimated conditional intensity from each model maybe used in the time-rescaling theorem to assess model goodness of t asdescribed in Section 21

The second approach is to compute a histogram-based estimate of theconditional intensity function by using either spatial smoothing (Muller ampKubie 1987 Frank et al 2000) or temporal smoothing (Wood Dudchenkoamp Eichenbaum 1999) of the spike train To compute the spatial smooth-ing estimate of the conditional intensity function we followed Frank et al(2000) and divided the 300 cm track into 42 cm bins counted the numberof spikes per bin and divided the count by the amount of time the animalspends in the bin We smooth the binned ring rate with a six-point gaus-sian window with a standard deviation of one bin to reduce the effect of

338 E N Brown R Barbieri V Ventura R E Kass and L M Frank

running velocity The spatial conditional intensity estimate is the smoothedspatial rate function To compute the temporal rate function we followedWood et al (1999) and divided the experiment into time bins of 200 msecand computed the rate as the number of spikes per 200 msec

These two smoothing procedures produce histogram-based estimatesof l(t) Both are histogram-based estimates of Poisson rate functions be-cause neither the estimated spatial nor the temporal rate functions makeany history-dependence assumption about the spike train As in the analy-sis of the SEF spike trains we again use the KS and Q-Q plots to comparedirectly goodness of t of the ve spike train models for the hippocampalplace cells The IP IG and IIG models were t to the spike train data by max-imum likelihood The spatial and temporal rate models were computed asdescribed The KS and Q-Q plotgoodness-of-t comparisonsare in Figure 2

The IG model overestimates at lower quantiles underestimates at inter-mediate quantiles and overestimates at the upper quantiles (see Figure 2A)

The Time-Rescaling Theorem 339

The IP model underestimates the lower and intermediate quantiles andoverestimates the upper quantiles The KS plot of the spatial rate model issimilar to that of the IP model yet closer to the 45-degree line The temporalrate model overestimates the quantiles of the true probability model of thedata This analysis suggests that the IG IP and spatial rate models are mostlikely oversmoothing this spike train whereas the temporal rate model un-dersmooths it Of the ve models the one that is closest to the 45-degreeline and lies almost entirely within the condence bounds is the IIG Thismodel disagrees only with the experimental data near quantile 080 Becauseall of the models with the exception of the IIG have appreciable lack of tin terms of the KS plots the ndings in the Q-Q plot analyses are almostidentical (see Figure 2B) As in the KS plot the Q-Q plot for the IIG model isclose to the 45-degree line and within the 95 condence bounds with theexception of quantiles near 080 These analyses suggest that IIG rate modelgives the best agreement with the spiking activity of this pyramidal neuronThe spatial and temporal rate function models and the IP model have thegreatest lack of t

33 Simulating a General Point Process by Time Rescaling As a sec-ond application of the time-rescaling theorem we describe how the the-orem may be used to simulate a general point process We stated in theIntroduction that the time-rescaling theorem provides a standard approachfor simulating an inhomogeneous Poisson process from a simple Poisson

Figure 2 Facing page (A) KS plots of the parametric and histogram-based modelts to the hippocampal place cell spike train activity The parametric modelsare the inhomogeneous Poisson (IP) (dotted line) the inhomogeneous gamma(IG) (thin solid line) and the inhomogeneous inverse gaussian (IIG)(thick solidline) The histogram-based models are the spatial rate model (dashed line) basedon 42 cm spatial bin width and the temporal rate model (dashed and dotted line)based a 200msec time bin width The 45-degree solid line represents exact agree-ment between the model and the experimental data and the 45-degree thin solidlines are the 95 condence bounds based on the KS statistic as in Figure 1 TheIIG model KS plot lies almost entirely within the condence bounds whereasall the other models show signicant lack of t This suggests that the IIG modelgives the best description of this place cell data series and the histogram-basedmodels agree least with the data (B) Q-Q plots of the IP (dotted line) IG (thinsolid line) and IIG (thick solid line) spatial (dashed line) temporal (dotted anddashed line) models The 95 local condence bounds (thin dashed lines) arecomputed as described in Figure 1 The ndings in the Q-Q plot analyses arealmost identical to those in the KS plots The Q-Q plot for the IIG model is closeto the 45-degree line and within the 95 condence bounds with the exceptionof quantiles near 080 This analysis also suggests that IIG rate model gives thebest agreement with the spiking activity of this pyramidal neuron The spatialand temporal rate function models and the IP model have the greatest lack of t

340 E N Brown R Barbieri V Ventura R E Kass and L M Frank

process The general form of the time-rescaling theorem suggests that anypoint process with an integrable conditional intensity function may be sim-ulated from a Poisson process with unit rate by rescaling time with respectto the conditional intensity (rate) function Given an interval (0 T] the sim-ulation algorithm proceeds as follows

1 Set u0 D 0 Set k D 1

2 Draw tk an exponential random variable with mean 1

3 Find uk as the solution to tk DR uk

ukiexcl1l (u | u0 u1 ukiexcl1 ) du

4 If uk gt T then stop

5 k D k C 1

6 Go to 2

By using equation 23 a discrete version of the algorithm can be constructedas follows Choose J large and divide the interval (0 T] into J bins each ofwidth D D T J For k D 1 J draw a Bernoulli random variable ucurren

k withprobability l(kD | ucurren

1 ucurrenkiexcl1 )D and assign a spike to bin k if ucurren

k D 1 andno spike if ucurren

k D 0While in many instances there will be faster more computationally ef-

cient algorithms for simulating a point process such as model-based meth-ods for specic renewal processes (Ripley 1987) and thinning algorithms(Lewis amp Shedler 1978 Ogata 1981 Ross 1993) the algorithm above is sim-ple to implement given a specication of the conditional intensity functionFor an example of where this algorithm is crucial for point process simu-lation we consider the IG model in equation 32 Its conditional intensityfunction is innite immediately following a spike if y lt 1 If in addition yis time varying (y D y (t) lt 1 for all t) then neither thinning nor standardalgorithms for making draws from a gamma probability distribution maybe used to simulate data from this model The thinning algorithm fails be-cause the conditional intensity function is not bounded and the standardalgorithms for simulating a gamma model cannot be applied because y istime varying In this case the time-rescaling simulation algorithm may beapplied as long as the conditional intensity function remains integrable asy varies temporally

4 Discussion

Measuring how well a point process model describes a neural spike traindata series is imperative prior to using the model for making inferences Thetime-rescaling theorem states that any point process with an integrable con-ditional intensity function may be transformed into a Poisson process witha unit rate Berman (1983) and Ogata (1988) showed that this theorem maybe used to develop goodness-of-t tests for point process models of seismo-logic data Goodness-of-t methods for neural spike train models based on

The Time-Rescaling Theorem 341

time-rescaling transformations but not the time-rescaling theorem have alsobeen reported (Reich Victor amp Knight 1998 Barbieri Frank Quirk Wilsonamp Brown 2001) Here we have described how the time-rescaling theoremmay be used to develop goodness-of-t tests for point process models ofneural spike trains

To illustrate our approach we analyzed two types of commonlyrecordedspike train data The SEF data are a set of multiple short (400 msec) series ofspike times each measured under identical experimental conditions Thesedata are typically analyzed by a PSTH The hippocampus data are a long(20 minutes) series of spike time recordings that are typically analyzed witheither spatial or temporal histogram models To each type of data we tboth parametric and histogram-based models Histogram-based models arepopularneural data analysis toolsbecause of the ease with which they can becomputed and interpreted These apparent advantages do not override theneed to evaluate the goodness of t of these models We previously usedthe time-rescaling theorem to assess goodness of t for parametric spiketrain models (Olson et al 2000 Barbieri Quirk et al 2001 Ventura et al2001) Our main result in this article is that the time-rescaling theorem can beused to evaluate goodness of t of parametric and histogram-based modelsand to compare directly the accuracy of models from the two classes Werecommend that before making an inference based on either type of modela goodness-of-t analysis should be performed to establish how well themodel describes the spike train data If the model and data agree closelythen the inference is more credible than when there is signicant lack of tThe KS and Q-Q plots provide assessments of overall goodness of t For themodels t by maximum likelihood these assessments can be applied alongwith methods that measure the marginal value and marginal costs of usingmore complex models such as Akaikiersquos information criterion (AIC) andthe Bayesian information criterion (BIC) in order to gain a more completeevaluation of model agreement with experimental data (Barbieri Quirk etal 2001)

We assessed goodness of t by using the time-rescaling theorem to con-struct KS and Q-Q plots having respectively liberal and conservative con-dence bounds Together the two sets of condence bounds help character-ize the range of agreement between the model and the data For example amodelwhose KSplot lies consistently outside the 95 KS condence bounds(the IP model for the hippocampal data) agrees poorly with the data Onthe other hand a model that is within all the 95 condence bounds of theQ-Q plots (the IMI model for the SEF data) agrees closely with the data Amodel such as the IP model for the SEF data that is within nearly all theKS bounds may lie outside the Q-Q plot intervals In this case if the lackof t with respect to the Q-Q plot intervals is systematic (ie is over a set ofcontiguous quantiles) this suggests that the model does not t the data well

As a second application of the time-rescaling theorem we presented analgorithm for simulating spike trains from a point process given its condi-

342 E N Brown R Barbieri V Ventura R E Kass and L M Frank

tional intensity (rate) function This algorithm generalizes the well-knowntechnique of simulating an inhomogeneous Poisson process by rescaling aPoisson process with a constant rate

Finally to make the reasoning behind the time-rescaling theorem moreaccessible to the neuroscience researchers we proved its general form usingelementary probability arguments While this elementary proof is most cer-tainly apparent to experts in probability and point process theory (D Brill-inger personal communication Guttorp 1995) itsdetails to our knowledgehave not been previously presented The original proofs of this theorem usemeasure theory and are based on the martingale representation of point pro-cesses (Meyer 1969 Papangelou 1972 Br Acircemaud 1981 Jacobsen 1982) Theconditional intensity function (see equation 21) is dened in terms of themartingale representation Our proof uses elementary arguments because itis based on the fact that the joint probability density of a set of point processobservations (spike train) has a canonical representation in terms of the con-ditional intensity function When the joint probability density is representedin this way the Jacobian in the change of variables between the original spiketimes and the rescaled interspike intervals simplies to a product of the re-ciprocals of the conditional intensity functions evaluated at the spike times

The proof also highlights the signicance of the conditional intensityfunction in spike train modeling its specication completely denes thestochastic structure of the point process This is because in a small time in-terval the product of the conditional intensity function and the time intervaldenes the probability of a spike in that interval given the historyof the spiketrain up to that time (see equation 23) When there is no history dependencethe conditional intensity function is simply the Poisson rate function Animportant consequence of this simplication is that unless history depen-dence is specically included then histogram-based models such as thePSTH and the spatial and temporal smoothers are implicit Poisson models

In both the SEF and hippocampusexamples the histogram-based modelsgave poor ts to the spike train These poor ts arose because these mod-els used few data points to estimate many parameters and because theydo not model history dependence in the spike train Our parametric mod-els used fewer parameters and represented temporal dependence explicitlyas Markov Point process models with higher-order temporal dependencehave been studied by Ogata (1981 1988) Brillinger (1988) and Kass andVentura (2001) and will be considered further in our future work Paramet-ric conditional intensity functions may be estimated from neural spike traindata in any experiments where there are enough data to estimate reliably ahistogram-based model This is because if there are enough data to estimatemany parameters using an inefcient procedure (histogrammethod of mo-ments) then there should be enough data to estimate a smaller number ofparameters using an efcient one (maximum likelihood)

Using the K-S and Q-Q plots derived from the time-rescaling theoremit is possible to devise a systematic approach to the use of histogram-based

The Time-Rescaling Theorem 343

models That is it is possible to determine when a histogram-based modelaccurately describes a spike train and when a different model class temporaldependence or the effects of covariates (eg the theta rhythm and the ratrunning velocity in the case of the place cells) should be considered and thedegree of improvement the alternative models provide

In summary we have illustrated how the time-rescaling theorem may beused to compare directly goodness of t of parametric and histogram-basedpoint process models of neural spiking activity and to simulate spike traindata These results suggest that the time-rescaling theorem can be a valuabletool for neural spike train data analysis

Appendix

A1 Maximum Likelihood Estimation of the IMI Model To t the IMImodel we choose J large and divide the interval (0 T] into J bins of widthD D T J We choose J so that there is at most one spike in any bin In thisway we convert the spike times 0 lt u1 lt u2 lt lt uniexcl1 lt un middot T into abinary sequence ucurren

j where ucurrenj D 1 if there is a spike in bin j and 0 otherwise

for j D 1 J By denition of the conditional intensity function for the IMImodel in equation 31 it follows that each ucurren

j is a Bernoulli random variablewith the probability of a spike at jD dened as l1 ( jD | h )l2 ( jD iexclucurren

N ( jD ) | h )D We note that this discretization is identical to the one used to constructthe discretized version of the simulation algorithm in Section 33 The logprobability of a spike is thus

log(l1 ( jD | h ) ) C log(l2 ( jD iexcl ucurrenN ( jD ) | h )D ) (A1)

and the cubic spline models for log(l1 ( jD | h ) ) and log(l2 ( jD iexcl ucurrenN ( jD ) | h ) )

are respectively

log(l1 ( jD | h ) ) D3X

lD1

hl ( jD iexcl j1 ) lC

Ch4 ( jD iexclj2 )3C C h5 ( jD iexcl j3) 3

C (A2)

log(l2 ( jD iexcl ucurrenN ( jD ) | h ) ) D

3X

lD1

hlC5 ( jD iexcl ucurrenN ( jD ) iexclc 1 ) l

C

C h9 ( jD iexcl ucurrenN ( jD ) iexclc 2 ) 3

C (A3)

whereh D (h1 h9 ) and the knots are dened asjl D lT4 for l D 1 2 3 andc l is the observed 100l3 quantile of the interspike interval distribution ofthe trial spike trains for lD 1 2 We have for Section 31 that hIP D (h1 h5)

344 E N Brown R Barbieri V Ventura R E Kass and L M Frank

and hIMI D (h1 h9) are the coefcients of the spline basis elements for theIP and IMI models respectively

The parameters are estimated by maximum likelihood with D D 1 msecusing the gam function in S-PLUS (MathSoft Seattle) as described in Chap-ter 11 of Venables and Ripley (1999) The inputs to gam are the ucurren

j s the timearguments jD and jD iexclucurren

N ( jD ) the symbolic specication of the spline mod-els and the knots Further discussion on spline-based regression methodsfor analyzing neural data may be found in Olson et al (2000) and Venturaet al (2001)

Acknowledgments

We are grateful to Carl Olson for allowing us to use the supplementary eyeeld data and Matthew Wilson for allowing us to use the hippocampal placecell data We thank David Brillinger and Victor Solo for helpful discussionsand the two anonymous referees whose suggestions helped improve theexposition This research was supported in part by NIH grants CA54652MH59733 and MH61637 and NSF grants DMS 9803433 and IBN 0081548

References

Barbieri R Frank L M Quirk M C Wilson M A amp Brown E N (2001)Diagnostic methods for statistical models of place cell spiking activity Neu-rocomputing 38ndash40 1087ndash1093

Barbieri R Quirk M C Frank L M Wilson M A amp Brown E N (2001)Construction and analysis of non-Poisson stimulus-response models of neu-ral spike train activity J Neurosci Meth 105 25ndash37

Berman M (1983)Comment on ldquoLikelihood analysis of point processes and itsapplications to seismological datardquo by Ogata Bulletin Internatl Stat Instit50 412ndash418

Br Acircemaud P (1981) Point processes and queues Martingale dynamics BerlinSpringer-Verlag

Brillinger D R (1988) Maximum likelihood analysis of spike trains of interact-ing nerve cells Biol Cyber 59 189ndash200

Brown E N Frank L M Tang D Quirk M C amp Wilson M A (1998) Astatistical paradigm for neural spike train decoding applied to position pre-diction from ensemble ring patterns of rat hippocampal place cells Journalof Neuroscience 18 7411ndash7425

Casella G amp Berger R L (1990) Statistical inference Belmont CA DuxburyDaley D amp Vere-Jones D (1988) An introduction to the theory of point process

New York Springer-VerlagFrank L M Brown E N amp Wilson M A (2000) Trajectory encoding in the

hippocampus and entorhinal cortex Neuron 27 169ndash178Guttorp P (1995) Stochastic modeling of scientic data London Chapman amp

Hall

The Time-Rescaling Theorem 345

Hogg R V amp Tanis E A (2001) Probability and statistical inference (6th ed)Englewood Cliffs NJ Prentice Hall

Jacobsen M (1982)Statistical analysis of counting processes New York Springer-Verlag

Johnson A amp Kotz S (1970) Distributions in statistics Continuous univariatedistributionsmdash2 New York Wiley

Kalbeisch J amp Prentice R (1980)The statisticalanalysis of failure time data NewYork Wiley

Karr A (1991) Point processes and their statistical inference (2nd ed) New YorkMarcel Dekker

Kass R E amp Ventura V (2001)A spike train probability model Neural Comput13 1713ndash1720

Lewis P A W amp Shedler G S (1978) Simulation of non-homogeneous Poissonprocesses by thinning Naval Res Logistics Quart 26 403ndash413

Meyer P (1969)D Acircemonstration simpli Acircee drsquoun th Acirceoreme de Knight InS AcirceminaireprobabilitAcirce V (pp 191ndash195) New York Springer-Verlag

Muller R U amp Kubie J L (1987)The effects of changes in the environment onthe spatial ring of hippocampal complex-spike cells Journal of Neuroscience7 1951ndash1968

Ogata Y (1981) On Lewisrsquo simulation method for point processes IEEE Trans-actions on Information Theory IT-27 23ndash31

Ogata Y (1988) Statistical models for earthquake occurrences and residualanalysis for point processes Journal of American Statistical Association 83 9ndash27

OrsquoKeefe J amp Dostrovsky J (1971) The hippocampus as a spatial map Pre-liminary evidence from unit activity in the freely-moving rat Brain Res 34171ndash175

Olson C R Gettner S N Ventura V Carta R amp Kass R E (2000) Neuronalactivity in macaque supplementary eye eld during planning of saccades inresponse to pattern and spatial cues Journal of Neurophysiology84 1369ndash1384

Papangelou F (1972) Integrability of expected increments of point processesand a related random change of scale Trans Amer Math Soc 165 483ndash506

Port S (1994) Theoretical probability for applications New York WileyProtter M H amp Morrey C B (1991)A rst course in real analysis (2nd ed) New

York Springer-VerlagReich D Victor J amp Knight B (1998) The power ratio and the interval

map Spiking models and extracellular recordings Journal of Neuroscience18 10090ndash10104

Ripley B (1987) Stochastic simulation New York WileyRoss S (1993) Introduction to probability models (5th ed) San Diego CA Aca-

demic PressSnyder D amp Miller M (1991)Random point processes in time and space (2nd ed)

New York Springer-VerlagTaylor H M amp Karlin S (1994) An introduction to stochastic modeling (rev ed)

San Diego CA Academic PressVenables W amp Ripley B (1999) Modern applied statistics with S-PLUS (3rd ed)

New York Springer-Verlag

346 E N Brown R Barbieri V Ventura R E Kass and L M Frank

Ventura V Carta R Kass R Gettner S amp Olson C (2001) Statistical analysisof temporal evolution in single-neuron ring rate Biostatistics In press

Wilson M A amp McNaughton B L (1993) Dynamics of the hippocampal en-semble code for space Science 261 1055ndash1058

Wood E R Dudchenko P A amp Eichenbaum H (1999) The global record ofmemory in hippocampal neuronal activity Nature 397 613ndash616

Received October 27 2000 accepted May 2 2001

The Time-Rescaling Theorem 327

theorem is less familiar to neuroscience researchers The technical nature ofthe proof which relies on the martingale representation of a point processmay have prevented its signicance from being more broadly appreciated

The time-rescaling theorem has important theoretical and practical im-plications for application of point process models in neural spike train dataanalysis To help make this result more accessible to researchers in neu-roscience we present a proof that uses only elementary probability theoryarguments We describe how the theorem may be used to develop goodness-of-t tests for both parametric and histogram-based point process modelsof neural spike trains We apply these tests in two examples a compar-ison of PSTH inhomogeneous Poisson and inhomogeneous Markov in-terval models of neural spike trains from the supplementary eye eld ofa macque monkey and a comparison of temporal and spatial smoothersinhomogeneous Poisson inhomogeneous gamma and inhomogeneous in-verse gaussian models of rat hippocampal place cell spiking activity Wealso demonstrate how the time-rescaling theorem may be used to simulatea general point process

2 Theory

21 The Conditional Intensity Function and the Joint Probability Den-sity of the Spike Train Dene an interval (0 T] and let 0 lt u1 lt u2 lt lt uniexcl1 lt un middot T be a set of event (spike) times from a point processFor t 2 (0 T] let N (t) be the sample path of the associated counting processThe sample path is a right continuous function that jumps 1 at the eventtimes and is constant otherwise (Snyder amp Miller 1991) In this way N (t)counts the number and location of spikes in the interval (0 t] Therefore itcontains all the information in the sequence of events or spike times Fort 2 (0 T] we dene the conditional or stochastic intensity function as

l(t | Ht ) D limD t0

Pr(N (t C D t) iexcl N (t) D 1 | Ht)D t

(21)

where Ht D f0 lt u1 lt u2 uN (t) lt tg is the history of the process up totime t and uN (t) is the time of the last spike prior to t If the point process is aninhomogeneous Poisson process then l(t | Ht ) D l (t) is simply the Poissonrate function Otherwise it depends on the history of the process Hencethe conditional intensity generalizes the denition of the Poisson rate It iswell known that l (t | Ht ) can be dened in terms of the event (spike) timeprobability density f (t | Ht ) as

l(t | Ht ) Df (t | Ht )

1 iexclR t

uN (t)f (u | Ht) du

(22)

for t gt uN (t) (Daley amp Vere-Jones 1988 Barbieri Quirk Frank Wilson ampBrown 2001) We may gain insight into equation 22 and the meaning of the

328 E N Brown R Barbieri V Ventura R E Kass and L M Frank

conditional intensity function by computing explicitly the probability thatgiven Ht a spike uk occurs in [t t C D t) where k D N (t) C 1 To do this wenote that the events fN (t C Dt) iexcl N (t) D 1 | Htg and fuk 2 [t t C D t) | Htgare equivalent and that therefore

Pr (N (t C D t) iexcl N (t) D 1 | Ht ) D Pr(uk 2 [t t C Dt) | uk gt t Ht )

DPr(uk 2 [t t C Dt) | Ht )

Pr (uk gt t | Ht)

D

R tCDtt f (u | Ht ) du

1 iexclR t

uN (t)f (u | Ht ) du

frac14 f (t | Ht)D t

1 iexclR t

uN (t)f (u | Ht ) du

D l (t | Ht )D t (23)

Dividing by D t and taking the limit gives

limD t0

Pr(uk 2 [t t C D t) | Ht )D t

Df (t | Ht )

1 iexclR t

uN (t)f (u | Ht ) du

D l(t | Ht ) (24)

which is equation 21 Therefore l (t | Ht)D t is the probability of a spike in[t t C Dt) when there is history dependence in the spike train In survivalanalysis the conditional intensity is termed the hazard function because inthis case l (t | Ht )D t measures the probability of a failure ordeath in [t tCD t)given that the process has survived up to time t (Kalbeisch amp Prentice1980)

Because we would like to apply the time-rescaling theorem to spike traindata series we require the joint probability density of exactly n event timesin (0 T] This joint probability density is (Daley amp Vere-Jones 1988 BarbieriQuirk et al 2001)

f (u1 u2 un N (T) D n)

D f (u1 u2 un unC1 gt T)

D f (u1 u2 un N (un) D n) Pr (unC1 gt T | u1 u2 un)

DnY

kD1

l(uk | Huk) exp

raquoiexcl

Z uk

ukiexcl1

l(u | Hu) dufrac14

cent exp

(iexcl

Z T

un

l (u | Hu) du

) (25)

The Time-Rescaling Theorem 329

where

f (u1 u2 un N (un) D n)

DnY

kD1

l (uk | Huk) exp

raquoiexcl

Z uk

ukiexcl1

l (u | Hu) dufrac14

(26)

Pr(unC1 gt T | u1 u2 un) D exp

(iexcl

Z T

un

l(u | Hu) du

) (27)

and u0 D 0 Equation 26 is the joint probability density of exactly n eventsin (0 un] whereas equation 27 is the probability that the n C 1st eventoccurs after T The conditional intensity function provides a succinct wayto represent the joint probability density of the spike times We can nowstate and prove the time-rescaling theorem

Time-Rescaling Theorem Let 0 lt u1 lt u2 lt lt un lt T be a realizationfrom a point process with a conditional intensity function l(t | Ht ) satisfying0 lt l (t | Ht ) for all t 2 (0 T] Dene the transformation

L (uk) DZ uk

0l (u | Hu ) du (28)

for k D 1 n and assume L (t) lt 1 with probability one for all t 2 (0 T]Then the L (uk) rsquos are a Poisson process with unit rate

Proof Let tk D L (uk) iexcl L (ukiexcl1 ) for k D 1 n and set tT DR Tun

l(u | Hu) du To establish the result it sufces to show that the tksare independent and identically distributed exponential random variableswith mean one Because the tk transformation is one-to-one and tnC1 gt tTif and only if unC1 gt T the joint probability density of the tkrsquos is

f (t1 t2 tn tnC1 gt tT )

D f (t1 tn) Pr (tnC1 gt tT | t1 tn) (29)

We evaluate each of the two terms on the right side of equation 29 Thefollowing two events are equivalent

ftnC1 gt tT | t1 tng D funC1 gt T | u1 u2 ung (210)

Hence

Pr(tnC1 gt tT | t1 t2 tn ) D Pr(unC1 gt T | u1 u2 un )

D exp

(iexcl

Z T

un

l (u | Hun) du

)

D expfiexcltTg (211)

330 E N Brown R Barbieri V Ventura R E Kass and L M Frank

where the last equality follows from the denition of tT By the multivariatechange-of-variable formula (Port 1994)

f (t1 t2 tn ) D | J| f (u1 u2 un N (un) D n) (212)

where J is the Jacobian of the transformation between uj j D 1 n andtk k D 1 n Because tk is a function of u1 uk J is a lower triangularmatrix and its determinant is the product of its diagonal elements denedas | J| D |

QnkD1 Jkk| By assumption 0 lt l (t | Ht) and by equation 28 and

the denition of tk the mapping of u into t is one-to-one Therefore bythe inverse differentiation theorem (Protter amp Morrey 1991) the diagonalelements of J are

Jkk Duk

tkD l(uk | Huk

) iexcl1 (213)

Substituting |J | and equation 26 into equation 212 yields

f (t1 t2 tn ) DnY

kD1

l(uk | Huk) iexcl1

nY

kD1

l (uk | Huk)

cent expraquo

iexclZ uk

ukiexcl1

l (u | Hu ) dufrac14

DnY

kD1

expfiexcl[L (uk) iexcl L (ukiexcl1 )]g

DnY

kD1

expfiexcltkg (214)

Substituting equations 211 and 214 into 29 yields

f (t1 t2 tn tnC1 gt tT ) D f (t1 tn) Pr(tnC1 gt tT | t1 tn)

D

sup3nY

kD1

expfiexcltkgacute

expfiexcltTg (215)

which establishes the result

The time-rescaling theorem generates a history-dependent rescaling ofthe time axis that converts a point process into a Poisson process with a unitrate

22 Assessing Model Goodness of Fit We may use the time-rescalingtheorem to construct goodness-of-t tests for a spike data model Once a

The Time-Rescaling Theorem 331

model has been t to a spike train data series we can compute from itsestimated conditional intensity the rescaled times

tk D L (uk) iexcl L (ukiexcl1 ) (216)

If the model is correct then according to the theorem the tks are indepen-dent exponential random variables with mean 1 If we make the furthertransformation

zk D 1 iexcl exp(iexcltk) (217)

then zks are independent uniform random variables on the interval (0 1) Be-cause the transformations inequations 216 and 217 are both one-to-one anystatistical assessment that measures agreement between the zks and a uni-form distribution directly evaluates how well the original model agrees withthe spike train data Here we present two methods Kolmogorov-Smirnovtests and quantile-quantile plots

To construct the Kolmogorov-Smirnov test we rst order the zks fromsmallest to largest denoting the ordered values as z (k) s We then plot the val-ues of the cumulative distribution function of the uniform density dened

as bk D kiexcl 12

n for k D 1 n against the z (k) s If the model is correct then thepoints should lie on a 45-degree line (Johnson amp Kotz 1970) Condencebounds for the degree of agreement between the models and the data maybe constructed using the distribution of the Kolmogorov-Smirnov statisticFor moderate to large sample sizes the 95 (99) condence bounds arewell approximated as bk sect 136n12 (bk sect 163n12) (Johnson amp Kotz 1970)We term such a plot a Kolmogorov-Smirnov (KS) plot

Another approach to measuring agreement between the uniform prob-ability density and the zks is to construct a quantile-quantile (Q-Q) plot(Ventura Carta Kass Gettner amp Olson 2001 Barbieri Quirk et al 2001Hogg amp Tanis 2001) In this display we plot the quantiles of the uniformdistribution denoted here also as the bks against the z(k) s As in the case ofthe KS plots exact agreement occurs between the point process model andthe experimental data if the points lie on a 45-degree line Pointwise con-dence bands can be constructed to measure the magnitude of the departureof the plot from the 45-degree line relative to chance To construct point-wise bands we note that if the tks are independent exponential randomvariables with mean 1 and the zks are thus uniform on the interval (0 1)then each z (k) has a beta probability density with parameters k and n iexcl k C 1dened as

f (z | k n iexcl k C 1) Dn

(n iexcl k) (k iexcl 1)zkiexcl1 (1 iexcl z)niexclk (218)

for 0 lt z lt 1 (Johnson amp Kotz 1970) We set the 95 condence boundsby nding the 25th and 975th quantiles of the cumulative distribution

332 E N Brown R Barbieri V Ventura R E Kass and L M Frank

associated with equation 218 for k D 1 n These exact quantiles arereadily available in many statistical software packages For moderate tolarge spike train data series a reasonable approximation to the 95 (99)condence bounds is given by the gaussian approximation to the bino-mial probability distribution as z(k) sect 196[z(k) (1 iexcl z(k) ) n]12 (z(k) sect2575[z(k) (1 iexcl z(k) ) n]12) To our knowledge these local condence boundsfor the Q-Q plots based on the beta distribution and the gaussian approxi-mation are new

In general the KS condence intervals will be wider than the corre-sponding Q-Q plot intervals To see this it sufces to compare the widthsof the two intervals using their approximate formulas for large n From thegaussian approximation to the binomial the maximum width of the 95condence interval for the Q-Q plots occurs at the median z(k) D 050 andis 2[196 (4n) 12] D 196niexcl12 For n large the width of the 95 condenceintervals for the KS plots is 272niexcl12 at all quantiles The KS condencebounds consider the maximum discrepancy from the 45-degree line alongall quantiles the 95 bands show the discrepancy that would be exceeded5 of the time by chance if the plotted data were truly uniformlydistributedThe Q-Q plot condence bounds consider the maximum discrepancy fromthe 45-degree line for each quantile separately These pointwise 95 con-dence bounds mark the amount by which each value z(k) would deviatefrom the true quantile 5 of the time purely by chance The KS bounds arebroad because they are based on the joint distribution of all n deviationsand they consider the distribution of the largest of these deviations TheQ-Q plot bounds are narrower because they measure the deviation at eachquantile separately Used together the two plots help approximate upperand lower limits on the discrepancy between a proposed model and a spiketrain data series

3 Applications

31 An Analysis of Supplementary Eye Field Recordings For the rstapplication of the time-rescaling theorem to a goodness-of-t analysis weanalyze a spike train recorded from the supplementary eye eld (SEF) of amacaque monkey Neurons in the SEF play a role in oculomotor processes(Olson Gettner Ventura Carta amp Kass 2000) A standard paradigm forstudying the spiking propertiesof these neurons is a delayed eye movementtask In this task the monkey xates is shown locations of potential targetsites and is then cued to the specic target to which it must saccade Nexta preparatory cue is given followed a random time later by a go signalUpon receiving the go signal the animal must saccade to the specic targetand hold xation for a dened amount of time in order to receive a rewardBeginning fromthe point of the specic target cue neural activity is recordedfor a xed interval of time beyond the presentation of the go signal Aftera brief rest period the trial is repeated Multiple trials from an experiment

The Time-Rescaling Theorem 333

such as this are jointly analyzed using a PSTH to estimate ring rate for anite interval following a xed initiation point That is the trials are timealigned with respect to a xed initial point such as the target cue The dataacross trials are binned in time intervals of a xed length and the rate in eachbin is estimated as the average number of spikes in the xed time interval

Kass and Ventura (2001) recently presented inhomogeneous Markov in-terval (IMI) modelsas an alternative to the PSTH for analyzing multiple-trialneural spike train data These models use a Markov representation for theconditional intensity function One form of the IMI conditional intensityfunction they considered is

l(t | Ht ) D l (t | uN (t) h ) D l1 (t | h )l2 (t iexcl uN (t) | h ) (31)

where uN (t) is the time of the last spike prior to t l1 (t | h ) modulates r-ing as a function of the experimental clock time l2 (t iexcl uN (t) | h ) repre-sents Markov dependence in the spike train and h is a vector of modelparameters to be estimated Kass and Ventura modeled log l1 (t | h ) andlog l2 (t iexcl uN (t) | h ) as separate piecewise cubic splines in their respectivearguments t and t iexcl uN (t) The cubic pieces were joined at knots so thatthe resulting functions were twice continuously differentiable The numberand positions of the knots were chosen in a preliminary data analysis In thespecial case l (t | uN (t) h ) D l1 (t | h ) the conditional intensity function inequation 31 corresponds to an inhomogeneous Poisson (IP) model becausethis assumes no temporal dependence among the spike times

Kass and Ventura used their models to analyze SEF data that consistedof 486 spikes recorded during the 400 msec following the target cue signalin 15 trials of a delayed eye movement task (neuron PK166a from Olsonet al 2000 using the pattern condition) The IP and IMI models were tby maximum likelihood and statistical signicance tests on the spline co-efcients were used to compare goodness of t Included in the analysiswere spline models with higher-order dependence among the spike timesthan the rst-order Markov dependence in equation 31 They found thatthe IMI model gave a statistically signicant improvement in the t relativeto the IP and that adding higher-order dependence to the model gave nofurther improvements The t of the IMI model was not improved by in-cluding terms to model between-trial differences in spike rate The authorsconcluded that there was strong rst-order Markov dependence in the r-ing pattern of this SEF neuron Kass and Ventura did not provide an overallassessment of model goodness of t or evaluate how much the IMI and theIP models improved over the histogram-based rate model estimated by thePSTH

Using the KS and Q-Q plots derived from the time-rescaling theorem it ispossible to compare directly the ts of the IMI IP and PSTH models and todetermine which gives the most accurate description of the SEF spike trainstructure The equations for the IP rate function lIP (t | hIP ) D l1 (t | hIP )

334 E N Brown R Barbieri V Ventura R E Kass and L M Frank

and for the IMI conditional intensity function lIMI (t | uN (t) hIMI ) Dl1 (t | hIMI )l2 (t iexcl uN (t) hIMI ) are given in the appendix along with a dis-cussion of the maximum likelihood procedure used to estimate the co-efcients of the spline basis elements The estimated conditional inten-sity functions for the IP and IMI models are respectively lIP (t | OhIP ) andlIMI (t | uN (t) OhIMI ) where Oh is the maximum likelihood estimate of thespecic spline coefcients For the PSTH model the conditional intensityestimate is the PSTH computed by averaging the number of spikes in eachof 40 10 msec bins (the 400 msec following the target cue signal) acrossthe 15 trials The PSTH is the t of another inhomogeneous Poisson modelbecause it assumes no dependence among the spike times

The results of the IMI IP and PSTH model ts compared by KS andQ-Q plots are shown in Figure 1 For the IP model there is lack of t atlower quantiles (below 025) because in that range its KS plot lies just out-side the 95 condence bounds (see Figure 1A) From quantile 025 andbeyond the IP model is within the 95 condence bounds although be-

The Time-Rescaling Theorem 335

yond the quantile 075 it slightly underpredicts the true probability modelof the data The KS plot of the PSTH model is similar to that of the IPexcept that it lies entirely outside the 95 condence bands below quan-tile 050 Beyond this quantile it is within the 95 condence bounds TheKS plot of the PSTH underpredicts the probability model of the data toa greater degree in the upper range than the IP model The IMI model iscompletely within the 95 condence bounds and lies almost exactly onthe 45-degree line of complete agreement between the model and the dataThe Q-Q plot analyses (see Figure 1B) agree with KS plot analyses witha few exceptions The Q-Q plot analyses show that the lack of t of theIP and PSTH models is greater at the lower quantiles (0ndash050) than sug-gested by the KS plots The Q-Q plots for the IP and PSTH models alsoshow that the deviations of these two models near quantiles 080 to 090are statistically signicant With the exception of a small deviation belowquantile 010 the Q-Q plot of the IMI lies almost exactly on the 45-degreeline

Figure 1 Facing page (A) Kolmogorov-Smirnov (K-S) plots of the inhomoge-neous Markov interval (IMI) model inhomogeneous Poisson (IP) and perstim-ulus time histogram (PSTH) model ts to the SEF spike train data The solid45-degree line represents exact agreement between the model and the dataThe dashed 45-degree lines are the 95 condence bounds for exact agree-ment between the model and experimental data based on the distribution of theKolmogorov-Smirnov statistic The 95 condence bounds are bk sect 136niexcl 1

2 where bk D (k iexcl 1

2) n for k D 1 n and n is the total number of spikes The

IMI model (thick solid line) is completely within the 95 condence boundsand lies almost exactly on the 45-degree line The IP model (thin solid line)has lack of t at lower quantiles (lt 025) From the quantile 025 and be-yond the IP model is within the 95 condence bounds The KS plot of thePSTH model (dotted line) is similar to that of the IP model except that it liesoutside the 95 condence bands below quantile 050 Beyond this quantileit is within the 95 condence bounds yet it underpredicts the probabilitymodel of the data to a greater degree in this range than the IP model doesThe IMI model agrees more closely with the spike train data than either the IPor the PSTH models (B) Quantile-quantile (Q-Q) plots of the IMI (thick solidline) IP (thin solid line) and PSTH (dotted line) models The dashed lines arethe local 95 condence bounds for the individual quantiles computed fromthe beta probability density dened in equation 218 The solid 45-degree linerepresents exact agreement between the model and the data The Q-Q plotssuggests that the lack of t of the IP and PSTH models is greater at the lowerquantiles (0ndash050) than suggested by the KS plots The Q-Q plots for the IP andPSTH models also show that the deviations of these two models near quantiles080 to 090 are statistically signicant With the exception of a small devia-tion below quantile 010 the Q-Q plot of the IMI lies almost exactly on the45-degree line

336 E N Brown R Barbieri V Ventura R E Kass and L M Frank

We conclude that the IMI model gives the best description of these SEFspike trains In agreement with the report of Kass and Ventura (2001) thisanalysis supports a rst-order Markov dependence among the spike timesand not a Poisson structure as would be suggested by either the IP or thePSTH models This analysis extends the ndings of Kass and Ventura byshowing that of the three models the PSTH gives the poorest description ofthe SEF spike train The IP model gives a better t to the SEF data than thePSTH model because the maximum likelihood analysis of the parametricIP model is more statistically efcient than the histogram (method of mo-ments) estimate obtained from the PSTH (Casella amp Berger 1990) That isthe IP model t by maximum likelihood uses all the data to estimate theconditional intensity function at all time points whereas the PSTH anal-ysis uses only spikes in a specied time bin to estimate the ring rate inthat bin The additional improvement of the IMI model over the IP is dueto the fact that the former represents temporal dependence in the spiketrain

32 An Analysis of Hippocampal Place Cell Recordings As a secondexample of using the time-rescaling theorem to develop goodness-of-ttests we analyze the spiking activity of a pyramidal cell in the CA1 regionof the rat hippocampus recorded from an animal running back and forth ona linear track Hippocampal pyramidal neurons have place-specic ring(OrsquoKeefe amp Dostrovsky 1971)a given neuron res onlywhen the animal is ina certain subregion of the environment termed the neuronrsquos place eld On alinear track these elds approximately resemble one-dimensional gaussiansurfaces The neuronrsquos spiking activity correlates most closely with the an-imalrsquos position on the track (Wilson amp McNaughton 1993) The data serieswe analyze consists of 691 spikes from a place cell in the CA1 region of thehippocampus recorded from a rat running back and forth for 20 minutes ona 300 cm U-shaped track The track was linearized for the purposes of thisanalysis (Frank Brown amp Wilson 2000)

There are two approaches to estimating the place-specic ring maps ofa hippocampal neuron One approach is to use maximum likelihood to ta specic parametric model of the spike times to the place cell data as inBrown Frank Tang Quirk and Wilson (1998) and Barbieri Quirk et al(2001) If x(t) is the animalrsquos position at time t we dene the spatial functionfor the one-dimensional place eld model as the gaussian surface

s (t) D expraquo

a iexcl b (x (t) iexcl m )2

2

frac14 (32)

where m is the center of place eld b is a scale factor and expfag is themaximum height of the place eld at its center We represent the spike timeprobability density of the neuron as either an inhomogeneous gamma (IG)

The Time-Rescaling Theorem 337

model dened as

f (uk | ukiexcl1 h )

Dys (uk )C (y )

microZ uk

ukiexcl1

ys (u) duparay iexcl1

expraquo

iexclZ uk

ukiexcl1

ys(u) dufrac14

(33)

or as an inhomogeneous inverse gaussian (IIG) model dened as

f (uk | ukiexcl1 h )

Ds(uk)

micro2p

hR uk

ukiexcl1s(u) du

i3para 1

2

exp

8gtlt

gtiexcl

12

plusmnR uk

ukiexcl1s (u) du iexcl y

sup22

y 2R uk

ukiexcl1s(u) du

9gt=

gt (34)

where y gt 0 is a location parameter for both models and h D (m a b y )is the set of model parameters to be estimated from the spike train If weset y D 1 in equation 33 we obtain the IP model as a special case ofthe IG model The parameters for all three modelsmdashthe IP IG and theIIGmdashcan be estimated from the spike train data by maximum likelihood(Barbieri Quirk et al 2001) The models in equations 33 and 34 are Markovso that the current value of either the spike time probability density orthe conditional intensity (rate) function depends on only the time of thepreviousspike Because of equation 22 specifying the spike time probabilitydensity is equivalent to specifying the conditional intensity function If welet Oh denote the maximum likelihood estimate of h then the maximumlikelihood estimate of the conditional intensity function for each model canbe computed from equation 22 as

l(t | Ht Oh ) Df (t | uN (t) Oh )

1 iexclR t

uN (t)f (u | uN (t) Oh ) du

(35)

for t gt uN (t) The estimated conditional intensity from each model maybe used in the time-rescaling theorem to assess model goodness of t asdescribed in Section 21

The second approach is to compute a histogram-based estimate of theconditional intensity function by using either spatial smoothing (Muller ampKubie 1987 Frank et al 2000) or temporal smoothing (Wood Dudchenkoamp Eichenbaum 1999) of the spike train To compute the spatial smooth-ing estimate of the conditional intensity function we followed Frank et al(2000) and divided the 300 cm track into 42 cm bins counted the numberof spikes per bin and divided the count by the amount of time the animalspends in the bin We smooth the binned ring rate with a six-point gaus-sian window with a standard deviation of one bin to reduce the effect of

338 E N Brown R Barbieri V Ventura R E Kass and L M Frank

running velocity The spatial conditional intensity estimate is the smoothedspatial rate function To compute the temporal rate function we followedWood et al (1999) and divided the experiment into time bins of 200 msecand computed the rate as the number of spikes per 200 msec

These two smoothing procedures produce histogram-based estimatesof l(t) Both are histogram-based estimates of Poisson rate functions be-cause neither the estimated spatial nor the temporal rate functions makeany history-dependence assumption about the spike train As in the analy-sis of the SEF spike trains we again use the KS and Q-Q plots to comparedirectly goodness of t of the ve spike train models for the hippocampalplace cells The IP IG and IIG models were t to the spike train data by max-imum likelihood The spatial and temporal rate models were computed asdescribed The KS and Q-Q plotgoodness-of-t comparisonsare in Figure 2

The IG model overestimates at lower quantiles underestimates at inter-mediate quantiles and overestimates at the upper quantiles (see Figure 2A)

The Time-Rescaling Theorem 339

The IP model underestimates the lower and intermediate quantiles andoverestimates the upper quantiles The KS plot of the spatial rate model issimilar to that of the IP model yet closer to the 45-degree line The temporalrate model overestimates the quantiles of the true probability model of thedata This analysis suggests that the IG IP and spatial rate models are mostlikely oversmoothing this spike train whereas the temporal rate model un-dersmooths it Of the ve models the one that is closest to the 45-degreeline and lies almost entirely within the condence bounds is the IIG Thismodel disagrees only with the experimental data near quantile 080 Becauseall of the models with the exception of the IIG have appreciable lack of tin terms of the KS plots the ndings in the Q-Q plot analyses are almostidentical (see Figure 2B) As in the KS plot the Q-Q plot for the IIG model isclose to the 45-degree line and within the 95 condence bounds with theexception of quantiles near 080 These analyses suggest that IIG rate modelgives the best agreement with the spiking activity of this pyramidal neuronThe spatial and temporal rate function models and the IP model have thegreatest lack of t

33 Simulating a General Point Process by Time Rescaling As a sec-ond application of the time-rescaling theorem we describe how the the-orem may be used to simulate a general point process We stated in theIntroduction that the time-rescaling theorem provides a standard approachfor simulating an inhomogeneous Poisson process from a simple Poisson

Figure 2 Facing page (A) KS plots of the parametric and histogram-based modelts to the hippocampal place cell spike train activity The parametric modelsare the inhomogeneous Poisson (IP) (dotted line) the inhomogeneous gamma(IG) (thin solid line) and the inhomogeneous inverse gaussian (IIG)(thick solidline) The histogram-based models are the spatial rate model (dashed line) basedon 42 cm spatial bin width and the temporal rate model (dashed and dotted line)based a 200msec time bin width The 45-degree solid line represents exact agree-ment between the model and the experimental data and the 45-degree thin solidlines are the 95 condence bounds based on the KS statistic as in Figure 1 TheIIG model KS plot lies almost entirely within the condence bounds whereasall the other models show signicant lack of t This suggests that the IIG modelgives the best description of this place cell data series and the histogram-basedmodels agree least with the data (B) Q-Q plots of the IP (dotted line) IG (thinsolid line) and IIG (thick solid line) spatial (dashed line) temporal (dotted anddashed line) models The 95 local condence bounds (thin dashed lines) arecomputed as described in Figure 1 The ndings in the Q-Q plot analyses arealmost identical to those in the KS plots The Q-Q plot for the IIG model is closeto the 45-degree line and within the 95 condence bounds with the exceptionof quantiles near 080 This analysis also suggests that IIG rate model gives thebest agreement with the spiking activity of this pyramidal neuron The spatialand temporal rate function models and the IP model have the greatest lack of t

340 E N Brown R Barbieri V Ventura R E Kass and L M Frank

process The general form of the time-rescaling theorem suggests that anypoint process with an integrable conditional intensity function may be sim-ulated from a Poisson process with unit rate by rescaling time with respectto the conditional intensity (rate) function Given an interval (0 T] the sim-ulation algorithm proceeds as follows

1 Set u0 D 0 Set k D 1

2 Draw tk an exponential random variable with mean 1

3 Find uk as the solution to tk DR uk

ukiexcl1l (u | u0 u1 ukiexcl1 ) du

4 If uk gt T then stop

5 k D k C 1

6 Go to 2

By using equation 23 a discrete version of the algorithm can be constructedas follows Choose J large and divide the interval (0 T] into J bins each ofwidth D D T J For k D 1 J draw a Bernoulli random variable ucurren

k withprobability l(kD | ucurren

1 ucurrenkiexcl1 )D and assign a spike to bin k if ucurren

k D 1 andno spike if ucurren

k D 0While in many instances there will be faster more computationally ef-

cient algorithms for simulating a point process such as model-based meth-ods for specic renewal processes (Ripley 1987) and thinning algorithms(Lewis amp Shedler 1978 Ogata 1981 Ross 1993) the algorithm above is sim-ple to implement given a specication of the conditional intensity functionFor an example of where this algorithm is crucial for point process simu-lation we consider the IG model in equation 32 Its conditional intensityfunction is innite immediately following a spike if y lt 1 If in addition yis time varying (y D y (t) lt 1 for all t) then neither thinning nor standardalgorithms for making draws from a gamma probability distribution maybe used to simulate data from this model The thinning algorithm fails be-cause the conditional intensity function is not bounded and the standardalgorithms for simulating a gamma model cannot be applied because y istime varying In this case the time-rescaling simulation algorithm may beapplied as long as the conditional intensity function remains integrable asy varies temporally

4 Discussion

Measuring how well a point process model describes a neural spike traindata series is imperative prior to using the model for making inferences Thetime-rescaling theorem states that any point process with an integrable con-ditional intensity function may be transformed into a Poisson process witha unit rate Berman (1983) and Ogata (1988) showed that this theorem maybe used to develop goodness-of-t tests for point process models of seismo-logic data Goodness-of-t methods for neural spike train models based on

The Time-Rescaling Theorem 341

time-rescaling transformations but not the time-rescaling theorem have alsobeen reported (Reich Victor amp Knight 1998 Barbieri Frank Quirk Wilsonamp Brown 2001) Here we have described how the time-rescaling theoremmay be used to develop goodness-of-t tests for point process models ofneural spike trains

To illustrate our approach we analyzed two types of commonlyrecordedspike train data The SEF data are a set of multiple short (400 msec) series ofspike times each measured under identical experimental conditions Thesedata are typically analyzed by a PSTH The hippocampus data are a long(20 minutes) series of spike time recordings that are typically analyzed witheither spatial or temporal histogram models To each type of data we tboth parametric and histogram-based models Histogram-based models arepopularneural data analysis toolsbecause of the ease with which they can becomputed and interpreted These apparent advantages do not override theneed to evaluate the goodness of t of these models We previously usedthe time-rescaling theorem to assess goodness of t for parametric spiketrain models (Olson et al 2000 Barbieri Quirk et al 2001 Ventura et al2001) Our main result in this article is that the time-rescaling theorem can beused to evaluate goodness of t of parametric and histogram-based modelsand to compare directly the accuracy of models from the two classes Werecommend that before making an inference based on either type of modela goodness-of-t analysis should be performed to establish how well themodel describes the spike train data If the model and data agree closelythen the inference is more credible than when there is signicant lack of tThe KS and Q-Q plots provide assessments of overall goodness of t For themodels t by maximum likelihood these assessments can be applied alongwith methods that measure the marginal value and marginal costs of usingmore complex models such as Akaikiersquos information criterion (AIC) andthe Bayesian information criterion (BIC) in order to gain a more completeevaluation of model agreement with experimental data (Barbieri Quirk etal 2001)

We assessed goodness of t by using the time-rescaling theorem to con-struct KS and Q-Q plots having respectively liberal and conservative con-dence bounds Together the two sets of condence bounds help character-ize the range of agreement between the model and the data For example amodelwhose KSplot lies consistently outside the 95 KS condence bounds(the IP model for the hippocampal data) agrees poorly with the data Onthe other hand a model that is within all the 95 condence bounds of theQ-Q plots (the IMI model for the SEF data) agrees closely with the data Amodel such as the IP model for the SEF data that is within nearly all theKS bounds may lie outside the Q-Q plot intervals In this case if the lackof t with respect to the Q-Q plot intervals is systematic (ie is over a set ofcontiguous quantiles) this suggests that the model does not t the data well

As a second application of the time-rescaling theorem we presented analgorithm for simulating spike trains from a point process given its condi-

342 E N Brown R Barbieri V Ventura R E Kass and L M Frank

tional intensity (rate) function This algorithm generalizes the well-knowntechnique of simulating an inhomogeneous Poisson process by rescaling aPoisson process with a constant rate

Finally to make the reasoning behind the time-rescaling theorem moreaccessible to the neuroscience researchers we proved its general form usingelementary probability arguments While this elementary proof is most cer-tainly apparent to experts in probability and point process theory (D Brill-inger personal communication Guttorp 1995) itsdetails to our knowledgehave not been previously presented The original proofs of this theorem usemeasure theory and are based on the martingale representation of point pro-cesses (Meyer 1969 Papangelou 1972 Br Acircemaud 1981 Jacobsen 1982) Theconditional intensity function (see equation 21) is dened in terms of themartingale representation Our proof uses elementary arguments because itis based on the fact that the joint probability density of a set of point processobservations (spike train) has a canonical representation in terms of the con-ditional intensity function When the joint probability density is representedin this way the Jacobian in the change of variables between the original spiketimes and the rescaled interspike intervals simplies to a product of the re-ciprocals of the conditional intensity functions evaluated at the spike times

The proof also highlights the signicance of the conditional intensityfunction in spike train modeling its specication completely denes thestochastic structure of the point process This is because in a small time in-terval the product of the conditional intensity function and the time intervaldenes the probability of a spike in that interval given the historyof the spiketrain up to that time (see equation 23) When there is no history dependencethe conditional intensity function is simply the Poisson rate function Animportant consequence of this simplication is that unless history depen-dence is specically included then histogram-based models such as thePSTH and the spatial and temporal smoothers are implicit Poisson models

In both the SEF and hippocampusexamples the histogram-based modelsgave poor ts to the spike train These poor ts arose because these mod-els used few data points to estimate many parameters and because theydo not model history dependence in the spike train Our parametric mod-els used fewer parameters and represented temporal dependence explicitlyas Markov Point process models with higher-order temporal dependencehave been studied by Ogata (1981 1988) Brillinger (1988) and Kass andVentura (2001) and will be considered further in our future work Paramet-ric conditional intensity functions may be estimated from neural spike traindata in any experiments where there are enough data to estimate reliably ahistogram-based model This is because if there are enough data to estimatemany parameters using an inefcient procedure (histogrammethod of mo-ments) then there should be enough data to estimate a smaller number ofparameters using an efcient one (maximum likelihood)

Using the K-S and Q-Q plots derived from the time-rescaling theoremit is possible to devise a systematic approach to the use of histogram-based

The Time-Rescaling Theorem 343

models That is it is possible to determine when a histogram-based modelaccurately describes a spike train and when a different model class temporaldependence or the effects of covariates (eg the theta rhythm and the ratrunning velocity in the case of the place cells) should be considered and thedegree of improvement the alternative models provide

In summary we have illustrated how the time-rescaling theorem may beused to compare directly goodness of t of parametric and histogram-basedpoint process models of neural spiking activity and to simulate spike traindata These results suggest that the time-rescaling theorem can be a valuabletool for neural spike train data analysis

Appendix

A1 Maximum Likelihood Estimation of the IMI Model To t the IMImodel we choose J large and divide the interval (0 T] into J bins of widthD D T J We choose J so that there is at most one spike in any bin In thisway we convert the spike times 0 lt u1 lt u2 lt lt uniexcl1 lt un middot T into abinary sequence ucurren

j where ucurrenj D 1 if there is a spike in bin j and 0 otherwise

for j D 1 J By denition of the conditional intensity function for the IMImodel in equation 31 it follows that each ucurren

j is a Bernoulli random variablewith the probability of a spike at jD dened as l1 ( jD | h )l2 ( jD iexclucurren

N ( jD ) | h )D We note that this discretization is identical to the one used to constructthe discretized version of the simulation algorithm in Section 33 The logprobability of a spike is thus

log(l1 ( jD | h ) ) C log(l2 ( jD iexcl ucurrenN ( jD ) | h )D ) (A1)

and the cubic spline models for log(l1 ( jD | h ) ) and log(l2 ( jD iexcl ucurrenN ( jD ) | h ) )

are respectively

log(l1 ( jD | h ) ) D3X

lD1

hl ( jD iexcl j1 ) lC

Ch4 ( jD iexclj2 )3C C h5 ( jD iexcl j3) 3

C (A2)

log(l2 ( jD iexcl ucurrenN ( jD ) | h ) ) D

3X

lD1

hlC5 ( jD iexcl ucurrenN ( jD ) iexclc 1 ) l

C

C h9 ( jD iexcl ucurrenN ( jD ) iexclc 2 ) 3

C (A3)

whereh D (h1 h9 ) and the knots are dened asjl D lT4 for l D 1 2 3 andc l is the observed 100l3 quantile of the interspike interval distribution ofthe trial spike trains for lD 1 2 We have for Section 31 that hIP D (h1 h5)

344 E N Brown R Barbieri V Ventura R E Kass and L M Frank

and hIMI D (h1 h9) are the coefcients of the spline basis elements for theIP and IMI models respectively

The parameters are estimated by maximum likelihood with D D 1 msecusing the gam function in S-PLUS (MathSoft Seattle) as described in Chap-ter 11 of Venables and Ripley (1999) The inputs to gam are the ucurren

j s the timearguments jD and jD iexclucurren

N ( jD ) the symbolic specication of the spline mod-els and the knots Further discussion on spline-based regression methodsfor analyzing neural data may be found in Olson et al (2000) and Venturaet al (2001)

Acknowledgments

We are grateful to Carl Olson for allowing us to use the supplementary eyeeld data and Matthew Wilson for allowing us to use the hippocampal placecell data We thank David Brillinger and Victor Solo for helpful discussionsand the two anonymous referees whose suggestions helped improve theexposition This research was supported in part by NIH grants CA54652MH59733 and MH61637 and NSF grants DMS 9803433 and IBN 0081548

References

Barbieri R Frank L M Quirk M C Wilson M A amp Brown E N (2001)Diagnostic methods for statistical models of place cell spiking activity Neu-rocomputing 38ndash40 1087ndash1093

Barbieri R Quirk M C Frank L M Wilson M A amp Brown E N (2001)Construction and analysis of non-Poisson stimulus-response models of neu-ral spike train activity J Neurosci Meth 105 25ndash37

Berman M (1983)Comment on ldquoLikelihood analysis of point processes and itsapplications to seismological datardquo by Ogata Bulletin Internatl Stat Instit50 412ndash418

Br Acircemaud P (1981) Point processes and queues Martingale dynamics BerlinSpringer-Verlag

Brillinger D R (1988) Maximum likelihood analysis of spike trains of interact-ing nerve cells Biol Cyber 59 189ndash200

Brown E N Frank L M Tang D Quirk M C amp Wilson M A (1998) Astatistical paradigm for neural spike train decoding applied to position pre-diction from ensemble ring patterns of rat hippocampal place cells Journalof Neuroscience 18 7411ndash7425

Casella G amp Berger R L (1990) Statistical inference Belmont CA DuxburyDaley D amp Vere-Jones D (1988) An introduction to the theory of point process

New York Springer-VerlagFrank L M Brown E N amp Wilson M A (2000) Trajectory encoding in the

hippocampus and entorhinal cortex Neuron 27 169ndash178Guttorp P (1995) Stochastic modeling of scientic data London Chapman amp

Hall

The Time-Rescaling Theorem 345

Hogg R V amp Tanis E A (2001) Probability and statistical inference (6th ed)Englewood Cliffs NJ Prentice Hall

Jacobsen M (1982)Statistical analysis of counting processes New York Springer-Verlag

Johnson A amp Kotz S (1970) Distributions in statistics Continuous univariatedistributionsmdash2 New York Wiley

Kalbeisch J amp Prentice R (1980)The statisticalanalysis of failure time data NewYork Wiley

Karr A (1991) Point processes and their statistical inference (2nd ed) New YorkMarcel Dekker

Kass R E amp Ventura V (2001)A spike train probability model Neural Comput13 1713ndash1720

Lewis P A W amp Shedler G S (1978) Simulation of non-homogeneous Poissonprocesses by thinning Naval Res Logistics Quart 26 403ndash413

Meyer P (1969)D Acircemonstration simpli Acircee drsquoun th Acirceoreme de Knight InS AcirceminaireprobabilitAcirce V (pp 191ndash195) New York Springer-Verlag

Muller R U amp Kubie J L (1987)The effects of changes in the environment onthe spatial ring of hippocampal complex-spike cells Journal of Neuroscience7 1951ndash1968

Ogata Y (1981) On Lewisrsquo simulation method for point processes IEEE Trans-actions on Information Theory IT-27 23ndash31

Ogata Y (1988) Statistical models for earthquake occurrences and residualanalysis for point processes Journal of American Statistical Association 83 9ndash27

OrsquoKeefe J amp Dostrovsky J (1971) The hippocampus as a spatial map Pre-liminary evidence from unit activity in the freely-moving rat Brain Res 34171ndash175

Olson C R Gettner S N Ventura V Carta R amp Kass R E (2000) Neuronalactivity in macaque supplementary eye eld during planning of saccades inresponse to pattern and spatial cues Journal of Neurophysiology84 1369ndash1384

Papangelou F (1972) Integrability of expected increments of point processesand a related random change of scale Trans Amer Math Soc 165 483ndash506

Port S (1994) Theoretical probability for applications New York WileyProtter M H amp Morrey C B (1991)A rst course in real analysis (2nd ed) New

York Springer-VerlagReich D Victor J amp Knight B (1998) The power ratio and the interval

map Spiking models and extracellular recordings Journal of Neuroscience18 10090ndash10104

Ripley B (1987) Stochastic simulation New York WileyRoss S (1993) Introduction to probability models (5th ed) San Diego CA Aca-

demic PressSnyder D amp Miller M (1991)Random point processes in time and space (2nd ed)

New York Springer-VerlagTaylor H M amp Karlin S (1994) An introduction to stochastic modeling (rev ed)

San Diego CA Academic PressVenables W amp Ripley B (1999) Modern applied statistics with S-PLUS (3rd ed)

New York Springer-Verlag

346 E N Brown R Barbieri V Ventura R E Kass and L M Frank

Ventura V Carta R Kass R Gettner S amp Olson C (2001) Statistical analysisof temporal evolution in single-neuron ring rate Biostatistics In press

Wilson M A amp McNaughton B L (1993) Dynamics of the hippocampal en-semble code for space Science 261 1055ndash1058

Wood E R Dudchenko P A amp Eichenbaum H (1999) The global record ofmemory in hippocampal neuronal activity Nature 397 613ndash616

Received October 27 2000 accepted May 2 2001

328 E N Brown R Barbieri V Ventura R E Kass and L M Frank

conditional intensity function by computing explicitly the probability thatgiven Ht a spike uk occurs in [t t C D t) where k D N (t) C 1 To do this wenote that the events fN (t C Dt) iexcl N (t) D 1 | Htg and fuk 2 [t t C D t) | Htgare equivalent and that therefore

Pr (N (t C D t) iexcl N (t) D 1 | Ht ) D Pr(uk 2 [t t C Dt) | uk gt t Ht )

DPr(uk 2 [t t C Dt) | Ht )

Pr (uk gt t | Ht)

D

R tCDtt f (u | Ht ) du

1 iexclR t

uN (t)f (u | Ht ) du

frac14 f (t | Ht)D t

1 iexclR t

uN (t)f (u | Ht ) du

D l (t | Ht )D t (23)

Dividing by D t and taking the limit gives

limD t0

Pr(uk 2 [t t C D t) | Ht )D t

Df (t | Ht )

1 iexclR t

uN (t)f (u | Ht ) du

D l(t | Ht ) (24)

which is equation 21 Therefore l (t | Ht)D t is the probability of a spike in[t t C Dt) when there is history dependence in the spike train In survivalanalysis the conditional intensity is termed the hazard function because inthis case l (t | Ht )D t measures the probability of a failure ordeath in [t tCD t)given that the process has survived up to time t (Kalbeisch amp Prentice1980)

Because we would like to apply the time-rescaling theorem to spike traindata series we require the joint probability density of exactly n event timesin (0 T] This joint probability density is (Daley amp Vere-Jones 1988 BarbieriQuirk et al 2001)

f (u1 u2 un N (T) D n)

D f (u1 u2 un unC1 gt T)

D f (u1 u2 un N (un) D n) Pr (unC1 gt T | u1 u2 un)

DnY

kD1

l(uk | Huk) exp

raquoiexcl

Z uk

ukiexcl1

l(u | Hu) dufrac14

cent exp

(iexcl

Z T

un

l (u | Hu) du

) (25)

The Time-Rescaling Theorem 329

where

f (u1 u2 un N (un) D n)

DnY

kD1

l (uk | Huk) exp

raquoiexcl

Z uk

ukiexcl1

l (u | Hu) dufrac14

(26)

Pr(unC1 gt T | u1 u2 un) D exp

(iexcl

Z T

un

l(u | Hu) du

) (27)

and u0 D 0 Equation 26 is the joint probability density of exactly n eventsin (0 un] whereas equation 27 is the probability that the n C 1st eventoccurs after T The conditional intensity function provides a succinct wayto represent the joint probability density of the spike times We can nowstate and prove the time-rescaling theorem

Time-Rescaling Theorem Let 0 lt u1 lt u2 lt lt un lt T be a realizationfrom a point process with a conditional intensity function l(t | Ht ) satisfying0 lt l (t | Ht ) for all t 2 (0 T] Dene the transformation

L (uk) DZ uk

0l (u | Hu ) du (28)

for k D 1 n and assume L (t) lt 1 with probability one for all t 2 (0 T]Then the L (uk) rsquos are a Poisson process with unit rate

Proof Let tk D L (uk) iexcl L (ukiexcl1 ) for k D 1 n and set tT DR Tun

l(u | Hu) du To establish the result it sufces to show that the tksare independent and identically distributed exponential random variableswith mean one Because the tk transformation is one-to-one and tnC1 gt tTif and only if unC1 gt T the joint probability density of the tkrsquos is

f (t1 t2 tn tnC1 gt tT )

D f (t1 tn) Pr (tnC1 gt tT | t1 tn) (29)

We evaluate each of the two terms on the right side of equation 29 Thefollowing two events are equivalent

ftnC1 gt tT | t1 tng D funC1 gt T | u1 u2 ung (210)

Hence

Pr(tnC1 gt tT | t1 t2 tn ) D Pr(unC1 gt T | u1 u2 un )

D exp

(iexcl

Z T

un

l (u | Hun) du

)

D expfiexcltTg (211)

330 E N Brown R Barbieri V Ventura R E Kass and L M Frank

where the last equality follows from the denition of tT By the multivariatechange-of-variable formula (Port 1994)

f (t1 t2 tn ) D | J| f (u1 u2 un N (un) D n) (212)

where J is the Jacobian of the transformation between uj j D 1 n andtk k D 1 n Because tk is a function of u1 uk J is a lower triangularmatrix and its determinant is the product of its diagonal elements denedas | J| D |

QnkD1 Jkk| By assumption 0 lt l (t | Ht) and by equation 28 and

the denition of tk the mapping of u into t is one-to-one Therefore bythe inverse differentiation theorem (Protter amp Morrey 1991) the diagonalelements of J are

Jkk Duk

tkD l(uk | Huk

) iexcl1 (213)

Substituting |J | and equation 26 into equation 212 yields

f (t1 t2 tn ) DnY

kD1

l(uk | Huk) iexcl1

nY

kD1

l (uk | Huk)

cent expraquo

iexclZ uk

ukiexcl1

l (u | Hu ) dufrac14

DnY

kD1

expfiexcl[L (uk) iexcl L (ukiexcl1 )]g

DnY

kD1

expfiexcltkg (214)

Substituting equations 211 and 214 into 29 yields

f (t1 t2 tn tnC1 gt tT ) D f (t1 tn) Pr(tnC1 gt tT | t1 tn)

D

sup3nY

kD1

expfiexcltkgacute

expfiexcltTg (215)

which establishes the result

The time-rescaling theorem generates a history-dependent rescaling ofthe time axis that converts a point process into a Poisson process with a unitrate

22 Assessing Model Goodness of Fit We may use the time-rescalingtheorem to construct goodness-of-t tests for a spike data model Once a

The Time-Rescaling Theorem 331

model has been t to a spike train data series we can compute from itsestimated conditional intensity the rescaled times

tk D L (uk) iexcl L (ukiexcl1 ) (216)

If the model is correct then according to the theorem the tks are indepen-dent exponential random variables with mean 1 If we make the furthertransformation

zk D 1 iexcl exp(iexcltk) (217)

then zks are independent uniform random variables on the interval (0 1) Be-cause the transformations inequations 216 and 217 are both one-to-one anystatistical assessment that measures agreement between the zks and a uni-form distribution directly evaluates how well the original model agrees withthe spike train data Here we present two methods Kolmogorov-Smirnovtests and quantile-quantile plots

To construct the Kolmogorov-Smirnov test we rst order the zks fromsmallest to largest denoting the ordered values as z (k) s We then plot the val-ues of the cumulative distribution function of the uniform density dened

as bk D kiexcl 12

n for k D 1 n against the z (k) s If the model is correct then thepoints should lie on a 45-degree line (Johnson amp Kotz 1970) Condencebounds for the degree of agreement between the models and the data maybe constructed using the distribution of the Kolmogorov-Smirnov statisticFor moderate to large sample sizes the 95 (99) condence bounds arewell approximated as bk sect 136n12 (bk sect 163n12) (Johnson amp Kotz 1970)We term such a plot a Kolmogorov-Smirnov (KS) plot

Another approach to measuring agreement between the uniform prob-ability density and the zks is to construct a quantile-quantile (Q-Q) plot(Ventura Carta Kass Gettner amp Olson 2001 Barbieri Quirk et al 2001Hogg amp Tanis 2001) In this display we plot the quantiles of the uniformdistribution denoted here also as the bks against the z(k) s As in the case ofthe KS plots exact agreement occurs between the point process model andthe experimental data if the points lie on a 45-degree line Pointwise con-dence bands can be constructed to measure the magnitude of the departureof the plot from the 45-degree line relative to chance To construct point-wise bands we note that if the tks are independent exponential randomvariables with mean 1 and the zks are thus uniform on the interval (0 1)then each z (k) has a beta probability density with parameters k and n iexcl k C 1dened as

f (z | k n iexcl k C 1) Dn

(n iexcl k) (k iexcl 1)zkiexcl1 (1 iexcl z)niexclk (218)

for 0 lt z lt 1 (Johnson amp Kotz 1970) We set the 95 condence boundsby nding the 25th and 975th quantiles of the cumulative distribution

332 E N Brown R Barbieri V Ventura R E Kass and L M Frank

associated with equation 218 for k D 1 n These exact quantiles arereadily available in many statistical software packages For moderate tolarge spike train data series a reasonable approximation to the 95 (99)condence bounds is given by the gaussian approximation to the bino-mial probability distribution as z(k) sect 196[z(k) (1 iexcl z(k) ) n]12 (z(k) sect2575[z(k) (1 iexcl z(k) ) n]12) To our knowledge these local condence boundsfor the Q-Q plots based on the beta distribution and the gaussian approxi-mation are new

In general the KS condence intervals will be wider than the corre-sponding Q-Q plot intervals To see this it sufces to compare the widthsof the two intervals using their approximate formulas for large n From thegaussian approximation to the binomial the maximum width of the 95condence interval for the Q-Q plots occurs at the median z(k) D 050 andis 2[196 (4n) 12] D 196niexcl12 For n large the width of the 95 condenceintervals for the KS plots is 272niexcl12 at all quantiles The KS condencebounds consider the maximum discrepancy from the 45-degree line alongall quantiles the 95 bands show the discrepancy that would be exceeded5 of the time by chance if the plotted data were truly uniformlydistributedThe Q-Q plot condence bounds consider the maximum discrepancy fromthe 45-degree line for each quantile separately These pointwise 95 con-dence bounds mark the amount by which each value z(k) would deviatefrom the true quantile 5 of the time purely by chance The KS bounds arebroad because they are based on the joint distribution of all n deviationsand they consider the distribution of the largest of these deviations TheQ-Q plot bounds are narrower because they measure the deviation at eachquantile separately Used together the two plots help approximate upperand lower limits on the discrepancy between a proposed model and a spiketrain data series

3 Applications

31 An Analysis of Supplementary Eye Field Recordings For the rstapplication of the time-rescaling theorem to a goodness-of-t analysis weanalyze a spike train recorded from the supplementary eye eld (SEF) of amacaque monkey Neurons in the SEF play a role in oculomotor processes(Olson Gettner Ventura Carta amp Kass 2000) A standard paradigm forstudying the spiking propertiesof these neurons is a delayed eye movementtask In this task the monkey xates is shown locations of potential targetsites and is then cued to the specic target to which it must saccade Nexta preparatory cue is given followed a random time later by a go signalUpon receiving the go signal the animal must saccade to the specic targetand hold xation for a dened amount of time in order to receive a rewardBeginning fromthe point of the specic target cue neural activity is recordedfor a xed interval of time beyond the presentation of the go signal Aftera brief rest period the trial is repeated Multiple trials from an experiment

The Time-Rescaling Theorem 333

such as this are jointly analyzed using a PSTH to estimate ring rate for anite interval following a xed initiation point That is the trials are timealigned with respect to a xed initial point such as the target cue The dataacross trials are binned in time intervals of a xed length and the rate in eachbin is estimated as the average number of spikes in the xed time interval

Kass and Ventura (2001) recently presented inhomogeneous Markov in-terval (IMI) modelsas an alternative to the PSTH for analyzing multiple-trialneural spike train data These models use a Markov representation for theconditional intensity function One form of the IMI conditional intensityfunction they considered is

l(t | Ht ) D l (t | uN (t) h ) D l1 (t | h )l2 (t iexcl uN (t) | h ) (31)

where uN (t) is the time of the last spike prior to t l1 (t | h ) modulates r-ing as a function of the experimental clock time l2 (t iexcl uN (t) | h ) repre-sents Markov dependence in the spike train and h is a vector of modelparameters to be estimated Kass and Ventura modeled log l1 (t | h ) andlog l2 (t iexcl uN (t) | h ) as separate piecewise cubic splines in their respectivearguments t and t iexcl uN (t) The cubic pieces were joined at knots so thatthe resulting functions were twice continuously differentiable The numberand positions of the knots were chosen in a preliminary data analysis In thespecial case l (t | uN (t) h ) D l1 (t | h ) the conditional intensity function inequation 31 corresponds to an inhomogeneous Poisson (IP) model becausethis assumes no temporal dependence among the spike times

Kass and Ventura used their models to analyze SEF data that consistedof 486 spikes recorded during the 400 msec following the target cue signalin 15 trials of a delayed eye movement task (neuron PK166a from Olsonet al 2000 using the pattern condition) The IP and IMI models were tby maximum likelihood and statistical signicance tests on the spline co-efcients were used to compare goodness of t Included in the analysiswere spline models with higher-order dependence among the spike timesthan the rst-order Markov dependence in equation 31 They found thatthe IMI model gave a statistically signicant improvement in the t relativeto the IP and that adding higher-order dependence to the model gave nofurther improvements The t of the IMI model was not improved by in-cluding terms to model between-trial differences in spike rate The authorsconcluded that there was strong rst-order Markov dependence in the r-ing pattern of this SEF neuron Kass and Ventura did not provide an overallassessment of model goodness of t or evaluate how much the IMI and theIP models improved over the histogram-based rate model estimated by thePSTH

Using the KS and Q-Q plots derived from the time-rescaling theorem it ispossible to compare directly the ts of the IMI IP and PSTH models and todetermine which gives the most accurate description of the SEF spike trainstructure The equations for the IP rate function lIP (t | hIP ) D l1 (t | hIP )

334 E N Brown R Barbieri V Ventura R E Kass and L M Frank

and for the IMI conditional intensity function lIMI (t | uN (t) hIMI ) Dl1 (t | hIMI )l2 (t iexcl uN (t) hIMI ) are given in the appendix along with a dis-cussion of the maximum likelihood procedure used to estimate the co-efcients of the spline basis elements The estimated conditional inten-sity functions for the IP and IMI models are respectively lIP (t | OhIP ) andlIMI (t | uN (t) OhIMI ) where Oh is the maximum likelihood estimate of thespecic spline coefcients For the PSTH model the conditional intensityestimate is the PSTH computed by averaging the number of spikes in eachof 40 10 msec bins (the 400 msec following the target cue signal) acrossthe 15 trials The PSTH is the t of another inhomogeneous Poisson modelbecause it assumes no dependence among the spike times

The results of the IMI IP and PSTH model ts compared by KS andQ-Q plots are shown in Figure 1 For the IP model there is lack of t atlower quantiles (below 025) because in that range its KS plot lies just out-side the 95 condence bounds (see Figure 1A) From quantile 025 andbeyond the IP model is within the 95 condence bounds although be-

The Time-Rescaling Theorem 335

yond the quantile 075 it slightly underpredicts the true probability modelof the data The KS plot of the PSTH model is similar to that of the IPexcept that it lies entirely outside the 95 condence bands below quan-tile 050 Beyond this quantile it is within the 95 condence bounds TheKS plot of the PSTH underpredicts the probability model of the data toa greater degree in the upper range than the IP model The IMI model iscompletely within the 95 condence bounds and lies almost exactly onthe 45-degree line of complete agreement between the model and the dataThe Q-Q plot analyses (see Figure 1B) agree with KS plot analyses witha few exceptions The Q-Q plot analyses show that the lack of t of theIP and PSTH models is greater at the lower quantiles (0ndash050) than sug-gested by the KS plots The Q-Q plots for the IP and PSTH models alsoshow that the deviations of these two models near quantiles 080 to 090are statistically signicant With the exception of a small deviation belowquantile 010 the Q-Q plot of the IMI lies almost exactly on the 45-degreeline

Figure 1 Facing page (A) Kolmogorov-Smirnov (K-S) plots of the inhomoge-neous Markov interval (IMI) model inhomogeneous Poisson (IP) and perstim-ulus time histogram (PSTH) model ts to the SEF spike train data The solid45-degree line represents exact agreement between the model and the dataThe dashed 45-degree lines are the 95 condence bounds for exact agree-ment between the model and experimental data based on the distribution of theKolmogorov-Smirnov statistic The 95 condence bounds are bk sect 136niexcl 1

2 where bk D (k iexcl 1

2) n for k D 1 n and n is the total number of spikes The

IMI model (thick solid line) is completely within the 95 condence boundsand lies almost exactly on the 45-degree line The IP model (thin solid line)has lack of t at lower quantiles (lt 025) From the quantile 025 and be-yond the IP model is within the 95 condence bounds The KS plot of thePSTH model (dotted line) is similar to that of the IP model except that it liesoutside the 95 condence bands below quantile 050 Beyond this quantileit is within the 95 condence bounds yet it underpredicts the probabilitymodel of the data to a greater degree in this range than the IP model doesThe IMI model agrees more closely with the spike train data than either the IPor the PSTH models (B) Quantile-quantile (Q-Q) plots of the IMI (thick solidline) IP (thin solid line) and PSTH (dotted line) models The dashed lines arethe local 95 condence bounds for the individual quantiles computed fromthe beta probability density dened in equation 218 The solid 45-degree linerepresents exact agreement between the model and the data The Q-Q plotssuggests that the lack of t of the IP and PSTH models is greater at the lowerquantiles (0ndash050) than suggested by the KS plots The Q-Q plots for the IP andPSTH models also show that the deviations of these two models near quantiles080 to 090 are statistically signicant With the exception of a small devia-tion below quantile 010 the Q-Q plot of the IMI lies almost exactly on the45-degree line

336 E N Brown R Barbieri V Ventura R E Kass and L M Frank

We conclude that the IMI model gives the best description of these SEFspike trains In agreement with the report of Kass and Ventura (2001) thisanalysis supports a rst-order Markov dependence among the spike timesand not a Poisson structure as would be suggested by either the IP or thePSTH models This analysis extends the ndings of Kass and Ventura byshowing that of the three models the PSTH gives the poorest description ofthe SEF spike train The IP model gives a better t to the SEF data than thePSTH model because the maximum likelihood analysis of the parametricIP model is more statistically efcient than the histogram (method of mo-ments) estimate obtained from the PSTH (Casella amp Berger 1990) That isthe IP model t by maximum likelihood uses all the data to estimate theconditional intensity function at all time points whereas the PSTH anal-ysis uses only spikes in a specied time bin to estimate the ring rate inthat bin The additional improvement of the IMI model over the IP is dueto the fact that the former represents temporal dependence in the spiketrain

32 An Analysis of Hippocampal Place Cell Recordings As a secondexample of using the time-rescaling theorem to develop goodness-of-ttests we analyze the spiking activity of a pyramidal cell in the CA1 regionof the rat hippocampus recorded from an animal running back and forth ona linear track Hippocampal pyramidal neurons have place-specic ring(OrsquoKeefe amp Dostrovsky 1971)a given neuron res onlywhen the animal is ina certain subregion of the environment termed the neuronrsquos place eld On alinear track these elds approximately resemble one-dimensional gaussiansurfaces The neuronrsquos spiking activity correlates most closely with the an-imalrsquos position on the track (Wilson amp McNaughton 1993) The data serieswe analyze consists of 691 spikes from a place cell in the CA1 region of thehippocampus recorded from a rat running back and forth for 20 minutes ona 300 cm U-shaped track The track was linearized for the purposes of thisanalysis (Frank Brown amp Wilson 2000)

There are two approaches to estimating the place-specic ring maps ofa hippocampal neuron One approach is to use maximum likelihood to ta specic parametric model of the spike times to the place cell data as inBrown Frank Tang Quirk and Wilson (1998) and Barbieri Quirk et al(2001) If x(t) is the animalrsquos position at time t we dene the spatial functionfor the one-dimensional place eld model as the gaussian surface

s (t) D expraquo

a iexcl b (x (t) iexcl m )2

2

frac14 (32)

where m is the center of place eld b is a scale factor and expfag is themaximum height of the place eld at its center We represent the spike timeprobability density of the neuron as either an inhomogeneous gamma (IG)

The Time-Rescaling Theorem 337

model dened as

f (uk | ukiexcl1 h )

Dys (uk )C (y )

microZ uk

ukiexcl1

ys (u) duparay iexcl1

expraquo

iexclZ uk

ukiexcl1

ys(u) dufrac14

(33)

or as an inhomogeneous inverse gaussian (IIG) model dened as

f (uk | ukiexcl1 h )

Ds(uk)

micro2p

hR uk

ukiexcl1s(u) du

i3para 1

2

exp

8gtlt

gtiexcl

12

plusmnR uk

ukiexcl1s (u) du iexcl y

sup22

y 2R uk

ukiexcl1s(u) du

9gt=

gt (34)

where y gt 0 is a location parameter for both models and h D (m a b y )is the set of model parameters to be estimated from the spike train If weset y D 1 in equation 33 we obtain the IP model as a special case ofthe IG model The parameters for all three modelsmdashthe IP IG and theIIGmdashcan be estimated from the spike train data by maximum likelihood(Barbieri Quirk et al 2001) The models in equations 33 and 34 are Markovso that the current value of either the spike time probability density orthe conditional intensity (rate) function depends on only the time of thepreviousspike Because of equation 22 specifying the spike time probabilitydensity is equivalent to specifying the conditional intensity function If welet Oh denote the maximum likelihood estimate of h then the maximumlikelihood estimate of the conditional intensity function for each model canbe computed from equation 22 as

l(t | Ht Oh ) Df (t | uN (t) Oh )

1 iexclR t

uN (t)f (u | uN (t) Oh ) du

(35)

for t gt uN (t) The estimated conditional intensity from each model maybe used in the time-rescaling theorem to assess model goodness of t asdescribed in Section 21

The second approach is to compute a histogram-based estimate of theconditional intensity function by using either spatial smoothing (Muller ampKubie 1987 Frank et al 2000) or temporal smoothing (Wood Dudchenkoamp Eichenbaum 1999) of the spike train To compute the spatial smooth-ing estimate of the conditional intensity function we followed Frank et al(2000) and divided the 300 cm track into 42 cm bins counted the numberof spikes per bin and divided the count by the amount of time the animalspends in the bin We smooth the binned ring rate with a six-point gaus-sian window with a standard deviation of one bin to reduce the effect of

338 E N Brown R Barbieri V Ventura R E Kass and L M Frank

running velocity The spatial conditional intensity estimate is the smoothedspatial rate function To compute the temporal rate function we followedWood et al (1999) and divided the experiment into time bins of 200 msecand computed the rate as the number of spikes per 200 msec

These two smoothing procedures produce histogram-based estimatesof l(t) Both are histogram-based estimates of Poisson rate functions be-cause neither the estimated spatial nor the temporal rate functions makeany history-dependence assumption about the spike train As in the analy-sis of the SEF spike trains we again use the KS and Q-Q plots to comparedirectly goodness of t of the ve spike train models for the hippocampalplace cells The IP IG and IIG models were t to the spike train data by max-imum likelihood The spatial and temporal rate models were computed asdescribed The KS and Q-Q plotgoodness-of-t comparisonsare in Figure 2

The IG model overestimates at lower quantiles underestimates at inter-mediate quantiles and overestimates at the upper quantiles (see Figure 2A)

The Time-Rescaling Theorem 339

The IP model underestimates the lower and intermediate quantiles andoverestimates the upper quantiles The KS plot of the spatial rate model issimilar to that of the IP model yet closer to the 45-degree line The temporalrate model overestimates the quantiles of the true probability model of thedata This analysis suggests that the IG IP and spatial rate models are mostlikely oversmoothing this spike train whereas the temporal rate model un-dersmooths it Of the ve models the one that is closest to the 45-degreeline and lies almost entirely within the condence bounds is the IIG Thismodel disagrees only with the experimental data near quantile 080 Becauseall of the models with the exception of the IIG have appreciable lack of tin terms of the KS plots the ndings in the Q-Q plot analyses are almostidentical (see Figure 2B) As in the KS plot the Q-Q plot for the IIG model isclose to the 45-degree line and within the 95 condence bounds with theexception of quantiles near 080 These analyses suggest that IIG rate modelgives the best agreement with the spiking activity of this pyramidal neuronThe spatial and temporal rate function models and the IP model have thegreatest lack of t

33 Simulating a General Point Process by Time Rescaling As a sec-ond application of the time-rescaling theorem we describe how the the-orem may be used to simulate a general point process We stated in theIntroduction that the time-rescaling theorem provides a standard approachfor simulating an inhomogeneous Poisson process from a simple Poisson

Figure 2 Facing page (A) KS plots of the parametric and histogram-based modelts to the hippocampal place cell spike train activity The parametric modelsare the inhomogeneous Poisson (IP) (dotted line) the inhomogeneous gamma(IG) (thin solid line) and the inhomogeneous inverse gaussian (IIG)(thick solidline) The histogram-based models are the spatial rate model (dashed line) basedon 42 cm spatial bin width and the temporal rate model (dashed and dotted line)based a 200msec time bin width The 45-degree solid line represents exact agree-ment between the model and the experimental data and the 45-degree thin solidlines are the 95 condence bounds based on the KS statistic as in Figure 1 TheIIG model KS plot lies almost entirely within the condence bounds whereasall the other models show signicant lack of t This suggests that the IIG modelgives the best description of this place cell data series and the histogram-basedmodels agree least with the data (B) Q-Q plots of the IP (dotted line) IG (thinsolid line) and IIG (thick solid line) spatial (dashed line) temporal (dotted anddashed line) models The 95 local condence bounds (thin dashed lines) arecomputed as described in Figure 1 The ndings in the Q-Q plot analyses arealmost identical to those in the KS plots The Q-Q plot for the IIG model is closeto the 45-degree line and within the 95 condence bounds with the exceptionof quantiles near 080 This analysis also suggests that IIG rate model gives thebest agreement with the spiking activity of this pyramidal neuron The spatialand temporal rate function models and the IP model have the greatest lack of t

340 E N Brown R Barbieri V Ventura R E Kass and L M Frank

process The general form of the time-rescaling theorem suggests that anypoint process with an integrable conditional intensity function may be sim-ulated from a Poisson process with unit rate by rescaling time with respectto the conditional intensity (rate) function Given an interval (0 T] the sim-ulation algorithm proceeds as follows

1 Set u0 D 0 Set k D 1

2 Draw tk an exponential random variable with mean 1

3 Find uk as the solution to tk DR uk

ukiexcl1l (u | u0 u1 ukiexcl1 ) du

4 If uk gt T then stop

5 k D k C 1

6 Go to 2

By using equation 23 a discrete version of the algorithm can be constructedas follows Choose J large and divide the interval (0 T] into J bins each ofwidth D D T J For k D 1 J draw a Bernoulli random variable ucurren

k withprobability l(kD | ucurren

1 ucurrenkiexcl1 )D and assign a spike to bin k if ucurren

k D 1 andno spike if ucurren

k D 0While in many instances there will be faster more computationally ef-

cient algorithms for simulating a point process such as model-based meth-ods for specic renewal processes (Ripley 1987) and thinning algorithms(Lewis amp Shedler 1978 Ogata 1981 Ross 1993) the algorithm above is sim-ple to implement given a specication of the conditional intensity functionFor an example of where this algorithm is crucial for point process simu-lation we consider the IG model in equation 32 Its conditional intensityfunction is innite immediately following a spike if y lt 1 If in addition yis time varying (y D y (t) lt 1 for all t) then neither thinning nor standardalgorithms for making draws from a gamma probability distribution maybe used to simulate data from this model The thinning algorithm fails be-cause the conditional intensity function is not bounded and the standardalgorithms for simulating a gamma model cannot be applied because y istime varying In this case the time-rescaling simulation algorithm may beapplied as long as the conditional intensity function remains integrable asy varies temporally

4 Discussion

Measuring how well a point process model describes a neural spike traindata series is imperative prior to using the model for making inferences Thetime-rescaling theorem states that any point process with an integrable con-ditional intensity function may be transformed into a Poisson process witha unit rate Berman (1983) and Ogata (1988) showed that this theorem maybe used to develop goodness-of-t tests for point process models of seismo-logic data Goodness-of-t methods for neural spike train models based on

The Time-Rescaling Theorem 341

time-rescaling transformations but not the time-rescaling theorem have alsobeen reported (Reich Victor amp Knight 1998 Barbieri Frank Quirk Wilsonamp Brown 2001) Here we have described how the time-rescaling theoremmay be used to develop goodness-of-t tests for point process models ofneural spike trains

To illustrate our approach we analyzed two types of commonlyrecordedspike train data The SEF data are a set of multiple short (400 msec) series ofspike times each measured under identical experimental conditions Thesedata are typically analyzed by a PSTH The hippocampus data are a long(20 minutes) series of spike time recordings that are typically analyzed witheither spatial or temporal histogram models To each type of data we tboth parametric and histogram-based models Histogram-based models arepopularneural data analysis toolsbecause of the ease with which they can becomputed and interpreted These apparent advantages do not override theneed to evaluate the goodness of t of these models We previously usedthe time-rescaling theorem to assess goodness of t for parametric spiketrain models (Olson et al 2000 Barbieri Quirk et al 2001 Ventura et al2001) Our main result in this article is that the time-rescaling theorem can beused to evaluate goodness of t of parametric and histogram-based modelsand to compare directly the accuracy of models from the two classes Werecommend that before making an inference based on either type of modela goodness-of-t analysis should be performed to establish how well themodel describes the spike train data If the model and data agree closelythen the inference is more credible than when there is signicant lack of tThe KS and Q-Q plots provide assessments of overall goodness of t For themodels t by maximum likelihood these assessments can be applied alongwith methods that measure the marginal value and marginal costs of usingmore complex models such as Akaikiersquos information criterion (AIC) andthe Bayesian information criterion (BIC) in order to gain a more completeevaluation of model agreement with experimental data (Barbieri Quirk etal 2001)

We assessed goodness of t by using the time-rescaling theorem to con-struct KS and Q-Q plots having respectively liberal and conservative con-dence bounds Together the two sets of condence bounds help character-ize the range of agreement between the model and the data For example amodelwhose KSplot lies consistently outside the 95 KS condence bounds(the IP model for the hippocampal data) agrees poorly with the data Onthe other hand a model that is within all the 95 condence bounds of theQ-Q plots (the IMI model for the SEF data) agrees closely with the data Amodel such as the IP model for the SEF data that is within nearly all theKS bounds may lie outside the Q-Q plot intervals In this case if the lackof t with respect to the Q-Q plot intervals is systematic (ie is over a set ofcontiguous quantiles) this suggests that the model does not t the data well

As a second application of the time-rescaling theorem we presented analgorithm for simulating spike trains from a point process given its condi-

342 E N Brown R Barbieri V Ventura R E Kass and L M Frank

tional intensity (rate) function This algorithm generalizes the well-knowntechnique of simulating an inhomogeneous Poisson process by rescaling aPoisson process with a constant rate

Finally to make the reasoning behind the time-rescaling theorem moreaccessible to the neuroscience researchers we proved its general form usingelementary probability arguments While this elementary proof is most cer-tainly apparent to experts in probability and point process theory (D Brill-inger personal communication Guttorp 1995) itsdetails to our knowledgehave not been previously presented The original proofs of this theorem usemeasure theory and are based on the martingale representation of point pro-cesses (Meyer 1969 Papangelou 1972 Br Acircemaud 1981 Jacobsen 1982) Theconditional intensity function (see equation 21) is dened in terms of themartingale representation Our proof uses elementary arguments because itis based on the fact that the joint probability density of a set of point processobservations (spike train) has a canonical representation in terms of the con-ditional intensity function When the joint probability density is representedin this way the Jacobian in the change of variables between the original spiketimes and the rescaled interspike intervals simplies to a product of the re-ciprocals of the conditional intensity functions evaluated at the spike times

The proof also highlights the signicance of the conditional intensityfunction in spike train modeling its specication completely denes thestochastic structure of the point process This is because in a small time in-terval the product of the conditional intensity function and the time intervaldenes the probability of a spike in that interval given the historyof the spiketrain up to that time (see equation 23) When there is no history dependencethe conditional intensity function is simply the Poisson rate function Animportant consequence of this simplication is that unless history depen-dence is specically included then histogram-based models such as thePSTH and the spatial and temporal smoothers are implicit Poisson models

In both the SEF and hippocampusexamples the histogram-based modelsgave poor ts to the spike train These poor ts arose because these mod-els used few data points to estimate many parameters and because theydo not model history dependence in the spike train Our parametric mod-els used fewer parameters and represented temporal dependence explicitlyas Markov Point process models with higher-order temporal dependencehave been studied by Ogata (1981 1988) Brillinger (1988) and Kass andVentura (2001) and will be considered further in our future work Paramet-ric conditional intensity functions may be estimated from neural spike traindata in any experiments where there are enough data to estimate reliably ahistogram-based model This is because if there are enough data to estimatemany parameters using an inefcient procedure (histogrammethod of mo-ments) then there should be enough data to estimate a smaller number ofparameters using an efcient one (maximum likelihood)

Using the K-S and Q-Q plots derived from the time-rescaling theoremit is possible to devise a systematic approach to the use of histogram-based

The Time-Rescaling Theorem 343

models That is it is possible to determine when a histogram-based modelaccurately describes a spike train and when a different model class temporaldependence or the effects of covariates (eg the theta rhythm and the ratrunning velocity in the case of the place cells) should be considered and thedegree of improvement the alternative models provide

In summary we have illustrated how the time-rescaling theorem may beused to compare directly goodness of t of parametric and histogram-basedpoint process models of neural spiking activity and to simulate spike traindata These results suggest that the time-rescaling theorem can be a valuabletool for neural spike train data analysis

Appendix

A1 Maximum Likelihood Estimation of the IMI Model To t the IMImodel we choose J large and divide the interval (0 T] into J bins of widthD D T J We choose J so that there is at most one spike in any bin In thisway we convert the spike times 0 lt u1 lt u2 lt lt uniexcl1 lt un middot T into abinary sequence ucurren

j where ucurrenj D 1 if there is a spike in bin j and 0 otherwise

for j D 1 J By denition of the conditional intensity function for the IMImodel in equation 31 it follows that each ucurren

j is a Bernoulli random variablewith the probability of a spike at jD dened as l1 ( jD | h )l2 ( jD iexclucurren

N ( jD ) | h )D We note that this discretization is identical to the one used to constructthe discretized version of the simulation algorithm in Section 33 The logprobability of a spike is thus

log(l1 ( jD | h ) ) C log(l2 ( jD iexcl ucurrenN ( jD ) | h )D ) (A1)

and the cubic spline models for log(l1 ( jD | h ) ) and log(l2 ( jD iexcl ucurrenN ( jD ) | h ) )

are respectively

log(l1 ( jD | h ) ) D3X

lD1

hl ( jD iexcl j1 ) lC

Ch4 ( jD iexclj2 )3C C h5 ( jD iexcl j3) 3

C (A2)

log(l2 ( jD iexcl ucurrenN ( jD ) | h ) ) D

3X

lD1

hlC5 ( jD iexcl ucurrenN ( jD ) iexclc 1 ) l

C

C h9 ( jD iexcl ucurrenN ( jD ) iexclc 2 ) 3

C (A3)

whereh D (h1 h9 ) and the knots are dened asjl D lT4 for l D 1 2 3 andc l is the observed 100l3 quantile of the interspike interval distribution ofthe trial spike trains for lD 1 2 We have for Section 31 that hIP D (h1 h5)

344 E N Brown R Barbieri V Ventura R E Kass and L M Frank

and hIMI D (h1 h9) are the coefcients of the spline basis elements for theIP and IMI models respectively

The parameters are estimated by maximum likelihood with D D 1 msecusing the gam function in S-PLUS (MathSoft Seattle) as described in Chap-ter 11 of Venables and Ripley (1999) The inputs to gam are the ucurren

j s the timearguments jD and jD iexclucurren

N ( jD ) the symbolic specication of the spline mod-els and the knots Further discussion on spline-based regression methodsfor analyzing neural data may be found in Olson et al (2000) and Venturaet al (2001)

Acknowledgments

We are grateful to Carl Olson for allowing us to use the supplementary eyeeld data and Matthew Wilson for allowing us to use the hippocampal placecell data We thank David Brillinger and Victor Solo for helpful discussionsand the two anonymous referees whose suggestions helped improve theexposition This research was supported in part by NIH grants CA54652MH59733 and MH61637 and NSF grants DMS 9803433 and IBN 0081548

References

Barbieri R Frank L M Quirk M C Wilson M A amp Brown E N (2001)Diagnostic methods for statistical models of place cell spiking activity Neu-rocomputing 38ndash40 1087ndash1093

Barbieri R Quirk M C Frank L M Wilson M A amp Brown E N (2001)Construction and analysis of non-Poisson stimulus-response models of neu-ral spike train activity J Neurosci Meth 105 25ndash37

Berman M (1983)Comment on ldquoLikelihood analysis of point processes and itsapplications to seismological datardquo by Ogata Bulletin Internatl Stat Instit50 412ndash418

Br Acircemaud P (1981) Point processes and queues Martingale dynamics BerlinSpringer-Verlag

Brillinger D R (1988) Maximum likelihood analysis of spike trains of interact-ing nerve cells Biol Cyber 59 189ndash200

Brown E N Frank L M Tang D Quirk M C amp Wilson M A (1998) Astatistical paradigm for neural spike train decoding applied to position pre-diction from ensemble ring patterns of rat hippocampal place cells Journalof Neuroscience 18 7411ndash7425

Casella G amp Berger R L (1990) Statistical inference Belmont CA DuxburyDaley D amp Vere-Jones D (1988) An introduction to the theory of point process

New York Springer-VerlagFrank L M Brown E N amp Wilson M A (2000) Trajectory encoding in the

hippocampus and entorhinal cortex Neuron 27 169ndash178Guttorp P (1995) Stochastic modeling of scientic data London Chapman amp

Hall

The Time-Rescaling Theorem 345

Hogg R V amp Tanis E A (2001) Probability and statistical inference (6th ed)Englewood Cliffs NJ Prentice Hall

Jacobsen M (1982)Statistical analysis of counting processes New York Springer-Verlag

Johnson A amp Kotz S (1970) Distributions in statistics Continuous univariatedistributionsmdash2 New York Wiley

Kalbeisch J amp Prentice R (1980)The statisticalanalysis of failure time data NewYork Wiley

Karr A (1991) Point processes and their statistical inference (2nd ed) New YorkMarcel Dekker

Kass R E amp Ventura V (2001)A spike train probability model Neural Comput13 1713ndash1720

Lewis P A W amp Shedler G S (1978) Simulation of non-homogeneous Poissonprocesses by thinning Naval Res Logistics Quart 26 403ndash413

Meyer P (1969)D Acircemonstration simpli Acircee drsquoun th Acirceoreme de Knight InS AcirceminaireprobabilitAcirce V (pp 191ndash195) New York Springer-Verlag

Muller R U amp Kubie J L (1987)The effects of changes in the environment onthe spatial ring of hippocampal complex-spike cells Journal of Neuroscience7 1951ndash1968

Ogata Y (1981) On Lewisrsquo simulation method for point processes IEEE Trans-actions on Information Theory IT-27 23ndash31

Ogata Y (1988) Statistical models for earthquake occurrences and residualanalysis for point processes Journal of American Statistical Association 83 9ndash27

OrsquoKeefe J amp Dostrovsky J (1971) The hippocampus as a spatial map Pre-liminary evidence from unit activity in the freely-moving rat Brain Res 34171ndash175

Olson C R Gettner S N Ventura V Carta R amp Kass R E (2000) Neuronalactivity in macaque supplementary eye eld during planning of saccades inresponse to pattern and spatial cues Journal of Neurophysiology84 1369ndash1384

Papangelou F (1972) Integrability of expected increments of point processesand a related random change of scale Trans Amer Math Soc 165 483ndash506

Port S (1994) Theoretical probability for applications New York WileyProtter M H amp Morrey C B (1991)A rst course in real analysis (2nd ed) New

York Springer-VerlagReich D Victor J amp Knight B (1998) The power ratio and the interval

map Spiking models and extracellular recordings Journal of Neuroscience18 10090ndash10104

Ripley B (1987) Stochastic simulation New York WileyRoss S (1993) Introduction to probability models (5th ed) San Diego CA Aca-

demic PressSnyder D amp Miller M (1991)Random point processes in time and space (2nd ed)

New York Springer-VerlagTaylor H M amp Karlin S (1994) An introduction to stochastic modeling (rev ed)

San Diego CA Academic PressVenables W amp Ripley B (1999) Modern applied statistics with S-PLUS (3rd ed)

New York Springer-Verlag

346 E N Brown R Barbieri V Ventura R E Kass and L M Frank

Ventura V Carta R Kass R Gettner S amp Olson C (2001) Statistical analysisof temporal evolution in single-neuron ring rate Biostatistics In press

Wilson M A amp McNaughton B L (1993) Dynamics of the hippocampal en-semble code for space Science 261 1055ndash1058

Wood E R Dudchenko P A amp Eichenbaum H (1999) The global record ofmemory in hippocampal neuronal activity Nature 397 613ndash616

Received October 27 2000 accepted May 2 2001

The Time-Rescaling Theorem 329

where

f (u1 u2 un N (un) D n)

DnY

kD1

l (uk | Huk) exp

raquoiexcl

Z uk

ukiexcl1

l (u | Hu) dufrac14

(26)

Pr(unC1 gt T | u1 u2 un) D exp

(iexcl

Z T

un

l(u | Hu) du

) (27)

and u0 D 0 Equation 26 is the joint probability density of exactly n eventsin (0 un] whereas equation 27 is the probability that the n C 1st eventoccurs after T The conditional intensity function provides a succinct wayto represent the joint probability density of the spike times We can nowstate and prove the time-rescaling theorem

Time-Rescaling Theorem Let 0 lt u1 lt u2 lt lt un lt T be a realizationfrom a point process with a conditional intensity function l(t | Ht ) satisfying0 lt l (t | Ht ) for all t 2 (0 T] Dene the transformation

L (uk) DZ uk

0l (u | Hu ) du (28)

for k D 1 n and assume L (t) lt 1 with probability one for all t 2 (0 T]Then the L (uk) rsquos are a Poisson process with unit rate

Proof Let tk D L (uk) iexcl L (ukiexcl1 ) for k D 1 n and set tT DR Tun

l(u | Hu) du To establish the result it sufces to show that the tksare independent and identically distributed exponential random variableswith mean one Because the tk transformation is one-to-one and tnC1 gt tTif and only if unC1 gt T the joint probability density of the tkrsquos is

f (t1 t2 tn tnC1 gt tT )

D f (t1 tn) Pr (tnC1 gt tT | t1 tn) (29)

We evaluate each of the two terms on the right side of equation 29 Thefollowing two events are equivalent

ftnC1 gt tT | t1 tng D funC1 gt T | u1 u2 ung (210)

Hence

Pr(tnC1 gt tT | t1 t2 tn ) D Pr(unC1 gt T | u1 u2 un )

D exp

(iexcl

Z T

un

l (u | Hun) du

)

D expfiexcltTg (211)

330 E N Brown R Barbieri V Ventura R E Kass and L M Frank

where the last equality follows from the denition of tT By the multivariatechange-of-variable formula (Port 1994)

f (t1 t2 tn ) D | J| f (u1 u2 un N (un) D n) (212)

where J is the Jacobian of the transformation between uj j D 1 n andtk k D 1 n Because tk is a function of u1 uk J is a lower triangularmatrix and its determinant is the product of its diagonal elements denedas | J| D |

QnkD1 Jkk| By assumption 0 lt l (t | Ht) and by equation 28 and

the denition of tk the mapping of u into t is one-to-one Therefore bythe inverse differentiation theorem (Protter amp Morrey 1991) the diagonalelements of J are

Jkk Duk

tkD l(uk | Huk

) iexcl1 (213)

Substituting |J | and equation 26 into equation 212 yields

f (t1 t2 tn ) DnY

kD1

l(uk | Huk) iexcl1

nY

kD1

l (uk | Huk)

cent expraquo

iexclZ uk

ukiexcl1

l (u | Hu ) dufrac14

DnY

kD1

expfiexcl[L (uk) iexcl L (ukiexcl1 )]g

DnY

kD1

expfiexcltkg (214)

Substituting equations 211 and 214 into 29 yields

f (t1 t2 tn tnC1 gt tT ) D f (t1 tn) Pr(tnC1 gt tT | t1 tn)

D

sup3nY

kD1

expfiexcltkgacute

expfiexcltTg (215)

which establishes the result

The time-rescaling theorem generates a history-dependent rescaling ofthe time axis that converts a point process into a Poisson process with a unitrate

22 Assessing Model Goodness of Fit We may use the time-rescalingtheorem to construct goodness-of-t tests for a spike data model Once a

The Time-Rescaling Theorem 331

model has been t to a spike train data series we can compute from itsestimated conditional intensity the rescaled times

tk D L (uk) iexcl L (ukiexcl1 ) (216)

If the model is correct then according to the theorem the tks are indepen-dent exponential random variables with mean 1 If we make the furthertransformation

zk D 1 iexcl exp(iexcltk) (217)

then zks are independent uniform random variables on the interval (0 1) Be-cause the transformations inequations 216 and 217 are both one-to-one anystatistical assessment that measures agreement between the zks and a uni-form distribution directly evaluates how well the original model agrees withthe spike train data Here we present two methods Kolmogorov-Smirnovtests and quantile-quantile plots

To construct the Kolmogorov-Smirnov test we rst order the zks fromsmallest to largest denoting the ordered values as z (k) s We then plot the val-ues of the cumulative distribution function of the uniform density dened

as bk D kiexcl 12

n for k D 1 n against the z (k) s If the model is correct then thepoints should lie on a 45-degree line (Johnson amp Kotz 1970) Condencebounds for the degree of agreement between the models and the data maybe constructed using the distribution of the Kolmogorov-Smirnov statisticFor moderate to large sample sizes the 95 (99) condence bounds arewell approximated as bk sect 136n12 (bk sect 163n12) (Johnson amp Kotz 1970)We term such a plot a Kolmogorov-Smirnov (KS) plot

Another approach to measuring agreement between the uniform prob-ability density and the zks is to construct a quantile-quantile (Q-Q) plot(Ventura Carta Kass Gettner amp Olson 2001 Barbieri Quirk et al 2001Hogg amp Tanis 2001) In this display we plot the quantiles of the uniformdistribution denoted here also as the bks against the z(k) s As in the case ofthe KS plots exact agreement occurs between the point process model andthe experimental data if the points lie on a 45-degree line Pointwise con-dence bands can be constructed to measure the magnitude of the departureof the plot from the 45-degree line relative to chance To construct point-wise bands we note that if the tks are independent exponential randomvariables with mean 1 and the zks are thus uniform on the interval (0 1)then each z (k) has a beta probability density with parameters k and n iexcl k C 1dened as

f (z | k n iexcl k C 1) Dn

(n iexcl k) (k iexcl 1)zkiexcl1 (1 iexcl z)niexclk (218)

for 0 lt z lt 1 (Johnson amp Kotz 1970) We set the 95 condence boundsby nding the 25th and 975th quantiles of the cumulative distribution

332 E N Brown R Barbieri V Ventura R E Kass and L M Frank

associated with equation 218 for k D 1 n These exact quantiles arereadily available in many statistical software packages For moderate tolarge spike train data series a reasonable approximation to the 95 (99)condence bounds is given by the gaussian approximation to the bino-mial probability distribution as z(k) sect 196[z(k) (1 iexcl z(k) ) n]12 (z(k) sect2575[z(k) (1 iexcl z(k) ) n]12) To our knowledge these local condence boundsfor the Q-Q plots based on the beta distribution and the gaussian approxi-mation are new

In general the KS condence intervals will be wider than the corre-sponding Q-Q plot intervals To see this it sufces to compare the widthsof the two intervals using their approximate formulas for large n From thegaussian approximation to the binomial the maximum width of the 95condence interval for the Q-Q plots occurs at the median z(k) D 050 andis 2[196 (4n) 12] D 196niexcl12 For n large the width of the 95 condenceintervals for the KS plots is 272niexcl12 at all quantiles The KS condencebounds consider the maximum discrepancy from the 45-degree line alongall quantiles the 95 bands show the discrepancy that would be exceeded5 of the time by chance if the plotted data were truly uniformlydistributedThe Q-Q plot condence bounds consider the maximum discrepancy fromthe 45-degree line for each quantile separately These pointwise 95 con-dence bounds mark the amount by which each value z(k) would deviatefrom the true quantile 5 of the time purely by chance The KS bounds arebroad because they are based on the joint distribution of all n deviationsand they consider the distribution of the largest of these deviations TheQ-Q plot bounds are narrower because they measure the deviation at eachquantile separately Used together the two plots help approximate upperand lower limits on the discrepancy between a proposed model and a spiketrain data series

3 Applications

31 An Analysis of Supplementary Eye Field Recordings For the rstapplication of the time-rescaling theorem to a goodness-of-t analysis weanalyze a spike train recorded from the supplementary eye eld (SEF) of amacaque monkey Neurons in the SEF play a role in oculomotor processes(Olson Gettner Ventura Carta amp Kass 2000) A standard paradigm forstudying the spiking propertiesof these neurons is a delayed eye movementtask In this task the monkey xates is shown locations of potential targetsites and is then cued to the specic target to which it must saccade Nexta preparatory cue is given followed a random time later by a go signalUpon receiving the go signal the animal must saccade to the specic targetand hold xation for a dened amount of time in order to receive a rewardBeginning fromthe point of the specic target cue neural activity is recordedfor a xed interval of time beyond the presentation of the go signal Aftera brief rest period the trial is repeated Multiple trials from an experiment

The Time-Rescaling Theorem 333

such as this are jointly analyzed using a PSTH to estimate ring rate for anite interval following a xed initiation point That is the trials are timealigned with respect to a xed initial point such as the target cue The dataacross trials are binned in time intervals of a xed length and the rate in eachbin is estimated as the average number of spikes in the xed time interval

Kass and Ventura (2001) recently presented inhomogeneous Markov in-terval (IMI) modelsas an alternative to the PSTH for analyzing multiple-trialneural spike train data These models use a Markov representation for theconditional intensity function One form of the IMI conditional intensityfunction they considered is

l(t | Ht ) D l (t | uN (t) h ) D l1 (t | h )l2 (t iexcl uN (t) | h ) (31)

where uN (t) is the time of the last spike prior to t l1 (t | h ) modulates r-ing as a function of the experimental clock time l2 (t iexcl uN (t) | h ) repre-sents Markov dependence in the spike train and h is a vector of modelparameters to be estimated Kass and Ventura modeled log l1 (t | h ) andlog l2 (t iexcl uN (t) | h ) as separate piecewise cubic splines in their respectivearguments t and t iexcl uN (t) The cubic pieces were joined at knots so thatthe resulting functions were twice continuously differentiable The numberand positions of the knots were chosen in a preliminary data analysis In thespecial case l (t | uN (t) h ) D l1 (t | h ) the conditional intensity function inequation 31 corresponds to an inhomogeneous Poisson (IP) model becausethis assumes no temporal dependence among the spike times

Kass and Ventura used their models to analyze SEF data that consistedof 486 spikes recorded during the 400 msec following the target cue signalin 15 trials of a delayed eye movement task (neuron PK166a from Olsonet al 2000 using the pattern condition) The IP and IMI models were tby maximum likelihood and statistical signicance tests on the spline co-efcients were used to compare goodness of t Included in the analysiswere spline models with higher-order dependence among the spike timesthan the rst-order Markov dependence in equation 31 They found thatthe IMI model gave a statistically signicant improvement in the t relativeto the IP and that adding higher-order dependence to the model gave nofurther improvements The t of the IMI model was not improved by in-cluding terms to model between-trial differences in spike rate The authorsconcluded that there was strong rst-order Markov dependence in the r-ing pattern of this SEF neuron Kass and Ventura did not provide an overallassessment of model goodness of t or evaluate how much the IMI and theIP models improved over the histogram-based rate model estimated by thePSTH

Using the KS and Q-Q plots derived from the time-rescaling theorem it ispossible to compare directly the ts of the IMI IP and PSTH models and todetermine which gives the most accurate description of the SEF spike trainstructure The equations for the IP rate function lIP (t | hIP ) D l1 (t | hIP )

334 E N Brown R Barbieri V Ventura R E Kass and L M Frank

and for the IMI conditional intensity function lIMI (t | uN (t) hIMI ) Dl1 (t | hIMI )l2 (t iexcl uN (t) hIMI ) are given in the appendix along with a dis-cussion of the maximum likelihood procedure used to estimate the co-efcients of the spline basis elements The estimated conditional inten-sity functions for the IP and IMI models are respectively lIP (t | OhIP ) andlIMI (t | uN (t) OhIMI ) where Oh is the maximum likelihood estimate of thespecic spline coefcients For the PSTH model the conditional intensityestimate is the PSTH computed by averaging the number of spikes in eachof 40 10 msec bins (the 400 msec following the target cue signal) acrossthe 15 trials The PSTH is the t of another inhomogeneous Poisson modelbecause it assumes no dependence among the spike times

The results of the IMI IP and PSTH model ts compared by KS andQ-Q plots are shown in Figure 1 For the IP model there is lack of t atlower quantiles (below 025) because in that range its KS plot lies just out-side the 95 condence bounds (see Figure 1A) From quantile 025 andbeyond the IP model is within the 95 condence bounds although be-

The Time-Rescaling Theorem 335

yond the quantile 075 it slightly underpredicts the true probability modelof the data The KS plot of the PSTH model is similar to that of the IPexcept that it lies entirely outside the 95 condence bands below quan-tile 050 Beyond this quantile it is within the 95 condence bounds TheKS plot of the PSTH underpredicts the probability model of the data toa greater degree in the upper range than the IP model The IMI model iscompletely within the 95 condence bounds and lies almost exactly onthe 45-degree line of complete agreement between the model and the dataThe Q-Q plot analyses (see Figure 1B) agree with KS plot analyses witha few exceptions The Q-Q plot analyses show that the lack of t of theIP and PSTH models is greater at the lower quantiles (0ndash050) than sug-gested by the KS plots The Q-Q plots for the IP and PSTH models alsoshow that the deviations of these two models near quantiles 080 to 090are statistically signicant With the exception of a small deviation belowquantile 010 the Q-Q plot of the IMI lies almost exactly on the 45-degreeline

Figure 1 Facing page (A) Kolmogorov-Smirnov (K-S) plots of the inhomoge-neous Markov interval (IMI) model inhomogeneous Poisson (IP) and perstim-ulus time histogram (PSTH) model ts to the SEF spike train data The solid45-degree line represents exact agreement between the model and the dataThe dashed 45-degree lines are the 95 condence bounds for exact agree-ment between the model and experimental data based on the distribution of theKolmogorov-Smirnov statistic The 95 condence bounds are bk sect 136niexcl 1

2 where bk D (k iexcl 1

2) n for k D 1 n and n is the total number of spikes The

IMI model (thick solid line) is completely within the 95 condence boundsand lies almost exactly on the 45-degree line The IP model (thin solid line)has lack of t at lower quantiles (lt 025) From the quantile 025 and be-yond the IP model is within the 95 condence bounds The KS plot of thePSTH model (dotted line) is similar to that of the IP model except that it liesoutside the 95 condence bands below quantile 050 Beyond this quantileit is within the 95 condence bounds yet it underpredicts the probabilitymodel of the data to a greater degree in this range than the IP model doesThe IMI model agrees more closely with the spike train data than either the IPor the PSTH models (B) Quantile-quantile (Q-Q) plots of the IMI (thick solidline) IP (thin solid line) and PSTH (dotted line) models The dashed lines arethe local 95 condence bounds for the individual quantiles computed fromthe beta probability density dened in equation 218 The solid 45-degree linerepresents exact agreement between the model and the data The Q-Q plotssuggests that the lack of t of the IP and PSTH models is greater at the lowerquantiles (0ndash050) than suggested by the KS plots The Q-Q plots for the IP andPSTH models also show that the deviations of these two models near quantiles080 to 090 are statistically signicant With the exception of a small devia-tion below quantile 010 the Q-Q plot of the IMI lies almost exactly on the45-degree line

336 E N Brown R Barbieri V Ventura R E Kass and L M Frank

We conclude that the IMI model gives the best description of these SEFspike trains In agreement with the report of Kass and Ventura (2001) thisanalysis supports a rst-order Markov dependence among the spike timesand not a Poisson structure as would be suggested by either the IP or thePSTH models This analysis extends the ndings of Kass and Ventura byshowing that of the three models the PSTH gives the poorest description ofthe SEF spike train The IP model gives a better t to the SEF data than thePSTH model because the maximum likelihood analysis of the parametricIP model is more statistically efcient than the histogram (method of mo-ments) estimate obtained from the PSTH (Casella amp Berger 1990) That isthe IP model t by maximum likelihood uses all the data to estimate theconditional intensity function at all time points whereas the PSTH anal-ysis uses only spikes in a specied time bin to estimate the ring rate inthat bin The additional improvement of the IMI model over the IP is dueto the fact that the former represents temporal dependence in the spiketrain

32 An Analysis of Hippocampal Place Cell Recordings As a secondexample of using the time-rescaling theorem to develop goodness-of-ttests we analyze the spiking activity of a pyramidal cell in the CA1 regionof the rat hippocampus recorded from an animal running back and forth ona linear track Hippocampal pyramidal neurons have place-specic ring(OrsquoKeefe amp Dostrovsky 1971)a given neuron res onlywhen the animal is ina certain subregion of the environment termed the neuronrsquos place eld On alinear track these elds approximately resemble one-dimensional gaussiansurfaces The neuronrsquos spiking activity correlates most closely with the an-imalrsquos position on the track (Wilson amp McNaughton 1993) The data serieswe analyze consists of 691 spikes from a place cell in the CA1 region of thehippocampus recorded from a rat running back and forth for 20 minutes ona 300 cm U-shaped track The track was linearized for the purposes of thisanalysis (Frank Brown amp Wilson 2000)

There are two approaches to estimating the place-specic ring maps ofa hippocampal neuron One approach is to use maximum likelihood to ta specic parametric model of the spike times to the place cell data as inBrown Frank Tang Quirk and Wilson (1998) and Barbieri Quirk et al(2001) If x(t) is the animalrsquos position at time t we dene the spatial functionfor the one-dimensional place eld model as the gaussian surface

s (t) D expraquo

a iexcl b (x (t) iexcl m )2

2

frac14 (32)

where m is the center of place eld b is a scale factor and expfag is themaximum height of the place eld at its center We represent the spike timeprobability density of the neuron as either an inhomogeneous gamma (IG)

The Time-Rescaling Theorem 337

model dened as

f (uk | ukiexcl1 h )

Dys (uk )C (y )

microZ uk

ukiexcl1

ys (u) duparay iexcl1

expraquo

iexclZ uk

ukiexcl1

ys(u) dufrac14

(33)

or as an inhomogeneous inverse gaussian (IIG) model dened as

f (uk | ukiexcl1 h )

Ds(uk)

micro2p

hR uk

ukiexcl1s(u) du

i3para 1

2

exp

8gtlt

gtiexcl

12

plusmnR uk

ukiexcl1s (u) du iexcl y

sup22

y 2R uk

ukiexcl1s(u) du

9gt=

gt (34)

where y gt 0 is a location parameter for both models and h D (m a b y )is the set of model parameters to be estimated from the spike train If weset y D 1 in equation 33 we obtain the IP model as a special case ofthe IG model The parameters for all three modelsmdashthe IP IG and theIIGmdashcan be estimated from the spike train data by maximum likelihood(Barbieri Quirk et al 2001) The models in equations 33 and 34 are Markovso that the current value of either the spike time probability density orthe conditional intensity (rate) function depends on only the time of thepreviousspike Because of equation 22 specifying the spike time probabilitydensity is equivalent to specifying the conditional intensity function If welet Oh denote the maximum likelihood estimate of h then the maximumlikelihood estimate of the conditional intensity function for each model canbe computed from equation 22 as

l(t | Ht Oh ) Df (t | uN (t) Oh )

1 iexclR t

uN (t)f (u | uN (t) Oh ) du

(35)

for t gt uN (t) The estimated conditional intensity from each model maybe used in the time-rescaling theorem to assess model goodness of t asdescribed in Section 21

The second approach is to compute a histogram-based estimate of theconditional intensity function by using either spatial smoothing (Muller ampKubie 1987 Frank et al 2000) or temporal smoothing (Wood Dudchenkoamp Eichenbaum 1999) of the spike train To compute the spatial smooth-ing estimate of the conditional intensity function we followed Frank et al(2000) and divided the 300 cm track into 42 cm bins counted the numberof spikes per bin and divided the count by the amount of time the animalspends in the bin We smooth the binned ring rate with a six-point gaus-sian window with a standard deviation of one bin to reduce the effect of

338 E N Brown R Barbieri V Ventura R E Kass and L M Frank

running velocity The spatial conditional intensity estimate is the smoothedspatial rate function To compute the temporal rate function we followedWood et al (1999) and divided the experiment into time bins of 200 msecand computed the rate as the number of spikes per 200 msec

These two smoothing procedures produce histogram-based estimatesof l(t) Both are histogram-based estimates of Poisson rate functions be-cause neither the estimated spatial nor the temporal rate functions makeany history-dependence assumption about the spike train As in the analy-sis of the SEF spike trains we again use the KS and Q-Q plots to comparedirectly goodness of t of the ve spike train models for the hippocampalplace cells The IP IG and IIG models were t to the spike train data by max-imum likelihood The spatial and temporal rate models were computed asdescribed The KS and Q-Q plotgoodness-of-t comparisonsare in Figure 2

The IG model overestimates at lower quantiles underestimates at inter-mediate quantiles and overestimates at the upper quantiles (see Figure 2A)

The Time-Rescaling Theorem 339

The IP model underestimates the lower and intermediate quantiles andoverestimates the upper quantiles The KS plot of the spatial rate model issimilar to that of the IP model yet closer to the 45-degree line The temporalrate model overestimates the quantiles of the true probability model of thedata This analysis suggests that the IG IP and spatial rate models are mostlikely oversmoothing this spike train whereas the temporal rate model un-dersmooths it Of the ve models the one that is closest to the 45-degreeline and lies almost entirely within the condence bounds is the IIG Thismodel disagrees only with the experimental data near quantile 080 Becauseall of the models with the exception of the IIG have appreciable lack of tin terms of the KS plots the ndings in the Q-Q plot analyses are almostidentical (see Figure 2B) As in the KS plot the Q-Q plot for the IIG model isclose to the 45-degree line and within the 95 condence bounds with theexception of quantiles near 080 These analyses suggest that IIG rate modelgives the best agreement with the spiking activity of this pyramidal neuronThe spatial and temporal rate function models and the IP model have thegreatest lack of t

33 Simulating a General Point Process by Time Rescaling As a sec-ond application of the time-rescaling theorem we describe how the the-orem may be used to simulate a general point process We stated in theIntroduction that the time-rescaling theorem provides a standard approachfor simulating an inhomogeneous Poisson process from a simple Poisson

Figure 2 Facing page (A) KS plots of the parametric and histogram-based modelts to the hippocampal place cell spike train activity The parametric modelsare the inhomogeneous Poisson (IP) (dotted line) the inhomogeneous gamma(IG) (thin solid line) and the inhomogeneous inverse gaussian (IIG)(thick solidline) The histogram-based models are the spatial rate model (dashed line) basedon 42 cm spatial bin width and the temporal rate model (dashed and dotted line)based a 200msec time bin width The 45-degree solid line represents exact agree-ment between the model and the experimental data and the 45-degree thin solidlines are the 95 condence bounds based on the KS statistic as in Figure 1 TheIIG model KS plot lies almost entirely within the condence bounds whereasall the other models show signicant lack of t This suggests that the IIG modelgives the best description of this place cell data series and the histogram-basedmodels agree least with the data (B) Q-Q plots of the IP (dotted line) IG (thinsolid line) and IIG (thick solid line) spatial (dashed line) temporal (dotted anddashed line) models The 95 local condence bounds (thin dashed lines) arecomputed as described in Figure 1 The ndings in the Q-Q plot analyses arealmost identical to those in the KS plots The Q-Q plot for the IIG model is closeto the 45-degree line and within the 95 condence bounds with the exceptionof quantiles near 080 This analysis also suggests that IIG rate model gives thebest agreement with the spiking activity of this pyramidal neuron The spatialand temporal rate function models and the IP model have the greatest lack of t

340 E N Brown R Barbieri V Ventura R E Kass and L M Frank

process The general form of the time-rescaling theorem suggests that anypoint process with an integrable conditional intensity function may be sim-ulated from a Poisson process with unit rate by rescaling time with respectto the conditional intensity (rate) function Given an interval (0 T] the sim-ulation algorithm proceeds as follows

1 Set u0 D 0 Set k D 1

2 Draw tk an exponential random variable with mean 1

3 Find uk as the solution to tk DR uk

ukiexcl1l (u | u0 u1 ukiexcl1 ) du

4 If uk gt T then stop

5 k D k C 1

6 Go to 2

By using equation 23 a discrete version of the algorithm can be constructedas follows Choose J large and divide the interval (0 T] into J bins each ofwidth D D T J For k D 1 J draw a Bernoulli random variable ucurren

k withprobability l(kD | ucurren

1 ucurrenkiexcl1 )D and assign a spike to bin k if ucurren

k D 1 andno spike if ucurren

k D 0While in many instances there will be faster more computationally ef-

cient algorithms for simulating a point process such as model-based meth-ods for specic renewal processes (Ripley 1987) and thinning algorithms(Lewis amp Shedler 1978 Ogata 1981 Ross 1993) the algorithm above is sim-ple to implement given a specication of the conditional intensity functionFor an example of where this algorithm is crucial for point process simu-lation we consider the IG model in equation 32 Its conditional intensityfunction is innite immediately following a spike if y lt 1 If in addition yis time varying (y D y (t) lt 1 for all t) then neither thinning nor standardalgorithms for making draws from a gamma probability distribution maybe used to simulate data from this model The thinning algorithm fails be-cause the conditional intensity function is not bounded and the standardalgorithms for simulating a gamma model cannot be applied because y istime varying In this case the time-rescaling simulation algorithm may beapplied as long as the conditional intensity function remains integrable asy varies temporally

4 Discussion

Measuring how well a point process model describes a neural spike traindata series is imperative prior to using the model for making inferences Thetime-rescaling theorem states that any point process with an integrable con-ditional intensity function may be transformed into a Poisson process witha unit rate Berman (1983) and Ogata (1988) showed that this theorem maybe used to develop goodness-of-t tests for point process models of seismo-logic data Goodness-of-t methods for neural spike train models based on

The Time-Rescaling Theorem 341

time-rescaling transformations but not the time-rescaling theorem have alsobeen reported (Reich Victor amp Knight 1998 Barbieri Frank Quirk Wilsonamp Brown 2001) Here we have described how the time-rescaling theoremmay be used to develop goodness-of-t tests for point process models ofneural spike trains

To illustrate our approach we analyzed two types of commonlyrecordedspike train data The SEF data are a set of multiple short (400 msec) series ofspike times each measured under identical experimental conditions Thesedata are typically analyzed by a PSTH The hippocampus data are a long(20 minutes) series of spike time recordings that are typically analyzed witheither spatial or temporal histogram models To each type of data we tboth parametric and histogram-based models Histogram-based models arepopularneural data analysis toolsbecause of the ease with which they can becomputed and interpreted These apparent advantages do not override theneed to evaluate the goodness of t of these models We previously usedthe time-rescaling theorem to assess goodness of t for parametric spiketrain models (Olson et al 2000 Barbieri Quirk et al 2001 Ventura et al2001) Our main result in this article is that the time-rescaling theorem can beused to evaluate goodness of t of parametric and histogram-based modelsand to compare directly the accuracy of models from the two classes Werecommend that before making an inference based on either type of modela goodness-of-t analysis should be performed to establish how well themodel describes the spike train data If the model and data agree closelythen the inference is more credible than when there is signicant lack of tThe KS and Q-Q plots provide assessments of overall goodness of t For themodels t by maximum likelihood these assessments can be applied alongwith methods that measure the marginal value and marginal costs of usingmore complex models such as Akaikiersquos information criterion (AIC) andthe Bayesian information criterion (BIC) in order to gain a more completeevaluation of model agreement with experimental data (Barbieri Quirk etal 2001)

We assessed goodness of t by using the time-rescaling theorem to con-struct KS and Q-Q plots having respectively liberal and conservative con-dence bounds Together the two sets of condence bounds help character-ize the range of agreement between the model and the data For example amodelwhose KSplot lies consistently outside the 95 KS condence bounds(the IP model for the hippocampal data) agrees poorly with the data Onthe other hand a model that is within all the 95 condence bounds of theQ-Q plots (the IMI model for the SEF data) agrees closely with the data Amodel such as the IP model for the SEF data that is within nearly all theKS bounds may lie outside the Q-Q plot intervals In this case if the lackof t with respect to the Q-Q plot intervals is systematic (ie is over a set ofcontiguous quantiles) this suggests that the model does not t the data well

As a second application of the time-rescaling theorem we presented analgorithm for simulating spike trains from a point process given its condi-

342 E N Brown R Barbieri V Ventura R E Kass and L M Frank

tional intensity (rate) function This algorithm generalizes the well-knowntechnique of simulating an inhomogeneous Poisson process by rescaling aPoisson process with a constant rate

Finally to make the reasoning behind the time-rescaling theorem moreaccessible to the neuroscience researchers we proved its general form usingelementary probability arguments While this elementary proof is most cer-tainly apparent to experts in probability and point process theory (D Brill-inger personal communication Guttorp 1995) itsdetails to our knowledgehave not been previously presented The original proofs of this theorem usemeasure theory and are based on the martingale representation of point pro-cesses (Meyer 1969 Papangelou 1972 Br Acircemaud 1981 Jacobsen 1982) Theconditional intensity function (see equation 21) is dened in terms of themartingale representation Our proof uses elementary arguments because itis based on the fact that the joint probability density of a set of point processobservations (spike train) has a canonical representation in terms of the con-ditional intensity function When the joint probability density is representedin this way the Jacobian in the change of variables between the original spiketimes and the rescaled interspike intervals simplies to a product of the re-ciprocals of the conditional intensity functions evaluated at the spike times

The proof also highlights the signicance of the conditional intensityfunction in spike train modeling its specication completely denes thestochastic structure of the point process This is because in a small time in-terval the product of the conditional intensity function and the time intervaldenes the probability of a spike in that interval given the historyof the spiketrain up to that time (see equation 23) When there is no history dependencethe conditional intensity function is simply the Poisson rate function Animportant consequence of this simplication is that unless history depen-dence is specically included then histogram-based models such as thePSTH and the spatial and temporal smoothers are implicit Poisson models

In both the SEF and hippocampusexamples the histogram-based modelsgave poor ts to the spike train These poor ts arose because these mod-els used few data points to estimate many parameters and because theydo not model history dependence in the spike train Our parametric mod-els used fewer parameters and represented temporal dependence explicitlyas Markov Point process models with higher-order temporal dependencehave been studied by Ogata (1981 1988) Brillinger (1988) and Kass andVentura (2001) and will be considered further in our future work Paramet-ric conditional intensity functions may be estimated from neural spike traindata in any experiments where there are enough data to estimate reliably ahistogram-based model This is because if there are enough data to estimatemany parameters using an inefcient procedure (histogrammethod of mo-ments) then there should be enough data to estimate a smaller number ofparameters using an efcient one (maximum likelihood)

Using the K-S and Q-Q plots derived from the time-rescaling theoremit is possible to devise a systematic approach to the use of histogram-based

The Time-Rescaling Theorem 343

models That is it is possible to determine when a histogram-based modelaccurately describes a spike train and when a different model class temporaldependence or the effects of covariates (eg the theta rhythm and the ratrunning velocity in the case of the place cells) should be considered and thedegree of improvement the alternative models provide

In summary we have illustrated how the time-rescaling theorem may beused to compare directly goodness of t of parametric and histogram-basedpoint process models of neural spiking activity and to simulate spike traindata These results suggest that the time-rescaling theorem can be a valuabletool for neural spike train data analysis

Appendix

A1 Maximum Likelihood Estimation of the IMI Model To t the IMImodel we choose J large and divide the interval (0 T] into J bins of widthD D T J We choose J so that there is at most one spike in any bin In thisway we convert the spike times 0 lt u1 lt u2 lt lt uniexcl1 lt un middot T into abinary sequence ucurren

j where ucurrenj D 1 if there is a spike in bin j and 0 otherwise

for j D 1 J By denition of the conditional intensity function for the IMImodel in equation 31 it follows that each ucurren

j is a Bernoulli random variablewith the probability of a spike at jD dened as l1 ( jD | h )l2 ( jD iexclucurren

N ( jD ) | h )D We note that this discretization is identical to the one used to constructthe discretized version of the simulation algorithm in Section 33 The logprobability of a spike is thus

log(l1 ( jD | h ) ) C log(l2 ( jD iexcl ucurrenN ( jD ) | h )D ) (A1)

and the cubic spline models for log(l1 ( jD | h ) ) and log(l2 ( jD iexcl ucurrenN ( jD ) | h ) )

are respectively

log(l1 ( jD | h ) ) D3X

lD1

hl ( jD iexcl j1 ) lC

Ch4 ( jD iexclj2 )3C C h5 ( jD iexcl j3) 3

C (A2)

log(l2 ( jD iexcl ucurrenN ( jD ) | h ) ) D

3X

lD1

hlC5 ( jD iexcl ucurrenN ( jD ) iexclc 1 ) l

C

C h9 ( jD iexcl ucurrenN ( jD ) iexclc 2 ) 3

C (A3)

whereh D (h1 h9 ) and the knots are dened asjl D lT4 for l D 1 2 3 andc l is the observed 100l3 quantile of the interspike interval distribution ofthe trial spike trains for lD 1 2 We have for Section 31 that hIP D (h1 h5)

344 E N Brown R Barbieri V Ventura R E Kass and L M Frank

and hIMI D (h1 h9) are the coefcients of the spline basis elements for theIP and IMI models respectively

The parameters are estimated by maximum likelihood with D D 1 msecusing the gam function in S-PLUS (MathSoft Seattle) as described in Chap-ter 11 of Venables and Ripley (1999) The inputs to gam are the ucurren

j s the timearguments jD and jD iexclucurren

N ( jD ) the symbolic specication of the spline mod-els and the knots Further discussion on spline-based regression methodsfor analyzing neural data may be found in Olson et al (2000) and Venturaet al (2001)

Acknowledgments

We are grateful to Carl Olson for allowing us to use the supplementary eyeeld data and Matthew Wilson for allowing us to use the hippocampal placecell data We thank David Brillinger and Victor Solo for helpful discussionsand the two anonymous referees whose suggestions helped improve theexposition This research was supported in part by NIH grants CA54652MH59733 and MH61637 and NSF grants DMS 9803433 and IBN 0081548

References

Barbieri R Frank L M Quirk M C Wilson M A amp Brown E N (2001)Diagnostic methods for statistical models of place cell spiking activity Neu-rocomputing 38ndash40 1087ndash1093

Barbieri R Quirk M C Frank L M Wilson M A amp Brown E N (2001)Construction and analysis of non-Poisson stimulus-response models of neu-ral spike train activity J Neurosci Meth 105 25ndash37

Berman M (1983)Comment on ldquoLikelihood analysis of point processes and itsapplications to seismological datardquo by Ogata Bulletin Internatl Stat Instit50 412ndash418

Br Acircemaud P (1981) Point processes and queues Martingale dynamics BerlinSpringer-Verlag

Brillinger D R (1988) Maximum likelihood analysis of spike trains of interact-ing nerve cells Biol Cyber 59 189ndash200

Brown E N Frank L M Tang D Quirk M C amp Wilson M A (1998) Astatistical paradigm for neural spike train decoding applied to position pre-diction from ensemble ring patterns of rat hippocampal place cells Journalof Neuroscience 18 7411ndash7425

Casella G amp Berger R L (1990) Statistical inference Belmont CA DuxburyDaley D amp Vere-Jones D (1988) An introduction to the theory of point process

New York Springer-VerlagFrank L M Brown E N amp Wilson M A (2000) Trajectory encoding in the

hippocampus and entorhinal cortex Neuron 27 169ndash178Guttorp P (1995) Stochastic modeling of scientic data London Chapman amp

Hall

The Time-Rescaling Theorem 345

Hogg R V amp Tanis E A (2001) Probability and statistical inference (6th ed)Englewood Cliffs NJ Prentice Hall

Jacobsen M (1982)Statistical analysis of counting processes New York Springer-Verlag

Johnson A amp Kotz S (1970) Distributions in statistics Continuous univariatedistributionsmdash2 New York Wiley

Kalbeisch J amp Prentice R (1980)The statisticalanalysis of failure time data NewYork Wiley

Karr A (1991) Point processes and their statistical inference (2nd ed) New YorkMarcel Dekker

Kass R E amp Ventura V (2001)A spike train probability model Neural Comput13 1713ndash1720

Lewis P A W amp Shedler G S (1978) Simulation of non-homogeneous Poissonprocesses by thinning Naval Res Logistics Quart 26 403ndash413

Meyer P (1969)D Acircemonstration simpli Acircee drsquoun th Acirceoreme de Knight InS AcirceminaireprobabilitAcirce V (pp 191ndash195) New York Springer-Verlag

Muller R U amp Kubie J L (1987)The effects of changes in the environment onthe spatial ring of hippocampal complex-spike cells Journal of Neuroscience7 1951ndash1968

Ogata Y (1981) On Lewisrsquo simulation method for point processes IEEE Trans-actions on Information Theory IT-27 23ndash31

Ogata Y (1988) Statistical models for earthquake occurrences and residualanalysis for point processes Journal of American Statistical Association 83 9ndash27

OrsquoKeefe J amp Dostrovsky J (1971) The hippocampus as a spatial map Pre-liminary evidence from unit activity in the freely-moving rat Brain Res 34171ndash175

Olson C R Gettner S N Ventura V Carta R amp Kass R E (2000) Neuronalactivity in macaque supplementary eye eld during planning of saccades inresponse to pattern and spatial cues Journal of Neurophysiology84 1369ndash1384

Papangelou F (1972) Integrability of expected increments of point processesand a related random change of scale Trans Amer Math Soc 165 483ndash506

Port S (1994) Theoretical probability for applications New York WileyProtter M H amp Morrey C B (1991)A rst course in real analysis (2nd ed) New

York Springer-VerlagReich D Victor J amp Knight B (1998) The power ratio and the interval

map Spiking models and extracellular recordings Journal of Neuroscience18 10090ndash10104

Ripley B (1987) Stochastic simulation New York WileyRoss S (1993) Introduction to probability models (5th ed) San Diego CA Aca-

demic PressSnyder D amp Miller M (1991)Random point processes in time and space (2nd ed)

New York Springer-VerlagTaylor H M amp Karlin S (1994) An introduction to stochastic modeling (rev ed)

San Diego CA Academic PressVenables W amp Ripley B (1999) Modern applied statistics with S-PLUS (3rd ed)

New York Springer-Verlag

346 E N Brown R Barbieri V Ventura R E Kass and L M Frank

Ventura V Carta R Kass R Gettner S amp Olson C (2001) Statistical analysisof temporal evolution in single-neuron ring rate Biostatistics In press

Wilson M A amp McNaughton B L (1993) Dynamics of the hippocampal en-semble code for space Science 261 1055ndash1058

Wood E R Dudchenko P A amp Eichenbaum H (1999) The global record ofmemory in hippocampal neuronal activity Nature 397 613ndash616

Received October 27 2000 accepted May 2 2001

330 E N Brown R Barbieri V Ventura R E Kass and L M Frank

where the last equality follows from the denition of tT By the multivariatechange-of-variable formula (Port 1994)

f (t1 t2 tn ) D | J| f (u1 u2 un N (un) D n) (212)

where J is the Jacobian of the transformation between uj j D 1 n andtk k D 1 n Because tk is a function of u1 uk J is a lower triangularmatrix and its determinant is the product of its diagonal elements denedas | J| D |

QnkD1 Jkk| By assumption 0 lt l (t | Ht) and by equation 28 and

the denition of tk the mapping of u into t is one-to-one Therefore bythe inverse differentiation theorem (Protter amp Morrey 1991) the diagonalelements of J are

Jkk Duk

tkD l(uk | Huk

) iexcl1 (213)

Substituting |J | and equation 26 into equation 212 yields

f (t1 t2 tn ) DnY

kD1

l(uk | Huk) iexcl1

nY

kD1

l (uk | Huk)

cent expraquo

iexclZ uk

ukiexcl1

l (u | Hu ) dufrac14

DnY

kD1

expfiexcl[L (uk) iexcl L (ukiexcl1 )]g

DnY

kD1

expfiexcltkg (214)

Substituting equations 211 and 214 into 29 yields

f (t1 t2 tn tnC1 gt tT ) D f (t1 tn) Pr(tnC1 gt tT | t1 tn)

D

sup3nY

kD1

expfiexcltkgacute

expfiexcltTg (215)

which establishes the result

The time-rescaling theorem generates a history-dependent rescaling ofthe time axis that converts a point process into a Poisson process with a unitrate

22 Assessing Model Goodness of Fit We may use the time-rescalingtheorem to construct goodness-of-t tests for a spike data model Once a

The Time-Rescaling Theorem 331

model has been t to a spike train data series we can compute from itsestimated conditional intensity the rescaled times

tk D L (uk) iexcl L (ukiexcl1 ) (216)

If the model is correct then according to the theorem the tks are indepen-dent exponential random variables with mean 1 If we make the furthertransformation

zk D 1 iexcl exp(iexcltk) (217)

then zks are independent uniform random variables on the interval (0 1) Be-cause the transformations inequations 216 and 217 are both one-to-one anystatistical assessment that measures agreement between the zks and a uni-form distribution directly evaluates how well the original model agrees withthe spike train data Here we present two methods Kolmogorov-Smirnovtests and quantile-quantile plots

To construct the Kolmogorov-Smirnov test we rst order the zks fromsmallest to largest denoting the ordered values as z (k) s We then plot the val-ues of the cumulative distribution function of the uniform density dened

as bk D kiexcl 12

n for k D 1 n against the z (k) s If the model is correct then thepoints should lie on a 45-degree line (Johnson amp Kotz 1970) Condencebounds for the degree of agreement between the models and the data maybe constructed using the distribution of the Kolmogorov-Smirnov statisticFor moderate to large sample sizes the 95 (99) condence bounds arewell approximated as bk sect 136n12 (bk sect 163n12) (Johnson amp Kotz 1970)We term such a plot a Kolmogorov-Smirnov (KS) plot

Another approach to measuring agreement between the uniform prob-ability density and the zks is to construct a quantile-quantile (Q-Q) plot(Ventura Carta Kass Gettner amp Olson 2001 Barbieri Quirk et al 2001Hogg amp Tanis 2001) In this display we plot the quantiles of the uniformdistribution denoted here also as the bks against the z(k) s As in the case ofthe KS plots exact agreement occurs between the point process model andthe experimental data if the points lie on a 45-degree line Pointwise con-dence bands can be constructed to measure the magnitude of the departureof the plot from the 45-degree line relative to chance To construct point-wise bands we note that if the tks are independent exponential randomvariables with mean 1 and the zks are thus uniform on the interval (0 1)then each z (k) has a beta probability density with parameters k and n iexcl k C 1dened as

f (z | k n iexcl k C 1) Dn

(n iexcl k) (k iexcl 1)zkiexcl1 (1 iexcl z)niexclk (218)

for 0 lt z lt 1 (Johnson amp Kotz 1970) We set the 95 condence boundsby nding the 25th and 975th quantiles of the cumulative distribution

332 E N Brown R Barbieri V Ventura R E Kass and L M Frank

associated with equation 218 for k D 1 n These exact quantiles arereadily available in many statistical software packages For moderate tolarge spike train data series a reasonable approximation to the 95 (99)condence bounds is given by the gaussian approximation to the bino-mial probability distribution as z(k) sect 196[z(k) (1 iexcl z(k) ) n]12 (z(k) sect2575[z(k) (1 iexcl z(k) ) n]12) To our knowledge these local condence boundsfor the Q-Q plots based on the beta distribution and the gaussian approxi-mation are new

In general the KS condence intervals will be wider than the corre-sponding Q-Q plot intervals To see this it sufces to compare the widthsof the two intervals using their approximate formulas for large n From thegaussian approximation to the binomial the maximum width of the 95condence interval for the Q-Q plots occurs at the median z(k) D 050 andis 2[196 (4n) 12] D 196niexcl12 For n large the width of the 95 condenceintervals for the KS plots is 272niexcl12 at all quantiles The KS condencebounds consider the maximum discrepancy from the 45-degree line alongall quantiles the 95 bands show the discrepancy that would be exceeded5 of the time by chance if the plotted data were truly uniformlydistributedThe Q-Q plot condence bounds consider the maximum discrepancy fromthe 45-degree line for each quantile separately These pointwise 95 con-dence bounds mark the amount by which each value z(k) would deviatefrom the true quantile 5 of the time purely by chance The KS bounds arebroad because they are based on the joint distribution of all n deviationsand they consider the distribution of the largest of these deviations TheQ-Q plot bounds are narrower because they measure the deviation at eachquantile separately Used together the two plots help approximate upperand lower limits on the discrepancy between a proposed model and a spiketrain data series

3 Applications

31 An Analysis of Supplementary Eye Field Recordings For the rstapplication of the time-rescaling theorem to a goodness-of-t analysis weanalyze a spike train recorded from the supplementary eye eld (SEF) of amacaque monkey Neurons in the SEF play a role in oculomotor processes(Olson Gettner Ventura Carta amp Kass 2000) A standard paradigm forstudying the spiking propertiesof these neurons is a delayed eye movementtask In this task the monkey xates is shown locations of potential targetsites and is then cued to the specic target to which it must saccade Nexta preparatory cue is given followed a random time later by a go signalUpon receiving the go signal the animal must saccade to the specic targetand hold xation for a dened amount of time in order to receive a rewardBeginning fromthe point of the specic target cue neural activity is recordedfor a xed interval of time beyond the presentation of the go signal Aftera brief rest period the trial is repeated Multiple trials from an experiment

The Time-Rescaling Theorem 333

such as this are jointly analyzed using a PSTH to estimate ring rate for anite interval following a xed initiation point That is the trials are timealigned with respect to a xed initial point such as the target cue The dataacross trials are binned in time intervals of a xed length and the rate in eachbin is estimated as the average number of spikes in the xed time interval

Kass and Ventura (2001) recently presented inhomogeneous Markov in-terval (IMI) modelsas an alternative to the PSTH for analyzing multiple-trialneural spike train data These models use a Markov representation for theconditional intensity function One form of the IMI conditional intensityfunction they considered is

l(t | Ht ) D l (t | uN (t) h ) D l1 (t | h )l2 (t iexcl uN (t) | h ) (31)

where uN (t) is the time of the last spike prior to t l1 (t | h ) modulates r-ing as a function of the experimental clock time l2 (t iexcl uN (t) | h ) repre-sents Markov dependence in the spike train and h is a vector of modelparameters to be estimated Kass and Ventura modeled log l1 (t | h ) andlog l2 (t iexcl uN (t) | h ) as separate piecewise cubic splines in their respectivearguments t and t iexcl uN (t) The cubic pieces were joined at knots so thatthe resulting functions were twice continuously differentiable The numberand positions of the knots were chosen in a preliminary data analysis In thespecial case l (t | uN (t) h ) D l1 (t | h ) the conditional intensity function inequation 31 corresponds to an inhomogeneous Poisson (IP) model becausethis assumes no temporal dependence among the spike times

Kass and Ventura used their models to analyze SEF data that consistedof 486 spikes recorded during the 400 msec following the target cue signalin 15 trials of a delayed eye movement task (neuron PK166a from Olsonet al 2000 using the pattern condition) The IP and IMI models were tby maximum likelihood and statistical signicance tests on the spline co-efcients were used to compare goodness of t Included in the analysiswere spline models with higher-order dependence among the spike timesthan the rst-order Markov dependence in equation 31 They found thatthe IMI model gave a statistically signicant improvement in the t relativeto the IP and that adding higher-order dependence to the model gave nofurther improvements The t of the IMI model was not improved by in-cluding terms to model between-trial differences in spike rate The authorsconcluded that there was strong rst-order Markov dependence in the r-ing pattern of this SEF neuron Kass and Ventura did not provide an overallassessment of model goodness of t or evaluate how much the IMI and theIP models improved over the histogram-based rate model estimated by thePSTH

Using the KS and Q-Q plots derived from the time-rescaling theorem it ispossible to compare directly the ts of the IMI IP and PSTH models and todetermine which gives the most accurate description of the SEF spike trainstructure The equations for the IP rate function lIP (t | hIP ) D l1 (t | hIP )

334 E N Brown R Barbieri V Ventura R E Kass and L M Frank

and for the IMI conditional intensity function lIMI (t | uN (t) hIMI ) Dl1 (t | hIMI )l2 (t iexcl uN (t) hIMI ) are given in the appendix along with a dis-cussion of the maximum likelihood procedure used to estimate the co-efcients of the spline basis elements The estimated conditional inten-sity functions for the IP and IMI models are respectively lIP (t | OhIP ) andlIMI (t | uN (t) OhIMI ) where Oh is the maximum likelihood estimate of thespecic spline coefcients For the PSTH model the conditional intensityestimate is the PSTH computed by averaging the number of spikes in eachof 40 10 msec bins (the 400 msec following the target cue signal) acrossthe 15 trials The PSTH is the t of another inhomogeneous Poisson modelbecause it assumes no dependence among the spike times

The results of the IMI IP and PSTH model ts compared by KS andQ-Q plots are shown in Figure 1 For the IP model there is lack of t atlower quantiles (below 025) because in that range its KS plot lies just out-side the 95 condence bounds (see Figure 1A) From quantile 025 andbeyond the IP model is within the 95 condence bounds although be-

The Time-Rescaling Theorem 335

yond the quantile 075 it slightly underpredicts the true probability modelof the data The KS plot of the PSTH model is similar to that of the IPexcept that it lies entirely outside the 95 condence bands below quan-tile 050 Beyond this quantile it is within the 95 condence bounds TheKS plot of the PSTH underpredicts the probability model of the data toa greater degree in the upper range than the IP model The IMI model iscompletely within the 95 condence bounds and lies almost exactly onthe 45-degree line of complete agreement between the model and the dataThe Q-Q plot analyses (see Figure 1B) agree with KS plot analyses witha few exceptions The Q-Q plot analyses show that the lack of t of theIP and PSTH models is greater at the lower quantiles (0ndash050) than sug-gested by the KS plots The Q-Q plots for the IP and PSTH models alsoshow that the deviations of these two models near quantiles 080 to 090are statistically signicant With the exception of a small deviation belowquantile 010 the Q-Q plot of the IMI lies almost exactly on the 45-degreeline

Figure 1 Facing page (A) Kolmogorov-Smirnov (K-S) plots of the inhomoge-neous Markov interval (IMI) model inhomogeneous Poisson (IP) and perstim-ulus time histogram (PSTH) model ts to the SEF spike train data The solid45-degree line represents exact agreement between the model and the dataThe dashed 45-degree lines are the 95 condence bounds for exact agree-ment between the model and experimental data based on the distribution of theKolmogorov-Smirnov statistic The 95 condence bounds are bk sect 136niexcl 1

2 where bk D (k iexcl 1

2) n for k D 1 n and n is the total number of spikes The

IMI model (thick solid line) is completely within the 95 condence boundsand lies almost exactly on the 45-degree line The IP model (thin solid line)has lack of t at lower quantiles (lt 025) From the quantile 025 and be-yond the IP model is within the 95 condence bounds The KS plot of thePSTH model (dotted line) is similar to that of the IP model except that it liesoutside the 95 condence bands below quantile 050 Beyond this quantileit is within the 95 condence bounds yet it underpredicts the probabilitymodel of the data to a greater degree in this range than the IP model doesThe IMI model agrees more closely with the spike train data than either the IPor the PSTH models (B) Quantile-quantile (Q-Q) plots of the IMI (thick solidline) IP (thin solid line) and PSTH (dotted line) models The dashed lines arethe local 95 condence bounds for the individual quantiles computed fromthe beta probability density dened in equation 218 The solid 45-degree linerepresents exact agreement between the model and the data The Q-Q plotssuggests that the lack of t of the IP and PSTH models is greater at the lowerquantiles (0ndash050) than suggested by the KS plots The Q-Q plots for the IP andPSTH models also show that the deviations of these two models near quantiles080 to 090 are statistically signicant With the exception of a small devia-tion below quantile 010 the Q-Q plot of the IMI lies almost exactly on the45-degree line

336 E N Brown R Barbieri V Ventura R E Kass and L M Frank

We conclude that the IMI model gives the best description of these SEFspike trains In agreement with the report of Kass and Ventura (2001) thisanalysis supports a rst-order Markov dependence among the spike timesand not a Poisson structure as would be suggested by either the IP or thePSTH models This analysis extends the ndings of Kass and Ventura byshowing that of the three models the PSTH gives the poorest description ofthe SEF spike train The IP model gives a better t to the SEF data than thePSTH model because the maximum likelihood analysis of the parametricIP model is more statistically efcient than the histogram (method of mo-ments) estimate obtained from the PSTH (Casella amp Berger 1990) That isthe IP model t by maximum likelihood uses all the data to estimate theconditional intensity function at all time points whereas the PSTH anal-ysis uses only spikes in a specied time bin to estimate the ring rate inthat bin The additional improvement of the IMI model over the IP is dueto the fact that the former represents temporal dependence in the spiketrain

32 An Analysis of Hippocampal Place Cell Recordings As a secondexample of using the time-rescaling theorem to develop goodness-of-ttests we analyze the spiking activity of a pyramidal cell in the CA1 regionof the rat hippocampus recorded from an animal running back and forth ona linear track Hippocampal pyramidal neurons have place-specic ring(OrsquoKeefe amp Dostrovsky 1971)a given neuron res onlywhen the animal is ina certain subregion of the environment termed the neuronrsquos place eld On alinear track these elds approximately resemble one-dimensional gaussiansurfaces The neuronrsquos spiking activity correlates most closely with the an-imalrsquos position on the track (Wilson amp McNaughton 1993) The data serieswe analyze consists of 691 spikes from a place cell in the CA1 region of thehippocampus recorded from a rat running back and forth for 20 minutes ona 300 cm U-shaped track The track was linearized for the purposes of thisanalysis (Frank Brown amp Wilson 2000)

There are two approaches to estimating the place-specic ring maps ofa hippocampal neuron One approach is to use maximum likelihood to ta specic parametric model of the spike times to the place cell data as inBrown Frank Tang Quirk and Wilson (1998) and Barbieri Quirk et al(2001) If x(t) is the animalrsquos position at time t we dene the spatial functionfor the one-dimensional place eld model as the gaussian surface

s (t) D expraquo

a iexcl b (x (t) iexcl m )2

2

frac14 (32)

where m is the center of place eld b is a scale factor and expfag is themaximum height of the place eld at its center We represent the spike timeprobability density of the neuron as either an inhomogeneous gamma (IG)

The Time-Rescaling Theorem 337

model dened as

f (uk | ukiexcl1 h )

Dys (uk )C (y )

microZ uk

ukiexcl1

ys (u) duparay iexcl1

expraquo

iexclZ uk

ukiexcl1

ys(u) dufrac14

(33)

or as an inhomogeneous inverse gaussian (IIG) model dened as

f (uk | ukiexcl1 h )

Ds(uk)

micro2p

hR uk

ukiexcl1s(u) du

i3para 1

2

exp

8gtlt

gtiexcl

12

plusmnR uk

ukiexcl1s (u) du iexcl y

sup22

y 2R uk

ukiexcl1s(u) du

9gt=

gt (34)

where y gt 0 is a location parameter for both models and h D (m a b y )is the set of model parameters to be estimated from the spike train If weset y D 1 in equation 33 we obtain the IP model as a special case ofthe IG model The parameters for all three modelsmdashthe IP IG and theIIGmdashcan be estimated from the spike train data by maximum likelihood(Barbieri Quirk et al 2001) The models in equations 33 and 34 are Markovso that the current value of either the spike time probability density orthe conditional intensity (rate) function depends on only the time of thepreviousspike Because of equation 22 specifying the spike time probabilitydensity is equivalent to specifying the conditional intensity function If welet Oh denote the maximum likelihood estimate of h then the maximumlikelihood estimate of the conditional intensity function for each model canbe computed from equation 22 as

l(t | Ht Oh ) Df (t | uN (t) Oh )

1 iexclR t

uN (t)f (u | uN (t) Oh ) du

(35)

for t gt uN (t) The estimated conditional intensity from each model maybe used in the time-rescaling theorem to assess model goodness of t asdescribed in Section 21

The second approach is to compute a histogram-based estimate of theconditional intensity function by using either spatial smoothing (Muller ampKubie 1987 Frank et al 2000) or temporal smoothing (Wood Dudchenkoamp Eichenbaum 1999) of the spike train To compute the spatial smooth-ing estimate of the conditional intensity function we followed Frank et al(2000) and divided the 300 cm track into 42 cm bins counted the numberof spikes per bin and divided the count by the amount of time the animalspends in the bin We smooth the binned ring rate with a six-point gaus-sian window with a standard deviation of one bin to reduce the effect of

338 E N Brown R Barbieri V Ventura R E Kass and L M Frank

running velocity The spatial conditional intensity estimate is the smoothedspatial rate function To compute the temporal rate function we followedWood et al (1999) and divided the experiment into time bins of 200 msecand computed the rate as the number of spikes per 200 msec

These two smoothing procedures produce histogram-based estimatesof l(t) Both are histogram-based estimates of Poisson rate functions be-cause neither the estimated spatial nor the temporal rate functions makeany history-dependence assumption about the spike train As in the analy-sis of the SEF spike trains we again use the KS and Q-Q plots to comparedirectly goodness of t of the ve spike train models for the hippocampalplace cells The IP IG and IIG models were t to the spike train data by max-imum likelihood The spatial and temporal rate models were computed asdescribed The KS and Q-Q plotgoodness-of-t comparisonsare in Figure 2

The IG model overestimates at lower quantiles underestimates at inter-mediate quantiles and overestimates at the upper quantiles (see Figure 2A)

The Time-Rescaling Theorem 339

The IP model underestimates the lower and intermediate quantiles andoverestimates the upper quantiles The KS plot of the spatial rate model issimilar to that of the IP model yet closer to the 45-degree line The temporalrate model overestimates the quantiles of the true probability model of thedata This analysis suggests that the IG IP and spatial rate models are mostlikely oversmoothing this spike train whereas the temporal rate model un-dersmooths it Of the ve models the one that is closest to the 45-degreeline and lies almost entirely within the condence bounds is the IIG Thismodel disagrees only with the experimental data near quantile 080 Becauseall of the models with the exception of the IIG have appreciable lack of tin terms of the KS plots the ndings in the Q-Q plot analyses are almostidentical (see Figure 2B) As in the KS plot the Q-Q plot for the IIG model isclose to the 45-degree line and within the 95 condence bounds with theexception of quantiles near 080 These analyses suggest that IIG rate modelgives the best agreement with the spiking activity of this pyramidal neuronThe spatial and temporal rate function models and the IP model have thegreatest lack of t

33 Simulating a General Point Process by Time Rescaling As a sec-ond application of the time-rescaling theorem we describe how the the-orem may be used to simulate a general point process We stated in theIntroduction that the time-rescaling theorem provides a standard approachfor simulating an inhomogeneous Poisson process from a simple Poisson

Figure 2 Facing page (A) KS plots of the parametric and histogram-based modelts to the hippocampal place cell spike train activity The parametric modelsare the inhomogeneous Poisson (IP) (dotted line) the inhomogeneous gamma(IG) (thin solid line) and the inhomogeneous inverse gaussian (IIG)(thick solidline) The histogram-based models are the spatial rate model (dashed line) basedon 42 cm spatial bin width and the temporal rate model (dashed and dotted line)based a 200msec time bin width The 45-degree solid line represents exact agree-ment between the model and the experimental data and the 45-degree thin solidlines are the 95 condence bounds based on the KS statistic as in Figure 1 TheIIG model KS plot lies almost entirely within the condence bounds whereasall the other models show signicant lack of t This suggests that the IIG modelgives the best description of this place cell data series and the histogram-basedmodels agree least with the data (B) Q-Q plots of the IP (dotted line) IG (thinsolid line) and IIG (thick solid line) spatial (dashed line) temporal (dotted anddashed line) models The 95 local condence bounds (thin dashed lines) arecomputed as described in Figure 1 The ndings in the Q-Q plot analyses arealmost identical to those in the KS plots The Q-Q plot for the IIG model is closeto the 45-degree line and within the 95 condence bounds with the exceptionof quantiles near 080 This analysis also suggests that IIG rate model gives thebest agreement with the spiking activity of this pyramidal neuron The spatialand temporal rate function models and the IP model have the greatest lack of t

340 E N Brown R Barbieri V Ventura R E Kass and L M Frank

process The general form of the time-rescaling theorem suggests that anypoint process with an integrable conditional intensity function may be sim-ulated from a Poisson process with unit rate by rescaling time with respectto the conditional intensity (rate) function Given an interval (0 T] the sim-ulation algorithm proceeds as follows

1 Set u0 D 0 Set k D 1

2 Draw tk an exponential random variable with mean 1

3 Find uk as the solution to tk DR uk

ukiexcl1l (u | u0 u1 ukiexcl1 ) du

4 If uk gt T then stop

5 k D k C 1

6 Go to 2

By using equation 23 a discrete version of the algorithm can be constructedas follows Choose J large and divide the interval (0 T] into J bins each ofwidth D D T J For k D 1 J draw a Bernoulli random variable ucurren

k withprobability l(kD | ucurren

1 ucurrenkiexcl1 )D and assign a spike to bin k if ucurren

k D 1 andno spike if ucurren

k D 0While in many instances there will be faster more computationally ef-

cient algorithms for simulating a point process such as model-based meth-ods for specic renewal processes (Ripley 1987) and thinning algorithms(Lewis amp Shedler 1978 Ogata 1981 Ross 1993) the algorithm above is sim-ple to implement given a specication of the conditional intensity functionFor an example of where this algorithm is crucial for point process simu-lation we consider the IG model in equation 32 Its conditional intensityfunction is innite immediately following a spike if y lt 1 If in addition yis time varying (y D y (t) lt 1 for all t) then neither thinning nor standardalgorithms for making draws from a gamma probability distribution maybe used to simulate data from this model The thinning algorithm fails be-cause the conditional intensity function is not bounded and the standardalgorithms for simulating a gamma model cannot be applied because y istime varying In this case the time-rescaling simulation algorithm may beapplied as long as the conditional intensity function remains integrable asy varies temporally

4 Discussion

Measuring how well a point process model describes a neural spike traindata series is imperative prior to using the model for making inferences Thetime-rescaling theorem states that any point process with an integrable con-ditional intensity function may be transformed into a Poisson process witha unit rate Berman (1983) and Ogata (1988) showed that this theorem maybe used to develop goodness-of-t tests for point process models of seismo-logic data Goodness-of-t methods for neural spike train models based on

The Time-Rescaling Theorem 341

time-rescaling transformations but not the time-rescaling theorem have alsobeen reported (Reich Victor amp Knight 1998 Barbieri Frank Quirk Wilsonamp Brown 2001) Here we have described how the time-rescaling theoremmay be used to develop goodness-of-t tests for point process models ofneural spike trains

To illustrate our approach we analyzed two types of commonlyrecordedspike train data The SEF data are a set of multiple short (400 msec) series ofspike times each measured under identical experimental conditions Thesedata are typically analyzed by a PSTH The hippocampus data are a long(20 minutes) series of spike time recordings that are typically analyzed witheither spatial or temporal histogram models To each type of data we tboth parametric and histogram-based models Histogram-based models arepopularneural data analysis toolsbecause of the ease with which they can becomputed and interpreted These apparent advantages do not override theneed to evaluate the goodness of t of these models We previously usedthe time-rescaling theorem to assess goodness of t for parametric spiketrain models (Olson et al 2000 Barbieri Quirk et al 2001 Ventura et al2001) Our main result in this article is that the time-rescaling theorem can beused to evaluate goodness of t of parametric and histogram-based modelsand to compare directly the accuracy of models from the two classes Werecommend that before making an inference based on either type of modela goodness-of-t analysis should be performed to establish how well themodel describes the spike train data If the model and data agree closelythen the inference is more credible than when there is signicant lack of tThe KS and Q-Q plots provide assessments of overall goodness of t For themodels t by maximum likelihood these assessments can be applied alongwith methods that measure the marginal value and marginal costs of usingmore complex models such as Akaikiersquos information criterion (AIC) andthe Bayesian information criterion (BIC) in order to gain a more completeevaluation of model agreement with experimental data (Barbieri Quirk etal 2001)

We assessed goodness of t by using the time-rescaling theorem to con-struct KS and Q-Q plots having respectively liberal and conservative con-dence bounds Together the two sets of condence bounds help character-ize the range of agreement between the model and the data For example amodelwhose KSplot lies consistently outside the 95 KS condence bounds(the IP model for the hippocampal data) agrees poorly with the data Onthe other hand a model that is within all the 95 condence bounds of theQ-Q plots (the IMI model for the SEF data) agrees closely with the data Amodel such as the IP model for the SEF data that is within nearly all theKS bounds may lie outside the Q-Q plot intervals In this case if the lackof t with respect to the Q-Q plot intervals is systematic (ie is over a set ofcontiguous quantiles) this suggests that the model does not t the data well

As a second application of the time-rescaling theorem we presented analgorithm for simulating spike trains from a point process given its condi-

342 E N Brown R Barbieri V Ventura R E Kass and L M Frank

tional intensity (rate) function This algorithm generalizes the well-knowntechnique of simulating an inhomogeneous Poisson process by rescaling aPoisson process with a constant rate

Finally to make the reasoning behind the time-rescaling theorem moreaccessible to the neuroscience researchers we proved its general form usingelementary probability arguments While this elementary proof is most cer-tainly apparent to experts in probability and point process theory (D Brill-inger personal communication Guttorp 1995) itsdetails to our knowledgehave not been previously presented The original proofs of this theorem usemeasure theory and are based on the martingale representation of point pro-cesses (Meyer 1969 Papangelou 1972 Br Acircemaud 1981 Jacobsen 1982) Theconditional intensity function (see equation 21) is dened in terms of themartingale representation Our proof uses elementary arguments because itis based on the fact that the joint probability density of a set of point processobservations (spike train) has a canonical representation in terms of the con-ditional intensity function When the joint probability density is representedin this way the Jacobian in the change of variables between the original spiketimes and the rescaled interspike intervals simplies to a product of the re-ciprocals of the conditional intensity functions evaluated at the spike times

The proof also highlights the signicance of the conditional intensityfunction in spike train modeling its specication completely denes thestochastic structure of the point process This is because in a small time in-terval the product of the conditional intensity function and the time intervaldenes the probability of a spike in that interval given the historyof the spiketrain up to that time (see equation 23) When there is no history dependencethe conditional intensity function is simply the Poisson rate function Animportant consequence of this simplication is that unless history depen-dence is specically included then histogram-based models such as thePSTH and the spatial and temporal smoothers are implicit Poisson models

In both the SEF and hippocampusexamples the histogram-based modelsgave poor ts to the spike train These poor ts arose because these mod-els used few data points to estimate many parameters and because theydo not model history dependence in the spike train Our parametric mod-els used fewer parameters and represented temporal dependence explicitlyas Markov Point process models with higher-order temporal dependencehave been studied by Ogata (1981 1988) Brillinger (1988) and Kass andVentura (2001) and will be considered further in our future work Paramet-ric conditional intensity functions may be estimated from neural spike traindata in any experiments where there are enough data to estimate reliably ahistogram-based model This is because if there are enough data to estimatemany parameters using an inefcient procedure (histogrammethod of mo-ments) then there should be enough data to estimate a smaller number ofparameters using an efcient one (maximum likelihood)

Using the K-S and Q-Q plots derived from the time-rescaling theoremit is possible to devise a systematic approach to the use of histogram-based

The Time-Rescaling Theorem 343

models That is it is possible to determine when a histogram-based modelaccurately describes a spike train and when a different model class temporaldependence or the effects of covariates (eg the theta rhythm and the ratrunning velocity in the case of the place cells) should be considered and thedegree of improvement the alternative models provide

In summary we have illustrated how the time-rescaling theorem may beused to compare directly goodness of t of parametric and histogram-basedpoint process models of neural spiking activity and to simulate spike traindata These results suggest that the time-rescaling theorem can be a valuabletool for neural spike train data analysis

Appendix

A1 Maximum Likelihood Estimation of the IMI Model To t the IMImodel we choose J large and divide the interval (0 T] into J bins of widthD D T J We choose J so that there is at most one spike in any bin In thisway we convert the spike times 0 lt u1 lt u2 lt lt uniexcl1 lt un middot T into abinary sequence ucurren

j where ucurrenj D 1 if there is a spike in bin j and 0 otherwise

for j D 1 J By denition of the conditional intensity function for the IMImodel in equation 31 it follows that each ucurren

j is a Bernoulli random variablewith the probability of a spike at jD dened as l1 ( jD | h )l2 ( jD iexclucurren

N ( jD ) | h )D We note that this discretization is identical to the one used to constructthe discretized version of the simulation algorithm in Section 33 The logprobability of a spike is thus

log(l1 ( jD | h ) ) C log(l2 ( jD iexcl ucurrenN ( jD ) | h )D ) (A1)

and the cubic spline models for log(l1 ( jD | h ) ) and log(l2 ( jD iexcl ucurrenN ( jD ) | h ) )

are respectively

log(l1 ( jD | h ) ) D3X

lD1

hl ( jD iexcl j1 ) lC

Ch4 ( jD iexclj2 )3C C h5 ( jD iexcl j3) 3

C (A2)

log(l2 ( jD iexcl ucurrenN ( jD ) | h ) ) D

3X

lD1

hlC5 ( jD iexcl ucurrenN ( jD ) iexclc 1 ) l

C

C h9 ( jD iexcl ucurrenN ( jD ) iexclc 2 ) 3

C (A3)

whereh D (h1 h9 ) and the knots are dened asjl D lT4 for l D 1 2 3 andc l is the observed 100l3 quantile of the interspike interval distribution ofthe trial spike trains for lD 1 2 We have for Section 31 that hIP D (h1 h5)

344 E N Brown R Barbieri V Ventura R E Kass and L M Frank

and hIMI D (h1 h9) are the coefcients of the spline basis elements for theIP and IMI models respectively

The parameters are estimated by maximum likelihood with D D 1 msecusing the gam function in S-PLUS (MathSoft Seattle) as described in Chap-ter 11 of Venables and Ripley (1999) The inputs to gam are the ucurren

j s the timearguments jD and jD iexclucurren

N ( jD ) the symbolic specication of the spline mod-els and the knots Further discussion on spline-based regression methodsfor analyzing neural data may be found in Olson et al (2000) and Venturaet al (2001)

Acknowledgments

We are grateful to Carl Olson for allowing us to use the supplementary eyeeld data and Matthew Wilson for allowing us to use the hippocampal placecell data We thank David Brillinger and Victor Solo for helpful discussionsand the two anonymous referees whose suggestions helped improve theexposition This research was supported in part by NIH grants CA54652MH59733 and MH61637 and NSF grants DMS 9803433 and IBN 0081548

References

Barbieri R Frank L M Quirk M C Wilson M A amp Brown E N (2001)Diagnostic methods for statistical models of place cell spiking activity Neu-rocomputing 38ndash40 1087ndash1093

Barbieri R Quirk M C Frank L M Wilson M A amp Brown E N (2001)Construction and analysis of non-Poisson stimulus-response models of neu-ral spike train activity J Neurosci Meth 105 25ndash37

Berman M (1983)Comment on ldquoLikelihood analysis of point processes and itsapplications to seismological datardquo by Ogata Bulletin Internatl Stat Instit50 412ndash418

Br Acircemaud P (1981) Point processes and queues Martingale dynamics BerlinSpringer-Verlag

Brillinger D R (1988) Maximum likelihood analysis of spike trains of interact-ing nerve cells Biol Cyber 59 189ndash200

Brown E N Frank L M Tang D Quirk M C amp Wilson M A (1998) Astatistical paradigm for neural spike train decoding applied to position pre-diction from ensemble ring patterns of rat hippocampal place cells Journalof Neuroscience 18 7411ndash7425

Casella G amp Berger R L (1990) Statistical inference Belmont CA DuxburyDaley D amp Vere-Jones D (1988) An introduction to the theory of point process

New York Springer-VerlagFrank L M Brown E N amp Wilson M A (2000) Trajectory encoding in the

hippocampus and entorhinal cortex Neuron 27 169ndash178Guttorp P (1995) Stochastic modeling of scientic data London Chapman amp

Hall

The Time-Rescaling Theorem 345

Hogg R V amp Tanis E A (2001) Probability and statistical inference (6th ed)Englewood Cliffs NJ Prentice Hall

Jacobsen M (1982)Statistical analysis of counting processes New York Springer-Verlag

Johnson A amp Kotz S (1970) Distributions in statistics Continuous univariatedistributionsmdash2 New York Wiley

Kalbeisch J amp Prentice R (1980)The statisticalanalysis of failure time data NewYork Wiley

Karr A (1991) Point processes and their statistical inference (2nd ed) New YorkMarcel Dekker

Kass R E amp Ventura V (2001)A spike train probability model Neural Comput13 1713ndash1720

Lewis P A W amp Shedler G S (1978) Simulation of non-homogeneous Poissonprocesses by thinning Naval Res Logistics Quart 26 403ndash413

Meyer P (1969)D Acircemonstration simpli Acircee drsquoun th Acirceoreme de Knight InS AcirceminaireprobabilitAcirce V (pp 191ndash195) New York Springer-Verlag

Muller R U amp Kubie J L (1987)The effects of changes in the environment onthe spatial ring of hippocampal complex-spike cells Journal of Neuroscience7 1951ndash1968

Ogata Y (1981) On Lewisrsquo simulation method for point processes IEEE Trans-actions on Information Theory IT-27 23ndash31

Ogata Y (1988) Statistical models for earthquake occurrences and residualanalysis for point processes Journal of American Statistical Association 83 9ndash27

OrsquoKeefe J amp Dostrovsky J (1971) The hippocampus as a spatial map Pre-liminary evidence from unit activity in the freely-moving rat Brain Res 34171ndash175

Olson C R Gettner S N Ventura V Carta R amp Kass R E (2000) Neuronalactivity in macaque supplementary eye eld during planning of saccades inresponse to pattern and spatial cues Journal of Neurophysiology84 1369ndash1384

Papangelou F (1972) Integrability of expected increments of point processesand a related random change of scale Trans Amer Math Soc 165 483ndash506

Port S (1994) Theoretical probability for applications New York WileyProtter M H amp Morrey C B (1991)A rst course in real analysis (2nd ed) New

York Springer-VerlagReich D Victor J amp Knight B (1998) The power ratio and the interval

map Spiking models and extracellular recordings Journal of Neuroscience18 10090ndash10104

Ripley B (1987) Stochastic simulation New York WileyRoss S (1993) Introduction to probability models (5th ed) San Diego CA Aca-

demic PressSnyder D amp Miller M (1991)Random point processes in time and space (2nd ed)

New York Springer-VerlagTaylor H M amp Karlin S (1994) An introduction to stochastic modeling (rev ed)

San Diego CA Academic PressVenables W amp Ripley B (1999) Modern applied statistics with S-PLUS (3rd ed)

New York Springer-Verlag

346 E N Brown R Barbieri V Ventura R E Kass and L M Frank

Ventura V Carta R Kass R Gettner S amp Olson C (2001) Statistical analysisof temporal evolution in single-neuron ring rate Biostatistics In press

Wilson M A amp McNaughton B L (1993) Dynamics of the hippocampal en-semble code for space Science 261 1055ndash1058

Wood E R Dudchenko P A amp Eichenbaum H (1999) The global record ofmemory in hippocampal neuronal activity Nature 397 613ndash616

Received October 27 2000 accepted May 2 2001

The Time-Rescaling Theorem 331

model has been t to a spike train data series we can compute from itsestimated conditional intensity the rescaled times

tk D L (uk) iexcl L (ukiexcl1 ) (216)

If the model is correct then according to the theorem the tks are indepen-dent exponential random variables with mean 1 If we make the furthertransformation

zk D 1 iexcl exp(iexcltk) (217)

then zks are independent uniform random variables on the interval (0 1) Be-cause the transformations inequations 216 and 217 are both one-to-one anystatistical assessment that measures agreement between the zks and a uni-form distribution directly evaluates how well the original model agrees withthe spike train data Here we present two methods Kolmogorov-Smirnovtests and quantile-quantile plots

To construct the Kolmogorov-Smirnov test we rst order the zks fromsmallest to largest denoting the ordered values as z (k) s We then plot the val-ues of the cumulative distribution function of the uniform density dened

as bk D kiexcl 12

n for k D 1 n against the z (k) s If the model is correct then thepoints should lie on a 45-degree line (Johnson amp Kotz 1970) Condencebounds for the degree of agreement between the models and the data maybe constructed using the distribution of the Kolmogorov-Smirnov statisticFor moderate to large sample sizes the 95 (99) condence bounds arewell approximated as bk sect 136n12 (bk sect 163n12) (Johnson amp Kotz 1970)We term such a plot a Kolmogorov-Smirnov (KS) plot

Another approach to measuring agreement between the uniform prob-ability density and the zks is to construct a quantile-quantile (Q-Q) plot(Ventura Carta Kass Gettner amp Olson 2001 Barbieri Quirk et al 2001Hogg amp Tanis 2001) In this display we plot the quantiles of the uniformdistribution denoted here also as the bks against the z(k) s As in the case ofthe KS plots exact agreement occurs between the point process model andthe experimental data if the points lie on a 45-degree line Pointwise con-dence bands can be constructed to measure the magnitude of the departureof the plot from the 45-degree line relative to chance To construct point-wise bands we note that if the tks are independent exponential randomvariables with mean 1 and the zks are thus uniform on the interval (0 1)then each z (k) has a beta probability density with parameters k and n iexcl k C 1dened as

f (z | k n iexcl k C 1) Dn

(n iexcl k) (k iexcl 1)zkiexcl1 (1 iexcl z)niexclk (218)

for 0 lt z lt 1 (Johnson amp Kotz 1970) We set the 95 condence boundsby nding the 25th and 975th quantiles of the cumulative distribution

332 E N Brown R Barbieri V Ventura R E Kass and L M Frank

associated with equation 218 for k D 1 n These exact quantiles arereadily available in many statistical software packages For moderate tolarge spike train data series a reasonable approximation to the 95 (99)condence bounds is given by the gaussian approximation to the bino-mial probability distribution as z(k) sect 196[z(k) (1 iexcl z(k) ) n]12 (z(k) sect2575[z(k) (1 iexcl z(k) ) n]12) To our knowledge these local condence boundsfor the Q-Q plots based on the beta distribution and the gaussian approxi-mation are new

In general the KS condence intervals will be wider than the corre-sponding Q-Q plot intervals To see this it sufces to compare the widthsof the two intervals using their approximate formulas for large n From thegaussian approximation to the binomial the maximum width of the 95condence interval for the Q-Q plots occurs at the median z(k) D 050 andis 2[196 (4n) 12] D 196niexcl12 For n large the width of the 95 condenceintervals for the KS plots is 272niexcl12 at all quantiles The KS condencebounds consider the maximum discrepancy from the 45-degree line alongall quantiles the 95 bands show the discrepancy that would be exceeded5 of the time by chance if the plotted data were truly uniformlydistributedThe Q-Q plot condence bounds consider the maximum discrepancy fromthe 45-degree line for each quantile separately These pointwise 95 con-dence bounds mark the amount by which each value z(k) would deviatefrom the true quantile 5 of the time purely by chance The KS bounds arebroad because they are based on the joint distribution of all n deviationsand they consider the distribution of the largest of these deviations TheQ-Q plot bounds are narrower because they measure the deviation at eachquantile separately Used together the two plots help approximate upperand lower limits on the discrepancy between a proposed model and a spiketrain data series

3 Applications

31 An Analysis of Supplementary Eye Field Recordings For the rstapplication of the time-rescaling theorem to a goodness-of-t analysis weanalyze a spike train recorded from the supplementary eye eld (SEF) of amacaque monkey Neurons in the SEF play a role in oculomotor processes(Olson Gettner Ventura Carta amp Kass 2000) A standard paradigm forstudying the spiking propertiesof these neurons is a delayed eye movementtask In this task the monkey xates is shown locations of potential targetsites and is then cued to the specic target to which it must saccade Nexta preparatory cue is given followed a random time later by a go signalUpon receiving the go signal the animal must saccade to the specic targetand hold xation for a dened amount of time in order to receive a rewardBeginning fromthe point of the specic target cue neural activity is recordedfor a xed interval of time beyond the presentation of the go signal Aftera brief rest period the trial is repeated Multiple trials from an experiment

The Time-Rescaling Theorem 333

such as this are jointly analyzed using a PSTH to estimate ring rate for anite interval following a xed initiation point That is the trials are timealigned with respect to a xed initial point such as the target cue The dataacross trials are binned in time intervals of a xed length and the rate in eachbin is estimated as the average number of spikes in the xed time interval

Kass and Ventura (2001) recently presented inhomogeneous Markov in-terval (IMI) modelsas an alternative to the PSTH for analyzing multiple-trialneural spike train data These models use a Markov representation for theconditional intensity function One form of the IMI conditional intensityfunction they considered is

l(t | Ht ) D l (t | uN (t) h ) D l1 (t | h )l2 (t iexcl uN (t) | h ) (31)

where uN (t) is the time of the last spike prior to t l1 (t | h ) modulates r-ing as a function of the experimental clock time l2 (t iexcl uN (t) | h ) repre-sents Markov dependence in the spike train and h is a vector of modelparameters to be estimated Kass and Ventura modeled log l1 (t | h ) andlog l2 (t iexcl uN (t) | h ) as separate piecewise cubic splines in their respectivearguments t and t iexcl uN (t) The cubic pieces were joined at knots so thatthe resulting functions were twice continuously differentiable The numberand positions of the knots were chosen in a preliminary data analysis In thespecial case l (t | uN (t) h ) D l1 (t | h ) the conditional intensity function inequation 31 corresponds to an inhomogeneous Poisson (IP) model becausethis assumes no temporal dependence among the spike times

Kass and Ventura used their models to analyze SEF data that consistedof 486 spikes recorded during the 400 msec following the target cue signalin 15 trials of a delayed eye movement task (neuron PK166a from Olsonet al 2000 using the pattern condition) The IP and IMI models were tby maximum likelihood and statistical signicance tests on the spline co-efcients were used to compare goodness of t Included in the analysiswere spline models with higher-order dependence among the spike timesthan the rst-order Markov dependence in equation 31 They found thatthe IMI model gave a statistically signicant improvement in the t relativeto the IP and that adding higher-order dependence to the model gave nofurther improvements The t of the IMI model was not improved by in-cluding terms to model between-trial differences in spike rate The authorsconcluded that there was strong rst-order Markov dependence in the r-ing pattern of this SEF neuron Kass and Ventura did not provide an overallassessment of model goodness of t or evaluate how much the IMI and theIP models improved over the histogram-based rate model estimated by thePSTH

Using the KS and Q-Q plots derived from the time-rescaling theorem it ispossible to compare directly the ts of the IMI IP and PSTH models and todetermine which gives the most accurate description of the SEF spike trainstructure The equations for the IP rate function lIP (t | hIP ) D l1 (t | hIP )

334 E N Brown R Barbieri V Ventura R E Kass and L M Frank

and for the IMI conditional intensity function lIMI (t | uN (t) hIMI ) Dl1 (t | hIMI )l2 (t iexcl uN (t) hIMI ) are given in the appendix along with a dis-cussion of the maximum likelihood procedure used to estimate the co-efcients of the spline basis elements The estimated conditional inten-sity functions for the IP and IMI models are respectively lIP (t | OhIP ) andlIMI (t | uN (t) OhIMI ) where Oh is the maximum likelihood estimate of thespecic spline coefcients For the PSTH model the conditional intensityestimate is the PSTH computed by averaging the number of spikes in eachof 40 10 msec bins (the 400 msec following the target cue signal) acrossthe 15 trials The PSTH is the t of another inhomogeneous Poisson modelbecause it assumes no dependence among the spike times

The results of the IMI IP and PSTH model ts compared by KS andQ-Q plots are shown in Figure 1 For the IP model there is lack of t atlower quantiles (below 025) because in that range its KS plot lies just out-side the 95 condence bounds (see Figure 1A) From quantile 025 andbeyond the IP model is within the 95 condence bounds although be-

The Time-Rescaling Theorem 335

yond the quantile 075 it slightly underpredicts the true probability modelof the data The KS plot of the PSTH model is similar to that of the IPexcept that it lies entirely outside the 95 condence bands below quan-tile 050 Beyond this quantile it is within the 95 condence bounds TheKS plot of the PSTH underpredicts the probability model of the data toa greater degree in the upper range than the IP model The IMI model iscompletely within the 95 condence bounds and lies almost exactly onthe 45-degree line of complete agreement between the model and the dataThe Q-Q plot analyses (see Figure 1B) agree with KS plot analyses witha few exceptions The Q-Q plot analyses show that the lack of t of theIP and PSTH models is greater at the lower quantiles (0ndash050) than sug-gested by the KS plots The Q-Q plots for the IP and PSTH models alsoshow that the deviations of these two models near quantiles 080 to 090are statistically signicant With the exception of a small deviation belowquantile 010 the Q-Q plot of the IMI lies almost exactly on the 45-degreeline

Figure 1 Facing page (A) Kolmogorov-Smirnov (K-S) plots of the inhomoge-neous Markov interval (IMI) model inhomogeneous Poisson (IP) and perstim-ulus time histogram (PSTH) model ts to the SEF spike train data The solid45-degree line represents exact agreement between the model and the dataThe dashed 45-degree lines are the 95 condence bounds for exact agree-ment between the model and experimental data based on the distribution of theKolmogorov-Smirnov statistic The 95 condence bounds are bk sect 136niexcl 1

2 where bk D (k iexcl 1

2) n for k D 1 n and n is the total number of spikes The

IMI model (thick solid line) is completely within the 95 condence boundsand lies almost exactly on the 45-degree line The IP model (thin solid line)has lack of t at lower quantiles (lt 025) From the quantile 025 and be-yond the IP model is within the 95 condence bounds The KS plot of thePSTH model (dotted line) is similar to that of the IP model except that it liesoutside the 95 condence bands below quantile 050 Beyond this quantileit is within the 95 condence bounds yet it underpredicts the probabilitymodel of the data to a greater degree in this range than the IP model doesThe IMI model agrees more closely with the spike train data than either the IPor the PSTH models (B) Quantile-quantile (Q-Q) plots of the IMI (thick solidline) IP (thin solid line) and PSTH (dotted line) models The dashed lines arethe local 95 condence bounds for the individual quantiles computed fromthe beta probability density dened in equation 218 The solid 45-degree linerepresents exact agreement between the model and the data The Q-Q plotssuggests that the lack of t of the IP and PSTH models is greater at the lowerquantiles (0ndash050) than suggested by the KS plots The Q-Q plots for the IP andPSTH models also show that the deviations of these two models near quantiles080 to 090 are statistically signicant With the exception of a small devia-tion below quantile 010 the Q-Q plot of the IMI lies almost exactly on the45-degree line

336 E N Brown R Barbieri V Ventura R E Kass and L M Frank

We conclude that the IMI model gives the best description of these SEFspike trains In agreement with the report of Kass and Ventura (2001) thisanalysis supports a rst-order Markov dependence among the spike timesand not a Poisson structure as would be suggested by either the IP or thePSTH models This analysis extends the ndings of Kass and Ventura byshowing that of the three models the PSTH gives the poorest description ofthe SEF spike train The IP model gives a better t to the SEF data than thePSTH model because the maximum likelihood analysis of the parametricIP model is more statistically efcient than the histogram (method of mo-ments) estimate obtained from the PSTH (Casella amp Berger 1990) That isthe IP model t by maximum likelihood uses all the data to estimate theconditional intensity function at all time points whereas the PSTH anal-ysis uses only spikes in a specied time bin to estimate the ring rate inthat bin The additional improvement of the IMI model over the IP is dueto the fact that the former represents temporal dependence in the spiketrain

32 An Analysis of Hippocampal Place Cell Recordings As a secondexample of using the time-rescaling theorem to develop goodness-of-ttests we analyze the spiking activity of a pyramidal cell in the CA1 regionof the rat hippocampus recorded from an animal running back and forth ona linear track Hippocampal pyramidal neurons have place-specic ring(OrsquoKeefe amp Dostrovsky 1971)a given neuron res onlywhen the animal is ina certain subregion of the environment termed the neuronrsquos place eld On alinear track these elds approximately resemble one-dimensional gaussiansurfaces The neuronrsquos spiking activity correlates most closely with the an-imalrsquos position on the track (Wilson amp McNaughton 1993) The data serieswe analyze consists of 691 spikes from a place cell in the CA1 region of thehippocampus recorded from a rat running back and forth for 20 minutes ona 300 cm U-shaped track The track was linearized for the purposes of thisanalysis (Frank Brown amp Wilson 2000)

There are two approaches to estimating the place-specic ring maps ofa hippocampal neuron One approach is to use maximum likelihood to ta specic parametric model of the spike times to the place cell data as inBrown Frank Tang Quirk and Wilson (1998) and Barbieri Quirk et al(2001) If x(t) is the animalrsquos position at time t we dene the spatial functionfor the one-dimensional place eld model as the gaussian surface

s (t) D expraquo

a iexcl b (x (t) iexcl m )2

2

frac14 (32)

where m is the center of place eld b is a scale factor and expfag is themaximum height of the place eld at its center We represent the spike timeprobability density of the neuron as either an inhomogeneous gamma (IG)

The Time-Rescaling Theorem 337

model dened as

f (uk | ukiexcl1 h )

Dys (uk )C (y )

microZ uk

ukiexcl1

ys (u) duparay iexcl1

expraquo

iexclZ uk

ukiexcl1

ys(u) dufrac14

(33)

or as an inhomogeneous inverse gaussian (IIG) model dened as

f (uk | ukiexcl1 h )

Ds(uk)

micro2p

hR uk

ukiexcl1s(u) du

i3para 1

2

exp

8gtlt

gtiexcl

12

plusmnR uk

ukiexcl1s (u) du iexcl y

sup22

y 2R uk

ukiexcl1s(u) du

9gt=

gt (34)

where y gt 0 is a location parameter for both models and h D (m a b y )is the set of model parameters to be estimated from the spike train If weset y D 1 in equation 33 we obtain the IP model as a special case ofthe IG model The parameters for all three modelsmdashthe IP IG and theIIGmdashcan be estimated from the spike train data by maximum likelihood(Barbieri Quirk et al 2001) The models in equations 33 and 34 are Markovso that the current value of either the spike time probability density orthe conditional intensity (rate) function depends on only the time of thepreviousspike Because of equation 22 specifying the spike time probabilitydensity is equivalent to specifying the conditional intensity function If welet Oh denote the maximum likelihood estimate of h then the maximumlikelihood estimate of the conditional intensity function for each model canbe computed from equation 22 as

l(t | Ht Oh ) Df (t | uN (t) Oh )

1 iexclR t

uN (t)f (u | uN (t) Oh ) du

(35)

for t gt uN (t) The estimated conditional intensity from each model maybe used in the time-rescaling theorem to assess model goodness of t asdescribed in Section 21

The second approach is to compute a histogram-based estimate of theconditional intensity function by using either spatial smoothing (Muller ampKubie 1987 Frank et al 2000) or temporal smoothing (Wood Dudchenkoamp Eichenbaum 1999) of the spike train To compute the spatial smooth-ing estimate of the conditional intensity function we followed Frank et al(2000) and divided the 300 cm track into 42 cm bins counted the numberof spikes per bin and divided the count by the amount of time the animalspends in the bin We smooth the binned ring rate with a six-point gaus-sian window with a standard deviation of one bin to reduce the effect of

338 E N Brown R Barbieri V Ventura R E Kass and L M Frank

running velocity The spatial conditional intensity estimate is the smoothedspatial rate function To compute the temporal rate function we followedWood et al (1999) and divided the experiment into time bins of 200 msecand computed the rate as the number of spikes per 200 msec

These two smoothing procedures produce histogram-based estimatesof l(t) Both are histogram-based estimates of Poisson rate functions be-cause neither the estimated spatial nor the temporal rate functions makeany history-dependence assumption about the spike train As in the analy-sis of the SEF spike trains we again use the KS and Q-Q plots to comparedirectly goodness of t of the ve spike train models for the hippocampalplace cells The IP IG and IIG models were t to the spike train data by max-imum likelihood The spatial and temporal rate models were computed asdescribed The KS and Q-Q plotgoodness-of-t comparisonsare in Figure 2

The IG model overestimates at lower quantiles underestimates at inter-mediate quantiles and overestimates at the upper quantiles (see Figure 2A)

The Time-Rescaling Theorem 339

The IP model underestimates the lower and intermediate quantiles andoverestimates the upper quantiles The KS plot of the spatial rate model issimilar to that of the IP model yet closer to the 45-degree line The temporalrate model overestimates the quantiles of the true probability model of thedata This analysis suggests that the IG IP and spatial rate models are mostlikely oversmoothing this spike train whereas the temporal rate model un-dersmooths it Of the ve models the one that is closest to the 45-degreeline and lies almost entirely within the condence bounds is the IIG Thismodel disagrees only with the experimental data near quantile 080 Becauseall of the models with the exception of the IIG have appreciable lack of tin terms of the KS plots the ndings in the Q-Q plot analyses are almostidentical (see Figure 2B) As in the KS plot the Q-Q plot for the IIG model isclose to the 45-degree line and within the 95 condence bounds with theexception of quantiles near 080 These analyses suggest that IIG rate modelgives the best agreement with the spiking activity of this pyramidal neuronThe spatial and temporal rate function models and the IP model have thegreatest lack of t

33 Simulating a General Point Process by Time Rescaling As a sec-ond application of the time-rescaling theorem we describe how the the-orem may be used to simulate a general point process We stated in theIntroduction that the time-rescaling theorem provides a standard approachfor simulating an inhomogeneous Poisson process from a simple Poisson

Figure 2 Facing page (A) KS plots of the parametric and histogram-based modelts to the hippocampal place cell spike train activity The parametric modelsare the inhomogeneous Poisson (IP) (dotted line) the inhomogeneous gamma(IG) (thin solid line) and the inhomogeneous inverse gaussian (IIG)(thick solidline) The histogram-based models are the spatial rate model (dashed line) basedon 42 cm spatial bin width and the temporal rate model (dashed and dotted line)based a 200msec time bin width The 45-degree solid line represents exact agree-ment between the model and the experimental data and the 45-degree thin solidlines are the 95 condence bounds based on the KS statistic as in Figure 1 TheIIG model KS plot lies almost entirely within the condence bounds whereasall the other models show signicant lack of t This suggests that the IIG modelgives the best description of this place cell data series and the histogram-basedmodels agree least with the data (B) Q-Q plots of the IP (dotted line) IG (thinsolid line) and IIG (thick solid line) spatial (dashed line) temporal (dotted anddashed line) models The 95 local condence bounds (thin dashed lines) arecomputed as described in Figure 1 The ndings in the Q-Q plot analyses arealmost identical to those in the KS plots The Q-Q plot for the IIG model is closeto the 45-degree line and within the 95 condence bounds with the exceptionof quantiles near 080 This analysis also suggests that IIG rate model gives thebest agreement with the spiking activity of this pyramidal neuron The spatialand temporal rate function models and the IP model have the greatest lack of t

340 E N Brown R Barbieri V Ventura R E Kass and L M Frank

process The general form of the time-rescaling theorem suggests that anypoint process with an integrable conditional intensity function may be sim-ulated from a Poisson process with unit rate by rescaling time with respectto the conditional intensity (rate) function Given an interval (0 T] the sim-ulation algorithm proceeds as follows

1 Set u0 D 0 Set k D 1

2 Draw tk an exponential random variable with mean 1

3 Find uk as the solution to tk DR uk

ukiexcl1l (u | u0 u1 ukiexcl1 ) du

4 If uk gt T then stop

5 k D k C 1

6 Go to 2

By using equation 23 a discrete version of the algorithm can be constructedas follows Choose J large and divide the interval (0 T] into J bins each ofwidth D D T J For k D 1 J draw a Bernoulli random variable ucurren

k withprobability l(kD | ucurren

1 ucurrenkiexcl1 )D and assign a spike to bin k if ucurren

k D 1 andno spike if ucurren

k D 0While in many instances there will be faster more computationally ef-

cient algorithms for simulating a point process such as model-based meth-ods for specic renewal processes (Ripley 1987) and thinning algorithms(Lewis amp Shedler 1978 Ogata 1981 Ross 1993) the algorithm above is sim-ple to implement given a specication of the conditional intensity functionFor an example of where this algorithm is crucial for point process simu-lation we consider the IG model in equation 32 Its conditional intensityfunction is innite immediately following a spike if y lt 1 If in addition yis time varying (y D y (t) lt 1 for all t) then neither thinning nor standardalgorithms for making draws from a gamma probability distribution maybe used to simulate data from this model The thinning algorithm fails be-cause the conditional intensity function is not bounded and the standardalgorithms for simulating a gamma model cannot be applied because y istime varying In this case the time-rescaling simulation algorithm may beapplied as long as the conditional intensity function remains integrable asy varies temporally

4 Discussion

Measuring how well a point process model describes a neural spike traindata series is imperative prior to using the model for making inferences Thetime-rescaling theorem states that any point process with an integrable con-ditional intensity function may be transformed into a Poisson process witha unit rate Berman (1983) and Ogata (1988) showed that this theorem maybe used to develop goodness-of-t tests for point process models of seismo-logic data Goodness-of-t methods for neural spike train models based on

The Time-Rescaling Theorem 341

time-rescaling transformations but not the time-rescaling theorem have alsobeen reported (Reich Victor amp Knight 1998 Barbieri Frank Quirk Wilsonamp Brown 2001) Here we have described how the time-rescaling theoremmay be used to develop goodness-of-t tests for point process models ofneural spike trains

To illustrate our approach we analyzed two types of commonlyrecordedspike train data The SEF data are a set of multiple short (400 msec) series ofspike times each measured under identical experimental conditions Thesedata are typically analyzed by a PSTH The hippocampus data are a long(20 minutes) series of spike time recordings that are typically analyzed witheither spatial or temporal histogram models To each type of data we tboth parametric and histogram-based models Histogram-based models arepopularneural data analysis toolsbecause of the ease with which they can becomputed and interpreted These apparent advantages do not override theneed to evaluate the goodness of t of these models We previously usedthe time-rescaling theorem to assess goodness of t for parametric spiketrain models (Olson et al 2000 Barbieri Quirk et al 2001 Ventura et al2001) Our main result in this article is that the time-rescaling theorem can beused to evaluate goodness of t of parametric and histogram-based modelsand to compare directly the accuracy of models from the two classes Werecommend that before making an inference based on either type of modela goodness-of-t analysis should be performed to establish how well themodel describes the spike train data If the model and data agree closelythen the inference is more credible than when there is signicant lack of tThe KS and Q-Q plots provide assessments of overall goodness of t For themodels t by maximum likelihood these assessments can be applied alongwith methods that measure the marginal value and marginal costs of usingmore complex models such as Akaikiersquos information criterion (AIC) andthe Bayesian information criterion (BIC) in order to gain a more completeevaluation of model agreement with experimental data (Barbieri Quirk etal 2001)

We assessed goodness of t by using the time-rescaling theorem to con-struct KS and Q-Q plots having respectively liberal and conservative con-dence bounds Together the two sets of condence bounds help character-ize the range of agreement between the model and the data For example amodelwhose KSplot lies consistently outside the 95 KS condence bounds(the IP model for the hippocampal data) agrees poorly with the data Onthe other hand a model that is within all the 95 condence bounds of theQ-Q plots (the IMI model for the SEF data) agrees closely with the data Amodel such as the IP model for the SEF data that is within nearly all theKS bounds may lie outside the Q-Q plot intervals In this case if the lackof t with respect to the Q-Q plot intervals is systematic (ie is over a set ofcontiguous quantiles) this suggests that the model does not t the data well

As a second application of the time-rescaling theorem we presented analgorithm for simulating spike trains from a point process given its condi-

342 E N Brown R Barbieri V Ventura R E Kass and L M Frank

tional intensity (rate) function This algorithm generalizes the well-knowntechnique of simulating an inhomogeneous Poisson process by rescaling aPoisson process with a constant rate

Finally to make the reasoning behind the time-rescaling theorem moreaccessible to the neuroscience researchers we proved its general form usingelementary probability arguments While this elementary proof is most cer-tainly apparent to experts in probability and point process theory (D Brill-inger personal communication Guttorp 1995) itsdetails to our knowledgehave not been previously presented The original proofs of this theorem usemeasure theory and are based on the martingale representation of point pro-cesses (Meyer 1969 Papangelou 1972 Br Acircemaud 1981 Jacobsen 1982) Theconditional intensity function (see equation 21) is dened in terms of themartingale representation Our proof uses elementary arguments because itis based on the fact that the joint probability density of a set of point processobservations (spike train) has a canonical representation in terms of the con-ditional intensity function When the joint probability density is representedin this way the Jacobian in the change of variables between the original spiketimes and the rescaled interspike intervals simplies to a product of the re-ciprocals of the conditional intensity functions evaluated at the spike times

The proof also highlights the signicance of the conditional intensityfunction in spike train modeling its specication completely denes thestochastic structure of the point process This is because in a small time in-terval the product of the conditional intensity function and the time intervaldenes the probability of a spike in that interval given the historyof the spiketrain up to that time (see equation 23) When there is no history dependencethe conditional intensity function is simply the Poisson rate function Animportant consequence of this simplication is that unless history depen-dence is specically included then histogram-based models such as thePSTH and the spatial and temporal smoothers are implicit Poisson models

In both the SEF and hippocampusexamples the histogram-based modelsgave poor ts to the spike train These poor ts arose because these mod-els used few data points to estimate many parameters and because theydo not model history dependence in the spike train Our parametric mod-els used fewer parameters and represented temporal dependence explicitlyas Markov Point process models with higher-order temporal dependencehave been studied by Ogata (1981 1988) Brillinger (1988) and Kass andVentura (2001) and will be considered further in our future work Paramet-ric conditional intensity functions may be estimated from neural spike traindata in any experiments where there are enough data to estimate reliably ahistogram-based model This is because if there are enough data to estimatemany parameters using an inefcient procedure (histogrammethod of mo-ments) then there should be enough data to estimate a smaller number ofparameters using an efcient one (maximum likelihood)

Using the K-S and Q-Q plots derived from the time-rescaling theoremit is possible to devise a systematic approach to the use of histogram-based

The Time-Rescaling Theorem 343

models That is it is possible to determine when a histogram-based modelaccurately describes a spike train and when a different model class temporaldependence or the effects of covariates (eg the theta rhythm and the ratrunning velocity in the case of the place cells) should be considered and thedegree of improvement the alternative models provide

In summary we have illustrated how the time-rescaling theorem may beused to compare directly goodness of t of parametric and histogram-basedpoint process models of neural spiking activity and to simulate spike traindata These results suggest that the time-rescaling theorem can be a valuabletool for neural spike train data analysis

Appendix

A1 Maximum Likelihood Estimation of the IMI Model To t the IMImodel we choose J large and divide the interval (0 T] into J bins of widthD D T J We choose J so that there is at most one spike in any bin In thisway we convert the spike times 0 lt u1 lt u2 lt lt uniexcl1 lt un middot T into abinary sequence ucurren

j where ucurrenj D 1 if there is a spike in bin j and 0 otherwise

for j D 1 J By denition of the conditional intensity function for the IMImodel in equation 31 it follows that each ucurren

j is a Bernoulli random variablewith the probability of a spike at jD dened as l1 ( jD | h )l2 ( jD iexclucurren

N ( jD ) | h )D We note that this discretization is identical to the one used to constructthe discretized version of the simulation algorithm in Section 33 The logprobability of a spike is thus

log(l1 ( jD | h ) ) C log(l2 ( jD iexcl ucurrenN ( jD ) | h )D ) (A1)

and the cubic spline models for log(l1 ( jD | h ) ) and log(l2 ( jD iexcl ucurrenN ( jD ) | h ) )

are respectively

log(l1 ( jD | h ) ) D3X

lD1

hl ( jD iexcl j1 ) lC

Ch4 ( jD iexclj2 )3C C h5 ( jD iexcl j3) 3

C (A2)

log(l2 ( jD iexcl ucurrenN ( jD ) | h ) ) D

3X

lD1

hlC5 ( jD iexcl ucurrenN ( jD ) iexclc 1 ) l

C

C h9 ( jD iexcl ucurrenN ( jD ) iexclc 2 ) 3

C (A3)

whereh D (h1 h9 ) and the knots are dened asjl D lT4 for l D 1 2 3 andc l is the observed 100l3 quantile of the interspike interval distribution ofthe trial spike trains for lD 1 2 We have for Section 31 that hIP D (h1 h5)

344 E N Brown R Barbieri V Ventura R E Kass and L M Frank

and hIMI D (h1 h9) are the coefcients of the spline basis elements for theIP and IMI models respectively

The parameters are estimated by maximum likelihood with D D 1 msecusing the gam function in S-PLUS (MathSoft Seattle) as described in Chap-ter 11 of Venables and Ripley (1999) The inputs to gam are the ucurren

j s the timearguments jD and jD iexclucurren

N ( jD ) the symbolic specication of the spline mod-els and the knots Further discussion on spline-based regression methodsfor analyzing neural data may be found in Olson et al (2000) and Venturaet al (2001)

Acknowledgments

We are grateful to Carl Olson for allowing us to use the supplementary eyeeld data and Matthew Wilson for allowing us to use the hippocampal placecell data We thank David Brillinger and Victor Solo for helpful discussionsand the two anonymous referees whose suggestions helped improve theexposition This research was supported in part by NIH grants CA54652MH59733 and MH61637 and NSF grants DMS 9803433 and IBN 0081548

References

Barbieri R Frank L M Quirk M C Wilson M A amp Brown E N (2001)Diagnostic methods for statistical models of place cell spiking activity Neu-rocomputing 38ndash40 1087ndash1093

Barbieri R Quirk M C Frank L M Wilson M A amp Brown E N (2001)Construction and analysis of non-Poisson stimulus-response models of neu-ral spike train activity J Neurosci Meth 105 25ndash37

Berman M (1983)Comment on ldquoLikelihood analysis of point processes and itsapplications to seismological datardquo by Ogata Bulletin Internatl Stat Instit50 412ndash418

Br Acircemaud P (1981) Point processes and queues Martingale dynamics BerlinSpringer-Verlag

Brillinger D R (1988) Maximum likelihood analysis of spike trains of interact-ing nerve cells Biol Cyber 59 189ndash200

Brown E N Frank L M Tang D Quirk M C amp Wilson M A (1998) Astatistical paradigm for neural spike train decoding applied to position pre-diction from ensemble ring patterns of rat hippocampal place cells Journalof Neuroscience 18 7411ndash7425

Casella G amp Berger R L (1990) Statistical inference Belmont CA DuxburyDaley D amp Vere-Jones D (1988) An introduction to the theory of point process

New York Springer-VerlagFrank L M Brown E N amp Wilson M A (2000) Trajectory encoding in the

hippocampus and entorhinal cortex Neuron 27 169ndash178Guttorp P (1995) Stochastic modeling of scientic data London Chapman amp

Hall

The Time-Rescaling Theorem 345

Hogg R V amp Tanis E A (2001) Probability and statistical inference (6th ed)Englewood Cliffs NJ Prentice Hall

Jacobsen M (1982)Statistical analysis of counting processes New York Springer-Verlag

Johnson A amp Kotz S (1970) Distributions in statistics Continuous univariatedistributionsmdash2 New York Wiley

Kalbeisch J amp Prentice R (1980)The statisticalanalysis of failure time data NewYork Wiley

Karr A (1991) Point processes and their statistical inference (2nd ed) New YorkMarcel Dekker

Kass R E amp Ventura V (2001)A spike train probability model Neural Comput13 1713ndash1720

Lewis P A W amp Shedler G S (1978) Simulation of non-homogeneous Poissonprocesses by thinning Naval Res Logistics Quart 26 403ndash413

Meyer P (1969)D Acircemonstration simpli Acircee drsquoun th Acirceoreme de Knight InS AcirceminaireprobabilitAcirce V (pp 191ndash195) New York Springer-Verlag

Muller R U amp Kubie J L (1987)The effects of changes in the environment onthe spatial ring of hippocampal complex-spike cells Journal of Neuroscience7 1951ndash1968

Ogata Y (1981) On Lewisrsquo simulation method for point processes IEEE Trans-actions on Information Theory IT-27 23ndash31

Ogata Y (1988) Statistical models for earthquake occurrences and residualanalysis for point processes Journal of American Statistical Association 83 9ndash27

OrsquoKeefe J amp Dostrovsky J (1971) The hippocampus as a spatial map Pre-liminary evidence from unit activity in the freely-moving rat Brain Res 34171ndash175

Olson C R Gettner S N Ventura V Carta R amp Kass R E (2000) Neuronalactivity in macaque supplementary eye eld during planning of saccades inresponse to pattern and spatial cues Journal of Neurophysiology84 1369ndash1384

Papangelou F (1972) Integrability of expected increments of point processesand a related random change of scale Trans Amer Math Soc 165 483ndash506

Port S (1994) Theoretical probability for applications New York WileyProtter M H amp Morrey C B (1991)A rst course in real analysis (2nd ed) New

York Springer-VerlagReich D Victor J amp Knight B (1998) The power ratio and the interval

map Spiking models and extracellular recordings Journal of Neuroscience18 10090ndash10104

Ripley B (1987) Stochastic simulation New York WileyRoss S (1993) Introduction to probability models (5th ed) San Diego CA Aca-

demic PressSnyder D amp Miller M (1991)Random point processes in time and space (2nd ed)

New York Springer-VerlagTaylor H M amp Karlin S (1994) An introduction to stochastic modeling (rev ed)

San Diego CA Academic PressVenables W amp Ripley B (1999) Modern applied statistics with S-PLUS (3rd ed)

New York Springer-Verlag

346 E N Brown R Barbieri V Ventura R E Kass and L M Frank

Ventura V Carta R Kass R Gettner S amp Olson C (2001) Statistical analysisof temporal evolution in single-neuron ring rate Biostatistics In press

Wilson M A amp McNaughton B L (1993) Dynamics of the hippocampal en-semble code for space Science 261 1055ndash1058

Wood E R Dudchenko P A amp Eichenbaum H (1999) The global record ofmemory in hippocampal neuronal activity Nature 397 613ndash616

Received October 27 2000 accepted May 2 2001

332 E N Brown R Barbieri V Ventura R E Kass and L M Frank

associated with equation 218 for k D 1 n These exact quantiles arereadily available in many statistical software packages For moderate tolarge spike train data series a reasonable approximation to the 95 (99)condence bounds is given by the gaussian approximation to the bino-mial probability distribution as z(k) sect 196[z(k) (1 iexcl z(k) ) n]12 (z(k) sect2575[z(k) (1 iexcl z(k) ) n]12) To our knowledge these local condence boundsfor the Q-Q plots based on the beta distribution and the gaussian approxi-mation are new

In general the KS condence intervals will be wider than the corre-sponding Q-Q plot intervals To see this it sufces to compare the widthsof the two intervals using their approximate formulas for large n From thegaussian approximation to the binomial the maximum width of the 95condence interval for the Q-Q plots occurs at the median z(k) D 050 andis 2[196 (4n) 12] D 196niexcl12 For n large the width of the 95 condenceintervals for the KS plots is 272niexcl12 at all quantiles The KS condencebounds consider the maximum discrepancy from the 45-degree line alongall quantiles the 95 bands show the discrepancy that would be exceeded5 of the time by chance if the plotted data were truly uniformlydistributedThe Q-Q plot condence bounds consider the maximum discrepancy fromthe 45-degree line for each quantile separately These pointwise 95 con-dence bounds mark the amount by which each value z(k) would deviatefrom the true quantile 5 of the time purely by chance The KS bounds arebroad because they are based on the joint distribution of all n deviationsand they consider the distribution of the largest of these deviations TheQ-Q plot bounds are narrower because they measure the deviation at eachquantile separately Used together the two plots help approximate upperand lower limits on the discrepancy between a proposed model and a spiketrain data series

3 Applications

31 An Analysis of Supplementary Eye Field Recordings For the rstapplication of the time-rescaling theorem to a goodness-of-t analysis weanalyze a spike train recorded from the supplementary eye eld (SEF) of amacaque monkey Neurons in the SEF play a role in oculomotor processes(Olson Gettner Ventura Carta amp Kass 2000) A standard paradigm forstudying the spiking propertiesof these neurons is a delayed eye movementtask In this task the monkey xates is shown locations of potential targetsites and is then cued to the specic target to which it must saccade Nexta preparatory cue is given followed a random time later by a go signalUpon receiving the go signal the animal must saccade to the specic targetand hold xation for a dened amount of time in order to receive a rewardBeginning fromthe point of the specic target cue neural activity is recordedfor a xed interval of time beyond the presentation of the go signal Aftera brief rest period the trial is repeated Multiple trials from an experiment

The Time-Rescaling Theorem 333

such as this are jointly analyzed using a PSTH to estimate ring rate for anite interval following a xed initiation point That is the trials are timealigned with respect to a xed initial point such as the target cue The dataacross trials are binned in time intervals of a xed length and the rate in eachbin is estimated as the average number of spikes in the xed time interval

Kass and Ventura (2001) recently presented inhomogeneous Markov in-terval (IMI) modelsas an alternative to the PSTH for analyzing multiple-trialneural spike train data These models use a Markov representation for theconditional intensity function One form of the IMI conditional intensityfunction they considered is

l(t | Ht ) D l (t | uN (t) h ) D l1 (t | h )l2 (t iexcl uN (t) | h ) (31)

where uN (t) is the time of the last spike prior to t l1 (t | h ) modulates r-ing as a function of the experimental clock time l2 (t iexcl uN (t) | h ) repre-sents Markov dependence in the spike train and h is a vector of modelparameters to be estimated Kass and Ventura modeled log l1 (t | h ) andlog l2 (t iexcl uN (t) | h ) as separate piecewise cubic splines in their respectivearguments t and t iexcl uN (t) The cubic pieces were joined at knots so thatthe resulting functions were twice continuously differentiable The numberand positions of the knots were chosen in a preliminary data analysis In thespecial case l (t | uN (t) h ) D l1 (t | h ) the conditional intensity function inequation 31 corresponds to an inhomogeneous Poisson (IP) model becausethis assumes no temporal dependence among the spike times

Kass and Ventura used their models to analyze SEF data that consistedof 486 spikes recorded during the 400 msec following the target cue signalin 15 trials of a delayed eye movement task (neuron PK166a from Olsonet al 2000 using the pattern condition) The IP and IMI models were tby maximum likelihood and statistical signicance tests on the spline co-efcients were used to compare goodness of t Included in the analysiswere spline models with higher-order dependence among the spike timesthan the rst-order Markov dependence in equation 31 They found thatthe IMI model gave a statistically signicant improvement in the t relativeto the IP and that adding higher-order dependence to the model gave nofurther improvements The t of the IMI model was not improved by in-cluding terms to model between-trial differences in spike rate The authorsconcluded that there was strong rst-order Markov dependence in the r-ing pattern of this SEF neuron Kass and Ventura did not provide an overallassessment of model goodness of t or evaluate how much the IMI and theIP models improved over the histogram-based rate model estimated by thePSTH

Using the KS and Q-Q plots derived from the time-rescaling theorem it ispossible to compare directly the ts of the IMI IP and PSTH models and todetermine which gives the most accurate description of the SEF spike trainstructure The equations for the IP rate function lIP (t | hIP ) D l1 (t | hIP )

334 E N Brown R Barbieri V Ventura R E Kass and L M Frank

and for the IMI conditional intensity function lIMI (t | uN (t) hIMI ) Dl1 (t | hIMI )l2 (t iexcl uN (t) hIMI ) are given in the appendix along with a dis-cussion of the maximum likelihood procedure used to estimate the co-efcients of the spline basis elements The estimated conditional inten-sity functions for the IP and IMI models are respectively lIP (t | OhIP ) andlIMI (t | uN (t) OhIMI ) where Oh is the maximum likelihood estimate of thespecic spline coefcients For the PSTH model the conditional intensityestimate is the PSTH computed by averaging the number of spikes in eachof 40 10 msec bins (the 400 msec following the target cue signal) acrossthe 15 trials The PSTH is the t of another inhomogeneous Poisson modelbecause it assumes no dependence among the spike times

The results of the IMI IP and PSTH model ts compared by KS andQ-Q plots are shown in Figure 1 For the IP model there is lack of t atlower quantiles (below 025) because in that range its KS plot lies just out-side the 95 condence bounds (see Figure 1A) From quantile 025 andbeyond the IP model is within the 95 condence bounds although be-

The Time-Rescaling Theorem 335

yond the quantile 075 it slightly underpredicts the true probability modelof the data The KS plot of the PSTH model is similar to that of the IPexcept that it lies entirely outside the 95 condence bands below quan-tile 050 Beyond this quantile it is within the 95 condence bounds TheKS plot of the PSTH underpredicts the probability model of the data toa greater degree in the upper range than the IP model The IMI model iscompletely within the 95 condence bounds and lies almost exactly onthe 45-degree line of complete agreement between the model and the dataThe Q-Q plot analyses (see Figure 1B) agree with KS plot analyses witha few exceptions The Q-Q plot analyses show that the lack of t of theIP and PSTH models is greater at the lower quantiles (0ndash050) than sug-gested by the KS plots The Q-Q plots for the IP and PSTH models alsoshow that the deviations of these two models near quantiles 080 to 090are statistically signicant With the exception of a small deviation belowquantile 010 the Q-Q plot of the IMI lies almost exactly on the 45-degreeline

Figure 1 Facing page (A) Kolmogorov-Smirnov (K-S) plots of the inhomoge-neous Markov interval (IMI) model inhomogeneous Poisson (IP) and perstim-ulus time histogram (PSTH) model ts to the SEF spike train data The solid45-degree line represents exact agreement between the model and the dataThe dashed 45-degree lines are the 95 condence bounds for exact agree-ment between the model and experimental data based on the distribution of theKolmogorov-Smirnov statistic The 95 condence bounds are bk sect 136niexcl 1

2 where bk D (k iexcl 1

2) n for k D 1 n and n is the total number of spikes The

IMI model (thick solid line) is completely within the 95 condence boundsand lies almost exactly on the 45-degree line The IP model (thin solid line)has lack of t at lower quantiles (lt 025) From the quantile 025 and be-yond the IP model is within the 95 condence bounds The KS plot of thePSTH model (dotted line) is similar to that of the IP model except that it liesoutside the 95 condence bands below quantile 050 Beyond this quantileit is within the 95 condence bounds yet it underpredicts the probabilitymodel of the data to a greater degree in this range than the IP model doesThe IMI model agrees more closely with the spike train data than either the IPor the PSTH models (B) Quantile-quantile (Q-Q) plots of the IMI (thick solidline) IP (thin solid line) and PSTH (dotted line) models The dashed lines arethe local 95 condence bounds for the individual quantiles computed fromthe beta probability density dened in equation 218 The solid 45-degree linerepresents exact agreement between the model and the data The Q-Q plotssuggests that the lack of t of the IP and PSTH models is greater at the lowerquantiles (0ndash050) than suggested by the KS plots The Q-Q plots for the IP andPSTH models also show that the deviations of these two models near quantiles080 to 090 are statistically signicant With the exception of a small devia-tion below quantile 010 the Q-Q plot of the IMI lies almost exactly on the45-degree line

336 E N Brown R Barbieri V Ventura R E Kass and L M Frank

We conclude that the IMI model gives the best description of these SEFspike trains In agreement with the report of Kass and Ventura (2001) thisanalysis supports a rst-order Markov dependence among the spike timesand not a Poisson structure as would be suggested by either the IP or thePSTH models This analysis extends the ndings of Kass and Ventura byshowing that of the three models the PSTH gives the poorest description ofthe SEF spike train The IP model gives a better t to the SEF data than thePSTH model because the maximum likelihood analysis of the parametricIP model is more statistically efcient than the histogram (method of mo-ments) estimate obtained from the PSTH (Casella amp Berger 1990) That isthe IP model t by maximum likelihood uses all the data to estimate theconditional intensity function at all time points whereas the PSTH anal-ysis uses only spikes in a specied time bin to estimate the ring rate inthat bin The additional improvement of the IMI model over the IP is dueto the fact that the former represents temporal dependence in the spiketrain

32 An Analysis of Hippocampal Place Cell Recordings As a secondexample of using the time-rescaling theorem to develop goodness-of-ttests we analyze the spiking activity of a pyramidal cell in the CA1 regionof the rat hippocampus recorded from an animal running back and forth ona linear track Hippocampal pyramidal neurons have place-specic ring(OrsquoKeefe amp Dostrovsky 1971)a given neuron res onlywhen the animal is ina certain subregion of the environment termed the neuronrsquos place eld On alinear track these elds approximately resemble one-dimensional gaussiansurfaces The neuronrsquos spiking activity correlates most closely with the an-imalrsquos position on the track (Wilson amp McNaughton 1993) The data serieswe analyze consists of 691 spikes from a place cell in the CA1 region of thehippocampus recorded from a rat running back and forth for 20 minutes ona 300 cm U-shaped track The track was linearized for the purposes of thisanalysis (Frank Brown amp Wilson 2000)

There are two approaches to estimating the place-specic ring maps ofa hippocampal neuron One approach is to use maximum likelihood to ta specic parametric model of the spike times to the place cell data as inBrown Frank Tang Quirk and Wilson (1998) and Barbieri Quirk et al(2001) If x(t) is the animalrsquos position at time t we dene the spatial functionfor the one-dimensional place eld model as the gaussian surface

s (t) D expraquo

a iexcl b (x (t) iexcl m )2

2

frac14 (32)

where m is the center of place eld b is a scale factor and expfag is themaximum height of the place eld at its center We represent the spike timeprobability density of the neuron as either an inhomogeneous gamma (IG)

The Time-Rescaling Theorem 337

model dened as

f (uk | ukiexcl1 h )

Dys (uk )C (y )

microZ uk

ukiexcl1

ys (u) duparay iexcl1

expraquo

iexclZ uk

ukiexcl1

ys(u) dufrac14

(33)

or as an inhomogeneous inverse gaussian (IIG) model dened as

f (uk | ukiexcl1 h )

Ds(uk)

micro2p

hR uk

ukiexcl1s(u) du

i3para 1

2

exp

8gtlt

gtiexcl

12

plusmnR uk

ukiexcl1s (u) du iexcl y

sup22

y 2R uk

ukiexcl1s(u) du

9gt=

gt (34)

where y gt 0 is a location parameter for both models and h D (m a b y )is the set of model parameters to be estimated from the spike train If weset y D 1 in equation 33 we obtain the IP model as a special case ofthe IG model The parameters for all three modelsmdashthe IP IG and theIIGmdashcan be estimated from the spike train data by maximum likelihood(Barbieri Quirk et al 2001) The models in equations 33 and 34 are Markovso that the current value of either the spike time probability density orthe conditional intensity (rate) function depends on only the time of thepreviousspike Because of equation 22 specifying the spike time probabilitydensity is equivalent to specifying the conditional intensity function If welet Oh denote the maximum likelihood estimate of h then the maximumlikelihood estimate of the conditional intensity function for each model canbe computed from equation 22 as

l(t | Ht Oh ) Df (t | uN (t) Oh )

1 iexclR t

uN (t)f (u | uN (t) Oh ) du

(35)

for t gt uN (t) The estimated conditional intensity from each model maybe used in the time-rescaling theorem to assess model goodness of t asdescribed in Section 21

The second approach is to compute a histogram-based estimate of theconditional intensity function by using either spatial smoothing (Muller ampKubie 1987 Frank et al 2000) or temporal smoothing (Wood Dudchenkoamp Eichenbaum 1999) of the spike train To compute the spatial smooth-ing estimate of the conditional intensity function we followed Frank et al(2000) and divided the 300 cm track into 42 cm bins counted the numberof spikes per bin and divided the count by the amount of time the animalspends in the bin We smooth the binned ring rate with a six-point gaus-sian window with a standard deviation of one bin to reduce the effect of

338 E N Brown R Barbieri V Ventura R E Kass and L M Frank

running velocity The spatial conditional intensity estimate is the smoothedspatial rate function To compute the temporal rate function we followedWood et al (1999) and divided the experiment into time bins of 200 msecand computed the rate as the number of spikes per 200 msec

These two smoothing procedures produce histogram-based estimatesof l(t) Both are histogram-based estimates of Poisson rate functions be-cause neither the estimated spatial nor the temporal rate functions makeany history-dependence assumption about the spike train As in the analy-sis of the SEF spike trains we again use the KS and Q-Q plots to comparedirectly goodness of t of the ve spike train models for the hippocampalplace cells The IP IG and IIG models were t to the spike train data by max-imum likelihood The spatial and temporal rate models were computed asdescribed The KS and Q-Q plotgoodness-of-t comparisonsare in Figure 2

The IG model overestimates at lower quantiles underestimates at inter-mediate quantiles and overestimates at the upper quantiles (see Figure 2A)

The Time-Rescaling Theorem 339

The IP model underestimates the lower and intermediate quantiles andoverestimates the upper quantiles The KS plot of the spatial rate model issimilar to that of the IP model yet closer to the 45-degree line The temporalrate model overestimates the quantiles of the true probability model of thedata This analysis suggests that the IG IP and spatial rate models are mostlikely oversmoothing this spike train whereas the temporal rate model un-dersmooths it Of the ve models the one that is closest to the 45-degreeline and lies almost entirely within the condence bounds is the IIG Thismodel disagrees only with the experimental data near quantile 080 Becauseall of the models with the exception of the IIG have appreciable lack of tin terms of the KS plots the ndings in the Q-Q plot analyses are almostidentical (see Figure 2B) As in the KS plot the Q-Q plot for the IIG model isclose to the 45-degree line and within the 95 condence bounds with theexception of quantiles near 080 These analyses suggest that IIG rate modelgives the best agreement with the spiking activity of this pyramidal neuronThe spatial and temporal rate function models and the IP model have thegreatest lack of t

33 Simulating a General Point Process by Time Rescaling As a sec-ond application of the time-rescaling theorem we describe how the the-orem may be used to simulate a general point process We stated in theIntroduction that the time-rescaling theorem provides a standard approachfor simulating an inhomogeneous Poisson process from a simple Poisson

Figure 2 Facing page (A) KS plots of the parametric and histogram-based modelts to the hippocampal place cell spike train activity The parametric modelsare the inhomogeneous Poisson (IP) (dotted line) the inhomogeneous gamma(IG) (thin solid line) and the inhomogeneous inverse gaussian (IIG)(thick solidline) The histogram-based models are the spatial rate model (dashed line) basedon 42 cm spatial bin width and the temporal rate model (dashed and dotted line)based a 200msec time bin width The 45-degree solid line represents exact agree-ment between the model and the experimental data and the 45-degree thin solidlines are the 95 condence bounds based on the KS statistic as in Figure 1 TheIIG model KS plot lies almost entirely within the condence bounds whereasall the other models show signicant lack of t This suggests that the IIG modelgives the best description of this place cell data series and the histogram-basedmodels agree least with the data (B) Q-Q plots of the IP (dotted line) IG (thinsolid line) and IIG (thick solid line) spatial (dashed line) temporal (dotted anddashed line) models The 95 local condence bounds (thin dashed lines) arecomputed as described in Figure 1 The ndings in the Q-Q plot analyses arealmost identical to those in the KS plots The Q-Q plot for the IIG model is closeto the 45-degree line and within the 95 condence bounds with the exceptionof quantiles near 080 This analysis also suggests that IIG rate model gives thebest agreement with the spiking activity of this pyramidal neuron The spatialand temporal rate function models and the IP model have the greatest lack of t

340 E N Brown R Barbieri V Ventura R E Kass and L M Frank

process The general form of the time-rescaling theorem suggests that anypoint process with an integrable conditional intensity function may be sim-ulated from a Poisson process with unit rate by rescaling time with respectto the conditional intensity (rate) function Given an interval (0 T] the sim-ulation algorithm proceeds as follows

1 Set u0 D 0 Set k D 1

2 Draw tk an exponential random variable with mean 1

3 Find uk as the solution to tk DR uk

ukiexcl1l (u | u0 u1 ukiexcl1 ) du

4 If uk gt T then stop

5 k D k C 1

6 Go to 2

By using equation 23 a discrete version of the algorithm can be constructedas follows Choose J large and divide the interval (0 T] into J bins each ofwidth D D T J For k D 1 J draw a Bernoulli random variable ucurren

k withprobability l(kD | ucurren

1 ucurrenkiexcl1 )D and assign a spike to bin k if ucurren

k D 1 andno spike if ucurren

k D 0While in many instances there will be faster more computationally ef-

cient algorithms for simulating a point process such as model-based meth-ods for specic renewal processes (Ripley 1987) and thinning algorithms(Lewis amp Shedler 1978 Ogata 1981 Ross 1993) the algorithm above is sim-ple to implement given a specication of the conditional intensity functionFor an example of where this algorithm is crucial for point process simu-lation we consider the IG model in equation 32 Its conditional intensityfunction is innite immediately following a spike if y lt 1 If in addition yis time varying (y D y (t) lt 1 for all t) then neither thinning nor standardalgorithms for making draws from a gamma probability distribution maybe used to simulate data from this model The thinning algorithm fails be-cause the conditional intensity function is not bounded and the standardalgorithms for simulating a gamma model cannot be applied because y istime varying In this case the time-rescaling simulation algorithm may beapplied as long as the conditional intensity function remains integrable asy varies temporally

4 Discussion

Measuring how well a point process model describes a neural spike traindata series is imperative prior to using the model for making inferences Thetime-rescaling theorem states that any point process with an integrable con-ditional intensity function may be transformed into a Poisson process witha unit rate Berman (1983) and Ogata (1988) showed that this theorem maybe used to develop goodness-of-t tests for point process models of seismo-logic data Goodness-of-t methods for neural spike train models based on

The Time-Rescaling Theorem 341

time-rescaling transformations but not the time-rescaling theorem have alsobeen reported (Reich Victor amp Knight 1998 Barbieri Frank Quirk Wilsonamp Brown 2001) Here we have described how the time-rescaling theoremmay be used to develop goodness-of-t tests for point process models ofneural spike trains

To illustrate our approach we analyzed two types of commonlyrecordedspike train data The SEF data are a set of multiple short (400 msec) series ofspike times each measured under identical experimental conditions Thesedata are typically analyzed by a PSTH The hippocampus data are a long(20 minutes) series of spike time recordings that are typically analyzed witheither spatial or temporal histogram models To each type of data we tboth parametric and histogram-based models Histogram-based models arepopularneural data analysis toolsbecause of the ease with which they can becomputed and interpreted These apparent advantages do not override theneed to evaluate the goodness of t of these models We previously usedthe time-rescaling theorem to assess goodness of t for parametric spiketrain models (Olson et al 2000 Barbieri Quirk et al 2001 Ventura et al2001) Our main result in this article is that the time-rescaling theorem can beused to evaluate goodness of t of parametric and histogram-based modelsand to compare directly the accuracy of models from the two classes Werecommend that before making an inference based on either type of modela goodness-of-t analysis should be performed to establish how well themodel describes the spike train data If the model and data agree closelythen the inference is more credible than when there is signicant lack of tThe KS and Q-Q plots provide assessments of overall goodness of t For themodels t by maximum likelihood these assessments can be applied alongwith methods that measure the marginal value and marginal costs of usingmore complex models such as Akaikiersquos information criterion (AIC) andthe Bayesian information criterion (BIC) in order to gain a more completeevaluation of model agreement with experimental data (Barbieri Quirk etal 2001)

We assessed goodness of t by using the time-rescaling theorem to con-struct KS and Q-Q plots having respectively liberal and conservative con-dence bounds Together the two sets of condence bounds help character-ize the range of agreement between the model and the data For example amodelwhose KSplot lies consistently outside the 95 KS condence bounds(the IP model for the hippocampal data) agrees poorly with the data Onthe other hand a model that is within all the 95 condence bounds of theQ-Q plots (the IMI model for the SEF data) agrees closely with the data Amodel such as the IP model for the SEF data that is within nearly all theKS bounds may lie outside the Q-Q plot intervals In this case if the lackof t with respect to the Q-Q plot intervals is systematic (ie is over a set ofcontiguous quantiles) this suggests that the model does not t the data well

As a second application of the time-rescaling theorem we presented analgorithm for simulating spike trains from a point process given its condi-

342 E N Brown R Barbieri V Ventura R E Kass and L M Frank

tional intensity (rate) function This algorithm generalizes the well-knowntechnique of simulating an inhomogeneous Poisson process by rescaling aPoisson process with a constant rate

Finally to make the reasoning behind the time-rescaling theorem moreaccessible to the neuroscience researchers we proved its general form usingelementary probability arguments While this elementary proof is most cer-tainly apparent to experts in probability and point process theory (D Brill-inger personal communication Guttorp 1995) itsdetails to our knowledgehave not been previously presented The original proofs of this theorem usemeasure theory and are based on the martingale representation of point pro-cesses (Meyer 1969 Papangelou 1972 Br Acircemaud 1981 Jacobsen 1982) Theconditional intensity function (see equation 21) is dened in terms of themartingale representation Our proof uses elementary arguments because itis based on the fact that the joint probability density of a set of point processobservations (spike train) has a canonical representation in terms of the con-ditional intensity function When the joint probability density is representedin this way the Jacobian in the change of variables between the original spiketimes and the rescaled interspike intervals simplies to a product of the re-ciprocals of the conditional intensity functions evaluated at the spike times

The proof also highlights the signicance of the conditional intensityfunction in spike train modeling its specication completely denes thestochastic structure of the point process This is because in a small time in-terval the product of the conditional intensity function and the time intervaldenes the probability of a spike in that interval given the historyof the spiketrain up to that time (see equation 23) When there is no history dependencethe conditional intensity function is simply the Poisson rate function Animportant consequence of this simplication is that unless history depen-dence is specically included then histogram-based models such as thePSTH and the spatial and temporal smoothers are implicit Poisson models

In both the SEF and hippocampusexamples the histogram-based modelsgave poor ts to the spike train These poor ts arose because these mod-els used few data points to estimate many parameters and because theydo not model history dependence in the spike train Our parametric mod-els used fewer parameters and represented temporal dependence explicitlyas Markov Point process models with higher-order temporal dependencehave been studied by Ogata (1981 1988) Brillinger (1988) and Kass andVentura (2001) and will be considered further in our future work Paramet-ric conditional intensity functions may be estimated from neural spike traindata in any experiments where there are enough data to estimate reliably ahistogram-based model This is because if there are enough data to estimatemany parameters using an inefcient procedure (histogrammethod of mo-ments) then there should be enough data to estimate a smaller number ofparameters using an efcient one (maximum likelihood)

Using the K-S and Q-Q plots derived from the time-rescaling theoremit is possible to devise a systematic approach to the use of histogram-based

The Time-Rescaling Theorem 343

models That is it is possible to determine when a histogram-based modelaccurately describes a spike train and when a different model class temporaldependence or the effects of covariates (eg the theta rhythm and the ratrunning velocity in the case of the place cells) should be considered and thedegree of improvement the alternative models provide

In summary we have illustrated how the time-rescaling theorem may beused to compare directly goodness of t of parametric and histogram-basedpoint process models of neural spiking activity and to simulate spike traindata These results suggest that the time-rescaling theorem can be a valuabletool for neural spike train data analysis

Appendix

A1 Maximum Likelihood Estimation of the IMI Model To t the IMImodel we choose J large and divide the interval (0 T] into J bins of widthD D T J We choose J so that there is at most one spike in any bin In thisway we convert the spike times 0 lt u1 lt u2 lt lt uniexcl1 lt un middot T into abinary sequence ucurren

j where ucurrenj D 1 if there is a spike in bin j and 0 otherwise

for j D 1 J By denition of the conditional intensity function for the IMImodel in equation 31 it follows that each ucurren

j is a Bernoulli random variablewith the probability of a spike at jD dened as l1 ( jD | h )l2 ( jD iexclucurren

N ( jD ) | h )D We note that this discretization is identical to the one used to constructthe discretized version of the simulation algorithm in Section 33 The logprobability of a spike is thus

log(l1 ( jD | h ) ) C log(l2 ( jD iexcl ucurrenN ( jD ) | h )D ) (A1)

and the cubic spline models for log(l1 ( jD | h ) ) and log(l2 ( jD iexcl ucurrenN ( jD ) | h ) )

are respectively

log(l1 ( jD | h ) ) D3X

lD1

hl ( jD iexcl j1 ) lC

Ch4 ( jD iexclj2 )3C C h5 ( jD iexcl j3) 3

C (A2)

log(l2 ( jD iexcl ucurrenN ( jD ) | h ) ) D

3X

lD1

hlC5 ( jD iexcl ucurrenN ( jD ) iexclc 1 ) l

C

C h9 ( jD iexcl ucurrenN ( jD ) iexclc 2 ) 3

C (A3)

whereh D (h1 h9 ) and the knots are dened asjl D lT4 for l D 1 2 3 andc l is the observed 100l3 quantile of the interspike interval distribution ofthe trial spike trains for lD 1 2 We have for Section 31 that hIP D (h1 h5)

344 E N Brown R Barbieri V Ventura R E Kass and L M Frank

and hIMI D (h1 h9) are the coefcients of the spline basis elements for theIP and IMI models respectively

The parameters are estimated by maximum likelihood with D D 1 msecusing the gam function in S-PLUS (MathSoft Seattle) as described in Chap-ter 11 of Venables and Ripley (1999) The inputs to gam are the ucurren

j s the timearguments jD and jD iexclucurren

N ( jD ) the symbolic specication of the spline mod-els and the knots Further discussion on spline-based regression methodsfor analyzing neural data may be found in Olson et al (2000) and Venturaet al (2001)

Acknowledgments

We are grateful to Carl Olson for allowing us to use the supplementary eyeeld data and Matthew Wilson for allowing us to use the hippocampal placecell data We thank David Brillinger and Victor Solo for helpful discussionsand the two anonymous referees whose suggestions helped improve theexposition This research was supported in part by NIH grants CA54652MH59733 and MH61637 and NSF grants DMS 9803433 and IBN 0081548

References

Barbieri R Frank L M Quirk M C Wilson M A amp Brown E N (2001)Diagnostic methods for statistical models of place cell spiking activity Neu-rocomputing 38ndash40 1087ndash1093

Barbieri R Quirk M C Frank L M Wilson M A amp Brown E N (2001)Construction and analysis of non-Poisson stimulus-response models of neu-ral spike train activity J Neurosci Meth 105 25ndash37

Berman M (1983)Comment on ldquoLikelihood analysis of point processes and itsapplications to seismological datardquo by Ogata Bulletin Internatl Stat Instit50 412ndash418

Br Acircemaud P (1981) Point processes and queues Martingale dynamics BerlinSpringer-Verlag

Brillinger D R (1988) Maximum likelihood analysis of spike trains of interact-ing nerve cells Biol Cyber 59 189ndash200

Brown E N Frank L M Tang D Quirk M C amp Wilson M A (1998) Astatistical paradigm for neural spike train decoding applied to position pre-diction from ensemble ring patterns of rat hippocampal place cells Journalof Neuroscience 18 7411ndash7425

Casella G amp Berger R L (1990) Statistical inference Belmont CA DuxburyDaley D amp Vere-Jones D (1988) An introduction to the theory of point process

New York Springer-VerlagFrank L M Brown E N amp Wilson M A (2000) Trajectory encoding in the

hippocampus and entorhinal cortex Neuron 27 169ndash178Guttorp P (1995) Stochastic modeling of scientic data London Chapman amp

Hall

The Time-Rescaling Theorem 345

Hogg R V amp Tanis E A (2001) Probability and statistical inference (6th ed)Englewood Cliffs NJ Prentice Hall

Jacobsen M (1982)Statistical analysis of counting processes New York Springer-Verlag

Johnson A amp Kotz S (1970) Distributions in statistics Continuous univariatedistributionsmdash2 New York Wiley

Kalbeisch J amp Prentice R (1980)The statisticalanalysis of failure time data NewYork Wiley

Karr A (1991) Point processes and their statistical inference (2nd ed) New YorkMarcel Dekker

Kass R E amp Ventura V (2001)A spike train probability model Neural Comput13 1713ndash1720

Lewis P A W amp Shedler G S (1978) Simulation of non-homogeneous Poissonprocesses by thinning Naval Res Logistics Quart 26 403ndash413

Meyer P (1969)D Acircemonstration simpli Acircee drsquoun th Acirceoreme de Knight InS AcirceminaireprobabilitAcirce V (pp 191ndash195) New York Springer-Verlag

Muller R U amp Kubie J L (1987)The effects of changes in the environment onthe spatial ring of hippocampal complex-spike cells Journal of Neuroscience7 1951ndash1968

Ogata Y (1981) On Lewisrsquo simulation method for point processes IEEE Trans-actions on Information Theory IT-27 23ndash31

Ogata Y (1988) Statistical models for earthquake occurrences and residualanalysis for point processes Journal of American Statistical Association 83 9ndash27

OrsquoKeefe J amp Dostrovsky J (1971) The hippocampus as a spatial map Pre-liminary evidence from unit activity in the freely-moving rat Brain Res 34171ndash175

Olson C R Gettner S N Ventura V Carta R amp Kass R E (2000) Neuronalactivity in macaque supplementary eye eld during planning of saccades inresponse to pattern and spatial cues Journal of Neurophysiology84 1369ndash1384

Papangelou F (1972) Integrability of expected increments of point processesand a related random change of scale Trans Amer Math Soc 165 483ndash506

Port S (1994) Theoretical probability for applications New York WileyProtter M H amp Morrey C B (1991)A rst course in real analysis (2nd ed) New

York Springer-VerlagReich D Victor J amp Knight B (1998) The power ratio and the interval

map Spiking models and extracellular recordings Journal of Neuroscience18 10090ndash10104

Ripley B (1987) Stochastic simulation New York WileyRoss S (1993) Introduction to probability models (5th ed) San Diego CA Aca-

demic PressSnyder D amp Miller M (1991)Random point processes in time and space (2nd ed)

New York Springer-VerlagTaylor H M amp Karlin S (1994) An introduction to stochastic modeling (rev ed)

San Diego CA Academic PressVenables W amp Ripley B (1999) Modern applied statistics with S-PLUS (3rd ed)

New York Springer-Verlag

346 E N Brown R Barbieri V Ventura R E Kass and L M Frank

Ventura V Carta R Kass R Gettner S amp Olson C (2001) Statistical analysisof temporal evolution in single-neuron ring rate Biostatistics In press

Wilson M A amp McNaughton B L (1993) Dynamics of the hippocampal en-semble code for space Science 261 1055ndash1058

Wood E R Dudchenko P A amp Eichenbaum H (1999) The global record ofmemory in hippocampal neuronal activity Nature 397 613ndash616

Received October 27 2000 accepted May 2 2001

The Time-Rescaling Theorem 333

such as this are jointly analyzed using a PSTH to estimate ring rate for anite interval following a xed initiation point That is the trials are timealigned with respect to a xed initial point such as the target cue The dataacross trials are binned in time intervals of a xed length and the rate in eachbin is estimated as the average number of spikes in the xed time interval

Kass and Ventura (2001) recently presented inhomogeneous Markov in-terval (IMI) modelsas an alternative to the PSTH for analyzing multiple-trialneural spike train data These models use a Markov representation for theconditional intensity function One form of the IMI conditional intensityfunction they considered is

l(t | Ht ) D l (t | uN (t) h ) D l1 (t | h )l2 (t iexcl uN (t) | h ) (31)

where uN (t) is the time of the last spike prior to t l1 (t | h ) modulates r-ing as a function of the experimental clock time l2 (t iexcl uN (t) | h ) repre-sents Markov dependence in the spike train and h is a vector of modelparameters to be estimated Kass and Ventura modeled log l1 (t | h ) andlog l2 (t iexcl uN (t) | h ) as separate piecewise cubic splines in their respectivearguments t and t iexcl uN (t) The cubic pieces were joined at knots so thatthe resulting functions were twice continuously differentiable The numberand positions of the knots were chosen in a preliminary data analysis In thespecial case l (t | uN (t) h ) D l1 (t | h ) the conditional intensity function inequation 31 corresponds to an inhomogeneous Poisson (IP) model becausethis assumes no temporal dependence among the spike times

Kass and Ventura used their models to analyze SEF data that consistedof 486 spikes recorded during the 400 msec following the target cue signalin 15 trials of a delayed eye movement task (neuron PK166a from Olsonet al 2000 using the pattern condition) The IP and IMI models were tby maximum likelihood and statistical signicance tests on the spline co-efcients were used to compare goodness of t Included in the analysiswere spline models with higher-order dependence among the spike timesthan the rst-order Markov dependence in equation 31 They found thatthe IMI model gave a statistically signicant improvement in the t relativeto the IP and that adding higher-order dependence to the model gave nofurther improvements The t of the IMI model was not improved by in-cluding terms to model between-trial differences in spike rate The authorsconcluded that there was strong rst-order Markov dependence in the r-ing pattern of this SEF neuron Kass and Ventura did not provide an overallassessment of model goodness of t or evaluate how much the IMI and theIP models improved over the histogram-based rate model estimated by thePSTH

Using the KS and Q-Q plots derived from the time-rescaling theorem it ispossible to compare directly the ts of the IMI IP and PSTH models and todetermine which gives the most accurate description of the SEF spike trainstructure The equations for the IP rate function lIP (t | hIP ) D l1 (t | hIP )

334 E N Brown R Barbieri V Ventura R E Kass and L M Frank

and for the IMI conditional intensity function lIMI (t | uN (t) hIMI ) Dl1 (t | hIMI )l2 (t iexcl uN (t) hIMI ) are given in the appendix along with a dis-cussion of the maximum likelihood procedure used to estimate the co-efcients of the spline basis elements The estimated conditional inten-sity functions for the IP and IMI models are respectively lIP (t | OhIP ) andlIMI (t | uN (t) OhIMI ) where Oh is the maximum likelihood estimate of thespecic spline coefcients For the PSTH model the conditional intensityestimate is the PSTH computed by averaging the number of spikes in eachof 40 10 msec bins (the 400 msec following the target cue signal) acrossthe 15 trials The PSTH is the t of another inhomogeneous Poisson modelbecause it assumes no dependence among the spike times

The results of the IMI IP and PSTH model ts compared by KS andQ-Q plots are shown in Figure 1 For the IP model there is lack of t atlower quantiles (below 025) because in that range its KS plot lies just out-side the 95 condence bounds (see Figure 1A) From quantile 025 andbeyond the IP model is within the 95 condence bounds although be-

The Time-Rescaling Theorem 335

yond the quantile 075 it slightly underpredicts the true probability modelof the data The KS plot of the PSTH model is similar to that of the IPexcept that it lies entirely outside the 95 condence bands below quan-tile 050 Beyond this quantile it is within the 95 condence bounds TheKS plot of the PSTH underpredicts the probability model of the data toa greater degree in the upper range than the IP model The IMI model iscompletely within the 95 condence bounds and lies almost exactly onthe 45-degree line of complete agreement between the model and the dataThe Q-Q plot analyses (see Figure 1B) agree with KS plot analyses witha few exceptions The Q-Q plot analyses show that the lack of t of theIP and PSTH models is greater at the lower quantiles (0ndash050) than sug-gested by the KS plots The Q-Q plots for the IP and PSTH models alsoshow that the deviations of these two models near quantiles 080 to 090are statistically signicant With the exception of a small deviation belowquantile 010 the Q-Q plot of the IMI lies almost exactly on the 45-degreeline

Figure 1 Facing page (A) Kolmogorov-Smirnov (K-S) plots of the inhomoge-neous Markov interval (IMI) model inhomogeneous Poisson (IP) and perstim-ulus time histogram (PSTH) model ts to the SEF spike train data The solid45-degree line represents exact agreement between the model and the dataThe dashed 45-degree lines are the 95 condence bounds for exact agree-ment between the model and experimental data based on the distribution of theKolmogorov-Smirnov statistic The 95 condence bounds are bk sect 136niexcl 1

2 where bk D (k iexcl 1

2) n for k D 1 n and n is the total number of spikes The

IMI model (thick solid line) is completely within the 95 condence boundsand lies almost exactly on the 45-degree line The IP model (thin solid line)has lack of t at lower quantiles (lt 025) From the quantile 025 and be-yond the IP model is within the 95 condence bounds The KS plot of thePSTH model (dotted line) is similar to that of the IP model except that it liesoutside the 95 condence bands below quantile 050 Beyond this quantileit is within the 95 condence bounds yet it underpredicts the probabilitymodel of the data to a greater degree in this range than the IP model doesThe IMI model agrees more closely with the spike train data than either the IPor the PSTH models (B) Quantile-quantile (Q-Q) plots of the IMI (thick solidline) IP (thin solid line) and PSTH (dotted line) models The dashed lines arethe local 95 condence bounds for the individual quantiles computed fromthe beta probability density dened in equation 218 The solid 45-degree linerepresents exact agreement between the model and the data The Q-Q plotssuggests that the lack of t of the IP and PSTH models is greater at the lowerquantiles (0ndash050) than suggested by the KS plots The Q-Q plots for the IP andPSTH models also show that the deviations of these two models near quantiles080 to 090 are statistically signicant With the exception of a small devia-tion below quantile 010 the Q-Q plot of the IMI lies almost exactly on the45-degree line

336 E N Brown R Barbieri V Ventura R E Kass and L M Frank

We conclude that the IMI model gives the best description of these SEFspike trains In agreement with the report of Kass and Ventura (2001) thisanalysis supports a rst-order Markov dependence among the spike timesand not a Poisson structure as would be suggested by either the IP or thePSTH models This analysis extends the ndings of Kass and Ventura byshowing that of the three models the PSTH gives the poorest description ofthe SEF spike train The IP model gives a better t to the SEF data than thePSTH model because the maximum likelihood analysis of the parametricIP model is more statistically efcient than the histogram (method of mo-ments) estimate obtained from the PSTH (Casella amp Berger 1990) That isthe IP model t by maximum likelihood uses all the data to estimate theconditional intensity function at all time points whereas the PSTH anal-ysis uses only spikes in a specied time bin to estimate the ring rate inthat bin The additional improvement of the IMI model over the IP is dueto the fact that the former represents temporal dependence in the spiketrain

32 An Analysis of Hippocampal Place Cell Recordings As a secondexample of using the time-rescaling theorem to develop goodness-of-ttests we analyze the spiking activity of a pyramidal cell in the CA1 regionof the rat hippocampus recorded from an animal running back and forth ona linear track Hippocampal pyramidal neurons have place-specic ring(OrsquoKeefe amp Dostrovsky 1971)a given neuron res onlywhen the animal is ina certain subregion of the environment termed the neuronrsquos place eld On alinear track these elds approximately resemble one-dimensional gaussiansurfaces The neuronrsquos spiking activity correlates most closely with the an-imalrsquos position on the track (Wilson amp McNaughton 1993) The data serieswe analyze consists of 691 spikes from a place cell in the CA1 region of thehippocampus recorded from a rat running back and forth for 20 minutes ona 300 cm U-shaped track The track was linearized for the purposes of thisanalysis (Frank Brown amp Wilson 2000)

There are two approaches to estimating the place-specic ring maps ofa hippocampal neuron One approach is to use maximum likelihood to ta specic parametric model of the spike times to the place cell data as inBrown Frank Tang Quirk and Wilson (1998) and Barbieri Quirk et al(2001) If x(t) is the animalrsquos position at time t we dene the spatial functionfor the one-dimensional place eld model as the gaussian surface

s (t) D expraquo

a iexcl b (x (t) iexcl m )2

2

frac14 (32)

where m is the center of place eld b is a scale factor and expfag is themaximum height of the place eld at its center We represent the spike timeprobability density of the neuron as either an inhomogeneous gamma (IG)

The Time-Rescaling Theorem 337

model dened as

f (uk | ukiexcl1 h )

Dys (uk )C (y )

microZ uk

ukiexcl1

ys (u) duparay iexcl1

expraquo

iexclZ uk

ukiexcl1

ys(u) dufrac14

(33)

or as an inhomogeneous inverse gaussian (IIG) model dened as

f (uk | ukiexcl1 h )

Ds(uk)

micro2p

hR uk

ukiexcl1s(u) du

i3para 1

2

exp

8gtlt

gtiexcl

12

plusmnR uk

ukiexcl1s (u) du iexcl y

sup22

y 2R uk

ukiexcl1s(u) du

9gt=

gt (34)

where y gt 0 is a location parameter for both models and h D (m a b y )is the set of model parameters to be estimated from the spike train If weset y D 1 in equation 33 we obtain the IP model as a special case ofthe IG model The parameters for all three modelsmdashthe IP IG and theIIGmdashcan be estimated from the spike train data by maximum likelihood(Barbieri Quirk et al 2001) The models in equations 33 and 34 are Markovso that the current value of either the spike time probability density orthe conditional intensity (rate) function depends on only the time of thepreviousspike Because of equation 22 specifying the spike time probabilitydensity is equivalent to specifying the conditional intensity function If welet Oh denote the maximum likelihood estimate of h then the maximumlikelihood estimate of the conditional intensity function for each model canbe computed from equation 22 as

l(t | Ht Oh ) Df (t | uN (t) Oh )

1 iexclR t

uN (t)f (u | uN (t) Oh ) du

(35)

for t gt uN (t) The estimated conditional intensity from each model maybe used in the time-rescaling theorem to assess model goodness of t asdescribed in Section 21

The second approach is to compute a histogram-based estimate of theconditional intensity function by using either spatial smoothing (Muller ampKubie 1987 Frank et al 2000) or temporal smoothing (Wood Dudchenkoamp Eichenbaum 1999) of the spike train To compute the spatial smooth-ing estimate of the conditional intensity function we followed Frank et al(2000) and divided the 300 cm track into 42 cm bins counted the numberof spikes per bin and divided the count by the amount of time the animalspends in the bin We smooth the binned ring rate with a six-point gaus-sian window with a standard deviation of one bin to reduce the effect of

338 E N Brown R Barbieri V Ventura R E Kass and L M Frank

running velocity The spatial conditional intensity estimate is the smoothedspatial rate function To compute the temporal rate function we followedWood et al (1999) and divided the experiment into time bins of 200 msecand computed the rate as the number of spikes per 200 msec

These two smoothing procedures produce histogram-based estimatesof l(t) Both are histogram-based estimates of Poisson rate functions be-cause neither the estimated spatial nor the temporal rate functions makeany history-dependence assumption about the spike train As in the analy-sis of the SEF spike trains we again use the KS and Q-Q plots to comparedirectly goodness of t of the ve spike train models for the hippocampalplace cells The IP IG and IIG models were t to the spike train data by max-imum likelihood The spatial and temporal rate models were computed asdescribed The KS and Q-Q plotgoodness-of-t comparisonsare in Figure 2

The IG model overestimates at lower quantiles underestimates at inter-mediate quantiles and overestimates at the upper quantiles (see Figure 2A)

The Time-Rescaling Theorem 339

The IP model underestimates the lower and intermediate quantiles andoverestimates the upper quantiles The KS plot of the spatial rate model issimilar to that of the IP model yet closer to the 45-degree line The temporalrate model overestimates the quantiles of the true probability model of thedata This analysis suggests that the IG IP and spatial rate models are mostlikely oversmoothing this spike train whereas the temporal rate model un-dersmooths it Of the ve models the one that is closest to the 45-degreeline and lies almost entirely within the condence bounds is the IIG Thismodel disagrees only with the experimental data near quantile 080 Becauseall of the models with the exception of the IIG have appreciable lack of tin terms of the KS plots the ndings in the Q-Q plot analyses are almostidentical (see Figure 2B) As in the KS plot the Q-Q plot for the IIG model isclose to the 45-degree line and within the 95 condence bounds with theexception of quantiles near 080 These analyses suggest that IIG rate modelgives the best agreement with the spiking activity of this pyramidal neuronThe spatial and temporal rate function models and the IP model have thegreatest lack of t

33 Simulating a General Point Process by Time Rescaling As a sec-ond application of the time-rescaling theorem we describe how the the-orem may be used to simulate a general point process We stated in theIntroduction that the time-rescaling theorem provides a standard approachfor simulating an inhomogeneous Poisson process from a simple Poisson

Figure 2 Facing page (A) KS plots of the parametric and histogram-based modelts to the hippocampal place cell spike train activity The parametric modelsare the inhomogeneous Poisson (IP) (dotted line) the inhomogeneous gamma(IG) (thin solid line) and the inhomogeneous inverse gaussian (IIG)(thick solidline) The histogram-based models are the spatial rate model (dashed line) basedon 42 cm spatial bin width and the temporal rate model (dashed and dotted line)based a 200msec time bin width The 45-degree solid line represents exact agree-ment between the model and the experimental data and the 45-degree thin solidlines are the 95 condence bounds based on the KS statistic as in Figure 1 TheIIG model KS plot lies almost entirely within the condence bounds whereasall the other models show signicant lack of t This suggests that the IIG modelgives the best description of this place cell data series and the histogram-basedmodels agree least with the data (B) Q-Q plots of the IP (dotted line) IG (thinsolid line) and IIG (thick solid line) spatial (dashed line) temporal (dotted anddashed line) models The 95 local condence bounds (thin dashed lines) arecomputed as described in Figure 1 The ndings in the Q-Q plot analyses arealmost identical to those in the KS plots The Q-Q plot for the IIG model is closeto the 45-degree line and within the 95 condence bounds with the exceptionof quantiles near 080 This analysis also suggests that IIG rate model gives thebest agreement with the spiking activity of this pyramidal neuron The spatialand temporal rate function models and the IP model have the greatest lack of t

340 E N Brown R Barbieri V Ventura R E Kass and L M Frank

process The general form of the time-rescaling theorem suggests that anypoint process with an integrable conditional intensity function may be sim-ulated from a Poisson process with unit rate by rescaling time with respectto the conditional intensity (rate) function Given an interval (0 T] the sim-ulation algorithm proceeds as follows

1 Set u0 D 0 Set k D 1

2 Draw tk an exponential random variable with mean 1

3 Find uk as the solution to tk DR uk

ukiexcl1l (u | u0 u1 ukiexcl1 ) du

4 If uk gt T then stop

5 k D k C 1

6 Go to 2

By using equation 23 a discrete version of the algorithm can be constructedas follows Choose J large and divide the interval (0 T] into J bins each ofwidth D D T J For k D 1 J draw a Bernoulli random variable ucurren

k withprobability l(kD | ucurren

1 ucurrenkiexcl1 )D and assign a spike to bin k if ucurren

k D 1 andno spike if ucurren

k D 0While in many instances there will be faster more computationally ef-

cient algorithms for simulating a point process such as model-based meth-ods for specic renewal processes (Ripley 1987) and thinning algorithms(Lewis amp Shedler 1978 Ogata 1981 Ross 1993) the algorithm above is sim-ple to implement given a specication of the conditional intensity functionFor an example of where this algorithm is crucial for point process simu-lation we consider the IG model in equation 32 Its conditional intensityfunction is innite immediately following a spike if y lt 1 If in addition yis time varying (y D y (t) lt 1 for all t) then neither thinning nor standardalgorithms for making draws from a gamma probability distribution maybe used to simulate data from this model The thinning algorithm fails be-cause the conditional intensity function is not bounded and the standardalgorithms for simulating a gamma model cannot be applied because y istime varying In this case the time-rescaling simulation algorithm may beapplied as long as the conditional intensity function remains integrable asy varies temporally

4 Discussion

Measuring how well a point process model describes a neural spike traindata series is imperative prior to using the model for making inferences Thetime-rescaling theorem states that any point process with an integrable con-ditional intensity function may be transformed into a Poisson process witha unit rate Berman (1983) and Ogata (1988) showed that this theorem maybe used to develop goodness-of-t tests for point process models of seismo-logic data Goodness-of-t methods for neural spike train models based on

The Time-Rescaling Theorem 341

time-rescaling transformations but not the time-rescaling theorem have alsobeen reported (Reich Victor amp Knight 1998 Barbieri Frank Quirk Wilsonamp Brown 2001) Here we have described how the time-rescaling theoremmay be used to develop goodness-of-t tests for point process models ofneural spike trains

To illustrate our approach we analyzed two types of commonlyrecordedspike train data The SEF data are a set of multiple short (400 msec) series ofspike times each measured under identical experimental conditions Thesedata are typically analyzed by a PSTH The hippocampus data are a long(20 minutes) series of spike time recordings that are typically analyzed witheither spatial or temporal histogram models To each type of data we tboth parametric and histogram-based models Histogram-based models arepopularneural data analysis toolsbecause of the ease with which they can becomputed and interpreted These apparent advantages do not override theneed to evaluate the goodness of t of these models We previously usedthe time-rescaling theorem to assess goodness of t for parametric spiketrain models (Olson et al 2000 Barbieri Quirk et al 2001 Ventura et al2001) Our main result in this article is that the time-rescaling theorem can beused to evaluate goodness of t of parametric and histogram-based modelsand to compare directly the accuracy of models from the two classes Werecommend that before making an inference based on either type of modela goodness-of-t analysis should be performed to establish how well themodel describes the spike train data If the model and data agree closelythen the inference is more credible than when there is signicant lack of tThe KS and Q-Q plots provide assessments of overall goodness of t For themodels t by maximum likelihood these assessments can be applied alongwith methods that measure the marginal value and marginal costs of usingmore complex models such as Akaikiersquos information criterion (AIC) andthe Bayesian information criterion (BIC) in order to gain a more completeevaluation of model agreement with experimental data (Barbieri Quirk etal 2001)

We assessed goodness of t by using the time-rescaling theorem to con-struct KS and Q-Q plots having respectively liberal and conservative con-dence bounds Together the two sets of condence bounds help character-ize the range of agreement between the model and the data For example amodelwhose KSplot lies consistently outside the 95 KS condence bounds(the IP model for the hippocampal data) agrees poorly with the data Onthe other hand a model that is within all the 95 condence bounds of theQ-Q plots (the IMI model for the SEF data) agrees closely with the data Amodel such as the IP model for the SEF data that is within nearly all theKS bounds may lie outside the Q-Q plot intervals In this case if the lackof t with respect to the Q-Q plot intervals is systematic (ie is over a set ofcontiguous quantiles) this suggests that the model does not t the data well

As a second application of the time-rescaling theorem we presented analgorithm for simulating spike trains from a point process given its condi-

342 E N Brown R Barbieri V Ventura R E Kass and L M Frank

tional intensity (rate) function This algorithm generalizes the well-knowntechnique of simulating an inhomogeneous Poisson process by rescaling aPoisson process with a constant rate

Finally to make the reasoning behind the time-rescaling theorem moreaccessible to the neuroscience researchers we proved its general form usingelementary probability arguments While this elementary proof is most cer-tainly apparent to experts in probability and point process theory (D Brill-inger personal communication Guttorp 1995) itsdetails to our knowledgehave not been previously presented The original proofs of this theorem usemeasure theory and are based on the martingale representation of point pro-cesses (Meyer 1969 Papangelou 1972 Br Acircemaud 1981 Jacobsen 1982) Theconditional intensity function (see equation 21) is dened in terms of themartingale representation Our proof uses elementary arguments because itis based on the fact that the joint probability density of a set of point processobservations (spike train) has a canonical representation in terms of the con-ditional intensity function When the joint probability density is representedin this way the Jacobian in the change of variables between the original spiketimes and the rescaled interspike intervals simplies to a product of the re-ciprocals of the conditional intensity functions evaluated at the spike times

The proof also highlights the signicance of the conditional intensityfunction in spike train modeling its specication completely denes thestochastic structure of the point process This is because in a small time in-terval the product of the conditional intensity function and the time intervaldenes the probability of a spike in that interval given the historyof the spiketrain up to that time (see equation 23) When there is no history dependencethe conditional intensity function is simply the Poisson rate function Animportant consequence of this simplication is that unless history depen-dence is specically included then histogram-based models such as thePSTH and the spatial and temporal smoothers are implicit Poisson models

In both the SEF and hippocampusexamples the histogram-based modelsgave poor ts to the spike train These poor ts arose because these mod-els used few data points to estimate many parameters and because theydo not model history dependence in the spike train Our parametric mod-els used fewer parameters and represented temporal dependence explicitlyas Markov Point process models with higher-order temporal dependencehave been studied by Ogata (1981 1988) Brillinger (1988) and Kass andVentura (2001) and will be considered further in our future work Paramet-ric conditional intensity functions may be estimated from neural spike traindata in any experiments where there are enough data to estimate reliably ahistogram-based model This is because if there are enough data to estimatemany parameters using an inefcient procedure (histogrammethod of mo-ments) then there should be enough data to estimate a smaller number ofparameters using an efcient one (maximum likelihood)

Using the K-S and Q-Q plots derived from the time-rescaling theoremit is possible to devise a systematic approach to the use of histogram-based

The Time-Rescaling Theorem 343

models That is it is possible to determine when a histogram-based modelaccurately describes a spike train and when a different model class temporaldependence or the effects of covariates (eg the theta rhythm and the ratrunning velocity in the case of the place cells) should be considered and thedegree of improvement the alternative models provide

In summary we have illustrated how the time-rescaling theorem may beused to compare directly goodness of t of parametric and histogram-basedpoint process models of neural spiking activity and to simulate spike traindata These results suggest that the time-rescaling theorem can be a valuabletool for neural spike train data analysis

Appendix

A1 Maximum Likelihood Estimation of the IMI Model To t the IMImodel we choose J large and divide the interval (0 T] into J bins of widthD D T J We choose J so that there is at most one spike in any bin In thisway we convert the spike times 0 lt u1 lt u2 lt lt uniexcl1 lt un middot T into abinary sequence ucurren

j where ucurrenj D 1 if there is a spike in bin j and 0 otherwise

for j D 1 J By denition of the conditional intensity function for the IMImodel in equation 31 it follows that each ucurren

j is a Bernoulli random variablewith the probability of a spike at jD dened as l1 ( jD | h )l2 ( jD iexclucurren

N ( jD ) | h )D We note that this discretization is identical to the one used to constructthe discretized version of the simulation algorithm in Section 33 The logprobability of a spike is thus

log(l1 ( jD | h ) ) C log(l2 ( jD iexcl ucurrenN ( jD ) | h )D ) (A1)

and the cubic spline models for log(l1 ( jD | h ) ) and log(l2 ( jD iexcl ucurrenN ( jD ) | h ) )

are respectively

log(l1 ( jD | h ) ) D3X

lD1

hl ( jD iexcl j1 ) lC

Ch4 ( jD iexclj2 )3C C h5 ( jD iexcl j3) 3

C (A2)

log(l2 ( jD iexcl ucurrenN ( jD ) | h ) ) D

3X

lD1

hlC5 ( jD iexcl ucurrenN ( jD ) iexclc 1 ) l

C

C h9 ( jD iexcl ucurrenN ( jD ) iexclc 2 ) 3

C (A3)

whereh D (h1 h9 ) and the knots are dened asjl D lT4 for l D 1 2 3 andc l is the observed 100l3 quantile of the interspike interval distribution ofthe trial spike trains for lD 1 2 We have for Section 31 that hIP D (h1 h5)

344 E N Brown R Barbieri V Ventura R E Kass and L M Frank

and hIMI D (h1 h9) are the coefcients of the spline basis elements for theIP and IMI models respectively

The parameters are estimated by maximum likelihood with D D 1 msecusing the gam function in S-PLUS (MathSoft Seattle) as described in Chap-ter 11 of Venables and Ripley (1999) The inputs to gam are the ucurren

j s the timearguments jD and jD iexclucurren

N ( jD ) the symbolic specication of the spline mod-els and the knots Further discussion on spline-based regression methodsfor analyzing neural data may be found in Olson et al (2000) and Venturaet al (2001)

Acknowledgments

We are grateful to Carl Olson for allowing us to use the supplementary eyeeld data and Matthew Wilson for allowing us to use the hippocampal placecell data We thank David Brillinger and Victor Solo for helpful discussionsand the two anonymous referees whose suggestions helped improve theexposition This research was supported in part by NIH grants CA54652MH59733 and MH61637 and NSF grants DMS 9803433 and IBN 0081548

References

Barbieri R Frank L M Quirk M C Wilson M A amp Brown E N (2001)Diagnostic methods for statistical models of place cell spiking activity Neu-rocomputing 38ndash40 1087ndash1093

Barbieri R Quirk M C Frank L M Wilson M A amp Brown E N (2001)Construction and analysis of non-Poisson stimulus-response models of neu-ral spike train activity J Neurosci Meth 105 25ndash37

Berman M (1983)Comment on ldquoLikelihood analysis of point processes and itsapplications to seismological datardquo by Ogata Bulletin Internatl Stat Instit50 412ndash418

Br Acircemaud P (1981) Point processes and queues Martingale dynamics BerlinSpringer-Verlag

Brillinger D R (1988) Maximum likelihood analysis of spike trains of interact-ing nerve cells Biol Cyber 59 189ndash200

Brown E N Frank L M Tang D Quirk M C amp Wilson M A (1998) Astatistical paradigm for neural spike train decoding applied to position pre-diction from ensemble ring patterns of rat hippocampal place cells Journalof Neuroscience 18 7411ndash7425

Casella G amp Berger R L (1990) Statistical inference Belmont CA DuxburyDaley D amp Vere-Jones D (1988) An introduction to the theory of point process

New York Springer-VerlagFrank L M Brown E N amp Wilson M A (2000) Trajectory encoding in the

hippocampus and entorhinal cortex Neuron 27 169ndash178Guttorp P (1995) Stochastic modeling of scientic data London Chapman amp

Hall

The Time-Rescaling Theorem 345

Hogg R V amp Tanis E A (2001) Probability and statistical inference (6th ed)Englewood Cliffs NJ Prentice Hall

Jacobsen M (1982)Statistical analysis of counting processes New York Springer-Verlag

Johnson A amp Kotz S (1970) Distributions in statistics Continuous univariatedistributionsmdash2 New York Wiley

Kalbeisch J amp Prentice R (1980)The statisticalanalysis of failure time data NewYork Wiley

Karr A (1991) Point processes and their statistical inference (2nd ed) New YorkMarcel Dekker

Kass R E amp Ventura V (2001)A spike train probability model Neural Comput13 1713ndash1720

Lewis P A W amp Shedler G S (1978) Simulation of non-homogeneous Poissonprocesses by thinning Naval Res Logistics Quart 26 403ndash413

Meyer P (1969)D Acircemonstration simpli Acircee drsquoun th Acirceoreme de Knight InS AcirceminaireprobabilitAcirce V (pp 191ndash195) New York Springer-Verlag

Muller R U amp Kubie J L (1987)The effects of changes in the environment onthe spatial ring of hippocampal complex-spike cells Journal of Neuroscience7 1951ndash1968

Ogata Y (1981) On Lewisrsquo simulation method for point processes IEEE Trans-actions on Information Theory IT-27 23ndash31

Ogata Y (1988) Statistical models for earthquake occurrences and residualanalysis for point processes Journal of American Statistical Association 83 9ndash27

OrsquoKeefe J amp Dostrovsky J (1971) The hippocampus as a spatial map Pre-liminary evidence from unit activity in the freely-moving rat Brain Res 34171ndash175

Olson C R Gettner S N Ventura V Carta R amp Kass R E (2000) Neuronalactivity in macaque supplementary eye eld during planning of saccades inresponse to pattern and spatial cues Journal of Neurophysiology84 1369ndash1384

Papangelou F (1972) Integrability of expected increments of point processesand a related random change of scale Trans Amer Math Soc 165 483ndash506

Port S (1994) Theoretical probability for applications New York WileyProtter M H amp Morrey C B (1991)A rst course in real analysis (2nd ed) New

York Springer-VerlagReich D Victor J amp Knight B (1998) The power ratio and the interval

map Spiking models and extracellular recordings Journal of Neuroscience18 10090ndash10104

Ripley B (1987) Stochastic simulation New York WileyRoss S (1993) Introduction to probability models (5th ed) San Diego CA Aca-

demic PressSnyder D amp Miller M (1991)Random point processes in time and space (2nd ed)

New York Springer-VerlagTaylor H M amp Karlin S (1994) An introduction to stochastic modeling (rev ed)

San Diego CA Academic PressVenables W amp Ripley B (1999) Modern applied statistics with S-PLUS (3rd ed)

New York Springer-Verlag

346 E N Brown R Barbieri V Ventura R E Kass and L M Frank

Ventura V Carta R Kass R Gettner S amp Olson C (2001) Statistical analysisof temporal evolution in single-neuron ring rate Biostatistics In press

Wilson M A amp McNaughton B L (1993) Dynamics of the hippocampal en-semble code for space Science 261 1055ndash1058

Wood E R Dudchenko P A amp Eichenbaum H (1999) The global record ofmemory in hippocampal neuronal activity Nature 397 613ndash616

Received October 27 2000 accepted May 2 2001

334 E N Brown R Barbieri V Ventura R E Kass and L M Frank

and for the IMI conditional intensity function lIMI (t | uN (t) hIMI ) Dl1 (t | hIMI )l2 (t iexcl uN (t) hIMI ) are given in the appendix along with a dis-cussion of the maximum likelihood procedure used to estimate the co-efcients of the spline basis elements The estimated conditional inten-sity functions for the IP and IMI models are respectively lIP (t | OhIP ) andlIMI (t | uN (t) OhIMI ) where Oh is the maximum likelihood estimate of thespecic spline coefcients For the PSTH model the conditional intensityestimate is the PSTH computed by averaging the number of spikes in eachof 40 10 msec bins (the 400 msec following the target cue signal) acrossthe 15 trials The PSTH is the t of another inhomogeneous Poisson modelbecause it assumes no dependence among the spike times

The results of the IMI IP and PSTH model ts compared by KS andQ-Q plots are shown in Figure 1 For the IP model there is lack of t atlower quantiles (below 025) because in that range its KS plot lies just out-side the 95 condence bounds (see Figure 1A) From quantile 025 andbeyond the IP model is within the 95 condence bounds although be-

The Time-Rescaling Theorem 335

yond the quantile 075 it slightly underpredicts the true probability modelof the data The KS plot of the PSTH model is similar to that of the IPexcept that it lies entirely outside the 95 condence bands below quan-tile 050 Beyond this quantile it is within the 95 condence bounds TheKS plot of the PSTH underpredicts the probability model of the data toa greater degree in the upper range than the IP model The IMI model iscompletely within the 95 condence bounds and lies almost exactly onthe 45-degree line of complete agreement between the model and the dataThe Q-Q plot analyses (see Figure 1B) agree with KS plot analyses witha few exceptions The Q-Q plot analyses show that the lack of t of theIP and PSTH models is greater at the lower quantiles (0ndash050) than sug-gested by the KS plots The Q-Q plots for the IP and PSTH models alsoshow that the deviations of these two models near quantiles 080 to 090are statistically signicant With the exception of a small deviation belowquantile 010 the Q-Q plot of the IMI lies almost exactly on the 45-degreeline

Figure 1 Facing page (A) Kolmogorov-Smirnov (K-S) plots of the inhomoge-neous Markov interval (IMI) model inhomogeneous Poisson (IP) and perstim-ulus time histogram (PSTH) model ts to the SEF spike train data The solid45-degree line represents exact agreement between the model and the dataThe dashed 45-degree lines are the 95 condence bounds for exact agree-ment between the model and experimental data based on the distribution of theKolmogorov-Smirnov statistic The 95 condence bounds are bk sect 136niexcl 1

2 where bk D (k iexcl 1

2) n for k D 1 n and n is the total number of spikes The

IMI model (thick solid line) is completely within the 95 condence boundsand lies almost exactly on the 45-degree line The IP model (thin solid line)has lack of t at lower quantiles (lt 025) From the quantile 025 and be-yond the IP model is within the 95 condence bounds The KS plot of thePSTH model (dotted line) is similar to that of the IP model except that it liesoutside the 95 condence bands below quantile 050 Beyond this quantileit is within the 95 condence bounds yet it underpredicts the probabilitymodel of the data to a greater degree in this range than the IP model doesThe IMI model agrees more closely with the spike train data than either the IPor the PSTH models (B) Quantile-quantile (Q-Q) plots of the IMI (thick solidline) IP (thin solid line) and PSTH (dotted line) models The dashed lines arethe local 95 condence bounds for the individual quantiles computed fromthe beta probability density dened in equation 218 The solid 45-degree linerepresents exact agreement between the model and the data The Q-Q plotssuggests that the lack of t of the IP and PSTH models is greater at the lowerquantiles (0ndash050) than suggested by the KS plots The Q-Q plots for the IP andPSTH models also show that the deviations of these two models near quantiles080 to 090 are statistically signicant With the exception of a small devia-tion below quantile 010 the Q-Q plot of the IMI lies almost exactly on the45-degree line

336 E N Brown R Barbieri V Ventura R E Kass and L M Frank

We conclude that the IMI model gives the best description of these SEFspike trains In agreement with the report of Kass and Ventura (2001) thisanalysis supports a rst-order Markov dependence among the spike timesand not a Poisson structure as would be suggested by either the IP or thePSTH models This analysis extends the ndings of Kass and Ventura byshowing that of the three models the PSTH gives the poorest description ofthe SEF spike train The IP model gives a better t to the SEF data than thePSTH model because the maximum likelihood analysis of the parametricIP model is more statistically efcient than the histogram (method of mo-ments) estimate obtained from the PSTH (Casella amp Berger 1990) That isthe IP model t by maximum likelihood uses all the data to estimate theconditional intensity function at all time points whereas the PSTH anal-ysis uses only spikes in a specied time bin to estimate the ring rate inthat bin The additional improvement of the IMI model over the IP is dueto the fact that the former represents temporal dependence in the spiketrain

32 An Analysis of Hippocampal Place Cell Recordings As a secondexample of using the time-rescaling theorem to develop goodness-of-ttests we analyze the spiking activity of a pyramidal cell in the CA1 regionof the rat hippocampus recorded from an animal running back and forth ona linear track Hippocampal pyramidal neurons have place-specic ring(OrsquoKeefe amp Dostrovsky 1971)a given neuron res onlywhen the animal is ina certain subregion of the environment termed the neuronrsquos place eld On alinear track these elds approximately resemble one-dimensional gaussiansurfaces The neuronrsquos spiking activity correlates most closely with the an-imalrsquos position on the track (Wilson amp McNaughton 1993) The data serieswe analyze consists of 691 spikes from a place cell in the CA1 region of thehippocampus recorded from a rat running back and forth for 20 minutes ona 300 cm U-shaped track The track was linearized for the purposes of thisanalysis (Frank Brown amp Wilson 2000)

There are two approaches to estimating the place-specic ring maps ofa hippocampal neuron One approach is to use maximum likelihood to ta specic parametric model of the spike times to the place cell data as inBrown Frank Tang Quirk and Wilson (1998) and Barbieri Quirk et al(2001) If x(t) is the animalrsquos position at time t we dene the spatial functionfor the one-dimensional place eld model as the gaussian surface

s (t) D expraquo

a iexcl b (x (t) iexcl m )2

2

frac14 (32)

where m is the center of place eld b is a scale factor and expfag is themaximum height of the place eld at its center We represent the spike timeprobability density of the neuron as either an inhomogeneous gamma (IG)

The Time-Rescaling Theorem 337

model dened as

f (uk | ukiexcl1 h )

Dys (uk )C (y )

microZ uk

ukiexcl1

ys (u) duparay iexcl1

expraquo

iexclZ uk

ukiexcl1

ys(u) dufrac14

(33)

or as an inhomogeneous inverse gaussian (IIG) model dened as

f (uk | ukiexcl1 h )

Ds(uk)

micro2p

hR uk

ukiexcl1s(u) du

i3para 1

2

exp

8gtlt

gtiexcl

12

plusmnR uk

ukiexcl1s (u) du iexcl y

sup22

y 2R uk

ukiexcl1s(u) du

9gt=

gt (34)

where y gt 0 is a location parameter for both models and h D (m a b y )is the set of model parameters to be estimated from the spike train If weset y D 1 in equation 33 we obtain the IP model as a special case ofthe IG model The parameters for all three modelsmdashthe IP IG and theIIGmdashcan be estimated from the spike train data by maximum likelihood(Barbieri Quirk et al 2001) The models in equations 33 and 34 are Markovso that the current value of either the spike time probability density orthe conditional intensity (rate) function depends on only the time of thepreviousspike Because of equation 22 specifying the spike time probabilitydensity is equivalent to specifying the conditional intensity function If welet Oh denote the maximum likelihood estimate of h then the maximumlikelihood estimate of the conditional intensity function for each model canbe computed from equation 22 as

l(t | Ht Oh ) Df (t | uN (t) Oh )

1 iexclR t

uN (t)f (u | uN (t) Oh ) du

(35)

for t gt uN (t) The estimated conditional intensity from each model maybe used in the time-rescaling theorem to assess model goodness of t asdescribed in Section 21

The second approach is to compute a histogram-based estimate of theconditional intensity function by using either spatial smoothing (Muller ampKubie 1987 Frank et al 2000) or temporal smoothing (Wood Dudchenkoamp Eichenbaum 1999) of the spike train To compute the spatial smooth-ing estimate of the conditional intensity function we followed Frank et al(2000) and divided the 300 cm track into 42 cm bins counted the numberof spikes per bin and divided the count by the amount of time the animalspends in the bin We smooth the binned ring rate with a six-point gaus-sian window with a standard deviation of one bin to reduce the effect of

338 E N Brown R Barbieri V Ventura R E Kass and L M Frank

running velocity The spatial conditional intensity estimate is the smoothedspatial rate function To compute the temporal rate function we followedWood et al (1999) and divided the experiment into time bins of 200 msecand computed the rate as the number of spikes per 200 msec

These two smoothing procedures produce histogram-based estimatesof l(t) Both are histogram-based estimates of Poisson rate functions be-cause neither the estimated spatial nor the temporal rate functions makeany history-dependence assumption about the spike train As in the analy-sis of the SEF spike trains we again use the KS and Q-Q plots to comparedirectly goodness of t of the ve spike train models for the hippocampalplace cells The IP IG and IIG models were t to the spike train data by max-imum likelihood The spatial and temporal rate models were computed asdescribed The KS and Q-Q plotgoodness-of-t comparisonsare in Figure 2

The IG model overestimates at lower quantiles underestimates at inter-mediate quantiles and overestimates at the upper quantiles (see Figure 2A)

The Time-Rescaling Theorem 339

The IP model underestimates the lower and intermediate quantiles andoverestimates the upper quantiles The KS plot of the spatial rate model issimilar to that of the IP model yet closer to the 45-degree line The temporalrate model overestimates the quantiles of the true probability model of thedata This analysis suggests that the IG IP and spatial rate models are mostlikely oversmoothing this spike train whereas the temporal rate model un-dersmooths it Of the ve models the one that is closest to the 45-degreeline and lies almost entirely within the condence bounds is the IIG Thismodel disagrees only with the experimental data near quantile 080 Becauseall of the models with the exception of the IIG have appreciable lack of tin terms of the KS plots the ndings in the Q-Q plot analyses are almostidentical (see Figure 2B) As in the KS plot the Q-Q plot for the IIG model isclose to the 45-degree line and within the 95 condence bounds with theexception of quantiles near 080 These analyses suggest that IIG rate modelgives the best agreement with the spiking activity of this pyramidal neuronThe spatial and temporal rate function models and the IP model have thegreatest lack of t

33 Simulating a General Point Process by Time Rescaling As a sec-ond application of the time-rescaling theorem we describe how the the-orem may be used to simulate a general point process We stated in theIntroduction that the time-rescaling theorem provides a standard approachfor simulating an inhomogeneous Poisson process from a simple Poisson

Figure 2 Facing page (A) KS plots of the parametric and histogram-based modelts to the hippocampal place cell spike train activity The parametric modelsare the inhomogeneous Poisson (IP) (dotted line) the inhomogeneous gamma(IG) (thin solid line) and the inhomogeneous inverse gaussian (IIG)(thick solidline) The histogram-based models are the spatial rate model (dashed line) basedon 42 cm spatial bin width and the temporal rate model (dashed and dotted line)based a 200msec time bin width The 45-degree solid line represents exact agree-ment between the model and the experimental data and the 45-degree thin solidlines are the 95 condence bounds based on the KS statistic as in Figure 1 TheIIG model KS plot lies almost entirely within the condence bounds whereasall the other models show signicant lack of t This suggests that the IIG modelgives the best description of this place cell data series and the histogram-basedmodels agree least with the data (B) Q-Q plots of the IP (dotted line) IG (thinsolid line) and IIG (thick solid line) spatial (dashed line) temporal (dotted anddashed line) models The 95 local condence bounds (thin dashed lines) arecomputed as described in Figure 1 The ndings in the Q-Q plot analyses arealmost identical to those in the KS plots The Q-Q plot for the IIG model is closeto the 45-degree line and within the 95 condence bounds with the exceptionof quantiles near 080 This analysis also suggests that IIG rate model gives thebest agreement with the spiking activity of this pyramidal neuron The spatialand temporal rate function models and the IP model have the greatest lack of t

340 E N Brown R Barbieri V Ventura R E Kass and L M Frank

process The general form of the time-rescaling theorem suggests that anypoint process with an integrable conditional intensity function may be sim-ulated from a Poisson process with unit rate by rescaling time with respectto the conditional intensity (rate) function Given an interval (0 T] the sim-ulation algorithm proceeds as follows

1 Set u0 D 0 Set k D 1

2 Draw tk an exponential random variable with mean 1

3 Find uk as the solution to tk DR uk

ukiexcl1l (u | u0 u1 ukiexcl1 ) du

4 If uk gt T then stop

5 k D k C 1

6 Go to 2

By using equation 23 a discrete version of the algorithm can be constructedas follows Choose J large and divide the interval (0 T] into J bins each ofwidth D D T J For k D 1 J draw a Bernoulli random variable ucurren

k withprobability l(kD | ucurren

1 ucurrenkiexcl1 )D and assign a spike to bin k if ucurren

k D 1 andno spike if ucurren

k D 0While in many instances there will be faster more computationally ef-

cient algorithms for simulating a point process such as model-based meth-ods for specic renewal processes (Ripley 1987) and thinning algorithms(Lewis amp Shedler 1978 Ogata 1981 Ross 1993) the algorithm above is sim-ple to implement given a specication of the conditional intensity functionFor an example of where this algorithm is crucial for point process simu-lation we consider the IG model in equation 32 Its conditional intensityfunction is innite immediately following a spike if y lt 1 If in addition yis time varying (y D y (t) lt 1 for all t) then neither thinning nor standardalgorithms for making draws from a gamma probability distribution maybe used to simulate data from this model The thinning algorithm fails be-cause the conditional intensity function is not bounded and the standardalgorithms for simulating a gamma model cannot be applied because y istime varying In this case the time-rescaling simulation algorithm may beapplied as long as the conditional intensity function remains integrable asy varies temporally

4 Discussion

Measuring how well a point process model describes a neural spike traindata series is imperative prior to using the model for making inferences Thetime-rescaling theorem states that any point process with an integrable con-ditional intensity function may be transformed into a Poisson process witha unit rate Berman (1983) and Ogata (1988) showed that this theorem maybe used to develop goodness-of-t tests for point process models of seismo-logic data Goodness-of-t methods for neural spike train models based on

The Time-Rescaling Theorem 341

time-rescaling transformations but not the time-rescaling theorem have alsobeen reported (Reich Victor amp Knight 1998 Barbieri Frank Quirk Wilsonamp Brown 2001) Here we have described how the time-rescaling theoremmay be used to develop goodness-of-t tests for point process models ofneural spike trains

To illustrate our approach we analyzed two types of commonlyrecordedspike train data The SEF data are a set of multiple short (400 msec) series ofspike times each measured under identical experimental conditions Thesedata are typically analyzed by a PSTH The hippocampus data are a long(20 minutes) series of spike time recordings that are typically analyzed witheither spatial or temporal histogram models To each type of data we tboth parametric and histogram-based models Histogram-based models arepopularneural data analysis toolsbecause of the ease with which they can becomputed and interpreted These apparent advantages do not override theneed to evaluate the goodness of t of these models We previously usedthe time-rescaling theorem to assess goodness of t for parametric spiketrain models (Olson et al 2000 Barbieri Quirk et al 2001 Ventura et al2001) Our main result in this article is that the time-rescaling theorem can beused to evaluate goodness of t of parametric and histogram-based modelsand to compare directly the accuracy of models from the two classes Werecommend that before making an inference based on either type of modela goodness-of-t analysis should be performed to establish how well themodel describes the spike train data If the model and data agree closelythen the inference is more credible than when there is signicant lack of tThe KS and Q-Q plots provide assessments of overall goodness of t For themodels t by maximum likelihood these assessments can be applied alongwith methods that measure the marginal value and marginal costs of usingmore complex models such as Akaikiersquos information criterion (AIC) andthe Bayesian information criterion (BIC) in order to gain a more completeevaluation of model agreement with experimental data (Barbieri Quirk etal 2001)

We assessed goodness of t by using the time-rescaling theorem to con-struct KS and Q-Q plots having respectively liberal and conservative con-dence bounds Together the two sets of condence bounds help character-ize the range of agreement between the model and the data For example amodelwhose KSplot lies consistently outside the 95 KS condence bounds(the IP model for the hippocampal data) agrees poorly with the data Onthe other hand a model that is within all the 95 condence bounds of theQ-Q plots (the IMI model for the SEF data) agrees closely with the data Amodel such as the IP model for the SEF data that is within nearly all theKS bounds may lie outside the Q-Q plot intervals In this case if the lackof t with respect to the Q-Q plot intervals is systematic (ie is over a set ofcontiguous quantiles) this suggests that the model does not t the data well

As a second application of the time-rescaling theorem we presented analgorithm for simulating spike trains from a point process given its condi-

342 E N Brown R Barbieri V Ventura R E Kass and L M Frank

tional intensity (rate) function This algorithm generalizes the well-knowntechnique of simulating an inhomogeneous Poisson process by rescaling aPoisson process with a constant rate

Finally to make the reasoning behind the time-rescaling theorem moreaccessible to the neuroscience researchers we proved its general form usingelementary probability arguments While this elementary proof is most cer-tainly apparent to experts in probability and point process theory (D Brill-inger personal communication Guttorp 1995) itsdetails to our knowledgehave not been previously presented The original proofs of this theorem usemeasure theory and are based on the martingale representation of point pro-cesses (Meyer 1969 Papangelou 1972 Br Acircemaud 1981 Jacobsen 1982) Theconditional intensity function (see equation 21) is dened in terms of themartingale representation Our proof uses elementary arguments because itis based on the fact that the joint probability density of a set of point processobservations (spike train) has a canonical representation in terms of the con-ditional intensity function When the joint probability density is representedin this way the Jacobian in the change of variables between the original spiketimes and the rescaled interspike intervals simplies to a product of the re-ciprocals of the conditional intensity functions evaluated at the spike times

The proof also highlights the signicance of the conditional intensityfunction in spike train modeling its specication completely denes thestochastic structure of the point process This is because in a small time in-terval the product of the conditional intensity function and the time intervaldenes the probability of a spike in that interval given the historyof the spiketrain up to that time (see equation 23) When there is no history dependencethe conditional intensity function is simply the Poisson rate function Animportant consequence of this simplication is that unless history depen-dence is specically included then histogram-based models such as thePSTH and the spatial and temporal smoothers are implicit Poisson models

In both the SEF and hippocampusexamples the histogram-based modelsgave poor ts to the spike train These poor ts arose because these mod-els used few data points to estimate many parameters and because theydo not model history dependence in the spike train Our parametric mod-els used fewer parameters and represented temporal dependence explicitlyas Markov Point process models with higher-order temporal dependencehave been studied by Ogata (1981 1988) Brillinger (1988) and Kass andVentura (2001) and will be considered further in our future work Paramet-ric conditional intensity functions may be estimated from neural spike traindata in any experiments where there are enough data to estimate reliably ahistogram-based model This is because if there are enough data to estimatemany parameters using an inefcient procedure (histogrammethod of mo-ments) then there should be enough data to estimate a smaller number ofparameters using an efcient one (maximum likelihood)

Using the K-S and Q-Q plots derived from the time-rescaling theoremit is possible to devise a systematic approach to the use of histogram-based

The Time-Rescaling Theorem 343

models That is it is possible to determine when a histogram-based modelaccurately describes a spike train and when a different model class temporaldependence or the effects of covariates (eg the theta rhythm and the ratrunning velocity in the case of the place cells) should be considered and thedegree of improvement the alternative models provide

In summary we have illustrated how the time-rescaling theorem may beused to compare directly goodness of t of parametric and histogram-basedpoint process models of neural spiking activity and to simulate spike traindata These results suggest that the time-rescaling theorem can be a valuabletool for neural spike train data analysis

Appendix

A1 Maximum Likelihood Estimation of the IMI Model To t the IMImodel we choose J large and divide the interval (0 T] into J bins of widthD D T J We choose J so that there is at most one spike in any bin In thisway we convert the spike times 0 lt u1 lt u2 lt lt uniexcl1 lt un middot T into abinary sequence ucurren

j where ucurrenj D 1 if there is a spike in bin j and 0 otherwise

for j D 1 J By denition of the conditional intensity function for the IMImodel in equation 31 it follows that each ucurren

j is a Bernoulli random variablewith the probability of a spike at jD dened as l1 ( jD | h )l2 ( jD iexclucurren

N ( jD ) | h )D We note that this discretization is identical to the one used to constructthe discretized version of the simulation algorithm in Section 33 The logprobability of a spike is thus

log(l1 ( jD | h ) ) C log(l2 ( jD iexcl ucurrenN ( jD ) | h )D ) (A1)

and the cubic spline models for log(l1 ( jD | h ) ) and log(l2 ( jD iexcl ucurrenN ( jD ) | h ) )

are respectively

log(l1 ( jD | h ) ) D3X

lD1

hl ( jD iexcl j1 ) lC

Ch4 ( jD iexclj2 )3C C h5 ( jD iexcl j3) 3

C (A2)

log(l2 ( jD iexcl ucurrenN ( jD ) | h ) ) D

3X

lD1

hlC5 ( jD iexcl ucurrenN ( jD ) iexclc 1 ) l

C

C h9 ( jD iexcl ucurrenN ( jD ) iexclc 2 ) 3

C (A3)

whereh D (h1 h9 ) and the knots are dened asjl D lT4 for l D 1 2 3 andc l is the observed 100l3 quantile of the interspike interval distribution ofthe trial spike trains for lD 1 2 We have for Section 31 that hIP D (h1 h5)

344 E N Brown R Barbieri V Ventura R E Kass and L M Frank

and hIMI D (h1 h9) are the coefcients of the spline basis elements for theIP and IMI models respectively

The parameters are estimated by maximum likelihood with D D 1 msecusing the gam function in S-PLUS (MathSoft Seattle) as described in Chap-ter 11 of Venables and Ripley (1999) The inputs to gam are the ucurren

j s the timearguments jD and jD iexclucurren

N ( jD ) the symbolic specication of the spline mod-els and the knots Further discussion on spline-based regression methodsfor analyzing neural data may be found in Olson et al (2000) and Venturaet al (2001)

Acknowledgments

We are grateful to Carl Olson for allowing us to use the supplementary eyeeld data and Matthew Wilson for allowing us to use the hippocampal placecell data We thank David Brillinger and Victor Solo for helpful discussionsand the two anonymous referees whose suggestions helped improve theexposition This research was supported in part by NIH grants CA54652MH59733 and MH61637 and NSF grants DMS 9803433 and IBN 0081548

References

Barbieri R Frank L M Quirk M C Wilson M A amp Brown E N (2001)Diagnostic methods for statistical models of place cell spiking activity Neu-rocomputing 38ndash40 1087ndash1093

Barbieri R Quirk M C Frank L M Wilson M A amp Brown E N (2001)Construction and analysis of non-Poisson stimulus-response models of neu-ral spike train activity J Neurosci Meth 105 25ndash37

Berman M (1983)Comment on ldquoLikelihood analysis of point processes and itsapplications to seismological datardquo by Ogata Bulletin Internatl Stat Instit50 412ndash418

Br Acircemaud P (1981) Point processes and queues Martingale dynamics BerlinSpringer-Verlag

Brillinger D R (1988) Maximum likelihood analysis of spike trains of interact-ing nerve cells Biol Cyber 59 189ndash200

Brown E N Frank L M Tang D Quirk M C amp Wilson M A (1998) Astatistical paradigm for neural spike train decoding applied to position pre-diction from ensemble ring patterns of rat hippocampal place cells Journalof Neuroscience 18 7411ndash7425

Casella G amp Berger R L (1990) Statistical inference Belmont CA DuxburyDaley D amp Vere-Jones D (1988) An introduction to the theory of point process

New York Springer-VerlagFrank L M Brown E N amp Wilson M A (2000) Trajectory encoding in the

hippocampus and entorhinal cortex Neuron 27 169ndash178Guttorp P (1995) Stochastic modeling of scientic data London Chapman amp

Hall

The Time-Rescaling Theorem 345

Hogg R V amp Tanis E A (2001) Probability and statistical inference (6th ed)Englewood Cliffs NJ Prentice Hall

Jacobsen M (1982)Statistical analysis of counting processes New York Springer-Verlag

Johnson A amp Kotz S (1970) Distributions in statistics Continuous univariatedistributionsmdash2 New York Wiley

Kalbeisch J amp Prentice R (1980)The statisticalanalysis of failure time data NewYork Wiley

Karr A (1991) Point processes and their statistical inference (2nd ed) New YorkMarcel Dekker

Kass R E amp Ventura V (2001)A spike train probability model Neural Comput13 1713ndash1720

Lewis P A W amp Shedler G S (1978) Simulation of non-homogeneous Poissonprocesses by thinning Naval Res Logistics Quart 26 403ndash413

Meyer P (1969)D Acircemonstration simpli Acircee drsquoun th Acirceoreme de Knight InS AcirceminaireprobabilitAcirce V (pp 191ndash195) New York Springer-Verlag

Muller R U amp Kubie J L (1987)The effects of changes in the environment onthe spatial ring of hippocampal complex-spike cells Journal of Neuroscience7 1951ndash1968

Ogata Y (1981) On Lewisrsquo simulation method for point processes IEEE Trans-actions on Information Theory IT-27 23ndash31

Ogata Y (1988) Statistical models for earthquake occurrences and residualanalysis for point processes Journal of American Statistical Association 83 9ndash27

OrsquoKeefe J amp Dostrovsky J (1971) The hippocampus as a spatial map Pre-liminary evidence from unit activity in the freely-moving rat Brain Res 34171ndash175

Olson C R Gettner S N Ventura V Carta R amp Kass R E (2000) Neuronalactivity in macaque supplementary eye eld during planning of saccades inresponse to pattern and spatial cues Journal of Neurophysiology84 1369ndash1384

Papangelou F (1972) Integrability of expected increments of point processesand a related random change of scale Trans Amer Math Soc 165 483ndash506

Port S (1994) Theoretical probability for applications New York WileyProtter M H amp Morrey C B (1991)A rst course in real analysis (2nd ed) New

York Springer-VerlagReich D Victor J amp Knight B (1998) The power ratio and the interval

map Spiking models and extracellular recordings Journal of Neuroscience18 10090ndash10104

Ripley B (1987) Stochastic simulation New York WileyRoss S (1993) Introduction to probability models (5th ed) San Diego CA Aca-

demic PressSnyder D amp Miller M (1991)Random point processes in time and space (2nd ed)

New York Springer-VerlagTaylor H M amp Karlin S (1994) An introduction to stochastic modeling (rev ed)

San Diego CA Academic PressVenables W amp Ripley B (1999) Modern applied statistics with S-PLUS (3rd ed)

New York Springer-Verlag

346 E N Brown R Barbieri V Ventura R E Kass and L M Frank

Ventura V Carta R Kass R Gettner S amp Olson C (2001) Statistical analysisof temporal evolution in single-neuron ring rate Biostatistics In press

Wilson M A amp McNaughton B L (1993) Dynamics of the hippocampal en-semble code for space Science 261 1055ndash1058

Wood E R Dudchenko P A amp Eichenbaum H (1999) The global record ofmemory in hippocampal neuronal activity Nature 397 613ndash616

Received October 27 2000 accepted May 2 2001

The Time-Rescaling Theorem 335

yond the quantile 075 it slightly underpredicts the true probability modelof the data The KS plot of the PSTH model is similar to that of the IPexcept that it lies entirely outside the 95 condence bands below quan-tile 050 Beyond this quantile it is within the 95 condence bounds TheKS plot of the PSTH underpredicts the probability model of the data toa greater degree in the upper range than the IP model The IMI model iscompletely within the 95 condence bounds and lies almost exactly onthe 45-degree line of complete agreement between the model and the dataThe Q-Q plot analyses (see Figure 1B) agree with KS plot analyses witha few exceptions The Q-Q plot analyses show that the lack of t of theIP and PSTH models is greater at the lower quantiles (0ndash050) than sug-gested by the KS plots The Q-Q plots for the IP and PSTH models alsoshow that the deviations of these two models near quantiles 080 to 090are statistically signicant With the exception of a small deviation belowquantile 010 the Q-Q plot of the IMI lies almost exactly on the 45-degreeline

Figure 1 Facing page (A) Kolmogorov-Smirnov (K-S) plots of the inhomoge-neous Markov interval (IMI) model inhomogeneous Poisson (IP) and perstim-ulus time histogram (PSTH) model ts to the SEF spike train data The solid45-degree line represents exact agreement between the model and the dataThe dashed 45-degree lines are the 95 condence bounds for exact agree-ment between the model and experimental data based on the distribution of theKolmogorov-Smirnov statistic The 95 condence bounds are bk sect 136niexcl 1

2 where bk D (k iexcl 1

2) n for k D 1 n and n is the total number of spikes The

IMI model (thick solid line) is completely within the 95 condence boundsand lies almost exactly on the 45-degree line The IP model (thin solid line)has lack of t at lower quantiles (lt 025) From the quantile 025 and be-yond the IP model is within the 95 condence bounds The KS plot of thePSTH model (dotted line) is similar to that of the IP model except that it liesoutside the 95 condence bands below quantile 050 Beyond this quantileit is within the 95 condence bounds yet it underpredicts the probabilitymodel of the data to a greater degree in this range than the IP model doesThe IMI model agrees more closely with the spike train data than either the IPor the PSTH models (B) Quantile-quantile (Q-Q) plots of the IMI (thick solidline) IP (thin solid line) and PSTH (dotted line) models The dashed lines arethe local 95 condence bounds for the individual quantiles computed fromthe beta probability density dened in equation 218 The solid 45-degree linerepresents exact agreement between the model and the data The Q-Q plotssuggests that the lack of t of the IP and PSTH models is greater at the lowerquantiles (0ndash050) than suggested by the KS plots The Q-Q plots for the IP andPSTH models also show that the deviations of these two models near quantiles080 to 090 are statistically signicant With the exception of a small devia-tion below quantile 010 the Q-Q plot of the IMI lies almost exactly on the45-degree line

336 E N Brown R Barbieri V Ventura R E Kass and L M Frank

We conclude that the IMI model gives the best description of these SEFspike trains In agreement with the report of Kass and Ventura (2001) thisanalysis supports a rst-order Markov dependence among the spike timesand not a Poisson structure as would be suggested by either the IP or thePSTH models This analysis extends the ndings of Kass and Ventura byshowing that of the three models the PSTH gives the poorest description ofthe SEF spike train The IP model gives a better t to the SEF data than thePSTH model because the maximum likelihood analysis of the parametricIP model is more statistically efcient than the histogram (method of mo-ments) estimate obtained from the PSTH (Casella amp Berger 1990) That isthe IP model t by maximum likelihood uses all the data to estimate theconditional intensity function at all time points whereas the PSTH anal-ysis uses only spikes in a specied time bin to estimate the ring rate inthat bin The additional improvement of the IMI model over the IP is dueto the fact that the former represents temporal dependence in the spiketrain

32 An Analysis of Hippocampal Place Cell Recordings As a secondexample of using the time-rescaling theorem to develop goodness-of-ttests we analyze the spiking activity of a pyramidal cell in the CA1 regionof the rat hippocampus recorded from an animal running back and forth ona linear track Hippocampal pyramidal neurons have place-specic ring(OrsquoKeefe amp Dostrovsky 1971)a given neuron res onlywhen the animal is ina certain subregion of the environment termed the neuronrsquos place eld On alinear track these elds approximately resemble one-dimensional gaussiansurfaces The neuronrsquos spiking activity correlates most closely with the an-imalrsquos position on the track (Wilson amp McNaughton 1993) The data serieswe analyze consists of 691 spikes from a place cell in the CA1 region of thehippocampus recorded from a rat running back and forth for 20 minutes ona 300 cm U-shaped track The track was linearized for the purposes of thisanalysis (Frank Brown amp Wilson 2000)

There are two approaches to estimating the place-specic ring maps ofa hippocampal neuron One approach is to use maximum likelihood to ta specic parametric model of the spike times to the place cell data as inBrown Frank Tang Quirk and Wilson (1998) and Barbieri Quirk et al(2001) If x(t) is the animalrsquos position at time t we dene the spatial functionfor the one-dimensional place eld model as the gaussian surface

s (t) D expraquo

a iexcl b (x (t) iexcl m )2

2

frac14 (32)

where m is the center of place eld b is a scale factor and expfag is themaximum height of the place eld at its center We represent the spike timeprobability density of the neuron as either an inhomogeneous gamma (IG)

The Time-Rescaling Theorem 337

model dened as

f (uk | ukiexcl1 h )

Dys (uk )C (y )

microZ uk

ukiexcl1

ys (u) duparay iexcl1

expraquo

iexclZ uk

ukiexcl1

ys(u) dufrac14

(33)

or as an inhomogeneous inverse gaussian (IIG) model dened as

f (uk | ukiexcl1 h )

Ds(uk)

micro2p

hR uk

ukiexcl1s(u) du

i3para 1

2

exp

8gtlt

gtiexcl

12

plusmnR uk

ukiexcl1s (u) du iexcl y

sup22

y 2R uk

ukiexcl1s(u) du

9gt=

gt (34)

where y gt 0 is a location parameter for both models and h D (m a b y )is the set of model parameters to be estimated from the spike train If weset y D 1 in equation 33 we obtain the IP model as a special case ofthe IG model The parameters for all three modelsmdashthe IP IG and theIIGmdashcan be estimated from the spike train data by maximum likelihood(Barbieri Quirk et al 2001) The models in equations 33 and 34 are Markovso that the current value of either the spike time probability density orthe conditional intensity (rate) function depends on only the time of thepreviousspike Because of equation 22 specifying the spike time probabilitydensity is equivalent to specifying the conditional intensity function If welet Oh denote the maximum likelihood estimate of h then the maximumlikelihood estimate of the conditional intensity function for each model canbe computed from equation 22 as

l(t | Ht Oh ) Df (t | uN (t) Oh )

1 iexclR t

uN (t)f (u | uN (t) Oh ) du

(35)

for t gt uN (t) The estimated conditional intensity from each model maybe used in the time-rescaling theorem to assess model goodness of t asdescribed in Section 21

The second approach is to compute a histogram-based estimate of theconditional intensity function by using either spatial smoothing (Muller ampKubie 1987 Frank et al 2000) or temporal smoothing (Wood Dudchenkoamp Eichenbaum 1999) of the spike train To compute the spatial smooth-ing estimate of the conditional intensity function we followed Frank et al(2000) and divided the 300 cm track into 42 cm bins counted the numberof spikes per bin and divided the count by the amount of time the animalspends in the bin We smooth the binned ring rate with a six-point gaus-sian window with a standard deviation of one bin to reduce the effect of

338 E N Brown R Barbieri V Ventura R E Kass and L M Frank

running velocity The spatial conditional intensity estimate is the smoothedspatial rate function To compute the temporal rate function we followedWood et al (1999) and divided the experiment into time bins of 200 msecand computed the rate as the number of spikes per 200 msec

These two smoothing procedures produce histogram-based estimatesof l(t) Both are histogram-based estimates of Poisson rate functions be-cause neither the estimated spatial nor the temporal rate functions makeany history-dependence assumption about the spike train As in the analy-sis of the SEF spike trains we again use the KS and Q-Q plots to comparedirectly goodness of t of the ve spike train models for the hippocampalplace cells The IP IG and IIG models were t to the spike train data by max-imum likelihood The spatial and temporal rate models were computed asdescribed The KS and Q-Q plotgoodness-of-t comparisonsare in Figure 2

The IG model overestimates at lower quantiles underestimates at inter-mediate quantiles and overestimates at the upper quantiles (see Figure 2A)

The Time-Rescaling Theorem 339

The IP model underestimates the lower and intermediate quantiles andoverestimates the upper quantiles The KS plot of the spatial rate model issimilar to that of the IP model yet closer to the 45-degree line The temporalrate model overestimates the quantiles of the true probability model of thedata This analysis suggests that the IG IP and spatial rate models are mostlikely oversmoothing this spike train whereas the temporal rate model un-dersmooths it Of the ve models the one that is closest to the 45-degreeline and lies almost entirely within the condence bounds is the IIG Thismodel disagrees only with the experimental data near quantile 080 Becauseall of the models with the exception of the IIG have appreciable lack of tin terms of the KS plots the ndings in the Q-Q plot analyses are almostidentical (see Figure 2B) As in the KS plot the Q-Q plot for the IIG model isclose to the 45-degree line and within the 95 condence bounds with theexception of quantiles near 080 These analyses suggest that IIG rate modelgives the best agreement with the spiking activity of this pyramidal neuronThe spatial and temporal rate function models and the IP model have thegreatest lack of t

33 Simulating a General Point Process by Time Rescaling As a sec-ond application of the time-rescaling theorem we describe how the the-orem may be used to simulate a general point process We stated in theIntroduction that the time-rescaling theorem provides a standard approachfor simulating an inhomogeneous Poisson process from a simple Poisson

Figure 2 Facing page (A) KS plots of the parametric and histogram-based modelts to the hippocampal place cell spike train activity The parametric modelsare the inhomogeneous Poisson (IP) (dotted line) the inhomogeneous gamma(IG) (thin solid line) and the inhomogeneous inverse gaussian (IIG)(thick solidline) The histogram-based models are the spatial rate model (dashed line) basedon 42 cm spatial bin width and the temporal rate model (dashed and dotted line)based a 200msec time bin width The 45-degree solid line represents exact agree-ment between the model and the experimental data and the 45-degree thin solidlines are the 95 condence bounds based on the KS statistic as in Figure 1 TheIIG model KS plot lies almost entirely within the condence bounds whereasall the other models show signicant lack of t This suggests that the IIG modelgives the best description of this place cell data series and the histogram-basedmodels agree least with the data (B) Q-Q plots of the IP (dotted line) IG (thinsolid line) and IIG (thick solid line) spatial (dashed line) temporal (dotted anddashed line) models The 95 local condence bounds (thin dashed lines) arecomputed as described in Figure 1 The ndings in the Q-Q plot analyses arealmost identical to those in the KS plots The Q-Q plot for the IIG model is closeto the 45-degree line and within the 95 condence bounds with the exceptionof quantiles near 080 This analysis also suggests that IIG rate model gives thebest agreement with the spiking activity of this pyramidal neuron The spatialand temporal rate function models and the IP model have the greatest lack of t

340 E N Brown R Barbieri V Ventura R E Kass and L M Frank

process The general form of the time-rescaling theorem suggests that anypoint process with an integrable conditional intensity function may be sim-ulated from a Poisson process with unit rate by rescaling time with respectto the conditional intensity (rate) function Given an interval (0 T] the sim-ulation algorithm proceeds as follows

1 Set u0 D 0 Set k D 1

2 Draw tk an exponential random variable with mean 1

3 Find uk as the solution to tk DR uk

ukiexcl1l (u | u0 u1 ukiexcl1 ) du

4 If uk gt T then stop

5 k D k C 1

6 Go to 2

By using equation 23 a discrete version of the algorithm can be constructedas follows Choose J large and divide the interval (0 T] into J bins each ofwidth D D T J For k D 1 J draw a Bernoulli random variable ucurren

k withprobability l(kD | ucurren

1 ucurrenkiexcl1 )D and assign a spike to bin k if ucurren

k D 1 andno spike if ucurren

k D 0While in many instances there will be faster more computationally ef-

cient algorithms for simulating a point process such as model-based meth-ods for specic renewal processes (Ripley 1987) and thinning algorithms(Lewis amp Shedler 1978 Ogata 1981 Ross 1993) the algorithm above is sim-ple to implement given a specication of the conditional intensity functionFor an example of where this algorithm is crucial for point process simu-lation we consider the IG model in equation 32 Its conditional intensityfunction is innite immediately following a spike if y lt 1 If in addition yis time varying (y D y (t) lt 1 for all t) then neither thinning nor standardalgorithms for making draws from a gamma probability distribution maybe used to simulate data from this model The thinning algorithm fails be-cause the conditional intensity function is not bounded and the standardalgorithms for simulating a gamma model cannot be applied because y istime varying In this case the time-rescaling simulation algorithm may beapplied as long as the conditional intensity function remains integrable asy varies temporally

4 Discussion

Measuring how well a point process model describes a neural spike traindata series is imperative prior to using the model for making inferences Thetime-rescaling theorem states that any point process with an integrable con-ditional intensity function may be transformed into a Poisson process witha unit rate Berman (1983) and Ogata (1988) showed that this theorem maybe used to develop goodness-of-t tests for point process models of seismo-logic data Goodness-of-t methods for neural spike train models based on

The Time-Rescaling Theorem 341

time-rescaling transformations but not the time-rescaling theorem have alsobeen reported (Reich Victor amp Knight 1998 Barbieri Frank Quirk Wilsonamp Brown 2001) Here we have described how the time-rescaling theoremmay be used to develop goodness-of-t tests for point process models ofneural spike trains

To illustrate our approach we analyzed two types of commonlyrecordedspike train data The SEF data are a set of multiple short (400 msec) series ofspike times each measured under identical experimental conditions Thesedata are typically analyzed by a PSTH The hippocampus data are a long(20 minutes) series of spike time recordings that are typically analyzed witheither spatial or temporal histogram models To each type of data we tboth parametric and histogram-based models Histogram-based models arepopularneural data analysis toolsbecause of the ease with which they can becomputed and interpreted These apparent advantages do not override theneed to evaluate the goodness of t of these models We previously usedthe time-rescaling theorem to assess goodness of t for parametric spiketrain models (Olson et al 2000 Barbieri Quirk et al 2001 Ventura et al2001) Our main result in this article is that the time-rescaling theorem can beused to evaluate goodness of t of parametric and histogram-based modelsand to compare directly the accuracy of models from the two classes Werecommend that before making an inference based on either type of modela goodness-of-t analysis should be performed to establish how well themodel describes the spike train data If the model and data agree closelythen the inference is more credible than when there is signicant lack of tThe KS and Q-Q plots provide assessments of overall goodness of t For themodels t by maximum likelihood these assessments can be applied alongwith methods that measure the marginal value and marginal costs of usingmore complex models such as Akaikiersquos information criterion (AIC) andthe Bayesian information criterion (BIC) in order to gain a more completeevaluation of model agreement with experimental data (Barbieri Quirk etal 2001)

We assessed goodness of t by using the time-rescaling theorem to con-struct KS and Q-Q plots having respectively liberal and conservative con-dence bounds Together the two sets of condence bounds help character-ize the range of agreement between the model and the data For example amodelwhose KSplot lies consistently outside the 95 KS condence bounds(the IP model for the hippocampal data) agrees poorly with the data Onthe other hand a model that is within all the 95 condence bounds of theQ-Q plots (the IMI model for the SEF data) agrees closely with the data Amodel such as the IP model for the SEF data that is within nearly all theKS bounds may lie outside the Q-Q plot intervals In this case if the lackof t with respect to the Q-Q plot intervals is systematic (ie is over a set ofcontiguous quantiles) this suggests that the model does not t the data well

As a second application of the time-rescaling theorem we presented analgorithm for simulating spike trains from a point process given its condi-

342 E N Brown R Barbieri V Ventura R E Kass and L M Frank

tional intensity (rate) function This algorithm generalizes the well-knowntechnique of simulating an inhomogeneous Poisson process by rescaling aPoisson process with a constant rate

Finally to make the reasoning behind the time-rescaling theorem moreaccessible to the neuroscience researchers we proved its general form usingelementary probability arguments While this elementary proof is most cer-tainly apparent to experts in probability and point process theory (D Brill-inger personal communication Guttorp 1995) itsdetails to our knowledgehave not been previously presented The original proofs of this theorem usemeasure theory and are based on the martingale representation of point pro-cesses (Meyer 1969 Papangelou 1972 Br Acircemaud 1981 Jacobsen 1982) Theconditional intensity function (see equation 21) is dened in terms of themartingale representation Our proof uses elementary arguments because itis based on the fact that the joint probability density of a set of point processobservations (spike train) has a canonical representation in terms of the con-ditional intensity function When the joint probability density is representedin this way the Jacobian in the change of variables between the original spiketimes and the rescaled interspike intervals simplies to a product of the re-ciprocals of the conditional intensity functions evaluated at the spike times

The proof also highlights the signicance of the conditional intensityfunction in spike train modeling its specication completely denes thestochastic structure of the point process This is because in a small time in-terval the product of the conditional intensity function and the time intervaldenes the probability of a spike in that interval given the historyof the spiketrain up to that time (see equation 23) When there is no history dependencethe conditional intensity function is simply the Poisson rate function Animportant consequence of this simplication is that unless history depen-dence is specically included then histogram-based models such as thePSTH and the spatial and temporal smoothers are implicit Poisson models

In both the SEF and hippocampusexamples the histogram-based modelsgave poor ts to the spike train These poor ts arose because these mod-els used few data points to estimate many parameters and because theydo not model history dependence in the spike train Our parametric mod-els used fewer parameters and represented temporal dependence explicitlyas Markov Point process models with higher-order temporal dependencehave been studied by Ogata (1981 1988) Brillinger (1988) and Kass andVentura (2001) and will be considered further in our future work Paramet-ric conditional intensity functions may be estimated from neural spike traindata in any experiments where there are enough data to estimate reliably ahistogram-based model This is because if there are enough data to estimatemany parameters using an inefcient procedure (histogrammethod of mo-ments) then there should be enough data to estimate a smaller number ofparameters using an efcient one (maximum likelihood)

Using the K-S and Q-Q plots derived from the time-rescaling theoremit is possible to devise a systematic approach to the use of histogram-based

The Time-Rescaling Theorem 343

models That is it is possible to determine when a histogram-based modelaccurately describes a spike train and when a different model class temporaldependence or the effects of covariates (eg the theta rhythm and the ratrunning velocity in the case of the place cells) should be considered and thedegree of improvement the alternative models provide

In summary we have illustrated how the time-rescaling theorem may beused to compare directly goodness of t of parametric and histogram-basedpoint process models of neural spiking activity and to simulate spike traindata These results suggest that the time-rescaling theorem can be a valuabletool for neural spike train data analysis

Appendix

A1 Maximum Likelihood Estimation of the IMI Model To t the IMImodel we choose J large and divide the interval (0 T] into J bins of widthD D T J We choose J so that there is at most one spike in any bin In thisway we convert the spike times 0 lt u1 lt u2 lt lt uniexcl1 lt un middot T into abinary sequence ucurren

j where ucurrenj D 1 if there is a spike in bin j and 0 otherwise

for j D 1 J By denition of the conditional intensity function for the IMImodel in equation 31 it follows that each ucurren

j is a Bernoulli random variablewith the probability of a spike at jD dened as l1 ( jD | h )l2 ( jD iexclucurren

N ( jD ) | h )D We note that this discretization is identical to the one used to constructthe discretized version of the simulation algorithm in Section 33 The logprobability of a spike is thus

log(l1 ( jD | h ) ) C log(l2 ( jD iexcl ucurrenN ( jD ) | h )D ) (A1)

and the cubic spline models for log(l1 ( jD | h ) ) and log(l2 ( jD iexcl ucurrenN ( jD ) | h ) )

are respectively

log(l1 ( jD | h ) ) D3X

lD1

hl ( jD iexcl j1 ) lC

Ch4 ( jD iexclj2 )3C C h5 ( jD iexcl j3) 3

C (A2)

log(l2 ( jD iexcl ucurrenN ( jD ) | h ) ) D

3X

lD1

hlC5 ( jD iexcl ucurrenN ( jD ) iexclc 1 ) l

C

C h9 ( jD iexcl ucurrenN ( jD ) iexclc 2 ) 3

C (A3)

whereh D (h1 h9 ) and the knots are dened asjl D lT4 for l D 1 2 3 andc l is the observed 100l3 quantile of the interspike interval distribution ofthe trial spike trains for lD 1 2 We have for Section 31 that hIP D (h1 h5)

344 E N Brown R Barbieri V Ventura R E Kass and L M Frank

and hIMI D (h1 h9) are the coefcients of the spline basis elements for theIP and IMI models respectively

The parameters are estimated by maximum likelihood with D D 1 msecusing the gam function in S-PLUS (MathSoft Seattle) as described in Chap-ter 11 of Venables and Ripley (1999) The inputs to gam are the ucurren

j s the timearguments jD and jD iexclucurren

N ( jD ) the symbolic specication of the spline mod-els and the knots Further discussion on spline-based regression methodsfor analyzing neural data may be found in Olson et al (2000) and Venturaet al (2001)

Acknowledgments

We are grateful to Carl Olson for allowing us to use the supplementary eyeeld data and Matthew Wilson for allowing us to use the hippocampal placecell data We thank David Brillinger and Victor Solo for helpful discussionsand the two anonymous referees whose suggestions helped improve theexposition This research was supported in part by NIH grants CA54652MH59733 and MH61637 and NSF grants DMS 9803433 and IBN 0081548

References

Barbieri R Frank L M Quirk M C Wilson M A amp Brown E N (2001)Diagnostic methods for statistical models of place cell spiking activity Neu-rocomputing 38ndash40 1087ndash1093

Barbieri R Quirk M C Frank L M Wilson M A amp Brown E N (2001)Construction and analysis of non-Poisson stimulus-response models of neu-ral spike train activity J Neurosci Meth 105 25ndash37

Berman M (1983)Comment on ldquoLikelihood analysis of point processes and itsapplications to seismological datardquo by Ogata Bulletin Internatl Stat Instit50 412ndash418

Br Acircemaud P (1981) Point processes and queues Martingale dynamics BerlinSpringer-Verlag

Brillinger D R (1988) Maximum likelihood analysis of spike trains of interact-ing nerve cells Biol Cyber 59 189ndash200

Brown E N Frank L M Tang D Quirk M C amp Wilson M A (1998) Astatistical paradigm for neural spike train decoding applied to position pre-diction from ensemble ring patterns of rat hippocampal place cells Journalof Neuroscience 18 7411ndash7425

Casella G amp Berger R L (1990) Statistical inference Belmont CA DuxburyDaley D amp Vere-Jones D (1988) An introduction to the theory of point process

New York Springer-VerlagFrank L M Brown E N amp Wilson M A (2000) Trajectory encoding in the

hippocampus and entorhinal cortex Neuron 27 169ndash178Guttorp P (1995) Stochastic modeling of scientic data London Chapman amp

Hall

The Time-Rescaling Theorem 345

Hogg R V amp Tanis E A (2001) Probability and statistical inference (6th ed)Englewood Cliffs NJ Prentice Hall

Jacobsen M (1982)Statistical analysis of counting processes New York Springer-Verlag

Johnson A amp Kotz S (1970) Distributions in statistics Continuous univariatedistributionsmdash2 New York Wiley

Kalbeisch J amp Prentice R (1980)The statisticalanalysis of failure time data NewYork Wiley

Karr A (1991) Point processes and their statistical inference (2nd ed) New YorkMarcel Dekker

Kass R E amp Ventura V (2001)A spike train probability model Neural Comput13 1713ndash1720

Lewis P A W amp Shedler G S (1978) Simulation of non-homogeneous Poissonprocesses by thinning Naval Res Logistics Quart 26 403ndash413

Meyer P (1969)D Acircemonstration simpli Acircee drsquoun th Acirceoreme de Knight InS AcirceminaireprobabilitAcirce V (pp 191ndash195) New York Springer-Verlag

Muller R U amp Kubie J L (1987)The effects of changes in the environment onthe spatial ring of hippocampal complex-spike cells Journal of Neuroscience7 1951ndash1968

Ogata Y (1981) On Lewisrsquo simulation method for point processes IEEE Trans-actions on Information Theory IT-27 23ndash31

Ogata Y (1988) Statistical models for earthquake occurrences and residualanalysis for point processes Journal of American Statistical Association 83 9ndash27

OrsquoKeefe J amp Dostrovsky J (1971) The hippocampus as a spatial map Pre-liminary evidence from unit activity in the freely-moving rat Brain Res 34171ndash175

Olson C R Gettner S N Ventura V Carta R amp Kass R E (2000) Neuronalactivity in macaque supplementary eye eld during planning of saccades inresponse to pattern and spatial cues Journal of Neurophysiology84 1369ndash1384

Papangelou F (1972) Integrability of expected increments of point processesand a related random change of scale Trans Amer Math Soc 165 483ndash506

Port S (1994) Theoretical probability for applications New York WileyProtter M H amp Morrey C B (1991)A rst course in real analysis (2nd ed) New

York Springer-VerlagReich D Victor J amp Knight B (1998) The power ratio and the interval

map Spiking models and extracellular recordings Journal of Neuroscience18 10090ndash10104

Ripley B (1987) Stochastic simulation New York WileyRoss S (1993) Introduction to probability models (5th ed) San Diego CA Aca-

demic PressSnyder D amp Miller M (1991)Random point processes in time and space (2nd ed)

New York Springer-VerlagTaylor H M amp Karlin S (1994) An introduction to stochastic modeling (rev ed)

San Diego CA Academic PressVenables W amp Ripley B (1999) Modern applied statistics with S-PLUS (3rd ed)

New York Springer-Verlag

346 E N Brown R Barbieri V Ventura R E Kass and L M Frank

Ventura V Carta R Kass R Gettner S amp Olson C (2001) Statistical analysisof temporal evolution in single-neuron ring rate Biostatistics In press

Wilson M A amp McNaughton B L (1993) Dynamics of the hippocampal en-semble code for space Science 261 1055ndash1058

Wood E R Dudchenko P A amp Eichenbaum H (1999) The global record ofmemory in hippocampal neuronal activity Nature 397 613ndash616

Received October 27 2000 accepted May 2 2001

336 E N Brown R Barbieri V Ventura R E Kass and L M Frank

We conclude that the IMI model gives the best description of these SEFspike trains In agreement with the report of Kass and Ventura (2001) thisanalysis supports a rst-order Markov dependence among the spike timesand not a Poisson structure as would be suggested by either the IP or thePSTH models This analysis extends the ndings of Kass and Ventura byshowing that of the three models the PSTH gives the poorest description ofthe SEF spike train The IP model gives a better t to the SEF data than thePSTH model because the maximum likelihood analysis of the parametricIP model is more statistically efcient than the histogram (method of mo-ments) estimate obtained from the PSTH (Casella amp Berger 1990) That isthe IP model t by maximum likelihood uses all the data to estimate theconditional intensity function at all time points whereas the PSTH anal-ysis uses only spikes in a specied time bin to estimate the ring rate inthat bin The additional improvement of the IMI model over the IP is dueto the fact that the former represents temporal dependence in the spiketrain

32 An Analysis of Hippocampal Place Cell Recordings As a secondexample of using the time-rescaling theorem to develop goodness-of-ttests we analyze the spiking activity of a pyramidal cell in the CA1 regionof the rat hippocampus recorded from an animal running back and forth ona linear track Hippocampal pyramidal neurons have place-specic ring(OrsquoKeefe amp Dostrovsky 1971)a given neuron res onlywhen the animal is ina certain subregion of the environment termed the neuronrsquos place eld On alinear track these elds approximately resemble one-dimensional gaussiansurfaces The neuronrsquos spiking activity correlates most closely with the an-imalrsquos position on the track (Wilson amp McNaughton 1993) The data serieswe analyze consists of 691 spikes from a place cell in the CA1 region of thehippocampus recorded from a rat running back and forth for 20 minutes ona 300 cm U-shaped track The track was linearized for the purposes of thisanalysis (Frank Brown amp Wilson 2000)

There are two approaches to estimating the place-specic ring maps ofa hippocampal neuron One approach is to use maximum likelihood to ta specic parametric model of the spike times to the place cell data as inBrown Frank Tang Quirk and Wilson (1998) and Barbieri Quirk et al(2001) If x(t) is the animalrsquos position at time t we dene the spatial functionfor the one-dimensional place eld model as the gaussian surface

s (t) D expraquo

a iexcl b (x (t) iexcl m )2

2

frac14 (32)

where m is the center of place eld b is a scale factor and expfag is themaximum height of the place eld at its center We represent the spike timeprobability density of the neuron as either an inhomogeneous gamma (IG)

The Time-Rescaling Theorem 337

model dened as

f (uk | ukiexcl1 h )

Dys (uk )C (y )

microZ uk

ukiexcl1

ys (u) duparay iexcl1

expraquo

iexclZ uk

ukiexcl1

ys(u) dufrac14

(33)

or as an inhomogeneous inverse gaussian (IIG) model dened as

f (uk | ukiexcl1 h )

Ds(uk)

micro2p

hR uk

ukiexcl1s(u) du

i3para 1

2

exp

8gtlt

gtiexcl

12

plusmnR uk

ukiexcl1s (u) du iexcl y

sup22

y 2R uk

ukiexcl1s(u) du

9gt=

gt (34)

where y gt 0 is a location parameter for both models and h D (m a b y )is the set of model parameters to be estimated from the spike train If weset y D 1 in equation 33 we obtain the IP model as a special case ofthe IG model The parameters for all three modelsmdashthe IP IG and theIIGmdashcan be estimated from the spike train data by maximum likelihood(Barbieri Quirk et al 2001) The models in equations 33 and 34 are Markovso that the current value of either the spike time probability density orthe conditional intensity (rate) function depends on only the time of thepreviousspike Because of equation 22 specifying the spike time probabilitydensity is equivalent to specifying the conditional intensity function If welet Oh denote the maximum likelihood estimate of h then the maximumlikelihood estimate of the conditional intensity function for each model canbe computed from equation 22 as

l(t | Ht Oh ) Df (t | uN (t) Oh )

1 iexclR t

uN (t)f (u | uN (t) Oh ) du

(35)

for t gt uN (t) The estimated conditional intensity from each model maybe used in the time-rescaling theorem to assess model goodness of t asdescribed in Section 21

The second approach is to compute a histogram-based estimate of theconditional intensity function by using either spatial smoothing (Muller ampKubie 1987 Frank et al 2000) or temporal smoothing (Wood Dudchenkoamp Eichenbaum 1999) of the spike train To compute the spatial smooth-ing estimate of the conditional intensity function we followed Frank et al(2000) and divided the 300 cm track into 42 cm bins counted the numberof spikes per bin and divided the count by the amount of time the animalspends in the bin We smooth the binned ring rate with a six-point gaus-sian window with a standard deviation of one bin to reduce the effect of

338 E N Brown R Barbieri V Ventura R E Kass and L M Frank

running velocity The spatial conditional intensity estimate is the smoothedspatial rate function To compute the temporal rate function we followedWood et al (1999) and divided the experiment into time bins of 200 msecand computed the rate as the number of spikes per 200 msec

These two smoothing procedures produce histogram-based estimatesof l(t) Both are histogram-based estimates of Poisson rate functions be-cause neither the estimated spatial nor the temporal rate functions makeany history-dependence assumption about the spike train As in the analy-sis of the SEF spike trains we again use the KS and Q-Q plots to comparedirectly goodness of t of the ve spike train models for the hippocampalplace cells The IP IG and IIG models were t to the spike train data by max-imum likelihood The spatial and temporal rate models were computed asdescribed The KS and Q-Q plotgoodness-of-t comparisonsare in Figure 2

The IG model overestimates at lower quantiles underestimates at inter-mediate quantiles and overestimates at the upper quantiles (see Figure 2A)

The Time-Rescaling Theorem 339

The IP model underestimates the lower and intermediate quantiles andoverestimates the upper quantiles The KS plot of the spatial rate model issimilar to that of the IP model yet closer to the 45-degree line The temporalrate model overestimates the quantiles of the true probability model of thedata This analysis suggests that the IG IP and spatial rate models are mostlikely oversmoothing this spike train whereas the temporal rate model un-dersmooths it Of the ve models the one that is closest to the 45-degreeline and lies almost entirely within the condence bounds is the IIG Thismodel disagrees only with the experimental data near quantile 080 Becauseall of the models with the exception of the IIG have appreciable lack of tin terms of the KS plots the ndings in the Q-Q plot analyses are almostidentical (see Figure 2B) As in the KS plot the Q-Q plot for the IIG model isclose to the 45-degree line and within the 95 condence bounds with theexception of quantiles near 080 These analyses suggest that IIG rate modelgives the best agreement with the spiking activity of this pyramidal neuronThe spatial and temporal rate function models and the IP model have thegreatest lack of t

33 Simulating a General Point Process by Time Rescaling As a sec-ond application of the time-rescaling theorem we describe how the the-orem may be used to simulate a general point process We stated in theIntroduction that the time-rescaling theorem provides a standard approachfor simulating an inhomogeneous Poisson process from a simple Poisson

Figure 2 Facing page (A) KS plots of the parametric and histogram-based modelts to the hippocampal place cell spike train activity The parametric modelsare the inhomogeneous Poisson (IP) (dotted line) the inhomogeneous gamma(IG) (thin solid line) and the inhomogeneous inverse gaussian (IIG)(thick solidline) The histogram-based models are the spatial rate model (dashed line) basedon 42 cm spatial bin width and the temporal rate model (dashed and dotted line)based a 200msec time bin width The 45-degree solid line represents exact agree-ment between the model and the experimental data and the 45-degree thin solidlines are the 95 condence bounds based on the KS statistic as in Figure 1 TheIIG model KS plot lies almost entirely within the condence bounds whereasall the other models show signicant lack of t This suggests that the IIG modelgives the best description of this place cell data series and the histogram-basedmodels agree least with the data (B) Q-Q plots of the IP (dotted line) IG (thinsolid line) and IIG (thick solid line) spatial (dashed line) temporal (dotted anddashed line) models The 95 local condence bounds (thin dashed lines) arecomputed as described in Figure 1 The ndings in the Q-Q plot analyses arealmost identical to those in the KS plots The Q-Q plot for the IIG model is closeto the 45-degree line and within the 95 condence bounds with the exceptionof quantiles near 080 This analysis also suggests that IIG rate model gives thebest agreement with the spiking activity of this pyramidal neuron The spatialand temporal rate function models and the IP model have the greatest lack of t

340 E N Brown R Barbieri V Ventura R E Kass and L M Frank

process The general form of the time-rescaling theorem suggests that anypoint process with an integrable conditional intensity function may be sim-ulated from a Poisson process with unit rate by rescaling time with respectto the conditional intensity (rate) function Given an interval (0 T] the sim-ulation algorithm proceeds as follows

1 Set u0 D 0 Set k D 1

2 Draw tk an exponential random variable with mean 1

3 Find uk as the solution to tk DR uk

ukiexcl1l (u | u0 u1 ukiexcl1 ) du

4 If uk gt T then stop

5 k D k C 1

6 Go to 2

By using equation 23 a discrete version of the algorithm can be constructedas follows Choose J large and divide the interval (0 T] into J bins each ofwidth D D T J For k D 1 J draw a Bernoulli random variable ucurren

k withprobability l(kD | ucurren

1 ucurrenkiexcl1 )D and assign a spike to bin k if ucurren

k D 1 andno spike if ucurren

k D 0While in many instances there will be faster more computationally ef-

cient algorithms for simulating a point process such as model-based meth-ods for specic renewal processes (Ripley 1987) and thinning algorithms(Lewis amp Shedler 1978 Ogata 1981 Ross 1993) the algorithm above is sim-ple to implement given a specication of the conditional intensity functionFor an example of where this algorithm is crucial for point process simu-lation we consider the IG model in equation 32 Its conditional intensityfunction is innite immediately following a spike if y lt 1 If in addition yis time varying (y D y (t) lt 1 for all t) then neither thinning nor standardalgorithms for making draws from a gamma probability distribution maybe used to simulate data from this model The thinning algorithm fails be-cause the conditional intensity function is not bounded and the standardalgorithms for simulating a gamma model cannot be applied because y istime varying In this case the time-rescaling simulation algorithm may beapplied as long as the conditional intensity function remains integrable asy varies temporally

4 Discussion

Measuring how well a point process model describes a neural spike traindata series is imperative prior to using the model for making inferences Thetime-rescaling theorem states that any point process with an integrable con-ditional intensity function may be transformed into a Poisson process witha unit rate Berman (1983) and Ogata (1988) showed that this theorem maybe used to develop goodness-of-t tests for point process models of seismo-logic data Goodness-of-t methods for neural spike train models based on

The Time-Rescaling Theorem 341

time-rescaling transformations but not the time-rescaling theorem have alsobeen reported (Reich Victor amp Knight 1998 Barbieri Frank Quirk Wilsonamp Brown 2001) Here we have described how the time-rescaling theoremmay be used to develop goodness-of-t tests for point process models ofneural spike trains

To illustrate our approach we analyzed two types of commonlyrecordedspike train data The SEF data are a set of multiple short (400 msec) series ofspike times each measured under identical experimental conditions Thesedata are typically analyzed by a PSTH The hippocampus data are a long(20 minutes) series of spike time recordings that are typically analyzed witheither spatial or temporal histogram models To each type of data we tboth parametric and histogram-based models Histogram-based models arepopularneural data analysis toolsbecause of the ease with which they can becomputed and interpreted These apparent advantages do not override theneed to evaluate the goodness of t of these models We previously usedthe time-rescaling theorem to assess goodness of t for parametric spiketrain models (Olson et al 2000 Barbieri Quirk et al 2001 Ventura et al2001) Our main result in this article is that the time-rescaling theorem can beused to evaluate goodness of t of parametric and histogram-based modelsand to compare directly the accuracy of models from the two classes Werecommend that before making an inference based on either type of modela goodness-of-t analysis should be performed to establish how well themodel describes the spike train data If the model and data agree closelythen the inference is more credible than when there is signicant lack of tThe KS and Q-Q plots provide assessments of overall goodness of t For themodels t by maximum likelihood these assessments can be applied alongwith methods that measure the marginal value and marginal costs of usingmore complex models such as Akaikiersquos information criterion (AIC) andthe Bayesian information criterion (BIC) in order to gain a more completeevaluation of model agreement with experimental data (Barbieri Quirk etal 2001)

We assessed goodness of t by using the time-rescaling theorem to con-struct KS and Q-Q plots having respectively liberal and conservative con-dence bounds Together the two sets of condence bounds help character-ize the range of agreement between the model and the data For example amodelwhose KSplot lies consistently outside the 95 KS condence bounds(the IP model for the hippocampal data) agrees poorly with the data Onthe other hand a model that is within all the 95 condence bounds of theQ-Q plots (the IMI model for the SEF data) agrees closely with the data Amodel such as the IP model for the SEF data that is within nearly all theKS bounds may lie outside the Q-Q plot intervals In this case if the lackof t with respect to the Q-Q plot intervals is systematic (ie is over a set ofcontiguous quantiles) this suggests that the model does not t the data well

As a second application of the time-rescaling theorem we presented analgorithm for simulating spike trains from a point process given its condi-

342 E N Brown R Barbieri V Ventura R E Kass and L M Frank

tional intensity (rate) function This algorithm generalizes the well-knowntechnique of simulating an inhomogeneous Poisson process by rescaling aPoisson process with a constant rate

Finally to make the reasoning behind the time-rescaling theorem moreaccessible to the neuroscience researchers we proved its general form usingelementary probability arguments While this elementary proof is most cer-tainly apparent to experts in probability and point process theory (D Brill-inger personal communication Guttorp 1995) itsdetails to our knowledgehave not been previously presented The original proofs of this theorem usemeasure theory and are based on the martingale representation of point pro-cesses (Meyer 1969 Papangelou 1972 Br Acircemaud 1981 Jacobsen 1982) Theconditional intensity function (see equation 21) is dened in terms of themartingale representation Our proof uses elementary arguments because itis based on the fact that the joint probability density of a set of point processobservations (spike train) has a canonical representation in terms of the con-ditional intensity function When the joint probability density is representedin this way the Jacobian in the change of variables between the original spiketimes and the rescaled interspike intervals simplies to a product of the re-ciprocals of the conditional intensity functions evaluated at the spike times

The proof also highlights the signicance of the conditional intensityfunction in spike train modeling its specication completely denes thestochastic structure of the point process This is because in a small time in-terval the product of the conditional intensity function and the time intervaldenes the probability of a spike in that interval given the historyof the spiketrain up to that time (see equation 23) When there is no history dependencethe conditional intensity function is simply the Poisson rate function Animportant consequence of this simplication is that unless history depen-dence is specically included then histogram-based models such as thePSTH and the spatial and temporal smoothers are implicit Poisson models

In both the SEF and hippocampusexamples the histogram-based modelsgave poor ts to the spike train These poor ts arose because these mod-els used few data points to estimate many parameters and because theydo not model history dependence in the spike train Our parametric mod-els used fewer parameters and represented temporal dependence explicitlyas Markov Point process models with higher-order temporal dependencehave been studied by Ogata (1981 1988) Brillinger (1988) and Kass andVentura (2001) and will be considered further in our future work Paramet-ric conditional intensity functions may be estimated from neural spike traindata in any experiments where there are enough data to estimate reliably ahistogram-based model This is because if there are enough data to estimatemany parameters using an inefcient procedure (histogrammethod of mo-ments) then there should be enough data to estimate a smaller number ofparameters using an efcient one (maximum likelihood)

Using the K-S and Q-Q plots derived from the time-rescaling theoremit is possible to devise a systematic approach to the use of histogram-based

The Time-Rescaling Theorem 343

models That is it is possible to determine when a histogram-based modelaccurately describes a spike train and when a different model class temporaldependence or the effects of covariates (eg the theta rhythm and the ratrunning velocity in the case of the place cells) should be considered and thedegree of improvement the alternative models provide

In summary we have illustrated how the time-rescaling theorem may beused to compare directly goodness of t of parametric and histogram-basedpoint process models of neural spiking activity and to simulate spike traindata These results suggest that the time-rescaling theorem can be a valuabletool for neural spike train data analysis

Appendix

A1 Maximum Likelihood Estimation of the IMI Model To t the IMImodel we choose J large and divide the interval (0 T] into J bins of widthD D T J We choose J so that there is at most one spike in any bin In thisway we convert the spike times 0 lt u1 lt u2 lt lt uniexcl1 lt un middot T into abinary sequence ucurren

j where ucurrenj D 1 if there is a spike in bin j and 0 otherwise

for j D 1 J By denition of the conditional intensity function for the IMImodel in equation 31 it follows that each ucurren

j is a Bernoulli random variablewith the probability of a spike at jD dened as l1 ( jD | h )l2 ( jD iexclucurren

N ( jD ) | h )D We note that this discretization is identical to the one used to constructthe discretized version of the simulation algorithm in Section 33 The logprobability of a spike is thus

log(l1 ( jD | h ) ) C log(l2 ( jD iexcl ucurrenN ( jD ) | h )D ) (A1)

and the cubic spline models for log(l1 ( jD | h ) ) and log(l2 ( jD iexcl ucurrenN ( jD ) | h ) )

are respectively

log(l1 ( jD | h ) ) D3X

lD1

hl ( jD iexcl j1 ) lC

Ch4 ( jD iexclj2 )3C C h5 ( jD iexcl j3) 3

C (A2)

log(l2 ( jD iexcl ucurrenN ( jD ) | h ) ) D

3X

lD1

hlC5 ( jD iexcl ucurrenN ( jD ) iexclc 1 ) l

C

C h9 ( jD iexcl ucurrenN ( jD ) iexclc 2 ) 3

C (A3)

whereh D (h1 h9 ) and the knots are dened asjl D lT4 for l D 1 2 3 andc l is the observed 100l3 quantile of the interspike interval distribution ofthe trial spike trains for lD 1 2 We have for Section 31 that hIP D (h1 h5)

344 E N Brown R Barbieri V Ventura R E Kass and L M Frank

and hIMI D (h1 h9) are the coefcients of the spline basis elements for theIP and IMI models respectively

The parameters are estimated by maximum likelihood with D D 1 msecusing the gam function in S-PLUS (MathSoft Seattle) as described in Chap-ter 11 of Venables and Ripley (1999) The inputs to gam are the ucurren

j s the timearguments jD and jD iexclucurren

N ( jD ) the symbolic specication of the spline mod-els and the knots Further discussion on spline-based regression methodsfor analyzing neural data may be found in Olson et al (2000) and Venturaet al (2001)

Acknowledgments

We are grateful to Carl Olson for allowing us to use the supplementary eyeeld data and Matthew Wilson for allowing us to use the hippocampal placecell data We thank David Brillinger and Victor Solo for helpful discussionsand the two anonymous referees whose suggestions helped improve theexposition This research was supported in part by NIH grants CA54652MH59733 and MH61637 and NSF grants DMS 9803433 and IBN 0081548

References

Barbieri R Frank L M Quirk M C Wilson M A amp Brown E N (2001)Diagnostic methods for statistical models of place cell spiking activity Neu-rocomputing 38ndash40 1087ndash1093

Barbieri R Quirk M C Frank L M Wilson M A amp Brown E N (2001)Construction and analysis of non-Poisson stimulus-response models of neu-ral spike train activity J Neurosci Meth 105 25ndash37

Berman M (1983)Comment on ldquoLikelihood analysis of point processes and itsapplications to seismological datardquo by Ogata Bulletin Internatl Stat Instit50 412ndash418

Br Acircemaud P (1981) Point processes and queues Martingale dynamics BerlinSpringer-Verlag

Brillinger D R (1988) Maximum likelihood analysis of spike trains of interact-ing nerve cells Biol Cyber 59 189ndash200

Brown E N Frank L M Tang D Quirk M C amp Wilson M A (1998) Astatistical paradigm for neural spike train decoding applied to position pre-diction from ensemble ring patterns of rat hippocampal place cells Journalof Neuroscience 18 7411ndash7425

Casella G amp Berger R L (1990) Statistical inference Belmont CA DuxburyDaley D amp Vere-Jones D (1988) An introduction to the theory of point process

New York Springer-VerlagFrank L M Brown E N amp Wilson M A (2000) Trajectory encoding in the

hippocampus and entorhinal cortex Neuron 27 169ndash178Guttorp P (1995) Stochastic modeling of scientic data London Chapman amp

Hall

The Time-Rescaling Theorem 345

Hogg R V amp Tanis E A (2001) Probability and statistical inference (6th ed)Englewood Cliffs NJ Prentice Hall

Jacobsen M (1982)Statistical analysis of counting processes New York Springer-Verlag

Johnson A amp Kotz S (1970) Distributions in statistics Continuous univariatedistributionsmdash2 New York Wiley

Kalbeisch J amp Prentice R (1980)The statisticalanalysis of failure time data NewYork Wiley

Karr A (1991) Point processes and their statistical inference (2nd ed) New YorkMarcel Dekker

Kass R E amp Ventura V (2001)A spike train probability model Neural Comput13 1713ndash1720

Lewis P A W amp Shedler G S (1978) Simulation of non-homogeneous Poissonprocesses by thinning Naval Res Logistics Quart 26 403ndash413

Meyer P (1969)D Acircemonstration simpli Acircee drsquoun th Acirceoreme de Knight InS AcirceminaireprobabilitAcirce V (pp 191ndash195) New York Springer-Verlag

Muller R U amp Kubie J L (1987)The effects of changes in the environment onthe spatial ring of hippocampal complex-spike cells Journal of Neuroscience7 1951ndash1968

Ogata Y (1981) On Lewisrsquo simulation method for point processes IEEE Trans-actions on Information Theory IT-27 23ndash31

Ogata Y (1988) Statistical models for earthquake occurrences and residualanalysis for point processes Journal of American Statistical Association 83 9ndash27

OrsquoKeefe J amp Dostrovsky J (1971) The hippocampus as a spatial map Pre-liminary evidence from unit activity in the freely-moving rat Brain Res 34171ndash175

Olson C R Gettner S N Ventura V Carta R amp Kass R E (2000) Neuronalactivity in macaque supplementary eye eld during planning of saccades inresponse to pattern and spatial cues Journal of Neurophysiology84 1369ndash1384

Papangelou F (1972) Integrability of expected increments of point processesand a related random change of scale Trans Amer Math Soc 165 483ndash506

Port S (1994) Theoretical probability for applications New York WileyProtter M H amp Morrey C B (1991)A rst course in real analysis (2nd ed) New

York Springer-VerlagReich D Victor J amp Knight B (1998) The power ratio and the interval

map Spiking models and extracellular recordings Journal of Neuroscience18 10090ndash10104

Ripley B (1987) Stochastic simulation New York WileyRoss S (1993) Introduction to probability models (5th ed) San Diego CA Aca-

demic PressSnyder D amp Miller M (1991)Random point processes in time and space (2nd ed)

New York Springer-VerlagTaylor H M amp Karlin S (1994) An introduction to stochastic modeling (rev ed)

San Diego CA Academic PressVenables W amp Ripley B (1999) Modern applied statistics with S-PLUS (3rd ed)

New York Springer-Verlag

346 E N Brown R Barbieri V Ventura R E Kass and L M Frank

Ventura V Carta R Kass R Gettner S amp Olson C (2001) Statistical analysisof temporal evolution in single-neuron ring rate Biostatistics In press

Wilson M A amp McNaughton B L (1993) Dynamics of the hippocampal en-semble code for space Science 261 1055ndash1058

Wood E R Dudchenko P A amp Eichenbaum H (1999) The global record ofmemory in hippocampal neuronal activity Nature 397 613ndash616

Received October 27 2000 accepted May 2 2001

The Time-Rescaling Theorem 337

model dened as

f (uk | ukiexcl1 h )

Dys (uk )C (y )

microZ uk

ukiexcl1

ys (u) duparay iexcl1

expraquo

iexclZ uk

ukiexcl1

ys(u) dufrac14

(33)

or as an inhomogeneous inverse gaussian (IIG) model dened as

f (uk | ukiexcl1 h )

Ds(uk)

micro2p

hR uk

ukiexcl1s(u) du

i3para 1

2

exp

8gtlt

gtiexcl

12

plusmnR uk

ukiexcl1s (u) du iexcl y

sup22

y 2R uk

ukiexcl1s(u) du

9gt=

gt (34)

where y gt 0 is a location parameter for both models and h D (m a b y )is the set of model parameters to be estimated from the spike train If weset y D 1 in equation 33 we obtain the IP model as a special case ofthe IG model The parameters for all three modelsmdashthe IP IG and theIIGmdashcan be estimated from the spike train data by maximum likelihood(Barbieri Quirk et al 2001) The models in equations 33 and 34 are Markovso that the current value of either the spike time probability density orthe conditional intensity (rate) function depends on only the time of thepreviousspike Because of equation 22 specifying the spike time probabilitydensity is equivalent to specifying the conditional intensity function If welet Oh denote the maximum likelihood estimate of h then the maximumlikelihood estimate of the conditional intensity function for each model canbe computed from equation 22 as

l(t | Ht Oh ) Df (t | uN (t) Oh )

1 iexclR t

uN (t)f (u | uN (t) Oh ) du

(35)

for t gt uN (t) The estimated conditional intensity from each model maybe used in the time-rescaling theorem to assess model goodness of t asdescribed in Section 21

The second approach is to compute a histogram-based estimate of theconditional intensity function by using either spatial smoothing (Muller ampKubie 1987 Frank et al 2000) or temporal smoothing (Wood Dudchenkoamp Eichenbaum 1999) of the spike train To compute the spatial smooth-ing estimate of the conditional intensity function we followed Frank et al(2000) and divided the 300 cm track into 42 cm bins counted the numberof spikes per bin and divided the count by the amount of time the animalspends in the bin We smooth the binned ring rate with a six-point gaus-sian window with a standard deviation of one bin to reduce the effect of

338 E N Brown R Barbieri V Ventura R E Kass and L M Frank

running velocity The spatial conditional intensity estimate is the smoothedspatial rate function To compute the temporal rate function we followedWood et al (1999) and divided the experiment into time bins of 200 msecand computed the rate as the number of spikes per 200 msec

These two smoothing procedures produce histogram-based estimatesof l(t) Both are histogram-based estimates of Poisson rate functions be-cause neither the estimated spatial nor the temporal rate functions makeany history-dependence assumption about the spike train As in the analy-sis of the SEF spike trains we again use the KS and Q-Q plots to comparedirectly goodness of t of the ve spike train models for the hippocampalplace cells The IP IG and IIG models were t to the spike train data by max-imum likelihood The spatial and temporal rate models were computed asdescribed The KS and Q-Q plotgoodness-of-t comparisonsare in Figure 2

The IG model overestimates at lower quantiles underestimates at inter-mediate quantiles and overestimates at the upper quantiles (see Figure 2A)

The Time-Rescaling Theorem 339

The IP model underestimates the lower and intermediate quantiles andoverestimates the upper quantiles The KS plot of the spatial rate model issimilar to that of the IP model yet closer to the 45-degree line The temporalrate model overestimates the quantiles of the true probability model of thedata This analysis suggests that the IG IP and spatial rate models are mostlikely oversmoothing this spike train whereas the temporal rate model un-dersmooths it Of the ve models the one that is closest to the 45-degreeline and lies almost entirely within the condence bounds is the IIG Thismodel disagrees only with the experimental data near quantile 080 Becauseall of the models with the exception of the IIG have appreciable lack of tin terms of the KS plots the ndings in the Q-Q plot analyses are almostidentical (see Figure 2B) As in the KS plot the Q-Q plot for the IIG model isclose to the 45-degree line and within the 95 condence bounds with theexception of quantiles near 080 These analyses suggest that IIG rate modelgives the best agreement with the spiking activity of this pyramidal neuronThe spatial and temporal rate function models and the IP model have thegreatest lack of t

33 Simulating a General Point Process by Time Rescaling As a sec-ond application of the time-rescaling theorem we describe how the the-orem may be used to simulate a general point process We stated in theIntroduction that the time-rescaling theorem provides a standard approachfor simulating an inhomogeneous Poisson process from a simple Poisson

Figure 2 Facing page (A) KS plots of the parametric and histogram-based modelts to the hippocampal place cell spike train activity The parametric modelsare the inhomogeneous Poisson (IP) (dotted line) the inhomogeneous gamma(IG) (thin solid line) and the inhomogeneous inverse gaussian (IIG)(thick solidline) The histogram-based models are the spatial rate model (dashed line) basedon 42 cm spatial bin width and the temporal rate model (dashed and dotted line)based a 200msec time bin width The 45-degree solid line represents exact agree-ment between the model and the experimental data and the 45-degree thin solidlines are the 95 condence bounds based on the KS statistic as in Figure 1 TheIIG model KS plot lies almost entirely within the condence bounds whereasall the other models show signicant lack of t This suggests that the IIG modelgives the best description of this place cell data series and the histogram-basedmodels agree least with the data (B) Q-Q plots of the IP (dotted line) IG (thinsolid line) and IIG (thick solid line) spatial (dashed line) temporal (dotted anddashed line) models The 95 local condence bounds (thin dashed lines) arecomputed as described in Figure 1 The ndings in the Q-Q plot analyses arealmost identical to those in the KS plots The Q-Q plot for the IIG model is closeto the 45-degree line and within the 95 condence bounds with the exceptionof quantiles near 080 This analysis also suggests that IIG rate model gives thebest agreement with the spiking activity of this pyramidal neuron The spatialand temporal rate function models and the IP model have the greatest lack of t

340 E N Brown R Barbieri V Ventura R E Kass and L M Frank

process The general form of the time-rescaling theorem suggests that anypoint process with an integrable conditional intensity function may be sim-ulated from a Poisson process with unit rate by rescaling time with respectto the conditional intensity (rate) function Given an interval (0 T] the sim-ulation algorithm proceeds as follows

1 Set u0 D 0 Set k D 1

2 Draw tk an exponential random variable with mean 1

3 Find uk as the solution to tk DR uk

ukiexcl1l (u | u0 u1 ukiexcl1 ) du

4 If uk gt T then stop

5 k D k C 1

6 Go to 2

By using equation 23 a discrete version of the algorithm can be constructedas follows Choose J large and divide the interval (0 T] into J bins each ofwidth D D T J For k D 1 J draw a Bernoulli random variable ucurren

k withprobability l(kD | ucurren

1 ucurrenkiexcl1 )D and assign a spike to bin k if ucurren

k D 1 andno spike if ucurren

k D 0While in many instances there will be faster more computationally ef-

cient algorithms for simulating a point process such as model-based meth-ods for specic renewal processes (Ripley 1987) and thinning algorithms(Lewis amp Shedler 1978 Ogata 1981 Ross 1993) the algorithm above is sim-ple to implement given a specication of the conditional intensity functionFor an example of where this algorithm is crucial for point process simu-lation we consider the IG model in equation 32 Its conditional intensityfunction is innite immediately following a spike if y lt 1 If in addition yis time varying (y D y (t) lt 1 for all t) then neither thinning nor standardalgorithms for making draws from a gamma probability distribution maybe used to simulate data from this model The thinning algorithm fails be-cause the conditional intensity function is not bounded and the standardalgorithms for simulating a gamma model cannot be applied because y istime varying In this case the time-rescaling simulation algorithm may beapplied as long as the conditional intensity function remains integrable asy varies temporally

4 Discussion

Measuring how well a point process model describes a neural spike traindata series is imperative prior to using the model for making inferences Thetime-rescaling theorem states that any point process with an integrable con-ditional intensity function may be transformed into a Poisson process witha unit rate Berman (1983) and Ogata (1988) showed that this theorem maybe used to develop goodness-of-t tests for point process models of seismo-logic data Goodness-of-t methods for neural spike train models based on

The Time-Rescaling Theorem 341

time-rescaling transformations but not the time-rescaling theorem have alsobeen reported (Reich Victor amp Knight 1998 Barbieri Frank Quirk Wilsonamp Brown 2001) Here we have described how the time-rescaling theoremmay be used to develop goodness-of-t tests for point process models ofneural spike trains

To illustrate our approach we analyzed two types of commonlyrecordedspike train data The SEF data are a set of multiple short (400 msec) series ofspike times each measured under identical experimental conditions Thesedata are typically analyzed by a PSTH The hippocampus data are a long(20 minutes) series of spike time recordings that are typically analyzed witheither spatial or temporal histogram models To each type of data we tboth parametric and histogram-based models Histogram-based models arepopularneural data analysis toolsbecause of the ease with which they can becomputed and interpreted These apparent advantages do not override theneed to evaluate the goodness of t of these models We previously usedthe time-rescaling theorem to assess goodness of t for parametric spiketrain models (Olson et al 2000 Barbieri Quirk et al 2001 Ventura et al2001) Our main result in this article is that the time-rescaling theorem can beused to evaluate goodness of t of parametric and histogram-based modelsand to compare directly the accuracy of models from the two classes Werecommend that before making an inference based on either type of modela goodness-of-t analysis should be performed to establish how well themodel describes the spike train data If the model and data agree closelythen the inference is more credible than when there is signicant lack of tThe KS and Q-Q plots provide assessments of overall goodness of t For themodels t by maximum likelihood these assessments can be applied alongwith methods that measure the marginal value and marginal costs of usingmore complex models such as Akaikiersquos information criterion (AIC) andthe Bayesian information criterion (BIC) in order to gain a more completeevaluation of model agreement with experimental data (Barbieri Quirk etal 2001)

We assessed goodness of t by using the time-rescaling theorem to con-struct KS and Q-Q plots having respectively liberal and conservative con-dence bounds Together the two sets of condence bounds help character-ize the range of agreement between the model and the data For example amodelwhose KSplot lies consistently outside the 95 KS condence bounds(the IP model for the hippocampal data) agrees poorly with the data Onthe other hand a model that is within all the 95 condence bounds of theQ-Q plots (the IMI model for the SEF data) agrees closely with the data Amodel such as the IP model for the SEF data that is within nearly all theKS bounds may lie outside the Q-Q plot intervals In this case if the lackof t with respect to the Q-Q plot intervals is systematic (ie is over a set ofcontiguous quantiles) this suggests that the model does not t the data well

As a second application of the time-rescaling theorem we presented analgorithm for simulating spike trains from a point process given its condi-

342 E N Brown R Barbieri V Ventura R E Kass and L M Frank

tional intensity (rate) function This algorithm generalizes the well-knowntechnique of simulating an inhomogeneous Poisson process by rescaling aPoisson process with a constant rate

Finally to make the reasoning behind the time-rescaling theorem moreaccessible to the neuroscience researchers we proved its general form usingelementary probability arguments While this elementary proof is most cer-tainly apparent to experts in probability and point process theory (D Brill-inger personal communication Guttorp 1995) itsdetails to our knowledgehave not been previously presented The original proofs of this theorem usemeasure theory and are based on the martingale representation of point pro-cesses (Meyer 1969 Papangelou 1972 Br Acircemaud 1981 Jacobsen 1982) Theconditional intensity function (see equation 21) is dened in terms of themartingale representation Our proof uses elementary arguments because itis based on the fact that the joint probability density of a set of point processobservations (spike train) has a canonical representation in terms of the con-ditional intensity function When the joint probability density is representedin this way the Jacobian in the change of variables between the original spiketimes and the rescaled interspike intervals simplies to a product of the re-ciprocals of the conditional intensity functions evaluated at the spike times

The proof also highlights the signicance of the conditional intensityfunction in spike train modeling its specication completely denes thestochastic structure of the point process This is because in a small time in-terval the product of the conditional intensity function and the time intervaldenes the probability of a spike in that interval given the historyof the spiketrain up to that time (see equation 23) When there is no history dependencethe conditional intensity function is simply the Poisson rate function Animportant consequence of this simplication is that unless history depen-dence is specically included then histogram-based models such as thePSTH and the spatial and temporal smoothers are implicit Poisson models

In both the SEF and hippocampusexamples the histogram-based modelsgave poor ts to the spike train These poor ts arose because these mod-els used few data points to estimate many parameters and because theydo not model history dependence in the spike train Our parametric mod-els used fewer parameters and represented temporal dependence explicitlyas Markov Point process models with higher-order temporal dependencehave been studied by Ogata (1981 1988) Brillinger (1988) and Kass andVentura (2001) and will be considered further in our future work Paramet-ric conditional intensity functions may be estimated from neural spike traindata in any experiments where there are enough data to estimate reliably ahistogram-based model This is because if there are enough data to estimatemany parameters using an inefcient procedure (histogrammethod of mo-ments) then there should be enough data to estimate a smaller number ofparameters using an efcient one (maximum likelihood)

Using the K-S and Q-Q plots derived from the time-rescaling theoremit is possible to devise a systematic approach to the use of histogram-based

The Time-Rescaling Theorem 343

models That is it is possible to determine when a histogram-based modelaccurately describes a spike train and when a different model class temporaldependence or the effects of covariates (eg the theta rhythm and the ratrunning velocity in the case of the place cells) should be considered and thedegree of improvement the alternative models provide

In summary we have illustrated how the time-rescaling theorem may beused to compare directly goodness of t of parametric and histogram-basedpoint process models of neural spiking activity and to simulate spike traindata These results suggest that the time-rescaling theorem can be a valuabletool for neural spike train data analysis

Appendix

A1 Maximum Likelihood Estimation of the IMI Model To t the IMImodel we choose J large and divide the interval (0 T] into J bins of widthD D T J We choose J so that there is at most one spike in any bin In thisway we convert the spike times 0 lt u1 lt u2 lt lt uniexcl1 lt un middot T into abinary sequence ucurren

j where ucurrenj D 1 if there is a spike in bin j and 0 otherwise

for j D 1 J By denition of the conditional intensity function for the IMImodel in equation 31 it follows that each ucurren

j is a Bernoulli random variablewith the probability of a spike at jD dened as l1 ( jD | h )l2 ( jD iexclucurren

N ( jD ) | h )D We note that this discretization is identical to the one used to constructthe discretized version of the simulation algorithm in Section 33 The logprobability of a spike is thus

log(l1 ( jD | h ) ) C log(l2 ( jD iexcl ucurrenN ( jD ) | h )D ) (A1)

and the cubic spline models for log(l1 ( jD | h ) ) and log(l2 ( jD iexcl ucurrenN ( jD ) | h ) )

are respectively

log(l1 ( jD | h ) ) D3X

lD1

hl ( jD iexcl j1 ) lC

Ch4 ( jD iexclj2 )3C C h5 ( jD iexcl j3) 3

C (A2)

log(l2 ( jD iexcl ucurrenN ( jD ) | h ) ) D

3X

lD1

hlC5 ( jD iexcl ucurrenN ( jD ) iexclc 1 ) l

C

C h9 ( jD iexcl ucurrenN ( jD ) iexclc 2 ) 3

C (A3)

whereh D (h1 h9 ) and the knots are dened asjl D lT4 for l D 1 2 3 andc l is the observed 100l3 quantile of the interspike interval distribution ofthe trial spike trains for lD 1 2 We have for Section 31 that hIP D (h1 h5)

344 E N Brown R Barbieri V Ventura R E Kass and L M Frank

and hIMI D (h1 h9) are the coefcients of the spline basis elements for theIP and IMI models respectively

The parameters are estimated by maximum likelihood with D D 1 msecusing the gam function in S-PLUS (MathSoft Seattle) as described in Chap-ter 11 of Venables and Ripley (1999) The inputs to gam are the ucurren

j s the timearguments jD and jD iexclucurren

N ( jD ) the symbolic specication of the spline mod-els and the knots Further discussion on spline-based regression methodsfor analyzing neural data may be found in Olson et al (2000) and Venturaet al (2001)

Acknowledgments

We are grateful to Carl Olson for allowing us to use the supplementary eyeeld data and Matthew Wilson for allowing us to use the hippocampal placecell data We thank David Brillinger and Victor Solo for helpful discussionsand the two anonymous referees whose suggestions helped improve theexposition This research was supported in part by NIH grants CA54652MH59733 and MH61637 and NSF grants DMS 9803433 and IBN 0081548

References

Barbieri R Frank L M Quirk M C Wilson M A amp Brown E N (2001)Diagnostic methods for statistical models of place cell spiking activity Neu-rocomputing 38ndash40 1087ndash1093

Barbieri R Quirk M C Frank L M Wilson M A amp Brown E N (2001)Construction and analysis of non-Poisson stimulus-response models of neu-ral spike train activity J Neurosci Meth 105 25ndash37

Berman M (1983)Comment on ldquoLikelihood analysis of point processes and itsapplications to seismological datardquo by Ogata Bulletin Internatl Stat Instit50 412ndash418

Br Acircemaud P (1981) Point processes and queues Martingale dynamics BerlinSpringer-Verlag

Brillinger D R (1988) Maximum likelihood analysis of spike trains of interact-ing nerve cells Biol Cyber 59 189ndash200

Brown E N Frank L M Tang D Quirk M C amp Wilson M A (1998) Astatistical paradigm for neural spike train decoding applied to position pre-diction from ensemble ring patterns of rat hippocampal place cells Journalof Neuroscience 18 7411ndash7425

Casella G amp Berger R L (1990) Statistical inference Belmont CA DuxburyDaley D amp Vere-Jones D (1988) An introduction to the theory of point process

New York Springer-VerlagFrank L M Brown E N amp Wilson M A (2000) Trajectory encoding in the

hippocampus and entorhinal cortex Neuron 27 169ndash178Guttorp P (1995) Stochastic modeling of scientic data London Chapman amp

Hall

The Time-Rescaling Theorem 345

Hogg R V amp Tanis E A (2001) Probability and statistical inference (6th ed)Englewood Cliffs NJ Prentice Hall

Jacobsen M (1982)Statistical analysis of counting processes New York Springer-Verlag

Johnson A amp Kotz S (1970) Distributions in statistics Continuous univariatedistributionsmdash2 New York Wiley

Kalbeisch J amp Prentice R (1980)The statisticalanalysis of failure time data NewYork Wiley

Karr A (1991) Point processes and their statistical inference (2nd ed) New YorkMarcel Dekker

Kass R E amp Ventura V (2001)A spike train probability model Neural Comput13 1713ndash1720

Lewis P A W amp Shedler G S (1978) Simulation of non-homogeneous Poissonprocesses by thinning Naval Res Logistics Quart 26 403ndash413

Meyer P (1969)D Acircemonstration simpli Acircee drsquoun th Acirceoreme de Knight InS AcirceminaireprobabilitAcirce V (pp 191ndash195) New York Springer-Verlag

Muller R U amp Kubie J L (1987)The effects of changes in the environment onthe spatial ring of hippocampal complex-spike cells Journal of Neuroscience7 1951ndash1968

Ogata Y (1981) On Lewisrsquo simulation method for point processes IEEE Trans-actions on Information Theory IT-27 23ndash31

Ogata Y (1988) Statistical models for earthquake occurrences and residualanalysis for point processes Journal of American Statistical Association 83 9ndash27

OrsquoKeefe J amp Dostrovsky J (1971) The hippocampus as a spatial map Pre-liminary evidence from unit activity in the freely-moving rat Brain Res 34171ndash175

Olson C R Gettner S N Ventura V Carta R amp Kass R E (2000) Neuronalactivity in macaque supplementary eye eld during planning of saccades inresponse to pattern and spatial cues Journal of Neurophysiology84 1369ndash1384

Papangelou F (1972) Integrability of expected increments of point processesand a related random change of scale Trans Amer Math Soc 165 483ndash506

Port S (1994) Theoretical probability for applications New York WileyProtter M H amp Morrey C B (1991)A rst course in real analysis (2nd ed) New

York Springer-VerlagReich D Victor J amp Knight B (1998) The power ratio and the interval

map Spiking models and extracellular recordings Journal of Neuroscience18 10090ndash10104

Ripley B (1987) Stochastic simulation New York WileyRoss S (1993) Introduction to probability models (5th ed) San Diego CA Aca-

demic PressSnyder D amp Miller M (1991)Random point processes in time and space (2nd ed)

New York Springer-VerlagTaylor H M amp Karlin S (1994) An introduction to stochastic modeling (rev ed)

San Diego CA Academic PressVenables W amp Ripley B (1999) Modern applied statistics with S-PLUS (3rd ed)

New York Springer-Verlag

346 E N Brown R Barbieri V Ventura R E Kass and L M Frank

Ventura V Carta R Kass R Gettner S amp Olson C (2001) Statistical analysisof temporal evolution in single-neuron ring rate Biostatistics In press

Wilson M A amp McNaughton B L (1993) Dynamics of the hippocampal en-semble code for space Science 261 1055ndash1058

Wood E R Dudchenko P A amp Eichenbaum H (1999) The global record ofmemory in hippocampal neuronal activity Nature 397 613ndash616

Received October 27 2000 accepted May 2 2001

338 E N Brown R Barbieri V Ventura R E Kass and L M Frank

running velocity The spatial conditional intensity estimate is the smoothedspatial rate function To compute the temporal rate function we followedWood et al (1999) and divided the experiment into time bins of 200 msecand computed the rate as the number of spikes per 200 msec

These two smoothing procedures produce histogram-based estimatesof l(t) Both are histogram-based estimates of Poisson rate functions be-cause neither the estimated spatial nor the temporal rate functions makeany history-dependence assumption about the spike train As in the analy-sis of the SEF spike trains we again use the KS and Q-Q plots to comparedirectly goodness of t of the ve spike train models for the hippocampalplace cells The IP IG and IIG models were t to the spike train data by max-imum likelihood The spatial and temporal rate models were computed asdescribed The KS and Q-Q plotgoodness-of-t comparisonsare in Figure 2

The IG model overestimates at lower quantiles underestimates at inter-mediate quantiles and overestimates at the upper quantiles (see Figure 2A)

The Time-Rescaling Theorem 339

The IP model underestimates the lower and intermediate quantiles andoverestimates the upper quantiles The KS plot of the spatial rate model issimilar to that of the IP model yet closer to the 45-degree line The temporalrate model overestimates the quantiles of the true probability model of thedata This analysis suggests that the IG IP and spatial rate models are mostlikely oversmoothing this spike train whereas the temporal rate model un-dersmooths it Of the ve models the one that is closest to the 45-degreeline and lies almost entirely within the condence bounds is the IIG Thismodel disagrees only with the experimental data near quantile 080 Becauseall of the models with the exception of the IIG have appreciable lack of tin terms of the KS plots the ndings in the Q-Q plot analyses are almostidentical (see Figure 2B) As in the KS plot the Q-Q plot for the IIG model isclose to the 45-degree line and within the 95 condence bounds with theexception of quantiles near 080 These analyses suggest that IIG rate modelgives the best agreement with the spiking activity of this pyramidal neuronThe spatial and temporal rate function models and the IP model have thegreatest lack of t

33 Simulating a General Point Process by Time Rescaling As a sec-ond application of the time-rescaling theorem we describe how the the-orem may be used to simulate a general point process We stated in theIntroduction that the time-rescaling theorem provides a standard approachfor simulating an inhomogeneous Poisson process from a simple Poisson

Figure 2 Facing page (A) KS plots of the parametric and histogram-based modelts to the hippocampal place cell spike train activity The parametric modelsare the inhomogeneous Poisson (IP) (dotted line) the inhomogeneous gamma(IG) (thin solid line) and the inhomogeneous inverse gaussian (IIG)(thick solidline) The histogram-based models are the spatial rate model (dashed line) basedon 42 cm spatial bin width and the temporal rate model (dashed and dotted line)based a 200msec time bin width The 45-degree solid line represents exact agree-ment between the model and the experimental data and the 45-degree thin solidlines are the 95 condence bounds based on the KS statistic as in Figure 1 TheIIG model KS plot lies almost entirely within the condence bounds whereasall the other models show signicant lack of t This suggests that the IIG modelgives the best description of this place cell data series and the histogram-basedmodels agree least with the data (B) Q-Q plots of the IP (dotted line) IG (thinsolid line) and IIG (thick solid line) spatial (dashed line) temporal (dotted anddashed line) models The 95 local condence bounds (thin dashed lines) arecomputed as described in Figure 1 The ndings in the Q-Q plot analyses arealmost identical to those in the KS plots The Q-Q plot for the IIG model is closeto the 45-degree line and within the 95 condence bounds with the exceptionof quantiles near 080 This analysis also suggests that IIG rate model gives thebest agreement with the spiking activity of this pyramidal neuron The spatialand temporal rate function models and the IP model have the greatest lack of t

340 E N Brown R Barbieri V Ventura R E Kass and L M Frank

process The general form of the time-rescaling theorem suggests that anypoint process with an integrable conditional intensity function may be sim-ulated from a Poisson process with unit rate by rescaling time with respectto the conditional intensity (rate) function Given an interval (0 T] the sim-ulation algorithm proceeds as follows

1 Set u0 D 0 Set k D 1

2 Draw tk an exponential random variable with mean 1

3 Find uk as the solution to tk DR uk

ukiexcl1l (u | u0 u1 ukiexcl1 ) du

4 If uk gt T then stop

5 k D k C 1

6 Go to 2

By using equation 23 a discrete version of the algorithm can be constructedas follows Choose J large and divide the interval (0 T] into J bins each ofwidth D D T J For k D 1 J draw a Bernoulli random variable ucurren

k withprobability l(kD | ucurren

1 ucurrenkiexcl1 )D and assign a spike to bin k if ucurren

k D 1 andno spike if ucurren

k D 0While in many instances there will be faster more computationally ef-

cient algorithms for simulating a point process such as model-based meth-ods for specic renewal processes (Ripley 1987) and thinning algorithms(Lewis amp Shedler 1978 Ogata 1981 Ross 1993) the algorithm above is sim-ple to implement given a specication of the conditional intensity functionFor an example of where this algorithm is crucial for point process simu-lation we consider the IG model in equation 32 Its conditional intensityfunction is innite immediately following a spike if y lt 1 If in addition yis time varying (y D y (t) lt 1 for all t) then neither thinning nor standardalgorithms for making draws from a gamma probability distribution maybe used to simulate data from this model The thinning algorithm fails be-cause the conditional intensity function is not bounded and the standardalgorithms for simulating a gamma model cannot be applied because y istime varying In this case the time-rescaling simulation algorithm may beapplied as long as the conditional intensity function remains integrable asy varies temporally

4 Discussion

Measuring how well a point process model describes a neural spike traindata series is imperative prior to using the model for making inferences Thetime-rescaling theorem states that any point process with an integrable con-ditional intensity function may be transformed into a Poisson process witha unit rate Berman (1983) and Ogata (1988) showed that this theorem maybe used to develop goodness-of-t tests for point process models of seismo-logic data Goodness-of-t methods for neural spike train models based on

The Time-Rescaling Theorem 341

time-rescaling transformations but not the time-rescaling theorem have alsobeen reported (Reich Victor amp Knight 1998 Barbieri Frank Quirk Wilsonamp Brown 2001) Here we have described how the time-rescaling theoremmay be used to develop goodness-of-t tests for point process models ofneural spike trains

To illustrate our approach we analyzed two types of commonlyrecordedspike train data The SEF data are a set of multiple short (400 msec) series ofspike times each measured under identical experimental conditions Thesedata are typically analyzed by a PSTH The hippocampus data are a long(20 minutes) series of spike time recordings that are typically analyzed witheither spatial or temporal histogram models To each type of data we tboth parametric and histogram-based models Histogram-based models arepopularneural data analysis toolsbecause of the ease with which they can becomputed and interpreted These apparent advantages do not override theneed to evaluate the goodness of t of these models We previously usedthe time-rescaling theorem to assess goodness of t for parametric spiketrain models (Olson et al 2000 Barbieri Quirk et al 2001 Ventura et al2001) Our main result in this article is that the time-rescaling theorem can beused to evaluate goodness of t of parametric and histogram-based modelsand to compare directly the accuracy of models from the two classes Werecommend that before making an inference based on either type of modela goodness-of-t analysis should be performed to establish how well themodel describes the spike train data If the model and data agree closelythen the inference is more credible than when there is signicant lack of tThe KS and Q-Q plots provide assessments of overall goodness of t For themodels t by maximum likelihood these assessments can be applied alongwith methods that measure the marginal value and marginal costs of usingmore complex models such as Akaikiersquos information criterion (AIC) andthe Bayesian information criterion (BIC) in order to gain a more completeevaluation of model agreement with experimental data (Barbieri Quirk etal 2001)

We assessed goodness of t by using the time-rescaling theorem to con-struct KS and Q-Q plots having respectively liberal and conservative con-dence bounds Together the two sets of condence bounds help character-ize the range of agreement between the model and the data For example amodelwhose KSplot lies consistently outside the 95 KS condence bounds(the IP model for the hippocampal data) agrees poorly with the data Onthe other hand a model that is within all the 95 condence bounds of theQ-Q plots (the IMI model for the SEF data) agrees closely with the data Amodel such as the IP model for the SEF data that is within nearly all theKS bounds may lie outside the Q-Q plot intervals In this case if the lackof t with respect to the Q-Q plot intervals is systematic (ie is over a set ofcontiguous quantiles) this suggests that the model does not t the data well

As a second application of the time-rescaling theorem we presented analgorithm for simulating spike trains from a point process given its condi-

342 E N Brown R Barbieri V Ventura R E Kass and L M Frank

tional intensity (rate) function This algorithm generalizes the well-knowntechnique of simulating an inhomogeneous Poisson process by rescaling aPoisson process with a constant rate

Finally to make the reasoning behind the time-rescaling theorem moreaccessible to the neuroscience researchers we proved its general form usingelementary probability arguments While this elementary proof is most cer-tainly apparent to experts in probability and point process theory (D Brill-inger personal communication Guttorp 1995) itsdetails to our knowledgehave not been previously presented The original proofs of this theorem usemeasure theory and are based on the martingale representation of point pro-cesses (Meyer 1969 Papangelou 1972 Br Acircemaud 1981 Jacobsen 1982) Theconditional intensity function (see equation 21) is dened in terms of themartingale representation Our proof uses elementary arguments because itis based on the fact that the joint probability density of a set of point processobservations (spike train) has a canonical representation in terms of the con-ditional intensity function When the joint probability density is representedin this way the Jacobian in the change of variables between the original spiketimes and the rescaled interspike intervals simplies to a product of the re-ciprocals of the conditional intensity functions evaluated at the spike times

The proof also highlights the signicance of the conditional intensityfunction in spike train modeling its specication completely denes thestochastic structure of the point process This is because in a small time in-terval the product of the conditional intensity function and the time intervaldenes the probability of a spike in that interval given the historyof the spiketrain up to that time (see equation 23) When there is no history dependencethe conditional intensity function is simply the Poisson rate function Animportant consequence of this simplication is that unless history depen-dence is specically included then histogram-based models such as thePSTH and the spatial and temporal smoothers are implicit Poisson models

In both the SEF and hippocampusexamples the histogram-based modelsgave poor ts to the spike train These poor ts arose because these mod-els used few data points to estimate many parameters and because theydo not model history dependence in the spike train Our parametric mod-els used fewer parameters and represented temporal dependence explicitlyas Markov Point process models with higher-order temporal dependencehave been studied by Ogata (1981 1988) Brillinger (1988) and Kass andVentura (2001) and will be considered further in our future work Paramet-ric conditional intensity functions may be estimated from neural spike traindata in any experiments where there are enough data to estimate reliably ahistogram-based model This is because if there are enough data to estimatemany parameters using an inefcient procedure (histogrammethod of mo-ments) then there should be enough data to estimate a smaller number ofparameters using an efcient one (maximum likelihood)

Using the K-S and Q-Q plots derived from the time-rescaling theoremit is possible to devise a systematic approach to the use of histogram-based

The Time-Rescaling Theorem 343

models That is it is possible to determine when a histogram-based modelaccurately describes a spike train and when a different model class temporaldependence or the effects of covariates (eg the theta rhythm and the ratrunning velocity in the case of the place cells) should be considered and thedegree of improvement the alternative models provide

In summary we have illustrated how the time-rescaling theorem may beused to compare directly goodness of t of parametric and histogram-basedpoint process models of neural spiking activity and to simulate spike traindata These results suggest that the time-rescaling theorem can be a valuabletool for neural spike train data analysis

Appendix

A1 Maximum Likelihood Estimation of the IMI Model To t the IMImodel we choose J large and divide the interval (0 T] into J bins of widthD D T J We choose J so that there is at most one spike in any bin In thisway we convert the spike times 0 lt u1 lt u2 lt lt uniexcl1 lt un middot T into abinary sequence ucurren

j where ucurrenj D 1 if there is a spike in bin j and 0 otherwise

for j D 1 J By denition of the conditional intensity function for the IMImodel in equation 31 it follows that each ucurren

j is a Bernoulli random variablewith the probability of a spike at jD dened as l1 ( jD | h )l2 ( jD iexclucurren

N ( jD ) | h )D We note that this discretization is identical to the one used to constructthe discretized version of the simulation algorithm in Section 33 The logprobability of a spike is thus

log(l1 ( jD | h ) ) C log(l2 ( jD iexcl ucurrenN ( jD ) | h )D ) (A1)

and the cubic spline models for log(l1 ( jD | h ) ) and log(l2 ( jD iexcl ucurrenN ( jD ) | h ) )

are respectively

log(l1 ( jD | h ) ) D3X

lD1

hl ( jD iexcl j1 ) lC

Ch4 ( jD iexclj2 )3C C h5 ( jD iexcl j3) 3

C (A2)

log(l2 ( jD iexcl ucurrenN ( jD ) | h ) ) D

3X

lD1

hlC5 ( jD iexcl ucurrenN ( jD ) iexclc 1 ) l

C

C h9 ( jD iexcl ucurrenN ( jD ) iexclc 2 ) 3

C (A3)

whereh D (h1 h9 ) and the knots are dened asjl D lT4 for l D 1 2 3 andc l is the observed 100l3 quantile of the interspike interval distribution ofthe trial spike trains for lD 1 2 We have for Section 31 that hIP D (h1 h5)

344 E N Brown R Barbieri V Ventura R E Kass and L M Frank

and hIMI D (h1 h9) are the coefcients of the spline basis elements for theIP and IMI models respectively

The parameters are estimated by maximum likelihood with D D 1 msecusing the gam function in S-PLUS (MathSoft Seattle) as described in Chap-ter 11 of Venables and Ripley (1999) The inputs to gam are the ucurren

j s the timearguments jD and jD iexclucurren

N ( jD ) the symbolic specication of the spline mod-els and the knots Further discussion on spline-based regression methodsfor analyzing neural data may be found in Olson et al (2000) and Venturaet al (2001)

Acknowledgments

We are grateful to Carl Olson for allowing us to use the supplementary eyeeld data and Matthew Wilson for allowing us to use the hippocampal placecell data We thank David Brillinger and Victor Solo for helpful discussionsand the two anonymous referees whose suggestions helped improve theexposition This research was supported in part by NIH grants CA54652MH59733 and MH61637 and NSF grants DMS 9803433 and IBN 0081548

References

Barbieri R Frank L M Quirk M C Wilson M A amp Brown E N (2001)Diagnostic methods for statistical models of place cell spiking activity Neu-rocomputing 38ndash40 1087ndash1093

Barbieri R Quirk M C Frank L M Wilson M A amp Brown E N (2001)Construction and analysis of non-Poisson stimulus-response models of neu-ral spike train activity J Neurosci Meth 105 25ndash37

Berman M (1983)Comment on ldquoLikelihood analysis of point processes and itsapplications to seismological datardquo by Ogata Bulletin Internatl Stat Instit50 412ndash418

Br Acircemaud P (1981) Point processes and queues Martingale dynamics BerlinSpringer-Verlag

Brillinger D R (1988) Maximum likelihood analysis of spike trains of interact-ing nerve cells Biol Cyber 59 189ndash200

Brown E N Frank L M Tang D Quirk M C amp Wilson M A (1998) Astatistical paradigm for neural spike train decoding applied to position pre-diction from ensemble ring patterns of rat hippocampal place cells Journalof Neuroscience 18 7411ndash7425

Casella G amp Berger R L (1990) Statistical inference Belmont CA DuxburyDaley D amp Vere-Jones D (1988) An introduction to the theory of point process

New York Springer-VerlagFrank L M Brown E N amp Wilson M A (2000) Trajectory encoding in the

hippocampus and entorhinal cortex Neuron 27 169ndash178Guttorp P (1995) Stochastic modeling of scientic data London Chapman amp

Hall

The Time-Rescaling Theorem 345

Hogg R V amp Tanis E A (2001) Probability and statistical inference (6th ed)Englewood Cliffs NJ Prentice Hall

Jacobsen M (1982)Statistical analysis of counting processes New York Springer-Verlag

Johnson A amp Kotz S (1970) Distributions in statistics Continuous univariatedistributionsmdash2 New York Wiley

Kalbeisch J amp Prentice R (1980)The statisticalanalysis of failure time data NewYork Wiley

Karr A (1991) Point processes and their statistical inference (2nd ed) New YorkMarcel Dekker

Kass R E amp Ventura V (2001)A spike train probability model Neural Comput13 1713ndash1720

Lewis P A W amp Shedler G S (1978) Simulation of non-homogeneous Poissonprocesses by thinning Naval Res Logistics Quart 26 403ndash413

Meyer P (1969)D Acircemonstration simpli Acircee drsquoun th Acirceoreme de Knight InS AcirceminaireprobabilitAcirce V (pp 191ndash195) New York Springer-Verlag

Muller R U amp Kubie J L (1987)The effects of changes in the environment onthe spatial ring of hippocampal complex-spike cells Journal of Neuroscience7 1951ndash1968

Ogata Y (1981) On Lewisrsquo simulation method for point processes IEEE Trans-actions on Information Theory IT-27 23ndash31

Ogata Y (1988) Statistical models for earthquake occurrences and residualanalysis for point processes Journal of American Statistical Association 83 9ndash27

OrsquoKeefe J amp Dostrovsky J (1971) The hippocampus as a spatial map Pre-liminary evidence from unit activity in the freely-moving rat Brain Res 34171ndash175

Olson C R Gettner S N Ventura V Carta R amp Kass R E (2000) Neuronalactivity in macaque supplementary eye eld during planning of saccades inresponse to pattern and spatial cues Journal of Neurophysiology84 1369ndash1384

Papangelou F (1972) Integrability of expected increments of point processesand a related random change of scale Trans Amer Math Soc 165 483ndash506

Port S (1994) Theoretical probability for applications New York WileyProtter M H amp Morrey C B (1991)A rst course in real analysis (2nd ed) New

York Springer-VerlagReich D Victor J amp Knight B (1998) The power ratio and the interval

map Spiking models and extracellular recordings Journal of Neuroscience18 10090ndash10104

Ripley B (1987) Stochastic simulation New York WileyRoss S (1993) Introduction to probability models (5th ed) San Diego CA Aca-

demic PressSnyder D amp Miller M (1991)Random point processes in time and space (2nd ed)

New York Springer-VerlagTaylor H M amp Karlin S (1994) An introduction to stochastic modeling (rev ed)

San Diego CA Academic PressVenables W amp Ripley B (1999) Modern applied statistics with S-PLUS (3rd ed)

New York Springer-Verlag

346 E N Brown R Barbieri V Ventura R E Kass and L M Frank

Ventura V Carta R Kass R Gettner S amp Olson C (2001) Statistical analysisof temporal evolution in single-neuron ring rate Biostatistics In press

Wilson M A amp McNaughton B L (1993) Dynamics of the hippocampal en-semble code for space Science 261 1055ndash1058

Wood E R Dudchenko P A amp Eichenbaum H (1999) The global record ofmemory in hippocampal neuronal activity Nature 397 613ndash616

Received October 27 2000 accepted May 2 2001

The Time-Rescaling Theorem 339

The IP model underestimates the lower and intermediate quantiles andoverestimates the upper quantiles The KS plot of the spatial rate model issimilar to that of the IP model yet closer to the 45-degree line The temporalrate model overestimates the quantiles of the true probability model of thedata This analysis suggests that the IG IP and spatial rate models are mostlikely oversmoothing this spike train whereas the temporal rate model un-dersmooths it Of the ve models the one that is closest to the 45-degreeline and lies almost entirely within the condence bounds is the IIG Thismodel disagrees only with the experimental data near quantile 080 Becauseall of the models with the exception of the IIG have appreciable lack of tin terms of the KS plots the ndings in the Q-Q plot analyses are almostidentical (see Figure 2B) As in the KS plot the Q-Q plot for the IIG model isclose to the 45-degree line and within the 95 condence bounds with theexception of quantiles near 080 These analyses suggest that IIG rate modelgives the best agreement with the spiking activity of this pyramidal neuronThe spatial and temporal rate function models and the IP model have thegreatest lack of t

33 Simulating a General Point Process by Time Rescaling As a sec-ond application of the time-rescaling theorem we describe how the the-orem may be used to simulate a general point process We stated in theIntroduction that the time-rescaling theorem provides a standard approachfor simulating an inhomogeneous Poisson process from a simple Poisson

Figure 2 Facing page (A) KS plots of the parametric and histogram-based modelts to the hippocampal place cell spike train activity The parametric modelsare the inhomogeneous Poisson (IP) (dotted line) the inhomogeneous gamma(IG) (thin solid line) and the inhomogeneous inverse gaussian (IIG)(thick solidline) The histogram-based models are the spatial rate model (dashed line) basedon 42 cm spatial bin width and the temporal rate model (dashed and dotted line)based a 200msec time bin width The 45-degree solid line represents exact agree-ment between the model and the experimental data and the 45-degree thin solidlines are the 95 condence bounds based on the KS statistic as in Figure 1 TheIIG model KS plot lies almost entirely within the condence bounds whereasall the other models show signicant lack of t This suggests that the IIG modelgives the best description of this place cell data series and the histogram-basedmodels agree least with the data (B) Q-Q plots of the IP (dotted line) IG (thinsolid line) and IIG (thick solid line) spatial (dashed line) temporal (dotted anddashed line) models The 95 local condence bounds (thin dashed lines) arecomputed as described in Figure 1 The ndings in the Q-Q plot analyses arealmost identical to those in the KS plots The Q-Q plot for the IIG model is closeto the 45-degree line and within the 95 condence bounds with the exceptionof quantiles near 080 This analysis also suggests that IIG rate model gives thebest agreement with the spiking activity of this pyramidal neuron The spatialand temporal rate function models and the IP model have the greatest lack of t

340 E N Brown R Barbieri V Ventura R E Kass and L M Frank

process The general form of the time-rescaling theorem suggests that anypoint process with an integrable conditional intensity function may be sim-ulated from a Poisson process with unit rate by rescaling time with respectto the conditional intensity (rate) function Given an interval (0 T] the sim-ulation algorithm proceeds as follows

1 Set u0 D 0 Set k D 1

2 Draw tk an exponential random variable with mean 1

3 Find uk as the solution to tk DR uk

ukiexcl1l (u | u0 u1 ukiexcl1 ) du

4 If uk gt T then stop

5 k D k C 1

6 Go to 2

By using equation 23 a discrete version of the algorithm can be constructedas follows Choose J large and divide the interval (0 T] into J bins each ofwidth D D T J For k D 1 J draw a Bernoulli random variable ucurren

k withprobability l(kD | ucurren

1 ucurrenkiexcl1 )D and assign a spike to bin k if ucurren

k D 1 andno spike if ucurren

k D 0While in many instances there will be faster more computationally ef-

cient algorithms for simulating a point process such as model-based meth-ods for specic renewal processes (Ripley 1987) and thinning algorithms(Lewis amp Shedler 1978 Ogata 1981 Ross 1993) the algorithm above is sim-ple to implement given a specication of the conditional intensity functionFor an example of where this algorithm is crucial for point process simu-lation we consider the IG model in equation 32 Its conditional intensityfunction is innite immediately following a spike if y lt 1 If in addition yis time varying (y D y (t) lt 1 for all t) then neither thinning nor standardalgorithms for making draws from a gamma probability distribution maybe used to simulate data from this model The thinning algorithm fails be-cause the conditional intensity function is not bounded and the standardalgorithms for simulating a gamma model cannot be applied because y istime varying In this case the time-rescaling simulation algorithm may beapplied as long as the conditional intensity function remains integrable asy varies temporally

4 Discussion

Measuring how well a point process model describes a neural spike traindata series is imperative prior to using the model for making inferences Thetime-rescaling theorem states that any point process with an integrable con-ditional intensity function may be transformed into a Poisson process witha unit rate Berman (1983) and Ogata (1988) showed that this theorem maybe used to develop goodness-of-t tests for point process models of seismo-logic data Goodness-of-t methods for neural spike train models based on

The Time-Rescaling Theorem 341

time-rescaling transformations but not the time-rescaling theorem have alsobeen reported (Reich Victor amp Knight 1998 Barbieri Frank Quirk Wilsonamp Brown 2001) Here we have described how the time-rescaling theoremmay be used to develop goodness-of-t tests for point process models ofneural spike trains

To illustrate our approach we analyzed two types of commonlyrecordedspike train data The SEF data are a set of multiple short (400 msec) series ofspike times each measured under identical experimental conditions Thesedata are typically analyzed by a PSTH The hippocampus data are a long(20 minutes) series of spike time recordings that are typically analyzed witheither spatial or temporal histogram models To each type of data we tboth parametric and histogram-based models Histogram-based models arepopularneural data analysis toolsbecause of the ease with which they can becomputed and interpreted These apparent advantages do not override theneed to evaluate the goodness of t of these models We previously usedthe time-rescaling theorem to assess goodness of t for parametric spiketrain models (Olson et al 2000 Barbieri Quirk et al 2001 Ventura et al2001) Our main result in this article is that the time-rescaling theorem can beused to evaluate goodness of t of parametric and histogram-based modelsand to compare directly the accuracy of models from the two classes Werecommend that before making an inference based on either type of modela goodness-of-t analysis should be performed to establish how well themodel describes the spike train data If the model and data agree closelythen the inference is more credible than when there is signicant lack of tThe KS and Q-Q plots provide assessments of overall goodness of t For themodels t by maximum likelihood these assessments can be applied alongwith methods that measure the marginal value and marginal costs of usingmore complex models such as Akaikiersquos information criterion (AIC) andthe Bayesian information criterion (BIC) in order to gain a more completeevaluation of model agreement with experimental data (Barbieri Quirk etal 2001)

We assessed goodness of t by using the time-rescaling theorem to con-struct KS and Q-Q plots having respectively liberal and conservative con-dence bounds Together the two sets of condence bounds help character-ize the range of agreement between the model and the data For example amodelwhose KSplot lies consistently outside the 95 KS condence bounds(the IP model for the hippocampal data) agrees poorly with the data Onthe other hand a model that is within all the 95 condence bounds of theQ-Q plots (the IMI model for the SEF data) agrees closely with the data Amodel such as the IP model for the SEF data that is within nearly all theKS bounds may lie outside the Q-Q plot intervals In this case if the lackof t with respect to the Q-Q plot intervals is systematic (ie is over a set ofcontiguous quantiles) this suggests that the model does not t the data well

As a second application of the time-rescaling theorem we presented analgorithm for simulating spike trains from a point process given its condi-

342 E N Brown R Barbieri V Ventura R E Kass and L M Frank

tional intensity (rate) function This algorithm generalizes the well-knowntechnique of simulating an inhomogeneous Poisson process by rescaling aPoisson process with a constant rate

Finally to make the reasoning behind the time-rescaling theorem moreaccessible to the neuroscience researchers we proved its general form usingelementary probability arguments While this elementary proof is most cer-tainly apparent to experts in probability and point process theory (D Brill-inger personal communication Guttorp 1995) itsdetails to our knowledgehave not been previously presented The original proofs of this theorem usemeasure theory and are based on the martingale representation of point pro-cesses (Meyer 1969 Papangelou 1972 Br Acircemaud 1981 Jacobsen 1982) Theconditional intensity function (see equation 21) is dened in terms of themartingale representation Our proof uses elementary arguments because itis based on the fact that the joint probability density of a set of point processobservations (spike train) has a canonical representation in terms of the con-ditional intensity function When the joint probability density is representedin this way the Jacobian in the change of variables between the original spiketimes and the rescaled interspike intervals simplies to a product of the re-ciprocals of the conditional intensity functions evaluated at the spike times

The proof also highlights the signicance of the conditional intensityfunction in spike train modeling its specication completely denes thestochastic structure of the point process This is because in a small time in-terval the product of the conditional intensity function and the time intervaldenes the probability of a spike in that interval given the historyof the spiketrain up to that time (see equation 23) When there is no history dependencethe conditional intensity function is simply the Poisson rate function Animportant consequence of this simplication is that unless history depen-dence is specically included then histogram-based models such as thePSTH and the spatial and temporal smoothers are implicit Poisson models

In both the SEF and hippocampusexamples the histogram-based modelsgave poor ts to the spike train These poor ts arose because these mod-els used few data points to estimate many parameters and because theydo not model history dependence in the spike train Our parametric mod-els used fewer parameters and represented temporal dependence explicitlyas Markov Point process models with higher-order temporal dependencehave been studied by Ogata (1981 1988) Brillinger (1988) and Kass andVentura (2001) and will be considered further in our future work Paramet-ric conditional intensity functions may be estimated from neural spike traindata in any experiments where there are enough data to estimate reliably ahistogram-based model This is because if there are enough data to estimatemany parameters using an inefcient procedure (histogrammethod of mo-ments) then there should be enough data to estimate a smaller number ofparameters using an efcient one (maximum likelihood)

Using the K-S and Q-Q plots derived from the time-rescaling theoremit is possible to devise a systematic approach to the use of histogram-based

The Time-Rescaling Theorem 343

models That is it is possible to determine when a histogram-based modelaccurately describes a spike train and when a different model class temporaldependence or the effects of covariates (eg the theta rhythm and the ratrunning velocity in the case of the place cells) should be considered and thedegree of improvement the alternative models provide

In summary we have illustrated how the time-rescaling theorem may beused to compare directly goodness of t of parametric and histogram-basedpoint process models of neural spiking activity and to simulate spike traindata These results suggest that the time-rescaling theorem can be a valuabletool for neural spike train data analysis

Appendix

A1 Maximum Likelihood Estimation of the IMI Model To t the IMImodel we choose J large and divide the interval (0 T] into J bins of widthD D T J We choose J so that there is at most one spike in any bin In thisway we convert the spike times 0 lt u1 lt u2 lt lt uniexcl1 lt un middot T into abinary sequence ucurren

j where ucurrenj D 1 if there is a spike in bin j and 0 otherwise

for j D 1 J By denition of the conditional intensity function for the IMImodel in equation 31 it follows that each ucurren

j is a Bernoulli random variablewith the probability of a spike at jD dened as l1 ( jD | h )l2 ( jD iexclucurren

N ( jD ) | h )D We note that this discretization is identical to the one used to constructthe discretized version of the simulation algorithm in Section 33 The logprobability of a spike is thus

log(l1 ( jD | h ) ) C log(l2 ( jD iexcl ucurrenN ( jD ) | h )D ) (A1)

and the cubic spline models for log(l1 ( jD | h ) ) and log(l2 ( jD iexcl ucurrenN ( jD ) | h ) )

are respectively

log(l1 ( jD | h ) ) D3X

lD1

hl ( jD iexcl j1 ) lC

Ch4 ( jD iexclj2 )3C C h5 ( jD iexcl j3) 3

C (A2)

log(l2 ( jD iexcl ucurrenN ( jD ) | h ) ) D

3X

lD1

hlC5 ( jD iexcl ucurrenN ( jD ) iexclc 1 ) l

C

C h9 ( jD iexcl ucurrenN ( jD ) iexclc 2 ) 3

C (A3)

whereh D (h1 h9 ) and the knots are dened asjl D lT4 for l D 1 2 3 andc l is the observed 100l3 quantile of the interspike interval distribution ofthe trial spike trains for lD 1 2 We have for Section 31 that hIP D (h1 h5)

344 E N Brown R Barbieri V Ventura R E Kass and L M Frank

and hIMI D (h1 h9) are the coefcients of the spline basis elements for theIP and IMI models respectively

The parameters are estimated by maximum likelihood with D D 1 msecusing the gam function in S-PLUS (MathSoft Seattle) as described in Chap-ter 11 of Venables and Ripley (1999) The inputs to gam are the ucurren

j s the timearguments jD and jD iexclucurren

N ( jD ) the symbolic specication of the spline mod-els and the knots Further discussion on spline-based regression methodsfor analyzing neural data may be found in Olson et al (2000) and Venturaet al (2001)

Acknowledgments

We are grateful to Carl Olson for allowing us to use the supplementary eyeeld data and Matthew Wilson for allowing us to use the hippocampal placecell data We thank David Brillinger and Victor Solo for helpful discussionsand the two anonymous referees whose suggestions helped improve theexposition This research was supported in part by NIH grants CA54652MH59733 and MH61637 and NSF grants DMS 9803433 and IBN 0081548

References

Barbieri R Frank L M Quirk M C Wilson M A amp Brown E N (2001)Diagnostic methods for statistical models of place cell spiking activity Neu-rocomputing 38ndash40 1087ndash1093

Barbieri R Quirk M C Frank L M Wilson M A amp Brown E N (2001)Construction and analysis of non-Poisson stimulus-response models of neu-ral spike train activity J Neurosci Meth 105 25ndash37

Berman M (1983)Comment on ldquoLikelihood analysis of point processes and itsapplications to seismological datardquo by Ogata Bulletin Internatl Stat Instit50 412ndash418

Br Acircemaud P (1981) Point processes and queues Martingale dynamics BerlinSpringer-Verlag

Brillinger D R (1988) Maximum likelihood analysis of spike trains of interact-ing nerve cells Biol Cyber 59 189ndash200

Brown E N Frank L M Tang D Quirk M C amp Wilson M A (1998) Astatistical paradigm for neural spike train decoding applied to position pre-diction from ensemble ring patterns of rat hippocampal place cells Journalof Neuroscience 18 7411ndash7425

Casella G amp Berger R L (1990) Statistical inference Belmont CA DuxburyDaley D amp Vere-Jones D (1988) An introduction to the theory of point process

New York Springer-VerlagFrank L M Brown E N amp Wilson M A (2000) Trajectory encoding in the

hippocampus and entorhinal cortex Neuron 27 169ndash178Guttorp P (1995) Stochastic modeling of scientic data London Chapman amp

Hall

The Time-Rescaling Theorem 345

Hogg R V amp Tanis E A (2001) Probability and statistical inference (6th ed)Englewood Cliffs NJ Prentice Hall

Jacobsen M (1982)Statistical analysis of counting processes New York Springer-Verlag

Johnson A amp Kotz S (1970) Distributions in statistics Continuous univariatedistributionsmdash2 New York Wiley

Kalbeisch J amp Prentice R (1980)The statisticalanalysis of failure time data NewYork Wiley

Karr A (1991) Point processes and their statistical inference (2nd ed) New YorkMarcel Dekker

Kass R E amp Ventura V (2001)A spike train probability model Neural Comput13 1713ndash1720

Lewis P A W amp Shedler G S (1978) Simulation of non-homogeneous Poissonprocesses by thinning Naval Res Logistics Quart 26 403ndash413

Meyer P (1969)D Acircemonstration simpli Acircee drsquoun th Acirceoreme de Knight InS AcirceminaireprobabilitAcirce V (pp 191ndash195) New York Springer-Verlag

Muller R U amp Kubie J L (1987)The effects of changes in the environment onthe spatial ring of hippocampal complex-spike cells Journal of Neuroscience7 1951ndash1968

Ogata Y (1981) On Lewisrsquo simulation method for point processes IEEE Trans-actions on Information Theory IT-27 23ndash31

Ogata Y (1988) Statistical models for earthquake occurrences and residualanalysis for point processes Journal of American Statistical Association 83 9ndash27

OrsquoKeefe J amp Dostrovsky J (1971) The hippocampus as a spatial map Pre-liminary evidence from unit activity in the freely-moving rat Brain Res 34171ndash175

Olson C R Gettner S N Ventura V Carta R amp Kass R E (2000) Neuronalactivity in macaque supplementary eye eld during planning of saccades inresponse to pattern and spatial cues Journal of Neurophysiology84 1369ndash1384

Papangelou F (1972) Integrability of expected increments of point processesand a related random change of scale Trans Amer Math Soc 165 483ndash506

Port S (1994) Theoretical probability for applications New York WileyProtter M H amp Morrey C B (1991)A rst course in real analysis (2nd ed) New

York Springer-VerlagReich D Victor J amp Knight B (1998) The power ratio and the interval

map Spiking models and extracellular recordings Journal of Neuroscience18 10090ndash10104

Ripley B (1987) Stochastic simulation New York WileyRoss S (1993) Introduction to probability models (5th ed) San Diego CA Aca-

demic PressSnyder D amp Miller M (1991)Random point processes in time and space (2nd ed)

New York Springer-VerlagTaylor H M amp Karlin S (1994) An introduction to stochastic modeling (rev ed)

San Diego CA Academic PressVenables W amp Ripley B (1999) Modern applied statistics with S-PLUS (3rd ed)

New York Springer-Verlag

346 E N Brown R Barbieri V Ventura R E Kass and L M Frank

Ventura V Carta R Kass R Gettner S amp Olson C (2001) Statistical analysisof temporal evolution in single-neuron ring rate Biostatistics In press

Wilson M A amp McNaughton B L (1993) Dynamics of the hippocampal en-semble code for space Science 261 1055ndash1058

Wood E R Dudchenko P A amp Eichenbaum H (1999) The global record ofmemory in hippocampal neuronal activity Nature 397 613ndash616

Received October 27 2000 accepted May 2 2001

340 E N Brown R Barbieri V Ventura R E Kass and L M Frank

process The general form of the time-rescaling theorem suggests that anypoint process with an integrable conditional intensity function may be sim-ulated from a Poisson process with unit rate by rescaling time with respectto the conditional intensity (rate) function Given an interval (0 T] the sim-ulation algorithm proceeds as follows

1 Set u0 D 0 Set k D 1

2 Draw tk an exponential random variable with mean 1

3 Find uk as the solution to tk DR uk

ukiexcl1l (u | u0 u1 ukiexcl1 ) du

4 If uk gt T then stop

5 k D k C 1

6 Go to 2

By using equation 23 a discrete version of the algorithm can be constructedas follows Choose J large and divide the interval (0 T] into J bins each ofwidth D D T J For k D 1 J draw a Bernoulli random variable ucurren

k withprobability l(kD | ucurren

1 ucurrenkiexcl1 )D and assign a spike to bin k if ucurren

k D 1 andno spike if ucurren

k D 0While in many instances there will be faster more computationally ef-

cient algorithms for simulating a point process such as model-based meth-ods for specic renewal processes (Ripley 1987) and thinning algorithms(Lewis amp Shedler 1978 Ogata 1981 Ross 1993) the algorithm above is sim-ple to implement given a specication of the conditional intensity functionFor an example of where this algorithm is crucial for point process simu-lation we consider the IG model in equation 32 Its conditional intensityfunction is innite immediately following a spike if y lt 1 If in addition yis time varying (y D y (t) lt 1 for all t) then neither thinning nor standardalgorithms for making draws from a gamma probability distribution maybe used to simulate data from this model The thinning algorithm fails be-cause the conditional intensity function is not bounded and the standardalgorithms for simulating a gamma model cannot be applied because y istime varying In this case the time-rescaling simulation algorithm may beapplied as long as the conditional intensity function remains integrable asy varies temporally

4 Discussion

Measuring how well a point process model describes a neural spike traindata series is imperative prior to using the model for making inferences Thetime-rescaling theorem states that any point process with an integrable con-ditional intensity function may be transformed into a Poisson process witha unit rate Berman (1983) and Ogata (1988) showed that this theorem maybe used to develop goodness-of-t tests for point process models of seismo-logic data Goodness-of-t methods for neural spike train models based on

The Time-Rescaling Theorem 341

time-rescaling transformations but not the time-rescaling theorem have alsobeen reported (Reich Victor amp Knight 1998 Barbieri Frank Quirk Wilsonamp Brown 2001) Here we have described how the time-rescaling theoremmay be used to develop goodness-of-t tests for point process models ofneural spike trains

To illustrate our approach we analyzed two types of commonlyrecordedspike train data The SEF data are a set of multiple short (400 msec) series ofspike times each measured under identical experimental conditions Thesedata are typically analyzed by a PSTH The hippocampus data are a long(20 minutes) series of spike time recordings that are typically analyzed witheither spatial or temporal histogram models To each type of data we tboth parametric and histogram-based models Histogram-based models arepopularneural data analysis toolsbecause of the ease with which they can becomputed and interpreted These apparent advantages do not override theneed to evaluate the goodness of t of these models We previously usedthe time-rescaling theorem to assess goodness of t for parametric spiketrain models (Olson et al 2000 Barbieri Quirk et al 2001 Ventura et al2001) Our main result in this article is that the time-rescaling theorem can beused to evaluate goodness of t of parametric and histogram-based modelsand to compare directly the accuracy of models from the two classes Werecommend that before making an inference based on either type of modela goodness-of-t analysis should be performed to establish how well themodel describes the spike train data If the model and data agree closelythen the inference is more credible than when there is signicant lack of tThe KS and Q-Q plots provide assessments of overall goodness of t For themodels t by maximum likelihood these assessments can be applied alongwith methods that measure the marginal value and marginal costs of usingmore complex models such as Akaikiersquos information criterion (AIC) andthe Bayesian information criterion (BIC) in order to gain a more completeevaluation of model agreement with experimental data (Barbieri Quirk etal 2001)

We assessed goodness of t by using the time-rescaling theorem to con-struct KS and Q-Q plots having respectively liberal and conservative con-dence bounds Together the two sets of condence bounds help character-ize the range of agreement between the model and the data For example amodelwhose KSplot lies consistently outside the 95 KS condence bounds(the IP model for the hippocampal data) agrees poorly with the data Onthe other hand a model that is within all the 95 condence bounds of theQ-Q plots (the IMI model for the SEF data) agrees closely with the data Amodel such as the IP model for the SEF data that is within nearly all theKS bounds may lie outside the Q-Q plot intervals In this case if the lackof t with respect to the Q-Q plot intervals is systematic (ie is over a set ofcontiguous quantiles) this suggests that the model does not t the data well

As a second application of the time-rescaling theorem we presented analgorithm for simulating spike trains from a point process given its condi-

342 E N Brown R Barbieri V Ventura R E Kass and L M Frank

tional intensity (rate) function This algorithm generalizes the well-knowntechnique of simulating an inhomogeneous Poisson process by rescaling aPoisson process with a constant rate

Finally to make the reasoning behind the time-rescaling theorem moreaccessible to the neuroscience researchers we proved its general form usingelementary probability arguments While this elementary proof is most cer-tainly apparent to experts in probability and point process theory (D Brill-inger personal communication Guttorp 1995) itsdetails to our knowledgehave not been previously presented The original proofs of this theorem usemeasure theory and are based on the martingale representation of point pro-cesses (Meyer 1969 Papangelou 1972 Br Acircemaud 1981 Jacobsen 1982) Theconditional intensity function (see equation 21) is dened in terms of themartingale representation Our proof uses elementary arguments because itis based on the fact that the joint probability density of a set of point processobservations (spike train) has a canonical representation in terms of the con-ditional intensity function When the joint probability density is representedin this way the Jacobian in the change of variables between the original spiketimes and the rescaled interspike intervals simplies to a product of the re-ciprocals of the conditional intensity functions evaluated at the spike times

The proof also highlights the signicance of the conditional intensityfunction in spike train modeling its specication completely denes thestochastic structure of the point process This is because in a small time in-terval the product of the conditional intensity function and the time intervaldenes the probability of a spike in that interval given the historyof the spiketrain up to that time (see equation 23) When there is no history dependencethe conditional intensity function is simply the Poisson rate function Animportant consequence of this simplication is that unless history depen-dence is specically included then histogram-based models such as thePSTH and the spatial and temporal smoothers are implicit Poisson models

In both the SEF and hippocampusexamples the histogram-based modelsgave poor ts to the spike train These poor ts arose because these mod-els used few data points to estimate many parameters and because theydo not model history dependence in the spike train Our parametric mod-els used fewer parameters and represented temporal dependence explicitlyas Markov Point process models with higher-order temporal dependencehave been studied by Ogata (1981 1988) Brillinger (1988) and Kass andVentura (2001) and will be considered further in our future work Paramet-ric conditional intensity functions may be estimated from neural spike traindata in any experiments where there are enough data to estimate reliably ahistogram-based model This is because if there are enough data to estimatemany parameters using an inefcient procedure (histogrammethod of mo-ments) then there should be enough data to estimate a smaller number ofparameters using an efcient one (maximum likelihood)

Using the K-S and Q-Q plots derived from the time-rescaling theoremit is possible to devise a systematic approach to the use of histogram-based

The Time-Rescaling Theorem 343

models That is it is possible to determine when a histogram-based modelaccurately describes a spike train and when a different model class temporaldependence or the effects of covariates (eg the theta rhythm and the ratrunning velocity in the case of the place cells) should be considered and thedegree of improvement the alternative models provide

In summary we have illustrated how the time-rescaling theorem may beused to compare directly goodness of t of parametric and histogram-basedpoint process models of neural spiking activity and to simulate spike traindata These results suggest that the time-rescaling theorem can be a valuabletool for neural spike train data analysis

Appendix

A1 Maximum Likelihood Estimation of the IMI Model To t the IMImodel we choose J large and divide the interval (0 T] into J bins of widthD D T J We choose J so that there is at most one spike in any bin In thisway we convert the spike times 0 lt u1 lt u2 lt lt uniexcl1 lt un middot T into abinary sequence ucurren

j where ucurrenj D 1 if there is a spike in bin j and 0 otherwise

for j D 1 J By denition of the conditional intensity function for the IMImodel in equation 31 it follows that each ucurren

j is a Bernoulli random variablewith the probability of a spike at jD dened as l1 ( jD | h )l2 ( jD iexclucurren

N ( jD ) | h )D We note that this discretization is identical to the one used to constructthe discretized version of the simulation algorithm in Section 33 The logprobability of a spike is thus

log(l1 ( jD | h ) ) C log(l2 ( jD iexcl ucurrenN ( jD ) | h )D ) (A1)

and the cubic spline models for log(l1 ( jD | h ) ) and log(l2 ( jD iexcl ucurrenN ( jD ) | h ) )

are respectively

log(l1 ( jD | h ) ) D3X

lD1

hl ( jD iexcl j1 ) lC

Ch4 ( jD iexclj2 )3C C h5 ( jD iexcl j3) 3

C (A2)

log(l2 ( jD iexcl ucurrenN ( jD ) | h ) ) D

3X

lD1

hlC5 ( jD iexcl ucurrenN ( jD ) iexclc 1 ) l

C

C h9 ( jD iexcl ucurrenN ( jD ) iexclc 2 ) 3

C (A3)

whereh D (h1 h9 ) and the knots are dened asjl D lT4 for l D 1 2 3 andc l is the observed 100l3 quantile of the interspike interval distribution ofthe trial spike trains for lD 1 2 We have for Section 31 that hIP D (h1 h5)

344 E N Brown R Barbieri V Ventura R E Kass and L M Frank

and hIMI D (h1 h9) are the coefcients of the spline basis elements for theIP and IMI models respectively

The parameters are estimated by maximum likelihood with D D 1 msecusing the gam function in S-PLUS (MathSoft Seattle) as described in Chap-ter 11 of Venables and Ripley (1999) The inputs to gam are the ucurren

j s the timearguments jD and jD iexclucurren

N ( jD ) the symbolic specication of the spline mod-els and the knots Further discussion on spline-based regression methodsfor analyzing neural data may be found in Olson et al (2000) and Venturaet al (2001)

Acknowledgments

We are grateful to Carl Olson for allowing us to use the supplementary eyeeld data and Matthew Wilson for allowing us to use the hippocampal placecell data We thank David Brillinger and Victor Solo for helpful discussionsand the two anonymous referees whose suggestions helped improve theexposition This research was supported in part by NIH grants CA54652MH59733 and MH61637 and NSF grants DMS 9803433 and IBN 0081548

References

Barbieri R Frank L M Quirk M C Wilson M A amp Brown E N (2001)Diagnostic methods for statistical models of place cell spiking activity Neu-rocomputing 38ndash40 1087ndash1093

Barbieri R Quirk M C Frank L M Wilson M A amp Brown E N (2001)Construction and analysis of non-Poisson stimulus-response models of neu-ral spike train activity J Neurosci Meth 105 25ndash37

Berman M (1983)Comment on ldquoLikelihood analysis of point processes and itsapplications to seismological datardquo by Ogata Bulletin Internatl Stat Instit50 412ndash418

Br Acircemaud P (1981) Point processes and queues Martingale dynamics BerlinSpringer-Verlag

Brillinger D R (1988) Maximum likelihood analysis of spike trains of interact-ing nerve cells Biol Cyber 59 189ndash200

Brown E N Frank L M Tang D Quirk M C amp Wilson M A (1998) Astatistical paradigm for neural spike train decoding applied to position pre-diction from ensemble ring patterns of rat hippocampal place cells Journalof Neuroscience 18 7411ndash7425

Casella G amp Berger R L (1990) Statistical inference Belmont CA DuxburyDaley D amp Vere-Jones D (1988) An introduction to the theory of point process

New York Springer-VerlagFrank L M Brown E N amp Wilson M A (2000) Trajectory encoding in the

hippocampus and entorhinal cortex Neuron 27 169ndash178Guttorp P (1995) Stochastic modeling of scientic data London Chapman amp

Hall

The Time-Rescaling Theorem 345

Hogg R V amp Tanis E A (2001) Probability and statistical inference (6th ed)Englewood Cliffs NJ Prentice Hall

Jacobsen M (1982)Statistical analysis of counting processes New York Springer-Verlag

Johnson A amp Kotz S (1970) Distributions in statistics Continuous univariatedistributionsmdash2 New York Wiley

Kalbeisch J amp Prentice R (1980)The statisticalanalysis of failure time data NewYork Wiley

Karr A (1991) Point processes and their statistical inference (2nd ed) New YorkMarcel Dekker

Kass R E amp Ventura V (2001)A spike train probability model Neural Comput13 1713ndash1720

Lewis P A W amp Shedler G S (1978) Simulation of non-homogeneous Poissonprocesses by thinning Naval Res Logistics Quart 26 403ndash413

Meyer P (1969)D Acircemonstration simpli Acircee drsquoun th Acirceoreme de Knight InS AcirceminaireprobabilitAcirce V (pp 191ndash195) New York Springer-Verlag

Muller R U amp Kubie J L (1987)The effects of changes in the environment onthe spatial ring of hippocampal complex-spike cells Journal of Neuroscience7 1951ndash1968

Ogata Y (1981) On Lewisrsquo simulation method for point processes IEEE Trans-actions on Information Theory IT-27 23ndash31

Ogata Y (1988) Statistical models for earthquake occurrences and residualanalysis for point processes Journal of American Statistical Association 83 9ndash27

OrsquoKeefe J amp Dostrovsky J (1971) The hippocampus as a spatial map Pre-liminary evidence from unit activity in the freely-moving rat Brain Res 34171ndash175

Olson C R Gettner S N Ventura V Carta R amp Kass R E (2000) Neuronalactivity in macaque supplementary eye eld during planning of saccades inresponse to pattern and spatial cues Journal of Neurophysiology84 1369ndash1384

Papangelou F (1972) Integrability of expected increments of point processesand a related random change of scale Trans Amer Math Soc 165 483ndash506

Port S (1994) Theoretical probability for applications New York WileyProtter M H amp Morrey C B (1991)A rst course in real analysis (2nd ed) New

York Springer-VerlagReich D Victor J amp Knight B (1998) The power ratio and the interval

map Spiking models and extracellular recordings Journal of Neuroscience18 10090ndash10104

Ripley B (1987) Stochastic simulation New York WileyRoss S (1993) Introduction to probability models (5th ed) San Diego CA Aca-

demic PressSnyder D amp Miller M (1991)Random point processes in time and space (2nd ed)

New York Springer-VerlagTaylor H M amp Karlin S (1994) An introduction to stochastic modeling (rev ed)

San Diego CA Academic PressVenables W amp Ripley B (1999) Modern applied statistics with S-PLUS (3rd ed)

New York Springer-Verlag

346 E N Brown R Barbieri V Ventura R E Kass and L M Frank

Ventura V Carta R Kass R Gettner S amp Olson C (2001) Statistical analysisof temporal evolution in single-neuron ring rate Biostatistics In press

Wilson M A amp McNaughton B L (1993) Dynamics of the hippocampal en-semble code for space Science 261 1055ndash1058

Wood E R Dudchenko P A amp Eichenbaum H (1999) The global record ofmemory in hippocampal neuronal activity Nature 397 613ndash616

Received October 27 2000 accepted May 2 2001

The Time-Rescaling Theorem 341

time-rescaling transformations but not the time-rescaling theorem have alsobeen reported (Reich Victor amp Knight 1998 Barbieri Frank Quirk Wilsonamp Brown 2001) Here we have described how the time-rescaling theoremmay be used to develop goodness-of-t tests for point process models ofneural spike trains

To illustrate our approach we analyzed two types of commonlyrecordedspike train data The SEF data are a set of multiple short (400 msec) series ofspike times each measured under identical experimental conditions Thesedata are typically analyzed by a PSTH The hippocampus data are a long(20 minutes) series of spike time recordings that are typically analyzed witheither spatial or temporal histogram models To each type of data we tboth parametric and histogram-based models Histogram-based models arepopularneural data analysis toolsbecause of the ease with which they can becomputed and interpreted These apparent advantages do not override theneed to evaluate the goodness of t of these models We previously usedthe time-rescaling theorem to assess goodness of t for parametric spiketrain models (Olson et al 2000 Barbieri Quirk et al 2001 Ventura et al2001) Our main result in this article is that the time-rescaling theorem can beused to evaluate goodness of t of parametric and histogram-based modelsand to compare directly the accuracy of models from the two classes Werecommend that before making an inference based on either type of modela goodness-of-t analysis should be performed to establish how well themodel describes the spike train data If the model and data agree closelythen the inference is more credible than when there is signicant lack of tThe KS and Q-Q plots provide assessments of overall goodness of t For themodels t by maximum likelihood these assessments can be applied alongwith methods that measure the marginal value and marginal costs of usingmore complex models such as Akaikiersquos information criterion (AIC) andthe Bayesian information criterion (BIC) in order to gain a more completeevaluation of model agreement with experimental data (Barbieri Quirk etal 2001)

We assessed goodness of t by using the time-rescaling theorem to con-struct KS and Q-Q plots having respectively liberal and conservative con-dence bounds Together the two sets of condence bounds help character-ize the range of agreement between the model and the data For example amodelwhose KSplot lies consistently outside the 95 KS condence bounds(the IP model for the hippocampal data) agrees poorly with the data Onthe other hand a model that is within all the 95 condence bounds of theQ-Q plots (the IMI model for the SEF data) agrees closely with the data Amodel such as the IP model for the SEF data that is within nearly all theKS bounds may lie outside the Q-Q plot intervals In this case if the lackof t with respect to the Q-Q plot intervals is systematic (ie is over a set ofcontiguous quantiles) this suggests that the model does not t the data well

As a second application of the time-rescaling theorem we presented analgorithm for simulating spike trains from a point process given its condi-

342 E N Brown R Barbieri V Ventura R E Kass and L M Frank

tional intensity (rate) function This algorithm generalizes the well-knowntechnique of simulating an inhomogeneous Poisson process by rescaling aPoisson process with a constant rate

Finally to make the reasoning behind the time-rescaling theorem moreaccessible to the neuroscience researchers we proved its general form usingelementary probability arguments While this elementary proof is most cer-tainly apparent to experts in probability and point process theory (D Brill-inger personal communication Guttorp 1995) itsdetails to our knowledgehave not been previously presented The original proofs of this theorem usemeasure theory and are based on the martingale representation of point pro-cesses (Meyer 1969 Papangelou 1972 Br Acircemaud 1981 Jacobsen 1982) Theconditional intensity function (see equation 21) is dened in terms of themartingale representation Our proof uses elementary arguments because itis based on the fact that the joint probability density of a set of point processobservations (spike train) has a canonical representation in terms of the con-ditional intensity function When the joint probability density is representedin this way the Jacobian in the change of variables between the original spiketimes and the rescaled interspike intervals simplies to a product of the re-ciprocals of the conditional intensity functions evaluated at the spike times

The proof also highlights the signicance of the conditional intensityfunction in spike train modeling its specication completely denes thestochastic structure of the point process This is because in a small time in-terval the product of the conditional intensity function and the time intervaldenes the probability of a spike in that interval given the historyof the spiketrain up to that time (see equation 23) When there is no history dependencethe conditional intensity function is simply the Poisson rate function Animportant consequence of this simplication is that unless history depen-dence is specically included then histogram-based models such as thePSTH and the spatial and temporal smoothers are implicit Poisson models

In both the SEF and hippocampusexamples the histogram-based modelsgave poor ts to the spike train These poor ts arose because these mod-els used few data points to estimate many parameters and because theydo not model history dependence in the spike train Our parametric mod-els used fewer parameters and represented temporal dependence explicitlyas Markov Point process models with higher-order temporal dependencehave been studied by Ogata (1981 1988) Brillinger (1988) and Kass andVentura (2001) and will be considered further in our future work Paramet-ric conditional intensity functions may be estimated from neural spike traindata in any experiments where there are enough data to estimate reliably ahistogram-based model This is because if there are enough data to estimatemany parameters using an inefcient procedure (histogrammethod of mo-ments) then there should be enough data to estimate a smaller number ofparameters using an efcient one (maximum likelihood)

Using the K-S and Q-Q plots derived from the time-rescaling theoremit is possible to devise a systematic approach to the use of histogram-based

The Time-Rescaling Theorem 343

models That is it is possible to determine when a histogram-based modelaccurately describes a spike train and when a different model class temporaldependence or the effects of covariates (eg the theta rhythm and the ratrunning velocity in the case of the place cells) should be considered and thedegree of improvement the alternative models provide

In summary we have illustrated how the time-rescaling theorem may beused to compare directly goodness of t of parametric and histogram-basedpoint process models of neural spiking activity and to simulate spike traindata These results suggest that the time-rescaling theorem can be a valuabletool for neural spike train data analysis

Appendix

A1 Maximum Likelihood Estimation of the IMI Model To t the IMImodel we choose J large and divide the interval (0 T] into J bins of widthD D T J We choose J so that there is at most one spike in any bin In thisway we convert the spike times 0 lt u1 lt u2 lt lt uniexcl1 lt un middot T into abinary sequence ucurren

j where ucurrenj D 1 if there is a spike in bin j and 0 otherwise

for j D 1 J By denition of the conditional intensity function for the IMImodel in equation 31 it follows that each ucurren

j is a Bernoulli random variablewith the probability of a spike at jD dened as l1 ( jD | h )l2 ( jD iexclucurren

N ( jD ) | h )D We note that this discretization is identical to the one used to constructthe discretized version of the simulation algorithm in Section 33 The logprobability of a spike is thus

log(l1 ( jD | h ) ) C log(l2 ( jD iexcl ucurrenN ( jD ) | h )D ) (A1)

and the cubic spline models for log(l1 ( jD | h ) ) and log(l2 ( jD iexcl ucurrenN ( jD ) | h ) )

are respectively

log(l1 ( jD | h ) ) D3X

lD1

hl ( jD iexcl j1 ) lC

Ch4 ( jD iexclj2 )3C C h5 ( jD iexcl j3) 3

C (A2)

log(l2 ( jD iexcl ucurrenN ( jD ) | h ) ) D

3X

lD1

hlC5 ( jD iexcl ucurrenN ( jD ) iexclc 1 ) l

C

C h9 ( jD iexcl ucurrenN ( jD ) iexclc 2 ) 3

C (A3)

whereh D (h1 h9 ) and the knots are dened asjl D lT4 for l D 1 2 3 andc l is the observed 100l3 quantile of the interspike interval distribution ofthe trial spike trains for lD 1 2 We have for Section 31 that hIP D (h1 h5)

344 E N Brown R Barbieri V Ventura R E Kass and L M Frank

and hIMI D (h1 h9) are the coefcients of the spline basis elements for theIP and IMI models respectively

The parameters are estimated by maximum likelihood with D D 1 msecusing the gam function in S-PLUS (MathSoft Seattle) as described in Chap-ter 11 of Venables and Ripley (1999) The inputs to gam are the ucurren

j s the timearguments jD and jD iexclucurren

N ( jD ) the symbolic specication of the spline mod-els and the knots Further discussion on spline-based regression methodsfor analyzing neural data may be found in Olson et al (2000) and Venturaet al (2001)

Acknowledgments

We are grateful to Carl Olson for allowing us to use the supplementary eyeeld data and Matthew Wilson for allowing us to use the hippocampal placecell data We thank David Brillinger and Victor Solo for helpful discussionsand the two anonymous referees whose suggestions helped improve theexposition This research was supported in part by NIH grants CA54652MH59733 and MH61637 and NSF grants DMS 9803433 and IBN 0081548

References

Barbieri R Frank L M Quirk M C Wilson M A amp Brown E N (2001)Diagnostic methods for statistical models of place cell spiking activity Neu-rocomputing 38ndash40 1087ndash1093

Barbieri R Quirk M C Frank L M Wilson M A amp Brown E N (2001)Construction and analysis of non-Poisson stimulus-response models of neu-ral spike train activity J Neurosci Meth 105 25ndash37

Berman M (1983)Comment on ldquoLikelihood analysis of point processes and itsapplications to seismological datardquo by Ogata Bulletin Internatl Stat Instit50 412ndash418

Br Acircemaud P (1981) Point processes and queues Martingale dynamics BerlinSpringer-Verlag

Brillinger D R (1988) Maximum likelihood analysis of spike trains of interact-ing nerve cells Biol Cyber 59 189ndash200

Brown E N Frank L M Tang D Quirk M C amp Wilson M A (1998) Astatistical paradigm for neural spike train decoding applied to position pre-diction from ensemble ring patterns of rat hippocampal place cells Journalof Neuroscience 18 7411ndash7425

Casella G amp Berger R L (1990) Statistical inference Belmont CA DuxburyDaley D amp Vere-Jones D (1988) An introduction to the theory of point process

New York Springer-VerlagFrank L M Brown E N amp Wilson M A (2000) Trajectory encoding in the

hippocampus and entorhinal cortex Neuron 27 169ndash178Guttorp P (1995) Stochastic modeling of scientic data London Chapman amp

Hall

The Time-Rescaling Theorem 345

Hogg R V amp Tanis E A (2001) Probability and statistical inference (6th ed)Englewood Cliffs NJ Prentice Hall

Jacobsen M (1982)Statistical analysis of counting processes New York Springer-Verlag

Johnson A amp Kotz S (1970) Distributions in statistics Continuous univariatedistributionsmdash2 New York Wiley

Kalbeisch J amp Prentice R (1980)The statisticalanalysis of failure time data NewYork Wiley

Karr A (1991) Point processes and their statistical inference (2nd ed) New YorkMarcel Dekker

Kass R E amp Ventura V (2001)A spike train probability model Neural Comput13 1713ndash1720

Lewis P A W amp Shedler G S (1978) Simulation of non-homogeneous Poissonprocesses by thinning Naval Res Logistics Quart 26 403ndash413

Meyer P (1969)D Acircemonstration simpli Acircee drsquoun th Acirceoreme de Knight InS AcirceminaireprobabilitAcirce V (pp 191ndash195) New York Springer-Verlag

Muller R U amp Kubie J L (1987)The effects of changes in the environment onthe spatial ring of hippocampal complex-spike cells Journal of Neuroscience7 1951ndash1968

Ogata Y (1981) On Lewisrsquo simulation method for point processes IEEE Trans-actions on Information Theory IT-27 23ndash31

Ogata Y (1988) Statistical models for earthquake occurrences and residualanalysis for point processes Journal of American Statistical Association 83 9ndash27

OrsquoKeefe J amp Dostrovsky J (1971) The hippocampus as a spatial map Pre-liminary evidence from unit activity in the freely-moving rat Brain Res 34171ndash175

Olson C R Gettner S N Ventura V Carta R amp Kass R E (2000) Neuronalactivity in macaque supplementary eye eld during planning of saccades inresponse to pattern and spatial cues Journal of Neurophysiology84 1369ndash1384

Papangelou F (1972) Integrability of expected increments of point processesand a related random change of scale Trans Amer Math Soc 165 483ndash506

Port S (1994) Theoretical probability for applications New York WileyProtter M H amp Morrey C B (1991)A rst course in real analysis (2nd ed) New

York Springer-VerlagReich D Victor J amp Knight B (1998) The power ratio and the interval

map Spiking models and extracellular recordings Journal of Neuroscience18 10090ndash10104

Ripley B (1987) Stochastic simulation New York WileyRoss S (1993) Introduction to probability models (5th ed) San Diego CA Aca-

demic PressSnyder D amp Miller M (1991)Random point processes in time and space (2nd ed)

New York Springer-VerlagTaylor H M amp Karlin S (1994) An introduction to stochastic modeling (rev ed)

San Diego CA Academic PressVenables W amp Ripley B (1999) Modern applied statistics with S-PLUS (3rd ed)

New York Springer-Verlag

346 E N Brown R Barbieri V Ventura R E Kass and L M Frank

Ventura V Carta R Kass R Gettner S amp Olson C (2001) Statistical analysisof temporal evolution in single-neuron ring rate Biostatistics In press

Wilson M A amp McNaughton B L (1993) Dynamics of the hippocampal en-semble code for space Science 261 1055ndash1058

Wood E R Dudchenko P A amp Eichenbaum H (1999) The global record ofmemory in hippocampal neuronal activity Nature 397 613ndash616

Received October 27 2000 accepted May 2 2001

342 E N Brown R Barbieri V Ventura R E Kass and L M Frank

tional intensity (rate) function This algorithm generalizes the well-knowntechnique of simulating an inhomogeneous Poisson process by rescaling aPoisson process with a constant rate

Finally to make the reasoning behind the time-rescaling theorem moreaccessible to the neuroscience researchers we proved its general form usingelementary probability arguments While this elementary proof is most cer-tainly apparent to experts in probability and point process theory (D Brill-inger personal communication Guttorp 1995) itsdetails to our knowledgehave not been previously presented The original proofs of this theorem usemeasure theory and are based on the martingale representation of point pro-cesses (Meyer 1969 Papangelou 1972 Br Acircemaud 1981 Jacobsen 1982) Theconditional intensity function (see equation 21) is dened in terms of themartingale representation Our proof uses elementary arguments because itis based on the fact that the joint probability density of a set of point processobservations (spike train) has a canonical representation in terms of the con-ditional intensity function When the joint probability density is representedin this way the Jacobian in the change of variables between the original spiketimes and the rescaled interspike intervals simplies to a product of the re-ciprocals of the conditional intensity functions evaluated at the spike times

The proof also highlights the signicance of the conditional intensityfunction in spike train modeling its specication completely denes thestochastic structure of the point process This is because in a small time in-terval the product of the conditional intensity function and the time intervaldenes the probability of a spike in that interval given the historyof the spiketrain up to that time (see equation 23) When there is no history dependencethe conditional intensity function is simply the Poisson rate function Animportant consequence of this simplication is that unless history depen-dence is specically included then histogram-based models such as thePSTH and the spatial and temporal smoothers are implicit Poisson models

In both the SEF and hippocampusexamples the histogram-based modelsgave poor ts to the spike train These poor ts arose because these mod-els used few data points to estimate many parameters and because theydo not model history dependence in the spike train Our parametric mod-els used fewer parameters and represented temporal dependence explicitlyas Markov Point process models with higher-order temporal dependencehave been studied by Ogata (1981 1988) Brillinger (1988) and Kass andVentura (2001) and will be considered further in our future work Paramet-ric conditional intensity functions may be estimated from neural spike traindata in any experiments where there are enough data to estimate reliably ahistogram-based model This is because if there are enough data to estimatemany parameters using an inefcient procedure (histogrammethod of mo-ments) then there should be enough data to estimate a smaller number ofparameters using an efcient one (maximum likelihood)

Using the K-S and Q-Q plots derived from the time-rescaling theoremit is possible to devise a systematic approach to the use of histogram-based

The Time-Rescaling Theorem 343

models That is it is possible to determine when a histogram-based modelaccurately describes a spike train and when a different model class temporaldependence or the effects of covariates (eg the theta rhythm and the ratrunning velocity in the case of the place cells) should be considered and thedegree of improvement the alternative models provide

In summary we have illustrated how the time-rescaling theorem may beused to compare directly goodness of t of parametric and histogram-basedpoint process models of neural spiking activity and to simulate spike traindata These results suggest that the time-rescaling theorem can be a valuabletool for neural spike train data analysis

Appendix

A1 Maximum Likelihood Estimation of the IMI Model To t the IMImodel we choose J large and divide the interval (0 T] into J bins of widthD D T J We choose J so that there is at most one spike in any bin In thisway we convert the spike times 0 lt u1 lt u2 lt lt uniexcl1 lt un middot T into abinary sequence ucurren

j where ucurrenj D 1 if there is a spike in bin j and 0 otherwise

for j D 1 J By denition of the conditional intensity function for the IMImodel in equation 31 it follows that each ucurren

j is a Bernoulli random variablewith the probability of a spike at jD dened as l1 ( jD | h )l2 ( jD iexclucurren

N ( jD ) | h )D We note that this discretization is identical to the one used to constructthe discretized version of the simulation algorithm in Section 33 The logprobability of a spike is thus

log(l1 ( jD | h ) ) C log(l2 ( jD iexcl ucurrenN ( jD ) | h )D ) (A1)

and the cubic spline models for log(l1 ( jD | h ) ) and log(l2 ( jD iexcl ucurrenN ( jD ) | h ) )

are respectively

log(l1 ( jD | h ) ) D3X

lD1

hl ( jD iexcl j1 ) lC

Ch4 ( jD iexclj2 )3C C h5 ( jD iexcl j3) 3

C (A2)

log(l2 ( jD iexcl ucurrenN ( jD ) | h ) ) D

3X

lD1

hlC5 ( jD iexcl ucurrenN ( jD ) iexclc 1 ) l

C

C h9 ( jD iexcl ucurrenN ( jD ) iexclc 2 ) 3

C (A3)

whereh D (h1 h9 ) and the knots are dened asjl D lT4 for l D 1 2 3 andc l is the observed 100l3 quantile of the interspike interval distribution ofthe trial spike trains for lD 1 2 We have for Section 31 that hIP D (h1 h5)

344 E N Brown R Barbieri V Ventura R E Kass and L M Frank

and hIMI D (h1 h9) are the coefcients of the spline basis elements for theIP and IMI models respectively

The parameters are estimated by maximum likelihood with D D 1 msecusing the gam function in S-PLUS (MathSoft Seattle) as described in Chap-ter 11 of Venables and Ripley (1999) The inputs to gam are the ucurren

j s the timearguments jD and jD iexclucurren

N ( jD ) the symbolic specication of the spline mod-els and the knots Further discussion on spline-based regression methodsfor analyzing neural data may be found in Olson et al (2000) and Venturaet al (2001)

Acknowledgments

We are grateful to Carl Olson for allowing us to use the supplementary eyeeld data and Matthew Wilson for allowing us to use the hippocampal placecell data We thank David Brillinger and Victor Solo for helpful discussionsand the two anonymous referees whose suggestions helped improve theexposition This research was supported in part by NIH grants CA54652MH59733 and MH61637 and NSF grants DMS 9803433 and IBN 0081548

References

Barbieri R Frank L M Quirk M C Wilson M A amp Brown E N (2001)Diagnostic methods for statistical models of place cell spiking activity Neu-rocomputing 38ndash40 1087ndash1093

Barbieri R Quirk M C Frank L M Wilson M A amp Brown E N (2001)Construction and analysis of non-Poisson stimulus-response models of neu-ral spike train activity J Neurosci Meth 105 25ndash37

Berman M (1983)Comment on ldquoLikelihood analysis of point processes and itsapplications to seismological datardquo by Ogata Bulletin Internatl Stat Instit50 412ndash418

Br Acircemaud P (1981) Point processes and queues Martingale dynamics BerlinSpringer-Verlag

Brillinger D R (1988) Maximum likelihood analysis of spike trains of interact-ing nerve cells Biol Cyber 59 189ndash200

Brown E N Frank L M Tang D Quirk M C amp Wilson M A (1998) Astatistical paradigm for neural spike train decoding applied to position pre-diction from ensemble ring patterns of rat hippocampal place cells Journalof Neuroscience 18 7411ndash7425

Casella G amp Berger R L (1990) Statistical inference Belmont CA DuxburyDaley D amp Vere-Jones D (1988) An introduction to the theory of point process

New York Springer-VerlagFrank L M Brown E N amp Wilson M A (2000) Trajectory encoding in the

hippocampus and entorhinal cortex Neuron 27 169ndash178Guttorp P (1995) Stochastic modeling of scientic data London Chapman amp

Hall

The Time-Rescaling Theorem 345

Hogg R V amp Tanis E A (2001) Probability and statistical inference (6th ed)Englewood Cliffs NJ Prentice Hall

Jacobsen M (1982)Statistical analysis of counting processes New York Springer-Verlag

Johnson A amp Kotz S (1970) Distributions in statistics Continuous univariatedistributionsmdash2 New York Wiley

Kalbeisch J amp Prentice R (1980)The statisticalanalysis of failure time data NewYork Wiley

Karr A (1991) Point processes and their statistical inference (2nd ed) New YorkMarcel Dekker

Kass R E amp Ventura V (2001)A spike train probability model Neural Comput13 1713ndash1720

Lewis P A W amp Shedler G S (1978) Simulation of non-homogeneous Poissonprocesses by thinning Naval Res Logistics Quart 26 403ndash413

Meyer P (1969)D Acircemonstration simpli Acircee drsquoun th Acirceoreme de Knight InS AcirceminaireprobabilitAcirce V (pp 191ndash195) New York Springer-Verlag

Muller R U amp Kubie J L (1987)The effects of changes in the environment onthe spatial ring of hippocampal complex-spike cells Journal of Neuroscience7 1951ndash1968

Ogata Y (1981) On Lewisrsquo simulation method for point processes IEEE Trans-actions on Information Theory IT-27 23ndash31

Ogata Y (1988) Statistical models for earthquake occurrences and residualanalysis for point processes Journal of American Statistical Association 83 9ndash27

OrsquoKeefe J amp Dostrovsky J (1971) The hippocampus as a spatial map Pre-liminary evidence from unit activity in the freely-moving rat Brain Res 34171ndash175

Olson C R Gettner S N Ventura V Carta R amp Kass R E (2000) Neuronalactivity in macaque supplementary eye eld during planning of saccades inresponse to pattern and spatial cues Journal of Neurophysiology84 1369ndash1384

Papangelou F (1972) Integrability of expected increments of point processesand a related random change of scale Trans Amer Math Soc 165 483ndash506

Port S (1994) Theoretical probability for applications New York WileyProtter M H amp Morrey C B (1991)A rst course in real analysis (2nd ed) New

York Springer-VerlagReich D Victor J amp Knight B (1998) The power ratio and the interval

map Spiking models and extracellular recordings Journal of Neuroscience18 10090ndash10104

Ripley B (1987) Stochastic simulation New York WileyRoss S (1993) Introduction to probability models (5th ed) San Diego CA Aca-

demic PressSnyder D amp Miller M (1991)Random point processes in time and space (2nd ed)

New York Springer-VerlagTaylor H M amp Karlin S (1994) An introduction to stochastic modeling (rev ed)

San Diego CA Academic PressVenables W amp Ripley B (1999) Modern applied statistics with S-PLUS (3rd ed)

New York Springer-Verlag

346 E N Brown R Barbieri V Ventura R E Kass and L M Frank

Ventura V Carta R Kass R Gettner S amp Olson C (2001) Statistical analysisof temporal evolution in single-neuron ring rate Biostatistics In press

Wilson M A amp McNaughton B L (1993) Dynamics of the hippocampal en-semble code for space Science 261 1055ndash1058

Wood E R Dudchenko P A amp Eichenbaum H (1999) The global record ofmemory in hippocampal neuronal activity Nature 397 613ndash616

Received October 27 2000 accepted May 2 2001

The Time-Rescaling Theorem 343

models That is it is possible to determine when a histogram-based modelaccurately describes a spike train and when a different model class temporaldependence or the effects of covariates (eg the theta rhythm and the ratrunning velocity in the case of the place cells) should be considered and thedegree of improvement the alternative models provide

In summary we have illustrated how the time-rescaling theorem may beused to compare directly goodness of t of parametric and histogram-basedpoint process models of neural spiking activity and to simulate spike traindata These results suggest that the time-rescaling theorem can be a valuabletool for neural spike train data analysis

Appendix

A1 Maximum Likelihood Estimation of the IMI Model To t the IMImodel we choose J large and divide the interval (0 T] into J bins of widthD D T J We choose J so that there is at most one spike in any bin In thisway we convert the spike times 0 lt u1 lt u2 lt lt uniexcl1 lt un middot T into abinary sequence ucurren

j where ucurrenj D 1 if there is a spike in bin j and 0 otherwise

for j D 1 J By denition of the conditional intensity function for the IMImodel in equation 31 it follows that each ucurren

j is a Bernoulli random variablewith the probability of a spike at jD dened as l1 ( jD | h )l2 ( jD iexclucurren

N ( jD ) | h )D We note that this discretization is identical to the one used to constructthe discretized version of the simulation algorithm in Section 33 The logprobability of a spike is thus

log(l1 ( jD | h ) ) C log(l2 ( jD iexcl ucurrenN ( jD ) | h )D ) (A1)

and the cubic spline models for log(l1 ( jD | h ) ) and log(l2 ( jD iexcl ucurrenN ( jD ) | h ) )

are respectively

log(l1 ( jD | h ) ) D3X

lD1

hl ( jD iexcl j1 ) lC

Ch4 ( jD iexclj2 )3C C h5 ( jD iexcl j3) 3

C (A2)

log(l2 ( jD iexcl ucurrenN ( jD ) | h ) ) D

3X

lD1

hlC5 ( jD iexcl ucurrenN ( jD ) iexclc 1 ) l

C

C h9 ( jD iexcl ucurrenN ( jD ) iexclc 2 ) 3

C (A3)

whereh D (h1 h9 ) and the knots are dened asjl D lT4 for l D 1 2 3 andc l is the observed 100l3 quantile of the interspike interval distribution ofthe trial spike trains for lD 1 2 We have for Section 31 that hIP D (h1 h5)

344 E N Brown R Barbieri V Ventura R E Kass and L M Frank

and hIMI D (h1 h9) are the coefcients of the spline basis elements for theIP and IMI models respectively

The parameters are estimated by maximum likelihood with D D 1 msecusing the gam function in S-PLUS (MathSoft Seattle) as described in Chap-ter 11 of Venables and Ripley (1999) The inputs to gam are the ucurren

j s the timearguments jD and jD iexclucurren

N ( jD ) the symbolic specication of the spline mod-els and the knots Further discussion on spline-based regression methodsfor analyzing neural data may be found in Olson et al (2000) and Venturaet al (2001)

Acknowledgments

We are grateful to Carl Olson for allowing us to use the supplementary eyeeld data and Matthew Wilson for allowing us to use the hippocampal placecell data We thank David Brillinger and Victor Solo for helpful discussionsand the two anonymous referees whose suggestions helped improve theexposition This research was supported in part by NIH grants CA54652MH59733 and MH61637 and NSF grants DMS 9803433 and IBN 0081548

References

Barbieri R Frank L M Quirk M C Wilson M A amp Brown E N (2001)Diagnostic methods for statistical models of place cell spiking activity Neu-rocomputing 38ndash40 1087ndash1093

Barbieri R Quirk M C Frank L M Wilson M A amp Brown E N (2001)Construction and analysis of non-Poisson stimulus-response models of neu-ral spike train activity J Neurosci Meth 105 25ndash37

Berman M (1983)Comment on ldquoLikelihood analysis of point processes and itsapplications to seismological datardquo by Ogata Bulletin Internatl Stat Instit50 412ndash418

Br Acircemaud P (1981) Point processes and queues Martingale dynamics BerlinSpringer-Verlag

Brillinger D R (1988) Maximum likelihood analysis of spike trains of interact-ing nerve cells Biol Cyber 59 189ndash200

Brown E N Frank L M Tang D Quirk M C amp Wilson M A (1998) Astatistical paradigm for neural spike train decoding applied to position pre-diction from ensemble ring patterns of rat hippocampal place cells Journalof Neuroscience 18 7411ndash7425

Casella G amp Berger R L (1990) Statistical inference Belmont CA DuxburyDaley D amp Vere-Jones D (1988) An introduction to the theory of point process

New York Springer-VerlagFrank L M Brown E N amp Wilson M A (2000) Trajectory encoding in the

hippocampus and entorhinal cortex Neuron 27 169ndash178Guttorp P (1995) Stochastic modeling of scientic data London Chapman amp

Hall

The Time-Rescaling Theorem 345

Hogg R V amp Tanis E A (2001) Probability and statistical inference (6th ed)Englewood Cliffs NJ Prentice Hall

Jacobsen M (1982)Statistical analysis of counting processes New York Springer-Verlag

Johnson A amp Kotz S (1970) Distributions in statistics Continuous univariatedistributionsmdash2 New York Wiley

Kalbeisch J amp Prentice R (1980)The statisticalanalysis of failure time data NewYork Wiley

Karr A (1991) Point processes and their statistical inference (2nd ed) New YorkMarcel Dekker

Kass R E amp Ventura V (2001)A spike train probability model Neural Comput13 1713ndash1720

Lewis P A W amp Shedler G S (1978) Simulation of non-homogeneous Poissonprocesses by thinning Naval Res Logistics Quart 26 403ndash413

Meyer P (1969)D Acircemonstration simpli Acircee drsquoun th Acirceoreme de Knight InS AcirceminaireprobabilitAcirce V (pp 191ndash195) New York Springer-Verlag

Muller R U amp Kubie J L (1987)The effects of changes in the environment onthe spatial ring of hippocampal complex-spike cells Journal of Neuroscience7 1951ndash1968

Ogata Y (1981) On Lewisrsquo simulation method for point processes IEEE Trans-actions on Information Theory IT-27 23ndash31

Ogata Y (1988) Statistical models for earthquake occurrences and residualanalysis for point processes Journal of American Statistical Association 83 9ndash27

OrsquoKeefe J amp Dostrovsky J (1971) The hippocampus as a spatial map Pre-liminary evidence from unit activity in the freely-moving rat Brain Res 34171ndash175

Olson C R Gettner S N Ventura V Carta R amp Kass R E (2000) Neuronalactivity in macaque supplementary eye eld during planning of saccades inresponse to pattern and spatial cues Journal of Neurophysiology84 1369ndash1384

Papangelou F (1972) Integrability of expected increments of point processesand a related random change of scale Trans Amer Math Soc 165 483ndash506

Port S (1994) Theoretical probability for applications New York WileyProtter M H amp Morrey C B (1991)A rst course in real analysis (2nd ed) New

York Springer-VerlagReich D Victor J amp Knight B (1998) The power ratio and the interval

map Spiking models and extracellular recordings Journal of Neuroscience18 10090ndash10104

Ripley B (1987) Stochastic simulation New York WileyRoss S (1993) Introduction to probability models (5th ed) San Diego CA Aca-

demic PressSnyder D amp Miller M (1991)Random point processes in time and space (2nd ed)

New York Springer-VerlagTaylor H M amp Karlin S (1994) An introduction to stochastic modeling (rev ed)

San Diego CA Academic PressVenables W amp Ripley B (1999) Modern applied statistics with S-PLUS (3rd ed)

New York Springer-Verlag

346 E N Brown R Barbieri V Ventura R E Kass and L M Frank

Ventura V Carta R Kass R Gettner S amp Olson C (2001) Statistical analysisof temporal evolution in single-neuron ring rate Biostatistics In press

Wilson M A amp McNaughton B L (1993) Dynamics of the hippocampal en-semble code for space Science 261 1055ndash1058

Wood E R Dudchenko P A amp Eichenbaum H (1999) The global record ofmemory in hippocampal neuronal activity Nature 397 613ndash616

Received October 27 2000 accepted May 2 2001

344 E N Brown R Barbieri V Ventura R E Kass and L M Frank

and hIMI D (h1 h9) are the coefcients of the spline basis elements for theIP and IMI models respectively

The parameters are estimated by maximum likelihood with D D 1 msecusing the gam function in S-PLUS (MathSoft Seattle) as described in Chap-ter 11 of Venables and Ripley (1999) The inputs to gam are the ucurren

j s the timearguments jD and jD iexclucurren

N ( jD ) the symbolic specication of the spline mod-els and the knots Further discussion on spline-based regression methodsfor analyzing neural data may be found in Olson et al (2000) and Venturaet al (2001)

Acknowledgments

We are grateful to Carl Olson for allowing us to use the supplementary eyeeld data and Matthew Wilson for allowing us to use the hippocampal placecell data We thank David Brillinger and Victor Solo for helpful discussionsand the two anonymous referees whose suggestions helped improve theexposition This research was supported in part by NIH grants CA54652MH59733 and MH61637 and NSF grants DMS 9803433 and IBN 0081548

References

Barbieri R Frank L M Quirk M C Wilson M A amp Brown E N (2001)Diagnostic methods for statistical models of place cell spiking activity Neu-rocomputing 38ndash40 1087ndash1093

Barbieri R Quirk M C Frank L M Wilson M A amp Brown E N (2001)Construction and analysis of non-Poisson stimulus-response models of neu-ral spike train activity J Neurosci Meth 105 25ndash37

Berman M (1983)Comment on ldquoLikelihood analysis of point processes and itsapplications to seismological datardquo by Ogata Bulletin Internatl Stat Instit50 412ndash418

Br Acircemaud P (1981) Point processes and queues Martingale dynamics BerlinSpringer-Verlag

Brillinger D R (1988) Maximum likelihood analysis of spike trains of interact-ing nerve cells Biol Cyber 59 189ndash200

Brown E N Frank L M Tang D Quirk M C amp Wilson M A (1998) Astatistical paradigm for neural spike train decoding applied to position pre-diction from ensemble ring patterns of rat hippocampal place cells Journalof Neuroscience 18 7411ndash7425

Casella G amp Berger R L (1990) Statistical inference Belmont CA DuxburyDaley D amp Vere-Jones D (1988) An introduction to the theory of point process

New York Springer-VerlagFrank L M Brown E N amp Wilson M A (2000) Trajectory encoding in the

hippocampus and entorhinal cortex Neuron 27 169ndash178Guttorp P (1995) Stochastic modeling of scientic data London Chapman amp

Hall

The Time-Rescaling Theorem 345

Hogg R V amp Tanis E A (2001) Probability and statistical inference (6th ed)Englewood Cliffs NJ Prentice Hall

Jacobsen M (1982)Statistical analysis of counting processes New York Springer-Verlag

Johnson A amp Kotz S (1970) Distributions in statistics Continuous univariatedistributionsmdash2 New York Wiley

Kalbeisch J amp Prentice R (1980)The statisticalanalysis of failure time data NewYork Wiley

Karr A (1991) Point processes and their statistical inference (2nd ed) New YorkMarcel Dekker

Kass R E amp Ventura V (2001)A spike train probability model Neural Comput13 1713ndash1720

Lewis P A W amp Shedler G S (1978) Simulation of non-homogeneous Poissonprocesses by thinning Naval Res Logistics Quart 26 403ndash413

Meyer P (1969)D Acircemonstration simpli Acircee drsquoun th Acirceoreme de Knight InS AcirceminaireprobabilitAcirce V (pp 191ndash195) New York Springer-Verlag

Muller R U amp Kubie J L (1987)The effects of changes in the environment onthe spatial ring of hippocampal complex-spike cells Journal of Neuroscience7 1951ndash1968

Ogata Y (1981) On Lewisrsquo simulation method for point processes IEEE Trans-actions on Information Theory IT-27 23ndash31

Ogata Y (1988) Statistical models for earthquake occurrences and residualanalysis for point processes Journal of American Statistical Association 83 9ndash27

OrsquoKeefe J amp Dostrovsky J (1971) The hippocampus as a spatial map Pre-liminary evidence from unit activity in the freely-moving rat Brain Res 34171ndash175

Olson C R Gettner S N Ventura V Carta R amp Kass R E (2000) Neuronalactivity in macaque supplementary eye eld during planning of saccades inresponse to pattern and spatial cues Journal of Neurophysiology84 1369ndash1384

Papangelou F (1972) Integrability of expected increments of point processesand a related random change of scale Trans Amer Math Soc 165 483ndash506

Port S (1994) Theoretical probability for applications New York WileyProtter M H amp Morrey C B (1991)A rst course in real analysis (2nd ed) New

York Springer-VerlagReich D Victor J amp Knight B (1998) The power ratio and the interval

map Spiking models and extracellular recordings Journal of Neuroscience18 10090ndash10104

Ripley B (1987) Stochastic simulation New York WileyRoss S (1993) Introduction to probability models (5th ed) San Diego CA Aca-

demic PressSnyder D amp Miller M (1991)Random point processes in time and space (2nd ed)

New York Springer-VerlagTaylor H M amp Karlin S (1994) An introduction to stochastic modeling (rev ed)

San Diego CA Academic PressVenables W amp Ripley B (1999) Modern applied statistics with S-PLUS (3rd ed)

New York Springer-Verlag

346 E N Brown R Barbieri V Ventura R E Kass and L M Frank

Ventura V Carta R Kass R Gettner S amp Olson C (2001) Statistical analysisof temporal evolution in single-neuron ring rate Biostatistics In press

Wilson M A amp McNaughton B L (1993) Dynamics of the hippocampal en-semble code for space Science 261 1055ndash1058

Wood E R Dudchenko P A amp Eichenbaum H (1999) The global record ofmemory in hippocampal neuronal activity Nature 397 613ndash616

Received October 27 2000 accepted May 2 2001

The Time-Rescaling Theorem 345

Hogg R V amp Tanis E A (2001) Probability and statistical inference (6th ed)Englewood Cliffs NJ Prentice Hall

Jacobsen M (1982)Statistical analysis of counting processes New York Springer-Verlag

Johnson A amp Kotz S (1970) Distributions in statistics Continuous univariatedistributionsmdash2 New York Wiley

Kalbeisch J amp Prentice R (1980)The statisticalanalysis of failure time data NewYork Wiley

Karr A (1991) Point processes and their statistical inference (2nd ed) New YorkMarcel Dekker

Kass R E amp Ventura V (2001)A spike train probability model Neural Comput13 1713ndash1720

Lewis P A W amp Shedler G S (1978) Simulation of non-homogeneous Poissonprocesses by thinning Naval Res Logistics Quart 26 403ndash413

Meyer P (1969)D Acircemonstration simpli Acircee drsquoun th Acirceoreme de Knight InS AcirceminaireprobabilitAcirce V (pp 191ndash195) New York Springer-Verlag

Muller R U amp Kubie J L (1987)The effects of changes in the environment onthe spatial ring of hippocampal complex-spike cells Journal of Neuroscience7 1951ndash1968

Ogata Y (1981) On Lewisrsquo simulation method for point processes IEEE Trans-actions on Information Theory IT-27 23ndash31

Ogata Y (1988) Statistical models for earthquake occurrences and residualanalysis for point processes Journal of American Statistical Association 83 9ndash27

OrsquoKeefe J amp Dostrovsky J (1971) The hippocampus as a spatial map Pre-liminary evidence from unit activity in the freely-moving rat Brain Res 34171ndash175

Olson C R Gettner S N Ventura V Carta R amp Kass R E (2000) Neuronalactivity in macaque supplementary eye eld during planning of saccades inresponse to pattern and spatial cues Journal of Neurophysiology84 1369ndash1384

Papangelou F (1972) Integrability of expected increments of point processesand a related random change of scale Trans Amer Math Soc 165 483ndash506

Port S (1994) Theoretical probability for applications New York WileyProtter M H amp Morrey C B (1991)A rst course in real analysis (2nd ed) New

York Springer-VerlagReich D Victor J amp Knight B (1998) The power ratio and the interval

map Spiking models and extracellular recordings Journal of Neuroscience18 10090ndash10104

Ripley B (1987) Stochastic simulation New York WileyRoss S (1993) Introduction to probability models (5th ed) San Diego CA Aca-

demic PressSnyder D amp Miller M (1991)Random point processes in time and space (2nd ed)

New York Springer-VerlagTaylor H M amp Karlin S (1994) An introduction to stochastic modeling (rev ed)

San Diego CA Academic PressVenables W amp Ripley B (1999) Modern applied statistics with S-PLUS (3rd ed)

New York Springer-Verlag

346 E N Brown R Barbieri V Ventura R E Kass and L M Frank

Ventura V Carta R Kass R Gettner S amp Olson C (2001) Statistical analysisof temporal evolution in single-neuron ring rate Biostatistics In press

Wilson M A amp McNaughton B L (1993) Dynamics of the hippocampal en-semble code for space Science 261 1055ndash1058

Wood E R Dudchenko P A amp Eichenbaum H (1999) The global record ofmemory in hippocampal neuronal activity Nature 397 613ndash616

Received October 27 2000 accepted May 2 2001

346 E N Brown R Barbieri V Ventura R E Kass and L M Frank

Ventura V Carta R Kass R Gettner S amp Olson C (2001) Statistical analysisof temporal evolution in single-neuron ring rate Biostatistics In press

Wilson M A amp McNaughton B L (1993) Dynamics of the hippocampal en-semble code for space Science 261 1055ndash1058

Wood E R Dudchenko P A amp Eichenbaum H (1999) The global record ofmemory in hippocampal neuronal activity Nature 397 613ndash616

Received October 27 2000 accepted May 2 2001