neurocog fx computerized screening of cognitive …epileptologie-bonn.de/.../neurocog_fx_eb.pdf ·...

24
NeuroCog FX ® - computerized screening of cognitive functions in epilepsy patients Christian Hoppe* ,1 , Klaus Fliessbach 1 , Uwe Schlegel 2 , Christian E. Elger 1 , Christoph Helmstaedter 1 1 Department of Epileptology, University of Bonn Medical Centre, Sigmund-Freud-Str. 25, 53105 Bonn, Germany. 2 Department of Neurology, Knappschaftskrankenhaus, Ruhr-University, Bochum, Germany. * Corresponding author. Fax: ++49 (0)228 / 287-16294. E-mail address: [email protected] (C. Hoppe). Word count: 6,742 words (text body only) NeuroCog FX ® - Hoppe et al. - 2 - Abstract (145 words) NeuroCog FX ® , a computerized neuropsychological screening for serial examinations of patients with epilepsy and other neurological diseases, was developed to fill the gap between unspecific ratings and comprehensive assessments. Eight subtests address attention, working memory, verbal and figural memory, and language. The test duration is less than 30 minutes. In research contexts, the test can be applied at multiple sites by non-academic personnel. Normative data were recorded from healthy subjects (N=244; age range=16-75 years; retest: N=44; validation: N=40) and unselected patients from an Epileptology unit (N=212; retest: N=94; validation: N=126). Psychometric analyses confirmed sufficient reliability and concurrent validity, particularly in patients. NeuroCog FX ® memory and overall performance scores showed “fair” to “good” diagnostic utility with regard to deficits revealed by established tests. NeuroCog FX ® provides reliable and valid measures of cognitive performance and may be used in clinical and research contexts as a screening instrument. Keywords: Epilepsy, cognition, memory, computer-based testing

Upload: lynga

Post on 23-Feb-2018

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: NeuroCog FX computerized screening of cognitive …epileptologie-bonn.de/.../NeuroCog_FX_EB.pdf · NeuroCog FX® - computerized screening of cognitive functions in epilepsy patients

NeuroCog FX® - computerized screening of cognitive functions in epilepsy patients

Christian Hoppe*,1, Klaus Fliessbach1, Uwe Schlegel2, Christian E. Elger1, Christoph

Helmstaedter1

1 Department of Epileptology, University of Bonn Medical Centre, Sigmund-Freud-Str. 25,

53105 Bonn, Germany. 2 Department of Neurology, Knappschaftskrankenhaus, Ruhr-University, Bochum, Germany.

* Corresponding author. Fax: ++49 (0)228 / 287-16294.

E-mail address: [email protected] (C. Hoppe).

Word count: 6,742 words (text body only)

NeuroCog FX® - Hoppe et al.

- 2 -

Abstract (145 words)

NeuroCog FX®, a computerized neuropsychological screening for serial examinations of patients

with epilepsy and other neurological diseases, was developed to fill the gap between unspecific

ratings and comprehensive assessments. Eight subtests address attention, working memory,

verbal and figural memory, and language. The test duration is less than 30 minutes. In research

contexts, the test can be applied at multiple sites by non-academic personnel. Normative data

were recorded from healthy subjects (N=244; age range=16-75 years; retest: N=44; validation:

N=40) and unselected patients from an Epileptology unit (N=212; retest: N=94; validation:

N=126). Psychometric analyses confirmed sufficient reliability and concurrent validity,

particularly in patients. NeuroCog FX® memory and overall performance scores showed “fair” to

“good” diagnostic utility with regard to deficits revealed by established tests. NeuroCog FX®

provides reliable and valid measures of cognitive performance and may be used in clinical and

research contexts as a screening instrument.

Keywords:

Epilepsy, cognition, memory, computer-based testing

Page 2: NeuroCog FX computerized screening of cognitive …epileptologie-bonn.de/.../NeuroCog_FX_EB.pdf · NeuroCog FX® - computerized screening of cognitive functions in epilepsy patients

NeuroCog FX® - Hoppe et al.

- 3 -

1. Introduction

The maintenance or restitution of cognitive functions is a major therapeutic aim in the treatment

of epilepsy and other neurological diseases. In epilepsy, cognitive performance may be impaired

by relatively stable factors such as focal brain lesions (e.g. congenital malformation) but also by

dynamic factors such as underlying progressive diseases (e.g. encephalitis, tumor), seizure

activity, adverse effects of antiepileptic drugs, and effects of epilepsy surgery [1,2,3]. Besides

seizure control neuropsychological functioning is a key determinant of health-related quality of

life and as such a major factor for treatment success. Furthermore, neuropsychological alterations

may be important indicators of latent disease dynamics. In other neurological conditions such as

brain tumors [4], dementia [5], multiple sclerosis [6], or Parkinson’s disease [7] the role of

cognitive functioning and the need for adequate neuropsychological evaluation have been

recognized as well.

For the valid individual diagnostic evaluation (e.g. presurgical work-up in epilepsy)

comprehensive neuropsychological assessment by experienced neuropsychologists is essential

[8]. However, in other clinical or research contexts this ‘gold standard’ may not be required. For

example, neurologists in private practice may want to select patients for more comprehensive

neuropsychological examination based on an objective economic measure. Also for multiple

follow-up examinations of cognitive performance, the administration of a complete testing

battery is inappropriate. Similarly, in neuropharmacological research contexts, serial extensive

individual neuropsychological evaluations may appear inappropriate at an early stage of drug

development.

There is a need for economic but nevertheless objective, reliable, valid, standardized, and

appropriate screening instruments for serial cognitive examinations of patients with epilepsy and

other neurological diseases. Notably, screening instruments are not developed to replace

established neuropsychological tests but to offer alternatives for potentially inadequate

‘measures’ of cognitive performance and change (e.g. global change rating scales). In particular,

the test duration should be short (below 30 minutes) and non-academic personnel (e.g. study

nurses, doctor’s assistants, medical students) should be able to administer, score, and file the test

NeuroCog FX® - Hoppe et al.

- 4 -

results. However, the individual diagnostic evaluation generally requires professional

neuropsychological education and experience.

Computerized testing appears promising for the purpose of screening patients and might fill the

gap between unspecific ratings and comprehensive neuropsychological assessments [9,10,11]. The

software defines the test procedure allowing the highly standardized administration by different

testers at different sites. Scoring and filing of the data are automated which increases the

objectivity and facilitates scientific use. Furthermore, random selection of items from a bigger

pool allows multiple serial examinations with short follow-up intervals. In contrast, most of the

established paper-pencil tests have a limited number of validated parallel versions (if at all).

Several computer-based test systems have been published during the last two decades [12].

However, these tools had little impact on neuropsychological research in epilepsy so far. A

PUBMED search identified only eleven studies using computerized cognitive testing in epilepsy

patients during the last five years (due to July 15, 2009; search term: ‘computer* cogniti*

epilep*’; identified batteries: BARS, CalCAP, CDR, CNS Vital Signs, FePsy, RTB [12]) but the

number of scientific publications is not necessarily representative for clinical use.

Here we report on the development and psychometric evaluation of NeuroCog FX®

(‘neurocognitive effects’), a computer-based neuropsychological screening battery for serial

clinical or scientific examinations of individual patients. Based on previous data, the test was

introduced in a German paper in 2006 [13]. The eight subtests refer to well-established

neuropsychological paradigms and address four separate cognitive domains which were selected

for their clinical relevance in patients with epilepsy and brain tumors (primary CNS lymphomas):

attention, working memory, memory (comprising verbal and figural learning and recognition),

and language [14,15]. The two memory subtests are based on a former version developed in our

unit which has shown to be sensitive for the immediate effects of high-intensity vagus nerve

stimulation and postictal material-specific memory disturbance after lateralized focal seizures

[16,17]. Besides application in epilepsy patients, the test is presently used in studies of the Glioma

Network, a multicentre research consortium of the German Cancer Foundation, and in several

other neurological studies (e.g. myotrophic dystonia).

Page 3: NeuroCog FX computerized screening of cognitive …epileptologie-bonn.de/.../NeuroCog_FX_EB.pdf · NeuroCog FX® - computerized screening of cognitive functions in epilepsy patients

NeuroCog FX® - Hoppe et al.

- 5 -

2. Methods

2.1. NeuroCog FX®

Table 1 shows the functional domains being addressed, descriptions of the computerized tasks,

and the measures being recorded and standardized.

> Table 1

The test system was programmed in Borland Delphi 6 by one of the authors (C. Hoppe) and runs

on current PCs (or laptops) operating under MS Windows 95 or higher. Hardware requirements

(e.g. graphic resolution) are specified in the technical manual. At the beginning of each subtest a

short instruction is shown which may be complemented by the examiner according to the test

manual. The Two-Back test instruction includes a short demonstration. All other tests are

administered right after instruction without further familiarization or practice. In the most recent

version, all reactions from the subject are recorded via the keyboard, i.e. no paper or additional

hardware is needed (e.g. mouse, touchpad). With regard to subjects with motor impairment, the

spacebar (as biggest key) was defined as the standard reaction key. However, the Digit Span test

requires data entry via the number keys. The examiner is permitted to assist the subject if

necessary. The language of test administration is German; an English version is under

construction.

The test was administered in patients and healthy subjects by non-academic personnel (doctoral

medical students) under the supervision of experienced neuropsychologists. Patients were tested

in a seated position in the neuropsychological lab rooms of our department. Control subjects

underwent computerized testing at different quiet and undisturbed places but established

neuropsychological tests were also performed in the lab rooms. Test administrators were

instructed to ensure optimal and constant lighting conditions. The test administrators always

remained present and no group tests were performed.

NeuroCog FX® - Hoppe et al.

- 6 -

The subtests provide raw scores, i.e. number of correct reactions and errors, or reaction times,

respectively. The Digit Span test is standardized for the maximum length of reproducible digit

sequences (i.e., span). The recognition memory scores are calculated by hits minus (0.5*false

alarms) with hits being correctly confirmed items and false alarms being erroneously confirmed

distracters. Thus, random response strategies (e.g. confirming or rejecting each single item)

likewise result in a score of 0, perfect performance equals the number of learning items (12 or 7,

respectively), and scores markedly below 0 may indicate possible malingering [18].

For the evaluation of validity and reliability, the Phonematic Fluency test was primarily

administered in a written form (i.e., patient writing words on a sheet of paper) with a standard

initial letter (P) according to an established German testing procedure (Leistungs-Prüf-System,

Subtest 6, by Horn [19]). However, written administration would exclude patients with motor

handicaps from being assessed. Furthermore, serial follow-up assessments require randomly

changing initial letters. According to our previous neuropsychological experience with epilepsy

patients, test performance remains more or less unaffected by the test mode (oral vs. written)

because the requirement to write the words is no limiting factor. However, initial letter selection

will influence performance. In its recent version, Phonematic Fluency is administered in an oral

form, i.e. patients say words and the tester counts the number of words by clicking a button on

the screen. The initial letters are randomly selected by the computer (set: L, P, or S). Meanwhile,

further consecutive patients from our outpatient clinic have been included allowing statistical

analyses of the effects of the different test modes (see below).

Based on exploratory Principal Component Analyses (PCA; see below) two measures of overall

performance, instead of a single total score, were defined. SCORE was defined by the mean of

the mean standard values of the scored subtests on working memory (Digit Span, Two-Back),

memory (Verbal and Figural Memory), and verbal fluency (Phonematic Fluency). RTT was

defined by the mean of the standard values from the three reaction-time based tests (Simple

Reaction, Go/No Go, Inverted Go/No Go).

Page 4: NeuroCog FX computerized screening of cognitive …epileptologie-bonn.de/.../NeuroCog_FX_EB.pdf · NeuroCog FX® - computerized screening of cognitive functions in epilepsy patients

NeuroCog FX® - Hoppe et al.

- 7 -

2.2. Healthy subjects

Psychometric evaluation and standardization was primarily done in healthy subjects. None of

these subjects had a history of neurological or psychiatric disease. All healthy subjects (and the

parents of minors) gave informed consent.

CON-TOTAL sample: Normative data were derived from N=244 healthy subjects with a mean

age of 42.1 years (median: 40.6; range=17-80, SD=17.7; gender male/female: 107/137;

handedness right/left/ambidextrous: 215/13/16). Subjects were recruited according to the

predefined ranges of the normative age groups: adolescents (16-29 years; n=87); younger adults

(30-44 years; n=57); older adults (45-59 years; n=48); and seniors (60-75 years; n=52).

The level of education of subjects from the normative sample was tested by a paper-pencil

multiple-choice word/pseudowords vocabulary test (Mehrfachwahl-Wortschatztest, MWT-B

[20]). The mean MWT-B IQ of the normative sample was 115.4 (SD=15.3). This high mean IQ

value was the result of the dated norms (1977) and actually corresponds to present average

intelligence levels as indicated by the almost identical MWT-B test results in a recent

standardization study in 235 other healthy adult subjects (unpublished data; cf. Flynn effect [21]).

CON-RELIABLE subsample: Subtest reliabilities were estimated by test-retest correlations. A

subgroup of N=44 healthy subjects with the mean age of 36.8 years ((median:=28.4, range=17-

72, SD=16.3; male/female: 17/27; handedness right/left/ambidextrous: 40/1/3) underwent the

computerized test battery twice with a mean interval of 2.0 months (median=2.0, range=0.9-3.7,

SD=0.5). To evaluate later practice effects, nineteen of these subjects were assessed four times in

total with a mean inter-test interval of 1.36 months.

CON-VALIDATE subsample: Subtest validities were estimated by concurrent validation, i.e.

correlations of newly introduced and established neuropsychological measures. A subgroup of

N=40 healthy subjects with the mean age of 41.3 years (median=39.9, range=17-66, SD=14.6;

male/female: 13/27; handedness right/left: 36/4) were tested by the computerized test as well as

by an established comprehensive neuropsychological test battery.

NeuroCog FX® - Hoppe et al.

- 8 -

2.3. Patients

Additional data for psychometric evaluation and preliminary clinical evaluation including the

analysis of sensitivity/specificity of the test scores were recorded in unselected consecutive

adolescent and adult patients from the Bonn Department of Epileptology. All patients (and

parents of the minors) gave their informed consent to participate in the study.

PAT-TOTAL sample: This sample comprised all data from the first application of

NeuroCog FX® from N=212 consecutive patients with a mean age of 38.1 years (median=37.0;

range=15-74, SD=13.7; male/female: 100/112; handedness right/left/ambidextrous: 178/16/18).

The first N=40 patients were enrolled prospectively while the subsequent N=172 patients were

selected for neuropsychological testing for diverse clinical indications (e.g. subjective

complaints, medication change, baseline/outcome evaluation of in-house neuropsychological

training).

PAT-RELIABLE subsample: A subgroup of N=94 patients with the mean age of 38.1 years

(median=35.1, range=16-74, SD=14.1; gender male/female: 42/52; handedness

right/left/ambidextrous: 79/6/9) underwent NeuroCog FX® assessment at least twice within a

maximum interval of three months (test-retest interval/days: mean=13.9, median=6.0, range=0-

105, ,SD=24.9) for clinical indications.

PAT-VALIDATE subsample: A subgroup of N= 126 patients with the mean age of 38.6 years

(median=40.0, range=16-70, SD=13.1; gender male/female: 60/66; handedness

right/left/ambidextrous: 101/6/19) underwent NeuroCog FX® but were also selected for

comprehensive neuropsychological testing with an established test battery for diverse clinical

reasons. The neuropsychological standard battery could not be completed in all patients; patient

selection for this study was based on completed verbal and figural memory assessment (see

below).

Page 5: NeuroCog FX computerized screening of cognitive …epileptologie-bonn.de/.../NeuroCog_FX_EB.pdf · NeuroCog FX® - computerized screening of cognitive functions in epilepsy patients

NeuroCog FX® - Hoppe et al.

- 9 -

2.4. Factor analyses

In many assessment batteries subtest scores are finally summed up in a single total score (e.g.

full-scale IQ). To obtain rational parameters of overall test performance, exploratory PCA was

applied to the NeuroCog FX® subtest raw scores from the total sample of all subjects (CON-and

PAT-TOTAL, N=379; extraction criterion: Eigenvalue>1, factor rotation: VARIMAX with

Kaiser’s normalization). Notably, PCA was not performed to test whether the neuropsychological

functional domains addressed by the computerized tasks are extracted as statistically independent

factors.

2.5. Psychometric evaluation

Reliabilities and practice effects. Subtest reliabilities were estimated by Pearson’s product-

moment correlation of test and retest scores (r12) in the CON-RELIABLE sample. Additionally,

Spearman’s rank correlation coefficients were calculated to rule out possible effects of the non-

normal data distribution. Practice effects were estimated by the mean differences (MD) between

group means of the test and retest raw scores in the CON-RELIABLE subsample.

In addition, effects of repeated testing were separately evaluated in the PAT-RELIABLE sample.

In most of these patients, re-evaluation with NeuroCog FX® aimed at the evaluation of cognitive

effects of clinical interventions such as antiepileptic medication changes or cognitive training

performed during the test-retest interval. Though being retrospective and confounded by a variety

of treatments, this data provides important additional information on reliability and practice

effects in a clinical target population of the test.

Concurrent validities. NeuroCog FX® was validated in the CON-VALIDATE subsample of

healthy subjects (N=40; for sample characteristics see above) and, separately, in the PAT-

VALIDATE patient sample (N=126). All subjects underwent computerized testing and a well-

established comprehensive neuropsychological assessment battery; Table 2 lists the established

tests (for detailed descriptions [22]).

NeuroCog FX® - Hoppe et al.

- 10 -

> Table 2

Computerized and established testing was administered in a single session in healthy subjects but

with an interval in patients (N=126; sequence NeuroCog FX® – standard test battery: N=28,

range=1-291, median: 90 days; both batteries at the same day: N= 57; sequence standard test

battery – NeuroCog FX®: N=41, range=1-326, median: 8 days).

Concurrent validities of NeuroCog FX® subtests were evaluated by two different approaches.

Firstly, Pearson’s product-moment correlations of NeuroCog FX® subtests and the domain-

related subtests from the established neuropsychological battery were calculated as validity

estimates; possible effects of non-normal data distribution were controlled for by also calculating

the respective Spearman’s rank correlations. Secondly, PCA with VARIMAX-rotation was

applied on raw scores from both instruments to test the functional coherence of NeuroCog FX®

measures and respective established tests. Both analyses were performed separately for healthy

subjects (CON-VALIDATE) and patients (PAT-VALIDATE).

2.6. Individual diagnostic evaluation

Standardization. Data were standardized to allow age-related individual diagnostic evaluation and

comparisons between different subtests. Normal distribution was tested by the Kolmogorov-

Smirnov goodness-of-fit test. To ensure the usual reference of standard values (SV; mean=100,

SD=10) and percentiles (PR; e.g. SV=90 refers to PR=16) despite the non-normal distribution of

raw scores, SV were assigned to raw scores separately for the pre-defined age groups based on a

set of selected PR (plane transformation [23]: PR/SV: 0/60, 1/70, 3/80, 10/85, 16/90, 20/92, 30/95,

40/97, 50/100, 60/103, 70/105, 80/108, 84/110, 90/115, 97/120, 99/130, 100/140 with PR 100/SV

140 being reserved for future performance exceeding performance of the normative sample).

Reliable change. The statistical evaluation of differences between the test scores in individual

patients (e.g. changes from test to retest in the same measure) requires the calculation of a

confidence interval based on critical differences (∆-crit) and practice effects (shift of expectancy

value) of the respective score [23]. Thereby, ∆-crit refers to a defined significance level α of the

Page 6: NeuroCog FX computerized screening of cognitive …epileptologie-bonn.de/.../NeuroCog_FX_EB.pdf · NeuroCog FX® - computerized screening of cognitive functions in epilepsy patients

NeuroCog FX® - Hoppe et al.

- 11 -

confidence interval and reflects the non-perfect reliability (rr) of the measure (rr<1). ∆-crit is

defined by ∆-crit = ±zα * SE∆ with the standard error of change SE∆ = SDTest * (2 * (1-rr))1/2 and

SDTest as the standard deviation of the test score. Here, ∆-crit were calculated for α=.10 with zα =

±1.64 and the reliability rr was estimated by Pearson’s product-moment correlation of test and

retest score r12 (i.e., ∆-crit = (±1.64) * SDTest * (2 * (1-r12))1/2). The 90%-confidence intervals for

each score were then calculated by MD ± ∆-crit with MD as the test-retest difference of group

means which are used as an estimate of the true practice effect. The outer limits of the confidence

intervals are given, i.e. differences of scores already indicating significant decline or

improvement, respectively.

Determination of diagnostic thresholds. To identify the optimal diagnostic thresholds, diagnostic

classifications (affected versus unaffected) based on scores from the computerized screening

battery were compared to categorizations based on established neuropsychological tests. The

analysis was focused on memory deficits which are particularly important in epilepsy but also

other neurological diseases [1,2,24]. Based on the assumption that a functional deficit is reliably

indicated by age-corrected below-average scores in established tests (SV<85 = mean - 1.5 SD, ∆-

crit being considered), the total rate of type I and type II classification errors, the relative

reduction of incorrect classifications as compared to random classification, positive and negative

predictive values, likelihood ratios positive and negative, sensitivity, specificity, and correlation

coefficients (ϕ, χ2) were calculated for different thresholds of the computerized scores (SV<80,

<85, <90, <95) in the merged PAT- and CON-VALIDATE subsamples. The same coefficients of

diagnostic utility were calculated for the NeuroCog FX® overall performance score, SCORE,

with regard to the identification of other cognitive deficits (SV<85) in at least two non-memory

tests from the established assessment battery.

2.7. Cognitive performance in patients

In addition to psychometric analyses, the group differences between patients and controls have

been analyzed by MANOVAs and MANCOVA (age as covariate). Findings from the

computerized measures were compared with respective findings based on established tests.

NeuroCog FX® - Hoppe et al.

- 12 -

2.8. Statistics

The significance level was set to α=.05 for all tests. Statistics have been performed by SPSS 17

(SPSS Inc., 2008).

3. Results

The median test duration was 24 minutes in healthy subjects and 28 minutes in patients

(maximum: 35 minutes). Some patients had difficulties understanding the Two-Back test despite

the integrated demonstration tool; consequently, in 3 out of 212 patients this test could not be

performed.

3.1. Factor analyses

Explorative PCA on raw scores from the total sample of subjects (CON- and PAT-TOTAL,

N=379) extracted two factors (Eigenvalue>1, VARIMAX rotation): SCORE, comprising all

scored subtest (Digit Span, Two-Back, Verbal and Figural Memory, and Phonematic Fluency);

and RTT, comprising the three reaction-time based tests (Simple Reaction, Go/No Go, and

Inverted Go/No Go). All factor loadings >.30 are shown in Table 3.

> Table 3

The model explained 60% of the variance. Deleting the Two-Back test from analysis for its low

reliability (see below) results in a slightly improved model with identical factors explaining 64%

of the variance. Based on this data, instead of defining a single total score, two measures of

overall performance, SCORE and RTT, were defined. The SCORE and RTT standard values

shared about 16% of their variance (r=0.42, P<.001).

Page 7: NeuroCog FX computerized screening of cognitive …epileptologie-bonn.de/.../NeuroCog_FX_EB.pdf · NeuroCog FX® - computerized screening of cognitive functions in epilepsy patients

NeuroCog FX® - Hoppe et al.

- 13 -

3.2. Psychometric evaluation

Reliability and validity were estimated based on Pearson’s product-moment correlation.

Importantly, non-parametric Spearman’s rank correlation generally yielded similar results.

Reliability and practice effects. Table 4A shows the practice effect estimates (i.e., group mean

differences between test and retest raw scores, MD) and the reliability estimates calculated by

Pearson’s product-moment correlations (r12) of test and retest raw scores in the sample of healthy

subjects.

> Table 4A

With the exception of the Two-Back and the Simple Reaction test all subtests yielded significant

practice effects, i.e. improvements from test to retest. Also SCORE, but not RTT, showed

significantly increased scores from test to retest. For those 19 subjects who underwent the test

four times, further improvements occurred in single measures from second to third application

(reaction times in the Simple Reaction and the Go/No Go test) but not between third and fourth

application (data not shown) indicating sufficient test stability.

All but one test-retest correlations were significant (5 scores P<.001, 2 scores P<.01). The

correlations were large for overall performance standard values (SCORE: r12=0.71; RTT:

r12=0.55) and medium to large in the majority of subtests (0.45 ≤ r12 ≤ 0.69). However, the Two-

Back test showed no significant test-retest correlation (r12=0.21, P>.05). Near-to-the-maximum

mean values and the small variance in this measure indicate a possible ceiling effect.

In addition, we estimated practice effects and reliabilities based on patient data from the PAT-

RELIABLE sample (Table 4B).

> Table 4B

Practice effects were clearly smaller in patients and reached significance only for Digit Span

(improved) and Simple Reaction (decelerated). Furthermore, the test-retest correlations were

NeuroCog FX® - Hoppe et al.

- 14 -

large for SCORE and all subtests (0.70 ≤ r12 ≤ 0.84) indicating high reliability of all measures

when estimated based on patient data. However, due to larger variance, the critical differences for

significant change are not smaller than those derived from healthy subjects.

Concurrent validity. Table 5 shows the concurrent validity estimates, i.e. the Pearson’s product-

moment correlation coefficients of corresponding measures from NeuroCog FX® and the

established test battery, separately for healthy subjects (CON-VALIDATE) and patients (PAT-

VALIDATE).

> Table 5

In healthy subjects, reaction times and the Two-Back test score did not correlate with tests from

the validation test battery which, however, did not include directly comparable counterparts.

Significant correlations were obtained for the tests on working memory (small to medium),

memory (medium to large), and verbal fluency (large). In patients, all but one subtests from the

computerized battery showed medium to large correlations with established counterparts.

However, Verbal Memory and one of the VLMT scores (retention) showed only a small

correlation in patients (r=-0.19, P<.05; healthy subjects: r=-0.49). .

Concurrent validity was also tested by explorative PCA on the raw scores of computerized and

established measures separately in healthy subjects and patients (CON- and PAT-VALIDATE

subsamples). The Two-Back test was excluded from this analysis due to its low reliability. From

the established battery the Maze Test and Verbal Semantic Fluency were excluded due to the

frequent missing values. Table 6A and 6B show all VARIMAX-rotated factor loadings >.30 for

healthy subjects and patients (CON- and PAT-VALIDATE). Measures and factors were arranged

by the subtests from NeuroCog FX®.

> Table 6A and 6B

Analyses in both samples extracted 6 factors (Eigenvalue > 1). The computerized reaction time

tests loaded on a single factor but were not associated with established measures. The other

Page 8: NeuroCog FX computerized screening of cognitive …epileptologie-bonn.de/.../NeuroCog_FX_EB.pdf · NeuroCog FX® - computerized screening of cognitive functions in epilepsy patients

NeuroCog FX® - Hoppe et al.

- 15 -

subtests loaded on different factors and were associated with their established neuropsychological

counterparts. For example, Verbal Memory loaded on a factor together with scores from the

Verbal Learning and Memory Test. However, in patients the verbal retention score loaded on a

factor together with Digit Span instead of Verbal Memory. Thus, PCA confirmed concurrent

validity of the NeuroCog FX® subtests but with a limitation regarding verbal retention.

3.3. Individual diagnostic evaluation

Standardization. In healthy subjects, the raw scores of the following subtests were distributed

askew indicating ceiling effects: Two-Back, Go/No Go, Inverted Go/No Go, and to a lower extent

Verbal Memory. Possible ceiling effects were already considered in the previous reliability and

validation analyses. Standardization was based on the first test administration data from healthy

subjects for the predefined four age groups. Group-wise descriptive statistics and results from

univariate ANOVAs on age group effects (with post-hoc Scheffé tests) are shown in Table 7.

Post-hoc tests revealed no significant group differences between the younger adults (29-44 years)

and the older adults (45-59 years). Performance of seniors was significantly lower in all subtests.

> Table 7

Notably, with the exception of Figural Memory, neither the subtest raw scores nor the subtest

standard values were distributed normally (Kolmogorov-Smirnov goodness-of-fit test, P<.05).

The overall standard values, SCORE and RTT, showed normal distribution in patients, healthy

subjects, and in the total sample (P>.05). Standard values were assigned according to plane

transformation (see method section) [23]. Meanwhile, the age-related normative data were

integrated in an upgrade of the computerized test which now provides automatic scoring and

normative age-related evaluation together with a graphical performance profile (Figure 1 shows

an example from the most recent test release).

> Figure 1

NeuroCog FX® - Hoppe et al.

- 16 -

Reliable change. Tables 4A and 4B show the critical differences ∆-crit (α=.10) and the reliable

change indices indicating the outer limits of significant changes (decline/improvement) based on

90%-confidence intervals for raw scores in individual subjects. Measures of overall performance,

SCORE and RTT, may be regarded as changed if test and retest standard values differ by

approximately one SD (SV ±10).

Identification of diagnostic thresholds. Tables 8A-C show several coefficients of the diagnostic

utility of NeuroCog FX® memory subtests and the overall performance score, SCORE, at

different thresholds of the age-corrected standard values (SV<80, <85, <90, <95) with regard to

established indicators of memory and other cognitive deficits (threshold: SV<85). The analyses

are based on the combined data from patients and healthy subjects (PAT- and CON-VALIDATE)

but, notably, only single control subjects showed neuropsychological deficits according to the

definition.

> Tables 8A, 8B, & 8C

Regarding the different aspects of verbal memory, applying a strict threshold of SV<80 (2 SD,

corresponding to about 8 items in four trials; see Table 4A) to the computerized scores showed

advantageous diagnostic properties such as lowest rate of classification errors, highest relative

error reduction, and a “fair” likelihood ratio positive (Table 8A). However, at this threshold the

likelihood ratio negative and the sensitivity are rather low, i.e. the rate of unidentified affected

patients is rather high. For the verbal learning score and the recognition score slightly better

values were achieved as compared to the final score and the retention score.

Regarding figural learning (Table 8B), the diagnostic utility appears higher than for verbal

learning and memory with the best outcome being achieved if a lower threshold of SV<90 (1 SD,

corresponding to about 6 figures in four trials; see Table 4A) is applied to the Figural Memory

standard value.

Finally, the overall performance of the scored tests (including the computerized memory tests),

represented by SCORE, was explored regarding its utility to diagnose other cognitive deficits,

Page 9: NeuroCog FX computerized screening of cognitive …epileptologie-bonn.de/.../NeuroCog_FX_EB.pdf · NeuroCog FX® - computerized screening of cognitive functions in epilepsy patients

NeuroCog FX® - Hoppe et al.

- 17 -

irrespective of possible memory deficits (Table 8C). A balance of sensitivity (0.69) and

specificity (0.72) might be achieved for SV<95 (i.e., 0.5 SD) but regarding rate of classification

errors, relative error reduction, and likelihood ratio positive, SCORE shows similar or

advantageous values for the more usual threshold of SV<90.

3.4. Neuropsychological performance in patients

A MANCOVA on the NeuroCog FX® data from the merged PAT-TOTAL and CON-TOTAL

sample (N= 379, complete data) yielded significant effects of the covariate age (Wilks λ=0.73,

F8; 369=17.0, P<.001) and the group factor (Wilks λ=0.74, F8; 369=16.5, P<.001) indicating general

cognitive impairment in patients irrespective of age in a group level analysis; univariate post-hoc

analyses confirmed the group effect in each single measure. MANOVA on the age-corrected

standard values of overall performance (SCORE, RTT) also showed a significant main effect of

the group factor (N= 361; Wilks λ=0.68, F2; 358 =83.4, P<.001); post-hoc univariate analyses

confirmed the group effect for both parameters.

The group mean of the number of below-average scores (SV<90) was 2.8 (SD: 2.2; median 2.0)

in patients and 1.1 (SD: 1.3; median 1.0) in controls (χ2=95.5, P<.001) from eight available

scores. Thus, more individual analyses revealed rather specific profiles of impaired and

unimpaired functions in patients. Figure 2 shows the subtest performance profiles in terms of

percentages of subjects with below-average standard values. SCORE showed greater group

differences than RTT. Verbal memory and phonematic fluency appeared most susceptible to

cognitive deterioration.

> Figure 2

In order to compare the findings from the computerized tests and the established

neuropsychological assessment battery, further group analyses were performed on established

measures from the merged CON- and PAT-VALIDATE samples (N= 103). Multivariate analysis

again yielded a main effect of the group factor but no effect of age was obtained in this sample.

Post-hoc univariate analyses yielded effects of age as a covariate only for Figural Memory

NeuroCog FX® - Hoppe et al.

- 18 -

(F=11.7, P<.01) and Simple Reaction (F=4.1, P<.05) as well as for 3 of 15 established measures

(CIT Interference, VLMT learning score, DCS learning score). Thus, no evidence could be

revealed for a general inappropriateness of computerized testing in older subjects with

presumably less computer experience. Significant main effects of the group factor (corrected for

age) were obtained for 6 of 8 single measures but not for Simple Reaction and Inverted Go/No Go

which, however, showed significant group differences in the TOTAL samples. A similar pattern

of general cognitive deterioration in epilepsy patients was revealed by established measures.

Patients showed significantly lower performance on all established tests and measures except of

the Corsi Block Tapping (P<.001 for most of the measures).

3.5. Phonematic fluency

The effects of changing the conditions of the Phonematic Fluency test were assessed in the PAT-

VALIDATE sample. According to our hypothesis, patients who underwent the written form of

the Phonematic Fluency test in the early stages of test development (N= 42) yielded no different

results (raw score, standard value) than the orally tested patients enrolled later (N= 82; Mann-

Whitney test, raw score: P=.86, standard value: P=.30). However, the initial letter selection

significantly affected performance with ‘S’ yielding higher scores than ‘L’ and ‘P’ (‘L’: N=52,

mean: 10.0; ‘P’: N=92, mean: 9.7; ‘S’: N=48; mean: 12.6; ANOVA: F=6.95, P=.001, post-hoc

Scheffé tests: ‘S’>‘L’: P=.018, ‘S’>‘P’: P=.002). Therefore, the following correction for initial

letter ‘S’ was meanwhile integrated in the program: The raw score should be corrected by

subtraction of 20% (truncated) before applying the age-corrected standard value. After this

correction the mean standard values for the three initial letter groups were equal (ANOVA,

F=0.01, P=.91).

4. Discussion

NeuroCog FX® is a PC-based cognitive screening instrument which was developed to fill the

diagnostic gap between unspecific ratings and comprehensive neuropsychological test batteries.

For example, economic identification of patients for more comprehensive neuropsychological

examinations, frequent follow-up examinations of cognitive performance, or the preliminary

Page 10: NeuroCog FX computerized screening of cognitive …epileptologie-bonn.de/.../NeuroCog_FX_EB.pdf · NeuroCog FX® - computerized screening of cognitive functions in epilepsy patients

NeuroCog FX® - Hoppe et al.

- 19 -

scientific evaluation of cognitive drug effects may motivate and justify using a screening

instrument. NeuroCog FX® can be administered in about 30 minutes. The test can be

administered by non-academic personnel (e.g. medical students). In the most recent version of the

test system scoring (including determination of age-corrected standard values) and electronic

filing for later scientific use are integrated. The tasks address four cognitive domains: attention

(psychomotor speed), working memory (capacity, manipulation), verbal and figural memory

(learning and recognition), and language (phonematic fluency). These functions are related to the

quality of life and are known to be sensitive indicators of treatment effects and latent disease

dynamics in epilepsy and brain tumor diseases [1-7]. Two measures of overall performance,

SCORE (scored tests) and RTT (reaction-time based tests), were defined according the results

from PCA. Individual diagnostics can be performed based on the age-related normative data and

the provided thresholds for clinical classification but strictly requires professional

neuropsychological expertise. The test is intended for use also in other neurological diseases and

is presently evaluated in diverse studies (e.g. primary CNS lymphoma, glioma, myotonic

dystonia, and septicemia).

To evade effects of a non-normal distribution of raw scores, plane transformation was applied,

i.e. an age-group related percentile-based “manual” assignment of standard values to the eight

subtest raw scores (normative sample: N=244; age range = 16-80 years; four age groups with

N≥48) [23]. No group differences were found between younger and older adults (30-44 years

versus 45-60 years) but older adults (60-75 years) showed lower performance in all subtests. No

education-related norms were provided. In early-onset diseases, like many types of epilepsy, a

correction for the lower educational level would cover neurocognitive deficits. In late-onset

diseases individual diagnostic evaluations must carefully account for the educational background

of a patient and eventually should mainly refer to the intraindividual performance profile (Fig. 1).

Based on a sample of healthy subjects (CON-RELIABLE), the test-retest correlations of all but

one NeuroCog FX® subtests appeared medium to large confirming sufficient reliability.

However, the Two-Back score showed no significant test-retest correlation but a near-to-the-

maximum group mean and small variance (r12=.22, P=.15; mean=8.8, SD=2.3) indicating a

ceiling effect which might have contributed to the low test-retest correlation. In contrast, patients

NeuroCog FX® - Hoppe et al.

- 20 -

showed “large” test-retest correlations in all scores (0.70<r12<0.85) including the Two Back test

(r12=.70, P<.001; mean=5.4, SD=4.0). We conclude that NeuroCog FX® is suited for serial use

in epilepsy patients but may be inappropriate for studies in high-functioning subjects.

From a clinical perspective, an individual classification of the course of performance during

follow-up examinations may be required. Reliable change indices (90%-confidence intervals)

indicated that significant individual change in the NeuroCog FX® subtests refers to a raw score

difference of approximately one normative standard deviation (i.e., 10 standard value points).

This corresponds to findings on reliabilities and confidence intervals of established

neuropsychological measures. For example, for scores from the VLMT used in this study the

critical differences (P=.10) were about 1 SD for learning score, final score, and recognition score

and 1.6 SD for the retention score (unpublished data from a comprehensive normative study

performed in 2003, N=81 retested healthy subjects).

Concurrent validity was estimated based on correlations between computerized and established

neuropsychological measures. In patients, all computerized tests showed specific correlations

with established neuropsychological counterparts (small to large, 0.19<|r|<0.67). Validity

estimates appeared slightly lower for healthy subjects, probably due to ceiling effects as shown

above for reliability analysis. Explorative PCA, separately performed on both samples (CON-

and PAT-VALIDATE), confirmed that the computerized subtests (except the reaction times)

loaded on different factors together with their established neuropsychological counterparts. We

conclude that the NeuroCog FX® subtests address the different cognitive functions as intended

and, thus, cover important aspects of attention, working memory, verbal and figural memory, and

language. However, this screening instrument is not suited to replace established tests (and was

not intended to).

The clinical validity and the utility of individual diagnostic applications were further explored by

evaluating classification error rates, predictive values, likelihood ratios, sensitivity, and

specificity at different standard value thresholds of computerized test scores. The provided data

on diagnostic utility allow the user to define categorical thresholds according to his/her specific

requirements (Table 8A-C). Under consideration of critical differences, falling below a threshold

Page 11: NeuroCog FX computerized screening of cognitive …epileptologie-bonn.de/.../NeuroCog_FX_EB.pdf · NeuroCog FX® - computerized screening of cognitive functions in epilepsy patients

NeuroCog FX® - Hoppe et al.

- 21 -

of (normative mean – 1.5 standard deviations) in established measures appears as a reasonable

criterion for diagnosing functional impairment. The focus of a more detailed analysis was set on

the detection of memory deficits which are an important co-morbidity of epilepsy. Importantly,

only single healthy subjects showed neuropsychological deficits according to this definition, and

NeuroCog FX® classified almost 100% of the healthy subjects correctly. The diagnostic utility of

the Figural Memory test appeared high for a threshold of SV<90 (Table 8B). For the Verbal

Memory test, the classification error rate was minimal at a rather high threshold of SV<80 (Table

8A). For screening purposes, the rate of affected but non-identified patients (1-sensitivity) could

be decreased by lowering the threshold but, of course, at the expense of lower specificity. We

conclude that the computerized memory tests to some extent cover important aspects of verbal

and figural learning and memory as captured by established test.

However, a substantial portion of subjects (about 30%) showed normal performance in the

computerized tests while they fail established tests - and vice versa. The computerized tests

strongly differ from the established tests with regard to presentation and test modalities (e.g.

recognition versus free reproductions, verbal test: reading versus listening) which contributes to

this dissociation. In group studies, failure in verbal delayed free recall, which may remain

undetected by recognition tests, is correlated with (left) hippocampal dysfunction [24,25]. A

dissociation of intact recognition but impaired recall is also known from amnesic conditions [26].

Although the diagnostic properties of the established retention score (with regard to hippocampal

lesions) are unknown, NeuroCog FX® might be inappropriate, or require completion by further

tests, in studies on hippocampal function or dysfunction. Notably, in patients, the verbal retention

score did not load on one factor together with the Verbal Memory score (Table 6B). Besides

memory deficits, the overall performance measure, SCORE, showed sufficient diagnostic utility

for the detection of other cognitive deficits (e.g. working memory, verbal fluency).

After the early stages of test development (standardization, reliability/validity analyses; N=40

patients) further patients were selected for computerized and neuropsychological testing for

diverse clinical indications such as subjective complaints, medication changes, and

baseline/outcome evaluations of neuropsychological training. Some of the patients still

underwent diagnostic procedures and, thus, did not have a clear diagnosis of epilepsy. Therefore,

NeuroCog FX® - Hoppe et al.

- 22 -

the clinical data although being recorded from unselected Epileptology unit patients can not be

unequivocally referred to epilepsy. For this mixed patient population, group level analyses of

NeuroCog FX® scores as well as established measures revealed a pattern of general cognitive

impairment. However, single patients showed rather differential profiles of affected and

unaffected functions (median: 2 of 8 NeuroCog FX® subtests with SV <90). In terms of

percentage of subjects with below-average scores (SV<90; Fig. 2), Verbal Memory (patients vs.

healthy subjects: 44% vs. 14%) and Phonematic Fluency (45% vs. 13%) showed the greatest

group mean differences which is consistent with earlier findings based on established measures

[1,2,24].

We would like to close the discussion with some remarks on the benefits and pitfalls of

computerized screenings though a full review of this issue is out of the scope of this article. The

most important risk of each screening approach, irrespective of whether or not it is computerized,

is that the tasks fail to address clinically relevant cognitive functions. For example, cognitive

effects of topiramate had been controversially discussed based on data from computerized

screening until established tests properly addressing language and other cognitive functions

associated to the frontal lobes were applied [27,28]. During the last two decades several computer-

based or computer-assisted neuropsychological test systems were developed and applied also in

studies on epilepsy [12]. Some systems established for research on neurological diseases, such as

the Automated Neuropsychological Assessment Metrics (ANAM), to our knowledge, have never

been used in epilepsy research so far [29]. Like NeuroCog FX® subtests, most of the

computerized tasks represent computer adaptations of well-established neuropsychological

paradigms. Consequently, computerized testing provided little theoretical or conceptual advances

to the field. NeuroCog FX® offers a selection of tasks which was based on the

neuropsychological expertise of our group in epilepsy and brain tumors [13-17,22,24,25]. Concurrent

validation analyses confirmed that the tasks actually address the intended cognitive domains. On

a group level, NeuroCog FX® is likely to detect systematic alterations in attention, working

memory, verbal and figural memory, or language.

Page 12: NeuroCog FX computerized screening of cognitive …epileptologie-bonn.de/.../NeuroCog_FX_EB.pdf · NeuroCog FX® - computerized screening of cognitive functions in epilepsy patients

NeuroCog FX® - Hoppe et al.

- 23 -

The possible benefits of computerized testing such as high standardization and ease of

administration or automated scoring and filing have already been mentioned above. Most recent

reviews on the use of computerized testing finally arrive at a positive evaluation [10,11,29]. Beyond

the established psychometric criteria, important computer-specific issues are the application in

computer-naïve populations (e.g. older subjects) and technical issues referred to hardware and

software specifications. Notably, analyses of age effects on NeuroCog FX® performance did not

reveal any computer-specific difficulties with the procedure in elderly subjects. Furthermore, the

user requirements for NeuroCog FX® are very low. All reactions are recorded via the keyboard,

most of them via the spacebar. If necessary, the tester may assist the testee. Neither patients nor

healthy subjects complained about the usability of the computerized test. The major technical

concerns refer to exact time measurement and timing of stimulus material as required for event-

related psychophysiological experiments [30]. NeuroCog FX® was evaluated on different

hardware and operation system platforms and is intended to be used flexibly. Users are advised to

close all other programs when running the test. But nevertheless, time measurement and timing

will not be exact at a millisecond level. However, psychometric analyses confirmed sufficient

reliability of the median reaction times despite the error variance caused by this technical

shortcoming. Scores from reaction-time based tests are not summed up with the other tests in a

single total score.

While the software guarantees a high standardization of the test procedure and the scoring

process, the hardware platforms and devices as well as relevant environmental conditions may

vary at different sites and times. We recommend that subjects are tested under comparable

conditions during follow-up examinations (i.e., same hardware, especially same screen; same

room, especially similar lighting conditions). Some further technical issues of computerized

testing, for example monitor flickering, have lost much of their former relevance due to new

technical developments. But different lighting conditions might influence the contrast and,

finally, performance (e.g. visuomotor reaction times). Therefore, our testers were instructed to

ensure optimal lighting and perfect viewing conditions. Graphics requirements (e.g. resolution)

are specified in the technical manual. Notably, data for psychometric analyses of NeuroCog FX®

were recorded under real-life conditions, i.e. these analyses already account for the error variance

caused by technical or environmental shortcomings and variation.

NeuroCog FX® - Hoppe et al.

- 24 -

Korczyn and Aharonson (2007) in their review favor self-explaining systems allowing complete

self-administration in non-demented subjects even with aphasia [14]. Although NeuroCog FX®

would be suited for group administrations (e.g. no auditory signals) and even self-administration,

we do not recommend this use. In our experience, neuropsychological examination of patients

requires an assistant who ensures the correct understanding of the tasks, keeps the patient

motivated and grants all of the necessary support (‘testing to the limits’). Usability criteria

proposed for self-administered computerized tests (so called “controlled” or “supervised” mode)

do not apply to NeuroCog FX® which is recommended for the so called “managed” mode, i.e.

administration with high level human supervision [31].

5. Conclusion

NeuroCog FX® is an economic screening tool which allows multiple serial science-based testing

during individual treatments or in multicenter group studies if more comprehensive

neuropsychological evaluations are inappropriate or unavailable. The test provides objective,

standardized, and sufficiently reliable and valid measures of clinically relevant cognitive

functions in epilepsy. The test does not replace a comprehensive neuropsychological evaluation.

The tool is presently under evaluation for use in patients with other neurological diseases.32

Conflict of Interest Statement

With permission of the University of Bonn Medical Centre NeuroCog FX® is marketed by three

of the co-authors (C. Hoppe, K. Fließbach, C. Helmstaedter).

Acknowledgment

Thanks to Nina Stephanie Lehnen, Alexander Höinghaus, Frederike Adler, and Johanna Michel

for data recording in patients and healthy subjects during their MD thesis. We are also grateful to

all participants of the studies. Special thanks to the anonymous reviewers for their detailed and

extraordinarily helpful comments.

Page 13: NeuroCog FX computerized screening of cognitive …epileptologie-bonn.de/.../NeuroCog_FX_EB.pdf · NeuroCog FX® - computerized screening of cognitive functions in epilepsy patients

NeuroCog FX® - Hoppe et al.

- 25 -

Tables and Figures

Table 1: NeuroCog FX®: subtests, functions, task descriptions, and measures.

Note: Measures not selected for standardization are shown in parentheses. Tests are

shown according to the predefined standard administration sequence.

a In case of motor handicaps, the examiner is allowed to assist the subject in typing.

b The background for the different administration rules applied on this test is explained in

the method section. No significant effects of type of administration (oral versus

written) were obtained in the patient samples; effects of different initial letters (L, P, S)

are corrected (see results section).

Table 2: Established neuropsychological assessment battery.

Note: The tests are described in more detail in [22].

Table 3: PCA on NeuroCog FX® scores (PAT- and CON-TOTAL, N=379).

Note: VARIMAX rotation. All factor loadings ≥0.30 are shown.

Table 4A: Reliability, practice effects and critical differences in healthy controls (CON-

RELIABLE).

a T-test for paired samples.

r12: Pearson’s product-moment test-retest correlation (estimate of reliability); M1: group

mean of first test application; M2: group mean of retest; SD = standard deviation; ∆-crit:

critical differences (for details see method section, α=.10); C.I. = 90%-confidence

intervals for raw scores (declined/improved).

ns = non-significant, * p<.05, ** p<.01, *** p<.001

NeuroCog FX® - Hoppe et al.

- 26 -

Table 4B: Reliability, practice effects and critical differences in patients (PAT-

RELIABLE).

For notes, see Table 4a.

Table 5: Concurrent validity estimates (CON- and PAT-VALIDATE).

+ p<.10, * p<.05, ** p<.01

Note: Pearson’s product-moment correlation coefficients.

Table 6A: Concurrent validation (PCA) in healthy controls (CON-VALIDATE).

Note: VARIMAX rotation. All factor loadings ≥0.30 are shown. The Two-Back test has

been excluded from this analysis.

Table 6B: Concurrent validation (PCA) in patients (PAT-VALIDATE, N=82).

For notes, see Table 6A.

Table 7: Performance in age groups: Means, standard deviations and results from

ANOVAs.

Table 8A: Detecting verbal memory deficits (CON- and PAT-VALIDATE, N=156).

a Deficits were indicated by age-corrected below-average scores of established

neuropsychological measures (SV<85, critical differences being considered). b Including false alarms (type I errors) and misses (type II errors).

Page 14: NeuroCog FX computerized screening of cognitive …epileptologie-bonn.de/.../NeuroCog_FX_EB.pdf · NeuroCog FX® - computerized screening of cognitive functions in epilepsy patients

NeuroCog FX® - Hoppe et al.

- 27 -

Table 8B: Detecting figural memory deficits (CON- and PAT-VALIDATE, N=138).

For notes, see Table 8A.

Table 8C: Detecting other cognitive deficits (CON- and PAT-VALIDATE, N=149). a The median number of below-average scores was 1.0. Cognitive deficits were indicated

by two or more age-corrected below-average scores in neuropsychological tests exclusive

of scores from verbal and figural list learning (SV<85, critical differences being

considered). b Including false alarms (type I errors) and misses (type II errors).

NeuroCog FX® – Hoppe et al.

- 28 -

Table 1

NeuroCog FX®: subtests, functions, tasks, and measures.

Subtest Function Task Measures

Digit Span verbal short-term memory

Successive visual presentation of single digits (1/second) from a digit sequence with increasing length (3-9); 2 trials for each span

Immediately recall the digit sequence by typing (number keys) a

score: number of correct responses

(digit span: maximal length of correctly reproduced digit sequence)

Two-Back working memory Continuous presentation of single digits (1/second).

React (spacebar) as fast as possible if present digit equals the second to the last digit

score: hits minus false alarms

(reaction time: median of reaction times for hits)

Simple Reaction alertness React as fast as possible when a blue circle occurs (spacebar)

reaction time: median of reaction times

Go/No Go selective attention React (Go) as fast as possible if a blue circle occurs (spacebar) but ignore yellow circles (No go)

(hits)

(false alarms)

reaction time: median of reaction times for hits

Inverted Go/No Go

susceptibility to interference effects and cognitive flexibility

Vice versa: React (Go) as fast as possible if a yellow circle occurs (spacebar) but ignore blue circles (No go)

(hits)

(false alarms)

reaction time: median of reaction times for hits

Page 15: NeuroCog FX computerized screening of cognitive …epileptologie-bonn.de/.../NeuroCog_FX_EB.pdf · NeuroCog FX® - computerized screening of cognitive functions in epilepsy patients

NeuroCog FX® – Hoppe et al.

- 29 -

Table 1, continued

Verbal Memory verbal learning and

recognition 3 trials of word list learning (12 nouns from 6 predefined lists or random selection)

learning: visual presentation (1 item per second)

subsequent yes/no recognition tests (item distracter ratio = 1:2, paced presentation, spacebar press indicates ‘yes’, max. reaction interval: 2 seconds, same distracters but word sequence re-arranged from trial to trial)

plus delayed yes/no recognition test (retention interval filled by Figural Memory)

pool: 72 items (word frequency < 5 per million) and 140 distracters (word frequency <6 per million) from the CELEX database (Max-Planck-Institute of Neurolinguistics, Nijmegen/Netherlands)

(hits)

(false alarms)

total score: hits – false alarms/2

(reaction time: median reaction time for hits)

(reaction time: median reaction time for false alarms)

NeuroCog FX® – Hoppe et al.

- 30 -

Figural Memory figural learning and recognition

3 trials of figure list learning (7 checkerboard patterns with 4 indicated yellow squares in a 3x3 blue matrix, from 6 predefined lists or random selection)

learning: visual presentation (1 item per 2 seconds)

subsequent yes/no recognition tests (item distracter ratio = 1:2, paced presentation, spacebar press to indicate ‘yes’, max. reaction interval: 2 seconds, same distracters but pattern sequence re-arranged from trial to trial)

plus delayed yes/no recognition test (retention interval filled by Verbal Memory/delayed recognition)

pool: 126 possible patterns, 42 items, 84 distracters

(hits)

(false alarms)

total score: hits – false alarms/2

(reaction time: median reaction time for hits)

(reaction time: median reaction time for false alarms)

Phonematic Fluency

phonematic literal word fluency

former version (CON-TOTAL, N=42 from PAT-VALIDATE): write words with initial letter P (paper-pencil test)

present version (N=82 from PAT-VALIDATE): name words with random first letter (L, P, or S),

each type of words but counting, conjunctions, or declinations are not permitted

time: 1 minute

program shows initial letter and elapsed time b on the screen and allows the examiner to count correct words via button clicks

score: number of correct words

Page 16: NeuroCog FX computerized screening of cognitive …epileptologie-bonn.de/.../NeuroCog_FX_EB.pdf · NeuroCog FX® - computerized screening of cognitive functions in epilepsy patients

NeuroCog FX® – Hoppe et al.

- 31 -

Table 2

Established neuropsychological assessment battery.

Function Test/s Subtests Measures

Counting of Symbols

time for completion Attention and executive functions

Test für cerebrale Insuffizienz (c.I.T.) [Test for cerebral insufficiency] AB-Interference time for completion

Trail Making Test (TMT) Forms A and B time for completion

Maze Test (from Chapuis) time for completion

Digit Span/Forward

span Short-term memory and working memory Digit

Span/Reversed span

Span tests from Wechsler Memory Scale (WMS-III)

Corsi Block Tapping Forward

block span

Verbal memory

Verbaler Lern- und Merkfähigkeitstest (VLMT) [Rey Auditory Verbal and Learning Test]

learning score: total of recalled words during learning (trials 1-5)

final score: recalled words after retention (trial 7)

retention score: loss of words from trial 5 to trial 7 (negative)

recognition score: yes/no recognition, hits minus false alarms

NeuroCog FX® – Hoppe et al.

- 32 -

Table 2, continued. Figural memory

Diagnosticum für Cerebralschädigung - revidiert (DCS) [DCS – a visual learning and memory test for neuropsychological evaluation]

learning score: total recalled figures during learning (trials 1-5)

final score: number of recalled items in trial 5

Word fluency, phonematic

Leistungs-Prüf-System [Performance-Test-System]

Subtest 6: Word fluency (written version)

number of correct words

Word fluency, semantic

Demenz-Test [Dementia-Test] Supermarket Test (written)

number of correct words

Page 17: NeuroCog FX computerized screening of cognitive …epileptologie-bonn.de/.../NeuroCog_FX_EB.pdf · NeuroCog FX® - computerized screening of cognitive functions in epilepsy patients

NeuroCog FX® – Hoppe et al.

- 33 -

Table 3

PCA on NeuroCog FX® scores (CON- and PAT-TOTAL, N=379).

Factors

Tests – Measures SCORE RTT NeuroCog FX®

Digit Span – score .736

Two Back – score .708

Verbal Memory – score .568 -.333

Figural Memory – score .682

Phonematic Fluency – score .714

Simple Reaction – reaction time .792

Go/No Go – reaction time .873

Inverted Go/No Go – reaction time .867

NeuroCog FX® – Hoppe et al.

- 34 -

Table 4A

Reliability, practice effects and critical differences in healthy controls (CON-RELIABLE).

Subtest – Measure (max. raw score) n r12 M1 (SD) M2 (SD) M2 - M1 a 90%-∆-crit C.I.

Digit Span – score (9) 44 0.68 *** 7.4 (2.2) 8.2 (2.1) +0.8 ** 2.9 -3 / +4

Two-Back – score (10) 41 0.21 ns 8.8 (2.3) 9.2 (1.4) +0.4 ns 4.8 -5 / +6

Simple Reaction – reaction time/ms 44 0.45 ** 262 (54) 261 (48) -1 ns 93 +93 / -95

Go/No Go – reaction time/ms 40 0.54 *** 362 (58) 342 (57) -20 * 92 +73 / -113

Inverted Go/No Go – reaction time/ms 44 0.57 *** 373 (63) 349 (49) -24 ** 96 +73 / -121

Verbal Memory – total score (48) 44 0.45 ** 41.2 (4.1) 43.6 (3.8) +2.4 *** 7.1 -5 / +10

Figural Memory – total score (28) 44 0.52 *** 14.6 (5.5) 16.5 (5.3) +2.0 * 8.9 -8 / +11

Phonematic Fluency – score (“P”) 44 0.69 *** 14.2 (3.1) 16.0 (3.3) +1.8 *** 4.0 -3 / +6

SCORE – standard value 36 0.62 *** 102.4 (6.2) 107.1 (6.1) +4.7 *** 8.9 -5 / +14

RTT – standard value 36 0.55 *** 100.1 (8.9) 102.6 (7.5) +2.5 ns 13.9 -12 / +16

Page 18: NeuroCog FX computerized screening of cognitive …epileptologie-bonn.de/.../NeuroCog_FX_EB.pdf · NeuroCog FX® - computerized screening of cognitive functions in epilepsy patients

NeuroCog FX® – Hoppe et al.

- 35 -

Table 4B

Reliability, practice effects and critical differences in patients (PAT-RELIABLE).

Subtest – Measure (max. raw score) n r12 M1 (SD) M2 (SD) M2 - M1 a 90%-∆-crit 90%-CI

Digit Span – score (9) 85 0.82 *** 5.1 (2.4) 5.5 (2.2) +0.4 * 2.4 -3 / +3

Two-Back – score (10) 82 0.70 *** 5.4 (4.0) 5.1 (4.3) -0.3 ns 5.1 -5 / +6

Simple Reaction – reaction time/ms 89 0.81 *** 299 (90) 314 (111) +14 * 92 +79 / -107

Go/No Go – reaction time/ms 88 0.74 *** 411 (107) 409 (117) -2 ns 127 +130 / -126

Inverted Go/No Go – reaction time/ms 89 0.80 *** 427 (109) 415 (113) -12 ns 114 +127 / -103

Verbal Memory – total score (48) 82 0.85 *** 32.7 (10.1) 32.3 (10.5) -0.4 ns 9.1 -10 / +9

Figural Memory – total score (28) 82 0.72 *** 8.7 (6.0) 9.3 (6.2) +0.7 ns 7.4 -6 / +9

Phonematic Fluency – corrected score 82 0.81 *** 8.8 (4.5) 8.9 (4.6) +0.02 ns 4.6 -5 / +5

SCORE – standard value 75 0.84 *** 87.2 (10.8) 87.5 (10.6) +0.3 ns 10.1 -10 / +11

RTT – standard value 80 0.78 *** 91.4 (12.3) 92.2 (12.8) +0.8 ns 13.5 -13 / +15

NeuroCog FX® – Hoppe et al.

- 36 -

Table 5

Concurrent validity estimates.

NeuroCog FX® Established neuropsychological assessment battery

CON-VALIDATE (N=40)

PAT-VALIDATE

r N r

Digit Span Digit Span/Forward 0.30 + 96 0.54 ***

Digit Span/Reversed 0.36 * 112 0.50 ***

Two-Back TMT Form B - time -0.04 ns 99 -0.48 ***

CIT Interference – time +0.12 ns 98 -0.40 ***

Maze Test – time -0.36 * 76 -0.33 **

Simple Reaction - time CIT Symbol Counting – time 0.10 ns 82 0.42 ***

TMT Form A – time 0.26 * 105 0.32 **

Verbal Memory VLMT – learning score (trials 1-5) 0.47 ** 124 0.62 ***

VLMT – final score (trial 7) 0.56 *** 124 0.49 ***

VLMT – retention score (∆5-7) -0.49 ** 124 -0.19 *

VLMT – recognition score 0.49 ** 123 0.54 ***

Figural Memory DCS – learning score (trials 1-5) 0.46 ** 110 0.58 ***

DCS – final score (trial 5) 0.38 * 109 0.54 ***

Phonematic Fluency Verbal Fluency, phonematic literal 0.60 *** 86 0.67 ***

Verbal Fluency, semantic (supermarket) 0.45 ** 37 0.34 *

Page 19: NeuroCog FX computerized screening of cognitive …epileptologie-bonn.de/.../NeuroCog_FX_EB.pdf · NeuroCog FX® - computerized screening of cognitive functions in epilepsy patients

NeuroCog FX® - Hoppe et al.

- 37 -

Table 6A

Concurrent validation (PCA) in healthy controls (CON-VALIDATE).

Factors

Tests - Measures 5 2 1 4 3 6NeuroCog FX®

Digit Span – score .664 -.507Simple Reaction .755 -.338Go/No Go .782 -.339Inverted Go/No Go .859Verbal Memory .590 .308 .338Figural Memory -.362 .542Phonematic Fluency .782

Established test battery

Digit Span/Forward – span .629

Digit Span/Reversed – span .809

VLMT – learning score .759 -.424

VLMT – final score .933

VLMT – retention score -.778

VLMT – recognition .776

DCS – learning score .453 .599 .391

DCS – final score .599 .513

Verbal Fluency, phonematic – score .817

Corsi Block Tapping – span .704

TMT Form A – time -.796 .336

TMT Form B – time -.534 .556

CIT Counting Symbols – time .781

CIT Interference – time .423 .589

NeuroCog FX® - Hoppe et al.

- 38 -

Table 6B

Concurrent validation (PCA) in patients (PAT-VALIDATE, N= 82).

Factors

Tests - Measures 4 2 1 3 5 6 NeuroCog FX®

Digit Span – score .676 -.318

Simple Reaction .821

Go/No Go .865

Inverted Go/No Go .839

Verbal Memory .621 .368

Figural Memory .306 .679

Phonematic Fluency .815

Established test battery

Digit Span/Forward – span .606 .479

Digit Span/Reversed – span .690 .307

CIT Counting Symbols – time -.735 .351

CIT Interference - time -.492 .349 -.444

VLMT – retention score -.410 -.360 .361

VLMT – learning score .811

VLMT – final score .861

VLMT – recognition .884

DCS – learning score .885

DCS – final score .873

Word Fluency, phonematic – score .801

TMT Form A – time -.486 .436

TMT Form B – time -.390 -.366 .542

Corsi Block Tapping – span -.842

Page 20: NeuroCog FX computerized screening of cognitive …epileptologie-bonn.de/.../NeuroCog_FX_EB.pdf · NeuroCog FX® - computerized screening of cognitive functions in epilepsy patients

NeuroCog FX® - Hoppe et al.

- 39 -

Table 7 Performance in age groups: Means, standard deviations and results from ANOVAs.

Age Groups ANOVA

16-29 yrs.

N=87

(A)

30-44 yrs.

N=57

(B)

45-59 yrs.

N=48

(C)

60-75 yrs.

N=52

(D)

P Post-hoc

Scheffé tests

Digit Span 7.7 (2.0) 7.4 (2.1) 6.7 (2.4) 6.3 (2.1) .001 AB > D

Two Back 8.5 (2.6) 8.1 (2.7) 7.9 (2.3) 5.6 (3.6) .000 ABC > D

Simple Reaction (ms) 256 (44) 281 (65) 273 (51) 330 (80) .000 ABC > D

Go/No Go (ms) 348 (54) 360 (57) 376 (61) 440 (98) .000 ABC > D

Inverted Go/No Go (ms) 356 (51) 379 (55) 389 (64) 435 (79) .000 ABC < D A < C

Verbal Memory 42.5 (3.7) 41.4 (4.4) 40.9 (4.1) 34.4 (8.9) .000 ABC > D

Figural Memory 17.3 (4.8) 13.9 (5.3) 13.1 (6.8) 7.6 (5.7) .000 A > BC > D

Phonematic Fluency 13.0 (3.7) 13.2 (4.8) 14.8 (4.3) 11.7 (3.3) .003 C > D

NeuroCog FX® - Hoppe et al.

- 40 -

Table 8A

Detecting verbal memory deficits (CON- and PAT-VALIDATE, N= 156).

Verbal Learning Score Verbal Final Score Verbal Retention Score Verbal Recognition Score

Affected patients/controls (%) a 35 / 2 (24.3) 48 / 1 (32.0) 28 / 0 (17.9) 43 / 1 (28.4)

NeuroCog FX® Verbal Memory (SV thresholds)

<80 <85 <90 <80 <85 <90 <80 <85 <90 <80 <85 <90

Positively tested (%) 20.4 35.5 46.1 20.9 35.3 46.4 20.5 35.3 46.8 20.6 34.8 46.5

Classification errors (%) b 21.1 28.3 33.6 28.1 32.0 34.0 24.4 34.0 41.7 23.2 29.7 33.5

Relative error reduction (%) b -39.5 -33.6 -30.1 -28.9 -28.4 -30.2 -21.7 -16.2 -13.1 -37.8 -31.7 -30.8

Positive predictive value 0.58 0.44 0.40 0.59 0.50 0.48 0.34 0.27 0.25 0.63 0.48 0.44

Negative predictive value 0.84 0.87 0.89 0.75 0.78 0.82 0.86 0.87 0.88 0.80 0.82 0.86

Likelihood ratio positive 4.30 2.49 2.07 3.10 2.12 1.95 2.40 1.71 1.50 4.21 2.34 2.02

Likelihood ratio negative 0.58 0.48 0.38 0.70 0.61 0.48 0.73 0.68 0.63 0.61 0.55 0.43

Sensitivity 0.49 0.65 0.76 0.39 0.55 0.69 0.39 0.54 0.64 0.45 0.59 0.73

Specificity 0.89 0.74 0.63 0.88 0.74 0.64 0.84 0.69 0.57 0.89 0.75 0.64

Correlation (ϕ) 0.40 0.35 0.34 0.30 0.28 0.32 0.22 0.18 0.16 0.39 0.32 0.33

χ2 24.05 18.38 17.27 13.90 12.38 15.31 7.38 5.01 4.19 23.08 15.92 17.05

χ2-Test (df=1), significance (P) 0.000 0.000 0.000 0.000 0.000 0.000 0.007 0.025 0.041 0.000 0.000 0.000

Page 21: NeuroCog FX computerized screening of cognitive …epileptologie-bonn.de/.../NeuroCog_FX_EB.pdf · NeuroCog FX® - computerized screening of cognitive functions in epilepsy patients

NeuroCog FX® - Hoppe et al.

- 41 -

Table 8B

Detecting figural memory deficits (CON- and PAT-VALIDATE, N= 138).

Figural Learning Score Figural Final Score

Affected patients/controls (%) a 48 / 1 (35.5) 49 / 0 (35.5)

NeuroCog FX® Figural Memory (SV thresholds)

<80 <85 <90 <80 <85 <90

Positively tested (%) 5.8 28.3 34.1 6.0 28.4 34.3

Classification errors (%) b 31.2 21.7 18.8 33.6 24.6 21.6

Relative error reduction (%) b -16.2 -50.3 -58.5 -12.0 -44.3 -52.7

Positive predictive value 0.88 0.74 0.74 0.75 0.71 0.72

Negative predictive value 0.68 0.80 0.85 0.66 0.77 0.82

Likelihood ratio positive 12.71 5.27 5.30 5.20 4.26 4.40

Likelihood ratio negative 0.87 0.46 0.33 0.90 0.52 0.39

Sensitivity 0.14 0.59 0.71 0.12 0.55 0.67

Specificity 0.99 0.89 0.87 0.98 0.87 0.85

Correlation (ϕ) 0.27 0.51 0.59 0.20 0.45 0.53

χ2 10.02 35.83 47.25 5.42 27.19 37.36

χ2-Test (df=1), significance (P) 0.002 0.000 0.000 0.020 0.000 0.000

NeuroCog FX® - Hoppe et al.

- 42 -

Table 8C

Detecting other cognitive deficits (CON- and PAT-VALIDATE, N= 149).

Established non-memory tests

Affected patients/controls (%) a 63 / 8 (47.7)

NeuroCog FX® SCORE (overall performance) (SV thresholds)

<80 <85 <90 <95

Positively tested (%) 10.7 19.5 32.2 47.7

Classification errors (%) b 42.3 37.6 28.9 29.5

Relative error reduction (%) b -12.2 -22.6 -41.3 -40.8

Positive predictive value 0.75 0.76 0.79 0.69

Negative predictive value 0.56 0.59 0.67 0.72

Likelihood ratio positive 3.30 3.45 4.17 2.45

Likelihood ratio negative 0.88 0.76 0.53 0.43

Sensitivity 0.17 0.31 0.54 0.69

Specificity 0.95 0.91 0.87 0.72

Correlation (ϕ) 0.19 0.28 0.44 0.41

χ2 5.37 11.49 28.20 24.81

χ2-Test (df=1), significance (P) 0.020 0.001 0.000 0.000

Page 22: NeuroCog FX computerized screening of cognitive …epileptologie-bonn.de/.../NeuroCog_FX_EB.pdf · NeuroCog FX® - computerized screening of cognitive functions in epilepsy patients

NeuroCog FX® - Hoppe et al.

- 43 -

Figure 1: NeuroCog FX® cognitive performance profile.

NeuroCog FX® - Hoppe et al.

- 44 -

Figure 2: Performance profiles: CON- and PAT-TOTAL.

Percentage of subjects with SV<90 in the respective NeuroCog FX® subtest (Chi-square

tests, P<.001 for each subtest).

Page 23: NeuroCog FX computerized screening of cognitive …epileptologie-bonn.de/.../NeuroCog_FX_EB.pdf · NeuroCog FX® - computerized screening of cognitive functions in epilepsy patients

NeuroCog FX® – Hoppe et al.

- 45 -

References

1 Elger CE, Helmstaedter C, Kurthen M. Chronic epilepsy and cognition. Lancet Neurol 2004;

3:663-672.

2 Motamedi G, Meador K. Epilepsy and cognition. Epilepsy Behav 2003;4 Suppl 2:S25-38.

3 Meador KJ, Gilliam FG, Kanner AM & Pellock JM. Cognitive and behavioral effects of

antiepileptic drugs. Epilepsy Behav 2001;2:SS1-SS17.

4 Taphoorn MJ & Klein M. Cognitive deficits in adult patients with brain tumours. Lancet

Neurol 2004;3:159-168.

5 Sparks DL, Sabbagh MN, Connor DJ, Lopez J, Launer LJ, Browne P, Wasser D, Johnson-

Traver S, Lochhead J & Ziolwolski C. Atorvastatin for the treatment of mild to moderate

Alzheimer disease: preliminary results. Arch Neurol 2005;62:753-757.

6 Panitch H, Miller A, Paty D & Weinshenker B. Interferon beta-1b in secondary progressive

MS:results from a 3-year controlled study. Neurology 2004;63:1788-1795.

7 Ravina B, Putt M, Siderowf A, Farrar JT, Gillespie M, Crawley A, Fernandez HH,

Trieschmann MM, Reichwein S & Simuni T. Donepezil for dementia in Parkinson's

disease:a randomised, double blind, placebo controlled, crossover study. J Neurol Neurosurg

Psychiatry 2005;76:934-939.

8 Brodie MJ, Shorvon SD, Canger R, Halász P, Johannessen S, Thompson P, Wieser HG,

Wolf P. Commission on European Affairs: Appropriate standards of epilepsy care across

Europe. Epilepsia 1997;38:1245-1250.

9 Wilken JA, Sullivan CL, Lewandowski A & Kane RL. The use of ANAM to assess the side-

effect profiles and efficacy of medication. Arch Clin Neuropsychol 2007;22 Suppl 1:S127-

S133.

10 Wild K, Howieson D, Webbe F, Seelye A & Kaye J. Status of computerized cognitive testing

in aging: a systematic review. Alzheimer’s Dementia 2008;4:428-437.

11 Korczyn AD & Aharsonson V. Computerized methods in the assessment and prediction of

dementia. Curr Alzheimer Res 2007;4:364-369.

NeuroCog FX® – Hoppe et al.

- 46 -

12 First peer-review publication on test systems which have already been applied in epilepsy

research (sorted by publishing year): Cambridge Neuropsychological Test Automated

Battery CANTAB (Sahakian BJ, Morris RG, Evenden JL, Heald A, Levy R, Philpot MP &

Robbins TW. A comparative study of visuospatial memory and learning in Alzheimer-type

dementia and Parkinson's disease. Brain 1988;111:695-718); Automated Neuropsychological

Assessment Metrics ANAM (Bleiberg J, Garmoe W, Cederquist J, Reeves D & Lux W.

Effects of Dexedrine on performance consistency following brain injury: a double-blind

placebo crossover case study. Neuropsychiatr Neuropsychol Behav Neurol 1993;6:245-248);

FePSY ‘The Iron Psyche’ (Aldenkamp AP, Alpherts WC, Diepman L, van 't Slot B, Overweg

J & Vermeulen J. Cognitive side-effects of phenytoin compared with carbamazepine in

patients with localization-related epilepsy. Epilepsy Res 1994;19:37-43); Cognitive Drug

Research CDR (Mohr E, Knott V, Sampson M, Wesnes K, Herting R, Mendis T. Cognitive

and quantified electroencephalographic correlates of cycloserine treatment in Alzheimer's

disease. Clin Neuropharmacol 1995;18:28-38); MicroCog (Powell Ass.; Di Sclafani V, Clark

HW, Tolou-Shams M, Bloomer CW, Salas GA, Norman D & Fein G. Premorbid brain size is

a determinant of functional reserve in abstinent crack-cocaine and crack-cocaine-alcohol-

dependent adults. J Int Neuropsychol Soc 1998;4:559–565); Headminder (Erlanger DM,

Feldman DJ, Theodoracopulos A, Kaplan D. Development and validation of the cognitive

stability index, a web-based protocol for monitoring change in cognitive function. Arch of

Clin Neuropsychol 2000;15:293-316); Mindstreams (NeuroTrax Inc.; Elstein D, Guedalia J,

Doniger GM, Simon ES, Antebi V, Arnon Y & Zimran A. Computerized cognitive testing in

patients with type I Gaucher disease: effects of enzyme replacement and substrate reduction.

Genet Med 2005;7:124-130); California Computerized Assessment Package CALCAP (Chang

L, Ernst T, Speck O, Patel H, DeSilva M, Leonido-Yee M & Miller EN. Perfusion MRI and

computerized cognitive test abnormalities in abstinent methamphetamine users. Psychiatry

Res 2002; 114:65-79); Behavioral Assessment and Research System BARS (Rohlman DS,

Gimenes LS, Eckerman DA, Kang SK, Farahat FM & Anger WK. Development of the

Behavioral Assessment and Research System (BARS) to detect and characterize

neurotoxicity in humans. Neurotoxicology 2003;24:523-31); CNS Vital Signs (Gualtieri CT

& Johnson LG. Reliability and validity of a computerized neurocognitive test battery, CNS

Vital Signs. Arch Clin Neuropsychol 2006;21:623-643); Rochester Test Battery RTB

Page 24: NeuroCog FX computerized screening of cognitive …epileptologie-bonn.de/.../NeuroCog_FX_EB.pdf · NeuroCog FX® - computerized screening of cognitive functions in epilepsy patients

NeuroCog FX® – Hoppe et al.

- 47 -

(Davidson PW, Weiss B, Beck C, Cory-Slechta DA, Orlando M, Loiselle D, Young EC,

Sloane-Reeves J & Myers GJ. Development and validation of a test battery to assess subtle

neurodevelopmental differences in children. Neurotoxicology. 2006;27:951-69); NexAde

(Korczyn AD & Aharsonson V. Computerized methods in the assessment and prediction of

dementia. Curr Alzheimer Res 2007;4:364-369).

13 Fliessbach K, Hoppe C, Schlegel U, Elger CE & Helmstaedter C (2006) [NeuroCogFX - a

computer-based neuropsychological assessment battery for the follow-up examination of

neurological patients]. Fortschr Neurol Psychiatr;74:643-50.

14 Fliessbach K, Helmstaedter C, Urbach H, Althaus A, Pels H, Linnebank M, Juergens A,

Glasmacher A, Schmidt-Wolf IG, Klockgether T & Schlegel U. Neuropsychological

outcome after chemotherapy for primary CNS lymphoma: a prospective study. Neurology

2005;64:1184-1188.

15 Jünemann H, Helmstaedter C & Elger CE [Possible use of computerized memory testing in

presurgical epilepsy diagnostics]. In: Scheffner D, ed [Epilepsy 91]. Reinbek: Einhorn-Presse

Verlag; 1992. p. 449-452.

16 Helmstaedter C, Hoppe C & Elger CE. Memory alterations during acute high-intensity vagus

nerve stimulation. Epilepsy Res 2001;47:37-42.

17 Helmstaedter C, Elger CE & Lendt M. Postictal courses of cognitive deficits in focal

epilepsies. Epilepsia 1994;35:1073-1078.

18 Gualtieri CT & Johnson LG. Reliability and validity of a computerized neurocognitive test

battery, CNS Vital Signs. Arch Clin Neuropsychol 2006;21:623-643.

19 Horn W [Leistungsprüfsystem L-P-S]. Göttingen: Hogrefe; 1983.

20 Lehrl S [Multiple choice vocabulary test. Form B]. Erlangen: Straube; 1977.

21 Flynn, J. R. The mean IQ of Americans: Massive gains 1932 to 1978. Psychol Bull 1984;

95:29-51.

22 Hoppe C, Helmstaedter C, Scherrmann J & Elger CE. No evidence for cognitive side effects

after 6 months of vagus nerve stimulation in epilepsy patients. Epilepsy Behav 2001;2:351-

356.

NeuroCog FX® – Hoppe et al.

- 48 -

23 Krauth J [Test construction and test theory]. Weinheim: Psychologie Verlags Union; 1995.

24 Hoppe C, Elger CE & Helmstaedter C. Long-term memory impairment in patients with focal

epilepsy. Epilepsia 2007;48 Suppl 9:26-29.

25 Gleissner U, Helmstaedter C, Schramm J & Elger CE. Memory outcome after selective

amygdalohippocampectomy in patients with temporal lobe epilepsy: one-year follow-up.

Epilepsia 2004;45:960-962.

26 Goodrich-Hunsacker NJ, Hopkins RO. Word Test performance in amnesic patients with

hippocampal damage. Neuropsychology 2009;23:529-534.

27 Aldenkamp AP. Cognitive effects of topiramate, gabapentin, and lamotrigine in healthy

young adults (comment). Neurology 2000;54:271-272.

28 Kockelmann E, Elger CE & Helmstaedter C. Significant improvement in frontal lobe

associated neuropsychological functions after withdrawal of topiramate in epilepsy patients.

Epilepsy Res 2003;54:171-178.

29 For example, the special supplement of Arch Clin Neuropsychol 2007;22 Suppl. 1.

30 Cernich AN, Brennana DM, Barker LM, Bleiberg J. Sources of error in computerized

neuropsychological assessment. Arch Clin Neuropsychol 2007;22 Suppl. 1: S39-S48.

31 International Test Commission (ITC). International guidelines on computer-based and

internet delivered testing

[http://www.intestcom.org/Downloads/ITC%20Guidelines%20on%20Computer%20-

%20version%202005%20approved.pdf; July 16, 2009]. ITC; 2005.

32 A demo version of the software can be requested from the corresponding author.