prospective vs retrospective assessment of lower urinary tract symptoms in patients with advanced...

4
© 2 0 0 3 B J U I N T E R N A T I O N A L | 9 2 , 7 0 3 – 7 0 6 | doi:10.1046/j.1464-410X.2003.04462.x 703 Original Article ASSESSING LUTS IN PATIENTS WITH ADVANCED PROSTATE CANCER J. REES et al. Prospective vs retrospective assessment of lower urinary tract symptoms in patients with advanced prostate cancer: the effect of ‘response shift’ J. REES, D. WALDRON*, C. O’BOYLE† P. EWINGS and R. MACDONAGH Taunton and Somerset Hospital, *University College Hospital, Galway, and †Royal College of Surgeons in Ireland, Dublin, Ireland Accepted for publication 10 April 2003 were recruited from primary care and completed the same questionnaires; in all, 76 patients and 17 controls participated. RESULTS The IPSS and SPI scores decreased significantly over the 6 months of the study. Patients retrospectively rated their level of symptoms and symptom bother as higher than their contemporaneous assessments. This was not the case in the control group. CONCLUSION These results question the assumption that contemporaneously collected pre-test scores are interchangeable with retrospectively assessed then-tests. This suggests that caution is required when comparing the results of studies that use these two alternative techniques of data collection. The difference between then-test and pre-test scores may represent an example of a phenomenon termed ‘response shift’, in which, by adapting to their disease, patients changed the internal standards by which they assessed their symptoms. KEYWORDS prostate cancer, response shift, lower urinary tract symptoms OBJECTIVE To compare prospectively obtained symptom scores (pre-tests) with retrospective assessment (then-tests) in patients with newly diagnosed advanced prostate cancer. PATIENTS AND METHODS Patients with newly diagnosed locally advanced or metastatic prostate cancer were recruited. They completed the International Prostate Symptom Score (IPSS) and Symptom Problem Index (SPI) before starting treatment. At 3 and 6 months after diagnosis they again completed these questionnaires, but also retrospectively reassessed their initial symptom level. Healthy age-matched controls INTRODUCTION For many men with prostate cancer or benign prostatic enlargement (BPE), LUTS and the effect of these symptoms on daily life are the principal causes of morbidity. These symptoms have been extensively investigated in BPE, with symptom scores devised by the AUA [1] and endorsed by the WHO. Although originally designed to assess patients with BPE, the IPSS and the accompanying Symptom Problem Index (SPI or Bother Score) can also be used to evaluate the severity of LUTS in men with prostate cancer, and to gauge the response to local and systemic treatments [2]. When using symptom scores to assess response to treatment it is essential that the instrument accurately reflects the experience of the patient. Traditionally, questionnaires are administered before intervention (pre- test) and repeated at a fixed interval after starting treatment (post-test). However, some studies [3] elected not to use a pre-test, but instead asked patients at the time of the post- test to recollect their symptom status before intervention. This form of administration has been termed a ‘retrospective pre-test’, or ‘then-test’ [4]. Previous reports suggested that the then-test is an acceptable substitute for carrying out a pre-test [5], but recent work on the adaptation of patients to chronic disease (‘response shift’) suggests that not only are the pre-test and then-test likely to be different, but also that it is the then-test that provides the more appropriate comparator with the post-test when assessing real change [6]. The aim of the present study was to compare the results of prospective and retrospective symptom assessment in a group of patients with newly diagnosed advanced prostate cancer, and in a group of healthy age-matched controls. PATIENTS AND METHODS Patients with newly diagnosed locally advanced or metastatic prostate cancer were recruited from three centres in the South- West of England (Taunton and Somerset Hospital, Southmead Hospital and Bristol Royal Infirmary), after obtaining approval from the relevant ethics committees. Locally advanced disease was defined as two of three from: (i) a PSA level of >10 ng/mL; (ii) a Gleason grade of >7; and (iii) clinical stage T3 [7]. Patients were excluded from the study if they were unable to complete the questionnaires, either through lack of understanding, poor literacy or limited life- expectancy (< 6 months). Controls were recruited from primary care, with the recruiting practices reflecting a variety of environments, both urban and rural, as with the patient population. GPs were asked to recruit men aged 60–85 years and considered to have no significant health problems. The patients were seen in their own homes within a week of their diagnosis (i.e. before starting treatment) and at this first visit completed the IPSS and SPI. Subsequent visits took place 3 and 6 months later,

Upload: j-rees

Post on 06-Jul-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Prospective vs retrospective assessment of lower urinary tract symptoms in patients with advanced prostate cancer: the effect of ‘response shift’

©

2 0 0 3 B J U I N T E R N A T I O N A L | 9 2 , 7 0 3 – 7 0 6 | doi:10.1046/j.1464-410X.2003.04462.x

7 0 3

Blackwell Science, LtdOxford, UKBJUBJU International1464-410XBJU InternationalNovember 2003927

Original Article

ASSESSING LUTS IN PATIENTS WITH ADVANCED PROSTATE CANCERJ. REES

et al.

Prospective vs retrospective assessment of lower urinary tract symptoms in patients with advanced prostate cancer: the effect of ‘response shift’

J. REES, D. WALDRON*, C. O’BOYLE† P. EWINGS and R. MACDONAGH

Taunton and Somerset Hospital, *University College Hospital, Galway, and †Royal College of Surgeons in Ireland, Dublin, Ireland

Accepted for publication 10 April 2003

were recruited from primary care and completed the same questionnaires; in all, 76 patients and 17 controls participated.

RESULTS

The IPSS and SPI scores decreased significantly over the 6 months of the study. Patients retrospectively rated their level of symptoms and symptom bother as higher than their contemporaneous assessments. This was not the case in the control group.

CONCLUSION

These results question the assumption that contemporaneously collected pre-test scores

are interchangeable with retrospectively assessed then-tests. This suggests that caution is required when comparing the results of studies that use these two alternative techniques of data collection. The difference between then-test and pre-test scores may represent an example of a phenomenon termed ‘response shift’, in which, by adapting to their disease, patients changed the internal standards by which they assessed their symptoms.

KEYWORDS

prostate cancer, response shift, lower urinary tract symptoms

OBJECTIVE

To compare prospectively obtained symptom scores (pre-tests) with retrospective assessment (then-tests) in patients with newly diagnosed advanced prostate cancer.

PATIENTS AND METHODS

Patients with newly diagnosed locally advanced or metastatic prostate cancer were recruited. They completed the International Prostate Symptom Score (IPSS) and Symptom Problem Index (SPI) before starting treatment. At 3 and 6 months after diagnosis they again completed these questionnaires, but also retrospectively reassessed their initial symptom level. Healthy age-matched controls

INTRODUCTION

For many men with prostate cancer or benign prostatic enlargement (BPE), LUTS and the effect of these symptoms on daily life are the principal causes of morbidity. These symptoms have been extensively investigated in BPE, with symptom scores devised by the AUA [1] and endorsed by the WHO. Although originally designed to assess patients with BPE, the IPSS and the accompanying Symptom Problem Index (SPI or Bother Score) can also be used to evaluate the severity of LUTS in men with prostate cancer, and to gauge the response to local and systemic treatments [2].

When using symptom scores to assess response to treatment it is essential that the instrument accurately reflects the experience of the patient. Traditionally, questionnaires are administered before intervention (pre-test) and repeated at a fixed interval after starting treatment (post-test). However, some studies [3] elected not to use a pre-test, but

instead asked patients at the time of the post-test to recollect their symptom status before intervention. This form of administration has been termed a ‘retrospective pre-test’, or ‘then-test’ [4]. Previous reports suggested that the then-test is an acceptable substitute for carrying out a pre-test [5], but recent work on the adaptation of patients to chronic disease (‘response shift’) suggests that not only are the pre-test and then-test likely to be different, but also that it is the then-test that provides the more appropriate comparator with the post-test when assessing real change [6]. The aim of the present study was to compare the results of prospective and retrospective symptom assessment in a group of patients with newly diagnosed advanced prostate cancer, and in a group of healthy age-matched controls.

PATIENTS AND METHODS

Patients with newly diagnosed locally advanced or metastatic prostate cancer were

recruited from three centres in the South-West of England (Taunton and Somerset Hospital, Southmead Hospital and Bristol Royal Infirmary), after obtaining approval from the relevant ethics committees. Locally advanced disease was defined as two of three from: (i) a PSA level of >10 ng/mL; (ii) a Gleason grade of >7; and (iii) clinical stage

T3 [7]. Patients were excluded from the study if they were unable to complete the questionnaires, either through lack of understanding, poor literacy or limited life-expectancy (< 6 months). Controls were recruited from primary care, with the recruiting practices reflecting a variety of environments, both urban and rural, as with the patient population. GPs were asked to recruit men aged 60–85 years and considered to have no significant health problems.

The patients were seen in their own homes within a week of their diagnosis (i.e. before starting treatment) and at this first visit completed the IPSS and SPI. Subsequent visits took place 3 and 6 months later,

Page 2: Prospective vs retrospective assessment of lower urinary tract symptoms in patients with advanced prostate cancer: the effect of ‘response shift’

J . R E E S

E T A L .

7 0 4

©

2 0 0 3 B J U I N T E R N A T I O N A L

repeating the questionnaires and incorporating a then-test, with both groups asked to re-evaluate their symptoms at the time of the previous assessment. It was emphasized that the then-test was not asking patients to recall their previous answers, but instead to provide a renewed judgement on their degree of symptoms/bother at the previous assessment.

The seven questions of the IPSS are answered from ‘not at all’ to ‘almost always’, and scored from 0 to 5. The total range it therefore 0–35 (asymptomatic to very symptomatic). Added to the end of the IPSS is a single question about quality of life (QoL): ‘If you were to spend the rest of your life with your urinary condition just the way it is now, how would you feel about that?’ The answers to this question range from ‘delighted’ to ‘terrible’, scored 0–6. The SPI repeats the seven questions of the IPSS, but asks how much the symptoms have bothered the patient (rather than merely recording the presence of symptoms). The answers range from ‘not a problem’ to ‘big problem’ and are scored from 0 to 4, giving a range of scores of 0–28 (‘no bother’ to ‘extremely bothersome’).

Values between and within groups were compared using unpaired and paired

t

-tests, respectively. Although the original data show some evidence of skewness, the sample size ensures that the Normal distribution is a reasonable approximation for the sampling distribution of the means, hence justifying a parametric approach. However, to test the sensitivity of this assumption, nonparametric tests (Mann–Whitney and Wilcoxon, as appropriate) were also undertaken and the

P

values obtained were similar; for simplicity, only the

t

-tests are reported here.

RESULTS

Between 20 July 2000 and 19 July 2001, 76 patients met the inclusion/exclusion criteria for the study. All agreed to participate and gave informed consent. Despite excluding patients from the study with a predicted life-expectancy of <6 months, two patients died shortly after their diagnosis, and therefore completed only the first assessment. Three patients had catheters

in situ

at the initial visit, increasing to nine at 3 months and seven at 6 months, and thus different numbers completed the questionnaires at each assessment. Seventeen controls were

recruited by a variety of GPs in Bristol and Somerset; all 17 completed the study.

The mean (

SD

) ages of the patients and controls were 72.8 (8.5) and 71.8 (5.0) years, respectively. At presentation, the median (range) PSA level of the patients was 58.1 (4.4–5050) ng/mL. The clinical stage was recorded as T3 (according to the TNM classification) in 59 patients, with seven T2 and 10 T4 cases; 64 of the 76 patients (84%) had histological confirmation of their diagnosis, with 12 (16%) treated as a result of a clinical diagnosis. Most tumours were moderately differentiated, with a Gleason grade of 5–7 in 41 patients; 22 patients had poorly differentiated tumours (Gleason 8–10) and one had a well-differentiated tumour. Distant metastases were identified in 19 patients and the remaining 57 were therefore considered to have locally advanced disease. Sixty-one patients received treatment with hormonal therapy alone (LHRH analogue), whilst 11 received radical radiotherapy (6 weeks for 5 days/week) in addition to their neoadjuvant hormonal therapy. All patients receiving hormonal therapy continued on this

treatment throughout the study. Three patients did not receive treatment during the study, but were placed on an active surveillance or ‘watchful waiting’ regimen.

The mean IPSS at each assessment is summarized in Table 1; the ‘visit 2 then-test’ refers to an assessment by the patient of their symptoms at the time of visit 1, while the ‘visit 3 then-test’ refers to an assessment at the time of visit 2. The mean IPSS was consistently higher in patients than in controls, and this was significant at all assessments (

P

< 0.001). Table 2 summarizes the difference between scores using a traditional pre/post-test design, and that seen using the retrospective then-test scores. Using pre/post-testing, there was a statistically significant improvement of 2.9 points in patients over the first 3 months (

P

=

0.001), with an insignificant deterioration in the subsequent 3 months (0.9 points,

P

=

0.1). There were no significant changes in the control group using this method. Using the retrospective score, the magnitude of improvement increased significantly in the first 3 months to 4.2 points (

P

< 0.001). In the

TABLE 1

The mean IPSS for patients and controls

VisitMean (

SD

) IPSS Patients Controls Difference (95% CI)

1 11.6 (7.8) 4.4 (4.4) 7.2 (4.3–10.0)2 8.2 (5.4) 4.0 (3.3) 4.2 (2.1–6.3)2 (then-test) 13.2 (9.2) 4.0 (3.7) 9.2 (6.4–12.1)3 9.0 (6.1) 3.7 (3.3) 5.3 (3.1–7.5)3 (then-test) 10.8 (7.1) 3.6 (2.9) 7.2 (5.0–9.5)

P <0.001, unpaired t-tests for the difference in mean scores between patients and controls for all assessments.

TABLE 2

Differences in the mean IPSS using pre/post-tests vs then/post-tests

Test* N 0–3 months N 3–6 monthsPre-test & then-testPatients 69†

-

1.6 (0.02) 67†

-

2.3 (< 0.001)Controls 17 0.4 (0.6) 17 0.4 (0.5)Pre-test & post-testPatients 66†

-

2.9 (0.001) 67† 0.9 (0.1)Controls 17

-

0.4 (0.6) 17

-

0.3 (0.6)Then-test & post-test 66†

-

4.3 (< 0.001) 68†

-

1.8 (0.04)Controls 17 0.0 (1.0) 17 0.1 (0.6)

*Paired

t

-test; †Differences are for pairs analysed in paired

t

-test, i.e. 66 pairs between visits 1 and 2, and therefore these values are different from those calculated for all patients at each visit, as shown in Table 1.

Page 3: Prospective vs retrospective assessment of lower urinary tract symptoms in patients with advanced prostate cancer: the effect of ‘response shift’

A S S E S S I N G L U T S I N P A T I E N T S W I T H A D V A N C E D P R O S T A T E C A N C E R

©

2 0 0 3 B J U I N T E R N A T I O N A L

7 0 5

second 3 months, instead of an insignificant deterioration, the results showed a statistically significant improvement in mean IPSS (1.8 points,

P

=

0.04). Again, there were no statistically significant changes in the control group.

In patients the mean difference between the contemporary pre-test scores and the retrospective then-test scores was 1.6 points in the first 3 months and 2.3 points in the second 3-month period. These differences were both statistically significant (

P

=

0.02 and <0.001, respectively, paired

t

-test).

For QoL effects of the LUTS, again the patient scores were statistically significantly higher than control scores at all visits, indicating that patients were less happy about their urinary symptoms. For the patients, both methods showed a statistically significant improvement in the first 3 months after diagnosis (

P

< 0.001), but with a greater change using the retrospective assessment. However, in the second 3 months traditional pre/post-testing showed no significant change, whilst then/post-testing showed a highly significant improvement (

P

< 0.001). These changes were not apparent in the control group.

Patients consistently showed higher levels of bother (SPI score) than controls and this was statistically significant at all assessments. The level of bother improved over the first 3 months of the study but this improvement was greater using then/post-tests than with pre/post-tests (4.0 vs 2.2 points, respectively). In the second 3 months, there was an insignificant deterioration using traditional testing, but the then-test showed a highly significant improvement in patient bother scores (

P

< 0.001). Again, there were no statistically significant changes in the control group. The difference between pre-test and then-test scores in the patients was highly significant in both the first and second 3-month periods (

P

< 0.001, paired

t

-test).

DISCUSSION

Assessing symptoms and symptom bother suggests that hormonal treatment (with or with no radiotherapy) was effective in reducing the impact of LUTS in this group of patients with advanced prostate cancer. However, the results do not support the assumption that contemporaneously collected pre-test scores are interchangeable with retrospectively assessed then-tests. On the IPSS, QoL item and SPI, patients retrospectively assessed their symptoms as more severe/bothersome than they had at the time of the pre-test. This increased the magnitude of the improvement on some assessments, and even reversed the direction of change on others from deterioration to improvement.

The difference in scores between pre-tests and then-tests may represent an example of ‘response shift’. In using symptom scales and QoL instruments, it is presumed that individuals evaluating themselves have an internalized standard for judging their level of functioning, and that this standard will not differ from experimental to control groups, or change from one testing to the next [8]. However, patients may adapt to their disease, changing this internal standard, and thus changing the criteria by which they assess their symptoms/QoL. This change in a subject’s basis for determining his or her level of functioning on a given dimension is referred to as ‘response shift’, and has been postulated as a source of bias when using self-reported instruments [9].

According to the theory of response shift, pre-tests and post-tests are measured in patients who may have changed their internal standards, re-conceptualized the meaning of QoL and/or changed their values about the relative importance of the various constituents of QoL [6]. The then-test has been suggested as a method to compensate

for this [4,10]. The assumption is that the patient will use the same criteria for the conventional post-test and the ‘then-rating’, allowing meaningful comparison between the scores. The difference between the then-test and post-test thus represents the actual change taking place, whilst the difference between then-test scores and pre-test scores allows an estimate of the degree of response-shift bias (Fig. 1) [11]. Using the then-test method, a significant response shift was identified in the present patients, with statistically significant differences between contemporary (pre-test) and retrospective (then-test) scores on the IPSS and SPI at both comparisons, and on the QoL question in the second 3 months. There were no statistically significant changes in the control group using this technique.

Then-test results were statistically significantly different from pre-test results using the IPSS and SPI, but the fundamental question is whether these differences represent response shift or some other confounding factor(s). Social desirability may play a part in the retrospective worsening of scores, in which the subject feels their symptoms or QoL should have improved with treatment, and therefore they will place lower scores on then-test evaluation. It is also possible that patients may feel an implicit pressure to ‘please the doctor’, again leading them to lower their retrospective QoL scores. Although the theory of response shift was not explained to the subjects, it is likely that many will have understood the purpose of then-testing. Finally, the issue of recall bias and memory must be considered. Could re-framing of memory play a part in changing retrospective scores? Undoubtedly, memory is important for then-testing, but importantly, the subjects were not asked to remember how they scored the questionnaire at the previous assessment; instead they were asked to re-score the questionnaire as they felt they were at the previous visit, i.e. using the same set of standards, values and concepts about QoL. Thus, although memory has a role, it is by no means the only factor. Furthermore, an attempt was made to minimize the effects of recall bias and memory in this study by using relatively short intervals between assessments (3 months). It was considered that changing this interval in either direction could cause potential problems. If the then-test is assessed too close to the pre-test, it is possible that subjects will remember their

FIG. 1.

A hypothetical example of thethen-test approach to measuring

response shift (from [10]); redclosed circles, pre-test to post-

test; green open circles, then-testto post-test.

ReportedtreatmenteffectResponseshift effect

Actualtreatmenteffect

Pre-test

Post-test

Then-test

Mea

n sc

ore

Time

Page 4: Prospective vs retrospective assessment of lower urinary tract symptoms in patients with advanced prostate cancer: the effect of ‘response shift’

J . R E E S

E T A L .

7 0 6

©

2 0 0 3 B J U I N T E R N A T I O N A L

previous answers and score the questionnaire accordingly. Conversely, if the then-test is applied after a longer interval it is possible that memory effects may have an increasing role. The effect of changes in timing of the then-test is an interesting area for future research.

These results question the comparison of studies which use retrospectively collected symptom/QoL assessment (e.g. the National Prostatectomy Audit) [3] with those using contemporaneously collected data. The present study suggests that comparing data collected using the different techniques is likely to be unreliable, and may lead to inaccurate conclusions. If the then-test method removes the bias of response shift, it can be argued that it actually represents more accurately the true changes taking place. This would suggest that, in the research setting, it may be preferable to incorporate both a pre-test and then-test, at least where this does not mean a significant questionnaire overload.

ACKNOWLEDGEMENTS

The authors thank Messrs D. Gillatt (Southmead Hospital) and R. Persad (Bristol Royal Infirmary) for access to their patients, and their Clinical Research Fellows, Messrs M. Sugiono, J. P. Meyer and B. Patel for their

assistance in recruiting patients for the study.

REFERENCES

1

Barry MJ, Fowler FJ, O’Leary MP, Holtgrewe HL, Mebust WK, Cockett ATK.

The American Urological Association symptom index for benign prostatic hyperplasia.

J Urol

1992;

148

: 1549–57

2

Kirby RS, Christmas TJ, Brawer MK.

Prostate Cancer.

London: Mosby International, 2001

3

Emberton M, Neal DE, Black N

et al.

The effect of prostatectomy on symptom severity and quality of life.

Br J Urol

1996;

77

: 233–474

Howard GS, Ralph KM, Gulanick NA, Maxwell SE, Nance DW, Gerber SK.

Internal invalidity in pre-test–post-test self-report evaluations and a re-evaluation of retrospective pre-tests.

Appl Psychol Measurement

1979;

3

: 1–23

5

Emberton M, Challands A, Styles R, Wightman J, Black N.

Recollected versus contemporary patient reports of pre-operative symptoms in men undergoing transurethral prostatectomy for benign disease.

J Clin Epid

1995;

48

: 749–566

Schwartz CE, Sprangers MAG.

Methodological approaches for assessing

response shift in longitudinal health-related quality of life research.

Social Sci Med

1999;

48

: 1531–487

Roach M.

Management of locally advanced prostate cancer: new definitions and strategies.

WJM

1998;

169

: 290–18

Postulart D, Adang EMM.

Response shift and adaptation in chronically ill patients.

Med Decis Making

2000;

20

: 186–93

9

Sprangers MAG.

Response shift bias. A challenge to the assessment of patients’ quality of life in cancer clinical trials.

Cancer Treatment Rev

1996;

22

(Suppl. A): 55–62

10

Sprangers MAG, Van Dam FSAM, Broersen J

et al.

Revealing response shift in longitudinal research on fatigue. The use of the then-test approach.

Acta Oncologica

1999;

38

: 709–1811

Breetvelt IS, Van Dam FSAM.

Under-reporting by cancer patients. the case of response shift.

Social Sci Med

1991;

32

: 981–7

Correspondence: J. Rees, Department of Urology, Taunton and Somerset Hospital, Musgrove Park, Taunton, TA1 5DA, UK.e-mail: [email protected]

Abbreviations:

BPE

, benign prostatic enlargement;

SPI

, Symptom Problem Index;

QoL

, quality of life.