31st july talk (20021)

74
Clinical Trial Writing II Sample Size Calculation and Randomization Liying XU (Tel: 22528716) CCTER CUHK 31 st July 2002

Upload: vijay-pithadia

Post on 12-May-2015

482 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: 31st july talk (20021)

Clinical Trial Writing IISample Size Calculation and

Randomization

Liying XU (Tel: 22528716)CCTERCUHK 31st July 2002

Page 2: 31st july talk (20021)

1Sample Size Planning

Page 3: 31st july talk (20021)

1.1 Introduction

Fundamental Points Clinical trials should have sufficient

statistical power to detect difference between groups considered to be of clinical interest. Therefore calculation of sample size with provision for adequate levels of significance and power is a essential part of planning.

Page 4: 31st july talk (20021)

Five Key Questions Regarding the Sample Size

What is the main purpose of the trial? What is the principal measure of patients

outcome? How will the data be analyzed to detect a treatment

difference? (The test statistic: t-test , X2 or CI.) What type of results does one anticipate with

standard treatment? Ho and HA, How small a treatment difference is it

important to detect and with what degree of certainty? ( , and .)

How to deal with treatment withdraws and protocol violations. (Data set used.)

Page 5: 31st july talk (20021)

SSC: Only an Estimate

Parameters used in calculation are estimates with uncertainty and often base on very small prior studies

Population may be different Publication bias--overly optimistic Different inclusion and exclusion

criteria Mathematical models approximation

Page 6: 31st july talk (20021)

What should be in the protocol?

Sample size justification Methods of calculation Quantities used in calculation:

• Variances• mean values• response rates • difference to be detected

Page 7: 31st july talk (20021)

Realistic and Conservative

Overestimated size: unfeasible early termination

Underestimated size justify an increase extension in follow-up incorrect conclusion (WORSE)

Page 8: 31st july talk (20021)

What is (Type I error)? The probability of erroneously

rejecting the null hypothesis (Put an useless medicine into the

market!)

Page 9: 31st july talk (20021)

What is (Type II error)? The probability of erroneously failing

to reject the null hypothesis. (keep a good medicine away from

patients!)

Page 10: 31st july talk (20021)

What is Power ?

Power quantifies the ability of the study to find true differences of various values of .

Power = 1- =P (accept H1|H1 is true) ----the chance of correctly identify H1

(correctly identify a better medicine)

Page 11: 31st july talk (20021)

What is ?

is the minimum difference between groups that is judged to be clinically important Minimal effect which has clinical relevance in

the management of patients The anticipated effect of the new treatment

(larger)

Page 12: 31st july talk (20021)

The Choice of and depend on:

the medical and practical consequences of the two kinds of errors

prior plausibility of the hypothesis the desired impact of the results

Page 13: 31st july talk (20021)

The Choice of and

=0.10 and =0.2 for preliminary trials that are likely to be replicated.

=0.01 and =0.05 for the trial that are unlikely replicated.

= if both test and control treatments are new, about equal in cost, and there are good reasons to consider them both relatively safe.

Page 14: 31st july talk (20021)

The Choice of and

> if there is no established control treatment and test treatment is relatively inexpensive, easy to apply and is not known to have any serious side effects.

< (the most common approach 0.05 and 0,2)if the control treatment is already widely used and is known to be reasonably safe and effective, whereas the test treatment is new,costly, and produces serious side effects.

Page 15: 31st july talk (20021)

1.2 SSC for Continuous Outcome Variables

H0: =C-I=0 HA: =C-I0 If the variance in known If

If H0 will be rejected at the level of significance.

IC

Ic

NN

xxz

11

ZZ

Page 16: 31st july talk (20021)

A total sample 2N would be needed to detect a true difference between I and C with power (1-) and significant level by formula:

2

224

2

ZZN

Page 17: 31st july talk (20021)

Example 1

An investigator wish to estimate the sample size necessary to detect a 10 mg/dl difference in cholesterol level in a diet intervention group compared to the control group. The variance from other data is estimated to be (50 mg/dl). For a two sided 5% significance level, Z=1.96, and for 90% power, Z=1.282.

2N=4(1.96+1.282)2(50)2/102=1050

Page 18: 31st july talk (20021)

Example1a Baseline Adjustment

An investigator interested in the mean levels of change might want to test whether diet intervention lowers serum cholesterol from baseline levels when compare with a control.

H0: =0

HA: 0 =20mg/dl, =10mg/dl 2N=4(1.96+1.282)2(20)2/102=170

Ic

Ic

Page 19: 31st july talk (20021)

A Professional Statement A sample size of 85 in each group will

have 90% power to detect a difference in means of 10.0 assuming that the common standard deviation is 20.0 using a two group t-test with a 0.05 two-sided significant level.

Page 20: 31st july talk (20021)

Values of f(,) to be used in formula for sample size calculation

( T y p e I I e r r o r )

0 . 0 5 0 . 1 0 . 2 0 . 5( T y p e I

e r r o r )0 . 1

0 . 0 50 . 0 20 . 0 1

1 0 . 81 3 . 01 5 . 81 7 . 8

8 . 61 0 . 51 3 . 01 4 . 9

6 . 27 . 9

1 0 . 01 1 . 7

2 . 73 . 85 . 46 . 6

),(2

fZZ

Page 21: 31st july talk (20021)

1.3 SSC for a Binary Outcome Two independent samples

)/(

111/

CICI

ICIC

NNrrp

NNppppZ

Page 22: 31st july talk (20021)

2/)( IC ppp

22 /1)(42 IC ppppZZN

Page 23: 31st july talk (20021)

Example 2

Suppose the annual event rate in the control group is anticipated to be 20%. The investigator hopes that the intervention will reduce the annual rate to 15%. The study is planned so that each participant will be followed for 2 years. Therefore, if the assumption are accurate, approximately 40% of the participants in the control group and 30% of the participants in the intervention group will develop an event.

Page 24: 31st july talk (20021)

960956

3.04.0/)65.0)(35.0(282.196.142 22

N

Page 25: 31st july talk (20021)

A Professional Statement A two group x2 test with a 0.05 two-

sided significant level will have 90% power to detect the difference between a Group 1 proportion, P1,of 0.40 and a Group 2 proportion P2 of 0.30 (odds ratio of 0.643) when the sample size in each group is 480.

Page 26: 31st july talk (20021)

Table 1.3 Approximate total sample size for comparing various proportions in two groups with significance level () of 0.05 and power(1-) of 0.8 and 0.9

True proportions =0.05(one-sided) =0.05(two-sided)

pC pI 1- 1- 1- 1-

Control group Interventiongroup

0.90 0.80 0.90 0.80

0.6

0.50

0.40

0.30

0.20

0.10

0.500.400.300.200.400.300.250.200.300.250.200.200.150.100.150.100.050.05

8502109050850210130907803301806402701401980440170950

610160704061015090605602401304701901001430320120690

104026012060104025016011096041022079033017024305402001170

7802009050780190120807203101705902501301810400150870

Page 27: 31st july talk (20021)

From Table 1.3 You can see:

N The power 1- N The N

Page 28: 31st july talk (20021)

Paired Binary Outcome

McNemar’s test

d=difference in the proportion of successes (d=pI-pC)

f=the portion of participants whose response is discordant (the pair of outcome are not the same)

2

2

d

fZZN p

Page 29: 31st july talk (20021)

Example 3

Consider an eye study where one eye is treated for loss in visual acuity by a new laser procedure and the other eye is treated by standard therapy. The failure rate on the control, pC, is estimated to be 0.4, and the new procedure is projected to reduce the failure rate to 0.20. The discordant rate f is assumed to be 0.50.

Page 30: 31st july talk (20021)

=0.05 The power 1- =0.90 f=0.5 PC=0.4 PI=0.2

1325.02622.04.0

5.0282.196.12

2

Np

Page 31: 31st july talk (20021)

1.4 Adjusting for Non-adherence

Ro =drop out rate

RI=drop in rate N=N

If RO=0.20, RI=0.05 N =1.78N

21/ IO RR

Page 32: 31st july talk (20021)

1.5 Adjusting the Multiple Comparison

’= /k

k= the number of multiple comparison variables

Page 33: 31st july talk (20021)

Table 1.4 Adjusting for Randomization Ratio

Randomization Ratio Increase in total N1:1 01:2 +12.5%1:3 +33%1:4 +56%1:5 +80%1:6 +100%

Page 34: 31st july talk (20021)

1.6 Adjusting for loss of follow up

If p is the proportion of subjects lost to follow-up, the number of subjects must be increased by a factor of 1/(1-p).

Page 35: 31st july talk (20021)

1.7 Other Factors: the rate of attrition of subjects

during a trial intermediate analyses

Page 36: 31st july talk (20021)

Sample size re-estimation Events rates are lower than

anticipate Variability of larger than expected

Without unbinding data and Making treatment comparisons

Page 37: 31st july talk (20021)

1.8 Power Calculation(assuming we compare two medicines)

Power Depends on 4 Elements: The real difference between the two

medicines, • Big big power

The variation among individuals, • Small big power

The sample size, n• Large nbig power

Type I error,• Large big power

Page 38: 31st july talk (20021)

Sensitivity of the sample size estimate

to a variety of deviations from these assumptions

a power table

Page 39: 31st july talk (20021)

Table 1 Statistical Power of the Tanzania Vitamin and HIV Infection Trial (N=960)

Effect of B

0% 15% 30%

Effect of A Loss to follow up

0% 20% 33%

Loss to follow up

0% 20% 33%

Loss to follow up

0% 20% 33%

30% 89% 82% 74% 85% 76% 68% 79% 69% 61%

25% 75% 65% 58% 69% 59% 52% 62% 52% 45%

Page 40: 31st july talk (20021)

Example 4 Regret for Low Power Due to Small Sample?

I have a set of data that the mean change between the 2 groups is significantly different (p<0.05).  But when I put calculate the power it gives only 50%.  How should I interpret this? Also, can someone kindly advise as whether it is meaningful (or pointless) to calculate the power when the result is statistically significant?

Page 41: 31st july talk (20021)

Books and Software Sample size tables for clinical

studies (second edition) By David Machin, Michael Campbell Peter Fayers

and Alain Pinol Blackwell Science 1997

PASS 2000 available in CCTER

nQuery 4.0 available in CCTER

Page 42: 31st july talk (20021)

2. Randomization

Page 43: 31st july talk (20021)

Randomization

Definition: randomization is a process by which each

participant has the same chance of being assigned to either intervention or control.

Page 44: 31st july talk (20021)

Fundamental Point

Randomization trends to produce study groups comparable with respect to known and unknown risk factors, removes investigator bias in the allocation of participants, and guarantees that statistical tests will have valid significance levels.

Page 45: 31st july talk (20021)

Two Types of Bias in Randomization

Selection bias occurs if the allocation process is predictable. If any

bias exists as to what treatment particular types of participants should receive, then a selection bias might occur.

Accidental bias can arise if the randomization procedure does not

achieve balance on risk factors or prognostic covariates especially in small studies.

Page 46: 31st july talk (20021)

Fixed Allocation Randomization Fixed allocation randomization procedures

assign the intervention to participants with a pre-specified probability, usually equal, and that allocation probability is not altered as the study processes

• Simple randomization• Blocked randomization• Stratified randomization

Page 47: 31st july talk (20021)

Randomization Types

Simple randomization

Page 48: 31st july talk (20021)

Simple Randomization Option 1: to toss an unbiased coin for a randomized

trial with two treatment (call them A and B) Option 2: to use a random digit table. A randomization

list may be generated by using the digits, one per treatment assignment, starting with the top row and working downwards:

Option 3: to use a random number-producing algorithm, available on most digital computer systems.

Page 49: 31st july talk (20021)

Advantages

Each treatment assignment is completely unpredictable, and probability theory guarantees that in the long run the numbers of patients on each treatment will not be radically different and easy to implement

Page 50: 31st july talk (20021)

Disadvantages

Unequal groups one treatment is assigned more often than

another Time imbalance or chronological bias

One treatment is given with greater frequency at the beginning of a trial and another with greater frequency at the end of the trial.

Simple randomization is not often used, even for large studies.

Page 51: 31st july talk (20021)

Randomization Types

Blocked randomization

Page 52: 31st july talk (20021)

Blocked Randomization (permuted block randomization) Blocked randomization is to ensure exactly equal

treatment numbers at certain equally spaced point in the sequence of patients assignments

A table of random permutations is used containing, in random order, all possible combinations (permutations) of a small series of figures.

Block size: 6,8,10,16,20.

Page 53: 31st july talk (20021)

Advantages

The balance between the number of participants in each group is guaranteed during the course of randomization. The number in each group will never differ by more than b/2 when b is the length of the block.

Page 54: 31st july talk (20021)

Disadvantages

Analysis may be more complicated (in theory)Correct analysis could have bigger power

Changing block size can avoid the randomization to be predictable

Mid-block inequality might occur if the interim analysis is intended.

Page 55: 31st july talk (20021)

Randomization Types Stratified randomization

lym ph sk in breast

Ye s

lym ph sk in breast

N o

U .S .

lym ph sk in breast

Ye s

lym ph sk in breast

N o

Europe

previous exposure

geographic location

site

Page 56: 31st july talk (20021)

Stratified Randomization

Stratified randomization process involves measuring the level of the selected factors for participants, determining to which stratum each belongs, and performing the randomization within the stratum. Within each stratum, the randomization process itself could be simple randomization, but in practice most clinical trials use some blocked randomization strategy.

Page 57: 31st july talk (20021)

Table 3. Stratification Factors and Levels (323=18 Strata)

Age Sex Smoking history

1. 40-49 yr 1.Male 1. Current smoker

2. 50-59 yr 2 Female 2. Ex-smoker

3. 60-69 yr 3. Never smoked

Page 58: 31st july talk (20021)

Table 4 Stratified Randomization with Block Size of FourStrat

a Age Sex Smoking Group assignment

1 2 3 4 5 6 7 8 9 10 11 12

40-49 40-49 40-49 40-49 40-49 40-59 50-59 50-59 50-59 50-59 50-59 50-59 etc.

M M M F F F M M M F F F

Current Ex

Never Current

Ex Never

Current Ex

Never Current

Ex Never

ABBA BABA.. BABA BBAA..

Etc.

Page 59: 31st july talk (20021)

Advantages

To make two study groups appear comparable with regard to specified factors, the power of the study can be increased by taking the stratification into account in the analysis.

Page 60: 31st july talk (20021)

Disadvantages

The prognostic factor used in stratified randomization may be unimportant and other factors may be identified later are of more importance

Page 61: 31st july talk (20021)

MechanismTrial Type

Mechanism

No central registration office Randomization list sealed envelops

Double blind drug trial Pharmacist will be involved

Multi-centre trial Central registration office

Single-centre trial Independent person responsible for patients registration and randomization

Page 62: 31st july talk (20021)

An Example of Stratified Randomization

Patients will be stratified according to the following criteria:

1) Treatment center (Hospital A vs Hospital B vs Hospital C)

2) N-stage(N2 vs N3) 3) T-stage (T1-2 vs T3-4)

Page 63: 31st july talk (20021)

What should be in the protocol? A dynamic allocation scheme will be used to

randomize patients in equal proportions within each of 12 strata. The scheme first creates time-ordered blocks of size divisible by three and then uses simple randomization to divide the patients in each block into three treatment arms, in equal proportion. The block sizes will be chosen randomly so that each block contains either 6 or 9 patients.

Page 64: 31st july talk (20021)

Cont…

This procedure helps to ensure both randomness and investigator blinding (the block sizes are known only to the statistician), as recommended by Freedman et al. Randomization will be generated by the consulting statistician in sealed envelopes, labeled by stratum, which will be unsealed after patient registration.

Page 65: 31st july talk (20021)

Adaptive Randomization

Number adaptiveBiased coin method

Baseline adaptive (MINIMIZATION) Outcome adaptive

Page 66: 31st july talk (20021)

Biased Coin Method

Advantages Investigators can not determine the next

assignment by discovery the blocking factor.

DisadvantagesComplexity in useStatistical analysis cumbersome

Page 67: 31st july talk (20021)

Minimization

Minimization is an well -accepted statistical method to limit imbalance in relative small randomized clinical trials in conditions with known important prognostic baseline characteristics.

It called minimization because imbalance in the distribution of prognostic factors are minimized

Page 68: 31st july talk (20021)

Table 1 Some baseline characteristics of patients in a controlled trial of mustine versus talc in the control of pleural effusions in patients with breast cancer (Frientiman et al, 1983)

Treatment Mustine (n=23) Talc(n=23)

Mean age (SE) 50.3(1.5) 55.3(2.2)

Stage of disease: 1 or 2 3 or 4

52% 48%

74% 26%

Mean interval in month between BC diag. and effusion diag. (SE)

33.1(6.2) 60.4(13.1)

Postmenopausal 43% 74%

Page 69: 31st july talk (20021)

Minimization Factors

Age ( years) <=50 Or >50

Stage of disease 1 or 2 Or 3 or 4

Time between diagnosis of cancer and diagnosis of effusions(months)

<=30 Or >30

Menopausal Pre Or Post

Page 70: 31st july talk (20021)

Table 2 Characteristics of the first 29 patients in a clinical trial using minimization to allocate treatment

Mustine Talc

Age <=50 >50

7 8

6 8

Stage 1 or 2 3 or 4

11 4

11 3

Time Interval

<=30m >30m

6 9

4 10

Menopausal Pre Post

7 8

5 9

Page 71: 31st july talk (20021)

Table 3 Calculation of imbalance in patient characteristics for allocating treatment to the thirtieth patient

Mustine (n=15)

Talc (n=14)

Age >50 8 8

Stage 3 or 4 4 3

Time interval <=30m 6 4

Postmenopausal 8 9

Total 26 24

Page 72: 31st july talk (20021)

Advantages

It can reduce the imbalance into the minimum level especially in small trial

Computer Program available (called Mini) and also not difficult to perform ‘by hand’

Minimization and stratification on the same prognostic factors produce similar levels of power, but minimization may add slightly more power if stratification does not include all of the covariance

Page 73: 31st july talk (20021)

Disadvantages

It is a bit complicated process compare to the simple randomization

Page 74: 31st july talk (20021)

Practical Considerations

Study type Randomization

Large studies Blocked

Large, Multicentre studies Stratified by centre

Small studies Blocked and Stratified by centre

Large number of Prognostic factors

Minimization

Large studies Stratified analysis without stratified randomization