selecting a valid sample size for longitudinal and multilevel studies in cancer research: software...

Selecting a Valid Sample Size for Longitudinal and Multilevel Studies in Cancer Research: Software and MethodsDeborah H. Glueck, Sarah M. Kreidler, Brandy M. Ringham, Keith E. Muller

1

Outline• Determining sample size and power for complex designs• Calculating power with our free, web-based software• Writing the grant• Novel results for missing data• Questions

2


3

Ethics of Sample Size

• If the sample size is too small, the study may be inconclusive and waste resources

• If the sample size is too large, then the study may expose too many participants to possible harms due to research

4

1 2

3 4

Previous Study on Sensory Focus to Alleviate Pain• Participants categorized into

four coping styles

• Randomized to one of two treatment arms:

sensory focusstandard of care

• Measured experienced pain after root canal

5

(Logan, Baron, Kohout, 1995)

Hig

hLo

w

Perceived Control

Des

ired

Cont

rol

HighLow

Memory of Pain TrialStudy Design

6

Memory of Pain TrialResearch Question

7

Memory of Pain TrialStudy Population

• Recruit participants who have a high desire/low felt coping style

• 30 patients / week

• 40% consent rate for previous studies

8

How do we calculate an accurate sample size?

9

Inputs for Power Analysis

• Type I error rate:

• Desired power:

• Loss to follow-up:

10



• Desired power:


11

0.01



• Desired power:


12

0.01

0.90



• Desired power:


13

0.01

0.90

25%


14

GLIMMPSE is a user-friendly online tool for calculating power and sample size for multilevel and longitudinal studies.

http://glimmpse.samplesizeshop.org/

GLIMMPSE

Salient Software Features

• Free

• Requires no programming expertise

• Allows saving study designs for later use

• Also available on smartphones

• Coming soon on iPad

Create a Study Design

17

Create a Study Design

18

Select Guided Study Design

Main Application Screen

19Left navigation bar allows

access to input screens

Tools and documentation appear on top navigation bar

Solving For

20

Solving For

21

Green checkmark = completePencil = incomplete

Solving For

22

Desired Power

23Enter each desired power value here and click the enter key

Desired Power

24

Predictors

25

Predictors

26Enter predictors names

Predictors

27

Enter categories for each predictor

Enter predictors names

Response Variables

28

Response Variables

29Enter each outcome variable and click “Enter”

Repeated Measures

30

Repeated Measures

31

Enter the units (time), the number of repeated measures (3), and the spacing (equal)

Hypothesis

time by treatment interaction

Hypothesis

33

Hypothesis

34

Select the type of hypothesis

Hypothesis

35

Select the factors for the hypothesis

Statistical Test

36

Statistical Test

37

Type I Error Rate

38

Type I Error Rate

39Enter each Type I error rate value and click “Enter”

Choices for Means and Variance

• So far, all inputs are known as part of the study design

•We now must obtain reasonable values for mean differences and variability to complete the calculation

40

• Pilot study

• Similar published research

• Unpublished internal studies

• Clinical experience41

Where Can I Find Means, Variances, and Correlations?

Means

Intervention Baseline 6 Months 12 Months

Sensory Focus(SF) 3.6 2.8 0.9

Standard of Care

(SOC) 4.5 4.3 3.0

Intervention Difference

(SF - SOC)

Net Difference Over Time

(12 Months - Baseline)

0.9 1.5 -2.1

-1.2

Variances and Correlations

• Consider the sources of variability and correlation in the study design

• Repeated measures within a given participant will be correlated

• Outcome measurements will vary between participants


Correlation Between Outcomes Over TimeGedney, Logan, and Baron (2003) identified predictors of the amount of experienced pain recalled over time…One of the findings was that memory of pain intensity at 1 week and 18 months had a correlation of 0.4. We assume that the correlation between measures 18 months apart will be similar to the correlation between measures 12 months apart. Likewise, the correlation between measures 6 months apart will be only slightly greater than the correlation between measures 18 months apart.


Standard Deviation of the OutcomeLogan, Baron, and Kohout (1995) examined whether sensory focus therapy during a root canal procedure could reduce a patient’s experienced pain. The investigators assessed experienced pain on a 5 point scale both immediately and at one week following the procedure. The standard deviation of the measurements was 0.98.

50

GLIMMPSE MeansSpecifying a Mean Difference

51

GLIMMPSE MeansSpecifying a Mean Difference

Enter the expected net mean difference for the interaction

Glimmpse VariabilitySpecifying Correlations Across Time

52


53

A separate tab will appear for each level of repeated

measures, with an additional tab for responses variables


54

Enter correlations between repeated measurements

Glimmpse VariabilitySpecifying Variability in Responses

55

Enter the standard deviation of the response variable

Scale Factor for Variability

56

Glimmpse Calculate Button

• Green indicates that you are ready to calculate• Gray indicates that the study design incomplete. • Clicking the gray “Calculate” button will show a list of

incomplete screens.57

Glimmpse Results

58Minimum total sample size to achieve at least 90% power

Glimmpse Results: Detail View

59


60

We plan a repeated measures ANOVA using the Hotelling-Lawley trace to test for a time by treatment interaction.

61

Sample Size Calculation Summary

We plan a repeated measures ANOVA using the Hotelling-Lawley trace to test for a time by treatment interaction.

62


• Type I error rate• α = 0.01

• Hypothesis test• Wrong: power = treatment

data analysis = time x treatment

• Right: power = time x treatmentdata analysis = time x treatment

63

Aligning Power Analysis with Data Analysis

Based on previous studies, we predict memory of pain measures will have a standard deviation of 0.98 and the correlation between baseline and 6 months will be 0.5. Based on clinical experience, we believe the correlation will decrease slowly over time, for a correlation of 0.4 between pain recall measures at baseline and 12 months.

64


Based on previous studies, we predict memory of pain measures will have a standard deviation of 0.98 and the correlation between baseline and 6 months will be 0.5. Based on clinical experience, we believe the correlation will decrease slowly over time, for a correlation of 0.4 between pain recall measures at baseline and 12 months.

65


• Give all the values needed to recreate the power analysis

• Provide appropriate citation

66

Justifying the Power Analysis

For a desired power of 0.90 and a Type I error rate of 0.01, we estimated that we would need 44 participants to detect a mean difference of 1.2.

67


For a desired power of 0.90 and a Type I error rate of 0.01, we estimated that we would need 44 participants to detect a mean difference of 1.2.

68


Pow

er

69

Accounting for Uncertainty

Mean Difference

Pow

er

Mean Difference70

0.90


Pow

er

Mean Difference

0.90

71


We plan a repeated measures ANOVA using the Hotelling-Lawley Trace to test for a time by treatment interaction. Based on previous studies, we predict measures of pain recall will have a standard deviation of 0.98. The correlation in pain recall between baseline and 6 months will be 0.5. Based on clinical experience, we predict that the correlation will decrease slowly over time. Thus, we anticipate a correlation of 0.4 between pain recall measures at baseline and 12 months. For a desired power of 0.90 and a Type I error rate of 0.01, we need to enroll 44 participants to detect a mean difference of 1.2. 72

Sample Size Calculation Summary Draft

• 25% loss to follow-up

• Inflate calculated sample size by 25%

73

Handling Missing Data

74

Handling Missing Data

• 25% loss to follow-up

• Inflate calculated sample size by 25%

Over 12 months, we expect 25% loss to follow up. We will inflate the sample size by 25% to account for the attrition, for a total enrollment goal of 56 participants, or 28 participants per treatment arm.

75


Over 12 months, we expect 25% loss to follow up. We will inflate the sample size by 25% to account for the attrition, for a total enrollment goal of 56 participants, or 28 participants per treatment arm.

76


• Is the target population sufficiently large?

• Can recruitment be completed in the proposed time period?

77

Demonstrating Enrollment Feasibility

• 30 patients per week with a high desire / low felt coping style

• 40% consent rate

78

Sample size needed56

Sample size available

Planned Sample Size vs. Available Sample Size



79


Sample size available36


3 week enrollment period


Sample size available60



80


5 week enrollment period

The clinic treats 30 patients per week with the high desire/low felt coping style. Based on recruitment experience for previous studies, we expect a 40% consent rate. At an effective enrollment of 12 participants per week, we will reach the enrollment goal of 56 participants in 5 weeks time.

81


The clinic treats 30 patients per week with the high desire/low felt coping style. Based on recruitment experience for previous studies, we expect a 40% consent rate. At an effective enrollment of 12 participants per week, we will reach the enrollment goal of 56 participants in 5 weeks time.

82


• Aims typically represent different hypotheses

• Maximum of the sample sizes calculated for each aim

83

Planning for Multiple Aims


84

Background• The Kenward and Roger Wald test in the linear mixed model

improves Type I error control for small samples.

• For data analysis, Kenward and Roger described a scaled Wald statistic and corresponding central F reference distribution

• We describe a noncentral F power approximation for the Kenward and Roger (1997) Wald test

Existing power method for mixed models

• Helms (1992) described a noncentral F power approximation for the unscaled Wald test

• Stroup (1999) suggested an “exemplary data” approach• Tu et al. (2004, 2007) derived a power function based on the

asymptotic distribution of GEE estimates• Muller et al. (2007) presented power methods for “reversible”

mixed models

The linear mixed model

• Univariate model with one row for each observation• Observations may be correlated within an independent sampling

unit• The model accommodates missing data and several covariance

structures• Several methods of estimation

y11

y12

y21

y22

y23

1 0 0

0 1 0

1 0 0

0 1 0

0 0 1

e11

e12

e21

e22

e23

μ1

μ2

μ3

The Wald test of fixed effects• The Wald statistic is

• The reference distribution of the Wald statistic depends on the estimation method

New method: Overall strategy• Use a three-moment approximation to obtain the reference

distribution of the scaled Wald statistic under the alternative

• Approximate the moments of the unscaled Wald statistic under the null and the alternative

Approximating the moments of the unscaled Wald statistic• The exact distribution for REML estimates is unknown

• Approximate the distribution of REML estimates with known results from other techniques

• Consider each component (mean and covariance) of the Wald statistic separately

• Assume independence and combine to form an approximate F

Power for the Kenward and Roger test• Specify the Type I error rate, the hypothesis, choices for

means, and the covariance of a complete data case.

• For each independent sampling unit, specify the pattern of missing data and the design matrix

• Calculate the approximate reference distribution for the Kenward and Roger scaled Wald statistic

• Using the null distribution, obtain the critical value

• Using the alternative distribution, calculate power

Results: Longitudinal studies

Outline• Determining sample size and power for complex designs• Calculating power with our free, web-based software• Writing the grant• Handling missing data• Questions

93

Questions?

94

selecting a valid sample size for longitudinal and multilevel studies in cancer research: software...

Documents