overlap methods derived from visual analysis in single case research methodology in collaboration...

Post on 17-Jan-2018

217 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Rationale Assumption: Multiple studies are needed to be confident in the evidence supporting a practice –Need to aggregate findings from multiple studies –Meta-analytic methods are well established for group experimental research: Aggregate findings Quantify the magnitude Conduct analyses of moderator variables to account for difference in magnitude across studies (Lipsey & Wilson, 2001) –Consensus does not exist for SCRD How to calculate effect sizes?

TRANSCRIPT

Overlap Methods derived from Visual Analysis in Single Case Research

Methodology

In collaboration with Brian Reichow and Mark Wolery

Topics

• Rationale and History of Overlap Methods • Overview of Methods and their Problems

1. PND2. PEM3. PEM-T (ECL)4. PAND5. R-IRD*6. NAP*7. TauU*

Recommendations for Overlap Methods

Rationale • Assumption: Multiple studies are needed to be

confident in the evidence supporting a practice– Need to aggregate findings from multiple studies– Meta-analytic methods are well established for

group experimental research: • Aggregate findings• Quantify the magnitude • Conduct analyses of moderator variables to account

for difference in magnitude across studies (Lipsey & Wilson, 2001)

– Consensus does not exist for SCRD • How to calculate effect sizes?

Issues in Meta-analysis of SCRD

• Data are not independent– On a single individual– Using the same definitions– Using the same data collection procedures– In the same context – Under the same procedures– Often with short intervals between observations

• Compromises certain analysis techniques• Compromises some quantitative synthesis

methods

Issues in Meta-analysis of SCRD

With single subject research, to identify a functional relation between independent and dependent variables:

• Threats to internal validity must be controlled• The design needs an adequate number of replications

of the experimental conditions• The data patterns must shift consistently in the predicted

(therapeutic) direction with each change in experimental conditions

Single subject studies use a replication logic for making judgments about functional relations

Overlap Methods: Advantages

• Distribution free indices – Lack normality or constant variance so not

impacted by outliers• Small amount of data (short lengths) make it

impossible to test parametric assumptions• Calculated by hand • Directly interpretable

– Percent of data showing no overlap• Allow the interventionist to remain in control

PND: Percent of Non-overlapping Data

Overlap Methods: PNDPercentage of non-overlapping data (PND)• Oldest of the overlap methods (Scruggs, & Mastropieri,

1998; Scruggs, Mastropieri, & Casto, 1987)

• Used extensively• Easily calculated• Does not assume data are independent• Does not make other assumptions necessary in

regression methods• Interpreted as: “The percentage of Phase B data

exceeding the single highest Phase A datum point.”

Overlap Methods: PNDCalculated by:

1. Identify the intended change 2. Drawing a straight line from the highest (or lowest)

point in Phase A and counting the number of data point in Phase B above the line

3. Quotient= # above the line / total number in Phase B X100

4. > 70% is effective, 50% to 70% is questionable effectiveness, and <50% is no observed effect (Scruggs & Mastropieri, 1998).

1. PND is compromised by a baseline datum point at the floor or ceiling

0

10

20

30

3 6 9 12 15

Sessions

# of

resp

onse

s

PND = 0

Baseline InterventionChange?

2. PND is compromised by variability in the baseline condition, because it relies on the most extreme datum point in the baseline, perhaps the one that is least representative of the data pattern

0

10

20

30

3 6 9 12 15

Sessions

# of

resp

onse

s Baseline Intervention

PND = 0

Change?

3. PND is compromised by trends in data within conditions

0

10

20

30

3 6 9 12 15

Sessions

# of

resp

onse

s Baseline Intervention

PND = 0

Change?

0

10

20

30

3 6 9 12 15

Sessions

# of

resp

onse

s

Baseline InterventionChange?

PND = 100

0

10

20

30

3 6 9 12 15

Sessions

# of

resp

onse

sBaseline Intervention

PND =66.7

0

10

20

30

3 6 9 12 15 18

Sessions

# of

resp

onse

s Baseline Intervention

PND = 92.3

4. PND is compromised by the number of data points in the intervention condition

0

10

20

30

3 6 9 12 15 18

Sessions

# of

resp

onse

sBaseline Intervention

PND =100

0

10

20

30

3 6 9 12 15 18

Sessions

# of

resp

onse

s Baseline Intervention

PND = 100

Change?

Change?

5. PND does not measure magnitude of difference

0

10

20

30

3 6 9 12 15 18

Sessions

# of

resp

onse

s Baseline Intervention

PND =60

0

10

20

30

3 6 9 12 15 18

Sessions

# of

resp

onse

s Baseline Intervention

PND = 60

Change?

Change?

Lower graph looks like standard learning curve; doesn’t do well with this pattern

0

10

20

30

3 6 9 12 15 18

Sessions

# of

resp

onse

s Baseline Intervention

PND =50

Change?

Extinction bursts may be problematic

Overlap Methods: PND—Problems 6. PND does not address critical issues of

consistent replications • Adding the PNDs across replications can lead

to inaccurate conclusions

0

10

20

30 Baseline Intervention

PND =100

0

10

20

30

0

10

20

30

3 6 9 12 15 18 21 24 27

PND =50

PND =100

100100 +50250

250/3=83.3

Mean PND=83.3

Overlap Methods: PND—Problems 7. Compared with consensus visual analysis,

PND resulted in an error in about 1 of 5 condition changes—high rate of errors (Wolery, Busick, Reichow, & Barton, 2010)

PEM: Percent Exceeding the Median

Overlap Methods: PEMPercentage of Data Exceeding the Median (Ma, 2006)

• Developed to improve upon PND• Rather than the highest datum point, PEM relies on the

more stable and representative Phase A median level • As with the PND, the PEM is not compromised by

serial dependency and other data assumptions• Simple calculation• Designed to eliminate problem with baseline datum

point being at floor or ceiling• Designed to not rely on the most extreme datum point• Less influenced by variability in baseline

Overlap Methods: PEMCalculated by:

1. Drawing a line at the median of Phase A data through Phase B data

2. Count the number of data points in Phase B above (or below) the line and divide by the total number of data points in Phase B

0

10

20

30

3 6 9 12 15

Sessions

# of

resp

onse

s Baseline Intervention

PEM = 57.1

0

10

20

30

3 6 9 12 15

Sessions

# of

resp

onse

s Baseline Intervention

PEM = 100

Change

No change

1. PEM is compromised by trends in the data

0

10

20

30

3 6 9 12 15

Sessions

# of

resp

onse

s Baseline Intervention

PEM =66.7

0

10

20

30

3 6 9 12 15 18

Sessions

# of

resp

onse

s Baseline Intervention

PEM = 92.3

2. PEM is compromised by the number of intervention data points

0

10

20

30

3 6 9 12 15 18

Sessions

# of

resp

onse

s Baseline Intervention

PEM =100

0

10

20

30

3 6 9 12 15 18

Sessions

# of

resp

onse

s Baseline Intervention

PEM = 100

3. PEM is not an effect size estimate, because it does not measure magnitude of difference

0

10

20

30

3 6 9 12 15 18

Sessions

# of

resp

onse

s Baseline Intervention

PEM = 60

0

10

20

30

3 6 9 12 15 18

Sessions

# of

resp

onse

s Baseline Intervention

PEM = 60

Doesn’t do well with learning curves

0

10

20

30

3 6 9 12 15 18

Sessions

# of

resp

onse

s Baseline Intervention

PEM =60

Change?

Extinction bursts may be problematic

Overlap Methods: PEM: Problems4. The PEM does not address the critical analysis

question (do data patterns consistently shift with each change in the experimental conditions?)

• Adding the PEMs across replications can lead to the same inaccurate conclusions as with the PND

5. Compared with consensus visual analysis, PEM resulted in an error in about 1 of 6 condition changes—high rate of errors (Wolery et al., 2010)

6. The PEM appears to over-estimate effects and does not discriminate well between graphs (Parker & Hagan-Burke, 2007)

PEM-T: Percent Exceeding the Median Trend line

Overlap Methods: PEM-TPercentage of Data Exceeding a Median-Based

Trend (Wolery et al., 2010)

• As with the PND and PEM, the PEM-T is not compromised by serial dependency and other data assumptions

• Simple calculation, but requires graphing on semi-logarithmic paper (and re-graphing data)

• Designed to eliminate problem with baseline datum point being at floor or ceilingrelying on the most extreme baseline pointtrends in the data

Overlap Methods: PEM-TCalculated by:

1. Graph data on semi-logarithmic chart2. Calculate and draw a split middle line of trend

estimation for Phase A data and extend it through Phase B

3. Count # of Phase B data points above/below the split middle line of trend estimation

4. Divide count from Step 4 by # data points in Condition 2 and multiply quotient by 100

PEM-T not compromised by trends in the data

1

10

100

1000

3 6 9 12 15

1

10

100

1000

3 6 9 12 15

Change

No change

Baseline

Baseline

Intervention

Intervention

PEM-T = 100

PEM-T = 0

1. PEM-T is compromised by the number of intervention data points

1

10

100

1000

3 6 9 12 15

1

10

100

1000

3 6 9 12 15 18 21

Baseline

Baseline

Intervention

Intervention

PEM-T = 66.7

PEM-T = 92.3

2. PEM-T is not an effect size estimate, because it does not measure magnitude of difference

1

10

100

1000

3 6 9 12 15 18 21

1

10

100

1000

3 6 9 12 15 18 21

Baseline

Baseline

Intervention

Intervention

PEM-T = 100

PEM-T = 100

1

10

100

1000

3 6 9 12 15 18 21

1

10

100

1000

3 6 9 12 15 18 21

Baseline

Baseline

Intervention

Intervention

PEM-T = 60

PEM-T = 60

Doesn’t do well with learning curves

Extinction bursts may be problematic

Overlap Methods: PEM-T: Problems3.The PEM-T does not address the critical analysis

question (do data patterns consistently shift with each change in the experimental conditions?) – Adding the PEM-Ts across replications can lead to the

same inaccurate conclusions as with the PND4.Compared with consensus visual analysis, PEM-

T resulted in an error in about 1 of 8 condition changes—high rate of errors (Wolery et al., 2010)

PAND: Percent of All Non-overlapping Data

Overlap Method: PANDPercentage of All Non-overlapping Data (Parker, Hagan-Burke, &

Vannest, 2007)

• Percentage of data remaining after determining the fewest data points that must be removed to eliminate all between-phase overlap

• As with the PND, PEM, PEM-T, and IRD is not compromised by serial dependency or other data assumptions

• Simple calculation, although a bit more complex than PND or PEM

• Used for the multiple baseline design, but perhaps can be used with long (60-80 data point) A-B-A-B designs

Overlap Methods: PANDCalculated by:

1. Count the total number of data points in all tiers

2. Identify how many need data points need to be removed to eliminate overlap

3. Count the number of remaining data points 4. Divide count in step 3 by count in step 1

PAND: Issues1. Variable baseline data likely to compromise

PAND2. Trends in the data will compromise PAND3. Requires some overlap between phases to

have sensitive estimates (Parker et al. 2007)

4. Increasing the length of the intervention condition can produce a higher PAND

5. PAND does not assess magnitude, because different patterns (e.g., shallow and sharp learning curves) can have same values

PAND: Issues7. Combining data across tiers

– violates the within-tier analysis logic of the multiple baseline design

– Violates the replication logic of the multiple baseline design

– assumes each tier is equivalent in terms of changes (or lack thereof) in data patterns

8. Method for calculating confidence intervals give the impression of statistical precision, when indeed it is absent

PAND: Issues9. The PAND does not address the critical

analysis question (do data patterns consistently shift with each change in the experimental conditions?)

IRD: Improvement Rate Difference

Overlap Methods: IRDImprovement Rate Difference (Parker, Vannest, & Brown,

2009)

• As with the PND and PEM, the PEM-T is not compromised by serial dependency or other data assumptions

• Simple calculation, although a bit more complex than PND or PEM

• Allows confidence intervals to be calculated• Has history of use in group health care research• Note: rate in IRD is not about rate of behavior

R-IRD: Calculation

• IRD (Parker et al., 2009) was developed to improve upon PAND by providing an easily interpretable, reputable effect size index (with a sampling distribution).

• IRD calculation begins with the same method (fewest data points that must be removed to eliminate all overlap) as PAND, but in a second step converts the results to two improvement rates (IR), for phase A and B respectively.

• The two IR values are finally subtracted to obtain the “Improvement Rate Difference” (IRD).

0102030405060708090

100

3 6 9 12 15 18 21 24 27 30 33 36 39

A1 B1 A2 B2

From Parker et al. (2009)

(7) (16) (4) (12)

A1 to B1: (13/16=81%) – (0/7=0%) = IRD of 81%

B1 to A2: (13/16=81%) – (2/4=50%) = IRD of 31%

A2 to B2: (11/12=92%) – (0/4=0%) = IRD of 92%

IRD =(81+31+92)/3 = 68%

R-IRD: Calculation

• The original IRD article recommended that in the first step, data point ‘removal’ “should be balanced across the contrasted phases” (Parker et al., 2009, p. 141) for more robust results.

• A better robust IRD solution was later described and formalized as “Robust IRD” (R-IRD).

• R-IRD requires rebalancing (by hand) of a 2 x 2 matrix

• IRD is interpreted as the difference in the proportion of high or “improved” scores between phases B and A.

R-IRD: Calculation

• The superior robust version of IRD (R-IRD) requires that quadrants be balanced. Balancing allows for a more conservative effect in instances where a large number of data points may be removed arbitrarily from one side and a few from the other, which can unduly influence the results.

• R - IRD is unbiased in the sense that it does not allow bias in removal of data points from A versus B, as some datasets provide two or more equally good removal solutions.

R-IRD: Calculation

1. Determine the fewest data points that must be removed to eliminate overlap

2. Balance quadrant W and Z3. Then balance Y = A Phase A: W / (W + Y) 4. Then balance X = B Phase B: X / (X + Z)5. R - IRD = B – A

IRD is not influenced by a data point at floor or ceiling

0

10

20

30

3 6 9 12 15

Sessions

# of

resp

onse

s Baseline Intervention

A1 to B1: (7/8=88%) – (0/7=0%) = IRD of 88%

1. IRD is compromised by variability in the baseline condition

0

10

20

30

3 6 9 12 15

Sessions

# of

resp

onse

s Baseline Intervention

A1 to B1: (5/8=63%) – (2/7=29%) = IRD of 34%

“An IRD of 50% (.50) indicates that half the scores are overlapping, so did not improve from phase A to B” (Parker et al., 2009, p. 139).

2. IRD is compromised by trends in the conditions

0

10

20

30

3 6 9 12 15

Sessions

# of

resp

onse

s Baseline Intervention

A1 to B1: (5/8=63%) – (3/7=43%) = IRD of 20%

0

10

20

30

3 6 9 12 15

Sessions

# of

resp

onse

sBaseline Intervention

IRD =66.7

0

10

20

30

3 6 9 12 15 18

Sessions

# of

resp

onse

s Baseline Intervention

IRD = 92%

3. IRD is compromised by the number of data points in the intervention condition

(7/7=100%) (1/3=33%)

(7/7=100%) (1/13=8%)

0

10

20

30

3 6 9 12 15 18

Sessions

# of

resp

onse

s Baseline Intervention

IRD =100

0

10

20

30

3 6 9 12 15 18

Sessions

# of

resp

onse

s Baseline Intervention

IRD = 100

4. IRD does not measure magnitude of difference

(10/10=100%) (0/10=0%)

(10/10=100%) (0/10=0%)

IRD5. The IRD does not address replication logic. Although

the IRD has some advantages over other overlap methods, it is still flawed.

Practice: Calculating R-IRD

1 2 3 4 5 6 7 8 9 10 11 12 130

1

2

3

4

5

6

7

8

Phase A: 0 4 3 0 0Phase B: 5 2 3 5 3 5 6 7

1 2 3 4 5 6 7 8 9 10 11 12 130

1

2

3

4

5

6

7

8X

Y ZImproved= 0Improved= 3

Not Improved= 8

1. Determine the fewest data points that must be removed to eliminate overlap = 2

2. Balance quadrant W and Z = 1, 13. Then balance X = B = 7 Phase B: X / (X + Z) = 7/(1+7) = .8754. Then balance Y = A = 4 Phase A: W / (W + Y) = 1/(1+4)= .25. R - IRD = B – A = .875 - .2 = .675

W

Not Improved= 2

NAP: Non-overlap of All Pairs

Overview of NAP

• The percentage of data that improve from A to B or operationally, the percentage of all pairwise comparisons from Phase A to B which show improvement or growth (Parker & Vannest, 2009)

• NAP’s limitations include that it is insensitive to trends and outliers

Calculating NAP

1. NAP begins with all pairwise comparisons (#Pairs = nA × nB) between phases.

2. Each paired comparison has one of three outcomes: improvement over time (Pos), deterioration (Neg), or no change over time (Tie).

3. NAP is calculated as (Pos + .5 × Tie) / #Pairs.

Practice: Calculating NAP

1 2 3 4 5 6 7 8 9 10 11 12 130

1

2

3

4

5

6

7

8

Phase A: 0 4 3 0 0Phase B: 5 2 3 5 3 5 6 7

1 2 3 4 5 6 7 8 9 10 11 12 13012345678

# of Pairs = 5*8 = 40#Pos = 34, #Neg = 4, #Tie = 2NAP = (#Pos + .5*#Ties)/#PairsNAP = (34 + .5*2)/40NAP = .875

N=5 N=8

TauU(Kendall’s Tau + Mann-Whitney U)

Overview of TauU

• NAP’s major limitation of insensitivity to data trend led to development of a new index that integrates non-overlap and trend: TauU (Parker, Vannest, Davis, & Sauber, 2011).

• Melding KRC and MW-U are transformations of one another and share the same S sampling distribution

• The Tau-U score is not affected by the ceiling effect present in other non-overlap methods, and performs well in the presence of autocorrelation.

Calculating TauU

Simplest TauU (non-overlap only)• Conduct the same pairwise comparisons (nA ×

nB = #Pairs) across phases as is NAP, resulting in a Pos, Neg, or Tie for each pair

• The TauU simple non-overlap form (not considering trend) is TauU = (Pos - Neg) / Pairs

• Thus, NAP is percent of non-overlapping data, whereas TauU is percent of non-overlapping minus overlapping data.

Practice: Calculating TauU

1 2 3 4 5 6 7 8 9 10 11 12 130

1

2

3

4

5

6

7

8

Phase A: 0 4 3 0 0Phase B: 5 2 3 5 3 5 6 7

1 2 3 4 5 6 7 8 9 10 11 12 13012345678

# of Pairs = 5*8 = 40#Pos = 34, #Neg = 4, #Tie = 2TauU = (#Pos - #Neg)/#PairsTauU = (34 - 4)/40TauU = .75

N=5 N=8

www.singlecaseresearch.org

SummaryMany different synthesis metrics• Each has significant limitations and flaws• None are satisfactory • PND been quite popular (maybe most flawed),

but increasing popularity of PAND is evidentCommon problems• Variability, trends, and “known” data patterns

(extinction bursts, learning curves, delayed effects)

• Failure to measure magnitude• Ignoring the replication logic of SSR

Summary• Complete non-overlap measures offer the most

robust option (NAP, TauU)– Complete measures equally emphasize all scores– Incomplete measures emphasize particular scores

(e.g., median)

• Interpretation of ES is a tricky business, with context, social significance, clinical significance, effects of prior studies, and the behaviors under examination all a part of the interpretation

Recommend using IRD, NAP, or TauU

For more information:Ma, H. H. (2006).An alternative method for quantitative synthesis of single-subject research: Percentage of datapoints exceeding the median. Behavior Modification, 30, 598–617.

Parker, R., & Vannest, K. J. (2008). An improved effect size for single case research: Non-overlap of all pairs (NAP). Behavior Therapy, 40, 357-67.

Parker, R. I., Vannest, K. J., & Brown, L. (2009). The improvement rate difference for single case research. Exceptional Children, 75, 135–150.

Parker, R. I., Vannest, K. J., & Davis, J. L. (2011). Effect size in single-case research: A review of nine nonoverlap techniques. Behavior Modification, 35, 303-322.

Wolery, M., Busick, M., Reichow, B., & Barton, E. (2010). Comparison of overlap methods for quantitatively synthesizing single-subject data. Journal of Special Education, 44, 18-28.

top related