statistical sampling overview and principles alvin binns 205-220-4522 [email protected]

Statistical Sampling Overview and Principles

Alvin Binns205-220-4522

[email protected]

• Provider X is identified for billing excessive ambulance services. A decision was made to pull all his/her ambulance services for a specified two years period.

Results: - 3,000 claims, 7,000 lines and $1.8M in payments.

Scenario

• Time

• Cost

• Available resources

• Available staff

Reasons for Sampling

What is Sampling?

• Sampling - is the selection of observations to acquire some knowledge of a statistical universe (population).

• From the characteristics of samples, we can infer the characteristics of universes, if the sample is representative of the universe.

• In order for statistics to be good estimates of parameters, they must, on average, return the value of the universe parameter

• When the expected value of a statistic equals a universe parameter, we call the statistic an unbiased estimator of that universe parameter

How Do I Get a Representative Sample?

• How do you ensure that your statistic is an unbiased estimator?

How Do I Get a Representative Sample?

RANDOMIZATION!!!

• A sample that is randomly selected from a universe yields sample statistics that are unbiased estimates of the universe parameters

• Many software packages, such as SAS and RAT-STATS have a valid random number generator

Randomization

• Another idea behind random sampling is that each sampling unit has a known probability of being selected

Probability Sampling

• Universe : An event or things of interest that the researcher wishes to investigate.

Eg. All Medicare beneficiaries that received a left heart catheterization from Dr. John Doe between January 1, 2007 and June 30, 2008 paid up to September 30, 2008.

Sampling Terms

• Samples are usually drawn by taking a subset of sampling units from the total universe

• Sampling units are non-overlapping collection of elements from the universe that cover the entire universe (eg claims, beneficiaries)

Sampling Terms

• We can infer the values of the universe from the sample by the use of estimation

• Ideally, we would like gather information from the sample and then estimate that value for the entire universe

• These estimates calculated from the sample data are called statistics

Estimation

Statistics

Sample Census

Statistic Parameter

Estimates

Estimates

ENTITY

CHARACTERISTIC

• In an simple random sample where we had sampled 100 units out of 1000, suppose we had a $5,000 total overpayment from the sample

• The Mean Total Overpayment would then be:

Estimation

000,50$100

50001000

top yN

Why Should I Care?

• HCFA Ruling 86-1 allows the use of statistical sampling for the purpose of estimating a provider’s overpayment to the Medicare trust fund

• Thus, we can use sampling to estimate overpayments to providers and avoid having to review the entire universe!

• CMS guidelines for Statistical Sampling for Overpayment Estimation

• Program Integrity Manual Section 3.10

• Some of the issues addressed are:– Methodologies– Sample Size– Estimation techniques

CMS Sampling Guidelines

• This replaces and clarifies (for older cases) the old HCFA Sampling Guidelines Appendix (CR 1363)– “This program memorandum (PM) provides clarified

guidance and direction for Medicare carriers to use when conducting statistical sampling for overpayment estimation. The attached replaces the prior Sampling Guidelines Appendix for reviews conducted after issuance of this PM. For reviews conducted prior to this issuance, the attached are a clarification to aid interpretation of the earlier instructions, particularly where specific numbers are suggested”

Sampling Guidelines

• Simple Random Sampling

• Cluster Sampling

• Stratified Sampling

• Other Methodologies

Sampling Methodologies

• This is the most straightforward method of sampling

• X number of sampling units are randomly selected from Y total sampling units in the Universe

• Each sampling unit has an equal probability of being selected

Simple Random Sampling

• A cluster sample is a probability sample in which each sampling unit is a collection, or cluster, of elements

• A good example is the random selection of beneficiaries, then selecting all relevant claims from each beneficiary

Cluster Sampling

• A stratified random sample is one obtained by separating the universe elements into non-overlapping groups, called strata, and then selecting a simple random sample from each stratum

• An example of this would be samples involving multiple procedure codes, selecting simple random samples from each code

• Stratified random sampling generally has less sampling variability that other sampling designs

Stratified Random Sampling

Stratified Random Sampling – Proportional Allocation Example

Universe = 1000 Units

99211 99212 992159921499213

50 150 50300 450

Sample = 100 Units

99211 99212 99213 99214 99215

5 15 30 45 5

• PIM 3.10 states about sample sizes:– “It is neither possible nor desirable to specify a minimum sample

size that applies to all situations”– “…real-world economic constraints must be taken into account.

As stated earlier, sampling is used when it is not administratively feasible to review every sampling unit in the target universe. In practice, sample sizes may be determined by available resources. That does not mean, however, that the resulting estimate of overpayment is not valid as long as proper procedures for the execution of probability sampling have been followed. A challenge to the validity of the sample that is sometimes made is that the particular sample size is too small to yield meaningful results. Such a challenge is without merit as it fails to take into account all of the other factors that are involved in the sample design”

PIM 3.10 – Sample Sizes

• CSA procedure:– If we can, we like to pull at least 10% of the

universe, however, this is not a rule that is set in stone

– We must, at a minimum, pull at least 30 sampling units to satisfy distribution requirements through the central limit theorem

PIM 3.10 – Sample Sizes

• PIM 3.10 also states:– “In most situations the lower limit of a one-

sided 90 percent confidence interval should be used as the amount of overpayment to be demanded for recovery from the physician or supplier. The details of the calculation of this lower limit involve subtracting some multiple of the estimated standard error from the point estimate, thus yielding a lower figure.”

PIM 3.10 – Overpayment

• It further states that:– “This procedure, which, through confidence interval estimation,

incorporates the uncertainty inherent in the sample design, is a conservative method that works to the financial advantage of the physician or supplier. That is, it yields a demand amount for recovery that is very likely less than the true amount of overpayment, and it allows a reasonable recovery without requiring the tight precision that might be needed to support a demand for the point estimate. However, you are not precluded from demanding the point estimate where high precision has been achieved.”


• What we really do then is calculate the Mean Total Overpayment and subtract a multiple of the standard error from it to achieve the lower level of the confidence interval


• Below is the formula for the total variance for cluster sampling


11

2

2

n

yy

nN

nNNV

n

iti

op

• Look at how the overpayments work:


OP w/ Small Variance (Large n) OP w/ Large Variance (Small n)

Mean Total Overpayment


90% Upper Limit

90% Lower Limit

90% Upper Limit

90% Lower Limit

$$ O

verp

aym

ent

Sample Size Comparison

Analysis Variable : CLUSTAMT

N Mean Std Dev Sum Minimum Maximum

20 17,462.81 30,551.07 34,9256.24 0.00 81,603.20

10 17,462.81 31,388.24 17,4628.12 0.00 81,603.20

5 18,323.58 35,639.19 91,617.92 0.00 81,603.20

Estimation Of Total Amount Of

Refund & It's Lower 1-sided 90% C.I.

Difference if sample Size 5-10 beneficiary: $184,569.45 Difference in sample Size 10-20 beneficiary: $207,578.67

Sample Size

Univ. Size


Std. Error

90% 1-sided Lower Bound

20 44 $768,363.73221,995.

08$483,766.04

10 44 $768,363.73383,912.

92$276,187.37

5 44 $806,237.70660,329.

49$91,617.92

Bottom Line

• Large Sample Sizes– Use when the expected

overpayment is large

– Use in high profile cases

– Resource intensive

– Increase precision even more using stratified sampling plans

• Small Sample Sizes– Use when the expected

overpayment is small

– Use in routine, low $ cases

– Not as resource intensive

– Does not work as well for stratified sampling

Sub-samples

• It is often beneficial to evaluate a sub-sample before moving to a full statistical sample. (sample size of about 30)

• Get a good idea of the point estimate (Mean Total Overpayment).

• Sampling for Consent Settlements.

Summary for Sampling

• Define the Universe

• Determine the sampling methodology

• Create the sampling Frame

• Determine sample size

• Create your sample

After Sampling review is completed

• Perform overpayment Projection

Questions?

Thank You!

statistical sampling overview and principles alvin binns 205-220-4522 [email protected]

Documents

random sampling

total universe sampling

entire universe

scenario slide

statistics estimation

universe parameters

use of statistical sampling

sample statistics