Download - Evaluating Program Outcomes

Evaluating Program Outcomes

Farrokh Alemi, Ph.D.July 04, 2004

Farrokh Alemi, Ph.D. 2

Your ExperienceDo you have experience collecting satisfaction surveys?Using the data from satisfaction surveys?What do these data tell you? Are they useful?


ObjectivesUse statistical process control to evaluate effectiveness of programs?Use satisfaction surveys and health status measures to examine impact of careUnderstand how program evaluations can go wrong


Evaluating Changing Programs

Ideal program evaluationProgram is constantPatients are assigned randomlyTakes months to complete

Continuous Program EvaluationOn-going data and timely resultsAllows continuous change in the programFew evaluation requirements


Study DesignMultiple Observations of Two Groups

0 2 4 6 8 10

Observations

Ou

tco

me

m

ea

su

re

Experimental Control


Evaluation MeasuresObjective Possible measures

The approval of the program and enrollment of the patients

Lapsed time from announcement of the funding to enrollment of first patient

Correspondence between what was approved and done

The percent of program objectives planned actually implemented

The demand for the proposed program

The ratio of patients to the service capacity

Description of the patients The number and demographic background of patients

Satisfaction with the program Patient satisfaction surveys

Provider satisfaction Measures of conflict among providers and satisfaction with care

Impact on patient outcomes Measures of mortality and morbidity or patient health status


Why measure satisfaction?Give customers a voiceEvaluate based on customer’s own valuesMeasure facet of care not easily examined

compassionate bedside skills efficient attendance to needs participation in decision-makingadequate communication and information

Change market share


AdvantageNot too rare as death or other adverse eventsAvailable on all disease for all institutionsNot likely to be affected by case mix, though affected by expectationsPositive outcome


Definition of Patient Satisfaction

Pascoe envisages patient satisfaction as healthcare recipients‘ reactions to their care.A reaction that is composed of both a cognitive evaluation and an emotional response.No right or wrong, all reactions are valid.


Process of Making satisfaction Judgments

Each patient begins with a comparison standard against which care is judged.Standard can be an ideal care, a minimal expectation, an average of past experiences, or a sense of what one deserves.The patient can assimilate discrepancies between this expected and actual care.What is not assimilated affects patient ratings of satisfaction We do not know

how to measureexpectations


Examples of Satisfaction Surveys

Patient Satisfaction Questionnaire (PSQ) Patient Judgments of Hospital Quality Questionnaire (PJHQ)Medline database includes surveys specific to different clinical areas


PSQ Contains Eight Dimensions

Technical careInterpersonal behaviorAccess

AvailabilityContinuity of careEnvironmentFinances


Patient Judgments of Hospital Quality Questionnaire

Nursing careMedical careHospital environment

InformationAdmission proceduresDischarge proceduresFinances


Constructing a Satisfaction Instrument

Let a focus group to generate the questions. Group the questions into general dimensions Survey a large sample of patients Drop questions with skewed distributions or with a high rate of missing responses Create a shorter version by dropping highly correlated items Construct validity of the questions by comparing responses to objective measures

Problems with Satisfaction Measures

Are patients satisfied with their overall care or a specific agent of

care?


Are reports of satisfaction biased by patients' respect, trust, confidence,

and gratitude to their doctors, nurses, and healthcare?


Statistical analyses of satisfaction ratings suggest that technical and interpersonal dimensions are not

always evaluated independently by patients

Are we measuring quality of care or quality of the cure?


The earlier satisfaction is measured, the higher the

satisfaction rating


Most people are satisfied and to discover impact large databases

are needed to detect small changes

Focus on time to dissatisfied customer


Patients’ Health StatusRely on client’s self reportMeasure ability to function not preferences for life stylesAvailable on numerous diseases across many institutions


Patients’ Health StatusMany disease-specific measures are available. See Medline for detailsA widely used example is the general SF-36 and its shortened version SF-12


Components of SF-36

SF-36

Physical Health Mental Health

Physical functioning

Role - physical

Bodily pain

General health

Vitality

Social functioning

Role - emotional

Mental health


Reliability & ValidityMore than 4000 studiesWith few exceptions, overall reliability exceeding 70%Numerous studies showing the validity of the instrument in differentiating among sick and well patientsTranslated and used widely


Reliability of ComponentsScales Lowest Possible Score (Floor) Highest Possible Score (Ceiling) Reliability

Physical Functioning

Very limited in performing all physical activities, including bathing or dressing (0.8%)

Performs all types of physical activities including the most vigorous without limitations due to health (38.8%)

0.93

Role-Physical Problems with work or other daily activities as a result of physical health (10.3%)

No problems with work or other daily activities (70.9%) 0.89

Bodily Pain Very severe and extremely limiting pain (0.6%) No pain or limitations due to pain (31.9%) 0.9

General Health Evaluates personal health as poor and believes it is likely to get worse (0.0%)

Evaluates personal health as excellent (7.4%) 0.81

Vitality Feels tired and worn out all of the time (0.5%) Feels full of pep and energy all of the time (1.5%) 0.86

Social Functioning Extreme and frequent interference with normal social activities due to physical and emotional problems (0.6%)

Performs normal social activities without interference due to physical or emotional problems (52.3%)

0.68

Role-Emotional Problems with work or other daily activities as a result of emotional problems (9.6%)

No problems with work or other daily activities (71.0%) 0.82

Mental Health Feelings of nervousness and depression all of the time (0.0%)

Feels peaceful, happy, and calm all of the time (0.2%) 0.84

Physical Component Summary

Limitations in self-care, physical, social, and role activities, severe bodily pain, frequent tiredness, health rated "poor" (0.0%)

No physical limitations, disabilities, or decrements in well-being, high energy level, health rated "excellent" (0.0%)

0.92

Mental Component Summary

Frequent psychological distress, social and role disability due to emotional problems, health rated "poor" (0.0%)

Frequent positive affect, absence of psychological distress and limitations in usual social/role activities due to emotional problems, health rated "excellent" (0.0%)

0.88

Adapted from SF-36.org

Problems with Health Status Measures

Patients’ perception of adequacy of their health status should depend on their life style choices. SF-36 does

not take this into account.


Assumptions in Use of Statistical Process Control

Tools Data needs to be collected over multiple time periods and not just before and after the intervention

Patients are likely to be recruited over multiple time periods any way

Data need to be collected from both the experimental and control groups


Use of Statistical Process Controls

Process limits are set based on our expectations

Calculated from patterns among outcomes of the control group

Observed rates are compared to expected limits


Control Cases are Weighted Based on Their Similarity to Experimental

Group O = (∑ j=1, …, M Wj Oj)/ ∑ j=1, …, M Wj

S = [∑ j=1, …, M Wj (Oj - O)2/(-1+∑ j=1, …, M Wj)]0.5

Upper limit = O + 3 SLower limit = O - 3 S


Limitation: poorly defined populations

Threat of poorly defined populations. It is not reasonable to compare patients and organization with such differences to each other How this threat is addressed? The proposed method weighs cases similar to the program site more heavily. It formalizes what is implicitly always done: It allows the investigators to specify the characteristic on which cases must be matched.


Limitation: treatment contamination

Potential threat of treatment contamination. Typically one expects that well-defined interventions be used to address construct validity and to protect against diffusion, contamination and imitation of treatments by the comparison groups. This ensures that any "significant" difference can indeed be ascribed to a specific intervention. How this threat is addressed? In the proposed evaluation, if the program has led to improved outcomes, then improvements will be detected when data are compared to historical trends. Furthermore, since cycles of improvement are introduced at specific points in time, we will be able to attribute the improvement to specific modification of the program.


Limitation: poorly defined control group

Potential threat of poorly defined control group. Changes in "control" may introduce biases that limit our understanding maturation and regression towards the mean in the control group. Moreover, it might generate variability. How the threat is addressed? If the observed improvement is due to aging or maturation of the clients, such differences are also occurring in the control cases. Similarly regression towards the mean occurs for both the program and the control cases. So the proposed design protects against both maturation and regression towards the mean.


Limitation: baseline differences

Potential threat of baseline differences. Clients differ in their baseline values. Some are at high risk and others are at low risk for incidence of adverse outcomes. Where you end up, in part depends where you started. To ignore these differences would brand effective programs as useless. How this threat is addressed? We gather data on baseline. Patients are asked to provide data prior to intervention and after the intervention. The analysis compares outcomes to historical trends.


Limitation: not having a replication

Potential threat of not having a replication. In the Western scientific method, there is an absolute requirement for replication of scientific findings. Without replication, it is impossible to refute the claim that any "result" was simply due to chance variation among study subjects. How this is addressed? Though the intervention is change, there is a replication of a different sort going on in the proposed approach. If outcomes have not changed, then we consider the intervention to be a replication of the earlier one.

Take Home LessonsYou can evaluate outcomes of changing programs (e.g. patients satisfaction and patients’ health status) using statistical

process controls


Limitation:

Download - Evaluating Program Outcomes

Top Related