Download - Evaluating Program Outcomes
Evaluating Program Outcomes
Farrokh Alemi, Ph.D.July 04, 2004
Farrokh Alemi, Ph.D. 2
Your ExperienceDo you have experience collecting satisfaction surveys?Using the data from satisfaction surveys?What do these data tell you? Are they useful?
Farrokh Alemi, Ph.D. 3
ObjectivesUse statistical process control to evaluate effectiveness of programs?Use satisfaction surveys and health status measures to examine impact of careUnderstand how program evaluations can go wrong
Farrokh Alemi, Ph.D. 4
Evaluating Changing Programs
Ideal program evaluationProgram is constantPatients are assigned randomlyTakes months to complete
Continuous Program EvaluationOn-going data and timely resultsAllows continuous change in the programFew evaluation requirements
Farrokh Alemi, Ph.D. 5
Study DesignMultiple Observations of Two Groups
0 2 4 6 8 10
Observations
Ou
tco
me
m
ea
su
re
Experimental Control
Farrokh Alemi, Ph.D. 6
Evaluation MeasuresObjective Possible measures
The approval of the program and enrollment of the patients
Lapsed time from announcement of the funding to enrollment of first patient
Correspondence between what was approved and done
The percent of program objectives planned actually implemented
The demand for the proposed program
The ratio of patients to the service capacity
Description of the patients The number and demographic background of patients
Satisfaction with the program Patient satisfaction surveys
Provider satisfaction Measures of conflict among providers and satisfaction with care
Impact on patient outcomes Measures of mortality and morbidity or patient health status
Farrokh Alemi, Ph.D. 7
Why measure satisfaction?Give customers a voiceEvaluate based on customer’s own valuesMeasure facet of care not easily examined
compassionate bedside skills efficient attendance to needs participation in decision-makingadequate communication and information
Change market share
Farrokh Alemi, Ph.D. 8
AdvantageNot too rare as death or other adverse eventsAvailable on all disease for all institutionsNot likely to be affected by case mix, though affected by expectationsPositive outcome
Farrokh Alemi, Ph.D. 9
Definition of Patient Satisfaction
Pascoe envisages patient satisfaction as healthcare recipients‘ reactions to their care.A reaction that is composed of both a cognitive evaluation and an emotional response.No right or wrong, all reactions are valid.
Farrokh Alemi, Ph.D. 10
Process of Making satisfaction Judgments
Each patient begins with a comparison standard against which care is judged.Standard can be an ideal care, a minimal expectation, an average of past experiences, or a sense of what one deserves.The patient can assimilate discrepancies between this expected and actual care.What is not assimilated affects patient ratings of satisfaction We do not know
how to measureexpectations
Farrokh Alemi, Ph.D. 11
Examples of Satisfaction Surveys
Patient Satisfaction Questionnaire (PSQ) Patient Judgments of Hospital Quality Questionnaire (PJHQ)Medline database includes surveys specific to different clinical areas
Farrokh Alemi, Ph.D. 12
PSQ Contains Eight Dimensions
Technical careInterpersonal behaviorAccess
AvailabilityContinuity of careEnvironmentFinances
Farrokh Alemi, Ph.D. 13
Patient Judgments of Hospital Quality Questionnaire
Nursing careMedical careHospital environment
InformationAdmission proceduresDischarge proceduresFinances
Farrokh Alemi, Ph.D. 14
Constructing a Satisfaction Instrument
Let a focus group to generate the questions. Group the questions into general dimensions Survey a large sample of patients Drop questions with skewed distributions or with a high rate of missing responses Create a shorter version by dropping highly correlated items Construct validity of the questions by comparing responses to objective measures
Problems with Satisfaction Measures
Are patients satisfied with their overall care or a specific agent of
care?
Problems with Satisfaction Measures
Are reports of satisfaction biased by patients' respect, trust, confidence,
and gratitude to their doctors, nurses, and healthcare?
Problems with Satisfaction Measures
Statistical analyses of satisfaction ratings suggest that technical and interpersonal dimensions are not
always evaluated independently by patients
Are we measuring quality of care or quality of the cure?
Problems with Satisfaction Measures
The earlier satisfaction is measured, the higher the
satisfaction rating
Problems with Satisfaction Measures
Most people are satisfied and to discover impact large databases
are needed to detect small changes
Focus on time to dissatisfied customer
Farrokh Alemi, Ph.D. 20
Patients’ Health StatusRely on client’s self reportMeasure ability to function not preferences for life stylesAvailable on numerous diseases across many institutions
Farrokh Alemi, Ph.D. 21
Patients’ Health StatusMany disease-specific measures are available. See Medline for detailsA widely used example is the general SF-36 and its shortened version SF-12
Farrokh Alemi, Ph.D. 22
Components of SF-36
SF-36
Physical Health Mental Health
Physical functioning
Role - physical
Bodily pain
General health
Vitality
Social functioning
Role - emotional
Mental health
Farrokh Alemi, Ph.D. 23
Reliability & ValidityMore than 4000 studiesWith few exceptions, overall reliability exceeding 70%Numerous studies showing the validity of the instrument in differentiating among sick and well patientsTranslated and used widely
Farrokh Alemi, Ph.D. 24
Reliability of ComponentsScales Lowest Possible Score (Floor) Highest Possible Score (Ceiling) Reliability
Physical Functioning
Very limited in performing all physical activities, including bathing or dressing (0.8%)
Performs all types of physical activities including the most vigorous without limitations due to health (38.8%)
0.93
Role-Physical Problems with work or other daily activities as a result of physical health (10.3%)
No problems with work or other daily activities (70.9%) 0.89
Bodily Pain Very severe and extremely limiting pain (0.6%) No pain or limitations due to pain (31.9%) 0.9
General Health Evaluates personal health as poor and believes it is likely to get worse (0.0%)
Evaluates personal health as excellent (7.4%) 0.81
Vitality Feels tired and worn out all of the time (0.5%) Feels full of pep and energy all of the time (1.5%) 0.86
Social Functioning Extreme and frequent interference with normal social activities due to physical and emotional problems (0.6%)
Performs normal social activities without interference due to physical or emotional problems (52.3%)
0.68
Role-Emotional Problems with work or other daily activities as a result of emotional problems (9.6%)
No problems with work or other daily activities (71.0%) 0.82
Mental Health Feelings of nervousness and depression all of the time (0.0%)
Feels peaceful, happy, and calm all of the time (0.2%) 0.84
Physical Component Summary
Limitations in self-care, physical, social, and role activities, severe bodily pain, frequent tiredness, health rated "poor" (0.0%)
No physical limitations, disabilities, or decrements in well-being, high energy level, health rated "excellent" (0.0%)
0.92
Mental Component Summary
Frequent psychological distress, social and role disability due to emotional problems, health rated "poor" (0.0%)
Frequent positive affect, absence of psychological distress and limitations in usual social/role activities due to emotional problems, health rated "excellent" (0.0%)
0.88
Adapted from SF-36.org
Problems with Health Status Measures
Patients’ perception of adequacy of their health status should depend on their life style choices. SF-36 does
not take this into account.
Farrokh Alemi, Ph.D. 26
Assumptions in Use of Statistical Process Control
Tools Data needs to be collected over multiple time periods and not just before and after the intervention
Patients are likely to be recruited over multiple time periods any way
Data need to be collected from both the experimental and control groups
Farrokh Alemi, Ph.D. 27
Use of Statistical Process Controls
Process limits are set based on our expectations
Calculated from patterns among outcomes of the control group
Observed rates are compared to expected limits
Farrokh Alemi, Ph.D. 28
Farrokh Alemi, Ph.D. 29
Control Cases are Weighted Based on Their Similarity to Experimental
Group O = (∑ j=1, …, M Wj Oj)/ ∑ j=1, …, M Wj
S = [∑ j=1, …, M Wj (Oj - O)2/(-1+∑ j=1, …, M Wj)]0.5
Upper limit = O + 3 SLower limit = O - 3 S
Farrokh Alemi, Ph.D. 30
Limitation: poorly defined populations
Threat of poorly defined populations. It is not reasonable to compare patients and organization with such differences to each other How this threat is addressed? The proposed method weighs cases similar to the program site more heavily. It formalizes what is implicitly always done: It allows the investigators to specify the characteristic on which cases must be matched.
Farrokh Alemi, Ph.D. 31
Limitation: treatment contamination
Potential threat of treatment contamination. Typically one expects that well-defined interventions be used to address construct validity and to protect against diffusion, contamination and imitation of treatments by the comparison groups. This ensures that any "significant" difference can indeed be ascribed to a specific intervention. How this threat is addressed? In the proposed evaluation, if the program has led to improved outcomes, then improvements will be detected when data are compared to historical trends. Furthermore, since cycles of improvement are introduced at specific points in time, we will be able to attribute the improvement to specific modification of the program.
Farrokh Alemi, Ph.D. 32
Limitation: poorly defined control group
Potential threat of poorly defined control group. Changes in "control" may introduce biases that limit our understanding maturation and regression towards the mean in the control group. Moreover, it might generate variability. How the threat is addressed? If the observed improvement is due to aging or maturation of the clients, such differences are also occurring in the control cases. Similarly regression towards the mean occurs for both the program and the control cases. So the proposed design protects against both maturation and regression towards the mean.
Farrokh Alemi, Ph.D. 33
Limitation: baseline differences
Potential threat of baseline differences. Clients differ in their baseline values. Some are at high risk and others are at low risk for incidence of adverse outcomes. Where you end up, in part depends where you started. To ignore these differences would brand effective programs as useless. How this threat is addressed? We gather data on baseline. Patients are asked to provide data prior to intervention and after the intervention. The analysis compares outcomes to historical trends.
Farrokh Alemi, Ph.D. 34
Limitation: not having a replication
Potential threat of not having a replication. In the Western scientific method, there is an absolute requirement for replication of scientific findings. Without replication, it is impossible to refute the claim that any "result" was simply due to chance variation among study subjects. How this is addressed? Though the intervention is change, there is a replication of a different sort going on in the proposed approach. If outcomes have not changed, then we consider the intervention to be a replication of the earlier one.
Take Home LessonsYou can evaluate outcomes of changing programs (e.g. patients satisfaction and patients’ health status) using statistical
process controls
Farrokh Alemi, Ph.D. 36
Limitation: