evaluation paradigms

Upload: zarrin-seema-siddiqui

Post on 30-May-2018

219 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/14/2019 Evaluation Paradigms

    1/3

    JCPSP 2006, Vol. 16 (4): 291-293 291

    INTRODUCTIONMedical education worldwide has undergone importantchanges and shown notable advancement in recent years.1

    Various interventions have been introduced and a need has

    been recognised to apply evidence-based approaches on the

    impact of these interventions.2 Evaluation thus becomes an

    integral part of these interventions. It implies that there is a

    method ology that allows to look at the resu lts of what can be

    done in an effective way to influence the actions going

    forward.3 Surprisingly, evaluation is on e par ticular asp ect of

    educational cycle which we, as human being, are constantly

    performing in one way or other, it may be a television

    programme, a sport event, a new departmental store or just

    our colleagues to d etermine their w orth, merit or significance.

    In the context of education, however, evaluation is used to

    determine the effectiveness of program s and ascertain that theobjective/ outcomes have been achieved. This provides

    information to program staff and stakeholders in order to

    identify changes that can further improve program

    effectiveness.

    Educational experts have suggested many evaluation models

    to achieve the above - mentioned aims each having their ow n

    limitations and strengths. 4 In this article, two major

    paradigms of evaluation, used in medical education i.e.

    scientific and naturalistic, are discussed with reference to their

    key emphasis, methodology and analysis. The strengths and

    limitations associated w ith each par adigm are also identified.

    SCIENTIFIC PARADIGMThere has been great emph asis on scientific research methods

    in initial stages when efforts were being made to introduce

    progr am evalu ation as a separate d iscipline. The basic idea

    behind scientific method is planned assessment of programeffects by means of scientific measurements. It is, therefore,

    referred to a model that is grounded in measuring learning

    attainment. The behavioural changes are measured against

    predefined learning outcomes and provide objective

    information to assist in further d evelopment. The paradigm is,

    therefore, focussed on the predetermined objectives and the

    final outcomes rather than intervening in the learning

    process.5

    METHODOLOGYThere are tw o general categories.

    1. TRUE EXPERIMENTAL DESIGN : The subjects are randomly

    assigned to program and comparison groups.Randomisation is the vital ingredient of this process to

    ensure that groups are totally comparable and the

    observed d ifferences in the outcomes are not the results of

    extraneous factors or pre-existing differences. For example

    a study was conducted to determine impact of an

    educational intervention on the students learning of

    clinical research methodology where participants were

    divided into control and experimental groups to compare

    the outcomes.6 One of the limitations noted by author was

    the exchange of learning material among students in two

    group s, which may have influence on results.

    2. Q UASI EXPERIMENTAL DESIGN: If researcher feels that

    randomisation is not possible or practical, quasi

    experimental design is recommend ed w hich may be of thefollowing ty pes:

    (a). N ON -EQUIVALENT GROUP, POST-TEST ONLY: A study is

    conducted to compare the knowledge scores of

    medical students in problem-based learning (PBL)

    and traditional curriculum on public health topics.

    The results show that PBL students w ere significantly

    more successful in the knowledge test used.7 Here

    only the outcome measure is used for comparison.

    This, however, does not rule out if one group was

    already better than the other before experiment or

    there m ight be other influential factors.

    MEDICAL EDUCATION

    EVALUATION PARADIGMS IN MEDICAL EDUCATIONZarrin Seema Siddiqui

    ABSTRACT

    Medical education is regularly challenged with new and innovative ideas in the field of curricula, teaching learning processes and

    assessment. Evaluation of these approaches and techniques provide vital information regarding their subsequent use and

    application to enhance the quality of learning experiences for students. Therefore, it is essentially important to choose an

    evaluation approach/model that provide meaningful and valid information to concerned stakeholders. Here two major paradigm

    of evaluation i.e. scientific and naturalistic are discussed emphasizing on their use, strengths and limitations. It is concluded that

    no single paradigm is superior to other and it is finally left to the evaluator for making the ultimate choice depending on the

    purpose and questions that need to be answered through evaluation.

    KEY WORDS: Evaluation. Medical education. Educational intervention.

    Department of Medical Education, University of Western Australia, WA 6009, Perth,

    Australia.

    Correspondence: Dr. Zarrin Seema Siddiqui, Lecturer, Medical Education, Faculty of

    Medicine and Dentistry, University of Western Australia, MBDP: M515, 1st Floor N

    Block, SCGH, Nedlands WA 6009, Perth, Australia. E - m a i l :

    [email protected]

    Received July 21, 2005; accepted: February 17, 2006.

  • 8/14/2019 Evaluation Paradigms

    2/3

    (b). NO N-EQUIVALENT GROUP , PR E-TEST POST-TEST: Here

    the pre- and post-test design partially eliminates the

    limitation earlier discussed. Still there might be

    problems resulting from students in control group

    being exposed to the experimen tal condition.

    (c). PO ST-TEST ONLY, C ONTR OL GR OUP: When a large

    number of students or teachers are involved, it is

    practically imp ossible and time-consum ing to do p re

    and post-test, therefore, this design is used.

    (d). T IM E SER IES DESIGN: Several measurements are

    undertaken from both the groups prior to and after

    the experimental treatment. This provides a more

    reliable evaluation, although, earlier problems may

    still occur if groups differ at the onset of evaluation.

    (e). ASSESSMENT OF TREATMENT G ROUP ONLY: Another

    form of evaluation where only treatment group is

    considered yet, without the information that might

    have incurred in the absence of intervention/

    experiment, it would be hard to know whether the

    program has actually any impact.

    STRENGTHSExperimental designs are especially useful in addressing

    evaluation questions about the effectiveness and impact of

    programs.8 With the use of comparative data as context for

    interpreting findings, experimental designs increase our

    confidence that observed outcomes are the result of a given

    program or innovation instead of a function of extraneous

    variable or events. For example, experimental designs may

    help to answer qu estions such as;

    Would adopting a new integrated educational programimprove stud ent performance?

    Is problem-based learning having a positive impact onstud ent achievement and faculty satisfaction?

    How is the professional development p rogram influencingteachers collegiality and classroom practice?

    LIMITATIONSThe main limitations associated with this paradigm is that

    objectives are sometimes difficult to predict, or may change as

    the course proceeds and unintended learning may be more

    important th an the expected outcome of a learning program .5

    The use of randomised controlled trial is specially limited in

    med ical education for a n um ber of reasons, which involve lack

    of appropriate resources and funding.9,10 Other limitations are

    to precisely define the parameters of the experiment and

    control of variables which can literally take place only in

    laboratories. A genuine control is impossible. Practical

    difficulties in separating group s often result in contamination

    of designs.11 Withholding intervention to control group has

    also been subjected to ethical issues. Sample reliability and

    lack of control group as m entioned earlier may also affect the

    results of evaluation. Finally, even when the purpose of

    evaluation is to assess the impact of a program, logistical and

    feasibility issues constrain the whole fram ework.

    N ATURALISTIC PARADIGMThis model is derived from ethnographic methodologies

    developed by anthropologists. The rationale behind this

    model was that no other model really captures the context of

    the program that involves students, their families, teaching

    staff and other surrounding elements of the community.

    Naturalistic model provides detailed information about

    individual, groups or institution as they occur naturally.When compared to experimental model, the naturalistic

    model considers as well as values the positions of multiple

    audiences. It does not solely rely on numerical data and

    focuses on p rogram activity rather than intent. For example in

    1996-97, a structured training course was organised by a

    British University for Hungarian Medical Teachers. To assess

    the relevance of this course to th e needs of the participants, the

    evaluator adapted a naturalistic approach using in-depth

    interviews. These interviews were triangulated using

    observation and documentary analysis. Since the aim of

    evaluation w as to construct the m eaning of the value of MSc

    experience for foreign participants, this seems to be the most

    appropriate model.12

    METHODOLOGYThe naturalistic model mainly relies on qualitative data and

    analysis. Four important forms of qualitative analysis,

    commonly adapted by the naturalistic model13, are

    phenomenological analysis, content analysis, analytic

    induction and constant comp arative analysis.

    Attempts are made by the evaluator to look for information

    that can be identified across multiple data sources and

    methods. The categories or themes are identified and

    relationship among categories is established. Finally, more

    evidence to sup port categories and relationship is collected.

    Dornan et al. used p henomenological analysis in their stud y to

    evaluate how clinicians perceive their role in problem-based

    medical education and how closely those perceptions match

    the curriculum they are teaching.14 Yet in another study, the

    role of autopsy in the modern undergraduate curriculum has

    been investigated, using content analysis within a theoretical

    sample.15

    In addition to qualitative data in narrative form, quantitative

    data may also be included in analysis but, nevertheless, the

    model is highly reliant on the expertise of the evaluator to

    interpret the data, determine the significance of results and

    draw conclusions.

    STRENGTHSNatu ralistic methods have greater validity as they encourage

    mu ltiple data types and resources. This yields rich and timely

    information about the important aspects of the program

    implementation, interaction between various stakeholders,

    problems encountered by the program staff etc. This also

    decreases the possibility of missing intended effects as well as

    offers some degree of flexibility to the evaluator, which

    differentiates it from experimen tal mod el.

    292 JCPSP 2006, Vol. 16 (4): 291-293

    Zarrin Seema Siddiqui

  • 8/14/2019 Evaluation Paradigms

    3/3

    LIMITATIONSThere is higher degree of reliance on subjectivity and

    reliability of human observers thus their personal

    observations and intu itive analysis can lead to biased opinion.

    Similarly, data collection and analysis also becomes laborious

    task and may potentially be very expensive.

    DISCUSSIONThese two major paradigms have been in debate for a long

    period. One group favours experimental method s for program

    evaluation on the basis that qualitative methods often p rovide

    misleading information which may be detrimental for the

    decision-making process. The other group argues that

    qualitative method s best serve the pu rpose of evaluation and

    reject any merit of experimental or quasi-experimental

    method s. They see qualitative model as highly responsive and

    emphasized that the entire evaluation can be performed

    throu gh n aturalistic methods of information collection.16,17

    An interesting debate between Robert F. Buroch and Milbrey

    W. Mc Laughlin has been r eported in literature.18 The occasion

    was the annual meeting of the Evaluation Network and theEvaluation Research Society in United States. The issue for

    discussion was the recommendation to the Congress and the

    Department of Education regarding mandatory use of

    experimental methods, where appropriate, by the federal

    government. Boruch argued that field tests are essentially

    required for an u nbiased estimate effects. He further stressed

    that the quality of an evaluation will enhance provided

    experimental methods are implemented at federal or state

    levels. On the contrary, McLaughlin pointed ou t the rigidity of

    experimental methods that make an evaluator ask wrong

    questions, use wron g measures and thus resulting in failure to

    provide valid information for policy and practice. It is clearly

    evident from the debate that individuals within each

    paradigm (naturalistic or scientific) when evaluate the other

    paradigm may overlook the strengths of that model while

    both par adigms have their own pros an d cons. It is finally left

    for the evaluators to make the choice considering a series of

    questions (Table I) based on Stufflebeam's four essential

    attributes for a sound and fair programme evaluation.19

    Alternatively an evaluation matrix may be developed where

    the set of research questions is tabulated against a selection of

    possible analysis tools.20 This enables evaluators to select the

    most appropriate and feasible method.

    CONCLUSIONThere is no single model that is comprehensive enough to

    conduct an effective evaluation. It is worthwhile to try a

    combination of methods when situation allows depending

    up on the p urp ose for evaluation. However, if the key concern

    is programs impact on participant outcomes or if multiple

    programs are being considered with regards to their

    effectiveness for decision-making, experimental designs seem

    appropriate. Otherwise, the evaluation should be enriched

    with detailed information to enable timely and responsive

    feedback for the stakeholders.

    REFERENCES1. Yolsal N, Bulut A, Karabey S, Ortayli N, Bahadir G, Aydin Z.

    Development of training of trainers programmes and evaluation of their

    effectiveness in Istanbul, Turkey. Med Teach 2003. 25: 319-24.

    2. Reed D, Price EG, Windish DM, Wright SM, Gozu A, Hsu EB, et al.Challenges in systemic reviews of educational intervention studies. AnnIntern Med 2005; 142: 1080-9.

    3. Wolfensohn J. Evaluation and poverty reduction conference. WashingtonDC: The World Bank, 1999.

    4. Stufflebeam DL. Evaluation models: new directions for evaluation, SanFrancisco: Jossey - Bass, 2001.

    5. Brigley S, Littlejohns P, Young Y, McEwan J. Continuing medicaleducation: the question of evaluation. Med Edu 1997; 31: 67-71.

    6. Pelayo - Alvarez M, Albert - Rox X, Gil - Latore F, Gutierrez - Sigler D.Feasibility analysis of a personalised training plan for learning researchmethodology. Med Educ 2000. 34: 139-45.

    7. Gurpinar E, Musal B, Aksakoglu G, Ucku R. Comparison of knowledgescores of medical students in problem-based learning and traditionalcurriculum on public health topics. BMC Med Educ 2005. 5: 7.

    8. Barry G, Herman J. True and quasi-experimental designs. Practicalassessment, Res Evaluation, 1997; 5(14).

    9. Smith M. The whole is greater: combining qualitative and quantitativeapproaches in evaluation studies, in naturalistic evaluation. Newdirections for program evaluation. San Francisco: Jossey-Bass, 1986.

    10. Wilkes M, Bligh J. Evaluating educational interventions. BMJ 1999;318: 1269-72.

    11. Kember D. To control or not to control: the question of whetherexperimental designs are appropriate for evaluating teaching innovationsin higher education. Assess Eval Higher Educ 2003; 28: 89-101.

    12. Bolden K, Willoughby SA, Claridge MT, Lewis AP. Training Hungarianprimary healthcare teachers: the relevance of a UK postgraduate coursefor health educators. Med Educ 2000; 34: 61-5.

    13. Payne D. Designing educational project and program evaluations: apractical overview based on research and experience. Boston: KluwerAcademic Publishers, 1994.

    14. Dornan T, Scherpbier A, King N, Boshuizen H. Clinical teachers andproblem-based learning: a phenomenological study. Med Educ 2005.39: 163-70.

    15. Burton J. The autopsy in modern undergraduate medical education: aqualitative study of uses and curriculum considerations. Med Educ2003. 37: 1073-81.

    16. Parlett M, Hamilton D. Evaluation and illumination: a new approach tothe study of innovative programs, in beyond the numbers game. Berkley,

    CA:McCutchan, 1978.

    17. Guba E, Lincoln Y. Effective evaluation: improving the usefulness ofevaluation results through responsive and naturalistic approaches. SanFrancisco: Jossey- Bass, 1981.

    18. Davis B. Boruch and Mclaughlin debate. Eval News 1982; 3: 11-20.

    19. Curran V, Christopher J, Lemire F, Collins A, Barrett B. Application ofa responsive evaluation approach in medical education. Med Educ2003. 37: 256-66.

    20. Reeves T. Evaluation matrix. available http://mime1.marc.gatech.edu/MM_Tools/EM.html accessed online 29th September 2005.

    JCPSP 2006, Vol. 16 (4): 291-293 293

    Table I: Essential questions for evaluation.

    Utili ty Will evaluat ion model be useful in serving the information

    needs of the intended users?

    Feasibility Will the evaluation model follow practical and feasible meansfor collecting evaluative information?

    Propriety How the evaluation model will be conducted in an ethical

    manner, with due regard for the welfare of those involved in

    the evaluation, as well as those affected by its results?

    Accuracy Will the evaluation model convey technically adequate

    information about the features that determined the worth or

    merit of the programme being evaluated?

    l l l l l Ol l l l l

    Evaluation paradigms in medical education