practical evaluation

8/6/2019 Practical Evaluation

http://slidepdf.com/reader/full/practical-evaluation 1/5

Session F4H

978-1-4244-1970-8/08/$25.00 © 2008 IEEE October 22 – 25, 2008, Saratoga Springs, NY

38th

ASEE/IEEE Frontiers in Education ConferenceF4H-18

Comparing the Effectiveness of Evaluating

Practical Capabilities Through Hands-On On-Line

Exercises Versus Conventional Methods

Isabel Garcia, Alfonso Duran and Manuel [email protected], [email protected], [email protected]

Abstract - Two interrelated methodological

transformations involved in the current transition of

European universities towards the European Higher

Education Area (EHEA) are the role of applied

capabilities and the evaluation process. In this context

this paper presents the results of a structured

comparison, throughout a five course period, of the

impact of alternative evaluation methods in courses

aimed at the development of applied engineering

capabilities. The comparison perspective is twofold: howaccurately does the evaluation method measure the

competence level attained by the students, and how does

it affect their active learning. The experiment was

conducted in a simulation course from the Industrial

Engineering curriculum and the aim was the evaluation

of the capability of using a simulation software.

Evaluation was traditionally based on a written final

exam and two other evaluation methods were then

introduced: Computer exam and team project

assignment. The assessment of the evaluation methods

was carried out by both faculty members and students

(through anonymous surveys). Results suggest that both

group assignments and computer exam perform farbetter, in this environment, than written exams. The

comparison between group assignments and computer

exam is less straightforward, being dependant on which

criterion is being appraised.

Index Terms – Evaluating capabilities, on-line testing,evaluation methodologies, problem based learning.

INTRODUCTION

The current transition of European universities towards the

European Higher Education Area (EHEA) requires a movetowards student-centered higher education and away from

teacher driven provision, as well as a renewed emphasis onemployability and the development of transferable skills and

capabilities [1], [2], [3]. Out of the many methodologicaltransformations involved, two significant and interrelated

components are the role of applied capabilities and the

assessment of learning outcomes.

EHEA’s recommendations encourage a shift from the highly

theoretical approach widespread in most national higher

education systems, such as the Spanish university system,

towards placing a higher emphasis on applied capabilities.

That is in turn related to the major overhaul proposed for the

evaluation procedures; the currently prevailing approach based solely on written final exams is postulated to

encourage learning by rote and being inappropriate for

appraising applied capabilities. According to the European

University Association’s Trends V report to the Conferenceof Ministers of Education meeting in London on 17/18 May

2007 to discuss the culmination of the Bologna process by

2010, a majority of the participating institutions continue torely on traditional end-of-year examinations to assess

student knowledge [2]. Progress is, however, being made, as

shown by the comparison with the equivalent Trends III

report figures.

The recently approved legal framework aimed at revamping

the Spanish higher education system to adapt it to the

EHEA’s requirements highlights the focus on thedevelopment of capabilities, as opposed to the mere

accumulation of knowledge, and the need to establish

appropriate evaluation procedures for these capabilities [4].

In the USA, the Accreditation Board for Engineering andTechnology (ABET), among the criteria it applies for

accrediting engineering programs during the 2007-2008

accreditation cycle, requires that Engineering programsdemonstrate that their students attain applied capabilities

such as “an ability to design and conduct experiments, as

well as to analyze and interpret data” and “an ability to usethe techniques, skills, and modern engineering tools

necessary for engineering practice” [5]. It also requires the

implementation of an appropriate assessment process, with

documented results, that demonstrates that the degree of achievement of these capabilities is being measured. There

are, however, some worrying indicators, such as the

sustained “grade inflation” reported for a wide sample of USuniversities [6].

Appropriate assessment and evaluation procedures

contribute to the effectiveness of the educational process

through two complementary mechanisms. On the one hand,student’s expectations about the evaluation system heavily

condition their chosen course of action. On the other hand,

the evaluation’s results will only be used in order tocontinuously improve the educational process if the quality



Session F4H


38th


of the evaluation is perceived as being high. Additionally, in

highly competitive educational environments, such as theSpanish engineering schools, evaluation procedures also

determine which students do and which ones do not finally

get the engineering degree; the net impact of this filtering isagain contingent on the appropriateness of the assessment

and evaluation procedures.

The choice of the most appropriate assessment method(s) isdependant on a number of parameters, such as the specificeducational outcome to be measured and the resources

available, since the resource requirements by the various

assessment approaches differ widely. Proponents of masteryexams point at options, such as applying Item Response

Theory to analyze the exam results in order to assess student

learning and the focus on the feedback loop to continuouslyimprove the educational program, that can lead to an overall

satisfactory result under certain circumstances [7]. However,

for some educational outcomes, such as ABET’s “soft”

professional skills, conventional assessment approaches are

clearly not up to the task [8].

OBJECTIVES AND RESEARCH DESIGN

Within this framework, the research project presented in this

paper was started in 2003 at the Engineering School of the

University Carlos III de Madrid (UC3M). Its goal was the

structured comparison, in courses in which some of theobjectives are linked to acquiring practical capabilities in the

use of a software tool, of the impact of alternative evaluation

methods. The incidence of the evaluation methods was

compared from two perspectives: how accurately do theymeasure the actual competence level attained, and how do

they affect active learning by the students. These two basic perspectives had to be complemented with an estimation of resource consumption, in terms of both student time and

instructor time, and the parameters on which this resource

usage was dependent (e.g. number of students enrolled) in

order to understand the feasibility of their implementation.

The course chosen, “Quantitative Methods in Management

II”, from the Industrial Engineering curriculum, covers

discrete event simulation and optimization (60% of the

credits devoted to simulation and 40% to optimization). Theexperiment was conducted over the discrete event simulation

part of the course. As programming is unavoidable in

simulation, a substantial part of the student’s effort isdevoted to developing the capability of constructing models

and carrying out experiments using a commercial simulation

software package (Witness®). Traditionally the evaluation

was solely based on a written final exam. This approach fits

well for theoretical and numerical exercises, but it wasconsidered less adequate for assessing the capabilities

associated to the use of a software tool.

Two other evaluation methods were then introduced. Group

project assignment (team development of a simulation project) was used as a major evaluation element for two

years. The other three years, the evaluation involved a

practical, computer based exam, whereby students weresummoned into a computer lab and assigned a practical case,

for which they individually had to develop a model and

carry out experiments using the simulation software. The

resulting model was then uploaded to the instructor’s systemfor grading.

The results have been appraised from both perspectives

(measurement accuracy and impact on active learning).Assessments were carried out by both faculty members and

students (through anonymous surveys). In each case, the first

year was considered a “warm-up” period, during whichinitial difficulties were ironed out, thus comparative

measurements took place in the second year. Therefore,

there are three sets of data to be compared: pre-2003 data

from the steady-state, final examination based alternative,

and data from 2004 and 2006 corresponding to the second

year of the alternative evaluation methods.

ASSESSMENT THROUGH A PRACTICAL, COMPUTER BASED

EXAM

Until 2002, grading for this course was based on a

conventional written final exam. Since a large percentage of the coursework was devoted to hands-on simulation work in

the laboratory, 40% of the simulation part of the written

exam consisted of questions aimed at assessing the

competence of the students in actually designing anddeveloping simulation models. Additionally, attendance to

the practical sessions was monitored, and students wererequired to carry out a set of structured exercises utilizingthe simulation software.

To overcome the limitations of written exams in assessing

this type of applied capability, the simulation evaluation was

then split into two different exams. Theoretical conceptswere still tested through a conventional written final exam,

accounting for 50% of the grade. For the remaining 50%, an

on-line, computer based exam was designed.

For the computer exam design there was little former

experience from which to benefit. So a careful design phase

was required before implementation. The exam takes placein the same labs as the practical sessions. This has two main

advantages: the students are familiar with the context, which

contributes to reduce the stress of facing this new exam, and

the reliability of the computers has been evaluated before the

exam so that the real capacity of the lab (in terms of number of computers expected to be available) is known and the

corrective actions in case of a computer failure can be better

planned.



Session F4H


38th


Students are given approx. one and a half hours to perform

individually a set of simulation/ programming/experimenting exercises, taking as a starting point a file that

is copied over the network into each student’s PC directory

when they log in. Instructions for the exercises are handedout in paper. Exercises of diverse complexity are included to

facilitate the discrimination of the various levels of

acquisition of the capabilities. At the end of the exercise the

students are asked to upload their exercises to a server usingan ftp client application.

A special profile was created for the exam. It gives access

exclusively to a network-served Witness® license, a predetermined directory in the PC local disk and the ftp

client application. The ftp application is configured so that

file downloading and overwriting are forbidden and only fileuploading is allowed. Additionally, access to removable

media such as USB is disabled. This special profile along

with the use of various exercises versions, guarantee that the

students actually work individually. The possibility of

copying among the students was one of the concerns about

this type of exam, since it opens new ways of interactionwhen compared to the written exam (e.g. exchanging

solutions through the server or through e-mail). The

proximity of the computers in the lab and the vertical position of the screens are also specific characteristics in

these exams.

The experience gained in the first year in which the system

was implemented showed how critical it was that the whole

examination process was thoroughly familiar for the students

beforehand. Thus, practical sessions had to precisely mimic

the examination environment, including downloading theinitial files and uploading the final result. Uploading to the

assigned location in the server the file containing the work

carried out during each practical session provided anadditional way to monitor progress throughout the course,

and allowed for longer exercises, that could be solved over

several consecutive practical sessions.

The type of exercises students are asked to solve reach the

same degree of complexity as the ones solved in the

practical sessions. To save time and concentrate on the

valuable part of the exercises, some of the programming is

already given in the starting up file that gets copied whenthey log in. The paper instructions ask the students to

complete the programming following a specific sequence

until the simulation model of a simple production or service

system is completed (e.g. a manufacturing area of a plant).For some questions (typically validation proofs) the students

are asked to complement the file solution with anexplanation that they must write on the instruction sheet.

Usually, two different versions of the exam are given to the

students, to prevent them from copying. The versions are

carefully designed so that the complexity of the exercises is

the same. This can be accomplished, for example, bydividing the system in two subsystems and inverting the

order of the construction of the system in the two versions.

For example, if the students are asked to program amanufacturing system, it could have a transportation

subsystem and a processing subsystem. In version A the

transportation could be first and the processing second, andin version B the opposite sequence. To give coherence to

both systems (the one in version A and the “inverted” in

version B) the systems may be described as being different,

as long as the logic of the model to be programmed remainsthe same. For example, in system A the transportationsubsystem could be the arrival of a material to the plant, and

in system B it could be transportation of the final product to

the warehouse. To facilitate this approach several start upfiles are copied to every PC, and in the instruction sheet the

students are asked to work only with the one which

corresponds to the version of the exam they receive.

Figure 1 shows an example of a start up working file.

FIGURE 1

WORKING FILE.

ASSESSMENT THROUGH A GROUP PROJECT ASSIGNMENT

As an alternative to the computer-based practical exam, a

group project assignment was used for two years.

All students were asked to study through simulation aspecific type of system. For example, in 2006 (when the

survey of this type of evaluation was conducted), the

students were asked to choose a gas station in their vicinity

whose “as is” and “to be” queue designs were to besimulated. The use of the same type of system for all the

groups allowed for a highly standardized level of complexity

in all phases of the project. The likelihood of one teamcopying the work of another team was reduced, by forcing

each team to choose a different gas station. This design



Session F4H


38th


resulted from the experience gained from the first edition of

the project assignment (academic year 2005). On thatoccasion, each group chose a different system. As a result,

some of the groups were more fortunate than others in their

election, in terms of feasibility of the study, interest of theresults…These difficulties also affected the faculty

members, leading to a higher effort in coordination and

supervision.

Another remarkable characteristic of the design of the project assignment is that individual members of the team

had the freedom to specialize in specific tasks of the project

assignment, although they were asked to have a reasonableknowledge of the overall project. In the report they were

asked to make explicit the work distribution among the

members of the group.

Even though the project assignment is basically an

alternative to the written conventional exam for the

evaluation of the practical capabilities of using the

simulation software, it should be highlighted that it also

helped to attain other important and difficult to fulfillobjectives. Therefore the design of the assignment

incorporates a variety of objectives. Besides the evaluation

of the practical capabilities in the use of simulation software,the most important objective stems from the opportunity of

working on an integrative applied problem, which gives

participants the opportunity to work in the modeling of realsystems, applying the theoretical contents of the course, and

developing a complete study from beginning to the end.

Other complementary objectives are team working and

improving oral and written capabilities.

The main disadvantage of this approach it that it is much

more resource consuming, for both students and faculty,

than the other alternatives. It requires, for example, teamwork, which has a value on itself, but leads to problems in

evaluation. Even if the possibility of copying the assignment

among groups is not an issue thanks to its design, there is therisk that some students within the team act as free-riders. To

reduce the impact of this potential risk, the group assignment

accounted for only 33,3% of the 50% of the grade that was

devoted to the practical capability. The remaining 16,7%

was evaluated through a question related to the assignment

but included in the individual, conventional written finalexam. Theoretical concepts, accounting for the remaining

50% of the grade, were tested through conventional

questions in this same written final exam.

RESULTS AND DISCUSSION

As described above, three sets of data were used for thecomparison: 2002 data representing the steady state while

using only the conventional written final exam, data for the

second year in which the computer-based practical examwas used and data for the second year in which the group

project assignment was used. In computer exam as well as in

project assignment, the first year was considered a “warm-up” period and was therefore excluded. Quantitative data

included average grades for the capability-oriented and for

the theoretical concepts-oriented part of the grade.Participating students varied according to the year, between

34 and 57. Anonymous surveys, encompassing both closed

and open questions, were filled up by the students in the

second year of using the computer-based practical exam andin the second year of using the group project assignment.Faculty members involved in the exercise were also

interviewed.

On a 10 point scale, average grades for the capability-

oriented part of the grade were 3,8 for the 2002 data

(conventional written final exam, in which this part had a40% weight) and 7,2 for the second year of using the

computer-based practical exam (when this part accounted for

50% of the grade). As for the second year of using the group

project assignment, the average grade for the assignment

itself, that accounted for 33,3% of the total grade, was 7,6,

while as the assignment-related individual question in thewritten exam, that accounted for another 16,7%, had an

average score of 8,1. Average scores for the theoretical

concepts-oriented part of the grade were similar in the firsttwo cases (conventional written final exam and computer-

based practical exam), with values of 5,4 and 5,5, whereas

for the project assignment case it was higher, 6,8. Thishigher result is not surprising, as team project assignment is

expected to have a positive impact on the students

understanding of the theoretical concepts.

Survey questions requested students to compare thealternative evaluation method they were using (computer

based practical exam or group project assignment) with the

conventional written final exam, that they were all familiar with since that is the assessment method most commonly

used at the UC3M. Students were not asked to compare

computer based practical exam with group projectassignment since they had only experienced one of the

approaches. This comparison encompassed, for the closed

questions, learning outcomes, motivation, soft skills

development and workload requirements. Open questions

enquired about the perceived strong and weak points.

Student feedback was generally very positive regarding

learning outcomes, motivation and soft skills development.

Thus, on a 5-level Likert item inquiring whether theadoption of a computer based practical exam (as opposed to

a conventional written final exam) increased the student’smotivation to proactively engage in the practical sessions,

71% of the respondents agreed (responses 4 or 5). Average

score was 3,77, standard deviation 1,19.

Similarly, 86% answered that it had led to a higher level of knowledge of the software tool, and 84% considered that



Session F4H


38th


unsupervised individual, proactive work at the lab had been

useful for their preparation.

However, only 30% thought that it had led to a better grasp

of the theoretical concepts; that result is consistent with thenegligible impact observed on the average scores for the

theoretical concepts-oriented part of the grade (5,5 vs. 5,4

with the conventional exam).

On the other hand, workload requirements were perceived by 54% of the students to be higher than with traditional

methods.

As for the open questions, in the case of the computer based

practical exam, most students stated that this evaluation

procedure was more appropriate for the subject matter, andtherefore provided a fairer and more precise assessment. A

significant number of responses also stated that it led to a

deeper learning, even though it required additional effort.

83% of the students were in favor of maintaining the

computer based practical test, while as only 10% preferred aconventional written final exam and 7% had mixed feelings.

Regarding the project assignment, student feedback wasquite similar, highlighting the positive impact on the

learning outcome. However, in this case the perception that

workload requirements were higher than with traditionalmethods was much more acute; 100% of the students

thought so, and over 50% described the workload

requirements as “a lot heavier”.

From the faculty members’ perspective, the feedback wasvery similar, with a very positive perception of the

effectiveness of the alternative evaluation methods in

promoting the active learning of the students but at the sametime leading to a much heavier assessment workload,

particularly for the project assignment option. As for their

ability to precisely and fairly measuring the knowledgeacquired by the students, both methods were considered

superior to conventional exams. The project assignment and

computer based practical exam allowed the faculty to

properly assess the level acquired by the students, although

in the case of group assignment what was accurately graded

was the team as a whole, not the individuals. In an attempt tomitigate this intra-team blurness, 16,7% of the grade was

evaluated through an individual, assignment-related question

in the final exam.

CONCLUSIONS

Results suggest that both group assignments and computer exam perform far better, in this environment, than the

traditional written exams. The comparison between group

assignments and computer exam is less straightforward sincetheir relative impact is dependant on which of the chosen

criteria is being appraised. While computer exam allows for

a more accurate individual evaluation of the practicalcapacity of software use, group assignment adds up other

important formative assets related to the whole of the course

(not only the practical capability of software use).

However, increased workload requirements for both students

and instructors, particularly for the group assignment option,

require careful resource planning before implementation.

R EFERENCES

[1] Crosier, D, Purser, L, Smidt, H, "Trends V: Universities shaping the

European Higher Education Area", European University Association

report, 2007.

[2] Education Ministers of Bologna Process countries, "London

Communiqué - Towards the European Higher Education Area:

responding to challenges in a globalised world", 2007.

[3] Huba, M E, Freed, J, " Learner-centered assessment on college

campuses: Shifting the focus from teaching to learning", Needham Heights, MA: Allyn-Bacon , 2000.

[4] Ministerio de Educación y Ciencia de España, "Real Decreto

1393/2007, de 29 de octubre, por el que se establece la ordenación delas enseñanzas universitarias oficiales", Boletín Oficial del Estado,

No. 260, 2007, pp. 44037-44048.

[5] ABET, "Criteria for Accrediting Engineering Programs. Effective for

evaluations during the 2007-2008 accreditation cycle ", Engineering Accreditation Commission, Baltimore, MD, 2007

[6] Rojstaczer, S , "Grade Inflation at American Colleges andUniversities", Accesible at www.gradeinflation.com/, 2003.

[7] Qualters, D M et al., "Improving Learning in First-Year Engineering

Courses Through Interdisciplinary Collaborative Assessment",

Journal of Engineering Education, Vol 97, No 1, 2008.

[8] Shuman, L J, Besterfield-Sacre, M, McGourty, J. " The ABET

“Professional Skills” – Can They Be Taught? Can They BeAssessed?", Journal of Engineering Education, Vol 94, No 1, 2005,

pp. 41-55.

practical evaluation

Documents