1 assessment principles

7/25/2019 1 Assessment Principles

1/25

Prof Dr Vincent Pang

Universiti Malaysia Sabah

INSTRUCTION L SSESSMENT

PRINCIPLES


2/25

VALIDITY


3/25

What is Validity?

A test is valid if it measures what it purports to measure.Allen, M.J. and Yen, W. M. (1979.)

A test is valid if it does what it is intended to do.

Nunnally, J.C. (1978)

Validity Types (Approaches?)

Face

Content

Criterion Related Predictive

Concurrent

Construct3


4/25

Face Validity

Determined by judgements made by the

stakeholders and based on surface

appearance.

4


5/25

Content Validity Determined by expert judgements of the

appropriateness of the contents of a measure refers to the degree that one has

representatively sampled from that domain ofmeaning.

receives the most attention in construction ofachievement & proficiency measures with

psychology & education. Test specification table can be used to enhance

content validity

5


6/25

Criterion Related Validity

Two basic types

1.Concurrent validity

2. Predictive validity

Comparison of performance is made with some

available criterion

A standard by which you will judge the

outcomes of your measurement

6


7/25

1. Concurrent Validity

Correlate test scores with criterion scores obtained atabout the same time.

The ability of a measure to indicate an individuals

present standing on the criterion variable.The criterion is an independent measure of the same

trait that the test is designed to measure.

Usually measured at the same time as the test isadministered

Eg. the test achievement of a student is comparedagainst the teachers rating of his ability (criterion)

7


8/25

2. Predictive Validity

determined by correlating test scores withcriterion scores obtained after examinees havehad a chance to perform what is predicted by the

test. A measure is validated with reference to future

standing on criterion variable.

A measure of what the test is designed to predict. Usually measured after the students have had a

chance to exhibit the predicted behaviour

8


9/25

e.g.Employment Test

The purpose of an employment test is to predict success on

the job.

The most appropriate test of its validity is predictive validity

- to what extent does the test predict what it is supposed topredict?

Give test to applicants for a position.

For all those hired, compare their test scores to supervisors

rating after 6 months on the job. The supervisors ratings are the criterion.

If employees scored on the test similarly to supervisors ratings,

then predictive validity of test is supported.

9


10/25

Construct Validity

For variables with no obvious body ofcontent, eg: intelligence, depression,stress, attitudes towards facebook

For variable without existing criteriaMeasurable elements of the construct is

listed based on literature/theoriesThe magnitude of the construct can be

indicated by statistical methods, e.g factoranalysis, Rasch analysis.

10


11/25

RELIABILITY


12/25

Reliability

Is the instrument measuring somethingconsistently & dependably

It pays no attention to the something Reliability is a necessary, but not sufficient

condition for validity.

Do repeated applications of the instrument undersimilar conditions yield consistent results?

12


13/25

Consistency?

Would the test yield the same result tomorrow?

If you changed codes/languages, would the test yield

the same result.

If you choose another set of items from a contentdomain, would the test yield the same result

13


14/25

Intra-Rater Consistency

Consistent mood.

Same level of strictness/leniency.

Same scoring standard

Examination of item by item instead of candidate bycandidate

14


15/25

Inter-Rater Agreement

Using a particular measure, two or more different observersshould provide similar ratings to the same phenomenon.

If they do so, the inter-rater reliability of the measure is

supported.

Not needed for objectively scored instruments

15


16/25

Reliability or Validity?

16


17/25

17


18/25

18


19/25

19


20/25

20


21/25

21


22/25

OBJECTIVITY

ADMINISTRABILITY

INTERPRETABILITY

MORE EFFECTIVE PRACTICES


23/25

Objectivity

Relates to accuracy in scoring.

A testing is objective when every examiner gives the same score to

the same answer

Objective and subjective items with fixed responses are highlyobjective

Objectivity can be enhanced by

Analytic scoring scheme

Moderation of scoring

23


24/25

dministrability

the efficacy in running a test.

Indicators of high administrability:

Exam hall is prepared promptlyTest commences on time

Clear instructions to candidates

Effective invigilation

Systematic script collectionTimely scoring/marking

On schedule announcement of results

24


25/25

Interpretability

An interpretable test provides more information toexaminers and candidates.

Indicators of high interpretability:

achievement

Position/rank of candidates, Strengths and weakness of candidates,

Suitability of teaching strategies

Suitability of items*

Inform the attainment of learning outcomes Cognitive

Affective

psychomotor

25

1 assessment principles

Documents