validity and test validation prepared by olga simonova, inna chmykh, svetlana borisova, olga...

30
VALIDITY AND TEST VALIDATION Prepared by Olga Simonova, Inna Chmykh, Svetlana Borisova, Olga Kuznetsova Based on materials by Anthony Green

Upload: barnaby-stewart

Post on 19-Dec-2015

228 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: VALIDITY AND TEST VALIDATION Prepared by Olga Simonova, Inna Chmykh, Svetlana Borisova, Olga Kuznetsova Based on materials by Anthony Green

VALIDITY AND TEST VALIDATIONPrepared by Olga Simonova, Inna Chmykh, Svetlana Borisova, Olga KuznetsovaBased on materials by Anthony Green

Page 2: VALIDITY AND TEST VALIDATION Prepared by Olga Simonova, Inna Chmykh, Svetlana Borisova, Olga Kuznetsova Based on materials by Anthony Green

ABC Test of EnglishResultsIvana 45%Irina 78%

Which student is better at English?

Validity

Page 3: VALIDITY AND TEST VALIDATION Prepared by Olga Simonova, Inna Chmykh, Svetlana Borisova, Olga Kuznetsova Based on materials by Anthony Green

Validity

TSome aspects may not be tested:

Construct under-representation Assessment tasks

Page 4: VALIDITY AND TEST VALIDATION Prepared by Olga Simonova, Inna Chmykh, Svetlana Borisova, Olga Kuznetsova Based on materials by Anthony Green

Language Ability

ValiditySome abilities that are important to success in a test may not be connected to real-world language abilities: •ability to cope with exam stress;•awareness of how multiple-choice questions are written;•willingness to guess etc.These are construct irrelevant factors.

Page 5: VALIDITY AND TEST VALIDATION Prepared by Olga Simonova, Inna Chmykh, Svetlana Borisova, Olga Kuznetsova Based on materials by Anthony Green

What is validity?

Tests are tools for helping us to make good decisions.Construct relevance:• a test of maths (even if it’s very reliable) can’t

tell us about someone’s ability to sing;• a test of written grammar can’t tell us much

about someone’s ability to hold a conversation.Construct representation:• does the test cover all aspects of the relevant

abilities?

Page 6: VALIDITY AND TEST VALIDATION Prepared by Olga Simonova, Inna Chmykh, Svetlana Borisova, Olga Kuznetsova Based on materials by Anthony Green

What is validity?

‘validity refers to the degree to which evidence and theory support the interpretations of test scores entailed by proposed uses of tests’

American Educational Research Association et al. (1999) This means that test results can be valid for one purpose and for one particular population of test takers, but not for others.A test may be valid for placement purposes on a general language course, but not for employment selection.

Page 7: VALIDITY AND TEST VALIDATION Prepared by Olga Simonova, Inna Chmykh, Svetlana Borisova, Olga Kuznetsova Based on materials by Anthony Green

What do we want the results to mean? What evidence can we collect to find out if scores really support this interpretation?•evaluation – the test taker’s performance is a fair reflection of his/her abilities;•generalization – similar scores would be obtained if the test taker was given a different form of the test, or if the raters scoring his/her performance were different;•explanation – the test reflects a coherent theory of language ability;•utilisation – the tested abilities are relevant to the decision being made about the test taker.

Building a validation argument

Page 8: VALIDITY AND TEST VALIDATION Prepared by Olga Simonova, Inna Chmykh, Svetlana Borisova, Olga Kuznetsova Based on materials by Anthony Green

• at different stages in the cycle, different questions need to be answered;

• different types of validity may be more relevant at each stage;

• tests made for different purposes raise different issues.

Validation in the assessment cycle:

Page 9: VALIDITY AND TEST VALIDATION Prepared by Olga Simonova, Inna Chmykh, Svetlana Borisova, Olga Kuznetsova Based on materials by Anthony Green

• Evaluation – the test taker’s performance is a fair reflection of his/her abilities. Test form and administration.

• Generalization – similar scores would be obtained if the raters scoring his/her performance were different. Test score and rating scales.

• Explanation – the test reflects a coherent theory of language ability. Specification.

• Utilisation –the tested abilities are relevant to the decision being made about the test taker. Test purpose and target language use domain.

Building a validation argument:

Page 10: VALIDITY AND TEST VALIDATION Prepared by Olga Simonova, Inna Chmykh, Svetlana Borisova, Olga Kuznetsova Based on materials by Anthony Green

VALIDITY AND TEST VALIDATION

Page 11: VALIDITY AND TEST VALIDATION Prepared by Olga Simonova, Inna Chmykh, Svetlana Borisova, Olga Kuznetsova Based on materials by Anthony Green

Validity in test design

“Tests for the measurement of language abilities must be constructed according to a coherent validity framework based on the latest developments in theory and practice.”

(Weir, 2005)

Page 12: VALIDITY AND TEST VALIDATION Prepared by Olga Simonova, Inna Chmykh, Svetlana Borisova, Olga Kuznetsova Based on materials by Anthony Green

CONTEXT VALIDITY

COGNITIVEVALIDITY

TEST TASK

PERFORMANCE

SCORING VALIDITY

Socio-cognitive approach (O’Sullivan & Weir, 2010)

CONSEQUENTIALVALIDITY

CRITERION-RELATEDVALIDITY

Page 13: VALIDITY AND TEST VALIDATION Prepared by Olga Simonova, Inna Chmykh, Svetlana Borisova, Olga Kuznetsova Based on materials by Anthony Green

Content (context) validity

Content validity is based on subject experts' judgments of test content.Does the content of the test adequately cover all the aspects of language ability we are interested in for making this decision?

Page 14: VALIDITY AND TEST VALIDATION Prepared by Olga Simonova, Inna Chmykh, Svetlana Borisova, Olga Kuznetsova Based on materials by Anthony Green

Content (context) validity

A test is said to have content validity if its content constitutes a representative sample of the language skills, structures, etc. with which it is meant to be concerned.

(Hughes, 2005)

The term content validity was traditionally used to refer to the content coverage of the task. Context validity is preferred as a more inclusive superordinate which signals the need to consider the discoursal, social and cultural context as the linguistic parameters under which the task is performed (its operations and conditions).

(Weir and Shaw, 2005)

Page 15: VALIDITY AND TEST VALIDATION Prepared by Olga Simonova, Inna Chmykh, Svetlana Borisova, Olga Kuznetsova Based on materials by Anthony Green

Do test takers go through the same mental processes when responding to test tasks as when they use language in the real world in the situations we are interested in?

Cognitive (or theory-based) validity

Page 16: VALIDITY AND TEST VALIDATION Prepared by Olga Simonova, Inna Chmykh, Svetlana Borisova, Olga Kuznetsova Based on materials by Anthony Green

Theory-based validity involves collecting a priori evidence through piloting and trialling before the test event, for example through verbal reports from test takers on the cognitive processing activated by the test task, and a posteriori evidence involving statistical analysis of scores following test administration.

(Weir and Shaw, 2005)

Cognitive (or theory-based) validity

Page 17: VALIDITY AND TEST VALIDATION Prepared by Olga Simonova, Inna Chmykh, Svetlana Borisova, Olga Kuznetsova Based on materials by Anthony Green

Scoring validity

Scoring validity accounts for the extent to which test scores are: •based on appropriate criteria;•exhibit consensual agreement in their marking;•free as possible from measurement error;•stable over time;•engender confidence as reliable decision making indicators.

(Weir and Shaw, 2005)

Page 18: VALIDITY AND TEST VALIDATION Prepared by Olga Simonova, Inna Chmykh, Svetlana Borisova, Olga Kuznetsova Based on materials by Anthony Green

Scoring validity = reliability

Are the test scores consistent enough for us to have confidence in the results?

Scoring validity

Page 19: VALIDITY AND TEST VALIDATION Prepared by Olga Simonova, Inna Chmykh, Svetlana Borisova, Olga Kuznetsova Based on materials by Anthony Green

Criterion-related validity

Criterion-related validity relates to the degree to which results on the test agree with those provided by some independent and highly dependable assessment of the candidate's ability. This independent assessment is thus the criterion measure against which the test is validated.

(Hughes, 2003)

Are test results of the test consistent with other evidence we have about test takers’ abilities?

Criterion-related validity takes two forms: concurrent validity predictive validity

Page 20: VALIDITY AND TEST VALIDATION Prepared by Olga Simonova, Inna Chmykh, Svetlana Borisova, Olga Kuznetsova Based on materials by Anthony Green

Concurrent validity

“involves the comparison of the test scores with some other measures of the same candidates taken at roughly the same time as the test.”

(Alderson et al., 1995:177)

Do scores on our test agree with the results of other tests of the same abilities?

Page 21: VALIDITY AND TEST VALIDATION Prepared by Olga Simonova, Inna Chmykh, Svetlana Borisova, Olga Kuznetsova Based on materials by Anthony Green

Predictive validity

Predictive validity entails the comparison of test scores with some other measure for the same candidates taken some time after the test has been given.

(Alderson et al., 1995)The degree to which a test can predict candidates' future performance.

(Hughes, 2003) Did the test accurately predict which test takers were

going to perform best in their jobs/ in class/ etc.?

Page 22: VALIDITY AND TEST VALIDATION Prepared by Olga Simonova, Inna Chmykh, Svetlana Borisova, Olga Kuznetsova Based on materials by Anthony Green

Does the introduction and use of the test have the intended social consequences?

Is there any:•bias in scoring and interpretation of results?•unfairness in test use?•positive or negative effect on teaching and learning?

Consequential validity (impact)

Page 23: VALIDITY AND TEST VALIDATION Prepared by Olga Simonova, Inna Chmykh, Svetlana Borisova, Olga Kuznetsova Based on materials by Anthony Green

Face validity

Face validity refers to the test's “surface credibility or public acceptability” (Alderson, et al., 1995:172). Bachman (1990:307) states that “face validity is the appearance of real life.”

Do test takers/ teachers/ politicians/ the public generally believe in the value of the

test?

Page 24: VALIDITY AND TEST VALIDATION Prepared by Olga Simonova, Inna Chmykh, Svetlana Borisova, Olga Kuznetsova Based on materials by Anthony Green

The assessment is credible to users: it looks as though it measures the skills or abilities of interest.For example, a multiple choice grammar test does not look as though it really tests the ability to speak English in real-world situations. All kinds of evidence could be used to show that people who pass the test are actually able to communicate effectively, but users may not be convinced because test takers are not actually required to speak. If the test does not have face validity, it is unlikely to be successful.

Face validity

Page 25: VALIDITY AND TEST VALIDATION Prepared by Olga Simonova, Inna Chmykh, Svetlana Borisova, Olga Kuznetsova Based on materials by Anthony Green

Construct validity

In recent years the term construct validity has been used to refer to the general, overarching notion of validity. It is not enough to assert that a test has construct validity; empirical evidence is needed.

(Hughes, 2003)The arguments for using the test as a reasonable justification for taking any decision must be presented and examined: validation.

Page 26: VALIDITY AND TEST VALIDATION Prepared by Olga Simonova, Inna Chmykh, Svetlana Borisova, Olga Kuznetsova Based on materials by Anthony Green

Round-up:suitable data for test validity

Face validity Questionnaires to and interviews with candidates, administrations and other users.

Context validity a) Compare test content with specifications/syllabus.b) Questionnaires to and interviews with 'experts' such as

teachers, subject specialists, applied linguists.c) Expert judges rate test items and texts according to

precise list of criteria.

Cognitive validity Students introspect on their test-taking procedures, either concurrently or retrospectively. Keystroke logs. Eye-tracking.

Concurrent validity a) Compare students' test scores with their scores on another test.

b) Compare students' test scores with teachers' rankings.c) Compare students' test scores with other measures of

ability such as students' teacher rating.

Page 27: VALIDITY AND TEST VALIDATION Prepared by Olga Simonova, Inna Chmykh, Svetlana Borisova, Olga Kuznetsova Based on materials by Anthony Green

Suitable data for test validity

Predictive validity a) Compare students' test scores with their scores on tests taken some time later.

b) Compare students' test scores with success in final exam.c) Compare students' test scores with other measures of their

ability taken some time later, such as employers' assessments of their ability.

Construct validity a) Compare performance on each subtest with other subtests.b) Compare performance on each subtest with total of all other

subtests.d) Compare students' test score with students' biodata and

psychological characteristics.e) Multitrait-multimethod studies.f) Factor analysis.

Page 28: VALIDITY AND TEST VALIDATION Prepared by Olga Simonova, Inna Chmykh, Svetlana Borisova, Olga Kuznetsova Based on materials by Anthony Green

Roles

• Designers

• Producers

• Organisers

• Administrators

• Assessees

• Scorers

• Users

Example validity questions

• Does the design of the test reflect an adequate theory of language?

• Is an appropriate balance of abilities required for success on the test?

• Do the test items reflect the designers’ intentions?• Is the test organised and administered in a way that

will ensure fairness?• Do assessees respond to the test tasks in a way that

reflects realistic language processing?• Do scorers consistently and accurately capture the

qualities of test takers’ performance?• Are decisions taken by users justified by the test?

Who is a validator?

Page 29: VALIDITY AND TEST VALIDATION Prepared by Olga Simonova, Inna Chmykh, Svetlana Borisova, Olga Kuznetsova Based on materials by Anthony Green

Assessment developers (teachers, testing agencies):• to check the quality of their own work;• to showcase the quality of their tests.Assessment users:• to check that tests are giving them accurate and

relevant information.Independent agencies:• to enforce/ encourage good quality assessment.

Who is a validator?

Page 30: VALIDITY AND TEST VALIDATION Prepared by Olga Simonova, Inna Chmykh, Svetlana Borisova, Olga Kuznetsova Based on materials by Anthony Green

Conclusion

• Test validity, according to Alderson et al., (1995:193), is 'time-consuming and difficult'.

• However, it is essential as a test without validity cannot be useful as a decision making tool.

• Applied linguists and teachers should focus more of their efforts on practical research in this field.