1 escuela: ingles nombres: maria arias cordova david james axelson language testing second bimester...

1

ESCUELA: INGLES

NOMBRES: MARIA ARIAS CORDOVA

DAVID JAMES AXELSON

LANGUAGE TESTINGSECOND BIMESTER

FECHA: ABRIL – AGOSTO 2009

WHAT IS TESTING?

Testing is a matter of using data to establish evidence of learning. But evidence does not occur concretely in the natural state, but is an abstract inference. It is a matter of judgment.

3

THE PURPOSE OF VALIDATION

The purpose of validadtion in Language Testing is to ensure the defensibility and fairness of interpretation based on test performance.

5


The scrutiny of such procedure will involve both reasoning and examination of the facts.

The reasoning may involve legal argumentation, and appeals to the common sense, insight, and human understanding of the jury members, as well as careful examination of the evidence.

5


Test validation similarly involves thinking about the logic of the test, particularly its design and its intentions, and also involves looking at empirical evidence –the hard facts- emerging from data from test trial or operational administrations.

6

QUALITIES 0F A GOOD TEST

A good test has the following qualities:

It is valid It is reliable It is practical It has negative effects on the teaching program.

7

PRACTICALITY A good test is practical. A good test is practical when

it is within the means of financial limitations, time constraints, easy of administration, and scoring and interpretation.

8

PRACTICALITY A test that is prohibitively

expensive is not practical. A test of language proficiency

that takes a student ten hours to complete is impractical.

A test that takes a few minutes for a student to take is impractical.

9

PRACTICALITY A test that takes several hours for an examiner to evaluate is impractical.

A test that requires individual one-to- one proctoring is impractical.

10

PRACTICALITY The extent to which a test is practical

sometimes hinges on whether a test is designed to be norm-referenced or criterion-referenced. In norm –referenced tests, each test-taker’s score is interpreted in relation to a mean, median, standard deviation, and / or percentile rank. The purpose in such tests is to place test-takers along a mathematical continuum in rank order.

11

PRACTICALITY Typical or non-referenced tests are

standardized tests intended to be administered to large audiences, with results quickly disseminated to test-takers. Such tests must have fixed, predetermined responses in a form that can be electronically scanned. Practicality is a primary issue.

The most important quality of any test is how practical it is to administer.

12

PRACTICALITY

It is the ability of a person or system to perform and maintain its functions in routine circumstances, as well as hostile or unexpected circumstances.

13

VALIDITY

The most complex criterion of a good test is validity, the degree to which the test actually measures what it is intended to measure.

14

FACE VALIDITY

Face validity: face validity is when a test appears valid to examinees who take it, personnel who administer it and other untrained observers.

15

RELIABILITY A testing reliability is a set of

two probabilities, the definition of which varies by field. In medicine, the sensitivity and specificity are conventionally used. In the field of , the probabilities of detection and false call are conventionally used.

http://en.wikipedia.org/wiki/Probability

http://en.wikipedia.org/wiki/Medicine

http://en.wikipedia.org/wiki/Sensitivity_(tests)

http://en.wikipedia.org/wiki/Specificity_(tests)

16

RELIABILITY If you give the same test to the same subject or matched subjects on two dfifferent occasions, the test itself should yield similar reults; it should have test reliability

17

RELIABILITY Means:

- dependability - trustworthiness - precision

18

THREATS TO TEST VALIDITY

Why is face validity not enough? What can threaten the validity The meaningfulness Interpretability Fairness of assessment ( scores, ratings)

19

THREATS TO TEST VALIDITY

Possible problem areas:

- Test content - Test method and - Test construct

20

CONTENT VALIDITY A test has content validity if it

measures knowledge of the content domain of which it was designed to measure knowledge. Another way of saying this is that content validity concerns, primarily, the adequacy with which the test items adequately and representatively sample the content area to be measured.

21

CONTENT VALIDITY

For example: a comprehensive math achievement test would lack content validity if good scores depended primarily on knowledge of English, or if it only had questions about one aspect of math (e.g., algebra). Content validity is primarily an issue for educational tests, certain industrial tests, and other tests of content knowledge like the Psychology Licensing Exam.

22

TEST METHOD A test method is a definitive

procedure that produces a test result. (ASTM definition)

The test result can be qualititive (yes/no), categorical, or quantititive (a measured value). It can be a personal observation or the output of a precision measuring instrument.

http://en.wikipedia.org/wiki/ASTM

http://en.wikipedia.org/wiki/Measuring_instrument



23

TEST CONSTRUCT

Test Construct refers to those aspects of knowledge or skill possessed by the candidate which are being measured.

Test Construct involves being clear about what knowledge of language consists of, and how that knoweledge is deployed in actual performance.

24

THREATS TO TEST VALIDITY Possible problem areas: Test content: What the test contains.

Test method: The way in which the candidate is asked to engage with the materials and tasks in the test, and how these responses will be scored.

Test construct: The underlying ability being captured by the test.

25

ESSAY TESTS

To write compositions or essay tests seems very easy. Much easier, for example, than writing multiple-choice questions. All one seems to have to do is write a topic and leave the student to compose an answer. The following prompt is very common:

“ HEALTHY FOOD ” Discuss.

26

ESSAY TESTSFormat: Introduction. Introduce your topic Background. Give historical or philosophical

background data to orient the reader to the topic.

Thesis and arguments. State the main points including causes and effects, methods used, dates, places, results.

Conclusion. Include the significance of each event and finish up with a summary.

27

INTRODUCTION

The business practices of the Intel Corporation, a technology company best known for the production of microprocessors for computers, illustrate the importance of brand marketing. Intel was able to achieve a more than 1,500 percent increase in sales, moving from $ 1.2 billion in sales to more than $ 33 billion, in a little more than 10 years. Although the explosion of the home-computer market certainly accounted for some of this dramatic increase, the brilliance of its branding strategy also played a significant role.

28

BACKGROUND

Intel became a major producer of microprocessor chips in 1978, when its 8086 chip was selected by IBM for use in its line of home computers. The 8086 chip and its successors soon became the industry standard, even as Intel’s competitors sought to break into this potentially lucrative market. Intel’s main problem in facing its competitors was its lack of trademark protection for its series of microchips. Competitors were able to exploit this lack by introducing clone products with similar sounding names, severely inhibiting Intel’s ability to create a brand identity.

29

THESIS AND ARGUMENTS

In an effort to save its market share, Intel embarked on an ambitious branding program in 1991. The corporation’s decision to invest more than $ 100 million in this program was greeted with skepticism and controversy. Many within the company argued that the money could be better spent researching and developing new products, while others argue that a company that operated within such a narrow consumer niche had little need for such an aggressive branding campaign. Despite these misgivings, Intel went ahead with its strategy, which in a short time became a resounding success.

30

CONCLUSION

Ironically, the success of the Intel’s branding strategy led to a marketing dilemma for the company. In 1992, Intel was prepared to unveil its new line of microprocessors. However, the company faced a difficult decision: release the new product under the current brand logo and risk consumer apathy or give the product a new name and brand and risk undoing all the work put into the branding strategy. In the end, Intel decided to move forward with a new brand identity. It was a testament to the strength of Intel’s earlier branding efforts that the new product line was seamlessly integrated into the public consciousness.

TOPICS

Some people like doing work by hand. Others prefer using machines. Which do you prefer? Use specific reasons and examples to support your answer.

Some people think that children should begin their formal education at a very early age and should spend most of their time on school studies. Others believe that young children should spend most of their time playing. Compare these two views. Which view do you agree with? Why?

32

TOPICS

Some people think that the family is the most important influence on young adults. Other people think that friends are the most important influence on young adults. Which view do you agree with? Use examples to support your position.

Some students prefer to study alone. Others prefer to study with a group of students. Which do you prefer? Use specific reasons and examples to support your answer.

33

KINDS OF ESSAY TESTS

ORAL INTERVIEWS

SUMMARIES

INFORMATION GAP ACTIVITIES

34

ORAL INTERVIEWS J. B Heaton explains that in real

life the two skills of listening and speaking are fully integrated in most everyday situations involving communication. Consequently, an excellent way of testing speaking is the oral interview since listening and speaking can be assessed in a natural situation.

35

SUMMARIES Summaries are used most often

to test reading or listening comprehension and writing skills. Writing summaries may closely replicate many real-life activities.

36

INFORMATION GAP ACTIVITIES

Work out what the differences are

37

TESTING READING SKILLS

VOCABULARY TESTS often provide a good guide to reading ability. It is usually necessary for students to demonstrate not only a knowledge of the meaning of a particular word but al so an awareness of the other words with which it is generally used. However, in addition to their usefulness in proficiency tests, vocabulary tests are also useful in progress tests as they lend themselves to follow up work in class.

38

TRUE / FALSE ITEMS1. ___ Children learn to recognize and

produce the sounds of the language by listening to its spoken form.

2. ___ One remarkable thing about first language acquisition is the low degree of similarity which we see in the early language of children all over the world.

3. ___ Many sentences such as “ Mummy juice” and “baby fall down” are known as telegraphic speech.

39

MULTIPLE-CHOICE ITEMS

Writing multiple-choice items is not too difficult after you have had a little practice. For most purposes three options are enough. Remember that the distracters should appear correct to any students who are not sure of the answer. Avoid writing absurd distracters which everyone can easily see are wrong. On the other hand, however, all the distracters should be written within the student’s range of proficiency and at the same level as the correct

40

MULTIPLE-CHOICE ITEM

Example:According to the author, one cause of mountain formation is the a. effect of the climate change on sea level b. slowing down of volcanic activity c. force of Earth`s crustal plates hitting each other d. replacement of sedimentary rock with volcanic rock Correct answer: c

41

MATCHING ITEMS

Matching items are also very useful for testing vocabulary in context. It is necessary to instruct the students to write the correct word from the story at the side of each word listed below it.

42

MATCHING ITEM

Example:Column A Column B1. shy a. cheerful2. happy b. thin3. sad c. become

scared4. slim d. sorrowful

43

TESTING WRITING SKILLS

Jeremy Harmer explains that like many other aspects of English language teaching, the type of writing we get students to do will depend on their age, interests and level.

44


GRAMMAR AND STRUCTURE - Multiple-choice - Error recognition - Re-arrangement - Changing words - Blank -filling

45


Controlled Writing Transformation Broken Sentences Notes and Diaries Free writing

46


GRAMMAR AND STRUCTURE - Multiple-choice items Multiple-choice items test an ability

to recognize sentences which are grammatically correct.

47


ERROR RECOGNITION

Students must choose the underlined word or phrase which is incorrect.


RE-ARRANGEMENT

Students are required to unscramble sentences. They must write out each sentence, putting the words and phrases in their correct order. This type of item is useful for testing awareness of the order of adjectives, the position of adverbs, inversion and other areas of grammar.

49


CHANGING WORDS A completely different type of

questions requires students to put verbs into their correct tense or voice. This question is quite easy and straightforward to construct. However, it is important to provide an interesting context.

50


Blank–filling Blank-filling items should consist of

paragraphs providing an interesting and relevant context. It is important to choose the words to omit very carefully so that they are all grammatical words

( e.g. to, in, is, the).

51

CENTRAL TENDENCY

The Central Tendency of a distribution is an estimate of the “center” of a distribution of values.

CENTRAL TENDENCY

There are three major types of estimates of Central Tendency:

- Mean - Median - Mode

CENTRAL TENDENCY

The Mean or average is probably the most commonly used method of describing central tendency.

CENTRAL TENDENCY

The Mean To compute the mean, add up all the

values and divide by the number of values.

CENTRAL TENDENCY

The Mean For example: 20, 20, 20, 18, 17, 14, 14= 135 The sum of these 8 values is

135/8= 16.87

CENTRAL TENDENCY

The Median Is the score found at the exact

middle of the set of values. One way to compute the median is to list all scores in numerical order, and then locate the score in the center of the sample.

CENTRAL TENDENCY

The Median For example: 15, 15, 15, 15, 15, 17, 18, 20 There are 8 scores and score # 4

and # 5 represent the halfway point. Since both these scores are 15, the median is 15.

CENTRAL TENDENCY

The Median If the two middle scores have

different values, you would have to interpolate to determine the median.

DO`S AND DON`TS IN WRITING FOR READING COMPREHENSION

General Concerns: The candidate or the student should be able to answer the

questions on the basis of what is in the passage; the questions should not require outside knowledge.

Questions should cover all the important parts of the passage. Questions should not be asked exclusively about one section of the passage while other sections are neglected.

- Overlap among questions should be avoided. With many questions based on one passage, it is inevitable that more than one question may relate to a particular portion or aspect of the passage; care should be taken, however, that such questions explore different perspectives of the material.

DO`S AND DON`TS IN WRITING FOR READING

COMPREHENSIONThe stem: - The stem should formulate the question

or problem as simply and directly as possible. Avoid irrelevant verbiage.

- The stem should be as directed as possible; that is, it should have a focus and should clearly identify the problem. The candidate should not have to read all of the options to see what the question is asking.


COMPREHENSION - Capitalize words such as NOT, LEAST, EXCEPT, etc. When they are used in the stem to call for a negative or unexpected response.

- If a word or phrase is used at the beginning of each option, move that word or phrase to the stem to avoid unnecessary repetition.


COMPREHENSIONRefer to the passage as such, not

“selection,” “excerpt,” etc.

- Use specific line references when questions refer to specific words, phrases, or arguments in the passage.


COMPREHENSION The Key: There should be one and only one correct or

clearly best answer. The key should not be specifically determined in

any way, e.g., by length, degree of precision, or language. Item writers often submit questions with the key so carefully qualified that it is twice as long as the distractors; it may help to write the key first so that the distractors can be tailored to be parallel.


COMPREHENSION The options: Do not use “ All of the above” or “None

of the above” as options. The need to use “All of the above” may be an indication that a Roman numeral format would be appropriate.

All options should be as parallel as possible in grammatical structure, diction, and length.

65

DO`S AND DON`TS IN WRITING FOR READING COMPREHENSION

Unacceptable Sample Options: The passage implies that an advantage

of adopting the author’s theories is that we would increase our knowledge of atmospheric processes

national survival the formulation of a set of hypotheses

regarding motion in space


COMPREHENSION The options: - All options must fit the stem,

e.g., they should not be easily identifiable as incorrect responses simply because they make no sense grammatically or idiomatically.

- Avoid options that overlap or subsume each other, or options that give away the answers to other questions.


COMPREHENSION - Avoid using a pair of opposites in

the options if one of the pair is the key. If such a pair of opposites is used, the item is likely to operate as two-choice rather than a four-choice item, and the probability of guessing the correct answer is increased.

- Arrange options in logical order, if one exists, or according to length ( for example, shortest to longest).

Computers and Language Testing

Rapid developments in computer technology have had a major impact on test delivery. Already, many important national and international language tests, including TOEFL, are moving to computer based testing (CBT). Stimulus texts and prompts are presented not in examination booklets but on the screen, with candidates being required to key in their responses. The advent of CBT has not necessarily involved any change in the test content, which may remain quite conservative in its assumptions, but often simply represents a change in test method.


The proponents of computer based testing can point to a number of advantages. First, scoring of fixed response items can be done automatically, and the candidate can be given a score immediately. Second, the computer can deliver tests that are tailored to the particular abilities of the candidate.


It seems inefficient for all candidates to take all the questions on a test; clearly some are so easy for some candidates that they provide little information on their abilities; others are too hard to be of use. It makes sense to use the very limited time available for testing to focus on those items that are just within, and just beyond a candidate’s threshold of ability.


The use of computer for delivery of test materials raises questions of validity. For example, different levels of familiarity with computers will affect people’s performance with them, and interaction with the computer may be stressful experience for some students or candidates. McNamara Tim ( 2000, pages 79-81 )

LEARNING THEORY:

Intrinsic Motivation / Teacher extrinsic motivation

Structure: focused practice / lots of oral practice

Sequence: learn well before moving on to next point.

Reinforcement: review and feedbackINPUT PROCESS OUTPUTINPUT PROCESS OUTPUT

CORRECTION REVIEW FEEDBACKCORRECTION REVIEW FEEDBACK

LEARNING THEORY

ERROR CORRECTION FEEDBACK

LOCAL ERRORS

GLOBAL ERRORS

Consulted Bibliography Tim MacNamara: (2000).Language Testing. Oxford

University Press. Heaton J. B.(1998) Classroom Testing. Longman Keys

to Language Teaching. Longman. London. New York. Jack C. Richards (2005). Communicative Language

Teaching , Cambridge Univ. Press Brown, Douglas (200l). Teaching by Principles,

Longman, United States IBT Tests (2004). MacGraw Hills. Freeman Donald; Richards Jack C. (2001); Teacher

Learning in Language Teaching.

THANK YOU

1 escuela: ingles nombres: maria arias cordova david james axelson language testing second bimester...

Documents

good test

test trial

test performance

test of language proficiency

testtakers score

face validity face validity

practicality typical

nonreferenced tests