the analysis of validity, reliability, discrimination ...lib.unnes.ac.id/2561/1/6444.pdffakultas...

109
THE ANALYSIS OF VALIDITY, RELIABILITY, DISCRIMINATION POWER AND LEVEL OF DIFFICULTY OF FIRST MID-TERM TEST IN THE CASE OF THE EIGHTH GRADE STUDENTS OF SMP 33 SEMARANG (In the Academic Year of 2008/2009) a final project submitted in partial fulfillment of requirements for the degree of Sarjana Pendidikan in English by Ajeng Desy H 2201405080 ENGLISH DEPARTMENT FACULTY OF LANGUAGES AND ARTS SEMARANG STATE UNIVERSITY 2009

Upload: buikhanh

Post on 25-May-2019

223 views

Category:

Documents


0 download

TRANSCRIPT

THE ANALYSIS OF VALIDITY, RELIABILITY,

DISCRIMINATION POWER AND LEVEL OF DIFFICULTY

OF FIRST MID-TERM TEST IN THE CASE OF THE EIGHTH

GRADE STUDENTS OF SMP 33 SEMARANG (In the Academic Year of 2008/2009)

a final project

submitted in partial fulfillment of requirements

for the degree of Sarjana Pendidikan

in English

by

Ajeng Desy H

2201405080

ENGLISH DEPARTMENT

FACULTY OF LANGUAGES AND ARTS

SEMARANG STATE UNIVERSITY

2009

ii

APPROVAL

The final project was approved by the Board of Examiners of the English Department of the Faculty of Languages and Art of Semarang State University on 19th August 2009.

Boards of Examiners

1. Chair person Dra. Malarsih, M.Sn

NIP. 131764021

2. Secretary Drs. Ahmad Sofwan, PhD NIP. 131813664

3. First examiner Drs. Suprapto, M. Hum NIP. 131125925

4. Second examiner/ second advisor

Frimadhona Syafri, S.S, M. Hum NIP. 132300419

5. Third examiner/ first advisor

Drs. Amir Sisbiyanto, M. Hum NIP. 131281220

Approved by Dean of Faculty of Languages and Arts

Prof. Dr. Rustono, M.Hum NIP. 131281222

iii

PERNYATAAN

Dengan ini saya, Nama : Ajeng Desy Hidayati NIM : 2201405080 Prodi/ Jurusan : Pendidikan Bahasa Inggris Fakultas Bahasa dan Seni Universitas Negeri Semarang menyatakan dengan

sesungguhnya bahwa skripsi/ tugas akhir/ final project yang berjudul:

“THE ANALYSIS OF VALIDITY, RELIABILITY, DISCRIMINATION

POWER AND LEVEL OF DIFFICULTY OF FIRST MID-TERM TEST IN THE

CASE OF EIGHTH GRADE STUDENTS OF SMP 33 SEMARANG IN THE

ACADEMIC YEAR 2008/ 2009”

Saya tulis dalam rangka memenuhi salah satu syarat untuk memeperoleh gelar

sarjana ini benar-benar merupakan karya sendiri yang saya hasilkan setelah

melalui penelitian, pembimbingan, diskusi, dan pemaparan/ ujian. Semua kutipan

baik yang langsung maupun yang tidak langsung, baik yang diperoleh dari sumber

kepustakaan, wahana elektronik, wawancara langsung maupun sumber lainnya

dengan cara sebagaimana yang lazim dalam penulisan karya ilmiah.

Dengan demikian, walaupun tim penguji dan pembimbing penulisan skripsi/ tugas

akhir/ final project ini membubuhkan tanda tangan sebagai tanda keabsahannya,

seluruh isi karya ilmiah ini tetap menjadi tanggung jawab sendiri. Jika kemudian

ditemukan ketidakberesan, saya bersedia menerima akibatnya.

Demikian, harap pernyataan ini digunakan seperlunya.

Semarang, Agustus 2009 Yang membuat pernyataan

Ajeng Desy Hidayati NIM. 2201405080

iv

They only are the (true) believers whose hearts feel fear when Allah is mentioned,

and when the revelations of Allah are recited into them they increase their faith,

and who trust in their Lord (Al-Anfaal: 2)

There is a will there is a way.

Dedicated to:

My parents

My brothers

My lovely and my friends

v

ACKNOWLEDGEMENT

Foremost, I wish to take this opportunity to express my gratitude to God the

Almighty for the blessing, inspiration, and leading me to complete this final project.

First of all, I address my deepest appreciation to Drs. Amir Sisbiyanto, M.

Hum, my first adviser, who was given a valuable guidance and unfailing

encouragement from the beginning until this final project was completed. I also

extend my gratitude to Frimadona Syafri, S.S, M. Hum, my second adviser, who was

given many suggestions and corrections of its improvement.

In addition, my thank goes to Mrs. Endang Sarwo Sri, S. Pd, the headmaster of

SMP 33 Semarang who was given me permit to conduct the experimental study

there.And also special thank to Mrs. Aniek Rita, the teacher of eight grade of SMP 33

Semarang, who help me to conduct the experimental study there.

Furthermore, I owe a special debt to of gratitude to all members of teaching

staff of English Department, for their continuous guidance given to me during my

years of study there.

Finally, I would like to express my thanks to all my family members, friends,

and my buddy for their help and moral support.

vi

ABSTRACT

Hidayati, Ajeng Desy. 2009. The Analysis of Validity, Reliability, Discrimination Power, and Level of Difficulty of First Mid-Term Test. The case of eighth grade students of SMP 33 Semarang. In the academic year 2008/2009. Final Project. English Department. Faculty of Languages and Arts. Semarang State University. First advisor: Drs. Amir Sisbiyanto M. Hum. Second advisor: Frimadhona Syafri S.S M.Hum

Key words: validity, reliability, discrimination power, level of difficulty, test item.

A good English test will help students to learn the language by requiring them to study hard, emphasizing the objectives of the course and also showing them in which parts of the course they need improvement. A test, which is intended to measure the students’ achievement, has to fulfill the requirements of good test such as validity and reliability. There are several factors that influenced in building good test. There are relevance, balance, efficiency, specificity, difficulty, discrimination, variability and reliability.

In this study, the writer would like to focus her research on the English mid-term test which is administered to eighth grade students of SMP 33 Semarang in the academic year of 2008/2009. In this study, the writer would like to find the answer of the following question: “how good is the English mid-term test for eight grade of SMP 33 Semarang in the academic year 2008/2009.” The general objective of the study is obtaining an objective description of the structure of a good test item. The method that the writer used in analyzing the data this study is quantitative approach. In writing this final project, the writer conducts to activities. The first is library activities, the writer select some books which give information, or supporting data for reference. Then the second is field activity, it is used to collect the data.

From the result of the analysis the test there are 33 valid items and 17 invalid items. The reliability of the test is 0.39, so this test is still reliable. From the point of view of discrimination power, it can be concluded as poor because the mean of the discrimination power is 0.17. There are 8 good items, 13 marginal items and 29 poor items. In the term of difficulty level this item categorized as moderate item because the mean is 0.41. There are 11 difficult items, 34 moderate items, and 5 easy items. Based on the result above, the writer would like to offer some suggestions.

First, the constructor of the test should be aware the characteristic of good test, especially in determine difficulty levels and discrimination power. Second, items that still can be used should be revised and save, while items which have negative value should be discarded, because it means that the students in the lower group performing better than the students in the upper group. Finally, the writer suggest that the test should not be used in the English final test, it can still be used unless it has makes some revisions. The writer hopes that the result of this item analysis could be used as an example in analyzing other test item and encourages teacher to make good English test.

vii

TABLE OF CONTENTS

ACKNOWLEDGEMENTS………………………………………………….. iv

ABSTRACT…………………………………………………………………… v

TABLE OF CONTENTS…………………………………………………….. vi

LIST OF TABLES……………………………………………………………. ix

LIST OF APPENDICES…………………………………………………… x

CHAPTER

1. INTRODUCTION

1.1. Background of the Study………………………………………………… 1

1.2. Reason for Choosing the Topic………………………………………….. 3

1.3. Statement of the Problem………………………………………………... 3

1.4. Objective of the Study………………………………………………….... 4

1.5. Significance of the Problem……………………………………………... 4

1.6. Limitation of the Study………………………………………………….. 5

1.7. Outline of the Research…………………………………………………. 5

2. REVIEW OF THE RELATED LITERATURE

2.1 Characteristics of a Good Test…………………………………………….. 7

2.2 Multiple-choice Items…………………………………………………….. 10

2.2.1 The Uses of Multiple-Choice Items……………………………………... 11

2.2.2 Characteristic of Multiple-choice Items…………………………………. 12

2.2.3 Rules for Constructing Multiple-Choice Items…………….…………… 12

2.3 Validity…………………………………………………………………… 13

2.4 Reliability………………………………………………………………..... 15

viii

2.5 Item Discrimination Power……………………………………………….. 15

2.6 Item Difficulty…………………………………………………………..... 17

2.7 Curricullum……………………………………………………………….. 19

3. METHODS OF INVESTIGATION

3.1. Population……………………………………………………………….. 20

3.2. Sample and Sampling Technique………………………………………... 20

3.3. Identification of the Problem……………………………………………. 21

3.4. Technique of Data Collection…………………………………………… 21

3.5. Technique and Data Analysis…………………………………………… 22

3.5.1 Analysis of Validity……………………………………………………. 22

3.5.2 Analysis of Reliability…………………………………………………. 22

3.5.3 Difficulty Level Analysis………………………………………………. 23

3.5.4 Discrimination Power Analysis………………………………………... 24

4. RESULT OF THE ANALYSIS

4.1 Result of the Analysis……………………………………………………… 26

4.1.1 Analysis of Item Validity………………………………………………… 26

4.1.2 Analysis of Item Reliability ……………………………………………… 27

4.1.3 Analysis of the Difficulty Level…………………………………………. 27

4.1.4 Analysis of Discrimination Power………………………………………. 28

4.2 Discussions………………………………………………………………… 30

5. CONCLUSIONS AND SUGGESTIONS

5.1 Conclusions……………………………………………………………….. 92

5.2 Suggestions………………………………………………………………. 93

ix

BIBLIOGRAPHY……………………………………………………………. 95

APPENDICES.................................................................................................. 97

x

LIST OF TABLES

Table

3.5.4 Discrimination Power Values……………………………………………. 25

4.2 Result of the Item Analysis…………………..…………………………… 89

xi

LIST OF APPENDICES

Appendix

1. Analysis of each Item ……………………………………………………. 98

2. Computation of Reliability……………………………………………….. 100

3. Computation of Discrimination Power…………………………………… 102

4. Computation of Level of Difficulty……………………………………..... 103

5. List of students in the Upper group and Lower Group………………….... 104

6. List of the Correspondent…………………………………………………. 105

1

CHAPTER I

INTRODUCTION

1.1 Background of the Study

Language is a means of communication. By using language, people can

express their feelings, thought, and minds. People use language to communicate

with others in fulfilling their daily needs. In fact, language has played an

important role in human life. English as the first international language is

important in global communication. English subject is developing ability of

communication in language, both spoken and written.

Realizing the role of English, the government has included English as a

compulsory subject in Junior High School, Senior High School even in

Elementary School. In Elementary School, is English taught as part of the local

content curriculum. The government makes an effort to increase the quality of

educational, especially English. So, children as soon as possible introduced to

English in the beginning.

To assert students achievement, it is useful to teacher to conduct a test or

examination. A test plays an important role in the teaching and learning process as

an integral part of the instructional program which provides information that

serves as a basis for a variety of instructional decisions. As stated by Cohen

(1998:101) a test is intended to measure students’ achievement and the degree of

success of the teaching learning program. Through testing, we can measure

students’ knowledge or ability of English.

2

According to Madsen (1983:3-5) testing is an important part of every teaching and

learning experience. Well-made test of English can help students in at least two

ways. English test can help create positive attitudes toward instruction by giving

students a sense of accomplishment and a felling that the teacher’s evaluation of

them matches with what he was taught to them. Good English test also help

students learn the language by requiring them to study hard, emphasizing course

objectives, and showing them where they need to improve.

A good test should fulfill some requirements such as validity and reliability.

According to Ebel (1979:232) there are some factors which build a good test. The

factors are relevance, balance, efficiency, specificity, difficulty, discrimination,

variability, and reliability. To make a good test, teacher should support the test

with some requirements stated above.

Finally by analyzing the test item of mid-term test of eighth grade students of

SMP 33 Semarang in terms of validity, reliability, discrimination power and level

of difficulty of English question items with the study. The writer hopes that the

test writers can build good tests for each grader.

1.2 Reason for Choosing the Topic

According to Weir (1995:3) in evaluating students’ achievement the test

should relate closely to both the content of what has been learned and also what

things they have been taught. So, by analyzing the test through item analysis, we

can indicate which items may be reliable, valid or not, and we can which test has

3

good quality. A good test means that the test must reliable, valid, moderate in the

term of difficulty level, which means that the test is neither difficult nor easy.

Then, the test must meet with the criterion of discrimination power, which is

satisfactory, good, reasonably good, or poor item. The item which considered as

poor item should be discarded; it means that the item can not distinguish between

the students in the upper group and the students in the lower group well.

1.3 Statement of the Problem

Through the study the writer would like to find the answer of the following

question. “How good the test of mid-term tests of eighth grade students of Junior

High School?”

More specifically, in analyzing the test item, the writer will limit the problem

into the following question:

(1) What is the validity of the test items?

(2) What is the reliability of the test items?

(3) What is the difficulty level of test items?

(4) What is the discrimination power of the test items?

1.4 Objective of the Study

The general objective of the study is obtaining an objective description of the

structures of a good test item. The objectives are then specified into the following

goals:

(1) To describe the validity of each test items.

(2) To describe the reliability of each test items.

(3) To describe the value of the difficulty level of each test items.

(4) To describe the value of the discriminating power of each test items.

4

1.5 Significance of the Problem

The advantages that can be required from this study are as follows:

(1) For students: Students can use the result of the study to make their study more

effective with regard to the right materials.

According to Madsen (1983:4) a good test of English can help students in at

least two ways. First of all, such test can help create positive attitudes toward

accomplishment. Second, the English test can help students learn the language by

requiring them study hard, emphasizing course objectives, and showing them

where they need to improve.

(2) For the teacher: Teacher can use the result of the study as a reference when

they want to analyze test items. The test plays several important roles, such as

to provide insight into ways of improving the evaluation process and to

provide means of diagnosing their own efforts if they have taught effectively.

(3) For test maker: The test maker may use it as a supplement in constructing

tests.

(4) For the writer: The writer herself especially it can increase her skill in

constructing test items.

1.6 Limitation of the Study

The writer wants to analyze the English test of mid-term tests of eighth grade

students of Junior High School in the form of multiple-choice tests in the belief

that:

(1) Multiple-choice items only have one correct answer. Thus the grader will

grade the answer correctly.

5

(2) By using this type of item analysis the discrimination power difficulty level of

the test can be practically determined.

1.7 Outline of the Research

This study consists of five chapters. Chapter I covers for choosing the topic,

statements of the problem, objectives of the study, significance of the study and

limitation of the study. Chapter II discusses review of related literature in

connection with analysis of test items in Junior High School mid-term test.

Chapter III talks about the method of investigation which consists of population,

sample and sampling identification of the problems, technique of data collection

and technique of data analysis. Chapter IV discusses the data analysis and

interpretation. Chapter V offers some conclusions and suggestions.

6

CHAPTER II

REVIEW OF RELATED LITERATURE

2.1 Characteristics of a Good Test

Test contributes directly to the teaching learning process used in the classroom

instruction, and it is useful in programmed instruction, curriculum development,

marking, guidance and counseling school administration and research. (Gronlund,

1976:7)

Regarding the test roles, Vallete (1977:3) states “… classroom test plays three

important roles in the second language program: they define course objective, they

stimulate students’ progress, and they evaluate class achievement.”

Tests provide information that teacher and students ordinary can get of the

success of their efforts to teach and to learn. The need of good tests of educational

achievement becomes more intense. According to Ebel (1979:14) imperfect tests

we now use far better than if we would be served by no tests at all.

Before constructing a test, we must recognize the characteristic of a good test.

A good test should valid and reliable. As Harris stated any tests that we use must

be appropriate in terms of objectives, dependable, in the evidence it provides, and

applicable to our particular situation. (Harris, 1969:13)

The first characteristic of a good test is validity. Brown (1988:163) defines the

validity of test as the extent to which a test measure and nothing else. Gronlund

(1981:66) states that validity refers to the results of a test or evaluation instrument

for a given group or individuals, not to the instruments itself. Validity concerned

7

with the specific use to be made of the results and with the truthfulness of our

proposed interpretation.

The second characteristic which a good test must meet is reliability. Reliability

refers to the consistency of a test score. That is how consistent it is from the

measurement to another. Vallete and Harris have the same statement about

reliability. Reliability refers to the stability of the test score. An important

consideration, then, is determining whether or not a test is reliable.

Besides that, there are several types of test based on their criteria:

A. Based on the function the test gives.

Vallete divides the type of the test as follows:

(1) Summative test. This test is usually given at the end of a marking

period and measures the ‘sum’ total of the material covered (at the end

of the academic year or term)

(2) Formative test. This test is given during the course of instruction; its

purpose is to show which aspects of the chapter the student has

mastered and where remedial work is necessary. (Vallete, 1977:11

B. Based on the way the test is scored.

This type of the test is also divided into two parts. They are:

(1) Objective test. It is a test which only has one correct answer; therefore

whether the test is scored by one teacher or another, whether it is

scored today or last week, it is always scored the same way (e.g.

multiple-choice test)

(2) Subjective test. It is one that does not have a single right answer. The

result of the test may be different if it is scored by different persons.

(Vallete, 1977:10)

C. Based on the test constructor.

Harris divides this type test as follows:

(1) Standardized test. It is formal and large-scale test which is prepared by

professional testing services to assist institutions in the selection,

placement, and evaluation of the students. Usually, it has been proved

in terms of validity and reliability.

(2) Teacher-made test. It is generally prepared administrated, and scored

by a teacher. (Harris, 1969:1)

D. Based on the objective of the test.

Harris divides this type of test into three parts. They are:

(1) Achievement test to measure the extent of students’ achievement of the

instructional goals.

(2) Aptitude test (prognostic test) to determine whether or not they will be

successful in a certain field or study.

(3) General proficiency test to measure what a person already knows

(learned) in the target language, but the aim is to determine whether

this language ability corresponds to specific language requirements.

(Harris, 1969:2)

So, it can be concluded that the characteristic of a good test is depends on the

validity, reliability, discrimination power and level of difficulty of the test.

Validity refers to the consistency of a test score. The discrimination power itself

refers to how good the test discriminates between students in the upper group and

the students in the lower group. The level of difficulty refers to the percentage of

students who got the item right.

2.2 Multiple-choice Items

According to Mc. Namara (2000:5) the definition of multiple-choice format as

a format for test questions in which candidates have to choose from a number of

presented alternatives, only one which is correct. Brown (2004:194) stated that the

most popular method of testing a reading knowledge of vocabulary and grammar

is the multiple-choice format, mainly for reason practically: it is easy to administer

and can be scored quickly.

According to John Boker in http: //www.uab.edu/ uasomume/ cdm/ test. Htm,

the advantages and the disadvantages of multiple-choice items are as follows:

a. Advantages : Multiple-choice test can measure all levels students ability, it

enables wide sampling of subject content, it is quick and easy to score, it

enables objective score, and it can be analyzed for effectiveness.

b. Disadvantages : Multiple-choice test is difficult to construct good items; it

tends to measure simple recall.

2.2.1 The Uses of Multiple-Choice Items

According to Gronlund (1998:60-75) multiple-choice items are appropriate for

both classroom based and large-scale situations. Gronlund (1982:39) suggested

that the multiple-choice items could be used to measure both knowledge outcomes

and various types of intellectual skills.

Gronlund (1976:190-195) stated that the uses of multiple-choice items are

measuring:

a. Knowledge of terminology

b. Knowledge of specific facts

c. Knowledge of conventions

d. Knowledge of trends and sequence

e. Knowledge of classification and categories

f. Knowledge of criteria

g. Knowledge of methodology

h. Knowledge of principles and generalization

i. Knowledge of theories and structure

2.2.2 Characteristic of Multiple-choice Items

A multiple-choice item consists of a problem and a list of suggested solutions

(Gronlund, 1976:188). The problem may be stated in the form of a direct

questions or a complete statement, which presents problem situation and is called

the stem of the items. The list of suggested solution may include words numbers,

symbols, or phrases, which provides possible solution to the problem is called

alternatives.

The alternatives include the correct answer while the remaining alternatives or

several plausible wrong answers are called distracters. The function of the latter is

to distract those students who are not too certain of the answer.

2.2.3 Rules for Constructing Multiple-Choice Items

Ideally, a multiple-choice item presents students with a task that is both

important and clearly understood, and one that can be answered correctly only by

those who have achieved the desired learning. Gronlund (1982: 40-44) puts

toward the following hints for constructing multiple-choice items:

a. The stem of item should be meaningful by itself and should present a

defined problem.

b. Present a single clearly formulated problem in stem of the item.

c. State the stem of the item in simple, clear language.

d. Put as much of wording as possible in the stem of item.

e. State the stem in positive form, where ever possible.

f. Emphasize negative wording whenever it is used in the stem of an item.

g. Make certain that the intended answer is correct or clearly best.

h. Make all alternatives grammatically consistent with the stem of the item

and parallel in form.

i. Avoid verbal clues that might enable students to select the correct answer

or eliminate an incorrect alternative.

j. Make the distracters plausible and attractive to the uninformed.

k. Vary the relative length of the correct answer to eliminate length as a clue.

l. Vary the position of the correct answer in a random manner.

Most of the test items in the final tests or mid-terms tests in junior or senior

high schools are in the form of multiple-choice tests. The writer considers that the

reason of this is based on the principles of constructing test items. Multiple-choice

item is practical; it has scoring procedure that is specific and efficient. And also

the test takers and the test makers are used to with the form of multiple-choice

test.

2.3 Validity

Validity refers to whether or not a test measures what it proposes to

measure. Thus, a test cannot be valid unless it also reliable, for an unreliable test

does not measure.

Gronlund (1976: 81-97) claims that there are three basic types of validity

that commonly used in educational and psychological measurement. They are:

a. Content validity

It may define as the extent to which a test measures a preventative sample

of the subject-matter comment and the behavioral changes under

consideration.

b. Criterion-related Validity

It is the extent to which test performance is related to some other valued

measured of performance whenever test scores are not to be used to predict

future performance on some valued measure other than itself.

c. Construct validity

Construct validity may be defined as the extent to which the test

performance can be interpreted in terms of certain psychological construct.

A number of factors tend to influence the validity of the test result.

Gronlund (1979:98-100) points out that some of these influences can be found in

the instrument itself, some of the typical responses of the pupils to the test

situation and still others in the nature of the group tested in the composition of the

criterion measures used.

In this final project, the writer used content validity. Content validity

means that the test must represent sample of the content of whatever the test is

claiming to test. It means that the test should represent what the test measured. In

this case, the test items of mid-term test of Junior High School have to meet the

criteria of what the test measure.

2.4 Reliability

Reliability deals with the consistency of the result. That is how consistent

test scores or other evaluation results are from one measurement to the other. If a

test is reliable, then a students’ score on it when compared to the scores of his

classmates, should be similar to his relative score on the other test measuring the

same information. Gronlund (1982:132) claims that reliability refers to the

consistency of test scores that is, how consistent they are from one measurement

to another. Reliability measures provide an estimate of how much variation that

might expect under different conditions.

Reliability is the consistency of the test. It means how consistence or

repeatable the test is. If the test is reliable, it indicates that the first test and the

next test are on the same measure. It means that if a students compare with his

classmate in a test, so on the next test with the same field of test the score will not

change if it is compared to his classmate.

2.5 Item Discrimination Power

Item discrimination or discrimination power explains how well the items

perform in separating the better students from the poorer ones. If the good

students tend to do well on an item and the poor students badly on the same item,

then the item is a good one because it distinguishes the good from the bad

students. This is the statement underlying the index of discrimination.

To calculate the discrimination power we can use the following steps:

(1) Find the number in the upper group who got the items right.

(2) Find the number in the lower group who got the items right.

(3) Then subtract the number getting it right in the upper group from the

number getting it right from the lower group.

(4) Divide this figure by one half of the total numbers of papers in the

upper and lower groups.

D = RU-RL ½ T

D : Discrimination power

RU: The number of the students in the upper group who answer the item

correctly.

RL: The number of the students in the lower group who answer the item

correctly.

½ T: One half of the total number of the students included in the item

analysis.

(Gronlund, 1981:259)

(1.0): All the students in the upper group answer correctly and no one in

the lower group does.

(.00): Is obtained when an equal number of the students in the upper and

lower group answer the item correctly.

(-) : Obtained when more students in the lower group than the upper

group answer correctly.

In calculating the discrimination power, the writer divided the students into

three groups; there are upper group, middle group and lower group. From the

calculation, we can classify the discrimination power of the items into satisfactory

item, good item, reasonably good, and poor item. The item which got zero index

discrimination or negative discrimination power means that the item is bad. The

item with negative value means that the students in the lower group perform better

than the students in the upper group. This item must be revised or discarded.

2.6 Item Difficulty

The difficulty of the test item is indicated by the percentage of students who

get the item right. The more difficult items, the fewer will be the students who

select the correct option. And the easier the items are the more will be the students

who select the correct one.

Teacher usually have wrong opinion, they feel that they can get respect of

their students by giving them east test, and some of them giving more difficult test

items in order to get the respect from the students and parents.

There are some factors in constructing the difficulty level of test items.

According to Mahren and Lehman (1984:31) the concept of difficulty or the

decision of how difficult the test should be depend on a variety factors.

There are:

(1) The purpose of the text

(2) The ability level of the students

(3) The age or grade level of the students

The index of item difficulty (P) can be commutated by two ways. First by

dividing the number of the students who answered an item correctly (R) with the

total number of the students tested (T), and then multiplied by one hundred.

P = R X 100 T

The second way is by dividing the students into the upper group and the lower

group only, and assuming the responses of the students in the middle group will

follow essentially in the same pattern

The index difficulty will run from 0.00 to 1.00, with 1.00 indicating the easiest

possible item. The index of difficulty which is in the range from 0.31 to 0.70

would indicate that the item is considered moderate or acceptable. The most

difficult will run from 0.00 to 0.30.

By knowing how many students who answered the item right, we can calculate

the item difficulty. It can be calculated through two ways, but in this final project

the writer used the first way. That is by dividing the number of the students who

got the item right with the total number of students tested and multiplied by one

hundred. By that, we can know which item is difficult, moderate, and easy.

2.7 Curriculum

The curriculum that is used in the eighth grade of SMP 33 Semarang is KTSP (

KurikulumTingkat Satuan Pendidikan). In the KTSP there are several materials

that is given to the students there are:

1) Descriptive Text

2) Recount Text

3) Narrative Text

4) Asking and Giving Permission

5) Asking and Giving for Help

6) Refusing for Help

7) Giving Response

8) Giving Argument

9) Invitation

10) Announcement

11) Short Message

The writer compares the each item with the curriculum whether the item is

suitable or not.

18

CHAPTER III

METHODS OF INVESTIGATION

In the third chapter, the writer presents the population, sample and sampling

technique, identification of the problems, techniques of data collection and

technique of data analysis. In this research, the writer used two kinds of methods

in order to get data required in this study, namely library research and analysis of

students’ works.

3.1 Population

The population on this study is eighth grade students of Junior High School in

the first semester, which covers three classes each of them consists of forty

students. Therefore the total number of the population is one hundred and twenty.

3.2 Sample and Sampling Technique

To make the study effective, the writer selects some sample. The writer will

take three classes on the school that consists of forty students. So, the total

samples are one hundred and twenty students. In this study, the writer will use the

random sampling technique to take sample.

According to Brink (1974:33) random sampling refers to the process of

drawing a random sample of individuals of some population. The writer

administers a simple random sampling by take three classes on a school and takes

the students’ answer sheets to be analyzed.

19

3.3 Identification of the Problem

Most of the teacher of Junior and Senior High Schools still do not know how

to construct a good test. They made a test without paying attention to the

characteristic or the quality of a good test.

There are four problems related to the teacher-made English test items. The

problems are:

a. The validity level

b. The reliability level

c. The difficulty level

d. The discrimination power

3.4 Technique of Data Collection

In this study the intended test is the mid-term test of eighth grade students of

Junior High School. The data are in the form of students’ answer sheets, and the

test item of mid-term test of eighth grade students of Junior High School. The

writer selects the eighth grade students of Junior High School to get the required

data. Before the test administrated to the students, the writer will contact the

English teacher of the selected school to ensure that they were not used anymore.

Then, she begins to analyze the test.

3.5 Technique of Data Analysis

The data to be analyze in this study taken from the students’ answer sheets

of the mid-term test of eighth grade of Junior High School. They used to analyze

the quality of test items.

The purpose of this analysis is to identify the quality of each item, whether

they belong to good items, moderate items, or bad items. Through items analysis,

we can also find information about the weakness or the shortcoming of the items.

Here the items analysis consists of the following:

3.5.1 Analysis of Validity

Validity refers to whether or not a test measures what it is supposed to

measure. In this study the writer used content validity. It means that the items

should measure what it is supposed to measure. Then, the writer compare each

item with the curriculum of KTSP then if the item meet the criteria of the material

of the curriculum the item can be said as valid item and vice versa.

3.5.2 Analysis of Reliability

The formula that is used to estimate the reliability of the test is Kruder-

Richardson 20 formula. According to Brown (185:2005) the Kuder-Richardson 20

formula is the most accurate and flexible formula to calculate reliability. The

formula is:

K-R20 = K ∑Si² 1- K-1 St²

K_R20 = Kuder-Richardson formula 20

k = number of the items

Si² = item variance

St² = test score variance (Brown, 181:2005)

The formula to calculate the variance is:

St2

=

nny

y 2

2

St²= the variance

the sum of

Y= the total score

N= the number of respondent

3.5.3 Difficulty Level Analysis

A good test item which is not too difficult or too easy for a group of students if

more than 75 percent of the group accurately respond to that item of a test. If

between 25 percent and 75 percent of the students in a group accurately respond to

an item of a test, the item is considered as moderate. A hard item is one which

fewer than 25 percent of students correctly answer on a test.

Based on the above difficulty level criteria, the difficulty level criteria that are

used are:

(1) an item with a difficulty level of 0.00 ≤ P ≤ 0.25 is a difficult item

(2) 0.25 ≤ P ≤ 0.75 is moderate

(3) 0.75 ≤ P ≤ 1.00 is easy

The formula is: P = R T P = difficulty level or index of difficulty

R = the number of students responding correctly to the item

T = the total number of students responding to the item (Nitko, 1983:288)

3.5.4 Discrimination Power Analysis

Item discrimination tells how well the item performs in separating the better

students from the poorer students. The formula of computing item discriminating

power is as follows:

D = RU-RL ½ T

D = the index of DP

RU = the number of students in the upper group who answer the item correctly

RL = the number of students in the lower group who answer the item correctly

½ T = one half of the total number of the students included in the item analysis

(Gronlund, 1982:103)

(.00) is obtained when an equal number of the students in each group answer

correctly.

(1.00)Is the highest equal indicating that all students in the upper group got the

item correctly and all the students in the lower group got wrong.

(-) is obtained when more students in the lower group answer correctly than the

students in the upper group.

Zero and (-) DP of item should be removed from the test and then discarder or

improved. Ebel and Frisbie (1991:232) classify the discrimination power values as

follows:

Discrimination

index

Item Evaluation

0.40 and above Very good item

0.30-0.39 Reasonably good but possibly subject to improvement

0.20-0.29 Marginal items, usually needing being subject to

improvement

0.19 and below Poor items, to be rejected or improved by revision

In estimating the discrimination power, the writer divided the class into three

groups there are upper group, lower group, and middle group. In divide the sample

into upper group and the lower group, the writer ranks the sample from 1 to 120.

Then, from the ranks, the writer classified the upper group of the sample is 27%

students who got highest grade from the whole sample. And the lower group is

27% students who got the lowest grade from the whole sample. The rest students

are categorized as middle group. The list of upper group and lower group students

can be seen in appendix 6.

By using the criteria above, the writer analyzed the items. Therefore, the test

can be said as a good test or not.

24

CHAPTER IV

RESULT OF THE STUDY

4.1 Result of the Analysis This study analyzed four aspects of the test items; there are validity, reliability,

discrimination of power and the level of difficulty. The aim of this study is to

analyze the test item of mid-term test of eighth grade students of SMP 33

Semarang.

By analyzed the four items of the test items we can identify whether the item

is good, moderate or poor. We can acquire the weaknesses of the items and how to

revise it. From the data analysis the test item of mid-term test of eighth grade

students of SMP 33 Semarang we can obtain the following data.

4.2.1 Analysis of Item validity

The writer used the Pearson’s product moment table to calculate the value of

the validity level of each test item. After that we can consulted the value of each

item to the table of r product moment values. The test item can be categorized as

valid item if the value of r is higher than the value on the table and vice versa.

From the validity calculation, the writer got the result as follow:

(1) There are 33 test items which fulfill the requirement of the validity. They are

the items number 1, 2, 3, 5, 6, 7, 10, 11, 12, 14,15, 16, 17, 18, 19, 20, 24, 25,

26, 27, 30, 35, 36, 37, 38, 39, 40, 42, 43, 44, 47, 49, 50.

(2) There are 17 items which belong to invalid items. There are numbers 4,8, 9,

13, 21, 22, 23, 28, 29, 31, 32, 33, 34, 41, 45, 46, 48 .

25

From those 50 test items, it can be analyze that there are several item that still

can be used in the following test.

4.2.2 Analysis of Item Reliability

To get the coefficient of reliability of the test item the writer applying the

Kuder-Richardson 20 formula. From the calculation, it is found that the coefficient

of reliability of the test item is 0.39, then it is consulted to the table of r product

moment values at level of significance0.05 or 5%, because the value of r

calculation is higher than the value on the table so it can be concluded that the test

item that were used in the English mid-term test for the eighth grade of SMP 33

Semarang in the academic year 2008/2009 is reliable. The calculation of the

reliability is listed in the appendix 2.

4.2.3 Analysis of the Difficulty Level

By using Nitko formula, the item of difficulty level can be analyzed by

calculating the percentage of students who got the item right. The level of item

difficulty is categorized into three levels. There are:

(1) Index 0.00 to 0.25 is categorized as difficult items.

(2) Index 0.26 to 0.75 is categorized as moderate items.

(3) Index 0.76 to 1.00 is categorized as easy items.

The result of the data analysis is as follows:

(1) Items that belong to the difficult level are the item number 10, 12, 15, 17,

18, 19, 20, 21, 26, 28, and 33.

(2) Items that can be classified as moderate items are the item number 1, 3, 4, 5, 6,

7, 8, 9, 11, 13, 14, 16, 23, 24, 25, 27, 29, 32, 34, 35, 37, 38, 39, 40, 41, 42, 43,

44, 45, 46, 47, 48, 49, 50.

(3) Items that can be regarded as easy items are 2, 22, 30, 31, and 36.

From those 50 items, the English mid-term test of eighth grade students of

SMP 33 Semarang can be categorized as moderate in the term of difficulty level

since the mean of their difficulty level is 0,41. The items that considered as easy

item still can be used to encourage and motivate the poor students. The example of

the calculation of the difficulty level can be seen in appendix 3.

4.2.4 Analysis of Discrimination Power

In analyzing the discrimination power of the test item, the writer used

Gronlund formula. It is tells how well the item performs in separating the upper

group and lower group students. The discrimination power of the test item is

categorized into four categories. In details it can be explained as follows:

(1) Index 0.40 and above are categorized as very good items.

(2) Index 0.30 to 0.39 belongs to reasonably good items but possibly subject to

improvement.

(3) Index 0.20 to 0.29 is categorized as marginal items, this item usually needing

and being subject to improvement.

(4) Index 0.19 and below belong to poor items. Those items should be rejected or

improved by revision.

From the data analysis, the result can be explained as follows:

(1) 29 items are categorized as poor items. They are items number 1, 2, 4, 5, 9, 11,

12, 13, 14, 15, 16, 17, 19, 20, 21, 22, 24, 27, 28, 30, 31, 33, 36, 37, 41, 42, 43,

46, 47.

(2) 13 items can be classified as marginal items. They are items number 6, 7, 8,

10, 18, 23, 26, 29, 332, 38, 40, 44, 50.

(3) There are 6 items that belongs to good item there are items number 25, 34, 39,

45, 48, and 49.

(4) There are only 2 items that can be classified as very good items they are items

number 3 and 35.

The mean of the discrimination power is 0.17, so as whole the mid-tern test

items are categorized as poor items. In this test items analysis there are 5 items

with negative values of discrimination power. The negative value of

discrimination power means that that more students in the lower group than the

students in the upper group answer the question correctly. The items stated above

are the item, number 13 with discrimination power -0.063, item number 15 with

the discrimination power -0.125, item number 30 with the discrimination power -

0.03, item number 31 with the discrimination power

-0.06 and item number 42 with the discrimination power -0.06. These items

should be rejected. The example of the discrimination power listed in the appendix

4.

4.2 Discussions

The goal of this study of the mid-term test of first term of eighth grade

students of SMP 33 Semarang, the analysis focused on the items which fulfill the

characteristics of a good test (the item can be used again in the following test) and

which items which do not fulfill the characteristic of a good test (the items should

be discarded or need some revision). It focused to identify the quality of each

item, whether it can be classified as good, moderate, or poor item. In the term of

validity, the item can be classified as valid or invalid item. While from difficulty

level point of view, the good item is an item which is neither too easy nor too

difficult. Then, based from the discrimination power point of view, a good item is

an item which can discriminate between the lower and upper group students.

According to Gronlund (1981:151-160) there are some criteria to determine

which item that still can be used, revised or should be discarded.

(1) An item is used if it has the following criteria.

a. Valid, reliable, good discrimination power and moderate difficulty level.

b. Valid, reliable, satisfactory discrimination power and moderate difficulty

level.

(2) An item is used with several revisions if it has the following criteria.

a. Valid, reliable, good discrimination power but the difficulty level is too

easy or too difficult.

b. Valid, reliable, satisfactory discrimination power, but the difficulty level is

too easy or too difficult.

c. Valid, reliable, poor discrimination power and moderate difficulty level.

d. Not valid, reliable, good discrimination power and moderate difficulty

level.

e. Not valid, reliable, satisfactory discrimination power and moderate

difficulty level.

(3) An item should be discarded if it has the following criteria.

a. Valid, reliable, poor discrimination power and the difficulty level are too

easy or too difficult.

b. Not valid, reliable, good discrimination power, but the difficulty level is

too easy or too difficult.

c. Not valid, reliable, satisfactory discrimination power, but the difficulty

level is too easy or too difficult.

d. Not valid, reliable, poor discrimination power and moderate difficulty

level.

e. Not valid, reliable, poor discrimination power and the difficulty level are

too easy or too difficult.

Based on the result of item analysis which is includes the analysis of validity,

reliability, difficulty level and discrimination power of the items, in detail the

result of the data analysis was explained as follows:

(1) The items which can be used again as follows:

a. There are 5 items that can be classified as valid, reliable, good

discrimination power and moderate difficulty level. They are the items

number 3, 25, 35, 39 and 49.

b. 6 items can be classified as valid, reliable, satisfactory discrimination

power and moderate difficulty level. They are the items number 6, 7, 38,

40, 44, 50.

(2) The items which still can be used but it need several revisions are as follows:

a. There are no items that can be classified as valid, reliable, good

discrimination power but the difficulty level is too easy or too difficult.

b. There are 2 items that can be categorized as valid, reliable, satisfactory

discrimination power, but the difficulty level is too easy or too difficult.

There are the item number 10 and 26.

c. 13 items are classified as valid, reliable, poor discrimination power and

moderate difficulty level. There are the items number 1, 5, 11, 13, 14, 16,

19, 24, 27, 37, 42, 43, 47.

d. There are 3 items that can be classified as not valid, reliable, good

discrimination power and moderate difficulty level. There are 34, 45, and

48.

e. There are 5 items that considered as not valid, reliable, satisfactory

discrimination power and moderate difficulty level. There are number 8,

18, 23, 29, and 32.

(3) The item which should be discarded are the following:

a. The items that can be classified as valid, reliable, poor discrimination

power and the difficulty level are too easy or too difficult are 7 items.

There are number 2, 12, 15, 17, 20, 30, and 36.

b. There are no items that considered as not valid, reliable, good

discrimination power, but the difficulty level is too easy or too difficult.

c. There are no items that considered as not valid, reliable, satisfactory

discrimination power, but the difficulty level is too easy or too difficult.

d. 5 items that can be classified as not valid, reliable, poor discrimination

power and moderate difficulty level. There are items numbers 4, 9, 22, 41,

46.

e. The items that considered as not valid, reliable, poor discrimination power

and the difficulty level are too easy or too difficult are the items number

21, 28, 31, and 33.

For more explanation, please note the following description.

Item number 1

Question I am an SMP student. My name is Rini. I have one sister and

two brothers. My sister’s name is Tuti and my Brother’s names

are Fauzan and Doni. My father’s name is Syahbudin and my

mother’s name is anis. I live with my family at Gunung Talang

street. I am twelve years old. Tuti is ten years old and fauzan is

six years old. My father is forty-three and my mother is thirty-

five. We are happy family

Rini’s sister is… years old.

a. 4

b. 6

c. 10

d. 12

Result A B C* D

Upper 27% 0 1 18 13

Middle 46% 7 1 25 22

Lower 27% 11 1 12 21

Total 28 3 55 56

Validity Valid

P value 0.49

D value 0.19

The item above meet with the criterion of the materials, which is descriptive

text, so the item above can be said as valid item, because it meet with the criteria

of the KTSP curriculum, and the P value 0.46 it classified as moderate item

because only 55 students chose the correct answer. From the D value, this item

can categorized as poor item since only 18 students from upper group who chose

the correct answer and 12 students from lower group did the same. From that

criterion, it can be says that this item still can be used in the next test with several

revisions.

Item number 2

Question What is the text about?

a. Rini’s address

b. Rini’s school

c. Rini’s family

d. Rini’s age

Result A B C* D

Upper 27% 1 0 30 1

Middle 46% 1 2 51 3

Lower 27% 1 1 27 2

Total 3 3 108 6

Validity Valid

P value 0.9

D value 0.09

This item is said to be easy because 108 students out of 120 chose the correct

answer. While from the aspect of discrimination power this item categorized as

poor item because 27 students from the lower group choose the correct answer and

only 30 students from the upper group did the same. Although this item can be

said as valid item, in the writer’s opinion this item should be discarded.

Item number 3

Question How many children’s do Rini’s parents have?

a. 6

b. 5

c. 4

d. 3

Result A B C* D

Upper 27% 1 0 28 3

Middle 46% 15 8 6 17

Lower 27% 3 0 11 18

Total 19 8 45 38

Validity Valid

P value 0.37

D value 0.53

This item considered as very good item since from D value there are 28

students in the upper group chose the correct answer and only 11 students from

the lower group who choose the correct answer. This item is not too easy or not

too difficult; it is moderate item since 45 students chose the correct answer. The

item also meets the criterion as valid item because it tests about description text.

From the criterion above it can be said that the item above is a good item.

Item number 4

Question “I can’t pack my bag and go home.”

The underlined words mean…the bag.

a. ask someone to bring

b. put his clothes in

c. put his belongings into

d. carry things with

Result A B C* D

Upper 27% 4 8 15 5

Middle 46% 9 7 22 12

Lower 27% 2 10 11 9

Total 15 25 48 26

Validity Invalid

P value 0.4

D value 0.125

The table above shows us that this item is neither too easy nor too difficult, in

the other words it can be categorized as moderate item in the terms of difficulty level.

Meanwhile in the term of discrimination power this item classified as poor item since

only 15 students in the upper group and 11 students in the lower group answer the

correct item. Based from the validity this item can be said as invalid item because it

does not meet the criteria of the curriculum. Considering the criteria of difficulty

level, discrimination power and the validity this item still can be used in the next test

with several revisions.

Item number 5

Question The tree trunk conducts water and dissolves materials

from the roots to the leaves, flowers, and fruits of the plant.

It also supports the branches and the twigs. The leaves,

flowers, and fruits grow along the twigs.

The roots absorb water and minerals from the soil to feed

all parts of the tree. They also anchor a tree in the soil to hold

the tree upright against the force of strong wind.

The roots have…function.

a. two

b. three

c. four

d. five

Result A* B C D

Upper 27% 16 13 1 2

Middle 46% 16 37 2 1

Lower 27% 15 16 1 0

Total 47 66 4 3

Validity Valid

P value 0.39

D value 0.03

This item belongs to reading comprehension test, since there is passage that

the students have to read to know the correct answer of the question. It tests about

description text, so it can be said as valid item, because it tests about descriptive

text. From the discrimination power criteria, it considered as poor item since it

only has 0.03 on the D value. In the term of difficulty level, it can be said as

moderate item since 47 students chose the correct answer. But, with the criteria

above, by using Gronlund criterion this item still can be used by several revisions.

Item number 6

Question Which one is the correct order of the parts of the tree,

from bottom to the upper parts?

a. trunk - roots – branches – twigs – fruits

b. trunk – twig - roots – branches – leaves

c. roots – trunk – twigs – branches – fruits

d. roots – trunk – branches – twig – leaves

Result A B C D*

Upper 27% 9 7 2 14

Middle 46% 13 23 1 19

Lower 27% 6 19 1 6

Total 28 49 4 39

Validity Valid

P value 0.32

D value 0.25

Because of the P value is 0.32 this item considered as moderate item in the

point of view of difficulty level. While in the term of discrimination power this

item considered as marginal item since the D value is 0.25. From the criterion of

validity, this item classified as a valid item and because of that criterion above this

item still can be used in the next test since it has satisfactory discrimination power

and moderate difficulty level.

Item number 7

Question The first paragraph above is told about…

a. the function of the trunk

b. the function of the roots

c. the function of the leaves

d. the function of the twigs

Result A* B C D

Upper 27% 15 15 2 0

Middle 46% 22 29 4 1

Lower 27% 8 17 1 1

Total 45 61 7 2

Validity Valid

P value 0.37

D value 0.21

The P value of this item is 0.37; it shows that this item is moderate item since

there are only 45 students chose the correct answer. The D value shows it as

marginal item since it has 0.25.it is because there are 15 students from the upper

group chose the correct answer and only 8 students from the lower group did the

same. From the point of view of validity, it belongs to valid item. This item is

categorized as a good item, so it still can be used in the next test.

Item number 8

Question Ali : “What is Mr. Bakri’s profession?

Amat: “He is a…

Look at his table!

It is full of tools. There are hammer, an axe, a handsaw,

a pencil, etc.

a. sailor c. carpenter

b. painter d. teacher

Result A B C* D

Upper 27% 1 6 24 1

Middle 46% 0 12 37 5

Lower 27% 3 9 16 4

Total 4 27 77 10

Validity Invalid

P value 0.64

D value 0.25

From the table above, we can see that this item belongs to moderate item since

there are 77 students who answered the item correctly. While in the discrimination

power point of view, this item is can be said as marginal item since 24 students

from the upper group and 16 students in the lower group responded to this

answer.But based from validity criteria, this item classified as invalid item

because it does not meet the criteria of the curriculum. However, it can be

categorized as a marginal item and still can be used in the next test with several

revisions.

Item number 9

Question My father is a farmer he is work at…,especially in the

rainy season, he grows rice.

a. in the farm

b. in the rice field

c. in the garden

d. in the park

Result A B* C D

Upper 27% 15 15 1 1

Middle 46% 38 20 0 0

Lower 27% 23 10 1 0

Total 76 45 2 1

Validity Invalid

P value 0.37

D value 0.15

From the terms of difficulty level and discrimination power, this item can be

said as moderate and poor item, since from the table above we can find out that 45

students choose the correct answer and the difference between the students in the

upper group and in the lower group is only 5 students. Meanwhile this item

classified as invalid item. In the writer’s opinion this item should be discarded

since it is invalid item.

Item number 10

Question I still have an assignment to do. I… it after lunch.

a. have been finishing

b. finishing

c. was finishing

d. will finish

Result A B C D*

Upper 27% 4 16 3 9

Middle 46% 7 31 12 7

Lower 27% 3 16 10 2

Total 14 63 25 18

Validity Valid

P value 0.15

D value 0.21

The item above is considered as a difficult item because only 18 students out

of 120 answered the question correctly. While in the term of difficulty level, this

item considered as marginal item since only 9 students in the upper group chose

the correct answer and 2 students from the lower group did the same. In the point

of view of validity, this item can be said as valid item, because it tests about future

tense, so from the criterion above the item still can be used in the next test with

several revisions.

Item number 11

Question Adhi : How many times do you swim a week?

Tio : Actually twice but last week I only…once

because I prepared the exam.

a. swam

b. swim

c. will swim

d. have swim

Result A* B C D

Upper 27% 18 6 5 3

Middle 46% 40 11 3 2

Lower 27% 18 8 3 1

Total 76 25 11 6

Validity Valid

P value 0.63

D value 0

From discrimination power index we can definitely sure that this item is poor

because there is no different between students in the upper group and in the lower

group in answering the option. While from the difficulty level index, this item is

classified as moderate item since 76 students out of 120 answered the item

correctly. In the term of validity this item can be categorized as valid item,

because it tests about past tense. So, this item still can be used in the following test

with several revisions.

Item number 12

Question Lia : Can I have some apples?

Dio : …do you want?

Lia : The Australian ones.

a. how many c. which

b. what d. how much

Result A B C* D

Upper 27% 12 10 8 2

Middle 46% 37 15 4 0

Lower 27% 17 12 2 1

Total 66 37 14 3

Validity Valid

P value 0.12

D value 0.19

From the number of the students who answered the item correctly, it can be

seen that the item is a difficult item. While in the discrimination power index this

item has 0.12 in the D value, so it can be categorized as poor item. While from

validity point of view it can be said as valid item because it tests about asking for

something. Although it is a valid item, this item should be discarded since it has

poor and difficult in the term of discrimination power and difficulty level.

Item number 13

Question A clever crow

One day a crow was tired and 13)…He looked

everywhere for some 14)…to drink, but he could not find

any. At last he 15)…an old jar which there was a little water.

The jar was so tall and the water was so low that he could not

reach it with his short bill. He thought for a while, then he

16)…away to pick up some stones. She 17)…the stones into

the jar one after another, and the water came up higher and

higher. At last the crow was able to drink as much as she

liked.

a. hungry

b. insects

c. water

d. leaves

Result A* B C D

Upper 27% 17 2 10 3

Middle 46% 36 7 12 1

Lower 27% 19 1 11 1

Total 72

Validity Valid

P value 0.6

D value -0.06

This item discusses about vocabulary. There are some clues in the question,

and then the students have to guess what the answer is. This item is neither too

easy nor too difficult. It has moderate level in the term of difficulty level. While

from the point of view of discrimination power and validity, this item categorized

as poor and valid item, since the students in the lower group chose the correct

answer more than the students in the lower group, so it has negative value in D

index, while from the validity point of view, this item test about narrative text so it

meet the criteria of validity. So, from the criterion above, it can definitely sure that

this item still can be used with several revisions.

Item number 14

Question a. meat

b. angry

c. amazing

d. thirsty

Result A B C* D

Upper 27% 5 0 26 0

Middle 46% 13 0 41 0

Lower 27% 6 3 23 0

Total 24 3 90 0

Validity Valid

P value 0.75

D value 0.09

From the table above we can see that in the term of difficulty level it belongs

to moderate item since 90 students answered the item correctly. While based on

the discrimination power level it classified as a poor item since only 26 students in

the upper group chose the correct answer and 23 students in the lower group chose

the correct answer. While from the validity criteria, this item can be said as valid

item. However, this item still can be used in the following test but with several

revisions.

Item number 15

Question a. find

b. found

c. founded

d. finding

Result A B* C D

Upper 27% 16 6 3 7

Middle 46% 21 9 1 24

Lower 27% 9 10 3 10

Total 46 25 7 41

Validity Valid

P value 0.2

D value -0.125

This item belongs to poor item in the term of discrimination power since it has

negative value; it means that the students in the lower group did better than the

students in the upper group. Based on the criterion of difficulty level it can be

classified as a difficult item, it can be seen that only 25 students out of 120

responded the item well. While based on the validity, it categorized as valid item

so this item can definitely sure to be discarded.

Item number 16

Question a. flew

b. put

c. threw

d. saw

Result A* B C D

Upper 27% 9 4 6 13

Middle 46% 22 6 9 16

Lower 27% 6 4 4 16

Total 37 14 19 45

Validity Valid

P value 0.3

D value 0.09

From the number of the student who answered the item correctly, it can be

seen that this item is a moderate item. In the point of view of the discrimination

power, it belongs to poor item because 9 students from the upper group and 7

students from the lower group chose the correct answer. Based on the validity, this

item classified as valid item. Considering the criteria of the validity,

discrimination power and the difficulty level this item still can be used in the

following test with several revisions.

Item number 17

Question a. dropped

b. found

c. reached

d. sent

Result A* B C D

Upper 27% 9 0 22 1

Middle 46% 9 1 36 7

Lower 27% 6 3 20 3

Total 24 4 78 11

Validity Valid

P value 0.2

D value 0.09

In the term of DP this item considered as poor item since the difference

between students in the UG and LG who chose the correct answer only 3. From

the criterion of the difficulty level it can be seen as difficult item since only 24 out

of 120 students who chose the correct answer. While based on the validity it

categorized as valid item and it should not be used in the next test.

Item number 18

Question Bita : Mom I want to make an omelet. What ingredients do

I need to prepare?

Mom: Prepare three eggs, five…of onion, a little salt, and

some vegetable oil.

a. sheets

b. slices

c. packs

d. sacks

Result A B* C D

Upper 27% 0 14 12 6

Middle 46% 2 11 26 15

Lower 27% 1 5 12 14

Total 3 30 50 35

Validity Invalid

P value 0.25

D value 0.28

From the table above we can see that this item considered as difficult item

since only 30 students chose the correct answer. While based on the

discrimination power criterion, it belongs to marginal item since it has 0.28 in the

D value. From the criterion of the validity, this item categorized as invalid item,

and it still can be used in the next test with several revisions.

Item number 19

Question Ani : I want to take my pill….

Lia : Sure. Wait a minute please.

a. Do you want some?

b. Can you get me a glass of water please?

c. Can you take me to the doctor please?

d. Will you buy it for me please?

Result A B* C D

Upper 27% 7 13 11 1

Middle 46% 13 9 31 1

Lower 27% 8 7 16 1

Total 28 29 58 3

Validity Valid

P value 0.24

D value 0.19

This item considered as difficult and poor item in the terms of difficulty level

and discrimination power. It can be seen that there are only 29 students who

responded the item well, so this item has 0.24 in the P value, while in the D value

this item only has 0.19; it is because there are only 13 students in the upper group

and 7 students in the lower group who answered the item correctly. This item

considered as valid item because it meets the criteria of the curriculum that is test

about asking for help. Although this item considered as valid item, this item

should be discarded since the level of the discrimination power and the difficulty

level of the item.

Item number 20

Question Reno : I think this shirt needs ironing.

Leo : No, I …it. Touch it. It is still warm.

a. iron c. am ironing

b. will iron d. have ironed

Result A B C D*

Upper 27% 2 9 15 6

Middle 46% 8 5 32 9

Lower 27% 4 1 22 5

Total 14 15 69 20

Validity Valid

P value 0.16

D value 0.19

This item can be classified as difficult, poor, and invalid item. Based on the

difficulty level index, it has 0.16 since there are only 20 students can responded

the item well. While on the discrimination power index, this item has 0. 19 which

mean that is a poor item, since the difference between the upper group and lower

group students only 6. And in the validity point of view, this item classified as

valid item, because it tests about past perfect tense.

Item number 21

Question Anto and Willy are my cousins. The word cousins

means…

a. aunt’s sisters

b. uncle’s daughters

c. aunt’s brothers

d. uncle’s sisters

Result A* B C D

Upper 27% 7 8 7 10

Middle 46% 10 17 3 26

Lower 27% 7 8 5 12

Total 24 33 15 48

Validity Invalid

P value 0.2

D value 0

These item also almost the same with the previous number, which is number

20, which can be classified as difficult and poor item in the terms of difficulty

level and discrimination power. It has 0.2 in the difficulty level index since there

are only 24 students out of 120 who answered the item correctly. And in the D

value, it has 0; it means that there is no different between students in the upper

group and in the lower group. This item also considered as invalid item, so it can

be definitely sure that this item should be discarded.

Item number 22

Question Euro 2000, one of the biggest 22)…on earth beside the World

Cup is about to begin. Football freaks everywhere will be

performed in Europe’s top teams. Who will 23) …in the mini

World Cup?

a. shows c. festivals

b. clubs d. tournament

Result A B C D*

Upper 27% 1 3 0 28

Middle 46% 1 10 5 40

Lower 27% 0 6 3 23

Total 2 19 8 91

Validity Invalid

P value 0.7

D value 0.16

From the number of the students who answered the item correctly, it can be

seen that this item classified as easy item. In the difficulty level index it has 0.7,

while in the discrimination power index it has 0.16; it means that this item is poor.

Because there are only 28 students in the upper group who chose the correct

answer and there are 23 students in the lower group who chose the correct answer.

Because the item is invalid, this item is better to be discarded.

Item number 23

Question a. meet c. fight

b. compete d. play

Result A B* C D

Upper 27% 5 13 0 14

Middle 46% 7 13 2 50

Lower 27% 6 5 0 21

Total 18 31 2 85

Validity Invalid

P value 0.25

D value 0.25

From the table above we can see that the item considered as moderate item

since there are 31 students out of 120 answered the item correctly. While from the

discrimination power index, this item has 0.25, which mean that the item is

considered to be marginal item. It is because the difference between the upper

group and in the lower group students is 8 students. Based on the validity

criterion, this item classified as invalid item. So, from the criterion of the

difficulty level, discrimination power level, and the validity it can be said that the

item classified as good item and still can be used in the following test with several

revisions.

Item number 24

Question These animals have become very rare in Indonesia

because people used to hunt them for skin and sport.

They look like large cats and have yellow fur with black

spots and long tails. The animals are…

a. lions c. tigers

b. leopards d. panthers

Result A B C* D

Upper 27% 0 6 26 0

Middle 46% 1 21 33 0

Lower 27% 6 5 22 0

Total 7 42 81 0

Validity Valid

P value 0.67

D value 0.125

Based on the difficulty level index, this item considered as moderate level, it

can be seen that there are 81 students attracted to the item. And based on the

discrimination power index, this item has 0.125; which mean that it is a poor item.

It s because there only 26 students in the upper group and 22 students in the lower

group who answered the item correctly. While in the validity point of view, we

can see that this item considered as valid item, because it test about description

text.

Item number 25

Question One day, there was a fox bragging to a cat. Then the cat said

that he was so smart and knowing a lot of tricks and, also

having a hundreds different ways to escape from enemies.

Then the cat answer that, unfortunately, he only knew one

trick. He then asked the fox to teach them some tricks. The

fox agreed. After that they heard a pack of wild dogs running

toward them. The cat ran up a tree and disappeared. Then he

said that was the only trick he knew. He then asked to the fox

which trick he was going to use. The fox sat there trying to

decide which trick to use. He thought a long time. Then he

decided to run. But it was too late. What happened then? At

then the wild dog got there before he could run away and ate

him up.

The word “trick” in the passage above has the similar

meaning with…

a. ideas

b. ways

c. strategies

d. solutions

Result A B C* D

Upper 27% 7 0 16 9

Middle 46% 13 1 12 26

Lower 27% 9 2 5 16

Total 29 3 33 51

Validity Valid

P value 0.27

D value 0.34

There are 33 students attracted to this item well. It means that this item

considered as moderate item. From the discrimination power point of view this

item classified as good item since there are 16 students in the upper group who

answered the item correctly and only 5 students in the lower group who chose the

correct answer. While in the term of validity criterion, it categorized as valid

item, because it tests about narrative which meet the criteria of the curriculum. So,

from the criterion above this item still can be used for the following test with

several revisions.

Item number 26

Question The underlined word in the text above has the similar

meaning with…

a. gone c. taught

b. moved d. existed

Result A* B C D

Upper 27% 12 8 5 7

Middle 46% 8 14 4 30

Lower 27% 5 15 1 11

Total 25 37 10 48

Validity Valid

P value 0.20

D value 0.21

From the number of the students who answered the question correctly, we can

definitely sure that this item is considered as difficult item. It has 0.2 in the

difficulty level index. While in the discrimination power index, we can conclude

that this item is marginal item since it has 0.21. It is because there are 12 students

in the upper group and 5 students in the lower group chose the correct answer.

Based on the validity, this item categorized as valid item, so from the criterion

above this item still can be used in the next test with several revisions.

Item number 27

Question The text above is a kind of…

a. exposition text c. recount text

b. descriptive text d. narrative text

Result A B C D*

Upper 27% 0 19 6 7

Middle 46% 2 21 8 22

Lower 27% 1 8 17 6

Total 3 48 31 35

Validity Valid

P value 0.29

D value 0.03

This item considered as moderate item based on the difficulty level index, it

has 0.29, and it is because there are 35 students who answered the item correctly.

While in the discrimination power index, this item only got 0.03, so this item is

poor. It is because there are only 7 students in the upper group can attracted the

item well, and in the lower group there are 6 students did the same. This item still

can be used in the following test with several revisions.

Item number 28

Question Ferry : How much is the bus…?

Dio : It is Rp. 5000.

a. cost c. price

b. fare d. fee

Result A B* C D

Upper 27% 10 6 12 4

Middle 46% 30 10 11 3

Lower 27% 17 1 13 1

Total 57 17 36 8

Validity Invalid

P value 0.14

D value 0.15

From the table above we can see that this item categorized as difficult item in

terms of difficulty level. It is because there are only 17 students who chose the

correct answer. While in the discrimination power index, this item can be

classified as poor item, since it only has 0.15, it is because there are only 6

students in the upper group that chose the correct answer and 1 student in the

lower group did the same. This item should be rejected since this item too difficult

and poor item and invalid.

Item number 29

Question Hendy got…when he went to Medan by ship.

a. sea-sick

b. dizzy

c. headache

d. cold

Result A* B C D

Upper 27% 16 8 0 8

Middle 46% 26 17 0 10

Lower 27% 9 8 5 10

Total 51 33 5 28

Validity Invalid

P value 0.42

D value 0.21

There are 51 students who attracted this item well, so this item categorized as

moderate item. It has 0.42 in the difficulty level index. While in the discrimination

power index this item has 0.21, which mean that this item is marginal item. Based on

the validity, it can be classified as invalid item. Considering the criterion above this

item still can be used in the next test with several revisions.

Item number 30

Question Dear Ayodya,

Come to my house tonight at 7 pm and do not forget to bring

your math book.

Thank. Dina

The text above is a…

a. recommendation c. announcement

b. invitation d. message

Result A B C D*

Upper 27% 0 3 0 29

Middle 46% 0 6 0 51

Lower 27% 0 0 1 30

Total 0 9 1 110

Validity Valid

P value 0.91

D value -0.03

From the number of the students that answered the item correctly, we can

definitely sure that it is an easy item since there are only 10 students who answered

the item wrong. From the discrimination power point of view, this item considered as

poor item since this item got negative value, which mean that the students in the lower

group attracted in the test better than the students in the upper group. While based on

the validity, this item considered as valid item, because it tests about invitation. So

from the criterion above this item should be discarded.

Item number 31

Question Keep out off the grass!

We usually find the sign above in the…

a. park c. library

b. hospital d. zoo

Result A* B C D

Upper 27% 28 0 0 3

Middle 46% 50 1 0 4

Lower 27% 30 2 0 0

Total 108 3 0 7

Validity Invalid

P value 0.9

D value -0.06

This item is almost the same with the previous number, which are number 30.

This item considered as easy, poor, and invalid item. The difficulty index of this

item is 0.9, since there are 108 students chose the correct answer. While on the

discrimination power index, this item got negative value which is -0.06, it means

that the lower group did better than the students in the upper group. It is because

there are only 28 students in the upper group chose the correct answer, while 30

students in the lower group did the same. This item should be rejected.

Item number 32

Question The academy award is held every year. The underlined

word has the same meaning with…

a. annually c. daily

b. yearly d. weekly

Result A* B C D

Upper 27% 16 7 4 5

Middle 46% 22 7 40 2

Lower 27% 7 12 7 6

Total 45 26 51 13

Validity Invalid

P value 0.37

D value 0.28

This item considered as moderate item in the difficulty level index, since there

are 45 students chose the correct answer. While based on the discrimination power

index, this item has 0.28, so it is a marginal item. There are 16 students in the

upper group chose the correct answer and 7 students in the lower group chose the

correct answer. Based on the validity, this item considered as invalid item. So,

from the criterion above this item still can be used in the following test with

several revisions.

Item number 33

Question After examine the patient the doctor writes a…

a. recite

b. prescription

c. recipe

d. check

Result A B* C D

Upper 27% 0 10 19 3

Middle 46% 4 7 40 2

Lower 27% 19 4 7 2

Total 23 21 66 7

Validity Invalid

P value 0.17

D value 0.19

From the table above we can see that this item considered as difficult item.

There are only 21 students chose the correct answer. The difficulty level index of this

item is 0.17. From the discrimination power point of view this item considered as

poor item, since there are only 10 students in the upper group chose the correct

answer, while in the lower group there are 4 students did the same. Because this item

considered as invalid item, this item is better to be rejected.

Item number 34

Question Harry Potter has no parents. He is …

a. a vendor

b. an orphan

c. a witch

d. a tailor

Result A B* C D

Upper 27% 7 15 9 1

Middle 46% 19 13 18 3

Lower 27% 19 4 7 2

Total 45 32 34 6

Validity Invalid

P value 0.26

D value 0.34

There are 34 students out of 120 chose the correct answer. This item is

moderate item in the term of difficulty level. It has 0.26 in the difficulty level

index. Based on the discrimination power level this item considered as good item,

since it has 0.34 in the discrimination power index. There are 15 students in the

upper group answered the item correctly, while in the lower group there are only 4

students did the same. In the validity point of view, this item considered as invalid

item. From the criterion above, this item still can be used in the following test with

several revisions.

Item number 35

Question Cricket, dragonfly, and grasshopper are the example of…

a. poultry c. mammals

b. reptiles d. insects

Result A B C D*

Upper 27% 0 11 1 20

Middle 46% 2 16 7 7

Lower 27% 15 6 13 17

Total 17 33 21 44

Validity Valid

P value 0.36

D value 0.40

These item almost the same with the following test that is number 34. This

item considered as moderate and good item in the term of difficulty level and

discrimination power level. It has 0.36 in the term of difficulty level and 0.4 in the

term discrimination power level. There are 20 students in the upper group and 7

students in the lower group answered the item well. Based on the criterion of

validity, discrimination power and difficulty level, this item still can be used in the

following test.

Item number 36

Question Yesterday evening, Jeremy went to the sport hall to…36)

badminton with his friend. They played until 10 at night. Then,

with his friend, he went to a café until 1 a.m. in the morning.

Then, they went home. But when he got home nobody opened

the…37) for him. Everybody was in deep sleep. Well, we

had…38). He climbed through the window. But one of his

neighbors saw him. She thought he was….39). She phoned the

police. Then, the policeman….40) him.

a. join

b. play

c. walk

d. climb

Result A B* C D

Upper 27% 1 30 0 0

Middle 46% 6 51 0 0

Lower 27% 2 26 0 3

Total 9 107 0 3

Validity Valid

P value 0.89

D value 0.125

This item has 0.89 in the term of difficulty level index, so this item can be

called as easy item; it is because 107 out of 120 students chose the correct answer.

While from the term of discrimination power it has 0.125. It is because 30

students from the upper group chose the answer correctly, and 26 students in the

lower group did the same thing. So this item can be classified as poor item. In the

validity point of view, this item categorized as valid item, because it tests about

recount. According to the Gronlund criterion this item should be discarded.

Item number 37

Question a. door c. floor

b. roof d. wall

Result A* B C D

Upper 27% 22 4 2 3

Middle 46% 42 5 0 9

Lower 27% 19 9 1 3

Total 83 18 3 15

Validity Valid

P value 0.69

D value 0.09

This item categorized as moderate and poor item in the term of level of difficulty

and discrimination of power. It has 0.69 in the level of difficulty index and 0.09 in the

discrimination power index. It is because 22 students in the upper group and 19

students in the lower group chose the answer correctly. While based from validity

point of view, this item categorized as valid item. So, although this item has moderate

in the difficulty level, this item should be discarded.

Item number 38

Question a. a way c. an idea

b. a trick d. a thinking

Result A B C* D

Upper 27% 5 10 13 4

Middle 46% 7 23 17 9

Lower 27% 12 10 5 5

Total 24 43 35 18

Validity Valid

P value 0.29

D value 0.25

Although the discrimination power of the test item is marginal, this item can

be categorized as a good item since it has moderate in the term of difficulty level. It

can be seen that 35 out of 120 students chose the correct answer. And 13 students in

the upper group and only 5 students in the lower group chose the correct answer.

While based on the validity criterion, this item categorized as valid item. So,

according to that criterion above, this item still can be used in the next test item.

Item number 39

Question a. a burglar c. a carpenter

b. an undertaker d. a surgeon

Result A* B C D

Upper 27% 13 6 8 5

Middle 46% 16 21 8 6

Lower 27% 3 14 13 1

Total 32 41 29 12

Validity Valid

P value 0.26

D value 0.31

This item considered as a good test item since the Discrimination power value

0.31 it means that the test discriminate the upper group and the lower group

students well. While in the term of difficulty level it classified as moderate item. It

is because 32 out of 120 students attracted the test well. In the validity point of

view it classified as valid item. Beside that the item has good distracters, so

according to the criterion above, this item still can be used in the following test.

Item number 40

Question a. watched c. laughed

b. caught d. arrested

Result A B* C D

Upper 27% 9 11 10 2

Middle 46% 27 19 2 5

Lower 27% 12 3 11 6

Total 48 33 23 13

Validity Valid

P value 0.27

D value 0.25

With P value 0.27 this item considered as moderate test item, because 33 out of

120 students chose the correct answer. From the aspect of discrimination power, the

item can be classified as marginal item; from the table above we can see that the

difference between the upper group and the lower group students who answer it

correctly is 8 students, which mean that the item discriminate them not very well.

Based on validity, it can be said as valid item. But according to Gronlund criterion,

this item still can be used in the following test.

Item number 41

Question You can buy medicine in…

a. hospital c. drugstore

b. butcher d. market

Result A B C* D

Upper 27% 13 0 19 0

Middle 46% 30 0 23 0

Lower 27% 13 0 19 0

Total 56 0 61 0

Validity Invalid

P value 0.50

D value 0

From the table above we can see that the discrimination power of this item is

poor since there is no difference between the upper group and in the lower group

students. While based on the difficulty level criterion, this item can be categorized

as moderate item since 61 out of 120 students chose the correct answer. From the

point of view of validity, this item categorized as invalid item. So, according to

the criterion above, this item should be discarded.

Item number 42

Question Yuli : How could an Indonesian sprinter, Kalimanto,

defeat a Malaysian sprinter, Rusman Alwi in

the last Sea Games?

Tuti : Everyone knows that Kalimanto ran …Rusman

Alwi.

a. fast c. faster than

b. as fast as d. the fastest

Result A B C* D

Upper 27% 11 4 15 2

Middle 46% 28 4 19 3

Lower 27% 12 3 16 1

Total 41 11 50 6

Validity Valid

P value 0.41

D value -0.03

This item similar with the previous item since it categorized as moderate and

poor item. Based from discrimination power criterion index, this item has negative

value since the students the lower group students attracted the test better than the

upper group students. While from the difficulty level index this item has 0.41.

This item classified as valid item in the term of validity. According to the criterion

above, the writer concluded that this item should be discarded.

Item number 43

Question LOLI

This is my lovely pet. I call her Loli. Loli loves eat

vegetables. She liked cabbages and cucumber, but she loves

carrot most of all. I put Loli on my backyard. She likes to dig

the soil.

The text above is a kind of…

a. narrative text c. procedure text

b. descriptive text d. recount text

Result A B* C D

Upper 27% 12 17 2 1

Middle 46% 27 18 0 9

Lower 27% 15 11 0 6

Total 54 46 2 16

Validity Valid

P value 0.38

D value 0.19

From the table above, we can see that this item classified as moderate, poor,

and valid item. In the term of difficulty level, this item classified as moderate item

since it has 0.38 in difficulty level index. While from the discrimination power

point of view, this item classified as poor item since there are only 17 students in

the upper group attracted the test well, and 11 students in the lower group did the

same. It can be said as valid item because it tests about descriptive text. So, same

with the previous item, this item should be discarded, or still can be used with

several revisions.

Item number 44

Question Loli is a…

a. tortoise c. giraffe

b. walrus d. rabbit

Result A B C D*

Upper 27% 0 4 3 25

Middle 46% 3 3 3 45

Lower 27% 6 5 4 17

Total 9 12 10 87

Validity Valid

P value 0.72

D value 0.25

This item can be categorized as a good test since it classified as moderate,

marginal, and valid item. 87 out of 120 students answer the test correctly, so this

item classified as moderate item. While from the discrimination power point of

view, this item classified as marginal item since there are 25 students in the upper

group chose the correct answer and only 17 students in the lower group did the

same. Based on the validity criterion, this item can be classified as valid item. So,

according to the criterion above, this item still can be used in the following test.

Item number 45

Question If you got a …, you will go to the dentist.

a. stomachache

b. toothache

c. headache

d. sore throat

Result A B* C D

Upper 27% 2 27 2 1

Middle 46% 21 24 3 4

Lower 27% 9 17 5 1

Total 32 68 10 6

Validity Invalid

P value 0.56

D value 0.31

From the number of the students who answer the item correctly we can see

that this item is a moderate item, since there are 68 students chose the correct

answer. While from the discrimination power index, it has 0.31 which mean that

this item classified as good item. There are 27 students in the upper group chose

the correct answer and only 17 students in the lower group did the same.

From the validity point of view this item classified as invalid item, so this item

still can be used in the following test with several revisions.

Item number 46

Question Look at the jumbled words, arrange them into good order!

The-beautifully-sings-girl-songs-the

1 2 3 4 5 6

a. 3-6-4-1-5-2 c. 3-5-6-1-4-2

b. 3-6-5-1-4-2 d. 3-5-4-1-6-2

Result A* B C D

Upper 27% 13 3 10 6

Middle 46% 11 2 29 10

Lower 27% 9 0 21 2

Total 33 5 60 18

Validity Invalid

P value 0.27

D value 0.125

From the table above we can see that this item categorized as poor item, since

there are only 13 students in the upper group chose the correct answer and 9

students in the lower group did the same, so the discrimination power index is

0.125. While from the difficulty level point of view this item classified as

moderate item since there are 33 students chose the correct answer. Based from

the validity criterion, this item classified as invalid item, so this item definitely

sure to be discarded.

Item number 47

Question Anza : Your father got angry last night,….?

Aris : Yes. He got mad because we went home late.

a. did he c. didn’t he

b. does he d. doesn’t he

Result A B C* D

Upper 27% 12 6 14 0

Middle 46% 22 17 14 0

Lower 27% 7 14 8 3

Total 41 37 36 3

Validity Valid

P value 0.3

D value 0.19

From the table above we can see that this item is similar with the previous

item. It classified as moderate, poor, and valid item. In the term of difficulty level,

this item categorized as moderate item since there are 36 students chose the

correct answer. While based on the discrimination power point of view, this item

classified as poor item since there are only 14 students in the upper group chose

the correct answer and 8 students in the lower group did the same. So, like the

previous item, this item also still can be used in the following test with several

revisions.

Item number 48

Question Look at the jumbled sentences below. Arrange into good order

of procedure text.

1. First, weigh the exact amount of rice that will pour.

It should not be more than 4 measure cup.

2. Then, after cooking, open the cover and mingle the

cooked rice for a while.

3. Second, wash the rice. Put in the inner pot and adjust

the quantity of water.

4. How to cook rice using the magic com.

5. After that, plug in the cable into the electricity socket

and push the cooking button. A light red will turn on.

6. At the same time, you can cook another meal like

vegetables. Put them in the steam pot.

The good order of the sentences above is…

a. 4-3-5-2-6-1

b. 4-2-3-5-6-1

c. 4-1-3-6-5-2

d. 4-5-3-6-1-2

Result A B C* D

Upper 27% 4 7 18 2

Middle 46% 25 9 18 2

Lower 27% 20 2 8 1

Total 49 18 44 5

Validity Invalid

P value 0.36

D value 0.31

From the number of the students who answer the item correctly, we can see that

this item classified as moderate item, since there are 44 students attracted to the item

well. While from the table above, it can be seen that this item is a good item in the

term of discrimination power since there are 18 students in the upper group chose the

correct answer and only 8 students in the lower group did the same. From validity

criteria, this item does not meet the criteria of the curriculum, so this item can be

classified as invalid item. So, from the criterion above we can see clearly that this

item still can be used in the following test with several revisions.

Item number 49

Question All students of class 8 are supposed to come to school on

Monday, 20th October 2008 at 8.00 a.m.

Our football team will play against SMP Bina Bangsa in the

final match of Student Football Cup.

Let’s support our team!!!

The text above is a kind of…

a. invitation c. news item

b. announcement d. recommendation

Result A B* C D

Upper 27% 3 18 9 2

Middle 46% 19 25 5 0

Lower 27% 15 7 7 3

Total 37 50 21 5

Validity Valid

P value 0.41

D value 0.34

This item can be classified as a good and moderate item based on the

discrimination power and the level of difficulty. It is similar with the previous

item. This item has 0.41 in the difficulty level index; it is because there are 50

students answer the item correctly. While in the discrimination power index it has

0.34 which mean that this item can discriminate the upper group students and the

lower group students well. Based on the validity criterion, this item can be

classified as valid item, because it tests about announcement which meets with the

criterion of the curriculum. So according to the criterion of difficulty level,

discrimination power and validity, this item classified as a good item and still can

be used in the following test.

Item number 50

Question Look at the chart below!

According to the chart above, the Asia’s population …from 1980

to 2000 rapidly.

a. improved

b. increased

c. developed

0

200

400

600

800

1000

1200

China India Indonesia

1980

2000

d. decreased

Result A B* C D

Upper 27% 5 17 1 9

Middle 46% 16 25 0 12

Lower 27% 11 8 3 10

Total 32 50 4 31

Validity Valid

P value 0.4

D value 0.28

This item also can be classified as a good item since it classified as moderate,

marginal and valid item. With P value 0.4 this item belongs to moderate item,

since there are 50 students attracted to the item well. While from the

discrimination power index, this item has 0.28 it is because there are 17 students

in the upper group chose the correct answer and only 8 students in the lower group

did the same. Based on the validity criterion, this item classified as valid item. So,

from the criterion above the writer concluded that this item still can be used in the

following test.

For more clear information about the result of the analysis, please notice in the

following table.

item number

validity

reliability

discrimination power

level of difficulty

1 valid reliable poor moderate 2 valid reliable poor easy 3 valid reliable very good moderate 4 invalid reliable poor moderate 5 valid reliable poor moderate 6 valid reliable marginal moderate 7 valid reliable marginal moderate 8 invalid reliable marginal moderate 9 invalid reliable poor moderate 10 valid reliable marginal difficult 11 valid reliable poor moderate 12 valid reliable poor difficult 13 invalid reliable poor moderate 14 valid reliable poor moderate 15 valid reliable poor difficult 16 valid reliable poor moderate 17 valid reliable poor difficult 18 valid reliable marginal difficult 19 valid reliable poor difficult 20 valid reliable poor difficult 21 invalid reliable poor difficult 22 invalid reliable poor easy 23 invalid reliable marginal moderate 24 valid reliable poor moderate 25 valid reliable poor moderate 26 valid reliable marginal difficult 27 valid reliable poor moderate 28 invalid reliable poor difficult 29 invalid reliable marginal moderate 30 valid reliable poor easy 31 invalid reliable poor easy 32 invalid reliable marginal moderate 33 invalid reliable poor difficult 34 invalid reliable poor moderate 35 valid reliable poor moderate 36 valid reliable poor easy 37 valid reliable poor moderate 38 valid reliable marginal moderate 39 valid reliable poor moderate

40 valid reliable marginal moderate 41 invalid reliable poor moderate 42 valid reliable poor moderate 43 valid reliable poor moderate 44 valid reliable marginal moderate 45 invalid reliable good moderate 46 invalid reliable poor moderate 47 valid reliable poor moderate 48 invalid reliable good moderate 49 valid reliable good moderate 50 valid reliable marginal moderate

The whole items have various values either on the difficulty level or in the

discrimination power. There are 11 items that can be classified as difficult items;

most of them belong to grammar. And there are 5 items that considered as easy

items and most of them are belong to reading comprehension. The easy of reading

comprehension test is because the students simply match the material in the

question with the material in the passage which has been clearly stated in the

passage. The items that considered as easy still can be used in the following test to

encourage the poor students.

While from the discrimination power point of view, there are 5 items with

negative value, it means that the item can not discriminate between the students in

the lower group and the students in the upper group well. It may cause by the

students have incomplete learning or wrong learning or the students may chose the

answer blindly.

Based from the reliability level, the whole items are reliable since the r

calculation more than the r table. From the validity point of view, there are 17

invalid items. There are several factors that can influence the validity. First the

inappropriate of the difficulty of the test, it causes the unreliable discrimination.

Second, the ambiguity statements, it will make confusions and miss interpretation

among the students. The third is the poorly constructed of the test.

So, according to the analysis there are 11 items that still can be used in the

following test. There are item numbers 3, 6, 7, 25, 35, 38, 39, 40, 44, 49, 50. Then

the items that still can be used in the following test but still need several revisions

are item are 23, there are item numbers 1, 5, 8, 10, 11, 14, 16, 19, 18, 47, 23, 24,

26, 27, 29, 32, 34, 37, 42, 43, 45, 47, 48. And the item that should be discarded are

16 items, there are the item numbers 2, 4, 9, 12, 15, 17, 20, 21, 28, 30, 31, 33, 36.

84

CHAPTER V

CONCLUSION AND SUGGESTIONS

5.1 Conclusion

From the result of the analysis of the English mid-term test of eighth grade

students of the SMP 33 Semarang, it comes to the following conclusion.

(1) The mean of the index of the difficulty level is 0.41 which means that the

items are classified as moderate item. As a result on the whole of the difficulty

level of the English mid-term test have met the requirements of a good test in

the terms of difficulty level.

(2) In analysis of the discrimination power item, the mean of the D value was

0.17 which mean that the item can not discriminate the lower group and the

upper group students well.

(3) From the point of view of the reliability through the calculation using Kruder-

Richardson 20 formula, it was found that the coefficient of the reliability is

0.39 which mean that the whole of the test item had moderate reliability.

(4) Based from the validity there are 33 items that meet with the criteria of KTSP

curriculum, and there are 17 items that does not meet the criteria of KTSP

curriculum and it is can be said as invalid items.

So, from the criteria of good test there are 11 items that can be used in the

following test, and there are 23 items that also still can be used in the next test but

85

it still needs some revisions. There are 16 items that does not meet the criteria

of good test and it is should be rejected or discarded.

Finally, the writer draws a conclusion that the items in the mid-term test of

first term for the eighth grade of SMP 33 Semarang could be put under marginal

level. Besides that, this English test also has lack of face validity and

discrimination power since almost of the items has low validity and poor

discrimination power. If the teacher wants to use it as an instrument of evaluation,

it needs some improvement or revision.

Suggestions

In constructing good language test is not an easy task. From the conclusion

above, the writer would like to offer following suggestions.

First, the test constructors have to know about the characteristic of good language

test, especially in build a good discrimination power and difficulty level. Second,

the items that still can be used should be revised and save. While the item which

has negative value should be discarded, it means the lower group students

performed better than the students in the upper group.

There are some points that have to be considered in constructing test items.

(1) Prepare the test item in advance before it is given to the students. It will

help the constructor to develop good test items. Check several times to find

some mistake which might have been missed.

(2) Make sure that the test is related to the intended learning outcomes to be

measured.

(3) The difficulty level of each item should be matched with the students’

ability and the test also able to discriminate between the students in the

upper group and in the lower group.

Finally, it is hoped that this analysis of the result can be used as a reference in

the preparation of the next test.

BIBLIOGRAPHY

Arikunto, S. 1995: Dasar-Dasar Evaluasi Pendidikan. Jakarta: Bumi Aksara.

Brink, T.D.T. 1974. Evaluation a Practical Guide for Teacher. University of Missouri: Mc. Graw Hill.

Brown, J.D. 2005. Testing in Language Programs: A Comprehensive Giude to

English Language Assesment International Edition 2005. New York: Mc. Graw Hill.

Brown, J.D. 1988. Understanding Research in Second Language Learning.

Cambridge: CUP. Ebel, Robert l. 1979. Essentials of Education Measurement. Third edition.

Englewood Cliff. New Jersey: Prentice Hall, Inc.

Gronlud, Norman E. 1981. Measurement and Evaluation in Teaching. Fourth edition, New York: Macmillan.

Gronlund, N.E. 1982. Constructing Achievement Test: Third edition. USA:

Prentice Hall, Inc. Harris, David P. 1969. Testing English as Second Language. New York: Mc Graw

Hill. Madsen, Horald S. 1983. Technique in Testing. Hongkong: Oxford University.

Mc, Namara. 2000. Language Testing: Oxford University Press.

Nitko, A.J. 1983. Educational Test and Measurement an Introduction. London: Harcourt Brace Jovanovich, Inc.

Phompham, J. W. 1981. Modern Educational Measurement. London : Prentice

Hall, Inc. Englewood Cliffs. Vallette, Rebeca M. 1977. Modern Language Testing. Second edition. New York:

Harcourt Brace Jovanovich, Inc. http: //www.uab.edu/ uasomume/cdm/test. Htm. Downloaded on January 3rd 2009. http://www.socialresearchmethods.net/kb/measval.php. Downloaded on June 3rd

2009.

APPENDICES

Number 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 Y Y²E_1 1 1 0 0 0 0 1 1 0 0 1 0 1 1 1 1 0 0 0 0 0 1 0 1 0 1 1 0 1 1 1 1 0 0 0 1 1 0 0 0 0 0 0 1 0 1 1 0 1 0 23 529E_2 1 1 0 0 0 0 1 1 0 0 1 1 1 1 0 1 0 0 0 0 0 1 1 1 0 0 1 0 1 1 1 0 0 0 0 1 1 0 0 0 0 0 0 1 0 0 0 0 0 1 20 400E_3 1 1 0 1 0 0 1 1 1 0 1 1 1 1 0 1 0 0 0 0 0 1 0 1 0 0 1 0 1 1 1 1 0 0 0 1 1 0 0 1 0 0 0 1 0 0 0 0 0 0 22 484E_4 1 1 0 0 0 1 0 1 0 0 1 0 1 1 0 1 0 0 0 0 0 1 1 1 1 0 1 0 1 1 1 0 0 0 0 1 1 1 1 0 0 0 0 1 1 0 0 0 1 1 24 576E_5 1 1 0 0 0 0 0 1 0 1 1 0 1 1 0 1 0 0 0 0 0 1 0 0 0 1 1 0 1 1 1 0 0 1 0 1 1 1 1 0 0 0 0 1 1 1 0 0 0 0 22 484E_6 1 1 0 1 0 1 1 1 0 1 1 0 1 1 1 1 0 0 0 0 0 1 0 0 0 0 1 0 1 1 1 1 0 0 0 1 1 0 0 0 0 0 0 1 0 1 0 0 1 0 23 529E_7 1 1 1 0 0 0 0 1 0 0 1 0 1 1 1 1 0 0 0 1 0 1 0 0 0 0 1 0 1 1 1 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 1 0 18 324E_8 1 1 0 0 0 1 1 1 0 0 1 0 1 1 0 1 1 0 0 0 0 1 0 0 0 0 1 0 1 1 1 1 1 0 1 1 1 0 0 0 0 0 0 1 0 0 1 0 0 0 22 484E_9 1 1 1 0 0 0 0 1 0 0 1 0 1 1 0 1 1 0 0 0 0 1 0 0 1 0 1 0 1 1 1 1 0 0 0 1 1 1 0 0 0 0 0 1 0 0 0 0 0 0 20 400E_10 1 1 0 1 0 0 0 1 0 0 1 0 1 1 0 1 0 0 0 1 0 1 0 1 0 0 1 0 1 1 1 0 0 0 0 1 1 0 0 0 0 0 0 1 0 0 0 0 1 1 20 400E_11 1 1 0 0 0 0 0 1 1 1 1 1 1 1 0 1 0 0 0 0 0 1 0 0 0 0 1 0 1 1 1 0 0 0 0 1 1 1 1 0 0 0 0 1 0 0 0 0 1 0 21 441E_12 0 1 0 0 0 1 1 1 0 0 1 0 1 1 0 1 0 0 0 1 0 1 1 0 0 0 0 0 1 1 1 0 0 1 0 1 1 1 0 1 0 0 0 0 0 0 0 1 1 1 22 484E_13 1 1 1 0 0 1 1 1 0 0 1 0 1 1 1 1 0 0 0 0 0 1 1 0 0 1 1 0 1 1 1 0 0 0 1 1 1 0 0 0 0 0 0 1 0 1 0 0 1 0 24 576E_14 1 1 0 0 0 0 1 1 0 0 1 0 1 1 0 1 0 0 0 0 0 1 0 0 0 1 1 0 1 1 1 1 0 0 1 1 1 0 0 0 0 0 0 1 0 0 0 0 1 0 20 400E_15 1 1 0 1 0 0 1 1 1 0 1 0 1 1 0 1 0 0 0 0 0 1 1 0 0 0 1 0 1 1 1 1 0 0 0 1 1 0 0 0 0 0 0 1 0 0 0 0 1 1 22 484E_16 1 1 1 0 0 0 0 1 0 0 1 0 1 1 0 1 1 0 0 0 0 1 0 0 0 0 1 1 1 1 1 1 0 0 0 1 1 0 0 0 0 0 0 1 0 0 0 0 0 1 20 400E_17 0 1 1 1 1 0 0 1 0 0 1 0 1 1 0 1 1 0 0 0 0 1 0 0 0 0 1 0 1 1 1 1 0 0 0 1 1 0 0 1 0 0 0 1 1 0 0 0 1 0 22 484E_18 1 1 0 1 0 1 0 1 0 1 1 1 1 1 0 1 0 0 0 0 0 1 0 0 1 0 1 0 1 1 1 1 0 1 0 1 1 1 1 0 0 0 0 1 1 0 1 0 1 0 27 729E_19 1 1 0 0 0 0 1 1 0 0 1 0 1 1 1 1 0 0 0 1 0 1 0 1 1 0 0 0 1 1 1 1 0 1 0 1 1 1 0 0 0 0 0 0 0 0 1 0 1 1 24 576E_20 1 1 1 1 0 0 1 1 0 0 1 0 1 1 0 1 0 0 0 0 0 1 0 0 0 0 0 0 1 1 1 0 0 1 0 1 1 0 0 1 0 0 0 1 0 1 0 1 1 1 23 529E_21 1 1 0 0 0 1 1 1 0 0 1 0 1 1 0 1 0 0 0 0 0 1 0 0 0 0 1 0 1 1 1 1 0 0 0 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 18 324E_22 1 1 0 0 0 0 1 1 1 0 1 0 1 1 1 1 0 0 0 0 0 1 1 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 0 0 0 0 0 1 0 0 0 1 1 1 23 529E_23 1 1 1 1 0 0 1 1 0 0 1 0 1 1 0 1 0 0 0 0 0 1 0 0 0 0 1 0 1 1 1 1 0 0 0 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 19 361E_24 1 1 0 0 0 1 0 1 0 0 1 0 1 1 0 1 1 0 0 1 0 1 0 0 1 0 1 0 1 1 1 0 1 1 0 1 1 1 1 0 0 0 0 1 1 0 0 0 1 0 25 625E_25 1 1 0 0 0 1 1 1 0 0 1 0 1 1 0 1 0 0 0 0 0 1 0 0 0 0 1 0 1 1 1 0 0 0 0 1 1 1 1 1 0 0 0 1 0 0 0 1 1 1 23 529E_26 1 1 0 0 0 0 0 1 0 0 1 0 1 1 0 1 0 0 0 0 0 1 0 0 0 0 1 0 1 1 1 0 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 1 1 17 289E_27 1 1 1 1 0 0 1 1 1 0 1 0 1 1 0 1 1 0 0 0 0 1 1 0 0 1 0 0 1 1 1 1 0 0 0 1 1 0 0 0 0 0 0 0 0 1 0 0 1 0 23 529E_28 1 1 0 0 0 0 1 1 0 1 1 1 1 1 0 0 0 0 0 0 0 1 1 0 0 0 1 0 1 1 1 1 0 0 0 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 19 361E_29 1 1 0 0 0 1 1 1 0 0 1 0 1 1 0 1 0 0 0 0 0 1 0 0 0 0 1 0 1 1 1 0 0 0 0 1 1 0 0 1 0 0 0 1 0 0 0 0 1 0 19 361E_30 1 1 0 0 0 0 1 1 1 0 1 0 1 1 1 1 1 0 0 0 0 1 0 1 0 0 1 0 1 1 1 0 0 1 0 1 1 1 0 0 0 0 0 1 0 0 1 0 0 0 23 529E_31 0 1 1 1 0 1 0 1 0 0 0 0 0 0 1 1 1 0 0 0 0 1 0 0 0 1 0 0 1 1 1 0 0 0 0 1 1 0 0 0 0 0 0 1 0 0 1 0 1 0 18 324E_32 1 1 0 0 0 0 1 1 0 0 1 0 1 1 1 1 1 0 0 1 0 1 0 0 1 0 1 0 1 1 1 1 1 0 0 1 1 0 0 0 0 0 0 1 0 1 0 0 0 0 23 529E_33 1 1 0 1 0 1 1 1 0 1 1 1 1 1 0 1 0 0 0 0 0 1 0 1 1 0 0 0 1 1 1 1 0 0 0 1 1 1 1 0 0 0 0 1 1 0 0 1 1 0 27 729E_34 0 1 0 1 1 1 0 1 0 0 1 0 1 1 0 1 0 0 0 0 0 1 0 0 0 0 1 1 1 1 1 0 0 0 0 1 1 0 1 0 0 0 0 1 0 0 0 0 1 1 21 441E_35 1 1 1 0 0 0 0 1 1 0 1 0 1 1 0 1 0 0 0 0 0 1 1 0 0 1 1 0 1 1 1 0 0 1 1 1 1 1 0 0 0 0 0 0 0 1 1 0 0 1 24 576E_36 0 1 0 1 1 0 0 1 0 0 1 0 1 1 1 0 0 0 0 0 0 1 0 0 0 1 0 0 1 1 1 0 0 0 1 1 1 0 0 0 0 0 0 1 0 0 0 0 1 0 18 324E_37 1 1 1 0 0 1 1 1 0 0 0 0 0 0 1 0 0 0 0 1 0 1 0 0 1 0 0 0 1 1 1 1 0 0 0 1 1 1 0 0 0 0 0 1 0 0 0 0 0 0 18 324E_38 1 1 0 0 0 0 1 1 0 0 1 0 1 0 1 0 0 0 0 0 0 1 0 0 0 0 1 0 1 1 1 1 0 0 0 1 1 1 0 0 0 0 0 1 0 0 0 0 0 0 17 289E_39 1 1 0 0 0 0 1 1 0 0 1 0 1 0 1 0 0 0 0 0 0 1 0 0 0 0 1 0 1 1 1 0 0 0 0 1 1 0 0 0 0 0 0 1 0 1 0 0 0 1 17 289E_40 1 1 0 0 0 1 1 1 0 0 1 0 1 0 1 0 0 0 0 0 0 1 1 1 0 0 0 0 1 1 1 0 0 1 0 1 1 0 0 0 0 0 0 1 0 0 0 0 1 0 19 361

Appendix 1 ITEM ANALYSIS OF ENGLISH THE MID-TERM TESTOF SMPN 33 SEMARANG IN THE ACADEMIC YEAR 2008/2009

E_40 1 1 0 0 0 1 1 1 0 0 1 0 1 0 1 0 0 0 0 0 0 1 1 1 0 0 0 0 1 1 1 0 0 1 0 1 1 0 0 0 0 0 0 1 0 0 0 0 1 0 19 361A_1 0 1 0 1 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 1 0 0 1 0 1 0 0 0 1 1 0 0 1 1 1 1 0 0 0 1 1 1 1 1 0 1 0 0 1 21 441A_2 0 1 1 0 1 0 0 0 1 0 0 0 1 1 1 0 0 0 0 1 1 1 0 1 0 1 0 0 0 1 1 1 0 0 0 1 0 0 1 0 1 1 0 0 1 0 0 0 0 0 20 400A_3 1 1 0 1 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 1 1 1 0 0 0 1 1 0 0 0 1 1 0 1 1 1 1 0 0 0 18 324A_4 0 1 0 0 0 1 1 1 1 1 0 0 1 1 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 1 1 1 0 1 1 1 1 0 0 0 1 1 1 1 1 0 0 1 0 1 24 576A_5 0 1 0 0 1 0 0 0 0 0 1 0 1 1 0 0 0 0 0 0 1 0 0 1 0 0 0 1 0 1 1 0 0 0 1 1 0 1 0 1 1 1 0 1 1 0 1 0 0 0 19 361A_6 0 1 1 0 1 0 0 0 1 0 1 0 1 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1 1 0 0 0 0 1 0 0 0 0 1 1 1 0 1 0 1 0 0 0 17 289A_7 0 1 1 0 1 0 0 0 0 0 1 0 1 1 0 0 0 0 0 0 1 0 1 1 0 0 0 0 0 1 1 0 0 0 1 1 0 1 0 0 1 1 0 1 1 0 1 0 0 0 19 361A_8 0 1 0 1 1 1 0 0 0 0 1 0 1 1 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 1 1 1 0 0 0 1 1 0 1 0 1 1 0 1 1 1 0 0 0 1 22 484A_9 0 1 0 1 1 1 0 0 0 0 1 0 1 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 0 1 1 1 1 0 0 1 1 0 0 0 1 1 0 1 1 0 0 0 0 1 21 441A_10 1 1 1 1 1 0 0 1 0 0 0 0 1 1 0 0 1 0 1 0 1 1 1 1 1 0 0 0 1 1 1 0 0 0 1 1 0 0 0 0 1 1 0 1 1 0 1 0 0 0 25 625A_11 0 1 0 1 0 0 1 1 1 0 1 0 1 1 0 0 0 0 0 0 1 0 0 1 1 0 0 0 0 1 1 0 0 0 0 1 0 1 0 0 1 1 1 1 1 0 0 1 0 0 21 441A_12 1 1 0 0 0 1 1 0 1 1 0 0 1 1 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 1 1 1 0 1 1 1 1 0 0 0 1 0 1 1 1 0 0 1 0 1 24 576A_13 0 1 0 1 0 0 0 0 0 0 1 0 1 1 0 0 1 0 0 1 1 0 0 1 1 0 0 0 0 1 1 0 0 0 1 1 0 0 0 0 1 1 0 0 1 1 0 0 0 0 18 324A_14 0 1 0 1 1 1 0 1 0 0 1 0 1 1 0 0 0 0 0 0 0 1 1 1 0 0 0 0 1 1 1 0 0 0 0 1 0 1 0 0 1 1 1 1 1 0 0 1 0 1 23 529A_15 0 1 0 1 0 0 0 1 1 0 1 0 1 1 0 0 0 0 0 0 1 0 0 1 1 1 0 0 0 1 1 0 0 0 0 1 0 0 0 0 1 1 0 0 1 1 0 0 0 1 19 361A_16 0 1 0 1 1 0 1 1 1 1 1 0 0 1 0 0 0 1 0 0 0 1 0 1 0 0 0 0 0 1 1 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 0 0 0 18 324A_17 0 1 0 1 1 1 0 1 0 0 1 0 1 1 0 0 0 0 0 0 0 1 1 1 0 0 0 0 1 1 1 0 0 0 0 1 0 1 0 0 1 1 1 1 1 0 0 1 0 1 23 529A_18 0 1 0 1 1 1 1 0 1 1 1 0 0 1 0 0 0 0 0 0 0 0 0 1 1 0 0 1 0 1 1 0 0 0 0 1 1 0 0 0 1 1 0 0 0 0 1 0 0 1 20 400A_19 0 1 0 0 0 0 0 0 0 0 1 0 1 1 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 1 1 0 0 0 1 1 0 0 0 0 1 1 0 1 1 0 1 0 0 0 15 225A_20 1 1 1 0 1 0 0 1 0 0 0 0 0 1 0 0 0 1 1 0 1 1 0 1 0 0 0 0 0 1 1 0 0 1 1 1 0 1 0 0 0 1 1 1 1 1 0 1 1 0 24 576A_21 0 1 0 1 0 0 1 0 1 0 0 0 1 1 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 1 1 0 0 0 0 1 1 0 1 0 1 1 0 0 1 0 0 0 0 1 17 289A_22 0 1 0 1 1 0 0 0 0 0 1 0 1 1 0 0 0 0 1 0 1 0 0 1 0 0 0 0 0 1 1 0 0 0 1 1 0 1 1 0 1 1 0 1 0 0 1 0 0 0 19 361A_23 1 1 1 0 1 0 0 0 0 0 0 0 1 1 0 0 1 0 1 0 0 1 0 1 0 1 0 0 1 1 1 0 0 0 0 1 0 0 0 0 1 1 1 0 0 0 0 0 0 0 18 324A_24 0 1 1 1 1 1 0 0 0 0 1 0 1 1 0 0 0 0 1 0 0 1 0 1 0 0 0 0 0 1 1 1 0 1 1 1 1 0 0 0 1 1 0 1 1 1 1 0 0 1 25 625A_25 0 1 1 1 0 1 0 0 0 0 1 0 1 1 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 1 1 0 0 0 1 1 1 1 0 1 1 1 0 0 1 0 1 1 0 0 22 484A_26 0 1 0 1 1 1 1 1 1 1 1 0 0 1 0 0 1 1 0 0 0 1 1 1 1 0 0 0 1 1 1 0 1 0 0 1 0 0 0 0 0 0 0 1 1 1 0 0 0 1 25 625A_27 0 1 0 1 0 1 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 1 0 1 0 0 0 0 1 1 1 1 0 0 0 1 1 0 0 1 1 1 0 1 1 0 0 0 0 1 19 361A_28 0 0 1 0 1 0 1 0 0 0 1 0 1 1 0 0 0 1 1 0 1 1 0 1 0 0 0 1 1 1 1 0 0 1 1 1 0 0 1 1 0 1 0 1 1 1 0 0 1 0 25 625A_29 0 1 1 1 1 0 0 0 0 0 1 0 0 1 0 0 0 1 0 0 1 0 1 1 1 0 0 0 0 1 1 0 0 1 1 1 1 0 0 0 1 1 1 1 1 1 1 0 0 0 24 576A_30 0 1 1 0 1 1 0 1 0 0 0 0 1 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 1 1 1 1 0 1 0 1 0 18 324A_31 0 1 0 1 1 0 1 1 1 1 1 0 0 1 0 0 1 1 0 0 0 1 0 1 1 0 0 0 0 1 1 1 0 1 1 1 1 0 0 0 1 1 0 0 1 0 0 0 0 0 24 576A_32 0 1 1 0 1 1 0 1 0 0 0 0 1 1 0 0 0 0 0 0 1 0 0 1 0 1 0 0 0 1 1 0 0 0 0 1 0 0 0 0 1 1 1 0 1 0 0 0 0 0 17 289A_33 0 1 1 0 1 0 0 0 0 0 0 0 0 1 0 0 1 0 1 0 0 1 0 1 0 0 0 0 0 1 1 0 0 0 1 1 0 0 0 0 1 1 0 1 1 0 1 0 0 0 17 289A_34 0 1 0 1 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 1 1 1 0 0 0 1 1 0 0 0 1 1 0 1 1 1 1 1 0 0 18 324A_35 0 1 0 1 0 0 0 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 1 1 0 0 0 1 1 1 0 0 0 1 1 0 0 1 0 0 0 0 0 16 256

A_36 0 1 1 1 1 0 0 0 0 0 1 0 1 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1 1 0 0 0 0 1 0 0 0 0 1 1 1 0 1 0 0 0 0 0 16 256A_37 1 0 1 1 1 1 1 0 1 1 0 1 0 0 1 0 1 0 0 1 1 0 0 0 0 1 1 1 1 0 0 0 1 0 0 0 0 1 1 1 1 0 0 0 1 1 1 0 0 1 26 676A_38 0 1 1 0 1 0 0 0 0 0 1 0 1 1 1 0 1 1 0 0 1 0 0 1 0 0 0 0 1 1 1 0 1 1 1 1 0 1 0 0 1 1 0 1 1 1 0 1 1 0 26 676A_39 0 1 0 0 0 0 1 1 1 0 1 0 1 1 1 0 0 0 0 0 1 0 1 1 0 0 0 0 0 1 1 0 0 0 0 1 0 0 0 0 1 1 0 0 1 0 0 0 0 1 18 324A_40 1 1 0 0 0 0 0 0 1 0 1 0 1 1 1 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1 1 0 0 0 0 1 0 0 0 0 1 1 1 0 1 0 0 0 0 0 16 256B_1 0 1 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 1 1 0 0 1 0 1 0 0 0 0 0 1 1 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 1 1 0 16 256B_2 1 1 1 1 1 1 0 0 0 0 1 0 0 1 0 0 0 1 1 0 0 1 0 1 0 1 0 1 0 1 1 1 0 1 0 1 1 0 0 0 1 0 1 1 1 0 1 1 1 0 27 729B_3 0 1 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 1 0 1 0 0 0 0 0 1 1 0 0 0 0 0 0 1 0 1 0 0 1 1 1 0 0 1 1 0 15 225B_4 1 1 0 0 1 1 1 1 1 1 1 1 1 1 0 1 1 0 0 1 1 0 0 1 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 1 0 0 0 0 1 1 0 0 1 24 576B_5 0 1 0 0 0 0 1 1 0 0 0 0 1 1 1 0 0 0 1 0 0 1 0 1 1 0 1 1 0 1 0 1 1 0 1 1 1 0 1 0 0 1 1 1 1 0 0 1 1 1 26 676B_6 0 1 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 1 1 0 0 1 1 1 0 0 1 1 0 15 225B_7 1 1 1 0 1 0 1 0 1 0 0 0 0 1 0 0 0 1 1 0 0 1 1 1 0 0 0 0 1 1 1 0 0 0 1 1 1 0 1 1 1 1 1 0 1 0 1 1 0 0 26 676B_8 0 0 0 0 0 0 1 1 1 0 0 0 0 1 0 0 0 1 0 1 0 0 0 1 0 0 1 0 0 1 1 1 0 0 1 1 1 0 0 1 0 0 1 1 0 0 0 1 1 0 19 361B_9 0 1 1 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 1 1 0 0 0 1 1 1 0 0 1 1 0 1 1 0 0 0 1 1 1 20 400B_10 0 1 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 1 1 0 0 1 1 1 1 1 0 0 0 1 1 1 0 0 1 1 1 0 0 1 1 0 1 1 1 0 0 1 1 1 24 576B_11 1 1 0 0 0 0 0 1 1 1 0 1 0 1 1 0 1 1 1 1 0 1 1 1 0 1 0 0 0 1 1 1 0 0 1 1 1 0 1 1 1 1 1 1 1 0 1 1 1 0 32 1024B_12 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 1 1 0 0 1 0 1 0 1 0 0 1 1 1 0 1 1 1 1 1 1 1 1 0 1 1 1 22 484B_13 1 1 0 1 0 0 0 1 0 0 1 0 1 0 0 0 0 1 1 0 0 1 0 1 0 0 0 1 0 1 1 0 0 1 0 1 1 0 0 0 0 0 1 1 1 0 0 0 1 0 20 400B_14 0 1 1 0 1 0 1 0 0 0 0 1 1 1 0 0 0 0 1 0 0 1 1 1 1 1 0 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 0 33 1089B_15 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 1 0 0 1 0 1 1 0 0 0 0 1 1 1 0 0 1 1 1 0 1 0 0 1 1 0 1 0 0 1 1 0 19 361B_16 1 0 1 1 1 1 0 0 1 1 0 0 0 0 1 0 1 0 0 1 0 1 0 0 0 1 1 1 0 0 0 0 1 1 0 0 0 1 1 1 0 0 0 0 0 1 1 0 0 1 22 484B_17 1 0 0 1 1 1 0 0 1 1 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 1 1 0 0 0 1 1 1 0 0 0 0 0 1 0 0 0 1 18 324B_18 1 0 1 1 1 1 0 0 1 1 0 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 1 0 0 0 1 1 1 1 0 0 0 0 1 1 0 0 1 19 361B_19 1 0 0 0 0 0 0 1 0 0 0 1 1 0 0 1 0 0 0 0 1 1 0 1 0 0 0 0 0 1 1 0 0 0 0 0 1 1 0 0 1 0 0 0 0 1 0 1 0 0 15 225B_20 0 1 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 1 1 0 0 1 0 1 1 0 0 0 0 1 1 1 0 0 1 1 1 0 1 0 1 1 1 0 1 0 0 0 0 1 20 400B_21 1 1 1 0 0 0 0 1 1 0 0 1 0 0 0 0 0 1 1 0 0 1 0 1 1 1 0 0 0 1 1 1 1 0 1 1 1 0 0 1 1 0 1 1 1 0 0 1 0 1 26 676B_22 0 1 1 1 1 0 0 1 0 0 1 0 0 0 0 0 0 1 1 0 0 1 0 1 1 0 0 0 0 1 0 1 1 0 1 1 1 0 0 1 0 0 1 1 1 0 0 1 1 1 24 576B_23 0 1 0 0 0 1 0 1 1 0 0 0 0 1 0 0 0 0 0 1 0 0 1 1 1 0 0 1 0 1 1 0 0 1 0 1 1 0 0 1 0 0 1 1 1 0 0 1 1 1 22 484B_24 1 0 0 0 1 0 0 1 0 0 0 0 0 1 0 0 0 1 1 0 0 1 0 1 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 1 0 0 16 256B_25 0 0 0 0 0 0 1 0 1 0 1 0 0 0 0 0 0 0 0 1 0 1 0 0 1 0 0 0 0 1 0 0 1 0 0 1 1 0 0 0 0 0 1 1 0 1 0 1 0 1 15 225B_26 0 1 1 0 1 0 0 1 0 0 0 1 1 1 0 1 0 0 0 0 0 1 1 1 0 0 0 0 0 1 1 0 0 0 0 0 1 1 0 0 1 0 0 0 0 1 1 0 0 0 18 324B_27 0 1 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 1 1 0 1 1 1 1 0 1 1 1 1 1 0 1 1 0 0 19 361B_28 0 1 1 0 1 0 0 0 1 0 1 0 0 0 0 0 0 1 1 0 1 1 0 1 0 1 0 0 0 1 1 0 0 0 0 0 0 0 0 1 0 0 1 1 1 0 0 1 1 0 19 361B_29 0 0 1 0 0 0 0 1 1 0 0 0 0 1 0 0 0 1 0 0 0 1 0 1 0 0 0 0 1 1 1 1 0 1 1 1 0 1 0 1 1 1 0 1 0 0 1 0 1 21 441B_30 0 1 1 1 0 1 0 1 1 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 1 0 0 1 0 1 1 0 0 1 0 1 0 0 0 1 1 0 1 1 1 0 1 1 1 1 24 576B_31 0 1 1 1 0 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0 1 0 1 0 1 0 0 1 1 1 1 1 0 1 1 1 0 1 1 1 1 1 1 1 0 1 1 1 1 28 784B_32 0 0 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0 1 1 0 0 1 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 1 1 1 0 0 1 1 0 14 196B_33 0 1 1 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 1 1 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1 0 0 1 1 1 0 0 1 1 0 16 256B_34 1 1 0 1 0 0 1 1 1 0 0 0 0 0 0 0 0 1 1 0 0 1 1 1 0 0 0 0 0 1 1 0 0 0 1 1 0 0 1 0 1 0 0 1 0 1 1 0 0 1 21 441B_34 1 1 0 1 0 0 1 1 1 0 0 0 0 0 0 0 0 1 1 0 0 1 1 1 0 0 0 0 0 1 1 0 0 0 1 1 0 0 1 0 1 0 0 1 0 1 1 0 0 1 21 441B_35 0 1 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 1 1 0 0 1 0 1 1 0 0 0 0 1 1 0 0 0 1 1 1 0 0 1 1 0 1 1 0 0 0 1 1 1 20 400B_36 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 1 1 0 0 1 0 1 0 0 0 0 0 1 1 0 0 0 0 0 1 0 0 0 1 1 1 1 0 0 0 0 0 1 14 196B_37 0 1 1 0 0 0 0 0 1 0 1 0 0 0 0 0 0 1 1 0 0 1 0 1 1 0 0 0 0 1 1 1 0 1 1 1 1 0 0 0 1 0 1 1 0 0 0 0 0 0 19 361B_38 1 1 1 1 1 1 0 1 1 0 0 1 0 0 0 0 0 1 1 0 0 1 0 1 1 1 0 0 0 0 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 33 1089B_39 0 1 1 1 1 0 1 1 1 0 0 0 0 1 0 0 0 1 1 0 0 1 0 1 0 1 0 0 0 0 1 0 0 0 1 1 0 0 1 0 1 1 1 1 1 1 1 1 0 1 26 676B_40 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 1 1 0 0 0 0 0 0 1 0 0 1 1 0 1 1 0 0 0 0 0 1 1 0 0 0 1 0 0 13 169∑x 55 108 45 48 47 39 45 77 45 18 76 14 72 90 25 37 24 30 29 20 24 91 31 81 33 25 35 17 51 110 108 45 21 32 44 107 83 35 32 33 61 50 46 87 68 33 36 44 50 50 2507 54139

R 55 108 45 48 47 39 45 77 45 18 76 14 72 90 25 37 24 30 29 20 24 91 31 81 33 25 35 17 51 110 108 45 21 32 44 107 83 35 32 33 61 50 46 87 68 33 36 44 50 50

T 120 120 120 120 120 120 120 120 120 120 120 120 120 120 120 120 120 120 120 120 120 120 120 120 120 120 120 120 120 120 120 120 120 120 120 120 120 120 120 120 120 120 120 120 120 120 120 120 120 120

P 0,458 0,9 0,375 0,4 0,392 0,325 0,375 0,642 0,375 0,15 0,633 0,117 0,6 0,75 0,208 0,308 0,2 0,25 0,242 0,167 0,2 0,758 0,258 0,675 0,275 0,208 0,292 0,142 0,425 0,917 0,9 0,375 0,175 0,267 0,367 0,892 0,692 0,292 0,267 0,275 0,508 0,417 0,383 0,725 0,567 0,275 0,3 0,367 0,417 0,417

criteria mod easy mod mod mod mod mod mod mod diff mod diff mod mod diff mod diff diff diff diff diff easy mod mod mod diff mod diff mod easy easy mod diff mod mod easy mod mod mod mod mod mod mod mod mod mod mod mod mod modU 18 30 28 15 16 14 15 24 15 9 18 8 17 26 6 9 9 14 13 6 7 28 13 26 16 12 7 6 16 29 28 16 10 15 20 30 22 13 13 11 19 15 17 25 27 13 14 18 18 17

L 12 27 11 11 15 6 8 16 10 2 18 2 19 23 10 6 6 5 7 5 7 23 5 22 5 5 6 1 9 30 30 7 4 4 7 26 19 5 3 3 19 16 11 17 17 9 8 8 7 8

D 0,188 0,094 0,531 0,125 0,031 0,25 0,219 0,25 0,156 0,219 0 0,188 -0,063 0,094 -0,125 0,094 0,094 0,281 0,188 0,031 0 0,156 0,25 0,125 0,344 0,219 0,031 0,156 0,219 -0,031 -0,063 0,281 0,188 0,344 0,406 0,125 0,094 0,25 0,313 0,25 0 -0,031 0,188 0,25 0,313 0,125 0,188 0,313 0,344 0,281

1/2 T 32

criteria poor poor v good poor poor mar mar mar poor mar poor poor poor poor poor poor poor mar poor poor poor poor mar poor good mar poor poor mar poor poor mar poor mar v good poor poor mar good mar poor poor poor mar good poor poor good good marp 0,458 0,9 0,458 0,4 0,392 0,325 0,375 0,642 0,375 0,15 0,633 0,117 0,6 0,75 0,208 0,308 0,2 0,25 0,242 0,167 0,2 0,758 0,258 0,675 0,275 0,208 0,292 0,142 0,425 0,917 0,9 0,375 0,175 0,267 0,367 0,892 0,692 0,292 0,267 0,275 0,508 0,417 0,383 0,725 0,567 0,275 0,3 0,367 0,417 0,417

q 0,542 0,1 0,542 0,6 0,608 0,675 0,625 0,358 0,625 0,85 0,367 0,883 0,4 0,25 0,792 0,692 0,8 0,75 0,758 0,833 0,8 0,242 0,742 0,325 0,725 0,792 0,708 0,858 0,575 0,083 0,1 0,625 0,825 0,733 0,633 0,108 0,308 0,708 0,733 0,725 0,492 0,583 0,617 0,275 0,433 0,725 0,7 0,633 0,583 0,583

pq 0,248 0,09 0,248 0,24 0,238 0,219 0,234 0,23 0,234 0,128 0,232 0,103 0,24 0,188 0,165 0,213 0,16 0,188 0,183 0,139 0,16 0,183 0,192 0,219 0,199 0,165 0,207 0,122 0,244 0,076 0,09 0,234 0,144 0,196 0,232 0,097 0,213 0,207 0,196 0,199 0,25 0,243 0,236 0,199 0,246 0,199 0,21 0,232 0,243 0,243

Si² 9,798

St² 16

r 0,39

criteria reliable

diffi

culty

leve

ldi

scrim

inat

ion

pow

erre

liabi

lity

Appendix 2

There is the example of the calculation of the reliability. In estimating the test item, the

writer uses Kuder-Richardson 20 formula. The formula is as follow:

1KKr

21

spq

r = reliability coefficient of the test items

k = number of the item in the test

p = the difficulty index

q = the proportion of the students gives the wrong answer (q = 1-p)

S2

= the variance of the total test scores

Then the formula to calculate the variance is as follows:

S2

=

nny

y 2

2

S 2 = the variance

the sum of

Y= the total score

N= the number of respondent

From the appendix 1 we can count the variance is as follow:

54719 – (2517) 2 120 S 2 = 120 S 2 = 16.04 ∑pq = 9.798

R =

= 1 –

= 0.39

Because the r product moment table is larger than r table so this item are reliable.

Appendix 3

There are calculations of discrimination power item of English mid-term test for eighth

grade students of SMP 33 Semarang. The formula to calculate the discrimination power is:

D = RU-RL ½ T

D : Discrimination power

RU: The number of the students in the upper group who answer the item

correctly.

RL: The number of the students in the lower group who answer the item correctly.

½ T: One half of the total number of the students included in the item analysis.

Based on the appendix 1, the example of the difficulty level calculation are as follow:

1. D = 18-12 = 0.19 ½ x 120

2. D = 30-27 = 0.09 ½ x 120

3. D = 28-11 = 0.53 ½ x 120

4. D = 15-11 = 0.125 ½ x 120

For further item, the calculation the same as the example above.

Appendix 4

The calculation of item difficulty level of mid-term test of eighth grade students SMP 33 Semarang is as follow:

P = R T P = difficulty level or index of difficulty

R = the number of students responding correctly to the item

T = the total number of students responding to the item

Based on the appendix 1, the example of the difficulty level calculation is as follows:

1. P = 55 = 0.46 120

2. P = 108 = 0.9 120

3. P = 47 = 0.39 120

4. P = 48 = 0.4 120

For further item, their calculation is the same as the example above.

Appendix 5

There is the list of the students in the upper group and the lower group ant the point that

they got.

Upper Group Lower Group Student number Point

Student number Point

E_4 24 E_26 17 E_13 24 E_38 17 E_18 27 E_39 17 E_19 24 A_6 17 E_24 25 A_19 15 E_33 27 A_21 17 E_35 24 A_32 17 A_4 24 A_33 17 A_10 25 A_35 16 A_12 24 A_36 16 A_20 24 A_40 16 A_24 25 B_1 16 A_26 25 B_3 15 A_28 25 B_6 15 A_29 34 B_19 15 A_31 24 B_24 16 A_37 26 B_25 15 A_38 26 B_32 14 B_2 27 B_33 16 B_4 24 B_36 14 B_5 26 B_40 13 B_7 26 B_17 18 B_10 24 E_37 18 B_11 32 E_31 18 B_14 33 A_30 18 B_21 26 B_26 18 B_22 24 A_13 18 B_30 24 A_39 18 B_31 28 A_16 18 B_38 33 A_23 18 B_39 26 E_7 18 A_37 23 E_21 18

Appendix 6

There is the list of the correspondents of the experiment.

No Name No Name 1 Aditya Pratama Putra 61 Indah Juwita sari 2 Ahmad Untung Susilo 62 Indra Dewantoro 3 Amanda Mariana Wattimu 63 Innamugrahan Haritz Zuhd 4 Ari Dwi Asto Kencono 64 Irvan Prasetyo 5 Avi Oktaviani 65 Ivan Setyo Pratama 6 Bagus Ari Setyawan 66 Kartika Tri Wulandari 7 Bunga Orchidya 67 Lisa Arum Sari 8 Chandra Septiana 68 Made Rikmawan 9 Cesar Catur Wijayanti 69 Muchamat Hasim 10 Desi Ratna Sari 70 Muh. Haris Denys 11 Desi Wulandari 71 Nur Indah Hayati 12 Dewa Putu Andika Khrisna 72 Sakti Nugroho 13 Dewi Sulistyaningrum 73 Septiyanto Bagus Nugroho 14 Gede Restu Yoga Sanjaya 74 Setia Adi Widodo 15 Hira Nur Prasetyo 75 Shora Pradani 16 I Gde Wirabawa M 76 Supriyanto 17 Ika Febri Isnaini 77 Tirza Cynara Putri 18 Kurnia Merdeka Wati 78 Titik Kurniawati 19 Luviana Septianingsih 79 Wahyu Handoyo 20 Lydia Novanda 80 Wahyu Wiknyo 21 Maria Dewi Arumsari 81 Aditian Ovandri 22 Nieko Sudarsono 82 Ajeng Perdana Sakti 23 Nur Cahyo 83 Angga Saputro Wijayanto 24 Puput Arnta Kurniasari 84 Anik Puji Lestari 25 Rahmawati Agustin Setyorini 85 Bhaskoro Yunanto 26 Rheda Pradipta Yahya 86 Christian Adie Nugroho 27 Ria Fitriana 87 Deyana Tri Buana 28 Rida Sonang Roha Boru S 88 Devianto 29 Ridwan Yulianto 89 Eka Setyawati 30 Rony Kevin P. Manulang 90 Erni Puspa Setyowati 31 Rudito Haris Pratomo 91 Fauzi Rahman 32 Samuel Kristianto Saputra 92 Gilang Riswanda 33 Shean Aprillien Dee Ronnie 93 Hesty Setyadi Putri 34 Shelita Lintang Rumanti 94 Ilham Dwiki Putra Nugroho 35 Tiar Anjani 95 Ilham Salafudin 36 Titus Ardian Wibowo 96 Julius Alexa Purnama Putri 37 Tomy Margiono 97 Melando Firmansyah 38 Wahid Suryo Hidayat 98 Melina Laila Ulya 39 Yeremia Tri Wicaksono 99 Muhammad Faisal Adilla 40 Yogi Yuli Saputro 100 Muhammad Haidar Pandu 41 Adi Nuriawan 101 Nanda Rizky Kurniawan 42 Aditya Ferry Nugroho 102 Nanda Rizky Putro Pamungkas 43 Aditya Yoga Saputra 103 Nurhasan Abdulloh

44 Ananda Ginanjar Bagaswo 104 Nurlaila Trihatiningsih 45 Annisa Fitri Nu Syahid 105 Octariyo Harawan 46 Aulia Ghassani 106 Ponco Prandopo Hadi 47 Bagus Setyo Ardiyani 107 Raditya Prasetiya Putro 48 Bayu Nur Satria 108 Rani Kusumaningsih 49 Bayu Tirta Mahardika 109 Ria Ayu Fitri Astuti 50 Dany Saputri 110 Ridho Pangestu 51 Desy Kurniawati 111 Ridhon Putro Agil 52 Dhita Kusuma Dewi Sumarno 112 Rismayanti 53 Dimas Panji Wibisono 113 Rizky Fajar Susanto 54 Dyah Hutami Widowati 114 Robby Budi Pangestu 55 Eka Rektyaningsih 115 Ronny Budi Pangestu 56 Elsa Setyowati 116 Satria Bagus Pamungkas 57 Ferry Kurniawan 117 Wiwd Fitriani 58 Hanifa 118 Yohana Rizky Ekawati 59 Herlina Kurnia Pratiwi 119 Youlanda Permata Sari 60 Inar Ristiana 120 Yuristin Pelita Sari