ss18 item+analysis+on+the+validity+of+english+summative+test+for+the+first+year+students
TRANSCRIPT
ITEM ANALYSIS ON THE VALIDITY OF ENGLISH SUMMATIVE
TEST FOR THE FIRST YEAR STUDENTS
(A Case study at the first year SMP YPPUI Ciledug Tangerang School year
2005/2006)
A skripsi
Submitted to the English Teachers Training Program as Partial Fulfillment
of the Requirements for the Degree of Sarjana Pendidikan
By:
Ade Rosita
Reg. No. 102014023778
FACULTY OF TARBIYAH AND TEACHERS� TRAINING
STATE ISLAMIC UNIVERSITY
SYARIF HIDAYATULLAH JAKARTA
2006
id768968 pdfMachine by Broadgun Software - a great PDF writer! - a great PDF creator! - http://www.pdfmachine.com http://www.broadgun.com
ITEM ANALYSIS ON THE VALIDITY OF ENGLISH SUMMATIVE
TEST FOR THE FIRST YEAR STUDENTS
(A Case study at the first year SMP YPPUI Ciledug Tangerang School year
2005/2006)
A Skripsi Presented to Tarbiyah and Teachers Training Faculty In Partial of
the fulfillment for Sarjana Degree (S1)
By:
Ade Rosita NIM. 102014023778
Approved by:
Drs. Nasrun Mahmud, MPd. Advisor
ENGLISH DEPARTMENT
FACULTY OF TARBIYAH AND TEACHERS� TRAINING
SYARIF HIDAYATULLAH STATE ISLAMIC UNIVERSITY
JAKARTA
2006/1427 H
8
CHAPTER II
THEORITICAL FRAMEWORK
A. The Meaning of Test
One of the evaluation instruments is a test. There are many meanings of
the test. M. Buchari as quoted from Suharsimi Arikunto said that �Test is a trial
which is held to know some results from a certain subject which is taken from a
student or a group of students�11 By testing teacher can know the ability of
learning that students have.
According to Anthony J Nitko, Test is �Systematic procedure for
observing and describing one or more characteristics of person with the aid of
either a numerical of category system.12 Groundlund defined, �the evaluation is a
systematic process of determining the extent to which instructional objectives are
achieved by a student�.13
By testing, teacher can know the ability of learning that students have.
Mochtar Buchari said that �Test is a trial which is held to know some results from
a certain subject which is taken from a student or a group of students�.14
11 Dr. Suharsimi Arikunto, Dasar-dasar Evaluasi Pendidikan, (Jakarta: Bina
Aksara,1987), p.199 12 Anthony J. Nitko, Educational Test and Measurement An Introduction, (NewYork.
Harcourt Brace Jovanovic, Inc.1983), p.6 13 N.E. Groundlund, Measurement and Evaluation in Teaching, (USA : Mc. Millan
Publishing Company, 1985). p.25 14 M. Buchari M. Ed, Tehnik-tehnik dalam Evaluasi Pendidikan, (Bandung, 1980). P.119
id803640 pdfMachine by Broadgun Software - a great PDF writer! - a great PDF creator! - http://www.pdfmachine.com http://www.broadgun.com
9
B. Types of test
There are many types of test used to measure students achievement. There
are four types of achievement test which are commonly used by teachers in the
classroom:
1. Placement test
A placement test is designed to determine the pupils� performance at the
beginning of instruction.
2. Formative test
It is used at the end of a unit in the course book or after a lesson designed.
The result of this test will also give the students immediate feed back.
3. Diagnostic test
Diagnosric test is intended to diagnose learning difficulties during
instruction. Thus, the main aim of diagnostic test is to determine the
causes of learning difficulties and then to formulate a plan for a remedial
action.
4. Summative test
The summative test is intended to show the standard that the students have
now reached in relation to other students at the same stage. Therefore it
typically comes at the end of a course or unit of instruction.15
15 Drs. Wilmar Tinambunan, Evaluation of students achievement, (Jakarta:
Depdikbud.,1998), p.7-9
10
Based on statement above the writer could summarize that generally test is
a systematic and objective procedure to find out the knowledge and the ability of
what have been learned from someone.
While there are a number of tests that teachers usually carry out in the
classroom, however for practical purpose. The writer presents only two of them.
Both which directly related to the analysis written in this skripsi; they are
formative and summative test. Norman E. Grounlund states that �Formative test is
used to monitor the learning process during the instructional program usually
teachers make the test by themselves�.16
Furthermore, with the summative test given for the students, teachers are
not only having a final report about the programs achievement, but also the
comparison among their individual students� ability and achievement in the
instructional objectives of teaching learning activities.
C. Types of Item
The Question, exercises and tasks appearing on the test are called items
the kinds of items on test are:
1. The letter type of items called choice items, which includes true false item,
multiple choice items, and matching exercise.
2. Completion items, present and incomplete sentence and examine is
required to supply a word or short phrase that best complete the sentence.
16 Norman E. Ground lund, Measurement and Evaluation in Teaching, (New York:
Macmillan publishing Co., Inc., 1981), 4 th, p.6
11
3. Short answer items, in this type of item, the students usually is not free ton
give expression to creative and imaginative thoughts.
4. Essay items, permit the testing of a student�s ability to organize ideas and
thoughts and allow for creative verbal expressions.
D. The Criteria of a Good test.
1. Validity
JB. Heaton said, �The validity of a test is the extent to which it measures
what it is soppossed to measure and nothing else�.17 The validity of a test
must be considered in measurement in this case there must be seen wheter
the test used really measures what are supposed to measure, briefly. The
validity of a test is the extent to which the test measures what it is intended
to measure. There are four types validity:
a. Face validity
Face validity means the way the test looks to the testiest, teachers,
moderators, and administrator. Therefore it is useful to show a test to
colleagues or friends in order to discover absurdities and ambiguities
of a test.
b. Content validity
Content validity is concerned with the materials that the students have
learned. The test should cover samples of the teaching materials given.
To fulfill this the teacher should refer his consideration to the teaching
17 JB. Heaton, Writing English Language test, (Longman1998), p. 153
12
syllabus. JB. Heaton says �Content validity depends on careful
analysis of the language being tested and of the particular course
objectives; the test should be so constructed as to contain a
representative sample of the course�.
c. Construct validity
Construct validity deals with construct and underlying theory of the
language learning and testing. JB. Heaton states. �If the test has
construct validity it is capable of measuring certain specific
characteristics in accordance with a theory of language and behavior
and learning�.
d. Empirical validity
There are two kinds of empirical validity : Concurrent validity and
Predictive validity which depend on whether the test scores are
correlated with subsequent or concurrent criterion measures.
If we use a test of english as a second language to screen university
applicants and then correlate test scores with grades made at the end of
the first semester, we are attempting to determine predictive validity of
the test. If, on the other hand, we follow up the test immediately by
having an English teacher rate each student�s English proficiency on
the basis of his class performance during the first week and correlate
the two measures, we are seeking to establish the concurrent validity of
the test.18
18 Ibid., p. 154-155
13
2. Reliability
A test shoud be reliable as a measuring instrument. A test cannnot measure
anything well unless it measures consistently. According to J.Charles
Alderson, Caroline clapham and Dianne wall . �a test cannot be valid
unless it is reliable�19
If the test administered to the same students on the different occasion and
there is no difference to the results. It can be said that the test is reliable.
3. Practicality
The third characteristics of a good test is practicality or usability in the
preparation of a new test. The teacher must keep in mind a number of very
practical considerations which involves economy, ease of administration,
scoring and interpretation of result.
Economy means the test is not costly. The teachers must take into account
the cost percopy, how many scores will be needed, (for the more personnel
who must be involved in giving and scoring a test, the more costly the
process becomes). How long the administering and scoring of it will take,
choosing a short test rather than longer one.
Ease of administration and scoring means that the test administrator can
perform his task quckly and efficiently. We must also consider the ease
with which the test can be administered.
19 J. Charles Anderson, Caroline Clapham and Dianne Wall, Language Test Construction
and Evaluation, (British: Cambridge University Press, 1995), p.187
14
Ease of interpretation and application JB. Heaton states �The final point
concerns the presentation of the test paper it self�, where possible , it
should be printed or type written and appear neat, tidy and aesthetically
pleasing. Nothing is worse and more disconcerting to the testiest than
untidy test paper, full of miss spellings, omissions and corrections. �if it
happens, it will be easy for the students or testiest easy to interpret the test
items�.20
Besides having a good criteria, the other characteristics of the test that�s
more important and specific is the quality of the test items. To know the quality of
the test items, teachers should use a method called item analysis.
E. Item Analysis
There are several meanings of what item analysis. According to Anthony J
Nitko, in his book, he stated that: �Item analysis refers to the processof collecting,
summarizing, and using information about individual test items especially
information about pupils� response to items�21
Item analysis is an important and necessary step in the preparation of good
multiple choice test. Because of this fact; it is suggestested that every classroom
teacher who uses multiple choice test data should know something of item
analysis. How it is and what it means.22
20 Op cit., p. 161 21 Anthony J. Nitko, Educational Test and Measurement an Introduction, (New
York:Harcourt Brace Jovanich inch., 1983), p.284 22 Jhon W. Oller, Language Test at School , (London: Longman group., 1979), p. 245
15
For the teacher made test , the followings are the important uses of item
analysis: determining whether an item functions as teacher intended, feed back to
students about their performance and as a basis for class discussion, feed back
about pupil difficulties, area for curriculum improvement, revising the item and
improving item writing skill.
Item analysis usually provides two kinds of information on items:23
1. Item facilty, which helps us decide if the test items are at the right
level for the target group, and
2. Item discrimination, which allows us to see if the individual items are
providing information on candidates� abilities consistent with that
provided by the other items on the test.
Item facility expresses the proportion of the people taking the test who got
a given item right. (item difficulty is sometimes used ton express similar
information, in this case the proportion that got an item wrong). Where the test
purpose is to make distinctions between candidates, to spread them out in terms of
their performance on the test, the items should be neither too easy nor too
difficult. If the items are too easy, then people with differing levels of ability or
knowledge will all get them right, and the differences in ablity or knowledge will
not revealed by the item. Similarly if the items are too hard, then able and less
able candidates alike will get them wrong and the item will not help us in
distinguishing between them.
23 H.G Widdowson, Language Testing, (Oxford: University Press., 2000), p.60
16
Analysis of item discrimination addreses a different target: consistency of
performance by candidates acrross items. The usual method for calculating item
discrimination involves comparing performance on each item by different groups
of test takers: those who have done relatively poorly. For example, as items get
harder, we would expect those who do best on the vest overall to be ones who in
the main get them right. Poor item discrimination indices are signal that an item
deserves revision.
If there are a lot of items with problems of discrimination, the information
coming out of the test is confusing , as it means that some items are suggesting
certain candidates that realtively better, while order individuals are better, no clear
picture of the candidates� abilities emerges from the test.(The scores, in other
words, are missleading and not reliable indicators of the underlying abilities of the
candidates) such a test will need considerable revision.24
24 Ibid., p.61
17
CHAPTER III
RESEARCH METHODOLOGY AND FINDINGS
A. RESEARCH METHODOLOGY
1. Research Design
a. Research Method
To solve this problem which is presented in the statements of the
problems and the limitation of the study, the writer did both the library study
that is by reading some books relating to the characteristics of a good test of
English test and the field research by analyzing the test paper, making the
interview and taking the test instrument of the test form which will be
analyzed as data.
b. Time and Location
The writer took the test paper and the test instruments on 21th
December 2005. The school which was used as the case study is Senior
High School Ciledug which is locating at Jl Raden Fatah no.36 Sudimara
Barat Ciledug Tangerang.
c. Techniques of Sample Taking
In this research, the writer took the sample from the first year students
of SMP YPPUI Ciledug Tangerang. The total number students which are
taken as the sample are 40 students.
id821250 pdfMachine by Broadgun Software - a great PDF writer! - a great PDF creator! - http://www.pdfmachine.com http://www.broadgun.com
18
d. Techniques of Data Collecting
To collect data the writer needed, she used the steps below:
1). Observation
In implementation of her observation, the writer had done some activities,
namely, by visiting the school to ask for the test result (Summative test) of
English Subject from the school in order to know the students summative
test and asking for the question sheet of English Subject to be analyzed.
The writer did interview with the English teacher of the first class of SMP
YPPUI Ciledug. And take the statements of research from Head Master of
SMP YPPUI Ciledug, Tangerang.
2). Documentation
Documentation means collecting the files or data of related information
including the result of first grade students� examination of odd semester.
e. Techniques of Data Analysis
The Data Analysis of this research, the writer used the descriptive analysis
and the quantitative research method; the writer processed and analyzed the
data by using the formulas as follows:20
r = N∑ X Y � (∑X) (∑Y)
√ [ N∑2 � (∑X)2 ] [ N∑ Y2 � (∑Y)2 ] Where:
r = Validity of item
20 Dr. Sumarna Surapranata, Analisis, Validitas, Reliabilitas, dan Interprtasi hasil tes
implementasi kurikulum 2004, (Bandung: PT Remaja Rosdakarya Bandung., 2004), p.74
19
X = Deviation Squared
N = Total number of testiest
Y = Total number of responses
2. Research Findings
a. Description of Data
The type of the test which is studied by the writer is summative test. The
summative test is final test of odd semester for the first year students of Junior
High School for the academic year 2005-2006, meanwhile the multiple choice
items are 40 items. Each item is consisting of stem and four options which
include one of them is the key and the other is distracter. The test was held on
Wednesday, 21th December with the total time which is given the teachers for
answering the test items are 90 minutes.
Based on the explanation of the first class English Teachers of SMP YPPUI
CILEDUG (The school which becomes the place of the study for this case) in
the interview with the writer, the process organizing of the test is by
establishing from Regional Office of National Education.
b. Analysis of Data
The result of moment calculation shows that validity of item no. 1 is the
same as (equals) key answer, (key answer of this item is= B). The result is
20
0,914. The positive mark shows that the items have been useful as it should be
(see the table 6 in appendix).
The negative mark on validity of item for no. 2 is -0,302 shows that the
key answer is not useful should be. That means lower group will response with
right on key. But upper group answer is wrong. (see the table 7 in appendix)
The result of moment calculation shows that validity of item for no. 3 is
the same as (equals) key answer, (key answer of this item is = C). The result is
0,386. The positive mark shows that the items have been useful as it should be,
(see the table 8 in appendix).
The result of moment calculation shows that validity of item for no. 4 is
the same as (equals) key answer, (key answer of this item is = A). The result is
0,350. The positive mark shows that the items have been useful as it should be,
(see the table 9 in appendix).
The result of moment calculation shows that validity of item for no. 5 is
the same as (equals) key answer, (key answer of this item is = C). The result is
0,410. The positive mark shows that the items have been useful as it should be,
(see the table 10 in appendix).
The result of moment calculation shows that validity of item for no. 6 is
the same as (equals) key answer, (key answer of this item is = A). The result is
0,354. The positive mark shows that the items have been useful as it should be,
(see the table 11 in appendix).
21
The result of moment calculation shows that validity of item for no. 7 is
the same as (equals) key answer, (key answer of this item is = C). The result is
0,406. The positive mark shows that the items have been useful as it should be,
(see the table 12 in appendix).
The result of moment calculation shows that validity of item for no. 8 is
the same as (equals) key answer, (key answer of this item is = C). The result is
0,150. The positive mark shows that the items have been useful as it should be,
(see the table 13 in appendix).
The result of moment calculation shows that validity of item for no. 9 is
the same as (equals) key answer, (key answer of this item is = C). The result is
0,089. The positive mark shows that the items have been useful as it should be,
(see the table 14 in appendix).
The result of moment calculation shows that validity of item for no. 10
is the same as (equals) key answer, (key answer of this item is = A). The result
is 0,562. The positive mark shows that the items have been useful as it should
(see the table 15 in appendix).
The result of moment calculation shows that validity of item no. 11 is
the same as (equals) key answer, (key answer of this item is = B). The result is
0,097. The positive mark shows that the items have been useful as it should be,
(see the table 16 in appendix).
22
The result of moment calculation shows that validity of item for no. 12
is the same as (equals) key answer, (key answer of this item is = D). The result
is 0,100. The positive mark shows that the items have been useful as it should
be, (see the table 17 in appendix).
The negative mark on validity of item for no. 13 is -0,302 shows that the
key answer is not useful should be. That means lower group will response with
right on key. But upper group answer is wrong, (see the table 18 in appendix).
The result of moment calculation shows that validity of item for no. 14
is the same as (equals) key answer, (key answer of this item is = B). The result
is 0,110. The positive mark shows that the items have been useful as it should
be, (see the table 19 in appendix).
The result of moment calculation shows that validity of item for no. 15
is the same as (equals) key answer, (key answer of this item is = B). The result
is 0,006. The positive mark shows that the items have been useful as it should
be, (see the table 20 in appendix).
The result of moment calculation shows that validity of item for no. 16
is the same as (equals) key answer, (key answer of this item is = D). The result
is 0,124. The positive mark shows that the items have been useful as it should
be, (see the table 21 in appendix).
The result of moment calculation shows that validity of item for no. 17
is the same as (equals) key answer, (key answer of this item is = A). The result
23
is 0,518. The positive mark shows that the items have been useful as it should
be, (see the table 22 in appendix).
The result of moment calculation shows that validity of item for no. 18
is the same as (equals) key answer, (key answer of this item is = A). The result
is 0,264. The positive mark shows that the items have been useful as it should
be, (see the table 23 in appendix).
The result of moment calculation shows that validity of item for no. 19
is the same as (equals) key answer, (key answer of this item is = B). The result
is 0,261. The positive mark shows that the items have been useful as it should
be, (see the table 24 in appendix).
The result of moment calculation shows that validity of item for no.20 is
the same as (equals) key answer, (key answer of this item is = C). The result is
0,264. The positive mark shows that the items have been useful as it should be,
(see the table 25 in appendix).
The result of moment calculation shows that validity of item for no. 21
is the same as (equals) key answer, (key answer of this item is = B). The result
is 0,332. The positive mark shows that the items have been useful as it should
be, (see the table 26 in appendix).
The result of moment calculation shows that validity of item for no.22 is
the same as (equals) key answer, (key answer of this item is = A). The result is
0,495. The positive mark shows that the items have been useful as it should be,
(see the table 27 in appendix).
24
The result of moment calculation shows that validity of item for no.23 is
the same as (equals) key answer, (key answer of this item is = B). The result is
0,064. The positive mark shows that the items have been useful as it should be,
(see the table 28 in appendix).
The result of moment calculation shows that validity of item for no. 24
is the same as (equals) key answer, (key answer of this item is = A). The result
is 0,225. The positive mark shows that the items have been useful as it should
be, (see the table 29 in appendix).
The negative mark on validity of item for no. 25 is -0,031 shows that the
key answer is not useful should be. That means lower group will response with
right on key. But upper group answer is wrong, (see the table 30 in appendix).
The result of moment calculation shows that validity of item for no. 26
is the same as (equals) key answer, (key answer of this item is = B). The result
is 0,068. The positive mark shows that the items have been useful as it should
be, (see the table 31 in appendix).
The result of moment calculation shows that validity of item for no. 27
is the same as (equals) key answer, (key answer of this item is = C). The result
is 0,104. The positive mark shows that the items have been useful as it should
be, (see the table 32 in appendix).
The negative mark on validity of item for no.28 is -0,315 shows that the
key answer is not useful should be. That means lower group will response with
right on key. But upper group answer is wrong, (see the table 33 in appendix).
25
The result of moment calculation shows that validity of item for no.29 is
the same as (equals) key answer, (key answer of this item is = A). The result is
0,344. The positive mark shows that the items have been useful as it should be,
(see the table 34 in appendix).
The result of moment calculation shows that validity of item for no.30 is
the same as (equals) key answer, (key answer of this item is = B). The result is
0,065. The positive mark shows that the items have been useful as it should be,
(see the table 35 in appendix).
The result of moment calculation shows that validity of item for no. 31
is the same as (equals) key answer, (key answer of this item is = B). The result
is 0,211. The positive mark shows that the items have been useful as it should
be, (see the table 36 in appendix).
The result of moment calculation shows that validity of item for no.32 is
the same as (equals) key answer, (key answer of this item is = C). The result is
0,232. The positive mark shows that the items have been useful as it should be,
(see the table 37 in appendix).
The result of moment calculation shows that validity of item for no.33 is
the same as (equals) key answer, (key answer of this item is = C). The result is
0,114. The positive mark shows that the items have been useful as it should be,
(see the table 38 in appendix).
The result of moment calculation shows that validity of item for no.34 is
the same as (equals) key answer, (key answer of this item is = B). The result is
26
0,339. The positive mark shows that the items have been useful as it should be,
(see the table 39 in appendix).
The negative mark on validity of item for no.35 is -0,070 shows that the
key answer is not useful should be. That means lower group will response with
right on key. But upper group answer is wrong, (see the table 40 in appendix).
The result of moment calculation shows that validity of item for no. 36
is the same as (equals) key answer, (key answer of this item is = D). The result
is 0,125. The positive mark shows that the items have been useful as it should
be, (see the table 41 in appendix).
The result of moment calculation shows that validity of item for no.37 is
the same as (equals) key answer, (key answer of this item is = B). The result is
0,079. The positive mark shows that the items have been useful as it should be,
(see the table 42 in appendix).
The result of moment calculation shows that validity of item for no.38 is
the same as (equals) key answer, (key answer of this item is = C). The result is
0,014. The positive mark shows that the items have been useful as it should be,
(see the table 43 in appendix).
The negative mark on validity of item for no. 39 is -0,070 shows that the
key answer is not useful should be. That means lower group will response with
right on key. But upper group answer is wrong, (see the table 44 in appendix).
The result of moment calculation shows that validity of item for no. 40
is the same as (equals) key answer, (key answer of this item is = D). The result
27
is 0,179. The positive mark shows that the items have been useful as it should
be, (see the table 45 in appendix).
c. The Interpretation of Data
Based on the data analysis the writer would like to conclude item
validity of English Summative test, there are 6 items (no.2,13,25,26,35, and 39)
which must be revised because of the negative mark show that the items have
not been useful as it should be.
The other items have positive mark that shows that the items have
been useful as they should be. Based on the computation of the result of validity
analysis there 6 items (no.5,7, 10,17,18, and 22) have enough quality, 11 items
(no.3,4,6,16,20,21,24,29,31, and 32) have low quality, and 17 items
(no,1,8,11,12,14,15,19,23,26,17,30,33,36,37,38, and 40) have lowest quality.
.
28
27
CHAPTER IV
CONCLUSION AND SUGGESTION
A. CONCLUSION
Based on the data analysis and the data interpretation in the previous chapter,
the writer would like to conclude that item validity of English test for odd semester
for the first year students of SMP YPPUI CILEDUG is as follows:
Based on the statistical calculation of item validity (empirical validity), the
writer interprets that the empirical validity of the test in the level of �badness�
because there are 18% which is must be revised, 18% has enough quality, 24% has
low quality and 40% has lowest quality.
B. SUGGESTION
From the conclusion written above the writer would like to give some
suggestion as follows:
1. To fulfill the characteristic of a good test , the items should be examined and
analyzed the test first.
2. The content of the test should be suitable with the curriculum and the GBPP
and should not deviate from the material which were given to the testees.
3. The test maker should make particular objectives of the test which are related
to the curriculum, so that the test would be representative enough to the
curriculum
id835671 pdfMachine by Broadgun Software - a great PDF writer! - a great PDF creator! - http://www.pdfmachine.com http://www.broadgun.com
28
29
BIBLIOGRAPHY
Anas, Sudijono, Prof, Drs., Pengantar Evaluasi Pendidikan, Jakarta: PT. Raja
Grafindo Persada, 2003, Cet. ke 4
Anas, Sudijono, Prof, Drs., Pengantar Evaluasi Pendidikan, Jakarta: PT. Raja
Grafindo Persada, 2005, Cet. Ke 5
Bailey, kathleen M, Learning About Language Assesment: Dilemmas Decisions, and
Directions, United States of America: ITP an Interantional Thomson
Publishing Company., 1998
Brown, Douglas, H., Teaching by Principles an Interactive Approach to Language
Pedadogy, Longman: San Francisco University 2001,Second Edition
Gronlund, N.E., Measurement and Evaluation in Teaching, New York: Macmillan
Publishing co., 1985, fifth edition
Harris, David P., Testing English as Second foreign Language. New York: Tata Mc
Graw Hill. Inc.1969
Heaton, J.B., Writing English Language Test, Longman:1988.
Murcia, Celce, Marianne, Teaching English as a second or foreign Language, Second
Edition. United States of America., 1991.
Nitko, J. Anthony. Educational Test and Measurement An Introduction. New York,
Harcourt Brace Jovanich inch. 1983.
Nunnaly, Jum C., Educational Measurement and Evaluation. New York: Mc Graw
Hill Book Company, 1964.
id854328 pdfMachine by Broadgun Software - a great PDF writer! - a great PDF creator! - http://www.pdfmachine.com http://www.broadgun.com
30
Nurkancana, Wayan and Sumartana, Evaluasi Pendidikan, Surabaya : Usaha
Nasional, 1986.
Oller, Jhon W., Language Test at School, A Pragmatic Approach, London: Longman
., 1979.
Purwanto, Ngalim, Prinsip-prinsip dan Tehnik Evaluasi Pengajaran, Bandung: PT
Remaja Rosdakarya.,1989.
Slameto, Drs., Evaluasi Pendidikan, Jakarta: Bumi Aksara.,2001, Cet. Ke 3
Sudjana, Nana, Drs., Penilaian Hasil Proses Belajar Mengajar, Bandung: PT Remaja
Rosdakarya., 1991
Supranata, Sumarna, Dr., Analisis, Validitas, Reliabilitas dan interpretasi hasil test
Implementasi kurikulum 2004, Bandung: PT Remaja Rosdakarya.,2004.
Tim Penyusun, Pedoman Penulisan Skripsi, Tesis, dan Disertasi, Cet 2, Jakarta: UIN
Jakarta Press., 2002.
Widdowson, H G. Language Testing, Oxford University, 2000.
Wilmar Tinambunan, Evaluation of students Achievement, Jakarta: Depdikbud.,1998