pilot study

PILOT STUDY RELIABILITY &

VALIDITYSHAHIZAH BINTI SHUKRI

Educational Multimedia 5th Colloquium

is PILOT STUDY

• is a mini-version of a full-scale study or a trial run done in preparation of the complete study

• also called a ‘feasibility’ study• specific pre-testing of research instruments, including

questionnaires or interview schedules (Polit, et al.& Baker in Nursing Standard, 2002:33-44)

• It is “reassessment without tears” (Blaxter, Hughes & Tight, 1996:121), trying out all research techniques and methods, which the researcher have in mind to see how well they will work in practice.

OBJECTIVES PILOT STUDY

1.Pilot study is a small experiment designed to test logistics2. Gather information prior to a large study3. Improve the actual study’s quality and efficiency 4.Reveal deficiencies in the design of a proposed experiment or procedure and these can then be addressed before time 5. A good research strategy requires careful planning and a pilot study will often be a part of this strategy

The Value of a Pilot Study

• Welman and Kruger (1999:146) also listed the following three values of a pilot study:i. to detect possible flaws in measurement

procedures and in the operationalisation of independent variables.

ii. to identify unclear or ambiguous items in a questionnaire.

iii. give important information about any embarrassment or discomfort experienced concerning the content or wording of items in a questionnaire- (the non-verbal behaviour)

• Quotes concerning the value and goal of pilot studies: i. “to see if the beast will fly”

(De Vos, 2002:410)ii. “reassessment without tears” (Blaxter, Hughes & Tight, 1996:121)iii. “Do not take the risk. Pilot test first.” (Van Teijlingen & Hundley, 2001:2).

Pilot Study Quotes

•Pilot studies should have a well-defined set of aims and objectives to ensure methodological rigour and scientific validity.

•Participants in an external pilot should not later be included in the main study to make savings in recruitment, because then the decision to proceed with the main study would not be made independently of the results of the pilot study.

•The analysis of a pilot study should be mainly descriptive or should focus on confidence interval estimation.

•Results from hypothesis testing should be treated as preliminary and interpreted with caution, as no formal power calculations have been carried out.

•The temptation not to proceed with the main study when significant differences are found should be avoided.

Recommendations

Pilot Study guidelines

Jumlah responden tidak ditentukan dengan tepat, dicadangkan sekurang-

kurangnya 25 orang, lebih baik antara 50 – 75 orang.

Untuk kajian baharu,

lakukan dua kali ujian rintis.

Tidak boleh gunakan kumpulan fokus

sebenar.

• Size calculations may not be required for some pilot or exploratory studies• A pilot study may be used to generate information to be used for sample size calculations• The sample for a pilot needs to be

representative of the target population–It should be sufficient to address the

key feasibility objectives–It should also be based on the same inclusion/exclusion criteria

Sample Sizes for Pilot Studies

1.Pilot study is done on a smaller scale. Thus, actual results of the study may vary from the results of pilot study.2. A pilot study is usually carried out on members of the relevant population, but not on those who will form part of the final sample3. A pilot study is normally small in comparison with the main experiment and therefore, can provide only limited information on the sources and magnitude of variation of response measures

Limitation

VALIDITY !!!!

• Validity is arguably the most important criteria for the quality of a test.

• The term validity refers to whether or not a test measures what it intends to measure.

• There are several ways to estimate the validity of a test, including content validity, construct validity, criterion-related validity (concurrent & predictive) and face validity.

The question of validity is raised in the context of the three points made above, 1. The form of the test 2. The purpose of the test 3. The population for whom it is intended.

Therefore, we cannot ask the general question “Is this a valid test?”. The question to ask is “how valid is this test for the decision that I need to make?” or “how valid is the interpretation I propose for the test?”

Validity is thus a requirement for both quantitative and qualitative research

TYPES OF VALIDITY

Content related to objectives and their sampling.

Construct referring to the theory underlying the target.

Criterion related to concrete criteria in the real world. It can be concurrent or predictive.

Concurrent

correlating high with another measure already validated.

Predictive Capable of anticipating some later measure.

A correlation coefficient is a statistical summary of the relation between two variables. It is the most common way of reporting the answer to such questions as the following: Does this test predict performance on the job? Do these two tests measure the same thing? Do the ranks of these people today agree with their ranks a year ago?

According to Cronbach, to the question “what is a good validity coefficient?” the only sensible answer is “the best you can get”, and it is unusual for a validity coefficient to rise above 0.60, though that is far from perfect prediction.

All in all we need to always keep in mind the contextual questions: • what is the test going to be used for? • how expensive is it in terms of time, energy and money? • what implications are we intending to draw from test scores?

https://statistics.laerd.com/statistical-guides/pearson-correlation-coefficient-statistical-guide.php

Reliability !!!!!!How representative is the measurement?

• Research requires dependable measurement. • Measurements are reliable to the extent that they are

repeatable and that any random influence which tends to make measurements different from occasion to occasion or circumstance to circumstance is a source of measurement error. (Nunnally, 1978)

• Reliability is the degree to which a test consistently measures whatever it measures. (Gay, 1987)

• Errors of measurement that affect reliability are random errors and errors of measurement that affect validity are systematic or constant errors.

There are three major categories of reliability for most instruments: test-retest, equivalent form, and internal consistency

Reliability

Test-retest the degree to which scores are consistent over time. It indicates score variation that occurs from testing session to testing session as a result of errors of measurement.

Equivalent-Forms

Used when it is likely that test takers will recall responses made during the first session and when alternate forms are available. Correlate the two scores. The obtained coefficient is called the coefficient of stability or coefficient of equivalence.

Internal Consistency

Determining how all items on the test relate to all other items. Kudser-Richardson-> is an estimate of reliability that is essentially equivalent to the average of the split-half reliabilities computed for all possible halves. Two indexes of internal consistency:Split half reliabilityCoefficient alpha

INTER-RATER RELIABILITY

the extent to which two or more individuals (coders or raters) agree. Inter-rater reliability assesses the consistency of how a measuring system is implemented. Inter-rater reliability is dependent upon the ability of two or more individuals to be consistent. Training, education and monitoring skills can enhance inter-rater reliability.

For example, when two or more teachers use a rating scale with which they are rating the students’ oral responses in an interview (1 being most negative, 5 being most positive). If one researcher gives a "1" to a student response, while another researcher gives a "5," obviously the inter-rater reliability would be inconsistent.

INTRA-RATER RELIABILITY

type of reliability assessment in which the same assessment is completed by the same rater on two or more occasions. These different ratings are then compared, generally by means of correlation. Since the same individual is completing both assessments, the rater's subsequent ratings are contaminated by knowledge of earlier ratings.

Split half reliability

Splitting a test into two equivalent halves and then assessing the consistency of the scores across the two halves of the test.Divide the test into halves and correlate the scores from the two halves. Compute the correlation between scores on the two halves of the test using Spearman-Brown formula.The low correlation indicates that the test was unreliable, a high correlation indicates that the test was reliable.

Coefficient alpha Lee Cronbach 1951) developed coefficient alpha.. Alpha CronbachCoefficient alpha tells you the degree to which the items are interrelated. Rule of thumb:At a minimum, greater than or equal to .07 for research purposes and somewhat greater than that value (e.g. ≥ .09) for clinical testing purposes.

( α) correlation coefficients adalah seperti berikut:

0.00 hingga + 1.00 = pada asasnya

.60 hingga .70 = satisfied coefficients

.70 hingga .80 = stability coefficients

.80 hingga .90 = customary coefficients

.90 hingga .95 = sufficient coefficients

.80 hingga .90 = acceptable reliability

.90 hingga + 1.00 = very good reliability

.95 hingga + 1.00 = acceptable standardised test for internal consistency

James Popham (1990), Modern Educational Measurement. A Practitioner’s Perspective. 2Nd Edition, New Jersey: Prentice Hall, Englewood Cliffs, h. 127.

Hair (2003)

What are some ways to improve validity?

• Make sure your goals and objectives are clearly defined and operationalized. Expectations of students should be written down.

• Match your assessment measure to your goals and objectives. Additionally, have the test reviewed by faculty at other schools to obtain feedback from an outside party who is less invested in the instrument.

• Get students involved; have the students look over the assessment for troublesome wording, or other difficulties.

• If possible, compare your measure with other measures, or data that may be available.

RELATIONSHIP BETWEEN VALIDITY & RELIABILITY

• Validity and reliability are closely related.

• A test cannot be considered valid unless the measurements resulting from it are reliable.

• Likewise, results from a test can be reliable and not necessarily valid.

Additional information

1. Pearson product-moment correlation2. Kesahan dan kebolehpercayaan dalam kajian

kualitatif dan kuantitatif 3. Comparison of values of pearson’s

and spearman’s correlation coefficients on the same sets of data

pilot study

Documents

results of pilot study

pilot test

pilot studies91

pilot studywelman

external pilot

main study

fullscale study

goal of pilot studies