psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/questionnaire... · web viewwith...

130
PSG FAIMER REGIONAL INSTITUTE SCHOLARLY REPORT M-L WEB DISCUSSION – JUNE 2011 QUESTIONNAIRE BY DR NANDITA hAZRA PSG-FAIMER FELLOW 2010 ASSOCIATE PROFESSOR DEPARTMENT OF MICROBIOLOGY ARMED FORCES MEDICAL COLLEGE, PUNE

Upload: vothuan

Post on 17-May-2018

214 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

PSG FAIMER REGIONAL INSTITUTE

SCHOLARLY REPORT

M-L WEB DISCUSSION – JUNE 2011

QUESTIONNAIREBY

DR NANDITA hAZRAPSG-FAIMER FELLOW 2010

ASSOCIATE PROFESSORDEPARTMENT OF MICROBIOLOGY

ARMED FORCES MEDICAL COLLEGE, PUNE

PSG-FAIMER ML Web Assignment, Jun 2011 ReportMODERATORS- Asma and Subish (2011 PSG-FRI Fellows)

Page 2: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

CO-MENTORS- Nandita and Mahalaxmi (2010 PSG-FRI Fellows)SENIOR FELLOWS-Vinutha (2009 PSG-FRI Fellow)

FACULTY- Medha, Supten, Animesh and Meera (PSG-FRI Faculty)TABLE OF CONTENTS

S NO TITLE PAGE1 INTRODUCTION 42 AIM OF STUDY 53 OBJECTIVES OF STUDY 54 MATERIAL AND METHODS

(a) Programme For Intersession Assignments 2011-2012

(b) Programme For Intersession Assignment on Questionnaire design 2011

(c) Summary of Preliminary Needs Assessment on Questionnaire Design 2011/2010/Other fellows

6

5 WEEKLY SUMMARY SUBMITTED BY 2011 FELLOWS Summary Week 1 with SLOs 13

(a) Types of Questionnaire(b) Steps in Questionnaire Design (c) Characteristics of an ideal Questionnaire(d)When to use a Questionnaire(e) Types of questions(f) Pretesting(g) Administering the Questionnaire(h) Limitations of a Questionnaire (including

postal)(i) Challenges of Questionnaire

administration(j) Rating scales including LIKERT with

examples Summary Week 2 with SLOs

(a) Classification of validity and various measurements of reliability

(b) Reliability and validity in qualitative and quantitative research

(c) Advantages and disadvantages of

20

2

Page 3: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

various validation techniques(d) Questions answered by FAIMER Faculty

Summary Week 3 with SLOs(a) Pilot testing in research- various aspects

discussed

28

Summary Week 4 with SLOs(a) Principles of data analysis

36

6 REVIEW OF LITERATURE(a) Principles Of Questionnaire Design (b) Qualities Of A Good Question(c) Nine steps to development of a

Questionnaire(d) Preliminary work in Questionnaire

development(e) Data Analysis(f) Types of Questions(g) Questions to be avoided in a

Questionnaire(h) Qualities of a Good Question(i) Data analysis(j) Limitations of a Questionnaire (k) Questionnaire Design Process(l) Likert Scale(m) Validity(n) Reliability

44

7 CONCLUSION 708 APPENDICES

A-Planning stage of Questionnaire design- A post view

71

B- Shades of Likert Scales 74 C- Needs assessment at onsite( Ack: PSG

FAIMER Fellow 2008 Amol) on which this session was based

78

Questionnaire research design Flowchart 80 Time considerations 81

9 BIBLIOGRAPHY 82

3

Page 4: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

INTRODUCTION The design of a questionnaire is a craft. A questionnaire should be

appropriate, intelligible, unambiguous, unbiased capable of coping with all possible responses, satisfactorily coded, piloted and ethical. A questionnaire is a series of questions asked to individuals to obtain statistically useful information about a given topic. When properly constructed and responsibly administered, questionnaires become a vital instrument by which statements can be made about specific groups or people or entire populations.

Questionnaires are frequently used in research. They are a valuable method of collecting a wide range of information from a large number of individuals, often referred to as respondents. Adequate questionnaire construction is critical to the success of a survey. Inappropriate questions, incorrect ordering of questions, incorrect scaling, or bad questionnaire format can make the survey valueless, as it may not accurately reflect the views and opinions of the participants. A useful method for checking a questionnaire and making sure it is accurately capturing the intended information is to pretest among a smaller subset of target respondents.

A questionnaire is a list of written questions that can be completed in one of two basic ways. Firstly, respondents could be asked to complete the questionnaire with the researcher not present as in a postal questionnaire. Secondly, respondents could be asked to complete the questionnaire by verbally responding to questions in the presence of the researcher, this variation being called a structured interview. Although the two variations are similar (a postal questionnaire and a structured interview could contain exactly the same questions), the difference is that protecting the

4

Page 5: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

respondent’s anonymity is better done using a postal questionnaire than a structured interview.

Questionnaires are usually restricted to two basic types of question:Closed-ended (or “closed question”) is a question for which a researcher provides a suitable list of responses (e.g. Yes / No). This produces mainly quantitative data.Open-ended (or “open question”) is a question where the researcher doesn’t provide the respondent with a set answer from which to choose. Rather, the respondent is asked to answer "in their own words". This produces mainly qualitative data.

AIM OF STUDYThe aim of the Mentor Learner Web Discussion on Questionnaire was to familiarize Fellows on the concept of Questionnaire design as well as enhance knowledge and skills of Questionnaire design in order to enable 2011 Fellows to develop Questionnaire(s) for their Curriculum Innovation Project.

OBJECTIVES OF STUDY Understand what a questionnaire is and how best to use it

5

Page 6: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

Understand the types of questionnaires available List the various types of open ended and closed ended questions that

can be used in a Questionnaire Be able to formulate question instructions and layout of questionnaires

for own CIP Understand how to use the Likert Scale in CIP questionnaire Learn how to validate the individual statements in a questionnaire and

use tests of reliability Understand how to pilot a questionnaire and its role in Questionnaire

development Learn to use the various techniques in the analysis of data generated

from questionnaires

MATERIAL AND METHODS METHODOLOGY PSG FAIMER uses online mentoring and Learning Web Discussion through the listserv to train FAIMER Fellows during second and fourth onsite sessions. PSG –FRI ML Web is an effective tool to encourage active learning

6

Page 7: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

and to discuss important issues. The topic selected for Jun 2011 was “QUESTIONNAIRE DESIGN”.

FAIMER Intersession discussions: ML WEB sessions are conducted by FAIMER conducts year long discussions on the web which are termed as List-serve ML – Web sessions.

Choice of the topic: The selection of the topic was done among a wide range of choicesranging from Questionnaire design, PBL, Ethics in Medical Education, Clinical Skills training, Need based Curriculum and E learning. The reason for choosing this topic on Questionnaire was to assist FAIMER fellows in the conduct of their CIP or Curriculum Innovative Project at their home Institution. Since most CIPs require the development of questionnaires for their project, hence the importance of this topic. The aim was to increase the utility of the session for Fellows by contributions from Senior Fellows and Faculty.

PROGRAMME FOR INTERSESSION ASSIGNMENTS 2011-2012

MONTH ASSIGNMENTS

MODERATORS 2011 FELLOWS

CO MENTORS 2010 FELLOWS

June 2011

Questionnaire designSubish Asma

Mahalaxmi Nandita

July 2011

Problem Based LearningKomalaSunitha Reneega

Sarala Anand

August 2011

Ethics in Medical Education Sharada Rashmi

Sudha Chetna

October 2011

Clinical Skills training Sarath ShobhaPiryani

Latha Smitha

November 2011

Need Based Curriculum Althaf Salim

Narasimman Niranjan

December 2011

E-Learning Sunil Kalaiselvan

SamuelPrashanth Bhadresh

January 2012

Humor in Medical Education AshakiranRamesh

Thomas Mathew Renu, Brogen

7

Page 8: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

Choice of the theme for discussion: Since the Jun team wanted to make the ML web session INNOVATIVE, a new idea of making a different beginning was thought of. The discussion for planning of the session was coordinated by the Fellows and Faculty associated with the session. This was done using the heading “QUESTIONNAIRE DESIGN: NEED TO GET THINGS READY” on e-mail and consensus was received for a plan to be followed. However this plan was flexible. Moderators were encouraged during the planning phase to take over the discussion as much as possible. Also the slots were kept open to be shared by the two 2011 and two 2010 fellows such that if any person was busy during a particular time, arrangements to take over the session were made by the others.It was decided to conduct the Jun 2011 discussions in the following way.

PROGRAMME FOR INTERSESSION ASSIGNMENT ON QUESTIONNAIRE DESIGN JUN 2011

Conduct of the session during the month: As the planning in the previous month progressed, the Jun 2011 team witnessed new thoughts arising, queries raised. These were all accommodated in separate threads. The original line of discussion was followed in the major thread.

Participation: There was a good participation from 2011, 2010 and 2009 Fellows with almostall the members contributing to the discussion and learning from one another. However the team did not hesistate to send reminders to participants who were engaged in conferences, busy with their academic/ professional/personal work.

DATES TOPIC ASSIGNED 2011 201001-07 Jun 2011

1.Introduction and review of literature2.Questionnaire development of individual projects3. Assignment to understand the types of questionnaires

Asma Nandita

08-14 Jun 2011

Validation and Reliability of questionnaires Asma Maha

8

Page 9: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

15-21 Jun 2011

Pretesting/ Piloting of Questionnaires Subish Nandita

22-28 Jun 2011

Analysis of individual project questionnaires and methods to analyze data

Asma Maha

Planning was done on-site and also through email among the moderator, co-moderators and Faculty mentors. The topics were assigned as follows:- End of May: Needs assessment Survey on Questionnaire Design and the specific learning objectives of the month long session were posted on the list serve for necessary feedback from all the participants.First week: Begin discussion on topic ‘What is questionnaire’? Discuss the different types of questionnaires which fellows are intending to use in their CIP. Fellows are to post their questionnaires for comments and improvement.Second week: Validity and ReliabilityThird week: Pre testing/ Piloting of a questionnaireFourth week: Analysis of data generated from the questionnaireSUMMARY OF PRELIMINARY NEEDS ASSESSMENT ON QUESTIONNAIRE DESIGN

SURVEY RESPONSES BY THE PSG FAIMER FELLOWS 2011 BATCH

Types of questionnaires used in CIP QUESTIONNAIREClosed ended questions

2

KAP questions 1Open ended questions 1Retro pre question 2Pre and post test 1Confidence level questionnaire

1

Structured questionnaire

1

Semi structured 1Likert scale 1

Assistance needed in questionnaire design in CIPReliability 3Validation 5Refining 4Retro pre 1

9

Page 10: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

1. Rank, in order of importance, the features that need to be covered in the ML web discussion on Questionnaire designing

 Least

Important

Less Importa

nt

Slightly Importa

nt

More Importa

nt

Most Importa

ntResponseCount

Discussion on what a questionnaire is and how best to use it

0.0% (0) 14.3% (1)

14.3% (1)

42.9% (3)

28.6% (2)

Discussion on the types of questionnaires available

0.0% (0) 14.3% (1)

14.3% (1)

28.6% (2)

42.9% (3)

Discussion on the various types of open ended and closed ended questions that can be used in a Questionnaire

0.0% (0) 0.0% (0) 28.6% (2)

42.9% (3)

28.6% (2)

10

Page 11: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

1. Rank, in order of importance, the features that need to be covered in the ML web discussion on Questionnaire designing

Discussion on the issues for formulation of question instructions and layout of questionnaires for own CIP

0.0% (0) 0.0% (0) 14.3% (1)

42.9% (3)

42.9% (3)

Discussion on use the Likert Scale

0.0% (0) 0.0% (0) 25.0% (2)

25.0% (2)

50.0% (4)

Discussion about validation of the questionnaire

0.0% (0) 0.0% (0) 0.0% (0) 28.6% (2)

71.4% (5)

Discussion about reliability of the questionnaire

0.0% (0) 0.0% (0) 0.0% (0) 28.6% (2)

71.4% (5)

11

Page 12: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

1. Rank, in order of importance, the features that need to be covered in the ML web discussion on Questionnaire designing

Discussion on issues related to pre-testing/piloting

0.0% (0) 0.0% (0) 0.0% (0) 50.0% (4)

50.0% (4)

OTHER FELLOWS

Types of questionnaires used in educational research Closed ended questions

4

KAP questions 1Open ended questions 6Retro pre question 3Pre and post test 3Confidence level questionnaire Structured questionnaire Semi structured Likert scale 2Semi quantitative 1Quantitative 1Qualitative 1

Assistance needed in questionnaire design Reliability 5Validation 6Refining Retro pre 1piloting 2Translation 1Statistical tools 1Methods other than Likert scale

1

12

Page 13: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

1. Rank, in order of importance, the features that need to be covered in the ML web discussion on Questionnaire designing

 Least

Important

Less Importa

nt

Slightly Importa

nt

More Importa

nt

Most Importa

ntResponseCount

Discussion on what a questionnaire is and how best to use it

0.0% (0) 10.0% (1)

30.0% (3)

40.0% (4)

20.0% (2) 10

Discussion on the types of questionnaires available

0.0% (0) 9.1% (1) 36.4% (4)

36.4% (4)

18.2% (2) 11

Discussion on the various types of open ended and closed ended questions that can be used in a Questionna

0.0% (0) 0.0% (0) 20.0% (2)

60.0% (6)

20.0% (2)

10

13

Page 14: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

1. Rank, in order of importance, the features that need to be covered in the ML web discussion on Questionnaire designing

ire

Discussion on the issues for formulation of question instructions and layout of questionnaires for own CIP

0.0% (0) 10.0% (1)

10.0% (1)

50.0% (5)

30.0% (3) 10

Discussion on use the Likert Scale

9.1% (1) 0.0% (0) 18.2% (2)

36.4% (4)

36.4% (4) 11

Discussion about validation of the questionnaire

0.0% (0) 0.0% (0) 0.0% (0) 9.1% (1) 90.9% (10) 11

Discussion about reliability

0.0% (0) 0.0% (0) 0.0% (0) 18.2% 81.8% 11

14

Page 15: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

1. Rank, in order of importance, the features that need to be covered in the ML web discussion on Questionnaire designing

of the questionnaire

(2) (9)

Discussion on issues related to pre-testing/piloting

0.0% (0) 0.0% (0) 11.1% (1)

11.1% (1)

77.8% (7) 9

QUESTIONNAIRE DESIGNWEEKLY SUMMARY SUBMITTED BY 2011

FELLOWSML-WEB DISCUSSION-SUMMARY OF WEEK 01-07 Jun 2011Submitted by PSG FAIMER Fellow 2011Asma

TOPIC: INTRODUCTION AND REVIEW OF LITERATURE

SPECIFIC LEARNING OBJECTIVES FOR WEEK 1At the end of the first week ML-Web discussion the Participants should be able to:

1. Understand what a questionnaire is and how best to use it.2. List the types of questionnaires available.

15

Page 16: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

3. List the various types of open ended and closed ended questions that can be used in a Questionnaire

4. Be able to formulate question instructions and layout of questionnaires for own CIP

5. Understand how to use the Likert Scale in CIP questionnaire

SET INDUCTION: Questionnaire group started with a month long theatre Act titled "Questionnaire: Who dunnit? Who nailed it?" Character roles in the discussions were actively taken up by FAIMERly over the week long discussion. The SLOs were set based on responses through survey monkey administered to FAIMERly.

Summary of the active discussion follows:A questionnaire is a research instrument consisting of a series of questions and other prompts for the purpose of gathering information from respondents. It is a series of questions asked to individuals to obtain statistically useful information about a given topic. When properly constructed and responsibly administered, questionnaires become a vital instrument by which statements can be made about specific groups or people or entire populations

Types of questionnaire

Areas of use of Questionnaire

Method of Administrationof the Questionnaire

Structured questionnaire

Used in a large interview of > 50 respondents. Typically used where it is possible to closely anticipate the possible responses.

Telephone/Face-to face Self completion

Semi-structured questionnaire

Used widely in business market research where there is a need to accommodate different responses. Also used where the responses can not be anticipated.

Face-to-face/Telephone

Unstructured questionnaire

The basis of many studies into technical or narrow markets. Also used in in-depth interviewing and in group discussions. Allows probing and searching where askilled researcher is not fully sure of the responses before the

Group discussions/Industrial visit interviews/Depth telephone interviews

16

Page 17: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

interview.STEPS IN QUESTIONNAIRE DESIGN - FLOW DIAGRAM

CHARACTERISTICS OF AN IDEAL QUESTIONNAIRE

The design of questionnaires is a craft which has been badly neglected by the medical profession. A questionnaire should be appropriate, intelligible, unambiguous, unbiased, capable of coping with all possible responses, satisfactorily coded, piloted, and ethical. The key steps in designing a questionnaire are to: decide what data you need, select items for inclusion, design the individual questions, compose the wording, design the layout and presentation, think about coding, prepare the first draft and pretest, pilot, and evaluate the form, and perform the survey. Despite the apparently complicated nature of the task, theoretical knowledge is no substitute for practical experience.

Personal interviews are a way to get in-depth and comprehensive information. They involve one person interviewing another person for personal or detailed information. Typically, an interviewer will ask questions from a written questionnaire and record the answers verbatim. Personal interviews are time consuming and generally used only when subjects are not likely to respond to other survey methods. Telephone surveys are the fastest method of gathering information from a relatively large sample (100-400 respondents). The interviewer follows a prepared script that is essentially the same as a written questionnaire. However, unlike a mail survey, the telephone survey allows the opportunity for some opinion probing. Telephone surveys generally last less than ten minutes. Mail surveys are a cost effective method of gathering information. They are ideal

17

Page 18: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

for large sample sizes, or when the sample comes from a wide geographic area. Because there is no interviewer, there is no possibility of interviewer bias. The main disadvantage is the inability to probe respondents for more detailed information. E-mail and internet surveys are relatively new and little is known about the effect of sampling bias in internet surveys. While it is clearly the most cost effective and fastest method of distributing a survey, the demographic profile of the internet user does not represent the general population, although this is changing. Before doing an e-mail or internet survey, one must carefully consider the effect that this bias might have on the results.

When to use a questionnaire?

There is no all encompassing rule for when to use a questionnaire. The choice will be made based on a variety of factors including the type of information to be gathered and the available resources for the experiment. A questionnaire should be considered in the following circumstances.

a. When resources and money are limited. A questionnaire can be quite inexpensive to administer. Although preparation may be mostly, any data collection scheme will have similar preparation expenses. The administration cost per person of a questionnaire can be as low as postage and a few photocopies. Time is also an important resource that questionnaires can maximize. If a questionnaire is self-administering, such as a e-mail questionnaire, potentially several thousand people could respond in a few days. It would be impossible to get a similar number of usability quests completed in the same short time.

b. When it is necessary to protect the privacy of the participants. Questionnaires are easy to administer confidentially. Often confidentiality is the necessary to ensure participants will respond honestly if at all. Examples of such cases would include studies that need to ask embarrassing questions about private or personal behavior.

c. When corroborating other findings. In studies that have resources to pursue other data collection strategies, questionnaires can be a useful confirmation tools. More costly schemes may turn up interesting trends, but occasionally there will not be resources to run these other tests in large enough participant groups to make the results statistically significant. A follow-up large scale questionnaire may be necessary to corroborate these earlier results.

Deciding on what to include in a Questionnaire?There are six common ways to get information. These are: literature searches, talking with people, focus groups, personal interviews, telephone surveys and mail surveys.

18

Page 19: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

A literature search involves reviewing all readily available materials. It is a very inexpensive method of gathering information. Talking with people is a good way to get information during the initial stages of a research project. It can be used to gather information that is not publicly available, or that is too new to be found in the literature. Although often valuable, the information has questionable validity because it is highly subjective and might not be representative of the population.

Open-ended versus Closed-ended QuestionsAn open ended question is one in which you do not provide any standard answers to choose from. For example, these are all open-ended questions:

How old are you? ______ years. What do you like best about your job?

A closed-ended question is one in which you provide the response categories, and the respondent just chooses one:

What do you like best about your job?(a) The people(b) The diversity of skills you need to do it(c) The pay and/or benefits(d) Other: ______________________________ (write in)

REFINING THE QUESTIONNAIRE :

PRETESTINGThe pretest is a try-out of the questionnaire to see how it works and whether changes are necessary before the start of the actual survey. About 15 to 20 respondents, whose characteristics are reasonably similar to the survey population, will be adequate for a pretest. The questionnaire is then revised and finalized on the basis of pretest results. Pretesting can help you determine the strengths and weaknesses of your survey concerning question format, wording and order.

Pretest techniques ObjectivesFocus Groups Determine how respondents define key

words, terms, and phrases Determine whether respondents interpret phrases and questions as the researcher intendsObtain a general assessment of respondents’ ability to perform required tasks (e.g. recall relevant information, estimate frequency of specific behaviors, etc.)

19

Page 20: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

Think alouds Identify questions that respondents cannot answer accurately (e.g., recall problems, inability to estimate frequencies accurately)

Response Latency Identify questions that are too complex or that are difficult to understand Measure attitude strength

Computer-Assisted Coding of Concurrent Protocols

Identify respondent comprehension, retrieval, judgement, and response problems

Expert Panel Review of Questionnaire and/or response problems

Identify potential respondent comprehensionIdentify potential interviewer problemsIdentify potential data analysis problemsObtain suggestions for revising questions and/or the questionnaire

Use of rating forms Identify questions that are awkward or difficult to read

Group discussion of behavior coding results

Assess respondent interest resultsObtain suggestions for revising questions and/or the questionnaire Identify sampling problems

Vignettes Assess whether different question wording affects respondents’ interpretation of a question. Identify terms and concepts that respondents interpret differently from researchers

Behavior Coding Identify problem questions based on the frequency of occurrence of respondent behavior

Administering the Questionnaires:

A focus group is used as a preliminary research technique to explore people’s ideas and attitudes. It is often used to test new approaches. A group of 6 to 20 people meet in a conference-room-like setting with a trained moderator. The moderator leads the group's discussion and keeps the focus on the areas you want to explore. The disadvantage is that the sample is small and may not be representative of the population in general.Putting questions quickly in the questionnaire is a great ability but the "ability" might come into question if done hastily without considering the

20

Page 21: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

statement of the problem, study objectives, hypothesis and most importantly the "trait" or "traits" the study aims to measure. This is what we call it "validity".

Limitations of using a questionnaire survey

Some disadvantages of questionnaires: Questionnaires, like many evaluation methods occur after the event, so participants may forget important issues. Questionnaires are standardized so it is not possible to explain any points in the questions that participants might misinterpret. This could be partially solved by piloting the questions on a small group of students or at least friends and colleagues. It is advisable to do this anyway. Open-ended questions can generate large amounts of data that can take a long time to process and analyze. One way of limiting this would be to limit the space available to students so their responses are concise or to sample the students and survey only a portion of them. Respondents may answer superficially especially if the questionnaire takes a long time to complete. The common mistake of asking too many questions should be avoided. Students may not be willing to answer the questions. They might not wish to reveal the information or they might think that they will not benefit from responding perhaps even be penalised by giving their real opinion. Students should be told why the information is being collected and how the results will be beneficial. They should be asked to reply honestly and told that if their response is negative this is just as useful as a more positive opinion. If possible the questionnaire should be anonymous.

Limitations of postal questionnaire:The format of questionnaire design makes it difficult for the researcher to examine complex issues and opinions.Even where open-ended questions are used, the depth of answers that the respondent can provide tend to be more-limited than with almost any other method of research. This makes it difficult for a researcher to gather information that is rich in depth and detail. 2. Where the researcher is not present, it's always difficult to know whether or not a respondent has understood a question properly.3. The researcher has to hope the questions asked mean the same to all the respondents as they do to the researcher. 4. The response rate (that is, the number of questionnaires that are actually returned to the researcher) tends to be very low for postal questionnaires. 5. The problem of the self-selecting sample is particularly apparent in relation to questionnaires. When a response rate is very low the responses

21

Page 22: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

received may only be the opinions of a very highly motivated section of the sample.Major strength of postal questionnaire is that one can do a large/ For eg. A nation wide survey may not be possible with the conventional method.

Challenges Of Questionnaire administering in Community surveys:

1) Low coverage: Inability to cover the appropriate sample calculated.2) Time of Questionnaire administering: In field surveys there is a higher response rate from Women in household, with nonparticipation of the male gender.3) Issue of Privacy: In culturally and socially sensitive health issues.4) False Reporting: In the absence of respondents, questionnaire will be filled in by other family members5) Too lengthy Questionnaires: A large number of variables go unfilled which get captured during data handling.6) Self filled questionnaires: Questionnaires needed to be administered on a one to one basis seem to be filled in by the respondent by discussing with peers or family members leading to respondent bias.7) Low response rates: Unfamiliarity with the personnel may lead to withholding key and sensitive information as in Family planning studies. USE OF RATING SCALES IN QUESTIONNAIRES:A series of response options to research questions, representing degrees of a particular characteristic. The options are specifically ordered with sequential values (known as "ordinal") and have little overlap between neighboring options.

Characteristics of Rating ScaleRating scales can have many characteristics, including the following:

Type of scales - The scale can be formatted in a number of ways, including as a Likert scale, a semantic differential scale, a magnitude estimation scale, or a continuum.Scale size - The scale size refers both to the overall length of the scale as well as whether there are an odd or even number of options (an odd number would have a middle point). Research suggests between 5-9 scale points, depending on the topic and the skills of the respondents (Krosnick and Fabrigar, 1997)

Likert scales are probably the most common type of scale in usability questionnaires, bipolar with a range from positive to negative responses to a question.Semantic differential scales have options labeled at each end. The Questionnaire for User Interaction Satisfaction (QUIS) published by the University of Maryland uses this format.

22

Page 23: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

In magnitude estimation, respondents do not use a pre-defined scale, but rather are instructed to provide ratings proportionate to a baseline rating. The only requirement is that the ratings be positive. McGee (2003, 2004) provides details about this method, which he used successfully to evaluate the perceived usability of software.Type of labels – Scales can have either some or all of their responses labeled. Labels can be text or numbers. Numbers may suggest regular intervals between response options, but they can also affect the respondents’ selections (respondents tend to avoid negative numbers).Utility of a Rating scale: Professionals often use rating scales in questionnaires to collect data from users on subjective topics such as satisfaction and ease of use, but rating scales can also be used with more objective topics as well.

Using Likert scale in CIP:

1. Which of the following best describes the material covered in this session? The material was

Too basic Just right Too advanced

2. I was engaged throughout this session. Strongly disagree

Moderately disagree

Disagree Agree Slightly agree

Moderately agree

Strongly agree

3. My knowledge and/or skills increased as a result of this session. Strongly disagree

Moderately disagree

Disagree Agree Slightly agree

Moderately agree

Strongly agree

23

Page 24: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

M-L WEB DISCUSSION-SUMMARY OF WEEK 2 (8-14 JUNE 2011) Submitted by PSG FAIMER Fellow 2011 Subish Palaian

TOPIC: VALIDATION AND RELIABILITY OF QUESTIONNAIRE

SPECIFIC LEARNING OBJECTIVES FOR WEEK 2At the end of the second week ML-Web discussion the Participants should be able to:

1. Understand the concept of validity and reliability.2. List the various types of types of validity and reliability.3. Be able to use the various tests on a questionnaire and determine that

it is valid and reliable. The discussion continued from 8th to 14th of June, 2011. Altogether there were 44 postings in the ML Web regarding the topic. The opening posting

24

Page 25: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

was done by Mahalakshmi with the message as under.

Some of the initial comments from the FAIMERLY are as under:

Renu: Validity means your questionnaire will measure what it intends to measure.Reliability means your questionnaire will measure the same thing every time.

Subish: 'Reliability is the consistency of the measurement'. In other words, it is the repeatability of the measurement and ‘Validity is the strength of the conclusions, inferences’.

Sharada: Reliability refers to the repeatability, stability or the internal consistency of a questionnaire and validity refers to whether a questionnaire is measuring what it purports to.

Classification of validity and various measurement of reliability

Validity It is the degree to which a questionnaire reflects reality. There are a number of different facets to validity.

Internal validity: the degree to which questions within an instrument agree with each other, i.e., that a subject will respond to similar questions in a similar way.

External validity: the ability to make generalizations about a population beyond that ofthe sample tested.

25

Page 26: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

Statistical validity: this is related to internal validity, and assesses whether the differences in the questionnaire results between patient groups can appropriately be subjected to statistical tests of significance.

Longitudinal validity: whether a questionnaire returns the same results in a given population over time, assuming all else remains equal.

Linguistic validity: whether the wording of the questionnaire is understood in the same way by everyone who completes it.

Discriminant validity: the ability of the questionnaire to detect true differences between groups, and detect no difference when there isn’t one. It is the opposite of convergent validity and means that the indicators of one construct hang together or converge, but also are negatively associated with opposing constructs.

Construct validity: the ability of a measure to assess correctly a particular cause and effect relationship between the measure and some other factor. It measures with multiple indicators. It addresses the question: If the measure is valid, do the various indicators operate in a consistent manner?

Face Validity: judgment by the scientific community that the indicator really measures the construct. It addresses the question: On the face of it, do people believe that the definition and method of measurement fit?

Content Validity: addresses the question: Is the full content of a definition represented in a measure'? A conceptual definition holds ideas: it is a "space" containing ideas and concepts. Measures should sample or represent all ideas or areas in the conceptual space. Content validity involves three steps. First, specify the content in a construct's definition. Next, take a sample from all areas of the definition. Finally, develop one or more indicators that tap all of the parts of the definition.

Criterion Validity: uses some standard or criterion to indicate a construct accurately. The validity of an indicator is verified by comparing it with another measure of the same construct in which a researcher has confidence. There are two subtypes of this kind of validity. Concurrent- agrees with a preexisting measure Predictive-agrees with future behavior (conservatism measure-test on

conservative groups and they should score high, then test on liberal groups and they should score low. If true then the measure is "validated" by the pilot testing.)

26

Page 27: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

Convergent Validity: This kind of validity applies when multiple indicators converge or are associated with one another and multiple measures of the same construct hang together or operate in similar ways.

In general, validity is not an absolute quality. It is a continuum, with a questionnaire being valid to a certain degree in certain circumstances, and researchers must decide (preferably before the validation study is run) what degree of validity is considered sufficient.

Each type of validity is distinct, meaning that a questionnaire can have one kind of validity but not another. Because of that, a questionnaire can never really be fully “validated.” It can only be validated for x patient population, under y conditions, and so forth.

In conclusion, validity is that we are measuring what we want to measure and reliability is the consistency or repeatability of the measure.

Both validity and reliability are important for a questionnaire. If the tool can not measure what it intends to it becomes invalid. Any body using the tool should be able to consistently (reliability) measure what it intends to (valid).

Validity as a unitary concept

The unitary concept of validity stresses that validity is a unitary concept and should not be subdivided.

However, Validity is usually subdivided: http://maaw.info/ValidityNotes.htm based on Experimental Design and Control Criteria e.g., Internal / External / statistical conclusion / construct and based on The Trinitarian Model of Validity e.g., Criterion related / construct / content

27

Page 28: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

There is an online tutorial available on the nitty-gritty of Reliability and Validity: http://education.calumet.purdue.edu/vockell/research/workbook/workbook5.htm

Similarly at: http://education.calumet.purdue.edu/vockell/research/workbook/workbook5.htm#Statement33b

The American Psychological Association has defined three specific tools (Content validity / Criterion-related validity / Construct validity) for estimating aspects of validity. These technical tools for estimating aspects of validity refer to specific aspects of the overall concept of validity. By examining these three technical tools for estimating aspects of validity, we can gain insights which may be useful to us in constructing, administering, and interpreting various data collection processes."

Again, the site gives elaborate examples and exercises to clarify all doubts.Here is an interesting paper on this background: "Deficiencies in medical education research quality are widely acknowledged. Content, internal structure, and criterion validity evidence support the use of the Medical Education Research Study Quality Instrument (MERSQI) to measure education research quality, but predictive validity evidence has not been explored.http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2517948/

28

Page 29: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

Reliability

It is the degree to which a questionnaire will produce the same result if administered again, or the “test-retest” concept. It is also a measure of the degree to which a questionnaire can reflect a true change.Inter-Rater or Inter-Observer Reliability: Used to assess the degree to which different raters/observers give consistent estimates of the same phenomenon. Using bathroom scale weighing machine if ‘A’ measure a person 60 kg and ‘B’ also measures the same 60 kg using the same weighing machine - good inter-rater reliability 

Test-Retest Reliability: Used to assess the consistency of a measure from one time to another. Today I measure a person 60 kg and after half an hour again I measure using the same weighing machine and found the same 60 kg - good test retest reliability 

Parallel-Forms Reliability: Used to assess the consistency of the results of two tests constructed in the same way from the same content domain. ‘A’ measure a person 60 kg using bathroom scale weighing machine and using digital weighing machine also ‘A’ get the same 60 kg - good parallel form reliability 

Internal Consistency Reliability (Cronbach alpha): Used to assess the consistency of results across items within a test. In used in where more than one method/questions is used to measure one construct. We check if both the questions elicit similar response for a person. http://www.socialresearchmethods.net/kb/reltypes.php 

To cite an example, http://education.calumet.purdue.edu/vockell/research/workbook/workbook5.htm#Statement61

The standard error of measurement provides a slightly different approach to describing the reliability of a data collection process. The standard error of measurement describes the range within which the "true" score of an individual is likely to occur. The more reliable a test is, the narrower this range will be.

For example, a student might score 77 on an unreliable reading test with a reliability of 0.50 and a standard error of 10. Since the test is unreliable, the

29

Page 30: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

score would probably be different if the student retook the same test or an alternate form of that test. The standard error of 10 means that there is a good chance that the student's actual score - which could be ascertained by taking the test or its alternate forms a large number of times to rule out chance fluctuations - probably falls between 67 and 87.

If the test were more reliable - say, with a reliability of .80 and a standard error of 3, we would have greater confidence in the accuracy of a given student's score. If a student received a score of 77 on a test with a standard error of 3, we would estimate that the actual score would be somewhere within the range of 74 to 80."

Reliability and validity in qualitative researchTraditionally, researchers from different disciplines have placed great emphasis on the concepts of reliability and validity in research. Reliability refers to a concept in research in which any measurement tool that we use consistently measures the same features whilst validity is the extent to which a research tool measures what it is supposed to measure. Reliability: the ability of a measurement procedure to produce the same results when used in different places by different researchers. An example of this could be a ruler – this reliably measures length regardless of when, where or who is using it. Validity: the extent to which a research tool measures what it is supposed to measure. In designing research studies these aspects must be considered carefully. Researchers using quantitative research try to ensure that the instruments they use to collect data are reliable by removing any extraneous variables – any variables other than the independent variable which may influence the effect to be measured.

Extraneous variable: any variable other than the independent variable which may influence the effect to be measured.

When designing research tools such as questionnaires, researchers using quantitative approaches spend long periods of time developing and refining their questionnaires to ensure that responses will be reliable. They can do this in several ways. One of these is a simple test–re-test approach. This allows the researcher to administer a questionnaire to a group of respondents on one occasion and then, perhaps several weeks later, administer the same questionnaire to the same group of respondents. The researcher then looks for consistency in the two sets of responses. If most of the respondents do not answer in the same way as they did on the first occasion, the researcher might assume either that they have changed their views or that the way in which the questions are phrased does not generate a consistent interpretation. It is more likely that the latter reason would

30

Page 31: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

account for inconsistency, because it is unlikely that a range of people would all change their views on a given topic over a short period of time.

In qualitative research we are asking people to describe things in their own words. The nature of the response we get is therefore likely to be different. Therefore, we could not claim that our questions were going to result in the same or consistent responses from all respondents. Because the instruments used for collecting data in qualitative research do not yield consistent responses, reliability in qualitative research is said to be low compared with highly structured, quantitative questionnaires.

In qualitative research, however, it is easier to measure what is supposed to be measured and therefore validity is said to be high in qualitative research. Although quantitative research tools may gather consistent responses, highly structured approaches make it harder to be sure that we are measuring what we are supposed to measure – to guarantee validity. Quantitative researchers will spend a long period of time developing tools to ensure they address the issue of validity. For the qualitative researcher, however, this can be much more straightforward. For example, if we were to ask a group of people to tell us what they felt about the National Health Service today we might anticipate the sorts of things they would say. If trying to get the same information by structured questionnaire we would need to take care not to anticipate the responses in the way in which we phrased the questions.

There are also some additional ways in which we can check the validity of our data in qualitative research. We can refer back to the person interviewed and ask him or her to check the way we have interpreted his or her response by asking ‘does this interpretation ring true for you?’ If the interpretation is considered inaccurate then we can review our procedures for selection of ideas and concepts and reconsider the data. Advantages and disadvantages of the various validation techniques

Response of fellows to the advantages and disadvantages of the various method of validation commonly used in medical research, n=3 

Method of validation Advantages Disadvantages Expert opinion, discussion with department , MEU faculty, peer feedback

1.      One of the simplest ways to obtain evidence for the validation 2.      Statistical procedure not needed 3.      Questionnaire Critically reviewed – aids fine

1.      We are taking the opinion from a sample that is not going to be a part of the study sample2.      Disagreement/Contradiction

31

Page 32: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

adjustment 4.      Offer important feedback from many peers, subject experts

3.      Time Consuming

By applying the questions to small group of participants

1.      Easy2.      Provides practical training to administer the questionnaire – strengthen better performance 3.      Can get the feedback from students4.      Testing on a sample which is the representation of a bigger sample

1.      Needs statistical methods2.      Internal consistency not checked3.      Increases resources – cost and time 4.      May result in modification, redesign

Crohnbach’s alpha 1.      Internal consistency is checked

1.      Difficulty for those not familiar with EXCEL AND SPSS2.      Not an independent tool of validity

Cognitive interview (think aloud, verbal probing)

Think aloud1.      Direct interview with the concerned, 2.      Can clarify their doubts, 3.      Freedom from interviewer-imposed bias4.      Minimal interviewer training requirements

1.      Need for subject training 2.      Subject resistance 3.      Burden on subject4.      Tendency for the subject to stray from the task 5.      Bias in subject information processing

Verbal probing5.      Control of the interview 6.      Ease of training of the subject

6.      Artificiality 7.      Potential for Bias

Questions and answers (By Dr. Shital Bhandary)

1. I want to explore the 'practical circumstances' in Medical Education research, where we can apply the concepts of Validity and Relability? - If you intend to use any research tool, be in quantitative or qualitative, it should be validated. Reliability, on the other hand, becomes abstract with the qualitative tools but indispensable for the quantitative one. Nonetheless, people have developed tools to measure reliability of the qualitative studies

32

Page 33: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

as well, e.g., consistency between the researchers before coding, during coding and after the coding. 2. Do we need to find out validity and reliability for all types of questionnaires? Yes, and "must' if it is a quantitative.  3. What type of validity (content validity, criterion-related validity and construct validity) suits best in Medical Education? (For e.g. criterion-related validity do fit well in social science research as they are based on abstract concepts at least to begin with) Depends. I think Medical Research is also done from very simple to very complex and all the concepts that apply to other research methods applied here too. So, it’s better to select the validity according to the type of "medical education' research we are conducting. I'm attaching a nice review of these by Dr. Yu and another one the validating questionnaire. There are so many other by the peers.  4. Which is the best method of measuring reliability in medical education research? - Depends as what is needed for the specific research method and tool. For instance, the Cronbach's alpha is sufficient for most to measure the internal consistency but we (being statistician) recommend the standard error of measurement (seem) and, if possible, g-coefficient because of them being highly sensitive and specific. 5. What are the units of measurement of validity and reliability? What are the cut-off values to accept that tool (questionnaire or scale) is valid or reliable? - Validly is abstract (some could be measure statistically and are unit less: I don't know if we can call correlation coefficient a unit!) so no units are assigned. But, you should have at least 6 people to validate your tool (so that you can have better reliability). The cut-off values of well-known measures like Cronbach' alpha hovers from 0.6 to 0.99 (or 1). For OSCEs, it is now accepted that 0.6 is reliable whereas for high-stake MCQs Cronbach's alpha greater than 0.9 is "must". There are similar cut-off for other type of research and available in books, articles and web. Unfortunately, most of them are subjective but you may find them accepted even in the "indexed" publications. One other thing that we use mostly is the consensus which can be done via consensus analysis or Fleiss/Kappa coefficient has two (or more) different

33

Page 34: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

theories and in-depth understanding of its theory is must before proceeding to apply and interpret the result.  For face validity, there are new terms like "language validity" where lay persons (literate non-content experts) are invited to judge the language appropriate for the intended group of population or not!  The content experts usually forget that the tools they develop to measure the "ability", "trait" etc. actually may not measure what they intend to measure if done alone and thus it is always better to seek peers (content and non content experts) help to validate the tools. I have even found people designing questionnaire and not knowing what they are actually measuring. Reliability, as quoted by many literature shared here, is necessary but not sufficient measure of validity. But remember an unreliable tool can never be valid!

ML WEB DISCUSSION-SUMMARY OF WEEK 3 (15-21 JUNE 2011) Submitted by PSG FAIMER Fellow 2011 Asma

TOPIC: PRETESTING AND PILOTING OF QUESTIONNAIRE

SPECIFIC LEARNING OBJECTIVES FOR WEEK 3At the end of the first week ML-Web discussion the Participants should be able to:

Understand the concepts of pretesting/piloting and how best to use them.

The following questions were put up by PSG FAIMER FELLOW 2011 SUBISH for discussion:

1. What is meant by pilot testing?

2. Importance of pilot testing in research.

3. Can the pilot test data be included in the main study?

4. What are the common difficulties in pilot testing?

5. How to modify the proposed research based on pilot study finding?

6. Can a pilot study finding be published in biomedical journal?

7. How many samples must be included in a pilot study? How to

34

Page 35: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

calculate the sample size in such a case?

8. Pilot testing is usually done for the questionnaire or for entireresearch process?

And any other comments are welcome.

The responses from the Fellows to these queries were as follows:-

1. What is meant by pilot testing?PSG FAIMER FELLOW 2011SHARADA: Pilot testing means running a small scale testing (trial run) of the questionnairePSG FAIMER FELLOW 2011 DR SARATH GILLELLAMUDI: The term 'pilot studies' refers to mini versions of a full-scale study (also called 'feasibility' studies), as well as the specific pre-testing of a particular research instrument such as a questionnaire or interview schedule.PSG FAIMER FELLOW 2010 ANAND: Is to check if the questionnaire designed is apt, feasible, measurable and meeting the hypothesis/aim of the study framed. Just a preliminary analysis.(specifically restricting to questionnaire design)PSG FAIMER FELLOW 2010 SUDHA: A pilot, or feasibility study, is a small experiment designed to test logistics and gather information prior to a larger study, in order to improve the latter’s quality and efficiencyPSG FAIMER FELLOW 2010 SAM: Pilot testing is a study done to test the questionnaire in the field to assess the feasibility issues.

2. Importance of pilot testing in research.PSG FAIMER FELLOW 2011SHARADA: The importance of pilot testing is that it enables us to find Errors in the questionnaire - appropriateness of questions to what is being tested, ambiguities, time taken, requirements for reformattingReaction of the respondents - acceptability of questions, if the wordings are understood , range of responses Sampling procedure can be checked - time required for administering , extent to which the instructions are followedData processing and analysis can be checked.

PSG FAIMER FELLOW 2011G.KALAISELVAN: I think that pilot testing is needed only when the study is new and if you want to explore any new idea.

PSG FAIMER FELLOW 2011 DR SARATH GILLELLAMUDI: Pilot studies are a crucial element of a good study design. Conducting a pilot study does not guarantee success in the main study, but it does increase the likelihood. Pilot studies fulfill a range of important functions and can provide valuable insights for other researchers. There is a need for more discussion amongst researchers of both the process and outcomes of pilot studies.

35

Page 36: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

Reasons for conducting pilot studies · Developing and testing adequacy of research instruments· Assessing the feasibility of a (full-scale) study/survey· Designing a research protocol· Assessing whether the research protocol is realistic and workable· Establishing whether the sampling frame and technique are effective· Assessing the likely success of proposed recruitment approaches· Identifying logistical problems which might occur using proposed methods· Estimating variability in outcomes to help determining sample size· Collecting preliminary data· Determining what resources (finance, staff) are needed for a planned study· Assessing the proposed data analysis techniques to uncover potential problems· Developing a research question and research plan· Training a researcher in as many elements of the research process as possible· Convincing funding bodies that the research team is competent and

knowledgeable· Convincing funding bodies that the main study is feasible and worth funding· Convincing other stakeholders that the main study is worth supporting.Pilot study helps to improve the internal validity of a questionnaire in the following way:· Administer the questionnaire to pilot subjects in exactly the

same way as it will be administered in the main study· Ask the subjects for feedback to identify ambiguities and difficult questions· Record the time taken to complete the questionnaire and decide

whether it is reasonable· Discard all unnecessary, difficult or ambiguous questions· Assess whether each question gives an adequate range of responses· Establish that replies can be interpreted in terms of the information that is required· Check that all questions are answered· Re-word or re-scale any questions that are not answered as expected· Shorten, revise and, if possible, pilot again. PSG FAIMER FELLOW 2010 ANAND: Importance of pilot testing in research.a. Preliminary checkb. Helps in validating our questionnairec. Enables for any modifications whereever necessaryd. Inform us if we are in the right track (related to our aims and objectives of the study)

36

Page 37: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

e. Enables to know about the feasibility of the project (questionnaire)PSG FAIMER FELLOW 2010 SUDHA: Importance of pilot testing in research. A pilot study can reveal deficiencies in the design of a proposed experiment or procedure and these can then be addressed before time and resources are expended on large scale studies. A good research strategy requires careful planning and a pilot study will often be a part of this strategy.A pilot study may address a number of logistical issues. • Check that the instructions given to investigators (e.g. randomization procedures) are comprehensible; • Check that investigators and technicians are sufficiently skilled in the procedures; • Check the correct operation of equipment; • Check the reliability and validity of results • Detect a floor or ceiling effect (e.g. if a task is too difficult or too easy there will be skewed results) • Assess whether the level of intervention is appropriate (e.g. the dose of a drug); • Identify adverse effects (pain, suffering, distress or lasting harm) caused by the procedure, and the effectiveness of actions to reduce them (e.g. analgesia dose rate and schedule); PSG FAIMER FELLOW 2010 SAM: Pilot study is done to look for the following issues: a) If the subjects are able to understand the questionsb) If the investigators are familiar with the skill of applying the questionnairec) To test whether any equipment proposed to be used is working well

3. Can the pilot test data be included in the main study?PSG FAIMER FELLOW 2011SHARADA: I think the pilot data cannot be added to the main data but would be proof for having piloted the questionnaire.PSG FAIMER FELLOW 2011 DR SARATH GILLELLAMUDI: A more common problem is deciding whether to include pilot study participants or site(s) in the main study? Here the concern is that they have already been exposed to an intervention and, therefore, may respond differently from those who have not previously experienced it. This may be positive, for example the participants may become more adept at using a new tool or procedure.The concern about including participants from the pilot study in the main study arises because only those involved in the pilot, and not the whole group, will have had the experience. In some cases however it is simply not possible to exclude these pilot-study participants because to do so would result in too small a sample in the main study. This problem arises in particular where the samples are clusters, for example schools, prisons or hospitals. In such cases one can conduct a sensitivity analysis (or sub-group

37

Page 38: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

analysis) to assess to what extent the process of piloting influences the size of the intervention effect.Social scientists engaged in predominantly quantitative research are likely to argue that: "an essential feature of a pilot study is that the data are not used to test a hypothesis or included with data from the actual study when the results are reported" (Peat et al. 2002: 57)PSG FAIMER FELLOW 2010 ANAND: Yes.PSG FAIMER FELLOW 2010 SUDHA: If a pilot study does not lead to modification of materials or procedures then the data might be suitable for incorporation into the main study. The sampling strategy used to select subjects, and the possibility of changes over time should be carefully considered before incorporating pilot data. Even if the pilot data are not used in this way, and even if the final design differs markedly from the pilot, it is useful to include information on the pilot study in any publications or reports arising from the main experiment as this can inform the design of future experiments. PSG FAIMER FELLOW 2010 SAM: The pilot study is not designed for answering the research question... hence it cannot be used for the main study. 4. What are the common difficulties in pilot testing?

PSG FAIMER FELLOW 2011 SHARADA: The difficulties in piloting the questionnaire are identification of a representative population for testing, time factor & probably the cost factor.PSG FAIMER FELLOW 2011 DR SARATH GILLELLAMUDI: Pilot studies are not free of problems or shall we say limitations.These include the possibility of making inaccurate predictions or assumptions on the basis of pilot data; problems arising from contamination; and problems related to funding.Completing a pilot study successfully is not a guarantee of the success of the full-scale survey. A further concern is that of contamination. This may arise in two ways:1. Where data from the pilot study is included in the main results;2. Where pilot participants are included in the main study, but new data are collected from these people. Contamination is less of a concern in qualitative research, where researchers often use some or all of their pilot data as part of the main study. Qualitative data collection and analysis is often progressive, in that a second or subsequent interview in a series should be 'better' than the previous one as the interviewer may have gained insights from previous interviews which are used to improve interview schedules and specific questions. Some have therefore argued that in qualitative approaches separate pilot studies are not necessary (e.g. Holloway 1997: 121).

38

Page 39: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

However, Frankland and Bloor (1999: 154) argue that piloting provides the qualitative researcher with a "clear definition of the focus of the study" which in turn helps the researcher to concentrate data collection on a narrow spectrum of projected analytical topics. Piloting of qualitative approaches can also be carried out if "the researcher lacks confidence or is a novice, particularly when using the interview technique" (Holloway 1997: 121).PSG FAIMER FELLOW 2010 ANAND: a. Time factorb. Expertisec. Whom to include in the pilot testingd. Sample sizee. Bias (sometimes)PSG FAIMER FELLOW 2010 SUDHA: What are the common difficulties in pilot testing?Many times the emphasis is on the sample size and not the feasibility..So there are no clear feasibility objectives; no clear analytical plans; and certainly no clear criteria for success of feasibility. It can be dangerous to use pilot studies to estimate treatment effects, as such estimates may be unrealistic/biased because of the limited sample sizes. Therefore if not used cautiously, results of pilot studies can potentially mislead sample size or power calculations PSG FAIMER FELLOW 2010 SAM:a) Problems of doing anything for the first timeb) Sample sizec) Timed) Expertise

5. How to modify the proposed research based on pilot study finding?PSG FAIMER FELLOW 2011SHARADA: Modifications can be made by changing the instruments (modification in questionnaire) and the procedures based on the finding of the pilot testing.PSG FAIMER FELLOW 2010 ANAND: Can look into the inputs from all members pertaining to the methodology, Measuring instrument, Questions framed, and assessment PSG FAIMER FELLOW 2010 SUDHA: Based on the difficulties faced during the pilot phase the tools can be modified. Usually a second pilot is done with the modified tools.PSG FAIMER FELLOW 2010 SAM: The questionnaire need to be modified based on the pilot study results. The modified questionnaire needs to be tested again to check if our objective is being met.

6. Can a pilot study finding be published in biomedical journal?PSG FAIMER FELLOW 2011SHARADA: A pilot study can be published, however the study group should be large for the data to be relevant to the scientific community (others please comment on this !!).

39

Page 40: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

PSG FAIMER FACULTY RAVI :Yes. Pilot study can be published. But usually as a 'preliminary report' or as a 'short communication' and the limitations need to mentioned clearly. (clarification needed from other members).PSG FAIMER FELLOW 2011 DR SARATH GILLELLAMUDI: Why are pilot studies not reported? Publication bias may occur because of a tendency for journals to accept only papers that have statistically significant results and not to report non-significant effects (Mahoney 1977; Chann 1982; Dickersin, 1990). A recent study exploring research on passive smoking found a difference of two years in the median time to publication between findings from significant and non-significant studies (Misakian & Bero 1998). It follows that papers reporting methodological issues, such as those identified during the pilot phase of a study, will also be less attractive to publishers. Selective publication of research results has been recognised as a problem. It may lead to an overestimation of the effectiveness of interventions, exposing patients to useless or harmful treatments, while overestimation of adverse effects may mean that patients are denied effective forms of care (Oxman et al. 1994). This has to be overcome. It has been said that pilot studies are likely to be "underdiscussed, underused and underreported" (Prescott and Soeken, 1989 p60). This is in quite contrast to subish colleagues always says, “Pilot testing is like tasting a food while cooking”. He also says “If properly checked, five rice can let one know whether five Kg rice is cooked or not”. Full reports of pilot studies are rare in the research literature (Lindquist, 1991; Muoio et al, 1995, van Teijlingen et al. 2001). When reported, they often only justify the research methods or particular research tool used. Too often research papers only refer to one element of the pilot study, for example, to the 'pre-testing' or 'pilot testing' of a questionnaire (De Vaus, 1993). Such papers simply state: "the questionnaire was tested for validity and reliability." Concluding, Well-designed and well-conducted pilot studies can inform us about the best research process and occasionally about likely outcomes. Therefore investigators should be encouraged to report their pilot studies, and in particular to report in more detail the actual improvements made to the study design and the research process. PSG FAIMER FELLOW 2010 ANAND: YesPSG FAIMER FELLOW 2010 SUDHA: Yes. It can be published. This ensures that the feasibility issues are made known to the research community.PSG FAIMER FELLOW 2010 SAM: Not sure

7. How many samples must be included in a pilot study? How to calculate the sample size in such a case?PSG FAIMER FELLOW 2011SHARADA: I am not sure of what number needed for pilot testing (clarification needed!!)PSG FAIMER FACULTY RAVI: The number needed for pilot testing is crucial. May be very few number are needed for qualitative studies and about 10% of the study sample in case of quantitative studies.PSG FAIMER FELLOW 2010 ANAND: That is the main challenge in the pilot testing. Its would very beneficial if we approach the statistician for

40

Page 41: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

calculating the sample size as there are many factors required for the sample size estimationPSG FAIMER FELLOW 2010 SUDHA: In general, sample size calculations may not be required for some pilot studies. However, it should be a representative of the target population. It should also be based on the same inclusion/exclusion criteria as the main study. As a rule of thumb, a pilot study should be large enough to provide useful information about the aspects that are being assessed for feasibility.PSG FAIMER FELLOW 2010 SAM: Not sure.

8. Pilot testing is usually done for the questionnaire or for entire research process?PSG FAIMER FELLOW 2011SHARADA: Pilot testing can be done for the questionnaire and the statistical methods that we will be applying to the outcome from the questionnaire PSG FAIMER FELLOW 2010 ANAND: For both it can be used PSG FAIMER FELLOW 2010 SUDHA: It is usually done for the entire research process.PSG FAIMER FELLOW 2010 SAM: Yes.

ADDITIONAL COMMENTS FROM FELLOWS/ FACULTY:

PSG FAIMER FACULTY AMOL :The question of 'piloting a questionnaire' becomes important when you plan to collect 'large data set' over a wide area or 'Interrupted Time Series' data (measurements done at multiple points in time) or multiple group non-experimental designs - using that particular questionnaire. After you work out a questionnaire (framing questions, deciding number of questions including skip and jump questions, layout, fond size, formatting, paper quality) pre-testing is done before you order final print and to check points such as - is there any problem of wording?, is there any repetition?, how much time is required to complete each interview? did you forget any important variable to include or exclude? - for example, if you forget to record the gender of children in a survey on child malnutrition, it leads to serious frustration at the time of analysis and ultimately leads to waste of resources. [I have one bitter experience, where i forgot one variable to include and still remember its consequences. So, its better to spend some extended time on piloting your questionnaire]. During pre-testing, you may fill-up questionnaire till 'saturation point' - means point at which no new problems come-up in the questionnaire administration. Some researchers say that this point is achieved quite before the sample of 30. [I found that 10 to 15 forms do it].PSG FAIMER FELLOW 2011RAMS: Why/What/Who is the "pilot" in pilot testing??

41

Page 42: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

PSG FAIMER FACULTY SUPTEN: As per http://dictionary.reference.com/browse/pilot Origin (of Pilot): 1520–30; earlier pylotte < Middle French pillotte < Italian pilota, dissimilated variant of pedota < Medieval Greek *pēdṓtēs steersman, equivalent to pēd ( á ) rudder (plural of pēdón oar) + -ōtēs agent suffix"Pilot" as a verb may denote: to steer, to lead, guide, or conduct, as through unknown places, intricate affairs. As an adjective it may denote: serving as an experimental or trial undertaking prior to full-scale operation or use: a pilot project: "A pilot experiment, also called a pilot study, is a small scale preliminary study conducted before the main research, in order to check the feasibility or to improve the design of the research. Pilot studies, therefore, may not be appropriate for case studies. They are frequently carried out before large-scale quantitative research, in an attempt to avoid time and money being wasted on an inadequately designed project. A pilot study is usually carried out on members of the relevant population, but not on those who will form part of the final sample. This is because it may influence the later behavior of research subjects if they have already been involved in the research." -from: http://en.wikipedia.org/wiki/Pilot_experimentPSG FAIMER FELLOW 2011SUNITA: Very nice discussion. In the beginning, Subish has posed very important questions which include all the essence of piloting. It is now more clear about what, when, why and how of piloting the questionnaire. The resource attached may add to the discussion.PSG FAIMER FELLOW 2011 DR SARATH GILLELLAMUDI Pilot study procedures help to improve the internal validity of a questionnaire in the following ways:· Administer the questionnaire to pilot subjects in exactly the same way

as it will be administered in the main study· Ask the subjects for feedback to identify ambiguities and difficult questions· Record the time taken to complete the questionnaire and decide

whether it is reasonable· Discard all unnecessary, difficult or ambiguous questions· Assess whether each question gives an adequate range of responses· Establish that replies can be interpreted in terms of the information that is required· Check that all questions are answered· Re-word or re-scale any questions that are not answered as expected· Shorten, revise and, if possible, pilot again. PSG FAIMER FELLOW 2009 LATHA RAJENDRA KUMAR:I want to add that: For determining the sample size, sometimes we use G power. I recently understood this for a grant writing. Whether g power can be used for pilot study or the actual study, I am not sure.May be Shital can answer this,

42

Page 43: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

PSG FAIMER FACULTY SHITAL: Theory recommends to pre-test at least 10% of calculated sample size for the study/project. However many use "rule of thumb" and the sample size for pre-test can be as low as 10-15. There is no hard and fast/golden rule on the pre-test sample size. The only thing that matters most is if the pre-test samples would give the enough power to establish the reliability and validity of the tool that we are pre-testing. So, there is not need to use sophosticates sample size determination software for pre-test. Hope this helps.PSG FAIMER FACULTY ANIMESH: Thank you for answering and clarifying and in the process enlightening all of us. I wasnt aware of G power and thought it's some kind of power of test or some statistical term. Now, I know that pilot study needs about 10% of original sample arbitrarily though there are no mandatory guidelines. Till now, I was following my intuition and doing so... good to have the right info now. For some who may still be wondering (like I was till a while ago), I would like to add that "G Power" is a software or program that helps in various statistical tests and computations including sample size. The best part is it's free and available online unlike SPSS, STATA etc. Please check: http://www.psycho.uni-duesseldorf.de/aap/projects/gpower/Excerpts from the website:G*Power 2 performs high-precision statistical power analyses for the most common statistical tests in behavioral research, that is, t-tests (independent samples, correlations, and any other t-test), F-tests (ANOVAS, multiple correlation and regression, and any other F-test), and Chi2-tests (goodness of fit and contingency tables). G*Power 2 computes power values for given sample sizes, effect sizes, and alpha levels (post hoc power analyses), sample sizes for given effect sizes, alpha levels, and power values (a priori power analyses), and alpha and beta values for given sample sizes, effect sizes, and beta/alpha ratios (compromise power analyses). The program may be used to display graphically the relation between any two of the relevant variables and it offers the opportunity to compute the effect size measures from basic parameters defining the alternative hypothesis.G*Power 2 is free. You may give the program to friends and colleagues who might find it useful. However, if you want to include G*Power on a shareware or freeware CD-ROM, or if you want to distribute it together with commercial software, you must ask the authors for permission. Further, now the newer version G Power 3 is available.PSG FAIMER FELLOW 2011 PIRYANI: I feel that the interpretations in the discussion have clarified queries generated in our mind, please continue with same vigor.

43

Page 44: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

WEB DISCUSSION-SUMMARY OF WEEK 4 (22-28 JUNE 2011) Submitted by PSG FAIMER Fellow 2011 Subish

TOPIC: PRINCIPLES OF DATA ANALYSIS OF QUESTIONNAIRE

SPECIFIC LEARNING OBJECTIVES FOR WEEK 4At the end of the fourth week ML-Web discussion the Participants should be able to:

1. Understand the concept of analysis of data generated from Questionnaires.

2. Be able to use the various methods of data analysis for data from Questionnaires.

44

Page 45: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

The discussion continued from 21st to 28th of June, 2011. Altogether there were 21 postings in the ML Web regarding the topic.

Basics of data analysis

Statistics is a set of methods that are used to collect, analyze, present, and interpret data. If all the above components do not fit in then the numbers remain abstract without any heart and soul.

Who will help us out with data analysis?

Computers play a very important role in statistical data analysis. The statistical software package, SPSS offers extensive data-handling capabilities and numerous statistical analysis routines that can analyze small to very large data statistics. The computer will assist in the summarization of data, but statistical data analysis focuses on the interpretation of the output to make inferences and predictions.

Role of a good questionnaire

Here lies the importance of designing and executing a good questionnaire (Let us remember here the concept of validity and reliability).

Applying what you have learnt: Statistical inference is referring to extending your knowledge obtained from a random sample from a population to the whole population. This is known in mathematics as an inductive reasoning.

Types of data: Data can be either quantitative or qualitative. Qualitative data are labels or names used to identify an attribute of each element. Quantitative data are always numeric and indicate either how much or how many.

45

Page 46: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

“Remember that in your CIP Questionnaire you come across both quantitative as well as Quantitative variables”.

How to code the data you have?

Data are often recorded manually on data sheets. Unless the numbers of observations and variables are small the data must be analyzed on a computer. The data will then go through three stages:

Coding: The data are transferred, if necessary to coded sheets.

Typing:  The data are typed and stored by at least two independent data entry persons. For example, when the Current Population Survey and other monthly surveys were taken using paper questionnaires, the U.S. Census Bureau used double key data entry.

Editing:  The data are checked by comparing the two independent typed data. The standard practice for key-entering data from paper questionnaires is to key in all the data twice. Ideally, the second time should be done by a different key entry operator whose job specifically includes verifying mismatches between the original and second entries. It is believed that this "double-key/verification" method produces a 99.8% accuracy rate for total keystrokes.

Type of data and levels of measurement

Information can be collected in statistics using qualitative or quantitative data.

Qualitative data, such as eye color of a group of individuals, is not computable by arithmetic relations. They are labels that advise in which category or class an individual, object, or process fall. They are called categorical variables.

46

Page 47: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

Quantitative data sets consist of measures that take numerical values for which descriptions such as means and standard deviations are meaningful. They can be put into an order and further divided into two groups: discrete data or continuous data. Discrete data are countable data, for example, the number of days of hospitalization. Continuous data, when the parameters (variables) are measurable, are expressed on a continuous scale. For example, measuring the height of a person.

The first activity in statistics is to measure or count. Measurement/counting theory is concerned with the connection between data and reality. A set of data is a representation (i.e., a model) of the reality based on numerical and measurable scales. Data are called "primary type" data if the analyst has been involved in collecting the data relevant to his/her investigation. Otherwise, it is called "secondary type" data.

Data come in the forms of Nominal, Ordinal, Interval and Ratio. Data can be either continuous or discrete.

 Since statisticians live for precision, they prefer Interval/Ratio levels of measurement.

Analyzing the data

Statistical data analysis divides the methods for analyzing data into two categories: exploratory methods and confirmatory methods. Exploratory methods are used to discover what the data seems to be saying by using simple arithmetic and easy-to-draw pictures to summarize data. Confirmatory methods use ideas from probability theory in the attempt to answer specific questions. Probability is important in decision making because it provides a mechanism for measuring, expressing, and analyzing the uncertainties associated with future events

Reporting the results

47

Page 48: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

Through inferences, an estimate or test claims about the characteristics of a population can be obtained from a sample. The results may be reported in the form of a table, a graph or a set of percentages. As only a small collection (sample) has been examined and not an entire population, the reported results must reflect the uncertainty through the use of probability statements and intervals of values.

48

Page 49: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

Summary of the four levels of measurement Appropriate descriptive statistics and graphs (James Neill,

2009)

Level of Measurement

Properties Examples Descriptive statistics

Graphs

Nominal / Categorical

Discrete

Arbitrary (no order)

Dichotomous- Yes / No, Gender

Types / Categories-colour, shape

Frequencies PercentageMode

Bar

Pie

Ordinal / RankOrdered categories

Ranks

Ranking favorites

Academic grades

Frequencies

Mode

Median

Percentiles

Bar

Pie

Stem & leaf

IntervalEqual distances between values

Discrete (e.g., Likert scale)

Metric (e.g., deg. F)

Interval scales >5 can usually be treated as ratio

Discrete

- Thoughts, behaviours, feelings, etc. on a Likert scale

Metric

- Deg. C or F

Frequencies(if discrete)

Mode

(if discrete)

Median

Mean

SD

Skewness

Bar(if discrete)

Pie(if discrete)

Stem & Leaf

Box plot

Histogram(if metric)

49

Page 50: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

Kurtosis

Ratio Continuous / Metric / Meaningful 0 allows ratio statements(e.g., A is twice as large as B)

Age

Weight

VO2 max

Deg. Kelvin

Mean

SD

Skewness

Kurtosis

Histogram

Box plot

Stem & Leaf(may need to round leafs)

50

Page 51: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

Usefulness of MS Excel in data analysis

In general, statisticians do not recommend Excel for Statistical Analysis even though it has many statistical routines as they are found problematic in various testing. Please read this on the fallacies of Excel as Data Analysis software: http://people.umass.edu/evagold/excel.html Nonetheless, MS-Excel is an excellent tool for Data Entry and Excel form is very easy to use. For comparison of many data analysis software inclding Excel, please follow this link: http://brenocon.com/blog/2009/02/comparison-of-data-analysis-packages-r-matlab-scipy-excel-sas-spss-stata/   

A Brief description of Epi-Info

Epi-Info is "free" whereas SPSS is "commercial" and it is the biggest difference for the scholar community from this part of world where we (institutes and individuals) don't spend much on genuine software (as far as I know even our Laptops and Desktop are/were installed with counterfeit Operating System and Software!). Epi-Info is a comprehensive for Data Entry (Creating Data Entry Forms, Entering Data, customized Data Entry) and basic statistical analysis whereas SPSS is intended for powerful statistical Data Analysis only. For instance, you can not calculate Cronbach's Alpha in Epi-Info directly as there is no bottom available for it. Epi-Info was created mainly (but not limited to) to be used in the Field Epidemiological Intervention programs whereas SPSS was created for all-round statistical routines although Base SPSS module need add-on modules for special statistical routines.  Data file of the current Windows based Epi-Info is MS-Access format whereas SPSS has its own data structure known as SAV files. Epi-Info can be installed

51

Page 52: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

directly on Windows only whereas SPSS is available for Windows, Mac and Unix. MS-Access data is very heavy (lots of space) whereas SAV is very light (less space). Data manipulation in Epi-Info is cumbersome compared to SPSS as you need to know Structured Query Language (SQL) which is backbone of any database like MS-Access. Epi-Info's programming language (Check codes) is as powerful as SPSS but it takes lot of time to learn as there are few resources available whereas SPSS programming (syntax, macro, script) are considered simple, easy yet comprehensive programming language.  Epi-Info has many additional tools which are very handy like Nutristat (for anthropometric data analysis), StatCalc for analyzing contingency tables and calculating Sample Size etc. The add-on module for SPSS should be bought separately. However, if you download the Evaluation version of SPSS, it includes Base as well as add-ons modules. SPSS sells the separate software for Sample Size and Text Analysis. SPSS has bought by IBM and IBM's marketing research tools are now included with it (with more costs of course!) whereas Epi-Info's source code (core program file used to build the software) is now "open source" (anyone can manipulate the code now, use imagination and create something new) and community of developers are working to make it better (however all of them are still in the alpha version i.e. for programmers and developers and, we are still waiting the beta i.e. stable enough to run and try as users. Please see: http://epiinfo.codeplex.com/ for more.  For SPSS simple tutorial: http://glimo.vub.ac.be/downloads/eng_spss_basic.pdfFor Epi-Info simple tutorial: http://www.glitc.org/epicenter/publications/tools/Epi_Info_Beginners_Manual.pdf 

52

Page 53: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

Group of programmers and statisticians are working to develop open source "SPSS Clone" which is free and known as "PSPP". It has less feature but we can give it a try, why not!: http://www.gnu.org/software/pspp/ Statisticians who advocate for FOSS (Free Operating System and Software) around the world are now using the open-source, free and comprehensive software "R" which has thousands of statistical, epidemiological, econometrics etc. routines. But, R is Command Line Interface (like MS-DOS) and is difficult to learn in the beginning. See: http://www.r-project.org/ for more  It is often heard from beginners is that, when they use SPSS, the result table contains lot of information. This makes the beginners, number phobic people, more phobic whereas epi-info gives a much simpler tables. 

One could download it from http://mac.softpedia.com/get/Math-Scientific/GraphPad-InStat.shtml

Statistical analysis using SPSS

Essential prerequisites a researcher should have before trying out SPSS is Choosing the Correct Statistical Test for a table that shows an overview of when each test is appropriate to use.  In deciding which test is appropriate to use, it is important to consider the type of variables that you have (i.e., whether your variables are categorical, ordinal or interval and whether they are normally distributed.

One sample t-test

A one sample t-test allows us to test whether a sample mean (of a normally distributed interval variable) significantly differs from a hypothesized value. For example, if one wants to test in your CIP on PBL, whether the mean score of the students performance after introduction of PBL in his/her institution is

53

Page 54: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

comparable to that of other Medical school/schools where this technique has been introduced. One can apply the t- test and you get values like Mean score of the students, SD or standard deviation and also the SE, or Standard error. One can also find out whether the scores are normally distributed or not. If they are not or skewed then it may be due to problems with sampling, data collection etc.

Binomial test

A one sample binomial test allows us to test whether the proportion of successes on a two-level categorical dependent variable significantly differs from a hypothesized value.

Chi-square goodness of fit

A chi-square goodness of fit test allows us to test whether the observed proportions for a categorical variable differ from hypothesized proportions.  For example, let's suppose that we believe that the general population consists of 10% Hispanic, 10% Asian, 10% African American and 70% White folks.  We want to test whether the observed proportions from our sample differ significantly from these hypothesized proportions. 

Two independent samples t-test

An independent samples t-test is used when you want to compare the means of a normally distributed interval dependent variable for two independent groups.  This test will help you understand whether there is any significant difference in the  PBL scores between Males and females in your class. 

Wilcoxon-Mann-Whitney test

The Wilcoxon-Mann-Whitney test is a non-parametric analog to the independent samples t-test and can be used when you do not assume that

54

Page 55: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

the dependent variable is a normally distributed interval variable (you only assume that the variable is at least ordinal). 

Chi-square test

A chi-square test is used when you want to see if there is a relationship between two categorical variables. Suppose during data analysis one categorized the scores into 2 or 3 categories say students with low or high scores. The idea is to test if there is any association between the type of score and gender .That's where the chi square test comes in.

Fisher's exact test

The Fisher's exact test is used when you want to conduct a chi-square test but one or more of your cells have an expected frequency of five or less. This may happen if your class has lesser number of females compared to males.

One Way Anova

A one-way analysis of variance (ANOVA) is used when you have a categorical independent variable (with two or more categories) and a normally distributed interval dependent variable and you wish to test for differences in the means of the dependent variable broken down by the levels of the independent variable. If you want to know the difference in the mean scores of PBL in 3 subjects say, Medicine, Surgery and Gynecology and to find out if the difference is statistically significant.

The Kruskal Wallis test

It is used when you have one independent variable with two or more levels and an ordinal dependent variable. In other words, it is the non-parametric version of ANOVA and a generalized form of the Mann-Whitney test method since it permits two or more groups.

55

Page 56: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

Paired t-test

A paired (samples) t-test is used when you have two related observations (i.e., two observations per subject) and you want to see if the means on these two normally distributed interval variables differ from one another. Here one wants to test if the exam scores of your students have significantly improved than written tests after introducing PBL. Here we go to compare and test the written test scores that is pre PBL Scores with the Post PBL Scores.

McNemar test

You would perform McNemar's test if you were interested in the marginal frequencies of two binary outcomes. These binary outcomes may be the same outcome variable on matched pairs (like a case-control study) or two outcome variables from a single group. 

Logistic regression

It helps you to find out the independent predictors of type of scores in PBL. We know the performance in PBL Scores depends on various factors like IQ of student, sex, subject, teacher etc. Logistic regression will tell us which is the most important predictor or predictors out of these multiple predictors after excluding the effect of others.

Correlation

Correlation is useful when you want to see the relationship between two (or more) normally distributed interval variables

Linear regression

Simple linear regression allows us to look at the linear relationship between one normally distributed interval predictor and one normally distributed

56

Page 57: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

interval outcome variable. For example, say we wish to look at the relationship between written exam scores (write) and PBL scores (read); in other words, predicting write from PBL

 

REVIEW OF LITERATUREWhen a questionnaire is handed to a respondent, an elaborate and

subtle process is started which is intended to end in the transmission of useful and accurate information from the respondent to the inquirer. Consider what this process involves. A question or series of questions have to be posed in a clear, comprehensible, and appropriate manner so that the respondent can formulate, articulate, and transmit the answers effectively. These answers must be recorded, coded, and analyzed without bias, errors, or misrepresentation of the respondents' views. A well designed questionnaire ensures the smooth unfolding of this chain of events from start to finish.

PRINCIPLES OF QUESTIONNAIRE DESIGN

Ensure that your first question is relevant to everyone, easy, and interesting.

Avoid vague response quantifiers when precise quantifiers can be used.

QUALITIES OF A GOOD QUESTION1. Appropriate2. Intelligible3. Unambiguous4. Unbiased5. Omnicompetent6. Appropriately7. coded8. Piloted9. Ethical

PRELIMINARY DECISIONS IN QUESTIONNAIRE DESIGN

There are NINE STEPS involved in the development of a questionnaire:

57

Page 58: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

1. Decide the information required.2. Define the target respondents.3. Choose the method(s) of reaching your target respondents.4. Decide on question content.5. Develop the question wording.6. Put questions into a meaningful order and format.7. Check the length of the questionnaire.8. Pre-test the questionnaire.9. Develop the final survey form.

PRELIMINARY WORK IN QUESTIONNAIRE DEVELOPMENT The key steps in designing a questionnaire include:

1. Decide what data you need2. Select items3. Design the individual questions4. Compose the wording5. Design the layout and presentation6. Think about coding7. Prepare the first draft 8. Pretest9. Pilot10. Evaluate the form11. Perform the survey

Despite the apparently complicated nature of the task, theoretical knowledge is no substitute for practical experience.

DECIDING ON THE DATA REQUIRED

It should be noted that one does not start by writing questions. The first step is to decide 'what are the things one needs to know from the respondent in order to meet the survey's objectives?' These appear in the research brief and the research proposal.

Though one may already have an idea about the kind of information to be collected, additional help can be obtained from secondary data, previous rapid rural appraisals and exploratory research. In respect of secondary data, the researcher should be aware of what work has been done on the same or similar problems in the past, what factors have not yet been examined, and how the present survey questionnaire can build on what has already been discovered. Further, a small number of preliminary informal interviews with target respondents will give a glimpse of reality that may help clarify ideas about what information is required

To maximize the response rate

58

Page 59: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

At the beginning state the purpose of the study and how the questionnaire will help.

Explain how the answers are important to the study and why the questions need to be answered carefully.

Also assert that it will be treated with confidentiality and anonymity Give the questionnaire a short and meaningful title as per the project

to which it is assigned Keep the questionnaire as short and succinct as possible Be creative and make it attractive Outline what the purpose of the survey is and why their response is

important Provide clear instructions as to how each question should be answered-

E.g., whether you are expecting one or more answers or whether answers should be ranked – and if so, is 1 high or low? Make it convenient- enclose a Stamped Address Envelope if appropriate

Offer incentive for responding if appropriate How to return the questionnaire and by what date

Planning was done during onsite session and through e-mail, telephone among moderators, mentors and faculty. The plan of action was posted on the listserv for necessary feedback from all the fellow participants.

Types Of Questions That May Be Used In A Questionnaire

1. Open Format Questions- Open format questions are those questions that give your respondents an opportunity to express their opinions. In these types of questions, there are no predetermined set of responses and the person is free to answer however he/she chooses. By including open format questions in your questionnaire, you can get true, insightful and even unexpected suggestions. Qualitative questions fall under the category of open format questions. An ideal questionnaire would include an open format question at the end of the questionnaire that would ask the respondent about suggestions for changes or improvements.

2. Closed Format Questions

Closed format questions are questions that include multiple choice answers. Multiple choice questions fall under the category of closed format questions. These multiple choices could either be in even numbers or in odd numbers. By including closed format questions in your questionnaire design, you can

59

Page 60: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

easily calculate statistical data and percentages. Preliminary analysis can also be performed with ease. Closed format questions can be asked to different groups at different intervals. This can enable you to efficiently track opinion over time.

3. Leading Questions

Leading questions are questions that force your audience for a particular type of answer. In a leading question, all the answers would be equally likely. An example of a leading question would be a question that would have choices such as, fair, good, great, poor, superb, excellent etc. By asking a question and then giving answers such as these, you will be able to get an opinion from your audience.

4. Importance Questions

In importance questions, the respondents are usually asked to rate the importance of a particular issue, on a rating scale of 1-5. These questions can help you grasp what are the things that hold importance to your respondents.

5. Likert Questions

Likert questions can help you ascertain how strongly your respondent agrees with a particular statement. Likert questions can also help you assess how the students feel towards a certain issue.

6. Dichotomous Questions

Dichotomous questions are simple questions that ask respondents to just answer yes or no. One major drawback of a dichotomous question is that it cannot analyze any of the answers between yes and no.

7. Bipolar Questions

Bipolar questions are questions that have two extreme answers. The respondent is asked to mark his/her responses between the two opposite ends of the scale.

8. Rating Scale Questions

In rating scale questions, the respondent is asked to rate a particular issue on a scale that ranges from poor to good. Rating scale questions usually

60

Page 61: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

have an even number of choices, so that respondents are not given the choice of a middle option.

  Questions to be avoided in a questionnaire

1. Embarrassing Questions

Embarrassing questions are questions that ask respondents details about personal and private matters. Embarrassing questions are mostly avoided because you would lose the trust of your respondents and may force them to not respond.

2. Positive/ Negative Connotation Questions

Ideal questions should have neutral or subtle overtones. While defining a question, strong negative or positive overtones must be avoided. Depending on the positive or negative connotation of your question, different data is obtained.

3. Hypothetical Questions

Hypothetical questions are questions that are based on speculation and fantasy. An example of a hypothetical question would be “If you were the HOD of a Dept what would be the changes that you would bring?” Though questions such as these force the respondent to give his or her ideas on a particular subject, these kinds of questions do not give consistent or clear data. Hypothetical questions are mostly avoided in questionnaires.

Methods and Guidelines to Avoid Common Questionnaire Bloopers

Over the years, I’ve often heard colleagues say "let’s throw a questionnaire together and find out what our users think about our product". Implicit in this statement is the assumption that questionnaires are easy to design, administer, and analyze. This assumption is far from the truth.

The design of questionnaires involves social processes (collaboration among stakeholders), persuasive processes (getting respondents to answer the questions), business processes (do I have the questions and response categories that will yield data to help us answer our business questions), cognitive processes (understanding about how memory and context affect respondents answers), and analytic processes (how do I analyze and present the data). Throwing a questionnaire together is at best a waste of time and at worst, a source of flawed data that could affect your company’s reputation and revenue. In this short article, I will present a set of methods and principles that will help you avoid the most common questionnaire bloopers.

61

Page 62: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

Apply the basic rules of user-centered design to the design of questionnaires

Design a questionnaire in much the same way that you would design a product (except that your design cycle might be in days rather than weeks or months). Start by gathering requirements from your stakeholders. What do your stakeholders want/need to know about users and the product? Follow the requirements gathering with a clear definition of goals and an explicit statement of how to build trust and provide respondents with benefits that outweigh the costs of filling out the questionnaire. Conduct prototype reviews and iterative testing. Ensure that you have a data analysis plan so that you understand what to do when all the data pour in.

Gather Requirements and Questions from Stakeholders

Interview key stakeholders about what they know and don’t know about users and how they use a product. Stakeholders include actual users, sales, marketing, development, documentation, training, senior management, and technical support.

Brainstorm with the product team about what they want to learn from a survey. If you can do this as part of a regular product team meeting, you can get many of the stakeholders in the room at the same time.

Distribute 3x5 cards at meetings or individually and ask stakeholders from different groups to write 1-3 questions that they would like to ask users. Avoid asking for too much here. This technique can be useful for getting insight into what issues are most important for different groups.

Conduct a short brainwriting session. Brainwriting is an extension of brainstorming where each person writes a question on a card and then passes it on to the next person who then reads the previous question and passes the card on to the next person who sees the first two questions and adds a third question. The premise is that seeing the questions of others will prompt additional relevant questions. This can be done in about fifteen minutes at team meetings and yield a large selection of questions.

Conduct a focus group to find out what issues are important to key user groups. Although focus groups are often derided by usability experts as weak sources of design information, they can be an excellent source of requirements for questionnaire design. The more open-ended nature of a focus group can provide input for more structured online or paper questionnaires.

Be Explicit of the Goals of Your Questionnaire

Too often, the goals of a questionnaire and each question on the questionnaire are not clear. Is the purpose of the questionnaire to understand customer loyalty issues, gather information before or after a

62

Page 63: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

usability test, gather requirements, understand why the product is not selling as expected, or understand how people use the product? I would recommend that each question on a questionnaire related to a specific business goal and user experience issue.

Consider How to Establish Trust, Increase Rewards, and Reduce Social Costs for Respondents

You can design your questionnaire to create trust among respondents and influence the respondent’s expectations about the benefits and costs associated with filling out the questionnaire. Don Dillman’s classic book, Mail and Internet Surveys The Tailored Design Method, Second Edition (2000, p. 27) notes that you can increase trust in the questionnaire by:

Providing tokens of appreciation in advance (though be careful not to make the tokens too large since this may be a source of bias)

Indicating clearly that the sponsor is legitimate and can do something with the results

Making the questionnaire appear important Indicating how the data will be used

Dillman’s suggestions for increasing rewards to respondents include:

Design an interesting questionnaire Use positive language that makes the respondent feel like a

collaborator Provide tangible rewards Thank the user for helping Ask people for advice

Suggestions for reducing the costs of completing a questionnaire include

Make the questionnaire usable, Avoid embarrassing questions (don’t ask "how old are you?"), Minimize the need for personal information, Make every question is relevant and avoid lengthy questionnaires, Allow users to change answers easily in online surveys.

Create Prototypes of the Questionnaire and Review Against Principles of Survey Design

Design a prototype questionnaire, including the cover page, and compare it with the principles of questionnaire design. These principles should cover language, relevance, page layout, response categories, and ordering of the questions. I recommend that the questionnaire designer ask four people to review the questionnaire, and that you interview a few people not closely

63

Page 64: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

associated with the project as they read the questionnaire and think aloud about their reactions to it.

Devise a Data Analysis Plan

A common error in designing and implementing a questionnaire is to not devise a data analysis plan that spells out how answers will be coded (for example, how will you code non-responses, unusual responses, or ratings where people circle two numbers when you only want a single answer), what analyses you will do on single questions and sets of questions, and any hypotheses that you may have and what questions will be used to test those hypotheses. You should do this even if you have survey software that does an automatic analysis of the data. You might find that your automated software doesn’t allow some of the analyses that you need to answer the questions that are important to your stakeholders.

Conduct Limited Testing of the Questionnaire With Actual Users

Get a small sample of users (or people as close to the expected users as possible) and have them fill out the questionnaire under realistic conditions and give you feedback. Make your final changes based on this input and do a final edit.

Qualities of a Good Question1. Evokes the truth. Questions must be non-threatening. Anonymous questionnaires that contain no identifying information are more likely to produce honest responses than those identifying the respondent. If your questionnaire does contain sensitive items, be sure to clearly state your policy on confidentiality.

2. Can accommodate all possible answers. Multiple choice items are the most popular type of survey questions because they are generally the easiest for a respondent to answer and the easiest to analyze. Asking a question that does not accommodate all possible responses can confuse and frustrate the respondent.

3. Has mutually exclusive options. A good question leaves no ambiguity in the mind of the respondent. There should be only one correct or appropriate choice for the respondent to make.

4. Produces variability of responses. When a question produces no variability in responses, we are left with considerable uncertainty about why we asked the question and what we learned from the information. If a question does not produce variability in responses, it will not be possible to

64

Page 65: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

perform any statistical analyses on the item. Design your questions so they are sensitive to differencesbetween respondents. As another example:

5. Follows comfortably from the previous question. Grouping questions that are similar will make the questionnaire easier to complete, and the respondent will feel more comfortable. Questionnaires that jump from one unrelated topic to another feel disjointed and are not likely to produce high response rates.

6. Does not presuppose a certain state of affairs. Among the most subtle mistakes in questionnaire design are questions that make an unwarranted assumption. If there is any possibility that the respondent may not know the answer toyour question, include a "don't know" response category.

7. Does not imply a desired answer. The wording of a question is extremely important. We are striving for objectivity in our surveys and, therefore, must be careful not to lead the respondent into giving the answer we would like to receive. Leading questions are usually easily spotted because they use negative phraseology.

8. Does not use emotionally loaded or vaguely defined words. This is one of the areas overlooked by both beginners and experienced researchers. Quantifying adjectives (e.g., most, least, majority) are frequently used in questions. It is important to understand that these adjectives mean different things to different people.

9. Does not use unfamiliar words or abbreviations. Remember who your audience is and write your questionnaire for them. Do not use uncommon words or compound sentences. Write short sentences. Abbreviations are okay if you are absolutely certain that every single respondent will understand their meanings. If there is any doubt at all, do not use the abbreviation.

10. Is not dependent on responses to previous questions. Branching in written questionnaires should be avoided. While branching can be used as an effective probing technique in telephone and face-to-face interviews, it should not be used in written questionnaires because it sometimes confuses respondents. An example of branching is: Have you ever answered a questionnaire ? (Yes or No) If no, go to question 3

11. Does not ask the respondent to order or rank a series of more than fiveitems. Questions asking respondents to rank items by importance should be avoided. This becomes increasingly difficult as the number of items

65

Page 66: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

increases, and the answers become less reliable. Limiting the number of items to five will make it easier for the respondent to answer.

DATA ANALYSIS

THE PROBLEMS USUALLY FACED 1. How many questions to formulate? (Ideal number) 2. What type of questions? (close ended, open ended, both)3. Formulating a understandable question4. Length of the question5. Closed ended questions can be analyzed easily, but difficult to design them6. How to interpret the open ended questions?7. Difficulty in validation and what statistics to be used?8. To design simple, clear and comprehensive questions9. Method of administration (written/email)

LIMITATIONS ARE: 1. Questioning is done after the event, so participants may forget important issues.2. Data obtained from open ended questions takes time for processing and analyzing3. Avoid many questions and long questions 4. Participants may not be willing to respond as they feel it may not benefit them5. Postal questionnaire does not ensure compliance and sincerity of the responses6. Close ended questions may not give true answers and original perceptions 7. We will not be able to get in-depth analysis of what group feels 8. They may fill merely for formality; respondents may discussion among them and then fill 9. Low response rate, coaxing is required to respond within the time10. If not standardized, we may not get valid information11. Incomplete and wrong answers (specially for sensitive questions)

DESIGN OF QUESTIONNAIRES:PROCESSES INVOLVED The design of questionnaires involves social processes (collaboration among stakeholders), persuasive processes (getting respondents to answer the questions), business processes (do I have the questions and response categories that will yield data to help us answer our business questions), cognitive processes (understanding about how memory and context affect respondents answers), and analytic processes (how do I analyze and present the data). Throwing a questionnaire together is at best a waste of time and at worst, a source of flawed data that could affect your company’s reputation

66

Page 67: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

and revenue. In this short article, I will present a set of methods and principles that will help you avoid the most common questionnaire bloopers.Apply the basic rules of user-centered design to the design of questionnairesDesign a questionnaire in much the same way that you would design a product (except that your design cycle might be in days rather than weeks or months). Start by gathering requirements from your stakeholders. What do your stakeholders want/need to know about users and the product? Follow the requirements gathering with a clear definition of goals and an explicit statement of how to build trust and provide respondents with benefits that outweigh the costs of filling out the questionnaire. Conduct prototype reviews and iterative testing. Ensure that you have a data analysis plan so that you understand what to do when all the data pour in.

STEPS TO DESIGNING A QUESTIONNAIRE

1. Review the literature on various themes and plausible items/descriptors used to assess the structured learing in physical examination2. Develop a (new) questionnaire based on the literature search and the existing one3. Form the "project group: my project to our project" and discuss the compiled questionnaire with them in a series of meetings4. Give ample time for the project committee members to review the questionnaire and incoroporate their comments and suggestions (Please note that the number of items may come down heavily after this process)5. Finalize the questionnaire in project group and share it with other colleagues that are going to be involved in assessing the proposed skills and/or experts in this area (Expect to have further modification in the items/themes)6. Pilot test the questionnaire with objective and subjective (open ended comments, interviews) inputs from students, faculty and project committee members7. Analyze the pilot test resutls and make necessary adjustments based on it in the project commitee (don't do the adjustment alone, share and finalize)8. Prepare a process documentation report and try to publish it (up to here is pre-validation of the tool)9. Validate the questionnaire in vivo i.e. real students, real faculty etc.10. Analyze results, make sure to incorporate the objective (checklist items) as well as open ended comments from students and faculty11. Prepare a full article based on the results (this is validation of the tool)

Gather Requirements and Questions from Stakeholders

67

Page 68: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

Interview key stakeholders about what they know and don’t know about users and how they use a product. Stakeholders include actual users, sales, marketing, development, documentation, training, senior management, and technical support.

Brainstorm with the product team about what they want to learn from a survey. If you can do this as part of a regular product team meeting, you can get many of the stakeholders in the room at the same time.

Distribute 3x5 cards at meetings or individually and ask stakeholders from different groups to write 1-3 questions that they would like to ask users. Avoid asking for too much here. This technique can be useful for getting insight into what issues are most important for different groups.

Conduct a short brainwriting session. Brainwriting is an extension of brainstorming where each person writes a question on a card and then passes it on to the next person who then reads the previous question and passes the card on to the next person who sees the first two questions and adds a third question. The premise is that seeing the questions of others will prompt additional relevant questions. This can be done in about fifteen minutes at team meetings and yield a large selection of questions.

Conduct a focus group to find out what issues are important to key user groups. Although focus groups are often derided by usability experts as weak sources of design information, they can be an excellent source of requirements for questionnaire design. The more open-ended nature of a focus group can provide input for more structured online or paper questionnaires.

Be Explicit of the Goals of Your Questionnaire

Too often, the goals of a questionnaire and each question on the questionnaire are not clear. Is the purpose of the questionnaire to understand customer loyalty issues, gather information before or after a usability test, gather requirements, understand why the product is not selling as expected, or understand how people use the product? I would recommend that each question on a questionnaire related to a specific business goal and user experience issue.

PRETESTING

When you have developed your questions and questionnaire using the good practice guidelines, you should test both of these. There are various methods that can be used to test questions and questionnaires. Some are quick and inexpensive, such as reading the questions out loud or asking your colleagues to complete the questionnaire. Other methods, for example conducting a pilot study, are more expensive and time consuming – but an essential stage of questionnaire development.

68

Page 69: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

A few methods that are used to test questions and questionnaires are summarized below. The nature of your survey and resources available to you will dictate which of the testing methods you should adopt. Note that each method has its own advantages and disadvantages, and large-scale surveys typically use more than one method.

A peer review

This involves asking a number people who are involved with the survey subject, or with questionnaires in general, to review it. A peer review can range from sending the questionnaire informally to colleagues, to asking a panel of experts to review it. Typically this is an inexpensive method for testing the questionnaire, which can highlight issues with it and offer solutions. The disadvantage of using a peer review is that you are not testing the questionnaire on people who are similar to those who will ultimately complete the survey.

Cognitive interviewing

Cognitive testing is a method of interviewing used to understand how people answer survey questions. It can be used to find out if a question is working as intended and whether respondents can answer it correctly. The questionnaire is completed by a test- respondent and then an interviewer questions them on how or why they have answered the questions in this particular way. A number of respondents should be interviewed and they should be representative of the survey population. Cognitive interviewing can uncover problems that would go unnoticed when the survey was in the field, and offer solutions. This method only needs a small number of respondents and it is relatively cheap. If a problem is observed by one person, the survey can be changed in time for the next cognitive interview. However, disadvantages of this approach are that: the interpretation of responses by the interviewer can be subjective; and it does require some specialist training to undertake the test properly.

Focus group discussions

A focus group test is similar to a cognitive interview, but in a group situation, with a discussion leader or moderator and dedicated note-taker. This would normally involve around five to eight participants who can adequately represent the survey population. They would complete the questionnaire, while the moderator observes any difficulties encountered. Once the forms are completed, any mistakes or missed questions can be analyzed. This is followed by debriefing the respondents, involving pre-planned questions and those based on observation. Typically, debriefing questions ask about the participants’ interpretation of terms and how they came up with their

69

Page 70: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

answers. Like cognitive interviewing, focus group discussions can uncover hidden problems and offer solutions. Again, it only needs a small number of respondents and is relatively cheap. The main disadvantage of using focus group discussions is that it is not easy to know if an observed problem was a significant one or not. Sometimes vocal individuals in the group can unduly influence others; problems encountered by only one individual are noticed and commented on by everyone in the group.

Field/Pilot testing

A field test or pilot survey is a very small-scale survey which should replicate the exact conditions of the full survey to follow. The advantages of this are that it will immediately highlight any issues with: the routing in the questionnaire; if there are a lot of unanswered questions; and the don’t know and no opinion answer options can be reviewed. However, it can be very expensive. If it is a pilot face-to face or telephone survey, an interviewer and possibly a respondent debriefing session should follow. Advice on how to pilot a survey can be found in Survey Design and Analysis.

Interview debrief

The debriefing session normally involves the interviewers/participants reading through the questionnaire slowly and the researcher asking interviewers/ participants if they encountered any problems with each question. A debrief can identify which questions were difficult to read or understand – from both the interviewer’s and respondent’s point of view. For any problems that are highlighted, proposed solutions can also be suggested and discussed.

Behaviour coding

Behaviour coding is conducted through observing the respondent’s actions. This method is usually undertaken by a coder and mainly used for face-to face surveys, where the coder can more easily observe the interviewer and respondent and see how they interact. They can note whether an interviewer asks the questions exactly as worded, whether the respondent asks for clarification or takes a long time to answer. Behaviour coding can detect problems in questions, is relatively cheap, but it does not suggest ways to solve the problems. For telephone surveys, the average delay between the interviewer asking the question and entering an answer can be calculated: those questions with longer time gaps should be examined carefully to see if they are complicated or difficult to understand, with a view to simplifying them. Regardless of which question testing method you use, it is necessary to repeat the review process each time changes are made to the

70

Page 71: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

questionnaire. This is necessary to ensure that the changes made result in genuine improvements, and do not introduce any new problems.Advantages of Written Questionnaires

Questionnaires are very cost effective when compared to face-to-face interviews especially true for studies involving large sample sizes and large geographic areas. Written questionnaires become even more cost effective as the number of research questions increases.

Questionnaires are easy to analyze. Data entry and tabulation for nearly all surveys can be easily done with many computer software packages.

Questionnaires are familiar to most people. Nearly everyone has had some experience completing questionnaires and they generally do not make people apprehensive.

Questionnaires reduce bias. There is uniform question presentation and no middle-man bias. The researcher's own opinions will not influence the respondent to answer questions in a certain manner. There are no verbal or visual clues to influence the respondent.

Questionnaires are less intrusive than telephone or face-to-face surveys. When a respondent receives a questionnaire in the mail, he is free to complete the questionnaire on his own time-table. Unlike other research methods, the respondent is not interrupted by the research instrument.

Disadvantages Of Written Questionnaires

Low response rates. Low response is the curse of statistical analysis. It can dramatically lower our confidence in the results. Response rates vary widely from one questionnaire to another (10% - 90%), however, well designed studies consistently produce high response rates.

Another disadvantage of questionnaires is the inability to probe responses. Questionnaires are structured instruments. They allow little flexibility to the respondent with respect to response format. In essence, they often lose the "flavor of the response" (i.e., respondents often want to qualify their answers). By allowing frequent space for comments, the researcher can partially overcome this disadvantage. Comments are among the most helpful of all the information on the questionnaire, and they usually provide insightful information that would have otherwise been lost.

Nearly ninety percent of all communication is visual. Gestures and other visual cues are not available with written questionnaires. The lack of personal contact will have different effects depending on the type of information being requested. A questionnaire requesting factual information will probably not be affected by the lack of personal contact.

71

Page 72: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

Finally, questionnaires are simply not suited for some people. For example, a written survey to a group of poorly educated people might not work because of reading skill problems. More frequently, people are turned off by written questionnaires because of misuse.

Devise a Data Analysis Plan

A common error in designing and implementing a questionnaire is to not devise a data analysis plan that spells out how answers will be coded (for example, how will you code non-responses, unusual responses, or ratings where people circle two numbers when you only want a single answer), what analyses you will do on single questions and sets of questions, and any hypotheses that you may have and what questions will be used to test those hypotheses. You should do this even if you have survey software that does an automatic analysis of the data. You might find that your automated software doesn’t allow some of the analyses that you need to answer the questions that are important to your stakeholders.

Conduct Limited Testing of the Questionnaire With Actual Users

Get a small sample of users (or people as close to the expected users as possible) and have them fill out the questionnaire under realistic conditions and give you feedback. Make your final changes based on this input and do a final edit.

Principles of Questionnaire Design

Ensure that your first question is relevant to everyone, easy, and interesting.

Avoid vague response quantifiers when precise quantifiers can be used.

LIKERT SCALE

Originally developed by Rensis Likert in 1932, the Likert scale is a psychometric scale commonly used in questionnaires, and is the most widely used scale in survey research, such that the term is often used interchangeably with rating scale even though the two are not synonymous. Like Thurstone or Guttman Scaling, Likert Scaling is a unidimensional scaling

72

Page 73: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

method. This type of rating scale is the most widely used attitude scaling technique. Likert rating scales are used in various settings, including clinical, educational, administrative, and organizational contexts. Reasons for its popularity include: 1) relatively easy to construct, 2) yields reliable scores, and 3) flexibility in its ability to measure many types of affective characteristics. When responding to a Likert questionnaire item, respondents specify their level of agreement to a statement.

A response or rating scale is a set of categories designed to elicit information about a quantitative or a qualitative attribute. Commonly a person selects the number on 1-10 rating scale which is considered to reflect the perceived quality of a product. There are a variety of response or rating scales (1-to-7, 1-to-9, 0-to-4). All of these odd-numbered scales have a middle value which is often labeled Neutral or Undecided. It is also possible to use a forced-choice response scale with an even number of responses and no middle neutral or undecided choice. In this situation, the respondent is forced to decide whether they lean more towards the “agree” or “disagree” end of the scale for each item.

Rating scales yield a single score that references the direction and intensity of a person’s attitude. The items are intended to differentiate between those respondents with favorable attitudes from those with unfavorable attitudes and to allow for a range of responses between these two points on the continuum

The Rating Scales could be of any of the following types:

Nominal: where numbers are used as labelsOrdinal: where numbers indicate relative positions as in Likert scaleInterval: where numbers indicate magnitude of difference e.g., opinion or attitude scales (no absolute zero) Ratio: where numbers indicate magnitude of difference e.g., age or income (fixed zero)

Some of the rating scales are as mentioned in Appendix A

How to Use the Likert Scale in Statistical Analysis

The Likert scale is commonly used in survey research. It is often used to measure respondents' attitudes by asking the extent to which they agree or disagree with a particular question or statement. On the surface, survey data using the Likert scale may seem easy to analyze, but there are important issues for a data analyst to consider.

73

Page 74: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

Likert Rating Scale Statements Select simple, short statements, rarely exceeding 20 words and containing only one complete thought, that are believed to cover the entire range of the affective scale of interest while keeping the language of the statements simple, clear, and direct.

Avoid statements that o refer to the past rather than the presento are factual or capable of being interpreted as factualo may be interpreted in more than one way o are irrelevant to the psychological object or construct under

consideration o are likely to be endorsed by almost everyone or by almost no

one.o contain universals, such as all, always, none, and never as they

may introduce ambiguity o contain words such as only, just, merely and others of a similar

nature o contain words that may not be understood o use double negatives.

Instructions for use of the Likert Scale1. Step 1Get your data ready for analysis by coding the responses. For example, let's say you have a survey that asks respondents whether they agree or disagree with a set of positions in a political party's platform. Each position is one survey question, and the scale uses the following responses: Strongly agree, agree, neutral, disagree, strongly disagree. In this example, we'll code the responses accordingly: Strongly disagree = 1, disagree = 2, neutral = 3, agree = 4, strongly agree = 5.2. Step 2Remember to differentiate between ordinal and interval data, as the two types require different analytical approaches. If the data are ordinal, we can say that one score is higher than another. We cannot say how much higher, as we can with interval data, which tell you the distance between two points. Here is the pitfall with the Likert scale: many researchers will treat it as an interval scale. This assumes that the

74

Page 75: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

differences between each response are equal in distance. The truth is that the Likert scale does not tell us that. In our example here, it only tells us that the people with higher-numbered responses are more in agreement with the party's positions than those with the lower-numbered responses.3. Step 3Begin analyzing your Likert scale data with descriptive statistics. Although it may be tempting, resist the urge to take the numeric responses and compute a mean. Adding a response of "strongly agree" (5) to two responses of "disagree" (2) would give us a mean of 4, but what is the significance of that number? Fortunately, there are other measures of central tendency we can use besides the mean. With Likert scale data, the best measure to use is the mode, or the most frequent response. This makes the survey results much easier for the analyst (not to mention the audience for your presentation or report) to interpret. You also can display the distribution of responses (percentages that agree, disagree, etc.) in a graphic, such as a bar chart, with one bar for each response category.4. Step 4Proceed next to inferential techniques, which test hypotheses posed by researchers. There are many approaches available, and the best one depends on the nature of your study and the questions you are trying to answer. A popular approach is to analyze responses using analysis of variance techniques, such as the Mann Whitney or Kruskal Wallis test. Suppose in our example we wanted to analyze responses to questions on foreign policy positions with ethnicity as the independent variable. Let's say our data includes responses from Anglo, African-American, and Hispanic respondents, so we could analyze responses among the three groups of respondents using the Kruskal Wallis test of variance.5. Step 5Simplify your survey data further by combining the four response categories (e.g., strongly agree, agree, disagree, strongly disagree) into two nominal categories, such as agree/disagree, accept/reject, etc.). This offers other analysis possibilities. The chi square test is one approach for analyzing the data in this way.

Remember that there are many approaches to analysis. Consider your research questions when determining the best analytical approach for your study.

Applying the Likert Scale to a study

75

Page 76: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

Define the Focus. As in all scaling methods, the first step is to define what it is one is trying to measure. Being a unidimensional scaling method, it is assumed that the concept to be measured is one-dimensional in nature.

Generating the Items. Create the set of potential scale items. These should be items that can be rated on a 1-to-5 or 1-to-7 Disagree-Agree response scale. More often than not, it's helpful to engage a number of people in the item creation step using brainstorming to create the items. It's desirable to have as large a set of potential items as possible at this stage, about 80-100 would be best.

Rating the Items. The next step is to have a group of peers rate the items. Usually you would use a 1-to-5 rating scale where:

1. = strongly unfavorable to the concept 2. = somewhat unfavorable to the concept 3. = undecided 4. = somewhat favorable to the concept 5. = strongly favorable to the concept

Notice that, as in other scaling methods, the peers are not telling you what they believe -- they are judging how favorable each item is with respect to the construct of interest.

Selecting the Items. The next step is to compute the inter correlations between all pairs of items, based on the ratings of the peers. In making judgements about which items to retain for the final scale there are several analyses you can do:

Throw out any items that have a low correlation with the total (summed) score across all items

In most statistics packages it is relatively easy to compute this type of Item-Total correlation. First, you create a new variable which is the sum of all of the individual items for each respondent. Then, you include this variable in the correlation matrix computation (if you include it as the last variable in the list, the resulting Item-Total correlations will all be the last line of the correlation matrix and will be easy to spot). How low should the correlation be for you to throw out the item? There is no fixed rule here -- you might eliminate all items with a correlation with the total score less than 0.6, for example.

For each item, get the average rating for the top quarter of judges and the bottom quarter. Then, do a t-test of the differences between the mean value for the item for the top and bottom quarter judges.

76

Page 77: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

Higher t-values mean that there is a greater difference between the highest and lowest judges. In more practical terms, items with higher t-values are better discriminators, so you want to keep these items. In the end, you will have to use your judgement about which items are most sensibly retained. You want a relatively small number of items on your final scale (e.g., 10-15) and you want them to have high Item-Total correlations and high discrimination (e.g., high t-values).

Administering the Scale.. A ‘true’ Likert scale uses a 5-point scale while if use of additional sets of anchors is done -- they are called Likert-type scales. However, it is important to remember that increasing the number of points on a Likert scale does not help because most respondents are unable to make finer distinctions. A mid-point allows respondents to select a neutral option and may be important if the respondent is truly ambivalent on a topic.

Respondents are asked the amount they agree or disagree with a number of statements. Each respondent is asked to rate each item on some response scale. For instance, they could rate each item on a 1-to-5 response scale where:

1. = strongly disagree 2. = disagree 3. = undecided 4. = agree 5. = strongly agree

The final score for the respondent on the scale is the sum of their ratings for all of the items (this is why this is sometimes called a "summated" scale).

REVERSAL ITEMS

On some scales, you will have items that are reversed in meaning from the overall direction of the scale. These are called reversal items. You will need to reverse the response value for each of these items before summing for the total. That is, if the respondent gave a 1, you make it a 5; if they gave a 2 you make it a 4; 3 = 3; 4 = 2; and, 5 = 1.

Example: The Employment Self Esteem Scale

Here's an example of a ten-item Likert Scale that attempts to estimate the level of self esteem a person has on the job. Notice that this instrument has no center or neutral point -- the respondent has to declare whether he/she is in agreement or disagreement with the item.

INSTRUCTIONS: Please rate how strongly you agree or disagree with each of the following statements by placing a check mark in the appropriate box.

77

Page 78: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

Strongly Disagree

Somewhat Disagree

Somewhat Agree

Strongly Agree

1. I feel good about my work on the job.

Strongly Disagree

Somewhat Disagree

Somewhat Agree

Strongly Agree

2. On the whole, I get along well with others at work.

Strongly Disagree

Somewhat Disagree

Somewhat Agree

Strongly Agree

3. I am proud of my ability to cope with difficulties at work.

Strongly Disagree

Somewhat Disagree

Somewhat Agree

Strongly Agree

4. When I feel uncomfortable at work, I know how to handle it.

Strongly Disagree

Somewhat Disagree

Somewhat Agree

Strongly Agree

5. I can tell that other people at work are glad to have me there.

Strongly Disagree

Somewhat Disagree

Somewhat Agree

Strongly Agree

6. I know I'll be able to cope with work for as long as I want.

Strongly Disagree

Somewhat Disagree

Somewhat Agree

Strongly Agree

7. I am proud of my relationship with my supervisor at work.

Strongly Disagree

Somewhat Disagree

Somewhat Agree

Strongly Agree

8. I am confident that I can handle my job without constant assistance.

Strongly Disagree

Somewhat Disagree

Somewhat Agree

Strongly Agree

9. I feel like I make a useful contribution at work.

Strongly Disagree

Somewhat Disagree

Somewhat Agree

Strongly Agree

10. I can tell that my coworkers respect me.

Utility of Likert Scale

“There has been a continuing and fierce debate on the use of Likert-type rating scales (Knapp, 1990). Although the response categories in Likert scales have a rank order and should be viewed as ordinal-level measurement, it has become common practice to assume that Likert-type categories constitute interval-level measurement as well as the intervals between values are equal (Jamieson, 2004). On the issue of the usage of Likert-type rating scales, two opposite positions have been held by researchers. Some researchers have argued that the aforementioned use of

78

Page 79: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

Likert scales may led to error in interpreting data and the relations inferred from data, while others have proposed that the danger is probably not as grave as it has made out to be and the results we get from using summated scales and assuming equal intervals are quite satisfactory (e.g., Kerlinger & Lee, 2000)” (p. 123).

Likert-type rating scales have rank order and are therefore ordinal measurement. This is NOT a “view” but is a statistical fact. The common practice to use such data as interval-level measurements

Type of Scale

Points on Continuum

1 2 3 4 5 Agreement Strongly

Agree Agree Neither

Agree or Disagree

Disagree Strongly Disagree

Frequency Always Often About Half the Time

Seldom Never

Satisfaction Very Satisfied

Satisfied Neither Satisfied nor Dissatisfied

Dissatisfied Very Dissatisfied

Effectiveness

Very Effective

Effective Neither Effective nor Ineffective

Ineffective Very Ineffective

Quality Very Good Good Average Poor Very Poor

Expectancy Much Better than Expected

Better than Expected

As Expected

Worse than Expected

Much Worse than Expected

Extent To a Very Great Extent

To a Great Extent

Somewhat To a Small Extent

To A Very Small Extent

VALIDITYValidating Questionnaires

In order for a questionnaire to be useful, the data it produces must be trustworthy, i.e., we must know that the results are meaningful and can be applied more generally than to just the sample tested. Proving that trustworthiness for questionnaires involving subjective clinical endpoints is

79

Page 80: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

not trivial, and ensuring that the resulting data reflect the “truth” has spawned an entire field of study. The term “validation” has a variety of meanings in clinical research, the first and most obvious being the assessment of computer systems to ensure they function as expected. “Validation” is also the process by which any data collection instrument, including questionnaires, is assessed for its dependability. Validating questionnaires is somewhat challenging as they usually evaluate subjective measures, which means they can be influenced by a range of factors that are hard to control. In other words, a blood pressure machine can be assessed for accuracy and calibrated to ensure consistent readings. Obviously the SF-36 quality of life questionnaire cannot be similarly assessed. That said, there are ways to evaluate the value of, or validate, a questionnaire. Validation involves establishing that the instrument produces data that are reliable and true. There are a number of ways to define this, some of which outlined below.Reliability: the degree to which a questionnaire will produce the same result ifadministered again, or the “test-retest” concept. It is also a measure of the degree towhich a questionnaire can reflect a true change.Validity: the degree to which a questionnaire reflects reality. There are a number ofdifferent facets to validity. Internal validity: the degree to which questions within an instrument agree with each other, i.e., that a subject will respond to similar questions in a similar way. It also affects the likelihood of producing false positives and false negatives.External validity: the ability to make generalizations about a population beyond that of the sample tested.Sensitivity: the degree to which the instrument can identify a true positive, e.g.,accurately identify a person who does have the condition.Specificity: similar to sensitivity, this is the degree to which the instrument can identify a true negative, e.g., correctly identify the people who do not have the disease. Sensitivity and specificity are another side of the coin from internal validity.Statistical validity: this is related to internal validity, and assesses whether the differences in the questionnaire results between patient groups can appropriately be subjected to statistical tests of significance.Longitudinal validity: whether a questionnaire returns the same results in a givenpopulation over time, assuming all else remains equalLinguistic validity: whether the wording of the questionnaire is understood in the same way by everyone who completes it.Discriminant validity: the ability of the questionnaire to detect true differences between groups, and detect no difference when there isn’t one.

80

Page 81: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

Construct validity: the ability of a measure to assess correctly a particular cause andeffect relationship between the measure and some other factor.

“Validity” is not an absolute quality. It’s a continuum, with a questionnaire being valid to a certain degree in certain circumstances, and researchers must decide (preferably before the validation study is run) what degree of validity is considered sufficient. The above categories also suggest that there are types of validity that relate to the internal validity of the questionnaire (are similar questions answered similarly), others that relate to the ability of the questionnaire to determine a given state in a patient (e.g., that it varies in alignment with the severity of the condition), and still others that involve the validity of comparing different groups on the basis of the questionnaire. Each type of validity is distinct, meaning that a questionnaire can have one kind of validity but not another. Because of that, a questionnaire can never really be fully “validated.” It can only be validated for x patient population, under y conditions, and so forth. This implies that it may not be appropriate, for example, to use a lymphoma quality of life questionnaire in a melanoma study if the questionnaire hasn’t been validated for that particular population, unless it has been shown to be applicable to cancer patients generally.

The validity of the results can be impacted by more than just the design of the questionnaire itself. Some questionnaires must be administered by individuals who have been trained in survey administration generally, or that one in particular. Others can be administered by any experienced clinician, or nurse, or indeed completed by the patient. If an otherwise valid questionnaire is administered by the wrong individual, the results are compromised. Similarly, some instruments must be used in their original published form, and changing the layout to create a CRF may compromise the results.

Results can also be compromised if the questionnaire is not completed at the expected times (either time or day or relative to some other event), or in the right setting. The process of validating an instrument varies depending upon what aspect(s) of validity are being assessed. Generally it involves running a study that is designed to determine a specific kind of validity, although it is sometimes possible to add a validation arm onto a trial with other primary objectives. One way to check the validity of a questionnaire is to compare its results with results from more objective measures. For example, a questionnaire assessing a patient’s perception of their chronic obstructive pulmonary disease (COPD) may be compared to measures of their lung function, and the results of each compared between groups of healthy subjects and ill patients. If the instrument has appropriate specificity, sensitivity and discriminant validity, one should see a good correlation between the lung functions of the more severely ill patients with “worse” scores on the questionnaire. The degree to which the differences in the scores vary in alignment with the lung function tests across the healthy and

81

Page 82: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

ill subjects is the measure of the validity of the instrument at identifying patients who have COPD. If the same questionnaire was developed in the US in English, and researchers wanted to use it in Italy, it would need to be translated into Italian. The Italian version would then have to be tested to see whether it varied with degree of illness in the way the English one did, or at least in a reliable and predictable fashion. Of course, there may be cultural differences that may require changing the content of the instrument. “Walking the length of a city block” is generally understood in the US, but the concept is meaningless in rural France.Establishing longitudinal validation is particularly relevant to clinical trials, in that determining the degree to which the use of an instrument repeatedly in a study affects the instrument results. On the one hand, in order to be able to draw conclusions from the results, the same instrument should be used throughout the study. For that to work it must be longitudinally valid. There is a well document test-retest effect, however; the first time a subject completes a given questionnaire the results are independent. After that, the subject is no longer naïve to the questions, and their answers in the second questionnaire may be influenced by their memory of their prior experience. Part of the process of validating instruments used over time is statistically evaluating thatrelationship.

RELIABILITY Types of ReliabilityIt is not possible to calculate reliability exactly. Instead, we have to estimate reliability, and this is always an imperfect endeavor. Here, I want to introduce the major reliability estimators and talk about their strengths and weaknesses.There are four general classes of reliability estimates, each of which estimates reliability in a different way. They are:

Inter-Rater or Inter-Observer ReliabilityUsed to assess the degree to which different raters/observers give consistent estimates of the same phenomenon.

Test-Retest ReliabilityUsed to assess the consistency of a measure from one time to another.

Parallel-Forms ReliabilityUsed to assess the consistency of the results of two tests constructed in the same way from the same content domain.

Internal Consistency ReliabilityUsed to assess the consistency of results across items within a test.

Discussing each of these in turn:-

82

Page 83: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

Inter-Rater or Inter-Observer Reliability

Whenever you use humans as a part of your measurement procedure, you have to worry about whether the results got are reliable or consistent. People are notorious for their inconsistency. We are easily distractible. We get tired of doing repetitive tasks. We daydream. We misinterpret.

Actually estimating inter-rater reliability: If your measurement consists of categories -- the raters are checking off which category each observation falls in -- you can calculate the percent of agreement between the raters. For instance, let's say you had 100 observations that were being rated by two raters. For each observation, the rater could check one of three categories. Imagine that on 86 of the 100 observations the raters checked the same category. In this case, the percent of agreement would be 86%. OK, it's a crude measure, but it does give an idea of how much agreement exists, and it works no matter how many categories are used for each observation.The other major way to estimate inter-rater reliability is appropriate when the measure is a continuous one. There, all you need to do is calculate the correlation between the ratings of the two observers. For instance, they might be rating the overall level of activity in a classroom on a 1-to-7 scale. You could have them give their rating at regular time intervals (e.g., every 30 seconds). The correlation between these ratings would give you an estimate of the reliability or consistency between the raters.

Test-Retest ReliabilityWe estimate test-retest reliability when we administer the same test to the same sample on two different occasions. This approach assumes that there is no substantial change in the construct being measured between the two occasions. The amount of time allowed between measures is critical. We know that if we measure the same thing twice that the correlation between the two observations will depend in part by how much time elapses between the two measurement occasions. The shorter the time gap, the higher the correlation; the longer the time gap, the lower the correlation. This is because the two observations are related over time -- the closer in time we get the more similar the factors that contribute to error. Since this correlation is the test-retest estimate of reliability, you can obtain considerably different estimates depending on the interval.

83

Page 84: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

Parallel-Forms ReliabilityIn parallel forms reliability you first have to create two parallel forms. One way to accomplish this is to create a large set of questions that address the same construct and then randomly divide the questions into two sets. You administer both instruments to the same sample of people. The correlation between the two parallel forms is the estimate of reliability. One major problem with this approach is that you have to be able to generate lots of items that reflect the same construct. This is often no easy feat. Furthermore, this approach makes the assumption that the randomly divided halves are parallel or equivalent. Even by chance this will sometimes not be the case. The parallel forms approach is very similar to the split-half reliability described below. The major difference is that parallel forms are constructed so that the two forms can be used independent of each other and considered equivalent measures. For instance, we might be concerned about a testing threat to internal validity. If we use Form A for the pretest and Form B for the posttest, we minimize that problem. It would even be better if we randomly assign individuals to receive Form A or B on the pretest and then switch them on the posttest. With split-half reliability we have an instrument that we wish to use as a single measurement instrument and only develop randomly split halves for purposes of estimating reliability.

Internal Consistency Reliability

84

Page 85: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

In internal consistency reliability estimation we use our single measurement instrument administered to a group of people on one occasion to estimate reliability. In effect we judge the reliability of the instrument by estimating how well the items that reflect the same construct yield similar results. We are looking at how consistent the results are for different items for the same construct within the measure. There are a wide variety of internal consistency measures that can be used.Average Inter-item Correlation The average inter-item correlation uses all of the items on our instrument that are designed to measure the same construct. We first compute the correlation between each pair of items, as illustrated in the figure. For example, if we have six items we will have 15 different item pairings (i.e., 15 correlations). The average interitem correlation is simply the average or mean of all these correlations. In the example, we find an average inter-item correlation of .90 with the individual correlations ranging from .84 to .95.

Average Item total CorrelationThis approach also uses the inter-item correlations. In addition, we compute a total score for the six items and use that as a seventh variable in the analysis. The figure shows the six item-to-total correlations at the bottom of the correlation matrix. They range from .82 to .88 in this sample analysis, with the average of these at .85.

85

Page 86: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

Split-Half Reliability In split-half reliability we randomly divide all items that purport to measure the same construct into two sets. We administer the entire instrument to a sample of people and calculate the total score for each randomly divided half. The split-half reliability estimate, as shown in the figure below is simply the correlation between these two total scores. In the example it is .87.

Cronbach's Alpha (a) Imagine that we compute one split-half reliability and then randomly divide the items into another set of split halves and recompute, and keep doing this until we have computed all possible split half estimates of reliability. Cronbach's Alpha is mathematically equivalent to the average of all possible split-half estimates, although that's not how we compute it. Notice that when I say we compute all possible split-half estimates, I don't mean that each time we go an measure a new sample! That would take forever. Instead, we calculate all split-half estimates from the same sample. Because we measured all of our sample on each of the six items, all we have to do is have the computer analysis do the random subsets of items and compute the resulting correlations. The figure shows several of the split-half estimates for our six item example and lists them as SH with a subscript. Just keep in

86

Page 87: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

mind that although Cronbach's Alpha is equivalent to the average of all possible split half correlations we would never actually calculate it that way. Some clever mathematician (Cronbach, I presume!) figured out a way to get the mathematical equivalent a lot more quickly.

Comparison of Reliability EstimatorsEach of the reliability estimators has certain advantages and disadvantages. Inter-rater reliability is one of the best ways to estimate reliability when your measure is an observation. However, it requires multiple raters or observers. As an alternative, you could look at the correlation of ratings of the same single observer repeated on two different occasions. For example, let's say you collected videotapes of child-mother interactions and had a rater code the videos for how often the mother smiled at the child. To establish inter-rater reliability you could take a sample of videos and have two raters code them independently. To estimate test-retest reliability you could have a single rater code the same videos on two different occasions. You might use the inter-rater approach especially if you were interested in using a team of raters and you wanted to establish that they yielded consistent results. If you get a suitably high inter-rater reliability you could then justify allowing them to work independently on coding different videos. You might use the test-retest approach when you only have a single rater and don't want to train any others. On the other hand, in some studies it is reasonable to do both to help establish the reliability of the raters or observers.The parallel forms estimator is typically only used in situations where you intend to use the two forms as alternate measures of the same thing. Both the parallel forms and all of the internal consistency estimators have one major constraint -- you have to have multiple items designed to measure the same construct. This is relatively easy to achieve in certain contexts like achievement testing (it's easy, for instance, to construct lots of similar addition problems for a math test), but for more complex or subjective constructs this can be a real challenge. If you do have lots of items, Cronbach's Alpha tends to be the most frequently used estimate of internal consistency.The test-retest estimator is especially feasible in most experimental and quasi-experimental designs that use a no-treatment control group. In these

87

Page 88: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

designs you always have a control group that is measured on two occasions (pretest and posttest). the main problem with this approach is that you don't have any information about reliability until you collect the posttest and, if the reliability estimate is low, you're pretty much sunk.Each of the reliability estimators will give a different value for reliability. In general, the test-retest and inter-rater reliability estimates will be lower in value than the parallel forms and internal consistency ones because they involve measuring at different times or with different raters. Since reliability estimates are often used in statistical analyses of quasi-experimental designs (e.g., the analysis of the nonequivalent group design) the fact that different estimates can differ considerably makes the analysis even more complex.

CONCLUSIONQuestionnaire design is a long process that demands careful attention.

A questionnaire is a powerful evaluation tool and should not be taken liightly. Design begins with an understanding of the capabilities of a questionnaire and how they can help your research. If it is determined that a questionnaire is to be used, the greatest care goes into the planning of the objectives.

Questionnaires are like any scientific experiment. One does not collect data and then see if they found something interesting. One forms a hypothesis and an experiment that will help prove or disprove the hypothesis.

Questionnaires are versatile, allowing the collection of both subjective and objective data through the use of open or closed format questions. Modern computers have only made the task of collecting and extracting valuable material more efficient. However, a questionnaire is only as good as the questions it contains. There are many guidelines that must be met before you questionnaire can be considered a sound research tool. The majority deal with making the questionnaire understandable and free of bias. Mindful review and testing is necessary to weed out minor mistakes that can cause great changes in meaning and interpretation. When these guidelines are followed, the questionnaire becomes a powerful and economic evaluation tool.

88

Page 89: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

Appendix A

PLANNING STAGE OF QUESTIONNAIRE DESIGN: Questionnaire Design Jun 2011 activity 15 Apr 2011- 25 May 2011

On 15 Apr 2011, a test mail for starting the process of on-line inter-group activity on QD for 2011 to be conducted in the month of June 2011 on the ML Web was initiated by Nandita Hazra, PSG Fellow 2010. Dr Thomas Chacko was requested to be a part of the thread and his email id was included so that the group could benefit from his vast experience during the planning process.

This is the group which moderated “Questionnaire” in Jun 2011

2011 fellows - Subish and Asma (moderators)2010 fellows - Maha and Nandita (Co-mentors)2009 fellow- VinuthaThomas Chacko, Meera, Medha, Amol, Animesh (Faculty)

Some guidelines passed on for 2011 fellows were as follows,

89

Page 90: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

1. All planning was private to the group and was not to be posted inadvertently on the ML Web until the due date of 01 Jun 2011.2. To ensure this, 2011 fellows were to check when they press on the reply button that the mail is being sent only to this link and not on the listserv.3. Suggestions were welcome for the conduct of the session from all in the group.4. Subish and Asma were told that they were free to key in their thoughts, but would need to do An adequate review of literature first and Maha and Nandita could assist in that. Pl feel free to mail for any help required.5. By 01 May 2011, we were to start finalizing our plans for this session, so be active on this group.6. Best wishes and all the best for your individual CIPs ( 2011 fellows) Congratulations again on becoming a part of FAIMERly and welcome to new learning experience.

DATES TOPIC ASSIGNED Questions assigned/ innovation to be used

2011 2010 2009 Faculty

1-15 May 2011

Planning by group includingSurvey Q for needs assessment/ Learning objectives/ Plan of action for who will carry out each week session

Asuma, Subish,

Maha, Nandita

Vinutha

Dr Thomas,Medha Meera, Amol,Animesh

05 May 2011

Reminder to 2011 fellows for submission of Questionnaires by 07 Jun for peer analysis.

Asma

15 May 2011

Needs assessment questionnaire

Maha and Nandita

25 May 2011

Learning objectives to be placed on list serve based on needs assessment

Asma Nandita

01-07 Jun 2011

Introduction and review of literature

Asma Nandita

90

Page 91: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

08-14 Jun 2011

Questionnaire development of individual projects

Nandita

15-21 Jun 2011

Validation and Reliability of questionnaires

Subish

Maha

22-28 Jun 2011

Analysis of individual project questionnaires and methods to analyze data

Asma Maha

The Senior faculty, Dr Thomas Chacko said that since it was a very useful topic for fellow's capacity building for implementing their projects, it made sense to start with a learning needs assessment preferably-before (the session) since there were 45 days. A pre-survey to identify what the 2011 fellows would need to know for preparing their project questionnaire as well as going through the first progress report of the fellows (which were to be received by mid May) and would indicate the objectives, the proposed study design and evaluation plans...

2009 fellow Vinutha Shankar said that it would be good to start early and abide by the "plan-plan-plan" formula for successful implementation. It was also decided to post the objectives that were tentatively planned during the on site session.  Animesh felt that the group could invite some questionnaires from 2011 fellows (and may be some from others too if the group feels). Then the Questionnaire Group could take up a few questionnaires and analyse the strengths and weaknesses and post it on the listserv during discussion inviting some comments from all which could be a good "Set induction." and a need was felt to tell the fellows sending (their CIP) questionnaire that they should be ready to take this exercise and comments in a positive spirit. Request Asma and Subish to insert their names in this table as decided during the onsite session and start communication to this group about how they intend to do moderate that week ( preferably before 09 May 2011, so that the planning can go ahead.THE SUGGESTED PLAN IS AS FOLLOWS:

05 May 2011:POST THIS ON ML WEB

Dear 2011 Fellows,

91

Page 92: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

The Jun 2011 Questionnaire group will be moderating the session on “Questionnaire”.

Each of the fellows of the 2011 group will be asked to submit their Questionnaires being used in the CIP during the first two weeks of June, i.e. 01-14 Jun 2011 for analysis of strengths and weaknesses by peers and Faculty.

This is an early reminder/ information to request you to be ready with your Questionnaires so that you may be able to submit the same for peer review. Dr Piryani has already done so but is welcome to submit again during this period if modifications have been made.

Cheerio and looking forward to interacting with you all on the ML Web.

QUESTIONNAIRE Group

25 May 2011:POST THIS ON ML WEB

SPECIFIC LEARNING OBJECTIVES:At the end of the Jun 2011Intersession ML Web Discussion on Questionnaire, each participant should be able to:

1. Understand what a questionnaire is and how best to use it2. Understand the types of questionnaires available3. List the various types of open ended and closed ended

questions that can be used in a Questionnaire4. Be able to formulate question instructions and layout of

questionnaires for own CIP5. Understand how to use the Likert Scale in CIP questionnaire6. Learn how to validate the individual statements in a

questionnaire by using correlation matrix computation and use t-test to determine the most valid statements for the questionnaire

7. Understand how to pilot a questionnaire and its role in Questionnaire development

8. Learn to use the various techniques in the analysis of data generated from questionnaires

Appendix BLIKERT SCALES

92

Page 93: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

Likert scales vary in the number of points in the scale. The five-point scale used here is the most common, but some Likert scales have 4-point response scales, eliminating the not sure/undecided category. Some even have 7-point response scales.

AGREEMENT Strongly Agree Agree Undecided Disagree Strongly Disagree

Agree Strongly Agree Moderately Agree Slightly Disagree Slightly Disagree Moderately Disagree Strongly

Agree Disagree

Agree Undecided Disagree

Agree Very Strongly Agree Strongly Agree Disagree Disagree Strongly Disagree Very Strongly

Yes No

Completely Agree Mostly Agree Slightly Agree Slightly Disagree Mostly Disagree Completely Disagree

Disagree Strongly Disagree

93

Page 94: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

Tend to Disagree Tend to Agree Agree Agree Strongly

FREQUENCY Very Frequently Frequently Occasionally Rarely Very Rarely Never

Always Very Frequently Occasionally Rarely Very Rarely Never

Always Usually About Half the Time Seldom Never

Almost Always To a Considerable Degree Occasionally Seldom

A Great Deal Much Somewhat Little Never

Often Sometimes Seldom Never Always

Very Often

94

Page 95: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

Sometimes Rarely Never

IMPORTANCE Very Important Important Moderately Important Of Little Importance Unimportant

QUALITY Very Good Good Barely Acceptable Poor Very Poor Extremely Poor

Below Average Average Above Average Excellent Good Fair Poor

LIKELIHOOD Like Me Unlike Me To a Great Extent Somewhat Very Little Not at All

True False

Definitely Very Probably Probably Possibly

95

Page 96: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

Probably Not Very Probably Not

Almost Always True Usually True Often True Occasionally True Sometimes But Infrequently True Usually Not True Almost Never True

True of Myself Mostly True of Myself About Halfway True of Myself Slightly True Of Myself Not at All True of Myself

96

Page 97: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

 Appendix C

NEEDS ASSESSMENT DONE AT ONSITE SESSION TO IMPROVE THE ONLINE SESSION (Contributions from Faculty, 2008, 2009, 2010 Fellows)

Table 1: What went well in ML web discussion?What went well?

Responses Number

Session designing

Innovations 2Planning in advance 6Practical approach  

 Self-management

Assigning task to each member  Time management  Teamwork 3

 Participation Active participation  

Good responses   Communication

Communication with group 2Excellent co-mentors\role of co mentors 2

 Co-ordination Support from Fellows and Faculties  

Co-operation and interaction among group members

 

 Learning Good learning experience  

Learned to moderate  Collaborative learning  Sharing of lot of literature  

 Commitment Hard work   Environment Comfortable with listserve  

Positive change in the department   

    (Coding type - In vivo, Category - Descriptive)

97

Page 98: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

Table 2: What were the challenges faced?    (Coding type - In vivo, Category - Descriptive)

Themes Responses Number

Time Time management 3Response rate Motivating others to respond 2

Poor response rate 1Delayed responses 2

 Facilitating the discussion

Dealing with deviation/digression  Keeping discussion interesting  

 Other obligations

Social and personal obligations (Festivals, conferences, holidays etc)

 

Other assignments   Handling of information

Lost communication details of participants  How to make report out of discussion?  Storage of information and its retrieval 2

 Table 3: How it could have been even better?

 Seeking expert opinion  More co-ordination between co mentors  Clear objectives and session planning in advance

5

Needs based 2Make it interesting by innovations 3Better time management 3Team work  Devotion of more time  Keep time for Listserv  Avoid tangential talk  Timely responses  

98

Page 99: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

Design Methodology

Determine Feasibility

Develop Instruments

Select Sample

Conduct Pilot Test

Revise Instruments

Conduct Research

Analyze Data

Prepare Report

Compose wording

Design individual questions

Design of layout and coding

99

Page 100: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

QUESTIONNAIRE RESEARCH DESIGN proceeds in an orderly and specific manner. Each item in the flow chart depends upon the successful completion of all the previous items. Therefore, it is important not to skip a single step. There are two feedback loops in the flow chart to allow revisions of the methodology and instruments.

TIME CONSIDERATIONS IN QUESTIONNAIRE DEVELOPMENT

To prevent researchers to underestimate the time required to complete a research project use the following form as an initial checklist in developing time estimates. Be generous with your time estimates as things almost always take longer than we think they should.

This checklist contains two time estimates for each task. The first one (Hours) is your best estimate of the actual number of hours required to complete the task. The second one (Duration) is the amount of time that will pass until the task is completed.

For example, an estimate of goal clarification may be four hours, but other commitments may allow spending of only two hours a day on thr study. The "hours" estimate is thus four hours, and the "duration" estimate is two days.

To arrive at your final time estimates, add the individual estimates. The hours estimate is used for budget planning i.e. arrangement for finances and the duration estimate is used to develop a project time line.

Hours Duration1. Goal clarification ...........................................................…________ ________2. Overall study design ........................................................ ________ ________3. Selecting the sample ......................................................... ________ ________4. Designing the questionnaire and cover letter................ ________ ________5. Conduct pilot test ............................................................... ________ ________6. Revise questionnaire (if necessary) ................................ ________ ________7. Printing time ........................................................................ ________ ________8. Locating the sample (if necessary).................................. ________ ________9. Time in the mail & response time .....................................________ ________10. Attempts to get non-respondents ..................................________ ________11. Editing the data and coding open-ended questions..________ ________12. Data entry and verification …………........................ ________ ________

100

Page 101: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

13. Analyzing the data ......................................................... ________ ________14. Preparing the report........................................................ ________ ________15. Printing & distribution of the report ............................. ________ ________

REFERENCES

DESIGNING A QUESTIONNAIRE

Bennett AE, Ritchie K. Questionnaires in medicine: A guide to their design and use.

London: Oxford University Press (for the Nuffield Provincial Hospitals Trust), 1975.

Hulley SB, Cummings SR, ed. Designing clinical research Baltimore: Williams and Wilkins, 1988.

McDowell I, Newell C. Measuring health. A guide to rating scales and questionnaires. New York: Oxford University Press, 1987.

Eastwood RP. Sales control by quantitative methods. New York: Columbia University Press, 1940.

Cartwright A. Health surveys in practice and in potential: a critical review of their scope and methods. London: King Edward's Hospital Fund, 1983.

Sudman S, Bradbum NM. Asking questions. London: Jossey Bass, 1982.

Streiner DL, Norman GR. Health measurement scales. A practical guide to their development and use. Oxford: Oxford University Press, 1989.

LIKERT SCALE IN QUESTIONNAIRE DESIGN

101

Page 102: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

Resources:Research Methods: Tips On Survey Research Methods

Read more: How to Use the Likert Scale in Statistical Analysis | eHow.com http://www.ehow.com/how_4855078_use-likert-scale-statistical-analysis.html#ixzz0ul5tUta7

Jamieson, S. (2004). Likert scales: how to (ab)use them. Medical Education, 38, 1212-1218.

Kerlinger, F.N., & Lee, H.B. (2000). Foundations of Behavioral Research (4th ed.). Fort Worth, TX: Harcourt College Publishers.

Wu, Y., & Tsai, C. (2007). Developing an information commitment survey for assessing students’ web information searching strategies and evaluative standards for web materials. Educational Technology & Society, 10(2), 120-132.

http://www.iiep.unesco.org/fileadmin/user_upload/Cap_Dev_Training/Training_Materials/Quality/Qu_Mod8.pdf

QUESTIONNAIRE DESIGNFoddy W (1993). Constructing questions for interviews and questionnaires: Theory and practice in social research. Cambridge University Press, Melbourne.

Oppenheim AN (1992). Questionnaire design, interviewing and attitude measurement. Pinter Publishers, London.

Schuman H, Presser S (1996). Questions and answers in attitude surveys: experiments on question form, wording, and context. Sage Publications, San Diego.

Streiner DL, Norman GF (1995). Health Measurement Scales: a practical guide to their development and use, Oxford University Press, Oxford, 1995.

D H Stone. "How To Do It. Design a questionnaire “. BMJ VOLUME 307 13 NOVEMBER 1993"

Vilie Farah ,http://www.brighthub.com/internet/web-development/articles/114491.aspx

SPSS: http://www.statisticshell.com/designing_questionnaires.pdf

102

Page 103: psgfri2010.wikispaces.compsgfri2010.wikispaces.com/file/view/QUESTIONNAIRE... · Web viewWith Likert scale data, the best measure to use is the mode, or the most frequent response

Fox, J.E. and Fricker, S.S. (2009). Designing ratings scales for questionnaires. Presented at the Usability Professionals' Association Annual Conference. Portland, OR, USA. 11-Jun-2009. Discusses the effects of rating scales on questionnaire results.

Fox, J.E. and Fricker, S.S. (2008). Beyond words: Strategies for designing good questionnaires. Presented at the Usability Professionals' Association Annual Conference. Baltimore, MD, USA. 19-Jun-2008.Dumas, J. and Tullis, T. (2009). Annotated Bibliography of Rating Scale Literature. 25-May-2009. Extensive list of references on rating scales and questionnaires.

Tullis, T. and Dumas, J. (2009). Rating scales: What the research says. PowerPoint slidesPresented at the Boston Usability Professionals' Association Mini-UPA Conference, 28-May-2009.

Jurek Kirakowski. Questionnaires in Usability Engineering (last updated 2-Jun-2000).

Stephen P. Borgatti, 1996 Revised: September 30, 1998, .Principles of Questionnaire design , www.analytictech.com/mb313/principl.htm

RELIABILITY

Cook TD, Campbell, DT. Quasi-Experimentation; Design and Analysis Issues for FieldSettings. Boston, MA: Houghton Mifflin Company, 1979

Damato S, Bonatti C, Frigo V, Pappagallo S, et al. Validation of the Clinical COPDquestionnaire in Italian language. Health and Quality of Life Outcomes 2005, 3:9

http://www.socialresearchmethods.net/kb/relandval.php

103