data edit code

Upload: shashank-shekhar

Post on 02-Jun-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/11/2019 Data Edit Code

    1/33

    Learning Objectives

    1. Explain the concepts of editing and coding2. List the important considerations in editing and

    coding3. List and explain the key issues in error-checking

    and data transformation4. Explain the contents and uses of a code book5. Edit and code completed questionnaires

  • 8/11/2019 Data Edit Code

    2/33

    PROCESSING OF DATA

    The collected data in research is processed and analyzed tocome to some conclusions or to verify the hypothesismade.

    Processing of data is important as it makes further analysis ofdata easier and efficient. Processing of data technicallymeans

    1. Editing of the data

    2. Coding of data

    3. Classification of data

    4. Tabulation of data.

  • 8/11/2019 Data Edit Code

    3/33

    Overview of the Stages of Data Analysis

  • 8/11/2019 Data Edit Code

    4/33

    EDITING

    The process of checking and adjustingresponses in the completed questionnairesfor omissions, legibility, and consistency andreadying them for coding and storage

  • 8/11/2019 Data Edit Code

    5/33

    Types of Editing

    1. Field EditingPreliminary editing by a field supervisor on thesame day as the interview to catch technicalomissions, check legibility of handwriting, andclarify responses that are logically orconceptually inconsistent.

    2. In-house Editing or central

    Editing performed by a central office staff; often dome morerigorously than field editing

    1. Backtracking

    2. Allocating missing values

    3. Plug values

  • 8/11/2019 Data Edit Code

    6/33

    Purpose of Editing

    1. For consistency between and amongresponses

    2. For completeness in responses to reduceeffects of item non-response

    3. To better utilize questions answered out oforder

    4. To facilitate the coding process

  • 8/11/2019 Data Edit Code

    7/33

    Editing for CompletenessItem Nonresponse

    The technical term for an unanswered question on anotherwise complete questionnaire resulting in missingdata.

    Plug Value An answer that an editor plugs in to replace blanks ormissing values so as to permit data analysis; choice ofvalue is based on a predetermined decision rule.

    ImputeTo fill in a missing data point through the use of astatistical algorithm that provides a best guess for themissing response based on available information.

  • 8/11/2019 Data Edit Code

    8/33

    Facilitating the CodingProcess

    Data Clean-upChecking written responses for any stray marks

    Editing And Tabulating Dont Know Answers Legitimate dont know (no opinion) Reluctant dont know (refusal to answer) Confused dont know (does not understand)

  • 8/11/2019 Data Edit Code

    9/33

    Editing (contd) Pitfalls of Editing

    Allowing subjectivity to enter into the editing process.Data editors should be intelligent, experienced, andobjective.

    Failing to have a systematic procedure for assessing thequestionnaires developed by the research analyst An editor should have clearly defined decision rules tofollow.

    Pretesting EditEditing during the pretest stage can prove very valuable forimproving questionnaire format, identifying poor instructionsor inappropriate question wording.

  • 8/11/2019 Data Edit Code

    10/33

    CODING

    The process of identifying and classifying eachanswer with a numerical score or othercharacter symbol given by respondent

    The numerical score or symbol is called a code ,and serves as a rule for interpreting, classifying,and recording data

    Identifying responses with codes is necessary if

    data is to be processed by computer

  • 8/11/2019 Data Edit Code

    11/33

    Coding - ContinuedCoded data is often stored electronically in the form of adata matrix - a rectangular arrangement of the data intorows (representing cases) and columns (representingvariables)

    The data matrix is organized into fields, records, and files:Field: A collection of characters that represents a singletype of dataRecord: A collection of related fields, i.e., fields related tothe same case (or respondent)File: A collection of related records, i.e. records related tothe same sample

  • 8/11/2019 Data Edit Code

    12/33

    CodingCodebook formulation it is the formalstandardization for all the variables under study.While designing we must take care of :

    Appropriateness to the research objectiveComprehensive optionsMutually exclusive optionsSingle variable entry

  • 8/11/2019 Data Edit Code

    13/33

    Coding

    Coding of closed ended structured questionsDichotomous questionsRanking questionsChecklist /multiple responsesScaled questionsMissing values

  • 8/11/2019 Data Edit Code

    14/33

    Key Issues in Coding1. Pre cod ing and p os t -cod ing2. Pre-Cod ing Fixed-Altern ative Questio ns (FA Qs)

    -Writing codes for FAQs on the questionnairebefore the data collection

    3. Coding Open-Ended Ques t ions - A 3-stageprocess:

    (a) Perform a test tabulation, (b) Devise a codingscheme, (c) Code all responsesTwo Rules For Code Construction are:

    a) Coding categories should be exhaustive

    b) Coding categories should be mutually exclusive andindependent

  • 8/11/2019 Data Edit Code

    15/33

    Issues in Coding - Continued

    3. Maintain ing a Cod e B oo k - A book thatidentifies each variable in a study, the variablesdescription, code name, and position in the datamatrix

    4. Produ c t ion Cod ing - The physical activity oftransferring the data from the questionnaire or datacollection form [to the computer] after the data hasbeen collected. Sometimes done through a codingsheet ruled paper drawn to mimic the data matrix

    5. Com bin ing Ed i ting and Cod ing

  • 8/11/2019 Data Edit Code

    16/33

    AFTER CODING ..

    1. Data Ent ry - The transfer of codes fromquestionnaires (or coding sheets) to a computer.Often accomplished in one of three ways:

    a) On-line direct data entry

    b) Optical scanning for highly structuredquestionnaires

    c) Keyboarding data entry via a computerkeyboard; often requires verification

  • 8/11/2019 Data Edit Code

    17/33

    After Coding - Continued

    2. Error Ch eck in g Verifying the accuracy ofdata entry and checking for some kinds ofobvious errors made during the data entry.

    Often accomplished through frequencyanalysis.

  • 8/11/2019 Data Edit Code

    18/33

    After Coding - Continued

    3. Data Trans form at ion Converting some of thedata from the format in which they were entered toa format most suitable for particular statisticalanalysis.Often accomplished through re-coding, to:

    reverse-score negative (or positive) statementsinto positive (or negative) statements;

    collapse the number of categories of a variable

  • 8/11/2019 Data Edit Code

    19/33

    EDITING REQUIRES SOME CAREFULCONSIDERATIONS:

    Editor must be familiar with the interviewers mind set, objectivesand everything related to the study.Different colors should be used when editors make entry in thedata collected.They should initial all answers or changes they make to the data.The editors name and date of editing should be placed on thedata sheet.

  • 8/11/2019 Data Edit Code

    20/33

    CODING:

    Classification of responses may be done on the basis ofone or more common concepts.

    In coding a particular numeral or symbol is assigned to theanswers in order to put the responses in some definitecategories or classes.

    The classes of responses determined by the researchershould be appropriate and suitable to the study.

    Coding enables efficient and effective analysis as theresponses are categorized into meaningful classes.

    Coding decisions are considered while developing ordesigning the questionnaire or any other data collection tool.

    Coding can be done manually or through computer.

  • 8/11/2019 Data Edit Code

    21/33

    CLASSIFICATION:

    Classification of the data implies that the collected raw

    data is categorized into common group having commonfeature.

    Data having common characteristics are placed in acommon group.

    The entire data collected is categorized into various groupsor classes, which convey a meaning to the researcher.

    Classification is done in two ways:

    1. Classification according to attributes.

    2. Classification according to the class intervals.

  • 8/11/2019 Data Edit Code

    22/33

    CLASSIFICATION ACCORDING THE THE ATTRIBUTES:

    Here the data is classified on the basis of common

    characteristics that can be descriptive like literacy, sex,honesty, marital status e.t.c. or numeral like weight, height,income e.t.c.

    Descriptive features are qualitative in nature and cannot be

    measured quantitatively but are kindly considered whilemaking an analysis.

    Analysis used for such classified data is known as statisticsof attributes and the classification is known as theclassification according to the attributes.

  • 8/11/2019 Data Edit Code

    23/33

  • 8/11/2019 Data Edit Code

    24/33

    TABULATION:

    The mass of data collected has to be arranged in some kind ofconcise and logical order.

    Tabulation summarizes the raw data and displays data in formof some statistical tables .

    Tabulation is an orderly arrangement of data in rows and

    columns.OBJECTIVE OF TABULATION :

    1. Conserves space & minimizes explanation and descriptivestatements.

    2. Facilitates process of comparison and summarization.

    3. Facilitates detection of errors and omissions.

    4. Establish the basis of various statistical computations.

  • 8/11/2019 Data Edit Code

    25/33

    BASIC PRINCIPLES OF TABULATION :

    1. Tables should be clear, concise & adequately titled.

    2. Every table should be distinctly numbered for easyreference.

    3. Column headings & row headings of the table should beclear & brief.

    4. Units of measurement should be specified at appropriateplaces.

    5. Explanatory footnotes concerning the table should beplaced at appropriate places.

    6. Source of information of data should be clearly indicated.

  • 8/11/2019 Data Edit Code

    26/33

    7. The columns & rows should be clearly separated with

    dark lines8. Demarcation should also be made between data of oneclass and that of another.

    9. Comparable data should be put side by side.

    10. The figures in percentage should be approximated beforetabulation.

    11. The alignment of the figures, symbols etc. should beproperly aligned and adequately spaced to enhance thereadability of the same.

    12. Abbreviations should be avoided.

  • 8/11/2019 Data Edit Code

    27/33

    Post tabulation

    Exploratory analysisStatistical software packages

    MS ExcelMinitabSystem for statistical analysis (SAS)Statistical package for social science (SPSS)

  • 8/11/2019 Data Edit Code

    28/33

    ANALYSIS OF DATA

    The important statistical measures that are used to analyzethe research or the survey are:

    1. Measures of central tendency(mean, median & mode)

    2. Measures of dispersion(standard deviation, range, meandeviation)

    3. Measures of asymmetry(skew ness)

    4. Measures of relationship etc.( correlation and regression)

    5. Association in case of attributes.

    6. Time series Analysis

  • 8/11/2019 Data Edit Code

    29/33

    TESTING THE HYPOTHESIS

    Several factor are considered into the determination of the

    appropriate statistical technique to use when conducting ahypothesis tests. The most important are as:

    1. The type of data being measured.

    2. The purpose or the objective of the statistical inference.

    Hypothesis can be tested by various techniques. Thehypothesis testing techniques are divided into two broad

    categories:1. Parametric Tests.

    2. Non- Parametric Tests.

  • 8/11/2019 Data Edit Code

    30/33

    PARAMETRIC TESTS:

    These tests depends upon assumptions typically that thepopulation(s) from which data are randomly sampledhave a normal distribution. Types of parametric tests are:

    1. t- test

    2. z- test

    3. F- test

    4. 2- test

  • 8/11/2019 Data Edit Code

    31/33

    NON PARAMETRIC TESTS

    The various types of Non Parametric Tests are:

    1. Wilcox on Signed Rank Test ( for comparing twopopulation)

    2. Kolmogorov Smirnov Test ( to test whether or not thesample of data is consistent with a specified distribution

    function)3. Runs Tests (in studies where measurements are made

    according to some well defined ordering, either in time orspace, a frequent question is whether or not the average

    value of the measurement is different points in thesequence. This test provides a means of testing this.

    4. Sign Test (this is single sample test that can be usedinstead of the single sample t- test or paired t- test.

    5. Chi square test

  • 8/11/2019 Data Edit Code

    32/33

    INTERPRETATION:

    Interpretation is the relationship amongst the collected data,with analysis. Interpretation looks beyond the data of theresearch and includes researches, theory and hypothesis.Interpretation in a way act as a tool to explain theobservations of the researcher during the research periodand it acts as a guide for future researches.

    WHY Interpretation?

    -the researcher understands the abstract principle underlyingthe findings.

    -Interpretation links up the findings with those of other similarstudies.

    -The researcher is able to make others understand the realimportance of his research findings.

  • 8/11/2019 Data Edit Code

    33/33