8.processing
DESCRIPTION
Contains Research methodology might be useful to medical and paramedical UG and PG students pursuing ResearchTRANSCRIPT
Processing &
Analysis of data D.A. Asir John Samuel, MPT (Neuro Paed),
Lecturer, Alva’s college of Physiotherapy,
Moodbidri
Dr.Asir John Samuel (PT), Lecturer, ACP
Processing operations
• Editing
• Coding
• Classification
• Tabulation
Dr.Asir John Samuel (PT), Lecturer, ACP
Editing
• Process of examining the collected raw data
• Editing is done to assure that data are
accurate, consistent with other facts gathered,
uniformly entered, as complete as possible
• Field editing
• Central editing Dr.Asir John Samuel (PT), Lecturer, ACP
Field editing
• Review of reporting forms by the investigator
for completing, translating or rewriting
• Individual writing styles
• On the very next day or on the next day
• Not correct errors of omission by simply
guessing Dr.Asir John Samuel (PT), Lecturer, ACP
Central editing
• Take place when all forms or schedules have
been completed and returned to fitness
• Correct errors such as an entry in wrong place,
wrong month, and the like
• Respondent can be contacted for clarification
• No bias
Dr.Asir John Samuel (PT), Lecturer, ACP
Coding
• Process of assigning numerals or other
symbols to answers
• Should be appropriate to research problem
under consideration
• Necessary for effective analysis
• Extraction of data
Dr.Asir John Samuel (PT), Lecturer, ACP
Classification
• Large volume of raw data is reduced into
homogeneous group
• Arranging data in groups or classes on basis of
common characteristics
• Classification according to attributes
• Classification according to class-intervals
Dr.Asir John Samuel (PT), Lecturer, ACP
Tabulation
• Arranging in concise and logical order
• Summarising raw data and displaying in
compact form
• Orderly arrangement of data in columns and
rows
Dr.Asir John Samuel (PT), Lecturer, ACP
Tabulation is essential because of
• Conserves space and reduces explanatory and
descriptive statement to a minimum
• Facilitates process of comparison
• Facilitates summation of items and detection
of errors and omissions
• Basis for various statistical computations Dr.Asir John Samuel (PT), Lecturer, ACP
Problems in processing
• Problem concerning “Don’t Know” responses
• Use of percentages
Dr.Asir John Samuel (PT), Lecturer, ACP
Problem concerning “Don’t Know” responses
• When DK group is small, it is of little significance
• In big group, it becomes mater of concern
• Actually may not know the answer or
• Researcher may fail in obtaining appropriate
information (failure of questioning process)
• Keep as a separate category in tabulation Dr.Asir John Samuel (PT), Lecturer, ACP
Use of percentages
• 2/more percentages must not be averaged
unless each is weighted by group size
• Too large percentages should be avoided
because difficult to understand and confuse
• Hide base value
• Real differences may not be correctly read
• Can never exceed 100 percent and for decrease
Dr.Asir John Samuel (PT), Lecturer, ACP
Statistics in Medical Research
• Documentation of medical history of disease,
their progression, variability b/w patient,
association with age, gender, etc.
• Efficacy of various types of therapy
• Definition of normal range
• Epidemiological studies
Dr.Asir John Samuel (PT), Lecturer, ACP
Statistics in Medical Research
• Study the effect of environment, socio-
economic and seasonal factors
• Provide assessment of state health in
common, met and unmet needs
• Success/failure of specific health programme
• Promote health legislation
• Evaluate total health programme of action Dr.Asir John Samuel (PT), Lecturer, ACP
Statistics in Medical Research - Limitation
• Does not deal with individual fact
• Conclusion are not exact
• Can be misused
• Common men cannot handle properly
Dr.Asir John Samuel (PT), Lecturer, ACP
Normal distribution
• Represented by a family of infinite curves
defined uniquely by 2 parameter the mean
and the SD of the population
• The curve are always symmetrically bell
shaped. The width of the curve is defined by
population, SD
Dr.Asir John Samuel (PT), Lecturer, ACP
Normal distribution
• Mean, median and mode coincide
• It extends from - ∞ to + ∞
• Symmetrically about the mean
• Approx 68% of distribution is within 1SD of
mean (68.27%)
- 95% - 2SD (1.96 SD)
- 99% - 3SD (2.58 SD) Dr.Asir John Samuel (PT), Lecturer, ACP
Normal distribution
• The total area under the curve is 1
• The value of measure of skewness is zero. It is
not skewed
• The curve is asymptotic. It approaches but
never touches baseline at extremes
• The curve extends on the both sides -3σ
distance on left to +3σ distance on the right
Dr.Asir John Samuel (PT), Lecturer, ACP
Normal distribution - Uses
• Construct confidence interval
• Many statistical techniques makes an
underlying assumption of normality
• Distribution of sample means is normal
• Normality is important in statistical inference
Dr.Asir John Samuel (PT), Lecturer, ACP
Skewness
• Measure of lack of symmetry in a distribution
• Positive skewed
- Right tail is longer
- Mass of distribution is concentrated on left
side
- Distribution is said to be right skewed
Dr.Asir John Samuel (PT), Lecturer, ACP
Negative skewed
• Left tail is longer
• Mass of distribution concentration on right
side
• Distribution is said to be left skewed
• Value of skewness is 0 for normal distribution
Dr.Asir John Samuel (PT), Lecturer, ACP
Kurtosis
• Measure of degree of peakness in distribution
• For normal distribution, value of kurtosis is 3
• Leptokurtic – High peakness
• Mesokurtic – normal
• Platykurtic – Low peakness
Dr.Asir John Samuel (PT), Lecturer, ACP
Descriptive statistics
• Measures of location
- Central tendency
- Mean, median and mode
• Measures of variation
- Dispersion
- Range, quartile, IQR, variance and SD Dr.Asir John Samuel (PT), Lecturer, ACP
Mean
• Sum of all observation divided by total no. of
observation
Dr.Asir John Samuel (PT), Lecturer, ACP
Mean - merits
• Well understood by most people
• Computation of mean is easy
• More stable
• All items in a series are taken into account
• Used in further statistical calculation
• Good basis for comparison Dr.Asir John Samuel (PT), Lecturer, ACP
Mean - Demerits
• Affected by extreme values
• Cannot be computed by mere observation
• Not suitable for skewed distribution
• May not be an actual item
• Not in qualitative data
Dr.Asir John Samuel (PT), Lecturer, ACP
Median
• Middle most observation when data is
arranged in ascending/descending order of
magnitude
• Divides number into 2 halves such that no.of
items below it is same as no.of items above
Dr.Asir John Samuel (PT), Lecturer, ACP
Median
Odd = n+1/2
Even = n/2 + (n+1)/2
2
Dr.Asir John Samuel (PT), Lecturer, ACP
Median - Merits
• Widely used measures of CD
• Not influenced by extreme values
• Can be determined if extremes are not known
• Not a typical representation of series
• Useful for skewed distribution
Dr.Asir John Samuel (PT), Lecturer, ACP
Median - Demerits
• When no. of items are small, median may not
be representative
• It is effected by frequency of neighboring
items
• Not a typical representation of series
Dr.Asir John Samuel (PT), Lecturer, ACP
Mode
• Most frequently occurring observation in data
• If all values are different then no mode
Dr.Asir John Samuel (PT), Lecturer, ACP
Mode - Merits
• Can be computed by mere observation
• Simple
• Precise
• Less time consuming
• Less strain
Dr.Asir John Samuel (PT), Lecturer, ACP
Mode - Demerits
• Not an amenable to further algebraic
treatment
• Not rigidly defined
• Affected by no. of frequency of items
Dr.Asir John Samuel (PT), Lecturer, ACP
Measures of Dispersion (variation)
• Range
• Interquartile range
• Variance
• Standard Deviation
Dr.Asir John Samuel (PT), Lecturer, ACP
Range
• Difference between largest and smallest value
Range = Largest no. – Smallest no.
Dr.Asir John Samuel (PT), Lecturer, ACP
Quartile
• Value that divide data into 4 equal parts when
data is arranged in ascending order
Q1 = (n+1/4)th ordered observation
Q1 = [2(n+1)/4]th ordered observation
Q3 = [3(n+1)/4]th ordered observation
Dr.Asir John Samuel (PT), Lecturer, ACP
Interquartile range
• Provides range which covers middlemost 50%
of observation
• Good measures of dispersion if there are
extreme values
IQR = Q3 – Q1
Dr.Asir John Samuel (PT), Lecturer, ACP
Variance
• Sum of squares of difference of each
observation from mean, divided by n-1
Variance = 𝜀 𝑥−𝑥 2
𝑛−1
Dr.Asir John Samuel (PT), Lecturer, ACP
Variance - Merits
• Easy to calculate
• Indicate the variability clearly
• Most informative
Dr.Asir John Samuel (PT), Lecturer, ACP
Variance - Demerits
• Units of expression of variance is not the same
Dr.Asir John Samuel (PT), Lecturer, ACP
Standard Deviation (SD)
• Square root of variance
SD = √𝜀 𝑥−𝑥 2
𝑛−1
Dr.Asir John Samuel (PT), Lecturer, ACP
Standard Deviation - Merits
• Most widely used
• Used in calculating standard error
Dr.Asir John Samuel (PT), Lecturer, ACP
Standard Deviation -Demerits
• Lengthy process
• Gives weightage to only extreme valves
Dr.Asir John Samuel (PT), Lecturer, ACP