statistics · 2020. 4. 8. · anal 0.065-0.5% oral 0.005-0.01% injecting drugs use 0.67% ......
TRANSCRIPT
STATISTICS
WHY??
• It is extension of mathematical
• 95% of medical students are not from mathematical background, it is difficult to digest statistic.
1. Antibiotics reduce the duration of viral throat infections by 1-2 days.
2. Five per cent of women aged 30-49 consult their GP each year with heavy menstrual bleeding.
3. At our health centre, 50 patients were diagnosed with angina last year.
4. 95% of medical students are not from mathematical background,
We use statistics every day, often without realising it.
WHY DO WE NEED STATISTICAL CALCULATION??
• We have tendency to make the strongest possible
conclusion from limited amount of data.
• Human brain is interested in finding pattern and
relationships, but tends to overgeneralize it..
• Important differences are often hidden by variability or
study limitations.
STATISTICS“There are three kinds of lies -- lies, damned lies, and statistics.”
Ø Collecting and Analysing data
Ø If you cannot distinguish Truth from Faulty Reasoning, then you are
vulnerable to manipulation and to decisions that are not in your best interest.
Ø Statistics provides tools that you need in order to react intelligently to
information you hear or read.
DEFINITIONS
STATISTICS: - the study of methods of collecting, classifying,
presenting & analysis of data & drawing of scientific influences of it.
- Collecting and Analysing data
We measure and analyze this variability into some meaningful interpretations
STATISTICSEXIT POLL 2018
DEFINITIONS
STATISTICS: - Collecting and Analysing data
BIOSTATISTICS-- It is a branch of statistics related with medical
science.
Exposure Route HIV Transmission
Blood transfusion >98%
Perinatal 20-40%
Sexual intercourse 0.1 to 1%
Vaginal 0.05-0.1%
Anal 0.065-0.5%
Oral 0.005-0.01%
Injecting drugs use 0.67%
Needle stick exposure 0.3%
Mucous membrane splash to eye, oro-nasal 0.09%HIV
Tra
nsm
issio
n Ri
sk
Mod
e of
transmission
of H
IV
USES OF BIO STATISTICS
Ø To measure the health problems of the community.Ø To know the health needs of the community.
Ø To define the targets/goals for various health programmes.
Ø Planning & Interpretation of research projects.
Ø Evaluation of various health measures.Ø To study trends in population.
Community Diagnosis
Researches
Impact evaluation
STATISTICS
DESCRIPTIVE STATISTICS
INFERENTIAL STATISTICS
1. DESCRIPTIVE STATISTICS
1. DESCRIPTIVE STATISTICS
Descriptive statistics are just descriptive.
They do not involve generalizing beyond the data at hand.
Most importantly & commonly used in sports statistics
1. DESCRIPTIVE STATISTICS Meaningful presentation of data can be :
Ø the tabular, graphical or pictorial display of data,
Ø condensation of large data into tables,
Ø preparation of summary measures to give a concise description of complex information and also to exhibit pattern that may be found in data sets.
2. INFERENTIAL STATISTICS
• Refers to decisions.
• Medical research doesn’t stop at just describing the characteristic of disease or situation.
• It tries to determine whether characteristics of a situation are unusual or if they have happened by chance.
DATA - CERTAIN KEY DEFINITIONS
Variable: A class of measurements or a characteristic on which individual observations or measurements are made is called a variable; examples include weight, height, and blood pressure, among others.
Data: Discrete observations of attributes or events that contain little meaning when considered alone.
Information: Data can be transformed into information by reducing them, summarizing them & adjusting them for variations such as age, gender etc
FLOW OF STUDY/RESEARCH
Collection of data
Classification of data
Presentation of data
Analysis of data
Interpretation of data
DATA TYPES: DEPENDING ON SOURCE
1. Primary Data 2. Secondary data
DATA TYPES: DEPENDING ON SOURCE
1. Primary Data
• Original data which are collected and recorded by the investigator.
• give first hand information e.g. Experiments and surveys.
2. Secondary data
• collected by another person but utilized by the investigator for his use
• e.g. Hospital records, published reports and articles etc.
DATA TYPES: DEPENDING ON SOURCE
1. Qualitative (Categorical)
2. Quantitative (Continuous/Discrete)
DATA TYPES: DEPENDING ON SOURCE
1. Qualitative (Categorical) • collected by counting the
individual having the same attribute or character.
• In this type of data there is only one variable e.g. No. of person (Frequency)
2. Quantitative (Continuous/Discrete)• In this type of data there are
two Variables, their attributes as well as frequency.
• e.g. the number of children of a specific age, and measurements, such as height and weight.
DATA- 4 SCALES OF MEASUREMENT (NOIR)
Nominal Ordinal Interval Ratio
DATA- 4 SCALES OF MEASUREMENT (NOIR)
Nominal
• Either this/that
• EgBlack/White, bld grp, gender
• If only 2/binary-dichotomous
Ordinal• meaningful
ORDER/RANK
• mild/mod/sevr, 1st, 2nd,3rd
•• No value of
size of interval i.e. is the difference between 1st & 2nd same
Interval
• Scaled data with meaningful intervals
• 40 OC
Ratio
• Scaled data with decimal
• Wt• ht• 120/mt is twice
of 60/mt
DATA- 4 SCALES OF MEASUREMENT (NOIR)
Nominal Ordinal Interval Ratio
Qualitative Quantitative
EXAMPLES OF VARIOUS TYPES OF STATISTICAL DATA.
Name of variable Type of variable List of Categories
Age Quantitative 0 to 100 years
Sex Qualitative Male, FemaleHeight Quantitative 100 to 200 cms.
Income Quantitative Rs.30 to 250 per day
Religion Qualitative Hindu, Muslim
Employment Status Qualitative Unemployed, Employed
RAW TO SYSTEMATIC
Ø The observations made on the subjects one after the other is called raw data
Ø The first step in handling the data, after it has been collected is to ‘reduce’ and summarize it.
Ø Tables are used to categorize and summarize data while graphs are used to provide an overall visual representation.
Ø To develop Graphs and diagrams, we need to first of all, condense the data in a table
DATA TRANSFORMATION
Unordered/Raw Data
• Raw Data- accumulation of information
Ordered Data
• organized in order of magnitude from the smallest value to the largest or vice versa
Grouped Data-
Frequency Table
GROUPED DATA - FREQUENCY TABLE
Besides arranging the data in ordered array, grouping of data is yet another useful way of summarizing them.
We classify the data in appropriate groups which are called “classes”.
The basic purpose behind classification or grouping is to help comparison and also to accommodate a large number of observations into a few classes only, by condensation so that similarities and dissimilarities can be easily brought out.
EXAMPLE- GROUPED DATA/FREQUENCY TABLE
Table 2: Age of distribution of the 100 children
Age group (in month) No of children
1-4 36
5-8 33
9-12 31
total 100
PRESENTATION OF DATA
Principles of PresentationØ To arrange the data in such a way that it arouses interest in a reader.
Ø To make the data sufficiently precise at the same time without losing important details.
Ø To present the data in simple form in order to make it possible to form some impressions and draw some conclusions directly or indirectly.
Ø To help in further statistical analysis.
ØPresentation can be in 2 forms: Tabulation & Graphs
TABULATION
Tabulation is the presentation of data in a systemic and scientific way and in a form so that special significance of the data is obtained.
STEPS:
To group a set of observations we select a set of contiguous, non overlapping intervals such that each value in the set of observations can be placed in one and only one of the intervals.
These intervals are usually referred to as class intervals.
TERMS USED IN CLASSIFICATION
CLASS LIMIT – Limit within which the class interval lies is known as class limit.
CLASS MAGNITUDE – It is the difference between the upper and lower levels of class limit.
CLASS FREQUENCY – Number of observations or items which fall in one class interval is known as class frequency.
For example the above data can be grouped into different age groups of 1-4, 5-8 and 9-12.
The class interval 1-4 includes the values 1, 2, 3 and 4.
The smallest value 1 is called its lower class limit whereas the highest value 4 is called its upper class limit.
The middle value of 1-4 i.e. 2.5 is called the midpoint or class mark.
Table 2: Age of distribution of the 100 children
Age group (in month) No of children
1-4 36
5-8 33
9-12 31
total 100
Points to remember:Class limits should be so fixed as to display the main characteristic of the distribution accurately.
The magnitude of the class interval should be determined in accordance with the size of the data.
The class interval should be of uniform magnitude and the class limits should be perfectly whole numbers.
The grouped data should not look very small or very large, indeterminate classes may be avoided.
GUIDELINES FOR TABULATION• The table should be attractive, impressive and clean.• Due prominence should be given to title and sub-title.• The title should be simple, clear and should give complete
description.• The complicated tables should be avoided.• The rows and columns should be serialized.
• The unit of measurement should be specified and defined.• The source of material should be given clearly.• A table should be easily read and understood so that there may
be no wastage of time.• Full details of deliberate exclusion of observations in a
collected series must be given.
CHECKLIST FOR DESIGNING A GOOD TABLE
Five support components are needed to describe the data displayed in a table:1- The table title should give a clear and accurate description of the data. It should answer the three questions “what”, “where” and “when”. Be short and concise, and avoid using verbs.
one way table.
When two variables are involved the table is referred to as cross tabulation or two way table.
Table 2: Age of distribution of the100 children
Age group (in month) No of children
1-4 365-8 339-12 31total 100
Table 2: Age of distribution of the 100 children among various gender
Age group (in month)
No of children
Female Male Total1-4 14 22 365-8 15 18 339-12 16 15 31total 45 55 100
GRAPHICAL PRESENTATION OF DATA
Table 2: Age of distribution of the100 children
Age group (in month) No of children
1-4 365-8 339-12 31total 100 28
30
32
34
36
38
1-4 m 5 -8 m 9 -12 m
No of Childrem
THE ADVANTAGES OF GRAPHS VERSUS TABLES
qTables have the advantages of:Ø displaying more complex data with precision and flexibility
Ø requiring less technical skill or facilities to prepare
Ø using less space for a given amount of information.
qGraphs have the advantages of:Ø simplicity and clarityØ memorable visual imagesØ being able to show complex relationships.
GRAPHSDATA SCALE GRAPH/
DIAGRAMQualitative Nominal
OrdinalBar DiagramPie Chart
PictogramMap Diagram/Spot Map
Quantitative Interval (Scale)
Ratio (Scale)
HistogramFrequency Polygon/curve line (Chart Scatter Diagram)
Cumulative Frequency CurveLine Chart Graph
GRAPHSGraphs for Qualitative Data:
1. Bar Chart
2. Pie Chart
3. Component Band Chart
4. Pictogram
5. Rate maps & Spot Maps
BAR CHARTSBar charts are best suited for displaying numbers or percentages that compare two or more categories of data.
For Qualitative data.
Bars themselves should always be of the same width and the bars should be of cross hatches so that there is no confusion between bars & intervening space.
For several comparisons of the totals, the bars can be categorised or sub-divided.(multiple, component bar charts)
PIE CHARTSimple representation of categorical data/ qualitative data
Each slice of pie corresponds to its frequency or proportion of that variable in total quantity.
PIE CHART
36%
33%
31%
Age
1-4 m5-8 m9-12m
PIE CHART
1-4 m, 36
5-8 m, 33
9-12m, 31
PIE CHARTS & COMPONENT BAND CHARTSØ Pie charts and component band charts display how a whole entity
is divided into its parts.
Ø A pie chart represents this information with a circle and a component band chart represents it with a bar – both are divided into sections representing the different components.
Ø In general, it is better to use component band charts for comparing how two or more whole entities are divided into their component parts than it is to place pie charts side by side.
PIE CHARTFor pie charts, a useful rule is to place the pieces of the pie in order according to their size, starting at the equivalent of twelve o’clock and then progressing clockwise.
COMPONENT BAND CHARTS
A pie chart represents this information with a circle and a component band chart represents it with a bar – both are divided into sections representing the different components.
PICTOGRAMØ It is a popular method of presenting data to the laymen or illiterate
persons i.e. to those who cannot understand charts.
Ø Small pictures or symbols are used to present the data.
SPOT MAPS
Ø Spot maps and rate maps display geographical locations of cases or rates.
Ø John Snow used this spot map to display where the cases of cholera occurred relative to the famous pump (Figure 4.1).
RATE MAPSRate maps are slightly different in that geographical areas are shaded according to the differences in values;
prevalence, incidence or mortality are often shown on rate maps.
Areas with the highest rates are typically shaded with the darkest shades or the brightest colors
State wise number of malaria cases in India during 2007–2017
GRAPHSGraphs for Quantitative Data:
1. Histogram
2. Frequency Polygon/frequency curve lines(Chart scatter diagram)
3. Cumulative frequency graphs (Ogive)
4. Line Chart Graph
BAR CHART → PIE
BAR CHART → HISTOGRAM??
BAR CHART → HISTOGRAM??
HISTOGRAM
Ø It is a graphical presentation of frequency distribution. looks like a bar chart with all the bars stacked together in an orderly fashion, with no space between the bars
Ø For quantitative continuous type of data where, on the X-axis, we plot the quantitative exclusive type of class intervals and on the Y-axis we plot the frequencies.
Ø The heights of the bars represent either the number or percentage of observations within each interval.
HISTOGRAM (with normal distribution curve)
ØFREQUENCY POLYGONSØ It is a diagram made over a histogram.
Ø This is drawn after plotting the points at the mid position of each class interval & they are connected by straight lines, drawn from each line to the next.
The bellshaped curve of the normal distribution is one important example (Figure 4.3).
EXAMPLE – DRAW A HISTOGRAM & FREQUENCY POLYGON OF THE DATA GIVEN:
0-10 8
10-20 6
20-30 9
30-40 4
40-50 3
50-60 5
0123456789
10 20 30 40 50 60
NU
MB
ER
OF
PE
RS
ON
S
AGE GROUPS IN YEARS
HISTOGRAM SHOWING DISTRIBUTION OF NUMBER OF PERSONS WITH AGE-GROUP
LINE GRAPHSØ Line graphs are best suited for displaying the amount of change or
difference in a continuous variable, which is usually shown on the vertical axis.
CUMULATIVE FREQUENCY
Marks Frequency CumulativeFrequency
31 – 40 5
41 – 50 12
51 – 60 26
61 – 70 13
71 – 80 5
81 – 90 3
5
1743
56
6164
Cumulativefrequency is arunning total
Check that this is the same asthe total of your pieces of data
30 40 50 60 70 80 90 1000
10
20
30
40
50
60
70
x
f
Marks
Cf
CUMULATIVE FREQUENCY GRAPH (OGIVE)
WHEN TO DO WHAT
Data Method Used
Frequency of Occurrence(Comparison of Magnitude)
Bar Chart/PieChart
Trends over time Line Graph
Distribution (not related to time) Histogram/Frequency Polygon
Association (Looking for correlation between 2 variables)
Scatter diagram
MCQ
1. The best method to show association between Ht. & Wt. of children in a class: A. Bar Chart B. Histogram C. Pictogram D. Scatter Diagram
2. Trends can be represented by: A. Line Diagram B. Scatter Diagram C. Pictogram D. Histogram
3. Frequency distribution is studied by: A. Pie Diagram B. Histogram C. Pictogram D. Line Diagram
MCQ
1. Histogram is used to describe:a. Quantitative data of a group of patients
b. Qualitative data of a group of patients
c. Data collected on Nominal Scale
d. Data collected on Ordinal Scale
2. Ogive is:
a. Bar Chart b. Histogram c. Cumulative frequency curved. Frequency polygon
WHEN TO DO WHAT
Data Method Used
Frequency of Occurrence(Comparison of Magnitude)
Bar Chart/PieChart
Trends over time Line Graph
Distribution (not related to time) Histogram/Frequency Polygon
Association (Looking for correlation between 2 variables)
Scatter diagram
THANKS..