Algonquin College - Jan Ladas 2
BIOSTATISTICS CONTINUEDBIOSTATISTICS CONTINUED
Previously discussed: Descriptive statistical techniques The first measures of spread / central tendency
Information about central tendency is important. Equally important is information about the spread of data in a set.
Algonquin College - Jan Ladas 3
VARIABILITY/DISPERSIONVARIABILITY/DISPERSION
Three terms associated with variability / dispersion:
Range Variance Standard Deviation
(They describe the spread around the central tendency)
Algonquin College - Jan Ladas 4
VARIABILITY/DISPERSIONVARIABILITY/DISPERSION
Range:
The numerical difference between the highest and lowest scores
Subtract the lowest score from the highest score
i.e.: c = {19, 21, 73, 4, 102, 88}
Range = 102 – 4 = 98
n.b.: easy to find but unreliable
Algonquin College - Jan Ladas 5
VARIABILITY/DISPERSIONVARIABILITY/DISPERSION
Variance:The measure of average deviation or spread of scores
around the mean- Based on each score in the set
Calculation:1. Obtain the mean of the distribution2. Subtract the mean from each score to obtain a
deviation score3. Square each deviation score4. Add the squared deviation scores5. Divide the sum of the squared deviation scores by the
number of subjects in the sample
Algonquin College - Jan Ladas 6
VARIABILITY/DISPERSIONVARIABILITY/DISPERSION
Standard Deviation of a set of scores is the positive square root of the variance
- a number which tells how much the data is spread around its mean
Interpretation of Variance and Standard Deviation is always equal to the square root of the variance
“The greater the dispersion around the mean of the distribution, the greater the standard deviation and variance”
Algonquin College - Jan Ladas 7
KURTOSISKURTOSIS
Kurtosis of a data set relates to how tall and thin, or short and flat the data set is.
Leptokurtic = tall and thin Mesokurtic = normal, about average Platykurtic = short and flat
Algonquin College - Jan Ladas 8
NORMAL CURVE (BELL)NORMAL CURVE (BELL)
A population distribution which appears very commonly in life science
Bell-shaped curve that is symmetrical around the mean of the distribution
Called “normal” because its shape occurs so often May vary from narrow (pointy) to wide (flat)
distribution The mean of the distribution is the focal point from
which all assumptions may be made Think in terms of percentages – easier to interpret the
distribution
Algonquin College - Jan Ladas 9
THE NORMAL CURVETHE NORMAL CURVE
Most used frequency distributions in biostatistics.
Characteristics:
1. Total area under the curve is equal to 1.00 or 100%
2. Mean = mode = median
3. The area under the curve is broken into equal segments which are one standard deviation in width
4. The proportion of area under the curve between:
A the mean and 1 SD (+ or -) 34.13%
B the 1st and 2nd SD 13.59%
C the 2nd and 3rd SD 2.21%
Algonquin College - Jan Ladas 10
RESEARCH TECHNIQUESRESEARCH TECHNIQUES
Inferential Statistics
(Statistical Inference) Techniques used to provide a basis for
generalizing about the probable characteristics of a large group when only a portion of the group is studied
The mathematic result can be applied to larger population
Algonquin College - Jan Ladas 11
DEFINITIONS RELATING TO DEFINITIONS RELATING TO RESEARCH TECHNIQUESRESEARCH TECHNIQUES
Population: Entire group of people, items, materials, etc. with at least
one basic defined characteristic in common Contains all subjects of interest A complete set of actual or potential observations
e.g. all Ontario dentists or all brands of toothpasteSample: A subset (representative portion) of the population Do not have exactly the same characteristics as the
population but can be made truly representative by using probability sampling methods and by using an adequate sample size (5 types of “sampling”)
Algonquin College - Jan Ladas 12
DEFINITIONS RELATING TO DEFINITIONS RELATING TO RESEARCH TECHNIQUESRESEARCH TECHNIQUES
Parameters: Numerical descriptive measures of a population
obtained by collecting a specific piece of information from each member of the population
Number inferred from sample statistics
E.G.: 2,000 women over age 50 with heart disease
Algonquin College - Jan Ladas 13
DEFINITIONS RELATING TO DEFINITIONS RELATING TO RESEARCH TECHNIQUESRESEARCH TECHNIQUES
Statistic: A number describing a sample characteristic.
Results from manipulation of sample data according to certain specified procedures
A characteristic of a sample chosen for study from the larger population
e.g.: 210 women out of 500 with diabetes have heart problems
Algonquin College - Jan Ladas 14
DEFINITIONS RELATING TO DEFINITIONS RELATING TO RESEARCH TECHNIQUESRESEARCH TECHNIQUES
Statistics: Characteristics of samples used to infer
parameters (characteristics of populations) A set of tools for collecting / organizing,
presenting and analyzing numerical facts or observations
Survey: The process of collecting descriptive data from a
population
Algonquin College - Jan Ladas 15
SAMPLING PROCEDURESSAMPLING PROCEDURES
5 Types of Samples:1. A random sample – by chance2. A stratified sample – categorized then
random3. A systematic sample – every nth item4. A judgment sample – prior knowledge5. A convenience sample – readily available
Algonquin College - Jan Ladas 16
RANDOM SAMPLERANDOM SAMPLE
1. A random sample is one in which every element in the population has an equal and independent chance of being selected. This method is preferred when possible because it equalizes the effect of variables not under investigation but which may influence the observations. It also controls possible selection bias on the part of the researcher.
Sample = 1000 / 5000 students from 50 universities Lottery numbers or names in a hat
Algonquin College - Jan Ladas 17
STRATIFIED RANDOM STRATIFIED RANDOM SAMPLESAMPLE
12. Stratified random sampling is employed when it may be necessary to select elements of the population according to certain sub groups or categories e.G. Age or gender. This method allows for the control of the variable on which categorization is made. Sample subjects are then randomly chosen from the population making up each category.
E.G.: List of names per university – random selection 1/5 of names
Algonquin College - Jan Ladas 18
SYSTEMATIC SAMPLESYSTEMATIC SAMPLE
3. Systematic samples are selected by deciding to observe every nth item in the population. This method is not random because not every element in the population has an equal and independent chance for selection.
Every 5th from a list – odd or even numbers
Algonquin College - Jan Ladas 19
JUDGEMENT SAMPLEJUDGEMENT SAMPLE
4. A judgement sample has characteristics similar to that of a stratified random sample. It is sample selection done when the researcher, with prior knowledge of the population or question under investigation, arbitrarily chooses certain criteria for representation E.G.: Income, educational levels, place of residence etc.
Could be biased.
Algonquin College - Jan Ladas 20
CONVENIENCE SAMPLECONVENIENCE SAMPLE
5. A convenience sample is chosen because it is most readily available. It may or may not be representative of the larger population. Convenience samples are often chosen on the basis of geographical accessibility.
Reliability is questionable – could be biased.
Algonquin College - Jan Ladas 21
VARIABLESVARIABLES
The items of a study that are measured.Independent Variable(s) (intervention): All the factors that influence the characteristics
which are under investigation Some of the Independent Variables will be
manipulated as part of the study or experiment = “controlled”i.e.: age, gender, type of oral hygiene aid, amount of drug administered
Algonquin College - Jan Ladas 22
VARIABLESVARIABLES
Independent Variable(s) (intervention):“Uncontrolled” variables can not be manipulated: Subject’s prior experience Subject’s knowledge base Subject’s emotional state Subject’s values, beliefs
i.e.: dental hygienist evaluating tooth brushing method for children = “controlled variable”
Algonquin College - Jan Ladas 23
VARIABLESVARIABLES
Dependant Variable(s) The measurable result or outcome which the researcher
hopes will change or not change as a result of the intervention
Their values are determined by all of the independent variables operational at the time of the study (both controlled and uncontrolled)n.b.: called dependant because result depends on independent variablee.g.: subject’s plaque scores / gingival condition
(measured before and after)Result depends on method used.
Algonquin College - Jan Ladas 24
POTENTIAL PATHOGENS ON POTENTIAL PATHOGENS ON NON-STERILE GLOVESNON-STERILE GLOVES
1. Method = experimental
- Brief outline of experiment
2. Independent variables = items of a study that are measured = the intervention
3. - Gloves – material and origin
- Petri dishes with growth substances
- Time and temperature of incubation
- Testing methods for identification
- Soap – type, amount and use
- Air exposure etc.
Algonquin College - Jan Ladas 25
POTENTIAL PATHOGENS ON POTENTIAL PATHOGENS ON NON-STERILE GLOVESNON-STERILE GLOVES
Dependant = measurable result
= The types and numbers of micro-organisms found on the tested gloves
Algonquin College - Jan Ladas 26
CONCEPT OF CONCEPT OF SIGNIFICANCESIGNIFICANCE
Probability – P (symbol)
When using inferential statistics, we often deal with statistical probability.
The expected relative frequency of a particular outcome by chance or likelihood of something occurring
Coin toss
Algonquin College - Jan Ladas 27
PROBABILITYPROBABILITY
Rules of probability:
1. The (P) of any one event occurring is some value from 0 to 1 inclusive
2. The sum of all possible events in an experiment must equal 1
* Numerical values can never be negative nor greater than 1
0 = non event
P 1 = event will always happen
Algonquin College - Jan Ladas 28
PROBABILITYPROBABILITY
Calculating probability:
Number of possible successful outcomes
/ Number of all possible outcomes
E.G.: Coin flip:
1 successful outcome of heads
/ 2 possible outcomes = P = .5 or 50%
E.G.: Throw of dice
1 successful outcome
/ 6 possible outcomes = P = .17 or 16.6%
Algonquin College - Jan Ladas 29
HYPOTHESIS TESTINGHYPOTHESIS TESTING
The first step in determining statistical significance is to establish a hypothesis
To answer questions about differences or to test credibility about a statement
e.g.: ? – does brand X toothpaste really whiten teeth more than brand Y ?
Algonquin College - Jan Ladas 30
HYPOTHESIS TESTINGHYPOTHESIS TESTING
Null hypothesis (Ho) = there is no statistically significant difference between brand X and brand Y
Positive hypothesis = brand X does whiten more* Ho – most often used as the hypothesis* Ho – assumed to be true
Therefore the purpose of most research is to examine the truth of a theory or the effectiveness of a procedure and make them seem more or less likely!
Algonquin College - Jan Ladas 31
HYPOTHESIS HYPOTHESIS CHARACTERISTICSCHARACTERISTICS
Hypothesis must have these characteristics in order to be researchable.
Feasible Adequate number of subjects Adequate technical expertise Affordable in time and money Manageable in scope
Interesting to the investigator
Novel Confirms or refutes previous findings Extends previous findings Provides new findings
Algonquin College - Jan Ladas 32
HYPOTHESIS HYPOTHESIS CHARACTERISTICSCHARACTERISTICS
Ethical
Relevant To scientific knowledge To clinical and health policy To future research direction
Algonquin College - Jan Ladas 33
SIGNIFICANCE LEVELSIGNIFICANCE LEVEL
A number (a = alpha) that acts as a cut-off point below which, we agree that a difference exists = Ho is rejected. Alpha is almost always either 0.01, 0.05 or 0.10.
Represents the amount of risk we are willing to take of being wrong in our conclusion
P < 0.10 = 10% chanceP < 0.01 = 1% chance (cautious)P < 0.05 = 5% chance
Critical value cut-off point of sample is set before conducting the study (usually P < 0.05)
Algonquin College - Jan Ladas 34
ERRORSERRORS
Type I (Alpha): Is made when we reject the null hypothesis
when, in fact, it is true, therefore could lead to practicing worthless treatments that do not work.
Type II (Beta): Is made when we do not reject the null
hypothesis when, in fact, it is false, therefore could lead to overlooking a promising treatment.
e.g.: the law – “innocent” or “guilty”
Algonquin College - Jan Ladas 35
DEGREE OF FREEDOM DEGREE OF FREEDOM (d.f.)(d.f.)
Most tests for statistical significance require application of concept of d.f.
d.f. refers to number of values observed which are free to vary after we have placed certain restrictions on the data collected
* d.f. usually equals the sample size minus 1
e.g.: 8, 2, 15, 10, 15, 7, 3, 12, 15, 13 = 100
d.f. = number (10) minus 1 = 9 Takes chance into consideration A penalty for uncertainty, so the larger the sample the
less the penalty