wayne daniel ppt

Upload: kmeena73

Post on 02-Jun-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/10/2019 Wayne Daniel Ppt

    1/186

    /04/34

    Lectures of Stat -145

    (Biostatistics)

    Text bookBiostatisticsBasic Concepts and Methodology for the Health Sciences

    By

    Wayne W. Daniel

    Prepared By:

    Sana A. [email protected]

    Chapter 1

    Introduction To

    Biostatistics

    Text Book : Basic Concepts andMethodology for the Health Sciences 2

  • 8/10/2019 Wayne Daniel Ppt

    2/186

    /04/34

    Key words :

    Statistics , data , Biostatistics,

    Variable ,Population ,Sample

    Text Book : Basic Concepts andMethodology for the Health Sciences 3

    IntroductionSome Basic concepts

    Statisticsis a field of studyconcerned

    with

    1-collection, organization, summarizationand analysisof data.

    2-drawing of inferencesabout a body of

    data when only a part of the data is

    observed.

    Statisticianstry to interpretand

    communicatethe results to others.

    Text Book : Basic Concepts andMethodology for the Health Sciences 4

  • 8/10/2019 Wayne Daniel Ppt

    3/186

    /04/34

    Data:

    The raw material of Statisticsis data.We may define data as figures. Figures

    result from the process of countingorfrom taking a measurement.

    For example

    - When a hospital administrator counts thenumber of patients (counting).

    - When a nurse weighs a patient

    (measurement)

    Text Book : Basic Concepts andMethodology for the Health Sciences 5

    * A variable:

    It is a characteristic that takes on differentvalues in different persons, places, or

    things.

    For example:

    - heart rate,

    - the heights of adult males,- the weights of preschool children,

    - the ages of patients seen in a dental clinic.

    Text Book : Basic Concepts andMethodology for the Health Sciences 6

  • 8/10/2019 Wayne Daniel Ppt

    4/186

    /04/34

    * Biostatistics:

    The tools of statisticsare employed in manyfields:

    business, education, psychology, agriculture,

    economics, etc.

    When the data analyzed are derived from

    the biological scienceand medicine,

    we use the term biostatisticsto distinguish

    this particular application of statistical

    tools and concepts.

    Text Book : Basic Concepts andMethodology for the Health Sciences 7

    Quantitative VariablesIt can be measured in

    the usual sense.

    For example:

    - the heights of adultmales,

    - the weights ofpreschool children,

    - the ages of patientsseen in a

    - dental clinic.

    Qualitative Variables

    Many characteristics are not

    capable of being measured.

    Some of them can be

    ordered (called ordinal)

    and Some of them cant be

    ordered (called nominal).

    For example:

    - classification of people into

    socio-economic groups

    -.hair color

    Text Book : Basic Concepts andMethodology for the Health Sciences 8

    Types of variablesQuantitative

    Qualitative

  • 8/10/2019 Wayne Daniel Ppt

    5/186

    /04/34

    A discrete variableis characterized by gaps

    or interruptions in thevalues that it canassume.

    For example:

    - The number of dailyadmissions to ageneral hospital,

    - The number of

    decayed, missing orfilled teeth per child

    in an elementary

    - school.

    A continuous variablecan assume any value within a

    specified relevant interval ofvalues assumed by the variable.

    For example:

    - Height,

    - weight,

    - skull circumference.

    No matter how close together the

    observed heights of two people,we can find another personwhose height falls somewhere inbetween.

    Text Book : Basic Concepts andMethodology for the Health Sciences 9

    Types of quantitative variables

    Discrete

    Continuous

    As the name implies itconsist of namingor classifies intovarious mutuallyexclusive categories

    For example:

    - Male - female- Sick - well

    - Marriedsingle -divorced

    .Whenever qualitativeobservation

    Can be ranked or orderedaccording to somecriterion.

    For example:- Blood pressure level

    (high-good-low)

    - Grades

    (ExcellentV.goodgoodfail)

    Text Book : Basic Concepts andMethodology for the Health Sciences 10

    Types of qualitative variables

    Nominal

    Ordinal

  • 8/10/2019 Wayne Daniel Ppt

    6/186

    /04/34

    *A population:

    It is the largest collection of valuesof arandom variablefor which we have aninterest at a particular time.

    For example:

    The weights of all the children enrolled in acertain elementary school.

    Populations may be finiteor infinite.

    Text Book : Basic Concepts andMethodology for the Health Sciences 11

    *A sample:It is a part of a population.

    For example:

    The weights of only a fraction of

    these children.

    Text Book : Basic Concepts andMethodology for the Health Sciences 12

  • 8/10/2019 Wayne Daniel Ppt

    7/186

    /04/34

    Exercises

    Question (6)Page 17 Question (7)Page 17

    Situation A , Situation B

    Text Book : Basic Concepts andMethodology for the Health Sciences 13

    Exercises:Q6: For each of the following variables

    indicate whether it is quantitative orqualitative variable:

    (a)The blood type of some patient in the

    hospital. Qualitative Nominal(b) Blood pressure level of a patient.

    (Qualitative ordinal)

    Text Book : Basic Concepts andMethodology for the Health Sciences 14

  • 8/10/2019 Wayne Daniel Ppt

    8/186

    /04/34

    (c)Weights of babies born in a hospital

    during a year. Quantitative continues(d) Gender of babies born in a hospital during a

    year. Qualitative nominal

    (e)The distance between the hospital to thehouseQuantitative continues

    (f) Under-arm temperature of day-old infants

    born in a hospital. Quantitativecontinues

    Text Book : Basic Concepts andMethodology for the Health Sciences 15

    Q7: For each of the following situations,answer questions a through d:

    (a)What is the population?

    (b)What is the sample in the study?

    (c)What is the variable of interest?

    (d)What is the type of the variable?

    Situation A:A study of 300 households in asmall southern town revealed that if she hasschool-age child present.

    Text Book : Basic Concepts andMethodology for the Health Sciences 16

  • 8/10/2019 Wayne Daniel Ppt

    9/186

    /04/34

    All households in a small(a) Population:

    southern town.

    300 households in a small(b) Sample:

    southern town.

    (c) Variable: Does households had school agechild present.

    (d)Variable is qualitative nominal.

    Text Book : Basic Concepts andMethodology for the Health Sciences 17

    Situation B:A study of 250 patients admitted

    to a hospital during the past year revealedthat, the patients lived from the hospital.

    (a) Population:All patients admitted to ahospital during the past year.

    (b) Sample: 250 patients admitted to a hospitalduring the past year.

    Text Book : Basic Concepts andMethodology for the Health Sciences 18

  • 8/10/2019 Wayne Daniel Ppt

    10/186

    /04/34

    (c) Variable: Distancethe patient lived away

    from the hospital

    Variable is Quantitative continuous.(d)

    Text Book : Basic Concepts andMethodology for the Health Sciences 19

    Choose the right answer:1-The variable is a

    a. subset of the population.

    b. parameter of the population.

    c. relative frequency.

    d. characteristic of the population to be measured.

    e. class interval.

    2-Which of the following is an example of discrete variablea. the number of students taking statistics in this term at ksu.

    b. the time to exercise daily.

    c. whether or not someone has a disease

    d. height of certain buildings

    e. Level of education

  • 8/10/2019 Wayne Daniel Ppt

    11/186

    /04/34

    3.Which of the following is not an example of discrete

    variable

    a. the number of students at the class of statistics.b. the number of times a child cry in a certain street.

    c. the time to run a certain distance.

    d. the number of buildings in a certain street.

    e. number of educated persons in a family.

    4.Which of the following is an example of nominal

    qualitative variable

    a. blood pressure level.

    b. the number of times a child brush his/her teeth.

    c. whether or not someone fail in an exam.

    d. Weight of babies at birth.

    e. the time to run a certain distance.

    5.The continuous variable is a

    a. variable with a specific number of values.

    b. variable which cant be measured.

    c. variable takes on values within intervals.

    d. variable with no mode.

    e. qualitative variable.

    6. which of the following is an example of continuous variable

    a. The number of visitors of the clinic yesterday.

    b. The time to finish the exam.

    c. The number of patients suffering from certain disease.

    d. Whether or not the answer is true.

  • 8/10/2019 Wayne Daniel Ppt

    12/186

    /04/34

    7. The discrete variable is

    a-qualitative variable.

    b-variable takes on values within interval.C-variable with a specific number of values.d-variable with no mode.

    8-Which of the following is an example of nominalvariable :

    a-age of visitors of a clinic.b-The time to finish the exam.

    c-Whether or not a person is infected by influenza.

    d-Weight for a sample of girls .

    9-The nominal variable is a

    a-A variable with a specific number of values

    b-Qualitative variable that cant be ordered.

    c-variable takes on values within interval.

    d-Quantitative variable .

    10-Which of the following is an example of

    nominal variable :a-The number of persons who are injured in accident.

    b-The time to finish the exam.

    c-Whether or not the medicine is effective.

    d-Socio-economic level.

  • 8/10/2019 Wayne Daniel Ppt

    13/186

    /04/34

    11-The ordinal variable is :

    a-variable with a specific number of values.

    b-variable takes on values within interval.

    c-Qualitative variable that can be ordered.

    d-Variable that has more than mode.

    Chapter ( 2 )Strategies for understanding the

    meanings of Data

    Pages 19 27)

  • 8/10/2019 Wayne Daniel Ppt

    14/186

    /04/34

    Key words

    frequency table, bar chart ,range

    width of interval ,mid-intervalHistogram , Polygon

    Text Book : Basic Concepts andMethodology for the Health Sciences 27

    Descriptive StatisticsFrequency Distribution

    for Discrete Random VariablesExample:

    Suppose that we take sampleof size 16 from children in aprimary school and get thefollowing data about thenumber of their decayed teeth,

    3,5,2,4,0,1,3,5,2,3,2,3,3,2,4,1

    To construct a frequency table

    We need three columns:

    1.Variable name

    2.Frequency (f):how manmber3.Relative frequency(R.f)=

    Frequency / n

    Relative

    Frequenc

    y

    Frequenc

    y

    No. of

    decayed

    teeth

    0.0625

    0.1250.25

    0.3125

    0.125

    0.125

    1

    24

    5

    2

    2

    0

    12

    3

    4

    5

    116=nTotal

  • 8/10/2019 Wayne Daniel Ppt

    15/186

    /04/34

    Representing the simple frequency table

    using the bar chart

    Number of decayed teeth

    5.004.003.002.001.00.00

    6

    5

    4

    3

    2

    1

    0

    22

    5

    4

    2

    1

    Text Book : Basic Concepts andMethodology for the Health Sciences 29

    We can represent theabove simplefrequency table usingthe bar chart.

    We can get :

    1. The sample size?

    2. Number of childrenwith decayedteeth=2?

    3. Relative frequencyof children withdecayed teeth =4?

    2.3 Frequency Distributionfor Continuous Random Variables

    For large samples, we cant use the simple frequency tableto represent the data.

    We need to dividethe data into groupsor intervals orclasses.

    So, we need to determine:

    1- The number of intervals (k).

    Too fewintervals are not good because information will belost.

    Too manyintervals are not helpful to summarize the data.

    Text Book : Basic Concepts andMethodology for the Health Sciences 30

  • 8/10/2019 Wayne Daniel Ppt

    16/186

    /04/34

    2- The range (R).

    It is the difference between the largest and

    the smallest observation in the dataset.[R=MaxMin]

    3- The Width of the interval (w).

    Classintervals generally should be of the same

    width.

    Text Book : Basic Concepts andMethodology for the Health Sciences 31

    Frequency

    (f)

    Class interval

    113039

    464049

    705059

    456069

    167079

    18089

    189Total

    Text Book : Basic Concepts andMethodology for the Health Sciences 32

    Sum of frequency=sample size=n

  • 8/10/2019 Wayne Daniel Ppt

    17/186

    /04/34

    Text Book : Basic Concepts andMethodology for the Health Sciences 33

    Cumulative Frequency:The

    It can be computed by adding successivefrequencies.

    The Cumulative Relative Frequency:

    It can be computed by adding successive relativefrequencies.

    interval:-MidTheIt can be computed by adding the lower bound of

    the interval plus the upper bound of it and thendivide over 2.[(Lower bound + Upper bound)/2]

    For the above example, the

    following table represents the

    cumulative frequency, the

    relative frequency, the

    cumulative relative frequencyand the mid-interval.

    Text Book : Basic Concepts andMethodology for the Health Sciences 34

  • 8/10/2019 Wayne Daniel Ppt

    18/186

    /04/34

    Cumulative

    RelativeFrequency

    Relative

    FrequencyR.f

    Cumulative

    Frequency

    Frequency

    Freq (f)

    Mid

    interval

    Class

    interval

    0.05820.0582111134.53039

    -0.2434574644.54049

    0.6720-127-54.55059

    0.91010.2381-45-6069

    0.99480.08471881674.57079

    10.0053189184.58089

    1189Total

    Text Book : Basic Concepts andMethodology for the Health Sciences 35

    R.f= freq/n

    Example :

    From the above frequency table, complete the tablethen answer the following questions:

    1-The number of objects with age less than 50 years ?

    2-The number of objects with age between 40-69 years?

    3-Relative frequency of objects with age between 70-79

    years ? 4-Relative frequency of objects with age more than 69

    years ?

    5-The percentage of objects with age between 40-49years ?

    Text Book : Basic Concepts andMethodology for the Health Sciences 36

  • 8/10/2019 Wayne Daniel Ppt

    19/186

    /04/34

    6-The percentage of objects with age less than 60 years ?

    7-The Range (R) ?

    8- Number of intervals (K)?

    9- The width of the interval ( W) ?

    Text Book : Basic Concepts andMethodology for the Health Sciences 37

    Representing the grouped frequency table using the histogram

    To draw the histogram, the true classes limitsshould be used. They can becomputed by subtracting 0.5 from thelower limit and adding 0.5 to theupper limit for each interval.

    FrequencyTrue class

    limits

    1129.5

  • 8/10/2019 Wayne Daniel Ppt

    20/186

    /04/34

    Representing the grouped frequency table

    using the Polygon

    0

    10

    20

    30

    40

    50

    60

    7080

    345 445 545 645 745 845

    Text Book : Basic Concepts andMethodology for the Health Sciences 39

    Q1] For a sample of patients, we obtain the following graph forapproximated hours spent without pain after a certain surgery

  • 8/10/2019 Wayne Daniel Ppt

    21/186

    /04/34

    1) The type of the graph is:

    (e)Bar chart (f) polygon(d) histogram(c) line(a) curve (b) not known

    2) The number of patients stayed the longest time without pain is:(f) 10(e) 15(d) 6(c) 5(b) 80(a) 1

    3)The percent of patients spent 3.5 hours or more without pain is:

    (f) 37.5%(e) 68.75%(d) 18.75%(c) 50%(a) 25% (b) 30%

    4)The lowest number of hours spent without pain is:

    (f))10(e) 1(d) 0.5(c) 5(b) 25 (a) 6.5

    6)The mode equals

    (f) 80(d) 3(d) 15(c) 2,4(b) 6(a) we can't find it

    the following histogram show the frequency distribution of pathologic][H.Wtumor size ( in cm) for a sample of 110 cancer patients:

  • 8/10/2019 Wayne Daniel Ppt

    22/186

    /04/34

    1. The percent of cancer patients with approximate level of pathologic tumor size =2

    cm is:

    (a)

    18% (b) 50% (c) 16.36% (d) 32.72% (e) 36% (f) 0%2. The number of cancer patients with lowest pathologic tumor size is:

    (a) 0.5 (b) 3 (c) 11 (d) 13 (e) 15 (f) 24

    3. The approximate size of pathologic tumorwith highest percentage of patients is:

    (a) 3 (b) 1 (c) 36 (d) 110 (e) 32.72% (f) 16.36%

    4. What the approximate value of the sample mean

    (a) 18.33 (b) 36 (c) 1 (d) 1.75 (e) 1.586 (f) we can't find it

    5. The mode equals

    (a)

    1 (b) 3 (c) 36 (d) 55 (e) 110 (f) we can't find it

    6. The approximate value of the sample variance

    (a) 3 (b) 0.6 (c) 0.774 (d) 1.586 (e) 110 (f) we can't find it

    Exercises

    Pages : 3134

    Questions: 2.3.2(a) , 2.3.5 (a)

    H.W. : 2.3.6 , 2.3.7(a)

    Text Book : Basic Concepts andMethodology for the Health Sciences 44

  • 8/10/2019 Wayne Daniel Ppt

    23/186

    /04/34

    Exercises:

    Q2.3.2:Janardhan et al. (A-2) conducted a study

    in which they measured incidental intracranialaneurysms (IIAs) in 125 patients. Theresearchers examined post proceduralcomplications and concluded that IIAs can besafely treated without causing mortality andwith a lower complications rate than previouslyreported.

    Text Book : Basic Concepts andMethodology for the Health Sciences 45

    The following are the sizes (in millimeters) of the159 IIAs in the sample.

    Text Book : Basic Concepts andMethodology for the Health Sciences 46

    frequencyClass Interval

    290-4

    875-9

    2610-14

    1015-19

    420-24

    125-29

    230-34

    159Total

  • 8/10/2019 Wayne Daniel Ppt

    24/186

    /04/34

    (a)Use the frequency table to prepare:

    * A relative frequency distribution* A cumulative frequency distribution

    * A cumulative relative frequency distortion

    Text Book : Basic Concepts andMethodology for the Health Sciences 47

    (b) What percentage of the measurements arebetween 10 and 14 inclusive?

    (c) How many observations are less than 20?

    (d) What proportion of the measurements are

    greater than orequal to 25?(e) What percentage of the measurements areeither less than 10 or greater than 19?

    Text Book : Basic Concepts and

    Methodology for the Health Sciences 48

  • 8/10/2019 Wayne Daniel Ppt

    25/186

    /04/34

    Q2.3.5:The following table shows the number of

    hours 45 hospital patients slept following the

    administration of a certainanesthetic.(a) From thesedata construct:* A relativefrequencydistribution

    FrequencyClass Interval

    211-5

    166-10

    611-15

    216-20

    45Total

    Text Book : Basic Concepts andMethodology for the Health Sciences 49

    (b) How many of the measurements are greaterthan 10? Ans: 8

    (c) What percentage of the measurements arebetween 6-15 ?

    Ans: 49%

    (d) What proportion of the measurement is lessthan or equal 15? Ans: 0.96

    Text Book : Basic Concepts andMethodology for the Health Sciences 50

  • 8/10/2019 Wayne Daniel Ppt

    26/186

    /04/34

    Q2.3.6:The following are the number of babies

    born during a year in 60 community hospitals.

    (a) From thesedata construct:

    *A relative

    frequency

    distribution

    *

    Text Book : Basic Concepts andMethodology for the Health Sciences 51

    FrequencyClass Interval

    520-24

    625-29

    930-34

    335-39

    540-44

    845-49

    1150-54

    1355-5960Total

    Q2.3.7: In a study ofphysical endurance

    levels of male college

    freshman, the

    following composite

    endurance scores

    based on severalexercise routines

    were collected.

    Text Book : Basic Concepts andMethodology for the Health Sciences 52

    FrequencyClass interval

    6115-134

    7135-154

    16155-174

    31175-194

    37195-214

    28215-234

    18235-254

    8255-274

    3275-2941295-314

    155Total

  • 8/10/2019 Wayne Daniel Ppt

    27/186

    /04/34

    (a) From these data construct:

    * A relative frequency distribution

    Text Book : Basic Concepts andMethodology for the Health Sciences 53

    ) :4.2Section (

    Descriptive Statistics

    Measures of Central

    Tendency41-38Page

  • 8/10/2019 Wayne Daniel Ppt

    28/186

    /04/34

    key words:

    Descriptive Statistic, measure ofcentral tendency ,statistic, parameter,mean () ,median, mode.

    Text Book : Basic Concepts andMethodology for the Health Sciences 55

    The Statistic and The Parameter

    A Statistic:It is a descriptive measure computed from the

    data of a sample.

    A Parameter:It is aa descriptive measure computed from

    the data of a population.Since it is difficult to measure a parameter from the

    population, a sampleis drawn of size n, whosevalues are

    1 , 2 , ,n. From this data, wemeasure the statistic.

    Text Book : Basic Concepts andMethodology for the Health Sciences 56

  • 8/10/2019 Wayne Daniel Ppt

    29/186

    /04/34

    Measures of Central Tendency

    A measure of central tendency is a measure whichindicates where the middleof the data is.

    The three most commonly used measures of central

    tendency are:

    The Mean, the Median, and the Mode.

    The Mean :

    It is the average of the data.

    Text Book : Basic Concepts andMethodology for the Health Sciences 57

    The Population Mean:

    = which is usually unknown, then we use the

    sample mean to estimate or approximate it.

    The Sample Mean:=

    Example:

    Here is a random sample of size 10 of ages, where

    1 = 42, 2 = 28, 3= 28, 4= 61, 5= 31,

    6= 23, 7= 50, 8= 34, 9 = 32, 10 = 37.

    = (42 + 28 + + 37) / 10 = 36.6x

    x

    Text Book : Basic Concepts andMethodology for the Health Sciences 58

    1

    N

    ii

    N

    X

    1

    n

    ii

    n

    x

  • 8/10/2019 Wayne Daniel Ppt

    30/186

    /04/34

    Properties of the Mean:

    Uniqueness.For a given set of data there is one

    and only one mean. Simplicity. It is easy to understand and to

    compute.

    Affected by extreme values. Since all valuesenter into the computation.

    Example:Assume the values are 115, 110, 119, 117, 121 and

    126. The mean = 118.

    But assume that the values are 75, 75, 80, 80 and 280. Themean = 118, a value that is not representative of the set of

    data as a whole.

    Text Book : Basic Concepts andMethodology for the Health Sciences 59

    The Median:

    When orderingthe data, it is the observation that divide theset of observations into two equal partssuch that half of thedata are before it and the other are after it.

    * If n is odd, the median will be the middle of observations. Itwill be the (n+1)/2 thordered observation.

    When n = 11, then the median is the 6thobservation.

    * If n is even, there are two middle observations. The medianwill be the mean of these two middle observations. It will be

    the mean of the [ (n/2)th, (n/2 +1)th]ordered observation.When n = 12, then the median is the 6.5thobservation, which

    is an observation halfway between the 6thand 7thorderedobservation.

    Text Book : Basic Concepts andMethodology for the Health Sciences 60

  • 8/10/2019 Wayne Daniel Ppt

    31/186

    /04/34

    Example:

    For the same random sample, the ordered

    observations will be as:23, 28, 28, 31, 32, 34, 37, 42, 50, 61.

    Since n = 10, then the median is the 5.5thobservation,i.e. = (32+34)/2 = 33.

    Properties of the Median:

    Uniqueness.For a given set of data there isone and only one median.

    Simplicity. It is easy to calculate. It is not affected by extreme values as is

    the mean.

    Text Book : Basic Concepts andMethodology for the Health Sciences 61

    The Mode:

    It is the value which occurs most frequently.

    If all values are different there is no mode.

    Sometimes, there are more than one mode.

    Example:

    For the same random sample, the value 28 isrepeated two times, so it is the mode.

    Properties of the Mode:

    Sometimes, it is notunique.

    It may be used fordescribing qualitative data.

    It is not affected by extreme values

    Text Book : Basic Concepts andMethodology for the Health Sciences 62

  • 8/10/2019 Wayne Daniel Ppt

    32/186

    /04/34

    Examples

    Find the mean and the mode for thefollowing Relative Frequency?

    Mode = 7(has the higher frequency)

    Text Book : Basic Concepts andMethodology for the Health Sciences 63

    n

    xfx

    Age(x)

    frequenc

    y(f)

    X f

    56710

    2341

    10182810

    Total 10 66

    6.610

    66x

    ExamplesFind the mean and the mode for the

    following grouped

    Frequency table?

    Mode :interval( 79 )(can't give exact number only the interval

    with higher Frequency)Text Book : Basic Concepts and

    Methodology for the Health Sciences 64

    n

    xfx

    4.71074 x

    Age

    Frequency

    (f)

    Midpoint

    (X)

    X f

    1 - 34 - 67 - 9

    10 - 12

    2 1

    4 3

    2 5 8 11

    4 5 32 33

    Total

    10

    _

    74

  • 8/10/2019 Wayne Daniel Ppt

    33/186

    /04/34

    Examples

    Number of decayed teeth

    5.004.003.002.001.00.00

    6

    5

    4

    3

    2

    1

    0

    22

    5

    4

    2

    1

    Text Book : Basic Concepts andMethodology for the Health Sciences 65

    Find the mean andthe mode for the

    following bar

    chart?

    Solution :

    Mode = 3

    (has the higherfrequency)

    16

    )25()24()53()42()21()10(

    225421

    xxxxxxx

    n

    Text Book : Basic Concepts andMethodology for the Health Sciences 66

    687.216

    43x

    n

    xfx

  • 8/10/2019 Wayne Daniel Ppt

    34/186

    /04/34

    key words:

    Descriptive Statistic, measure ofdispersion , range ,variance, coefficient ofvariation.

    Text Book : Basic Concepts andMethodology for the Health Sciences 67

    2.5. Descriptive StatisticsMeasures of Dispersion:

    A measure of dispersion conveys information regarding theamount of variability present in a set of data.

    Note:

    1. If all the values are the same

    There is no dispersion .2.If all the values are different

    There is a dispersion:a).If the valuescloseto each other

    The amount of Dispersion small.b)If the values are widely scattered

    The Dispersion is greater.

    Text Book : Basic Concepts andMethodology for the Health Sciences 68

  • 8/10/2019 Wayne Daniel Ppt

    35/186

    /04/34

    43Page1.5.2Ex. Figure

    **Measures of Dispersion are :1.Range (R).

    2.Variance.

    3.Standard deviation.

    4.Coefficient of variation (C.V).

    Text Book : Basic Concepts andMethodology for the Health Sciences 69

    1.The Range (R):

    Range =Largest value- Smallest value =

    Note: Range concern only onto two values Example 2.5.1 Page 40: Refer to Ex 2.4.2.Page 37 Data: 43,66,61,64,65,38,59,57,57,50. Find Range? Range=66-38=28

    Text Book : Basic Concepts andMethodology for the Health Sciences 70

    SL xx

  • 8/10/2019 Wayne Daniel Ppt

    36/186

    /04/34

    2.The Variance: It measure dispersion relative to the scatter of the values a

    bout there mean.

    a) Sample Variance ( ) :

    ,where is sample mean

    Example 2.5.2 Page 40:

    Refer to Ex 2.4.2.Page 37

    Find Sample Variance of ages , = 56

    Solution: S2= [(43-56) 2+(66-56) 2+..+(50-56) 2]/ (10-1)

    = 810/9 = 90

    x

    x

    Text Book : Basic Concepts andMethodology for the Health Sciences 71

    2S

    1

    )(1

    2

    2

    n

    xx

    S

    n

    i

    i

    b)PopulationVariance( ) :

    where , is Population mean

    3.The Standard Deviation:

    is the square root of variance=

    a) Sample Standard Deviation = S =

    b)Population Standard Deviation = =

    Text Book : Basic Concepts andMethodology for the Health Sciences 72

    2

    N

    xN

    i

    i

    1

    2

    2

    )(

    Varince

    2S

    2

  • 8/10/2019 Wayne Daniel Ppt

    37/186

    /04/34

    Note:

    You can use the calculator to find : 1.Mean

    2.Variance

    3. Standard deviation

    How to use calculator is in the Appendix))

    Text Book : Basic Concepts andMethodology for the Health Sciences 73

    4.The Coefficient of Variation(C.V):

    Is a measure use to compare thedispersion in two sets of data which isindependent of the unit of themeasurement .

    where S: Sample standard deviation.

    : Sample mean.

    Text Book : Basic Concepts andMethodology for the Health Sciences 74

    )100(.

    X

    SVC

    X

  • 8/10/2019 Wayne Daniel Ppt

    38/186

    /04/34

    :46Page3.5.2Example

    Suppose two samples of human males yield thefollowing data:

    Sampe1 Sample2

    Age 25-year-olds 11year-olds

    Mean weight 145 pound 80 pound

    Standard deviation 10 pound 10 pound

    Text Book : Basic Concepts andMethodology for the Health Sciences 75

    We wish to know which is more variable.

    Solution:

    c.v (Sample1)= (10/145)*100= 6.9 %

    c.v (Sample2)= (10/80)*100= 12.5 %

    Then age of 11-years old(sample2) is more variation

    Text Book : Basic Concepts andMethodology for the Health Sciences 76

  • 8/10/2019 Wayne Daniel Ppt

    39/186

    /04/34

    Exercises

    Pages : 5253 Questions: 2.5.1 , 2.5.2 ,2.5.3

    H.W. : 2.5.4 , 2.5.5, 2.5.6, 2.5.14 * Also you can solve in the review questions

    page 57:

    Q: 12,13,14,15,16, 19

    Text Book : Basic Concepts andMethodology for the Health Sciences 77

    Exercises:For each of the data sets in the following

    exercises Use calculator if possible))compute:(a) The mean

    (b) The median

    (c) The mode

    (d) The range

    (e) The variance(f) The standard deviation

    (g) The coefficient of variation

    Text Book : Basic Concepts andMethodology for the Health Sciences 78

  • 8/10/2019 Wayne Daniel Ppt

    40/186

    /04/34

    Q2.5.3:

    Butz et al. (A-10) evaluated the duration of benefit

    derived from the use of noninvasive positive-pressure ventilation by patients with amyotrophiclateral sclerosis on symptoms, quality of life, andsurvival. One of the variables of interest is partialpressure of arterial carbon dioxide (PaCO2). Thevalues below ( mm of Hg ) reflect the result of

    baseline testing on 30 subjects as established by

    arterial blood gas analyses.

    Text Book : Basic Concepts andMethodology for the Health Sciences 79

    seven rat pups from the experiment involving the carotidartery.

    500 570 560 570 450 560 570

    (a) The mean (b) The median

    Ans: 540 Ans: 560

    (c) The mode (d) The range

    Ans: 570 Ans: 120(e) The variance (f) The standard deviation

    Ans: 2200 Ans: 46.90

    (g) The coefficient of variation Ans: 8.69%

    Text Book : Basic Concepts andMethodology for the Health Sciences 80

  • 8/10/2019 Wayne Daniel Ppt

    41/186

    /04/34

    H.W :Q2.5.1:

    Porcellini et al. (A-8) studied 13 HIV-positive

    patients who were treated with highly activeantiretroviral therapy (HAART) for at least 6months. The CD4 T cell counts ( ) atbaseline for the 13 subjects are listed below.

    230 205 313 207 227 245 173

    58 103 181 105 301 169

    Text Book : Basic Concepts andMethodology for the Health Sciences 81

    l106

    (a) The mean (b) The median

    Ans: 193.6 Ans: 207

    (c) The mode (d) The range

    Ans: no mode Ans: 255

    (e) The variance (f) The standard deviation

    Ans: 5568.09 Ans: 74.62

    (g) The coefficient of variation Ans: 38.543%

    Text Book : Basic Concepts andMethodology for the Health Sciences 82

  • 8/10/2019 Wayne Daniel Ppt

    42/186

    /04/34

    Q2.5.2:H.W Shrair and Jasper (A-9) investigatedwhether decreasing the venous return in young

    rats would affect ultrasonic vocalizations(USVs). Their research showed no significantchange in the number of ultrasonicvocalizations when blood was removed fromeither the superior vena cava or the carotidartery. Another important variable measuredwas the heart rate (bmp) during the withdrawal

    of blood. The data below presents the heartrate of

    Text Book : Basic Concepts andMethodology for the Health Sciences 83

    40.0 47.0 34.0 42.0 54.0 48.0 53.6 56.958.0 45.0 54.5 54.0 43.0 44.3

    53.9 41.8 33.0 43.1 52.4 37.9 34.5

    40.1 33.0 59.9 62.6 54.1 45.7 40.6 56.659.0

    (a) The mean (b) The median

    Ans: 47.41 Ans: 46.35(c) The mode (d) The rangeAns: 33, 54 Ans: 29.6

    Text Book : Basic Concepts andMethodology for the Health Sciences 84

  • 8/10/2019 Wayne Daniel Ppt

    43/186

    /04/34

    (e) The variance (f) The standard deviation

    Ans: 46.53 Ans: 8.75

    (g) The coefficient of variation

    (H.W)Q2.5.4:

    According to Starch et al. (A-11), hamstring tendongrafts have been the weak link in anteriorcruciate ligament reconstruction. In a controlledlaboratory study, they compared two techniquesfor reconstruction : either an interference screwor a central sleeve and

    Text Book : Basic Concepts andMethodology for the Health Sciences 85

    screw on the tibial side. For eight cadavericknees, the measurements below represent therequired force ( in Newtones) at which initialfailure of graft strands occurred for thecentral sleeve and screw technique.

    172.5 216.63 212.62 98.97 66.95239.76 19.57 195.72

    (a) The mean (b) The medianAns: 152.84 Ans: 184.11

    (c) The mode (d) The range

    Ans: no mode Ans: 220.19

    Text Book : Basic Concepts and

    Methodology for the Health Sciences 86

  • 8/10/2019 Wayne Daniel Ppt

    44/186

    /04/34

    (e) The variance (f) The standard deviation

    Ans: 6494.732 Ans: 80.5899

    (g) The coefficient of variation Ans: 52.73%Q2.5.5:

    Cardosi et al. (A-12) performed a 4 yearsretrospective review of 102 women undergoingradical hysterectomy for cervical or endometrialcancer. Catheter-associated urinary tract infection

    was observed in 12 of the subjects. Below are thenumbers of

    Text Book : Basic Concepts andMethodology for the Health Sciences 87

    postoperative days until diagnosis of the infection foreach subject experiencing an infection.

    16 10 49 15 6 15 8 19 11 22 13 17

    (a) The mean (b) The median

    Ans: 16.75 Ans: 15(c) The mode (d) The range

    Ans: 15 Ans: 43

    (e) The variance (f) The standard deviationAns: 124.0227 Ans: 11.1365

    (g) The coefficient of variation Ans: 66.49%

    Text Book : Basic Concepts andMethodology for the Health Sciences 88

  • 8/10/2019 Wayne Daniel Ppt

    45/186

    /04/34

    Q2.5.6:The purpose of a study by Nozama et al. (A-13) was to evaluate the outcome of surgical repair

    of pars interarticularis defect by segmental wirefixation in young adults with lumbar spondylolysis.The authors found that segmental wire fixationhistorically has been successful in the treatment ofnonathletes with spondylolysis, but no informationexisted on the results of this type of surgery inathletes. In a retrospective study, the authorsfound 20 subjects who had the surgery between

    1993 and 2000. For these subjects, the data below

    Text Book : Basic Concepts andMethodology for the Health Sciences 89

    represent the duration in months of follow-up careafter the operation.

    103 68 62 60 60 54 49 44 42 41 3836 34 30 19 19 19 19 17 16

    (a) The mean (b) The median

    Ans: 41.5 Ans: 39.5

    (c) The mode (d) The rangeAns: 19 Ans: 87

    (e) The variance (f) The standard deviation

    Ans: 490.264 Ans: 22.1419

    Text Book : Basic Concepts andMethodology for the Health Sciences 90

  • 8/10/2019 Wayne Daniel Ppt

    46/186

    /04/34

    (g) The coefficient of variation Ans: 53.35%

    Q2.5.14:

    In a pilot study, Huizinga et al. ( A-14) wanted togain more insight into the psychosocialconsequences for children of a parent with cancer.For the study, 14 families participated insemistructured interviews and completedstandardized questionnaires. Below is the age ofthe sick parent with cancer (in years) for the 14families.

    Text Book : Basic Concepts andMethodology for the Health Sciences 91

    37 48 53 46 42 49 44 38 32 32 51 51 48 41

    (a) The mean (b) The medianAns: 43.7143 Ans: 45

    (c) The mode (d) The range

    Ans: 32, 51 Ans: 21(e) The variance (f) The standard deviation

    Ans: 48.0659 Ans: 6.93296(g) The coefficient of variation Ans: 15.8597%

    Text Book : Basic Concepts andMethodology for the Health Sciences 92

  • 8/10/2019 Wayne Daniel Ppt

    47/186

    /04/34

    3Chapter

    Probability

    The Basis of the Statistical

    inference

    Key words:

    Probability, objective Probability,

    subjective Probability, equally likely

    Mutually exclusive, multiplicative ruleConditional Probability, independent events, Bayes

    theorem

    Text Book : Basic Concepts andMethodology for the Health Sciences 94

  • 8/10/2019 Wayne Daniel Ppt

    48/186

    /04/34

    Introduction1.3

    The concept of probability is frequently encountered ineveryday communication. For example, a physician maysay that a patient has a 50-50 chance of surviving a certainoperation.

    Another physician may say that she is 95 percent certainthat a patient has a particular disease.

    Most people express probabilities in terms of percentages.

    But, it is more convenient to express probabilities asfractions. Thus, we may measure the probability of theoccurrence of some event by a number between 0 and 1.

    The more likely the event, the closer the number is to one.An event that can't occur has a probability of zero, and anevent that is certain to occur has a probability of one.

    Text Book : Basic Concepts andMethodology for the Health Sciences 95

    Some definitions of:2.3probability

    1.Equally likely outcomes:

    Are the outcomes that have the same chance ofoccurring.

    2.Mutually exclusive:

    Two events are said to be mutually exclusive ifthey cannot occur simultaneously such that

    A B =.

    Text Book : Basic Concepts andMethodology for the Health Sciences 96

  • 8/10/2019 Wayne Daniel Ppt

    49/186

    /04/34

    The universal Set(S): The set all possibleoutcomes.

    The empty set : Contain no elements. The event ,E: is a set of outcomes in S which has a

    certain characteristic.

    Classical Probability: If an event can occur in Nmutually exclusive and equally likely ways, and if m ofthese possess a triat, E, the probability of theoccurrence of event E is equal to m/ N .

    P(A)=n(A)/n(S)

    For Example: in the rolling of the die , each of the

    six sides is equally likely to be observed . So, theprobability that a 4 will be observed is equal to 1/6.

    Text Book : Basic Concepts andMethodology for the Health Sciences 97

    Relative Frequency Probability:

    Def:If some posses is repeated a large number oftimes, n, and if some resulting event E occurs m times, the relative frequency of occurrence of E , m/n willbe approximately equal to probability of E .

    P(E) = m/n .

    Text Book : Basic Concepts andMethodology for the Health Sciences 98

  • 8/10/2019 Wayne Daniel Ppt

    50/186

    /04/34

    Elementary Properties of3.3:Probability

    Given some process (or experiment )with n mutually exclusive events E1, E2,E3,, En, then

    1-P(Ei) 0, i= 1,2,3,n 2- P(E1 )+ P(E2) ++P(En)=1

    3- P(Ei+EJ)=P(Ei )+ P(EJ )

    Ei,EJare mutually exclusive

    Text Book : Basic Concepts andMethodology for the Health Sciences 99

    Rules of Probability1-Addition Rule P(A U B)= P(A) + P(B)P (AB )

    2- If A and B are mutually exclusive (disjoint) ,then P (AB ) = 0 Then , addition rule is

    P(A U B)= P(A) + P(B) . 3- Complementary Rule P(A' )= 1P(A) where, A' = = complement event Consider example 3.4.1 Page 63

    Text Book : Basic Concepts andMethodology for the Health Sciences 100

  • 8/10/2019 Wayne Daniel Ppt

    51/186

    /04/34

    Table 3.4.1 in Example 3.4.1TotalLater >18

    (L)Early = 18

    (E)

    Family history of

    Mood Disorders

    633528Negative(A)

    573819Bipolar

    Disorder(B)

    854441Unipolar (C)

    1136053Unipolar and

    Bipolar(D)318177141Total

    Text Book : Basic Concepts andMethodology for the Health Sciences 101

    Find the following probabilities:

    1. P(C) =

    2.P(L) =

    (=3.P(AcE) =3.P(A

    4.P(L U D)=

    B) =5.P(AE) =6.P(L

    (=Ac7.P(E

    (=U Ac8.P(L

    Text Book : Basic Concepts andMethodology for the Health Sciences 102

    A

  • 8/10/2019 Wayne Daniel Ppt

    52/186

    /04/34

    Answer the following questions:**

    Suppose we pick a person at random from this sample.1-The probability that this person will be 18-years old?

    2-The probability that this person has family history of moodorders Unipolar(C)?

    3-The probability that this person has no family history of moodorders Unipolar( )?

    4-The probability that this person is 18-years old or has no familyhistory of mood orders Unipolar (C))?

    5-The probability that this person is more than18-years old andhas family history of mood orders Unipolar and Bipolar(D)?

    Text Book : Basic Concepts andMethodology for the Health Sciences 103

    C

    Conditional Probability:

    P(A\B) is the probability of A assuming that B has

    happened.

    P(A\B)= , P(B) 0

    P(B\A)= , P(A) 0

    )(

    )(

    )(

    )(

    BP

    BAPor

    Bn

    BAn

    )(

    )(

    )(

    )(

    AP

    BAPor

    An

    BAn

    Text Book : Basic Concepts andMethodology for the Health Sciences 104

  • 8/10/2019 Wayne Daniel Ppt

    53/186

    /04/34

    64Page2.4.3ExampleFrom previous example 3.4.1 Page 63 , answer

    suppose we pick a person at random and find he is 18

    years (E),what is the probability that this person will

    be one with Negative family history of mood

    disorders (A)?

    suppose we pick a person at random and find he has

    family history of mood (D) what is the probability that

    this person will be 18 years (E)?

    Text Book : Basic Concepts andMethodology for the Health Sciences 105

    :Calculating a joint Probability

    Example 3.4.3.Page 64

    Suppose we pick a person at randomfrom the 318 subjects. Find theprobability that he will early (E) and has

    Negative family history of mooddisorders (A).

    Exercise: Example 3.4.5.Page 66

    Exercise: Example 3.4.6.Page 67

    Text Book : Basic Concepts andMethodology for the Health Sciences 106

  • 8/10/2019 Wayne Daniel Ppt

    54/186

    /04/34

    Independent Events:

    If A has no effect on B, we said that A,Bare independent events.

    Then,

    1- P(AB)= P(B)P(A) 2- P(A\B)=P(A)

    3- P(B\A)=P(B)

    Text Book : Basic Concepts andMethodology for the Health Sciences 107

    68Page7.4.3Example In a certain high school class consisting of 100

    students60 , 40 of them are boys, it is observed that24 girls and 16 boys wear eyeglasses . If a student ispicked at random from this class

    1.The probability that the student wears eyeglasses ,

    2.What is the probability that a student picked at

    random wears eyeglasses given that the student is aboy?

    3.What is the probability of the joint occurrence ofthe events of wearing eye glasses and being a girl?

    Text Book : Basic Concepts andMethodology for the Health Sciences 108

  • 8/10/2019 Wayne Daniel Ppt

    55/186

    /04/34

    SolutionTake the following symbols:

    not a boy = girl:B : is boy. BC

    not were glasses:G:were glasses GC

    n(s)=100 ,n(B)=40,n(GB)=16,

    n(GBC)=24

    Text Book : Basic Concepts andMethodology for the Health Sciences 109

    TotalBCB

    n(G)n(BCG)

    n(BG)G

    n(GC

    )n(BC

    GC

    )n(BGC

    )GC

    n(S)(n(BCn(B)Total

    1.P(G)=

    =2.P(G/B)

    (=3.P(GBC

    TotalBCB

    2416G

    GC

    10040Total

    Text Book : Basic Concepts andMethodology for the Health Sciences 110

  • 8/10/2019 Wayne Daniel Ppt

    56/186

    /04/34

    69Page8.4.3Example

    Suppose that of 1200 admission to a general

    hospital during a certain period of time,750 are

    private admissions. If we designate these as a set A,

    then compute P(A) , P( ).

    Exercise: Example 3.4.9.Page 76

    A

    Text Book : Basic Concepts andMethodology for the Health Sciences 111

    Marginal Probability: Definition: Given some variable that can be broken down into m

    categories designatedby and another jointly occurring variable

    that is broken down into n categories designated by, the marginal probability of with all the categories of

    B . That is,

    for all value of j Example 3.4.9.Page 76 Use data of Table 3.4.1, and rule of marginal

    Probabilities to calculate P(E).,P(L),P(A),..

    Text Book : Basic Concepts andMethodology for the Health Sciences 112

    ),()( jii BAPAP

    mi AAAA ,.......,,.......,, 21

    nj BBBB ,.......,,.......,, 21

    iA

  • 8/10/2019 Wayne Daniel Ppt

    57/186

    /04/34

    Exercise:

    Page 76-77 Questions :

    3.4.1, 3.4.3,3.4.4

    H.W.

    3.4.5 , 3.4.7

    Text Book : Basic Concepts andMethodology for the Health Sciences 113

    Q3.4.1: In a study of violent victimization of womenand men, Porcelli et al. (A-2) collected informationfrom 679 women and 345 men aged 18 to 64years at several family practice centers in themetropolitan Detroit area. Patients filled out ahealth history questionnaire that included aquestion about victimization. The following tableshows the sample subjects cross-classified by sex

    and type of violent victimization reported. Thevictimization categories are defined as novictimization, partner victimization (and not byothers), victimization by persons other than

    Text Book : Basic Concepts and

    Methodology for the Health Sciences 114

  • 8/10/2019 Wayne Daniel Ppt

    58/186

    /04/34

    partners (friends, family members, or strangers), andthose who reported multiple victimization.

    (a) Suppose we pick a subject at random from this

    group. What is the probability that this subject willbe a women?

    Text Book : Basic Concepts andMethodology for the Health Sciences 115

    TotalMultipleVictimization

    (T)

    Nonpartners

    (N)

    Partners

    (P)

    NoVictimization

    (V)

    679181634611Women(W)

    345101710308Men(M)

    1024283344919Total

    (b) What do we call the probability calculated inpart a?

    (c) Show how to calculate the probability asked forin part a by two additional methods.

    (d) If we pick a subject at random, what isprobability that the subject will be a women andhave experienced partner abuse?

    (e) What do we call the probability calculated inpart d?

    (f) Suppose we picked a man at random. Knowingthis information, what is the probability that he

    Text Book : Basic Concepts andMethodology for the Health Sciences 116

  • 8/10/2019 Wayne Daniel Ppt

    59/186

    /04/34

    experienced abuse from nonpartners?

    (g) What do we call the probability calculated in

    part f?(h) Suppose we pick a subject at random. Whatis the probability that it is a man or someonewho experienced abuse from a partner?

    (i) What do we call the method by which youobtained the probability in part h?

    Text Book : Basic Concepts andMethodology for the Health Sciences 117

    H.WQ3.4.3: Fernando et al. (A-3) studied drug-sharingamong injection drug users in the South Bronx inNew York City. Drug users in New York City usethe term split a bag or get down on a bag torefer to the practice of diving a bag of heroin orother injectable substances. A common practiceincludes splitting drugs after they are dissolved ina common cooker, a procedure with considerable

    HIV risk. Although this practice is common, little isknown about the prevalence of such practices. Theresearchers asked injection drug users in fourneighborhoods in the South Bronx if they ever

    Text Book : Basic Concepts andMethodology for the Health Sciences 118

  • 8/10/2019 Wayne Daniel Ppt

    60/186

    /04/34

    got down on drugs in bags or shots. The resultsclassified by gender and splitting practice are

    given below:State the

    following

    probabilities in

    words and calculate:

    (a) Ans: 0.3418

    (b) Ans: 0.8746(c) Ans: 0.6134

    Text Book : Basic Concepts and

    Methodology for the Health Sciences 119

    TotalNever SplitDrugs

    Split DrugsGender

    673324349Male

    348128220Female

    1021452569Total

    )( DrugsSplitMaleP

    )( DrugsSplitMaleP

    )( DrugsSplitMaleP

    (d) Ans: 0.6592

    Q3.4.4: Laveist and Nuru-Jeter (A-4) conducteda study to determine if doctor-patient raceconcordance was associated with greatersatisfaction with care. Toward that end, theycollected a national sample of African-

    American, Caucasian, Hispanic, and Asian-American respondents. The following tableclassifies the race of the subjects as well as therace of their physician:

    Text Book : Basic Concepts and

    Methodology for the Health Sciences 120

    )(MaleP

  • 8/10/2019 Wayne Daniel Ppt

    61/186

    /04/34

    (a) What is the probability that a randomly

    selected subject will have an Asian/Pacific-Islander physician? Ans: 0.1533

    Text Book : Basic Concepts andMethodology for the Health Sciences 121

    Patient Race

    TotalAsian-American

    HispanicAfrican-American

    CaucasianPhysiciansRace

    1796175406436779White19651516214African-

    American

    16621281719Hispanic

    417203717568Asian/Pacific-Island

    1454565530Other

    2720389676745910Total

    (b) What is the probability that an African-Americansubject will have an African- American physician?

    Ans: 0.2174

    (c) What is the probability that a randomly selectedsubject in the study will be Asian-American andhave an Asian/Pacific-Islander physician?Ans: 0.075

    (d) What is the probability that a subject chosen atrandom will be Hispanic or have a Hispanicphysician? Ans: 0.2625

    (e) Use the concept of complementary events tofind the probability that a subject chosen at

    Text Book : Basic Concepts andMethodology for the Health Sciences 122

  • 8/10/2019 Wayne Daniel Ppt

    62/186

    /04/34

    random in the study does not have a whitephysician? Ans: 0.3397

    Q3.4.5:

    If the probability of left-handedness in a certaingroup of people is 0.5, what is the probabilityof right-handedness (assuming noambidexterity)?

    Text Book : Basic Concepts andMethodology for the Health Sciences 123

    Q3.4.6:The probability is 0.6 that a patient selected atrandom from the current residents of acertain hospital will be a male. The probabilitythat the patient will be a male who is in forsurgery is 0.2. A patient randomly selectedfrom current residents is found to be a male;

    what is the probability that the patient is inthe hospital for surgery?

    Ans: 0.3333

    Text Book : Basic Concepts andMethodology for the Health Sciences 124

  • 8/10/2019 Wayne Daniel Ppt

    63/186

    /04/34

    Q3.4.7:

    In a certain population of hospital patients the

    probability is 0.35 that a randomly selectedpatient will have heart disease. The probabilityis 0.86 that a patient with heart disease is asmoker. What is the probability that a patientrandomly selected from the population will bea smoker and have heart disease?

    Ans: 0.301

    Text Book : Basic Concepts andMethodology for the Health Sciences 125

    Text Book : Basic Concepts andMethodology for the Health Sciences 126

    Baye's TheoremPages 79-83

  • 8/10/2019 Wayne Daniel Ppt

    64/186

    /04/34

    In this case if the patient has to do ablood test in the laboratory,some time the result is

    (he has the disease) and ifPositivenegativethe result is

    (he doesn't has the disease)

    Text Book : Basic Concepts andMethodology for the Health Sciences 127

    So, we have the following cases

    The patient has the

    disease(D)

    The patient doesn't has

    the disease( D )

    Lab result is

    Negative(T)

    wrong result

    Specificity

    A symptom

    P(T|D)

    Lab result is

    positive

    ( T )

    Sensitivity

    A symptom

    P(T|D)

    wrong result

    Text Book : Basic Concepts andMethodology for the Health Sciences 128

  • 8/10/2019 Wayne Daniel Ppt

    65/186

    /04/34

    Text Book : Basic Concepts andMethodology for the Health Sciences 129

    Definition.1

    The sensitivity of the symptom

    This is the probability of a positive result given that the

    subject has the disease. It is denoted by P(T|D)

    2Definition.

    The specificity of the symptomThis is the probability of negative result given that the

    subject does not have the disease. It is denoted by

    P(T|D)

    :3Definition

    The predictive value positive of the symptom

    This is the probability that the subject has thedisease given that the subject has a positivescreening test result.

    It is calculated using bayes theorem through thefollowing formula

    Where P(D) is the rate of the disease

    Text Book : Basic Concepts and

    Methodology for the Health Sciences 130

    )()|()()|()()|()|(

    DPDTPDPDTPDPDTPTDP

  • 8/10/2019 Wayne Daniel Ppt

    66/186

    /04/34

    Which is given by

    P(D) = 1P(D)P(T/ D) = 1 - P(T/ D)

    the numerator is equal to sensitivityNote thattimes rate of the disease, while thedenominator is equal to sensitivity times rateof the disease plus 1 minus the specificity

    times one minus the rate of the disease

    Text Book : Basic Concepts andMethodology for the Health Sciences 131

    Text Book : Basic Concepts andMethodology for the Health Sciences 132

    Definition 4

    The predictive value negative of the symptom

    This is the probability that a subject does not have the

    disease given that the subject has a negative

    screening test result .It is calculated using Bayes

    Theorem through the following formula

    where,

    )()|()()|(

    )()|()|(

    DPDTPDPDTP

    DPDTPTDP

    )|(1)|( DTPDTp

  • 8/10/2019 Wayne Daniel Ppt

    67/186

    /04/34

    Text Book : Basic Concepts andMethodology for the Health Sciences 133

    Example 3 5 1 page 82

    A medical research team wished to evaluate a proposed screening test for

    Alzheimers disease. The test was given to a random sample of 450 patients

    with Alzheimers disease and an independent random sample of 500 patients

    without symptoms of the disease. The two samples were drawn from

    populations of subjects who were 65 yearsor older. The results are as follows.

    Test Result Yes (D) No ( ) Total

    Positive(T) 436 5 441

    Negativ( ) 14 495 509

    Total 450 500 950

    T

    D

    Text Book : Basic Concepts andMethodology for the Health Sciences 134

    In the context of this example

    a)What is a false positive?

    A false positive is when the test indicates a positive result (T) when

    the person does not have the disease

    b) What is the false negative?

    A false negative is when a test indicates a negative result ( )

    when the person has the disease (D).

    c) Compute the sensitivity of the symptom.

    d) Compute the specificity of the symptom.

    D

    T

    9689.0450

    436)|( DTP

    99.0500

    495)|( DTP

  • 8/10/2019 Wayne Daniel Ppt

    68/186

    /04/34

    Text Book : Basic Concepts andMethodology for the Health Sciences 135

    e) Suppose it is known that the rate of the disease in the general population

    is 11.3%. What is the predictive value positive of the symptom and the

    predictive value negative of the symptomThe predictive value positive of the symptom is calculated as

    The predictive value negative of the symptom is calculated as

    996.0.113)(0.0311)(087)(0.99)(0.8

    87)(0.99)(0.8

    )()|()()|(

    )()|()|(

    DPDTPDPDTP

    DPDTPTDP

    925.00.113)-(.01)(1.113)(0.9689)(0

    .113)(0.9689)(0

    )()|()()|(

    )()|()|(

    DPDTPDPDTP

    DPDTPTDP

    Exercise:

    Page 83

    Questions :

    3.5.1, 3.5.2

    H.W.:

    Page 87 : Q4,Q5,Q7,Q9,Q21

    Text Book : Basic Concepts andMethodology for the Health Sciences 136

  • 8/10/2019 Wayne Daniel Ppt

    69/186

    /04/34

    Q3.5.1;A medical research team wishes toassess the usefulness of a certain symptom

    (call it S) in the diagnosis of a particulardisease. In a random sample of 775 patientswith the disease, 744 reported having thesymptom. In an independent random sampleof 1380 subjects without the disease, 21reported that they had the symptom.

    (a) In the context of this exercise, what is a false

    positive?(b) What is a false negative?

    Text Book : Basic Concepts and

    Methodology for the Health Sciences 137

    (c) Compute the sensitivity of the symptom.

    (d) Compute the specificity of the symptom.

    (e) Suppose it is known that the rate of the diseasesin the general population is 0.001. what is thepredictive value positive of the symptom?

    (f) What is the predictive value negative of thesymptom?

    Text Book : Basic Concepts andMethodology for the Health Sciences 138

  • 8/10/2019 Wayne Daniel Ppt

    70/186

    /04/34

    (h) What do you conclude about the predictivevalue of the symptom on the basis of the results

    obtained in part g?Q3.5.2:

    Dorsay and Helms (A-6) performed aretrospective study of 71 knees scanned by MRI.One of the indicators they examined was theabsence of the bow-tie sign in the MRI asevidence of a bucket-handle or bucket-handle

    type tear of the meniscus.

    Text Book : Basic Concepts andMethodology for the Health Sciences 139

    In the study, surgery confirmed that 43 of the71 cases were bucket-handle tears. The casesmay be cross-classified by bow-tie signstatus and surgical results as follows:

    Text Book : Basic Concepts andMethodology for the Health Sciences 140

    TotalTear SurgicallyConfirmed As Not

    Present ( )

    Tear SurgicallyConfirmed (D)

    481038Positive Test(absent bow-tie sign)

    (T)

    23185Negative Test(bow-tie present)( )

    712843Total

    D

    T

  • 8/10/2019 Wayne Daniel Ppt

    71/186

    /04/34

    (a) What is the sensitivity of testing to see if theabsent bow-tie sign indicates a meniscal tear?

    Ans: 0.8837(b) What is the specificity of testing to see if theabsent bow-tie sign indicates a meniscal tear?

    Ans: 0.6229

    (c) What additional information would you need todetermine the predictive value of the test?

    Text Book : Basic Concepts andMethodology for the Health Sciences 141

    (d) Suppose it is known that the rate of thedisease in the general population is 0.1, what isthe predictive value positive of the symptom?Ans: 0.20659

    (e) What is predictive value negative of the

    symptom? Ans: 0.9797

    Text Book : Basic Concepts andMethodology for the Health Sciences 142

  • 8/10/2019 Wayne Daniel Ppt

    72/186

    /04/34

    Chapter 4:Probabilistic features of

    certain data Distributions

    Pages 93- 111

    Key words

    Probability distribution , random variable ,

    Bernolli distribution, Binomail distribution,

    Poisson distribution

    Text Book : Basic Concepts andMethodology for the Health Sciences 144

  • 8/10/2019 Wayne Daniel Ppt

    73/186

    /04/34

    The Random Variable (X):

    When the values of a variable (height, weight,or age) cant be predicted in advance, thevariable is called a random variable.

    An example is the adult height.

    When a child is born, we cant predict exactly

    his or her height at maturity.

    Text Book : Basic Concepts andMethodology for the Health Sciences 145

    4.2 Probability Distributions forDiscrete Random Variables

    Definition:

    The probability distribution of a discreterandom variable is a table, graph, formula,or other device used to specify all

    possible values of a discrete randomvariable along with their respectiveprobabilities.

    Text Book : Basic Concepts andMethodology for the Health Sciences 146

  • 8/10/2019 Wayne Daniel Ppt

    74/186

    /04/34

    The Cumulative Probability

    Distribution of X, F(x):

    It shows the probability that the variable

    X is less than or equal to a certain value,

    F(x)=P(X x).

    Text Book : Basic Concepts andMethodology for the Health Sciences 147

    Example 4.2.1 page 94:

    F(x)=

    P(X x)

    P(X=x)

    (f/n)=

    Frequeny

    (f)

    Number ofPrograms

    0.20880.2088621

    0.36700.1582472

    0.49830.1313393

    0.62960.13133940.82490.1953585

    0.94950.1246376

    0.96300.013547

    1.00000.0370118

    1.0000297TotalText Book : Basic Concepts and

    Methodology for the Health Sciences 148

  • 8/10/2019 Wayne Daniel Ppt

    75/186

    /04/34

    Properties of probability distribution

    of discrete random variable. 1. 0 P(X = x) 1

    2. P(X = x) =1

    3. P(a X b)=P(X b)- P(X a-1)

    4. P(X < b)= P(X b-1)

    Text Book : Basic Concepts andMethodology for the Health Sciences 149

    Example 4.2.2 page 96: (use table inexample 4.2.1)

    What is the probability that a randomlyselected family will be one who usedthree assistance programs?

    Example 4.2.3 page 96: (use table in

    example 4.2.1)

    What is the probability that a randomlyselected family used either one or twoprograms?

    Text Book : Basic Concepts andMethodology for the Health Sciences 150

  • 8/10/2019 Wayne Daniel Ppt

    76/186

    /04/34

    Example 4.2.4 page 98: (use table inexample 4.2.1)

    What is the probability that a family picked at

    random will be one who used two or fewerassistance programs? Example 4.2.5 page 98: (use table in

    example 4.2.1)What is the probability that a randomlyselected family will be one who used fewerthan four programs?

    Example 4.2.6 page 98: (use table inexample 4.2.1)

    What is the probability that a randomly selectedfamily used five or more programs?

    Text Book : Basic Concepts andMethodology for the Health Sciences 151

    Example 4.2.7 page 98: (use table in

    example 4.2.1)

    What is the probability that a randomlyselected family is one who used betweenthree and five programs, inclusive?

    Text Book : Basic Concepts andMethodology for the Health Sciences 152

  • 8/10/2019 Wayne Daniel Ppt

    77/186

    /04/34

    Find the following probability

    For the following probability distribution table:

    P(X= 15) =

    P(X = 10)=

    P( X = 11)=

    P(X 10)=

    P( 5 X 15)=

    P(X 10.5) =

    Mean =

    Text Book : Basic Concepts andMethodology for the Health Sciences

    153

    P(X=x)X

    0.235

    0.3010

    -------15

    0.4420

    For the following probability distribution table:

    Find:

    P(X 6) = P(X < 5)=

    P( X = 7)=

    P(X =5)=

    P(X > 4)=

    P( 4 X 6)= Mean=

    Text Book : Basic Concepts andMethodology for the Health Sciences 154

    P(X x)X

    0.103

    0.254

    0.555

    0.606

    0.78718

  • 8/10/2019 Wayne Daniel Ppt

    78/186

    /04/34

    The Binomial Distribution:3.4 The binomial distribution is one of the most

    widely encountered probability distributions inapplied statistics. It is derived from a process

    known as a Bernoulli trial.

    Bernoulli trial is:

    When a random process or experiment called a

    trial can result in only one of two mutually

    exclusive outcomes, such as dead or alive, sick

    or well, the trial is called a Bernoulli trial.

    Text Book : Basic Concepts andMethodology for the Health Sciences 155

    The Bernoulli Process A sequence of Bernoulli trials forms a Bernoulli

    process under the following conditions

    1- Each trial results in one of two possible, mutuallyexclusive, outcomes. One of the possible outcomesis denoted (arbitrarily) as a success, and the other isdenoted a failure.

    2- The probability of a success, denoted by p, remainsconstant from trial to trial. The probability of afailure, 1-p, is denoted by q.

    3- The trials are independent, that is the outcome ofany particular trial is not affected by the outcome ofany other trial

    Text Book : Basic Concepts andMethodology for the Health Sciences 156

  • 8/10/2019 Wayne Daniel Ppt

    79/186

    /04/34

    The probability distribution of the binomial

    random variableX, the number of successes in

    nindependenttrials is:

    Where is the number of combinations of ndistinct objects taken x ofthemata time.

    * Note: 0! =1

    Text Book : Basic Concepts andMethodology for the Health Sciences 157

    ( ) ( ) , 0,1,2,....,X n X

    nf x P X x p q x n

    x

    n

    x

    !

    !( )!

    n n

    x n xx

    ! ( 1)( 2)....(1)x x x x

    Properties of the binomial distribution

    1. f(x) 0 2. f(x) =1 3.The parameters of the binomial

    distribution are n and p

    4. = E(X)= n p 5.2= Var(X)= npq

    Text Book : Basic Concepts andMethodology for the Health Sciences 158

  • 8/10/2019 Wayne Daniel Ppt

    80/186

    /04/34

    Example 4.3.1 page 100 If we examine all birth records from the North

    Carolina State Center for Health statistics for year2001, we find that 85.8 percent of the pregnancieshad delivery in week 37 or later (full- term birth).

    If we randomly selected five birth records from thispopulation what is the probability that exactly threeof the records will be for full-term births?

    Exercise: example 4.3.2 page 104

    Text Book : Basic Concepts andMethodology for the Health Sciences 159

    Example 4.3.3 page 104 Suppose it is known that in a certain

    population 10 percent of the population iscolor blind. If a random sample of 25 people isdrawn from this population, find the probabilitythat

    a) Two or fewer will be color blind.

    b) 24 or more will be color blindc) Between two and five inclusive will be color

    blind.

    d) At most one will be color blind.

    Exercise: example 4.3.4 page 106

    Text Book : Basic Concepts andMethodology for the Health Sciences 160

  • 8/10/2019 Wayne Daniel Ppt

    81/186

    /04/34

    The Poisson Distribution4.4 If the random variable X is the number of

    occurrences of some random event in a certainperiod of time or space (or some volume of matter).

    The probability distribution of X is given by:

    f (x) =P(X=x) = ,x = 0,1,..

    The symbol e is the constant equal to 2.7183.(Lambda) is called the parameter of the distributionand is the average number of occurrences of the

    random event in the interval (or volume)

    !

    x

    x

    e

    Text Book : Basic Concepts andMethodology for the Health Sciences 161

    Properties of the Poisson distribution

    1. f(x) 0 2. f(x) =1 3.The parameters of the poisson

    distribution is

    4. = E(X)= 5.2= Var(X)=

    Text Book : Basic Concepts andMethodology for the Health Sciences 162

  • 8/10/2019 Wayne Daniel Ppt

    82/186

    /04/34

    Example 4.4.1 page 111

    In a study of a drug -induced anaphylaxis amongpatients taking rocuronium bromide as part oftheir anesthesia, Laake and Rottingen foundthat the occurrence of anaphylaxis followed aPoisson model with =12 incidents per yearin Norway .Find

    1- The probability that in the next year, among

    patients receiving rocuronium, exactly threewill experience anaphylaxis?

    Text Book : Basic Concepts andMethodology for the Health Sciences 163

    2- The probability that less than two patientsreceiving rocuronium, in the next year willexperience anaphylaxis?

    3- The probability that more than two patientsreceiving rocuronium, in the next 6-month

    will experience anaphylaxis?

    4- The expected value of patients receivingrocuronium, in the next year who will experienceanaphylaxis.

    5- The variance of patients receiving rocuronium,in the next year who will experience anaphylaxis

    6- The standard deviation of patients receivingrocuronium, in the next year who will experienceanaphylaxis

    Text Book : Basic Concepts andMethodology for the Health Sciences 164

  • 8/10/2019 Wayne Daniel Ppt

    83/186

    /04/34

    Example 4.4.2 page 111: Refer to

    example 4.4.1

    1-What is the probability that at least threepatients in the next year will experienceanaphylaxis if rocuronium is administered withanesthesia?

    2-What is the probability that exactly one patientin the next month will experience anaphylaxis ifrocuronium is administered with anesthesia?

    3-What is the probability that none of thepatients in the next month will experienceanaphylaxis if rocuronium is administered withanesthesia?

    Text Book : Basic Concepts andMethodology for the Health Sciences 165

    4-What is the probability that at mosttwo patients in the next year willexperience anaphylaxis if rocuronium isadministered with anesthesia?

    Exercises: examples 4.4.3, 4.4.4 and

    4.4.5 pages111-113 Exercises: Questions 4.3.4 ,4.3.5,

    4.3.7 ,4.4.1,4.4.5

    Text Book : Basic Concepts andMethodology for the Health Sciences 166

  • 8/10/2019 Wayne Daniel Ppt

    84/186

    /04/34

    Excercices:

    111Page:4.3.4QThe same survey data base cited shows that 32 percent ofU.S adults indicated that they have been tested for HIV atsome points in their life .Consider a simple random sampleof 15 adults selected at that time .Find the probability

    that the number of adults who have been

    tested for HIV in the sample would be:

    Text Book : Basic Concepts andMethodology for the Health Sciences 167

    Hint:

    ( ) ( ) , 0,1,2,....,X n X

    nx P X x p q x n

    x

    Text Book : Basic Concepts andMethodology for the Health Sciences 168

  • 8/10/2019 Wayne Daniel Ppt

    85/186

    /04/34

    (a) Three (Ans. 0.1457)

    (b) Less than two (Ans. 0.02477)

    (c ) At most one (Ans. 0.02477)

    (d) At least three (Ans. 0.9038)

    (e) between three and five ,inclusive.

    Text Book : Basic Concepts and

    Methodology for the Health Sciences 169

    Q4.3.5

    refer to Q4.3.4 , find the mean and thevariance?

    (Answer: mean = 4.8 ,

    variance =3.264 )

    Text Book : Basic Concepts andMethodology for the Health Sciences 170

  • 8/10/2019 Wayne Daniel Ppt

    86/186

    /04/34

    Q 4.4.3 :If the mean number of serious accidents per year

    in a large factory is five ,find the probability thatthe current year there will be:

    Hint: f(x)=

    per yeara) Exactly seven accidents(

    per yearb) Ten or more accidents(months6perc) No accident(

    .per yeard)fewer than one accidents(

    Text Book : Basic Concepts andMethodology for the Health Sciences 171

    !x

    e x

    For Q 4.4.3:Q4.4.4

    Find:

    months-6per.mean1

    2. variance and standard

    months3deviation per

    Text Book : Basic Concepts andMethodology for the Health Sciences 172

  • 8/10/2019 Wayne Daniel Ppt

    87/186

    /04/34

    4.5 Continuous

    Probability Distribution

    Pages 114127

    Key words:

    Continuous random variable, normaldistribution , standard normal distribution

    , T-distribution

    Text Book : Basic Concepts andMethodology for the Health Sciences 174

  • 8/10/2019 Wayne Daniel Ppt

    88/186

    /04/34

    Now consider distributions of

    continuous random variables.

    Text Book : Basic Concepts andMethodology for the Health Sciences 175

    Properties of continuousprobability Distributions:

    1- Area under the curve = 1.

    2- P(X = a) = 0 , where ais a constant.

    3- Area between two points a , b

    = P(a

  • 8/10/2019 Wayne Daniel Ppt

    89/186

    /04/34

    4.6 The normal distribution:

    It is one of the most important probabilitydistributions in statistics.

    The normal density is given by

    , - < x < , - 0

    , e : constants

    : population mean.

    : Population standard deviation.

    Text Book : Basic Concepts andMethodology for the Health Sciences 177

    2

    2

    2

    )(

    2

    1)(

    x

    exf

    Characteristics of the normal distribution:Page 111

    The following are some important characteristicsof the normal distribution:

    1- It is symmetrical about its mean, .2- The mean, the median, and the mode are all equal.

    3- The total area under the curve above the x-axis isone.

    4-The normal distribution is completely determinedby the parameters and .

    Text Book : Basic Concepts andMethodology for the Health Sciences 178

  • 8/10/2019 Wayne Daniel Ppt

    90/186

    /04/34

    5- The normal distribution

    depends on the two

    parameters

    and

    .

    determines thelocation of

    the curve.

    (As seen in figure 4.6.3) ,

    But,

    determines

    the scale of the curve, i.e.

    the degree of flatness or

    peaked ness of the curve.(as seen in figure 4.6.4)

    Text Book : Basic Concepts andMethodology for the Health Sciences 179

    1 2 3

    1< 2< 3

    1

    2

    3

    1< 2< 3

    The Standard normal distribution:

    Is a special case of normal distribution with

    mean equal 0 and a standard deviation of 1.

    The equation for the standard normal

    distribution is written as

    , - < z <

    Text Book : Basic Concepts andMethodology for the Health Sciences 180

    2

    2

    2

    1)(

    z

    ezf

  • 8/10/2019 Wayne Daniel Ppt

    91/186

    /04/34

    Characteristics of the standard

    normal distribution

    1- It is symmetrical about 0.2- The total area under the curve above thex-axis is one.

    3- We can use table (D) in the Appendix tofind the probabilities and areas.

    Text Book : Basic Concepts andMethodology for the Health Sciences 181

    How to use tables of ZNote that

    The cumulative probabilities P(Z z)are given in

    tables for -3.49 < z < 3.49. Thus,

    P (-3.49 < Z < 3.49) 1.

    For standard normal distribution,

    P (Z > 0) = P (Z < 0) = 0.5

    Example 4.6.1:

    If Z is a standard normal distribution, then

    1) P( Z < 2) =0.9772

    is the area to the left to 2

    and it equals 0.9772.

    Text Book : Basic Concepts andMethodology for the Health Sciences 182

    2

  • 8/10/2019 Wayne Daniel Ppt

    92/186

    /04/34

    Example 4.6.2:

    P(-2.55 < Z < 2.55)is the area between

    -2.55 and 2.55, Then it equals

    P(-2.55 < Z < 2.55)=0.99460.0054

    = 0.9892.

    Example 4.6.2:

    P(-2.74 < Z < 1.53)is the area between

    -2.74 and 1.53.

    P(-2.74 < Z < 1.53)=0.93700.0031

    = 0.9339.

    Text Book : Basic Concepts andMethodology for the Health Sciences 183

    -2.74 1.53

    -2.55 2.550

    Example 4.6.3:P(Z > 2.71)is the area to the right to 2.71.So,

    P(Z > 2.71)=10.9966 = 0.0034.

    Example :

    P(Z = 0.84)is the area at z = 0.84.

    So,

    P(Z = 0.84)= 0

    Text Book : Basic Concepts andMethodology for the Health Sciences 184

    0.84

    2.71

  • 8/10/2019 Wayne Daniel Ppt

    93/186

    /04/34

    Exercise

    Given Standard normal distribution by usingthe tables :

    2:The area to the left of Z=1.6.4

    :2.6.4

    The area under the curve Z =0, Z= 1.43

    )=55.0: P(Z 3.6.4

    )=35.2-: P(Z

  • 8/10/2019 Wayne Daniel Ppt

    94/186

    /04/34

    1Given the following probabilities, find z11.6.4

    P(Z z1) = 0.0055 (z1=-2.54)12.6.4

    P(-2.67Z z1) = 0.9718 (z1=1.97)13.6.4

    P(Z > z1) = 0.0384 (z1=1.77):11.6.4

    P(z1 < Z 2.98) = 0.1117 (z1=1.21)

    Text Book : Basic Concepts andMethodology for the Health Sciences 187

    How to transform normal distribution(X) to standard normal distribution

    (Z)?

    This is done by the following formula:

    Example:

    If X is normal with = 3, = 2. Find the value of

    standard normal Z, If X= 6?

    Answer:

    Text Book : Basic Concepts andMethodology for the Health Sciences 188

    x

    z

    5.12

    36

    xz

  • 8/10/2019 Wayne Daniel Ppt

    95/186

    /04/34

    4.7 Normal Distribution Applications

    The normal distribution can be used to model the distribution of

    many variables that are of interest. This allow us to answer

    probability questions about these random variables.

    Example 4.7.1:

    The Uptime is a custom-made light weight battery-operated

    activity monitor that records the amount of time an individual

    spend the upright position. In a study of children ages 8 to 15

    years. The researchers found that the amount of time children

    spend in the upright position followed a normal distribution withMean of 5.4 hours and standard deviation of 1.3.Find

    Text Book : Basic Concepts andMethodology for the Health Sciences 189

    If a child selected at random ,then1-The probability that the child spend less than 3hours in the upright position 24-hour period

    P( X < 3) = P( < ) = P(Z < -1.85) = 0.0322

    -------------------------------------------------------------------------

    2-The probability that the child spend more than 5hours in the upright position 24-hour period

    P( X > 5) = P( > ) = P(Z > -0.31)

    = 1- P(Z < - 0.31) = 1- 0.3520= 0.648

    -----------------------------------------------------------------------3-The probability that the child spend exactly 6.2hours in the upright position 24-hour period

    P( X = 6.2) = 0

    X

    3.1

    4.53

    Text Book : Basic Concepts andMethodology for the Health Sciences 190

    X

    3.1

    4.55

  • 8/10/2019 Wayne Daniel Ppt

    96/186

    /04/34

    4-The probability that the child spend from 4.5 to

    7.3 hours in the upright position 24-hour period

    P( 4.5 < X < 7.3) = P( < < )

    = P( -0.69 < Z < 1.46 ) = P(Z

  • 8/10/2019 Wayne Daniel Ppt

    97/186

    /04/34

    Exercises

    oldyears-29For another subject (:1.7.4Qmale) in the study by Diskin, aceton level were

    normally distributed with mean of 870 and standard

    deviation of 211 ppb. Find the probability that in a

    given day the subjects acetone level is :

    (a) between 600 and 1000 ppb

    (b) over 900 ppb

    (c ) under 500 ppb (d) At 700 ppb

    Text Book : Basic Concepts andMethodology for the Health Sciences 193

    In the study of fingerprints an important:2.7.4Qquantitative characteristic is the total ridge count for

    the 10 fingers of an individual . Suppose that the total

    ridge counts of individuals in a certain population are

    approximately normally distributed with mean of 140

    and a standard deviation of 50 .Find the probability

    that an individual picked at random from this

    population will have ridge count of :

    (a) 200 or more

    (Answer :0.0985)

    Text Book : Basic Concepts andMethodology for the Health Sciences 194

  • 8/10/2019 Wayne Daniel Ppt

    98/186

    /04/34

    (b) less than 200 (Answer :0.8849)

    (c) between 100 and 200

    (Answer :0.6982)

    (d) between 200 and 250

    (Answer :0.0934)

    Text Book : Basic Concepts andMethodology for the Health Sciences 195

    The T Distribution:

    )7367

    1- It has mean of zero.

    2- It is symmetric about the

    mean.3- It ranges from -to .

    Text Book : Basic Concepts andMethodology for the Health Sciences 196

    0

  • 8/10/2019 Wayne Daniel Ppt

    99/186

    /04/34

    4- compared to the normal distribution, the t

    distribution is less peaked in the center andhas higher tails.

    5- It depends on the degrees of freedom (n-1).

    6- The t distribution approaches the standardnormal distribution as (n-1) approaches .

    7- Can find values of t in the table (E) in

    the Appendix.

    Text Book : Basic Concepts andMethodology for the Health Sciences 197

    Examples

    t (7, 0.975) = 2.3646

    ------------------------------

    t (24, 0.995) = 2.7696

    --------------------------

    If P (T(18)> t) = 0.975,

    then t = -2.1009-------------------------

    If P (T(22)< t) = 0.99,

    then t = 2.508

    Text Book : Basic Concepts andMethodology for the Health Sciences 198

    0.005

    t(24, 0.995)

    0.995

    t(7, 0.975)

    0.0250.975

    t

    0.9750.025

    0.99

    0.01

    t

  • 8/10/2019 Wayne Daniel Ppt

    100/186

    /04/34

    0

    Find :

    t0.95,10

    = 1.8125

    ---------------------------------

    t 0.975,18 = 2.1009

    ---------------------------------

    t 0.01,20 = - 2.528

    ---------------------------------

    t 0.10,29 = - 1.311

    ---------------------------------

    Text Book : Basic Concepts andMethodology for the Health Sciences 199

    Sampling4.6Distributions:

    Definition:1.4.6

    The probability distribution of a statistic is called a

    sampling distribution.

    Sampling Distributions of1.1.4.6

    Means:

    2

    2( ) , ( ) (3)X X

    E X V Xn

  • 8/10/2019 Wayne Daniel Ppt

    101/186

    /04/34

    1

    Definition:2.4.6

    Central Limit Theorem:1.2.4.6

    If is the mean of a random sample of size ntaken from a population with mean and finitevariance , then the limiting form of the

    distribution of:

    is approximately the standard normal

    distribution;

    (4)/

    XZ as n

    n

    X

    2

    ( ) ~ (0,1) ( ) , ( ) ,

    ~ ( , ) (5)

    XZ N E X V X

    n n

    X Nn

    :Example

    An electrical firm manufactures light bulbs that have a

    length of life that is approximately normally

    distributed with mean equal to 800hours and a

    standard deviation of 40hours. Find :

    probability that a random sample of 16

    bulbs will have an average life of lessthan 775hours.

  • 8/10/2019 Wayne Daniel Ppt

    102/186

    /04/34

    2

    :Solution

    Let Xbe the length of life and is theaverage life;

    X

    16, 800, 40

    775 800( 775) ( ( 2.5) 0.0062

    40/ 16

    n

    P X P Z P Z

    Sampling Distribution of the sampleProportion:

    Let X= no. of elements of type Ain the

    sample

    P= population proportion = no. of elements

    of type A in the population / N = sample proportion= no. of elements of

    type A in the sample / n = x/n

    p

  • 8/10/2019 Wayne Daniel Ppt

    103/186

    /04/34

    3

    3. For large n, we have:

    ~ ( , )

    ~ (0,1)

    pqp N p

    n

    p pZ N

    pq

    n

    ~ ( , ) ( ) , ( )

    1. ( ) ( )

    2. ( ) ( ) , 1

    x binomial n p E x np V x npq

    xE p E pn

    x pqV p V q p

    n n

    Example:

    If X is Binomial distribution with n=10,P=0.1, find P

    Solution:

    Text Book : Basic Concepts andMethodology for the Health Sciences 206

    )2.0( P

    85314.0)05.1(

    )009.0

    1.0()

    10

    )9.0)(1.0(

    1.02.0()2.0(

    009.0

    10

    )9.0)(1.0()(

    1.0)(

    ZP

    ZP

    n

    pq

    PPPPP

    n

    pqPV

    PPE

  • 8/10/2019 Wayne Daniel Ppt

    104/186

    /04/34

    4

    6Chapter

    Using sample data to make

    estimates about population

    )172-162parameters (P

    Key words:

    Point estimate, interval estimate, estimator,

    Confident level ,, Confident interval for mean ,

    Confident interval for two means,

    Confident interval for population proportion P,

    Confident interval for two proportions

    Text Book : Basic Concepts andMethodology for the Health Sciences 208

  • 8/10/2019 Wayne Daniel Ppt

    105/186

    /04/34

    5

    6.1 Introduction: Statistical inference is the procedure by which we reach toa conclusion about a population on the basis of the

    information contained in a sample drawn from thatpopulation.

    To any parameter, we can compute two types of estimate:a point estimate and an interval estimate.

    Apoint estimateis a single numerical value used toestimate the corresponding population parameter.

    Note:

    Point estimate for ( ) is

    Point estimate for ( ) is S

    Text Book : Basic Concepts andMethodology for the Health Sciences 209

    X

  • 8/10/2019 Wayne Daniel Ppt

    106/186

    /04/34

    6

    Definition:

    An interval estimateconsists of two

    numerical values defining a range of valuesthat, with a specified degree ofconfidence, we feel includes theparameter being estimated.

    Text Book : Basic Concepts andMethodology for the Health Sciences 211

    aConfidence Interval for2.6Population Mean: (C.I)

    Suppose researchers wish to estimate the mean ofsome normally distributed population.

    They draw a random sample of size n from thepopulation and compute , which they use as a pointestimate of .

    Because random sampling involves chance, thencant be expected to be equal to .

    The value of may be greater than or less than.

    It would be much more meaningful to estimate byan interval.

    x

    Text Book : Basic Concepts andMethodology for the Health Sciences 212

    x

  • 8/10/2019 Wayne Daniel Ppt

    107/186

    /04/34

    7

    percent confidence interval-1The:(C.I.) for

    We want to find two values L :Lower bound andU:Upper bound between which lies with highprobability, i.e.

    P( L U ) = 1-

    Text Book : Basic Concepts andMethodology for the Health Sciences 213

    For example:

    When,

    = 0.01,

    then 1- =

    = 0.05,

    then 1- =

    = 0.05,then 1- =

    Text Book : Basic Concepts andMethodology for the Health Sciences 214

  • 8/10/2019 Wayne Daniel Ppt

    108/186

    /04/34

    8

    For example:

    When, (1- )100%:Level of confident

    (1- )100% = 90%,

    then =

    99%

    then =

    80%

    then =

    Text Book : Basic Concepts andMethodology for the Health Sciences 215

    (1- )100% confident interval for the mean : When the value of sample size (n):

    population is normal or not normal population is normal

    ( n 30 ) (n< 30)

    is known is not k