iem outline lecture notes autumn 2016

Upload: stvn

Post on 07-Jul-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    1/198

    1

    200052 INTRODUCTION TO ECONOMIC METHODS

    LECTURE - WEEK 1

    Required Reading:Ref. File 1: Section 1.13Ref. File 3: Introduction and Sections 3.1 to 3.4, 3.7

    KEYS TO PASSING THIS UNIT:

    (i) Undertake the required reading from the referencefiles each week. (It may be necessary to re-readsome sections more than once)  –   Approximately 4hours per week.

    (ii) Carefully study lecture material and take notice ofadvice given in lectures.

    (iii) Attempt tutorial exercises before tutorials and workout where you have difficulties, which hopefully canbe resolved in tutorials.

    (iv) Make a conscious effort to keep up with the materialpresented.

    1. INTRODUCTION TO UNIT

    1.1 How Can We Define Statistics?

    Statistics, for our purposes encompasses the followingmajor activities:

    (i) Collection and description of information, or data -“descriptive statistics”. We will normally be dealing

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    2/198

    2

    with a subset of a larger collection or set of data. Thesubset is called a sample, the larger set a population.

    (ii) Using sample data to make inferences about apopulation - “statistical inference”. 

    1.2 Why Study Statistics?

    (i) (Major) It can be useful. It can help us to make

    decisions in the face of uncertainty.

    (ii) People are bombarded with statistics all the time.Often statistics is used in ways that are notwarranted. It is important not to be fooled by peoplewho misuse statistics.

    (iii) It is important to have a clear understanding of thestrengths and limitations of statistical analysis.

    1.3 Structure of the Subject

      Descriptive Statistics:

    How we summarise the characteristics of raw data(using graphs, summary measures, etc.)

      Probability Theory and Probability Distributions(“deductive statistics”): 

    Rules (or axioms) for calculating probabilities ofcertain things (called events) happening.

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    3/198

    3

    Probability theory can be considered part ofdescriptive statistics.

    Here we will be concerned about making probabilitystatements about a given population.

      Sampling Theory and Sampling Distributions (the basis

    of “inductive statistics”): 

    Here we will be concerned with making probabilitystatements about characteristics of samples, givenassumptions about the population from which thesample was drawn.

      Point and Interval Estimation:Point Estimation - Here we will be concerned about

    producing a particular estimate (a number), based onsample data, of a characteristic of a population.

    Interval Estimation - Here we will not give anestimate of a population characteristic, but rather arange in which we are confident (to some degree) thetrue value of the population characteristic is.

     

    Hypothesis Testing:Under this heading we will be looking at ways oftesting hypotheses about characteristics ofpopulations, based on sample data.

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    4/198

    4

      Regression Analysis:

    In this case we will be concerned with estimatinglinear relationships between different variables, i.e.linear equations.

    We will go on to examine statistical tests associatedwith estimated regression equations.

      Introduction to Differential Calculus

    2. DESCRIPTIVE STATISTICS

    2.1 Some Basic Definitions Relating to Data

    (i) Elementary Units and Frames:

    Statistical data normally represents measurements orobservations of a certain characteristic or variable  ofinterest of each member of a set of objects or people. 

    Each object (or person) for which the characteristic is orcan be measured is called an elementary unit.

    The set or listing of all possible elementary units is called aframe.

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    5/198

    5

    (ii) Population/Sample:

    A statistical population is the set of measurements orobservations of a characteristic of interest for allelementary units in a frame. 

    A population may comprise a finite or infinite number ofelements (observations), depending on the context.

    A statistical sample is a subset of a population.

    (iii) Parameters/Statistics:

    For our purposes -The numerical characteristics which describe a populationare called parameters of the population. 

    The numerical values calculated from sample data arecalled sample statistics. These sample statistics can bethought of as describing or characterizing the sample.

    (iv) Qualitative and Quantitative Variables:

    Populations may be quantitative or qualitative. Data from

    quantitative populations is called quantitative or intervaldata. Data from qualitative populations is calledqualitative, nominal or categorical data.

    Data from a quantitative population can be expressednumerically in a meaningful way. The variable (orcharacteristic) associated with a quantitative population is

    called a quantitative variable.

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    6/198

    6

    Data from qualitative populations cannot be expressed

    numerically in a meaningful way. The variable (orcharacteristic) associated with a qualitative population iscalled a qualitative or categorical variable.

    Note: Just because we assign a numerical code to aqualitative variable does not mean the variable isquantitative. 

    (v) Discrete and Continuous Quantitative Variables:

    A discrete quantitative variable can assume only certaindiscrete numerical values (on the number line); i.e. thereare gaps between the various values. Depending on thevariable, there could be a finite or infinite number of thesediscrete values.

    A continuous quantitative variable can assume any valuein a specific range or interval. The interval can be of finiteor infinite width.

    Note: By definition there are an infinite number of valuesa continuous variable can take.

    2.2 Frequency Distributions

    (a) Introduction

    Suppose we have a set of raw statistical data.  At this stagewe will make no distinction as to whether we are talking

    about a statistical population or sample.

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    7/198

    7

    In studying the data it is often useful to initially group the

    raw data into different classes or categories. A frequencydistribution for a set of data lists the number ofobservations or ‘data points’ in each class used forgrouping (the class frequencies). The classes of afrequency distribution must be mutually exclusive (anobservation cannot fall into two classes) and exhaustive(any observation must belong to a class).

    (b) Frequency Distributions for Quantitative Data

    Each class of a frequency distribution of quantitative datausually has a lower and an upper limit, althoughsometimes it is necessary or convenient to have open-ended classes, i.e. classes which have either an upper orlower limit but not both.

    Example:Suppose we have data on the number of children in 100households as follows:

    Class Frequency0 to under 2 children2 to under 4 children4 to under 6 children6 or more children

    3055132

    The class width is the difference between successive lowerclass limits or upper class limits.

    Note: An open-ended class has no class width.

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    8/198

    8

    General Advice for Forming Frequency Distributions:

     

    The number of classes should generally be between5 and 20. 

      Class widths are ideally equal, but this may notalways be possible, and open-ended classes may benecessary.

      Class limits should be chosen such that the classmidpoint is close to the average of observations in

    the class.  This is because in calculating summarystatistics based on grouped data the midpoint isused as representative of all observations in theclass.

    (c) Relative, Cumulative and Cumulative RelativeFrequency Distributions

    A relative frequency distribution shows the proportion ofall observations falling in each class. It is obtained bydividing the class frequencies (   i f  ) by the total number of

    observations in the data (‘n’).

    A cumulative frequency distribution shows, for each class

    i , the total of the first i  frequencies.

    A cumulative relative frequency distribution shows, foreach class i , the total of the first i  relative frequencies.

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    9/198

    9

    For the previous example we have

    Class (i )

    0 to under 22 to under 44 to under 66+ children

    Frequency(f i )305513

    __2100

    Cumulative.Frequency

    308598100

    RelativeFrequency

    0.300.550.130.021.00

    Cumulative.Rel. Freq.

    0.300.850.981.00

    An ogive is a graph of the cumulative relative frequencydistribution.

    2.3 Histograms

    Histograms give us a convenient way of visualising thedistribution of observations over classes. They take the

    form of a series of adjacent (contiguous) rectangles, onefor each class, with the base of each rectangle centred overthe corresponding class midpoint.

    In a frequency histogram the areas of the rectangles areproportional to the class frequencies, with the factor ofproportionality the same for all classes. Thus if all the

    classes have the same width, each rectangle will have thesame base width and the class frequencies can berepresented by the rectangle heights.

    In a relative frequency histogram the areas of therectangles are proportional to the relative frequencies.

    Similarly cumulative and cumulative relative frequencyhistograms can be defined.

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    10/198

    10

    Note: Frequency and relative frequency histograms will

    have the same shape.

    Example:Consider the following distribution

    Class0.5 to under 2.5

    2.5 to under 4.54.5 to under 6.56.5 to under 8.5

    Frequ.10

    305010

    Rel. Freq.0.1

    0.30.50.1

    Cum. Freq.10

    4090100

    Frequency Histogram

    Frequency

    50

    30

    10

    0.5 2.5 4.5 6.5 8.5 

    Relative Frequency Histogram

    RelativeFrequency

    0.5

    0.3

    0.1

    0.5 2.5 4.5 6.5 8.5

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    11/198

    11

    Cumulative Frequency Histogram

    100

    90

    CumulativeFrequency

    40

    10

    0.5 2.5 4.5 6.5 8.5 

    2.4 Shapes of Distributions

    The frequency or relative frequency histogram gives us arepresentation of the shape of the distribution of the databeing analysed.

    There are several terms commonly used to describe theshapes of distributions.

    A distribution is described as negatively skewed (skewedto the left) if it has the following shape

    A Distribution that is Skewed to the Left

    Relative Frequency

    Variable Value

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    12/198

    12

    A distribution is positively skewed (skewed to the right) ifit has the following shape.

    A Distribution that is Skewed to the Right

    Relative Frequency

    Variable Value

    A distribution is symmetric if it has the following shape.

    A Symmetric Distribution

    Relative Frequency

    Variable Value

    The above are all examples of unimodal distributions. Abimodal distribution has two peaks.

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    13/198

    13

    2.5 Bivariate Frequency Distributions

    Often it is of interest to classify observations of elementaryunits according to two variables (characteristics). Thisallows one to gauge the relationship between the twovariables.

    Example:Consider the final results of 50 students in a particular

    subject. Each student’s final  grade and gender arerecorded, allowing the derivation of the followingbivariate frequency distribution.

    GradeGender HD Dist. Credit Pass Fail Row

    Total

    Male 5 4 10 6 2 27Female 2 3 11 2 5 23Column

    Total7 7 21 8 7 50

    Each combination of grade and gender is represented by acell in the bivariate frequency distribution, which containsthe frequency of that combination in the data.

    The row totals represent, in this example, the marginalfrequencies of females and males in the class (27 and 23,respectively).

    The column totals represent the marginal frequencies offinal grades. 

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    14/198

    14

    Marginal frequencies, represented by the row and columntotals, each refer to one variable only.

    We can express the information in a bivariate frequencydistribution as a relative frequency distribution bydividing each entry in the distribution by the total numberof observations.

    Example:

    For the previous example, the bivariate relative frequencydistribution is given by

    GradeGender HD Dist. Credit Pass Fail Row

    Total

    Male 0.10 0.08 0.20 0.12 0.04 0.54Female 0.04 0.06 0.22 0.04 0.10 0.46Col.

    Total0.14 0.14 0.42 0.16 0.14 1.00

    The row and column totals in the above table are calledthe marginal relative frequencies.

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    15/198

    15

    3. MEASURES OF CENTRAL TENDENCY ANDDISPERSION

    In this section we shall look at important ways ofsummarising data from both populations and samples.We shall be concerned with measures of the

     

    ‘centre’ of a frequency distribution 

      ‘dispersion’ of values in a frequency distribution 

    3.1 Summation Notation

    Suppose we have ‘n’ numbers. By labelling the numbersn)321(   ,..., , ,  , we can represent the numbers by

    n,..., 1, x   i i   

    The sum of the numbers can be denoted

    n21

    n

    1

    x........xxx  i 

    i   

    n

    1 xi  i  is a shorthand way of writing the sum.

    Theorem (Basic Properties of Summation Notation)Given ‘c’ is some constant and n21   a,..., a, a   are ‘n’

    numbers:

    (i)

    n

    1

    n

    1 accai i  i i   

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    16/198

    16

    (ii) nca)ca(n

    1

    n

    1

     

    i  i i  i 

     

    (iii) 2n

    1

    n

    1

    2n

    1

    2 ncac2a)ca(  

    i i 

    i i 

    i i 

     

    (iv) 2n

    1

    n

    1

    2n

    1

    2 ncac2a)ca( 

    i  

    Example:Consider the following four labelled numbers.

    1a, 2a, 3a, 1a 4321    

    Use property (iii) of the above theorem to calculate

     

    4

    1

    2)1a(i 

    i  .

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    17/198

    17

    3.2 Measures of Central Tendency

    For each measure considered there are population andsample versions. We will suppose here there are N valuesin the population and ‘n’ values in a sample.

    Note that at this stage we are only concerned withquantitative variables, and we assume the populationcontains a finite number of values.

    Definition (Mean of a Finite Quantitative Population)If N321   x......., , x, x, x   represents a finite population of ‘N’

    quantitative data points, then the mean of this populationis given by

    Population meanN

    x

    Nx...xx

    N

    1N21

     

    (  is the Greek letter ‘mu’) 

    Definition (Mean of a Sample from a QuantitativePopulation)If n321   x....., , x, x, x   represents a particular sample of size

    ‘n’ from a quantitative population, then the mean of thissample is given by

    Sample meann

    x

    n

    x.....xxx

    n

    1n21

    i i 

     

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    18/198

    18

    Definition (Mode of a Set of Data)The mode is the data value that occurs most frequently in

    a set of data (population or sample).

    Definition (The Median of a Set of Data)If quantitative data is arranged in ascending ordescending order, the middle value of data is called themedian. If there is an even number of data points, the

    median is typically taken to be the arithmetic average ofthe two middle values.

    Example:Consider the following set of data, which we can assume tobe a sample from a population.

    13510

    1112

    5214

    4752

    12686

    46930

    6x, 5x, 1x, 24n 1131   , etc. (if we label across rows

    then down)

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    19/198

    19

    Comparison of the Mean, Median and Mode

    The mean takes account of all observation values thereforeit can be affected by extreme values or outliers, i.e. valueswhich differ greatly from the majority of values.

    The median and mode are unaffected by extremely high orlow values.

    The mode may not represent a “central” value in the

    distribution, as in the above example, but it may be useful,for example, for qualitative data.

    If the frequency (or relative frequency) distribution isperfectly symmetric and unimodal, the mean, median andmode will coincide.

    Symmetric DistributionRelative Frequency

    Variable ValueMeanMedianMode

    If the distribution is skewed to the right (positively

    skewed) and unimodal, mode < median < mean.

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    20/198

    20

    Distribution that is Skewed to the Right

    Relative Frequency

    Variable Value

    Mode MeanMedian

    If the distribution is skewed to the left (negatively skewed)and unimodal, mean < median < mode.

    Distribution that is Skewed to the Left

    Relative Frequency

    Variable Value

    Mean ModeMedian

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    21/198

    21

     MAIN POINTS

     

     A statistical population is a set of measurements orcharacteristics of elementary units of interest.

      Once a population is defined, a sample is a subset from

    the population.

      Parameters are numerical characteristics of a population.

      Sample statistics are numerical characteristics of a

    sample.

      A frequency or relative frequency distribution describes

    how data is distributed over different classes or categories.

     

     A histogram shows graphically a frequency, relative frequency or cumulative frequency distribution (the areas

    of the ‘contiguous’ rectangles are proportional to the

     frequencies or relative frequencies). 

     

    The mean is affected by ‘extreme’ values; the median and

    the mode are not affected by ‘extreme’ values. 

     

    The population mean is denoted : the sample mean is

    denoted x .

      The median divides a set of quantitative data into two

    equal halves.

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    22/198

    22

    200052 INTRODUCTION TO ECONOMIC METHODS

    LECTURE - WEEK 2

    Required Reading:Ref. File 1: Section 1.1Ref. File 3: Sections 3.5(a)-(d), 3.5(f)Ref. File 4: Introduction and Sections 4.1, 4.2

    3. MEASURES OF CENTRAL TENDENCY ANDDISPERSION CONTINUED

    3.3 Measures of Dispersion

    (a) The Range

    Definition (Range of a Set of Data)The range  of a set of quantitative data is the differencebetween the highest and lowest data values.

    (b) The Mean Absolute Deviation

    Definition (Deviation from the Mean)

    Consider a particular value i x  from a finite data set. The

    deviation from the mean of this value is defined as

      )x(   i   if the population mean  is known

      )xx(  i   if only a sample mean is available

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    23/198

    23

    Definition (Mean Absolute Deviation)(i) If N321   x....., , x, x, x   represents a finite quantitative

    population, then the population mean absolutedeviation is given by

    Population MADN

    xN

    1

    i i 

     

    (ii) If n321   x....., , x, x, x   represents a sample from aquantitative population, then the sample meanabsolute deviation is given by

    Sample MADn

    xxn

    1

     

    i i 

     

    (c) The Standard Deviation and Variance

    Another more mathematically convenient way ofanalysing the deviations from the mean is to square them.This leads to the definition of the variance.

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    24/198

    24

    Definition (Variance of a Finite Quantitative Population)If N321   x....., , x, x, x   represent a finite population of N

    quantitative data points, then the variance of thispopulation is given by

    (Finite) Population varianceN

    )x(N

    1

    2

    2

    i i 

     

    Definition (Variance of a Sample from a QuantitativePopulation)If n321   x....., , x, x, x  represent a particular sample of size ‘n’

    from a quantitative population, then the variance of thissample is given by

    Sample variance1n

    )xx(

    s

    n

    1

    2

    2

     

    i i 

     

    Alternatively we can equivalently write:

    Population varianceN

    Nx  2

    N

    1

    2

    2

    i i 

     

    Sample variance

    1n

    xnx

    s

    2n

    1

    2

    2

     

    i i 

     

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    25/198

    25

    The standard deviation is defined as the positive square

    root of the variance.

    Definition (Finite Population and Sample StandardDeviations)(i) If N321   x....., , x, x, x   represent a finite quantitative

    population, then the population standard deviation isgiven by

    Population standard deviationN

    )x(N

    1

    2

    i i 

     

    (ii) If n321   x....., , x, x, x   represent a sample from a

    quantitative population, then the sample standard

    deviation is given by

    Sample standard deviation1n

    )xx(

    s

    n

    1

    2

     

    i i 

     

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    26/198

    26

    An advantage of the standard deviation over the varianceis that it is expressed in the original units of measure.

    Example:

    Calculate 2s  and ‘s’ for the previous 24 number example.(36.3315, 6.0276)

    3.4 The Coefficient of Variation

    The coefficient of variation is useful for comparing thevariability of data sets with means that differ significantly,or data sets based on different units of measure.

    Definition (Coefficient of Variation)(i) For a population with mean  and standard deviation

    :

    Population coefficient of variation

     

    (i) For a sample with mean x  and standard deviation ‘s’: 

    Sample coefficient of variation

    x

    s  

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    27/198

    27

    Example:Suppose we wish to compare the variability of the weights

    of a given sample of people with the variability of theirdaily calorie intake. We are told

    sample mean of weights = 68kgsample standard deviation of weights = 5kgsample mean of daily calorie intake = 1200 caloriessample standard deviation of daily calorie intake = 300

    calories

    3.5 Chebyshev’s Theorem and the Empirical Rule 

    Theorem (Chebyshev’s Theorem) For any quantitative population with a finite variance, theproportion of data points less than ‘c’ standard deviations

    from the mean is at least )c1(1   2  , where 0c  .

    For hump-shaped or bell-shaped (unimodal) distributions,Chebyshev’s theorem will give a conservative indication of

    the concentration of population data points around themean. In such cases we can refer to the empirical rule.

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    28/198

    28

    The Empirical RuleFor a bell-shaped distribution of sample or population

    data, it will be approximately true that  68% of the data points will lie within 1 standard

    deviation of the mean.

      95% of the data points will lie within 2 standarddeviations of the mean.

      99.7% of the data points will lie within 3 standarddeviations of the mean.

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    29/198

    29

    4. INTRODUCTORY PROBABILITY THEORY

    4.1 Basic Set Theory

    A set is a collection of objects or elements.

    Definitions (Sets)The set of all elements of interest in a particular problem

    or context is called the universal set, which can be denotedby, say, . Other basic definitions relating to sets are as

    follows:

    (i) The null set , denoted

    , contains no elements.(ii) If an element denoted ‘x’ is a member of a set ,

    this is commonly denoted x : if ‘x’ is not a

    member of set , this can be denoted x .(iii) The intersection of sets and #,  denoted#

    , is

    the set of elements in both and #.

    (iv) The union of sets and #,  denoted#

    , is the

    set of elements in and/or #. 

    (v) Set is said to be a subset of set #, denoted#

    , if all elements in are also in #: if is not

    a subset of #, this can be denoted #

    .(vi) The complement of set , denoted , is the set of

    elements in but not in .

    (vii) If # , we say and #  are mutually

    exclusive  or disjoint   sets; they have no element incommon. 

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    30/198

    30

    Venn diagrams are often a convenient way of portrayingsets and the relationship between them. An example is the

    following diagram.

    ( #)

    $ $

    #$

     

    !

    "  #  , # disjoint/mutually exclusive 

    !

    ExampleSuppose we have the set 10, 9, 8, 7, 6, 5, 4, 3, 2, 1  

    Define 7, 4, 3, 1, 9, 7, 5, 3, 1   %  

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    31/198

    31

    4.2 Terminology Related to Statistical Experiments

    An experiment, in a statistical sense, is an act or processthat leads to an outcome which cannot be predicted withcertainty.

    Definition (Simple Events and Events)A simple event  of an experiment is an outcome that cannot

    be decomposed into simpler outcomes. An event   is acollection or set of one or more simple events. An event issaid to have occurred if a simple event included in theevent occurs.

    Definition (Sample Space of a Statistical Experiment)The sample space of an experiment, which will be denoted, is the set of all possible simple events. It can be

    described as the event consisting of all simple events.

    Venn diagrams often provide a convenient way ofdepicting sample spaces and events.

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    32/198

    32

    Definition (Discrete Sample Space)A discrete sample space consists of either a finite number

    of simple events or a countable and infinite number ofsimple events.

    Definition (Continuous Sample Space)A continuous sample space  consists of simple events thatrepresent all the points in an interval on the real numberline. The interval could be of finite or infinite width.

    4.3 Basic Concepts of Probability

    (a) Probabilities of Events as Relative Frequencies

    Definition (Probability of an Event)If Ef    is the frequency with which event ‘E’ occurs in ‘n’

    repetitions (trials) of an experiment under identicalconditions/rules, )E( P is defined as

    n

    f lim)E(   En 

     P  

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    33/198

    33

    (b) Definition of a Probability Distribution

    Definition (Probability Distribution)A  probability model or  probability distribution  for anexperiment takes the form of either a list of probabilitiesof simple events or some other representation of therelative frequency distribution of the underlyingpopulation associated with the experiment.

    (c) Axioms of Probability

    Suppose an experiment has a sample space . Any

    assignment of probabilities to events in (subsets of )

    must satisfy the following axioms:

    1. For any event ‘E’ in , 1)E(0 

     P  2. 1)P(    

    3. The probability of an event that is the union of acollection of mutually exclusive events is given bythe sum of the probabilities of these mutuallyexclusive events. (The ‘additive property ofprobability’) 

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    34/198

    34

    (d) Assigning Probabilities to Simple Events in DiscreteSample Spaces

    There are three broad approaches to assigningprobabilities to events.

    (i) The Underlying Population Relative FrequencyDistribution is Known or Assumed

    In this case the relative frequencies of the simple eventscan be considered the probabilities of these simple events.

    As a special case, the ‘classical’ or  ‘equally likely’ approach to assigning probabilities is applicable inexperiments where it is reasonable to assume that eachsimple event is equally likely. In this case, if there are ‘n’

    simple events, each will occur with probability 1/n.

    (ii) The Underlying Population Relative FrequencyDistribution is Not Known or Assumed, but theExperiment is Repeatable

    This approach relies on past observation of outcomes from

    an experiment that allows an approximate determinationof relative frequencies of simple events and events. 

    In terms of this approach, the probability of an event isapproximated by the relative frequency of the event in a‘large’ number of identical trials of the  experimentconsidered. This is often referred to as the ‘empirical’ or

    ‘relative frequency’ approach to assigning probabilities.

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    35/198

    35

    (iii) The Underlying Population Relative Frequency

    Distribution is Not Known or Assumed, and theExperiment is Not Repeatable

    In many circumstances an experiment may not berepeatable, i.e. it will only happen once. In suchcircumstances people assign subjective probabilities to theexperiment outcomes which reflect their personal beliefs. 

    For two events ‘A’ and ‘B’ defined on a sample space: 

      )BA( P   probability of simple events in both ‘A’ 

    and ‘B’. 

     

    )BA( P  probability of simple events in ‘A’ and/or‘B’.

    !

    B  BA  

    A

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    36/198

    36

      Events ‘A’ and ‘B’ are said to be mutually exclusive if

    BA . It follows immediately that, if ‘A’ and ‘B’ are mutually exclusive

    0)BA(    P  

    !

    A B

    Two Approaches to Determining the Probability of anEvent Defined on a Discrete Sample Space:

    (i) Add up the probabilities of the simple eventsincluded in the event.(ii) Use various probability rules and laws relating to

    unions, intersections and complements of events(considered later)

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    37/198

    37

    The first approach above can be formalised asperformance of the following steps:

    (i) Define the experiment.(ii) List the simple events and assign probabilities to

    them in a way consistent with the axioms ofprobability.

    (iii) Determine the simple events included in the eventof interest.

    (iv) Sum the probabilities of the simple events in theevent of interest to find its probability.

    Example:Consider the experiment of tossing a fair die once and let‘A’ be the event of obtaining an odd number of dots on theupward facing side.

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    38/198

    38

    Example:Suppose

    }8, 5, 4, 3, 2{B, }6, 5, 3, 1{A, }8, 7, 6, 5, 4, 3, 2, 1{   .

    Where is the sample space of a statistical experiment and

    all the simple events are equally likely.

    Example:Suppose that for }8, 7, 6, 5, 4, 3, 2, 1{ :

    1.0)6()3()2()1(    P P P P  

    08.0)8()7()4( 

     P P P  36.0)5(    P  

    with }8, 5, 4, 3, 2{B, }6, 5, 3, 1{A    

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    39/198

    39

     MAIN POINTS

     

     For a finite population,

    varianceN

    )x(N

    1

    2

    2

    i i 

     

     For a sample,

    variance 1n

    )xx(

    s

    n

    1

    2

    2

     

    i i 

     

      The standard deviation is the square root of the variance:

    it has the same units of measure as the data.

      Chebyshev’s t heorem applies to all statistical populations.

     

    The empirical rule applies only to hump-shapeddistributions.

      The coefficient of variation measures dispersion relative

    to the mean. It allows us to compare the dispersions of

    data sets with different means and units of measure.

     

     In set notation:

     means ‘and’  

     means ‘and/or’  

    A  means ‘not A’  

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    40/198

    40

     

     In statistical experiments:

    Simple events cannot be decomposed into simpleroutcomes.

    The sample space is the set of all simple events.

     Events are a collections or sets of one or more simple

    events.

     An event occurs if any of its included simple events

    occur.

      All statistical experiments can be thought of as sampling

     from a statistical population.

      Probabilities must obey certain axioms.

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    41/198

    41

    200052 INTRODUCTION TO ECONOMIC METHODS

    LECTURE - WEEK 3

    Required Reading:Ref. File 4: Sections 4.3, 4.4, 4.6

    4. PROBABILITY THEORY CONTINUED

    4.4 Discrete Bivariate Probability Distributions

    Definitions (Joint and Marginal Probabilities)Suppose a statistical experiment for which simple eventstake the form of intersections of outcomes with respect totwo or more variables. For such a statistical experiment:

     

    The probabilities of the simple events are referred toas joint probabilities

      The probabilities of events representing outcomeswith respect to one of the variables only are calledmarginal probabilities.

      A listing or other representation of the jointprobabilities is called a joint probability distribution.

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    42/198

    42

    Example:Suppose we have the following data on all 1950 first year

    students at a particular university.

    Work StatusAge inYears

    NotWorking

    Part-Time

    Full-Time

    RowTotal

    Under 2525 - 34

    35 or over 

    1200100

    10

    20075

    5

    250100

    10

    1650275

    25ColumnTotal

    1310 280 360 1950

    Consider the experiment of selecting one of the students atrandom. Define the following events for the experiment:

    A: Under 25B: 25 - 34C: 35 or overD: Not workingE: Part-time workerF: Full-time worker

    Calculate the following probabilities:

    )EC(, )C(, )FC(, )AD(, )D(, )C(, )( 

     P P P P P P A P  

    (   )7811, 7877, 265, 138, 195131, 781, 1311  

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    43/198

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    44/198

    44

    4.5 Useful Counting Techniques

    (a) The Multiplicative Rule

    Theorem (Multiplicative Rule of Counting)Suppose two sets of elements, sets and #, consist of An  

    and Bn  distinct  elements, respectively: An  and Bn  need not

    be equal. Then it is possible to form BA   nn   distinct pairs

    of elements consisting of one element from set and one

    element from set #, without regard to order within a pair.

    Example:If a take-away food store sells 10 different food items and5 different types of drink, 50105     distinct food/drinkpairs are possible.

    The multiplicative rule can be extended naturally. Thus

    k21  n...nn  different sets of ‘k’ elements are possible if one

    selects an element from each of ‘k’ groups consisting of

    k21  n,..., n, n  distinct elements, respectively.

    Example:

    Suppose we select 5 people at random. What is theprobability that they were born on different days of theweek, assuming an individual has an equal probability ofbeing born on any of the seven days of the week?(Approx. 0.1499)

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    45/198

    45

    A simple event here is an ordered sequence of 5 elements,the first representing the day of the week the first person

    was born on, the second the day the second person wasborn on, and so forth.

    (b) Permutations

    Definition (Permutations)A permutation is an ordered sequence of elements.

    Definition (Factorial Notation)If ‘N’ is a non-negative integer, we define:

     

    )1)(2)(3).......(2N)(1N(N!N 

    (‘N-factorial’) 

    And

     

    1!0   

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    46/198

    46

    Theorem (Number of Permutations)The total number of possible distinct permutations

    (ordered sequences) of ‘R’ elements selected (withoutreplacement) from ‘N’ distinct   elements, denoted RN P , isgiven by

    )!RN(

    !NRN

     

     P  

    Example:Consider the numbers 1, 2, 3, 4. How many permutationsof these four numbers taken 2 at a time can be found?(12)

    (c) Combinations

    Definition (Combinations)

    A set of ‘R’ elements selected from a set of ‘N’ distinctelements without regard to order is called a combination.

    Theorem (Number of Combinations)The total number of possible combinations of ‘R’ elementsselected from a set of ‘N’ distinct elements is given by.

    )!RN(!R

    !NRN

     

    C  

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    47/198

    47

    Example:In how many ways can a committee of 4 people be chosen

    from a group of 7 people? (35)

    (d) Permutations of ‘N’ Non-Distinct Elements

    Theorem (Number of Permutations of ‘N’ Non-DistinctElements)Consider a set of ‘N’ elements of which 1N  are alike, 2N  

    are alike,....., and rN   are alike, where 1N i    )r,..., 1( i   

    and NNr

    1

    i i  . Then the number of distinct permutations

    of these ‘N’ elements is given by 

    !N!......N!N

    !N

    r21

     

    If the above result is specialized to the case where ‘x’ is the

    number of distinct arrangements (or distinctpermutations) of ‘N’  objects where ‘R’  are alike and

    )RN(   are alike, then

    RN

    )!RN(!R

    !Nx   C

     

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    48/198

    48

    Example:Say we have 3 black flags and 2 red flags. How many

    distinct ways are there of arranging these flags in a row?(10)

    Example:Suppose there are 6 applicants for 2 similar jobs. As thepersonnel manager is too lazy he simply selects 2 of theapplicants at random and gives them each a job. What isthe probability that he selects one of the 2 best applicants,and 1 of the four worst applicants? (8/15)

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    49/198

    49

    4.6 Conditional Probability

    Definition (Conditional Probability)The probability of event ‘A’ occurring given that event ‘B’

    occurs, or the conditional probability of ‘A’  given ‘B’ (hasoccurred) is denoted )B|A( P . Provided 0)B(    P , this

    conditional probability is defined  to be

    )B()BA()B|A(

     P P P

     

     

    Example:Suppose that a survey of women aged 20-30 years suggeststhe following joint probability table relating to maritalstatus and desire to become pregnant within the next 12months.

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    50/198

    50

    Desire

    Marital status Pregnancy No pregnancy TotalMarriedUnmarried

    0.080.02

    0.470.43

    0.550.45

    Total 0.10 0.90 1.00

    Theorem (Multiplicative Law of Probability)

    Suppose events ‘A’ and  ‘B’ defined on a sample space.Then

    )B|A()B()A|B()A()BA(   P P P P P    

    Example:Define events ‘A’ and ‘B’ in the following way:

    ‘A’: A student achieves a mark of over 65% in a first yearstatistics exam

    ‘B’: A student goes on to complete her bachelors degree.

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    51/198

    51

    Suppose past experience indicates

    88.0)A|B(7.0)A(

     P P  

    4.7 Independence of Events

    Sometimes, whether an event ‘B’ has occurred or not willhave no effect on the probability of ‘A’ occurring. In this

    case we say events ‘A’ and ‘B’ are independent.

    Definition (Independent and Dependent Events)Events ‘A’ and ‘B’ are said to be statistically independent  if

    )B()A()BA(   P P P    

    If )B()A()BA(   P P P 

    , the events are said to bestatistically dependent .

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    52/198

    52

    Alternative Definition (Independent and DependentEvents)

    Events ‘A’ and ‘B’ are said to be statistically independent  if

    )A()B|A(   P P    

    )B()A|B(   P P    

    Otherwise the events are said to be statistically dependent.

    Example:Consider the single die tossing experiment again anddefine the following events:

    ‘A’: an odd number of dots results‘B’: a number of dots greater than 2 results

    Are ‘A’ and ‘B’ independent?

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    53/198

    53

    4.8 More Useful Probability Rules

    (a) The Additive Law of Probability

    Theorem (Additive Law of Probability)For two events ‘A’ and ‘B’ defined on a sample space 

    )BA()B()A()BA(    P P P P  

    Example:Again suppose that for }8, 7, 6, 5, 4, 3, 2, 1{ :

    1.0)6()3()2()1(    P P P P  

    08.0)8()7()4(    P P P  

    36.0)5(    P  

    with }8, 5, 4, 3, 2{B, }6, 5, 3, 1{A    

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    54/198

    54

    (b) The Complementation Rule

    Theorem 4.7 (Complementation Rule)Suppose an event ‘E’ and its complement E   defined onsome sample space . Then

    )E(1)E(   P P  

    (b) The Law of Total Probability

    Theorem (Law of Total Probability)Suppose a sample space and a set of ‘k’ events

    k21   E,..., E, E  such that

     

    0)E( 

    i  P   )k,..., 1( 

    i    

     j i    EE   )(   j i     (i.e. the events are mutually

    exclusive)

      k21   E...EE     (i.e. the events are exhaustive

    on )

    Then for any event ‘A’ defined on :

     

    k

    1

    kk2211

    k21

    )E|A()E(

    )E|A()E(....)E|A()E()E|A()E(

    )AE(....)AE()AE()A(

     j  j  j 

      P P

     P P P P P P

     P P P P

     

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    55/198

    55

     MAIN POINTS

     

     In some statistical experiments the number of basicoutcomes in the sample space or event of interest can be

    enumerated by using the ‘multiplicative rule’,

     permutation or combination formulae, depending on how

    a basic outcome can be represented most appropriately.

      )B|A( P   means the probability event ‘A’  occurs given

    that event ‘B’ has occurred. The conditional probabilitydefinition is

    )B(

    )BA()B|A(

     P

     P P

     

     

     

     Multiplicative law of probability:)B|A()B()A|B()A()BA(   P P P P P    

     

     Events ‘A’  and ‘B’  are statistically independent if the

     probability of ‘A’ occurring is not affected by whether ‘B’ 

    has occurred.

      Events ‘ A’ and ‘ B’ are independent if)B()A()BA(   P P P    

    or equivalently )A()B|A(   P P 

     

     

     Additive law of probability:)BA()B()A()BA(    P P P P  

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    56/198

    56

    200052 INTRODUCTION TO ECONOMIC METHODS

    LECTURE - WEEK 4

    Required Reading:Ref. File 4: Sections 4.7 to 4.9Ref. File 5: Introduction and Sections 5.1 to 5.4

    4. PROBABILITY THEORY CONTINUED

    4.9 Sampling With and Without Replacement

    Definition (Random Sample from a Statistical Population)A random sample  of ‘n’ elements from a statisticalpopulation is such that every possible combination of ‘n’elements from the population has an equal probability ofbeing in the sample.

    Many experiments involve taking a random sample from afinite population. If we sample with replacement, weeffectively return each observation to the populationbefore making the next selection. In this way thepopulation from which we are sampling remains the same

    from one selection to the next; provided sampling israndom, the successive outcomes will be independent.

    If we sample without replacement from a finitepopulation, the outcome of any one selection will dependon the outcomes of all previous selections; the populationis reduced with each selection.

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    57/198

    57

    Example:Suppose that in a given street 50 residents voted in the last

    election. Of these, 15 voted for party ‘A’, 30 voted forparty ‘B’  and 5 voted for neither party ‘A’  nor ‘B’.Suppose that one evening a candidate for the next electionvisits the residents of the street to introduce herself. Whatis the probability that the first two eligible voters shemeets voted for party ‘A’ at the last election? (   353 )

    Example:

    Consider the experiment of successively drawing 2 cardsfrom a deck of 52 playing cards. Define the followingevents:

    1A : ace on first draw

    2A : ace on second draw

    What is the probability of selecting 2 aces if sampling(drawing) is (i) without replacement, and (ii) withreplacement? (   1691, 2211 ) 

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    58/198

    58

    Note: If we simultaneously select a sample of ‘n’ elements,we are effectively sampling without replacement.

    4.10 Probability Trees

    Tree diagrams can be a useful aid in calculating theprobabilities of intersections of events (i.e. jointprobabilities).

    Example:Greasy Mo’s take-away food store offers special $10 mealdeals consisting of a small pizza or a kebab, together witha can of soft drink, a milkshake or a cup of fruit juice.Past experience has shown that 60% of meal deal buyerschoose a pizza (‘P’), 40% choose kebabs (‘K’), 75% choosesoftdrink (‘S’), 20% choose a milkshake (‘M’) and 5%choose fruit juice (‘J’). Assume the events ‘P’ and ‘K’ areindependent of the events ‘S’, ‘M’  and ‘J’. What is theprobability that a meal deal customer (chosen at random)will choose a pizza and fruit juice? (0.03)

    The tree diagram for this example can be drawn as below.

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    59/198

    59

    45.0)75.0(6.0)SP(   P  

    S:0.75

    M:0.2 12.0)2.0(6.0)MP(   P  

    P:0.6 J:0.05

    03.0)05.0(6.0)JP(   P  

    3.0)75.0(4.0)SK (   P  

    S:0.75

    K:0.4

    M:0.2 08.0)2.0(4.0)MK (   P  

    J:0.0502.0)05.0(4.0)JK (   P  

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    60/198

    60

    5. PROBABILITY DISTRIBUTIONS OF DISCRETERANDOM VARIABLES

    5.1 Probability Distributions and Random Variables

    A probability distribution can be considered a theoreticalmodel for a relative frequency distribution of data from areal life population.

    A probability distribution thus specifies the probabilitiesassociated with the various outcomes of a statisticalexperiment. It can take the form of a table, a graph orsome formula.

    From now on we shall be concerned with thecharacteristics of probability distributions. However, tofacilitate our study we shall now represent simple eventsand events associated with statistical experiments byvalues of random variables.

    Definition (Random Variable)A random variable  X is a rule that assigns to each simpleevent of a statistical experiment a unique numerical value.

    The above definition can also be expressed in the followingslightly more mathematical way.

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    61/198

    61

    Alternative Definition (Random Variable)A random variable  X   is a real valued function for which

    the domain is the sample space of a statistical experiment.

    In most statistical experiments of interest, outcomes giverise to quantitative data that can be considered values ofthe random variable being studied.

    In experiments which give rise to categorical or qualitativedata, a random variable can normally also be defined.

    Example:Consider the experiment of selecting a person at randomand noting their hair colour.

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    62/198

    62

    Definition (Discrete Random Variable)A discrete random variable  can only assume a finite or

    infinite and countable number of values.

    Definition (Continuous Random Variable)A continuous random variable can assume any value in aninterval (finite or infinite).

    Definition (Discrete Probability Distribution)A discrete probability distribution lists a probability for, orprovides a means (e.g. a rule or formula) of assigning aprobability to, each value a discrete random variable cantake.

    Suppose our random variable is called X . Then )x(    X  P  

    represents the probability that the random variable takeson the particular value ‘x’.

    Properties of the Discrete Probability Distribution of aRandom Variable X :

      1)x(0    X  P   for all values of ‘x’ 

     

    x

    1)x(all 

     X  P  

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    63/198

    63

    Example:Consider again the experiment of tossing a fair die once

    and noting the number of dots on the upward facing side( X ).

    Definition (Cumulative Distribution Function)The cumulative distribution function of a random variable X , denoted )x( F , is defined as

    )x()x(    X  P F  

    where ‘x’ is any real number. 

    5.2 Expected Values of Random Variables

    It is of interest to have a measure of the centre of theprobability distribution of a random variable X . This roleis filled by the expected value of X .

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    64/198

    64

    Definition (Expected Value of a Discrete RandomVariable)

    The expected value of a discrete random variable  X   isdefined as

    x

    )x(x)(all 

     X  P X  E  

    If a statistical experiment considered generates values ofthe random variable that coincide with values in thepopulation considered, and the theoretical probabilitydistribution of the random variable and populationrelative frequency distribution are the same, the mean ofthe theoretical distribution of  X   will be the same as thepopulation mean (i.e. ). That is, )( X  E .

    Example:Suppose you buy a lottery ticket for $10. The sole prize inthe lottery is $100,000 and 100,000 tickets are sold. If thelottery is fair (i.e. each ticket sold has an equal chance ofwinning), what will be your expected gain from buying thelottery ticket? (-9)

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    65/198

    65

    Theorem (Expected Value of a Function of a DiscreteRandom Variable)

    Suppose a function )( X  g  of a discrete random variable X .The expected value of this function, if it exists, is given by

    x

    )x()x()]([all 

     X  P g X  g E  

    Theorem 5.2 (Various Properties of Expected Values)

      If ‘c’ is any constant then 

    c)c(    E  

      If ‘c’ is any constant and )( X  g   is any function of a

    discrete or continuous random variable X  then

    )]([c)](c[   X  g E X  g E    

      If )( X  g i    )k,..., 1( i   are ‘k’ functions of a discrete or

    continuous random variable X  then

    )]([...)]([)](..)([ k1k1   X  g E X  g E X  g X  g E 

      If )( X h  and )( X  g  are two functions of a discrete or

    continuous random variable X  such that )()(   X  g X h    

    for all X , then

    )]([)]([   X  g E X h E    

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    66/198

    66

    5.3 The Variance of a Random Variable

    To gauge the dispersion of a random variable  X  about itsexpected value or mean we can calculate the expected

    value of its squared distance 2))((   X  E X   from the mean.

    This is called the variance of the random variable  X ,denoted )( X Var  . 

    Definition (Variance of a Random Variable)

    The variance of any random variable  X   (discrete orcontinuous) is given by

    ]))([()(   2 X  E X  E X Var   

    Definition (Standard Deviation of a Random Variable)

    The standard deviation of any random variable X  (discreteor continuous) is given by

    ]))([()()(   2 X  E X  E X Var  X SD  

    Again assuming the probability distribution of  X   is an

    accurate representation of the population relativefrequency distribution of  X , we can write 2)(    X Var  ,

    where 2  is the population variance.

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    67/198

    67

    An alternative way of writing (and calculating) )( X Var   is

    discrete)isIf ()]([)x(x

    )]([)()(

    2

    x

    2

    22

     X  X  E X  P

     X  E X  E X Var 

    all 

     

     

     

    Example:Suppose a lottery offers 3 prizes: $1,000, $2,000 and

    $3,000. 10,000 tickets are sold and each ticket has anequal chance of winning a prize. Calculate the varianceand standard deviation of the random variable  X  representing the value of the prize won by a ticket.(1399.64, 37.4118)

    x )x( 

     X  P  

    2

    x   )x(x 

     X  P   )x(x

    2

     X  P  0

    10000

    9997  0 0 0

    1,00010000

    1  1,000,000 0.1 100

    2,00010000

    1  4,000,000 0.2 400

    3,00010000

    1   9,000,000 0.3 900

    Total 0.6 1400

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    68/198

    68

    If we wish to determine the variance of a linear function

     X  X  gY    ba)( 

    of a random variable  X , the followingrule can be used

    )(b)ba()(  2  X Var  X Var Y Var     

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    69/198

    69

    5.4 The Binomial Distribution

    The binomial distribution is a discrete probabilitydistribution based on ‘n’  repetitions of an experimentwhose outcomes are represented by a Bernoulli randomvariable.

    (a) Bernoulli Experiments

    A Bernoulli experiment (or trial) is such that only 2outcomes are possible. These outcomes can be denotedsuccess (‘S’) and failure (‘F’), with probabilities ‘p’  and

    )p1(  , respectively.

    A Bernoulli random variable Y is usually defined so that ittakes the value 1 if the outcome of a Bernoulli experiment

    is a success, and the value 0 if the outcome is a failure.Thus

    )p1()0(

    p)1(

     

    Y  P

    Y  P 

    The mean and variance of a Bernoulli random variable

    defined in the above way are

    )p1(p)(

    p)(

     

    Y Var 

    Y  E 

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    70/198

    70

    (b) Binomial Experiments

    Definition (Binomial Experiment)A binomial experiment fulfils the following requirements:

    (i) There are ‘n’ repetitions or ‘trials’ of a Bernoulliexperiment for which there are only twooutcomes, ‘success’ or ‘failure’. 

    (ii) All trials are performed under identical

    conditions.(iii) The trials are independent.(iv) The probability of success ‘p’ is the same for each

    trial.(v) The random variable of interest, say  X , is the

    number of successes observed in the ‘n’ trials. 

    Theorem (The Binomial Probability Function)Let  X   represent the number of successes in a binomialexperiment consisting of ‘n’ trials and with a probability

    ‘p’ of success on each trial. The probability of ‘x’

    successes in such an experiment is given by

    xnxxn   )p1(p)x(  C X  P   for n,..., 3, 2, 1, 0x   

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    71/198

    71

    Example:A company that supplies reverse-cycle air conditioning

    units has found from experience that 70% of the units itinstalls require servicing within the first 6 weeks ofoperation. In a given week the firm installs 10 airconditioning units. Calculate the probability that, within 6weeks

      5 of the units require servicing (0.1029 approx.)

      none of the units require servicing (0 approx.)

     

    all of the units require servicing (0.0282 approx.)

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    72/198

    72

    (c) Cumulative Binomial Probabilities

    (Extract of Appendix 3)

    CUMULATIVE BINOMIAL PROBABILITIES: )n, p|x(    X  P  

    pn x 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 .... 0.70

    1

    2

    3

     

    10

    01

    012

    012

    3

     

    0123456789

    10

    0.95001.0000

    0.90250.99751.0000

    0.85740.99280.9999

    1.0000

     

    0.59870.91390.98850.99900.99991.00001.00001.00001.00001.00001.0000

    0.90001.0000

    0.81000.99001.0000

    0.72900.97200.9990

    1.0000

     

    0.34870.73610.92980.98720.99840.99991.00001.00001.00001.00001.0000

    0.85001.0000

    0.72250.97751.0000

    0.61410.93930.9966

    1.0000

     

    0.19690.54430.82020.95000.99010.99860.99991.00001.00001.00001.0000

    0.80001.0000

    0.64000.96001.0000

    0.51200.89600.9920

    1.0000

     

    0.10740.37580.67780.87910.96720.99360.99910.99991.00001.00001.0000

    0.75001.0000

    0.56250.93751.0000

    0.42190.84380.9844

    1.0000

     

    0.05630.24400.52560.77590.92190.98030.99650.99961.00001.00001.0000

    0.70001.0000

    0.49000.91001.0000

    0.34300.78400.9730

    1.0000

     

    0.02820.14930.38280.64960.84970.95270.98940.99840.99991.00001.0000

    0.65001.0000

    0.42250.87751.0000

    0.27460.71830.9571

    1.0000

     

    0.01350.08600.26160.51380.75150.90510.97400.99520.99951.00001.0000

    0.60001.0000

    0.36000.84001.0000

    0.21600.64800.9360

    1.0000

     

    0.00600.04640.16730.38230.63310.83380.94520.98770.99830.99991.0000

    ....

    ....

    ....

     

    ....

    0.30001.0000

    0.09000.51001.0000

    0.02700.21600.6570

    1.0000

     

    0.00000.00010.00160.01060.04730.15030.35040.61720.85070.97181.0000

    Example:Referring to previous air conditioning unit example,calculate the probability that within 6 weeks of installation

      less than 8 of the air conditioners require servicing.

    (0.6172 approx.)  4 or more of the air conditioners require servicing.

    (0.9894 approx.)

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    73/198

    73

    Example:A referring to previous air conditioning unit example, use

    the cumulative binomial tables to calculate the probabilitythat within 6 weeks of installation

      5 units require servicing (0.103)

      10 units require servicing (0.0282)

    (d) Characteristics of the Binomial Distribution

    Theorem (Mean and Variance of a Binomial RandomVariable)Let  X   represent the number of successes in a binomialexperiment consisting of ‘n’ trials, and where the

    probability of success on each trial is ‘p’. Then 

    np)(    X  E  )p1(np)(   X Var   

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    74/198

    74

    Each combination of ‘n’  and ‘p’  gives a particularbinomial distribution. We say ‘n’  and ‘p’  are the

    parameters of the binomial distribution.

    If 0.5p  , the binomial distribution is symmetric.

    ExampleSuppose 5n

     and 0.5p   

    (probability histogram)probability

    0.3125

    0.1563

    0.0313

    0 1 2 3 4 5  X

    The binomial distribution will be skewed to the left (i.e.‘negatively skewed’) if 0.5p  , and skewed to the right

    (i.e. ‘positively  skewed’) if 0.5p  . In either case the

    tendency to be skewed diminishes as ‘n’ increases.

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    75/198

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    76/198

    76

     

    The binomial distribution is a model for the relative

     frequency (probability) distribution of numbers ofsuccesses in ‘n’ trials of a Bernoulli experiment.

      The binomial distribution can be represented by the

     probability function

    xnx)p1(p)x(   xn  C X  P  

    where ‘ n’   is the number of trials, ‘x’  the number ofsuccesses and ‘ p’  the probability of success at each trial.

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    77/198

    77

    200052 INTRODUCTION TO ECONOMIC METHODS

    LECTURE - WEEK 5

    Required Reading:Ref. File 6: Introduction and Sections 6.1 to 6.4

    6. CONTINUOUS PROBABILITY DISTRIBUTIONS

    6.1 Introduction

    From now on we shall be mainly concerned with studyingthe distributions of continuous random variables. As wehave noted, a continuous random variable can assume anyvalue in a given interval.

    The probability distribution for a continuous randomvariable X  will have a smooth curve or line as its graphicalrepresentation. The heights of the points on this curve willbe given by a function of ‘x’, denoted )x( f  , which is

    variously called the probability density function, theprobability distribution or simply the density function ofthe random variable X .

    Areas under a density function )x( f    represent

    probabilities of  X   taking on values in the correspondingintervals.

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    78/198

    78

    Area )ba(    X  P  

    y  y =  f (x)

    a  b  X

    Properties of Density FunctionsIf )x( f   is a valid density function, it satisfies the following

    two properties:

    (i) 0)x( 

     f   for all x

    (ii)

    1x)x(   d  f   

    Note: For a continuous random variable the probabilityassociated with any particular value of the variable is 0.

    The mean and variance of a continuous random variableare normally determined using calculus.

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    79/198

    79

    6.2 The Uniform Distribution

    “If a random variable  X  can take on any value in a given finite interval bxa    and the probability of the variabletaking a value in a given finite sub-interval is the same asthe probability the variable takes a value in any otherfinite sub-interval of the same width, we say the variable X  is uniformly distributed.” We have the following formal

    definition.

    Definition (Uniform Random Variable)A continuous random variable  X   is said to be uniformlydistributed over the finite interval ba    X   if and only ifits density function is given by

     

    bxoraxif , 0

    bxaif , 

    ab

    1

    )x( f   

    We can calculate probabilities with respect to the random

    variable X  in the above definition from

    ab

    cd)dc(

     

     X  P   for bd, ac    

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    80/198

    80

     f (x) 

    Total Area = 11/(b-a)

    a c d b  X

    Theorem (Expected Value and Variance of a Uniform

    Random Variable)

    Suppose the random variable  X   is uniformly distributed

    over the finite interval bxa   . The expected value and

    variance of X  are, respectively

    2

    )ba()( 

     X  E  

    12

    )ab()(

     X Var   

    Example:

    The amount of petrol sold daily by a service station (say X )is known to be uniformly distributed between 4,000 and6,000 litres inclusive. What is the probability of sales onany one day being between 5,500 and 6,000 litres? (0.25)

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    81/198

    81

    6.3 The Normal (Gaussian) Distribution

    The normal distribution represents a family of “bell-shaped” distributions that are distinguished according  totheir mean and variance.

    Definition (Normally Distributed Random Variable)A random variable X  is normally distributed if and only ifit has a density function of the following form:

     

    2

    2  )x(

    2

    1

    2e

    2

    1)x( f    for all real ‘x’ 

    Where:

     

      and 2   are  parameters  of the distribution of  X .

    They are used to represent )( X  E   and )( X Var  ,

    respectively. 

      ‘e’ is the irrational number ‘e’ that serves as the base

    for natural logarithms ..)7182.2e(   

     

      is the irrational number representing the ratio ofthe circumference of a circle to its diameter

    ..)1415.3(   

    A normal distribution with mean   and variance 2   is

    usually denoted ), (   2 N  .

    The normal distribution has a positive density for all real‘x’. Therefore it can strictly speaking never exactly matchthe distribution of a variable that only takes on non-

    negative values. However, even in such cases it can oftengive a very good approximation.

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    82/198

    82

    The normal distribution is symmetric about .

    y y = f (x)

    For any normal distribution it will be the case that,approximately:

      68% of its values will fall within one standard

    deviation ( ) of .

      95.5% of its values will fall within two standard

    deviations (2

    ) of .  99.7% of its values will fall within three standard

    deviations (3 ) of .

    Computing areas under a normal density function is

    difficult, but we can use a table showing probabilitiesassociated with the standardised normal random variable(many calculators and Microsoft Excel are also able tocalculate these probabilities).

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    83/198

    83

    The standard normal distribution has a mean of 0 and avariance (and standard deviation) of 1. A standard

    normal variable is often denoted  Z . Thus

     Z !   )1, 0( N   

    Probabilities relating to  X !    ), (   2 N   can be calculated by

    first calculating the standardised  Z  scores correspondingto the value(s) of  X   and then using the standard normal

    probability table. This is formalized by the followingtheorem.

    Theorem 6.2 (The Standardizing Transformation of Non-Standard Normal Probabilities)A random variable X  is normally distributed with mean  

    and variance 2   if and only if  

     X  Z   is a standard

    normal random variable, that is

     X !   ), (  2 N   if and only if  

     X  Z   !   )1, 0( N   

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    84/198

    84

    Also note that a linear function of a normal variable is alsonormally distributed.

    (Extract of Appendix 5)AREAS UNDER THE STANDARD NORMAL DISTRIBUTION

    The table below gives areas under the standardnormal distribution between 0 and z.

    0 z  Z

    z 0 1 2 3 4 5 6 7 8 90.00.10.20.30.4

    .0000

    .0398

    .0793

    .1179

    .1554

    .0040

    .0438

    .0832

    .1217

    .1591

    .0080

    .0478

    .0871

    .1255

    .1628

    .0120

    .0517

    .0910

    .1293

    .1664

    .0160

    .0557

    .0948

    .1331

    .1700

    .0199

    .0596

    .0987

    .1368

    .1736

    .0239

    .0636

    .1026

    .1406

    .1772

    .0279

    .0675

    .1064

    .1443

    .1808

    .0319

    .0714

    .1103

    .1480

    .1844

    .0359

    .0754

    .1141

    .1517

    .1879

    0.50.60.70.80.9

    .1915

    .2258

    .2580

    .2881

    .3159

    .1950

    .2291

    .2612

    .2910

    .3186

    .1985

    .2324

    .2642

    .2939

    .3212

    .2019

    .2357

    .2673

    .2967

    .3238

    .2054

    .2389

    .2704

    .2996

    .3264

    .2088

    .2422

    .2734

    .3023

    .3289

    .2123

    .2454

    .2764

    .3051

    .3315

    .2157

    .2486

    .2794

    .3078

    .3340

    .2190

    .2518

    .2823

    .3106

    .3365

    .2224

    .2549

    .2852

    .3133

    .33891.01.1

    1.21.31.4

    .3413

    .3643

    .3849

    .4032

    .4192

    .3438

    .3665

    .3869

    .4049

    .4207

    .3461

    .3686

    .3888

    .4066

    .4222

    .3485

    .3708

    .3907

    .4082

    .4236

    .3508

    .3729

    .3925

    .4099

    .4251

    .3531

    .3749

    .3944

    .4115

    .4265

    .3554

    .3770

    .3962

    .4131

    .4279

    .3577

    .3790

    .3980

    .4147

    .4292

    .3599

    .3810

    .3997

    .4162

    .4306

    .3621

    .3830

    .4015

    .4177

    .4319

    1.51.61.71.81.9

    .4332

    .4452

    .4554

    .4641

    .4713

    .4345

    .4463

    .4564

    .4649

    .4719

    .4357

    .4474

    .4573

    .4656

    .4726

    .4370

    .4484

    .4582

    .4664

    .4732

    .4382

    .4495

    .4591

    .4671

    .4738

    .4394

    .4505

    .4599

    .4678

    .4744

    .4406

    .4515

    .4608

    .4686

    .4750

    .4418

    .4525

    .4616

    .4693

    .4756

    .4429

    .4535

    .4625

    .4699

    .4761

    .4441

    .4545

    .4633

    .4706

    .4767.... ..... ..... ..... ..... ..... ..... ..... ..... ..... .........3.83.9

    .......4999.5000

    ......4999.5000

    ......4999.5000

    ......4999.5000

    ......4999.5000

    ......4999.5000

    ......4999.5000

    ......4999.5000

    ......4999.5000

    ......4999.5000

    Example:If  Z !   )1, 0( N   determine the following probabilities:

      )0(    Z P   (0.5)

      )5.0(    Z P   (0.3085)

      )9.01.0(    Z P   (0.3557)

     

    )64.1( 

     Z P   (0.0505)

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    85/198

    85

    Example:If  X !   )4, 12( N  , calculate )26.6(    X  P , )137(    X  P  and

    )5.15( 

     X  P . (0.0021, 0.6853, 0.0401) 

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    86/198

    86

    Example:From several years’ records, a fish market manager has

    determined that the weight of deep sea bream sold in themarket )( X   is approximately normally distributed with a

    mean of 420 grams and a standard deviation of 80 grams.Assuming this distribution will remain unchanged in thefuture, calculate the expected proportions of deep seabream sold over the next year weighing

    (a) Between 300 and 400 grams. (0.3345)

    (b) Between 300 and 500 grams. (0.7745)

    (c) More than 600 grams. (0.0122)

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    87/198

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    88/198

    88

    Example:It is known that 60% of cars registered in a given town use

    unleaded petrol. A random sample of 200 cars is selected.Determine the probability that, of the cars in the sample:

      130 use unleaded petrol. (0.021)

      more than 130 use unleaded petrol. (0.0643)

      less than 130 use unleaded petrol. (0.9147)

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    89/198

    89

     MAIN POINTS

     

    The graphical representation of a continuous randomvariable is the graph of its density function  –  This is the

    counterpart of the probability histogram for a discrete

    random variable.

     

    The probability that a continuous random variable takes

    on a value in some range is given by an area under the

    density function.

      The uniform distribution has a constant density function.

      If  X  is normally distributed with a mean  and variance2

    , we can write this information as

     X  ~ ), (  2

     N   

     

    The standard normal random variable Z is such that

     Z  ~ )1, 0( N   

      Areas under a normal density function can be calculated

    with reference to the ‘standard normal table’, and making

    use of the symmetry of the distribution as needed.

     

    The normal distribution can be used to approximate

    binomial probabilities provided 5np    and 5)p1(n    

    (with np  and )p1(np2

       ); the approximation can

    be improved by using continuity correction.

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    90/198

    90

    200052 INTRODUCTION TO ECONOMIC METHODS

    LECTURE - WEEK 6

    Required Reading:Ref. File 7: Introduction and Sections 7.1 to 7.4

    7. INTRODUCTION TO ESTIMATION

    7.1 Estimators and Their Properties

    From now on we will mainly be concerned with ‘random

    samples of random variables’. 

    Definition (Random Sample of Size ‘n’ of a Random

    Variable)Consider a set of random variables

    n21  ,....., ,    X  X  X  . This

    set of random variables is said to represent a randomsample of size ‘n’ of the random variable X  if

    (i) n21   ,....., ,    X  X  X   are all statistically independent

     And

    (ii) n21   ,....., ,    X  X  X    each have the same probability

    distribution (or distribution function) as the randomvariable X .

    We will mostly use an upper case italicized letter to denotea random variable, and a lower case non-italicized letter to

    denote an actual realization or value of the variable.

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    91/198

    91

    Definition (Sample Statistic)Suppose the random variables n21   ,....., ,    X  X  X    are

    associated with a sample of size ‘n’ from a statisticalpopulation. Then any function of (or formula containing)

    n21  ,....., ,    X  X  X    that does not depend on any unknown

    parameter is called a sample statistic.

    Definition (Estimator/Estimate of a Population

    Parameter)Suppose the random variables n21   ,....., ,    X  X  X    are

    associated with a sample of size ‘n’ from a statistical

    population. Then a sample statistic involving

    n21   ,....., ,    X  X  X   that is used to estimate a parameter of the

    population or associated probability distribution is calledan estimator   of the parameter, and a realization of the

    sample statistic (an actual number) is called an estimate ofthe parameter.

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    92/198

    92

    Definition (Sample Mean and Variance of a RandomVariable)

    Suppose the random variables n21   ,....., ,    X  X  X    represent arandom sample of size ‘n’ of the random variable  X . Thesample mean and variance of  X   are then defined as,respectively

    Sample Mean of X  n

    n

    1

    i i 

     X 

     X   

    Sample Variance of X  1n

    )(n

    1

    2

    2

     

    i i    X  X 

    S  

    If an estimator is used to obtain a single value estimate ofa parameter, this estimate is called a point estimate.

    An interval estimate describes a range, or interval, ofvalues in which the population parameter is believed to be.

    An interval estimate is normally centred around a pointestimate.

    Since estimators are functions of random variables, theywill also be random variables whose values vary from

    sample to sample. The probability distribution of anestimator is called a sampling distribution.

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    93/198

    93

    Most statistical inference is based on a knowledge of the

    sampling distributions of estimators.

    Properties of Estimators

    Definition (Unbiased Estimator)Consider an estimator θ ˆ  of some population parameter .

    θ ˆ  is an unbiased estimator  of  if )ˆ(θ  E . If )ˆ(θ  E , θ ˆ  

    is said to be a biased estimator of  with the value of the

    bias given by )ˆ(B   θ  E .

    (  is the lower case version of the Greek letter ‘theta’) 

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    94/198

    94

    Definition (Relative Efficiency of an Estimator)

    If 1θ̂   and 2θ̂   are both unbiased estimators of a population

    parameter   with unequal variances, 1θ̂    is said to be

    relatively more efficient than 2θ̂   if

    )ˆ()ˆ( 21   θ Var θ Var     

    Definition (Consistency of an Estimator)An estimator θ ˆ  of some population parameter  is said tobe a consistent estimator of   if as the (random) samplesize increases the probability increases of the estimatoryielding an estimate in some arbitrary fixed interval,however small, centred round the true parameter value.

    Theorem (Sufficient Condition for Consistency of anEstimator)

    An estimator θ ˆ   of some population parameter   is aconsistent estimator of this parameter if

    )ˆ(limn

    θ  E  and   0)ˆ(limn

    θ Var   

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    95/198

    95

    7.2 The Sampling Distribution of the Sample Mean

    Example:Suppose we know that in a large city 20% of householdspossess no car, 60% possess one car and 20% possess twocars. If we let X  be the number of cars in a household wecan write the probability distribution of X  as

    0x

      1x    2x   

    x)( 

     X  P   51   53   51  

    Determine the sampling distribution of  X   based onrandom samples of size 2.

    1x   2x   x   ))x()x(( 2211    X  X  P  

    0 0 0 1/5 1/5 = 1/250 1 0.5 1/5 3/5 = 3/250 2 1 1/5 1/5 = 1/251 0 0.5 3/5 1/5 = 3/251 1 1 3/5 3/5 = 9/251 2 1.5 3/5 1/5 = 3/252 0 1 1/5 1/5 = 1/252 1 1.5 1/5 3/5 = 3/252 2 2 1/5 1/5 = 1/25

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    96/198

    96

    x   0 0.5 1 1.5 2

    )x( 

     X  P   1/25 6/25 11/25 6/25 1/25

    Theorem (The Central Limit Theorem)Consider a random sample n21   ,....., ,    X  X  X   of size ‘n’ of a

    random variable  X   with a finite mean )( X  E   and a

    finite variance 2)(    X Var  . Then:

    (i) If  X   is (exactly) normally distributed, the samplemean  X   will be exactly normally distributed with a

    mean  and a variance n2 .

    (ii) If  X   is not  normally distributed, the sample mean  X  will be approximately  normally distributed with a

    mean   and a variance n2   for large sample sizes.

    This approximation is generally considered to be

    valid when 30n  .

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    97/198

    97

    Note: )( X Var    decreases as ‘n’  increases and approaches

    zero in the limit. This, together with the fact that  X   is

    unbiased, ensures that  X  is a consistent estimator of .

    Note: The standard deviation of an estimator is oftencalled the standard error of the estimator, although oftenthis term is used for an estimate of the standard deviationof an estimator. 

    Example:A particular type of light bulb has a mean life of 6,000hours and a standard deviation of bulb life of 400 hours.What percentage of random samples made up of 100observations of bulb lives will yield mean bulb livesbetween 5,950 and 6,050 hours? (78.88%)

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    98/198

    98

    8. INTERVAL ESTIMATION

    8.1 Introduction and Terminology

    A confidence interval not only comprises an interval ofpossible population parameter values, but also somemeasure of the degree of belief or confidence that theinterval does indeed contain the parameter in question.

    The level of confidence associated with a confidenceinterval is the probability that we will obtain a realizationof the interval that contains the population parameter, i.e.before we actually take a sample. It is usually denoted(   1 )100%, where   is the probability (   10

     

    ) ofobtaining a realization of the interval that does not containthe population parameter.

    Confidence intervals are constructed on the basis ofknowledge of the sampling distribution of the estimator(or some function thereof) and a predetermined .

    The

    z  Notation:

    z   is used to denote the value of the standard normal

    variable Z such that

    )z( Z P  

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    99/198

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    100/198

    100

    n

     X  Z   !   )1, 0( N   

    Therefore, for a given :

    *1n

    zn

    z

    1n

    zn

    z

    1n

    zn

    z

    1n

    zn

    z

    1zn

    z

    1)zz(

    22

    22

    22

    22

    22

    22

     

     

     

     

     X  X  P

     X  X  P

     X  P

     X  P

     X  P

     Z P

     

    Thus the (   1 )100% confidence interval for an observedx  is given by

    **n

    zx, n

    zx 22  

     

     

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    101/198

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    102/198

    102

    8.3 Properties of Confidence Intervals

    The width of a confidence interval for the populationmean, where we are justified in using the normaldistribution, is given by

    nz2z2)zx(zx 2222

     X  X  X  

    For a given confidence level and a given , the confidenceinterval width decreases with increasing ‘n’. This leads toa criterion for choosing ‘n’.

    If we wish to use a calculated x  to estimate  to within ‘D’ 

    (units) with (   1 )100% confidence, we should choose ‘n’ 

    such that2

    2

    D

    zn

     

    (assuming a normally distributed population or an 30n  )

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    103/198

    103

    Example:A clothing shop located in a busy shopping arcade is

    interested in estimating the mean age of people whofrequent the arcade. The shop intends to use thisinformation in determining the appropriate range ofclothing it should stock in order to maximize sales. Asample of people is to be selected at random in the arcadeand questioned by the shop manager about their age.What should the sample size be if the shop manager

    wishes to use a calculated x  to estimate the average age ofpeople who frequent the arcade to within 1.5 years, with95% confidence, assuming the population standarddeviation is approximately 7.5? (97)

  • 8/18/2019 IEM Outline Lecture Notes Autumn 2016

    104/198

    104

     MAIN POINTS

     

     A random sample of a random variable is such that therandom variables representing the sample are

    independently and identically distributed.

      An estimator of a population parameter is a formula

    containing the random variables representing sample

    values.

      The probability distribution of an estimator is called a

    sampling distribution.

      An unbiased estimator of a parameter has a mean equal

    to the parameter value.

     

     A consistent estimator has a probability distribution thatbecomes ‘more concentrated’ around the true parameter

    value as n tends to infinity.

      )( X  E , andn

    )(2

     X Var    (if the s X i '   represent a

    random sample of the random variable X)

       X  is and unbiased and consistent estimator of .

     

    The central limit theorem says that even if we are

    sampling from a non-normal distribution, the distribution

    of the sample mean will be approximately normal

     provided the sample size is sufficiently large