measures+of+variation

Upload: imranmughalmani

Post on 08-Apr-2018

220 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/7/2019 Measures+of+Variation

    1/10

    Measures of Variation

    From our raw data, we were able to calculate a measure of central location. Although wefound five measures of central location, we shall, for the remainder of this course,concentrate only on the arithmetic meanHaving found the mean of our data set, we can now proceed to calculate a statistic thatmeasures how much are observed data varies around its mean.Suppose that we had two sections of students (X & Y) taking an exam graded out of ten.Observation X Y

    1 7 62 9 103 6 64 9 45 4 26 7 87 5 108 8 69 8 9

    10 7 9

    S 70 S 70

    X = SX/n = 7 Y = SY/n = 7

    The mean of both data sets is 7, yet closer inspection reveals that there is greater variationin data set Y than in data set X. For starters, the top score in X is 9, and the low score is4. By contrast the high and low scores in Y are 10 and 2 respectively.But this is hardly rigorous, what we need is a statistic that is calculated from as much ofour data as possible, not merely the high and low scores.

    The statistics of choice will be theVariance, the Standard Deviation and the Coefficientof Determination. But. before I start throwing equations around the shop, I need to sell you on the idea ofwhy we should be interested in a measure of variation what does it tell us?

    ase purchase PDFcamp Printer on http://www.verypdf.com/ to remove this watermark.

    http://www.verypdf.com/http://www.verypdf.com/
  • 8/7/2019 Measures+of+Variation

    2/10

    Consider first meteorology. The average July temperature in Buffalo NY is 69F theaverage July temperature in Seattle WA is also 69F. However, the average Januarytemperature in Buffalo is 25F, while Seattle its a balmy 41F.In finance, the riskyness of an asset is measured by the standard deviation (variation) of

    its price over a period of time (say, 250 days). Indeed, there is a relationship between anassets return and its riskyness, assets with low risk (such as US T-Bills) also have lowreturns, while assets with higher risk, have high returns (for example, the stock ofGoogle). Even in the universe of stocks, some are considered stable (General Electric,IBM, to name but two), while others are considered to be volatile, such as the stocks inthe bio-technology sector. For those of you who might be interested the translation of the

    word riskinto Mandarin Chinese is ?? , It means risk but opportunity, not just plain risk

    In sports, the idea of variation is pegged to consistency. For example in baseball, thecloser might not necessarily be the best pitcher on a team, but hes probably the most

    consistent. In golf, major tournaments are decided over four rounds. The winner is rarelythe golfer who scored the lowest round in the tournament, but is definitely the mostconsistent.Anyhow, the variation found in a data set is measured as following way.

    Variance =

    ( ) / 5 18. 203 1 0. 665 7 0 0 1 194 . 04 452 . 729 8 ( )

    1n

    XXni

    1i

    2

    i

    =

    =

    We subtract the mean from each observation and sum the squares. We then divide by thenumber of observation minus 1. The reason why we have to square the deviation from themean is because the simple sum of the mean deviations will be zero. The reason wedivide by n-1 rather than n, is a bit tricky but Ill deal with that later. Lets have lookat the X data from the previous page, where we found the mean to be 7.

    Observation X ( XX 2)XX(

    1 7 0 02 9 2 43 6 -1 1

    4 9 2 45 4 -3 96 7 0 07 5 -2 48 8 1 19 8 1 1

    10 7 0 0S = 70 S = 0 S = 24

    ase purchase PDFcamp Printer on http://www.verypdf.com/ to remove this watermark.

    http://www.verypdf.com/http://www.verypdf.com/
  • 8/7/2019 Measures+of+Variation

    3/10

    Therefore our variance = 24 9 = 2? or 2.6667

    We can repeat the process for our Ydata and will find that its variance is 7.1111, or sevenand one ninth.

    So, the variance of the Ydata is larger than the variance of the Xdata,which is what weexpected when we first looked at the data. However, what we now want is a way tointerpret 2.6667 and 7.1111 what do those numbers mean? The answer is notimmediately obvious, we had to square the deviations from the mean in order to ensurethat they did not sum to zero, but in doing so we inflated each deviation.The obvious thing to do would be to somehow undo the squaring, by taking the squareroot of the variance. This statistic is called the Standard Deviation.

    Standard Deviation =( ) / 5 18. 2188 0. 6654 0 0 1 251. 28 536. 3698

    2/1ni

    1i

    2

    1n

    XX

    =

    =

    So the standard deviation of X = ( 2.6667)1/2 = 1.6330 And the standard deviation of Y = (7.1111)1/2 = 2.6667

    Note: The fact that the standard deviation of Y happened to be the variance of X was

    purely accidental. The data sets are independent of each other.Now we can interpret the standard deviation.Standard Deviation: The standard deviation is the average deviation, of the

    individual observations, from their mean.

    For practical purpose, we are only really interested in the standard deviation, the fact thatwe have to calculate the variance first, is neither here or there.

    NotationThe variance of a sample is denoted S2The Standard deviation of a sample is denoted S

    ase purchase PDFcamp Printer on http://www.verypdf.com/ to remove this watermark.

    http://www.verypdf.com/http://www.verypdf.com/
  • 8/7/2019 Measures+of+Variation

    4/10

    S2 The sample variance is a statistic, it is our best estimate of the population

    variance, which denoted by s2(sigma squared). s2 is a parameter. S The sample standard deviation is a statistic, it is our best estimate of the

    population standard deviation, which denoted by s (sigma). s is a

    parameter.

    Recall that we referred to the mean of a data set as its first moment. The standarddeviation is called the second moment.

    Degrees of FreedomWe can now return to the thorny issue of why we divided by (n-1) rather than simply (n).The short answer is that we lostone degree of freedom, but I would venture to guess that

    this fact alone is not particularly helpful.Formally,Degrees of freedom are the number of independent pieces of information (ourobservations) that are available to estimate another piece of information. Moreconcretely, the number of degrees of freedom is the number of independent observationsin a sample of data that are available to estimate a parameter of the population fromwhich that sample is drawn.Example 1) If we have two observations, when calculating the mean we have two

    independent observations; however, when calculating the variance, wehave only one independentobservation, since the two observations areequally distant from the mean.

    Example 2) Suppose we have three observation (n = 3), if I tell you that the arithmetic

    mean of this data set is five (5), I have lost a degree of freedom. Otherwisestated, only two of the original three variables can actually vary, the thirdhas to be fixed it can no long vary.

    I have three volunteers: Tom, Dick and Harry who are free to chose any

    number they wish, but I tell them that the mean of their choices must be 5.Tom scratches his head and comes up with three (3).

    Dick choose nine (9)

    Now Harry, unlike Tom & Dick, cannot choose any number he wants; heis constrained by the fact that the mean is five (5). Therefore he must sayfour (4) because only for can give us a mean of five (3 + 9 + 4)/3 = 5.

    ase purchase PDFcamp Printer on http://www.verypdf.com/ to remove this watermark.

    http://www.verypdf.com/http://www.verypdf.com/
  • 8/7/2019 Measures+of+Variation

    5/10

    Thus, having calculate the mean, we no longer have (n) variables that can vary, we nowonly have (n-1), the last has to be fixed.Alternative Way to Calculate the Standard Deviation/Variance

    The formula given earlier on in this note has the advantage of being intuitive, we canimmediately see that we are summing squares of deviations from the mean.Pedagogically this is a desirable quality. Unfortunately, it is not computationallyefficient, in the sense that we can compute the standard deviation using fewer steps.Rather that simply give you the alternative formula, I will derive it for you. I do this notto aggravate you, or to show off, but I want to give you a sense of what MathematicalStatistics looks like. Since this is a course in Business Statistics, you are not required tolearn this, but I believe you will benefit from the exercise.We will start with the variance

    ( ) / 5 18. 203 1 0. 666 8 0 0 1 155 . 64 515 . 009 8 ( )

    1n

    XX

    S

    ni

    1i

    2

    2

    ==

    = (1)

    Since I am not going to play with the denominator, I will omit it for clarity and bring itback later. Again for clarity, Ill also lose the super & subscripts from the Sigma.

    (

    2XX (2)

    I expand the term in the bracket

    )X.X2XX( 22 + (3)Now, I will run the sigma operator through the equation. We treat S in exactly the sameway as we wood a constant (like a fixed number).

    + X.X2XX 22 (4)Now before this gets too unmanageable, why dont we concoct a little data setto help usunravel the three above terms.

    ase purchase PDFcamp Printer on http://www.verypdf.com/ to remove this watermark.

    http://www.verypdf.com/http://www.verypdf.com/
  • 8/7/2019 Measures+of+Variation

    6/10

    Suppose X = {4, 8, 6} SoSX = 18 n = 3 and, X= 6

    X 2X X X X2

    4 36 24 16

    8 36 48 64

    6 36 36 36

    SX = 18 SX 2 =108 SX X = 108 SX2 = 116

    So, we notice that a) SX 2 = SX X Think about why this has to be so.

    b) SX2

    = n.X2

    Returning to equation (4)

    + X.X2XX 22 (4)From a) above we get

    + 222 X.2XX (5)

    22 XX (6)From b) above we get

    22 X.nX (7)Replacing the denominator we get the variance

    Variance =1n

    X.nX 22

    And, taking the square root we recover the standard deviation

    Standard Deviation =

    2/122

    1n

    X.nX

    (8)

    ase purchase PDFcamp Printer on http://www.verypdf.com/ to remove this watermark.

    http://www.verypdf.com/http://www.verypdf.com/
  • 8/7/2019 Measures+of+Variation

    7/10

    We can now double check if this new equation is actually correct, with our original XData.Observation X X2

    1 7 492 9 813 6 364 9 815 4 166 7 497 5 258 8 649 8 64

    10 7 49

    S(X) = 70 SX2 = 514

    n = 10

    X = 7

    Standard Deviation =

    2/122

    1n

    X.nX

    =

    2/1

    9

    )49(10514

    = 1.6330 Yes!!!J

    I dont care which method you use as long as the answer is correct. Computers use theabove method because it is computationally more efficient than the mean deviationmethod.The coefficient of Variation

    Suppose we have a random variable X

    The coefficient of Variation is given by:X

    S.V.C =

    This quantity, which gives the standard deviation as a proportion of the mean, issometimes informative. For example, the value S = 10 has little meaning unless we can

    compare it to something else.. If S is observed to be 10 and X is observe to be 1,000, theamount of variation is small relative to the size of the mean. However, if S is observed to

    ase purchase PDFcamp Printer on http://www.verypdf.com/ to remove this watermark.

    http://www.verypdf.com/http://www.verypdf.com/
  • 8/7/2019 Measures+of+Variation

    8/10

    be 10 and X is observed to be 5, the variation is quite large relative to the size of themean.Example: In statistics, the term precision has a special meaning.

    Precision means variation in repeated measurement

    If we were interested in testing a measuring instrument, such as thosestupid plastic things nurses shove into ones ear to take your temperature.

    A Coefficient of variation of 10/1,000 = 0.01 might be acceptable.However, a coefficient of variation of 10/5 = 2 might be unacceptable.

    Example: We have two stocks: ABC Corp. and XYZ Corp. Which has the most risk.

    ABC has a standard deviation of $12 and an average price of $50

    XYZ has a standard deviation of $6 and an average price $24 .

    Coefficient of Variation for ABC Corp. is $12/$50 = 0.24

    Coefficient of Variation for XYZ Corp. is $6/$24 = 0.25

    Remember, in finance less risky is good, more risky is bad

    Thus, ABC Corp. is slightly less risk (but not by much).

    Summary

    1) Our main measure of variability is the sample Standard Deviation denotes S

    2) S is given by either

    ( ) / 5 18. 2188 0. 6654 0 0 1 266. 76 289. 40982/1

    ni

    1i

    2

    1n

    XX

    =

    = or,

    2/122

    1n

    X.nX

    3) S2is the sample variance given by

    ( ) / 5 18. 2031 0. 6665 0 0 1 336. 6 2

    1n

    XXni

    1i

    2

    =

    = or,1n

    X.nX22

    ase purchase PDFcamp Printer on http://www.verypdf.com/ to remove this watermark.

    http://www.verypdf.com/http://www.verypdf.com/
  • 8/7/2019 Measures+of+Variation

    9/10

    4) The Coefficient of Variation given byX

    S.V.C = allows us to compare the

    relative variability of two data sets.5) Degrees of freedom is the number of independent observations in a sample of data

    that are available to estimate a parameter of the population from which thatsample is drawn. For the purposes of this course, whenever a mean is calculatedwe lose one degree of freedom. Later on in the course we will be dealing with twotwo random variables and how they vary together (Covariance). Not surprisingly,if we calculate the mean of both variables for the sake of calculating theircovariance we lose two degrees of freedom.

    6) S and S2 are sample statistics, they are our best estimate of the population

    standard deviation and variance, s and s2 these are population parameters

    7) The coefficient of variation for a random variable X, is given byXS is also a

    sample statistic. It is our best estimate of the population coefficient of variation,

    which is a parameter given byX

    X

    , where X is the population mean of X

    (parameter), and X is the population standard deviation of X (also a

    parameter). There is no ancient Greek letter for the Coefficient of variation.

    Incidentally, these ancient Greek letter were not chosen at random.

    Is pronounced mu, chosen to represent the mean

    S Is the ancient Greek capital letter sigma, chosen to represent the sum

    s Is the ancient Greek lower case letter sigma, chosen to represent the Standard deviation.

    ? Is the ancient Greek capital letter pi , chosen to represent the product.

    p Is the ancient Greek lower case letter pi, which you know from gradeschool to represent the mathematical constants, approximately equal to3.14159. It represents the ratio of any circle's circumference to its diameterin Euclidean geometry.

    Never be afraid of notation, its like manners, its there to put you at ease, not to

    frighten you.

    ase purchase PDFcamp Printer on http://www.verypdf.com/ to remove this watermark.

    http://www.verypdf.com/http://www.verypdf.com/
  • 8/7/2019 Measures+of+Variation

    10/10

    JOIN KHALID AZIZ

    ICMAP STUDENTS

    DO NOT WASTE YOUR PRECIOUS TIME

    * STAGE1 FUNDAMENTALS OF FINANCIAL ACCOUNTING

    RS 2000 FOR COMPLETE SYLLABUS

    ECONOMICS RS 2000 FOR COMPLETE SYLLABUS

    *STAGE 2 COST ACCOUNTING RS 2500 FOR COMPLETE SYLLABUS

    *STAGE 3 FINANCIAL ACCOUNTING RS 3000 FOR COMPLETESYLLABUS

    COST ACCOUNTING APPRAISAL RS 3000 FOR COMPLETE

    SYLLABUS

    CONTACT:

    0322-3385752

    R-1173, ALNOOR SOCIETY, BLOCK 19, F.B.AREA,

    NEAR POWER HOUSE, KARACHI.

    ase purchase PDFcamp Printer on http://www.verypdf.com/ to remove this watermark.

    http://www.verypdf.com/http://www.verypdf.com/