statistics for management assignment 1

Upload: rithesh-kc

Post on 06-Apr-2018

225 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/3/2019 Statistics for Management Assignment 1

    1/14

    Master of Business Administration - MBA Semester 1

    Subject Code MB0040

    Subject Name STATISTICS FOR MANAGEMENT

    (Book ID: B1129)

    Assignment Set- 1

    Q 1. (a) Statistics is the backbone of decision-making. Comment.

    Ans:- Due to advanced communication network, rapid changes in consumer behavior, varied

    expectations of variety of consumers and new market openings, modern managers have a difficult

    task of making quick and appropriate decisions. Therefore, there is a need for them to depend more

    upon quantitative techniques like mathematical models, statistics, operations research and

    econometrics.

    As you can see, what the General Manager is doing here is using Statistics to solve a problem and to

    increase profits.

    Decision making is a key part of our day-to-day life. Even when we wish to purchase a television,

    we like to know the price, quality, durability, and maintainability of various brands and models

    before buying one. As you can see, in this scenario we are collecting data and making an optimum

    decision. In other words, we are using Statistics.

    Again, suppose a company wishes to introduce a new product, it has to collect data on market

    potential, consumer likings, availability of raw materials, feasibility of producing the product.

    Hence, data collection is the back-bone of any decision making process.

    Many organizations find themselves data-rich but poor in drawing information from it. Therefore, it

    is important to develop the ability to extract meaningful information from raw data to make better

    decisions. Statistics play an important role in this aspect.

    Statistics is broadly divided into two main categories.

    Figure 1.1 illustrates the two categories. The two categories of Statistics are descriptive statistics

    and inferential statistics.

  • 8/3/2019 Statistics for Management Assignment 1

    2/14

    Divisions in Statistics

    Descriptive Statistics: Descriptive statistics is used to present the general description of data which

    is summarized quantitatively. This is mostly useful in clinical research, when communicating the

    results of experiments.

    Inferential Statistics: Inferential statistics is used to make valid inferences from the data which are

    helpful in effective decision making for managers or professionals.

    Statistical methods such as estimation, prediction and hypothesis testing belong to inferential

    statistics. The researchers make deductions or conclusions from the collected data samples regarding

    the characteristics of large population from which the samples are taken. So, we can say Statistics is

    the backbone of decision-making.

    Q.1. (b) Give plural meaning of the word Statistics?

    Ans:- Plural of Word Statistic:

    The word statistics is used as the plural of the word Statistic which refers to a numerical quantity

    like mean, median, variance etc, calculated from sample value.

    In plural sense, the word statistics refer to numerical facts and figures collected in a systematic

    manner with a definite purpose in any field of study. In this sense, statistics are also aggregates of

    facts which are expressed in numerical form.For example, Statistics on industrial production,

    statistics or population growth of a country in different years etc.

  • 8/3/2019 Statistics for Management Assignment 1

    3/14

    For Example: If we select 15 student from a class of 80 students, measure their heights and find

    the average height. This average would be a statistic.

    Q 2. a. In a bivariate data on x and y, variance of x = 49, variance of y = 9 and covariance(?x,y) = -17.5. Find coefficient of correlation between x and y.

    Ans:- We know that:

    Hence, there is a highly negative correlation.

    Q 2 . b . E num e ra t e t he f a c t o r s w h i c h sho u l d b e k e p t i n m i nd f o r p ro p e r

    planning?Ans:- Planning a Statistical Survey

    Th e r e l ev an ce a n d accu r acy o f d a t a o b t a i n ed i n a su r v ey d ep en d s u p o n

    th e c a re exercised in planning. A properly planned investigation can lead to best

    resu lt s with least cost and time. Steps involved in the planning stage.

    Step-1:

  • 8/3/2019 Statistics for Management Assignment 1

    4/14

    Nature of the problem to be investigated should be cheerily defined in an

    unambiguous manner.

    Step-2:

    Objective of investigation be stated at the outset objective could be to:

    Obtain certain estimates.

    Establish a Theory.

    Verify an existing statement

    Find relationship between characteristics

    Step-3:

    T h e s c o p e o f t h e i n v e s t i g a t i o n h a s t o b e m a d e c l e a r . T h e

    s c o p e o f investigation refers to the area to be covered. Identification of units

    to be studied nature of characteristics to be observed accuracy of measurement, analytical

    methods, time cost and other resources required.

    Step-4:

    Whether to use data collected from primary or secondary source should be

    determined in advance.

    Step-5:

    T h e o r g a n i z a t i o n o f i n v e s t i g a t i o n i s t h e f i n a l s t e p i n t h e

    p r o c e s s . I t encompasses the determination of the number of investigator

    required their training supervision work needed, fund required.

    Q 3. The percentage sugar content of Tobacco in two samples was represented in table 11.11. Test whether

    their population variances are same. Table

    1. Percentage sugar content of Tobacco in two samplessampal

    A 2.4 2.7 2.6 2.1 2.5

  • 8/3/2019 Statistics for Management Assignment 1

    5/14

    sampal

    B 2.7 3 2.8 3.1 2.2 3.6

    Ans:-Required values of the method I to calculate sample mean

    X D=X-2.5 D2

    2.4 0.1 0.1

    2.7 -0.2 0.04

    2.6 -0.1 0.01

    2.1 0.4 0.16

    2.5 0 0

    TOTAL 0.2 0.22

    Required values of the method II to calculate sample mean

    X D=X-3 D2

    2.7 0.3 0.09

    3 0 0

    2.8 0.2 0.043.1 -0.1 0.1

    2.2 0.8 0.64

    3.6 -0.6 0.36

    TOTAL 0.6 1.23

    2 1 (d)2

    S = ---- [ d2 - ----- ]

    1 n1 n1

    1 0 0

    = ----[ 2 -0 / 5 ]

    4 2 4

    =0.0053

  • 8/3/2019 Statistics for Management Assignment 1

    6/14

    (d)2

    2 1 --------- ]

    S = ---- [ d2 - n2

    2 n2-1

    1 1.23-0.053

    = [ ---------------- ]

    5 6

    = 0.244 not significant.

    Q 4. a. Explain the characteristics of business forecasting?

    Ans:- Characteristics of business forecasting

    Based on past and present conditions

    Business forecasting is based on past and present economic condition of the business. To forecast the

    future, various data, information and facts concerning to economic condition of business for past and

    present are analysed.

    Based on mathematical and statistical methods

    The process of forecasting includes the use of statistical and mathematical methods. By using these

    methods, the actual trend which may take place in future can be forecasted.

    Period

    The forecasting can be made for long term, short term, medium term or any specific period.

    Estimation of future

    The business forecasting is to forecast the future regarding probable economic conditions.

    Scope

    The forecasting can be physical as well as financial.

    Q 4. b. Differentiate between prediction, projection and forecasting.

    Ans:- Prediction, projection and forecasting

  • 8/3/2019 Statistics for Management Assignment 1

    7/14

    A great amount of confusion seem to have grown up in the use of words forecast, prediction and

    projection.

    Forecasts are made by estimating future values of the external factors by means of prediction,

    projection or forecast and from these values calculating the estimate of the dependent variable.

    Q 5. What are the components of time series? Bring out the significance of moving average in

    analyzing a time series and point out its limitations.

    Ans:- Components of Time Series

    The behavior of a time series over periods of time is called the movement of the time series. The

    time series is classified into the following four components:

    i) Long term trend or secular trend

    ii) Seasonal variations

    iii) Cyclic variations

    iv) Random variations

    Method of moving averages

    Moving averages method is used for smoothing the time series. That is, it smoothes the fluctuations

    of the data by the method of moving averages.

    When period of moving average is odd to determine the trend by this method, the procedure is

    described in

  • 8/3/2019 Statistics for Management Assignment 1

    8/14

    Procedure for determining the trend when moving average is odd

    By plotting these trend values (if desired) you can obtain the trend curve with the help of which you

    can determine the trend whether it is increasing or decreasing. If needed, you can also compute

    short-term fluctuations by subtracting the trend values from the actual values.

    When period of moving averages is even

    When period of moving average is even (such as 4 years), we compute the moving averages by

    using the steps described in below

  • 8/3/2019 Statistics for Management Assignment 1

    9/14

    Procedure for determining the trend when moving average is even

    Q 6 . L i s t d o w n v a r i o u s m e a s u r e s o f c e n t r a l t e n d e n c y a n d e x p l a i n

    t h e difference between them?

    Ans:- Measures of Central Tendency

    Several different measures of central tendency are defined below.

    1 Arithmetic Mean

    The arithmetic mean is the most common measure of central tendency. It simply

    the sum of the numbers divided by the number of numbers. The symbol m is used for the mean

    of a population. The symbol M is used for the mean of a sample. The formula form is shown

    below:

  • 8/3/2019 Statistics for Management Assignment 1

    10/14

    X

    M= ------

    N

    Where X is the sum of all the numbers in the numbers in the sample and N is the

    number of numbers in the sample. As an example, the mean of the numbers 1 + 2 + 3+ 6 + 8 =20/5 = 4 regardless of whether the numbers constitute the entire population or just a sample

    from the population

    The table, Number of touchdown passes (Table 1: Number of touchdown passes),

    shows the number of touchdown (TD) passes thrown by each of the 31 teams in

    the National Football League in the 2000 season.

    The mean number of touchdown passes thrown is 20.4516 as shown below.

    Number of touchdown passes a l t h o u gh t h e a r it h me t i c me a n i s n o t t he o n l y " me a n "

    (t he re is al so a ge om et ri c mean), it is by far the most commonly used.Therefore, if the term "mean" is used without specifying whether it is the

    arithmetic mean, the geometric mean, or some other mean, it is assumed to refer to the

    arithmetic mean.

    2 Median

    The median is also a frequently used measure of central tendency. The median is the midpoint

    of a distribution: the same numbers of scores are above the median as below i t . F o r t h e

    d a t a i n t h e t ab l e , Nu mb er o f t o u ch d o w n p as ses ( Tab l e 1 : Nu mb er

    o f touchdown passes), there are 31 scores. The 16th highest score (which equals 20) is the

    median because there are 15 scores below the 16th score and 15 scores above

    The 16th score. The median can also be thought of as the 50th percentile3. Lets

    return to the made up example of the quiz on which you made a three discussed

  • 8/3/2019 Statistics for Management Assignment 1

    11/14

    previously in the module Introduction to Central Tendency4 and shown in Table

    2: Three possible datasets for the 5-point make-up quiz.

    Three possible datasets for the 5-point make-up quiz

    Fo r D a t a se t 1 , t h e med i an i s t h r ee , t h e s ame a s y o u r s co r e . Fo r D a t a se t

    2, th e median is 4. Therefore, your score is below the median. This means youare in the lower half of the class. Finally for Dataset 3, the median is 2. For this

    data set , your score is above the median and therefore in the upper half of the distribution.

    Computation of the Median: When there is an odd number of numbers, the median is simply

    the middle number. For example, the median of 2, 4, and 7 is 4. When there is an even

    number of numbers, the median is the mean of the two middle numbers. Thus, the

    median of the numbers 2, 4, 7, 12 is 4+7/2 = 5:5.

    3 modes

    The mode is the most frequently occurring value. For the data in the table, Number

    of touchdown passes (Table 1: Number of touchdown passes), the mode is 18 since

    mo r e t eams ( 4 ) h ad 1 8 t o u ch d o w n p as ses t h an an y o t h e r n u mb er o f

    t o u c h d o w n passes. With continuous data such as response time measured to many decimals,

    the

    Frequency of each value is one since no two scores will be exactly the same (seed i scu s s i o n o f co n t i n u o u s v a r i ab l e s5 ) . Th e r e f o r e t h e mo d e o f co n t i n u o u s

    da ta is normally computed from a grouped frequency distribution. The Grouped

    f requency d i s t r ibu t ion (Tab le 3 : Grouped f requency d i s t r ibu t ion) t ab l e

    s h o w s a gr o u p e d frequency distribution for the target response time data. Since the interval

    with the highest frequency is 600-700, the mode is the middle of that interval (650).

    Grouped frequency distribution

  • 8/3/2019 Statistics for Management Assignment 1

    12/14

    Proportions andPercentages

    Wh en t h e f o cu s i s o n t h e d eg r ee t o w h i ch a p o p u l a t i o n p o s ses ses a

    pa r t i c u l a r attribute, the measure of interest is a percentage or a proportion.

    A Proportion R e f e r s t o t h e f r a c t i o n o f t h e t o t a l t h a t p o s s e s s e s a

    c e r t a i n attribute. For example, we might ask what proportion of women in our sample

    weigh less than 135 pounds. Since 3 women weigh less than 135 pounds, the

    proportion would be 3/5 or 0.60.

    A percentage is another way of expressing a proportion. A percentage is equal to the

    proportion times 100. In our example of the five women, the percent of the total who

    weigh less than 135 pounds would be 100 * (3/5) or 60 percent.

    Notation

    O f t h e v a r i o u s measu r es , t h e mean an d t h e p r o p o r t i o n a r e mo s t

    i mp o r t a n t . Th e notation used to describe these measures appears below

    X: Refers to a population mean.

    X: Refers to a sample mean.

    P: The proportion of elements in the population that has a particular attribute.

    P: The proportion of elements in the sample that has a particular attribute.

    Q: The proportion of elements in the population that does not have a specified attribute. Note

    that Q = 1 - P.

    Q: The p r opor t i on o f e l eme n t s i n t he sa mple tha t does no t ha ve a

    s p e c i f ie d attribute. Note that q = 1 - p.

  • 8/3/2019 Statistics for Management Assignment 1

    13/14

    Q 6 b. What is a confidence interval, and why it is useful? What is a confidence level?

    Ans;-Confidence Intervals

    In statistics, a Confidence interval

    (CI) is a particular kind of estimate of parameter and is used to indicate the

    r e l i a b i l i t y of a n e s t i ma t e . I t i s a n observed interval (i.e. it is calculated from the

    observations), in principle different from sample to sample, that frequently includes the

    parameter of interest, if the experiments repeated. How frequently the observed interval

    contains the parameter is determinedly the confidence level or confidence coefficient

    .A confidence interval with a particular confidence level is intended to give the assurance that,

    if the stati stical model is correct, then taken over a ll the data t hat might have been

    obtained, the procedure for constructing the interval would deliver a confidence interval that

    included the true value of the parameter the proportion of the time set by the co nf idence

    level. More specifically, the meaning of the term "confidence level" is that, if

    confidence intervals are constructed across many separate data analyses

    of repeated (and possibly different) experiments, the proportion of such intervals

    that contain the true value of the parameter will approximately match the confidence level; this

    is guaranteed by the reasoning underlying the construction of confidence intervals.

    A co n f i d en ce i n t e r v a l d o es n o t p r ed i c t t h a t t h e t r u e v a l u e o f t h e

    p a r ame t e r h as a p a r t i cu l a r p r o b ab i l i t y o f b e i n g i n t h e co n f i d en ce i n t e r v a l

    g i ven t h e d a t a a c t u a l l y obtained. (An interval intended to have such a property, called acredible, can be estimated using Bayesian methods; but such methods br ing with the m

    their own distinct strengths and weaknesses).

    The confidence level s e t s t h e b o u n d a r i e s o f a c o n f i d e n c e

    i n t e r v a l ; t h i s i s conventionally set at 95% to coincide with the 5% convention of

    statistical significance in h yp ot he si s t es ti ng . In so me st ud ie s wi de r ( e. g. 90 %)

    or na rr ow er (e .g . 99 %) confidence intervals will be required. This rather

    depends upon the na ture of yo ur study. You should consult a statistician before using CI's

    other than 95%

    You will hear the terms confidence interval and confidence limit used. The confidence

    i n t e r v a l i s t h e r an g e Q - X t o Q + Y w h er e Q i s t h e v a l u e t h a t i s cen t r a l t o

    t h e s t u d y question, Q-X is he lower confidence limit and Q+Y is the upper confidence limit.

  • 8/3/2019 Statistics for Management Assignment 1

    14/14

    Familiarize yourself with alternative CI interpretations:

    Common

    A 95% CI is the interval that you are 95% certain contains the true population value as it might be estimated from

    a much larger study. The value in question can be a mean, difference between two means, a proportion etc. The CI

    is usually, but not necessarily, symmetrical about this value.

    Pure Bayesian

    The Bayesian concept of a

    Credible interval

    i s so me t i me s p u t f o r w ar d a s a mo r e practical concept than the confidence interval.

    For a 95% credible interval, oSf interest (e.g. size of treatment effect) lies with a 95%

    probability in the interval. This interval is then open to subjective molding of interpretation.Furthermore, the credible interval can only correspond exactly to the confidence interval if

    prior probability is so called "uninformative".

    Pure frequents

    Most pure frequents say that it is not possible to make probability statements, such CI

    interpretation, about the study values of interest in hypothesis tests.

    Neymanian

    A 95% CI is the interval which will contain the true value on 95% of occasions if

    a study were repeated many times using samples from the same population.Neyman originated the concept

    of CI as follows: If we test a large number of different null hypotheses at one critical level, say

    5%, then we can collect all of the rejected null h y p o t h e se s in t o o n e se t . T h i s s e t

    us ua ll y fo rm s a co nt in uo us in te rv al th at ca ns be derived mathematically and

    Neyman described the limits of this set as confidence limits that bound a confidence interval. If

    the critical level (probability of incorrectly rejecting the null hypothesis) is 5% then the interval

    is 95%. Any values of the treatment effect that lie outside the confidence interval are

    regarded as "unreasonable" in terms of hypothesis testing at the critical level.