summary of stats

Upload: nabhonil-basu-choudhury

Post on 02-Apr-2018

222 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/27/2019 summary of stats

    1/35

  • 7/27/2019 summary of stats

    2/35

    CORRELATION

    A Statistical technique that is used to analyse the

    strength and direction of the relationship betweentwo quantitative variable is called Correlational

    analysis.

    Two variables are said to be in correlation if the

    change in one of the variable results in a change in

    other variable.

    E g :- 1) Frequency of smoking and lungs damage ,

    2) Sales revenue and expenses incurred on

    advertising.

  • 7/27/2019 summary of stats

    3/35

    Importance of correlation

    If variables are linearly related to each otherthen it helps in estimation of one from theother.

    Advertisement and sales

    Prices and Demand

    We use Regression Analysis to find the valueof one variable from the other

  • 7/27/2019 summary of stats

    4/35

    TYPES OF CORRELATION

    POSITIVE AND NEGATIVE

    LINEAR AND NON-LINEAR

    SIMPLE ,PARTIAL AND MULTIPLE

  • 7/27/2019 summary of stats

    5/35

    POSITIVE CORRELATION AND

    NEGATIVE CORRELATION

    POSITIVECORRELATION NEGATIVE CORRELATION

    If the variables vary in

    same direction,

    correlation is said to bePOSITIVE.

    If one variable increases,

    the other also increases on

    the other hand, if one

    variable decreases, the

    other also decreases.

    If both variables vary in

    the opposite direction,

    correlation is said to beNEGATIVE.

    If one variable increases

    and the other decreases, or

    one decreases the other

    increases.

  • 7/27/2019 summary of stats

    6/35

    LINEAR CORRELATION NON-

    LINEAR CORRELATION

    LINEAR CORRELATION

    If the extent of changein one variable tends tohave a constant ratio inthe extent of change inanother variable, thenthe correlation is said tobe LINEAR.

    NON-LINEAR CORRELATION

    If the extent of changein one variable tends tohave no consistent ratioin the extent of changein another variable,then the correlation issaid to be NON-LINEAR.

  • 7/27/2019 summary of stats

    7/35

    SIMPLE,PARTIAL AND MULTIPLE

    CORRELATION

    When only two variables are involved,

    it is simple correlation When three or more than three

    variables are involved, we can computeeither partial or multiple correlation

  • 7/27/2019 summary of stats

    8/35

    Methods of

    correlation

    graphic

    Scatter diagram

    algebraic

    1. Karl pearson

    2. Rank method

  • 7/27/2019 summary of stats

    9/35

    Scatter Diagram

    Scatter diagram is a graph or chart which helps todetermine whether there is a relationship between twovariables by examining the graph of the observed data.

    A scattered diagram can give us two types of information:

    Pattern that indicate that the variables are related.

    If the variables are related,what kind of line orestimating equation,describes this relationship.

  • 7/27/2019 summary of stats

    10/35

  • 7/27/2019 summary of stats

    11/35

    KARLS PEARSONS

    COEFFIENT OF CORRELATION

    Karl Pearsons Coefficient of Correlationdenoted by- r The coefficient of

    correlation r measure the degree oflinear relationship between twovariables say x & y.

    r = N dxdy - dxdyN dx-(dx)N dy-(dy)

  • 7/27/2019 summary of stats

    12/35

    The value of correlation coefficient rranges from -1 to +1If r = +1, then the correlation between the

    two variables is said to be perfect andpositiveIf r = -1, then the correlation between the

    two variables is said to be perfect andnegativeIf r = 0, then there exists no correlation

    between the variables

    Interpretation of Correlation

    Coefficient (r)

  • 7/27/2019 summary of stats

    13/35

    REGRESSION

    The statistical technique that express the

    relationship between two or more variables in the

    form of an equation to estimate the value of a

    variable, based on the given value of another

    variable is called regression analysis.

    eg :- Profit after Sales of a firm.

  • 7/27/2019 summary of stats

    14/35

    Difference between dependent variable

    and independent variable

    Independent Variable

    1. The known variable is called

    the independent variable.

    2. What we typically call X.

    3. Variable that is controlled or

    manipulated.

    4. It is plotted on horizontal axis.

    5. An input variable.

    Dependent Variable

    1. The variable we are trying to

    predict is the dependent

    variable.

    2. What we typically call Y.

    3. Variable that cannot be

    controlled or manipulated.

    4. It is plotted on vertical axis.

    5. An output variable.

  • 7/27/2019 summary of stats

    15/35

    Difference between Regression

    and Correlation

    Regression

    A statistical method used

    to describe the nature ofrelationship.

    In linear regression analysisone variable is considered

    as dependent variable and

    other as independent

    variable

    Correlation

    A statistical method used

    to determine whether arelationship between two

    or more variables exist.

    In correlation analysis weexamine the degree of

    association between two

    variables

  • 7/27/2019 summary of stats

    16/35

    Advantages of Regression Analysis

    It helps in developing a regression equation

    by which the value of a dependent variable

    can be estimated given a value of an

    independent variable.

    It helps to determine standard error of

    estimate to measure the variability or spread

    of values of a dependent variable with

    respect to the regression line.

  • 7/27/2019 summary of stats

    17/35

    Estimation using the Regression Line

    The equation for a straight line where thedependent variable Y is determined by the

    independent variable X is:

    Y = a + bxWhere,

    a = y-intercept

    b = slope of the line

    Y = value of dependent variable

    X = value of independent variable

  • 7/27/2019 summary of stats

    18/35

    THE METHOD OF LEAST SQUARE

    It is a method of having a good fit of a line

    which minimizes the error between the

    estimated points on the line and actual

    points that were used to draw it.

    In this method Y represents the individual

    value of the observed points measured along

    the Y-axis and Y(y-hat) symbolize the

    individual values of the estimated points. The Estimated Line is:

    = a + bx

  • 7/27/2019 summary of stats

    19/35

  • 7/27/2019 summary of stats

    20/35

    COEFFICIENT OF

    DETERMINATION

    The convenient way of interpreting thevalue of correlation coefficient is to use ofsquare of coefficient of correlation whichis called Coefficient of Determination.

    The Coefficient of Determination is r2.

    r2

    = 1- (Y- )2

    (Y-Y)2

  • 7/27/2019 summary of stats

    21/35

    STANDARD ERROR OF ESTIMATE

    Standard error of estimate measures thevariability of the scatter of the observed

    values around the regression line.

    It is given by:

    Se= (Y- )2

    n-2If Se=0, the estimating equation is expected to

    be a perfect estimator of the dependent

    variable.

  • 7/27/2019 summary of stats

    22/35

    WHAT DOES TIME-SERIES MEAN?

    A time series is a sequence of data points,

    measured typically at successive points in time

    spaced at uniform time intervals.

    Time series is a set of measurements of avariable that are ordered through time

    Time series analysis comprises methods for

    analyzing time series data in order to extractmeaningful statistics and other characteristics

    of the data

    DIFFERENCE WITH REGRESSION

  • 7/27/2019 summary of stats

    23/35

    DIFFERENCE WITH REGRESSIONANALYSIS

    Timeseries Analysis

    Regression Analysis

    Time series forecasting is the

    use of a model to predict

    future values based on

    previously observed values.

    Regression analysis is often

    employed in such a way as

    to test theories that the

    current value of one time

    series affects the current

    value of another time

    series.

    Regression analysis cannot

    explain seasonal and cyclical

    effects.

    It shows or suggestsperiodicity of a data like

    seasonal and cyclical

    effects.

  • 7/27/2019 summary of stats

    24/35

    COMPONENTS OF TIME SERIES

    SECULAR TREND

    CYCLICAL VARIATIONS

    SEASONAL VARIATIONS

    IRREGULAR VARIATIONS

  • 7/27/2019 summary of stats

    25/35

  • 7/27/2019 summary of stats

    26/35

    Units

    years

    Upward trend of sales of Laptops in Ranchi

    2000 2001 2002 2003 2004 2005 2006 2007

    2000

    4000

    6000

    8000

    10000

  • 7/27/2019 summary of stats

    27/35

    units

    (in

    000

    )

    years

    Declining trend of using Landline Phones in India

    2000 01 02 03 05 06 07 08 09 10 11

    30

    60

    90

    120

    150

    180

    04

  • 7/27/2019 summary of stats

    28/35

    CYCLICAL VARIATION

    Cyclical variations are long-term movements that representconsistently recurring rises and declines in activity.

    Timing is the most important factor which affect

    the Cyclical Variations.

    for example- Business Cycle, it consists of the recurrence ofthe up and down movements of business activity

  • 7/27/2019 summary of stats

    29/35

    depression

    prosperity

    Prosperity orboom

    Economicactivities

    time

    Cyclical Variation(Business cycle)

  • 7/27/2019 summary of stats

    30/35

    SEASONAL VARIATION

    Seasonal variations are those periodic movements in business

    activity which occur regularly every year.

    Since these variations repeat during a period of twelve months

    so, they can be predicted fairly accurately.

    Seasonal Variations are caused by climate and weather

    conditions, customs, festivals and habits.

    for example-Sales of Cold-drinks goes up in summer season

    than any other season

  • 7/27/2019 summary of stats

    31/35

    U

    nits

    years2000 2001 2002 20032004 2005 2006

    Sales of Cold-drinks

    10000

    12000

    14000

    1600018000

    20000

    IRREGULAR VARIATION

  • 7/27/2019 summary of stats

    32/35

    IRREGULAR VARIATIONIrregular variations refer to such variations in business

    activity which do not repeat in a definite pattern.

    In these type of variations the pattern of the variable isunpredictable.Irregular Variations are caused by unpredictable

    factors like natural disasters (earthquakes, floods,

    wars etc.).These are unpredictable and no onehas control over it.

    For example-Production of cars tremendously wentdown after earthquake came in Japan in Nov 2011.

  • 7/27/2019 summary of stats

    33/35

    2005 2006 2007 2008 2009 2010 2011

    100000

    150000

    200000

    250000

    300000

    350000

    unit

    s

    Production of cars in Japan

    years

  • 7/27/2019 summary of stats

    34/35

  • 7/27/2019 summary of stats

    35/35

    Thank you