welcome [stjohnscollege.edu.in] · presenting numeric data through pictograms, cartograms, bar...
TRANSCRIPT
WELCOME
PREPARED BY :G.KAMALAM, B.COM.,M.COM.,M.PHIL.,SET,PGDCA,PGDPM
ASSISTANT PROFESSOR (SF)DEPARTMENT OF COMMERCE CSST.JOHN’S COLLEGEPALAYAMKOTTAI
BUSINESS STATISTICS
INTRODUCTION TO STATISTICS
Importance :
• Statistical knowledge helps you use theproper methods to collect the data, employthe correct analyses, and effectively presentthe results.
• Statistics is a crucial process behind how wemake discoveries in science, make decisionsbased on data, and make predictions
Limitations
• Statistical methods are best applicable toquantitative data.
• Statistics cannot be applied to heterogeneousdata.
• If sufficient care is not exercised in collecting,analyzing and interpreting the data, statisticalresults might be misleading.
• Only a person who has an expert knowledge ofstatistics can handle statistical data efficiently.
Sources Of Data
• Primary, or "statistical" sources are data thatare collected primarily for creating officialstatistics, and include statistical surveysand censuses. Secondary, or "non-statistical"sources, are data that have been primarilycollected for some other purpose(administrative data, private sector data etc).
Techniques
• Two main statistical methods are used indata analysis: descriptive statistics, whichsummarize data from a sample using indexessuch as the mean or standard deviation, andinferential statistics, which draw conclusionsfrom data that are subject to random variation(e.g., observational errors, samplingvariation).
Sampling
In statistics, quality assurance, and surveymethodology, sampling is the selection of asubset of individuals from within a statisticalpopulation to estimate characteristics of thewhole population. Statisticians attempt for thesamples to represent the population inquestion
Classification And Tabulation Of Data
• A process of condensing data and presentingit in a compact form, byputting data into statistical table, iscalled tabulation
• Data classification is based on similarattributes and variables of the observations.Conversely, in tabulation the data is arrangedin rows and columns
Diagrammatic Representation Of Data
• Diagrammatic presentation is a technique ofpresenting numeric data through Pictograms,Cartograms, Bar Diagrams & Pie Diagrams etc.It is the most attractive and appealing way torepresent statistical data
Measures Of Central Tendency
• A measure of central tendency is a summarystatistic that represents the center point ortypical value of a dataset. ... In statistics, thethree most common measures ofcentral tendency are the mean, median, andmode. Each of these measures calculates thelocation of the central point using a differentmethod.
Mean
• The statistical mean refers to the mean oraverage that is used to derive the centraltendency of the data in question. It isdetermined by adding all the data points in apopulation and then dividing the total by thenumber of points. The resulting number isknown as the mean or the average.
Median
• The median is a simple measure of centraltendency. To find the median, we arrange theobservations in order from smallest to largestvalue. If there is an odd number ofobservations, the median is the middle value.If there is an even number of observations,the median is the average of the two middlevalues
Mode
• The mode of a set of data values is the valuethat appears most often. If X is a discreterandom variable, the mode is the value x atwhich the probability mass function takes itsmaximum value. In other words, it is the valuethat is most likely to be sampled
Formula for Mean
1) Arithmetic Mean:• x̄ = ( Σ xi ) / n• x ̄ =∑fx / ∑f
2) Geometric Mean:• Gm =∑ log x / n• Gm = ∑ f log x / n• Gm = ∑ f log m / n
3) Harmonic Mean:• Hm = N / ∑ (1/x)• Hm = N / ∑ f(1/x)
Formulae
Median:
• Median = (n+1/2)th item
• Median = L + (n/2) – cf / f x c
Mode:
• Z = L + Δ 1 / Δ 1 + Δ 2 x c
Measures Of Dispersion
Range:
• The range of a set of data is the differencebetween the largest and smallest values.
• Formula is Range = L-S
Mean Deviation
• Definition of mean deviation. : the mean ofthe absolute values of the numericaldifferences between the numbers of a set(such as statistical data) and their mean ormedian.
• Formula is MD = ∑ |D| / N
MD = ∑ F|D| / N, this is used for both discrete and continuous series
Standard Deviation
• The standard deviation is a statistic thatmeasures the dispersion of a dataset relativeto its mean and is calculated as the squareroot of the variance
• Formula is σ = √ ∑ x2 / N
σ = √ ∑ Fx2 / N
σ = √ ∑ Fm2 / N
Skewness
• In probability theory and statistics, skewnessis a measure of the asymmetry of theprobability distribution of a real-valuedrandom variable about its mean. Theskewness value can be positive or negative, orundefined.
Formula of Skewness
• skewness = (3 * (mean - median)) / standarddeviation
Coefficient of Skewness:
• The coefficient compares the sample distributionwith a normal distribution. The larger the value,the larger the distribution differs from a normaldistribution. A value of zero meansno skewness at all. A large negative value meansthe distribution is negatively skewed.
CORRELATION
• Correlation is a statistical technique that canshow whether and how strongly pairs ofvariables are related. For example, height andweight are related; taller people tend to beheavier than shorter people . Anintelligent correlation analysis can lead to agreater understanding of your data.
Types of correlation
• Usually, in statistics, we measure four types ofcorrelations:
• Pearson correlation
• Kendall rank correlation
• Spearman correlation
• Point-Biserial correlation.
Regression
In statistical modeling, regression analysis is a set ofstatistical processes for estimating therelationships between a dependent variable andone or more independent variables.
Analysis:
• Regression analysis is a setof statistical methodsused for the estimation of relationships between adependent variable and one or more independentvariables
Regression Equations
• The Regression Equation. A regressionequation is a statistical model thatdetermined the specific relationship betweenthe predictor variable and the outcomevariable.
Regression Formula:
• Y= a + bX
Index Numbers
• An index number is the measure of change ina variable (or group of variables) overtime. Index numbers are one of the mostused statistical tools in economics. Indexnumbers are not directly measurable, butrepresent general, relative changes. They aretypically expressed as percents
Types of Index Numbers
• One such very important tool are indexnumbers. They help reveal the trends andtendencies of the economy and also help inthe formulation of economic policies and laws.There are broadly three types of indexnumbers – price index numbers, value indexnumbers, and quantity index numbers.
Analysis of Time Series
• Time series analysis is a statistical techniquethat deals with time series data, ortrend analysis. ... Time series data: A set ofobservations on the values that a variabletakes at different times. Cross-sectional data:Data of one or more variables, collected at thesame point in time.
Importance
• Time series analysis is use in order tounderstand the underlying structure andfunction that produce the observations.Understanding the mechanisms of a timeseries allows a model to be developed thatexplains the data in such a way thatprediction, monitoring, or control can occur.
Components of Time series
• An observed time series can be decomposedinto three components:
• the trend (long term direction)
• the seasonal (systematic, calendar relatedmovements)
• irregular (unsystematic, short termfluctuations).
Methods of Least Square
• The least squares method isa statistical procedure to find the best fit for aset of data points by minimizing the sum ofthe offsets or residuals of points from theplotted curve. Least squares regression isused to predict the behavior of dependentvariables
Measurement of Trend
• Measurement of Trend by the Method ofMoving Average
• It measures the trend by eliminating thechanges or the variations by means of amoving average. The simplest of the meanused for the measurement of a trend is thearithmetic means (averages).
THANK YOU