mca_unit-3_computer oriented numerical statistical methods

Unit-3 FREQUAENCY DISTRIBUTION

RAI UNIVERSITY, AHMEDABAD 1

Course: MCA Subject: Computer Oriented Numerical

Statistical Methods Unit-3

RAI UNIVERSITY, AHMEDABAD



Unit-III- Frequaency Distribution

Sr.

No.

Name of the Topic Page

No.

1 Introduction, Collection of data, Classification of data 2

2 Introduction to frequency distribution , Class Limit, , Class

Interval,Class frequency, Class mark, Class Boundaries, Width of a

class

5

3 Frequency density, Relative frequency, Percentage frequency,

Cumulative frequency

9

4 Introduction, Arithmetic Mean, Mean forDiscrete frequency

distribution, Mean for Continuous frequency distribution,Weighted

Arithmetic Mean

11

5 Properties of A.M., Merits & De merits of A.M., 22

6 Median for raw data, Discrete frequency distribution, Continuous

frequency distribution,

23

7 Merits and demerits of Median 26

8 Mode for raw data, D.f.s., c.f.s., 27

9 Merits & demerits of mode 30

10 Introduction, Range, coefficient of range,Merit & Demerit of

Range

30

11 Quartiles, Quartiles deviations, coefficient of quartile deviations 31

12 Mean deviation and coefficient of mean deviation,Merit and

Demerit of Mean Deviation

33

13 Standard Deviation and variance for all types of frequency

distribution

38

14 Coefficient of variation 48



1.1 Introduction:

A sequence of observation, made on a set of objects included in the sample drawn

from population is known as statistical data.

(1). Ungrouped data

Data which have been arranged in a systematic order are called raw data or

ungrouped data.

(2) .Grouped data

Data presented in the form of frequency distribution is called grouped data.

1.2 Collection of Data:

The first step in any enquiry (investigation) is collection of data. The data may be

collected for the whole population or for a sample only. It is mostly collected on

sample basis. Collection of data is very difficult job. The enumerator or

investigator is the well trained person who collects the statistical data. The

respondents (information) are the persons whom the information is collected.

1.3 Classification of Data:

Data classification is the process of organizing data into categories for its most

effective and efficient use.

A well-planned data classification system makes essential data easy to find and

retrieve.

Written procedures and guidelines for data classification should define what

categories and criteria the organization will use to classify data and specify the

roles and responsibilities of employees within the organization regarding data

stewardship. Once a data-classification scheme has been created, security standards

that specify appropriate handling practices for each category and storage standards

that define the data's lifecycle requirements should be addressed.

Here is an example of what a data classification scheme might look like:



Category 4: Highly sensitive corporate and customer data that if disclosed could

put the organization at financial or legal risk.

Example: Employee social security numbers, customer credit card numbers

Category 3: Sensitive internal data that if disclosed could negatively affect

operations.

Example: Contracts with third-party suppliers, employee reviews

Category 2: Internal data that is not meant for public disclosure.

Example: Sales contest rules, organizational charts

Category 1: Data that may be freely disclosed with the public.

Example: Contact information, price lists

1.3.1 Types of Data:

There are two types (sources) for the collection of data.

(1) Primary Data (2) Secondary Data

(1) Primary Data:

The primary data are the first hand information collected, compiled and published

by organization for some purpose. They are most original data in character and

have not undergone any sort of statistical treatment.

Example: Population census reports are primary data because these are collected,

complied and published by the population census organization.

(2) Secondary Data:

The secondary data are the second hand information which are already collected by

some one (organization) for some purpose and are available for the present study.

The secondary data are not pure in character and have undergone some treatment at

least once.

Example: Economics survey of England is secondary data because these are

collected by more than one organization like Bureau of statistics, Board of

Revenue, the Banks etc…



1.3.2Difference between Primary and Secondary Data:

The difference between primary and secondary data is only a change of hand. The

primary data are the first hand data information which is directly collected form

one source. They are most original data in character and have not undergone any

sort of statistical treatment while the secondary data are obtained from some other

sources or agencies. They are not pure in character and have undergone some

treatment at least once.

For Example: Suppose we interested to find the average age of MS students. We

collect the age’s data by two methods; either by directly collecting from each

student himself personally or getting their ages from the university record. The

data collected by the direct personal investigation is called primary data and the

data obtained from the university record is called secondary data.

1.3.3 Types of Classification:

1. Geographical Classification:

i.e. Area wise, e.g. cities, districts, etc.

2. Chronological Classification:

i.e. on the basis of time

3. Qualitative Classification:

i.e. according to some attributes

4. Quantitative Classification:

i.e. in terms of magnitudes

Quantitative classification refers to the classification of data according to some

characteristics that can be measured, such as height, weight, income, sales, profits

etc.

In this type of classification , there are two elements namely

(i). the variable

(ii). The frequency: Frequency is how often something occurs.



2.1Definition: frequency Distribution

By counting frequencies we can make a Frequency Distribution table.

There are two types of frequency distribution:

(a). Discrete frequency distribution.

(b). Continuous frequency Distribution.

Example: Newspapers

These are the numbers of newspapers sold at a local shop over the last 10 days:

22, 20, 18, 23, 20, 25, 22, 20, 18, 20

Let us count how many of each number there is:

Here Table-A is a example of discrete frequency distribution

Table-B is a example of continuous frequency distribution.

TABLE-A

Papers Sold Frequency

18 2

19 0

20 4

21 0

22 2

23 1

24 0

25 1

It is also possible to group the values. Here they are in grouped :

Table-B




15-19 2

20-24 7

25-29 1

2.2Class Limit:

Class limits are the smallest and largest observations (data, events etc) in each

class. Therefore, each class has two limits: a lower and upper.

Example for, In the above Table-B

For the first class 15-19

The Lower class limit is =15

The Upper class limit is = 19

2.3 Class Interval:

The difference between the upper and lower limit of a class is known as class

interval of that class.

Table-B


15-19 2

20-24 7

25-29 1

For example: For Table-B in the class 15-19 the class interval = 4

(i.e. 19-15)

A simple formula to obtain the estimate of appropriate class interval,

i.e. 𝑖 =𝐿−𝑆

𝑘 where 𝐿 = largest item, S= Smallest item, K = no. of classes



2.3.1 Example: If the salary of 100 employees in a commercial undertaking

varied between Rs.10,000 and Rs. 30,000 and we want to form 10 classes, then

Find the class interval.

Solution: 𝑖 =𝐿−𝑆

𝑘

Here 𝐿 = 30,000, 𝑆 = 10,000, 𝑘 = 10

𝑖 =30,000−10,000

10=

20,000

10= 2000

2.4 Class Frequency:

The number of observations corresponding to a particular class is known as the

frequency of that class or the class frequency.

Table-B


15-19 2

20-24 7

25-29 1

For example,

In the Table-B the class 20-24 has frequency 7.

i.e there are 7 days in which no. of papers sold is between 20-24.

If we add together frequencies of all individual classes, we obtain the total

frequency.

The total frequency of Table-B = 2+7+1=10

2.5 Class mark or class mid point:

It is the value lying half-way between the lower and upper class limits of a class

interval.



∴ 𝑐𝑙𝑎𝑠𝑠 𝑚𝑎𝑟𝑘 = 𝑈𝑝𝑝𝑒𝑟 𝑙𝑖𝑚𝑖𝑡 𝑜𝑓 𝑡ℎ𝑒 𝑐𝑙𝑎𝑠𝑠+𝑙𝑜𝑤𝑒𝑟 𝑙𝑖𝑚𝑖𝑡 𝑜𝑓 𝑡ℎ𝑒 𝑐𝑙𝑎𝑠𝑠

2

2.6 Class Boundaries:

Class Boundaries are the midpoints between the upper class limit of a class and the

lower class limit of the next class in the sequence. Therefore, each class has an

upper and lower class boundary.

For Example: Table-B


15-19 2

20-24 7

25-29 1

For the first class in table-B , 15 – 19

The lower class boundary is the midpoint between 14 and 15, that is 14.5

The upper class boundary is the midpoint between 19and 20, that is 19.5

2.7 Width of a Class:

Difference between two consecutive lower class limits

Difference between two consecutive upper class limits

Example for, Table-B


15-19 2

20-24 7

25-29 1

Difference between two consecutive lower class limits

20-15 = 5



Difference between two consecutive upper class limits

24-19 = 5

∴Class width=5

3.1 Frequency Density:

𝐹𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 𝑑𝑒𝑛𝑠𝑖𝑡𝑦 = 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦

𝑐𝑙𝑎𝑠𝑠 𝑤𝑖𝑑𝑡ℎ

Frequency density is use to draw histogram .

The following table shows the ages of 25 children on a school bus:

Age Frequency

5-10 6

11-15 15

16-17 4

> 17 0

To draw the histogram we need frequency density:

For class 5-10


𝑐𝑙𝑎𝑠𝑠 𝑤𝑖𝑑𝑡ℎ=

6

6= 1

For class 11-15


𝑐𝑙𝑎𝑠𝑠 𝑤𝑖𝑑𝑡ℎ=

15

5= 3



3.2 Relative frequency:

Relative frequency is the ratio of the number of times an event occurs to the

number of occasions on which it might occur in the same period.

In other words, how often something happens divided by all outcomes.

Example: if your team has won 9 games from a total of 12 games played:

* the Frequency of winning is 9

* the Relative Frequency of winning is 9

12 = 75%

3.3 Percentage frequency:

Percentage frequency that means calculate percentage of given frequency.

Percentage frequency=100×𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 𝑜𝑓 𝑔𝑖𝑣𝑒𝑛 𝑐𝑙𝑎𝑠𝑠

𝑡𝑜𝑡𝑎𝑙 𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦


15-19 2

20-24 7

25-29 1

For class 15-19 frequency is = 2

∴ Percentage frequency for class 15-19 =100×2

10= 20%



3.4 Cumulative frequency:

The total of a frequency and all frequencies so far in a frequency distribution.

It is the 'running total' of frequencies.

4.1 Introduction:

A measure of central tendency is a single value that attempts to describe a set of

data by identifying the central position within that set of data. Measures of central

tendency are sometimes called measures of central location. They are also classed

as summary statistics. The mean (often called the average) is most likely the

measure of central tendency that you are most familiar with, but there are others,

such as the median and the mode.

The mean, median and mode are all valid measures of central tendency.

4.2 Arithmetic Mean:

The most popular and widely used measure of representing the entire data by one

value is what most laymen call an average and what the statisticians call the



arithmetic mean. Its value is obtained by adding together all the items and by

dividing this total by the number of items.

Arithmetic mean may either be

i. Simple arithmetic mean, or

ii. Weighted arithmetic mean

First of all we have to discuss about Simple Arithmetic Mean.

There are two method for finding simple arithmetic Mean :

1. Direct Method

2. Short-cut Method

4.2.1 Direct Method for finding Arithmetic Mean :

�̅� = 𝑿𝟏 +𝑿𝟐 +𝑿𝟑 +⋯……+𝑿𝒏

𝑵 =

∑ 𝑿

𝑵

Here �̅� = Arithmetic Means

∑ 𝑋 = Sum of all the values of the variable 𝑋

𝑁 = Number of the Observations

Steps for finding Arithmetic mean :

1. Add together all the values of the variable 𝑋 and obtain the total

i.e., ∑ 𝑋

2. Divide this total by the number of observations, i.e., 𝑁

Example-The following table data is the monthly income (in Rs.) of 10

employees in an office:

14780,15760,26690,27750,24840,24920,16100,17810,27050,26950

Calculate the arithmetic mean of incomes.

Solution: Here 𝑁 = 10

∴ �̅� = 𝑋1 +𝑋2 +𝑋3 +⋯……+𝑋𝑛

𝑁 =

∑ 𝑋

𝑁



∴ �̅� =

14780+15760+26690+27750+24840+24920+16100+17810+27050+26950

10

∴ �̅� =2,22,650

10

∴ �̅� = 22,265

Hence the Average Income is Rs.22,265

4.2.2 Short-cut Method for finding Arithmetic Mean:

The arithmetic mean can be calculated by using what is known as an arbitrary

Origin. When deviations are taking from an arbitrary origin, the formula for

calculating arithmetic mean is

�̅� = 𝐴 +∑ 𝑑

𝑁

Where A is the assumed mean and d is the deviation of items from assumed mean,

i.e.,𝑑 = (𝑋 − 𝐴)

Steps for finding Arithmetic mean by shortcut Method

1. Take an assumed mean.

2. Take the deviations of items from the assumed mean and denote these

deviations by 𝑑.

3. Obtain the sum of these deviations, i.e., ∑ 𝑑

4. Apply the formula �̅� = 𝐴 +∑ 𝑑

𝑁

Example-- The following table data is the monthly income (in Rs.) of 10

employees in an office:

14780,15760,26690,27750,24840,24920,16100,17810,27050,26950

Calculate the arithmetic mean of incomes by using short cut method.

Solution:

Suppose assumed mean 𝐴 = 22000



Employees Income(Rs.) (𝑿 − 𝟐𝟐𝟎𝟎𝟎) = 𝒅

1 14,780 -7220

2 15,760 -6240

3 26,690 +4690

4 27,750 +5750

5 24,840 +2840

6 24,920 +2920

7 16,100 -5900

8 17,810 -4190

9 27,050 +5050

10 26,950 +4950

N=10 ∑ 𝑑 = 2650

�̅� = 𝐴 +∑ 𝑑

𝑁

𝐻𝑒𝑟𝑒 𝐴 = 22000,∑ 𝑑 = 2650, 𝑁 = 10

�̅� = 22000 +2650

10= 22,265

Hence the average income is Rs. 22,265.

4.3 Calculation of Arithmetic Mean-Discrete frequency Distribution:

In discrete series arithmetic mean may be computed by applying

1. Direct Method

2. Short-Cut Method

4.3.1Direct Method:

The formula for computing mean is



�̅� =∑ 𝑓𝑥

𝑁

Where 𝑓 = frequency

𝑋 = The variable in Question

𝑁 = Total number of Observations i.e. ∑ 𝑓

Steps for finding Arithmetic Mean for Discrete frequency Distribution:

1. Multiply the frequency of each row with the variable and obtain the

total ∑ 𝑓𝑋

2. Divide the total obtained by step(i) by the number of observations ,

i.e. Total frequency

Example-- From the following data of the marks obtained by 60 students of a

class calculates the arithmetic mean by Direct Method:

Marks No. of Students

20 8

30 12

40 20

50 10

60 6

70 4

Solution:

Let the marks be denoted by 𝑋 and the number of students by 𝑓.

Calculation of Arithmetic Mean

Marks(𝑿) No. of

Students(𝒇)

𝒇𝑿

20 8 160



30 12 360

40 20 800

50 10 500

60 6 360

70 4 280

𝑵 = 𝟔𝟎 ∑𝒇𝒙 = 𝟐𝟒𝟔𝟎

�̅� =∑ 𝑓𝑋

𝑁=

2460

60= 41

Hence, the average marks=41

4.3.2 Short-Cut Method:

According to this method, �̅� = 𝐴 +∑ 𝑓𝑑

𝑁

Where 𝐴 = Assumed mean ;

𝑑 = (𝑋 − 𝐴);

𝑁 = Total number of observations i.e., ∑ 𝑓.

Steps for finding Arithmetic Mean for Discrete frequency Distribution:

1. Take an assumed mean.

2. Take the deviations of the variable X from the assumed mean and

denote the deviations by 𝑑.

3. Multiply these deviations with the respective frequency and take the

total ∑ 𝑓𝑑.

4. Divide the total obtained in third step by the total frequency.

Example--From the following data of the marks obtained by 60 students of a

class, calculate the arithmetic mean by Shot-cut Method:

Marks No. of Students

20 8



30 12

40 20

50 10

60 6

70 4

Solution:

Suppose assumed mean A= 40

Calculation of Arithmetic Mean by short-cut Method

Marks(𝑿) No. of

Students(𝒇)

(𝑿 − 𝟒𝟎) = 𝒅 𝒇𝒅

20 8 -20 -160

30 12 -10 -120

40 20 0 0

50 10 +10 +100

60 6 +20 +120

70 4 +30 +120

𝑁 = 60 ∑ 𝑓𝑑 = 60

�̅� =∑ 𝑓𝑑

𝑁= 40 +

60

60= 40 + 1 = 41

Hence the Arithmetic mean by Shortcut method is =60

Note: We can Observe that value of Arithmetic mean does not change in

both the method . so, we can use any one for finding arithmetic mean.

4.4 Calculation of Arithmetic Mean – Continuous Frequency Distribution



In continuous frequency distribution arithmetic mean may be computed by

applying any of the following methods:

1. Direct Method

2. Short-Cut Method

4.4.1 Direct Method: (Mean of Continuous frequency distribution)

When direct method is used

�̅� =∑ 𝒇𝒎

𝑵

Where, m = mid-point of various classes

𝑓 = Frequency of each class

𝑁 = The total frequency

Steps for finding Arithmetic mean by Direct Method (C.F.D):

1. Obtain the mid-point of each class and denote it by m.

2. Multiply these midpoints by the respective frequency of each class

and obtain the total ∑ 𝑓𝑚

3. Divide the total obtained in step(i) by the sum of the frequency,i.e.,N.

Example--.from the following data compute arithmetic mean by direct

method:

Marks 0-10 10-20 20-30 30-40 40-50 50-60

No. of Students 5 10 25 30 20 10

Solution:

Calculation for Arithmetic Mean

Marks(𝑿) Mid-points

(𝒎)

No. of Students

(𝒇)

𝒇𝒎

0-10 5 5 25

10-20 15 10 150



20-30 25 25 625

30-40 35 30 1050

40-50 45 20 900

50-60 55 10 550

𝑵 = 𝟏𝟎𝟎 ∑ 𝒇𝒎 = 𝟑, 𝟑𝟎𝟎

�̅� =∑ 𝑓𝑚

𝑁=

3300

100= 33

Hence The value of Arithmetic mean is 33.

4.4.2 Short-Cut Method: (Mean of Continuous frequency distribution)

When short-cut method is used, arithmetic mean is computed by applying the

following formula:

�̅� = 𝐴 +∑ 𝑓𝑑

𝑁

Where A= Assumed mean

𝑑 = Deviations of mid-points from assumed mean i.e.,(m-A)

𝑁 = Total number of observations

Steps for finding Arithmetic Mean by Short-Cut Method: (C.F.D)

1. Take an assumed mean

2. From the mid-point of each class deduct the assumed mean.

3. Multiply the respective frequencies of each class by these deviations

and obtain the total ∑ 𝑓𝑑.

4. Apply the formula �̅� = 𝐴 +∑ 𝑓𝑑

𝑁

Example --from the following data compute arithmetic mean by Short-Cut

method:

Marks 0-10 10-20 20-30 30-40 40-50 50-60

No. of Students 5 10 25 30 20 10



Solution:

Assumed that Arithmetic Mean =35

N= 100

Calculation of Arithmetic mean by short-cut method

Marks(𝑿) Mid-points

(𝒎)

No. of Students

(𝒇)

(𝒎 − 𝟑𝟓)

= 𝒅

𝒇𝒅

0-10 5 5 -30 -150

10-20 15 10 -20 -200

20-30 25 25 -10 -250

30-40 35 30 0 0

40-50 45 20 +10 +200

50-60 55 10 +20 +200

𝑵 = 𝟏𝟎𝟎 ∑ 𝒇𝒅 = −𝟐𝟎𝟎

�̅� = 𝐴 +∑ 𝑓𝑑

𝑁= 35 −

200

100= 35 − 2 = 33

Hence the value of Arithmetic mean by Short-Cut method for continuous

frequency distribution is 33.

Example-- The mean marks of 100 students were found to be 40. Later on it

was discovered that a score of 53 was misread as 83.Find the correct mean

corresponding to correct Score.

Solution:

We are given 𝑁 = 100, �̅� = 40

Since �̅� =∑ 𝑋

𝑁

∑ 𝑋 = 𝑁�̅� = 100 × 40 = 4000

But this is not correct ∑ 𝑋



Correct ∑ 𝑋 = Incorrect ∑ 𝑋 − Wrong item + Correct item

Correct ∑ 𝑋 = 4000 − 83 + 53 = 3970

∴ Correct �̅� =correct ∑ 𝑋

N=

3970

100= 39.7

Hence the Correct mean=39.7

4.5 Weighted arithmetic mean

One of the limitations of the arithmetic mean discussed above is that it gives equal

importance to all the items. But there are cases where the relative importance of the

different items.The formula for computing weighted arithmetic mean is:

𝑋𝑤̅̅ ̅̅ =

∑ 𝑊𝑋

∑ 𝑊

Where 𝑋𝑤̅̅ ̅̅ represents the weighted arithmetic mean; X represents the

variable values.

W represents the weights attached to variable values.

Steps for finding Weighted mean:

1. Multiply the weights by the variable X and obtain the total ∑ 𝑊𝑋

2. Divide this total by the sum of the weights, i.e.,∑ 𝑊.

Example-- Calculate weighted average of the following data:

Course BA BSc MA MCA MBA

%of Pass 70 65 75 90 99

No of

Students

20 30 30 50 40

Solution:

Weighted Average = ∑ 𝑊𝑋

∑ 𝑊

Weighted Average =13300 / 170 = 76.47

Calculation for weighted Average (Mean)



% of Pass X No of Student W XW

70 20 1400

65 30 1950

75 30 2250

90 50 4500

80 40 3200

X=380 W=170 13300

Example--A train runs 25 miles at a speed of 30 m.p.h., another 50 miles at a

speed of 40 m.p.h, then due to repairs a track travels for 6 minutes at a speed

of 10 m.p.h. what is the average speed in miles per hour?

Solution:

Time taken in covering 25 miles at a speed of 30 m.p.h=50 minutes. Time taken in

covering 50 miles at a speed of 40 m.p.h = 75 minutes. Distance covered in 6

minutes at a speed of 10 m.p.h = 1 mile. Time taken in covering 24 miles at a

speed of 24 m.p.h = 60 minute Therefore, taking the time taken as weights we have

the weighted mean as

Speed (in m.p.h)

(𝑿)

Time taken

(in min) W

𝑾𝑿

30 50 1500

40 75 3000

10 6 60

24 60 1440

∑ 𝑊 = 191 ∑ 𝑊 𝑋 = 6000

∴ Weighted arithmetic mean =𝑋𝑤̅̅ ̅̅ =

∑ 𝑊𝑋

∑ 𝑊=

6000

191= 31.41 𝑚. 𝑝. ℎ.



Hence value of weighted arithmetic mean is 31.41 𝑚. 𝑝. ℎ.

5.1 Mathematical Properties Of Arithmetic Mean:

The following are a few important mathematical properties of the arithmetic mean:

1. The sum of the deviations of the items from the arithmetic mean (taking

signs into account) is always zero. i.e. ∑(𝑋 − �̅�) = 0

2. The sum of the squared deviations of the items from arithmetic mean is

minimum, that is, less than the sum of the squared deviations of the items

from any other value.

3. It is clear that ∑(𝑋 − �̅�)2 is greater. This Property that the sum of the

squares of items is least from the mean is of immense use regression

analysis which shall be discussed later.

5.2 Merits of arithmetic mean:

Arithmetic mean is the simplest measurement of central tendency of a group. It is

extensively used because:

It is easy to calculate and easy to understand.

It is based on all the observations.

It is rigidly defined.

It provides good basis of comparison.

It can be used for further analysis and algebraic treatment.

5.3 Demerits of the arithmetic mean:

It is affected by the extreme values.

It may lead to a wrong conclusion.

It is unrealistic.

Arithmetic mean cannot be obtained even if single observation is missing

It cannot be identified observation or graphic method

6.1 Median:



The median is the middle score for a set of data that has been arranged in order of

magnitude.

If the number of events are even then the average of two middle are taken.

The median is better for describing the typical value.

Example:

In order to calculate the median, suppose we have the data below:

65 55 89 56 35 14 56 55 87 45 92

We first need to rearrange that data into order of magnitude (smallest first):

14 35 45 55 55 56 56 65 87 89 92

Our median mark is the middle mark - in this case, 56 (highlighted in bold).

The median by definition refers to the middle value in a distribution. The median is

that value of the series which divides the group into two equal parts, one part

comprising all values greater than the median value and the other part comprising

all the values smaller than the median value.

6.2 Steps for Calculation of Median – Individual Observations

1. Arrange the data in ascending or descending order of magnitude.

(Both arrangements would give the same answer)

2. Apply the formula Median = 𝑀 =𝑁+1

2

Example-- Find the median for the following data:

5, 15, 10, 15, 5, 10, 10, 20, 25 and 15.

Solution:

First of all we have to arrange all the Observations in ascending order

5, 5, 10, 10, 10, 15, 15, 15, 20, 25

Here by Observation we can say that 𝑁 = 10

𝐻𝑒𝑛𝑐𝑒, Median 𝑀 =𝑁+1

2=

10+1

2= 5.5th item =

10+15

2= 12.5

Example-- Find the median for the following data:



25900, 26950, 27020, 27200, 28280

Solution:

First of all we have to arrange all the Observations in ascending order

25900, 26950, 27020, 27200, 28280

Here by Observation we can say that 𝑁 = 5

𝐻𝑒𝑛𝑐𝑒, Median 𝑀 =𝑁+1

2=

5+1

2=

6

2= 3𝑟𝑑 item =27020

6.3 Steps for Calculation of median of continuous frequency distribution:

1. Determine the particular class in which the value of median lies.

2. Use 𝑁

2 as the rank of the median and not

𝑁+1

2.

3. After ascertaining the class in which median lies, the following formula is

used for determining the exact value of median

𝑀𝑒𝑑𝑖𝑎𝑛 = 𝐿 +

𝑁2

− 𝑐. 𝑓.

𝑓× 𝑖

Where,

𝐿 = Lower limit of the median class i.e., the class in which the middle item of

the distribution lies.

𝑐. 𝑓. = Cumulative frequency of the class preceding the median class or sum of the

Frequency of the frequencies of all classes lower than the median class.

𝑓 = Simple frequency of the Median class.

𝑖 = The class interval of the Median class.

Example--: Find the median of the following data.

Cost 10-20 20-30 30-40 40-50 50-60

Items in a

group

4 5 3 6 3

Solution:



Calculation for Median

Cost Number of items in the group Cumulative frequency

10-20 4 4

20-30 5 9

30-40 3 12

40-50 6 18

50-60 3 21

Here N=21 ⇒𝑁

2= 10.5

The median class is 30-40.

From Formula,

𝑀𝑒𝑑𝑖𝑎𝑛 = 𝐿 + (

𝑁2

− 𝑐𝑓

𝑓) × 𝑖

Here, L=30, 𝑖 = 10, 𝑐𝑓 = 9

𝑀𝑒𝑑𝑖𝑎𝑛 = 30 +(10.5−9)

12× 10 = 30 + 1.25 = 31.25

7.1 Mathematical Property of Median:

1. The sum of the deviations of the items from median, ignoring signs, is the

least.

For example, the median of 4,6,8,10,12 is 8. The deviations from 8 ignoring signs

are 4, 2, 0, 2, 4 and the total is 12.This total is smaller than the one obtained if

deviations are taken from any other value. Thus if deviations are taken from 7,

values ignoring signs would be 3, 1, 1, 3, 5 and the total 13.

7.2 Merits of Median:


It is based on all the observations.

It is rigidly defined.



It eliminates the impact of extreme values.

It can be used for further analysis and algebraic treatment.

Median can be found out just by inspection in some cases.

7.3 Demerits Of Median:

It simply ignores the extreme values.

It may lead to a wrong conclusion. When distribution of observations is

Irregular.

The median is estimated in continuous case.

8.1 Mode:

The value of the variable which occurs most frequently in a distribution is called

the mode. Mode = 3 Median – 2 Mean

8.2 Calculation of Mode – Individual Operations.

For determining mode count the number of times the various values repeat

themselves and the value occurring maximum number of times is the mdal value

.The more often the modal value appears relatively,the more valuable the measure

is an average to represent data.

Example-- The following is the number of problems that Ms. Matty assigned

for homework on 10 different days. What is the mode?

8, 11, 9, 14, 9, 15, 18, 6, 9, 10

Solution:

Ordering the data from least to greatest, we get:

6, 8, 9, 9, 9, 10, 11, 14, 15, 18

The score which occurs most often is 9.

Therefore, the mode is 9.



Example-2 In a crash test, 11 cars were tested to determine what impact speed

was required to obtain minimal bumper damage. Find the mode of the speeds

given in miles per hour below.

24, 15, 18, 20, 18, 22, 24, 26, 18, 26, 24

Solution:


15, 18, 18, 18, 20, 22, 24, 24, 24, 26, 26

Since both 18 and 24 occur three times, the modes are 18 and 24 miles per hour.

This data set is bimodal.

Example-3 A marathon race was completed by 5 participants. What is the

mode of these times given in hours?

2.7 hr, 8.3 hr, 3.5 hr, 5.1 hr, 4.9 hr

Solution:


2.7, 3.5, 4.9, 5.1, 8.3

Since each value occurs only once in the data set, there is no mode for this set of

data.

8.2 Steps for Calculation of Mode For Continuous frequency Distribution:

1. Construct the table



2. Find the Modal class

3. Find out Mode class by using N / 2

4. Apply the formula

𝑀𝑜𝑑𝑒 = 𝐿 +(𝑓1 − 𝑓0)

2𝑓1 − 𝑓0 − 𝑓2

× 𝑖

Where, L = Lower limit of the Modal class

𝑓1Frequency of the Modal class

𝑓0 = Frequency of the class preceding the Modal class

𝑓2 =Frequency of the class succeeding Modal class

i = Class interval of modal class

Example--: Calculate mode of the following data:

Marks 10-20 20-30 30-40 40-50 50-60

F 5 20 25 15 5

Solution:

Construct ion of the table to find Mode:

Marks F

10-20 5

20-30 20

30-40 25

40-50 15

50-60 5

Modal class is 30-40 since highest frequency occurs here i.e. frequency of that

class is =25

𝑀𝑜𝑑𝑒 = 𝐿 +(𝑓1 − 𝑓0)

2𝑓1 − 𝑓0 − 𝑓2

× 𝑖

Where, L = Lower limit of the Modal class = 30



𝑓1 = Frequency of the Modal class = 25

𝑓0 = Frequency of the class preceding the Modal class = 20

𝑓2 =Frequency of the class succeeding Modal class =15

i = Class interval of modal class =10

𝑀𝑜𝑑𝑒 = 30 +(25−20)

(2×25)−20−15× 10

𝑀𝑜𝑑𝑒 = 30 +50

15

𝑀𝑜𝑑𝑒 = 33.33

9.1 Merits of Mode:


It eliminates the impact of extreme values

It can be identified by using graphical method

9.2 Demerits of Mode

It is not suitable for further mathematical treatments.

It may lead to a wrong conclusion. When bimodal distribution.

It is difficult to compute in some cases.

Mode is influenced by length of the class interval.

10.1 Introduction

Averages give us information of concentration of the observations about the central

part of the distribution. But they fail to give anything further about the data.

According to George Simpson and Fritz Kafka, “An average does not tell the full

Story. It is hardly fully representative of a mass, unless we know the manner in

which the individual items scatter around it.

“Dispersion is the measure of the variations of the items.’’- A.L Bowley

10.2 RANGE

The range is the difference between two extreme values of the given observations

Range = Largest value – Smallest value



10.3 Co-efficient of Range

Co- efficient of Range = 𝐿𝑎𝑟𝑔𝑒𝑠𝑡 𝑣𝑎𝑙𝑢𝑒−𝑆𝑚𝑎𝑙𝑙𝑒𝑠𝑡 𝑣𝑎𝑙𝑢𝑒

𝐿𝑎𝑟𝑔𝑒𝑠𝑡 𝑣𝑎𝑙𝑢𝑒+𝑆𝑚𝑎𝑙𝑙𝑒𝑠𝑡 𝑣𝑎𝑙𝑢𝑒=

𝐿−𝑆

𝐿+𝑆

Example-- Find the Co-efficient of range of Marks of 10 students from the

following

65,35,48,99,56,88,78,20,66,53

Solution:

Range = L – S

Range = 99 -20

Range = 79

Co- efficient of Range =99−20

99+20=

79

109= 0.72

10.4 Merits:

1. It is easy to compute and understand.

2. It gives an idea about the distribution immediately.

10.5 Demerits:

1. Calculation range depends only on the basis of extreme items, hence it is

not reliable.

2. It is not applied to open end cases

3. Not suitable for mathematical treatments.

11.1 Quartile Deviation:

The range which includes the middle 50 per cent of the distribution.That is one

quarter of the observations at the lower end, another quarter of the observations at

the upper end of the distribution are excluded in computing the interquartile

range.In the other words, interquartile range represents the difference between the

third quartile and the first quartile.

Symbolically,



𝐼𝑛𝑡𝑒𝑟𝑞𝑢𝑎𝑟𝑡𝑖𝑙𝑒 𝑟𝑎𝑛𝑔𝑒 = 𝑄3 − 𝑄1

Very often the interquartile range is reduced to the form of the semi-interquartile

range or quartile deviation by dividing it by 2.

𝑄𝑢𝑎𝑟𝑡𝑖𝑙𝑒 𝐷𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 𝑜𝑟 𝑄. 𝐷. =𝑄3 − 𝑄1

2

11.2 Coefficient of Quartile Deviation:

Quartile deviation is an absolute measure of dispersion .The relative measure

corresponding to this measure, called the coefficient of quartile deviation, is

calculated as follows.

𝐶𝑜𝑒𝑓𝑓𝑖𝑐𝑖𝑒𝑛𝑡 𝑜𝑓 𝑄. 𝐷. =

𝑄3 − 𝑄1

2𝑄3 + 𝑄1

2

=𝑄3 − 𝑄1

𝑄3 + 𝑄1

Example--find out the value of quartile deviation and its coefficient from the

following data:

Roll No. 1 2 3 4 5 6 7

Marks 20 28 40 12 30 15 50

Solution:

Marks arranged in ascending order: 12 15 20 28 30 40 50

𝑄1 = Size of 𝑁+1

4 th item =

7+1

4= 2𝑛𝑑 item

Size of 2nd item is 15. Thus 𝑄1 = 15

𝑄3 = size of 3 (𝑁+1

4) 𝑡ℎ item = Size of (

3×8

4) 𝑡ℎ item =6th item

Size of 6th item is 40.Thus 𝑄3 = 40

𝑄. 𝐷. =𝑄3 − 𝑄1

2=

40 − 15

2= 12.

Now we have to find coefficient of Q.D.

𝐶𝑜𝑒𝑓𝑓𝑖𝑐𝑖𝑒𝑛𝑡 𝑜𝑓 𝑄. 𝐷. = 𝑄3 −𝑄1

𝑄3 +𝑄1=

40−15

40+15=

25

55= 0.455



Example--Compute coefficientnofnquartile deviation from the following data:

Marks 10 20 30 40 50 60

No. of Students 4 7 15 8 7 2

Solution:

Calculation of coefficient of quartile deviation

Marks Frequency c.f.

10 4 4

20 7 11

30 15 26

40 8 34

50 7 41

60 2 43

𝑄1 = size of 𝑁+1

4𝑡ℎ item =

43+1

4= 11th item

Size of 11th item is 20. Thus 𝑄1 = 20

𝑄3 = Size of 3 (𝑁+1

4) 𝑡ℎ item =

3×44

4= 33rd item

Size of 33rd item is 40. Thus, 𝑄3 = 40

𝑄. 𝐷. =𝑄3 −𝑄1

2=

40−20

2= 10

𝐶𝑜𝑒𝑓𝑓𝑖𝑐𝑖𝑒𝑛𝑡 𝑜𝑓 𝑄. 𝐷. = 𝑄3 −𝑄1

𝑄3 +𝑄1=

40−20

40+20= 0.333

12.1 Mean Deviation:

Mean deviation is the arithmetic mean of the difference of a series computed from

any measure of central tendency i.e., Deviations from Mean or Mode or Median.

All the deviation’s absolute values are considered.

The mean deviation is also known as the average deviation. It is the average

difference between the items in a distribution and the median or mean of that



series. Theoritically there is an advantage in taking the deviations from median

because the sum of deviations of items from median is minimum when signs are

ignored.

12.2 Computation of Mean Deviation-Individual Observations:

If 𝑋1 ,𝑋2 , 𝑋3 , … … 𝑋𝑁 are 𝑁 given observations then the deviation about an average

𝐴 is given by

𝑀. 𝐷. =1

𝑁∑|𝑋 − 𝐴| =

1

𝑁∑|𝐷| 𝑜𝑟

∑|𝐷|

𝑁

Where |𝐷| = |𝑋 − 𝐴|.

Read as mod (X-A) is the modulus value or absolute value of the deviation

ignoring plus and minus signs.

12.2 Steps for Computation of mean deviation: (Indiviadual Observations)

1. Compute the median of the series.

2. The deviations of items from median ignoring ± signs and denote these

deviations by |𝐷|.

3. Obtain the total of these deviations,ie. ∑|𝐷|.

4. Divide the total obntained in step (3) by the total number of observations.

12.3 Coefficient of Mean Deviation:

The relative measure corresponding to the mean deviation called the coefficient of

mean deviation is obtained by dividing mean deviation by the particular average

used in computing mean deviation. Thus if mean deviation has been computed

from median, the coefficient of mean deviation shall be obtained by dividing mean

deviation by mean, median or mode.

𝐶𝑜𝑒𝑓𝑓𝑖𝑐𝑖𝑒𝑛𝑡 𝑜𝑓 𝑀. 𝐷. =𝑀.𝐷.

𝑀𝑒𝑎𝑛 ,𝑀𝑒𝑑𝑖𝑎𝑛 𝑜𝑟 𝑀𝑜𝑑𝑒

Example--Calculate mean deviation and coefficient of mean deviation from

the following data:



100 200 300 400 500 600 700

Solution:

Calculation for Mean Deviation

𝑿 |𝑫| = |𝑿 − 𝑨|=|𝑿 − 𝟒𝟎𝟎|

100 300

200 200

300 100

400 0

500 100

600 200

700 300

∑|𝑫| = 𝟏𝟐𝟎𝟎

Arithmetic Mean=A = ∑ 𝑋𝑖

𝑁=

2800

7= 400

Mean Deviation= 𝑀. 𝐷. = 1

𝑁∑|𝑋 − 𝐴| =

1

𝑁∑|𝐷|

𝑀. 𝐷. =1200

7= 171.42

𝐶𝑜𝑒𝑓𝑓𝑖𝑐𝑖𝑒𝑛𝑡 𝑜𝑓 𝑀. 𝐷. =𝑀.𝐷.

𝑀𝑒𝑎𝑛 =

171.42

400= 0.4285

𝐶𝑜𝑒𝑓𝑓𝑖𝑐𝑖𝑒𝑛𝑡 𝑜𝑓 𝑀. 𝐷 = 0.43

12.2.2 Calculation of Mean Deviation- (Discrete frequency distribution):

In discrete series the formula for calculating mean deviation is

𝑀. 𝐷. =∑ 𝑓|𝐷|

𝑁

Where |𝐷| denote deviation from median ignoring signs.



Steps for calculation:

1. Calculate the median of the series.

2. Take the deviations of the items from median ignoring signs and denote

them by |𝐷|.

3. Divide the total obtained in step (ii) by the number of observations.This

gives us the value of mean deviation.

Example--Claculate Mean deviation from the following series

X 10 11 12 13 14

f 3 12 18 12 3

Solution:

Calculation of Mean Deviation

X 𝒇 |𝑫| 𝒇|𝑫| 𝒄. 𝒇.

10 3 2 6 3

11 12 1 12 15

12 18 0 0 33

13 12 1 12 45

14 3 2 6 48

𝑁 = 48 ∑ 𝑓|𝐷|=36

𝑀. 𝐷.=∑ 𝑓|𝐷|

𝑁

Median = Size of 𝑁+1

2𝑡ℎ item =

48+1

2= 24.5𝑡ℎ item

Size of 24.5th item is 12, hence Median = 12

𝑀. 𝐷. =∑ 𝑓|𝐷|

𝑁=

36

48= 0.75

12.2.3 Calculation of Mean Deviation – Continuous Frequency distribution



For calculating mean deviation in continuous series the procedure remains the

same as discussed above.The only difference is that here we have to obtain the mid

point of the various classes and take deviations of these points from median. The

formula is same, i.e.,

𝑀. 𝐷. =∑ 𝑓|𝐷|

𝑁

Example-- Calculate the mean deviation from mean for the following data:

Class Interval 2-4 4-6 6-8 8-10

frequency 6 8 4 2

Solution:

Calculation for Mean Deviation

Class Mid

value(m)

Frequency 𝒇𝒎 |𝑫|

= |𝑿 − 𝑨|

𝒇|𝑫|

2-4 3 6 18 2.2 13.2

4-6 5 8 40 0.2 1.6

6-8 7 4 28 1.8 7.2

8-10 9 2 18 3.8 7.6

Total 20 104 27.6

𝐴 =∑ 𝑓𝑚

∑ 𝑓=

104

20= 5.2

𝑀. 𝐷. = ∑ 𝑓|𝐷|

𝑁=

27.6

20= 1.48 (ℎ𝑒𝑟𝑒 𝑁 = ∑ 𝑓 = 20)

𝑀. 𝐷. = 1.48

12.4 Merits of Mean Deviation:

1. It is simple to understand and easy to calculate

2. The computation process is based on all items of the series



3. It is less affected by the extreme items.

4. This measure is flexible, Since it can be calculated from mean, meadian, or

mode.

5. This measure is rigidly defined.

12.5 Demerits of Mean Deviation:

1. This measure is not a very accurate measure of dispersion.

2. Not suitable for further mathematical calculation.

3. It is rarely used.

4. Absolute values are considered, mathematically unsound and illogical.

13.1 Standard Deviation:

The famous statistician karl pearson introduced the concept of standard deviation

in 18

This is the most accepted measure of dispersion and also widely used in many

statistical applications. Standard deviation is also referred as root-mean square

deviation or Mean square error. It gives accurate results.

The standard deviation is also denoted by the greek letter (𝜎).

13.2 Variance:

The term variance was used to describe the square of standard deviation by

R.A.Fisher in 1913.

The concept of variance is highly important in advanced work where it is possible

to split the total into several parts,each attributable to one of the factors causing

variation in their original series.

Variance is defined as follows

𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒 =∑(𝑋−𝐴)2

𝑁

Thus variance is nothing but the square of the standard deviation

𝑖. 𝑒., 𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒 = 𝜎2 𝑜𝑟 𝜎 = √𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒



In a frequency distribution where deviations are taken from assumed mean

variance may directly be computed as follows

𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒 = {∑𝑓𝑑2

𝑁− (

∑ 𝑓𝑑

𝑁)

2

} × 𝑖2

𝑤ℎ𝑒𝑟𝑒, 𝑑 =𝑋−𝐴

𝑖 and 𝑖=class interval

13.2. Calculation Of Standard Deviation- Individual Observation

There are two method of calculating standard deviation in an individual

observation:

(i) Direct Method – Deviation taken from actual mean

(ii) Short- cut Method – Deviation taken from assumed mean

13.2.1 (i) Direct Method:

The following are the steps:

1. Find out actual mean of the given observations.

2. Compute deviation of each observation from the mean (𝑋 − 𝑀𝑒𝑎𝑛).

3. Square the deviations and find out the sum i.e. (𝑋 − 𝐴)2

4. Divide the total by the number of observations and take square root of the

quotient,the value is standard deviation.

𝜎 = √∑(𝑋 − 𝐴)2

𝑁

Example-- Calculate the standard deviation from the following data:

𝟏𝟓, 𝟏𝟐, 𝟏𝟕, 𝟏𝟎, 𝟐𝟏, 𝟏𝟖, 𝟏𝟏, 𝟏𝟔

Solution:

Calculation of S.D. from Mean

Values (𝑿) (𝑿 − 𝑨) (𝑿 − 𝑨)𝟐



15 0 0

12 -3 9

17 2 4

10 -5 25

21 6 36

18 3 9

11 -4 16

16 1 1

𝑿 = 𝟏𝟐𝟎 (𝑿 − 𝑨)𝟐 = 𝟏𝟎𝟎

𝐴 =∑ 𝑋𝑖

𝑁=

120

8= 15

S.D.=𝜎 = √∑(𝑋−𝐴)2

𝑁= √

100

8

𝜎 = 3.53

𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒 = 𝜎2 = 12.46

13.2.2 (ii) Short-Cut Method - Deviation taken from assumed mean

This method is used when arithmetic mean is fractional value. A deviation from

fractional value leads to tedious task. To save calculation time,we apply this

method the formula is

𝜎 = √∑𝑑2

𝑁− (

∑ 𝑑

𝑁)

2

Where, d = deviations from assumed mean =(𝑋 − 𝐴)

𝑁 = Number of Observations

Steps for calculations:

1. Take the deviations of the items from an assumed mean, i.e. obtain

(𝑋 − 𝐴). Denote these deviations by d.

2. Take the total of these deviations. i.e., obtain ∑ 𝑑.



3. Square these deviations and obtain the total ∑ 𝑑2.

4. Substitute the values of ∑ 𝑑2 , ∑ 𝑑 𝑎𝑛𝑑 𝑁 in the above formula.

Example--Blood serum cholesterol levels of 10 persons are as under:

240, 260, 290, 245, 255, 288, 272, 263, 277, 251

Calculate standard deviation and variance with the help of assumed

mean.

Solution:

Calculation of Standard Deviation

𝑿 𝒅 = (𝑿 − 𝟐𝟔𝟒) 𝒅𝟐

240 -24 576

260 -4 16

290 +26 676

245 -19 361

255 -9 81

288 +24 576

272 +8 64

263 -1 1

277 +13 169

251 -13 169

∑ 𝑿 =2641 ∑ 𝑑 = +1 ∑ 𝑑2 = 2689

𝜎 = √∑ 𝑑2

𝑁− (

∑ 𝑑

𝑁)

2

Now ∑ 𝑑2 = 2689, ∑ 𝑑 = +1, 𝑁 = 10



∴ 𝜎 = √2689

10− (

1

10)

2

∴ 𝜎 = √268.9 − 0.01

∴ 𝜎 = 16.398

𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒 = 𝜎2 = (16.398)2 = 268.894

13.3 Calculation of Standard Deviation – Discrete frequency distribution

For calculating standard deviation in discrete series, any of the following methods

may be applied:

1. Actual Mean Method

2. Assumed Mean Method

3. Step deviation method

13.3.1 Actual Mean Method:

When this method is applied, deviations are taken from the actual mean,The

formula is applied

𝜎 = √∑𝑓(𝑋 − 𝐴)2

𝑁

However in practice this method is rarely used because if the actual mean is in

fraction the calculations take a lot of time.


(1) Compute mean of the observations

(2) Compute deviation from the mean 𝑑 = (𝑋 − 𝐴)

(3) Square the deviations d2 and multiply these values with respective frequencies

f i.e., fd2

(4) Sum the products 𝑓𝑑2 and apply the formula

𝜎 = √∑𝑓𝑑2

𝑁= √

∑𝑓(𝑋 − 𝐴)2

𝑁



Example-- Compute standard deviation and variance from the following data

Marks 10 20 30 40 50

Frequency 2 8 10 8 2

Solution:

Construct the table to compute the standard deviation

Marks (X) 𝒇 𝒇𝑿 𝒅 = 𝑿 − 𝟑𝟎 𝒅𝟐 𝒇𝒅𝟐

10 2 20 -20 400 800

20 8 160 -10 100 800

30 10 300 0 0 0

40 8 320 10 100 800

50 2 100 20 400 800

𝑓 = 30 𝑓𝑋 = 900 𝑓𝑑2 = 3200

Mean = 𝐴 =∑ 𝑋𝑖

𝑁=

900

30= 30

𝜎 = √∑ 𝑓𝑑2

𝑁= √

∑ 𝑓(𝑋−𝐴)2

𝑁

𝜎 = √3200

30

𝜎 = √106.66 = 10.325

𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒 = 𝜎2 = 106.66

13.3.2 Assumed Mean Method:

When this method is used, the following formula is applied:

𝜎 = √∑ 𝑓𝑑2

𝑁− (

∑ 𝑓𝑑

𝑁)

2

Where 𝑑 = (𝑋 − 𝐴)


1. Assume any one of the given value as assumed mean A



2. Compute deviation from the assumed mean (𝑑 = 𝑋 − 𝐴).

3. Multiply these deviations by its frequencies 𝑓𝑑.

4. Square the deviations (𝑑2) and multiply these values with respective

Frequencies (𝑓) i.e., 𝑓𝑑2

5. Sum the products 𝑓𝑑2 and apply the formula 𝜎 = √∑ 𝑓𝑑2

𝑁− (

∑ 𝑓𝑑

𝑁)

2

Example-- Compute standard deviation and variance from the following data

Marks 10 20 30 40 50

Frequency 2 8 10 8 2

Solution: Construct the table to compute the standard deviation

Marks (X) 𝒇 𝒅 = 𝑿 − 𝟐𝟎 𝒅𝟐 𝒇𝒅 𝒇𝒅𝟐

10 2 -10 100 -20 200

20 8 0 0 0 0

30 10 10 100 100 1000

40 8 20 400 160 3200

50 2 30 900 60 1800

𝑓 = 30 𝑓𝑑 = 300 𝑓𝑑2 = 6200

𝜎 = √∑ 𝑓𝑑2

𝑁− (

∑ 𝑓𝑑

𝑁)

2

𝑁𝑜𝑤, here ∑ 𝑓𝑑2 = 6200, ∑ 𝑓𝑑 = 300, 𝑁 = ∑ 𝑓 = 30

𝜎 = √6200

30− (

300

30)

2

𝜎 = √206.66 − 100 = √106.66

𝜎 = 10.325



𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒 = 𝜎2 = 106.66

13.3.3Step deviation Method:

When this method is used we take deviations of midpoints from an assumed mean

and divide these deviations by the width of class interval,i.e. ′𝑖′ . In case class

intervals are unequal. We divide the deviations of midpoints by the lowest

common factor and use ‘c’ instead of ′𝑖′ in the formula for calculating standard

deviation.

The formula for calculating standard deviation is:

𝜎 = √∑𝑓𝑑2

𝑁− (

∑ 𝑓𝑑

𝑁)

2

× 𝑖

Where 𝑑=𝑋−𝐴

𝑖 and 𝑖 = class interval


Take a common factor and divide that item by all deviations

1. Assume any one of the given value as assumed mean A

2. Compute deviation from the assumed mean (𝑑 = 𝑋−𝐴

𝑖).

3. Multiply these deviations by its frequencies 𝑓𝑑.

4. Square the deviations (𝑑2) and multiply these values with respective

Frequencies (𝑓) i.e., 𝑓𝑑2

5. Sum the products 𝑓𝑑2 and apply the formula.

Example-- The annual salaries of a group of employees are given in the

following table:

Salaries 45000 50000 55000 60000 65000 70000 75000 80000

Number of

persons

3 5 8 7 9 7 4 7

Calculate the standard deviation of the salaries.



Solution:

Calculation of standard deviation

Salaries (X) No. of

persons(𝒇) 𝒅 =

𝑿 − 𝟔𝟎𝟎𝟎𝟎

𝟓𝟎𝟎𝟎

𝒇𝒅 𝒇𝒅𝟐

45000 3 -3 -9 27

50000 5 -2 -10 20

55000 8 -1 -8 8

60000 7 0 0 0

65000 9 +1 +9 9

70000 7 +2 +14 28

75000 4 +3 +12 36

80000 7 +4 +28 112

𝑵 = 𝟓𝟎 ∑ 𝒇𝒅 =36 ∑ 𝒇𝒅𝟐 = 𝟐𝟒𝟎

𝜎 = √∑ 𝑓𝑑2

𝑁− (

∑ 𝑓𝑑

𝑁)

2

× 𝑖

Now here ∑ 𝑓𝑑 = 36, ∑ 𝑓𝑑2 = 240 ,𝑁 = 50, 𝑖 = 5000

∴ 𝜎 = √240

50− (

36

50)

2

× 5000

∴ 𝜎 = √4.8 − 0.5184 × 5000

∴ 𝜎 = 10346.01

13.4 Calculation of Standard Deviation – Continuous Series.

In continuous series any of the methods discussed above for discrete frequency

distribution can be used. However, in practice it is the step deviation method that is

most used.

The formula is,



𝜎 = √∑𝑓𝑑2

𝑁− (

∑ 𝑓𝑑

𝑁)

2

× 𝑖

Where 𝑑 =𝑚−𝐴

𝑖, 𝑖 = 𝑐𝑙𝑎𝑠𝑠 𝑖𝑛𝑡𝑒𝑟𝑣𝑎𝑙


1. Find the mid point of various classes.

2. Take the deviations of these mid points from an assumed mean and denote

thrse deviations by 𝑑.

3. Wherever possible take a common factor and denote this column by 𝑑.

4. Multiply the frequencies of each class with these deviations and obtain ∑ 𝑓𝑑.

5. Square the deviations and multiply them with the respective frequencies of

each class and obtain ∑ 𝑓𝑑2 .

Thus the only difference in procedure in case of continuous series is to find mid-

points of the various classes.

Example--Calculate Standard deviation from the following data:

Age 20-25 25-30 30-35 35-40 40-45 45-50

No. of Persons 170 110 80 45 40 35

Solution:

Take assumed average = 32.5

Calculation of Standard deviation

Age Mid

point(m)

No. of

Persons (f) 𝒅 =

𝒎 − 𝟑𝟐. 𝟓

𝟓

𝒇𝒅 𝒇𝒅𝟐

20-25 22.5 170 -2 -340 680

25-30 27.5 110 -1 -110 110

30-35 32.5 80 0 0 0

35-40 37.5 45 +1 +45 45



40-45 42.5 40 +2 +80 160

45-50 47.5 35 +3 +105 315

𝑵 = 𝟒𝟖𝟎 ∑ 𝒇𝒅 =-220 ∑ 𝒇𝒅𝟐 =1310

𝜎 = √∑ 𝑓𝑑2

𝑁− (

∑ 𝑓𝑑

𝑁)

2

× 𝑖

𝜎 = √1310

480− (

−220

480)

2

× 5

𝜎 = √2.279 − 0.21 × 5

𝜎 = √2.519 × 5

𝜎 = 1.587 × 5 = 7.936

14.1 Coefficient of Variation:

The standard deviation discussed above is an absolute measure of dispersion. The

corresponding relative measure is known as the Coefficient of variation.

This measure developed by karl pearson is the most commonly used measure of

relative variation. It is used in such problems where we cant to compare the

variability of two or more than two series.

That series for which the coefficient of variation is greater is said to be more

variable or conversely less consistent,less uniform,less srable .

On the other hand , the series for which coefficient of variation is less is said to be

less variable or more consistent,more uniform,more stable .

Coefficient of variation is denoted by C.V. and is obtained as follows:

𝐶𝑜𝑒𝑓𝑓𝑖𝑐𝑖𝑒𝑛𝑡 𝑜𝑓 𝑣𝑎𝑟𝑖𝑎𝑡𝑖𝑜𝑛 𝑜𝑟 𝐶. 𝑉. =𝜎

𝐴× 100

Where, 𝜎 = standard deviation, 𝐴 = 𝑎𝑟𝑖𝑡ℎ𝑚𝑒𝑡𝑖𝑐 𝑀𝑒𝑎𝑛



Example--From the prices of shares of 𝑿 and 𝒀 below find out which is more

stable in value:

𝑿 35 54 52 53 56 58 52 50 51 49

𝒀 108 107 105 105 106 107 104 103 104 101

Solution:

In order to find out which shares are more stable, we have to compare coefficient

of variations.

Calculation of Coefficient of Variation

𝑿 𝒙 = (𝑿 − 𝑨) 𝒙𝟐 𝒀 𝒚 = (𝒀 − 𝑨) 𝒚𝟐

35 -16 256 108 +3 9

54 +3 9 107 +2 4

52 +1 1 105 0 0

53 +2 4 105 0 0

56 +5 25 106 +1 1

58 +7 49 107 +2 4

52 +1 1 104 -1 1

50 -1 1 103 -2 4

51 0 0 104 -1 1

49 -2 4 101 -4 16

∑ 𝑿 =510 ∑ 𝒙 =0 ∑ 𝒙𝟐 =350 ∑ 𝒀 =1050 ∑ 𝒚=0 ∑ 𝒚𝟐 =40

Coefficient of variation 𝑿:

𝐶. 𝑉. =𝜎

𝐴× 100

Here 𝐴 =∑ 𝑋

𝑁=

510

10= 51



𝜎 = √∑ 𝑥2

𝑁= √

350

10= 5.916

∴ 𝐶. 𝑉. = 𝜎

𝐴× 100 =

5.196

51× 100

∴ 𝐶. 𝑉. = 11.6

Coefficient of variation 𝒀:

𝐶. 𝑉. =𝜎

𝐴× 100

Here 𝐴 =∑ 𝑦

𝑁=

1050

10= 105

𝜎 = √∑ 𝑦2

𝑁= √

40

10= 2

∴ 𝐶. 𝑉. = 𝜎

𝐴× 100 =

2

105× 100

∴ 𝐶. 𝑉. = 1.905

Since Coefficient of variation is much less in case of shares 𝑌, Hence they are

more stable in value.

EXERCISE

Q-1 Evaluate the following Questions:

(1). Find the Mean, Median and Mode of the following data.

Cost 10-20 20-30 30-40 40-50 50-60

Items in a

group

4 5 3 6 3

(2).Find the Mean, Median and Mode of the following distribution:

Class Interval

0-10 10-20 20-30 30-40 40-50 50-60 60-70 70-80

Frequency 5 9 8 12 28 20 12 11

(3).On his first 5 biology tests, Bob received the following scores: 72, 86, 92, 63, and 77. What test score must Bob earn on his sixth test so that his average (mean score) for all six tests will be 80? Show how you arrived at your answer.

Q-2 Evaluate the following Questions:



(1). Calculate quartile deviation and it coefficient from the following data:

Wages in Rupees per Day

Less then 35

35-37

38-40 41-43 Over 43

Number of Wage earners

14 62 99 18 7

(2). Calculate Mean Deviation for the following data

Size 0-10 10-20 20-30 30-40 40-50 50-60 60-70

Frequency 7 12 18 25 16 14 8

(3). The owner of a restaurant is interested in how much people spend at the restaurant. He examines 10 randomly selected receipts for parties of four and write down the following data: 44, 50, 38, 96, 42, 47, 40,39, 46, 50.

Find mean, standard deviation and variance.

15. Referenc Book and Website Name:

1. http://www.emathzone.com/tutorials/basic-statistics/collection-of-statistical-

data.html 2. http://wizznotes.com/mathematics/statistics/class-limits-boundaries-and-

intervals 3. http://tistats.com/definitions/class-width/ 4. http://mathespk.blogspot.in/2011/10/frequency-density.html

5. http://www.mathsisfun.com/definitions/relative-frequency.html 6. http://www.mathsisfun.com/definitions/cumulative-frequency.html

7. http://www.mathgoodies.com/lessons/vol8/mode.html 8. Statistical Methods by S.P.Gupta.

http://www.emathzone.com/tutorials/basic-statistics/collection-of-statistical-data.html

http://www.emathzone.com/tutorials/basic-statistics/collection-of-statistical-data.html

http://wizznotes.com/mathematics/statistics/class-limits-boundaries-and-intervals

http://wizznotes.com/mathematics/statistics/class-limits-boundaries-and-intervals

http://tistats.com/definitions/class-width/

http://mathespk.blogspot.in/2011/10/frequency-density.html

http://www.mathsisfun.com/definitions/relative-frequency.html

http://www.mathsisfun.com/definitions/cumulative-frequency.html

http://www.mathgoodies.com/lessons/vol8/mode.html

mca_unit-3_computer oriented numerical statistical methods

Education

collection of data

raw data

statistical data

customer data

data stewardship

dataclassification scheme

data classification

sensitive internal data