Chapter 13 Statistics

13.1 Introduction

In Class IX, you have studied the classification of given data into ungrouped as well as grouped frequency distributions. You have also learnt to represent the data pictorially in the form of various graphs such as bar graphs, histograms (including those of varying widths) and frequency polygons. In fact, you went a step further by studying certain numerical representatives of the ungrouped data, also called measures of central tendency, namely, mean, median and mode. In this chapter, we shall extend the study of these three measures, i.e., mean, median and mode from ungrouped data to that of grouped data. We shall also discuss the concept of cumulative frequency, the cumulative frequency distribution and how to draw cumulative frequency curves, called ogives.

13.2 Mean of Grouped Data

The mean (or average) of observations, as we know, is the sum of the values of all the observations divided by the total number of observations. From Class IX, recall that if $x_{1}, x_{2}, \ldots, x_{\mathrm{n}}$ are observations with respective frequencies $f_{1}, f_{2}, \ldots, f_{\mathrm{n}}$, then this means observation $x_{1}$ occurs $f_{1}$ times, $x_{2}$ occurs $f_{2}$ times, and so on.

Now, the sum of the values of all the observations $=f_{1} x_{1}+f_{2} x_{2}+\ldots+f_{n} x_{n}$, and the number of observations $=f_{1}+f_{2}+\ldots+f_{n}$.

So, the mean $\bar{x}$ of the data is given by

$$ \bar{x}=\dfrac{f_{1} x_{1}+f_{2} x_{2}+\cdots+f_{n} x_{n}}{f_{1}+f_{2}+\cdots+f_{n}} $$

Recall that we can write this in short form by using the Greek letter $\Sigma$ (capital sigma) which means summation. That is,

$$ \bar{x}=\dfrac{\sum_{i=1}^{n} f_{i} x_{i}}{\sum_{i=1}^{n} f_{i}} $$

which, more briefly, is written as $\bar{x}=\dfrac{\sum f_{i} x_{i}}{\Sigma f_{i}}$, if it is understood that $i$ varies from 1 to $n$.

Let us apply this formula to find the mean in the following example.

Example 1 : The marks obtained by 30 students of Class $\mathrm{X}$ of a certain school in a Mathematics paper consisting of 100 marks are presented in table below. Find the mean of the marks obtained by the students.

Marks obtained $\left(\boldsymbol{x}_{\boldsymbol{i}}\right)$ 10 20 36 40 50 56 60 70 72 80 88 92 95
Number of students $\left(\boldsymbol{f} _{\boldsymbol{i}}\right)$ 1 1 3 4 3 2 4 4 1 1 2 3 1

Solution: Recall that to find the mean marks, we require the product of each $x_{i}$ with the corresponding frequency $f_{i}$. So, let us put them in a column as shown in Table 13.1.

Table 13.1

Marks obtained $\left(\boldsymbol{x_i}\right)$ Number of students $\left(\boldsymbol{f_i}\right)$ $\boldsymbol{f_i} \boldsymbol{x_i}$
10 1 10
20 1 20
36 3 108
40 4 160
50 3 150
56 2 112
60 4 240
70 4 280
72 1 72
80 1 80
88 2 176
92 3 276
95 1 95
Total $\Sigma f_{i}=30$ $\Sigma f_{i} x_{i}=1779$

Now, $$ \bar{x}=\dfrac{\sum f_{i} x_{i}}{\sum f_{i}}=\dfrac{1779}{30}=59.3 $$

Therefore, the mean marks obtained is 59.3.

In most of our real life situations, data is usually so large that to make a meaningful study it needs to be condensed as grouped data. So, we need to convert given ungrouped data into grouped data and devise some method to find its mean.

Let us convert the ungrouped data of Example 1 into grouped data by forming class-intervals of width, say 15 . Remember that, while allocating frequencies to each class-interval, students falling in any upper class-limit would be considered in the next class, e.g., 4 students who have obtained 40 marks would be considered in the classinterval 40-55 and not in 25-40. With this convention in our mind, let us form a grouped frequency distribution table (see Table 13.2).

Table 13.2

Class interval $10-25$ $25-40$ $40-55$ $55-70$ $70-85$ $85-100$
Number of students 2 3 7 6 6 6

Now, for each class-interval, we require a point which would serve as the representative of the whole class. It is assumed that the frequency of each classinterval is centred around its mid-point. So the mid-point (or class mark) of each class can be chosen to represent the observations falling in the class. Recall that we find the mid-point of a class (or its class mark) by finding the average of its upper and lower limits. That is,

$$ \text { Class } \text { mark }=\dfrac{\text { Upper class limit }+ \text { Lower class limit }}{2} $$

With reference to Table 13.2, for the class $10-25$, the class mark is $\dfrac{10+25}{2}$, i.e., 17.5. Similarly, we can find the class marks of the remaining class intervals. We put them in Table 13.3. These class marks serve as our $x_{i}$ ’s. Now, in general, for the $i$ th class interval, we have the frequency $f_{i}$ corresponding to the class mark $x_{i}$. We can now proceed to compute the mean in the same manner as in Example 1.

Table 13.3

Class interval Number of students $\left(\boldsymbol{f}_{\boldsymbol{i}}\right)$ Class mark $\left(\boldsymbol{x}_{\boldsymbol{i}}\right)$ $\boldsymbol{f}_{\boldsymbol{i}} \boldsymbol{x_i}$
$10-25$ 2 17.5 35.0
$25-40$ 3 32.5 97.5
$40-55$ 7 47.5 332.5
$55-70$ 6 62.5 375.0
$70-85$ 6 77.5 465.0
$85-100$ 6 92.5 555.0
Total $\sum f_{i}=30$ $\sum f_{i} x_{i}=1860.0$

The sum of the values in the last column gives us $\Sigma f_{i} x_{i}$. So, the mean $\bar{x}$ of the given data is given by

$$ \bar{x}=\dfrac{\Sigma f_{i} x_{i}}{\Sigma f_{i}}=\dfrac{1860.0}{30}=62 $$

This new method of finding the mean is known as the Direct Method.

We observe that Tables 13.1 and 13.3 are using the same data and employing the same formula for the calculation of the mean but the results obtained are different. Can you think why this is so, and which one is more accurate? The difference in the two values is because of the mid-point assumption in Table 13.3, 59.3 being the exact mean, while 62 an approximate mean.

Sometimes when the numerical values of $x_{i}$ and $f_{i}$ are large, finding the product of $x_{i}$ and $f_{i}$ becomes tedious and time consuming. So, for such situations, let us think of a method of reducing these calculations.

We can do nothing with the $f_{i}$ ’s, but we can change each $x_{i}$ to a smaller number so that our calculations become easy. How do we do this? What about subtracting a fixed number from each of these $x_{i}^{\prime}$ ’s? Let us try this method.

The first step is to choose one among the $x_{i}^{\prime}$ s as the assumed mean, and denote it by ’ $a$ ‘. Also, to further reduce our calculation work, we may take ’ $a$ ’ to be that $x_{i}$ which lies in the centre of $x_{1}, x_{2}, \ldots, x_{n}$. So, we can choose $a=47.5$ or $a=62.5$. Let us choose $a=47.5$.

The next step is to find the difference $d_{i}$ between $a$ and each of the $x_{i}$ ’s, that is, the deviation of ’ $a$ ’ from each of the $x_{i}$ ’s.

i.e., $$ d_{i}=x_{i}-a=x_{i}-47.5 $$

The third step is to find the product of $d_{i}$ with the corresponding $f_{i}$, and take the sum of all the $f_{i} d_{i}$ ’s. The calculations are shown in Table 13.4.

Table 13.4

Class interval Number of students $\left(\boldsymbol{f}_{\boldsymbol{i}}\right)$ Class mark $\left(\boldsymbol{x}_{\boldsymbol{i}}\right)$ $\boldsymbol{d_i}=\boldsymbol{x}_{\boldsymbol{i}}-\mathbf{4 7 . 5}$ $\boldsymbol{f}_{\boldsymbol{i}} \boldsymbol{d_i}$
$10-25$ 2 17.5 -30 -60
$25-40$ 3 32.5 -15 -45
$40-55$ 7 47.5 0 0
$55-70$ 6 62.5 15 90
$70-85$ 6 77.5 30 180
$85-100$ 6 92.5 45 270
Total $\Sigma f_{i}=30$ $\Sigma f_{i} d_{i}=435$

So, from Table 13.4, the mean of the deviations, $\bar{d}=\dfrac{\Sigma f_{i} d_{i}}{\Sigma f_{i}}$.

Now, let us find the relation between $\bar{d}$ and $\bar{x}$.

Since in obtaining $d_{i}$, we subtracted ’ $a$ ’ from each $x_{i}$, so, in order to get the mean $\bar{x}$, we need to add ’ $a$ ’ to $\bar{d}$. This can be explained mathematically as:

$$ \begin{aligned} \text { Mean of deviations, } \quad\quad\quad\quad \bar{d} & =\dfrac{\Sigma f_{i} d_{i}}{\Sigma f_{i}} \\ \text { So, } \quad\quad\quad\quad \bar{d} & =\dfrac{\Sigma f_{i}\left(x_{i}-a\right)}{\Sigma f_{i}} \\ & =\dfrac{\Sigma f_{i} x_{i}}{\Sigma f_{i}}-\dfrac{\Sigma f_{i} a}{\Sigma f_{i}} \\ & =\bar{x}-a \dfrac{\Sigma f_{i}}{\Sigma f_{i}} \\ & =\bar{x}-a \\ \text { So, } \quad\quad\quad\quad \bar{x} & =a+\bar{d} \\ \text { i.e., } \quad\quad\quad\quad\bar{x} & =a+\dfrac{\Sigma f_{i} d_{i}}{\Sigma f_{i}} \end{aligned} $$

Substituting the values of $a, \Sigma f_{i} d_{i}$ and $\Sigma f_{i}$ from Table 13.4, we get

$$ \bar{x}=47.5+\dfrac{435}{30}=47.5+14.5=62 . $$

Therefore, the mean of the marks obtained by the students is 62 .

The method discussed above is called the Assumed Mean Method.

Activity 1 : From the Table 13.3 find the mean by taking each of $x_{i}$ (i.e., 17.5, 32.5, and so on) as ’ $a$ ‘. What do you observe? You will find that the mean determined in each case is the same, i.e., 62 . (Why ?)

So, we can say that the value of the mean obtained does not depend on the choice of ’ $a$ ‘.

Observe that in Table 13.4, the values in Column 4 are all multiples of 15. So, if we divide the values in the entire Column 4 by 15 , we would get smaller numbers to multiply with $f_{i^{\prime}}$. (Here, 15 is the class size of each class interval.)

So, let $u_{i}=\dfrac{x_{i}-a}{h}$, where $a$ is the assumed mean and $h$ is the class size.

Now, we calculate $u_{i}$ in this way and continue as before (i.e., find $f_{i} u_{i}$ and then $\Sigma f_{i} u_{i}$). Taking $h=15$, let us form Table 13.5.

Table 13.5

Class interval $\boldsymbol{f}_{\boldsymbol{i}}$ $\boldsymbol{x}_{\boldsymbol{i}}$ $\boldsymbol{d_i}=\boldsymbol{x}_{\boldsymbol{i}}-\boldsymbol{a}$ $\boldsymbol{u_i}=\dfrac{\boldsymbol{x}_{\boldsymbol{i}}-\boldsymbol{a}}{\boldsymbol{h}}$ $\boldsymbol{f}_{\boldsymbol{i}} \boldsymbol{u_i}$
$10-25$ 2 17.5 -30 -2 -4
$25-40$ 3 32.5 -15 -1 -3
$40-55$ 7 47.5 0 0 0
$55-70$ 6 62.5 15 1 6
$70-85$ 6 77.5 30 2 12
$85-100$ 6 92.5 45 3 18
Total $\Sigma f_{i}=30$ $\Sigma f_{i} u_{i}=29$

Let $$ \bar{u}=\dfrac{\Sigma f_{i} u_{i}}{\Sigma f_{i}} $$

Here, again let us find the relation between $\bar{u}$ and $\bar{x}$.

We have, $$ u_{i}=\dfrac{x_{i}-a}{h} $$

Therefore, $$ \begin{aligned} \bar{u} & =\dfrac{\Sigma f_{i} \dfrac{\left(x_{i}-a\right)}{h}}{\Sigma f_{i}}=\dfrac{1}{h}\left[\dfrac{\Sigma f_{i} x_{i}-a \Sigma f_{i}}{\Sigma f_{i}}\right] \\ & =\dfrac{1}{h}\left[\dfrac{\Sigma f_{i} x_{i}}{\Sigma f_{i}}-a \dfrac{\Sigma f_{i}}{\Sigma f_{i}}\right] \\ & =\dfrac{1}{h}[\bar{x}-a] \end{aligned} $$

So, $$ \begin{aligned} h \bar{u} & =\bar{x}-a \\ \end{aligned} $$

i.e., $$\bar{x} =a+h \bar{u}$$

So, $$ \bar{x}=a+h\left(\dfrac{\Sigma f_{i} u_{i}}{\Sigma f_{i}}\right) $$

Now, substituting the values of $a, h, \Sigma f_{i} u_{i}$ and $\Sigma f_{i}$ from Table 14.5, we get

$$ \begin{aligned} \bar{x} & =47.5+15 \times\left(\dfrac{29}{30}\right) \\ & =47.5+14.5=62 \end{aligned} $$

So, the mean marks obtained by a student is 62 .

The method discussed above is called the Step-deviation method.

We note that :

  • the step-deviation method will be convenient to apply if all the $d_{i}$ ’s have a common factor.
  • The mean obtained by all the three methods is the same.
  • The assumed mean method and step-deviation method are just simplified forms of the direct method.
  • The formula $\bar{x}=a+h \bar{u}$ still holds if $a$ and $h$ are not as given above, but are any non-zero numbers such that $u_{i}=\dfrac{x_{i}-a}{h}$.

Let us apply these methods in another example.

Example 2 : The table below gives the percentage distribution of female teachers in the primary schools of rural areas of various states and union territories (U.T.) of India. Find the mean percentage of female teachers by all the three methods discussed in this section.

Percentage of female teachers $15-25$ $25-35$ $35-45$ $45-55$ $55-65$ $65-75$ $75-85$
Number of States/U.T. 6 11 7 4 4 2 1

Source : Seventh All India School Education Survey conducted by NCERT

Solution : Let us find the class marks, $x_{i}$, of each class, and put them in a column (see Table 13.6):

Table 13.6

Percentage of female teachers Number of States $/$ U.T. $\left(\boldsymbol{f}_{\boldsymbol{i}}\right)$ $\boldsymbol{x}_{\boldsymbol{i}}$
$15-25$ 6 20
$25-35$ 11 30
$35-45$ 7 40
$45-55$ 4 50
$55-65$ 4 60
$65-75$ 2 70
$75-85$ 1 80

Here we take $a=50, h=10$, then $d_{i}=x_{i}-50$ and $u_{i}=\dfrac{x_{i}-50}{10}$.

We now find $d_{i}$ and $u_{i}$ and put them in Table 13.7.

Table 13.7

Percentage of female teachers Number of states/U.T. $\left(\boldsymbol{f}_{\boldsymbol{i}}\right)$ $\boldsymbol{x}_{\boldsymbol{i}}$ $\boldsymbol{d_i}=\boldsymbol{x}_{\boldsymbol{i}}-\mathbf{5 0}$ $\boldsymbol{u}_{\boldsymbol{i}}=\dfrac{\boldsymbol{x_i}-\mathbf{5 0}}{\mathbf{1 0}}$ $\boldsymbol{f_i} \boldsymbol{x_i}$ $\boldsymbol{f_i} \boldsymbol{d_i}$ $\boldsymbol{f_i} \boldsymbol{u_i}$
$15-25$ 6 20 -30 -3 120 -180 -18
$25-35$ 11 30 -20 -2 330 -220 -22
$35-45$ 7 40 -10 -1 280 -70 -7
$45-55$ 4 50 0 0 200 0 0
$55-65$ 4 60 10 1 240 40 4
$65-75$ 2 70 20 2 140 40 4
$75-85$ 1 80 30 3 80 30 3
Total $\mathbf{3 5}$ $\mathbf{1 3 9 0}$ $\mathbf{- 3 6 0}$ $\mathbf{- 3 6}$

From the table above, we obtain $\Sigma f_{i}=35, \quad \Sigma f_{i} x_{i}=1390$,

$$ \Sigma f_{i} d_{i}=-360, \quad \Sigma f_{i} u_{i}=-36 $$

Using the direct method, $\bar{x}=\dfrac{\Sigma f_{i} x_{i}}{\Sigma f_{i}}=\dfrac{1390}{35}=39.71$

Using the assumed mean method,

$$ \bar{x}=a+\dfrac{\Sigma f_{i} d_{i}}{\Sigma f_{i}}=50+\dfrac{(-360)}{35}=39.71 $$

Using the step-deviation method,

$$ \bar{x}=a+\left(\dfrac{\Sigma f_{i} u_{i}}{\Sigma f_{i}}\right) \times h=50+\left(\dfrac{-36}{35}\right) \times 10=39.71 $$

Therefore, the mean percentage of female teachers in the primary schools of rural areas is 39.71 .

Remark : The result obtained by all the three methods is the same. So the choice of method to be used depends on the numerical values of $x_{i}$ and $f_{i}$. If $x_{i}$ and $f_{i}$ are sufficiently small, then the direct method is an appropriate choice. If $x_{i}$ and $f_{i}$ are numerically large numbers, then we can go for the assumed mean method or step-deviation method. If the class sizes are unequal, and $x_{i}$ are large numerically, we can still apply the step-deviation method by taking $h$ to be a suitable divisor of all the $d_{i}$ ’s.

Example 3 : The distribution below shows the number of wickets taken by bowlers in one-day cricket matches. Find the mean number of wickets by choosing a suitable method. What does the mean signify?

Number of wickets $20-60$ $60-100$ $100-150$ $150-250$ $250-350$ $350-450$
Number of bowlers 7 5 16 12 2 3

Solution : Here, the class size varies, and the $x_{i}$ s are large. Let us still apply the stepdeviation method with $a=200$ and $h=20$. Then, we obtain the data as in Table 13.8.

Table 13.8

Number of wickets taken Number of bowlers $\left(\boldsymbol{f}_{\boldsymbol{i}}\right)$ $\boldsymbol{x}_{\boldsymbol{i}}$ $\boldsymbol{d}_{\boldsymbol{i}}=\boldsymbol{x_i}-\mathbf{2 0 0}$ $\boldsymbol{u}_{\boldsymbol{i}}=\dfrac{\boldsymbol{d_i}}{\mathbf{2 0}}$ $\boldsymbol{u}_{\boldsymbol{i}} \boldsymbol{f_i}$
$20-60$ 7 40 -160 -8 -56
$60-100$ 5 80 -120 -6 -30
$100-150$ 16 125 -75 -3.75 -60
$150-250$ 12 200 0 0 0
$250-350$ 2 300 100 5 10
$350-450$ 3 400 200 10 30
Total $\mathbf{4 5}$ $\mathbf{- 1 0 6}$

So, $\bar{u}=\dfrac{-106}{45}$. Therefore, $\bar{x}=200+20\left(\dfrac{-106}{45}\right)=200-47.11=152.89$.

This tells us that, on an average, the number of wickets taken by these 45 bowlers in one-day cricket is 152.89 .

Now, let us see how well you can apply the concepts discussed in this section!

Activity 2 :

Divide the students of your class into three groups and ask each group to do one of the following activities.

1. Collect the marks obtained by all the students of your class in Mathematics in the latest examination conducted by your school. Form a >grouped frequency distribution of the data obtained.

2. Collect the daily maximum temperatures recorded for a period of 30 days in your city. Present this data as a grouped frequency table.

3. Measure the heights of all the students of your class (in cm) and form a grouped frequency distribution table of this data.

After all the groups have collected the data and formed grouped frequency distribution tables, the groups should find the mean in each case by the method which they find appropriate.

EXERCISE 13.1

1. A survey was conducted by a group of students as a part of their environment awareness programme, in which they collected the following data regarding the number of plants in 20 houses in a locality. Find the mean number of plants per house.

Number of plants $0-2$ $2-4$ $4-6$ $6-8$ $8-10$ $10-12$ $12-14$
Number of houses 1 2 1 5 6 2 3

Which method did you use for finding the mean, and why?

Show Answer

Solution

To find the class mark $(x_i)$ for each interval, the following relation is used.

$x_i=\dfrac{\text{ Upper class limit }+ \text{ Lower class limit }}{2}$

$x_i$ andf $f_i$ can be calculated as follows.

Number of plants Number of houses
$(\boldsymbol{{}f} _{\boldsymbol{{}i})}.$
$\boldsymbol{{}x} _{\boldsymbol{{}i}}$ $\boldsymbol{{}f} _{\boldsymbol{{}i}} \boldsymbol{{}x}_i$
$0-2$ 1 1 $1 \times 1=1$
$2-4$ 2 3 $2 \times 3=6$
$4-6$ 1 5 $1 \times 5=5$
$6-8$ 5 7 $5 \times 7=35$
$8-10$ 6 9 $6 \times 9=54$
$10-12$ 2 11 $2 \times 11=22$
$12-14$ 3 $3 \times 13=39$
Total 20 162

From the table, it can be observed that $\sum f_i=20$

$\sum f_i x_i=162$

Mean,

$\bar{{}x}=\dfrac{\sum f_i x_i}{\sum f_i}$

$=\dfrac{162}{20}=8.1$

Therefore, mean number of plants per house is 8.1.

Here, direct method has been used as the values of class marks $(x_i)$ and $f_i$ are small.

2. Consider the following distribution of daily wages of 50 workers of a factory.

Daily wages (in ₹) $500-520$ $520-540$ $540-560$ $560-580$ $580-600$
Number of workers 12 14 8 6 10

Find the mean daily wages of the workers of the factory by using an appropriate method.

Show Answer

Solution

To find the class mark for each interval, the following relation is used.

$x_i=\dfrac{\text{ Upper class limit }+ \text{ Lower class limit }}{2}$

Class size $(h)$ of this data $=20$

Taking 150 as assured mean (a), $d_i, u_i$, and $f_i u_i$ can be calculated as follows.

Daily wages
(in Rs)
Number of workers $(f_i)$ $\boldsymbol{{}x} _{\boldsymbol{{}i}}$ $\boldsymbol{{}d} _{\boldsymbol{{}i}}=\boldsymbol{{}x} _{\boldsymbol{{}i}}-\mathbf{1 5 0}$ $\boldsymbol{{}u} _{\boldsymbol{{}i}}=\dfrac{\boldsymbol{{}d} _{\boldsymbol{{}i}}}{\mathbf{2 0}}$ $\boldsymbol{{}f} _{\boldsymbol{{}i}}$
$100-120$ 12 110 -40 -2 -24
$120-140$ 14 130 -20 -1 -14
$140-160$ 8 150 0 0 0
$160-180$ 6 170 20 1 6
$180-200$ 10 190 40 2 20
Total 50 -12

From the table, it can be observed that

$ \begin{aligned} \sum f_i & =50 \\ \sum f_i u_i & =-12 \\ \text{ Mean } \bar{{}x} & =a+(\dfrac{\sum f_i u_i}{\sum f_i}) h \\ & =150+(\dfrac{-12}{50}) 20 \\ & =150-\dfrac{24}{5} \\ & =150-4.8 \\ & =145.2 \end{aligned} $

Therefore, the mean daily wage of the workers of the factory is Rs 145.20.

3. The following distribution shows the daily pocket allowance of children of a locality. The mean pocket allowance is Rs 18. Find the missing frequency $f$.

Daily pocket allowance (in ₹) $11-13$ $13-15$ $15-17$ $17-19$ $19-21$ $21-23$ $23-25$
Number of children 7 6 9 13 $f$ 5 4
Show Answer

Solution

To find the class mark $(x_i)$ for each interval, the following relation is used.

$x_i=\dfrac{\text{ Upper class limit }+ \text{ Lower class limit }}{2}$

Given that, mean pocket allowance, $\bar{{}x}=$ Rs 18

Taking 18 as assured mean (a), $d_i$ and $f_i d_i$ are calculated as follows.

Daily pocket allowance
(in Rs)
Number of children
$\boldsymbol{{}f} _{\boldsymbol{{}i}}$
Class mark $\boldsymbol{{}x} _{\boldsymbol{{}i}}$ $\boldsymbol{{}d} _{\boldsymbol{{}i}}=\boldsymbol{{}x} _{\boldsymbol{{}i}}-\mathbf{1 8}$ $\boldsymbol{{}f} _{\boldsymbol{{}i}}$
$11-13$ 7 12 -6 -42
$13-15$ 6 14 -4 -24
$15-17$ 9 16 -2 -18
$17-19$ 13 18 0 0
$19-21$ $f$ 20 2 $2 f$
$21-23$ 5 22 4 20
$23-25$ 4 24 6 24
Total $\sum f_i=44+f$ $2 f-40$

From the table, we obtain

$\sum f_i=44+f$

$\sum f_i u_i=2 f-40$

$\bar{{}x}=a+\dfrac{\sum f_i d_i}{\sum f_i}$

$18=18+(\dfrac{2 f-40}{44+f})$

$0=(\dfrac{2 f-40}{44+f})$

$2 f-40=0$

$2 f=40$

$f=20$

Hence, the missing frequency, $f$, is 20 .

4. Thirty women were examined in a hospital by a doctor and the number of heartbeats per minute were recorded and summarised as follows. Find the mean heartbeats per minute for these women, choosing a suitable method.

Number of heartbeats per minute $65-68$ $68-71$ $71-74$ $74-77$ $77-80$ $80-83$ $83-86$
Number of women 2 4 3 8 7 4 2
Show Answer

Solution

To find the class mark of each interval $(x_i)$, the following relation is used.

$ x_i=\dfrac{\text{ Upper class limit }+ \text{ Lower class limit }}{2} $

Class size, $h$, of this data $=3$

Taking 75.5 as assumed mean (a), di, $u_i, f_i u_i$ are calculated as follows.

Number of heart beats per minute Number of women
$\boldsymbol{{}f}_i$
$\boldsymbol{{}x}_i$ $\boldsymbol{{}d} _{\boldsymbol{{}i}}=\boldsymbol{{}x} _{\boldsymbol{{}i}}-\mathbf{7 5 . 5}$ $\boldsymbol{{}u} _{\boldsymbol{{}i}}=\dfrac{\boldsymbol{{}d} _{\boldsymbol{{}i}}}{\mathbf{3}}$ $\boldsymbol{{}f} _{\boldsymbol{{}i}} \boldsymbol{{}u} _{\boldsymbol{{}i}}$
$65-68$ 2 66.5 -9 -3 -6
$68-71$ 4 69.5 -6 -2 -8
$71-74$ 3 72.5 -3 -1 -3
$74-77$ 8 75.5 0 0 0
$77-80$ 7 78.5 3 1 7
$80-83$ 4 81.5 6 2 8
$83-86$ 2 84.5 9 3 6
Total 30 4

From the table, we obtain

$\sum f_i=30$

$\sum f_i u_i=4$

Mean $\bar{{}x}=a+(\dfrac{\sum f_i u_i}{\sum f_i}) \times h$

$ \begin{aligned} & =75.5+(\dfrac{4}{30}) \times 3 \\ & =75.5+0.4=75.9 \end{aligned} $

Therefore, mean hear beats per minute for these women are 75.9 beats per minute.

5. In a retail market, fruit vendors were selling mangoes kept in packing boxes. These boxes contained varying number of mangoes. The following was the distribution of mangoes according to the number of boxes.

Number of mangoes $50-52$ $53-55$ $56-58$ $59-61$ $62-64$
Number of boxes 15 110 135 115 25

Find the mean number of mangoes kept in a packing box. Which method of finding the mean did you choose?

Show Answer

Solution

$50-52$ 15
$53-55$ 110
$56-58$ 135
$59-61$ 115
$62-64$ 25

It can be observed that class intervals are not continuous. There is a gap of 1 between two class intervals. Therefore,

$\dfrac{1}{2}$ has to be added to the upper class limit and $\dfrac{1}{2}$ has to be subtracted from the lower class limit of each interval.

Class mark $(x_i)$ can be obtained by using the following relation.

$x_i=\dfrac{\text{ Upper class limit }+ \text{ Lower class limit }}{2}$

Class size $(h)$ of this data $=3$

Taking 57 as assumed mean (a), $d_i, u_i, f_i u_i$ are calculated as follows.

Class interval $\boldsymbol{{}f}_i$ $\boldsymbol{{}x} _{\boldsymbol{{}i}}$ $\boldsymbol{{}d} _{\boldsymbol{{}i}}=\boldsymbol{{}x} _{\boldsymbol{{}i}}-\mathbf{5 7}$ $\boldsymbol{{}u} _{\boldsymbol{{}i}}=\dfrac{\boldsymbol{{}d} _{\boldsymbol{{}i}}}{\mathbf{3}}$ $\boldsymbol{{}f} _{\boldsymbol{{}i}} \boldsymbol{{}u} _{\boldsymbol{{}i}}$
$49.5-52.5$ 15 51 -6 -2 -30
$52.5-55.5$ 110 54 -3 -1 -110
$55.5-58.5$ 135 57 0 0 0
$58.5-61.5$ 115 60 3 1 115
$61.5-64.5$ 25 63 6 2 50
Total 400 25

It can be observed that

$\sum f_i=400$

$\sum f_i u_i=25$

$ \begin{aligned} \text{ Mean, } \bar{{}x} & =a+(\dfrac{\sum f_i u_i}{\sum f_i}) \times h \\ & =57+(\dfrac{25}{400}) \times 3 \\ & =57+\dfrac{3}{16}=57+0.1875 \\ & =57.1875 \\ & \simeq 57.19 \end{aligned} $

Mean number of mangoes kept in a packing box is 57.19.

Step deviation method is used here as the values of $f_i, d_i$ are big and also, there is a common factor of all $d_i$ ’s

6. The table below shows the daily expenditure on food of 25 households in a locality.

Daily expenditure (in ₹) $100-150$ $150-200$ $200-250$ $250-300$ $300-350$
Number of households 4 5 12 2 2

Find the mean daily expenditure on food by a suitable method.

Show Answer

Solution

To find the class mark $(x_i)$ for each interval, the following relation is used.

$x_i=\dfrac{\text{ Upper class limit }+ \text{ Lower class limit }}{2}$

Class size $=50$

Taking 225 as assumed mean (a), $d_i, u_i, f_i u_i$ are calculated as follows.

Daily expenditure (in Rs) $\boldsymbol{{}f} _{\boldsymbol{{}i}}$ $\boldsymbol{{}x} _{\boldsymbol{{}i}}$ $\boldsymbol{{}d} _{\boldsymbol{{}i}}=\boldsymbol{{}x} _{\boldsymbol{{}i}}-\mathbf{2 2 5}$ $\boldsymbol{{}u} _{\boldsymbol{{}i}}=\dfrac{\boldsymbol{{}d} _{\boldsymbol{{}i}}}{\mathbf{5 0}}$ $\boldsymbol{{}f} _{\boldsymbol{{}i}} \boldsymbol{{}u} _{\boldsymbol{{}i}}$
$100-150$ 4 125 -100 -2 -8
$150-200$ 5 175 -50 -1 -5
$200-250$ 12 225 0 0 0
$250-300$ 2 275 50 1 2
$300-350$ 2 325 100 2 4
Total 25 -7

From the table, we obtain

$\sum f_i=25$

$\sum f_i u_i=-7$

$ \text{ Mean, } \begin{aligned} \bar{{}x} & =a+(\dfrac{\sum f_i u_i}{\sum f_i}) \times h \\ & =225+(\dfrac{-7}{25}) \times(50) \\ & =225-14 \\ & =211 \end{aligned} $

Therefore, mean daily expenditure on food is Rs 211 .

7. To find out the concentration of $\mathrm{SO}_{2}$ in the air (in parts per million, i.e., ppm), the data was collected for 30 localities in a certain city and is presented below:

Concentration of $\mathrm{SO}_{2}$ (in ppm) Frequency
$0.00-0.04$ 4
$0.04-0.08$ 9
$0.08-0.12$ 9
$0.12-0.16$ 2
$0.16-0.20$ 4
$0.20-0.24$ 2

Find the mean concentration of $\mathrm{SO}_{2}$ in the air.

Show Answer

Solution

To find the class marks for each interval, the following relation is used.

$ x_i=\dfrac{\text{ Upper class limit }+ \text{ Lower class limit }}{2} $

Class size of this data $=0.04$

Taking 0.14 as assumed mean (a), $d_i, u_i, f_i u_i$ are calculated as follows.

Concentration of $\mathbf{S O}_2$ (in ppm) Frequency $\boldsymbol{{}f}_i$ Class mark $\boldsymbol{{}x} _{\boldsymbol{{}i}}$ $\boldsymbol{{}d} _{\boldsymbol{{}i}}=\boldsymbol{{}x} _{\boldsymbol{{}i}}-\mathbf{0 . 1 4}$ $\boldsymbol{{}u} _{\boldsymbol{{}i}}=\dfrac{\boldsymbol{{}d} _{\boldsymbol{{}i}}}{\mathbf{0 . 0 4}}$ $\boldsymbol{{}f} _{\boldsymbol{{}i}} \boldsymbol{{}u} _{\boldsymbol{{}i}}$
$0.00-0.04$ 4 0.02 -0.12 -3 -12
$0.04-0.08$ 9 0.06 -0.08 -2 -18
$0.08-0.12$ 9 0.10 -0.04 -1 -9
$0.12-0.16$ 2 0.14 0 0 0
$0.16-0.20$ 4 0.18 0.04 1 4
$0.20-0.24$ 2 0.22 0.08 2 4
Total 30 -31

From the table, we obtain

$\sum f_i=30$

$\sum f_i u_i=-31$

Mean, $\bar{{}x}=a+(\dfrac{\sum f_i u_i}{\sum f_i}) \times h$

$ \begin{aligned} & =0.14+(\dfrac{-31}{30})(0.04) \\ & =0.14-0.04133 \\ & =0.09867 \\ & \simeq 0.099 ppm \end{aligned} $

Therefore, mean concentration of $SO_2$ in the air is $0.099 ppm$.

8. A class teacher has the following absentee record of 40 students of a class for the whole term. Find the mean number of days a student was absent.

Number of days $0-6$ $6-10$ $10-14$ $14-20$ $20-28$ $28-38$ $38-40$
Number of students 11 10 7 4 4 3 1
Show Answer

Solution

To find the class mark of each interval, the following relation is used.

$x_i=\dfrac{\text{ Upper class limit }+ \text{ Lower class limit }}{2}$

Taking 17 as assumed mean (a), $d_i$ and $f_i d_i$ are calculated as follows.

Number of days Number of students $\boldsymbol{{}f}_i$ $\boldsymbol{{}x} _{\boldsymbol{{}i}}$ $\boldsymbol{{}d} _{\boldsymbol{{}i}}=\boldsymbol{{}x} _{\boldsymbol{{}i}}-\mathbf{1 7}$ $\boldsymbol{{}f i} _{\boldsymbol{{}i}}$
$0-6$ 11 3 -14 -154
$6-10$ 10 8 -9 -90
$10-14$ 7 12 -5 -35
$14-20$ 4 17 0 0
$20-28$ 4 24 7 28
$28-38$ 3 33 16 48
$38-40$ 1 39 22 22
Total 40 -181

From the table, we obtain

$\sum f_i=40$

$\sum f_i d_i=-181$

Mean, $\bar{{}x}=a+(\dfrac{\sum f_i d_i}{\sum f_i})$

$=17+(\dfrac{-181}{40})$

$=17-4.525$

$=12.475$

$\simeq 12.48$

Therefore, the mean number of days is 12.48 days for which a student was absent.

9. The following table gives the literacy rate (in percentage) of 35 cities. Find the mean literacy rate.

Literacy rate (in %) $45-55$ $55-65$ $65-75$ $75-85$ $85-95$
Number of cities 3 10 11 8 3
Show Answer

Solution

To find the class marks, the following relation is used.

$x_i=\dfrac{\text{ Upper class limit }+ \text{ Lower class limit }}{2}$

Class size $(h)$ for this data $=10$

Taking 70 as assumed mean (a), $d_i, u_i$, and $f_i u_i$ are calculated as follows.

Literacy rate (in %) Number of cities $x_i$ $d_i=x_i-70$ $u_i=\dfrac{d_i}{10}$ $f_i u_i$
$45-55$ 3 50 -20 -2 -6
$55-65$ 10 60 -10 -1 -10
$65-75$ 11 70 0 0 0
$75-85$ 8 80 10 1 8
$85-95$ 3 90 20 2 6
Total 35 -2

From the table, we obtain

$\sum f_i=35$

$\sum f_i u_i=-2$

$ \text{ Mean, } \begin{aligned} \bar{{}x} & =a+(\dfrac{\sum f_i u_i}{\sum f_i}) \times h \\ & =70+(\dfrac{-2}{35}) \times(10) \\ & =70-\dfrac{20}{35} \\ & =70-\dfrac{4}{7} \\ & =70-0.57 \\ & =69.43 \end{aligned} $

Therefore, mean literacy rate is $69.43 %$.

13.3 Mode of Grouped Data

Recall from Class IX, a mode is that value among the observations which occurs most often, that is, the value of the observation having the maximum frequency. Further, we discussed finding the mode of ungrouped data. Here, we shall discuss ways of obtaining a mode of grouped data. It is possible that more than one value may have the same maximum frequency. In such situations, the data is said to be multimodal. Though grouped data can also be multimodal, we shall restrict ourselves to problems having a single mode only.

Let us first recall how we found the mode for ungrouped data through the following example.

Example 4 : The wickets taken by a bowler in 10 cricket matches are as follows:

$$ \begin{array}{llllllllll} 2 & 6 & 4 & 5 & 0 & 2 & 1 & 3 & 2 & 3 \end{array} $$

Find the mode of the data.

Solution : Let us form the frequency distribution table of the given data as follows:

Number of wickets 0 1 2 3 4 5 6
Number of matches 1 1 3 2 1 1 1

Clearly, 2 is the number of wickets taken by the bowler in the maximum number (i.e., 3) of matches. So, the mode of this data is 2.

In a grouped frequency distribution, it is not possible to determine the mode by looking at the frequencies. Here, we can only locate a class with the maximum frequency, called the modal class. The mode is a value inside the modal class, and is given by the formula:

$$ \text { Mode }=l+\left(\dfrac{f_{1}-f_{0}}{2 f_{1}-f_{0}-f_{2}}\right) \times h $$

where $l=$ lower limit of the modal class,

$h=$ size of the class interval (assuming all class sizes to be equal),

$f_{1}=$ frequency of the modal class,

$f_{0}=$ frequency of the class preceding the modal class,

$f_{2}=$ frequency of the class succeeding the modal class.

Let us consider the following examples to illustrate the use of this formula.

Example 5 : A survey conducted on 20 households in a locality by a group of students resulted in the following frequency table for the number of family members in a household:

Family size $1-3$ $3-5$ $5-7$ $7-9$ $9-11$
Number of families 7 8 2 2 1

Find the mode of this data.

Solution : Here the maximum class frequency is 8 , and the class corresponding to this frequency is $3-5$. So, the modal class is $3-5$.

Now

modal class $=3-5$, lower limit $(l)$ of modal class $=3$, class size $(h)=2$

frequency $\left(f_{1}\right)$ of the modal class $=8$,

frequency $\left(f_{0}\right)$ of class preceding the modal class $=7$,

frequency $\left(f_{2}\right)$ of class succeeding the modal class $=2$.

Now, let us substitute these values in the formula :

$$ \begin{aligned} \text { Mode } & =l+\left(\dfrac{f_{1}-f_{0}}{2 f_{1}-f_{0}-f_{2}}\right) \times h \\ \\ & =3+\left(\dfrac{8-7}{2 \times 8-7-2}\right) \times 2=3+\dfrac{2}{7}=3.286 \end{aligned} $$

Therefore, the mode of the data above is 3.286.

Example 6 : The marks distribution of 30 students in a mathematics examination are given in Table 13.3 of Example 1. Find the mode of this data. Also compare and interpret the mode and the mean.

Solution : Refer to Table 13.3 of Example 1. Since the maximum number of students (i.e., 7) have got marks in the interval 40 - 55, the modal class is $40-55$. Therefore,

the lower limit $(l)$ of the modal class $=40$,

the class size $(h)=15$,

the frequency $\left(f_{1}\right)$ of modal class $=7$,

the frequency $\left(f_{0}\right)$ of the class preceding the modal class $=3$,

the frequency $\left(f_{2}\right)$ of the class succeeding the modal class $=6$.

Now, using the formula:

$$ \begin{aligned} & \text { Mode }=l+\left(\dfrac{f_{1}-f_{0}}{2 f_{1}-f_{0}-f_{2}}\right) \times h, \end{aligned} $$

we get $$ \begin{aligned} & \text { Mode }=40+\left(\dfrac{7-3}{14-6-3}\right) \times 15=52 \end{aligned} $$

So, the mode marks is 52 .

Now, from Example 1, you know that the mean marks is 62.

So, the maximum number of students obtained 52 marks, while on an average a student obtained 62 marks.

Remarks :

1. In Example 6, the mode is less than the mean. But for some other problems it may be equal or more than the mean also.

2. It depends upon the demand of the situation whether we are interested in finding the average marks obtained by the students or the average of the marks obtained by most of the students. In the first situation, the mean is required and in the second situation, the mode is required.

Activity 3 : Continuing with the same groups as formed in Activity 2 and the situations assigned to the groups. Ask each group to find the mode of the data. They should also compare this with the mean, and interpret the meaning of both.

Remark : The mode can also be calculated for grouped data with unequal class sizes. However, we shall not be discussing it.

EXERCISE 13.2

1. The following table shows the ages of the patients admitted in a hospital during a year:

Age (in years) $5-15$ $15-25$ $25-35$ $35-45$ $45-55$ $55-65$
Number of patients 6 11 21 23 14 5

Find the mode and the mean of the data given above. Compare and interpret the two measures of central tendency.

Show Answer

Solution

To find the class marks $(x_i)$, the following relation is used.

$x_i=\dfrac{\text{ Upper class limit }+ \text{ Lower class limit }}{2}$

Taking 30 as assumed mean (a), $d_i$ and $f_i d_i$ are calculated as follows.

Age (in years) Number of patients $\boldsymbol{{}f}_i$ Class mark $\boldsymbol{{}x}_i$ $\boldsymbol{{}d} _{\boldsymbol{{}i}}=\boldsymbol{{}x} _{\boldsymbol{{}i}}-\mathbf{3 0}$ $\boldsymbol{{}f} _{\boldsymbol{{}i}} \boldsymbol{{}d} _{\boldsymbol{{}i}}$
$5-15$ 6 10 -20 -120
$15-25$ 11 20 -10 -110
$25-35$ 21 30 0 0
$35-45$ 23 40 10 230
$45-55$ 14 50 20 280
$55-65$ 5 60 30 150
Total 80 430

From the table, we obtain

$\sum f_i=80$

$\sum f_i d_i=430$

Mean, $\bar{{}x}=a+\dfrac{\sum f_i d_i}{\sum f_i}$

$ \begin{aligned} & =30+(\dfrac{430}{80}) \\ & =30+5.375 \\ & =35.375 \\ & \simeq 35.38 \end{aligned} $

Mean of this data is 35.38 . It represents that on an average, the age of a patient admitted to hospital was 35.38 years.

It can be observed that the maximum class frequency is 23 belonging to class interval 35 - 45 .

Modal class $=35-45$

Lower limit ( $l$ ) of modal class $=35$

Frequency $(f_1)$ of modal class $=23$

Class size $(h)=10$

Frequency $(f_0)$ of class preceding the modal class $=21$

Frequency $(f_2)$ of class succeeding the modal class $=14$

Mode $=$

$ l+(\dfrac{f_1-f_0}{2 f_1-f_0-f_2}) \times h $

$=35+(\dfrac{23-21}{2(23)-21-14}) \times 10$

$=35+[\dfrac{2}{46-35}] \times 10$

$=35+\dfrac{20}{11}$

$=35+1.81$

$=36.8$

Mode is 36.8. It represents that the age of maximum number of patients admitted in hospital was 36.8 years.

2. The following data gives the information on the observed lifetimes (in hours) of 225 electrical components :

Lifetimes (in hours) $0-20$ $20-40$ $40-60$ $60-80$ $80-100$ $100-120$
Frequency 10 35 52 61 38 29

Determine the modal lifetimes of the components.

Show Answer

Solution

From the data given above, it can be observed that the maximum class frequency is 61 , belonging to class interval 60 - 80.

Therefore, modal class $=60-80$

Lower class limit $(l)$ of modal class $=60$

Frequency $(f_1)$ of modal class $=61$

Frequency $(f_0)$ of class preceding the modal class $=52$

Frequency $(f_2)$ of class succeeding the modal class $=38$

Class size $(h)=20$

Mode $=l+(\dfrac{f_1-f_0}{2 f_1-f_0-f_2}) \times h$

$=60+(\dfrac{61-52}{2(61)-52-38})(20)$

$ \begin{aligned} & =60+(\dfrac{9}{122-90})(20) \\ & =60+(\dfrac{9 \times 20}{32}) \\ & =60+\dfrac{90}{16}=60+5.625 \\ & =65.625 \end{aligned} $

Therefore, modal lifetime of electrical components is 65.625 hours.

3. The following data gives the distribution of total monthly household expenditure of 200 families of a village. Find the modal monthly expenditure of the families. Also, find the mean monthly expenditure :

Expenditure (in ₹) Number of families
$1000-1500$ 24
$1500-2000$ 40
$2000-2500$ 33
$2500-3000$ 28
$3000-3500$ 30
$3500-4000$ 22
$4000-4500$ 16
$4500-5000$ 7
Show Answer

Solution

It can be observed from the given data that the maximum class frequency is 40, belonging to 1500 - 2000 intervals.

Therefore, modal class $=1500-2000$

Lower limit $(l)$ of modal class $=1500$

Frequency $(f_1)$ of modal class $=40$

Frequency $(f_0)$ of class preceding modal class $=24$

Frequency $(f_2)$ of class succeeding modal class $=33$

Class size $(h)=500$

$ \begin{aligned} \text{ Mode } & =l+(\dfrac{f_1-f_0}{2 f_1-f_0-f_2}) \times h \\ & =1500+(\dfrac{40-24}{2(40)-24-33}) \times 500 \\ & =1500+(\dfrac{16}{80-57}) \times 500 \\ & =1500+\dfrac{8000}{23} \\ & =1500+347.826 \\ & =1847.826=1847.83 \end{aligned} $

Therefore, modal monthly expenditure was Rs 1847.83.

To find the class mark, the following relation is used.

Class mark $=\dfrac{\text{ Upper class limit }+ \text{ Lower class limit }}{2}$

Class size $(h)$ of the given data $=500$

Taking 2750 as assumed mean (a), $d_i, u_i$, and $f_i u_i$ are calculated as follows.

Expenditure (in Rs.) No. of families Class Mark $\left(x_i\right)$ $d_i=x_i-a$ $u_i=\frac{d_i}{500}$ $f_i u_i$
$1000-1500$ 24 1250 -2000 -4 -96
$1500-2000$ 40 1750 -1500 -3 -120
$2000-2500$ 33 2250 -1000 -2 -66
$2500-3000$ 28 2750 -500 -1 -28
$3000-3500$ 30 $3250=a$ 0 0 0
$3500-4000$ 22 3750 500 1 22
$4000-4500$ 16 4250 1000 2 32
$4500-5000$ 7 4750 1500 3 21
Total $\Sigma f_i= 200$ $\Sigma f_i u_i=-235$

From the table, we obtatain

$\begin{aligned} & \sum f_i=200, \sum u_i f_i=-235 \\ & \text { Mean monthly income }=\overline{\mathrm{x}}=\mathrm{a}+\frac{\Sigma f_i u_i}{\Sigma f_i} \times \mathrm{h} \\ & =3250-\frac{235}{200} \times 500 \\ & =3250-587.5 \\ & =\text { Rs. } 2662.50\end{aligned}$

4. The following distribution gives the state-wise teacher-student ratio in higher secondary schools of India. Find the mode and mean of this data. Interpret the two measures.

Number of students per teacher Number of states / U.T.
$15-20$ 3
$20-25$ 8
$25-30$ 9
$30-35$ 10
$35-40$ 3
$40-45$ 0
$45-50$ 0
$50-55$ 2
Show Answer

Solution

It can be observed from the given data that the maximum class frequency is 10 belonging to class interval 30 - 35 .

Therefore, modal class $=30-35$

Class size $(h)=5$

Lower limit $(l)$ of modal class $=30$

Frequency $(f_1)$ of modal class $=10$

Frequency $(f_0)$ of class preceding modal class $=9$

Frequency $(f_2.$ ) of class succeeding modal class $=3$

$ \begin{aligned} \text{ Mode } & =l+(\dfrac{f_1-f_0}{2 f_1-f_0-f_2}) \times h \\ & =30+(\dfrac{10-9}{2(10)-9-3}) \times(5) \\ & =30+(\dfrac{1}{20-12}) 5 \\ & =30+\dfrac{5}{8}=30.625 \end{aligned} $

Mode $=30.6$

It represents that most of the states/U.T have a teacher-student ratio as 30.6.

To find the class marks, the following relation is used.

Class mark $=\dfrac{\text{ Upper class limit }+ \text{ Lower class limit }}{2}$

Taking 32.5 as assumed mean (a), $d_i, u_i$, and $f_i u_i$ are calculated as follows.

Number of students per teacher Number of states/U.T
$(f_i)$
$\boldsymbol{{}x} _{\boldsymbol{{}i}}$ $\boldsymbol{{}d}_i=\boldsymbol{{}x} _{\boldsymbol{{}i}}-\mathbf{3 2 . 5}$ $\boldsymbol{{}u} _{\boldsymbol{{}i}}=\dfrac{\boldsymbol{{}d} _{\boldsymbol{{}i}}}{\mathbf{5}}$ $\boldsymbol{{}f} _{\boldsymbol{{}i}} u _{\boldsymbol{{}i}}$
$15-20$ 3 17.5 -15 -3 -9
$20-25$ 8 22.5 -10 -2 -16
$25-30$ 9 27.5 -5 -1 -9
$30-35$ 10 32.5 0 0 0
$35-40$ 3 37.5 5 1 3
$40-45$ 0 42.5 10 2 0
$45-50$ 0 47.5 15 3 0
$50-55$ 2 52.5 20 4 8
Total 35 -23

Mean, $\bar{{}x}=a+(\dfrac{\sum f_i u_i}{\sum f_i}) h$

$ \begin{aligned} & =32.5+(\dfrac{-23}{35}) \times 5 \\ & =32.5-\dfrac{23}{7}=32.5-3.28 \\ & =29.22 \end{aligned} $

Therefore, mean of the data is 29.2.

It represents that

  • The maximum number of students per teacher is 30.6
  • The average number of students per teacher is 29.2

5. The given distribution shows the number of runs scored by some top batsmen of the world in one-day international cricket matches.

Runs scored Number of batsmen
$3000-4000$ 4
$4000-5000$ 18
$5000-6000$ 9
$6000-7000$ 7
$7000-8000$ 6
$8000-9000$ 3
$9000-10000$ 1
$10000-11000$ 1

Find the mode of the data.

Show Answer

Solution

From the given data, it can be observed that the maximum class frequency is 18 , belonging to class interval $4000-5000$ .

Therefore, modal class $=4000-5000$

Lower limit (I) of modal class $=4000$

Frequency $(f_1)$ of modal class $=18$

Frequency $(f_0)$ of class preceding modal class $=4$

Frequency $(f_2.$ ) of class succeeding modal class $=9$

Class size $(h)=1000$

Mode $=l+(\dfrac{f_1-f_0}{2 f_1-f_0-f_2}) \times h$

$=4000+(\dfrac{18-4}{2(18)-4-9}) \times 1000$

$=4000+(\dfrac{14000}{23})$

$=4000+608.695$

$=4608.695$

Therefore, mode of the given data is 4608.7 runs.

6. A student noted the number of cars passing through a spot on a road for 100 periods each of 3 minutes and summarised it in the table given below. Find the mode of the data :

Number of cars $0-10$ $10-20$ $20-30$ $30-40$ $40-50$ $50-60$ $60-70$ $70-80$
Frequency 7 14 13 12 20 11 15 8
Show Answer

Solution

From the given data, it can be observed that the maximum class frequency is 20 , belonging to 40 - 50 class intervals.

Therefore, modal class $=40-50$

Lower limit ( $l$ ) of modal class $=40$

Frequency $(f_1)$ of modal class $=20$

Frequency $(f_0)$ of class preceding modal class $=12$

Frequency $(f_2)$ of class succeeding modal class $=11$

Class size $=10$

$ \begin{aligned} \text{ Mode } & =l+(\dfrac{f_1-f_0}{2 f_1-f_0-f_2}) \times h \\ & =40+[\dfrac{20-12}{2(20)-12-11}] \times 10 \\ & =40+(\dfrac{80}{40-23}) \\ & =40+\dfrac{80}{17} \end{aligned} $

$=40+4.7$

$=44.7$

Therefore, mode of this data is 44.7 cars.

13.4 Median of Grouped Data

As you have studied in Class IX, the median is a measure of central tendency which gives the value of the middle-most observation in the data. Recall that for finding the median of ungrouped data, we first arrange the data values of the observations in ascending order. Then, if $n$ is odd, the median is the $\left(\dfrac{n+1}{2}\right)$ th observation. And, if $n$ is even, then the median will be the average of the $\dfrac{n}{2}$ th and the $\left(\dfrac{n}{2}+1\right)$ th observations.

Suppose, we have to find the median of the following data, which gives the marks, out of 50, obtained by 100 students in a test :

Marks obtained 20 29 28 33 42 38 43 25
Number of students 6 28 24 15 2 4 1 20

First, we arrange the marks in ascending order and prepare a frequency table as follows :

Table 13.9

Marks obtained Number of students
(Frequency)
20 6
25 20
28 24
29 28
33 15
38 4
42 2
43 1
Total $\mathbf{1 0 0}$

Here $n=100$, which is even. The median will be the average of the $\dfrac{n}{2}$ th and the $\left(\dfrac{n}{2}+1\right)$ th observations, i.e., the 50th and 51st observations. To find these observations, we proceed as follows:

Table 13.10

Marks obtained Number of students
20 6
upto 25 $6+20=26$
upto 28 $26+24=50$
upto 29 $50+28=78$
upto 33 $78+15=93$
upto 38 $93+4=97$
upto 42 $97+2=99$
upto 43 $99+1=100$

Now we add another column depicting this information to the frequency table above and name it as cumulative frequency column.

Table 13.11

Marks obtained Number of students Cumulative frequency
20 6 6
25 20 26
28 24 50
29 28 78
33 15 93
38 4 97
42 2 99
43 1 100

From the table above, we see that:

50th observaton is 28 (Why?)

51st observation is 29

So, $\quad$ Median $=\dfrac{28+29}{2}=28.5$

Remark : The part of Table 13.11 consisting Column 1 and Column 3 is known as Cumulative Frequency Table. The median marks 28.5 conveys the information that about 50% students obtained marks less than 28.5 and another $50 %$ students obtained marks more than 28.5.

Now, let us see how to obtain the median of grouped data, through the following situation.

Consider a grouped frequency distribution of marks obtained, out of 100, by 53 students, in a certain examination, as follows:

Table 13.12

Marks Number of students
$0-10$ 5
$10-20$ 3
$20-30$ 4
$30-40$ 3
$40-50$ 3
$50-60$ 4
$60-70$ 7
$70-80$ 9
$80-90$ 7
$90-100$ 8

From the table above, try to answer the following questions:

How many students have scored marks less than 10 ? The answer is clearly 5 .

How many students have scored less than 20 marks? Observe that the number of students who have scored less than 20 include the number of students who have scored marks from 0 - 10 as well as the number of students who have scored marks from $10-20$. So, the total number of students with marks less than 20 is $5+3$, i.e., 8 . We say that the cumulative frequency of the class $10-20$ is 8 .

Similarly, we can compute the cumulative frequencies of the other classes, i.e., the number of students with marks less than 30 , less than $40, \ldots$, less than 100 . We give them in Table 13.13 given below:

Table 13.13

Marks obtained Number of students
(Cumulative frequency)
Less than 10 5
Less than 20 $5+3=8$
Less than 30 $8+4=12$
Less than 40 $12+3=15$
Less than 50 $15+3=18$
Less than 60 $18+4=22$
Less than 70 $22+7=29$
Less than 80 $29+9=38$
Less than 90 $38+7=45$
Less than 100 $45+8=53$

The distribution given above is called the cumulative frequency distribution of the less than type. Here 10,20,30, . . 100, are the upper limits of the respective class intervals.

We can similarly make the table for the number of students with scores, more than or equal to 0 , more than or equal to 10 , more than or equal to 20 , and so on. From Table 13.12, we observe that all 53 students have scored marks more than or equal to 0 . Since there are 5 students scoring marks in the interval $0-10$, this means that there are $53-5=48$ students getting more than or equal to 10 marks. Continuing in the same manner, we get the number of students scoring 20 or above as $48-3=45,30$ or above as $45-4=41$, and so on, as shown in Table 13.14.

Table 13.14

Marks obtained Number of students
(Cumulative frequency)
More than or equal to 0 53
More than or equal to 10 $53-5=48$
More than or equal to 20 $48-3=45$
More than or equal to 30 $45-4=41$
More than or equal to 40 $41-3=38$
More than or equal to 50 $38-3=35$
More than or equal to 60 $35-4=31$
More than or equal to 70 $31-7=24$
More than or equal to 80 $24-9=15$
More than or equal to 90 $15-7=8$

The table above is called a cumulative frequency distribution of the more than type. Here $0,10,20, \ldots, 90$ give the lower limits of the respective class intervals.

Now, to find the median of grouped data, we can make use of any of these cumulative frequency distributions.

Let us combine Tables 13.12 and 13.13 to get Table 13.15 given below:

Table 13.15

Marks Number of students $(\boldsymbol{f})$ Cumulative frequency $(\mathbf{c f})$
$0-10$ 5 5
$10-20$ 3 8
$20-30$ 4 12
$30-40$ 3 15
$40-50$ 3 18
$50-60$ 4 22
$60-70$ 7 29
$70-80$ 9 38
$80-90$ 7 45
$90-100$ 8 53

Now in a grouped data, we may not be able to find the middle observation by looking at the cumulative frequencies as the middle observation will be some value in a class interval. It is, therefore, necessary to find the value inside a class that divides the whole distribution into two halves. But which class should this be?

To find this class, we find the cumulative frequencies of all the classes and $\dfrac{n}{2}$. We now locate the class whose cumulative frequency is greater than (and nearest to) $\dfrac{n}{2}$. This is called the median class. In the distribution above, $n=53$. So, $\dfrac{n}{2}=26.5$. Now 60 - 70 is the class whose cumulative frequency 29 is greater than (and nearest to) $\dfrac{n}{2}$, i.e., 26.5 .

Therefore, $60-70$ is the median class.

After finding the median class, we use the following formula for calculating the median.

$$ \text { Median }=l+\left(\dfrac{\dfrac{n}{2}-\mathrm{cf}}{f}\right) \times h $$

where

$l$ = lower limit of median class

$n$ = number of observations

$cf$ = cumulative frequency of class preceding the median class

$f$ = frequency of median class

$h$ = class size (assuming class size to be equal)

Substituting the values $\dfrac{n}{2}=26.5, l=60, \mathrm{cf}=22, f=7, h=10$ in the formula above, we get

$$ \begin{aligned} \text { Median } & =60+\left(\dfrac{26.5-22}{7}\right) \times 10 \\ & =60+\dfrac{45}{7} \\ & =66.4 \end{aligned} $$

So, about half the students have scored marks less than 66.4 , and the other half have scored marks more than 66.4.

Example 7 : A survey regarding the heights (in $\mathrm{cm}$ ) of 51 girls of Class $\mathrm{X}$ of a school was conducted and the following data was obtained:

Height (in cm) Number of girls
Less than 140 4
Less than 145 11
Less than 150 29
Less than 155 40
Less than 160 46
Less than 165 51

Find the median height.

Solution : To calculate the median height, we need to find the class intervals and their corresponding frequencies.

The given distribution being of the less than type, 140, 145, 150, …, 165 give the upper limits of the corresponding class intervals. So, the classes should be below 140, 140 - 145, 145 - 150, . ., 160 - 165. Observe that from the given distribution, we find that there are 4 girls with height less than 140, i.e., the frequency of class interval below 140 is 4 . Now, there are 11 girls with heights less than 145 and 4 girls with height less than 140. Therefore, the number of girls with height in the interval $140-145$ is $11-4=7$. Similarly, the frequency of $145-150$ is $29-11=18$, for $150-155$, it is $40-29=11$, and so on. So, our frequency distribution table with the given cumulative frequencies becomes:

Table 13.16

Class intervals Frequency Cumulative frequency
Below 140 4 4
$140-145$ 7 11
$145-150$ 18 29
$150-155$ 11 40
$155-160$ 6 46
$160-165$ 5 51

Now $n=51$. So, $\dfrac{n}{2}=\dfrac{51}{2}=25.5$. This observation lies in the class $145-150$. Then,

$l$ (the lower limit) $=145$,

cf $($ the cumulative frequency of the class preceding $145-150)=11$,

$f$ $($ the frequency of the median class $145-150)=18$,

$h$ (the class size) $=5$.

Using the formula, Median $=l+\left(\dfrac{\dfrac{n}{2}-\mathrm{cf}}{f}\right) \times h$, we have

$$ \begin{aligned} \text { Median } & =145+\left(\dfrac{25.5-11}{18}\right) \times 5 \\ & =145+\dfrac{72.5}{18}=149.03 . \end{aligned} $$

So, the median height of the girls is $149.03 \mathrm{~cm}$.

This means that the height of about $50 \%$ of the girls is less than this height, and $50 \%$ are taller than this height.

Example 8 : The median of the following data is 525. Find the values of $x$ and $y$, if the total frequency is 100 .

Class intervals Frequency
$0-100$ 2
$100-200$ 5
$200-300$ $x$
$300-400$ 12
$400-500$ 17
$500-600$ 20
$600-700$ $y$
$700-800$ 9
$800-900$ 7
$900-1000$ 4

Solution:

Class intervals Frequency Cumulative frequency
$0-100$ 2 2
$100-200$ 5 7
$200-300$ $x$ $7+x$
$300-400$ 12 $19+x$
$400-500$ 17 $36+x$
$500-600$ 20 $56+x$
$600-700$ $y$ $56+x+y$
$700-800$ 9 $65+x+y$
$800-900$ 7 $72+x+y$
$900-1000$ 4 $76+x+y$

It is given that $n=100$

$ \text{So,} \quad 76+x+y=100, \quad \text{ i.e.,}\quad x+y=24 \tag{1}$

The median is 525 , which lies in the class $500-600$

So, $\quad l=500, \quad f=20, \quad$ cf $=36+x, \quad h=100$

Using the formula :

Median $=l+\left(\dfrac{\dfrac{n}{2}-\mathrm{cf}}{f}\right) h$, we get

$$ 525=500+\left(\dfrac{50-36-x}{20}\right) \times 100 $$

i.e., $$ 525-500=(14-x) \times 5 $$

i.e., $$ 25=70-5 x $$

i.e., $$ 5 x=70-25=45 $$

So, $$ x=9 $$

Therefore, from (1), we get $9+y=24$

i.e., $y=15$

Now, that you have studied about all the three measures of central tendency, let us discuss which measure would be best suited for a particular requirement.

The mean is the most frequently used measure of central tendency because it takes into account all the observations, and lies between the extremes, i.e., the largest and the smallest observations of the entire data. It also enables us to compare two or more distributions. For example, by comparing the average (mean) results of students of different schools of a particular examination, we can conclude which school has a better performance.

However, extreme values in the data affect the mean. For example, the mean of classes having frequencies more or less the same is a good representative of the data. But, if one class has frequency, say 2, and the five others have frequency 20, 25, 20, 21,18 , then the mean will certainly not reflect the way the data behaves. So, in such cases, the mean is not a good representative of the data.

In problems where individual observations are not important, and we wish to find out a ’typical’ observation, the median is more appropriate, e.g., finding the typical productivity rate of workers, average wage in a country, etc. These are situations where extreme values may be there. So, rather than the mean, we take the median as a better measure of central tendency.

In situations which require establishing the most frequent value or most popular item, the mode is the best choice, e.g., to find the most popular T.V. programme being watched, the consumer item in greatest demand, the colour of the vehicle used by most of the people, etc.

Remarks:

1. There is a empirical relationship between the three measures of central tendency :

$$ 3 \text { Median }=\text { Mode }+2 \text { Mean } $$

2. The median of grouped data with unequal class sizes can also be calculated. However, we shall not discuss it here.

EXERCISE 13.3

1. The following frequency distribution gives the monthly consumption of electricity of 68 consumers of a locality. Find the median, mean and mode of the data and compare them.

Monthly consumption (in units) Number of consumers
$65-85$ 4
$85-105$ 5
$105-125$ 13
$125-145$ 20
$145-165$ 14
$165-185$ 8
$185-205$ 4
Show Answer

Solution

To find the class marks, the following relation is used.

Class mark $=\dfrac{\text{ Upper class limit }+ \text{ Lower class limit }}{2}$

Taking 135 as assumed mean (a), $d_i, u_i, u_i$ are calculated according to step deviation method as follows.

Monthly consumption (in
units)
Number of consumers
$(\boldsymbol{{}f} _{\boldsymbol{{}i}})$
$\boldsymbol{{}x} _{\boldsymbol{{}i}}$ class
mark
$\boldsymbol{{}d}_i=\boldsymbol{{}x}_i-$
$\mathbf{1 3 5}$
$\boldsymbol{{}u} _{\boldsymbol{{}i}}=\dfrac{\boldsymbol{{}d} _{\boldsymbol{{}i}}}{\mathbf{2 0}}$ $\boldsymbol{{}f} _{\boldsymbol{{}i}} \boldsymbol{{}u} _{\boldsymbol{{}i}}$
$65-85$ 4 75 -60 -3 -12
$85-105$ 5 95 -40 -2 -10
$105-125$ 13 115 -20 -1 -13
$125-145$ 20 135 0 0 0
$145-165$ 14 155 20 1 14
$165-185$ 8 175 40 2 16
$185-205$ 4 195 60 3 12
Total 68 7

From the table, we obtain

$\sum f_i u_i=7$ $\sum f_i=68$

Class size $(h)=20$

Mean, $\bar{{}x}=a+(\dfrac{\sum f_i u_i}{\sum f_i}) \times h$

$ \begin{aligned} & =135+\dfrac{7}{68} \times 20 \\ & =135+\dfrac{140}{68} \\ & =137.058 \end{aligned} $

From the table, it can be observed that the maximum class frequency is 20, belonging to class interval 125 - 145.

Modal class $=125-145$

Lower limit ( $($ ) of modal class $=125$

Class size $(h)=20$

Frequency $(f_1)$ of modal class $=20$

Frequency $(f_0)$ of class preceding modal class $=13$

Frequency $(f_2)$ of class succeeding the modal cla

2. If the median of the distribution given below is 28.5, find the values of $x$ and $y$.

Class interval Frequency
$0-10$ 5
$10-20$ $x$
$20-30$ 20
$30-40$ 15
$40-50$ $y$
$50-60$ 5
Total 60
Show Answer

Solution

The cumulative frequency for the given data is calculated as follows.

Class interval Frequency Cumulative frequency
$0-10$ 5 5
$10-20$ $x$ $5+x$
$20-30$ 20 $25+x$
$30-40$ 15 $40+x$
$40-50$ $y$ $40+x+y$
$50-60$ 5 $45+x+y$
Total $(n)$ 60

From the table, it can be observed that $n=60$

$45+x+y=60$

$x+y=15(1)$

Median of the data is given as 28.5 which lies in interval 20 - 30 .

Therefore, median class $=20-30$

Lower limit ( $($ ) of median class $=20$

Cumulative frequency ( $c f$ ) of class preceding the median class $=5+x$

Frequency $(f)$ of median class $=20$

Class size $(h)=10$

Median $=l+(\dfrac{\dfrac{n}{2}-c f}{f}) \times h$

$28.5=20+[\dfrac{\dfrac{60}{2}-(5+x)}{20}] \times 10$

$8.5=(\dfrac{25-x}{2})$

$17=25-x$

$x=8$

From equation (1),

$8+y=15$

$y=7$

Hence, the values of $x$ and $y$ are 8 and 7 respectively.

3. A life insurance agent found the following data for distribution of ages of 100 policy holders. Calculate the median age, if policies are given only to persons having age 18 years onwards but less than 60 year.

Age (in years) Number of policy holders
Below 20 2
Below 25 6
Below 30 24
Below 35 45
Below 40 78
Below 45 89
Below 50 92
Below 55 98
Below 60 100
Show Answer

Solution

Here, class width is not the same. There is no requirement of adjusting the frequencies according to class intervals. The given frequency table is of less than type represented with upper class limits. The policies were given only to persons with age 18 years onwards but less than 60 years. Therefore, class intervals with their respective cumulative frequency can be defined as below.

Age (in years) Number of policy holders $(\boldsymbol{{}f} _{\boldsymbol{{}i}})$ Cumulative frequency $(\boldsymbol{{}c} \boldsymbol{{})})$
$18-20$ 2 2
$20-25$ $6-2=4$ 6
$25-30$ $24-6=18$ 24
$30-35$ $45-24=21$ 45
$35-40$ $78-45=33$ 78
$40-45$ $89-78=11$ 99
$45-50$ $92-89=3$ 98
$50-55$ $98-92=6$ 100
$55-60$ $100-98=2$
Total $(n)$ 92

From the table, it can be observed that $n=100$.

Cumulative frequency (cf) just greater than $\dfrac{n}{2}(.$ i.e., $.\dfrac{100}{2}=50)$ is 78 , belonging to interval $35-40$.

Therefore, median class $=35-40$

Lower limit $(I)$ of median class $=35$

Class size $(h)=5$

Frequency $(f)$ of median class $=33$

Cumulative frequency ( $c f$ ) of class preceding median class $=45$

$ \begin{aligned} \text{ Median } & =l+(\dfrac{\dfrac{n}{2}-c f}{f}) \times h \\ & =35+(\dfrac{50-45}{33}) \times 5 \\ & =35+\dfrac{25}{33} \\ & =35.76 \end{aligned} $

Therefore, median age is 35.76 years.

4. The lengths of 40 leaves of a plant are measured correct to the nearest millimetre, and the data obtained is represented in the following table :

Length (in mm) Number of leaves
$118-126$ 3
$127-135$ 5
$136-144$ 9
$145-153$ 12
$154-162$ 5
$163-171$ 4
$172-180$ 2

Find the median length of the leaves.

(Hint : The data needs to be converted to continuous classes for finding the median, since the formula assumes continuous classes. The classes then change to 117.5 - 126.5, 126.5 - 135.5, …, 171.5 - 180.5.)

Show Answer

Solution

The given data does not have continuous class intervals. It can be observed that the difference between two class

intervals is 1. Therefore, $\dfrac{1}{2}=0.5$ has to be added and subtracted to upper class limits and lower class limits respectively.

Continuous class intervals with respective cumulative frequencies can be represented as follows.

Length (in mm) Number or leaves $\boldsymbol{{}f}_i$ Cumulative frequency

www.ncrtsolutions.in

$117.5-126.5$ 3 3
$126.5-135.5$ 5 $3+5=8$
$135.5-144.5$ 9 $8+9=17$
$144.5-153.5$ 12 $17+12=29$
$153.5-162.5$ 5 $29+5=34$
$162.5-171.5$ 4 $34+4=38$
$171.5-180.5$ 2 $38+2=40$

From the table, it can be observed that the cumulative frequency just greater than $\dfrac{2}{2}(.$ i.e., $.\dfrac{40}{2}=20)$ is 29 , belonging to class interval 144.5 - 153.5 .

Median class $=144.5-153.5$

Lower limit $(I)$ of median class $=144.5$

Class size $(h)=9$

Frequency $(f)$ of median class $=12$

Cumulative frequency ( $c f$ ) of class preceding median class $=17$

Median

$ =l+(\dfrac{\dfrac{n}{2}-c f}{f}) \times h $

$=144.5+(\dfrac{20-17}{12}) \times 9$

$=144.5+\dfrac{9}{4}=146.75$

Therefore, median length of leaves is $146.75 mm$.

5. The following table gives the distribution of the life time of 400 neon lamps :

Life time (in hours) Number of lamps
$1500-2000$ 14
$2000-2500$ 56
$2500-3000$ 60
$3000-3500$ 86
$3500-4000$ 74
$4000-4500$ 62
$4500-5000$ 48

Find the median life time of a lamp.

Show Answer

Solution

Thecumulative frequencies with their respective class intervals are as follows.

Life time Number of lamps $(\boldsymbol{{}f} _{\boldsymbol{{}i}})$ Cumulative frequency
$1500-2000$ 14 14
$2000-2500$ 56 $14+56=70$
$2500-3000$ 60 $70+60=130$
$3000-3500$ 86 $130+86=216$
$3500-4000$ 74 $216+74=290$
$4000-4500$ 62 $290+62=352$
$4500-5000$ 48 $352+48=400$
Total $(n)$ 400

It can be observed that the cumulative frequency just greater than

$ \dfrac{n}{2}(\text{ i.e., } \dfrac{400}{2}=200) $

interval 3000 - 3500.

Median class $=3000-3500$

Lower limit $(I)$ of median class $=3000$

Frequency $(f)$ of median class $=86$

Cumulative frequency ( $c$ ) of class preceding median class $=130$

Class size $(h)=500$

Median

$ =l+(\dfrac{\dfrac{n}{2}-c f}{f}) \times h $

$ \begin{aligned} & =3000+(\dfrac{200-130}{86}) \times 500 \\ & =3000+\dfrac{70 \times 500}{86} \end{aligned} $

$=3406.976$

Therefore, median life time of lamps is 3406.98 hours.

6. 100 surnames were randomly picked up from a local telephone directory and the frequency distribution of the number of letters in the English alphabets in the surnames was obtained as follows:

Number of letters $1-4$ $4-7$ $7-10$ $10-13$ $13-16$ $16-19$
Number of surnames 6 30 40 16 4 4

Determine the median number of letters in the surnames. Find the mean number of letters in the surnames? Also, find the modal size of the surnames.

Show Answer

Solution

The cumulative frequencies with their respective class intervals are as follows.

Number of letters Frequency $(\boldsymbol{{}f} _{\boldsymbol{{}i}})$ Cumulative frequency
$1-4$ 6 6
$4-7$ 30 $30+6=36$
$7-10$ 40 $36+40=76$
$10-13$ 16 $76+16=92$
$13-16$ 4 $92+4=96$
$16-19$ 4 $96+4=100$
Total $(n)$ 100

It can be observed that the cumulative frequency just greater than

$ \dfrac{n}{2}(\text{ i.e., } \dfrac{100}{2}=50) $

interval 7 - 10 .

Median class $=7-10$

Lower limit (I) of median class $=7$

Cumulative frequency ( $c f$ ) of class preceding median class $=36$

Frequency $(f)$ of median class $=40$

Class size $(h)=3$

Median

$ =l+(\dfrac{\dfrac{n}{2}-c f}{f}) \times h $

$=7+(\dfrac{50-36}{40}) \times 3$

$=7+\dfrac{14 \times 3}{40}$

$=8.05$

To find the class marks of the given class intervals, the following relation is used.

Class mark $=\dfrac{\text{ Upper class limit }+ \text{ Lower class limit }}{2}$

Taking 11.5 as assumed mean (a), $d_i, u_i$, and $f_i u_i$ are calculated according to step deviation method as follows.

Number of letters Number of surnames
$\boldsymbol{{}f}_i$
$\boldsymbol{{}x} _{\boldsymbol{{}i}}$ $\boldsymbol{{}d} _{\boldsymbol{{}i}}=\boldsymbol{{}x} _{\boldsymbol{{}i}}-\mathbf{1 1 . 5}$ $\boldsymbol{{}u}_i=\dfrac{\boldsymbol{{}d} _{\boldsymbol{{}i}}}{\mathbf{3}}$ $\boldsymbol{{}f} _{\boldsymbol{{}i}} \boldsymbol{{}u} _{\boldsymbol{{}i}}$
$1-4$ 6 2.5 -9 -3 -18
$4-7$ 30 5.5 -6 -2 -60
$7-10$ 40 8.5 -3 -1 -40
$10-13$ 16

7. The distribution below gives the weights of 30 students of a class. Find the median weight of the students.

Weight (in kg) $40-45$ $45-50$ $50-55$ $55-60$ $60-65$ $65-70$ $70-75$
Number of students 2 3 8 6 6 3 2
Show Answer

Solution

The cumulative frequencies with their respective class intervals are as follows.

Weight (in kg) Frequency (fi) Cumulative frequency
$40-45$ 2 2
$45-50$ 3 $2+3=5$
$50-55$ 8 $5+8=13$
$55-60$ 6 $13+6=19$
$60-65$ 6 $19+6=25$
$65-70$ 3 $25+3=28$
$70-75$ 2 $28+2=30$
Total $(n)$ 30

Cumulative frequency just greater than $\dfrac{n}{2}(.$ i.e., $.\dfrac{30}{2}=15)$ is 19 , belonging to class interval $55-60$.

Median class $=55-60$

Lower limit ( $/$ ) of median class $=55$

Frequency $(f)$ of median class $=6$

Cumulative frequency ( $c f$ ) of median class $=13$

Class size $(h)=5$

Median

$ =l+(\dfrac{\dfrac{n}{2}-c f}{f}) \times h $

$=55+(\dfrac{15-13}{6}) \times 5$

$=55+\dfrac{10}{6}$

$=56.67$

Therefore, median weight is $56.67 kg$.

13.5 Summary

In this chapter, you have studied the following points:

1. The mean for grouped data can be found by :

(i) the direct method : $\bar{x}=\dfrac{\Sigma f_{i} x_{i}}{\Sigma f_{i}}$

(ii) the assumed mean method : $\bar{x}=a+\dfrac{\Sigma f_{i} d_{i}}{\Sigma f_{i}}$

(iii) the step deviation method : $\bar{x}=a+\left(\dfrac{\Sigma f_{i} u_{i}}{\Sigma f_{i}}\right) \times h$,

with the assumption that the frequency of a class is centred at its mid-point, called its class mark.

2. The mode for grouped data can be found by using the formula:

$$ \text { Mode }=l+\left(\dfrac{f_{1}-f_{0}}{2 f_{1}-f_{0}-f_{2}}\right) \times h $$

where symbols have their usual meanings.

3. The cumulative frequency of a class is the frequency obtained by adding the frequencies of all the classes preceding the given class.

4. The median for grouped data is formed by using the formula:

$$ \text { Median }=l+\left(\dfrac{\dfrac{n}{2}-\mathrm{cf}}{f}\right) \times h \text {, } $$

where symbols have their usual meanings.

A NOTE TO THE READER

For calculating mode and median for grouped data, it should be ensured that the class intervals are continuous before applying the formulae. Same condition also apply for construction of an ogive. Further, in case of ogives, the scale may not be the same on both the axes.



Table of Contents