Using averages

There are several ways of calculating the average of a set of data including the mean, mode and median.

pdfFor a printer-friendly PDF version of this guide, click here

This guide explains the different types of average (mean, median and mode). It details their use, how to calculate them, and when they can be used most effectively.

Other Useful Guides: Working with Percentages, Measures of Variability.

Introduction

The term average is used frequently in everyday life to express an amount that is typical for a group of people or things. For example, you may read in a newspaper that on average people watch 3 hours of television per day. We understand from the use of the term average that not everybody watches 3 hours of television each day, but that some watch more and some less. However, we realize from the use of the term average that the figure of 3 hours per day is a good indicator of the amount of TV watched in general.

Averages are useful because they:

  • summarise a large amount of data into a single value; and
  • indicate that there is some variability around this single value within the original data.

In everyday language most people have an inherent understanding of what the term average means. However, within the language of mathematics there are three different definitions of average known as the mean, median and mode. 

The mean, median and mode are each calculated using different methods and when applied to the same set of original data they often result in different average values. It is important to understand what each of these mathematical measures of average tells you about the original data and consider which measure, the mean, median or mode, is the most appropriate to calculate should you wish to use an average value to describe a dataset.

 

Part one: The Mean

What is the mean?

The mean is the most commonly used mathematical measure of average and is generally what is being referred to when people use the term average in everyday language. The mean is calculated by totalling all the values in a dataset; this total is then divided by the number of values that make up the dataset.

For example, to find out the mean amount borrowed by 6 students in a tutorial group taking out a student loan in 1998/9, the amounts borrowed by each student have been collected. These 6 amounts form the dataset given in table 1.

av1.gif

Table 1: Amounts borrowed by 6 students taking out a student loan in 1998/9

In order to find the mean loan, the total amount borrowed (£9,140) is divided by the number of students (6) which equals £1,523.

Formula for the mean

Whilst it is not vital to know the mathematical formula for the calculation of the mean you may want to include it at some point in a report or dissertation. 

The formula for the mean is written in the following way:

av2.gif

_
X
 
is the symbol for the mean and is referred to as bar X (ex) 
 
Σ
 
is the Greek symbol sigma and simply means sum or add up
 
X
 
refers to each of the individual values that make up the dataset
 
n
 
is the number of values that make up the dataset

 

Re-writing the equations in words results in “the mean is equal to the sum of the individual values in the dataset, divided by the number of values in the dataset”.

When to use the mean

The mean is a good measure of the average when a dataset contains values that are relatively evenly spread with no exceptionally high or low values – this was the case with the data on student loans given in table 1.

If a dataset contains one or two very high or very low values the mean will be less typical as it will be adversely influenced by these exceptional value(s). This can be seen in table 2, where the mean salary of 6 graduates who responded to a survey about salaries in their first jobs is calculated to be £23,995 (£143,970 divided by 6).

av3.gif

Table 2: Graduate starting salaries

Examining the dataset shows that 5 of the 6 graduates earn less than the mean salary of £23,995 and it is Steve’s exceptionally high salary that produces the high mean value. In this example, the mean gives a misleading impression of the amount a typical graduate earns in their first job. For datasets containing extremely high or low values the median (see next section) is a better measure of the average value.

When not to use the mean

The mean is generally an inappropriate measure of average for data that are measured on ordinal scales. Ordinal data are rated according to a category where a higher score indicates a higher or better rank than a lower score.  Ordinal data are frequently used in surveys that ask people to indicate preference. The final information is relative and the difference between the ranks is not equal. For example, in response to a question regarding the flavour of a new blend of coffee a score of 10 implies a better taste than a score of 1 but it does not mean that the flavour is ten times as good!

 

Part two: The Median 

What is the median?

The median refers to the middle value in a dataset, when the values are arranged in order of magnitude from smallest to largest or vice-versa.  When there are an odd number of values in the dataset the middle value is straightforward to find. When there are an equal number of values, the mid-point between the two central values is the median.

For example, if the prices of seven sandwiches bought on campus are placed in order the median will be the 4th price in the sequence:

£1.10, £1.26, £1.30, [£1.40], £1.45, £1.85, £2.00

When the six starting salaries from example 2 are placed in order of magnitude the median value lies half-way between the 3rd and 4th salaries:

£14,870, £18,750, £19,100,    £21,650, £22,400, £47,200

The median value lies half-way between £19,100 and £21,650 and is £20,375 ((£19,100+£21,650)÷2).

When to use the median

The median is a good measure of the average value when the data include exceptionally high or low values because these have little influence on the outcome. 

The median is the most suitable measure of average for data classified on an ordinal scale.

The median is also easy to calculate but this does not imply that it is an inferior measure to the mean – what is important is to use an appropriate measure to determine the average. 

Another area where the median is useful is with frequency data.  Frequency data give the numbers of people or things in particular categories. For example, the frequency distribution of shoe sizes for a sample of 21 women was collected and is summarised in table 3.

av4.gif

Table 3: Frequency distribution of shoe sizes

A common mistake is to think that the median shoe size is 6 since this is the middle value in the first column. This is incorrect since it is the frequency information rather than the category (shoe size) that must be considered.  There are 21 women in the sample so when the shoe sizes are arranged in order of magnitude the median shoe size will be placed 11th  (mid-way) along that list.  The 11th value in the frequency column corresponds with a shoe size of 5.

An alternative way of finding the median shoe size in this case is to re-write the data from the table to show each shoe size and the number of times it occurred in each category and use this to work out the median:

4 ,4 ,4 ,4, 4,5 ,5 ,5 ,5 ,5 , [5], 6, 6, 6, 6, 6, 6, 6, 7, 7, 8

Using this method it is easier to see that the median shoe size is 5.

 

Part three: The Mode

What is the mode?

The mode is the value that occurs with the greatest frequency in a dataset.  It is representative or typical because it is the most common value. There may be more than one mode in a dataset if several values are equally common; alternatively there may be no mode. In example 3 (frequency distribution of shoe sizes), size 6 is the mode (or modal class), since this shoe size occurs the most frequently (7 times) in the sample. 

When to use the mode

The mode is the only measure of average that can be used with nominal data. For example, late-night users of the library were classified by faculty as: 14% science students, 32% social science students, and 54% biological sciences students. No median or mean can be calculated but the mode is biological science students as students from this faculty were the most common.

 

Part four: Calculating Averages Using Excel

In addition to the manual methods described above, you can also calculate the mean, median and mode in Excel using special commands. The commands are entered into the formula bar towards the top of the spreadsheet and are preceded by =. This use of = informs Excel that a calculation needs to be performed on the data. The corresponding cells in the spreadsheet show the result of the calculation. Example 4, below shows how Excel can be used to find the mean, median and mode of the student loan data originally given in example 1. Column A shows the different categories of student, whilst column B shows the amounts borrowed.

 

Excel Screen Shot

 

To calculate a mean, Excel uses the term average and in the example spreadsheet, the command =AVERAGE(B2:B7) has been typed in cell D2.  This command will automatically calculate the mean of the loans in cells 2 to 7 of column B. Excel performs the calculation instantly and the mean value of £1,523.33 is immediately shown in cell D2, however, the command used to perform the calculation remains displayed in the formula bar for as long as cell D2 is active (or highlighted) which is indicated by the box surrounding it. 

In cell D4 the command for the mode has been entered =MODE(B2:B7); however, as there is no modal value in this dataset the result is given as #N/A.

In cell D6 the command for the median has been entered =MEDIAN (B2:B7)

These examples show the quick method of calculating averages using a cell range. Each of the commands can also be written out in a longer format with each of the different amounts of student loan entered as a separate value.

For example using the command =MEDIAN(1170, 1890, 1530, 1160, 1870, 1520) will produce an identical result to =MEDIAN(B2:B7). However, if the amount of one of the loans in column B is changed, the cell range method will automatically adjust the median, whereas the longer format will require manual adjustment of the command.

Where next?

This guide has outlined the three different types of mathematical averages (mean, median and mode) and explained how to calculate them both manually and in the popular spreadsheet package Excel. More resources on Excel can be found here.

If you require specific help with the use of maths or statistics, the Student Learning Centre offers a range of services to meet the needs of both undergraduate and postgraduate students. These include the Maths Help service, a collection of text books and computer-based resources.

Related content
Maths

Share this page: