Histograms and Cumulative Frequency
This topic covers how to represent and analyse grouped data using two key diagrams. Histograms are used to visualise the shape and spread of data, especially with unequal class sizes, while cumulative frequency graphs allow for quick estimation of median and quartiles.
Part of the ESAT Mathematics 1 syllabus — revision for the Engineering and Science Admissions Test (ESAT), the UAT-UK admissions test for Cambridge, Imperial, Oxford and UCL.
Key points
- The fundamental principle of a histogram is that the AREA of each bar is proportional to the frequency, not the height.
- The vertical axis of a histogram is always Frequency Density, unless all class intervals are of equal width.
- For continuous data given to the nearest whole number, a class like '10-14' has boundaries 9.5 and 14.5, making its width 5, not 4.
- Cumulative frequency graphs plot a 'running total' of frequencies on the y-axis against the UPPER BOUNDARY of each class on the x-axis.
- The median and quartiles are found from a cumulative frequency graph by reading across from the appropriate value on the y-axis (e.g., Total Frequency / 2 for the median) and then down to the x-axis.
Formulae
Frequency Density = Frequency / Class Width To calculate the height of a bar for a histogram. This is essential when class intervals are unequal.
Definitions
- Frequency Density
- The frequency per unit of the data range. It is calculated as Frequency / Class Width and represents the height of a bar in a histogram.
- Cumulative Frequency
- A running total of the frequencies. The value for a given class shows the total number of data points up to and including that class.
- Class Boundaries
- The precise values that separate data classes. They ensure there are no gaps between classes for continuous data (e.g., the upper boundary of one class is the lower boundary of the next).
Worked example
The masses of 40 parcels are recorded. The data is summarised in the table below. If a histogram is constructed to represent this data, what is the height of the bar for the 5 < m ≤ 15 kg class? | Mass (m kg) | Frequency | |----------------|-----------| | 0 < m ≤ 5 | 10 | | 5 < m ≤ 15 | 12 | | 15 < m ≤ 20 | 18 |
- 1
First, identify the frequency and class width for the target interval, 5 < m ≤ 15.
- 2
The frequency for this class is given as 12.
- 3
The class width is the difference between the upper and lower boundaries:
15 - 5 = 10 - 4
Use the formula for frequency density:
Frequency Density = Frequency / Class Width.
- 5
Substitute the values:
Frequency Density = 12 / 10 = 1.2.
Answer: 1.2
Common mistakes
- ×Plotting frequency instead of frequency density on a histogram's y-axis. This is the most common error and is only valid if all class widths are identical.
- ×Miscalculating class width for discrete data. For example, the class '20-24' (inclusive integers) has a width of 5 (from boundaries 19.5 to 24.5), not 4 (from 24-20).
- ×On a cumulative frequency graph, plotting points against the class midpoint instead of the correct upper class boundary.
- ×Forgetting that the area of a histogram bar represents the frequency, so if asked for frequency, you must calculate height × width (i.e., frequency density × class width).
No-calculator tips
- ✓When calculating Frequency Density (Frequency / Width), always simplify the fraction before converting to a decimal. For example, 18/30 simplifies to 6/10, which is easily seen as 0.6.
- ✓To find quartiles from a cumulative frequency graph with a total frequency N, calculate the exact y-axis positions first: N/4 (Lower Quartile), N/2 (Median), and 3N/4 (Upper Quartile). This is much more accurate than estimating visually.
- ✓When reading values from a graph, check the scale of each axis carefully. A single grid square might represent 2, 5, or 0.5 units, not always 1.