The ONS has embarked on a programme of ‘statistical archaeology’ projects to make more historical census outputs available to the public in digital form. The coming years will see the digitisation of census outputs from 1921 to 1961. The first of these projects focused on the 1961 Census Small Area Statistics tables.
In 1961 local authorities were given the option to obtain, for a fee, additional census outputs down to parish council, ward, and enumeration district level. These Small Area Statistics were produced as paper computer print outs and microfilm. As such, these data were never published digitally and all that remains is scanned images of the printouts and microfilm.
This project aimed to breathe new life into the data by retrieving and processing content from the Office for National Statistics 1961 Census Image Library. The digitisation of the data was carried out by the University of Salford’s Pattern Recognition and Image Analysis Research Lab (PRImA).
The digitisation process started with Optical Character Recognition (OCR) of more than 140,000 digital images of the SAS outputs held by the ONS. This process was able to recognise approximately 95% of the characters and numbers shown in the images.
In an effort to obtain the remaining 5% of data, the PRImA team used the novel approach of crowdsourcing via the citizen science platform Zooniverse. In doing so, members of the public were able to classify images by submitting the figures from the images presented to them. The Zooniverse project lead to more than 2,800 volunteers submitting more than 5 million classifications.
In planning for the 1961 Census the decision was made that topics involving mainly national rather than local statistics, or where the classification was into relatively few groups, were candidates for sample tabulation. Information on economic activity (such as occupation, industry and workplace
Education, and household composition was mainly required on a national basis, and while migration was of local interest, the main classifications were short; these were therefore suitable for sample treatment. Population count, housing statistics, information on sex, age and marital condition, and birthplace and nationality were needed for every administrative area and therefore tabulated on a full count basis.
Consideration of the proposed sample-based tabulations led to the conclusion that a sample of 10% would provide data of sufficient precision. Further detail on the sampling procedures can be seen in the 1961 SAS user guide. The tables produced as part of the 1961 SAS were split into those covering 100% sample counts and those containing 10% sample counts.
This digitisation project is the first publication of an ongoing project to digitise multiple years of historical census data, ranging from 1921 through to 1961. The data recovered from the original images does not have full coverage and there are missing values. Please consider this a Beta publication and we welcome your feedback on your experience using this data at census.historical.research@ONS.gov.uk.
Start exploring 1961 SAS data on Nomis