Quality assurance of administrative data used in consumer price inflation statistics

1. Executive summary

There are a range of consumer price inflation measures in use in the UK; notably the Consumer Prices Index including owner occupiers’ housing costs (CPIH), and the Consumer Prices Index (CPI), which omits these housing costs.

CPIH is the first measure of inflation in our Consumer Price statistics bulletin. It was launched in 2013, but was subsequently de-designated as a National Statistic following the identification of required improvements to the methodology. We have now implemented all of the improvements, and are seeking re-designation for CPIH as a National Statistic.

The construction of CPIH and CPI is complex. Price and expenditure data are required for each of the approximately 700 items in the “basket” of goods and services. A variety of different data sources are used for this purpose.

The data used in the compilation of CPIH and CPI can be categorised as follows:

Price collection from shops in various locations around the country (commonly referred to as the “local” collection), which is contracted to an external company called TNS
Individual prices collected through a website, phone call to the supplier, or from a brochure.
Expenditure weights or prices calculated from survey data, which are sourced from within ONS, or from another government department.
Expenditure weights or prices calculated from administrative data, which taken from or compiled within ONS, other government departments, or commercial companies.

The owner occupiers’ housing costs (OOH) component of CPIH uses 4 administrative data sources to calculate the cost of owning, maintaining and living in one’s home; these are sourced from the Valuation Office Agency (VOA) in England, and from the Welsh and Scottish governments (Northern Ireland data are currently used from the TNS collection). Data from a range of sources are used to weight price data to reflect the owner occupied housing market.

Incorporating so many different data sources into any statistic, but particularly one used as a key economic measure, involves a certain degree of risk. Administrative data in particular may be collected and compiled by third parties, outside the Code of Practice for official statistics.

Our production processes are certified under an external quality management system: ISO9001: 2015. However, to further assure ourselves and users of the quality of our statistics, we have undertaken a thorough quality assessment of these data sources. This assessment is a continuous process, and we will publish updates periodically.

We have followed the Quality Assurance of Administrative Data (QAAD) toolkit, as described by the Office for Statistics Regulation (OSR). Using the toolkit, we established the level of assurance we are seeking (or “benchmark”) for each source. The assurance levels are set as either “basic”, “enhanced” or “comprehensive”, depending on:

the risk of quality concerns for that source, based on various factors, such as the source’s weight in the headline index, the complexity of the data source, contractual and communication arrangements currently in place, and other important considerations
the public interest profile of the item which is being measured, and its contribution to the headline index

The majority of items in the consumer prices basket of goods and services are constructed from just three key sources of data: the local price collection from TNS, expenditure data from Household Final Consumption Expenditure in the national accounts, and further expenditure data from the Living costs and Food Survey. This means that there are a few sources which will need a higher level of assurance, and many sources which are only used for one component of the index and so do not require a particularly high level of assurance.

Through engagement with our suppliers, we have assessed the assurance level that we have currently achieved by considering:

the operational context of the data; why and how it is collected
the communication and agreements in place between ourselves and the supplier
the quality assurance procedures undertaken by the supplier
the quality assurance procedures undertaken by us

The details below summarise the quality assurance benchmarks that were set, and the assurance levels that we have assessed each source at during this assessment.

For used cars Autotrader, the:

risk was low
profile was high
benchmark QA level was enhanced
achieved assessment is still in progress

For RDG LENNON, the:

risk was low
profile was high
benchmark QA level was enhanced
achieved assessment is still in progress

For Valuation Office Agency rental price data, the:

risk was medium
profile was high
benchmark QA level was comprehensive
achieved assessment was comprehensive

For Welsh Government rental price data, the:

risk was low
profile was medium
benchmark QA level was enhanced
achieved assessment was comprehensive

For Scottish Government rental price data, the:

risk was low
profile was medium
benchmark QA level was enhanced
achieved assessment was comprehensive

For Mintel, the:

risk was medium
profile was medium
benchmark QA level was enhanced
achieved assessment was comprehensive

For Glasses, the:

risk was low to medium
profile was low
benchmark QA level was basic
achieved assessment was basic (incomplete)

For Moneyfacts, the:

risk was low
profile was low
benchmark QA level was basic
achieved assessment was basic

For HESA, the:

risk was low
profile was high
benchmark QA level was basic
achieved assessment was basic

For Consumer Intelligence, the:

risk was low
profile was low
benchmark QA level was basic
achieved assessment was basic

For Kantar, the:

risk was low
profile was low
benchmark QA level was basic
achieved assessment was basic

For IDBR, the:

risk was low
profile was low
benchmark QA level was basic
achieved assessment was basic

For Website, the:

risk was low
profile was low
benchmark QA level was basic
achieved assessment was basic

For Direct Contact, the:

risk was low
profile was low
benchmark QA level was basic
achieved assessment was basic

For Brochures, the:

risk was low
profile was low
benchmark QA level was basic
achieved assessment was basic

For HHFCE, the:

risk was high
profile was high
benchmark QA level was comprehensive
achieved assessment was enhanced

For LCF, the:

risk was low
profile was high
benchmark QA level was enhanced
achieved assessment was comprehensive

For TNS, the:

risk was medium
profile was high
benchmark QA level was comprehensive
achieved assessment was comprehensive

For BEIS, the:

risk was low
profile was low
benchmark QA level was basic
achieved assessment was basic to enhanced

For IPS, the:

risk was low
profile was low
benchmark QA level was basic
achieved assessment was basic to enhanced

For Home and Communities agency, the:

risk was low
profile was low
benchmark QA level was basic
achieved assessment is still in progress

For the Department of Transport, the:

risk was low
profile was low
benchmark QA level was basic
achieved assessment is still in progress

As a result of this assessment, we have put in place an action plan to improve our quality assurance in some areas:

Household Final Consumption Expenditure (HHFCE) data also require a comprehensive level of assurance; however, we would like more information on the complex array of data sources used to compile the statistics. HHFCE QAAD assessment, still in progress
finally, there are a number of data sources which require basic assurance, for which we have not received all the requested quality assurance information; we will work with these suppliers to gain the level of assurance we require

We will continue to engage with our data suppliers to better understand any quality concerns that may arise, and to raise their understanding of how their data are used in the construction of consumer price inflation measures. The QAAD will be updated as new alternative data sources are introduced into live production.

Back to table of contents

2. Introduction

There are currently two key consumer price inflation measures in the UK. The Consumer Prices Index including owner occupiers’ housing costs (CPIH) is the first measure of consumer price inflation in our statistical bulletin, and is currently the most comprehensive measure of inflation. This addresses some of the shortcomings of the Consumer Prices Index (CPI), which is an internationally comparable measure of inflation, but does not include a measure of owner occupiers’ housing costs (OOH): a major component of household budgets¹. Both of these measures are based on the same data sources (with the exception of OOH and Council Tax, which are in CPIH but not CPI). These data sources are numerous and often complex. We therefore seek to assess the quality of each of these sources.

Our assessment of data sources is carried out in accordance with the Office for Statistics Regulation's Quality Assurance of Administrative Data (QAAD) toolkit. We are striving for a proportionate approach in assessing the required level of quality assurance for the many and varied data sources used in the compilation of CPI and CPIH. We seek to highlight and address the shortcomings that we have identified, and reassure users that the quality of the source data is monitored and fit for purpose.

In this paper, we set out the steps we have taken to quality assure our data, and our assessment of each source. In section 3 we discuss important quality considerations for CPIH and CPI. In section 4 we outline our approach to assessing our data sources. In section 5 we discuss the assurance levels we are seeking for each data source, and the resulting assessment and, in section 7, we detail our next steps towards achieving full assurance. Our detailed quality assurance information for each source is provided in Annex A.

This publication is part of an ongoing process of dialogue with our suppliers, to increase our understanding of any quality concerns in the source data, and to raise awareness of how it is utilised. Through this document, we aim to provide information and assurance to users that the sources used to construct our consumer price inflation measures are sufficient for the purposes for which they are used. We will therefore review this document every 2 years. We do not address the construction of, or rationale for, our OOH measure in CPIH here. This is discussed in detail in the CPIH Compendium. For more information on our consumer price inflation measures, please refer to our Quality and Methodology Information page.

Notes for: Introduction

The Retail Prices Index (RPI) is a legacy measure, only to be used for the continuing indexation of index linked gilts and bonds. It is not a National Statistic.

Back to table of contents

3. Quality considerations

When considering the quality of UK consumer price inflation measures, there are some broader considerations that users should bear in mind. The first is the de-designation of CPIH as a National Statistic in 2014. The second is external accreditation under ISO9001:2015 for consumer price statistics processes. These are described in more detail in this section. Detail on the quality assurance procedures applied to our statistics is reproduced in Annex B.

3.1 Loss of National Statistics status

CPIH was introduced in early 2013, following a lengthy development process overseen by the Consumer Prices Advisory Committee (CPAC) between 2009 and 2012. CPIH became a National Statistic in mid-2013, but was later de-designated in 2014 after required improvements to the OOH methodology were identified. These were:

improvements to the process for determining comparable replacement properties when a price update for a sampled property becomes unavailable, leading to more viable matches
bringing the process for replacing properties for which there is no comparable replacement into line with that used for other goods and services in consumer price statistics
optimising the sample of properties used at the start of the year, to increase the pool of properties from which comparable replacements can be selected
reassessing the length of time for which a rent price can be considered valid before a replacement property is found

The required methodological improvements were implemented in 2015, and the series was fully revised to accommodate these changes. On 3 March 2016, the Office for Statistics Regulation (OSR) released their assessment report on CPIH, reviewing the statistic against all areas of the Code of Practice for Official Statistics.

We have subsequently undertaken an assessment of all data sources used in the production of CPIH using the OSR’s Quality Assurance of Administrative Data toolkit (QAAD). We have aimed to demonstrate that we have investigated, managed and communicated appropriate and sufficient quality assurance of all our data sources. Additionally, we have published a range of supporting information, such as the CPIH Compendium, which sets out the rationale for our choice of OOH measure, and the methodology behind it, the Comparing measures of private rental growth in the UK article, and the Understanding the different approaches of measuring owner occupiers’ housing costs article. The CPIH was re-designated as a National Statistic on 31 July 2017. More details on CPIH can be found via the CPIH compendium.

3.2 ISO9001 Accreditation

Prices Production areas are externally accredited under the quality standard ISO9001:2015. This is an international standard based on a set of quality management principles:

customer focus
leadership
engagement of people
process approach
improvement
evidence-based decision making
relationship management

It promotes the adoption of a process approach, which will enable understanding and consistency in meeting requirements, considering processes in terms of added value, effective process performance and improvements to processes based on evidence and information. In other words, the main purpose of this standard is to ensure the quality of our production processes, to ensure that we fully evaluate risks and to ensure that we strive for continuous improvement.

The standard is applied to all areas of production involved in the compilation of the whole range of consumer price inflation statistics. Prices documentation is reviewed by trained internal auditors, based on an annual cycle planned by the quality manager. The depth of the audit is based on how frequently the processes change. A review by an external auditor is also conducted on an annual basis, and a 3-year strategic review is also conducted to assess suitability for re-certification.

Back to table of contents

4. Approach to assessment

We have conducted our assessment of data sources used in Consumer Prices Index including owner occupiers’ housing costs (CPIH) using the Office for Statistics Regulation’s QAAD toolkit. We took the following steps for each data source:

establish the risk of quality concerns with the data
establish the level of public interest in the item that the data are being used to measure
determine benchmark quality assurance levels, based on the risk and public interest.
contact the suppliers of administrative data to understand their own practices and approach to quality assurance; generally, this consists of the following steps:
- send out questionnaires to our data suppliers requesting information on their QA procedures
- conduct follow up meetings with our data suppliers to request further information and clarification
- maintain ongoing dialogues with data suppliers to develop a better understanding of any quality issues in the data, and raise awareness of how the source data are used
review our own quality assurance and validation procedures and processes
conduct an assessment of each data source using the four practice areas of the Quality Assurance of Administrative Data (QAAD) toolkit:
- operational context and data collection
- communication with data suppliers
- quality assurance procedures of the data supplier
- quality assurance procedures of producer
determine an overall quality assurance level based on our assessment
if this assurance level does not match the benchmark assurance level, then put steps in place to work towards meeting the required assurance level
review the quality assurance on an ongoing basis; we will publish a QAAD update every 2 years

4.1 Setting the benchmarks

In accordance with the QAAD toolkit, we have sought assurance for each data source based on the risk of quality concerns associated with that data source, and the public interest in the particular item being measured by that data source.

We considered a high, medium or low risk of data quality concerns based on:

the weight that the item being measured by a particular data source carries in headline CPIH or Consumer Price Index (CPI); we consider items with a weight less than 1.5% to be very small, items with a weight between 1.5% and 5% to be small, items with a weight between 5% and 10% to be medium, and items with a weight higher than 10% to be large.
the complexity of the data source; for example, whether it is compiled from a number of different sources, or based on survey data, which we would consider to be lower risk due to the fact that data are collected for statistical purposes and have a holistic, well designed collection strategy, their reliability is better understood, and quality assurance and validation procedures are typically robust
the existing contractual and communication arrangements currently in place
how much the measurement of a particular item depends on that data source (in other words, what would we do if we did not have this data?)
other considerations, such as any existing published information on data collection, methodology or quality assurance, or mitigation of high risk factors with the data

We considered a high, medium or low public interest profile based on:

the level of media or user interest in the particular item being measured
the economic or political importance of the particular item being measured
the contribution of the item being measured to the headline index, since we would consider both CPIH and CPI to be economically and politically important
any additional scrutiny from commentators, based on particular concerns about the data

Together the risk of quality concerns and public interest profile are combined to set an overall assurance level that is required for a particular source. For more information on how we assessed the overall assurance level, please see the UK Statistics Authority Quality assurance matrix.

4.2 QAAD practice areas

We have aimed to assess the quality of each data source based on four broad practice areas. These relate to the quality assurance of official statistics and the administrative data used to produce them: our knowledge of the operational context in which the data are recorded, building good communication links with our data suppliers, an understanding of our suppliers’ quality processes and standards, and the quality processes and standards that we apply. This is in line with the Office for Statistics Regulations expectations for quality assurance of data sources. The full assessments for each data source can be found in Annex A.

Breakdown of the four practice areas associated with data quality

The operational context and admin data collection practice area covers:

the environment and processes for compiling the administrative data
factors which affect data quality and cause bias
safeguards which minimise the risks
the role of performance measurements and targets; the potential for distortive effects

The communication with data partners practice area covers:

collaborative relationships with data collectors, suppliers, IT specialists, policy and operational officials
formal agreements detailing arrangements
regular engagement with collectors, suppliers and users

The Quality Assurance (QA) principles, standards and checks by data suppliers practice area covers:

data assurance arrangements in data collection and supply
quality information about the data from suppliers
role of operational inspection and internal/external audit in data assurance processes

The producers’ QA investigations and documentations practice area covers:

QA checks carried out by the statistics producer
quality indicators for input data and output statistics
strengths and limitations of the data in relation to use
explanation for users about the data quality and impact on the statistics

Back to table of contents

5. Assurance level assessment

5.1 Setting the benchmarks

In this section we describe each of our data sources, and consider the assurance level that we are seeking (or “benchmark”) for these. We also summarise our current assessment of the data and outline any further steps that may be required to reach the benchmark assurance level. We will also use this process to build engagement with our suppliers to better understand the data source, as well as raising awareness of how the data are used in consumer price inflation statistics.

In the section that follows, the weights provided are for Consumer Prices Index including owner occupiers’ housing costs (CPIH) (the first measure of consumer price inflation in our bulletin) in February 2017 (expect for rail fares which has been updated and provides weights information in February 2024).

It is a feature of consumer price statistics that we require a data source for each of the approximately 700 items in the basket of goods and services. The majority of the index is constructed from just three data sources – the local price collection, conducted by an external company called TNS, and expenditure data from the Living Costs and Food Survey (LCF) and the Household Final Consumption Expenditure (HHFCE) branch of the national accounts.

Remaining items tend to be constructed from data sources which are quite specific to the item being measured. A consequence of this is that the distribution of assurance levels required for assessment is very heavily weighted towards basic assurance. This is because we have a few data sources which are used for the vast majority of items, and relatively few items which all require a bespoke data source.

Benchmark assurance levels are summarised below. The assurance levels required for this QAAD assessment are set out in detail below, with explanations provided accordingly. The assurance levels are based on an assessment of the risk of quality concerns, and the public interest profile, as described in section 3.2. These are used to set the overall assurance level.