These data source overviews are intended to give you a high-level view of new data sources included in the Administrative Data Research Outputs. The emphasis is on the statistical quality of the source and how this affects the scope of its use in producing Research Outputs, rather than the operational quality of the administrative data source. It is anticipated that this overview will be updated in future years as the understanding and use of the data progresses.

2017 publications

The latest version of the income and benefits datasets we received was the 2016 delivery, with the latest complete year across all datasets being the tax year ending 2016. These datasets were used in Research Output publications in 2017. The only exception to this is the “New mothers’ income” publication, which used the 2015 delivery of data.

The 2016 delivery of the income and benefits datasets included Child Benefit, Universal Credit and Personal Independence Payment data. These data were supplied to us by the Department for Work and Pensions, with Child Benefit data originally sourced from HM Revenue and Customs.

The Digital Economy Act 2017 was passed into law in April 2017. The Act gives Office for National Statistics (ONS) a right of access to information held by government departments, other public bodies, charities and large and medium-sized businesses, for statistics and research purposes.

Previous publications

Source information

Dataset: Income and benefits Supplier: Department for Work and Pensions (DWP) and HM Revenue and Customs (HMRC) Version: 2015 delivery Geography: UK (HMRC datasets) and Great Britain (DWP datasets) Time period: Data from tax year ending 2010. Latest complete year across all datasets is tax year ending 2014.

Overview

Income and benefits is a collection of datasets we receive, which contain information about individuals and households receiving benefits, individuals in employment (excluding those in self-employment) and individuals receiving a pension. The datasets hold information that help infer a person’s residency, some components of income, and other characteristics based on interactions with government systems.

The income datasets supplied to us are sourced from HMRC data, which have been provided to DWP. The datasets contain all individuals in the UK whose pensions, pay, National Insurance contributions (NICs) and statutory payments information is reported to HMRC via the Pay As You Earn (PAYE) system1. The datasets also contain individuals in the UK who are claimants of tax credits, which are administered by HMRC.

The benefits datasets supplied to us have been sourced from DWP systems and contain all individuals who are in receipt of benefits2 in Great Britain.

Notes

  1. The datasets shared do not contain information about each of these separate components, but instead provide an aggregate income figure for each tax year.

  2. The benefits covered by the data received are Incapacity Benefit, Carer’s Allowance, Income Support, Jobseeker’s Allowance, Attendance Allowance, State Pension, Disability Living Allowance, Severe Disablement Benefit, Widows Benefit/Bereavement Benefit, Pension Credit, Employment Support Allowance and Housing Benefit.

Data-sharing arrangements

The income and benefits datasets were supplied to us under the Statistics and Registration Services Act 2007 legal gateway. A Memorandum of Understanding between HMRC, DWP and ONS was put in place to share anonymised datasets to allow us the opportunity to undertake research as part of the drive to make use of administrative data to improve the quality and availability of data.

The data received are held in a secure environment, which can be accessed only by our analysts who meet a set of security standards.

The Digital Economy Act 2017 amended the Statistics and Registration Service Act 2007 to provide us with greater and easier access to a range of data sources held within the public and private sectors, improving the quality and usability of official statistics and National Statistics. The amendments created a legal gateway for data owners to provide access to data they hold, for us to fulfil our statistical functions. The amended legislation also established the statutory conditions to enable us to work in partnership with data holders to identify and address the main security, privacy and resource implications of our access to data.

In addition to setting out strict limitations on the use of data provided in this way, the legislation also reinforced sanctions for the misuse of data and the main protections set out in the Data Protection Act 1998. These safeguards collectively ensure that data holders and the public can be confident that data will be used in a proportionate and accountable fashion to support the production of statistics and statistical research for the public good.

Content

The datasets contain anonymised individual-level information on those claiming certain benefits, receiving tax credits and receiving pensions or income paid through the PAYE system. The datasets exclude income submitted via self-assessment (including self-employment income).

The datasets contain a unique identifier created by DWP specifically for our purposes. The identifier allows for the linking of the income and benefits data to each other and to the DWP Customer Information System (CIS)1. The address and demographic characteristics of individuals included in the income and benefits datasets are obtained through linking to the CIS.

The datasets include a variable for the income or benefit amount paid to an individual or household and for some benefits variables with the dates of the claims. More information on the content of the datasets was published in the Income Research Outputs Report in December 2016.

Notes

    1. The CIS contains basic information (including name, address and date of birth) on all individuals who have ever had a National Insurance number. For more information on the CIS, see DWP Customer Information System.

Coverage

The income and benefits datasets include three groups of people. These groups are:

  • all individuals who have income that is submitted to HMRC via the PAYE system from employments (excluding self-employments), occupational pensions or personal pensions

  • all individuals who have claimed tax credits and, where applicable, information on the partner linked to the claim

  • all individuals who claim benefits and, for some benefits, information about other individuals linked to the claim (for example, a partner)

The income and benefits datasets will exclude any individual who is not employed, in receipt of a pension, claiming benefits or part of a household claiming benefits. This includes individuals who are self-employed, those whose only source of individual income is a benefit missing from this data supply (such as Child Benefit, Universal Credit or Personal Independence Payments) and those who do not work or claim benefits (for example, children and students).

Statistical use in Administrative Data Census Project

The uses of the income and benefits data within the Census Transformation Programme in the 2016 research outputs are as:

  • activity1 data, to improve the coverage of population estimates by providing an indication of whether an individual was likely to be resident at a point in time; further information on the use of the data within the 2016 population research outputs is given in the methodology report

  • characteristics data, to contribute to the production of census-type outputs including income; for more information see the Income Research Outputs Report published in December 2016

Notes

  1. "Activity" can be defined as an individual interacting with an administrative system, for example, for National Insurance or tax purposes, when claiming a benefit, attending hospital or updating information on government systems in some other way. Only demographic information (such as name, date of birth and address) and dates of interaction are needed from such data sources to improve the coverage of our population estimates.

Future

We continue to work closely with the data suppliers and are in discussions about the future long-term supply of the data.