1. Disclaimer

These Research Outputs on income are not official statistics. Rather they are published as outputs from research into a different methodology to that currently used in the production of income statistics. These outputs must not be interpreted as an indicator of poverty or living standards.

It’s important that the information and research presented here be read alongside the outputs to aid interpretation and avoid misunderstanding. These outputs must not be reproduced without this disclaimer and warning note.

Back to table of contents

2. Main points

  • These are the second Administrative Data Census Research Outputs on income; they are the continuation of our research to see if it’s feasible to produce small-area multivariate income outputs.

  • In response to user feedback, this year’s publication includes both individual and household gross income1 distributions at lower layer super output area (LSOA) level for the tax year ending 2016.

  • At this stage, the Research Outputs are limited to income from the Pay As You Earn (PAYE) and benefits systems (which include tax credits); therefore, a number of components of income are missing, for example, income from self-employment and investments taxed via Self Assessment – for more information on the data sources included in the Income Research Outputs, see section 5 of the 2016 publication and the Data Source Overview: Income and Benefits.

  • Despite the limitations of the analysis, we’re encouraged to see that results broadly reflect the patterns we expect; for example, a higher percentage of both individuals and households fall within the higher income bands in London and the south east.

  • We’d like your feedback on this publication and our future plans; in particular, we’d welcome feedback on the income definition used in this publication and whether this definition meets your needs – please send your feedback to Admin.Data.Census.Project@ons.gsi.gov.uk.

Notes for: Main points

  1. Gross income is income before deductions such as National Insurance contributions and tax.
Back to table of contents

3. Things you need to know about this release

  • These Research Outputs are not official statistics and they should not be interpreted as an indicator of poverty or living standards; rather they are published as outputs from research into a different methodology to that currently used in the production of official income statistics.

  • The official estimates of household income at low geographies are the small area model-based income estimates.

  • The term “income” used throughout this publication refers to income estimated from Pay As You Earn (PAYE) and benefits (which includes tax credits) data only.

  • Income recorded via Self Assessment (which includes income received from self-employment, property rental and investments) and from some benefits isn’t included in these Research Outputs.

  • The outputs refer to income received during the tax year ending 2016.

  • Two main methodological factors have changed since last year’s Income Research Output publication: the inclusion of Child Benefit data in the income measure, and the use of Statistical Population Dataset V2.01 as the population base.

  • Due to these changes, you shouldn’t compare results in this publication with results in the 2016 publication as a time series; furthermore, these Income Research Outputs haven’t been adjusted for inflation.

  • Our work to produce estimates of the number of “households” from administrative data has so far produced estimates of the number of “occupied addresses” – as a result, these Income Research Outputs for households are based on the concept of an “occupied address”.

  • When the outputs refer to household income, they’re actually representing income at an address; you can find more information on the differences between a traditional “household” and an “occupied address” in the Occupied Address Research Outputs publication.

  • Household income was equivalised using the Organisation for Economic Co-operation and Development-modified (OECD) equivalence scale; for more information on the methodology used to produce household income see section 8.

Notes for: Things you need to know about this release

  1. A Statistical Population Dataset (SPD) is a single, coherent dataset that forms the basis for estimating the population. It’s produced by linking records across multiple administrative data sources and applying a set of inclusion and distribution rules.
Back to table of contents

4. What’s our overall aim?

In last year’s publication, we outlined the user need for small area, multivariate income outputs. This need can’t be met by including a question on the census questionnaire, due to the negative impact on response and poor quality of resulting data.

Therefore, we’re carrying out research to see if it’s feasible to fill this gap in official statistics by using administrative data both as an enhanced output for the 2021 Census and within an Administrative Data Census. The methodology presented here could be used in either case, although it’s currently presented using an administrative data population base.

Our current focus is on producing small area income outputs and understanding the precise user needs for the definition of income. We’ll then extend the research to multivariate outputs. While there are still components of income missing in the data, this publication represents the type of outputs that could be provided to meet the most commonly stated user need in last year’s feedback (for more information, please see Annex A).

In our next release, we plan to reproduce the estimates for the tax year ending 2016, but with the addition of income processed through the Self Assessment system, subject to data being provided to Office for National Statistics (ONS). The Digital Economy Act 2017 provides a legal gateway for the Self Assessment data to be shared and we’re currently working with HM Revenue and Customs (HMRC) to achieve this.

The users who provided feedback to last year’s publication told us they needed household income rather than individual income. They also said that income outputs at lower layer super output area (LSOA) level would be sufficient. This year we’ve included LSOA distributions for both household and individual income.

In the 2021 Census topic consultation some users presented a need for income data at output area level. Therefore, we originally intended to produce this in 2018, subject to data quality and availability. The National Statistician’s Data Ethics Advisory Committee suggested that we engage with users to fully understand the demand for output area level data. If you do have a need for income outputs at output area level, please contact us to tell us about your needs.

The majority of users said gross income was the correct definition to work towards, while some stated a preference for net income. Some feedback suggests it may be those looking at the lower end of the distribution that felt gross income would be sufficient, and those looking at the higher incomes would require net.

It’s important that we understand requirements as we look to provide more measures, such as median income. Therefore, we would appreciate more information from you about your needs for gross or net income. Please send your feedback to Admin.Data.Census.Project@ons.gsi.gov.uk.

This research runs alongside other work reviewing how we collect income data. This includes a project looking at combining our surveys that collect income information into a single Household Finance Survey.

Our surveys currently produce a range of income statistics that don’t meet all user needs for income information. We’re working with those producing income outputs across the Government Statistical Service to make sure there’s a coherent approach to meeting as many user needs as possible. This is likely to be through a combination of administrative and survey data.

Definition of income

The definition of income we’re continuing to work towards for both individuals and households is “gross income”, as outlined in the Canberra Group Handbook. The components of gross income included in the target definition are:

  • income from employment (which includes employee income and income from self-employment)

  • property income (which includes investment income from both financial and non-financial assets)

  • current transfers received (which includes social security schemes and pensions)

For households, we’ve equivalised our measure of gross income using the Organisation for Economic Co-operation and Development-modified (OECD) equivalence scale. Equivalisation adjusts household income to reflect the different resource needs of households of different sizes and compositions. The OECD-modified equivalisation scale uses single-adult households as the reference group with a value of one.

Although the OECD-modified scale is generally used to equivalise net or disposable household income measures, we aren’t aware of an alternative scale for use with gross household income measures. For more information on how we used the OECD-modified scale in our analysis, see section 8.

To inform future development of this work, we’re keen to understand your needs for different income definitions. Please send your feedback to Admin.Data.Census.Project@ons.gsi.gov.uk.

Back to table of contents

5. What do the Income Research Outputs measure and which population groups do they cover?

This section discusses the income measure used in the Income Research Outputs, with reference to both the coverage of income sources and coverage of the population. This helps to inform interpretation of the outputs and identify limitations in their use.

The income sources and income measure

In section 5 of the 2016 publication, we discussed the availability of administrative data sources for the different components of gross income. For the 2017 publication of the Income Research Outputs, the overall availability of data sources on gross income has largely stayed the same. However, since last year the Department for Work and Pensions (DWP) has shared one more administrative data source with us – Child Benefit data1. Child Benefit data were originally sourced from HM Revenue and Customs (HMRC) and were provided to DWP and passed to Office for National Statistics (ONS) with HMRC agreement.

The addition of Child Benefit data is helping us move towards the full picture of income. However, we still don’t have full availability of data on all components and the income measure used in the Research Outputs doesn’t meet the internationally agreed definition of gross income (as defined in the Canberra Group Handbook).

Components of income included in the 2017 Income Research Outputs are:

  • gross earnings (net of pension contributions) from employment, including any benefits in kind paid through Pay As You Earn (PAYE)

  • state support – most benefits, including tax credits and Child Benefit (see Annex B of the 2016 publication for more information)

  • income from occupational and personal pensions included on the PAYE dataset

Components of income included in the international definition of gross income (as defined in the Canberra Group Handbook) but excluded from the income measure in the 2017 Income Research Outputs are:

  • income from self-employment or income from an employer not paid via PAYE

  • investment income including interest from Individual Savings Accounts (ISAs) and other saving accounts, bonds, stocks and shares

  • some state support – some benefits including Winter Fuel Payment, Universal Credit and Personal Independence Payment

We aim to include as many of these components as possible in the future with improved methods and availability of administrative data.

For further information on the income components included in the income measure, see section 6 of the 2016 Income Research Output publication.

Coverage of the England and Wales population – individuals

The 2017 Income Research Outputs cover some income information for 88% of people aged 16 and over in England and Wales on Statistical Population Dataset (SPD) V2.02. This is an increase of one percentage point compared with the 2016 Income Research Outputs and is likely to be due to Child Benefit data being included.

Figure 1 shows the percentage of the population with some PAYE and benefits income information in the 2016 publication compared with the 2017 publication of the Income Research Outputs by age and sex. The overall pattern of coverage for both men and women hasn’t changed with coverage generally increasing with age for both sexes. For a detailed discussion of the coverage pattern by age and sex and the potential reasons for this, see section 6 of the 2016 Income Research Output publication.

There has been little change in coverage of the male population between the 2016 and 2017 publications. However, for females there has been a notable increase in coverage between the ages of 25 and 56 years. The inclusion of Child Benefit data in the 2017 publication largely explains this increase, as 89% of Child Benefit claimants included in the outputs were female.

For both men and women in the 2016 and 2017 Income Research Outputs, coverage is highest for those of State Pension age. By age 67 for men and age 64 for women, some income information is available for more than 99% of the population. However, between the reference date for the 2016 publication and the reference date for this publication, State Pension age increased for women and this increase is reflected in the coverage. In the 2016 Income Research Outputs, the increase in coverage occurs between ages 60 and 61. In this year’s outputs, coverage doesn’t increase notably until ages 61 and 62.

At local authority (LA) level, the percentage of the aged 16 and over population on Statistical Population Dataset (SPD) V2.0 with some PAYE and benefits income information varied from 74% to 93%. Not all individuals in England and Wales will have had income as defined by our target definition. Therefore, we didn’t expect to obtain income information for 100% of the aged 16 and over population.

Overall coverage at LA level increased compared with last year’s publication due to the addition of Child Benefit data to the income outputs. Last year, 84 LAs in England and Wales had some PAYE and benefits income information for at least 90% of their 16 and over population. This has increased to 151 LAs this year (Figure 2).

LAs with some PAYE and benefits income information for less than 80% of their aged 16 and over population included some LAs in London as well as Richmondshire, Oxford and Cambridge. This was similar to last year. For information on why population coverage is lower in these LAs see section 6 of the 2016 publication.

Coverage of the population with PAYE and benefits income information varies more at lower layer super output area (LSOA) level than at LA level (these data are available in the related download). Factors discussed in section 6 of the 2016 publication that affect LAs explain some of this variation in coverage at LSOA level. For example, due to limitations of the methodology LSOAs where armed forces make up a large proportion of the population may have lower coverage (for more information, please see section 8). LSOAs with large student populations may also have lower coverage as student loans aren’t included in our definition of gross income.

Coverage of the England and Wales population – households

Of households on Statistical Population Dataset (SPD) V2.0 with a valid Unique Property Reference Number3 (UPRN), 82% had some income information available for all household members aged 16 and over for the tax year ending 2016. Of the remaining households, 15% had income information available for some, but not all, household members aged 16 and over, and 3% had “no income information” available for any household members aged 16 and over (Table 1).

Table 1 also demonstrates that household composition varied for households with different coverage. For example, 31% of households where income information was available for some, but not all, household members had at least one individual aged 16- to 18-years-old. This compared with 3% of households where income information was available for all household members aged 16 and over.

We don’t necessarily expect all household members aged 16 and over to have income information that falls under our definition of gross income. In particular, we expect lower coverage for 16- to 18-year-olds as described in section 6 of the 2016 Income Research Outputs. However, for other household members, we’re missing income information that should be included under our income definition, for example, income processed through Self Assessment.

These two factors make it difficult to interpret coverage of these households. This interpretation will be easier in future releases when Self Assessment data are included. We’ll also be able to use data on household composition (due to be published in 2018) in future releases to look at the patterns of income and data coverage further.

Income information was included in the household outputs for all individuals (including those aged under 16 years). All individuals were also included in the equivalisation calculation regardless of whether income information was available for them or not. Therefore, for households where income information was available for some, but not all, household members, the household measure is likely to be more biased.

Notes for: What do the Income Research Outputs measure and which population groups do they cover?

  1. Child Benefit is a state benefit administered by HMRC. People responsible for a child under 16 years old (or under 20 years old and in approved education or training) are eligible to apply for Child Benefit. However, only one person can claim Child Benefit for a child.

  2. A Statistical Population Dataset (SPD) is a single, coherent dataset that forms the basis for estimating the population. It’s produced by linking records across multiple administrative data sources and applying a set of inclusion and distribution rules.

  3. A Unique Property Reference Number (UPRN) is a unique alphanumeric identifier for every spatial address in Great Britain and can be found in Ordnance Survey’s address products.

Back to table of contents

6. What do the outputs show for individuals?

Please make sure that you’ve read the “Disclaimer” and “Things you need to know about this release” sections before continuing. For information on how these individual income distributions were constructed, please see the methodology in section 8. As explained in section 5, coverage of these individual outputs varies by age and sex.

Geographical comparisons

Figure 3 shows the Pay As You Earn (PAYE) and benefits individual income distribution of the population aged 16 and over in England and Wales for the tax year ending 2016.

For England and Wales combined, income of the population aged 16 and over was skewed towards the lower income bands. It’s important to note that the Income Research Outputs often underestimate an individual’s income. For more information on the components of income that aren’t included in the Research Outputs, please see section 5.

Income information wasn’t available for 12% of the aged 16 and over population in England compared with 10% in Wales.

Higher percentages of the aged 16 and over population in England fell within the higher income bands compared with Wales.

Figure 4 shows the percentage of the aged 16 and over population who had an annual income of £20,000 or below in the tax year ending 2016 for each lower layer super output area (LSOA) in England and Wales.

LSOAs shaded darker on the map indicate areas where a greater percentage of the population had an income of £20,000 or below.

LSOAs with lower percentages of their population with an income of £20,000 or below were clustered around London and the south east of England.

Back to table of contents

7. What do the outputs show for households?

Please make sure that you have read the “Disclaimer” and “Things you need to know about this release” sections before continuing. For information on how these household income distributions were produced, see the methodology in section 8. Coverage of these household outputs varies by household type. For more information on this, and the potential bias introduced, see section 5.

Geographical comparisons

Figure 5 shows the gross equivalised household Pay As You Earn (PAYE) and benefits income distribution for England and Wales.

For England and Wales combined, household income was skewed towards the lower income bands. It’s important to note the Income Research Outputs often underestimate equivalised household income due to components of income being missing from the income measure. For more information on the components of income that aren’t included in the Research Outputs see section 6 of the 2016 publication.

Higher percentages of households in England fell within the higher income bands in comparison with Wales.

Income information was unavailable for all individuals aged 16 and over in the household for 2% of households in Wales and 3% of households in England.

The £20,000.01 to £30,000.00 income band was the most common for gross equivalised household incomes in both England and Wales.

Figure 6 shows the distribution of gross equivalised household income, from PAYE and benefits, by region.

Much like individual income, higher percentages of households in London and the South East fell within the higher income bands compared with other regions. All regions except London had a similar percentage of households within the £20,000.01 to £30,000.00 income band, only varying by one percentage point.

London was the region with the greatest percentage of households with “no income information”, around 5%. In all other regions this was below 3%.

Figure 7 shows the percentage of households that had an equivalised annual income of £20,000 or below in the tax year ending 2016 for each lower layer super output area (LSOA) in England and Wales.

LSOAs shaded darker on the map indicate areas where a greater percentage of households had an equivalised income of £20,000 or below.

LSOAs with lower percentages of households with an equivalised income of £20,000 or below were clustered around London (to the south) and the south east of England. There were also a few in the east and east midlands.

The south west, Wales, and the north east were the main areas where the greatest percentage of households had an equivalised income of £20,000 or below.

The following results demonstrate the type of analysis that could be possible with the Income Research Outputs in the future. Due to limitations of the current methodology, these results must be interpreted with caution. In particular, the population groups analysed are affected differently by coverage of the Income Research Outputs. Coverage figures for each of these population groups at local authority (LA) level are available in the data downloads accompanying Figures 8 and 9.

In all LAs, a greater percentage of households had equivalised incomes of £20,000 or below when all household members were aged 60 and over compared with all other households (Figure 8). For both groups, LAs where fewer households had equivalised incomes of £20,000 or below were clustered around London and the south east.

In the majority of LAs, a greater percentage of households had equivalised incomes of £20,000 or below when the household included people aged under 16 years (Figure 9). This difference is expected, as in households with people aged under 16 years, fewer household members are likely to contribute to the total household income.

Greater variation in the percentage of households with an income of £20,000 or below was observed across LAs in London for households with people aged under 16 years.

Back to table of contents

8. Methodology

Main points about the method

  • Administrative datasets on Pay As You Earn (PAYE) income and benefits (including tax credits) were linked to derive gross annual estimated income amounts.

  • This year we added Child Benefit data to the income measure as an additional administrative data source.

  • Statistical Population Dataset1 (SPD) V2.0 was used as the population base and income information was included only for individuals present on SPD V2.0.

  • We’ve used the concept of an “occupied address” rather than a traditional “household” to produce the household income distributions; you can find more information on the differences between a traditional “household” and an “occupied address” in the Occupied Address Research Outputs publication.

  • Income information for people aged 16 and over was included in the individual income measure; income of all household members (including people aged under 16 years) was included in the household income measure.

  • Income information for many of the armed forces population wasn’t included in either the individual or household income measures – the armed forces population was added to SPD V2.0 at aggregate level; therefore, for the majority of the armed forces, it wasn’t possible to link their individual income information at record level or to allocate them to a Unique Property Reference Number2.

Method used to produce individual annual income amounts

The 2017 Income Research Outputs were produced by combining income information from five administrative data sources shared with us by the Department for Work and Pensions (DWP). These were:

  • HM Revenue and Customs (HMRC) tax credit data

  • HMRC’s Pay As You Earn (PAYE) data

  • DWP’s National Benefits Database (NBD)

  • DWP’s Single Housing Benefit Extract (SHBE)

  • HMRC’s Child Benefit data (new in the 2017 publication)

For more information on these data sources, see section 5 of the 2016 publication and the Data Source Overview: Income and Benefits.

The fifth administrative data source included in this year’s Income Research Outputs was HMRC’s Child Benefit data. The dataset supplied to us contained Child Benefit information at individual child level. This dataset contained a unique identifier for each child alongside one for the corresponding claimant with start and end dates for the claim. Claimants appeared more than once in the dataset if they claimed Child Benefit for more than one child during the tax year.

The amount of Child Benefit received for a claim wasn’t included in the dataset. Therefore, to calculate an annual income amount from Child Benefit for each claimant, we used the standard weekly rates of Child Benefit.

We allocated the rate for a first child (£20.70) to each claimant’s longest claim within the tax year and the rate for additional children (£13.70) for all other claims. We then multiplied these weekly rates by the length of the claim (in weeks) to get a claim amount. Following this, we aggregated these amounts for each claimant to get their annual Child Benefit income amount. As described in the 2016 publication, we also calculated an annual individual income amount for each of the other administrative data sources.

Figure 10 demonstrates how these five administrative data sources were combined and linked to Statistical Population Dataset1 (SPD) V2.0 for 2015, to create the individual Income Research Outputs. All this linking was done using a unique identifier created by DWP specifically for our purposes.

Income information was only included in the Income Research Outputs for individuals present on SPD V2.0 for 2015. You can find a more detailed description of the methodology used to combine datasets and produce the individual level Income Research Outputs in section 9 of the 2016 publication.

Distributions of individual annual income were then produced for the aged 16 and over population on SPD V2.0 at local authority (LA) and lower layer super output area (LSOA) level. Distributions by age and sex were also produced at national level.

SPD V2.0 2015 refers to the resident population in England and Wales on 30 June 2015. Residents with income information for the tax year ending 2016 but who arrived in England and Wales after this date aren’t present on SPD V2.0 and are therefore excluded from the Income Research Outputs. This demonstrates the discrepancy between the date used for the population base and the 2015 to 2016 income reference period. This results in some valid income information being omitted.

SPD V2.0 contains demographic information on the age, sex and address of individuals. Income information was included only for individuals aged 16 and over on the SPD V2.0 reference date. Therefore, income information was included in the Income Research Outputs for individuals aged 15 years at the start of the tax year, but who had turned 16 years old by 30 June 2015. In contrast, income information was excluded for individuals who turned 16 years old during the tax year, but after 30 June 2015.

SPD V2.0 has resolved the address conflicts that happened with the SPD V1.0 methodology. Therefore, this year individual incomes didn’t need to be weighted across LAs. You can find more information on these address conflicts and how they were resolved in section 6 of the methodology report for SPD V2.0.

Method used to produce household annual income amounts (equivalised)

In 2017, we published occupied address (household) estimates from administrative data for 2011 and 2015 as a Research Output. The 2015 occupied address (household) estimates were used alongside the income and benefits data from the Department for Work and Pensions (DWP) and HM Revenue and Customs (HMRC) to produce distributions of household annual income. However, as we can’t currently produce estimates of the number of traditionally defined “households” from administrative data, these “household” Income Research Outputs are based on the concept of income at an address. You can find more information on the differences between a traditional “household” and an “occupied address” in the Occupied Address Research Outputs publication.

As for the individual Income Research Outputs, the five administrative data sources on PAYE and benefits income were combined to create an income dataset. This income dataset was then linked to SPD V2.0, which contained Unique Property Reference Numbers2 (UPRNs) and demographic information. All individuals with a UPRN allocated to them on SPD V2.0 were grouped into addresses (households) using their UPRN (Figure 11).

Once individuals had been grouped into households using their UPRN, household income was equivalised using the Organisation for Economic Co-operation and Development-modified (OECD) equivalence scale. Equivalisation adjusts household income to reflect the different resource needs of households that are different sizes and compositions. Single adult households are the reference group used in the OECD-modified equivalisation scale with a value of one.

Using the OECD scale, a weighting was applied to each individual within the household. A weighting of 1.0 was applied to the first household member aged 16 and over, 0.5 to any others aged 16 and over, 0.5 to 14- or 15-year-olds, and 0.3 to those under 14 years. The total weight for the household was then calculated. The total income of all household members was calculated and divided by the household weight to get the equivalised household income.

Households with seven or more members aged 16 and over were removed from the outputs. This was to try to remove non-traditional households such as student halls of residence and care homes. This filtering resulted in the removal of 0.4% of households. Households without any members aged 16 and over were also removed.

Notes for: Methodology

  1. A Statistical Population Dataset (SPD) is a single, coherent dataset that forms the basis for estimating the population. It’s produced by linking records across multiple administrative data sources and applying a set of inclusion and distribution rules.

  2. A Unique Property Reference Number (UPRN) is a unique alphanumeric identifier for every spatial address in Great Britain and can be found in Ordnance Survey’s address products.

Back to table of contents

9. Feedback

We’re keen to get your feedback on these Research Outputs and the methodology we used to produce them. This includes any ideas you have on how we could improve the outputs, and potential uses of the data. We’re also interested in feedback on the income definition and lowest level of geography that you need, any observations you’ve made about the coverage and patterns of our data, and any thoughts you have regarding our future plans.

Please email your feedback to Admin.Data.Census.Project@ons.gsi.gov.uk. Don’t forget to include the title of the output in your response.

Back to table of contents

10. Annex A: Feedback on the 2016 publication

We received feedback on last year’s Income Research Output publication through an online survey and a series of engagement events held with representatives from various local authorities.

This section provides a summary of the feedback we received.

What do you use information about income for?

Respondents described a variety of uses for income data, some of the most popular uses were:

  • an indicator of deprivation, fuel poverty, financial inequalities and living standards

  • to inform services, policy and planning

  • to analyse the economic state of particular areas both over time and comparatively between similar geographical areas

  • to measure housing affordability and provision

Back to table of contents