Health Index methods and development: 2015 to 2021

1. Overview

This methodology article accompanies two releases presenting the Health Index results at local and national levels, and an article detailing the indicators contained within the Health Index.
The Health Index has been designed with the support of health experts to present a single number measuring the health of an area, with a clear breakdown of how different measures of health are combined to produce this value.
The development of the Health Index has followed guidance by the Competence Centre on Composite Indicators and Scoreboards (COIN).
Data have been selected from a wide variety of sources to allow comparisons across time and by geography, down to lower-tier local authority (LTLA) level.
Data selection has been based on agreed principles, such as the aim to measure health and its drivers rather than direct measures of health services.
Factor analysis has been used to group individual indicators of health into subdomains, guided by expert advice; factor analysis results have informed each indicator's weight towards the total Index value.

2. Handling data changes

Handling missing data for 2020 and 2021

The coronavirus (COVID-19) pandemic has naturally affected public health. It also forced changes to how some health variables were measured and whether certain indicators could be measured at all.

While COVID-19 and related restrictions have had the most obvious effect on 2020 and 2021 results and data collection, it is important to note that there may be other events or longer-term trends, which have led to differences between results.

Of the 56 indicators that form the Health Index, 21 were unavailable or unusable in 2021, compared with 12 in 2020 and two in 2019. In 2021 results, this includes four indicators for which the underlying data are not yet published but are expected to be later in the year. Prioritising timeliness for this release makes the Health Index more useful.

At the indicator level, all missing values were handled following our existing imputation methodology, which in these cases means 2019 or 2020 scores were often held constant. There is more detail and information about how we impute missing values in Section 10: Imputation of missing data (COIN Step 3).

The children and young people subdomain was especially affected by the pandemic, with only one and two out of five indicators available for 2021 and 2020, respectively. Our conversations with public health experts suggested the available indicators present a different trend to that expected for the indicators held constant.

Measures, such as pupil absenteeism, were unavailable but are expected to have increased, which would have contributed to a lower Health Index score. For this reason, we have decided to hold the children and young people subdomain itself constant for 2020 and 2021. This avoids misleading users about the impact of the pandemic on this subdomain. The available indicators are included in the Health Index for 2020 and 2021, but they do not affect their subdomain score for this year.

In 2021 we replicated the process above for the access to services subdomain. In this subdomain only one indicator had usable data, and we would expect to see variation in the other indicators if they were not imputed. To avoid presenting misleading results the subdomain is held constant in 2021, even though the indicators are included in the Health Index for 2021.

Updating data

The existing data were reviewed to ensure we use the most up-to-date and accurate data. For 11 indicators, the back series was updated where data producers had published new versions of their data. This was because of updates to methodology, improvements or revisions made by the producer since we last collected the data or because of re-weighting to more recent population estimates. These indicators are:

all four personal well-being indicators
cancer screening (bowel screening component)
child poverty
disability
healthy eating
job-related training
overweight and obesity in adults
suicides

For a further four indicators, the back series was updated as part of our calculations of the Health Index, including how we handled the data to produce our indicators. These indicators are:

personal crime - to remove bike theft and shoplifting, because these are already counted in the low-level crime indicator
rough sleeping - to keep in true zeros, where these were imputed previously
sedentary behaviour - using the correct year convention as applied to other indicators
sexually transmitted infections (diagnosis and test components) - new data provided through Fingertips

The size and extent of any of these changes varies by indicator. Detailed information about the changes to each indicator are available on request.

Handling data and population estimates following Census 2021

The Health Index 2021 includes the time-period in which Census 2021 data is available. As a result of the census, we have handled our data in varying ways because of the individual effect this has had on our indicators and associated mid-year population estimates.

Household overcrowding data for 2021 is now available and the 2011 and 2021 data are comparable, allowing us to build a more accurate reflection of household overcrowding in each year. There was previously one datapoint for 2011, which was being used as the 2015 baseline and every other year's score, in line with our imputation methods. This meant household overcrowding scores were stable at 100 for England, and lower-tier local authorities (LTLAs) retained the same score every year.

We used linear interpolation with the 2011 and 2021 data to estimate the level of household overcrowding in 2015. The estimated value for 2015 was used as the baseline in 2015 (a score of 100 for England). This means that the household overcrowding indicator has a time series of data from 2015 in the Health Index 2021, where scores are not held constant across the years, at any geography, in the Health Index.

Population estimates had the largest impact on our Health Index 2021. The Office for National Statistics (ONS) population estimates have shown differences between Census 2021 data and estimates rolled forward since the 2011 Census. Population estimates are important in our methods, and we use them in relation to our geographies and regional medians to help us create the Health Index scores from indicator to country level.

To prevent the Health Index scores showing artificial increases and decreases in scores because of the change from 2020 mid-year estimated population to Census population in 2021, we have re-used 2020 population estimates for 2021. Both the Integrated Care Systems (ICS) scores and our Health Index scores have applied this same method. The release of Health Index 2022 will include a revised back-series using ONS' latest population estimates.

The impact of the census population is also reflected in a few of the indicators we use in the Health Index. Where data providers use ONS population estimates to produce their data, the difference in population for census compared with the population estimates up to 2020 has led to an inconsistent time series. In these cases, we are using the previously published back series up to 2020 and using our imputation method for 2021. The indicators that are affected are alcohol misuse, frailty and self-harm, all sourced from Hospital Episode Statistics (HES) produced by NHS England.

Once new population estimates are available, the data provider will update the back series and the data will be usable from 2021 onwards.

Tell us whether you accept cookies

Health Index methods and development: 2015 to 2021

Table of contents

Handling missing data for 2020 and 2021

Updating data

Handling data and population estimates following Census 2021

The Index's origin

How the Health Index differs from existing products

Potential users of the Health Index

Data requirements for quality

Handling different timespans

2019:

2020:

2021:

Scaling

Normalisation

Standardisation

Weighting indicators within subdomains: time series factor analysis

Limitations of factor analysis

Weighting subdomains within domains: equal weighting

Weighting domains to the overall Health Index score: equal weighting

Contact details for this Methodology