In late 2017, Office for National Statistics began an audit to understand the data that are available on the nine protected characteristics covered by the Equality Act 2010.
When the call for contributions closed on 1 March 2018, we had received 50 responses from a range of governmental and non-governmental organisations, identifying almost 400 sources of data.
The volume of sources identified by the audit varied for the different characteristics, but further work is needed to establish the depth of coverage within these sources.
This initial stocktake of the ethnicity data has highlighted potential issues with the comparability and coherence of the data sources.
A number of areas for further work are identified and these will be taken forward working collaboratively with experts from a range of organisations.
In late 2017, Office for National Statistics (ONS) began an audit of data sources and publications that are available to understand equalities in the UK today. Our aim is to work with others to ensure that the right data are available to address the main social and policy questions about fairness and equity in society, including outcomes for all nine of the protected characteristic groups covered by the Equality Act (2010).
In recent years, we’ve seen an increasing demand for robust data to monitor equalities, driven by a range of developments including:
the creation of the Government Equalities Office in 2007, which now leads work on policy relating to women, sexual orientation and transgender equality, and a range of equalities legislation
the Women and Equalities Committee being appointed by the House of Commons in June 2015 to examine the expenditure, administration and policy of the Government Equalities Office
on 1 January 2016, the Sustainable Development Goals of the 2030 Agenda for Sustainable Development officially coming into force, based on the premise of “leaving no-one behind”
in April 2017, new regulations coming into force making it mandatory for employers with 250 or more employees to publish and report information about their gender pay gap
These events have increased the demand for equalities data in the UK as well as having an impact on its availability and accessibility. Also during this period, the Digital Economy Act (2017) came into force increasing opportunities for making better use of existing data and data science methods have expanded our horizons creating new opportunities for sourcing, analysing and presenting data.
Taken together, this suggests the time is right to take stock of the state of equalities data in the UK, to understand where we are now and the challenges and opportunities ahead. On the premise that better data is the basis for better decisions, this audit is a first step in that direction.
Given the high volume of responses to the audit and the detailed work required to understand the quality and coverage of data in relation to each of the protected characteristics covered by the Equality Act , we focus here on summarising the overall response to the audit and what we can learn from it.
As a first step, we’ve also provided a high-level look at the data available on ethnicity. We began with this because ethnicity data are commonly available and so provide a good “first look” at the audit results. This is an area of particular policy interest as reflected in the commissioning of the Race Disparity Audit.
Our initial look at the ethnicity data highlights the need to work collaboratively with a range of experts to fully understand the quality, coverage and granularity of the data as well as where improvements are required. The plan is now to convene a Technical Advisory Group to explore the data on ethnicity more fully as well as the data available for all the other protected characteristics. For further details, please see the Next steps section.Back to table of contents
The audit aimed to catalogue publications, datasets and sources relating to the nine protected characteristics. As a starting point, we carried out an initial search of GOV.UK to identify relevant sources of official statistics. This first draft was then distributed within Office for National Statistics (ONS), both for feedback on its content and to capture any additional resources that had been missed in the original search.
In the second stage, we used a crowd-sourcing approach for feedback and further input, advertising on the Government Statistical Service (GSS) website and sending the audit directly to established contacts in a number of other government departments, including the devolved administrations.
Recognising the research that is being carried out by organisations and individuals outside of government and the evidence that they are generating, in the final stage we targeted non- governmental organisations with a demonstrated interest in the issues. Again using a crowd-sourcing approach, we advertised through the National Council for Voluntary Organisations (NVCO) mailing list and social media platforms including the Economic and Social Research Council Twitter feed.
For each of the resources listed, we asked respondents to provide information on the following:
the protected characteristic(s) that it covers
the organisation responsible for it
the underlying dataset – the survey source or administrative dataset
the theme – this was a free text field to capture the broad subject matter of the source
its geographical coverage – UK, Great Britain, country-specific or lower levels of geography
its geographical granularity – whether the source is currently broken down into lower levels of geography, such as country, region, local authority
the date of the most recent publication
how often it’s released
the time period for which the data are available
In some cases, these fields were left blank by respondents. We have completed them where possible, for example, we allocated a type of dataset to the source unless it consisted of more than one type of dataset or where the underlying dataset could not easily be identified; in these cases the field was left blank.
Similarly, where possible, we allocated themes to each source . These were chosen to align with the domains defined in the Measurement framework for equality and human rights, although the health domain was expanded to include well-being; this was felt to better align with the resources that had been identified in the audit.
As far as possible, the themes reflect what respondents had included in the free text field. Where the free text response didn’t align with our themes, for example, where the theme was a specific protected characteristic, the field was left blank. In contrast, where the response indicated more than one theme, duplicate records were created for that source, each of which captured a different theme. This enables users to filter the spreadsheet by theme and capture all the sources. We also followed this procedure for records for which the free text response was blank.
Some respondents provided links to websites that were sources of further evidence and analysis. We haven’t yet assessed their potential to provide specific types of additional evidence. We intend to work collaboratively with data producers and users to make a more detailed assessment of the available data (see Next steps).
Notes for: Our approach
- The Equality Act defines the following as protected characteristics: age; disability; gender reassignment; marriage and civil partnership; pregnancy and maternity; race; religion or belief; sex; and sexual orientation.
The audit was designed to be an initial stocktake of the evidence that currently exists as a basis for the work we intend to take forward (see Next steps). It is a live document that will continue to evolve over time as new data sources are added and existing data sources are updated. This report covers the findings from the audit as of 1 March 2018, when our call for contributions closed.
A copy of the audit spreadsheet is published alongside this article. We continue to welcome additions to it and feedback on our approach. To provide feedback or comments please email email@example.com.
The response to the audit
In total we received 50 responses to the audit, 39 from government departments and agencies and 11 from non-governmental organisations, including academics, charities and think-tanks.
These responses provided links to almost 400 sources of data in a variety of formats, including articles, statistical bulletins, CSV files, datasets or tables, headline commentary and figures, infographics, statistical releases and web tools. Links to a further 55 websites were also provided.
Coverage of the protected characteristics and themes
The number of sources listing age and sex as protected characteristics was higher than for those identifying any of the other protected characteristics. This is likely to reflect the fact that age and sex are routinely captured in data collection and are used as standard breakdowns in most statistical releases. In contrast, some of the other characteristics, for example, sexual orientation and gender reassignment, are not routinely included in data collection, so are listed in fewer of the resources in the audit.
Further work is needed to establish the depth of coverage of the protected characteristics in these sources. While age breakdowns may be provided, they may not cover all age groups, for example, children. Conversely, while characteristics such as sexual identity and gender reassignment may be covered in more depth in the resources that include them, there are fewer resources against which to compare results. As such, the volume of sources reported shouldn’t necessarily be taken as an indication of better depth or coverage of any given characteristic or theme.
For the majority of the reported sources it was possible to allocate a theme. Of these, just over half of the sources related to the health and well-being theme.
Slightly less than a quarter of the reported sources were UK-wide, with the remainder covering individual countries or combinations of the countries within the UK. Around three-quarters were reported to be available at lower levels of geography, for example, broken-down by country, region and so on, and around a quarter were reported to be available at local authority level, though further work is needed to establish which of the protected characteristics are available at these lower levels of geography.
Around three-quarters of the data sources reported are regular publications, updated at least annually if not more frequently. For the annual releases, data are generally available within a year of the end of the reporting period. Ad hoc or occasional releases tend to cover more specific pieces of analysis, often on some of the least-covered protected characteristics. The majority of these come from surveys so, as part of the working group, we will look at whether these provide data that can be updated on a more regular basis.
Accessibility and transparency
Each of the records included in the audit is an online resource, though it may be a report or table and users are not necessarily able to access the underlying dataset in all cases. In many cases, the way in which the data are presented in these online resources enables users to easily access and understand the main supporting information , for example, the source, quality information and the underlying methodology used to generate it. However, there are examples where it is more difficult to access these important pieces of information.
Notes for: Summary outcome of the audit
- The themes were Education, Work, Living standards, Health and well-being, Justice and personal security and Participation.
The Equality Act 2010 identifies race as one of the protected characteristics, defining it in relation to colour, nationality and ethnic or national origins. Although race is the protected characteristic, ethnicity is the primary source of data collection in the UK and is therefore used in monitoring equality1 .
Coverage of ethnicity data
The audit identified 150 sources of data on ethnicity, the majority of which were produced by government departments, with similar numbers coming from surveys and the census as from administrative data sources. These data sources covered the full range of themes, with health and well-being again representing the highest proportion. A larger proportion of ethnicity sources related to education and justice than was seen overall, with these deriving mainly from administrative data sources, such as school Management Information Systems and the Prison National Offender Management Information System.
For the sources of data on ethnicity, there were fewer UK-wide sources and more country-specific sources than overall, with the highest numbers for England (32 sources) and Scotland (30 sources). Just over two-thirds of all the sources identified were reported to be available at lower levels of geography, including country and region and around a third were reported to be available at local authority level. As with the aggregate level, further work is needed to determine if the full ethnicity breakdown is available at these lower levels of geography for all the sources.
To adequately capture the experience of all members of the population, it is important to include as many ethnic minority groups as possible in any analysis, including White ethnic minorities. The GSS harmonised principles (PDF, 157KB) on ethnicity allow for data to be collected at a broad level using five aggregate categories (see Comparability and coherence of data sources section for further information about harmonisation). However, using broad categories does not allow for differences within these categories to be picked up.
In common with the findings from the Race Disparity Audit, our audit revealed a lack of data showing detailed breakdowns for minority ethnic groups. This is a common problem, particularly with survey data where sample sizes are too small for these groups to allow meaningful analysis. There are ways in which this can be dealt with, for example, by boosting the data collection to target groups of interest or combining multiple years of data collection, but these have their issues, including implications for sample design and costs and identifying year-on-year changes. Although collecting data at this level of detail presents a challenge, it is essential so that we can adequately understand the situation of different ethnic groups.
It is recognised that disadvantage may be experienced differently by those with multiple protected characteristics, for example, a woman from an ethnic minority group. This is referred to as intersectionality. For this reason, it is important that data are available to effectively monitor the intersection of different protected characteristics.
As might be expected given the lack of detailed data on minority ethnic groups, few sources of ethnicity data identified by the audit were broken down by other protected characteristics. Those that were, provided breakdowns by age or sex although these were generally only available for aggregated ethnic groups. That said, a number of the ethnicity sources were also sources of data for other protected characteristics, so there is scope to explore whether further intersectional analysis may be possible.
Reporting of ethnicity
In our guide for collecting and classifying ethnicity data, Office for National Statistics (ONS) recognises ethnicity as a subjective and multi-faceted concept that is self-defined and reflects how people see themselves. Because of this, a person’s ethnicity can change at different times, depending on the social and political context. ONS and the United Nations Statistics Division describe some of the criteria used to identify ethnic group as nationality, country of birth, language, religion, national or geographical origin and skin colour.
The subjective nature of ethnicity means that it should always be self-reported wherever possible, though the guide notes that some individuals, for example, children, may need help to understand the categories.
For the sources of data included in the audit, it is not always clear whether ethnicity has been self-reported. In general, if the guidance is being followed, as it should be on government surveys, ethnicity should be self-reported. However, not all the outputs on ethnicity are based on self-reporting, for example, data on youth cautions . Where ethnicity is not self-reported, the quality of the data can be compromised and this will also impact comparability across different sources. Ethnicity data that are not self-reported may also result in excessive use of “Other” categories, as seen in data about detentions under the Mental Health Act , or high levels of missing data. Further work is needed to establish the extent of this issue in the ethnicity evidence base.
The majority of the data sources in the audit that include ethnicity are released annually or more frequently and are published within a year of the end of the reporting period, indicating that timely data are available on a regular basis. The audit includes relatively few ad hoc or occasional releases relating to ethnicity and over half of these derive from surveys so there may be scope to explore whether these could be more regularly updated.
Comparability and coherence of data sources
The Government Statistical Service (GSS) has produced a set of harmonised principles (PDF, 52KB) to be used when collecting and analysing ethnicity data. The aim of these principles is to ensure comparability and consistency in statistical outputs across the GSS. They were developed in collaboration with stakeholders and cover a range of different modes of data collection.
It is important to note that the harmonised principles are different for England, Wales, Scotland and Northern Ireland, in part reflecting differences in legislation between the countries. However, there are specific recommendations for their use across Great Britain and the UK to deal with these differences.
The current harmonised principles, which have been in place since 2011, define 18 categories of ethnic group in England and Wales, 19 categories in Scotland and 16 categories for Northern Ireland . Where sample sizes do not allow the full categorisation to be applied, or to aid comparability between countries with different categorisations, the data should be aggregated to five main broad headings, with notes to explain the differences.
In their measurement framework, the Equality and Human Rights Commission (EHRC) recommend that these principles are used when collecting ethnicity data for use in monitoring equalities. However, the audit overall showed that these principles aren’t being applied consistently.
There were numerous sources identified by the audit where analysis had been produced using aggregations that were not harmonised. A number of others were also using earlier versions of the harmonised principles, which are now out of date. This applied in some cases even for relatively recent pieces of analysis.
In addition, some sources were using terminology that was not consistent with the principles and there were examples of categories aggregated under acronyms such as BME and BAME to refer to all except the “White” ethnic group. Information on what is included in these acronyms was often missing and it was unclear whether minority “White” groups, for example, “Gypsy/Irish Travellers”, were included.
This lack of alignment with the harmonised principles hinders comparability between different data sources. Further work is needed to establish why the harmonised standards are not being used consistently by all producers of statistics and how this situation could be improved. The Race Disparity Unit has identified similar inconsistencies in the data presented on the Ethnicity Facts and Figures website and we will be working together to improve harmonisation across the GSS.
Notes for: Initial findings on ethnicity data
An exception to this is in the recording of hate crimes in the justice system; these are described as racial hate crimes, resulting from the perception of a person’s real or perceived race.
See the Ethnicity Facts and Figures website for further details.
See the Ethnicity Facts and Figures website for further details.
To comply with Northern Irish legislation, there is only one White category and Irish traveller is a main category, separate from White.
The audit was intended as a first step towards collaboratively developing a data infrastructure by building on and bringing together what already exists on inequalities. The initial findings reported here have highlighted a number of areas for further work on ethnicity data.
We expect that the data on the remaining protected characteristics will similarly highlight the need for further work in these other areas. Our priority is therefore to convene technical working groups consisting of experts from a range of organisations to take forward this work. These groups will fully explore the existing data sources for all the protected characteristics, including their potential to be used for further analysis, identify where the gaps are and prioritise the areas for further work.
If you are interested in participating in these groups, please contact firstname.lastname@example.org.Back to table of contents