1. Executive summary

This is the latest in a series of updates on the work to utilise data collected by Her Majesty’s Revenue and Customs (HMRC) from Value Added Tax (VAT) returns as an administrative data source for Short-term Output Indicators (STOI) and National Accounts.

The STOI in scope are the Index of Production (IoP), Index of Services (IoS) and Output in the construction industry. This also means that gross domestic product (GDP) data using these indicators are also in scope.

This article:

  • details the 10 industries being considered for the pilot use of HMRC turnover data later this summer (section 3);
  • describes progress since the previous article (section 4);
  • outlines next steps towards the use of HMRC turnover data (section 5).
Back to table of contents

2. Background

This is the fourth in a series of articles which describe progress in utilising HM Revenue and Customs (HMRC) VAT turnover as a main administrative data source. The first and second articles described our broad plans to use this as a comprehensive and cost-effective replacement for elements of the Monthly Business Survey (MBS). The third article outlined our intention to pilot changes, commencing summer 2016, to Index of Services and the Output approach to measuring GDP.

The MBS is split into 5 sampling stratum based on the registered employment of each business on the Inter-Departmental Business Register (IDBR). Band 1 represents the smallest businesses in a given industry and the sample is selected at random from the universe of businesses contained on the IDBR for the sampling period. Bands 2 and 3 similarly represent small and medium-sized businesses which are selected at random. Band 4 represents the largest businesses in a given industry, and the selection constitutes a census of businesses in excess of the employment cut-off for that industry. In addition, businesses below the census cut-off with turnover in excess of £60 million can represent a significant part of an industry. We ensure that such businesses continue to be selected when satisfying these criteria and are known as Band 5. There are few of these businesses.

The pilot would continue the MBS for Band 4 and 5 businesses, where the survey is a census of activity and consider the extent to which the survey could be replaced by HMRC turnover data for small and medium-sized businesses in Bands 1 to 3.

Back to table of contents

3. Candidate industries

3.1 Confirming the candidate industries

In the previous article we described how we were analysing 20 industries as candidates for the pilot. These had been chosen due to either a good coherence when comparing quarter on quarter growth rates between Monthly Business Survey MBS and raw HMRC turnover data (that is, before the data is cleaned); or in industries where a large MBS sample of small businesses was used to measure a relatively minor element of an industry based on turnover.

We have cut the candidate industries for a variety of reasons which will also allow us to maximise analysis time:

  • avoid industries where specific additional questions are included on an MBS form (for example, Industry 59 "Film and TV production, sound recording and music publishing" where grant income is collected)
  • ignore industries where the sample size was not significant enough to justify the work involved in the pilot phase (for example, Industry 75 "Veterinary activities" where only 30 forms may have been impacted at most);
  • more specific reasons (for example, Industry 53.2 "Courier activities" was not pursued due to the recent publication of a review of the industry as it was felt that to change methodology so soon after a review may lead to a degree of confusion).

The remaining 10 candidate industries are drawn exclusively from the services industries for 2 reasons. Firstly, the MBS for the production industries collects export turnover data (due to an international legal requirement) and we decided that in this pilot phase we will exclude data series where a simple replacement cannot be identified. Secondly, the Index of Production is published more quickly than the Index of Services and we felt that in this pilot phase we should work to improve and accelerate processes in preparation for the more challenging demands of the Index of Production.

It is important to stress that no decision has yet been made on which Bands (1 or 1/2 or 1/2/3) or industries will be included in the pilot when this commences later this summer. This is a matter that will be considered in the next phase of the project and be dependent on the outcome of analysis.

3.2 The significance of the candidate industries

The 10 candidate industries for the pilot are listed in Table 1. This describes the significance of the industry in relation to the total economy, expressed in parts per thousand (ppt). Specifically this represents its proportion of the sum of gross value added (GVA) produced by the economy in 2012. GVA is derived from outputs less inputs, or output less intermediate consumption.

For example, UK Standard Industrial Classification (SIC) 2007 industry 55 "Accommodation" accounts for 6.8792 parts per thousand of the total economy. At October 2015, the total annual turnover of all businesses held on the IDBR for industry 55, from which the MBS is sampled, was split between employment Band 4 (58%) and the sampled Bands 1, 2 and 3 (9%, 10% and 24% respectively). We have used these percentages as a purely notional means to measure the significance, or weight, of each employment Band for each industry. In practice the significance of each Band will vary month on month due to the differences in the amount of turnover provided.

The 10 candidate industries equate to 119.6582 ppt or some 12% of the total economy. However, this includes the contribution of businesses in employment Band 4 – the large businesses where MBS constitutes a census of activity. The use of HMRC turnover will not impact these businesses. In seeking to explain the potential maximum impact of the pilot approach we have used the October 2015 selection of MBS in common with the previous article.

Table 1 illustrates that the Band 4 businesses in the 10 industries account for 6.8% of the economy while the sampled Bands account for 5.2% (Band 1 at 1.7%, Band 2 at 1.4% and Band 3 at 2.1%).

1.3 Comparing HMRC and MBS samples for the candidate industries

In Table 2 we state the number of MBS questionnaire forms despatched to each candidate industry in October 2015 for Bands 1 to 3. In total 4,057 MBS forms were despatched for businesses in Bands 1 to 3. In comparison we have also stated the number of businesses supplying HMRC turnover data for October within 6 months of that period (this will exclude some annual returns and late quarterly returns) at some 428,000. In essence, our analysis will consider how 4,000 MBS forms and 428,000 VAT returns compare in estimating growth.

Back to table of contents

4. Recent progress

Since the publication of the previous article in December 2015 we have been testing and improving our processing of the data. For example, a review of data processing between receipt of data from HMRC and the commencement of results checking has identified some duplication which will allow an extra day for future results analysis.

Four data challenges were described in the previous article – cleaning of suspicious data, forecasting, the apportionment or mapping of data from HMRC VAT unit to the Office for National Statistics (ONS) Reporting Unit (RU), and the identification of methods to improve the monthly nature of the data.

The cleaning of suspicious data was covered at some length in the second article. As is common with the Monthly Business Survey (MBS), a degree of manual editing, is required after the impact of automatic cleaning to ensure that the final results are of an appropriate quality. An initial view from manually cleaning 5 of the candidate industries suggests that few businesses require manual editing but where these occur the impact can be significant. We are confident in identifying businesses that require intervention in this way and in our ability to improve the automated cleaning rules and seek to limit the ongoing need for manual editing.

We have recently begun to forecast data and the early results appear encouraging. Although the scale of 428,000 returns from businesses is a significant strength the returns are not available in as timely a manner as MBS. This demands that forecasts of initial estimates are required and as such an analysis of their accuracy over time will be a main determinant in the success of the pilot approach. In addition, further work is required to ensure that the process can be repeated quickly in a production environment.

The data we receive from HMRC is reported at VAT unit level. However, we require data that is available at the ONS RU level as this can be more detailed. This allows a more accurate assessment of the characteristics of often large businesses grouped in an enterprise that cover a variety of different industries. We currently use registered employment to apportion or map data between VAT unit and ONS RU but other variables can equally be considered. We have prepared an approach which will test the impact of using registered turnover and will analyse the findings and consider other options over the next few months.

When the pilot is launched we will ensure that calendar days are used to apportion data from quarterly and annual returns to months in preference to the current practice of dividing returns by 3 or 12 respectively. Further improvements, for example, the use of trading days for appropriate industries will be considered after the pilot.

Back to table of contents

5. Next steps

5.1 Parallel run of historical HMRC turnover data

Since January 2014 we have frozen each HMRC delivery. Between April and early June we will be re-processing this historical data and reviewing forecasting at each month. We will compare HMRC data with Monthly Business Survey MBS results, analyse the impacts of revisions to forecasts and data cleaning over a concentrated period and compare quarter on quarter growth rates for MBS and cleaned HMRC turnover. This intensive and significant stage of the pilot will determine the extent to which MBS data may be replaced with HMRC turnover.

5.2 July article

The next article in July will reflect on the analysis of the candidate industries and determine how the scope and timetable of the pilot will progress.

Back to table of contents

6. References

Allcoat, J (2015) “Feasibility study into the use of HMRC turnover data within Short-term Output Indicators and National Accounts” Office for National Statistics

Stephens, M and Allcoat, J (2015) “Exploitation of HMRC VAT data” Office for National Statistics

Stephens, M and Allcoat, J (2015) “HMRC VAT update December 2015” Office for National Statistics

Back to table of contents

Contact details for this Methodology

John Allcoat
stoi.development@ons.gsi.gov.uk
Telephone: +44 (0)1633 456616