Table of contents
- Main points
- Change in Coronavirus (COVID-19) Infection Survey data collection method
- Likelihood of testing positive for COVID-19 by data collection method
- Representativeness of the Coronavirus (COVID-19) Infection Survey population sample by data collection method
- Future developments
- How the data are measured
- Coronavirus (COVID-19) Infection Survey, Quality Report data
- Collaboration
- Glossary
- Related links
- Cite this article
1. Main points
Following on from our September quality report, we have continued to assess the impact of the change to our data collection method from study worker home visits to remote data collection on our coronavirus (COVID-19) estimates.
We compared the estimated likelihood of testing positive for COVID-19 on a nose and throat swab by data collection method while adjusting for several variables. We also assessed the representativeness of our remote data collection and study worker population samples by comparing them with demographic data from the 2021 Census.
In the first week of the period studied (11 to 31 July 2022), participants who provided a swab sample by remote data collection were more likely to test positive compared with those who provided a swab sample with a study worker home visit; however, after this there was no difference between the groups in their likelihood of testing positive.
Remote data collection and study worker population samples are representative of the Census 2021 population by sex, age and region.
The demographic profiles of remote data collection and study worker home visit population samples are very similar to each other.
3. Likelihood of testing positive for COVID-19 by data collection method
In our August quality report we compared modelled estimates of the percentage of people testing positive for coronavirus (COVID-19) on a nose and throat swab by data collection method. The estimates were produced using the same methods as those in our weekly Coronavirus (COVID-19) Infection Survey bulletin, which use a Bayesian multi-level regression (MRP) model that adjusts for age, sex and region. More information on the methods used to produce COVID-19 positivity rates in our weekly bulletin can be found in our methods article.
To further assess the impact of the change in how the data are collected, we compared the estimated likelihood of testing positive for COVID-19 on a nose and throat swab by data collection method and calendar date while adjusting for several variables. The analysis is based on regression models similar to those presented in our Analysis of populations in the UK by risk of testing positive for coronavirus (COVID-19) September 2021 publication, which provides a more detailed explanation of the methods used.
The models presented here additionally include an interaction between data collection method and calendar date to test for variation in any effect of data collection method on the likelihood of testing positive for COVID-19 over time.
The analysis in this section uses data from 11 to 31 July 2022 and includes 4,104 positive results from 89,030 people who provided a nose and throat swab to study workers, and 2,430 positive results from 53,718 people who provided a nose and throat swab via post or courier. Our first regression model allowed us to test the effect of data collection method by calendar date on the likelihood of testing positive for COVID-19 on a nose and throat swab, while controlling for the following demographic variables:
age
sex
geographical region the participant lives in
ethnicity
household size
whether the household was multigenerational
urban or rural classification of the participant's address
effect of a disability (from not having a disability to affected "a lot" by a disability)
The likelihood of testing positive for COVID-19 for those who provided a nose and throat swab sample remotely, compared with those who provided a swab sample at study worker home visits, by calendar date from 11 to 31 July 2022, is shown in Figure 1.
Results show that between 11 and 17 July 2022, participants who provided a swab sample by remote data collection were more likely to test positive than those who provided a swab sample at a study worker home visit. It is possible that those who provided a swab sample remotely in the initial few days of its launch were different in a way that meant that their risk of testing positive was higher than those who provided a swab sample through a study worker home visit in the same time period. For example, symptomatic participants may have provided a swab sample remotely sooner in these earlier days of the online survey launch than participants with no symptoms so that they could know their infection status.
All participants who provided a swab sample by remote data collection at the beginning of the online survey launch were at the start of their 14-day data collection window. Subsequent samples could have been taken at any time during a participant's 14-day data collection window, and invitations to participants to move to the remote data collection approach were also staggered. This means that participants could provide a sample at any point during their testing window, leading to overlap in times from the start of their testing window. This is why the behaviour of symptomatic participants may have affected the results in the first week, when there was no overlap with other data collection windows.
Between 18 and 31 July 2022, there was no statistical evidence of a difference between those who provided a swab sample by remote data collection and those who provided a swab sample at a study worker home visit in the likelihood of testing positive for COVID-19 on a nose and throat swab. This finding supports overall comparability of the results obtained from remote data collection and study worker home visits.
The odds ratios for this analysis are shown in Figure 1. An odds ratio of greater than 1 indicates a greater likelihood of an outcome in the specified group compared with the reference group, and an odds ratio of less than 1 indicates a lower likelihood. In this case, an odds ratio of greater than 1 indicates an increased likelihood of testing positive for COVID-19 for those who provided a swab sample remotely compared with those who provided a swab sample with a study worker home visit. An odds ratio of less than 1 indicates a decreased likelihood of testing positive for COVID-19.
Figure 1: There was no statistical evidence of a difference in the likelihood of testing positive for coronavirus (COVID-19) between remote and study worker home visit data collection methods, from 18 to 31 July 2022
Estimated likelihood of testing positive for COVID-19 on nose and throat swabs by day for those that provided a swab sample remotely compared with those who provided a swab sample at a study worker home visit, UK, 11 to 31 July 2022
Embed code
Notes:
An odds ratio of greater than 1 indicates a greater likelihood of an outcome in the specified group compared with the reference group, and an odds ratio of less than 1 indicates a lower likelihood.
This model controls for age, sex, geographical region the participant lives in, ethnicity, deprivation score, household size, whether the household was multigenerational, urban or rural classification of participant’s address, and the effect of a disability.
Download the data
Sensitivity analysis was produced using a second regression model, which controlled for the variables mentioned previously, as well as other variables that are associated with COVID-19 positivity, such as COVID-19 vaccinations, previous COVID-19 infection and recent contact with hospitals. When controlling for these additional variables the results comparing the two data collection methods were very similar. Odds ratios from this model and the previous model can be found in Tables 1a and 1b of the Coronavirus (COVID-19) Infection Survey quality report: December 2022 dataset.
All variables used and variables considered for these models can be found in Section 6: How the data are measured.
Back to table of contents5. Future developments
The findings presented in this article, as well as findings from our August 2022 Coronavirus (COVID-19) Infection Survey quality report, and September 2022 Coronavirus (COVID-19) Infection Survey quality report indicate that the change to a remote data collection method has had minimal impact on survey results.
We are continuing to conduct comparative analyses and will publish further findings over the coming months. This analysis will include representativeness of participants across more demographic variables.
Back to table of contents6. How the data are measured
Likelihood of testing positive for coronavirus (COVID-19) by data collection method
The models described in Section 3: Likelihood of testing positive for COVID-19 by data collection method test the effect of data collection method by day on the likelihood of testing positive for COVID-19, while controlling for several other variables. Variables controlled for in our first model were:
- age
- sex
- geographical region the participant lives in
- ethnicity
- deprivation score
- household size
- whether the household was multigenerational
- urban or rural classification of the participant's address
- effect of a disability (from not having a disability to affected "a lot" by a disability)
Variables controlled for in our second model were:
all of the variables controlled for in our first model
work status (responses were grouped into "Employed, working", "Employed, not working", "Not working", "Retired" and "Child/student")
whether the participant was previously infected with COVID-19 based on a positive swab test (in the survey, the English national testing programme or self-reported)
whether the participant had travelled abroad in the previous 28 days
COVID-19 vaccinations
contact with hospitals in the previous 28 days
contact with care homes in the previous 28 days
whether the participant currently smoked
Additional variables considered for the model that were not included were:
whether a child aged 16 years or under lived in the household
whether an adult aged 70 years or over lived in the household
days worked outside the home
whether the participant worked in a patient-facing healthcare role, a health and social care role or a care home
whether the participant worked in a role that involves direct contact with others
work sector
work or school location (at home or elsewhere)
social distancing at work or school
how the participant travels to work or school0
These variables were not included in the model because our screening process revealed no statistical evidence of association between them and the likelihood of testing positive for COVID-19.
Back to table of contents8. Collaboration
The Coronavirus (COVID-19) Infection Survey analysis was produced by the Office for National Statistics (ONS) in collaboration with our research partners at the University of Oxford, the University of Manchester, UK Health Security Agency (UK HSA) and Wellcome Trust. Of particular note are:
- Sarah Walker - University of Oxford, Nuffield Department for Medicine: Professor of Medical Statistics and Epidemiology and Study Chief Investigator
- Koen Pouwels - University of Oxford, Health Economics Research Centre, Nuffield Department of Population Health: Senior Researcher in Biostatistics and Health Economics
- Thomas House - University of Manchester, Department of Mathematics: Reader in Mathematical Statistics
9. Glossary
Deprivation
Deprivation is based on an index of multiple deprivation (IMD) (PDF, 2.18MB) score or equivalent scoring method for the devolved administrations, from 1, which represents most deprived, up to 100, which represents least deprived. The hazard or odds ratio shows how a 10-unit increase in deprivation score, which is equivalent to 10 percentiles or 1 decile, affects the likelihood of testing positive for COVID-19.
SARS-CoV-2
This is the scientific name given to the specific virus that causes COVID-19.
Effect of a disability
To measure how severely a disability affected participants, we asked them if any long-lasting health conditions reduced their ability to carry out day-to-day activities, as part of our Coronavirus (COVID-19) Infection Survey questionnaire. The response options for this question were: "Yes, a lot", "Yes, a little" or "Not at all".
Odds ratio
An odds ratio indicates the likelihood of an individual testing positive for COVID-19 given a particular characteristic or variable. When a characteristic or variable has an odds ratio of 1, this means there is neither an increase nor a decrease in the likelihood of testing positive for COVID-19 compared with the reference category. An odds ratio greater than 1 indicates an increased likelihood of testing positive for COVID-19 compared with the reference category. An odds ratio less than 1 indicates a decreased likelihood of testing positive for COVID-19 compared with the reference category.
Confidence interval
A confidence interval gives an indication of the degree of uncertainty of an estimate, showing the precision of a sample estimate. The 95% confidence intervals are calculated so that if we repeated the study many times, 95% of the time the true unknown value would lie between the lower and upper confidence limits. A wider interval indicates more uncertainty in the estimate. Overlapping confidence intervals indicate that there may not be a true difference between two estimates. For more information, see our methodology page on statistical uncertainty.
Embed code
11. Cite this article
Office for National Statistics (ONS), published 21 December 2022, ONS website, methodology article, Coronavirus (COVID-19) Infection Survey, Quality Report: December 2022