The amount and variety of data that is available is growing rapidly and at a quicker pace. There is a wider range of data available in many formats, including audio, video, computer logs, purchase transactions, sensors and social networking sites. This has created Big Data, which are large, often unstructured datasets that are available, potentially in real time. At the same time, new data science techniques for maximising the value of both Big Data and other data sources are constantly being developed.

Big Data is a big topic. As the UK's largest producer of official statistics, we want to understand the effect it may have on our statistical processes and outputs. Our Big Data Team are investigating the advantages and challenges of using alternative sources of data and data science techniques in official statistics. This includes projects such as exploring web-scraped price data, machine learning for matching addresses and natural language processing for coding textual survey responses.

The team works closely with the ONS Data Science Campus and is part of the wider Government Data Science Partnership.

We regularly publish the outcomes of our work, and occasionally blog about it too. You can find some of our reports in the download section of this page, or for a complete list you can visit our Github.io page, which also contains some of our code.

We are committed to protecting the confidentiality of all information we hold. To produce statistics using alternative data sources, we are only interested in trends or patterns that can be seen, not personal data about individuals. However, we recognise that accessing data from the private sector or from the internet may raise concerns around security and privacy. We ensure that all of our work fully complies with legal requirements and our obligations under the Code of Practice for Official Statistics. We also work closely with the National Statistician’s Data Ethics Advisory Committee to consider the ethical issues associated with using these types of data sources within official statistics. For instance, we have developed guidance for web-scraping in official statistics.

For more information about the Big Data Team and its projects, please contact us via email at ons.big.data.project@ons.gsi.gov.uk.

Want to work on data science within government? Go to the Government Statistical Service Data Scientist recruitment page.