Highly experimental research, based on web-scraped supermarket data for 30 everyday grocery items, shows that the lowest-priced items have increased in cost by around as much as average food and non-alcoholic drinks prices (with both rising around 6% to 7% over the 12 months to April 2022).
There is considerable variation across the 30 items, with the prices for six items falling over the year, but the prices of five items rising by 15% or more.
The difference between the lowest-cost version of an item and the next lowest-cost version of it is often large; for over two-thirds of the items monitored, the next item was at least 20% more expensive.
The data provided in this article are highly experimental; varying the methodology, particularly the choice of substitutes for missing products, could give notably different estimates: more information on why we have chosen this approach is available in Section 9: Strengths and limitations.
Using innovative analytical methods to track the lowest-priced grocery items
With rising prices seen across many goods and services, the Office for National Statistics (ONS) has looked at the question of how prices of everyday grocery items have changed for the lowest-cost products.
To try to answer this, we apply new and highly experimental methods, making use of web-scraped supermarket data to capture the price changes of everyday grocery items. Over the year to April 2022, online grocery price quotes were collected from seven major supermarket retailers' websites. Prices were assessed for 30 everyday food and drink items, covering fresh fruit and vegetables, cupboard staples, chilled products, as well as meat and fish.Back to table of contents
The lowest-priced everyday grocery items have seen a notable variation in price change, with some items showing increases of over 15%, while other items fell in price
There was a very wide range of price movements for the lowest prices, when looking over the year to April 2022.
For 13 of the 30 sampled items monitored, the average lowest price, across the seven retailers, increased at a faster rate than the latest available official consumer price inflation measure for food and non-alcoholic beverages (a 6.7% increase over the 12 months to April 2022). For 10 items, the lowest-cost price increased by more than 10%, and for 5 of those 10 items the lowest-cost price rose by 15% or more.
The items where the lowest prices rose at the fastest rate were pasta (up 50% between April 2021 and April 2022), crisps (17%), bread (16%), minced beef (16%) and rice (15%).
For 6 of the 30 items, the lowest prices fell on average over the 12 months to April 2022. Price decreases were measured for potatoes (a 14% fall in price), cheese (7%), pizza (4%), chips (3%), sausages (3%) and apples (1%).
It is important to consider that for each of the 30 items, the overall figure can be made up of different price movements at the product level.
An example of this is vegetable oil - which could be drawn from a variety of products including sunflower oil and rapeseed oil - where prices rose by 9%. However, for vegetable oil, there are instances where we are missing some product-level price data, at some retailers, for the cheapest oils. In one case, this caused us to track the price of a more expensive oil. For these reasons, the estimate of 9% should be treated with additional caution. We provide further explanation on issues with the data used for this analysis in Section 9: Strengths and limitations.
Figure 1: There was a substantial range of price movement for the lowest prices
Lowest price of selected 30 everyday groceries, item-level price changes, April 2022 compared with April 2021
In cash terms, the largest price rises, on average, were measured for beef mince (up 32 pence for 500g to £2.34) and chicken breast (up 28 pence to £3.50 for 600g). Pasta (an increase of 17 pence), vegetable oil (14 pence), and crisps and rice (both increased by 12 pence) showed the next largest increases in cash terms.
The largest average fall in the lowest price was measured for potatoes (down 12 pence to 75 pence for a 2.5kg bag), followed by cheese (down 7 pence to 88 pence for 255g) and pizza (down 4 pence to 95 pence for 300g).
|Price (pence) |
in April 2021
|Price (pence) |
in April 2022
|Mixed frozen vegetables||1000g||78||89||11|
|Orange fruit juice||1000ml||63||72||9|
Download this table Table 1: Lowest price of groceries, April 2021 and April 2022, pence.xls .csv
Figure 2 shows the extent to which prices have changed over the year to April 2022. We have presented item indices with prices in April 2021 given a reference figure of 100. If an item was £2 in April 2021 and rose to £2.20, the item index would increase from 100 in April 2021 to 110, reflecting a 10% price increase. Alternatively, if the same item had fallen in price to £1.80, the index value would fall to 90 reflecting a 10% price reduction.
For a subset of items in late 2021 - most notably pasta, bread, biscuits, and crisps - inflation for the lowest price of everyday items increased.
However, the exact timing of price increases varied depending on the individual product. For example, the lowest price of baked beans rose by 10% over October and November 2021, while pasta prices rose 32% over November and December 2021.
Other items saw a more gradual increase or decrease in the lowest price. For example, the lowest price of potatoes trended downward over the twelve months and was 14% lower in April 2022 compared with a year ago. Several items had a very stable lowest price throughout the entire period, such as sausages and yoghurt.
Figure 2: Lowest-cost prices saw varied movements over the year to April 2022
Lowest price of groceries, item-level price index, April 2021=100
Combining the lowest-cost items into an index shows that, overall, the prices of the cheapest items has risen since April 2021 in line with official measures of inflation
Figure 3 shows that an overall groceries index that combines the lowest prices of 30 everyday items follows a similar trend to official measures of inflation for the food and non-alcoholic beverages component of the consumer price index including owner occupiers' housing costs (CPIH).
The lowest prices of the 30 everyday items, weighted by retailer and item, rose by 6.0% in the year to April 2022. When comparing March 2022 with April 2022, the lowest prices rose by 0.9%. Section 9: Strengths and limitations, provides more information on the approach to weighting retailers and items.
Over the month from March to April 2022, the items that saw the largest increase in the lowest price were breakfast cereal (up 6%), mixed frozen vegetables and vegetable oil (both increased by 5%).
Looking across all items over time, there was some evidence of shrinkflation - with pack sizes reducing but costing the same. Since the item size is collected by the web scrapers, this analysis accounts for changes to the product size.
It is also worth noting that there was evidence that sugar-free or low-salt versions of some lowest-cost items are often the same price as the standard versions of these products.
Back to table of contents
Items may not always be available instore or online, which is reflected in the data picked up by our web scrapers. Value ranges often represent a substantial saving and, where they are not available, the price difference to the next lowest-priced available item is often large.
By looking at all the available products within an item category, the difference between the lowest-cost and next lowest-priced item can be calculated. For over two-thirds of the sample items, the cost of substituting an item would have been at least 20% higher than the previous lowest-cost item. For four items, the difference was more than 50%.
As a result, the measure of the lowest price presented in this analysis can be sensitive to product availability and the specific products that are being substituted. Different approaches to substituting items can result in very different trends. Section 9: Strengths and limitations explains the full limitations of this analysis.Back to table of contents
Analysis of lowest-cost items, UK
Dataset | Released 30 May 2022
Data tables containing the item list and volumes, price change and indices published alongside the Office for National Statistics' (ONS’) analysis of lowest-cost items.
Consumer price inflation
Consumer price inflation is the rate at which the prices of goods and services bought by households rise or fall. It is estimated by using price indices. For an overview of the indices and their uses, please see our article, Consumer price indices, a brief guide: 2017.
Consumer Prices Index including owner occupiers' housing costs (CPIH)
CPIH is the most comprehensive measure of inflation. It extends the Consumer Prices Index (CPI) to include a measure of the costs associated with owning, maintaining and living in one's own home, known as owner occupiers' housing costs (OOH), along with Council Tax. Both are substantial expenses for many households and are not included in the CPI.
Shrinkflation is a term used to describe the process of a product's size (volume or concentration) being reduced while its price remains the same. For this analysis, the item size was collected by the web scrapers so we have been able to account for changes to the product size.
Web scraping is the activity or process of taking information from a website. During the coronavirus (COVID-19) pandemic, the Office for National Statistics (ONS) developed web-scraping capability as part of a previous project, which is explained in our Online price changes for high-demand products methodology. This work was expanded to cover a wider range of grocery products.Back to table of contents
Data sources and quality
The web-scraped data have been collected from seven grocery retailers: Asda, the Co-op, Iceland, Morrisons, Sainsbury's, Tesco and Waitrose.
For each item and retailer, the price of the lowest-priced product available was selected (after adjusting for the size of the item) over time to see how it changed.
These data differ from the data used to compile the official measures of inflation for food and non-alcoholic beverage items. For the 179 food and non-alcoholic drink products, local price collectors collect 50,000 prices by visiting sampled retailers in over 140 locations across the UK.
These data enabled us to identify the grocery items most likely to be bought by households on low incomes.
A total of 30 items were chosen as a good trade-off between coverage of a high proportion of expenditure, and the time and complexity costs of adding more items to the analysis.
The approach started with the items with the highest expenditure and the largest quantity bought by households in the lowest-equivalised income decile. But other factors such as the substitutability of items, the representativeness of other comparable items and including a broad range of products were also factored in. For example, prices of minced beef are likely to respond to similar economic factors as other beef and even other meat products.
The complete list of sampled items was prepared in collaboration with external stakeholders.
When selecting products from retailer websites, we only include items within a per item size band. For example, for sausages, we focus on items between 350g and 700g, inclusive. The size bands are used to:
define representative items
exclude more expensive bulk or large multipack products
create a more consistent comparison between retailers
address data quality issues, such as misclassifications and product sizing errors
The size bands used are based on a manual review of common product sizing for the retailers, and pragmatic decisions to maximise availability and consistency of data.
To identify the basket items in the web-scraped data, we use manually defined keywords for each item. This builds on part of a previous Office for National Statistics (ONS) project, which is explained in our Online price changes for high-demand products methodology.
For example, to identify "apples" we might expect the product to be listed in the "Fresh Fruit" section of the retailer product catalogue and to contain the word "apples" in the product name. In practice, each item has an accompanying list of product catalogue sections, keywords to match in the product name, and a separate list of words to exclude products (for example, for "apples" we would not want to include "toffee apples").
This approach is simple to apply, and quick to iterate and add new items, but it does have some drawbacks. There is a great deal of manual quality assurance required to check that the keywords select the expected products. It is also hard to guarantee that all the target products have been found. Additionally, there is some judgement required around which products to allow as representatives for an item, factoring in packaging, product composition, reliability of size information, and comparability of the products.
Product price quote data are pooled by the month, where we exclude items that were only available for a week, or less, of that month. This helps address issues with data collection, for example, where the collection for a retailer has been affected by changes to the website.
The product size information in our data is not perfect. The web-scraped data do not have product size information at all for a large number of products, and where it is present it is sometimes incorrect, ambiguous or in nonstandard units. To address this, we have manually searched for size information for several products and applied unit conversion where appropriate. Where we do not have any size information, or the size information is larger or smaller than is plausible, we drop the product from the analysis.
Items are aggregated according to the Consumer Prices Index (CPI) weights at classification of individual consumption by purpose (COICOP) 5 level. In most cases, each item belonged to a unique COICOP5 category. For four COICOP5 categories, we have a pair of items belonging in the category. In these cases, the category weight was split evenly between the two items.
As we use prices from seven high street grocery retailers, we need to use a weight structure to produce weighted average prices across those retailers. We use three main data sources to estimate the market share of each retailer within low-income households: UK grocery market share data, Index of Multiple Deprivation (IMD) Income Domain, and UK supermarket location data.
UK grocery market shares, provided by Kantar, captures aggregate market shares across all income levels. Therefore, we have developed a method to calibrate those shares to obtain grocery market shares for low-income households.
Income domain of IMD ranks small areas (LSOA) within each country of the UK according to the income deprivation levels. We first match each grocery store to the income decile of the neighbourhood in which it is located. Then, we calculate the proportion of each retailer's stores in the lowest-income decile. Finally, we apply these proportions to the retailer's market share to approximate the market share of each retailer in the lowest-income decile areas. These shares are then rescaled to sum to one and used as retailer weights.
To produce the index series at each aggregation level, we use a bilateral Lowe index, which uses quantities in a choice of period to weight each item. The formula for the Lowe Index is given as:
where b can be any period, or range of periods.
At the first level, to produce the item group index series, we apply this formula for each item group with unit prices for pᵢᵗ for each product i in period t and retailer weights for q which we assume to be constant over time.
To produce the final index series, at the highest level, we use the item group indices and apply the Lowe formula again, this time treating each item group as a product with index values pᵢᵗ for each label i in period t and aggregating according to the CPI weights at COICOP5 for q.Back to table of contents
Limitations in measuring the lowest price of groceries
The estimates presented here are highly experimental and are subject to great uncertainty. All average estimates will reflect a combination of different price movements for each individual product.
There are several limitations of the analysis. An important one is that the data are based on prices and product characteristics (including pack size) that have been scraped from retailers' websites. This means that the available products represent the retailer's online catalogue, rather than the range of products available or bought in local stores that month.
We do not have the data to say that a lowest-priced product is actively being purchased by consumers. Although our dataset includes price and product details, we do not have sales or expenditure data (all we know is that it is available to purchase from the retailer's website).
As we wish to focus on the very lowest-cost products, not an average of numerous product prices, the estimates of price change are created from a very small number of price quotes. For each month, figures for each item are based on seven prices at most - these are the products with the lowest unit price from each retailer. This means that the analysis is extremely sensitive to the input data.
With any new experimental process, there may be problems with its implementation. In this case, data have not been collected under ideal conditions. The inability of web scrapers to immediately adapt to changes to retailer websites mean some data were missed on occasion. As data are collected on a daily basis, it is not possible for us to go back and recollect missing data. Where data have been missed, we have developed processes to account for missing prices.
Impact of product substitution on results
Where a product selected in the previous month was not available in the collected data, a substitution was made to the next lowest-priced, similar available item. A substitution was also made where a new, similar item was available at a lower price. It is important to note the findings are very sensitive to the approach for substituting products and, as noted above, data on online price quotes and product availability may not reflect instore conditions.
The difference between the cheapest and the next cheapest item is often substantial, and so any type of substitution can have a notable impact on the index and corresponding price change over the year.
The impact of substitution can act to either increase or decrease the price index, at any point in the time series. For example, the lowest-priced or value-range item may not happen to be available at the start of the time period but may come back in stock at the end of the time period. This would act to reduce the index in later periods.
To identify the underlying trend in the lowest prices, without introducing excessive noise from product churn - where the range of available products on online stores continually changes - it may be beneficial to constrain the amount of product substitution that is occurring.
One approach to doing this is to limit the allowed substitutions so that they are within a strict percentage price difference range. The benefit of this is that it reflects the fact that spending decisions can change depending on whether a substituted item is substantially more expensive or not. This approach - which is the one we use for the headline results - removes much of the volatility in the index and results in a stronger upward trend movement throughout the year. Substitutes were not selected if the price was 20% higher or lower than the existing item. This threshold was informed by sensitivity analysis balancing the volatility of the resulting data, the effect on the number of eligible products and the comparability of the substitute item.
A drawback of this approach is that it can cause us to miss the entry of cheaper products for an item where the new products come in at a price point far enough below the current product.
Another approach would be to instead allow no constraint when choosing a substitute item. That is, to pick the cheapest product (by unit price within a size band) matching each item, with no regard to the price difference for the substitute item. For some items, this approach could lead to some very expensive products being considered as substitute items, meaning that we would see far greater volatility in the overall time series.
This alternative approach to substitution would result in a notable reduction in the index from February 2022 onwards, reflecting some value items not being available (in online stores) at the very start of the period, with the only available substitutes at that point being much more expensive products. When cheaper value products become available in later months, this alternative method would show reductions in the index relative to the results from our chosen, more constrained, substitution approach.Back to table of contents
This analysis is part of our current and future analytical work related to the cost of living, which has also included developing our personal inflation calculator to show you how inflation is affecting your household costs.
As we have outlined, since the analysis is based on web-scraped data, there were, inevitably, limitations to the analysis of lowest-cost items that we could carry out.
Our ongoing transformation programme to include new improved data sources, and developing our methods and systems for the production of UK consumer price statistics will notably improve our capability to reflect our changing economy and produce more robust, timely and granular inflation statistics for businesses, individuals, and government.
We welcome feedback on this work, which can be addressed to: firstname.lastname@example.orgBack to table of contents
Contact details for this Article
Telephone: +44 1633 455121