|Data collection||Administrative data|
|How compiled||Death certificate records|
|Geographic coverage||England and Wales|
|Related publications||Deaths of homeless people in England and Wales – local authority estimates: 2013 to 2017|
|Last revised||25 February 2019|
Back to table of contents
This quality and methodology report contains information on the quality characteristics of the data (including the European Statistical System five dimensions of quality) as well as the methods used to create it.
The information in this report will help you to:
understand the strengths and limitations of the data
learn about existing uses and users of the data
understand the methods used to create the data
help you to decide suitable uses for the data
reduce the risk of misusing data
These figures are produced as Experimental Statistics, which are in the testing phase and not yet fully developed, and they have yet to be assessed against the rigorous quality standards of National Statistics; comments and suggestions to improve the quality of this output and make it more useful to users are invited, and can be sent via email to firstname.lastname@example.org.
Deaths of homeless people were identified from the death registration records held by Office for National Statistics (ONS), and a statistical method called capture-recapture modelling was applied to estimate the most likely number of additional registrations not identified as homeless people.
The figures reported are the total estimated numbers, except where specifically described as being based on identified records only; the method used provides a robust but conservative estimate, so the real numbers may still be higher.
Definitions of homelessness exist for different purposes and with variations across the UK for legal and policy reasons; the Government Statistical Service (GSS) Harmonisation Team have explored the feasibility of harmonising definitions of homelessness for official statistics and a report on this work will be published on the GSS website on 28 February 2019.
The meaning of homelessness in this release is not based on a pre-existing definition but follows from the scope for identification of affected individuals in the death registration data; the records identified are mainly people sleeping rough, or using emergency accommodation such as homeless shelters and direct access hostels, at or around the time of death.
The first official estimates of the number of deaths of homeless people in England and Wales, published on 20 December 2018, covered deaths registered in the years 2013 to 2017. Selected breakdowns by age and sex, cause of death, time of year, and geographical area were given. The geographical areas were England and Wales combined and separately, English regions, and combined authorities (“city regions”).
A follow-up article giving estimates of the number of deaths of homeless people for local authorities in England and Wales was published on 25 February 2019. Information on the distribution of these deaths by decile of The Index of Multiple Deprivation (IMD) and Welsh Index of Multiple Deprivation (WIMD) and by the 2011 rural-urban classification was also included. We plan to update both the national and local area figures annually.
Uses and users
Homelessness is an important problem affecting some of the most vulnerable people in society, but it is difficult to measure as well as to solve. The government’s Rough Sleeping Strategy for England set new aims, including that deaths or serious harm of people who sleep rough should be rigorously investigated, while the Welsh Rough Sleeping Action Plan called for better monitoring and measuring of the extent of rough sleeping. The UK Statistics Authority published a review of housing and planning statistics as a whole in November 2017.
In addition, every local authority (LA) has a rough sleeping strategy and typically employs outreach workers. The information in this analysis can inform these activities.
Strengths and limitations
The cross-referencing of different mentions of homelessness in death certificates provides a strong dataset of identified homelessness.
These data provide users with valuable insight into the changing patterns of deaths while homeless in England and Wales.
Deaths while homeless provide data produced using the same methods for all local authorities in England and Wales, so that data for one local authority are comparable with data for other local authorities.
For local authorities it is important to be aware of their limitations. In particular, the method did not allow any estimated deaths to be allocated to local authorities where there were no actually identified deaths of homeless people in the relevant year. This means that a small number of deaths may have occurred in areas that are shown as having no deaths in these figures. We plan to use a more sophisticated estimation method to overcome this in the next release (deaths registered in 2018).
The figures presented show deaths registered each year, rather than deaths occurring each year. A substantial proportion (approximately 83% for the five-year period 2013 to 2017) of deaths of homeless people are certified by a coroner. This means that, due to the length of time it can take for an inquest to be completed, some of the deaths registered in (for example) 2017 will have occurred in earlier years, while some deaths that occurred in 2017 will not yet be included in the figures. These differences are likely to have relatively little impact at England and Wales level but can have more influence on figures for smaller geographical areas such as local authorities. See the latest report on the impact of registration delays on mortality statistics for more information.
These figures are produced as Experimental Statistics, which are in the testing phase and not yet fully developed. There are no recent improvements.Back to table of contents
How we collect the data, main data sources and accuracy
The figures are compiled using information supplied when a death is registered. A record for each death registered in England and Wales is held on the Office for National Statistics (ONS) death registrations database. Further details about the information held on the ONS death registrations database, and the methods used to quality assure the data can be found in the User guide to mortality statistics.
All deaths in England and Wales are coded by ONS according to the International Classification of Diseases (ICD) produced by the World Health Organisation. The Tenth Revision (ICD 10) has been used by ONS since 2001.
How we process the data
The figures in this release were produced following a two-stage process. First, the complete death registration records held by ONS, for deaths registered in the relevant calendar years, were analysed using multiple search strategies to identify all those deaths where there was evidence that the deceased was homeless at or around the time of death. Then, the results of the searches were used in a statistical modelling technique known as capture-recapture to estimate a total figure, which allows for the likelihood of more deaths of homeless people being present in the data but not identified.
Five search strategies were used, which are detailed in this section.
The recorded place of residence contained any of a list of text expressions such as “no fixed abode”, “homeless” and “night shelter” or the name or address of a known homeless hostel or project. An extensive list of addresses was compiled from publicly available sources. While this list was necessarily incomplete, the statistical model was found to be robust against even a substantial number of omissions.
Similarly, the recorded place of death containing any of a list of text expressions such as “no fixed abode”, “homeless” and “night shelter” or the name or address of a known homeless hostel or project.
The death had been investigated by a coroner, and the details received by ONS after the inquest, included any of the text expressions or addresses outlined previously. The information provided by coroners is broader and may be more precise than for deaths that do not require an inquest.
The record contained a “communal establishment code”, which specified a homeless hostel or shelter. These codes are assigned by ONS during the initial processing of a death registration, based on a periodically updated list of known postcodes of institutions of all kinds, ranging from hospitals to prisons.
The death occurred in hospital or in a hostel or similar location, and the recorded postcode of the place of residence was identical to the postcode of the place of death. This search ensured the inclusion of homeless people who had been found in need of medical attention in the street and subsequently died in hospital, or certain other possible scenarios.
The records identified by these searches were checked individually to prevent the incorrect inclusion of deaths, such as a person who lived in a hostel that catered for a non-homeless client group. No definite homeless deaths were identified below the age of 15 years, which was taken as the lower age cut-off. An upper age cut-off of 75 years was applied; this was important to exclude deaths of elderly people in a care home or after a long hospital stay, for whom in some cases no residential address had been recorded.
How we analyse and interpret the data
The estimation was carried out using the widely used Rcapture package in the R programming language, for which there is published documentation available. The calculations estimate the most probable size of an unknown closed population based on multiple captures (searches), using Poisson loglinear regression models and an iteratively reweighed least squares algorithm, which is simple and numerically stable. Based on the nature of the data and the diagnostic and goodness-of-fit statistics produced by the package, the Chao model was selected out of several alternatives. This is a robust but conservative (lower-bound) model, so that the figures produced should be taken as the lowest probable estimates.
Further description of the capture-recapture estimation
The parameters of interest in the Rcapture package depend on whether the population is assumed to be closed, open, or both. Births and deaths, together with immigration and emigration, can occur in an open population, but not in a closed one. Death registrations for a given (past) year are a closed population.
To estimate this population size, a model is fitted to the data. Following Otis, Burnham, White, and Anderson (1978), the model can incorporate up to three sources of variation among capture probabilities:
a temporal effect (t)
a heterogeneity between units (h)
a behavioural effect (b)
A temporal effect causes the capture probabilities to vary among capture occasions; heterogeneity causes the capture probabilities to vary among units. A behavioural effect means that the first capture changes the behaviour of a unit, so the capture probability differs before and after the first capture. These sources of variation lead to eight fundamental closed population models:
- M0 (no source of variation)
The analysis of data from a closed population capture-recapture experiment amounts to finding the best-fitting model and estimating the population size from the chosen model. All models are fitted using the glm function in the package; it produces maximum likelihood estimates of the loglinear parameters. The maximisation is done through an iteratively reweighed least squares algorithm, which is simple and numerically stable. An estimate of the population size N is then derived from the loglinear parameters. The output from Rcapture consists of descriptive statistics, heterogeneity charts, box plots of Pearson residuals and an abundance estimations and model fits table.
|fi: number of units captured i times|
|ui: number of units captured for the first time on occasion i|
|vi: number of units captured for the last time on occasion i|
|ni: number of units captured on occasion i|
|vi: number of units captured for the last time on occasion i|
|ni: number of units captured on occasion i|
Download this table Table 3: Descriptive statistics for 2017 data.xls .csv
The descriptive function of the Rcapture package computes basic capture-recapture frequency statistics. It displays, for i = 1, . . . , t, the number of units captured i times (fi), the number of units captured for the first time on occasion i (ui), the number of units captured for the last time on occasion i (vi) and the number of units captured on occasion i (ni). If the ni statistics vary among capture occasions, there is a temporal effect. The descriptive function also gives the m-array matrix, which contains recapture frequencies for units released on each occasion.
The plot.descriptive function in the R package Rcapture explores possible heterogeneity in the capture probabilities. The graphs produced by plot.descriptive are linear. Rivest (2007) shows that the fi graph should be concave downward when there is a temporal effect. This effect is typically small and the graph of the fi stays almost linear for model Mt.
Furthermore, from the work of Lindsay (1986) on mixing distributions in an exponential family, the fi graph for model Mh and the ui graph for models Mh and Mbh should be convex, up to sampling errors. The shape of the fi graph for model Mth depends on the relative importance of the temporal effect and the heterogeneity. So, the plot.descriptive function can bring out heterogeneity among capture probabilities in a dataset through graphs with a convex shape.
The fit of a model can also be judged through its residuals. The function boxplot.closedp from the R package Rcapture was used to produce boxplots of the Pearson residuals for the different fitted models. The graph brings out badly fitted data.
The final estimate is produced using the function closedp from the R package Rcapture. When several models have been fitted, they are compared and one selected. The function closedp was used to generate deviances, degrees of freedom and Akaike Information Criteria (AIC). These statistics are useful tools to compare models and to assess the goodness of their fit. Under the assumption of a good fit, the deviance of a model follows a chi squared distribution with the model’s degrees of freedom. Also, likelihood ratio tests can be constructed to compare nested models and a smaller AIC indicates a better model.
|Number of captured units: 491|
|Abundance estimations and model fits:|
|Mh Chao (LB)||669.2||23.2||858.788||29||937.487||945.880||OK|
|Mth Chao (LB)||597.2||17.0||261.519||24||350.219||379.594||OK|
Download this table Table 4: Estimates and best fit for the 2017 data.xls .csv
How we quality assure and validate the data
Rigorous quality assurance is carried out at all stages of production. Specific procedures include:
- scrutinising input data to investigate the accuracy of any abnormal values
- scrutinising trends in the total population, projected over time for plausibility
- comparing current deaths while homeless with previous deaths while homeless and estimates, to see where large changes are taking place and understand the reasons for these
- examining sex ratios to find any areas of imbalance
- comparison between local authorities, to check for outliers
- checking output tables to ensure that there are no errors or inaccuracies during the creation of published tables
Following initial investigation of the methods, two sensitivity analysis were carried out.
The statistical model was rerun using a method of randomly removing from 50 to 200 out of approximately 600 specific addresses from the list of addresses, to gauge the impact of the incompleteness of the available lists. The results of four iterations showed that removing a substantial proportion of the addresses made no more than 2% difference to the final estimate.
The model was rerun removing each of the separate lists (searches) one at a time. Again, the results were found to be relatively stable. The model was also run with and without two independent data sources in addition to the death registration records, namely the national list of deaths compiled by The Bureau of Investigative Journalism and a dataset provided in confidence by the CHAIN service in London. Both produced higher but less robust estimates, due largely to the difficulty of accurately matching the records (which often contained minimal identifying details) to the death registrations.
How we disseminate the data
Deaths while homeless are available online, by local authority and region for 2013 to 2017.
Links from the release calendar make the release date and location of each new set of deaths while homeless clear. Deaths while homeless can be downloaded free of charge in Microsoft Excel format. A statistical bulletin accompanies each publication. The underlying data for the charts and tables in the bulletin can be downloaded. Supporting documentation is also available on the deaths while homeless webpage.
Other data not published on the web are available on request by emailing email@example.com. Metadata describing the limitations of the data for more detailed tables are provided with each individual request. Most queries can be answered from the website datasets or supporting methods documents. Any additional enquires regarding deaths while homeless can be made by emailing firstname.lastname@example.org.
How we review and maintain the data processes
Future revisions to the deaths while homeless analysis may be required to reflect occasional or post-census revisions to the subnational population projections. This is in line with the ONS revision policy for population statistics.Back to table of contents
We have received user feedback and assessment of user needs and perceptions via correspondence and attendance at stakeholder meetings and plan to act on their recommendations to provide greater detail and regularity to the publication.
We received many enquires whilst producing this bulletin. A considerable amount were from stakeholders requesting local authority level data. We have responded to this by planning a local authority level analysis which is due for release early 2019.
We are grateful for advice and assistance in the development of these Experimental Statistics from The Bureau of Investigative Journalism, the Combined Homelessness and Information Network (CHAIN), the Homeless Impact Centre, Homelessness and Troubled Families team at the Ministry for Housing, Communities and Local Government, Planning and Housing Research – Greater Manchester Combined Authority, University College London – Institute of Health Informatics, and the Welsh Government. However, Office for National Statistics (ONS) independently produces these statistics, including determining the focus, content, commentary, illustration and interpretation of the measures presented, and the comments provided from other organisations are purely advisory.
We welcome feedback from users on the content, relevance and format of our outputs and user feedback is requested at the bottom of all emails sent by customer service teams within the division.
Feedback is also received through regular attendance of our researchers at user group meetings and conferences. In addition, the views of a wide range of users were sought as part of the UK Statistics Authority assessment of mortality statistics.
Adelstein A and Mardon C (1975) ‘Suicides 1961 to 1974’, in: Population Trends, number 2, London: Her Majesty's Stationery Office
Otis DL, Burnham KP, White GC, Anderson DR (1978), ‘Statistical Inference from Capture Data on Closed Animal Populations’, volume 62 of Wildlife Monographs, Wildlife Society
Rivest LP (2007).’Why a Time Effect Has a Limited Impact on Capture-Recapture Estimates in Closed Populations’, Canadian Journal of Statistics, Under revision
Lindsay BG (1986). ‘Exponential Family Mixture Models (With Least-Squares Estimators)’, The Annals of Statistics, volume 14, pages 124 to 137Back to table of contents