1. Methodology background

National
Statistic
National
Statistic
Experimental Experimental National Statistic
Survey name Adult cancer survival
in England
Adult cancer survival
by stage at diagnosis for England
Cancer survival for
children in England
Geographic patterns
of cancer survival
Data collection Administrative data Administrative data Administrative data Administrative data
Frequency Annual Annual Annual Annual
How compiled Administrative
data
Administrative
data
Administrative
data
Administrative
data
Geographic
coverage
England England England England,
NHS England Regions,
Cancer Alliances,
Sustainability and
Transformation
Partnerships
Related
publications
Adult cancer
survival in England
Adult cancer
survival by stage
at diagnosis
for England
Cancer survival
for children
in England
Geographic
patterns of
cancer
survival
Back to table of contents

2. About this Quality and Methodology Information report

This quality and methodology report contains information on the quality characteristics of the data (including the European Statistical System five dimensions of quality (PDF, 156KB) as well as the methods used to create it. The information in this report will help you to:

  • understand the strengths and limitations of the data

  • learn about existing uses and users of the data

  • understand the methods used to create the data

  • help you to decide suitable uses for the data

  • reduce the risk of misusing data

Back to table of contents

3. Important points

  • This Quality and Methodology Information (QMI) report on cancer survival has been updated in collaboration with Public Health England.

  • The main changes to this report are to simplify the descriptions of the methodology applied in each cancer survival publication and explain why each choice of methodology has been made.

  • Updated life tables have been published to support the survival estimates for adults with historic data to provide users with a consistent set of survival estimates.

  • All cancer survival publications are based on the data summarised by the Cancer registrations statistics bulletin.

  • For adults, survival estimates are the estimation of the proportion of patients surviving their cancer after 1, 5 or 10 years based on detailed patient data and taking account of the mortality of the general population.

  • For children, survival estimates do not need to adjust for the mortality of the general population and a simpler methodology is followed.

  • All survival estimates calculated have at least one year of potential experience after diagnosis, except for those using the hybrid approach, which is designed to predict survival for newly-diagnosed patients.

  • 1-, 5- and 10-year cancer survival estimates for England for both children and adults are included in the Cancer survival in England release.

  • 1- and 5-year subnational survival estimates are included in the Geographic patterns of cancer survival release and the Index of cancer survival release.

  • The adult Cancer survival in England and Geographic patterns of cancer survival releases use a different methodology and diagnosis periods to those used in the Index of cancer survival release, so estimates are not directly comparable.

Back to table of contents

4. Quality summary

Overview

The Office for National Statistics (ONS) in partnership with Public Health England (PHE) publishes a collection of bulletins on cancer incidence and survival, with accompanying Quality and Methodology Information (QMI) reports for cancer incidence and the Index of cancer survival.

Survival estimates are produced for the most common cancers in adults (aged 15 to 99 years) by stage and diagnosis, and for all cancers combined in children (aged 0 to 14 years).

This report provides details of the methodology used by the ONS and PHE partnership, to produce the following National Statistics and Experimental Statistics:

  • Cancer survival in England, which includes:

    • Adult cancer survival in England (National Statistics)
    • Adult cancer survival by stage at diagnosis for England (Experimental Statistics)
    • Cancer survival for children in England (Experimental Statistics)
  • Geographic patterns of cancer survival in England (National Statistics)

The Geographic patterns of cancer survival in England bulletin provides cancer survival estimates for common cancers at a subnational level, namely NHS England Regions, Cancer Alliances (CAs) and Sustainability and Transformation Partnerships (STPs).

Subnational survival estimates are also available by Clinical Commissioning Group (CCG) and these are produced as an index of cancer survival using a different methodology. STPs and CAs are included in the index of cancer survival bulletin to provide a single measure of cancer survival over time at these geographic levels. The alternative process is published in the Index of cancer survival Quality and Methodology Information report.

Those wishing to look at survival in CAs and STPs may choose between the Geographic patterns of cancer survival and Index of cancer survival. The Geographic patterns of cancer survival presents 1- and 5-year survival estimates in CAs and STPs for individual cancer sites that pass the post-estimation quality assurance checks for robustness (see Section 6). The Index of cancer survival presents 1- and 5-year survival for all malignant cancers (excluding prostate and non-melanoma skin cancer) combined. The Geographic patterns bulletin is more fine-detailed but the Index of cancer survival covers a wider range of cancers in the estimates.

This analysis has been prepared jointly (from June 2017) with the National Cancer Registration and Analysis Service (NCRAS) within PHE.

Cancer is a leading cause of death, accounting for just over one-quarter of deaths in England. More than one in two people born after 1960 will develop cancer at some point in their life. In July 2015, an Independent Cancer Taskforce published Achieving world-class cancer outcomes: a strategy for England 2015 to 2020, which included the aim to improve survival rates for cancer patients.

The Taskforce report sets out how the government plans to improve cancer outcomes (including improving survival rates through reductions in the proportion of patients who are diagnosed with cancer at an advanced stage), screening and treatment standards.

These publications of National Statistics in cancer survival enable the monitoring of changes over time and assess progress in achieving these aims.

A guide to choosing which publication is suitable for a number of potential uses is also available.

Uses and users

Users of cancer survival estimates include government organisations including the NHS, local bodies responsible for commissioning cancer services, health policy-makers, cancer charities, academics and researchers, cancer registries, the public and the media.

Population-based cancer survival statistics may be used to:

  • plan services aimed at cancer prevention and treatment

  • feed into national cancer plans – the Department of Health and Social Care identified cancer as a specific improvement area for preventing people dying prematurely, given that a significant gap remains in survival compared with the European average; the Independent Cancer Task Force set out six strategic priorities (PDF, 4.9MB) to help improve cancer survival in England, including reducing CCG variation and the ambition to increase 1-year survival to 75% by 2020 (PDF, 4.9MB) for all cancers combined

  • the NHS Outcomes Framework, which was established to monitor overall changes in performance of the NHS and the quality of health outcomes, and includes 1- and 5-year net survival from colorectal, breast and lung cancers; the NHS Five Year Forward View set out: “that improvements in outcomes will require action on three fronts: better preventions, swifter access to diagnosis, and better treatment and care for all those diagnosed with cancer”

  • provide reliable and accessible information about cancer outcomes to a wide range of groups, including patients and health professionals via health awareness campaigns, cancer information leaflets and websites

  • inform cancer research

Strengths and limitations

The main strengths of the cancer survival statistical bulletins include:

  • they show the effect of health policy on the survival of cancer patients in England

  • this effect of health policy on cancer survival is shown for patients with different:

    • ages at diagnosis
    • types of cancer
    • stages at diagnosis
    • geographical footprints
  • the use of administrative data means that survival estimates are population-based

  • they present age-standardised rates wherever possible to enable users to reliably compare results over time and, in the Geographical patterns of cancer survival, between areas

The main limitations of the cancer survival statistical bulletins include:

  • they do not show overall survival of a patient where they have been newly-diagnosed with cancer but die from an unrelated condition

  • these statistics are not applicable to an individual, newly-diagnosed patient; the survival of a newly-diagnosed individual will depend upon many other factors, such as their individual prognosis, their treatment or other diseases, and thus their survival may vary significantly from that reported by the publications

  • these statistics cannot be used to infer continued survival time for individuals who have already lived a certain amount of time since being diagnosed

  • a recognised staging system is not available for all types of cancer

  • in some cases, usually due to small numbers of cases, it is impossible to produce robust estimates of survival for one or more of the age groups, cancer sites, geographies or follow-up periods; all such non-robust estimates are suppressed

  • cancer data files are dynamic and new cases can be registered “late”, modified and, more rarely, cancelled

Recent improvements

There have been four important recent improvements to note:

  • in the publication for Cancer survival in England for 2012 to 2016 and followed up to 2017, it was possible to study survival by stage over five diagnosis years for the first time in England

  • in the publication for Cancer survival in England, the results of the survival by stage estimates have been presented together with the combined stage estimates

  • new life tables that reflect the changing mortality trends have been published and applied for 2006 to 2010 cancer diagnoses followed up to 2011 onwards

  • the coding applied to the underlying data has been strengthened; more information can be found in an impact paper

Back to table of contents

5. Quality characteristics of the Cancer survival statistical bulletins data

This section provides a range of information that describes the quality and characteristics of the data and identifies issues that should be noted when using the output.

Relevance

(The degree to which the statistical outputs meet users’ needs.)

Cancer survival is generally influenced by a combination of stage at diagnosis, treatment quality and patient factors (for example, age, frailty and other health conditions). Therefore, there has been an increasing interest in cancer survival by stage at diagnosis for several years. This was reflected in an Office for National Statistics (ONS) cancer output consultation in 2012. With improvements in the collection of stage data for cancer registrations by Public Health England (PHE) for diagnoses in or after 2012, it is now possible to provide cancer survival estimates by stage at diagnosis for 22 of the most common stageable tumour sites.

The adult cancer survival by stage at diagnosis analysis is used by the National Awareness and Early Diagnosis Initiative (NAEDI), which aims to improve cancer survival by earlier diagnosis. The data can help show the improvement in survival that could be made if more cancers were diagnosed earlier. They also show the pattern of survival and stage, which may help show where most improvement can be made.

From the most commonly occurring stageable tumour sites, which are also included in the Public Health Outcomes Framework (PHOF) experimental indicator 2.19 (Cancer diagnosed at an early stage), we have excluded non-Hodgkin lymphoma from the PHOF list of stageable tumour sites in the survival by stage analysis. Survival from non-Hodgkin lymphoma varies significantly between patients with the same stage, depending on tumour morphology and their individual survival outcomes are too varied to publish as a group. PHE also publishes counts of stage by tumour site (XLS, 2.77MB) in annual form and also quarterly updates that form the basis of PHOF indicator; a wider set of data from the same source is used to inform the survival by stage analysis.

The estimates of 5-year survival from the cancer survival for children in England is used to evidence improvement in preventing people from dying prematurely from cancer in the NHS Outcomes Framework.

The analyses of survival by stage in adults and survival in childhood have been released as Experimental Statistics, to allow us to gather views and opinions on the analysis undertaken.

The geographic patterns of cancer survival series is used in the indicators set for the compendium of population health indicators. The compendium includes 1-year and 5-year survival from bladder, breast, cervical, colorectal, lung, oesophagus, prostate and stomach cancers.

Changes in the coding of cancer occur when a new version of the International Statistical Classification of Diseases (ICD) is implemented. Currently, all malignancies are coded using ICD volume 10 (ICD-10), which replaced ICD-9 in 1995. Coding cancers with the most recent version of the ICD means that coding represents an accurate and up-to-date picture of cancer, so that cancer-related outputs continue to meet users’ needs.

Accuracy and reliability

(The degree of closeness between an estimate and the true value.)

Cancer survival releases are produced using the most robust methods available for population-based cancer survival estimation. All bulletins published on cancer survival use the same underlying data files, which are prepared for analysis by using the same documented quality assurance procedures; the cancer incidence data are the same as that used in the Cancer registration statistics.

Each year, over 300,000 patients are newly diagnosed with cancer in England. PHE’s National Cancer Registration and Analysis Service (NCRAS) records new cancer registrations covering the entire population of England and holds cancer registration data from the former regional cancer registries, who had registered tumour data since the 1960s.

Since 2001, cancer registrations for each year have been estimated to be between 96% and 99% complete at the time of extraction, with completeness improving over time. However, it is important to note that the cancer registration database is dynamic.

In common with cancer registries in other countries, cancer incidence in England can take up to five years after the end of a given calendar year to reach 100% completeness and stability, because of late registrations, corrections and deletions. The estimate of completeness for a diagnosis year is based on the figures published for the three previous years, compared with the number of late registrations subsequently received for these years.

This means that the previous four years of cancer registrations data are revised on each new release to reflect the changes recorded in the cancer registration database.

The estimate of completeness can be viewed as the difference between the figures published in Cancer registration statistics (and all subsequent ONS cancer incidence publications within that reporting year) and late registrations received after the publication date cut-off. It is not an estimate of the number of cancers that are never recorded.

Previously, data from ONS historic cancer registrations were used, but it has now been agreed that data provided by PHE will act as the only sources of cancer registrations for England and the ONS will archive its registry.

Coherence and comparability

(Coherence is the degree to which data that are derived from different sources or methods, but refer to the same topic, are similar. Comparability is the degree to which data can be compared over time and domain, for example, geographic level.)

International comparisons of cancer survival figures are occasionally reported. Care should be taken when interpreting results from different countries, because of known differences in healthcare and cancer registration systems, which are likely to affect results. A discussion of the issues raised by comparison of survival figures from different countries was published as part of the International Cancer Benchmarking Partnership (ICBP) study. Other international sources of survival by stage statistics include those published by National Cancer Institute in the US and the Canadian Cancer Society.

The issue of comparability of cancer survival statistics across the UK has been discussed at the United Kingdom and Ireland Association of Cancer Registries (UKIACR) Executive Board and a consensus has been made to use the International Cancer Survival Standard (ICSS) weights in cancer survival analysis in England, Scotland, Wales, Northern Ireland and the Republic of Ireland (as well as the same exclusions in data) so that results can be comparable across all countries in the UK and Ireland.

Estimates for England, Cancer Alliances and Sustainability and Transformation Partnerships (STPs) are published in both the Index of cancer survival by Clinical Commissioning Groups in England bulletin and the Geographic patterns of cancer survival in England bulletin. Care should be taken when comparing these estimates for 1-year and 5-year site-specific cancer survival as there are differences in the methodologies applied to calculate net survival and the cancer survival index. Further details of the other methodology can be found in the Index of Cancer Survival Quality and Methodology Information report.

Cancer survival estimates are published at England level by various organisations (for example, cancer charities, academic groups, international collaborations) and they will not all be directly comparable. Raw data may be taken from different sources and differences in quality assurance procedures will influence final estimates.

Confidence intervals reflect the level of uncertainty in each estimate in the supporting data tables that accompany each bulletin.

Cancer Research UK publishes relative cancer survival estimates by geography, deprivation level, cancer site, age at diagnosis and sex.

The Northern Ireland, Scottish and Welsh registries publish national figures for their respective countries. The period for which most recent data are available may differ between countries.

Accessibility and clarity

(Accessibility is the ease with which users can access the data, also reflecting the format in which the data are available and the availability of supporting information. Clarity refers to the quality and sufficiency of the release details, illustrations and accompanying advice.)

For information regarding conditions of access to outputs, please refer to:

All cancer survival statistical bulletins are web-only releases, available in either HTML or PDF formats; data tables are available in Excel format. For further information about cancer survival bulletins, please contact the Cancer Analysis Team via email at cancer.newport@ons.gov.uk or by telephone on +44 (0)1633 456935.

Timeliness and punctuality

(Timeliness refers to the lapse of time between publication and the period to which the data refer. Punctuality refers to the gap between planned and actual publication dates.)

Historically, the former regional cancer registries in England were obliged by the then Department of Health to provide data on all new cancer diagnoses to the ONS within 18 months of the end of the calendar year; the ONS would then produce a single cancer registration dataset, which formed the basis of all the cancer publications including diagnosis years to 2014 for the first time.

NCRAS now carries out the functions that the ONS performed as cancer cases are registered. NCRAS checking for duplicate registrations is aided by accessing of hospital patient administration systems for confirmation, a facility not available to the ONS.

Although now a single cancer registry using a single system, for the publications covering the diagnosis years 2012 to 2014, NCRAS submitted their data to the ONS as if NCRAS was still operating as regional registries.

Following the implementation of these improved quality controls by NCRAS and without changes being made to NCRAS submissions, the ONS decided to use the NCRAS single dataset for publications including the diagnosis year 2015 and onwards.

The process of data ascertainment and linkage of cancer registrations to death registrations is also substantially reduced due to this ONS and PHE collaboration.

PHE’s Cancer Survival Team are then able to produce cancer survival estimates in a shorter timeframe than before (three sets of national results were published in June 2017 at the same instance, compared with previous years’ publication pattern of September for adult cancer survival and February of the following year for childhood cancer).

Further, since the registration of diagnosis years 2012 and onwards, the use of the same data collection tools and methodologies across England has enabled a consistent national approach to the collection and recording of cancer staging data. This new approach successfully led to more than 60% of all stageable tumours being staged for the first time in 2012. For 2016 diagnoses, the proportion of tumours with a known, valid stage has further improved to 82%.

For more details on related releases, the GOV.UK release calendar provides up to 12 months’ advance notice of release dates. If there are any changes to the pre-announced release schedule, public attention will be drawn to the change and the reasons for the change will be explained fully at the same time, as set out in the Code of Practice for Statistics.

Concepts and definitions (including list of changes to definitions)

(Concepts and definitions describe the legislation governing the output and a description of the classifications used in the output.)

Cancer

For adults, cancers are coded using the International Statistical Classification of Diseases 10th Revision (ICD-10). ICD-10 coding for cancer is based on the nature and anatomical site of the cancer.

Morphology and behaviour codes used can be found in the International Classification of Diseases for Oncology, Second Edition (ICD-O-2). Morphology codes denote the cell types in the cancer and behaviour codes say if the tumour is malignant or invasive, or not.

For the purposes of adult cancer registration, the term “cancer” includes all tumours that are both invasive and malignant (tumours that invade into surrounding tissues), which are conditions listed under site code numbers C00 to C97 in ICD-10. In addition, all “in situ” (malignant but not invasive) tumours (D00 to D09), certain benign (not malignant) tumours (D32 to D33, D35.2 to D35.4) and tumours of uncertain or unknown behaviour (uncertain whether benign or malignant, D37 to D48) are registered.

Childhood cancer registrations include all children (aged 0 to 14 years) diagnosed with a primary malignant tumour of any organ, or a non-malignant tumour of the brain and central nervous system (CNS). Cancers of the skin, other than melanoma, and secondary and unspecified malignant tumours, are excluded. Inclusion in the childhood cancer survival analysis is defined in the third edition of the International Classification of Childhood Cancer and further details of the eligibility and exclusion criteria have been published in Control of data quality for population-based cancer survival analysis.

The numbers of cancers diagnosed each year are published in the Cancer registrations statistics, England series.

Primary cancer

A primary cancer is the tumour that first develops in an identifiable part of the body, for example, the stomach, and usually gives the name to the type of cancer with which a patient is diagnosed.

Metastatic or secondary cancer

A metastatic or secondary cancer is a cancer that has spread from the first primary cancer, which may be located within the same site as the first primary cancer (local metastasis) or spread beyond the site of the first primary cancer (distant metastasis).

The metastatic cancer should have the same underlying cell biology and morphology as the primary cancer. A spread of primary tumour cells within the system of lymph nodes is not usually considered to be metastatic cancer.

In the Adult cancer survival by stage at diagnosis for England publication, cancers diagnosed at a metastatic stage are classified as stage 4.

Cancer stage (at diagnosis)

Many common cancers have a staging system that aims to give an indication of how far the disease has progressed; stage is usually recorded at diagnosis although the stage of disease in a patient will vary over time. This is not true for all cancers; for example, brain cancers do not have a staging system.

Cancer stage at diagnosis is a measure of how far the primary tumour has grown when the patient first presents in hospital. It is measured and recorded according to internationally-agreed standards, often as agreed by the Union for International Cancer Control. The most common staging standard is sometimes called the TNM staging method and is based on three components:

  • tumour size (the T component)

  • nodal involvement of the lymphatic system (N)

  • metastatic spread (M)

Some gynaecological cancers are staged using an alternative method set out by the International Federation of Gynaecology and Obstetrics (FIGO). For cancers of the ovary and the uterus, FIGO stages can be uniquely matched to TNM stages and this has been used to supplement the TNM staging data.

Although the combinations of tumour size, nodal involvement and metastatic spread change by tumour type, generally there are four broad stages of cancer progression:

  • stage 1: the primary tumour is usually small and is contained within the body organ in which the tumour started growing

  • stage 2: although larger, the primary tumour has not spread to other parts of the body; spread to the lymphatic system may be included depending on the primary tumour site

  • stage 3: the primary tumour is larger and may have spread into neighbouring parts of the body and into the lymphatic system

  • stage 4: the primary tumour has spread to at least one other part of the body, creating a secondary or metastatic tumour

There are several reasons why a tumour cannot be staged, for example, some samples taken do not produce clear results and some patients are too unwell to undergo the surgery required to obtain sufficient tissue sampling for staging. In the Adult cancer survival by stage at diagnosis for England publication, missing stage is treated as a separate category and survival estimates are produced for patients with “unknown” stage alongside the other categories of known stage.

There is also a smaller group of tumours that have a morphology that, although the primary cancer is found in the same location as other tumours that are stageable, does not have a recognised staging system. These are denoted as “unstageable” in the Adult cancer survival by stage at diagnosis for England publication.

Multiple myeloma has a separate staging system, the International Staging System, which has three levels of disease progression. In common with some other cancer sites that are presented in the Adult cancer survival in England publication, the staging data are not complete enough to be considered reliable enough for publication by stage.

Follow-up

A measure of the patient’s time at risk of death following diagnosis. For example, the time from when a patient is diagnosed with cancer, until their date of death, embarkation (to a country outside of the NHS system) or if they are known to be alive on the censor date.

Censor date

The censor date is the date a patient was last known to be alive, which may be the last time checks against medical and deaths records were undertaken. The publications covered by this partnership have a censor date of 31 December in the year following the most recent diagnosis year included. Where a patient cannot be determined to be alive or dead on the censor date using these checks, a patient is said to be lost to follow-up (or censored) on the last date where they were known to be alive.

Lost to follow-up

If a patient cannot be determined to be alive or dead on the censor date, for example, because they have emigrated or because key identifiers to link datasets (such as NHS number, date of birth) contain an error that prevents automatic linkage, a patient is lost to follow-up on the date where they are last known to be alive that precedes the censor date. If a particular group of patients is lost to follow-up for reasons related to their cancer, then this is said to be informative censoring.

Crude survival

This is the simplest method for calculating cancer survival, by calculating the proportion of a group of cancer patients who are still alive at time(s) of interest following a diagnosis of cancer. This method produces biased estimates of survival because it does not consider:

  • the total amount of time a group of patients are living with cancer before the time of interest or their death (whichever is earlier)

  • how to deal with patients who are lost to follow-up

Overall survival (Kaplan-Meier estimator)

To allow for the total amount of time for which patients are alive and also for those patients who are lost to follow-up, a more sophisticated and unbiased estimator is overall survival (more formally known as a Kaplan-Meier estimator). The Kaplan-Meier estimator is a (non-parametric) method, which calculates the cumulative probability of “all-cause” (any cause) survival.

Relative survival

Relative survival (PDF, 691KB) is an estimate of the probability of survival from the cancer alone excluding other potential causes of death.

In relative survival, it is assumed that for a group of cancer patients:

total mortality (1) equals mortality from cancer (2) plus mortality from other causes (3)

This is saying that a cancer patient may die because of their cancer or another cause but not from both their cancer and another cause.

Measuring the total mortality (1) for a group of cancer patients can be calculated by applying the overall survival method. The mortality in the general population or from other causes (3) is calculated in life tables. The mortality from cancer (2) can then be obtained from (1) and (3).

Net survival

Net survival is a variant of relative survival that is preferred as a measure of cancer survival in adults because it is an unbiased estimator. Net survival estimates the survival of cancer patients compared with the background mortality that patients would have experienced if they had not been diagnosed with cancer.

The Pohar-Perme estimator of net survival is an unbiased estimator that accounts for informative censoring bias.

Life tables

Mortality for the general population is derived from population life tables. The life tables used are produced by the ONS. Using these life tables, the mortality of cancer patients is compared with that of individuals in the general population who belong to the same single year of age (0 to 99 years), sex, population-weighted quintile of the Index of Multiple Deprivation (IMD) and region.

Survival analysis approaches

In this section, the various different approaches to forming groups of patients for estimating survival are illustrated. These situations cover the scenarios where all outcomes at the estimation time of interest are known and those where outcomes at the estimation time of interest are only partially known.

Tables 1 to 4 are survival approach diagrams, which highlight the diagnosis year for the demonstrated approach and the patient years of follow-up included in that approach. A patient pathway begins with diagnosis in year zero when there are no years of follow-up, this continues right across the diagram increasing for each year of follow-up. For example, patients diagnosed in 2008 with follow-up until 2015 have at least seven years of follow-up. These diagrams focus on 5-year survival, but the principles are also applicable to 10-year survival.

Cohort approach

When follow-up information is available for each patient for at least one year, 1-year survival can be estimated using the (classical) cohort approach. For example, once follow-up information is available for each patient over the entire calendar year following their diagnosis, the cohort approach can be used to estimate 1-year survival by combining the conditional probabilities of survival to the end of each successive sub-period of the analysis.

Table 1 highlights 5-year survival using the cohort approach. This approach requires that at least five complete years of potential follow-up are available for each patient considered. It is the simplest approach as all patients could be diagnosed in the same year and potentially followed up for the same length of time. However, it could also be used for patients diagnosed in different years, for example, to calculate survival for patients diagnosed in 2010 to 2012 if all patients have full follow-up available for the latest year. The restriction with this approach is that it cannot be calculated until at least five years have passed.

Table 1: Cohort approach for the most recent year with follow-up to 2017

Follow-up year
Diagnosis year 2010 2011 2012 2013 2014 2015 2016 2017
2010 0 1 2 3 4 5 6 7
2011 0 1 2 3 4 5 6
2012 0 1 2 3 4 5
2013 0 1 2 3 4
2014 0 1 2 3
2015 0 1 2
2016 0 1
Notes:

† = 5-year survival for the most recent year.

Complete approach

The complete approach to survival analysis, a variant of the classical cohort approach, is used when some patients may have been followed up for less than the full period.

For example, it is viable to use the complete approach to produce estimates for patients diagnosed during 2012 to 2016 with follow-up until 31 December 2017, even though not every patient has had the opportunity to be followed up for the full five years. In this example, the potential follow-up time varies between a single year and five years, depending on the year of diagnosis.

The complete approach for 5-year survival is highlighted in Table 2. This approach uses all potential years of follow-up for patients diagnosed in a five-year period. The advantage to this approach is it combines timeliness and efficiency using all the available follow-up. A disadvantage is that this approach cannot be used to give an estimate for a single diagnosis year.

Table 2: Complete approach for the most recent years with follow-up to 2017

Follow-up year
Diagnosis year 2010 2011 2012 2013 2014 2015 2016 2017
2010 0 1 2 3 4 5 6 7
2011 0 1 2 3 4 5 6
2012 0 1 2 3 4 5
2013 0 1 2 3 4
2014 0 1 2 3
2015 0 1 2
2016 0 1
Notes:

† = 5-year survival for the most recent year.

Period approach

A period estimate of 5-year survival is a short-term prediction of survival for patients diagnosed in that period, on the assumption that they will experience the most recently observed conditional probabilities of survival in each year up to five years since diagnosis. Table 3 shows that, for each year of potential follow-up included, they were from patients diagnosed in different years.

In this example, patients diagnosed in 2016 potentially have one year of follow-up, then patients diagnosed in 2015 potentially have two years of follow-up given that they survived the first year. This is then true for each successive year until patients diagnosed in 2011 are potentially followed up for the fifth year given they survived the fourth year.

Table 3: Period approach for the most recent year with follow-up to 2017

Follow-up year
Diagnosis year 2010 2011 2012 2013 2014 2015 2016 2017
2010 0 1 2 3 4 5 6 7
2011 0 1 2 3 4 5 6
2012 0 1 2 3 4 5
2013 0 1 2 3 4
2014 0 1 2 3
2015 0 1 2
2016 0 1
Notes:

† = 5-year survival for the most recent year.

Hybrid approach

The hybrid approach, a variant of the period approach, is used for short-term predictions when the follow-up data are more recent than the incidence data. This short-term delay arises from the quality assurance processes applied in registering a cancer diagnosis.

These estimates assume that the probability of survival from the patients included would remain stable for the following five years. Since survival is generally improving over time, the hybrid estimate of survival will be lower than that which we can expect to observe five years from now, when the full cohort-wise estimates will be available. It has the advantage of being available several years sooner.

Table 4 highlights 5-year survival using the hybrid approach. It is like the period approach but the first year of survival based on patients with follow-up from the year before. This is because typically registries only have registrations data up to one year behind potential follow-up. This method uses the cohort approach for the first year of follow-up for patients diagnosed in 2016, then uses the period approach for the remaining four years of potential follow-up for patients diagnosed between 2012 and 2015.

Table 4: Hybrid approach for the most recent year with follow-up to 2017

Follow-up year
Diagnosis year 2010 2011 2012 2013 2014 2015 2016 2017
2010 0 1 2 3 4 5 6 7
2011 0 1 2 3 4 5 6
2012 0 1 2 3 4 5
2013 0 1 2 3 4
2014 0 1 2 3
2015 0 1 2
2016 0 1
2017
Notes:

† = 5-year survival for the most recent year.

The hybrid approach can be used to predict estimates of 10-year survival, if it can be assumed that the conditional probabilities of surviving for patients diagnosed in the current year are equal to those diagnosed over the full 10-year period of available data.

More information on the differences between survival approaches can be found in the article Estimating and modelling relative survival.

Geography (including list of changes to boundaries)

In all our publications we use the latest geographical health boundaries as published, which are routinely updated in the April of each year.

NHS England Regions

NHS England Regions cover healthcare commissioning and delivery in their area and provide professional leadership on finance, nursing, medical, specialised commissioning, patients and information, human resources, organisational development, assurance and delivery. Regional teams work closely with organisations such as CCGs, local authorities, Health and Well-being Boards as well as General Practitioner (GP) practices.

Cancer Alliances

Cancer Alliances were established in late 2016 (September to December) to bring together local senior clinical and managerial leaders representing the whole cancer patient pathway across a specific geography. Cancer Alliances will lead the local delivery of the Independent Cancer Taskforce’s ambitions for improving services, care and outcomes for everyone with cancer.

Sustainability and Transformation Partnerships

Sustainability and Transformation Partnerships were established in December 2016 as local partnerships between NHS organisations and councils. They set out practical ways to improve health and care services. They are built around the needs of the local population across whole areas, not just those of the individual organisations involved.

Output quality

Cancer incidence data for England are collected by the regional offices of NCRAS, which is part of PHE. Data are submitted to NCRAS from a range of healthcare providers and other services (for example, pathology laboratories). The quality and accuracy of the data submitted by different sources may vary. The regional offices of NCRAS collate all the data for each patient, including checks for internal consistency of the sequence of dates, as well as the cancer site, sex, morphology and duplicate registrations. These checks are closely based on those published by the International Agency for Research on Cancer (IARC) and are reported on by the UKIACR.

If a record fails any critical validation check – for example, if the date of birth is invalid – the records are not reported in Cancer registration statistics, nor in any other ONS publication, including survival releases, since it is not possible to send these records for verification of the patient’s vital status to NHS Digital. If a record passes all critical validation checks, or fails one or more minor quality controls, these records are sent to NHS Digital for verification of vital status.

Further checks are required for survival analysis; these are carried out in two stages.

The first stage involves checking the eligibility of a record based on its completeness, the patient’s usual residence, tumour behaviour and morphology. Patients with an invasive, primary, malignant tumour are eligible for analysis (see “How we analyse and interpret the data” section, which lists the full criteria). Ineligible patients include those whose tumour is benign (not malignant) or “in situ” (malignant, but not invasive) or of uncertain behaviour, or for which the organ of origin is unknown.

The second stage involves checking the patient’s age (15 to 99 years for adults; 0 to 14 years for children), vital status, that the patient’s sex is compatible with the cancer site, the dates are valid and the patients’ tumour was not registered solely from a death certificate.

Why you can trust our data

The ONS is the UK’s largest independent producer of statistics and its national statistics institute. The Data Policies and Information Charter details how data are collected, secured and used in the publication of statistics. We treat the data that we hold with respect, keeping it secure and confidential, and we use statistical methods that are professional, ethical and transparent.

The Adult cancer survival in England and Geographic patterns of cancer survival have National Statistics status, designated by the UK Statistics Authority in accordance with the Statistics and Registration Service Act 2007. This designation signifies compliance with the Code of Practice for Statistics. The remainder of the publications are currently being assessed against the Code of Practice for Statistics.

Back to table of contents

6. Methods used to produce the Cancer survival statistical bulletins data

Figure 1 sets out the broad steps that the Office for National Statistics (ONS) and Public Health England (PHE) work through to produce cancer survival statistical bulletins. The steps are referred to in the rest of this section.

How we collect the data, main data sources and accuracy

Cancer registration data are collected by the National Cancer Registration and Analysis Service (NCRAS) within PHE and the extract from the NCRAS Cancer Analysis System is the same data that are used for the Cancer registration statistics (Series MB1). Please refer to the Quality and Methodology Information (QMI) for details about how cancer registration data is collected and quality assured.

The checks and quality measures undertaken by NCRAS are based on the checks that the ONS Cancer Registry historically applied to the National Cancer Registration database.

NHS Digital routinely updates these individual cancer records with information on each patient’s vital status (alive, emigrated, dead or not traced). Typically, at the time that data are extracted for the most recent statistical bulletins, less than 0.3% of patients diagnosed cannot be traced during the relevant period.

Mortality and population estimates are used to produce smoothed subnational life tables (step 4 in Figure 1) used to calculate background mortality.

All data sources are linked by an appropriate identifier, at patient or geographical levels.

How we process the data

The estimates produced for these publications have been produced by NCRAS within PHE. These National Statistics and Experimental Statistics implement the United Kingdom and Ireland Association of Cancer Registries (UKIACR)-ratified standard operating procedure Guidelines on Population Based Cancer Survival Analysis.

Structured Query Language (SQL) is used to extract data from the NCRAS Cancer Analysis System (step 5 in Figure 1) and all statistical analyses are carried out using Stata, a dedicated statistical software package. To promote transparency in the cancer survival estimates, annotated copies of the SQL and Stata code can be provided, free of charge, on request. Please email any requests to NCRASenquiries@phe.gov.uk.

How we analyse and interpret the data

Geographical areas used (step 1 in Figure 1)

In all publications, estimates are presented for England. When interested in England-level estimates, users should use the Cancer survival in England: national estimates publication. The England estimate in the other publications are created for robust comparisons with the lower geographies included in the publications.

In the Geographic patterns of cancer survival in England publication, estimates for two subnational geographies are presented: Cancer Alliances (CAs) and Sustainability and Transformation Partnerships (STPs). These are the smallest geographical units for which reliable medium-term (5-year) cancer survival estimates can be published.

In the Index of cancer survival, the smallest health geographies, Clinical Commissioning Groups (CCGs), for which 1-year survival can be reliably estimated are published.

Choice of approach in selecting diagnosis years (step 1 in Figure 1)

Wherever possible, we report survival using the cohort approach; this is possible for all observed 1-year survival estimates and, where time since diagnosis allows, for 5- and 10-year survival in childhood cancer.

Publishing survival estimates only on fully observed patient groups would mean that we would be restricted to having at least the amount of time passing since diagnosis before a survival estimate can be reported.

To allow the reporting of medium- and long-term survival estimates using recent diagnosis data, we employ some different approaches:

  • the complete approach for observed data is employed for the Adult cancer survival in England, Adult cancer survival by stage at diagnosis for England and Geographic patterns of cancer survival in England publications’ estimates for 5-year survival

  • the period approach is used in the Cancer survival for children publication for 5- and 10-year survival estimates for which full follow-up does not exist

  • the hybrid approach is used in the Adult cancer survival in England and Cancer survival for children publications for 1-, 5- and 10-year predicted survival estimates

A different methodology is used for the Index of cancer survival, which is discussed in the associated QMI report for the index of cancer survival.

Choice of cancer sites (step 2 in Figure 1)

Cancer sites are chosen where there may be sufficient numbers of diagnoses within the time period of study. They are grouped, where appropriate, by broad physiological category (for example, cancers of the lower digestive tract are combined to form colorectal cancer). The same definitons are applied across all the cancer survival bulletins.

Age standardisation (step 3 in Figure 1)

Survival estimates are age-standardised wherever possible, to improve the comparability between population groups and over time. This is because cancer survival varies with age at diagnosis and the age profile of cancer patients can vary over time and between geographical areas.

From June 2017, age-standardised estimates for adults have been calculated using the International Cancer Survival Standard (ICSS) age-weightings. The impact of the change to methods of adopting the ICSS international cancer patient population for age-standardising survival ratios was detailed in the Impact of updating cancer survival methodologies for national estimates article. In summary, the benefits are:

For childhood cancer, the estimates are age-standardised by giving equal weight to all three age-groups: 0 to 4 years, 5 to 9 years and 10 to 14 years.

Choice of survival methodology (step 3 in Figure 1)

For population-level cancer survival where few, if any, patients will die within the follow-up time from a reason other than due to the cancer(s) they have been diagnosed with, the “overall survival” method will produce unbiased estimates of survival.

“Overall survival” is appropriate to use in the cancer survival for children in England publication because there is an extremely low level of mortality in children (excluding mortality in the first few months after birth) and almost all deaths in children diagnosed with cancer would be caused by their cancer diagnosis.

For adults, the level of mortality in the general population is significantly higher than in children. Many of these adult cancer patients may die from another cause of death, for example, from dementia, heart attack or stroke. This means reliable and timely death certificates are needed to allow overall survival to produce estimates related only to cancer deaths.

Although the date of death is usually recorded soon after a patient dies, the cause of death can take some months and perhaps years to be reported if an inquest is ordered.

This means it may not be possible to accurately classify patients who died from their cancer and who died from other causes when producing timely, unbiased, population-level cancer survival estimates. Therefore, overall survival is not appropriate for population-level reporting of adult cancer survival.

All the survival analyses for adults use the (Pohar-Perme) net survival estimator as implemented by Isabelle Clerc-Urmès, Michel Grzebyk and Guy Hédelin’s stns programme in Stata.

To illustrate how net survival rates may be interpreted, here are two highly simplified (and artificial) calculations of 1-year net survival:

Calculation 1

A diagnosis of cancer has no effect on mortality. The 1-year mortality rate of the general population is 20%.

Because the mortality of cancer patients is the same as the general population, the net survival would be:

The 100% net survival rate does not mean that all the cancer patients survive for one year after diagnosis; only 80% of the cancer patients survive at least one year, the same as expected in the general population.

Calculation 2

Patients diagnosed with cancer have a mortality rate that is 20% above the level experienced by the general population within one year of diagnosis. The 1-year mortality rate of the general population is 20%.

Because the mortality of cancer patients is now greater than that of the general population, the net survival would be:

The 75% net survival rate does not mean that 75% of the cancer patients survive for one year after diagnosis; only 60% of the cancer patients survive at least one year, which is 75% of the level expected in the general population (of whom 80% survive at least one year).

Post-estimation quality assurance checks for robustness (step 7 in Figure 1)

In the net survival analyses, for age groups where the estimates do not meet the following quality criteria, the result is suppressed for that age group of the specific cancer site:

  • a minimum of 10 patients should be alive at the beginning of the survival period being estimated (for example, first year of follow-up for a 1-year estimate, fifth year of follow-up for a 5-year estimate and 10th year of follow-up for a 10-year estimate)

  • at least two deaths registered in the years before or after the duration(s) being estimated

  • the level of the survival estimates should not increase with duration (for example, the survival estimated at five years following diagnosis should be lower than the survival estimated at one year following diagnosis)

  • the standard error of the survival estimates should be lower than 20%

These post-estimation tests are needed because of the modelling process underlying the comparisons and to determine which cancer sites can be published individually in the Cancer survival in England for adults (stages combined and separately) and Geographic patterns bulletins.

The same checks are not needed for the overall survival estimates for the childhood publication because of the simpler statistical process (overall survival) that is used.

Post-estimation analysis (step 7 in Figure 1)

In the Geographic patterns of cancer survival in England publication, un-standardised survival estimates for single diagnosis years are presented. A trend in survival is estimated by carrying out a variance-weighted least-squares regression analysis of the annual survival estimates for each combination of cancer, follow-up period and geography.

The estimated trend represents the average annual change in net survival over eight consecutive years. Due to the year-on-year variability of the survival estimates in smaller areas (for example, Sustainability and Transformation Partnerships (STPs)), the average annual trend may be increasing over eight years, even though a fall in survival may be observed between two consecutive years.

The annual trend in survival is only reported if at least three annual survival estimates were available and the absolute difference in survival between two consecutive years did not exceed 20%.

The p-value indicates whether the average annual change in survival is statistically significant. A p-value lower than 0.05 indicates that we can be more than 95% confident that the trend represents a real change and did not just occur by chance.

These trends are not calculated for Adult cancer survival in England, Adult cancer survival by stage at diagnosis in England, Cancer survival for children in England or the Index of cancer survival publications.

In the Cancer survival for children in England publication, to reduce the volatility of the reported estimates (mainly a result of relatively small numbers of diagnoses each year in children), locally-weighted scatterplot smoothing (PDF, 309KB) is applied to highlight underlying trends over time.

This smoothing technique is not applied to the Adult cancer survival in England, Adult cancer survival by stage at diagnosis in England, Geographic patterns of cancer survival or the Index of cancer survival publications.

Consistency checks (step 8 in Figure 1)

On completion of the previous steps, the ONS conducts a number of consistency checks alongside checks completed by PHE. These can be classified into three categories:

  • raw data checks; these include checking counts between each cancer site and that these are comparable with estimates provided in the Cancer registrations in England release

  • sensitivity checks; these include checking aggregates

  • survival checks; these involve checking percentage of stage completeness, percentage of patients lost to follow up and comparisons of estimates to previous years’ results

Summary tables and bulletin production (step 9 in Figure 1)

Once all the data and estimates have been quality-assured and agreed, the combined ONS and PHE team construct summary tables that contain the estimates produced that pass the robustness criteria; these display the data in themed sections so the users can find the estimates of interest and analyse them with ease.

An accompanying bulletin picks out main features of the estimates and places them in the context of previous publications and health policy considerations.

How we quality assure and validate the data

Following extraction, we apply data inclusion and exclusion criteria. These data quality checks ensure that patients and their cancer(s) are uniquely identified and that the data about the patient and their cancer(s) are self-consistent.

The following criteria are used to identify the patients that are eligible to be included in the analysis (and the final number of eligible patients is provided as part of the publication release):

  • patients should have a unique identifier; this is to make sure cancers for one patient are not assigned to another patient

  • patients should have a complete date of birth, so their age can be calculated at various time points

  • adults should be aged between 15 and 99 years at diagnosis; to match to life tables (see How we analyse and interpret the data)

  • children should be aged between 0 and 14 years at diagnosis

  • patients should have a known sex; this is a data quality check and to match to life tables

  • patients should have a complete date of cancer diagnosis; this is a data quality check, to match to life tables and to calculate survival time

  • patients who have died should have a complete registered date of death; data quality check and to match to life tables

  • patients should have a known date of being recorded as alive or dead; data quality check and to calculate their survival time after diagnosis

  • patients should be resident in England and have a valid postcode for their usual place of residence at the time of diagnosis; match patients to life tables

  • cancers should be (potentially) lethal, newly diagnosed in the studied cohort and a primary cancer (one that hasn’t spread from another part of the body); this is so the date of original diagnosis is known

  • cancers of the blood (for example, lymphomas, leukaemia and myelomas) should not occur in a solid cancer; data quality check

  • patients are included even if they have further new cancer diagnoses later in the period of interest; this ensures that a patient is only included once in each group of patients and survival time is counted from the earliest diagnosis of the cancer of interest in each period of interest

  • patients are excluded if they have had a primary cancer in the same site diagnosed before the period of interest; if a patient has two or more cancers of the same type, it is not clear whether survival time from that type of cancer should be measured from the first or later diagnosis

  • patients are included where the earliest diagnosis of a cancer of interest occurred within the period of interest even if they have a primary cancer of another site diagnosed at any time; this treats the patients where the cancer registry is unaware of previous cancer diagnoses in the same way as where this medical history is known

  • cancers where the only confirmed record of the cancer is on the patients’ death certificate are excluded; as cancer survival attempts to assess the effectiveness of the health system in treating patients with cancer, patients in which cancer is only found after death cannot contribute to this assessment.

  • the sequence of dates should be consistent; data quality check, for example, a patient should not be diagnosed before they are born

Other decisions applied include:

  • where a patient dies on the date of diagnosis and has more records than those on a death certificate, then these patients should be included in the survival analyses but should have one day added to the recorded date of death to prevent Stata’s stset command (PDF, 345KB) from excluding those patients

  • when two or more tumours of the same type are diagnosed on the same day for a patient, the one with the worst prognosis is chosen for inclusion; this ensures that a patient is only included once in each group of patients

  • coding the cancers with reference to International Statistical Classification of Diseases: ICD-10 to select similar groups of cancers; the details of the coding applied are included in each bulletin.

How we disseminate the data

All publications are available on the ONS website. The web pages may be saved as a PDF for offline use. Each publication has accompanying data tables from which the commentary is based. These can also be saved for offline use.

How we review and maintain the data processes

Future revisions of health coding and the evolving understanding of cancer epidemiology may require cancer groupings or methodologies to be updated. If this happens, an impact paper will be published together with a back series of estimates, so users can assess the changes made.

Back to table of contents

7. Other information

Useful links

We have produced a short document, Cancer statistics explained: different data sources and when they should be used that summarises the different contents and uses of the cancer bulletins.

More information about cancer survival and registration is available via the UK and Ireland Association of Cancer Registries and the National Cancer Registration and Analysis Service.

Information for patients and carers can be found at NHS online or by searching for charities set up to help cancer patients.

Assessment of user needs and perceptions

(The processes for finding out about uses and users, and their views on the statistical products.)

A stakeholder review of all our cancer publications was conducted in 2010. Stakeholders were asked for their views about how they use the relevant outputs, their importance and their quality. Comments were also sought on any changes respondents would like to see in terms of content and presentation of the outputs and of our cancer web pages. The results of this consultation are available.

A stakeholder consultation of all our cancer publications was undertaken in 2012 to determine future user needs. The results of this consultation are available. One of the important needs identified as part of this consultation was for data on stage at cancer diagnosis, which are collected under the National Cancer Registration Scheme and collated by Public Health England (PHE).

Due to improvements made by PHE in the collection of stage information as part of cancer registration, PHE in partnership with the Office for National Statistics (ONS) are now able to publish estimates of survival by stage diagnosis of cancer in the form of Experimental Statistics. The ONS will continue working with PHE to ensure that such survival estimates will be published as National Statistics in the future.

To promote ongoing feedback and to accommodate users’ needs, a workshop was held on 20 March 2017, to discuss what aspects they felt would be more useful and what they would like to see in future releases. The workshop included participants from the Department of Health, NHS Digital, Public Health England, Office for National Statistics, Cancer Research UK, Macmillan, RM Partners Cancer Vanguard and UCL Great Ormond Street Institute of Child Health.

We welcome feedback from users on the content, format and relevance of our statistics. Please contact us via email at cancer.newport@ons.gov.uk. A further stakeholder consultation is being planned for summer 2019.

Legislation

The Statistics and Registration Service Act 2007 permits the Registrar General to provide to the UK Statistics Authority, to carry out any of its functions, both information that is kept under the Births and Deaths Registration Act 1953 and any other information received by the Registrar General in relation to any birth or death.

The Health Service (Control of Patient Information) Regulations 2002 Statutory Instrument Number 1438, Regulation 2, permits confidential patient information relating to patients referred for the diagnosis or treatment of cancer to be processed for the following purposes:

  • the surveillance and analysis of health and disease

  • the monitoring and audit of health and health-related care provision, and outcomes where such provision has been made

  • planning and administration of the provision made for health and health-related care

  • research approved by research ethics committees for the provision of information about individuals who have suffered from a particular disease or condition where: that information supports an analysis of the risk of developing that disease or condition, and it is required for the counselling and support of a person who is concerned about the risk of developing that disease or condition

This regulation was made under Section 60 of the Health and Social Care Act 2001 and continues to have effect under Section 251 of the NHS Act 2006.

The Office for National Statistics processes and stores cancer registration data in accordance with the requirements of:

Public Health England processes and stores cancer registration data in accordance with the requirements of:

Back to table of contents