What is the ONS Longitudinal Study (LS) ?
The LS contains linked census and life events data for a 1% sample of the population of England and Wales. It contains records on over 500,000 people usually resident in England and Wales at each point in time and it is largely representative of the whole population. The LS is the largest longitudinal data resource in England and Wales.
The LS has linked records at each census since the 1971 Census, for people born on one of four selected dates in a calendar year. These four dates were used to update the sample at the 1981, 1991, 2001 and 2011 Censuses. Life events data are also linked for LS members, including births to sample mothers, deaths and cancer registrations. New LS members enter the study through birth and immigration (if they are born on one of the four selected birth dates).
Data on approximately 1 million sample members has been collected over the 40 years of the study. Figure 1 shows LS sample members collected in each census.
Download this table Table 1: Number of sample members and traced sample members at time of each census.xls .csv
The LS documents periods of change over a sample member’s life. Research topics include:
- health and mortality
- ageing (age at each census)
- family formation (including marital status, family and household type)
- ethnicity and religion
- migration (country of birth)
- educational and professional activity
- social class
The LS has the advantage of a very large sample size and low levels of attrition. This allows for extensive research into subgroups of the population of England and Wales, producing robust results. There are two studies for Scotland and Northern Ireland.
For more detail on the LS, see the publication Longitudinal Study 1971 to 1991 History, organisation and quality of data (available from The National Archives).
What data does the LS contain?
Census information is collected once every 10 years and mostly relates to people’s circumstances at the time. All information collected on the census forms is included in the LS, for example, age, sex, marital status and many other socio-demographic topics. Census data are also collected on the LS members’ co-residents, as they appear on the same census form. For a full list of data and variables, please access the information within the LS data dictionary (which provides complete information on all variables and tables available to researchers using the LS).
Event data for LS members usually resident in England and Wales, from the civil registration system, NHS registration systems and the cancer registries, has been added to the LS since Census Day in 1971. This includes:
- births (entry events)
- immigration (entry events)
- deaths (exit events)
- emigration (exit events)
Other event data includes:
- live and still births to female LS members
- deaths of LS members’ spouses
- deaths of infants born to female LS members
- cancer registrations
- re-entries to the LS after an embarkation and enlistment to the armed forces
What happened to LS members over time – cohort interactive tool
Our interactive tool explains what happens to LS members over time. For every new selection please first select beginning date followed by end date. Exits from the LS occur by two means, death or emigration. The quality of death data within the LS is extremely high as death registration is required by law. However, emigration data is not as comprehensive as it relies on LS members informing the NHS when they are leaving the country. It is known that many emigrants do not inform the NHS, causing gaps in the coverage of the data. This largely explains the amount of individuals categorised as lost to follow-up.
Research and outputs using the LS
CeLSIUS maintains a database of all publications using LS data from the early 1980s onward.
CALLS-Hub has a list of outputs from recent research using the LS, the Scottish LS and the Northern Ireland LS.
Current and completed projects using the LS.
Latest newsBack to table of contents
What can the LS be used for?
The LS can be used for several types of analysis, over many different research areas. The studies that make best use of LS data are those that link social, occupational and demographic information to data on life events. Examples include studies of mortality, cancer incidence and survival, and fertility patterns. The individual-level data of the LS means that person-years at risk can be calculated for epidemiological studies.
The ability to combine detailed personal characteristics with area characteristics has proved useful in many studies of health, for example, those looking at environmental effects on health, and those on inequalities in health.
Linked census data for members of the LS allow researchers to examine change between censuses by investigating the same people through two or more censuses.
Studies of social mobility have examined changing class position by age. Information on co-residents of LS sample members has been used to study intergenerational mobility.
The size of the LS makes it suitable for the study of ageing. Studies have used the information collected on the co-residents and family status of LS sample members to examine changes to household and family arrangements that come with age.
Census forms ask about addresses 1 year ago. The linked census data in the LS have been used to study 10-year migration patterns between censuses. In addition, information on place of enumeration in 1939 has been used to study migration over longer periods.
The addition of 2011 Census data on year of arrival (non-UK born residents) and language has meant that individual-level analysis on first- and second-generation migrants and language proficiency is now feasible.
How to use the LS
We actively promote wide use of the LS while maintaining the confidentiality of individuals in its LS sample. To ensure confidentiality, the data can only be accessed in secure settings at our offices in London, Newport (South Wales) and Titchfield (Hampshire). Only statistical analysis or data tabulations are released to researchers through an output clearance process.
Researchers need to make an application to access the LS for research purposes. A user support service is available to help researchers.
- advice on sample sizes and the suitability of the LS for particular projects
- advice on data content and linkage issues
- helping you through the application procedure
- identifying the variables and the study population to be included in an extract
- making data extracts
- transforming data and producing the tables or files necessary for your analyses
- advising on clearance procedures and confidentiality rules
Information and support for UK-based users from the academic, statutory and voluntary sectors can be obtained from the Centre for Longitudinal Study Information and User Support (CeLSIUS) by emailing Celsius@ucl.ac.uk. All other users should contact our Longitudinal Study Development Team (LSDT): LongitudinalStudy@ons.gov.uk.
If you are interested in using the LS, please contact CeLSIUS. A step-by-step guide to using the LS is available from the CeLSIUS website.
LS user resources
The data dictionary is a useful resource for LS users. It gives details of the large number of different variables in the LS.
Each census in England and Wales used two types of form: one for private households and another for communal establishments. The form for communal establishments does not contain questions about relationships in a household or questions about household amenities. Copies of the census forms used to collect LS data are available to download from census forms 1971 to 2011.
Census definitions and concepts
The publication Longitudinal Study 1971 to 1991: History, organisation and quality of data (available from the National Archives) is the definitive resource for information about the LS from 1971 to 1991. It provides the best source of information on how events data are linked to the LS and the quality of event information from 1971 to 1991.
Definitions and concepts change at each census. The related documents describe those used at each census, such as "enumerated population", "usual residence", "size of family" or "economic position".Back to table of contents
The LS contains linked census and life event information for a 1% sample of the population of England and Wales since 1971.
Data on the extent to which individuals are enumerated, traced and linked correctly at each census are vital for accurate use of the LS. Likewise data on the demographic characteristics of those previously in the LS without a valid exit event and not found at each census, or those individuals with multiple enumerators at the time of the census.
Detailed reports are available to download on the quality of linkages, and success rates in tracing and linking census records to the LS since 1971. In addition, more detailed data on linkage, tracing and sampling by main demographic characteristics are also available to download.
In addition to census data, the LS includes data on a number of registered events occurring to LS members. A detailed report on event quality is available to download.Back to table of contents
There are several ways you can find out more about the LS.
To discuss a potential research project, please contact the Centre for Longitudinal Study Information and User Support (CeLSIUS). CeLSIUS is an Economic and Social Research Council (ESRC) funded support team for UK academic, government, statutory and voluntary sector users of the LS.
To find out more about the CeLSIUS, please telephone +44 20 7679 1995 or email: email@example.com
To find out more about the LS history, data and processes, please contact the LS Development Team by telephone +44 1329 444696 or email: firstname.lastname@example.orgBack to table of contents
The Office for National Statistics (ONS) Longitudinal Study (LS) links data from three main sources: the census, the civil registration service, and NHS registration systems.
Here we detail why we need to link these data, how the data are processed and the uses made of these linked data. We also describe the steps that we take to make sure the information and individual privacy are protected.
Why we use information in this way
It was recognised as long ago as the middle of the 19th century that longitudinal cohort data were needed to fully understand patterns and inequalities in mortality. Early work in this field was led by William Farr, a noted epidemiologist and first "Compiler of Abstracts" at the newly established General Register Office for England and Wales in 1839. Farr was the first to combine information from a national census (1861) and the death registers to look at the occupation of men, their age at and cause of death.
The LS was designed to provide an official data source for the production of statistics on this topic. It was estimated that the LS would need to have a sample size of around 500,000 people, or approximately 1% of the population, in order to enable the production of robust mortality statistics. It was established in 1974 by taking a sample of records from the 1971 Census for England and Wales of all those born on one of four dates (the LS birth dates). This original sample has been continuously augmented since 1971 with new members.
In addition to the study of mortality, it was recognised that by linking birth events to mothers in the LS sample, the LS would also enable new analyses into fertility patterns, in particular the spacing of births and the part that social and economic characteristics play in family formation. The LS addresses these needs, and many more, by linking existing census and life event data.
How we process data
The processing required to maintain and update the LS is managed by the LS Development Team (LSDT) at the ONS. In order to link data from various sources, a unique identifier called the LS number is issued to each LS member by the LSDT.
All the data processed by the LSDT need to go through a process that enables data for the same individual from different sources to be linked. The processing service used is provided by NHS Digital (NHSD).
The NHSD team that provides the service was part of the Office for Population Censuses and Surveys between 1971 and 1996, and then became part of the Office for National Statistics when it was formed in 1996. This team moved to become part of NHSD (known as the Health and Social Care Information Centre at the time) on 1 April 2008, as a result of the enactment of the Statistics and Registration Service Act 2007.
To link census data, the LSDT works with ONS colleagues to arrange the creation of a data file to be used for processing by NHSD. This file only includes variables that help with the processing work. This includes personal data such as name, address including post code, date of birth and sex for each census record with one of the LS birth dates. NHSD processes these records and sends the results back to the LSDT. Once this work is complete, NHSD securely deletes all census information sent for processing.
Life events data are processed and linked annually between censuses. Each type of event (new births, births to sample mothers, deaths, widow(er)hoods) is processed separately. In each case, this involves the LSDT creating a processing file of records of people born on one of the four LS birth dates. NHSD processes these records and sends back the relevant LS number for each record. The files sent for processing include personal data such as name, address including post code, date of birth and sex. These files are securely deleted once processing is complete.
NHSD also sends data regarding NHS registrations to the LSDT for inclusion in the LS. This includes records of all new non-birth NHS registrations where the person reports one of the four LS birth dates. The LSDT issues each of these records with a new LS number and notifies NHSD of the LS numbers issued. NHSD also notifies the LSDT of exit events, re-entry events and deaths occurring to people flagged on its system as LS members.
NHSD is the only place where a complete, permanent record of the names and addresses of all LS members is held. The LSDT at the ONS only holds name and address information while processing the data. These data are deleted once each processing cycle is complete.
Files are transferred between the LSDT and NHSD using NHSD's Secure Electronic File Transfer (SEFT) system. The use of SEFT for this purpose has been approved by ONS's Security and Information Management team.
Who can access the information?
The system used to process LS data at the ONS is located on a dedicated part of ONS's technical infrastructure with enhanced security protection in place. It can only be accessed by members of the LSDT. Access to the system can only be made using dedicated desktops with fixed IP addresses. To avoid the risk of screens being overlooked, these desktops are located in a shielded area. Only LSDT staff can work in this area. All staff involved in this work have been cleared at the Security Check (SC) level.
On completion of each cycle of processing, a research version of the LS database is created and placed in ONS's Secure Research Service (SRS). No direct identifiers are included in the SRS version of the data, and it also excludes variables that present the greatest risk of identification of individuals: dates of birth; dates of death of infants; and small area geography codes below the level of local authority.
Access to the full SRS version is restricted to User Support Officers (USOs) from the LSDT and from the Centre for Longitudinal Study Information and User Support (CeLSIUS). The CeLSIUS USOs are employees of University College London, funded by the Economic and Social Research Council. They carry out the user support role on ONS infrastructure using ONS devices. They have all been cleared at the Security Check (SC) level.
Researchers wishing to use LS data need to successfully complete an Accredited Researcher application and also need their project to be approved by the UK Statistics Authority's Research Accreditation Panel. Their project application needs to identify the population and the variables required to carry out the research. If the project is approved, a bespoke data extract that only includes the specified data will be created by a USO and made available to the researcher. These data extracts can only be accessed via the SRS and from within the UK.
To ensure safe use of secure data, the SRS applies the Five Safes Framework. This is a set of principles that researchers and their organisations must adhere to. The Five Safes cover people, projects, settings, data and outputs.
How long are personal data kept?
Data protection law requires that personal data are kept for no longer than is needed to fulfil the purposes for which they were originally collected. The LSDT only retains data including direct identifiers such as name and address for as long as is necessary for processing. When each processing cycle is complete, the LSDT permanently deletes the files that include direct identifiers.
The law allows that information held for statistical purposes only may be kept for longer periods. Longitudinal data resources increase in value and utility as time passes and more data are added. The statistical data linked in the LS will be retained in de-identified form for as long as the ONS continues to maintain and update the study.
How the law protects your information
The General Data Protection Regulation and the Data Protection Act 2018 determine how, when and why any organisation can process your personal data. Personal data are any information that can identify a living individual. These laws exist to make sure your data are managed safely and used responsibly. They also give you certain rights about your data and create a responsibility on the ONS, as a user of personal data, to provide you with certain information.
The ONS is a statutory body, meaning it was created by legislation, specifically the Statistics and Registration Service Act 2007. Our objective is to promote and safeguard the production and publication of official statistics that serve the public good. All our collection and use of data comes from powers that can be found in that Act or other UK legislation.
The Census Act 1920 ensures that we treat personal data from the census securely. It is a criminal offence for ONS staff or our suppliers to misuse personal census data.
Legal basis for processing your data
The LS was first established in the 1970s. It was recognised from the beginning that, as data built over time, stringent measures would need to be in place to protect the identities of those whose data were included. One of these measures has been to keep the four dates of birth used to select the sample confidential. Only staff at the ONS and NHSD who have needed to know the dates to carry out their work have been made aware of them.
The ONS has not sought the consent of people to be included in the LS. If we were to try to contact all LS members to seek their consent, or if we made the dates of birth public knowledge, we would significantly increase the risk of individuals becoming identifiable from their data.
The wide range of research conducted using the LS has delivered significant public benefit. The impact on individuals of their data being used in this way has been minimal, as nobody is aware whether their data have been included. On balance, therefore, we believe that the continued protection of the dates of birth is the most ethical approach to take.
This does not deny any individual of their rights under the General Data Protection Regulation. Details on how to exercise these rights is provided in the "Further information" section.
Data protection legislation requires that all processing of personal data is undertaken under one or more conditions.
LS data are processed under the condition that "processing is necessary for the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller".
Under data protection legislation, an additional condition is needed to process special category personal data. Special category personal data includes information about racial or ethnic origin, religious or philosophical beliefs and sexual orientation. These topics are included in census data linked to the LS. We process this information under the condition "processing is necessary for archiving in the public interest, scientific or historical research purposes or statistical purposes based on UK law".
The data controller
The controller is the person or organisation that decides which personal data will be processed and for what purpose. For the LS, the ONS is the controller and makes those decisions.
You can contact us by:
Telephone: +44 1329 444696
Longitudinal Study Development Team
Office for National Statistics
More information about ONS principles, policies and practices regarding privacy and data protection, including your statutory rights, is available.Back to table of contents
You might also be interested in:
- 1991 Census definitions and concepts (3.0 MB pdf)
- 1981 Census definitions and concepts (1.9 MB pdf)
- 1971 Census definitions and concepts (2.4 MB pdf)
- Longitudinal Study 2001 – 2011 Completeness of census linkage (1.5 MB pdf)
- 1971 Census quality check (2.2 MB pdf)
- Quality of tracing at the 2011 Census (148.5 kB xls)
- Quality of tracing at the 2001 Census (115.5 kB pdf)
- Quality of tracing at the 1991 Census (91.8 kB pdf)
- Quality of tracing at the 1981 Census (90.1 kB pdf)
- Quality of tracing at the 1971 Census (75.9 kB pdf)
- 2011 Census sampling fractions (461.3 kB xls)
- 2001 Census sampling fractions (110.5 kB pdf)
- 1991 Census sampling fractions (45.7 kB pdf)
- 1981 Census sampling fractions (46.5 kB pdf)
- 1971 Census sampling fractions (37.7 kB pdf)
- Quality of linkage 2001 to 2011 (497.7 kB xls)
- Quality of linkage between 1991 and 2001 Censuses (68.4 kB pdf)
- Quality of linkage between 1981 and 1991 Censuses (63.6 kB pdf)
- Quality of linkage between 1971 and 1981 Censuses (61.0 kB pdf)
- 1971 - 1991: The quality of event sampling and linkage within the Longitudinal Study (265.2 kB pdf)
- Event quality report (64.4 kB pdf)
- LS infant mortality 1971 to 2016 (76.8 kB xls)
- New birth of LS members 1971 to 2016 (107.5 kB xls)
- LS deaths 1971 to 2016 (229.9 kB xls)
- LS widow(er)hoods 1971 to 2016 (107.0 kB xls)
- Births to sample mothers 1971 to 2016 (103.4 kB xls)