Maternity Hospital Episode Statistics (HES) data for 2007 were linked to birth registration and NHS Numbers for Babies (NN4B) data to bring together some key demographic and clinical data items not otherwise available at a national level. This extended the time period 2005-06, for which data had previously been linked and reported.
This work forms part of the Linkage analysis and dissemination of national birth and maternity data for England and Wales project funded by the Medical Research Council as part of the Joint Wellcome Research Councils Electronic patient data linkage initiative. We would like to thank Northgate Solutions, in particular Jonathan Low, for linking the datasets; Julie Messer at ONS for providing the birth registration – NHS Numbers for Babies linked data to Northgate Solutions to link to Maternity HES records, making the linked data accessible in the VML system and for releasing outputs; Chris Roebuck and Tony Childs at the NHS Information Centre, for their advice and support. Collaborators in the original National Gestational Age project included in addition to the authors, Lesz Lancucki, formerly Maternity Hospital Episode Statistics, Community Health Statistics and Surveys, NHS Information Centre and Tony Couch, formerly Head of Information Products, Health Solution Wales who we would like to thank for their help in the earlier stage of the project. We are grateful to Gwyneth Thomas, Health Statistics and Analysis Unit, Welsh Assembly Government and Martin Ward Platt, Clinical Director, Regional and Maternity Surveys Office, North East Region for their help and support in this project.
Maternity Hospital Episode Statistics (HES) data for 2007 were linked to birth registration and NHS Numbers for Babies (NN4B) data to bring together some key demographic and clinical data items not otherwise available at a national level. This extended the time period 2005–06, for which data had previously been linked and reported.
Birth registration and NN4B records were linked to Maternity HES delivery records and also to Maternity HES baby records using the NHS Number when available. Other direct identifiers were used if the NHS Number was missing.
Data quality and completeness of Maternity HES were assessed in relation to birth registration data wherever possible. For information not collected at registration, NN4B data were used to validate the quality of Maternity HES.
Overall, 93 per cent of Maternity HES delivery records could be linked to the birth registration/ NHS Numbers for Babies records and 80 per cent of Maternity HES baby records were linked to these.
Two per cent of Maternity HES records had the mother’s NHS number missing compared with 22 per cent in the NN4B dataset. This did not reflect the extent to which other Maternity HES data items were missing or inconsistent between the two data sets.
Nearly a third of all linked Maternity HES records for singleton babies had one or more of the following data items missing: birth weight, gestational age, birth status, sex and date of birth of the baby. On the other hand for data items where information was stated, such as birth weight, birth status and sex for singleton babies, there was good agreement between Maternity HES and linked birth registration and NN4B data.
Although NN4B records the ethnic category of the baby, as defined by the mother and Maternity HES records mother’s ethnic category, 75 per cent of the linked records had the same ethnic group recorded for the mother and her baby.
The linkage rate for 2007 was slightly higher than for the two previous years, but data were more incomplete. To gain maximum benefit from this linkage, improvements are urgently needed in the quality and completeness of the data contained in Maternity HES.
The data recorded at birth registration are mainly socio-demographic: such as names, address of the mother’s and father’s usual place of residence, place of birth, occupations of the parents and dates of birth of the mother and baby (Office for National Statistics publication, DH3). As a result some key items needed for demographic and clinical purposes are not available at a national level. The opportunity to obtain gestational age and ethnicity data nationally resulted from the introduction of the NHS Numbers for Babies (NN4B) Service in 2002. This service collects a small dataset which contains key items which are not recorded at birth registration. Information on gestational age at birth is of key importance as babies born preterm, before 37 completed weeks of gestation, are at particularly high risk of morbidity and mortality in early years of life (Brocklehurst P, 1999; ISD Scotland report 2004; Confidential Enquiry into Maternal and Child Health, 2004).
Clinical information on maternity care at delivery could be obtained only from the Maternity Hospital Episode Statistics (HES) dataset for births that occurred in England and from the Community Child Health database (CHD) and Patient Episode Database for Wales (PEDW) for births that occurred in Wales.
Therefore a collaborative project was set up in 2004 between City University London, the Office for National Statistics (ONS) and the then Welsh Assembly Government to link these datasets for all births that occurred in England and Wales from 2005 to 2007. Stage 1 of the project involved linkage of birth registration data with NN4B dataset and assessment of data quality and completeness of the NN4B data. This is reported elsewhere (Hilder et al., 2007; Moser K and Hilder L, 2008).
Stage 2 of the project involved linkage of the linked dataset for the years 2005 and 2006, created in stage 1, to Maternity HES and assessment of data quality and completeness by comparison with birth registration or NN4B, where possible. At the time, 2007 birth registration-NN4B linked data were not available. Therefore these data were linked to Maternity HES and corresponding Welsh records at a later date using the experience gained in linking the first two years’ data. The article published earlier describes details of the method used for linkage to Maternity HES records (Dattani et al., 2011). This article reports on quality and completeness of the 2007 linked data. The Welsh linkage for all three years, 2005–07, will be reported separately.
Linkage of data for further years and access to the 2005–07 linked data for other projects will involve seeking approval from the ethics and permission from the National Information Governance Board to access individual patient identifiable records and securing new funding.
Several data items are common to all three data sources (Maternity HES, birth registration and NHS Numbers for Babies) as shown in Box 1. In addition, some data items are unique to each data source and linkage is enabling new analyses using these linked data. For example, it is now possible to analyse caesarean section rates by the father’s socio-economic classification, compare time of birth with birth outcomes, and report on the outcome of birth by onset of labour, gestational age, time of day and day of the week. Now the linkage has been completed and checked, the next stage of the project will be to undertake some of these analyses.
Box 1 Availability of selected data items from birth registration, NN4B and Maternity HES
Details of the source data: birth registration, NHS Numbers for Babies and Maternity Hospital Episode Statistics (HES) can be found in the earlier article in Health Statistics Quarterly 49 describing the linkage of data for 2005 and 2006 (Dattani et al., 2011).
Record linkage was carried out by Northgate Solutions, which processes HES records under contract with the NHS Information Centre. The linkage algorithm previously compiled for 2005 and 2006 data was used, but the program was slightly amended to ensure that only one HES record was linked to each registration-NN4B linked record (Dattani et al., 2011).
The linked data provided to ONS by Northgate Solutions consisted of two files. One contained previously linked registration, NN4B records linked data to the mother’s record in HES which also included the baby 'tails'. The second file based on linkage of registration, NN4B, linked records to baby records in HES. These were accessed by researchers from City University London in the secure environment of the Virtual Microdata Laboratory (VML) facilities at ONS. Outputs of analyses undertaken in the VML were released by ONS in the form of disclosure controlled tables.
The review of the quality of Maternity HES was focussed on the completeness and consistency of the HES data, in relation to birth registration data where possible. Since all babies born in England and Wales have to be registered, information collected at registration is subject to quality checks (Office for National Statistics, series DH3). However, where information was not available from registration, NN4B data were used to validate the quality of Maternity HES. The quality of the NN4B data in comparison to birth registration data is reported elsewhere (Moser et al., 2008) The completeness of the main data items in all three sources was measured by identifying the extent to which data were missing.
The linked data for the mother’s file was split into singleton and multiple births, using the multiple birth status field from registration, to facilitate the assessment of data quality. In some instances the results are reported separately.
Data analyses were carried out using SAS version 9 and SPSS version 16 software products.
The Maternity HES record is a mother-based record containing the mother’s details in the core record. A maternity 'tail' and a baby 'tail', which can accommodate up to nine babies born in one maternity, are appended to the core record. In contrast, the registration and NN4B linked data consists of one record per baby. Therefore, the linkage was based on baby to mother records.
Northgate solutions returned 630,409 records that had linked to the registration and NN4B linked data. These included some multiple records for the same mother for each episode. Records with the most complete information were selected to ensure one to one linkage to the registration and NN4B linked dataset. This gave a file of 615,239 records.
In the registration and NN4B linked data file, there were 659,061 records for babies who were either born in England or resident in England. The resident in England category was used for births recorded as occurring at home in the registration and NN4B linked data.
Around 73 per cent of the linked registration and NN4B records were linked to Maternity HES records using the mother’s NHS number and her partial date of birth. A further 20 per cent of the linked registration and NN4B records were matched to Maternity HES using the mother’s postcode and full date of birth. Only 7 per cent of registration and NN4B linked records were not linked to HES. A total of 614,369 Maternity HES records were linked to the registration and NN4B linked records giving a linkage rate of 93.2 per cent.
The linkage to the baby file was much more straightforward than to the mother file as it involved one to one linkage between baby records in registration and NN4B linked data, and in Maternity HES.
A total of 667,893 HES baby records were linked to registration and NN4B linked data by Northgate solutions. This included multiple HES birth records for the same baby linked to a registration and NN4B linked record. Again only records with the fullest information were kept and others were deleted. After deletion, 552,398 records remained.
In the 2007 registration and NN4B linked data there were 659,061 records for babies who were either born in England or resident in England. Of these, 541,677 registration and NN4B linked records were linked to HES baby records using the NHS number, partial date of birth and sex, and 7,010 were linked using the baby’s date of birth, postcode and sex. Over 16 per cent of registration and NN4B linked records could not be linked to HES baby records. Overall 552,313 of the 659,061 records were linked, giving a linkage rate of 83.8 per cent.
For HES, the extent to which data were missing or discordant was assessed only in the mother’s records as these included information on the baby and also because the linkage rate was far better than for the baby records. For multiple births, information was recorded only for the first baby. Data on other babies was either missing or the same as the first baby, suggesting there were problems in the linkage process in HES. Hence singleton and multiple births were analysed separately and only results for singletons are reported here.
The mother’s NHS number is recorded only on the NN4B record and not recorded at birth registration. For singleton births, 22 per cent of linked registration and NN4B records did not have the mother’s NHS number compared with 2 per cent in the Maternity HES records. In Maternity HES, birth weight and gestational age information was missing for 31 per cent and 47 per cent of singletons respectively. Information about live or still birth status and/or the baby’s date of birth and sex was missing in nearly a third of the records (Table 1).
|NN4B||Birth registration||Maternity HES|
|NHS number of mother||131,202||22.0||NA||NA||11,546||1.9|
|Date of birth of mother||0||0.0||2,066||0.3||0||0.0|
|Date of birth of baby||0||0.0||0||0.0||187,931||31.6|
|Sex of baby2||763||0.1||0||0.0||196,545||33.0|
Out of 595,371 singletons
Discordance in each of the common data fields in the linked records was assessed using information from birth registration rather than NN4B. Where data items were not recorded at birth registration, NN4B data were used.
There were 14,274 records identified as relating to multiple births in birth registration and Maternity HES. Multiple birth status was discordant between the two data sources in 3,205 records (Table 2).
Discordance in live or still birth status
For the records which had a stated live or still birth status in both data sources, one per cent of the records disagreed on birth status (Table 3). Around 33 per cent of linked Maternity HES records had no information on birth status.
|Live birth||Still birth||Total|
|Number||Number||Number||% of all records|
|Still birth: ante-partum||4,006||1,354||5,360||0.9|
|Still birth: intra-partum||6||154||160||0.0|
|Still birth: Indeterminate||12||168||180||0.0|
Discordance in baby’s sex
The sex of the baby recorded on birth registration for singleton births was compared with Maternity HES. Where the baby’s sex was recorded in both data sources, an agreement of 98 per cent was observed (Table 4). Sex was indeterminate in 763 in NN4B records and 368 cases in Maternity HES. In the latter, sex was coded to unspecified codes in 71 cases (as shown in the footnote in Table 4).
|Male||Female||Total||% of total|
Includes 368 cases with indeterminate sex, 66 cases coded to 4 and 5 cases coded to 5,7 and 8
Source: HES and registration
Discordance in birth weight
Where birth weight was recorded, there was good concordance between Maternity HES and birth registration. In Maternity HES, birth weight was missing in a third of the records, however, compared to only 1 per cent in birth registration (Table 5).
|Maternity HES||Birth registration|
|Birth weight(g)||<500||500-999||1000-1499||1500-1999||2000-2499||2500-2999||3000-3499||3500-3999||4000-4499||4500-4999||5000-5499||5500 and over||Not stated||total||%|
|5500 and over||0||24||1||0||4||7||17||14||6||0||1||59||0||133||0.0|
Discordance in gestational age
Information about gestational age for all births was available from the NN4B and Maternity HES. In nearly 90 per cent of the records where it was recorded in both sources, gestational age was the same (see Table A1 in the Appendix). On the other hand, in Maternity HES, almost half of all records had gestational age missing. Gestational age differed by one week in around 6 per cent of the records and two weeks or more in about 9 per cent of the records. There was a wide variation in gestational age between the two data sources in the ‘tails’ for babies born before 22 weeks and over 42 weeks, but only 4 per cent of births occurred in these extremes of the gestational age distribution. The difference was 23 per cent for those born before 22 weeks. At 42 weeks, gestational age differed in about a fifth of all records. For records of births at 43 weeks or over, gestational age was missing in 43 per cent of maternity HES records.
Discordance in ethnicity
The baby’s ethnicity recorded in the NN4B record and the mother’s ethnicity recorded in Maternity HES were compared (see Table A2 in the Appendix). There was agreement in three-quarters of the records which had a stated ethnic category. Among all the linked records, 13 per cent of records had no ethnicity recorded in Maternity HES and in 9 per cent of records ethnic group was not stated in the NN4B data.
Three-quarters of the registration and NN4B records were linked to the HES mothers’ records using the NHS number and partial date of birth. This was not surprising as the mother’s NHS number was missing from nearly a quarter of the registration and NN4B linked records, and also from a very small proportion of Maternity HES records. A further fifth of the registration and NN4B linked records were linked using the date of birth or month and year of birth, and the postcode. There were concerns about using postcodes in the linkage algorithm, as the HES index used for linkage is derived using current postcode of residence of the mother and the postcode on registration and NN4B linked data were recorded at the time of registration. It is possible the mother could have moved since having the baby and this variable is also subject to recording and reporting errors. Despite this, an overall linkage rate of over 90 per cent was achieved. This could have been improved further if there had been a shorter delay before linkage was carried out as HESID would have been less likely to have changed. Alternatively HESID at birth could be retained as a separate field for linkage. There are however about 20 Trusts that fail to submit any maternity data to HES because they have a stand-alone maternity system that is not linked to the Patient Administration System. Hence it would be impossible to obtain a much higher linkage rate until all Trusts in England submit data to HES.
The linkage rate for registration and NN4B linked records to HES baby records was slightly lower than the linkage rate for the mothers’ records. This was not surprising, as a large proportion of baby 'tails' are known to be missing in Maternity HES (HES website 2010).
HES mother records include information about the baby. As the linkage rate for registration and NN4B linked data to HES mother records was higher than for the baby records, the quality of information in HES was assessed using the mothers’ records. There were however issues with multiple births in the HES mothers’ record, as already found in the 2005/06 data. Multiple birth status was also unknown in a fifth of the records. Further work is needed to assess the quality of data on multiple births for all three years of linked data before they could be used for any analyses.
Discrepancy in the recording of live/stillbirth status for singleton babies was found in 1 per cent of the linked records. This shows a deterioration compared with the data for the two previous years where it was 5 in 100,000 records in 2005 and 2 in 1,000 records in 2006. A third of the HES records for 2007 did not have any information on birth status, which is consistent with the 2005 and 2006 data.
Birth weight was missing in a quarter of all linked Maternity HES records for singleton babies compared with only 0.2 per cent at birth registration. There was however, good concordance between the two data sources where birth weight was stated, as the majority of the records were in the same 500g birth weight group. Missing birth weights are investigated by ONS by going back to registrars and also to child health departments. Therefore the quality of birth weight information on birth registration is better and more reliable than in Maternity HES.
Gestational age is not recorded at registration for live births but is available from the NN4B data. This records gestational age in weeks ‘calculated from relevant menstrual data held within the maternity system’ whereas Maternity HES specifies ‘time from the first day of the last menstrual period (LMP)’. Where this is not available an estimate is supposed to be recorded. However, it is likely the gestational age assessed by ultrasound is now used because second trimester scans are a routine part of antenatal assessment in the UK. A study of births at 27/28 weeks of gestational age in England, Wales and Northern Ireland between 1998 and 2000 showed that 79 per cent of the mothers had had an ultrasound before 20 weeks gestation, and 85 per cent had had their menstrual history recorded (Confidential Enquiry into Stillbirths and Deaths in Infancy report, 2001).
Gestational age distributions have shown to differ according to the method used to assess gestational age. Studies have shown that if second trimester ultrasound is used rather than LMP, then the mean gestational age is one week lower, but recorded gestational age differed by one week in only 7 per cent of the linked records. Nearly half of the linked HES records had no information about gestational age, compared with only 1 per cent in the NN4B data. Sub-national analysis of the NN4B data for 2005–08 showed that majority of the Trusts had none or very few records with gestational age missing (Office for National Statistics publication, Quality of ethnicity and gestational age data for 2005–08). Where gestational age was stated in maternity HES, it was in good agreement with NN4B in majority of the records.
A past study using maternity HES data for 1990–91 showed that only 52 per cent of the deliveries were recorded on HES compared with the number of registered births and, within regions, the level of completeness varied from district to district (Middle C, Macfarlane A, 1995). There has been a vast improvement in the number of maternities recorded on HES since that time but the level of completeness still varies between NHS Trusts (NHS Information Centre, Maternity HES Statistics bulletin 2007–08).
The NN4B system records information about the ethnic category of the baby as defined by the mother, using the 2001 Census categories (Moser K, Stanfield KM, et al., 2008). On the Maternity HES record, the mother’s ethnicity is self-reported using the 2001 Census categories. It is unclear however, whether the mother was involved in defining the ethnic category in either of these data sources or whether a health professional decided what to record without asking the mother. In practice it is likely to be a mixture of both. Although the ethnic group of the baby is requested in NN4B, it is not possible to know whose ethnic group was actually recorded, the mother’s or the baby’s.
A further consideration is that people’s identification with an ethnic group is not always straightforward. Individual responses, whether self-reported or not, may vary according to circumstances and over time.
Despite these limitations, in three-quarters of the linked records the mother’s ethnicity recorded was the same as that recorded for her baby. In 3 per cent of records, the mother’s ethnicity was categorised as ‘White British’ and baby’s ethnicity was categorised as ‘White other’ or vice versa. This suggests that the father’s ethnicity may have been taken into consideration in recording the baby’s ethnic category on the NN4B data and this is more likely to have been defined by the mother. Although recording of ethnicity is better on NN4B than in maternity HES, the level of completeness varies by Trusts ranging from zero to 98 per cent (Office for National Statistics publication, Quality of ethnicity and gestational age data for 2005–08).
This study shows that it is possible to link the majority of the Maternity HES records routinely to registration and NN4B linked records, but linkage would be considerably more valuable if there were further improvements in the quality and completeness of Maternity HES. Information about method of delivery and complications in pregnancy can only be obtained at a national level from Maternity HES, so linkage would be needed to access this information together with the data obtained from birth registration and NN4B.
Birth registration and NN4B are more reliable sources of data than Maternity HES. On the other hand, where data have been recorded they are in good concordance with birth registration or NN4B but there are a large proportion of linked records where information was not recorded on Maternity HES.
Details of the policy governing the release of new data are available by visiting www.statisticsauthority.gov.uk/assessment/code-of-practice/index.html or from the Media Relations Office email: firstname.lastname@example.org
Brocklehurst P (1999) Infection and preterm delivery. British Medical Journal 318, 548–549.
Confidential Enquiry into Maternal and Child Health. (2004) Stillbirth, neonatal and postneonatal mortality 2000–02, England, Wales and Northern Ireland.
Confidential Enquiry into Stillbirths and Deaths in Infancy (2001) 8th Annual Report, Maternal and Child Health Research Consortium: London.
Dattani N, Datta-Nemdharry P, and Macfarlane A. (2011) Linking maternity data for England, 2005–06: methods and data quality. Health Statistics Quarterly 49. Available on the ONS website at: www.ons.gov.uk/ons/rel/hsq/health-statistics-quarterly/spring-2011/index.html
Hilder L, Moser K, Dattani N and Macfarlane A. (2007) Pilot linkage of NHS Numbers for Babies data with Birth registrations. Health Statistics Quarterly 33, 25–33.
ISD Scotland and Scottish Programme for Clinical Effectiveness in Reproductive Health (2004) Scottish Perinatal and Infant Mortality and Morbidity Report 2003, SPERCH Publication No 21, NHS Scotland: Edinburgh
“Maternity data in HES” available on the Information Centre website at: www.hesonline.nhs.uk/Ease/servlet/ContentServer?siteID=1937&categoryID=925
Moser K and Hilder L. (2008) Assessing quality of NHS Numbers for Babies data and providing gestational age statistics. Health Statistics Quarterly 37, 15–23.
Moser K, Stanfield K M and Leon D A. (2008) Birthweight and gestational age by ethnic group, England and Wales 2005: introducing new data on births. Health Statistics Quarterly 39, 22–31.
NHS Information Centre, Maternity data, 2007–08. Available at: www.hesonline.nhs.uk/Ease/servlet/ContentServer?siteID=1937&categoryID=1060
Office for National Statistics, Mortality Statistics: Childhood, infant and perinatal, England and Wales, 2007. Series DH3 No. 40. Available on the ONS website at: www.ons.gov.uk/ons/rel/vsob1/mortality-statistics--childhood--infant-and-perinatal--england-and-wales--series-dh3-/no--40--2007/index.html
Office for National Statistics, Quality of ethnicity and gestation data subnationally for births and infant deaths in England and Wales, 2005–08. Available on the ONS website at: www.ons.gov.uk/ons/publications/re-reference-tables.html?edition=tcm%3A77-226528