The aim of this paper is to provide a description of the methodology used to estimate Short-Term International Migration (STIM) for England and Wales. This paper sets out the current methodology and highlights any methodological changes made to the series since it was introduced.
A quality and methodology information report is also published in which STIM estimates are assessed against the six dimensions of statistical quality defined by Eurostat.
STIM estimates were first published as Experimental Statistics in 2007 and were accredited as National Statistics in 2011. STIM statistics estimate visits made to England and Wales by non- UK residents for longer than a month but less than 12 months, on the basis of three definitions (as outlined in section 1.3). STIM statistics were developed as part of the Migration Statistics Improvement Programme (MSIP) and are based on International Passenger Survey (IPS) data.
Since 2007, the Office for National Statistics (ONS) has undertaken a programme of research to improve the estimates. This work has included:
improvements to the measurement of the quality of the estimates
improvements in the timeliness of the estimates
an investigation into the comparability of the estimates with counts taken from administrative data
production of stock estimates
publication of estimates at lower geographical level
A distinction is made between STIM estimates made at England and Wales level and those produced at local authority level. The latter were produced for the first time in October 2009 and are based on methods described in the short-term international immigration estimates for local authorities methodology. These estimates also form part of the Short-Term International Migration Annual Report (STIMAR) and have been published as National Statistics since May 2013, following an assessment by the UK Statistics Authority.
1.3. Definitions and estimates currently available
According to the United Nations (UN) definition, a short-term international migrant is defined as:
"A person who moves to a country other than that of his or her usual residence for a period of at least 3 months but less than a year except in cases where the movement to that country is for purposes of recreation, holiday, visits to friends and relatives, business, medical treatment or religious pilgrimage".
To meet user needs, estimates are currently produced based on three definitions:
moves made for between 3 and 12 months for employment or study (UN definition)
moves made for between 3 and 12 months for any reason
moves made for between 1 and 12 months for any reason
Moves made for between 1 and 12 months for employment or study can be found within the datasheet published on moves made for between 1 and 12 months. Estimates on the basis of all three definitions are available for mid-years (that is the period 1 July to 30 June) from the mid-year ending June 2004 onwards.
STIM estimates are also available as flows (the total number of moves made over a set period) and stocks (the number of short-term international migrants present at a given point in time).
Short-term migrants are interviewed by the IPS at the end of their visit (when a short-term emigrant returns to the UK and when a short-term immigrant leaves the UK). Flow estimates are aggregated based upon the date when the visit started:
inflow estimates refer to the number of non-UK residents1 estimated to have arrived in the UK to commence a visit to the UK in the relevant time period
outflow estimates refer to the number of England and Wales residents estimated to have left the UK to commence a visit abroad in the relevant time period
flow estimates refer to the number of short-term migrations commenced (migrant moves) as opposed to the number of people who commence short-term migrations (migrants)
This distinction is important when estimating STIM annually because a person could migrate more than once in the same period. For example, a single person migrating twice in a year for three months on each occasion would appear in the flow estimates as two (migrant moves), not as one (migrant). For more information about short-term migration stocks and average length of stay, please refer to section 2.6.
1.4. Issues with measuring migration
The IPS, weighted by airline and other carrier information, is the primary source used to produce STIM estimates. While the IPS offers the best data currently available, it is not specifically designed to capture information solely on international migration. Therefore, the information does not specifically cater for the need to estimate short-term international migration.
In addition, because it is not possible to produce an accurate figure for the number of people who are in the country illegally we do not produce estimates on the size of the illegal immigrant population.. For more information on estimating the size of the illegal population, please refer to the following Home Office reports:
A later report, Economic impact on the London and UK economy of an earned regularisation of irregular migrants to the UK, written by the London School of Economics (LSE), also estimated that in 2007 the number of “irregular” migrants in the UK was 533,000.
Notes for: Introduction
- To be a resident of the UK, a person must have lived here continuously for 12 months or more.
2.1. Calculating short-term international immigration
STIM flows estimates are produced directly from the IPS (for more information about the IPS, please refer to section 2.3). Short-term migrant contacts sampled by the IPS need to be grossed to represent total estimates. This is done by using a complex weighting system. The method of grossing the interviews to national estimates varies depending on the method of travel. A detailed description of how the IPS raw data are grossed is available in Appendix C, Travel Trends 2014.
Short-term migrants are interviewed at the end of their stay. For short-term immigrants, who must be foreign residents to qualify, this means being interviewed when they leave the UK at the end of their stay. For short-term emigrants, who must be England and Wales residents to qualify, this means being interviewed when they arrive back in the UK. This means estimates are based on actual behaviour and not intended behaviour, as is the case with Long-Term International Migration (LTIM) estimates.
While there is no overlap between STIM and LTIM estimates, there are overlaps between STIM estimates, which count visits by ‘migrants’ of 1 to 12 months, and travel and tourism statistics, which count visits by ‘visitors’ of over 1 day but less than 12 months. Estimates of visitors are considerably higher than those of migrants. Although there is some overlapping sample between the two statistics, they examine different characteristics of the short-term visitors and migrants.
2.2. What about net migration?
Unlike for LTIM, there are no estimates of short-term international net migration flows. Short-term migrants do not stay for more than 12 months; therefore, they do not become ‘usually resident’. This means they are not included in population estimates, so calculating a net migration figure for short-term migrants does not have the value that it does for long-term international migration. In addition, short-term migrants coming to England and Wales are counted in as short-term migrants but not out as short-term migrants, because to be counted out as a short-term migrant, a person needs to have been resident in England and Wales for 12 months or more. Short-term international outflows from England and Wales are higher than short-term international inflows to England and Wales, so the net difference would be negative if it were possible to calculate.
The most appropriate estimates to use in order to estimate the impact of short-term international migration on the overall population are short-term international migration stocks. Stocks estimates are more meaningful because they estimate the average number of short-term migrants in the country on an average day in a 12 month period. For example, if four migrants each stayed in England and Wales for three months, this would be the equivalent of one person staying for one year, so the ‘stock’ count would be 1.
2.3. International Passenger Survey
The IPS is a sample survey of passengers arriving at, and departing from, UK air and sea ports and the Channel Tunnel. As well as STIM, data from the IPS are a main component in the production of LTIM and Travel and Tourism estimates.
The main IPS sample consists of between 700,000 and 800,000 interviews each year. In 2014, the IPS had an overall response rate of 79%. The IPS sample is stratified to ensure that it is representative by mode of travel, route and time of day. Interviews are conducted throughout the year. Map 1 shows each port at which IPS interviewing currently takes place.
2.4. Issues with the International Passenger Survey
The IPS has some limitations with respect to measuring migration (both long- and short-term). For example, the IPS does not survey many asylum seekers who may be entering or leaving the UK, as they are unlikely to enter the UK via the main departure and arrival terminals where IPS interviewing takes place. In addition, the IPS does not capture those who are crossing the land border between the UK (Northern Ireland) and the Republic of Ireland. Section 2.5 looks at these issues in more detail.
Finally, the IPS is a sample survey and therefore only a sample of migrants to and from the UK is interviewed. As a result, estimates from the IPS are subject to a degree of uncertainty. There is evidence to suggest that an inadequate sampling design and coverage of the IPS meant that a substantial amount of long-term migration, particularly of EU8 citizens, was missed during the years 2004 to 2008, prior to improvements made to the IPS from 2009 (as outlined in section 2.4). This inadequate coverage of some routes will also have caused some short-term migrants to be missed in the mid-years 2004 to 2008. However, due to a lack of comparative data sources, it is not possible to quantify the scale of the difference. For more information, please refer to the Quality of Long-Term International Migration estimates from 2001 to 2011 full report.
For further information about the IPS and international migration statistics, please refer to International Passenger Survey: Quality Information in Relation to Migration Flows.
In January 2009, changes were made to the sample design and data processing of the IPS following the Port Survey Review.
2.5. Potential groups of migrants missed from the IPS
As discussed in section 2.3, the IPS does not interview all asylum seekers entering or leaving the UK. Asylum seekers are not included in STIM estimates as no suitable adjustment can currently be made to include them. In order to produce LTIM estimates, we obtain data from the Home Office on principal applicant asylum seekers and their dependants. However the STIM methodology does not attempt many of the adjustments made in the long-term estimates as they are considered too complex for the small number of migrants who would be affected.
Estimate of migration to and from Northern Ireland
The IPS does not sample those passengers who cross the land border between the UK (Northern Ireland) and the Republic of Ireland. In addition, no ports in Northern Ireland have historically been surveyed in the IPS, although Belfast International Airport has been included in the sample since 2009. No adjustments are made to England and Wales estimates to allow for this and therefore any short-term international migrants who move from Northern Ireland to England or Wales during their stay are missed from the estimates. However, this number is thought to be relatively low. Family doctor registration data are the most complete source that can be used to estimate international immigration to Northern Ireland. This source is used for LTIM but there is currently no way to count the short-term international migrants who go on to enter England or Wales using this source.
Further information about international migration statistics for Northern Ireland is available from the Northern Ireland Statistics and Research Agency.
2.6. Assumptions made in order to produce STIM
A small number of IPS respondents do not identify which of the constituent countries of the UK they lived in during their stay. For data up to and including the year ending mid-2008, relevant records without location of stay information were randomly allocated to either ‘England and Wales’ or ‘not England and Wales’ according to the proportions implied by all records where location of stay was collected. From the May 2013 release, which affects data for the year ending mid-2009 onwards, the way this imputation is applied has been improved. In the earlier method, a contact was selected to be in England and Wales or not. This did not take account of the weight given to each contact. Now the imputation applies the proportion to each contact, so that the proportion is maintained after weighting. This improves the representativeness of the sample, as demonstrated in Table 1.
Table 1: Demonstration of old and new methods of constituent country of stay imputation (uses dummy data)
|Contact||Weight||England and Wales / Rest of UK proportion||Old Method||New Method|
|Age group: 16 to 24||1000||90/10||Not randomly selected to be England and Wales, estimate =0||90% in England and Wales, estimate =900|
|Age group : 65 and over||100||90/10||Randomly selected to be England and Wales, estimate =100||90% in England and Wales, estimate =90|
Download this table.xls
2.7. Estimates of short-term international migration stocks and average length of stay
Short-term international migrants are those who stay for less than a year and cannot be added into the usual resident population (who by definition have been resident for a 12 months or more). For example, a migrant arriving on 1 July and leaving on 1 September cannot be added into the usual resident population, despite their two month stay adding, at least temporarily, to the actual population. It would also be incorrect to simply use the number of short-term international migrants present in the country at the mid-year (30 June) as a predictor for the number of short-term migrants present throughout the year, due to the large seasonal effects observed in short-term migration patterns.
The method developed to produce STIM stock estimates measures the total amount of time, as recorded in the IPS, spent in England and Wales (the ‘in-stock’) or away from England and Wales (the ‘out-stock’) by all short-term international migrants between 1 July and 30 June the next year. In effect they measure the ‘long-term migrant equivalent’ (LTME), the same as one person migrating to/from England and Wales for one year. For example, if four migrants each stayed in England and Wales for three months, this would be the equivalent of one person for one year, and so the stock count would be 1. Likewise two migrants staying for six months would give the equivalent of one person staying for one year. In the second example the number of arrivals is half that of the first example, but results in the same stock estimate. Stocks give an average number of migrants in the country on an average day.
Figure 1 demonstrates how individuals contribute to the LTME. The estimate is made for the period between points (i) and (ii). Any days spent by short-term international migrants in the period contribute to the LTME. Individual B contributes days spent between (i) and the end of their stay (y). Days before point (i) do not contribute to the estimate, likewise any days spent after point (ii) do not contribute. The total days spent in the period by all short-term international migrants are aggregated and then divided by 365 to produce an estimate expressed in person years, equivalent to long-term migration stays.
Similar LTME estimates could be produced by a large number of very short stays or a smaller number of longer stays. As a result the average length of stay is also calculated to give users additional information about the stock estimate. Referring back to Figure 1, the true length of stay for individual B is between X and Y whereas the length of stay contributing to the LTME is from point (i) to Y. Only using stays contributing to the LTME, (i) to (ii), would produce underestimates of the average length of stay. Therefore the method developed for average length of stay only includes individuals who began their stay in the period referred to by the LTME, regardless of when the short-term international migration was completed. On figure 1 this means that only individuals C and D would contribute to the average length of stay as individuals A and B began their stay before point (i). The whole of individual C’s stay contributes to the average. The average length of stay can be expressed in words as the sum of all days spent in the area by people who entered in the year of interest (in months), divided by the number of people who entered in the year of interest.
Mean (average) length of stay estimates are calculated by summing the number of nights in every stay commencing in the relevant time period and dividing by the number of stays commencing in that time period (the 'flow'). The resulting mean (average) is divided by 30 and expressed as the mean (average) length of stay in months. Please note that mean (average) length of stay estimates reflect the whole length of stay of all visits starting in the relevant time period. Unlike stock estimates, visits spanning more than one time period are not split. With regards to geography, immigrants who have spent time in England and Wales and in the rest of the UK are included in the LTME if they spent most of their time in England and Wales. The entire stay in the UK of such migrants is included in the calculation of the mean (average) length of stay, not just the time spent in England and Wales.Back to table of contents
The IPS is a sample survey and is, therefore, subject to some uncertainty, as discussed in section 2.3. Figures obtained from the IPS are subject to both sampling and non-sampling errors.
3.1. Sampling error
Sampling error arises due to the variability that occurs by chance because a sample, rather than an entire population, is surveyed; that is, sampling error results because not every migrant who enters or leaves the UK is interviewed. Sampling errors are determined both by the sample design and the sample size. Sampling error may sometimes present misleading changes as a result of the random selection of those included in the sample.
Confidence intervals (CI) are provided with IPS based estimates and are a statistical method by which sampling error can be measured. They provide a range within which the true value of an estimate is likely to fall. The confidence intervals used for the IPS are 95% confidence intervals; this means that the range is expected to contain the true value of the number of migrants around 95% of the time.
When estimates are broken down to lower levels of detail, greater care must be taken with their interpretation. This is because these estimates will be based on a smaller number of survey contacts, which increase the uncertainty around the estimate. For example, it is not possible to produce estimates for most individual citizenships or countries of last/next residence, within a single year, because of the small number of survey contacts that comprise each estimate.
Even where the sample size allows individual country estimates to be produced, it is often not possible to say that a change in the estimate from one year to the next is real or not. This is because smaller estimates often have proportionately larger confidence intervals than larger estimates. However, in a few instances where the estimates are based on large enough sample sizes, we can be at least 95% certain that the change in the estimate represents a statistically significant change.
3.2. Sampling error for stock estimates
As estimates of short-term international migration stocks are calculated from the IPS data and not directly estimated from the IPS it is not possible merely to use the IPS confidence interval. A method has been developed to calculate stock standard errors and confidence intervals. Confidence intervals for stocks are calculated by 1.96 x Standard Error.
3.3. What does it mean if a change is statistically significant?
As outlined in section 2.3, the IPS interviews a sample of passengers passing through ports within the UK. As with all sample surveys, the estimates produced from these interviews are based upon one of a number of different samples that could have been drawn at that point in time, meaning that there is a degree of variability around the estimates produced. This variability may sometimes present misleading changes as a result of the random selection of those included in the sample. If a change or a difference between estimates is described as 'statistically significant', it means that statistical tests have been carried out to reject the possibility that the change has occurred by chance. Therefore statistically significant changes are very likely to reflect real changes in migration patterns.
3.4. How do you determine if a change is statistically significant?
A quick method of identifying if the difference between two estimates is statistically significant is to determine if there is an overlap of their confidence intervals (for more information about confidence intervals, please refer to section 3.1). If the confidence intervals do not overlap, then the differences can be described as statistically significant. For example, the increase between an estimate of 100,000 with a confidence interval of +/- 10,000 and an estimate of 150,000 with a confidence interval of +/- 15,000 would be statistically significant, because 100,000 plus 10,000 is still lower than 150,000 minus 15,000. However, if the confidence intervals do overlap, a t test should be performed to determine statistical significance.
3.5. What is a t test?
A t test ascertains if the difference between two estimates is statistically significant, that is, if it were repeated with a different sample, the difference would occur 19 out of 20 times. This test divides the difference of the estimates by the square root of the sum of the squared standard errors. The resulting t value needs to be greater than 1.96 to be 95% certain that the estimates are different. It can also be used to create a confidence interval around the difference. It calculates the standard error of the difference directly from using the difference between the two individual standard errors. All main statistical software packages have the functionality required to perform a t test. If you need assistance with identifying whether the difference between two international migration estimates is statistically significant then please contact firstname.lastname@example.org.
3.6. Non-sampling error
Non-sampling error is all error that is not sampling error. The challenge with non-sampling error is that it is difficult to directly calculate a numerical measure of its effect. This, therefore, makes it hard to incorporate when analysing results. Non-sampling error is best understood by referring to examples that apply to the IPS.
The first non-sampling error may be due to non-response. Bias will occur when passengers who do not respond to the survey have different characteristics to those who do respond. Possible low levels of response that might be expected due to the respondent not speaking English have been reduced in recent years by the introduction of separate sampling arrangements in certain ports. This improvement is at least partly because interviewers can more easily enlist the help of relatives or interpreters to translate for contacts who do not speak English.
For those contacts identified by the IPS as migrants, the level of non-response is very low for most characteristics. Further information about IPS non-response is available in Appendix D of the report Travel Trends, 2014. In addition, the paper International Passenger Survey: Quality Information in Relation to Migration Flows provides an overview of the quality and reliability of the International Passenger Survey (IPS) in relation to producing estimates of long-term migration flows.
3.7. Validation of estimates
As well as ONS STIM estimates, Home Office data are available on the number of short-term entry clearance visas issued for less than one year. However, visa data provide only partial coverage of short-term migrants since these data normally relate to those non-European Economic Area (EEA) nationals who are subject to immigration control and who require a visa. EEA nationals do not normally require a visa to enter the UK (although a small number of EEA nationals do apply and are issued with visas). Visa data are not the only source of information on short-term residents; the 2011 Census also collected such information. ONS has published a report Examining the differences between the mid-year short-term immigration estimates for Local Authorities and the 2011 Census, which compares national- and local authority-level STIM estimates with 2011 Census data on short-term residents.Back to table of contents
Since the first publication of STIM estimates in 2007, we have undertaken a programme of research to further develop the estimates. This work has included:
improvements to the measurement of the quality of the estimate
improvements in the timeliness of the estimates
an investigation into the comparability of the estimates with counts taken from administrative data
production of stock estimates
publication of estimates at lower geographical level
Most of these changes have been incorporated in section 2 as current methodology. For example, standard errors for the estimates have only been published since 2008. However, the changes made to timeliness are explained in this section because they represent an improvement to the method and not an addition to the output or analysis available.
4.1. Improving Timeliness
STIM estimates are based on the start date of each migration. However, because this data is collected by the IPS at the end of each migration, each mid-year STIM estimate is based on relevant records from three calendar years of IPS data. For example, the estimates for immigrations commencing in the mid year 2014 are based on the following data:
2013 IPS data (migrants who arrived in the UK on or after 1 July 2013 and who were surveyed on departure before 1 January 2014 after staying in the UK for at least 1 month)
2014 IPS data (migrants who arrived in the UK on or after 1 July 2013 and who were surveyed on departure between January 2014 and December 2014 after staying in the UK for between 1 and 12 months)
2015 IPS data (migrants who arrived in the UK before 1 July 2014 and stayed up to 12 months and were surveyed on departure at any time up to the end of June 2015). Less than 1% of records fall into this category
IPS supply provisional data each quarter. The quarterly provisional data is superseded by final data issued annually and covering a calendar year. At the time that the STIM estimates are produced, the estimates for the most recent year (in the example above, the estimates for migrations commencing in mid 2014) include some data (the data collected in the period January 2015 to June 2015) that is provisional. These estimates are revised in the following year's publication when the final data is available. The final data differs from the provisional data only in that there are small adjustments to the weights assigned to each interview. The weight adjustments are small and apply to less than 1% of the migrants in the relevant period so any revisions are extremely small.
This change was introduced in 2009 for the mid-2007 England and Wales STIM estimates. By using provisional IPS data, estimates were published earlier than had previously been possible. A full assessment of the impact of the use of provisional IPS data showed that for the mid-2007 estimates the largest difference between using a full set of final IPS data and provisional IPS data for the final two quarters was 0.25%. For more information about the differences between provisional and final STIM estimates, please refer to the STIM Frequently Asked Questions. Since the publication of the mid-2008 England and Wales estimates, the previous year’s estimates have been updated using the full year’s final IPS data. Provisional data are marked ‘p’ in the estimates.
Since they were initially published, provisional STIM estimates are released earlier than they had previously been available. However publication is still 23 months after the reference point of the estimates. By comparison, provisional LTIM estimates are published five months after the reference point and mid-year population estimates are published 12 months after the reference point. Both of these products are published by ONS as National Statistics.
4.2. Improvements from changes to processing systems
In early 2013, when processing began for the mid-2011 STIM estimates, ONS had the opportunity to use different software to calculate STIM estimates. In moving the processing from the old software to the new software, we spotted several opportunities to improve the way STIM estimates are calculated. The impacts of these changes were minimal in most instances:
In data produced prior to 2013, where a stay spanned two mid-years, the part of the stay in the second mid-year was sometimes one day too long. This was corrected for the 2013 release. The new, correct, logic results in very slightly lower stock and mean length of stay estimates than had previously been published.
In the previous methodology, citizens of an overseas territory of a country were not included in the count of citizens of the country. For example, citizens of the Canary Islands had previously not been counted as citizens of Spain. This was corrected for mid-2009 STIM estimates and onwards. This also brings STIM citizenship definitions in line with those used in LTIM estimates.