The WAS draws its sample from the population of private households in Great Britain.
The first wave of the survey commenced in July 2006 and lasted for two years, ending in June 2008. This comprised 30,595 responding households.
The second wave of the survey commenced in July 2008 and ran until the end of June 2010. This comprised 20,170 responding households.
The third wave of the survey commenced in July 2010 and ran until the end of June 2012. This comprised 21,541 responding households. It returned to responding households from wave 2 who gave their permission to be re-interviewed. Households who were eligible at wave 2 but who could not be contacted were approached again at wave 3. In addition, a new cohort was introduced at wave 3 (12,000 issued addresses) with the aim to maintain an achieved sample size of around 20,000 responding households.
Data were collected in the field by Computer Assisted Interviewing (CAPI).
The WAS questionnaire is divided into two parts, a household questionnaire completed by one person in each household and an individual questionnaire addressed to all adults aged 16 and over (excluding those aged 16 to 18 currently in full-time education or those aged 19 and in a government training scheme.
The longitudinal editing introduced with wave 2 data (using information gathered at wave 1 to validate wave 2 data, but also looking at the wave 1 data in the light of the data given at wave 2) has again been applied at wave 3. However longitudinal editing is only done between wave 3 and wave 2 – the wave 1 data has not been re-edited.
In any sample survey there will always be missing values for individual questions. However, when constructing estimates of wealth it is necessary that valid responses have been given for all component estimates. Therefore, any missing values are imputed. The imputation methodology has been further refined from that used at wave 2 – details of which are given in Chapter 7: Technical details.
This chapter aims to assist readers in interpreting and utilising estimates from the Wealth and Assets Survey (WAS) by describing technical aspects relating to the survey. Much of the technical material regarding the survey has already been reported in Chapter 10 of the report ‘Wealth in Great Britain’ published in December 2009 and the wave 1 User Guide . Readers should consult these documents for more general technical detail.
The WAS is a longitudinal survey of private households and individuals in Great Britain (excluding the Isles of Scilly and Scotland north of the Caledonian Canal). The survey is conducted using face-to-face interviews, administered by ONS interviewers. The first wave of interviews was carried out between July 2006 and June 2008; the second from July 2008 until June 2010; and, the third wave between July 2010 and June 2012. The results reported on in this release describe the level of Wealth in Great Britain in 2010/12 as well as how the level and distribution of Wealth in Great Britain has changed since wave 1 of the survey.
Details of the sampling design, sampling frame, sample structure and field sampling procedures underlying wave 1 of the survey are provided in the wave 1 report. Responding households, as well as non-contacts and ‘soft’ refusals were included in the sample for the next wave. Any ‘hard’ refusals were not approached again in subsequent waves.
The WAS aims to follow individuals rather than households. In the case that a household splits, with individuals living at different addresses, WAS will interview all of the original sample members (OSMs); as well as any people living with the OSMs in the next wave of the survey. The new people in the sample are referred to as secondary sample members (SSM). OSMs remain eligible for interview until they leave Great Britain, enter an institution (such as a nursing home), or die. SSMs are eligible for interview as long as they live at the same address as an OSM. At waves 2 and 3, interviews were sought from those who had been interviewed previously and those who were previously ineligible (i.e. those aged 16 or under or 16-18 and in full time education or those aged 19 and in a government training scheme) and had become eligible at the follow up wave.
The original sample approached in wave 1 was approximately 63,000 households. However, given refusals to the survey, and changes in eligibility etc, the number of households with whom contact was attempted in wave 2 was approximately 35,000. Of the 35,000 addresses attempted for wave 2, 25,000 addresses were attempted for wave 3. Given the declining sample of eligible addresses over the life of WAS, it was decided to introduce a new panel of respondents to the survey in wave 3. A sample of 12,000 new addresses was issued to supplement the existing panel. The approach to selecting these new addresses was the same as for wave 1 of the survey.
WAS interviews take place two years after the previous wave, and generally within the same calendar month . Interviewers were given an allocation of addresses on a monthly basis and were instructed to make contact and gain an interview at all of these addresses using best practice in terms of varying calling times and days. Where it was not possible to attempt contact within the month, addresses were carried forward for reissue in the following month. Where information was unlikely to have changed, or earlier responses were likely to provide a useful aide memoire, answers from the previous wave were rolled forward and made available, in the computer assisted interviewing programme to the interviewer during the interviewing process. For instance, the type of tenure of the household’s accommodation from wave 1 would be available to the interviewer at wave 2. However, value information, such as the value of the property, was not rolled forward.
The wave 2 questionnaire covered the same topics as wave 1, however as a result of the longitudinal nature of the survey and specifically the experience gained during wave 1, it was slightly longer . The flow of questions was also improved, the types and nomenclature of some assets and debts were changed, and certain new requirements of stakeholders were included. The content of the wave 3 questionnaire was broadly comparable with wave 2. Improvements were made to the conditional routing of some questions, but generally questions were unchanged so as to preserve consistency in data collection over time. Questionnaire changes made between waves were tested both cognitively and via a quantative pilot. This ensured the new questions were both likely to be understood by respondents and were suitable for collecting the information we wanted. The mean interview length varied for each wave of the survey. The wave 1 mean interview length was 79 minutes; wave 2 was 85 minutes and wave 3 was 82 minutes.
Table 7.1 shows response for completed waves of WAS. An initial sample of 62,800 addresses were selected and sampled at wave 1. Of these, 30,500 took part in the survey, or 55% of the eligible sample. Approximately 10% of sampled addresses were found to be ineligible, and were therefore not interviewed at e.g. non-residential addresses. For wave 2, the cooperating wave 1 households, along with non-contacts and circumstantial refusals from wave 1 were issued for a wave 2 follow up interview. The eligible sample for wave 2 of the survey was nearly 29,600 households and of these 20,170 either fully or partially responded, giving a household response rate of 68%. This figure is not comparable with the household response rate of 55% achieved in wave 1 since the wave 2 figure is calculated as a proportion of the sample brought forward from wave 1. As a proportion of the original wave 1 sample, the response rate is 36%, which illustrates both the scale of non-response at wave 1 and subsequent attrition between waves 1 and 2.
|Total eligible sample||55,835||100||29,341||100||32,659||100|
|Refusal to office||3,805||7||1,262||4||1,692||5|
|Refusal to interviewer||15,397||28||4,500||15||6,233||19|
|Other non response||1,770||3||1,101||4||1,152||4|
Thus, of the eligible households in wave 2, an interview was achieved with over two-thirds while no interview took place with just under one-third. The non-contact rate at wave 2 (9%) was slightly above that observed at wave 1 (7%). However, the refusal rate was considerably higher in wave 1 than in wave 2, in part because hard refusals from wave 1 were not followed up for wave 2. For wave 3, cooperating households, non-contacts and circumstantial refusals from wave 2 were followed up. In addition, a new panel of households was selected for wave 3 in order to achieve a target of at least 20,000 household interviews. These new panel cases are included in the total figures for wave 3 in Table 1. The wave 3 response rate was 64%; 51% for the new cohort and 72% for the old cohort.
The cross-sectional results presented in the Wealth in Great Britain report are based on all those households which responded in the particular wave in question, while the extent of longitudinal analysis undertaken is clarified in each case.
Cross-sectional editing and validation processes for waves 2 and 3 were similar to those used for wave 1: more details are provided in section 10.4 of the wave 1 report (819.9 Kb Pdf) . However, collecting data from the same households over time provides an opportunity to conduct longitudinal edit checks. For example, if the recorded property value was similar in waves 1 and 3, but recorded as a very different figure in wave 2, perhaps due to a data entry error. In such circumstances, the wave 2 property value has been retrospectively edited to be more consistent with values recorded in waves 1 and 3. Generally, only values in waves 2 and 3 were edited. However, there were a small number of edits made to wave 1 data. The latest version of all three waves of data will be disseminated following this report.
Before any longitudinal checks could be carried out on the data, the longitudinally-linked records were checked for accuracy. The handling of adding new household members to households that responded in the previous wave, Original Sample Members (OSM) who left a household to be interviewed at their new address, or whole households who moved between waves, added complications to the linking exercise that deserved particular attention when the linkage checks were carried out. Furthermore, recorded gender and date of birth that differed from the data collected in the previous wave were checked to ensure that sample members were linked accurately. To account for changes of circumstances within households that may impact on the observed wealth, indicator variables were produced to highlight circumstances such as a change of the Household Reference Person (HRP), additional household members, split households, and movers between waves. Through this process changes between waves were observed that required further investigation. Thorough checking highlighted that the large majority of observed changes were genuine and could be explained through changes of circumstances for some or all individuals in the household, or where there was no evidence to indicate that collected data would be incorrect. However, these longitudinal checks also identified inconsistencies in the longitudinal data which were explained by errors occurring during the interview. These errors were amended where it was possible to establish the correct values.
Outliers exist in WAS data; they reflect the highly skewed nature of WAS data. All outliers were checked for supporting evidence from interviewers. Where appropriate, edits were made to ‘correct’ outliers. In many cases, interviewer notes supported the validity of outliers and these remain in the WAS datasets. Given the skewed nature of wealth data, and the impact that outliers can have on parametric estimates, Wealth in Great Britain 2010/12 does not report on any mean values. Mean values, particularly when exploring change across waves, can lead to the reporting of spurious change with the inclusion of extreme outliers. For this reason, all wealth estimates are reported on using median and/or deciles for Wealth in Great Britain 2010/12.
In a way similar to all social surveys, the Wealth and Assets (WAS) survey data contained missing values. Typically, missing values are associated with non-response. Non-response can occur at household level, person level, and item level. The WAS imputation strategy was concerned primarily with item non-response. Item non-response relates to an event where a respondent does not know or refuses to answer a particular survey question. This can impact on estimates derived from WAS data in two ways:
the missing data can lead to a reduction in the precision of the estimates
if the characteristics of the non-respondents differ from the respondents the estimates may be biased
The general aim of the WAS imputation strategy was to counter these risks by estimating accurately the statistical properties of the missing data. To meet this aim, missing values in the WAS data were imputed using Nearest-Neighbour/Minimum-Change methodology implemented in CANCEIS. CANCEIS is a widely recognised software platform containing a range of integrated imputation techniques (Bankier, Lachance, Poirier 1999; Canceis, 2009). The CANCEIS imputation algorithm employs a donor-based strategy designed to identify and replace missing values with observed values drawn from another record. The donor is selected from a small pool of potential donors with similar characteristics as the record currently being imputed. Similarity is measured by the sum of statistical distances between record and donor across a set of key demographic and other matching variables (MVs). The distance for each individual MV is weighted according to how well it might serve in predicting a valid and plausible range of imputable values in relation to the characteristics of the record currently being imputed. The MVs and associated weights for each WAS variable were identified through statistical modelling and expert review.
The general methodological WAS imputation strategy has several advantages:
as a non-parametric approach, it avoids the distributional assumptions associated with other methods, facilitating preservation of important properties of the data such as skew and discrete steps in observed distribution functions
the donor pool also serves as an implicit distributional model of the plausible range of values for each individual imputable record rendering the probability of selecting a particular value proportional to that distribution
These advantages serve to improve precision and reduce bias in point and variance estimates based on the WAS data, contributing to the accuracy of published statistical outputs (Durrent, 2005).
While the general methodological WAS imputation strategy serves to improve the accuracy of estimates based on WAS data, tuning this strategy to the analytical aims of the survey further improves performance. As a panel survey in its third wave, the overarching analytical aims of the survey are fourfold. To provide:
revised cross-sectional estimates based on the wave 2 data
cross-sectional estimates based on the wave 3 data
longitudinal estimates of change over time between waves 2 and 3
longitudinal estimates of change over time for entire survey duration – here waves 1 to 3
To facilitate these aims, the imputation strategy was divided into three imputation groups (iGroups). Figure 7.1 outlines the fundamental structure of a variable’s data within iGroup.
A more specific analytical aim of the WAS survey is to provide estimates of wealth across five key topic areas: Property, Physical, Pensions, Financial, and Income. To facilitate this aim, the imputation strategy was also aligned with the routing structure of the WAS questionnaire. Although there are many variations in wording and focus, Figure 7.2 represents a schematic overview of the typical structure underlying a question group designed to elicit information about a particular facet of one of the topic areas.
In the initial design of the WAS imputation strategy Wave 2 cross sectional imputation was not anticipated as there was no new Wave 3 data on which to condition a revision of previously imputed Wave 2 data. However, due to general improvements in the Wave 3 strategy such as tighter controls over the routing architecture and donor selection procedures, typically, between 1% and 5% of the previously imputed Wave 2 data was reset to missing. This served to promote consistency in the imputed data between waves. In general, the processing strategy for the Wave 2 and Wave 3 cross sectional data was the same. For every record in the data, entry and mid level routing variables were imputed sequentially employing the donor-based strategy outlined in the General Aims and Methodology Section. At any point throughout the routing where an imputed value indicated subsequent variables did not require a response, a NCR indicator was rolled forward through to the end of the question group. This maintained the integrity of the routing, excluded records from further processing, and ensured ultimately, that amounts related to the variable in question were only imputed for an appropriately sized sub-population.
Two additional constraints were imposed on the imputation of amounts at the end of the question group. If a banded estimate was observed, an imputed amount had to fall within that band. Extreme outliers were also excluded from the donor pool. Extreme outliers were generally defined as values greater than two times the threshold of the highest band for the variable in question and typically represented between 0.5% to 2.5% of observed values. These additional constraints ensured that imputed values were selected based on all available information and that estimates based on the WAS data were not inappropriately biased by a few relatively unique observations.
To facilitate the general aim of improving precision and reducing bias in point and variance estimates based on the WAS data, the pool of potential donors was maximised by including all observed data associated with a particular wave. For example, when imputing missing values in the cross-sectional imputation group for Wave 3, the potential donor pool included valid observations from both the cross sectional and longitudinal Wave 3 imputation groups. Although revision of the WAS Wave 1 data was out of scope, to help maintain continuity across all three waves of the WAS data, observed Wave 1 data was also included in the set of weighted MVs.
Fundamentally, imputation of the Wave 2 and Wave 3 longitudinal data followed the same sequential processing and roll forward NCR strategy as the cross sectional data. However, there were several distinct differences reflecting the change of emphasis in the primary analytical aims of the WAS survey from the provision of cross-sectional estimates to the provision of longitudinal estimates of change over time. To facilitate the longitudinal aims of the survey the Wave 2 and Wave 3 data were imputed simultaneously. Prior to imputation, in cases where new observed information was available in Wave 3, data previously imputed in Wave 2 was reset to missing as the new Wave 3 information would be used to revise imputed data with improved precision. While imputing entry and mid level routing variables, in cases where Wave 2 was missing and Wave 3 was observed, or visa versa, the observed variable was included in the MV set in addition to any Wave 1 data available and the rest of the MV set. Typically, this variable was given a much higher weight than other MVs ensuring that donor selection was constrained more by observed longitudinal data than other MVs. If routing variables were missing in both waves, imputed values were drawn from a single donor as this implicitly maintains appropriate longitudinal relationships in the data.When imputing Wave 2 and Wave 3 amounts, both waves were constrained by the same principles as in the cross sectional strategy; values had to be imputed within observed banded estimates and extreme values were excluded from the donor pool. For the longitudinal data, control over extreme outliers had to be extended. In cases where an extreme value was observed in one wave but missing in the other, the record was excluded and edited manually through expert review. As the number of potential donors would always be extremely limited in this situation, this strategy removed any possibility of imputing an unjustified financial collapse or windfall for an individual respondent in the domain currently being imputed.
End to end, more than 1,000 variables were treated through the WAS imputation strategy. While the current report provides an overview of the primary design and reasoning behind the principal aspects of that design, it is important to note that almost every variable required some unique adjustment to the micro-parameters of the system. The following list outlines just a few of the most significant but necessary macro-variations and extensions to the base-line strategy:
In many of the WAS entry level routing questions respondents were asked if they were in receipt of one or more of a number of different but interdependent assets displayed on a show-card. To account for this interdependence and avoid the imputation of unreasonable relationships between asset types, multi-tick routing was imputed simultaneously as a binary string set. Potential donors were identified through a user defined distance matrix.
For some assets, mid level routing asked respondents to specify how many iterations of a particular asset they owned, for example, how many mortgages or private pension schemes? This can be problematic because the subpopulation of respondents becomes smaller as the number of iterations increase, leading to extremely impoverished donor pools and rendering imputation inappropriate. In such cases, observed values from earlier iterations were included in the potential donor pool and the position in the sequence of iterations was included as a weighted MV.
Unlike most assets in the WAS survey, for a few, such as income from earnings, respondents were asked for a Gross and a Net. The imputation of Gross and Net can be quite complex, particularly when applied to longitudinal data which harbours within wave relationships nested inside relationships between waves. To account for these relationships, the imputation strategy for Gross and Net differed somewhat from the standard cross sectional and longitudinal strategies outlined previously.
In the event that the observed data did not provide any information about longitudinal relationship, the ratio-based roll forward/roll backward strategy outlined in the Wave 2 and Wave 3 Longitudinal Imputation Section was implemented in the first instance. The strategy was applied to impute either the Net to Net or the Gross to Gross relationship between waves depending on which had at least one observation in Wave 2 or Wave 3 to work with. When both Net and Gross were available to facilitate the longitudinal component of the imputation, Net was selected due to higher response rates for this variable. Once relationships between waves were resolved, Net to Gross relationships within Wave 2 and Wave 3 were imputed independently, also using the ratio-based roll forward/roll backward strategy outlined in the Wave 2 and Wave 3 Longitudinal Imputation Section.
Changes to the questionnaire or structure of a particular question group can represent difficulties for a longitudinal imputation leading potentially to an inappropriate MV set that can bias results. Changes to the WAS questionnaire were addressed according to a list of strategic priorities: Where possible, Wave 2 variables were harmonised with those in Wave 3 and imputed according to the base-line strategy. If a Wave 2 variable structure was similar to that in Wave 3 but could not be harmonised, previously imputed Wave 2 data was not revised but where relevant, it was included in the Wave 3 MV set. Where Wave 2 variable structures were completely incompatible with Wave 3, a cross sectional imputation strategy was applied to the Wave 3 missing data only.
Through previous research and expert review it had been recognised that some of the assets addressed by the WAS questionnaire such as personal pensions, can be extremely sensitive to ongoing temporal changes in market forces. For these variables, the month the interview was conducted was included in the weighted MV set.
Quality assurance and evaluation of the WAS imputation strategy was a three-stage process conducted at different times throughout processing. Typically, assessment was based on analytical results derived through custom software designed in SPSS or SAS and on expert review from domain and topic experts. To ensure thoroughness, three teams were involved in the quality assurance process: Survey Methodology; Collection and Production; and Analysis and Dissemination.
Stage 1: As the efficacy of any imputation method depends of the quality of the input data, prior to imputation the WAS data was examined against a well defined set of imputation specifications. The specifications included a detailed data dictionary, a comprehensive outline of all routing architecture, approved MV sets, and additional notes on expected exceptions and outliers.
Stage 2: On a variable by variable basis throughout processing, the statistical properties of the imputed data were evaluated and compared to those of the observed data. This served to ensure that the imputation process itself did not introduce unwarranted bias into the cross sectional and longitudinal properties of the variable currently being imputed.
Stage 3: Following imputation, further analyses and review evaluated the impact of the imputed data in the calculation and derivation of estimates based on the WAS data. This served to ensure that the imputation strategy did not introduce unwarranted bias or have unnecessary impact on those estimates and thus, on published outputs.
From wave 3 onwards, three sets of weights were created for use with the datasets from each wave: (i) a longitudinal weight for survivors (Wave 1 – Wave T), (ii) a longitudinal weight for the last two consecutive waves (Wave T-1 – Wave T) and (iii) a pseudo cross-sectional weight (Wave T). It‘s important to ensure that each set of weights is used for analysis of the relevant subsample of respondents. The weights incorporate adjustments for non-response and differential sampling probabilities (Daffin et al., 2010) and also adjust for loss to follow-up (LTFU) at following waves.
The wave 1 weights were constructed in three stages: first as the reciprocal of the selection probability; secondly adjusted for non-response; and finally calibrated to population totals using an age by sex and regional breakdown (Daffin et al., 2009). ‘Integrative calibration’ was used which ensures that each person in the household has the same weight; this is also the household weight. At each wave T, the Wave T-1 weight is brought forward to use as the basis of the Wave T base weight. The base weight tracks the progress through the survey of all people enumerated in the household, i.e. includes children and young adults who are deliberately not interviewed for the survey. WAS weights are calculated for all people enumerated in the household.
LTFU occurs through two processes. One process is where eligible people from Wave T-1 cannot be traced for their Wave T interview and, therefore, their eligibility status for the Wave T interview is unknown. The second process is to adjust for those participants who decide not to take part in the survey between waves.
The cases with unknown eligibility will, in reality, have included both eligible and ineligible cases. A weight is constructed to adjust for unknown eligibility using a weighted binomial regression of known/unknown eligibility status onto a suite of socio-demographic characteristics measured at Wave T-1. The reciprocal of the propensity for known eligibility was used to adjust the Wave T-1 weight by multiplying the Wave T-1 weight through by the eligibility adjustment weight (). The resulting weight was then used in a binomial regression of response/non-response status onto a suite of characteristics to adjust for the second stage of LTFU (response attrition). The reciprocal of the response propensity ( ) was used to adjust further the previous weight.
In summary, the Wave T longitudinal pre-calibrated weight ( ) can be written as (1) below for respondents:
The weight is the product of three quantities, i.e. the Wave T-1 weight ( ) adjusted for those cases moving into unknown eligibility ( ) and non-response ( ) at Wave T. This weight is defined over the set of ( ) longitudinal respondents at Wave T.
A second group of people included in the construction of the base weight are those people who became ineligible at Wave 2 ( ), described in (2). Typically, this group predominantly comprises those people who have left the population through death, migration or institutionalisation.
Taking the two sets and together should recover the population prior to LTFU, assuming complete correction for the LTFU processes.
A longitudinal calibration weight ( ) was constructed from a trimmed version of the longitudinal pre-calibrated weight by calibrating the combined sub-sets of cases ( and ) to the relevant population totals. For the weights of the survivors of all waves the relevant calibration population total are Wave 1; for the (T-1) to T longitudinal weights the relevant population totals are from Wave T-1.
The g-weight ( ) ensures that the sums of the calibration control variables (age by sex and region) match those of the relevant population.
A pseudo-cross-sectional weight at Wave t is constructed differently for each subgroup in the sample. Firstly consider the terminology used to describe the subgroups:
OSM – an Original Sample Member which refers to an individual who responded in the same wave that they were sampled.
EOSM – an Entrant Original Sample Member which refers to an individual who lives at an address which was sampled but the household did not respond until a later wave.
SSM – a Secondary Sample Member which refers to an individual who joined a previously responding household.
There are also new panels added from wave 3 onwards, as well as different combinations of response and non-response of sample members over waves to consider when calculating the cross sectional weights.
Any responder who has been in a previous wave will have their wave t longitudinal weight as a base weight. The first challenge for the cross-sectional weight is to assign a weight to people entering the sample. SSMs and births receive a cross sectional weight through a process of weight sharing the base weight of the OSMs. Rather than attempt to work out selection probabilities directly, it is common to use a weight share method to approximate these probabilities (e.g. Huang 1984, Ernst 1989, Kalton & Brick 1995).
A standard approach is to assign weight shares based on Wave T-1 household members to people in target Wave T households. A variety of weight share algorithms exist (e.g. Rendtel & Harms 2009). Following Kalton & Brick (1995), the weight at time T for household i can be defined as the sum of the product of the initial weights and a constant summed over the k individuals in households j at time T-1:
The constant ( ) is defined in terms of the number of people in household i at time T who were in the population at time T-1.
Each component of the Wave T pre-calibration cross-sectional weight is a constant for all members in the household. This is a consequence of the design for the entrant component sub-sample and of the weight share averaging (which occurred for all households and not just those with entrants) for the longitudinal sub-sample component.
The new panel weights were constructed firstly as the reciprocal of the selection probability, followed by a non response adjustment as with the original panel sample in wave 1.
The pseudo-cross-sectional Wave T weights are created through integrative calibration of the pre-calibration weight to the Wave T population totals (6). This is carried out for each panel separately to allow for analysis of each panel if required.
The final stage is to combine the different panels together; the chosen method combines the panels in proportion to the effective sample size (as proposed by Chu et al 1999, Korn and Graubard 1999). This accounts for the variance within each panel and combines the weights such that the overall variance is minimised. As a result, the newer panel(s) weights will be scaled up whilst the older panel(s) will be scaled down.
One measure of sampling variability is the standard error. Standard errors are one of the key measures of survey quality, showing the extent to which the estimates should be expected to vary over repeated random sampling. In order to estimate standard errors correctly, the complexity of the survey design needs to be accounted for, as does the calibration of the weight to population totals. WAS has a complex design that employs a two-stage, stratified sample of addresses with oversampling of the wealthier addresses at the second stage and implicit stratification in the selection of PSUs.
Typically, PSUs tend to be characterised by a positive intra-class correlation coefficient, that is people within a PSU are more alike to each other than they are to people in the rest of the sample. This acts to increase the standard error of an estimate relative to simple random sampling. Conversely, stratification can act to decrease the standard error if people within a stratum are relatively homogeneous and there is consequently a greater degree of heterogeneity between strata. Both these elements of the design should be accounted for when calculating standard errors.
An identifier of the PSU is included on the WAS dataset. Selection of the PSUs was done by ordering the frame. The first ordering principle was geographic (region x district); whereas the second was socio-demographic, that is within each of the 26 regional districts further ordering was done on the basis of the socio-demographic characteristics of the PSU populace. This ordering fulfils two purposes. Firstly it spreads out the sample in terms of socio-demographic characteristics ensuring people from higher and lower ends of the socio-demographic dimensions were included in the sample. Secondly, it enables stratification. The primary stratification variable, the 26 regional districts, was identified on the dataset but because of the way the sample was selected from the ordered frame it can be regarded as a design selecting a single PSU per stratum. Consequently, it was possible to incorporate a much finer stratification procedure using a ‘collapsed stratum’ approach.
Finally, the calibration to population totals needs to be taken into account. This will have a beneficial effect, both in terms of adjusting for residual bias after non-response weighting and in reducing the variance of estimates. The extent to which the variance was reduced was related to the extent to which the survey variables were related to the variables in the calibration. The calibration variables were household counts of people within each age group by sex and regional category, so it was to be expected that, for example, the total wealth of a household will be associated with these variables.
The method for taking account of the calibration when calculating standard errors is described in the report ‘Variance estimation for Labour Force Survey Estimates of Level and Change’, GSS Methodology Series no. 21, Holmes and Skinner.
To enable the reader to gain an appreciation of the variability of the results presented in this report, estimates of the standard errors of some key variables have been produced.
The estimates in this report are based on information obtained from a sample of the population and are therefore subject to sampling variability. Sampling error refers to the difference between the results obtained from the sample population and the results that would be obtained if the entire population were fully enumerated. The estimates may therefore differ from the figures that would have been produced if information had been collected for all households or individuals in Great Britain.
Additional inaccuracies which are not related to sampling variability may occur for reasons such as errors in response and reporting. Inaccuracies of this kind are collectively referred to as non-sampling errors and may occur in any collection whether it's a sample survey or a census.
The main sources of non-sampling error are:
response errors such as misleading questions, interviewer bias or respondent misreporting.
bias due to non-response as the characteristics of non-responding persons may differ from responding persons.
data input errors or systematic mistakes in processing the data.
Non-sampling errors are difficult to quantify in any collection, however every effort was made to minimise their impact through careful design and testing of the questionnaire, training of interviewers and extensive editing and quality control procedures at all stages of data processing. The ways in which these potential sources of error were minimised in WAS are discussed below.
Response errors generally arise from deficiencies in questionnaire design and methodology or in interviewing technique as well as through inaccurate reporting by the respondent. Errors may be introduced by misleading or ambiguous questions, inadequate or inconsistent definitions or terminology and by poor overall survey design. In order to minimise the impact of these errors the questionnaire, accompanying supporting documentation and processes were thoroughly tested before being finalised for use in the survey.
To improve the comparability of WAS statistics, harmonised concepts and definitions were also used where available. Harmonised questions were designed to provide common wordings and classifications to facilitate the analysis of data from different sources and have been well tested on a variety of collection vehicles.
WAS is a relatively long and complex survey and reporting errors may also have been introduced due to interviewer and/or respondent fatigue. While efforts were made to minimise errors arising from deliberate misreporting by respondents some instances will have inevitably occurred.
Lack of uniformity in interviewing standards can also result in non-sampling error, as can the impression made upon respondents by personal characteristics of individual interviewers such as age, sex, appearance and manner. ONS uses training programs, the provision of detailed supporting documentation and regular supervision and checks of interviewers' work to achieve consistent interviewing practices and maintain a high level of accuracy.
One of the main sources of non-sampling error is non-response, which occurs when people who were selected in the survey cannot or will not provide information or cannot be contacted by interviewers. Non-response can be total or partial and can affect the reliability of results and introduce a bias.
The magnitude of any bias depends upon the level of non-response and the extent of the difference between the characteristics of those people who responded to the survey and those who did not. It is not possible to accurately quantify the nature and extent of the differences between respondents and non-respondents however every effort was made to reduce the level of non-response bias through careful survey design and compensation during the weighting process. To further reduce the level and impact of item non-response resulting from missing values for key items in the questionnaire, ONS undertook imputation prior to the release of the datasets for analysis.
Non-sampling errors may also occur between the initial data collection and final compilation of statistics. These may be due to a failure to detect errors during editing or may be introduced in the course of deriving variables, manipulating data or producing the weights. To minimise the likelihood of these errors occurring a number of quality assurance processes were employed.
Unlike other measures of wealth estimated in the Wealth and Assets Survey (WAS), where respondents are asked to estimate the value of their assets, estimating the value of private pension pots is less straightforward.
When wave 1 data were first being processed, the ONS worked closely with the Institute for Fiscal Studies (IFS) to develop the methodology for the calculation of private pension wealth. The basic methodology has remained unchanged and was explained in detail in Wealth in Great Britain 2008/10, Part 2, Chapter 5: Annex on Pension Wealth Methodology, 2008/10. This current annex does not attempt to explain how private pension wealth is calculated but concentrates on changes in some of the assumptions that have been made which have affected the overall estimates.
Following the publication of wave 2 data, where the estimates of pension wealth increased considerably between waves 1 and 2 of the survey, the ONS, in liaison with experts in other government departments, undertook a study to evaluate whether the methodology for calculating private pension wealth could be improved, as the change was thought to be largely unrepresentative of the actual change to pension wealth during this time period. The increase was due primarily to the increase in the modelled estimates of defined benefit pensions, which use some external data: annuity rates and discount factors.
The results of this work made recommendations to change the financial assumptions used, and it was agreed that these changes should be applied to all waves of WAS available to date, so that private pension wealth is calculated on a consistent basis across existing and future waves of the survey.
In addition to the changes to the financial assumptions, the estimates of pension wealth have also changed due to the way in which the selection of individuals eligible for current occupation pensions is carried out; updated imputation of wave 2 data using information collected at wave 3; and the imputation of a small number of non-respondents at wave 1. This paper looks at the relative impact of each of these changes on the estimates of private pension wealth.
The annuity rates and the discount factors used in the calculation of the estimates of DB pension wealth were thought to be the cause of the large changes seen in estimates of DB pension wealth between the first two waves of the survey. The methods originally used at Waves 1 and 2 involved applying a single fixed, age and gender specific, annuity factor for the whole of a wave. The annuity values used for each of these waves were out of date, but thought to be the best available at the time, due to the inherent difficulty in sourcing historical annuity rates. In the case of Wave 1, rates for December 2009 were applied to data covering July 2006 to June 2008 and in the case of Wave 2, rates for December 2011 were applied to data covering July 2008 to June 2010. Also, the discount rate was set as the AA corporate bonds yield rate, again using a single value for the whole wave of data, which matched the date of the annuity rate. During the recession this rate dropped, due to the general fall in stock prices. The discount rate is a particularly important component of the pension wealth calculation, as small changes have a cumulative, and subsequently, large effect on resulting values.
Initial evaluation of the assumptions used was made by DWP Economic Advisers who recommended that:
a) annuity factors and discount rates should be applied on a matched monthly basis (i.e. those applicable at the time of interview); and
b) the Superannuation Contributions Adjusted for Past Experience (SCAPE) discount rate used by Government Actuaries should be used. This is generally less volatile than the AA Corporate bond rate.
These methods were proposed with the aim of reducing the large increase observed between Waves 1 and 2 while retaining a defensible rationale, and providing a basis on which to proceed at Wave 3 and beyond. The broad idea being this approach would generate a representative market value of the pension wealth at the time of interview, as well as apply a chronological smoothing function.
Following a sensitivity study carried out by ONS looking at the effects on current occupational DB pensions, and consultation with the Pensions Statistics Advisory Group, these recommendations were accepted and applied to all waves of WAS data.
The following table shows the effect of the changes to the financial assumptions alone on the value of individual current DB pension wealth.
|Quartile 1||Median||Quartile 3||Aggregate value (billion)|
|Wave 1 (2006/08)||Original discount and annuity||14,700||56,300||146,700||430|
|Matched monthly annuity rates and SCAPE factors||6,100||22,100||64,900||189|
|Wave 2 (2008/10)||Original discount and annuity||24,000||61,200||133,100||677|
|Matched monthly annuity rates and SCAPE factors||8,300||25,100||133,100||322|
The annuity rates and discount factors are also used in the calculation of other pension pots including wealth in retained DB schemes and pensions from a former spouse/partner.
|Quartile 1||Median||Quartile 3||Aggregate value (billion)|
|Wave 1||Original discount and annuity||14,700||56,300||146,700||430|
|Matched monthly annuity rates and SCAPE factors||6,100||22,100||64,900||189|
|Wave 2||Original discount and annuity||24,000||61,200||133,100||677|
|Matched monthly annuity rates and SCAPE factors||8,300||25,100||133,100||322|
|Quartile 1||Median||Quartile 3||Aggregate value (billion)|
|Wave 1||Original discount and annuity||4,200||32,800||125,700||30.9|
|Matched monthly annuity rates and SCAPE factors||1,600||15,300||58,000||15.1|
|Wave 2||Original discount and annuity||900||14,500||42,200||5.7|
|Matched monthly annuity rates and SCAPE factors||700||7,900||30,500||4.3|
In earlier publications, some pension information provided by respondents was excluded when presenting the data at an individual level, as a selection had been made using employment status. No such selection has been made in the current publication. This particular change will have had no direct effect on the aggregate values of total pension wealth, nor any of the data presented at household level since these were based on all reported pensions regardless of employment status.
The selection was made when originally processing the data at wave 1 since it was concluded, at the time, that the occupational pension information collected from people who were not classified as employees was in error. However, more detailed analyses now possible with the availability of more waves of data has allowed the editing procedures to filter out true errors and no filter based on employment status is now deemed necessary.
The data presented at individual level concentrates on the proportion of individuals with the various pension schemes (which has seen a change due to this selection) and also the distribution of the value of the various pensions which has not been affected much at all.
Details of the policy governing the release of new data are available by visiting www.statisticsauthority.gov.uk/assessment/code-of-practice/index.html or from the Media Relations Office email: firstname.lastname@example.org
These National Statistics are produced to high professional standards and released according to the arrangements approved by the UK Statistics Authority.