Methodological developments

Approaches for handling missing covariate data in cancer survival research

Ula Nur, Milena Falcaro and Bernard Rachet

Incomplete data are unavoidable in most research surveys, even if great effort is made in planning and data collection. This difficulty is more prevalent in population‐based routine data such as those collected by cancer registries, which are often forced to create new tumour registrations with incomplete data. Restriction of analysis to records that are complete may yield inferences that are substantially different from those that would have been obtained had no data been missing. The drawbacks of naïve methods for handling missing data and their impact on conclusions have been explored extensively over the last 20 or so years. It is now well known that ad hoc approaches such as complete‐case analysis, mean substitution or the use of a separate category for records with missing data can all lead to biased inferential conclusions. The most appropriate way to handle incomplete data will depend upon how data items become missing. When dealing with population‐based data, for example information on stage is very often missing for a large proportion of patients. Our main research interest in this area relates to multiple imputation by chained equations. A tutorial paper on this topic has already been published (Nur et al., 2010). Further work on the treatment of incomplete covariate data when estimating net survival is currently being carried out.

References and further reading

Little RJA and Rubin DB (1987) Statistical Analysis with Missing Data. Second Edition John Wiley & Sons: New York
Nur U., Shack L.G., Rachet B., Carpenter J.R. and Coleman M.P. (2010). Modelling relative survival in the presence of incomplete data: a tutorial. Int J Epidemiology 39(1), 118-128.
White I.R., Wood A. and Royston P. (2011). Tutorial in biostatistics: Multiple imputation using chained equations: issues and guidance for practice. Statistics in Medicine 30, 377-399.



Structural equation modelling to identify the origin of inequalities in survival

Oded Horn and Bernard Rachet

Structural equation modelling confers several advantages over conditional regression, by quantifying the effect of several possible causal pathways (e.g. those mediated by smoking) between prognostic factors (e.g. deprivation) and the outcome. Deprivation acts on relative survival in several interrelated ways, and these may be better assessed by explicitly modelling their interrelations. We have been investigating the use of this framework to explain socioeconomic disparities in relative survival.



Flexible models for life tables

Camille Maringe, Bernard Rachet and Libby Ellis

Background mortality varies by age, sex, calendar year, socio-economic group and region. We have produced numerous life tables for England and Wales: for each calendar year between 1971 and 2009, for each deprivation group and each Government Office Region. We have also built life tables for Scotland, Northern Ireland and Ireland. All these life tables are freely available on our website. We have shown that geographical variation in life expectancy for England and Wales around 1998 was mainly attributable to deprivation. We plan to update this analysis to examine trends in life expectancy by ethnicity. Constructing life tables for such specific populations requires smoothing techniques that deal with sparse data. We have been developing a multi-variable Poisson model with flexible link (splines) to model the effect of deprivation, region, ethnicity, etc. on the mortality rates. We have shown that this approach seems to produce a better fit to the observed mortality rates than the traditional approaches, and to deal more efficiently with sparse data.



Methods for evaluating the proportion of cancer patients who are cured

Bernard Rachet and Anjali Shah
Collaboration with P. Lambert (Centre for Biostatistics & Genetic Epidemiology, University of Leicester) and P. Dickman, T. Andersson and S. Eloranta (Karolinska Institute, Sweden)

The aims of this project are to develop statistical methods for estimating and modelling the cure fraction in population-based cancer survival analysis and apply the methods to data from Sweden, Finland, England and the USA with the joint aim of evaluating the new methodology as well as studying temporal trends in cancer patient survival. Traditional approaches to studying temporal trends in cancer patient survival typically involve estimating 5-year relative survival for different periods of diagnosis and attempting to correlate the observed trends with changes in factors thought to affect survival. A common problem is that an observed trend may be consistent with several competing hypotheses. An alternative approach, which we believe provides greater insights, is to simultaneously estimate the proportion of patients cured along with the distribution of survival times of the uncured. Studying trends in both the cure fraction and the average survival time of the uncured gives greater possibilities for distinguishing between competing explanations for an observed trend in patient survival.

References and further reading

Lambert P.C., Thompson J.R., Weston C.L. and Dickman P.W. (2007). Estimating and modeling the cure fraction in population-based cancer survival analysis. Biostatistics 8, 576-594.
Shah A., Stiller C.A., Kenward M.G., Vincent T., Eden T.O. and Coleman M.P. (2008). Childhood leukaemia: long-term excess mortality and the proportion ‘cured’. Br J Cancer 99(1), 219-223.



Robust indices of survival for NHS management

Manuela Quaresma and Bernard Rachet

Local health authorities such as Primary Care Trusts (PCTs) have a strong interest in cancer survival in their resident populations, to guide local policy. We have shown that annual survival estimates for even the most common cancers are not sufficiently robust for such purposes in small populations. Interpretation is complicated by differences between PCTs in the distribution of prognostic factors, which can rarely be taken into account in analysis. As an example of the increasing pressure to make such indices more robust for managerial purposes, however, we have been commissioned by the National Cancer Intelligence Network to examine trends in the all-cancers survival index for the 151 Primary Care Trusts in England. We have generated a robust index of one-year survival for all cancers combined in small areas, to provide time trends and geographic patterns for the 151 Primary Care Trusts in England. Results for patients diagnosed during 1996-2009 have been published, achieving wide coverage.

References and further reading

Quaresma M, Jakomis N, Gordon E, Carrigan C, Coleman MP, Rachet B. Index of cancer survival for Primary Care Trusts in England – patients diagnosed 1996-2009 and followed up to 2010. Office for National Statistics 13 Dec 2011: 1-11.