Summary: Machine learning significantly improves the accuracy of predicting premature deaths, from all causes, in a middle-aged population compared with more traditional models.
Source: University of Nottingham
Computers which are capable of teaching themselves to predict premature death could greatly improve preventative healthcare in the future, suggests a new study by experts at the University of Nottingham.
The team of healthcare data scientists and doctors have developed and tested a system of computer-based ‘machine learning’ algorithms to predict the risk of early death due to chronic disease in a large middle-aged population.
They found this AI system was very accurate in its predictions and performed better than the current standard approach to prediction developed by human experts. The study is published by PLOS ONE in a special collections edition of “Machine Learning in Health and Biomedicine”.
The team used health data from just over half a million people aged between 40 and 69 recruited to the UK Biobank between 2006 and 2010 and followed up until 2016.
Leading the work, Assistant Professor of Epidemiology and Data Science, Dr. Stephen Weng, said: “Preventative healthcare is a growing priority in the fight against serious diseases so we have been working for a number of years to improve the accuracy of computerized health risk assessment in the general population. Most applications focus on a single disease area but predicting death due to several different disease outcomes is highly complex, especially given environmental and individual factors that may affect them.
“We have taken a major step forward in this field by developing a unique and holistic approach to predicting a person’s risk of premature death by machine-learning. This uses computers to build new risk prediction models that take into account a wide range of demographic, biometric, clinical and lifestyle factors for each individual assessed, even their dietary consumption of fruit, vegetables, and meat per day.
“We mapped the resulting predictions to mortality data from the cohort, using Office of National Statistics death records, the UK cancer registry and ‘hospital episodes’ statistics. We found machine-learned algorithms were significantly more accurate in predicting death than the standard prediction models developed by a human expert.”
The AI machine learning models used in the new study are known as ‘random forest’ and ‘deep learning’. These were pitched against the traditionally-used ‘Cox regression’ prediction model based on age and gender – found to be the least accurate at predicting mortality – and also a multivariate Cox model which worked better but tended to over-predict risk.
Professor Joe Kai, one of the clinical academics working on the project, said: “There is currently intense interest in the potential to use ‘AI’ or ‘machine-learning’ to better predict health outcomes. In some situations, we may find it helps, in others it may not. In this particular case, we have shown that with careful tuning, these algorithms can usefully improve prediction.
“These techniques can be new to many in health research, and difficult to follow. We believe that by clearly reporting these methods in a transparent way, this could help with scientific verification and future development of this exciting field for health care.”
This new study builds on previous work by the Nottingham team which showed that four different AI algorithms, ‘random forest’, ‘logistic regression’, ‘gradient boosting’ and ‘neural networks’, were significantly better at predicting cardiovascular disease than an established algorithm used in current cardiology guidelines.
The Nottingham researchers predict that AI will play a vital part in the development of future tools capable of delivering personalized medicine, tailoring risk management to individual patients. Further research requires verifying and validating these AI algorithms in other population groups and exploring ways to implement these systems into routine healthcare.
University of Nottingham
Stephen Weng – University of Nottingham
The image is in the public domain.
Original Research: Open access.
“Prediction of premature all-cause mortality: A prospective general population cohort study comparing machine-learning and standard epidemiological approaches”
Stephen F. Weng, Luis Vaz, Nadeem Qureshi, Joe Kai.
Published: March 27, 2019 PLOS ONE doi:10.1371/journal.pone.0214365
Prediction of premature all-cause mortality: A prospective general population cohort study comparing machine-learning and standard epidemiological approaches
Prognostic modeling using standard methods is well-established, particularly for predicting risk of single diseases. Machine-learning may offer the potential to explore outcomes of even greater complexity, such as premature death. This study aimed to develop novel prediction algorithms using machine-learning, in addition to standard survival modeling, to predict premature all-cause mortality.
A prospective population cohort of 502,628 participants aged 40–69 years were recruited to the UK Biobank from 2006–2010 and followed-up until 2016. Participants were assessed on a range of demographic, biometric, clinical and lifestyle factors. Mortality data by ICD-10 were obtained from linkage to Office of National Statistics. Models were developed using deep learning, random forest and Cox regression. Calibration was assessed by comparing observed to predicted risks; and discrimination by area under the ‘receiver operating curve’ (AUC).
14,418 deaths (2.9%) occurred over a total follow-up time of 3,508,454 person-years. A simple age and gender Cox model was the least predictive (AUC 0.689, 95% CI 0.681–0.699). A multivariate Cox regression model significantly improved discrimination by 6.2% (AUC 0.751, 95% CI 0.748–0.767). The application of machine-learning algorithms further improved discrimination by 3.2% using random forest (AUC 0.783, 95% CI 0.776–0.791) and 3.9% using deep learning (AUC 0.790, 95% CI 0.783–0.797). These ML algorithms improved discrimination by 9.4% and 10.1% respectively from a simple age and gender Cox regression model. Random forest and deep learning achieved similar levels of discrimination with no significant difference. Machine-learning algorithms were well-calibrated, while Cox regression models consistently over-predicted risk.
Machine-learning significantly improved the accuracy of prediction of premature all-cause mortality in this middle-aged population, compared to standard methods. This study illustrates the value of machine-learning for risk prediction within a traditional epidemiological study design, and how this approach might be reported to assist scientific verification.