Summary: A new artificial intelligence system is capable of surpassing human experts in breast cancer prediction. The system provided absolute reductions of up to 5.7% in false-positive diagnoses and up to 9.4% in false-negative predictions of the disease.
Source: Cancer Research UK
Right now, the NHS breast cancer screening program saves around 1,300 lives in the UK each year.
But there are severe NHS staff shortages, particularly in the teams that help diagnose cancer, with some reports suggesting that up to 1 in 10 diagnostic posts are currently vacant. Throw in rising demand to the mix, and the future of these services could be in trouble.
But new technology could help ease the situation. We’ve partnered with Google Health on research to develop artificial intelligence that not only has the potential to change the way we detect breast cancer but could also save the NHS time and money.
Helping to train a computer
Our scientists have created a database of anonymized breast cancer scans (mammograms) that have come from breast screening appointments at a number of NHS breast screening centers around the UK, to be used for research.
Containing over 2.5 million images, this database is the largest and most dynamic of its kind in the world. And it’s available for academic and commercial partners to use if they have a smart and scientifically sound research proposal that will benefit patients. But before they get access, their proposal is scrutinized by a group of experts, including people affected by cancer.
That’s where Google Health comes in. Five years ago, Google and researchers from Imperial College London approached our team with a belief that a fancy computer program could be developed and trained to spot cancer on mammograms.
“Basically, they were trying to teach a machine to read images and it takes an awful lot of images for it to learn so it can get really good at picking up cancer,” says Helen, a member of the group Independent Cancer Patients’ Voice, that brings together patient advocates to help with medical research. She reviewed Google Health’s application to access the database.
Computers with AI capabilities are only as good as the data they’ve been trained on, so for her, our mammogram collection and Google’s technology prowess were a winning combination.
Results from this mighty research collaboration, published in Nature, show that the learning paid off. The AI software was able to correctly identify cancers in screening images with a similar degree of accuracy as the experts. The computer program also reduced the number of errors, including cases where cancer is flagged incorrectly or those that are missed altogether.
Currently, 2 experts review breast screening scans. But the system isn’t perfect, as screening can miss some cancers and pick up ones that wouldn’t have gone on to cause problems.
“It now looks from this research that having the combination of a human eye and a machine eye over the images could actually give more accurate results,” says Helen. She is referring to the study’s finding that AI reduced false-positive results. These are ‘false alarms’ that can occur when someone gets an abnormal result, but they don’t have cancer.
“That will reduce loads of anxiety for women,” says Helen, who was diagnosed with breast cancer in 2004 and finished reconstruction surgery in 2014. It will also save the NHS time and money by reducing the number of patients who are called back for further tests.
Artificial intelligence in a real scenario
Professor Ken Young works for the NHS and manages our mammogram collection. He and his colleagues helped Google Health analyze the data and design the trial to make it the most realistic AI study in breast cancer detection to date.
“What I think is most interesting about this study is its realism,” says Young. “What’s unusual is that it compares the algorithm to a totally realistic clinical scenario.”
Past studies have used specially selected mammograms that were analyzed in a somewhat artificial setting. For example, some other programs have been tested on a set of images that have more cancer cases than would be found in the general population.
But in the latest study, researchers compared real decisions made by radiologists analyzing the scans of people attending the NHS breast screening program.
“We have a sample that is representative of all the women that might come through breast screening,” says Young. “It includes easy cases, difficult cases and everything in between.”
And thanks to this collaboration, the data set is even richer than it was before. Around 100,000 more normal cases have been added to the database, which is now available to other researchers using the scan collection.
Giving the gift of time
NHS staff could also benefit from the partnership. A recent review suggested that this kind of tech will give radiologists ‘the gift of time’, instead of replacing them.
“All the radiologists I know aren’t worried about AI at all,” says Young. “I think they’d be delighted to have some of the quite monotonous work of reading mammograms done for them, so they’re freed up to do other things.”
Keeping patient data safe
The other concern when it comes to developing AI software is data protection, something that Young, Helen, and the team have carefully thought through.
“One of the concerns that come through is patient confidentiality,” says Helen, who has taken part in trials herself. “It’s very important that I sit there on the lay side to make certain that everything is anonymized, and the ethics are checked.”
Before images enter the database, they’re immediately de-identified so there is no way that a researcher can find out who the mammograms belong to. The scans don’t include any personal information, which is “stripped out before we add the image to the database and share it with researchers,” says Young.
And research groups who are granted access to the images also have to agree to certain conditions, like keeping the patient data confidential and not using it for any other purpose than the development of AI screening algorithms.
AI still has a lot to learn
This well-trained algorithm is still in its early stages, but now has a firm foundation of knowledge to build on. Next, the team needs to test on a wider population and to see how radiologists can benefit from using the algorithm in the clinic.
“I genuinely think the potential here is enormous,” says Young.
“Breast cancer screening has a number of problems that could be tackled by the introduction of artificial intelligence.”
“These early studies using AI are the beginning of something quite big that will revolutionize medicine, this is just one of the first examples.”
Cancer Research UK
Press Office – Cancer Research UK
The image is credited to Cancer Research UK.
Original Research: Closed access
“International evaluation of an AI system for breast cancer screening”. Scott Mayer McKinney, Marcin Sieniek, Varun Godbole, Jonathan Godwin, Natasha Antropova, Hutan Ashrafian, Trevor Back, Mary Chesus, Greg C. Corrado, Ara Darzi, Mozziyar Etemadi, Florencia Garcia-Vicente, Fiona J. Gilbert, Mark Halling-Brown, Demis Hassabis, Sunny Jansen, Alan Karthikesalingam, Christopher J. Kelly, Dominic King, Joseph R. Ledsam, David Melnick, Hormuz Mostofi, Lily Peng, Joshua Jay Reicher, Bernardino Romera-Paredes, Richard Sidebottom, Mustafa Suleyman, Daniel Tse, Kenneth C. Young, Jeffrey De Fauw & Shravya Shetty.
International evaluation of an AI system for breast cancer screening
Screening mammography aims to identify breast cancer at earlier stages of the disease, when treatment can be more successful1. Despite the existence of screening programmes worldwide, the interpretation of mammograms is affected by high rates of false positives and false negatives2. Here we present an artificial intelligence (AI) system that is capable of surpassing human experts in breast cancer prediction. To assess its performance in the clinical setting, we curated a large representative dataset from the UK and a large enriched dataset from the USA. We show an absolute reduction of 5.7% and 1.2% (USA and UK) in false positives and 9.4% and 2.7% in false negatives. We provide evidence of the ability of the system to generalize from the UK to the USA. In an independent study of six radiologists, the AI system outperformed all of the human readers: the area under the receiver operating characteristic curve (AUC-ROC) for the AI system was greater than the AUC-ROC for the average radiologist by an absolute margin of 11.5%. We ran a simulation in which the AI system participated in the double-reading process that is used in the UK, and found that the AI system maintained non-inferior performance and reduced the workload of the second reader by 88%. This robust assessment of the AI system paves the way for clinical trials to improve the accuracy and efficiency of breast cancer screening.