Summary: A new machine learning model uses data from the COVID-19 pandemic in conjunction with a neural network and can determine the efficacy of social distancing measures and better predict viral spread. With current quarantine measures in place, the AI model predicts a plateau in coronavirus infections between April 15-20.
Every day for the past few weeks, charts and graphs plotting the projected apex of Covid-19 infections have been splashed across newspapers and cable news. Many of these models have been built using data from studies on previous outbreaks like SARS or MERS. Now, a team of engineers at MIT has developed a model that uses data from the Covid-19 pandemic in conjunction with a neural network to determine the efficacy of quarantine measures and better predict the spread of the virus.
“Our model is the first which uses data from the coronavirus itself and integrates two fields: machine learning and standard epidemiology,” explains Raj Dandekar, a PhD candidate studying civil and environmental engineering. Together with George Barbastathis, professor of mechanical engineering, Dandekar has spent the past few months developing the model as part of the final project in class 2.168 (Learning Machines).
Most models used to predict the spread of a disease follow what is known as the SEIR model, which groups people into “susceptible,” “exposed,” “infected,” and “recovered.” Dandekar and Barbastathis enhanced the SEIR model by training a neural network to capture the number of infected individuals who are under quarantine, and therefore no longer spreading the infection to others.
The model finds that in places like South Korea, where there was immediate government intervention in implementing strong quarantine measures, the virus spread plateaued more quickly. In places that were slower to implement government interventions, like Italy and the United States, the “effective reproduction number” of Covid-19 remains greater than one, meaning the virus has continued to spread exponentially.
The machine learning algorithm shows that with the current quarantine measures in place, the plateau for both Italy and the United States will arrive somewhere between April 15-20. This prediction is similar to other projections like that of the Institute for Health Metrics and Evaluation.
“Our model shows that quarantine restrictions are successful in getting the effective reproduction number from larger than one to smaller than one,” says Barbastathis. “That corresponds to the point where we can flatten the curve and start seeing fewer infections.”
Quantifying the impact of quarantine
In early February, as news of the virus’ troubling infection rate started dominating headlines, Barbastathis proposed a project to students in class 2.168. At the end of each semester, students in the class are tasked with developing a physical model for a problem in the real world and developing a machine learning algorithm to address it. He proposed that a team of students work on mapping the spread of what was then simply known as “the coronavirus.”
“Students jumped at the opportunity to work on the coronavirus, immediately wanting to tackle a topical problem in typical MIT fashion,” adds Barbastathis.
One of those students was Dandekar. “The project really interested me because I got to apply this new field of scientific machine learning to a very pressing problem,” he says.
As Covid-19 started to spread across the globe, the scope of the project expanded. What had originally started as a project looking just at spread within Wuhan, China grew to also include the spread in Italy, South Korea, and the United States.
The duo started modeling the spread of the virus in each of these four regions after the 500th case was recorded. That milestone marked a clear delineation in how different governments implemented quarantine orders.
Armed with precise data from each of these countries, the research team took the standard SEIR model and augmented it with a neural network that learns how infected individuals under quarantine impact the rate of infection. They trained the neural network through 500 iterations so it could then teach itself how to predict patterns in the infection spread.
Using this model, the research team was able to draw a direct correlation between quarantine measures and a reduction in the effective reproduction number of the virus.
“The neural network is learning what we are calling the ‘quarantine control strength function,’” explains Dandekar. In South Korea, where strong measures were implemented quickly, the quarantine control strength function has been effective in reducing the number of new infections. In the United States, where quarantine measures have been slowly rolled out since mid-March, it has been more difficult to stop the spread of the virus.
Predicting the “plateau”
As the number of cases in a particular country decreases, the forecasting model transitions from an exponential regime to a linear one. Italy began entering this linear regime in early April, with the U.S. not far behind it.
The machine learning algorithm Dandekar and Barbastathis have developed predicted that the United States will start to shift from an exponential regime to a linear regime in the first week of April, with a stagnation in the infected case count likely between April 15 and April 20. It also suggests that the infection count will reach 600,000 in the United States before the rate of infection starts to stagnate.
“This is a really crucial moment of time. If we relax quarantine measures, it could lead to disaster,” says Barbastathis.
According to Barbastathis, one only has to look to Singapore to see the dangers that could stem from relaxing quarantine measures too quickly. While the team didn’t study Singapore’s Covid-19 cases in their research, the second wave of infection this country is currently experiencing reflects their model’s finding about the correlation between quarantine measures and infection rate.
“If the U.S. were to follow the same policy of relaxing quarantine measures too soon, we have predicted that the consequences would be far more catastrophic,” Barbastathis adds.
The team plans to share the model with other researchers in the hopes that it can help inform Covid-19 quarantine strategies that can successfully slow the rate of infection.
Quantifying the effect of quarantine control in Covid-19 infectious spread using machine learning
Since the first recording of what we now call Covid-19 infection in Wuhan, Hubei province, China on Dec 31, 2019, the disease has spread worldwide and met with a wide variety of social distancing and quarantine policies. The effectiveness of these responses is notoriously difficult to quantify as individuals travel, violate policies deliberately or inadvertently, and infect others without themselves being detected. Moreover, the publicly available data on infection rates are themselves unreliable due to limited testing and even possibly under-reporting. In this paper, we attempt to interpret and extrapolate from publicly available data using a mixed first-principles epidemiological equations and data-driven neural network model. Leveraging our neural network augmented model, we focus our analysis on four locales: Wuhan, Italy, South Korea and the United States of America, and compare the role played by the quarantine and isolation measures in each of these countries in controlling the effective reproduction number Rt of the virus. Our results unequivocally indicate that the countries in which rapid government interventions and strict public health measures for quarantine and isolation were implemented were successful in halting the spread of infection and prevent it from exploding exponentially. In the case of Wuhan especially, where the available data were earliest available, we have been able to test the predicting ability of our model by training it from data in the January 24 till March 3 window, and then matching the predictions up to April 1. Even for Italy and South Korea, we have a buffer window of one week (25 March – 1 April) to validate the predictions of our model. In the case of the US, our model captures well the current infected curve growth and predicts a halting of infection spread by 20 April 2020. We further demonstrate that relaxing or reversing quarantine measures right now will lead to an exponential explosion in the infected case count, thus nullifying the role played by all measures implemented in the US since mid March 2020.