Summary: A new AI tool identified long COVID in 22.8% of patients, a much higher rate than previously diagnosed. By analyzing extensive health records from nearly 300,000 patients, the algorithm identifies long COVID by distinguishing symptoms linked specifically to SARS-CoV-2 infection rather than pre-existing conditions.
This AI approach, known as “precision phenotyping,” helps clinicians differentiate long COVID symptoms from other health issues and may improve diagnostic accuracy by about 3%.
Key Facts:
- AI-based precision phenotyping: Identifies long COVID only after excluding other causes of symptoms in health records, improving diagnostic accuracy.
- Broader representation: Algorithm diagnoses mirror the Massachusetts demographic profile, addressing biases found in traditional diagnostic codes.
- Research potential: Algorithm may advance future research on the genetic and biochemical factors of long COVID subtypes.
Source: Harvard
While earlier diagnostic studies have suggested that 7 percent of the population suffers from long COVID, a new AI tool developed by Mass General Brigham revealed a much higher 22.8 percent, according to the study.
The AI-based tool can sift through electronic health records to help clinicians identify cases of long COVID. The often-mysterious condition can encompass a litany of enduring symptoms, including fatigue, chronic cough, and brain fog after infection from SARS-CoV-2.
The algorithm used was developed by drawing de-identified patient data from the clinical records of nearly 300,000 patients across 14 hospitals and 20 community health centers in the Mass General Brigham system.
The results, published in the journal MedRxiv, could identify more people who should be receiving care for this potentially debilitating condition.
“Our AI tool could turn a foggy diagnostic process into something sharp and focused, giving clinicians the power to make sense of a challenging condition,” said senior author Hossein Estiri, head of AI Research at the Center for AI and Biomedical Informatics of the Learning Healthcare System (CAIBILS) at MGB and an associate professor of medicine at Harvard Medical School.
“With this work, we may finally be able to see long COVID for what it truly is — and more importantly, how to treat it.”
For the purposes of their study, Estiri and colleagues defined long COVID as a diagnosis of exclusion that is also infection-associated. That means the diagnosis could not be explained in the patient’s unique medical record but was associated with a COVID infection. In addition, the diagnosis needed to have persisted for two months or longer in a 12-month follow up window.
The novel method developed by Estiri and colleagues, called “precision phenotyping,” sifts through individual records to identify symptoms and conditions linked to COVID-19 to track symptoms over time in order to differentiate them from other illnesses.
For example, the algorithm can detect if shortness of breath results from pre-existing conditions like heart failure or asthma rather than long COVID. Only when every other possibility was exhausted would the tool flag the patient as having long COVID.
“Physicians are often faced with having to wade through a tangled web of symptoms and medical histories, unsure of which threads to pull, while balancing busy caseloads. Having a tool powered by AI that can methodically do it for them could be a game-changer,” said Alaleh Azhir, co-lead author and an internal medicine resident at Brigham and Women’s Hospital, a founding member of the Mass General Brigham healthcare system.
The new tool’s patient-centered diagnoses may also help alleviate biases built into current diagnostics for long COVID, said researchers, who noted diagnoses with the official ICD-10 diagnostic code for long COVID trend toward those with easier access to healthcare.
The researchers said their tool is about 3 percent more accurate than the data ICD-10 codes capture, while being less biased. Specifically, their study demonstrated that the individuals they identified as having long COVID mirror the broader demographic makeup of Massachusetts, unlike long COVID algorithms that rely on a single diagnostic code or individual clinical encounters, skewing results toward certain populations such as those with more access to care.
“This broader scope ensures that marginalized communities, often sidelined in clinical studies, are no longer invisible,” said Estiri.
Limitations of the study and AI tool include that health record data the algorithm uses to account for long COVID symptoms may be less complete than the data physicians capture in post-visit clinical notes.
Another limitation was the algorithm did not capture possible worsening of a prior condition that may have been a long COVID symptom. For example, if a patient had COPD that worsened before they developed COVID-19, the algorithm might have removed the episodes even if they were long COVID indicators.
Declines in COVID-19 testing in recent years also makes it difficult to identify when a patient may have first gotten COVID-19.
The study was limited to patients in Massachusetts.
Future studies may explore the algorithm in cohorts of patients with specific conditions, like COPD or diabetes. The researchers also plan to release this algorithm publicly on open access so physicians and healthcare systems globally can use it in their patient populations.
In addition to opening the door to better clinical care, this work may lay the foundation for future research into the genetic and biochemical factors behind long COVID’s various subtypes.
“Questions about the true burden of long COVID — questions that have thus far remained elusive — now seem more within reach,” said Estiri.
Funding: Support was given by the National Institutes of Health, National Institute of Allergy and Infectious Diseases (NIAID) R01AI165535, National Heart, Lung, and Blood Institute (NHLBI) OT2HL161847, and National Center for Advancing Translational Sciences (NCATS) UL1 TR003167, UL1 TR001881, and U24TR004111.
J. Hügel’s work was partially funded by a fellowship within the IFI program of the German Academic Exchange Service (DAAD) and by the Federal Ministry of Education and Research (BMBF) as well by the German Research Foundation (426671079).
About this AI and long COVID research news
Author: MGB Communications
Source: Harvard
Contact: MGB Communications – Harvard
Image: The image is credited to Neuroscience News
Original Research: Open access.
“Precision Phenotyping for Curating Research Cohorts of Patients with Post-Acute Sequelae of COVID-19 (PASC) as a Diagnosis of Exclusion” by Hossein Estiri et al. MedRxiv
Abstract
Precision Phenotyping for Curating Research Cohorts of Patients with Post-Acute Sequelae of COVID-19 (PASC) as a Diagnosis of Exclusion
Scalable identification of patients with the post-acute sequelae of COVID-19 (PASC) is challenging due to a lack of reproducible precision phenotyping algorithms and the suboptimal accuracy, demographic biases, and underestimation of the PASC diagnosis code (ICD-10 U09.9).
In a retrospective case-control study, we developed a precision phenotyping algorithm for identifying research cohorts of PASC patients, defined as a diagnosis of exclusion. We used longitudinal electronic health records (EHR) data from over 295 thousand patients from 14 hospitals and 20 community health centers in Massachusetts.
The algorithm employs an attention mechanism to exclude sequelae that prior conditions can explain. We performed independent chart reviews to tune and validate our precision phenotyping algorithm.
Our PASC phenotyping algorithm improves precision and prevalence estimation and reduces bias in identifying Long COVID patients compared to the U09.9 diagnosis code.
Our algorithm identified a PASC research cohort of over 24 thousand patients (compared to about 6 thousand when using the U09.9 diagnosis code), with a 79.9 percent precision (compared to 77.8 percent from the U09.9 diagnosis code).
Our estimated prevalence of PASC was 22.8 percent, which is close to the national estimates for the region. We also provide an in-depth analysis outlining the clinical attributes, encompassing identified lingering effects by organ, comorbidity profiles, and temporal differences in the risk of PASC.
The PASC phenotyping method presented in this study boasts superior precision, accurately gauges the prevalence of PASC without underestimating it, and exhibits less bias in pinpointing Long COVID patients.
The PASC cohort derived from our algorithm will serve as a springboard for delving into Long COVID’s genetic, metabolomic, and clinical intricacies, surmounting the constraints of recent PASC cohort studies, which were hampered by their limited size and available outcome data.