Summary: Blood tests revealed specific epigenetic biomarkers for schizophrenia. Researchers applied machine learning to analyze the CoRSIVs region of the human genome to identify the schizophrenia biomarkers. Testing the model with an independent data set revealed the AI technology can detect schizophrenia with 80% accuracy.
Source: Baylor College of Medicine
An innovative strategy that analyzes a region of the genome offers the possibility of early diagnosis of schizophrenia, reports a team led by researchers at Baylor College of Medicine. The strategy applied a machine learning algorithm called SPLS-DA to analyze specific regions of the human genome called CoRSIVs, hoping to reveal epigenetic markers for the condition.
In DNA from blood samples, the team identified epigenetic markers, a profile of methyl chemical groups in the DNA, that differ between people diagnosed with schizophrenia and people without the disease and developed a model that would assess an individual’s probability of having the condition. Testing the model on an independent dataset revealed that it can identify schizophrenia patients with 80% accuracy.
The study appears in the journal Translational Psychiatry.
“Schizophrenia is a devastating disease that affects about 1% of the world’s population,” said corresponding author Dr. Robert A. Waterland professor of pediatrics – nutrition at the USDA/ARS Children’s Nutrition Research Center at Baylor and of molecular and human genetics. “Although genetic and environmental components seem to be involved in the condition, current evidence only explains a small portion of cases, suggesting that other factors, such as epigenetic, also could be important.”
Epigenetics is a system for molecular marking of DNA – it tells the different cells in the body which genes to turn on or off in that cell type, therefore epigenetic markers can vary between different normal tissues within one individual. This makes it challenging to assess whether epigenetic changes contribute to diseases involving the brain, like schizophrenia.
To address this obstacle, Waterland and his colleagues had identified in previous work a set of specific genomic regions in which DNA methylation, a common epigenetic marker, differs between people but is consistent across different tissues in one person. They called these genomic regions CoRSIVs for correlated regions of systemic interindividual variation. They proposed that studying CoRSIVs is a novel way to uncover epigenetic causes of disease.
“Because methylation patterns in CoRSIVs are the same in all the tissues of one individual, we can analyze them in a blood sample to infer epigenetic regulation on other parts of the body that are difficult to assess, such as the brain,” Waterland said.
Many previous studies have analyzed methylation profiles in blood samples with the goal of identifying epigenetic differences between individuals with schizophrenia, the researchers explained.
“Our study is innovative in various ways,” said first author Dr. Chathura J. Gunasekara, computer scientist in the Waterland lab.
“We focused on CoRSIVs and also applied for the first time the SPLS-DA machine learning algorithm to analyze DNA methylation. As a scientist interested in applying machine learning to medicine, our findings are very exciting. They not only suggest the possibility of predicting risk of schizophrenia early in life, but also outline a new approach that may be applicable to other diseases.”
The current study also is innovative because it considered major potential confounding factors other studies did not take into account. For instance, methylation patterns in blood can be affected by factors such as smoking and taking antipsychotic medications, both of which are common in schizophrenia patients.
“Here, we took various approaches to evaluate whether the methylation patterns we detected at CoRSIVs were affected by medication use and smoking. We were able to rule that out,” Waterland said. “This, together with the fact that DNA methylation at CoRSIVs is established very early in life, indicates that the epigenetic differences we identified between schizophrenia patients and healthy individuals were there before the disease was diagnosed, suggesting they may contribute to the condition.”
Using this novel approach, the researchers were able to achieve much stronger epigenetic signals associated with schizophrenia than has ever been done before, said the team.
“We consider our study a proof of principle that focusing on CoRSIVs makes epigenetic epidemiology possible,” Waterland said.
The following authors also contributed to this work: Eilis Hannon and Jonathan Mill at University of Exeter Medical School, Harry MacKay and Cristian Coarfa at Baylor College of Medicine, Andrew McQuillin at University College London and David St. Clair at University of Aberdeen.
Funding: This work was supported by NIH/NIDDK (grant number 1R01DK111522), the Cancer Prevention and Research Institute of Texas (grant number RP170295) and USDA/ARS (CRIS 3092-5-001-059).
About this machine learning and schizophrenia research news
A machine learning case–control classifier for schizophrenia based on DNA methylation in blood
Epigenetic dysregulation is thought to contribute to the etiology of schizophrenia (SZ), but the cell type-specificity of DNA methylation makes population-based epigenetic studies of SZ challenging. To train an SZ case–control classifier based on DNA methylation in blood, therefore, we focused on human genomic regions of systemic interindividual epigenetic variation (CoRSIVs), a subset of which are represented on the Illumina Human Methylation 450K (HM450) array.
HM450 DNA methylation data on whole blood of 414 SZ cases and 433 non-psychiatric controls were used as training data for a classification algorithm with built-in feature selection, sparse partial least squares discriminate analysis (SPLS-DA); application of SPLS-DA to HM450 data has not been previously reported.
Using the first two SPLS-DA dimensions we calculated a “risk distance” to identify individuals with the highest probability of SZ. The model was then evaluated on an independent HM450 data set on 353 SZ cases and 322 non-psychiatric controls.
Our CoRSIV-based model classified 303 individuals as cases with a positive predictive value (PPV) of 80%, far surpassing the performance of a model based on polygenic risk score (PRS). Importantly, risk distance (based on CoRSIV methylation) was not associated with medication use, arguing against reverse causality.
Risk distance and PRS were positively correlated (Pearson r = 0.28, P = 1.28 × 10−12), and mediational analysis suggested that genetic effects on SZ are partially mediated by altered methylation at CoRSIVs.
Our results indicate two innate dimensions of SZ risk: one based on genetic, and the other on systemic epigenetic variants.