Summary: Researchers have developed a faster, more precise method of identifying which of a person’s genes could be associated with a particular disease.
Source: University of Cambridge.
A faster and more accurate method of identifying which of an individual’s genes are associated with particular symptoms has been developed by a team of researchers from the UK and Saudi Arabia. This new approach could enable scientists to take advantage of recent developments in genome sequencing to improve diagnosis and potential treatment options.
Around 80% of rare diseases are thought to have a genetic component, but currently many patients experience long delays in diagnosis or never receive a diagnosis at all. Recent developments in our ability to obtain whole or partial genome sequences cheaply and efficiently now makes it feasible for patients to benefit from this technology through cheaper, faster diagnosis of disease, and the development of new therapies. Many international projects are currently seeking to capitalise on these developments by sequencing hundreds of thousands of individuals, such as the UK 100,000 Genome Project. However, a major challenge remains how to associate changes in a patient’s DNA to their disease.
“The challenge for scientists is to identify which of the hundreds of thousands of genetic differences between a patient and an unaffected individual might be responsible for their disease,” says Dr Paul Schofield from the Department of Physiology, Development and Neuroscience at the University of Cambridge. “Given the huge complexity of this problem, it has been described as ‘looking for needles in stacks of needles’.”
Now, Dr Schofield and a team of researchers from the UK and Saudi Arabia have developed an algorithm, published this week in the journal PLOS Computational Biology, that can identify variants that modify the normal function of a gene associated with a particular disease.
A framework developed by the team, called PhenomeNET, matches a patient’s phenotype (symptoms) to a large database of gene-to-phenotype associations, including those from studies involving mice and zebrafish, in order to identify disease-causing genes.
Mice and zebrafish are commonly used when studying the biology underlying human diseases as they have a number of important genetic and biological similarities to us. For many years, data on the consequences of naturally-occurring and experimentally-induced genetic variants in these animal models have been collected resulting in a huge ‘Big Data’ resource associating genetic makeup and phenotype, such as the Mouse Genome Database, which contains more than 60,000 of these associations.
By combining PhenomeNET with methods that find harmful variants in a genomic sequence, the team developed the PhenomeNET Variant Predictor (PVP) system, an algorithm that prioritises these variants with their likelihood of involvement in human disease.
“Our algorithm makes use of clinical and experimental data that have been collected for years and uses them to identify the genetic variants underlying the conditions of patients with genetic disorders,” adds Professor Robert Hoehndorf from King Abdullah University of Science and Technology (KAUST) in Saudi Arabia.
Working with Dr Nadia Schoenmakers at the Wellcome Trust-Medical Research Council Institute of Metabolic Science in Cambridge, the team was able to show that the new algorithm can identify genetic changes in patients with congenital thyroid disease, and can reveal candidate genetic changes in ‘Mendelian’ diseases where only a single gene is involved.
“We’ve shown that our algorithm works for simpler diseases and now the real test will be to determine whether a similar approach can be applied to complex diseases, such as diabetes, where multiple genes are involved,” says Professor George Gkoutos from the University of Birmingham.
Source: Craig Brierley – University of Cambridge
Image Source: NeuroscienceNews.com image is adapted from the University of Cambridge news release.
Original Research: Full open access research for “Semantic prioritization of novel causative genomic variants” by Imane Boudellioua, Rozaimi B. Mahamad Razali, Maxat Kulmanov, Yasmeen Hashish, Vladimir B. Bajic, Eva Goncalves-Serra, Nadia Schoenmakers, Georgios V. Gkoutos, Paul N. Schofield, Robert Hoehndorf in PLOS Computational Biology. Published online April 17 2017 doi:10.1371/journal.pcbi.1005500
Semantic prioritization of novel causative genomic variants
Discriminating the causative disease variant(s) for individuals with inherited or de novo mutations presents one of the main challenges faced by the clinical genetics community today. Computational approaches for variant prioritization include machine learning methods utilizing a large number of features, including molecular information, interaction networks, or phenotypes. Here, we demonstrate the PhenomeNET Variant Predictor (PVP) system that exploits semantic technologies and automated reasoning over genotype-phenotype relations to filter and prioritize variants in whole exome and whole genome sequencing datasets. We demonstrate the performance of PVP in identifying causative variants on a large number of synthetic whole exome and whole genome sequences, covering a wide range of diseases and syndromes. In a retrospective study, we further illustrate the application of PVP for the interpretation of whole exome sequencing data in patients suffering from congenital hypothyroidism. We find that PVP accurately identifies causative variants in whole exome and whole genome sequencing datasets and provides a powerful resource for the discovery of causal variants.
“Semantic prioritization of novel causative genomic variants” by Imane Boudellioua, Rozaimi B. Mahamad Razali, Maxat Kulmanov, Yasmeen Hashish, Vladimir B. Bajic, Eva Goncalves-Serra, Nadia Schoenmakers, Georgios V. Gkoutos, Paul N. Schofield, Robert Hoehndorf in PLOS Computational Biology. Published online April 17 2017 doi:10.1371/journal.pcbi.1005500