Big Data Analysis Identifies New Cancer Risk Genes

Summary: A newly developed statistical method has allowed researchers to identify 13 cancer predisposition risk genes, 10 of which, the scientists say, are new discoveries.

Source: Center for Genomic Regulation.

There are many genetic causes of cancer: while some mutations are inherited from your parents, others are acquired all throughout your life due to external factors or due to mistakes in copying DNA. Large-scale genome sequencing has revolutionised the identification of cancers driven by the latter group of mutations – somatic mutations – but it has not been as effective in the identification of the inherited genetic variants that predispose to cancer. The main source for identifying these inherited mutations is still family studies.

Now, three researchers at the Centre for Genomic Regulation (CRG) in Barcelona, led by the ICREA Research Professor Ben Lehner, have developed a new statistical method to identify cancer predisposition genes from tumour sequencing data. “Our computational method uses an old idea that cancer genes often require ‘two hits’ before they cause cancer. We developed a method that allows us to systematically identify these genes from existing cancer genome datasets” explains Solip Park, first author of the study and Juan de la Cierva postdoctoral researcher at the CRG.

The method allows researchers to find risk variants without a control sample, meaning that they do not need to compare cancer patients to groups of healthy people, “Now we have a powerful tool to detect new cancer predisposition genes and, consequently, to contribute to improving cancer diagnosis and prevention in the future,” adds Park.

The work, which is published in Nature Communications, presents their statistical method ALFRED and identifies 13 candidate cancer predisposition genes, of which 10 are new. “We applied our method to the genome sequences of more than 10,000 cancer patients with 30 different tumour types and identified known and new possible cancer predisposition genes that have the potential to contribute substantially to cancer risk,” says Ben Lehner, principal investigator of the study.

dna strand
Three researchers at the Centre for Genomic Regulation (CRG) identified new cancer risk genes only using public available data. Data sharing is key for genomic research to become more open, responsible and efficient. image is credited to Jonathan Bailey, NHGRI.

“Our results show that the new cancer predisposition genes may have an important role in many types of cancer. For example, they were associated with 14% of ovarian tumours, 7% of breast tumours and to about 1 in 50 of all cancers. For example, inherited variants in one of the newly-proposed risk genes – NSD1 – may be implicated in at least 3 out of 1,000 cancer patients.” explains Fran Supek, CRG alumnus and currently group leader of the Genome Data Science laboratory at the Institute for Reseach in Biomedicine (IRB Barcelona).

When sharing is key to advance knowledge

The researchers worked with genome data from several cancer studies from around the world, including The Cancer Genome Atlas (TCGA) project and also from several projects having nothing to do with cancer research. “We managed to develop and test a new method that hopefully will improve our understanding of cancer genomics and will contribute to cancer research, diagnostics and prevention just by using public data,” states Solip Park.

Ben Lehner adds, “Our work highlights how important it is to share genomic data. It is a success story for how being open is far more efficient and has a multiplier effect. We combined data from many different projects and by applying a new computational method were able to identify important cancer genes that were not identified by the original studies. Many patient groups lobby for better sharing of genomic data because it is only by comparing data across hospitals, countries and diseases that we can obtain a deep understanding of many rare and common diseases. Unfortunately, many researchers still do not share their data and this is something we need to actively change as a society”.

About this neuroscience research article

Funding: European Research Council, AXA Research Fund, Spanish Ministry of Economy and Competitiveness, Centro de Excelencia Severo Ochoa, Agència de Gestió d’Ajuts University funded this study.

Source: Laia Cendros – Center for Genomic Regulation
Publisher: Organized by
Image Source: image is credited to Jonathan Bailey, NHGRI.
Original Research: Open access research for “Systematic discovery of germline cancer predisposition genes through the identification of somatic second hits” by Solip Park, Fran Supek & Ben Lehner in Nature Communications. Published July 4 2018.

Cite This Article

[cbtabs][cbtab title=”MLA”]Center for Genomic Regulation”Big Data Analysis Identifies New Cancer Risk Genes.” NeuroscienceNews. NeuroscienceNews, 12 July 2018.
<>.[/cbtab][cbtab title=”APA”]Center for Genomic Regulation(2018, July 12). Big Data Analysis Identifies New Cancer Risk Genes. NeuroscienceNews. Retrieved July 12, 2018 from[/cbtab][cbtab title=”Chicago”]Center for Genomic Regulation”Big Data Analysis Identifies New Cancer Risk Genes.” (accessed July 12, 2018).[/cbtab][/cbtabs]


Systematic discovery of germline cancer predisposition genes through the identification of somatic second hits

The genetic causes of cancer include both somatic mutations and inherited germline variants. Large-scale tumor sequencing has revolutionized the identification of somatic driver alterations but has had limited impact on the identification of cancer predisposition genes (CPGs). Here we present a statistical method, ALFRED, that tests Knudson’s two-hit hypothesis to systematically identify CPGs from cancer genome data. Applied to ~10,000 tumor exomes the approach identifies known and putative CPGs – including the chromatin modifier NSD1 – that contribute to cancer through a combination of rare germline variants and somatic loss-of-heterozygosity (LOH). Rare germline variants in these genes contribute substantially to cancer risk, including to ~14% of ovarian carcinomas, ~7% of breast tumors, ~4% of uterine corpus endometrial carcinomas, and to a median of 2% of tumors across 17 cancer types.

Feel free to share this Neuroscience News.
Join our Newsletter
I agree to have my personal information transferred to AWeber for Neuroscience Newsletter ( more information )
Sign up to receive our recent neuroscience headlines and summaries sent to your email once a day, totally free.
We hate spam and only use your email to contact you about newsletters. You can cancel your subscription any time.