Researchers identify thousands of genetic variants, many of which can be linked to specific diseases

By guest author

An illustration of single-cell RNA sequencing (scRNA-seq), a powerful analysis method that gives researchers detailed insights into levels of gene expression in individual cells. Credit: Tobias Wüstefeld/BlueClay Studios

Induced pluripotent stem cells (iPSCs) are suitable for discovering the genes that underlie complex and also rare genetic diseases. Scientists from the German Cancer Research Center (DKFZ) and the European Molecular Biology Laboratory (EMBL), together with international partners, have studied genotype–phenotype relationships in iPSCs using data from approximately 1,000 donors.

Tens of thousands of tiny genetic variations (SNPs, single-nucleotide polymorphisms) have been identified in the human genome that are associated with specific diseases. Many of these genetic variants are not located in the protein-coding regions of genes, but affect regulatory sections. Therefore, scientists are trying to find out if and in which tissues these variants can be linked to changes in the activity of specific genes.

Typically, such analyses are performed in blood cells or tissue biopsies, depending on the type of disease. “Pluripotent stem cells, however, might be better suited for this purpose in many cases, as they are undifferentiated and therefore reflect the ancestral state of all cells,” says Oliver Stegle, division head at DKFZ and a group leader at EMBL. “Stem cells could be particularly relevant when searching for the cause of diseases that occur early in development.” Pluripotent stem cells can be generated in the culture dish from normal body cells obtained from a blood sample, for example. They are referred to as induced pluripotent stem cells, or iPSCs, since they are not naturally occurring stem cells.

Together with scientists from Stanford University and additional international cooperation partners, Oliver Stegle’s team has compiled sequence and transcriptome data on iPSCs from around 1,000 donors. The researchers systematically examined these data to identify correlations between individual genetic variants and altered expression patterns in stem cells. The results have now been published in the journal Nature Genetics.

For more than 67% of all genes active in iPSCs, the researchers found differential expression patterns depending on genetic variants. Many of these associations are novel and have not been described in somatic cell types before. For over 4,000 of these associations, it was possible to link the genetic variants responsible for the altered expression patterns to specific diseases. These included, for example, variants associated with coronary heart disease, lipid metabolism disorders, or hereditary cancers.

Stegle and colleagues also investigated whether iPSCs are suitable for identifying the causative genes of rare genetic diseases. They used iPSC lines from 65 patients with various rare diseases, whose causal gene variants were already known through previous analyses. In the transcriptome data of these iPSC lines, the scientists searched for particularly conspicuous outliers in the expression pattern. These analyses reliably led them to trace the genetic basis of the disease. “Such screenings were previously impossible because there were simply no sufficiently large reference collections of iPSC transcriptomes,” explains Marc Jan Bonder, first author of the study. “We were surprised to find such a large number of disease-associated genetic variants that are already visible in the expression pattern at the earliest time point of cell differentiation, represented by the iPSCs.” Until now, the relevance of iPSCs for such biomedical analyses has been significantly underestimated.

In a companion paper published in the same issue of Nature Genetics, Stegle and colleagues from EMBL’s European Bioinformatics Institute (EMBL-EBI) and the Wellcome Sanger Institute used more than 200 iPSC lines to investigate how genetic variants affect differentiation into neuronal cells.

The scientists performed single-cell RNA sequencing at different time points of neuronal cell differentiation. This allowed them to analyse how genetic variants affect expression patterns in different cellular states, including different neuronal cell types. “The study demonstrates the power of combining single-cell sequencing with iPSC technologies to dissect the effect of genetic variants in cell types that would otherwise be inaccessible,” Stegle explains.

This article was originally published by DKFZ.

This article is also available in German.