Back to search

FRIMEDBIO-Fri prosj.st. med.,helse,biol

Evolutionary and functional importance of simple repeats in the genome

Alternative title: Evolusjon av funksjon av enkle repetisjoner i genomet

Awarded: NOK 12.5 mill.

WP1. The study on STR length variation and how this affects gene expression (reported as submitted) in the previous report has now been published in Plant Cell (Reinar et al. 2021). This work has attracted a lot of attention. In arabidopsis, we have also now completed the experiment of exposing plants to temperature stress over a whole 9 generations. 3 parallels from each generation have been whole-genome sequenced and the results show that tandem repeat length variation accumulates to a greater extent at high temperature (than at normal temperature). We also see that length variations are more frequent than point mutations. This has implications for how to understand plants' response to climate change. An article is being prepared. In cod, we have investigated the relationship between STR length variation in coding STR and environmental variables in the Baltic Sea (low salinity) and the North Sea (high salinity). We find a number of genes that show such variation in protein coding sequence and among these there are genes that can be linked to transcriptional regulation, stress response and circadian rhythms (day length, seasons, light etc.). A great many of these candidate genes that are being investigated further in WP2 contain trinucleotides that code for the amino acid glutamine. In WP2, we have studied genes that have tandem repeats in their amino acid coding sequence and shown that such areas in the protein can be linked to areas that do not have a defined structure and that can change as a result of length variation. The amino acids glutamine and asparagine in such length variations which in turn are correlated to environmental conditions. Furthermore, we have shown experimentally that a transcription factor (TCP14) is affected both in terms of its interaction with another protein and its ability to activate transcription. These results are under publication (Reinar et al. 2023, under review). In cod, we have examined a number of candidate genes. These genes have either been introduced into human cell lines, into plants and into a cod cell line. The results here will be published. We have also started experiments where we use CRISPR-Cas editing of corresponding genes in medaka. In WP3, we have carried out a study of a large number of genomes representing a large number of eukaryotes and can show that in most examined species there is a statistically significantly higher frequency of simple tandem repeats around the start of transcription. We plan to publish this in the first half of 2023.

The outcomes described in the original application ("Expected outcome" for WP1,2 and 3) have all been achieved. I have checked this carefully, and it was mind-blowing to see that all the outcomes had been reached. Actually, we have achieved results beyond the outcomes in the original project description - including that we have been able to functionally test more genes in Arabidopsis having an effect on development and gene regulation than we had anticipated. Furthermore, due to the development in genomics the survey across the Three of Life has become for more comprehensive that we could foresee. We did not describe use of CRISPR-Cas in the application - but this is a reality in the reported project. The main outcome of the project is that it has shown that simple tandem repeats and their length variation in both non-coding (outside gens) and in coding regions affect gene regulation and protein function and is associated with environmental and biotechnology conditions - and that this has also been demonstrated experimentally for selected genes. The impacts are substantial. First, since the presence of simple tandem repeats inside genes and inter regulatory regions is universal they are likely to affect evolution and adaptations in most organisms. Second, since simple tandem repeat variation accumulate substantially faster than single nucleotide polymorphisms, this overlooked type of mutations needs to be investigated in all future genomics based projects (addressing evolution, behavior, life history traits and adaptations). Third, since short tandem repeat length variation is a type of structural variation in the genome, our results lend support to the growing evidence that structural variation (insertions/deletions, duplications, inversions, recombinations) are crucial for understanding the genotype phenotype enigma. From an applied perspective simple tandem repeat length variations we be crucial in management of local populations and for understanding plant, animal and human disease susceptibility and behavior.

More than 150 years since Darwin published his famous work 'The Origin Of Species' the causal relationship between the genotype (genome) and the phenotype (phenome) is still basically a mystery. In particular, even though the role of natural selection in evolution is widely accepted, we do not understand how changes in the phenotype relate to genetic change and how this may cause adaptation and speciation under natural selection. However, what we do better understand, due to recent whole genome investigations using high throughput sequencing (HTS), is the dynamic nature of genome architectural changes. These include, gene copy numbers, inversions, transposable element dynamics and simple repeat variations. Here we propose to investigate variations in simple trinucleotide repeats residing inside (coding) and in the vicinity (or in introns) of genes. We will relate such length variations to functional modulation of regulatory mechanisms affecting the phenotype. Specifically, we will test the hypothesis that hypervariable coding/regulatory repeats are promoting the ability of a species or population to adapt to a changing environment. The project is cross-disciplinary and will utilize genomic, bioinformatics, statistics and experimental approaches. The goal is to understand how new mechanisms drives genomic architecture and divergence, taking into account fluctuations in the selection regimes. We aim to obtain new fundamental biological insights as well as novel bioinformatics, and statistical methodology.

Publications from Cristin

No publications found

No publications found

No publications found

Funding scheme:

FRIMEDBIO-Fri prosj.st. med.,helse,biol