484
Views
0
CrossRef citations to date
0
Altmetric
Review

The regulatory roles of small nucleolar RNAs within their host locus

, & ORCID Icon
Pages 1-11 | Accepted 08 Apr 2024, Published online: 16 Apr 2024

ABSTRACT

Small nucleolar RNAs (snoRNAs) are a class of conserved noncoding RNAs forming complexes with proteins to catalyse site-specific modifications on ribosomal RNA. Besides this canonical role, several snoRNAs are now known to regulate diverse levels of gene expression. While these functions are carried out in trans by mature snoRNAs, evidence has also been emerging of regulatory roles of snoRNAs in cis, either within their genomic locus or as longer transcription intermediates during their maturation. Herein, we review recent findings that snoRNAs can interact in cis with their intron to regulate the expression of their host gene. We also explore the ever-growing diversity of longer host-derived snoRNA extensions and their functional impact across the transcriptome. Finally, we discuss the role of snoRNA duplications into forging these new layers of snoRNA-mediated regulation, as well as their involvement in the genomic imprinting of their host locus.

Introduction

Small nucleolar RNAs (snoRNAs) are a vast group of mid-size noncoding RNAs present in all eukaryotes and most extensively characterized for their role in ribosome biogenesis [Citation1]. They are typically separated in two subclasses, the C/D and H/ACA box snoRNAs, which differ in the motifs they harbour, the secondary structure they adopt and the core binding proteins with which they interact [Citation2]. The naming convention for genes of these two subclasses in human is SNORD# and SNORA# for C/D and H/ACA box snoRNAs, respectively, where # represents a number (sometimes followed by a letter) usually translatable to their order of discovery (e.g. SNORD13, SNORA19, SNORD50A, SNORD50B, etc.) [Citation3]. C/D box snoRNAs are defined by the presence of boxes C (RUGAUGA, where R is a purine) and D (CUGA) located, respectively, at their 5’ and 3’ ends, which interact through non-canonical base pairing to form a characteristic kink-turn structure [Citation4]. Although usually more degenerate than the C and D motifs, C/D box snoRNAs often display additional C’ and D’ boxes near the middle of the molecule [Citation4]. In contrast, H/ACA box snoRNAs are characterized by the presence of two hairpins separated by a hinge or H box (ANANNA, where N is any nucleotide) and terminated by an ACA box located 3 nucleotides upstream of their 3’ end [Citation5]. Through interactions with their respective core proteins and enzymes, snoRNAs form snoRNP (snoRNA ribonucleoprotein) complexes which catalyse specific modifications on target RNAs [Citation6]. In particular, C/D box snoRNAs guide 2’-O-ribose methylation catalysed by the methyltransferase fibrillarin (FBL) and H/ACA box snoRNAs guide pseudouridylation catalysed by the pseudouridine synthase dyskerin (DKC1). The specificity by which a snoRNP interacts with a specific target is intrinsically linked to antisense elements (ASEs) located upstream of the D and D’ boxes for C/D box snoRNAs or in internal hairpin bulges for H/ACA box snoRNAs [Citation7,Citation8]. These ASEs are responsible for the base-pairing to a complementary target sequence and typically vary in length between 5 and 20 nucleotides. The best characterized snoRNA targets are ribosomal RNA (rRNA) and small nuclear RNAs (snRNAs), the snoRNA-guided modifications on these targets being crucial regulators of ribosome and spliceosome assembly [Citation9,Citation10]. Notably, most snRNA modifications are guided by a special class of snoRNAs located in the Cajal body (scaRNAs), which can be composed of either or both C/D and H/ACA motifs [Citation9]. However, many snoRNAs still remain to this day with no known associated target, earning them the title of ‘orphan’ snoRNAs.

Combining computational prediction and experimental validation strategies, several reports in the last decades have led to the expansion of the spectrum of snoRNA interactors beyond their canonical targets and snoRNP core proteins (reviewed in [Citation11–13]). Indeed, several transfer RNAs (tRNAs) were reported to interact with C/D box snoRNAs, oftentimes leading to their 2’-O-methylation which regulates their cellular fate [Citation14,Citation15]. In addition, some snoRNAs were found to interact in trans with pre-messenger RNAs or messenger RNAs (mRNA), most of these examples underlining the capacity of snoRNAs to regulate the splicing, stability and translation of their target by binding to regulatory elements and in some cases through a snoRNA-guided 2’-O-methylation of the target [Citation16–20]. Interestingly, some C/D box snoRNAs were also observed to guide another type of modification, i.e. the acetylation of rRNA in the budding yeast as well as in animals such as the zebrafish and human [Citation21,Citation22].

Remarkably, the genomic location and expression strategies of snoRNAs vary significantly depending on the species and snoRNA type (; Supplementary Table S1; see Methods in Supplementary Material) [Citation23,Citation24]. For instance, in human and in several animals, expressed snoRNAs are mainly embedded within the introns of protein-coding host genes whose functions are related to ribosome assembly, translation regulation and RNA processing, as well as in noncoding host genes (i.e. long noncoding RNAs (lncRNAs)) (, donut charts) [Citation24]. Following the transcription and splicing of the host gene, the intronic lariat containing the snoRNA is linearized by a debranching enzyme such as DBR1 [Citation25]. The intron remnants that flank the snoRNA are then trimmed by exonucleases up to the mature snoRNA 5’ and 3’ ends, which are protected from degradation by bound core snoRNP proteins [Citation26]. It is also worth mentioning that although most intronic snoRNAs are processed via the previous mechanism, some intronic snoRNAs were shown to be produced through a splicing-independent process in frog oocytes as well as in in vitro assays in human cell lines [Citation27,Citation28]. Albeit following predominantly a one snoRNA per intron rule (, right bar chart for each species), host genes in animal genomes commonly harbour multiple snoRNAs distributed throughout different introns [Citation23,Citation24]. Interestingly, higher eukaryotes such as mammals show generally a greater proportion of intergenic snoRNAs than lower eukaryotic animals, which favour mostly intronic snoRNAs (, right bar chart for each species; compare the five species furthest to the left with the five furthest to the right). In contrast to this genomic architecture, most fungi snoRNAs exist as mono-intergenic snoRNAs, i.e. as independent transcriptional units with their own promoter [Citation29,Citation30], whereas plant snoRNAs are usually organized in intergenic clusters for which all snoRNAs are transcribed at once from one independent promoter (compare the right bar charts in to those in the middle and left panels in ) [Citation31,Citation32]. Notably, snoRNA annotations in protist species as well as in some fungi are still clearly lacking comprehensiveness compared to other eukaryotic kingdoms as shown by the low number or even the absence of annotated snoRNA genes in these species (, right panel). In addition to species-specific organization, snoRNA location also varies according to the snoRNA type. For instance, in human, most H/ACA box snoRNAs are encoded alone in their host gene, whereas host genes of C/D box snoRNAs usually harbour overall more than one snoRNA across their introns [Citation24]. Interestingly, a greater proportion of H/ACA box snoRNAs is observed in mammals compared to other types of animals and species from other eukaryotic kingdoms, which usually harbour more C/D box snoRNAs (, compare the left bar chart for each species). These observations support the hypothesis that H/ACA and C/D box snoRNA genes propagate in genomes using different strategies (which are differentially used depending on the species), with the former favouring retrotransposition and the latter favouring cis-recombination [Citation33–36].

Figure 1. The genomic organization of snoRNAs across eukaryotic model organisms. (a) SnoRNAs can be encoded as independently transcribed units in intergenic regions, either alone as mono-intergenic snoRNAs or as co-transcribed intergenic snoRNA clusters. They can also be embedded within host genes, usually within their introns, relying on their host gene transcription to be expressed. Mono-intronic snoRNAs are encoded alone in their intron whereas some snoRNAs co-exist within the same intron, thereby forming intronic snoRNA clusters. Some snoRNA genes also overlap exon sequences in host genes, either alone as mono-exonic snoRNAs or with others thereby forming exonic snoRNA clusters. The exonic snoRNAs are marked with an asterisk, because they are quite rare and might be artifacts due to genomic annotation errors (e.g. an overlapping exon in the form of a retained intron might not actually exist, therefore making intronic snoRNAs appear as ‘exonic’). (b) The distribution of genomic localization of snoRNAs across animal species is represented as a grouped stacked bar chart. The species are sorted from left to right in decreasing order of the total number of snoRNAs annotated in that species, with the total number being represented in parentheses above the bar charts. The right bar for each species represents the stacked proportion of snoRNAs localized in one of the different organizations displayed in (a). The left bar represents the stacked proportions of snoRNAs of each type (C/D box, H/ACA box or unknown type) that compose each genomic organization represented in the right bar. The distribution of host gene biotype (i.e. protein-coding gene, non-coding gene or intergenic snoRNA) for all snoRNAs in a given species is represented as a donut chart above its respective bars. The represented animal species are the platypus (Ornithorhynchus anatinus), human (Homo sapiens), rat (Rattus norvegicus), mouse (Mus musculus), macaque (Macaca mulatta), frog (Xenopus tropicalis), worm (Caenorhabditis elegans), fruit fly (Drosophila melanogaster), zebrafish (Danio rerio) and chicken (Gallus gallus). (c) Same as in (b), except that the chosen species are from the plant, fungal and protist eukaryotic kingdoms. The species are still ordered by decreasing total number of annotated snoRNAs, but the sorting was applied within individual kingdoms. The represented plant species are the wheat (Triticum aestivum), thale cress (Arabidopsis thaliana) and rice (Oryza sativa); the represented fungal species are the budding yeast (Saccharomyces cerevisiae), Candida albicans, fission yeast (Schizosaccharomyces pombe) and Neurospora crassa; the represented protist species are Dictyostelium discoideum, Tetrahymena thermophila and Giardia lamblia. Of note, no snoRNA is annotated in G. lamblia, resulting in the empty bar and donut charts. The methodology relevant to the results presented in this figure is detailed in the Methods in the Supplementary Material.

Figure 1. The genomic organization of snoRNAs across eukaryotic model organisms. (a) SnoRNAs can be encoded as independently transcribed units in intergenic regions, either alone as mono-intergenic snoRNAs or as co-transcribed intergenic snoRNA clusters. They can also be embedded within host genes, usually within their introns, relying on their host gene transcription to be expressed. Mono-intronic snoRNAs are encoded alone in their intron whereas some snoRNAs co-exist within the same intron, thereby forming intronic snoRNA clusters. Some snoRNA genes also overlap exon sequences in host genes, either alone as mono-exonic snoRNAs or with others thereby forming exonic snoRNA clusters. The exonic snoRNAs are marked with an asterisk, because they are quite rare and might be artifacts due to genomic annotation errors (e.g. an overlapping exon in the form of a retained intron might not actually exist, therefore making intronic snoRNAs appear as ‘exonic’). (b) The distribution of genomic localization of snoRNAs across animal species is represented as a grouped stacked bar chart. The species are sorted from left to right in decreasing order of the total number of snoRNAs annotated in that species, with the total number being represented in parentheses above the bar charts. The right bar for each species represents the stacked proportion of snoRNAs localized in one of the different organizations displayed in (a). The left bar represents the stacked proportions of snoRNAs of each type (C/D box, H/ACA box or unknown type) that compose each genomic organization represented in the right bar. The distribution of host gene biotype (i.e. protein-coding gene, non-coding gene or intergenic snoRNA) for all snoRNAs in a given species is represented as a donut chart above its respective bars. The represented animal species are the platypus (Ornithorhynchus anatinus), human (Homo sapiens), rat (Rattus norvegicus), mouse (Mus musculus), macaque (Macaca mulatta), frog (Xenopus tropicalis), worm (Caenorhabditis elegans), fruit fly (Drosophila melanogaster), zebrafish (Danio rerio) and chicken (Gallus gallus). (c) Same as in (b), except that the chosen species are from the plant, fungal and protist eukaryotic kingdoms. The species are still ordered by decreasing total number of annotated snoRNAs, but the sorting was applied within individual kingdoms. The represented plant species are the wheat (Triticum aestivum), thale cress (Arabidopsis thaliana) and rice (Oryza sativa); the represented fungal species are the budding yeast (Saccharomyces cerevisiae), Candida albicans, fission yeast (Schizosaccharomyces pombe) and Neurospora crassa; the represented protist species are Dictyostelium discoideum, Tetrahymena thermophila and Giardia lamblia. Of note, no snoRNA is annotated in G. lamblia, resulting in the empty bar and donut charts. The methodology relevant to the results presented in this figure is detailed in the Methods in the Supplementary Material.

Intriguingly, even though many snoRNAs are conserved across eukaryotes, the total number of snoRNA genes greatly differs across the Eukarya domain (, compare the numbers above the bar charts). Simple unicellular organisms such as yeasts and protists usually harbour less than a hundred snoRNAs in their genome, which highly contrasts with multicellular organisms [Citation37]. Indeed, plant and animal genomes typically display several hundred snoRNA genes, with mammalian genomes sometimes containing several thousand snoRNA genes [Citation38,Citation39]. These snoRNAs are often grouped into families based on sequence alignment and sequence covariance [Citation36,Citation40], which underlines that snoRNAs exist in multiple copies (sometimes with the exact same sequence) within a given genome. While having multiple exact copies creates a redundancy of snoRNAs targeting the same sites, most snoRNA family members, at least in human, do not display perfect sequence identity [Citation36]. This raises the question as to the biological significance and even more so the functionality of these diverged copies.

To complicate matters, recent studies indicate that several annotated snoRNA genes are not expressed (or at a very low level) in a mature form, ranging from a quarter to more than two-third of all annotated snoRNAs depending on the species [Citation24,Citation41–43]. These non-expressed snoRNAs, also referred to as snoRNA pseudogenes, are often defined by the accumulation of mutations in their characteristic motifs which could impact their capacity to bind their protein interactors, form a stable snoRNP and/or interact with their given target [Citation42]. Interestingly, snoRNA pseudogenes have been identified in many multicellular organisms including mammals, amphibians, plants and nematodes [Citation33,Citation38,Citation44,Citation45], which contrasts with observations in unicellular organisms like fungi that tend to turnover entire snoRNA families instead of maintaining these snoRNA remnants [Citation46].

Based on these observations, one would expect snoRNA pseudogenes to be poorly conserved across species due to their lack of expression. Although this is the case for most snoRNA pseudogenes, almost 12% of these non-expressed snoRNAs in human show an intriguingly high level of sequence conservation throughout vertebrates (average phastCons score ≥ 0.5) [Citation36], hinting to potential new regulatory functions of these snoRNAs outside of their typical trans-acting properties and that could be carried out from within their host locus. Supporting reports in recent years have shown that some snoRNAs can act as transient cis-regulators of their own host gene [Citation47–49]. Others have observed that some snoRNAs can also exist in various kinds of host-derived longer hybrids, i.e. transient or sometimes highly stable composite RNAs for which the cellular function is starting to emerge. This review explores the different regulatory roles uncovered for snoRNAs, while they are not expressed as independent mature snoRNPs, from their host transcript maturation regulation to the myriad of transient or stable longer forms they can adopt from their host locus. We finish by highlighting the role of snoRNA copies in shaping these new layers of gene regulation across species’ genomes, as well as their implication in the induction of genomic imprinting of their host locus.

Transcriptomic regulators in cis of their host gene

It is well understood that the expression of intron-encoded snoRNAs depends on the transcription and splicing of their host gene [Citation23,Citation26]. Yet, a growing body of literature demonstrates that, surprisingly enough, the abundance of most intronic snoRNAs does not correlate (and is sometimes anticorrelated) with that of their host gene [Citation24,Citation41,Citation50,Citation51]. Several hypotheses have been put forward to explain this uncoupling of expression, including the implication of alternative splicing, selective nonsense-mediated decay (NMD) of specific host transcripts and the use of alternative promoters, as well as the differing stability between the snoRNA and host transcript following their common transcription [Citation24,Citation52,Citation53]. However, recent evidence has also started to demonstrate the active role snoRNAs can play in influencing the fate of their host gene.

An emblematic example of this new cis-regulatory function is SNORD86, an orphan C/D box snoRNA which is embedded in the NOP56 gene, i.e. a host gene which codes for the C/D snoRNP core protein of the same name. This snoRNA was shown to modulate the expression level of its host gene in response to the concentration of the NOP56 protein [Citation47]. Indeed, within the intron in its host transcript, SNORD86 was found to adopt two alternate conformations: 1) when NOP56 protein level is low, the snoRNA assumes a non-snoRNP structure which results in the complete splicing of its intron and thereby the formation of a functional transcript that can then be translated into the NOP56 protein to restore its adequate cellular levels; 2) when the NOP56 protein is abundant, it binds to the snoRNA in its intron, favouring a snoRNP conformation which, in turn, promotes the splicing and production of an alternative transcript that is exported to the cytoplasm and cleaved by the NMD machinery. This longer noncoding isoform, called cytosolic 5’-snoRNA-ended and 3’-polyadenylated lncRNA (SPA), contains the snoRNA that is bound by its core proteins and accumulates in the cytoplasm, possibly to sequester away an excess of core proteins from the nucleus (see (i) in , and Supplementary Table S2). Intriguingly, the sequence of this cytosolic SPA is conserved across eutherians, suggesting that it might play a similar role in these species. In addition, this lncRNA constitutes the vast majority of the total NOP56 transcripts (and thereby of the SNORD86-containing transcripts), which correlates with the observation that the shorter and mature form of SNORD86 is only detected at very low levels in most human tissues and cell lines [Citation39]. Overall, these results underline the fact that SNORD86 does not play a major role as a mature snoRNP acting on targets in trans but acts rather in cis as a sensor of the output of its host gene, while still embedded in its host transcript, with important consequences for the regulation of snoRNP and ribosome assembly.

Figure 2. The maturation steps of different host-derived snoRNA hybrids. Host gene transcripts can harbour one or multiple (shaded) snoRNAs in their intron (see (a)). In the case of a snoRNA encoded alone in its intron, its canonical maturation requires the transcription and splicing of its host gene, which results in a snoRNA-containing lariat as well a mature host transcript. The lariat is typically debranched and its ends are then degraded by exonucleases up to the snoRNA ends which are bound by core proteins, thereby protecting the mature snoRNP from further cleavage (see (b)). When two snoRNAs are encoded in the same intron, the same maturation steps can lead to the formation of snoRNA-ended hybrids called sno-lncRNAs including SLERT (see (f) and (g)), a H/ACA sno-lncRNA, as well as the lncRNA LNC-SNO49AB (see (c)). SLERT was shown to interact with DDX21 and regulate cell proliferation, whereas LNC-SNO49AB was observed to bind with ADAR1 and promote its dimerization and activity. Other sno-lncRNAs have also been found to interact with splicing factors. On the other hand, atypical branch points (cytidines instead of adenosines) can hinder lariat debranching, thus leading to the formation of stable lariats bearing a snoRNA (slb-snoRNAs) which can be actively transported to the cytoplasm (see (e)). In addition, transcript readthrough of host genes can lead to the production of nuclear 5’ snoRNA-ended and 3’ polyadenylated lncRNAs (SPAs) that were shown to interact with splicing factors (see (d)). One example of SPA was also shown to be exported to the cytoplasm (SNORD86 cSPA) (see (i)). The retention of introns harbouring a H/ACA box snoRNA leads to the generation of snoRNA retaining transcripts (snoRTs), which are exported to cytoplasm and vary in length with regards to their 5’ end (see (j)). Finally, knocking down splicing factors was shown to induce splicing defects including the formation of hybrid mRNA-snoRNA (hmsnoRNAs) transcripts, which were observed to be either degraded or stabilized in the cytoplasm (see (h)). Of note, dotted lines represent the possible path that a host-derived snoRNA extension can take, whereas full lines represent an obligatory step in its maturation pathway. The asterisk that marks the name of certain snoRNA-containing hybrids (i.e. initial host transcript, mature snoRNP, slb-snoRNA and hmsnoRNA) signifies that the snoRNA represented in the hybrid can either be a C/D box or H/ACA box snoRNA, although only C/D box snoRNAs are represented in those cases for clarity.

Figure 2. The maturation steps of different host-derived snoRNA hybrids. Host gene transcripts can harbour one or multiple (shaded) snoRNAs in their intron (see (a)). In the case of a snoRNA encoded alone in its intron, its canonical maturation requires the transcription and splicing of its host gene, which results in a snoRNA-containing lariat as well a mature host transcript. The lariat is typically debranched and its ends are then degraded by exonucleases up to the snoRNA ends which are bound by core proteins, thereby protecting the mature snoRNP from further cleavage (see (b)). When two snoRNAs are encoded in the same intron, the same maturation steps can lead to the formation of snoRNA-ended hybrids called sno-lncRNAs including SLERT (see (f) and (g)), a H/ACA sno-lncRNA, as well as the lncRNA LNC-SNO49AB (see (c)). SLERT was shown to interact with DDX21 and regulate cell proliferation, whereas LNC-SNO49AB was observed to bind with ADAR1 and promote its dimerization and activity. Other sno-lncRNAs have also been found to interact with splicing factors. On the other hand, atypical branch points (cytidines instead of adenosines) can hinder lariat debranching, thus leading to the formation of stable lariats bearing a snoRNA (slb-snoRNAs) which can be actively transported to the cytoplasm (see (e)). In addition, transcript readthrough of host genes can lead to the production of nuclear 5’ snoRNA-ended and 3’ polyadenylated lncRNAs (SPAs) that were shown to interact with splicing factors (see (d)). One example of SPA was also shown to be exported to the cytoplasm (SNORD86 cSPA) (see (i)). The retention of introns harbouring a H/ACA box snoRNA leads to the generation of snoRNA retaining transcripts (snoRTs), which are exported to cytoplasm and vary in length with regards to their 5’ end (see (j)). Finally, knocking down splicing factors was shown to induce splicing defects including the formation of hybrid mRNA-snoRNA (hmsnoRNAs) transcripts, which were observed to be either degraded or stabilized in the cytoplasm (see (h)). Of note, dotted lines represent the possible path that a host-derived snoRNA extension can take, whereas full lines represent an obligatory step in its maturation pathway. The asterisk that marks the name of certain snoRNA-containing hybrids (i.e. initial host transcript, mature snoRNP, slb-snoRNA and hmsnoRNA) signifies that the snoRNA represented in the hybrid can either be a C/D box or H/ACA box snoRNA, although only C/D box snoRNAs are represented in those cases for clarity.

Interestingly, the advent of high-throughput RNA–RNA interaction datasets in recent years has led to the expansion of the snoRNA interactome, including the discovery of novel cis-interactions between snoRNAs and their host gene [Citation19,Citation48,Citation49,Citation54]. Similar to the SNORD86 case, two studies have recently shown that another C/D box snoRNA, SNORD2, also regulates the splicing of its host gene, EIF4A2 [Citation48,Citation49]. In most healthy human tissues, this snoRNA was shown to form a conventional snoRNP structure which is excised normally from its host intron [Citation39,Citation49]. Yet, SNORD2 was also observed, in specific tissues and cell lines, to interact in cis with its own host intron, more specifically with the intronic region downstream of the snoRNA. This snoRNA-intron interaction gives rise to a structure that sequesters the branch point of the intron and favours the exclusion of the following exon of the host transcript [Citation49]. This alternative splicing generates a premature stop codon, targeting this newly spliced isoform to the NMD machinery and its rapid degradation. Through the formation of an alternative SNORD2-intron structure, this splicing regulation mechanism was hypothesized to be influenced by the elongation rate of the RNA polymerase II, which varies across cell types and conditions. Furthermore, the sequence implicated in this SNORD2-intron interaction was shown to be highly conserved across vertebrates, suggesting that this regulation could occur in other species. Notably, the same study reported more than a hundred distinct snoRNA-host transcript interactions, hinting to a potential widespread cis-regulatory role played by snoRNAs which is yet to be explored.

Additional approaches such as classical structure–function studies present great potential to uncover cis-regulatory roles of snoRNA genes. Recently, CRISPR/Cas9 knockout of C/D box snoRNAs located in the introns of the lncRNA GAS5 suggested that SNORD74 harbours a regulatory region that could modulate the splicing and maturation of its host transcript, possibly through m6A modification of the snoRNA in its host transcript [Citation55]. Thus, while intronic snoRNAs were long viewed as passive passengers of their host gene, at least three and perhaps many other snoRNAs share a complex bidirectional regulatory relationship with their host gene.

Transcriptomic regulators as host-derived extensions

Although the previous examples have shown that snoRNAs can regulate their host gene fate by interacting directly in cis within the whole host transcript, several studies have also brought to light a myriad of diverse snoRNA hybrids that originate from their host locus, but that come in all shapes and sizes [Citation56–62] (). Found in a wide variety of eukaryotes, these composite RNAs sometimes exist only transiently but oftentimes persist in cells as stable products with putative functions. One such example are the snoRNA retaining transcripts (snoRTs) which were recently identified in human breast cancer cell lines upon screening for DKC1-bound transcripts [Citation56] (see (j) in , and Supplementary Table S2). These snoRTs are defined as host gene transcripts with a retained intron containing a H/ACA box snoRNA, with their length varying from the whole initial host transcript to truncated versions at their 5’ end (i.e. starting with a few exonic nucleotides upstream of the retained intron containing the snoRNA). Intriguingly, these snoRTs accumulate in the cytoplasm and are bound by the H/ACA core proteins GAR1 and NHP2 as well as DKC1, which is reminiscent of the SNORD86 SPA mode of action in which snoRNP core protein levels are regulated through their sequestration in the cytoplasm [Citation47]. Among the most significantly enriched transcripts upon DKC1 immunoprecipitation, more than 40 different snoRTs appear in the top candidates [Citation56]. Prediction of the potential targets of these snoRTs combined with a de novo analysis of pseudouridine-seq datasets showed a significant overlap, suggesting that the role of at least some of these snoRTs would be to guide the pseudouridylation of other mRNAs.

While snoRTs are generally constituted of a large portion of the host transcript, smaller, yet highly stable snoRNA-containing RNAs were recently observed in various vertebrates including frog, chicken, mouse and human [Citation57]. These stable lariats bearing a snoRNA (slb-snoRNA) are the natural product of the splicing of host genes (see (e) in , and Supplementary Table S2). However, they are not linearized as efficiently by the debranching enzyme due to their atypical branchpoint (a cytidine instead of an adenosine), which confers these lariats a high stability in the cell [Citation63]. Most of these lariats are usually species-specific, with some exceptions of conserved slb-snoRNAs between closely related frog species. While they are mainly found in the nucleus, slb-snoRNAs were also shown to be actively exported to the cytoplasm, thereby competing with the canonical snoRNA maturation process occurring in the nucleus (i.e. lariat debranching followed by exonucleolytic trimming up to the snoRNA ends that are protected by proteins). In addition to this regulatory role, it was also demonstrated that some cytoplasmic slb-snoRNAs can be bound by snoRNP proteins such as DKC1. Since yeast slb-snoRNAs (induced through the depletion of Dbr1) maintain their ability to guide modifications on their target RNA [Citation25], it was proposed that this could also be the case with these vertebrate slb-snoRNAs. Yet, none of the tested slb-snoRNAs have shown the capacity to guide modifications on rRNA, hinting that their potential function could be instead to sequester RNA binding proteins (RBPs) in the cytoplasm, a common pattern seen with snoRNA-containing extensions and lariats exported to the cytoplasm [Citation47,Citation56,Citation57,Citation64].

Interestingly, not all snoRNA hybrids are stable: some also exist only transiently in cells and are rather often the product of snoRNA maturation defects. For instance, it was shown that when splicing factors are depleted in yeast cells, hybrid mRNA-snoRNA (hmsnoRNA) transcripts are generated from host genes of both C/D and H/ACA box snoRNAs [Citation58] (see (h) in , and Supplementary Table S2). These hmsnoRNAs consist of the host gene transcript up to the snoRNA 3’ end which is protected from the nuclear RNA exosome by core proteins bound to the snoRNA. After their export to the cytoplasm, most hmsnoRNAs are degraded through 5’-3’ decay pathways; yet some still accumulate to levels varying between 10% to almost 70% of their mature snoRNA counterparts [Citation58]. Based on these observations, an interesting hypothesis was put forward that the splicing defects observed in many human diseases could not only impact protein-coding genes but could also generate similar stable hmsnoRNAs for which the pathogenic potential remains to be determined [Citation58,Citation65].

As emphasized previously, splicing is a key pillar to the effective biogenesis of intronic snoRNAs. Yet, when two snoRNAs are encoded within the same intron, the same splicing process can lead to the production of longer composite RNAs called sno-lncRNAs [Citation59] (see (f) in , and Supplementary Table S2). Indeed, after the intronic lariat is spliced out of the host gene and linearized, the 5’ and 3’ ends of the intron are trimmed by exonucleases. When two snoRNAs are embedded in that intron, the core proteins bound to the two snoRNAs act as shields against exonucleolytic cleavage, thereby creating a sno-lncRNA, i.e. an intron remnant flanked by two mature snoRNAs which are most often C/D box snoRNAs. First discovered originating from the 15q11-q13 locus in human [Citation59], an important region hosting the SNORD115 and SNORD116 C/D box snoRNA families and that is deleted in the Prader-Willi Syndrome (PWS) [Citation66], sno-lncRNAs have also been reported in other mammals including some arising from different genomic loci [Citation67]. Interestingly, the known sno-lncRNAs are usually species-specific, except for the PWS sno-lncRNAs which were found to be conserved across several primates [Citation67]. Unexpectedly, sno-lncRNAs from the PWS region were observed to localize to the nucleoplasm, but not in the nucleolus, suggesting an alternative function associated with these RNAs. Indeed, following immunoprecipitation and immunofluorescence assays, these sno-lncRNAs were found to bind to the splicing factor RBFOX2 and alter its availability in specific nuclear foci, thereby modulating splicing events in several neuronal-specific genes. This role of protein trapping is not unlike that of other lncRNAs which are known for their capacity to act as decoys for transcription factors and RBPs [Citation68,Citation69]. Interestingly, sno-lncRNAs can also be produced from host gene transcripts via the skipping of an exon that is flanked by two snoRNA-harbouring introns. This is the case of SLERT, a sno-lncRNA that enhances pre-rRNA transcription which was discovered and shown to be specific to human [Citation60] (see (g) in , and Supplementary Table S2). In contrast to most sno-lncRNAs, SLERT is constituted of H/ACA box snoRNAs at its ends (SNORA5C and SNORA5A), interacts with DKC1 and localizes to the nucleolus. Following its knockout, a significant reduction in pre-rRNA was observed, indicating its potential function in rRNA biogenesis. SLERT was also identified as an interactor of DDX21, an RNA helicase involved in ribosome synthesis, through its non-snoRNA internal sequence, whereas the flanking snoRNAs were found to be crucial for its nucleolar localization. Its interaction with DDX21 was shown to promote the transcription of ribosomal DNA (rDNA) genes, leading in turn to increased cell proliferation.

A recently discovered variation to sno-lncRNAs is the lncRNA LNC-SNO49AB which was characterized in blood samples of leukaemia patients [Citation61] (see (c) in , and Supplementary Table S2). This hybrid RNA resembles the structure of sno-lncRNAs, although its 5’ end is protected by an additional m7G cap. LNC-SNO49AB originates from the SNHG29 host gene which harbours two C/D box snoRNAs in its second intron: SNORD49B followed by SNORD49A. The resulting 5’ capped lncRNA consists of the first and second exon of SNHG29 as well as the second intron which is truncated immediately after SNORD49A. This lncRNA was observed to localize to the nucleolus and interact with the core C/D box snoRNP proteins; yet its knockdown did not show any effect on rRNA methylation level, suggesting it plays a different role. RNA pull-down assays coupled to mass spectrometry highlighted that the A-to-I editing enzyme ADAR1 could interact with several sites on LNC-SNO49AB. Further experiments demonstrated that LNC-SNO49AB acts as a scaffold to promote the dimerization and thereby the activity of ADAR1, which was found to be indicative of cancer progression [Citation61,Citation70]. Interestingly, the sequence of LNC-SNO49AB was shown to be conserved across primates, implying that it could exist and play a similar role in those species [Citation61].

Finally, at the interface between sno-lncRNAs and mRNAs lie the SPAs, which, as a reminder, are polyadenylated lncRNAs that are capped by a snoRNA at their 5’ end. As opposed to the previously introduced cytosolic SNORD86 SPA, the two other known SPAs to this date, SPA1 and SPA2, are retained in the nucleus and are, respectively, capped by the C/D box snoRNAs SNORD107 and SNORD109A [Citation47,Citation62] (see (d) in , and Supplementary Table S2). Importantly, they are encoded in the PWS region downstream of the SNURF-SNRPN gene and of its weak poly(A) signal [Citation62]. Following the transcription of SNURF-SNRPN and the cleavage of the resulting pre-mRNA at its poly(A) site, the fate of the downstream cleavage product remains under the control of RNA polymerase II (which continues to elongate it) and the exonuclease XRN2 (which trims it) according to the torpedo model [Citation71]. Whereas in typical pre-mRNA maturation XRN2 would catch up to the polymerase and trigger transcription termination, it was shown that the presence of SNORD107 and the core proteins bound to it in the downstream cleavage product block the exonuclease from further trimming. Therefore, this stabilizes the 5’ end of the nascent transcript which is elongated until the next poly(A) signal where it is cleaved and polyadenylated, resulting in the SPA1 transcript. Moreover, RNA polymerase II continues to transcribe the region downstream of SPA1. This region harbours SNORD109A and a further downstream poly(A) signal, resulting in the formation of SPA2 through the same mechanism described above. Notably, by substituting SNORD107 with the H/ACA box snoRNA SNORA5C, SPA1 was still produced to similar levels as its endogenous form, implying that both snoRNA types have the capacity to generate SPAs. To assess the functional role of SPAs, RNA fluorescence in situ hybridization (FISH) assays were undertaken and showed that both SPA1 and SPA2 co-localize with the PWS sno-lncRNAs in specific nuclear subdomains. Following individual-nucleotide resolution UV crosslinking and immunoprecipitation (iCLIP) experiments, all the PWS SPAs and sno-lncRNAs were shown to interact with various splicing factors including TDP43, RBFOX2 and hnRNP M. Through the sequestration of up to 1% of the total cellular level of these proteins, SPAs were found to regulate several splicing events in genes involved in synaptogenesis, hinting at a relationship between the aetiology of PWS and the dysregulation of splicing factor localization. Furthermore, a conserved SPA1 (but not SPA2) was also identified in mouse (mSPA1) hippocampal tissue, suggesting that SPAs’ regulatory function could take place across various species.

SnoRNA copies as a potential reservoir of new snoRNA forms and functions

The previous examples of transcriptomic regulation carried out by host-derived snoRNA extensions highlight the wide range of shapes and functions that snoRNAs can undertake in the cell. Indeed, most of the snoRNAs involved in longer hybrids also coexist with their mature and shorter snoRNP counterpart. However, the ratio of mature snoRNA to the extended snoRNA forms varies considerably depending on the snoRNA, ranging from a dominance of the mature snoRNA species to the opposite state [Citation47,Citation57,Citation59–61]. As snoRNAs are crucial regulators of rRNA maturation and ribosome assembly [Citation10], one can wonder how the cell can afford such fluctuations in mature snoRNA levels which depends oftentimes on the degree of production of its cognate longer form.

One plausible explanation resides in the fact that many snoRNAs exist in multiple copies in higher eukaryotes [Citation23,Citation36]. Due to uncountable rounds of recombination and retrotransposition events, the genome of current multicellular species harbours multiple snoRNA families that expanded in size and diversity throughout evolution [Citation33–36,Citation38]. Although members of the same snoRNA family can greatly vary in terms of abundance, it was recently shown that the overall total family abundance stays relatively constant across different cell types [Citation36]. This observation supports the viability of having longer snoRNA extensions that compete with the formation of their corresponding mature snoRNA, since other ‘backup’ copies can be produced as mature snoRNPs to ensure the desired target modification level. This also coincides with the fact that all naturally occurring snoRNA extensions presented herein have only been identified in multicellular organisms, which are known to harbour multiple snoRNA copies in their genome [Citation23], and not in simpler unicellular organisms.

Therefore, snoRNA gene duplication represents an interesting evolutionary force which can have an impact both at the family level, but also at the host gene level. Through these duplication events, snoRNAs can propagate in various genomic locations and sometimes to astonishing numbers (e.g. a H/ACA box snoRNA was copied almost 40,000 times in the platypus genome) [Citation38,Citation72]. It is thus highly probable that snoRNA duplication events in introns enabled the formation of many of the known snoRNA extensions including sno-lncRNAs, snoRTs and slb-snoRNAs. As reported previously, the presence of snoRNA in introns can dramatically alter the splicing patterns of the host gene [Citation47–49]. Therefore, it is tempting to speculate that the insertion of snoRNAs in host genes could also impact several other steps of the host gene processing including its polyadenylation and translation by disrupting for instance poly(A) and RBP binding sites.

Since snoRNAs can be retrotransposed in antisense into host genes and intergenic regions [Citation38,Citation72], one could also hypothesize that the ASEs of these snoRNAs act as a binding site for their parental copy, thereby opening the possibility of a wide range of snoRNA-mediated regulation of the retrotransposon recipient gene including at the chromatin, transcriptional and post-transcriptional levels. While this type of regulation might seem unlikely at first glance, at least three lines of evidence support it. Firstly, various snoRNA interactors were reported by several groups [Citation14,Citation19,Citation73], indicating that snoRNAs, at least in their canonical mature form, actively interact with a wide range of RNAs (e.g. with other snoRNAs). Secondly, several snoRNAs and sno-lncRNAs were observed to be associated with chromatin and oftentimes regulate its state in Drosophila melanogaster, mouse and human [Citation74–78], indicating that some snoRNAs have at least the capacity to localize at different DNA regions. Lastly, a similar phenomenon is observed with microRNAs (miRNAs) in which miRNA-containing transposable elements can be inserted in antisense of the 3’ untranslated regions (UTR) of genes, thereby creating compatible target sites for the parental miRNA copy [Citation79–81]. Thus, this suggests that, in principle, such regulation could also be functionally relevant for retrotransposed snoRNAs.

Ultimately, having multiple copies of the same snoRNA creates some flexibility for the cell in the same way protein-coding gene duplication allows for new functions to emerge [Citation82]. For instance, different mutations can accumulate in the ASEs of snoRNAs without affecting their parental copy, leading to the loss of complementarity to the initial target and/or the acquisition of complementarity to new targets. Although this process usually takes place across a long evolutionary time, it was recently demonstrated to occur even between closely related amphibian species [Citation44]. Furthermore, depending on the snoRNA, not all parts of their sequence are subject to the same selective pressure, as it was shown in the SNORD116 snoRNA family that their two ASEs showed significant differences in conservation level while their C and D boxes remained largely unchanged between homologous snoRNAs [Citation83]. Altogether, snoRNA gene duplications are likely to lead, on the one hand, to the degeneration of snoRNAs into pseudogenes if too many deleterious mutations accumulate, but on the other hand, to the creation of new regulatory roles including in cis and as extensions of their host gene.

The involvement of large tandem repeats of snoRNAs in genomic imprinting

While snoRNA genes generally exist as part of families that are dispersed throughout the genome, some C/D box snoRNA families have been shown to occur locally in large tandem repeats. These unusual genomic organizations can contain hundreds of repeated snoRNA genes, consistent with the propensity of C/D box snoRNA genes to duplicate locally [Citation33,Citation34,Citation36]. These families correspond to recent eutherian-specific innovations that produce orphan C/D box snoRNAs suspected to regulate unconventional targets [Citation16,Citation83,Citation84]. These snoRNA genes are embedded within repeated introns of poorly characterized lncRNA genes and are found at two loci controlled by parental genomic imprinting (PGI): the Dlk1-Dio3 (at human 14q32) and the PWS domains. PGI is an epigenetic mechanism that differentially marks the two parental genomes and causes the specific expression or repression of so-called imprinted genes from a given parent [Citation85,Citation86]. Each of the previously mentioned domains hosts two tandem repeats formed, respectively, by the SNORD113/SNORD114 and SNORD115/SNORD116 families. While these tandem repeats likely arose in a common ancestor of eutherian species, a fifth tandem repeat-containing lncRNA called Bsr is only present in the rat genome [Citation87], suggesting that the acquisition of these genomic structures may be ongoing and that understudied genomes might harbour novel snoRNA tandem repeats.

Interestingly, the organization of C/D box snoRNA genes in tandem repeats presumably had a major influence on their evolution and function. Indeed, it likely favoured a complex genetic interplay between neighbouring gene copies, giving rise to a high rate of gene gains and losses likely due to non-allelic recombination events, e.g. resulting in four to ten times more SNORD115 and SNORD116 gene copies in the mouse than in the rat genome despite limited evolutionary distance [Citation83,Citation88]. In addition, it probably facilitated a high rate of non-allelic conversion events between highly similar gene copies that promoted gene homogenization and spreading of sequence polymorphisms. It is interesting to note, however, that both the SNORD115 and SNORD116 tandem repeats were able to form gene subfamilies in primates. De novo mutations favoured by selection can escape erasure and spread within copies if gene diversity is beneficial [Citation89]. Therefore, the maintenance of snoRNA subfamilies argues in favour of the neofunctionalization of certain gene copies in primates [Citation83,Citation88]. Intriguingly, phylogenetic analyses have revealed a complex association between C/D box snoRNA tandem repeats and PGI, leading to two functional hypotheses. First, cross-species comparisons support the idea that it was the formation of snoRNA repeats that led to the installation of PGI at and around their site of amplification [Citation90]. PGI is thought to derive from mechanisms dedicated to silencing transposable elements [Citation91,Citation92], so a consistent scenario would be that snoRNA tandem repeats are recognized as parasitic structures to be repressed. This possibility is also supported by the fact that a similar association applies to the few known large tandem repeats of microRNA genes [Citation90,Citation93–95]. Second, recent works have revealed that the adjacent SNORD115 and SNORD116 repeats evolved in a coordinated manner with respect to copy gains and losses, the emergence of gene subfamilies, and partial tandem duplication events [Citation88,Citation96]. It was also suggested that this coordination was orchestrated by PGI since differential chromatin compaction can favour concerted non-allelic homologous recombination events of closely spaced tandem repeats [Citation88]. In short, it is thus possible that snoRNA tandem repeats favoured and then were affected by the installation of PGI, highlighting the complex role of snoRNA genes in regulating their host locus expression.

Conclusion

Once perceived as mere guides of rRNA and snRNA modification, recent studies on snoRNAs have unequivocally painted a much more complex picture. From new protein interactors to novel types of targets, the modes of action of snoRNAs have kept expanding in recent years. In this review, we have focused on the extensive regulatory roles that snoRNAs exert not as mature snoRNPs, but rather from within their host locus. More specifically, we have explored the modulatory roles of snoRNAs with regard to the maturation of their host transcript in cis, as well as the profusion of different host-derived forms they can adopt. In addition, we have discussed their varied cellular roles and potential evolutionary origin through gene duplication, as well as the complex interplay between snoRNA repeats and their host locus imprinting. Yet, many open questions remain to be answered. Comprehensive studies are needed to better understand how both snoRNA and snoRNA extensions are co-regulated, and to what extent these occur in eukaryotes other than in the few animals in which they were first identified (e.g. in plant species). Since SPAs can theoretically be capped by H/ACA box snoRNAs [Citation62], we also expect that future research will likely uncover such snoRNA hybrids through the use of snoRNA-adapted high-throughput detection methods (e.g. TGIRT-Seq [Citation97], icSHAPE-MaP [Citation98], RNA structure analysis using nanopore sequencing [Citation99], etc.). With the myriad of forms they can adopt, it is highly plausible that even more eclectic host-derived snoRNA extensions will be characterized in the future, especially by expanding research to a wider range of species and conditions. Finally, while some functions have been determined for a few of the cis-interacting snoRNAs and of the host-derived hybrids, further studies implicating experimental validations will be needed to decipher the precise molecular role of these snoRNAs.

Author contributions

ÉFC, SL and MSS wrote the manuscript. ÉFC assembled the data and made the figures.

Supplemental material

Supplementary_tables_S1_S2_Methods.pdf

Download PDF (1.6 MB)

Acknowledgments

The authors would like to thank members of their lab for the insightful ideas and discussions.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Data availability statement

The authors confirm that the data supporting the findings of this study are available within the article and its supplementary materials.

Supplementary material

Supplemental data for this article can be accessed online at https://doi.org/10.1080/15476286.2024.2342685

Additional information

Funding

This work was supported by an NSERC Discovery grant (RGPIN-2024-04743 to MSS) and a Canada Research Chair in Bioinformatics of Noncoding RNA (MSS). ÉFC holds a Vanier Canada Graduate Scholarship from NSERC.

References

  • Boivin V, Faucher-Giguère L, Scott M, et al. The cellular landscape of mid-size noncoding RNA. Wiley Interdiscip Rev RNA. 2019;10(4):e1530. doi: 10.1002/wrna.1530
  • Kiss T. Small nucleolar RNA-guided post-transcriptional modification of cellular RNAs [Internet]. Embo J. 2001;20(14):3617–3622. doi: 10.1093/emboj/20.14.3617
  • Seal RL, Braschi B, Gray K, et al. Genenames.Org: the HGNC resources in 2023. Nucleic Acids Res. 2023;51(D1):D1003–9. doi: 10.1093/nar/gkac888
  • Matera AG, Terns RM, Terns MP. Non-coding RNAs: lessons from the small nuclear and small nucleolar RNAs [internet]. Nat Rev Mol Cell Biol. 2007;8(3):209–220. doi: 10.1038/nrm2124
  • Tollervey D, Kiss T. Function and synthesis of small nucleolar RNAs. Curr Opin Cell Biol. 1997;9(3):337–342. doi: 10.1016/S0955-0674(97)80005-1
  • Filipowicz W, Pogači V. Biogenesis of small nucleolar ribonucleoproteins. Curr Opin Cell Biol. 2002;14:319–327. doi: 10.1016/S0955-0674(02)00334-4
  • Kiss-László Z, Henry Y, Bachellerie JP, et al. Site-specific ribose methylation of preribosomal RNA: a novel function for small nucleolar RNAs. Cell. 1996;85(7):1077–1088. doi: 10.1016/S0092-8674(00)81308-2
  • Ganot P, Caizergues-Ferrer M, Kiss T. The family of box ACA small nucleolar RNAs is defined by an evolutionarily conserved secondary structure and ubiquitous sequence elements essential for RNA accumulation. Genes Dev. 1997;11(7):941–956. doi: 10.1101/gad.11.7.941
  • Bohnsack MT, Sloan KE. Modifications in small nuclear RNAs and their roles in spliceosome assembly and function. Biol Chem. 2018;399(11):1265–1276. doi: 10.1515/hsz-2018-0205
  • Sloan KE, Warda AS, Sharma S, et al. Tuning the ribosome: the influence of rRNA modification on eukaryotic ribosome biogenesis and function [internet]. RNA Biol. 2017;14(9):1138–1152. doi: 10.1080/15476286.2016.1259781
  • Bratkovič T, Božič J, Rogelj B. Functional diversity of small nucleolar RNAs. Nucleic Acids Res. 2020;48(4):1627–1651. doi: 10.1093/nar/gkz1140
  • Falaleeva M, Welden JR, Duncan MJ, et al. C/D-box snoRNAs form methylating and non-methylating ribonucleoprotein complexes: old dogs show new tricks. BioEssays. 2017;39(6):1–28. doi: 10.1002/bies.201600264
  • Bergeron D, Fafard-Couture É, Scott MS. Small nucleolar RNAs: continuing identification of novel members and increasing diversity of their molecular mechanisms of action. Biochem Soc Trans. 2020;48:645–656. doi: 10.1042/BST20191046
  • Zhang M, Li K, Bai J, et al. A snoRNA–tRNA modification network governs codon-biased cellular states. Proc Natl Acad Sci U S A. 2023;120(41):e2312126120. doi: 10.1073/pnas.2312126120
  • Vitali P, Kiss T. Cooperative 2’-o-methylation of the wobble cytidine of human elongator tRnamet(cat) by a nucleolar and a cajal bodyspecific box C/D RNP. Genes Dev. 2019;33:741–746. doi: 10.1101/gad.326363.119
  • Kishore S, Stamm S. The snoRNA HBII-52 regulates alternative splicing of the serotonin receptor 2C. Science. 2006;311(5758):230–232. doi: 10.1126/science.1118265
  • Scott MS, Ono M, Yamada K, et al. Human box C/D snoRNA processing conservation across multiple cell types. Nucleic Acids Res. 2012;40(8):3676. doi: 10.1093/nar/gkr1233
  • Falaleeva M, Pages A, Matuszek Z, et al. Dual function of C/D box small nucleolar RNAs in rRNA modification and alternative pre-mRNA splicing. Proc Natl Acad Sci U S A. 2016;113(12):E1625–34. doi: 10.1073/pnas.1519292113
  • Sharma E, Sterne-Weiler T, O’Hanlon D, et al. Global mapping of human RNA-RNA interactions. Mol Cell. 2016;62(4):618–626. doi: 10.1016/j.molcel.2016.04.030
  • Elliott BA, Ho HT, Ranganathan SV, et al. Modification of messenger RNA by 2′-O-methylation regulates gene expression in vivo. Nat Commun. 2019;10(1):10. doi: 10.1038/s41467-019-11375-7
  • Bortolin-Cavaillé ML, Quillien A, Gamage ST, et al. Probing small ribosomal subunit RNA helix 45 acetylation across eukaryotic evolution. Nucleic Acids Res. 2022;50(11):6284. doi: 10.1093/nar/gkac404
  • Sharma S, Yang J, van Nues R, et al. Specialized box C/D snoRnps act as antisense guides to target RNA base acetylation. PloS Genet. 2017;13(5):e1006804. doi: 10.1371/journal.pgen.1006804
  • Dieci G, Preti M, Montanini B. Eukaryotic snoRnas: A paradigm for gene expression flexibility. Genomics. 2009;94(2):83–88. doi: 10.1016/j.ygeno.2009.05.002
  • Fafard-Couture É, Bergeron D, Couture S, et al. Annotation of snoRNA abundance across human tissues reveals complex snoRNA-host gene relationships. Genome Biol. 2021;22(1):22. doi: 10.1186/s13059-021-02391-2
  • Ooi SL, Samarsky DA, Fournier MJ, et al. Intronic snoRNA biosynthesis in Saccharomyces cerevisiae depends on the lariat-debranching enzyme: intron length effects and activity of a precursor snoRNA. RNA. 1998;4(9):1096. doi: 10.1017/S1355838298980785
  • Yang L. Splicing noncoding RNAs from the inside out. Wiley Interdiscip Rev RNA. 2015;6(6):651–660. doi: 10.1002/wrna.1307
  • Caffarelli E, Arese M, Santoro B, et al. In vitro study of processing of the intron-encoded U16 small nucleolar RNA in xenopus laevis. Mol Cell Biol. 1994;14(5):2966–2974. doi: 10.1128/MCB.14.5.2966
  • Hirose T, Shu MD, Steitz JA. Splicing-dependent and -independent modes of assembly for intron-encoded box C/D snoRnps in mammalian cells. Mol Cell. 2003;12(1):113–123. doi: 10.1016/S1097-2765(03)00267-3
  • Weinstein LB, Steitz JA. Guided tours: from precursor snoRNA to functional snoRNP. Curr Opin Cell Biol. 1999;11(3):378–384. doi: 10.1016/S0955-0674(99)80053-2
  • Li SG, Zhou H, Luo YP, et al. Identification and functional analysis of 20 box H/ACA small nucleolar RNAs (snoRNAs) from schizosaccharomyces pombe. J Biol Chem. 2005;280(16):16446–16455. doi: 10.1074/jbc.M500326200
  • Brown JWS, Clark GP, Leader DJ, et al. Multiple snoRNA gene clusters from Arabidopsis. RNA. 2001;7:1817.
  • Leader DJ, Clark GP, Watters J, et al. Clusters of multiple different small nucleolar RNA genes in plants are expressed as and processed from polycistronic pre-snoRnas. Embo J. 1997;16(18):5742. doi: 10.1093/emboj/16.18.5742
  • Zemann A, Op de Bekke A, Kiefmann M, et al. Evolution of small nucleolar RNAs in nematodes. Nucleic Acids Res. 2006;34(9):2676–2685. doi: 10.1093/nar/gkl359
  • Shao P, Yang JH, Zhou H, et al. Genome-wide analysis of chicken snoRnas provides unique implications for the evolution of vertebrate snoRnas. BMC Genomics. 2009;10(1):10. doi: 10.1186/1471-2164-10-86
  • Weber MJ. Mammalian small nucleolar RNAs are mobile genetic elements. PloS Genet. 2006;2(12):1984–1997. doi: 10.1371/journal.pgen.0020205
  • Bergeron D, Laforest C, Carpentier S, et al. SnoRNA copy regulation affects family size, genomic location and family abundance levels. BMC Genomics. 2021;22(1):1–18. doi: 10.1186/s12864-021-07757-1
  • Martin FJ, Amode MR, Aneja A, et al. Ensembl 2023. Nucleic Acids Res. 2023;51(D1):D933–41. doi: 10.1093/nar/gkac958
  • Schmitz J, Zemann A, Churakov G, et al. Retroposed SNOfall–a mammalian-wide comparison of platypus snoRnas. Genome Res. 2008;18:1005–1010. doi: 10.1101/gr.7177908
  • Bergeron D, Paraqindes H, Fafard-Couture É, et al. snoDB 2.0: an enhanced interactive database, specializing in human snoRNAs. Nucleic Acids Res. 2023;51(D1):D291. doi: 10.1093/nar/gkac835
  • Kalvari I, Nawrocki EP, Argasinska J, et al. Non-coding RNA analysis using the Rfam database. Curr Protoc Bioinforma. 2018;62(1):e51. doi: 10.1002/cpbi.51
  • McCann KL, Kavari SL, Burkholder AB, et al. H/ACA snoRNA levels are regulated during stem cell differentiation. Nucleic Acids Res. 2020;48(15):8686–8703. doi: 10.1093/nar/gkaa612
  • Fafard-Couture É, Jacques PÉ, Scott MS. Motif conservation, stability, and host gene expression are the main drivers of snoRNA expression across vertebrates. Genome Res. 2023;33(4):525–540. doi: 10.1101/gr.277483.122
  • Sklias A, Cruciani S, Marchand V, et al. Comprehensive map of ribosomal 2′-O-methylation and C/D box snoRnas in Drosophila melanogaster. Nucleic Acids Res. 2024. doi:10.1093/nar/gkae139.
  • Deryusheva S, Talhouarne GJS, Gall JG, et al. “Lost and found”: snoRNA annotation in the xenopus genome and implications for evolutionary studies. Mol Biol Evol. 2020;37(1):149. doi: 10.1093/molbev/msz209
  • Chen CL, Liang D, Zhou H, et al. The high diversity of snoRnas in plants: identification and comparative study of 120 snoRNA genes from Oryza sativa. Nucleic Acids Res. 2003;31(10):2601. doi: 10.1093/nar/gkg373
  • Canzler S, Stadler PF, Schor J. The fungal snoRnaome. RNA. 2018;24(3):342–360. doi: 10.1261/rna.062778.117
  • Lykke-Andersen S, Ardal BK, Hollensen AK, et al. Box C/D snoRNP autoregulation by a cis-acting snoRNA in the NOP56 pre-mRNA. Mol Cell. 2018;72(1):99–111.e5. doi: 10.1016/j.molcel.2018.08.017
  • Dunn-Davies H, Dudnakova T, Langhendries J-L, et al. Systematic mapping of small nucleolar RNA targets in human cells. bioRxiv. 2021.
  • Bergeron D, Faucher-Giguère L, Emmerichs AK, et al. Intronic small nucleolar RNAs regulate host gene splicing through base pairing with their adjacent intronic sequences. Genome Biol. 2023;24(1):24. doi: 10.1186/s13059-023-03002-y
  • Warner WA, Spencer DH, Trissal M, et al. Expression profiling of snoRnas in normal hematopoiesis and AML. Blood Adv. 2018;2(2):151–163. doi: 10.1182/bloodadvances.2017006668
  • Boivin V, Deschamps-Francoeur G, Couture S, et al. Simultaneous sequencing of coding and noncoding RNA reveals a human transcriptome dominated by a small number of highly expressed noncoding genes. RNA. 2018;24(7):950–965. doi: 10.1261/rna.064493.117
  • Lykke-Andersen S, Chen Y, Ardal BR, et al. Erratum to human nonsense-mediated RNA decay initiates widely by endonucleolysis and targets snoRNA host genes (Genes and Development, (2014), 28, (2498-2517)) [Internet]. Genes Dev. 2016;30(9):1128–1134. doi: 10.1101/gad.281881.116
  • Nepal C, Hadzhiev Y, Balwierz P, et al. Dual-initiation promoters with intertwined canonical and TCT/TOP transcription start sites diversify transcript processing. Nat Commun. 2020;11(1):168. doi: 10.1038/s41467-019-13687-0
  • Dudnakova T, Dunn-Davies H, Peters R, et al. Mapping targets for small nucleolar RNAs in yeast. Wellcome Open Res. 2018;3:3. doi: 10.12688/wellcomeopenres.14735.2
  • Matveeva A, Vinogradov D, Zhuravlev E, et al. Intron editing reveals SNORD-Dependent maturation of the small nucleolar RNA Host gene GAS5 in human cells. Int J Mol Sci. 2023;24(24):17621. doi: 10.3390/ijms242417621
  • Zacchini F, Venturi G, De Sanctis V, et al. Human dyskerin binds to cytoplasmic H/ACA-box-containing transcripts affecting nuclear hormone receptor dependence. Genome Biol. 2022;23(1):1–27. doi: 10.1186/s13059-022-02746-3
  • Talross GJS, Deryusheva S, Gall JG. Stable lariats bearing a snoRNA (slb-snoRNA) in eukaryotic cells: a level of regulation for guide RNAs. Proc Natl Acad Sci U S A. 2021;118(45):118. doi: 10.1073/pnas.2114156118
  • Liu Y, DeMario S, He K, et al. Splicing inactivation generates hybrid mRNA-snoRNA transcripts targeted by cytoplasmic RNA decay. Proc Natl Acad Sci U S A. 2022;119(31):119. doi: 10.1073/pnas.2202473119
  • Yin Q-F, Yang L, Zhang Y, et al. Long noncoding RNAs with snoRNA ends. Mol Cell. 2012;48(2):219–230. doi: 10.1016/j.molcel.2012.07.033
  • Xing YH, Yao RW, Zhang Y, et al. SLERT Regulates DDX21 rings associated with pol I transcription. Cell. 2017;169(4):664–678.e16. doi: 10.1016/j.cell.2017.04.011
  • Huang W, Sun YM, Pan Q, et al. The snoRNA-like lncRNA LNC-SNO49AB drives leukemia by activating the RNA-editing enzyme ADAR1. Cell Discov. 2022;8(1):117. doi: 10.1038/s41421-022-00460-9
  • Wu H, Yin Q-F, Luo Z, et al. Unusual processing generates SPA LncRNAs that sequester multiple RNA binding proteins. Mol Cell. 2016;64(3):534–548. doi: 10.1016/j.molcel.2016.10.007
  • Talhouarne GJS, Gall JG. Lariat intronic RNAs in the cytoplasm of vertebrate cells. Proc Natl Acad Sci U S A. 2018;115(34):E7970–7. doi: 10.1073/pnas.1808816115
  • Armakola M, Higgins MJ, Figley MD, et al. Inhibition of RNA lariat debranching enzyme suppresses TDP-43 toxicity in ALS disease models. Nat Genet. 2012;44(12):1302. doi: 10.1038/ng.2434
  • Chabot B, Shkreta L. Defective control of pre–messenger RNA splicing in human disease. J Cell Bio. 2016;212(1):13. doi: 10.1083/jcb.201510032
  • Cassidy SB, Schwartz S, Miller JL, et al. Prader-Willi syndrome. Genet Med. 2012;14(1):10–26. doi: 10.1038/gim.0b013e31822bead0
  • Zhang XO, Yin QF, Wang HB, et al. Species-specific alternative splicing leads to unique expression of sno-lncRNAs. BMC Genomics. 2014;15(1):287. doi: 10.1186/1471-2164-15-287
  • Statello L, Guo CJ, Chen LL, et al. Gene regulation by long non-coding RNAs and its biological functions. Nat Rev Mol Cell Biol. 2020;22(2):96–118. doi: 10.1038/s41580-020-00315-9
  • Mattick JS, Amaral PP, Carninci P, et al. Long non-coding RNAs: definitions, functions, challenges and recommendations. Nat Rev Mol Cell Biol. 2023;24(6):430–447. doi: 10.1038/s41580-022-00566-8
  • Jiang Q, Crews LA, Holm F, et al. RNA editing-dependent epitranscriptome diversity in cancer stem cells. Nat Rev Cancer. 2017;17(6):381. doi: 10.1038/nrc.2017.23
  • West S, Gromak N, Proudfoot NJ. Human 5′ → 3′ exonuclease Xrn2 promotes transcription termination at co-transcriptional cleavage sites. Nat. 2004;432(7016):522–525. doi: 10.1038/nature03035
  • Luo Y, Li S. Genome-wide analyses of retrogenes derived from the human box H/ACA snoRnas. Nucleic Acids Res. 2007;35(2):559. doi: 10.1093/nar/gkl1086
  • Aw JGA, Shen Y, Wilm A, et al. In vivo mapping of eukaryotic RNA interactomes reveals principles of higher-order organization and Regulation. Mol Cell. 2016;62(4):603–617. doi: 10.1016/j.molcel.2016.04.028
  • Schubert T, Pusch MC, Diermeier S, et al. Df31 protein and snoRnas maintain accessible higher-order structures of chromatin. Mol Cell. 2012;48(3):434–444. doi: 10.1016/j.molcel.2012.08.021
  • Bell JC, Jukam D, Teran NA, et al. Chromatin-associated RNA sequencing (ChAR-seq) maps genome-wide RNA-to-DNA contacts. Elife. 2018;7. doi: 10.7554/eLife.27024
  • Meng Y, Yi X, Li X, et al. The non-coding RNA composition of the mitotic chromosome by 5′-tag sequencing. Nucleic Acids Res. 2016;44(10):4934. doi: 10.1093/nar/gkw195
  • Han C, Sun LY, Luo XQ, et al. Chromatin-associated orphan snoRNA regulates DNA damage-mediated differentiation via a non-canonical complex. Cell Rep. 2022;38(13):110421. doi: 10.1016/j.celrep.2022.110421
  • Sledziowska M, Winczura K, Jones M, et al. Non-coding RNAs associated with prader–willi syndrome regulate transcription of neurodevelopmental genes in human induced pluripotent stem cells. Hum Mol Genet. 2023;32(4):608–620. doi: 10.1093/hmg/ddac228
  • Petri R, Brattås PL, Sharma Y, et al. LINE-2 transposable elements are a source of functional human microRNAs and target sites. PloS Genet. 2019;15(3):e1008036. doi: 10.1371/journal.pgen.1008036
  • Smalheiser NR, Torvik VI. Mammalian microRNAs derived from genomic repeats. Trends Genet. 2005;21(6):322–326. doi: 10.1016/j.tig.2005.04.008
  • Spengler RM, Oakley CK, Davidson BL. Functional microRNAs and target sites are created by lineage-specific transposition. Hum Mol Genet. 2014;23(7):1783–1793. doi: 10.1093/hmg/ddt569
  • Ohno S. Evolution by gene duplication. New York: Springer-Verlag; 1970. p. 160.
  • Baldini L, Robert A, Charpentier B, et al. Phylogenetic and molecular analyses identify SNORD116 targets involved in the prader–willi syndrome. Mol Biol Evol. 2022;39(1):39. doi: 10.1093/molbev/msab348
  • Kocher MA, Huang FW, Le E, et al. Snord116 post-transcriptionally increases Nhlh2 mRNA stability: implications for human prader-willi syndrome. Hum Mol Genet. 2021;30(12):1101–1110. doi: 10.1093/hmg/ddab103
  • Constância M, Kelsey G, Reik W. Resourceful imprinting. Nat. 2004;432(7013):53–57. doi: 10.1038/432053a
  • Tucci V, Isles AR, Kelsey G, et al. Genomic imprinting and physiological processes in mammals. Cell. 2019;176(5):952–965. doi: 10.1016/j.cell.2019.01.043
  • Cavaillé J, Vitali P, Basyuk E, et al. A novel brain-specific Box C/D small nucleolar RNA processed from tandemly repeated introns of a noncoding RNA gene in rats. J Biol Chem. 2001;276(28):26374–26383. doi: 10.1074/jbc.M103544200
  • Guibert M, Marty-Capelle H, Robert A, et al. Coordinated evolution of the SNORD115 and SNORD116 tandem repeats at the imprinted Prader–Willi/Angelman locus. Nucleic Acids Res Mol Med. 2023;1(1). doi: 10.1093/narmme/ugad003
  • Ohta T. Gene conversion and evolution of gene families: an overview. Genes (Basel). 2010;1(3):349. doi: 10.3390/genes1030349
  • Labialle S, Cavaillé J. Do repeated arrays of regulatory small-RNA genes elicit genomic imprinting? BioEssays. 2011;33(8):565–573. doi: 10.1002/bies.201100032
  • Ondičová M, Oakey RJ, Walsh CP, et al. Is imprinting the result of “friendly fire” by the host defense system? PloS Genet. 2020;16(4):e1008599. doi: 10.1371/journal.pgen.1008599
  • Barlow DP. Methylation and imprinting: From host defense to gene regulation? Science. 1993;260(5106):309–310. doi: 10.1126/science.8469984
  • Wang Q, Chow J, Hong J, et al. Recent acquisition of imprinting at the rodent Sfmbt2 locus correlates with insertion of a large block of miRNAs. BMC Genomics. 2011;12(1):1–11. doi: 10.1186/1471-2164-12-204
  • Noguer-Dance M, Abu-Amero S, Al-Khtib M, et al. The primate-specific microRNA gene cluster (C19MC) is imprinted in the placenta. Hum Mol Genet. 2010;19(18):3566–3582. doi: 10.1093/hmg/ddq272
  • Kuzmin A, Han Z, Golding MC, et al. The PcG gene Sfmbt2 is paternally expressed in extraembryonic tissues. Gene Expr Patterns. 2008;8(2):107–116. doi: 10.1016/j.modgep.2007.09.005
  • Keshavarz M, Savriama Y, Refki P, et al. Expression of concern: natural copy number variation of tandemly repeated regulatory SNORD RNAs leads to individual phenotypic differences in mice. Mol Ecol. 2021;30(19):4708–4722. doi: 10.1111/mec.16076
  • Nottingham RM, Wu DC, Qin Y, et al. RNA-seq of human reference RNA samples using a thermostable group II intron reverse transcriptase. RNA. 2016;22(4):597–613. doi: 10.1261/rna.055558.115
  • Luo QJ, Zhang J, Li P, et al. RNA structure probing reveals the structural basis of Dicer binding and cleavage. Nat Commun. 2021;12(1):1–12. doi: 10.1038/s41467-021-23607-w
  • Aw JGA, Lim SW, Wang JX, et al. Determination of isoform-specific RNA structure with nanopore long reads. Nat Biotechnol. 2020;39(3):336–346. doi: 10.1038/s41587-020-0712-z