Elsevier

Methods

Volume 103, 1 July 2016, Pages 99-119
Methods

The RNA 3D Motif Atlas: Computational methods for extraction, organization and evaluation of RNA motifs

https://doi.org/10.1016/j.ymeth.2016.04.025Get rights and content

Highlights

  • Up to 40% of nucleotides in structured RNAs are in hairpin, internal and junction loops.

  • Loops are structured by non-Watson-Crick base-pairing and -stacking interactions.

  • RNA 3D motifs found in loops can be extracted and clustered into motif groups.

  • Many RNA 3D motif geometries are modular and recurrent with similar non-WC pairing.

  • Corresponding motifs in homologous structures frequently conserve 3D structure.

Abstract

RNA 3D motifs occupy places in structured RNA molecules that correspond to the hairpin, internal and multi-helix junction “loops” of their secondary structure representations. As many as 40% of the nucleotides of an RNA molecule can belong to these structural elements, which are distinct from the regular double helical regions formed by contiguous AU, GC, and GU Watson-Crick basepairs. With the large number of atomic- or near atomic-resolution 3D structures appearing in a steady stream in the PDB/NDB structure databases, the automated identification, extraction, comparison, clustering and visualization of these structural elements presents an opportunity to enhance RNA science. Three broad applications are: (1) identification of modular, autonomous structural units for RNA nanotechnology, nanobiology and synthetic biology applications; (2) bioinformatic analysis to improve RNA 3D structure prediction from sequence; and (3) creation of searchable databases for exploring the binding specificities, structural flexibility, and dynamics of these RNA elements. In this contribution, we review methods developed for computational extraction of hairpin and internal loop motifs from a non-redundant set of high-quality RNA 3D structures. We provide a statistical summary of the extracted hairpin and internal loop motifs in the most recent version of the RNA 3D Motif Atlas. We also explore the reliability and accuracy of the extraction process by examining its performance in clustering recurrent motifs from homologous ribosomal RNA (rRNA) structures. We conclude with a summary of remaining challenges, especially with regard to extraction of multi-helix junction motifs.

Introduction

This contribution concerns the computational extraction, analysis, and organization of RNA 3D motifs. In this introductory section, we define the different types of 3D motifs we observe in atomic-resolution RNA structures, discuss their properties and functions, and identify those that are amenable to current methods for extraction and clustering. Then we discuss, with reference to the wider goals of RNA bioinformatics, some reasons for systematically analyzing atomic resolution RNA 3D structures to identify, extract, and cluster 3D motifs, including construction of computational tools for RNA structure prediction and analysis. In the Materials and Methods section we discuss the selection of a target set of reliable, non-redundant (NR) RNA 3D structure files for analysis. We also provide computational details of methods currently used to build and maintain the RNA 3D Motif Atlas, see http://rna.bgsu.edu/rna3dhub/motifs [1]. In the Theory section, we provide the conceptual framework used to annotate, classify and cluster RNA motifs into coherent groups intended for downstream bioinformatic analysis. We begin the Results section by reviewing the current content of the RNA 3D Motif Atlas. Then we assess how well the current implementation of the computational pipeline organizes RNA 3D motif instances by tracking the clustering of motif instances from corresponding positions of homologous ribosomal RNA (rRNA) 3D structures from different organisms. We conclude with a summary of outstanding issues in extraction and classification of hairpin loops (HL) and internal loops (IL) and challenges in extending the 3D Motif Atlas to linker regions (defined below) and multi-helix junction (MHJ) loops.

Other workers have developed similar methods to identify, extract, and cluster RNA 3D motifs and websites to make them available in searchable formats [2], [3], [4], [5], [6]. This contribution is not meant as a comprehensive comparison of all the available methods, but as an attempt to provide detailed explanation of our own approach, as well as an extensive discussion of its limitations and the opportunities for future work in the field.

We use this term to refer to modular arrangements in 3D space of mutually interacting RNA nucleotides localized within the secondary structure, that is, nucleotides delimited by a set of mutually nested AU, GC, or GU cis Watson-Crick (WC) basepairs [7]. For our purposes, the secondary structure separates the nucleotides of the linear sequence into two disjoint classes, those that form the secondary structure, per se, and all the rest. The former comprise the WC-paired helices. The latter constitute the so-called “loops” and “linker segments” of RNA chains. These are the nucleotides that may form 3D motifs. Some ambiguity, however, remains regarding those nucleotides that form “isolated” WC pairs that occur within or between 3D motifs and which are not stacked contiguously on other WC pairs, on at least one side. It is not a simple, easily codified matter to decide which of these isolated pairs should be assigned to the secondary structure and which are best considered elements of the 3D motifs that surround them.

Strictly speaking, the Watson-Crick (WC) paired helices composing the secondary structure are also 3D motifs. While RNA helices are quite rigid, they nonetheless exhibit sequence- and context-specific structural variation. As this has been studied extensively elsewhere [8], [9], [10], [11] we do not further consider RNA helices in this contribution.

The nucleotides forming the secondary structure generally comprise 60–70% of the nucleotides of structured RNAs. For example, just 60% form the secondary structure of 16S rRNA, while the remaining 40%, a significant fraction, constitute the loops and linkers [12]. In 2D representations of structured RNAs, these nucleotides are generally displayed as unstructured “loops,” separated by Watson-Crick paired helical elements, or as single-stranded “linkers,” joining distinct domains of the 2D structure. Such schematic representations seem to imply that these regions of the RNA are loosely structured or devoid of interactions. However, now that we have atomic-resolution structures for many structured RNAs, including the ribosomal RNAs (rRNA), we know that most loop regions are, in fact, structured by networks of non-Watson-Crick (non-WC) base-pairing, base-on-base stacking and base-to-backbone interactions [13]. We also verify that there is, in general, though not always, a one-to-one correspondence between individual “loops” in the 2D structures and modular 3D motifs.

Nucleotides forming “loops” are bordered on all sides by helical elements and so the corresponding 3D motifs can be classified topologically by the number of flanking Watson-Crick pairs. (The reader should note this does not apply to nucleotides in linker segments). The simplest loops are “terminal” or hairpin loops (HL), flanked by just one Watson-Crick (WC) pair, where the RNA chain folds back on itself. Internal loops (IL) are embedded between two helical elements and are flanked on two sides by Watson-Crick pairs. They comprise two distinct segments of the RNA chain.

Multi-helix junction (MHJ) loops are formed from three or more independent, interacting chain segments and are flanked by an equal number of WC pairs. MHJ are further classified, at the 2D level, according to how many chain segments and flanking pairs they comprise. The simplest and most common are three-way junctions (3WJ), followed by four-way junctions (4WJ). In biological RNA molecules, MHJ loops of rank five to ten are observed [14], [15]. Thus, MHJ loops are classified topologically according to the number of helical elements. However, this is a 2D classification. In fact, each topological category of MHJ comprises many different 3D motifs, differing among themselves in the arrangement in 3D space of the helices radiating from the junction [16]. The 3D arrangement of the helices is determined by detailed interactions between nucleotides at the junction, as well as distant tertiary interactions that orient and anchor the helices radiating from MHJ loops in 3D space. The relevance of this fact for RNA function became evident in studies of the mechanism of the self-cleaving hammerhead ribozyme, an RNA enzyme having a three-way junction (3WJ) at its active site. Long-range interactions between two of the helices far from the 3WJ site proved crucial for achieving the correct structure at the active site to make RNA catalysis possible [17], [18]. Many outstanding issues remain regarding extraction and clustering of MHJ (see Sections 4 and 6).

For our purposes, RNA 3D motifs are collections of nucleotides that form dense networks of interactions, what mathematicians call connected graphs. As such, they form modular structural units, that are distinct and self-contained in the sense that interactions the motif forms with other molecules or parts of the same RNA depend on, or are contingent, upon the correct formation of interactions internal to the motif. A significant fraction of nucleotide interactions in modular motifs are “internal”, that is, they occur between nucleotides of the motif.

RNA 3D motifs may also be “recurrent.” These are motifs that are structurally similar, as defined below, and that occur in diverse contexts, not only in corresponding positions of homologous RNA molecules. “Diverse contexts” means different locations within a single RNA molecule as well as occurrences in molecules unrelated by homology. For example, the Sarcin/Ricin motif in loop E of 5S rRNA in H. marismortui occurs five times in the LSU rRNA of the same organism, cf. Section 4.1.2, and in other non-ribosomal RNAs. Recurrent motifs can vary in sequence but conserve the 3D structure and the types of interactions among comprising nucleotides. Many recurrent hairpin loops (HLs) and internal loops (ILs) are known, but far fewer recurrent MHJ loops appear to be recurrent.

Continuing with the example of 5S rRNA, we note that all ribosomes except some mitochondrial ones contain this molecule, which is found in the central protuberance of the Large Subunit (LSU) on the side facing the Small Subunit (SSU). A recurrent IL called “loop E” in 5S rRNA interacts with a conserved IL in Helix 38 (H38), the “A-site finger” of the LSU rRNA that extends across the inter-subunit interface to contact the SSU. Within bacteria, loop E is highly conserved (see Fig. 1) but in archaea and eukarya, it is substituted by a distinct but related 3D motif that has the same structure as the Sarcin/Ricin (S/R) motif, first identified in the Factor-binding site of the LSU rRNA [19].

We provide a list of some common recurrent RNA 3D motifs that we have identified in the RNA 3D Motif Atlas in Table 1. Some of these are very familiar to RNA scientists and include GNRA and UNCG “tetraloops,” Anti-codon and TPsiC HL from tRNA, Sarcin/Ricin (S/R) and the related 5S rRNA “loop E” IL, kink turn motifs, C-loops, and the “11-nucleotide” GAAA loop receptor motif. Others do not have established names, but are found to be highly recurrent by the clustering algorithm. These are represented by their characteristic interactions, as noted by names like “tandem sheared pair,” indicating an IL motif comprising oppositely directed and stacked “sheared” (i.e. trans Hoogsteen-Sugar Edge) basepairs. Some recurrent motifs are represented by more than one motif group in the RNA 3D Motif Atlas, due to small structural differences, usually near the flanking basepairs of some motif instances, that are sufficient to trigger generation of new groups by the clustering procedure. For the most recurrent motifs, instances are found in a great diversity of RNA structures. Table 1 provides links to the Motif Atlas as well as schematic diagrams of exemplar instances.

Instances of the same RNA 3D motif can vary in sequence. A major motivation for extracting and organizing motifs by structural similarity is to document the range of sequence variation observed for each motif group, so as to define an empirical “sequence signature” for the motif. The sequence variation observed among 3D instances assigned to the same motif group can be augmented with sequence data from corresponding sites in homologous RNA multiple sequence alignments, although this needs to be done with care to ensure that the 3D structures of the motifs are conserved throughout the alignment. Such data inform probabilistic methods designed to predict 3D structures of RNA motifs from sequence [20]. Fig. 1 shows annotated basepair diagrams for six instances of loop E from 5S rRNA in bacteria, archaea, and eukarya. These show that all 5S loop E motifs are structurally related, with the bacterial ones all forming the same types of basepairs, as indicated by the basepairing symbols, which are explained in Section 3. Therefore, the bacterial loop E motifs all belong to the same motif group. Fig. 1 also shows that the archaeal and eukaryal loop E motifs form the same basepairs, but some of these pairs are different from those in the bacterial motif, showing that loop E motifs fall into two distinct groups. Moreover, Fig. 1 illustrates that within each group, there are base substitutions, but that these preserve the basepairing type. Moreover, these substitutions are isosteric, as discussed below.

RNA 3D motifs may also be “autonomous,” by which we mean RNA sequences that fold into their functional 3D structures independently of, or prior to, interactions with other structural elements or molecules. There is evidence from MD simulation and biophysical studies that some RNA 3D motifs are highly autonomous [21], [22], [23]. These motifs form sufficient numbers of stabilizing interactions among their nucleotides to assume essentially the same structure regardless of the context in which they are found. A good example in this respect is loop E in helix 4 of 5S rRNA. In bacterial 5S rRNA, this loop is a highly symmetrical loop consisting of seven stacked non-WC basepairs, as shown in Fig. 1. In archaeal and eukaryal 5S rRNA this loop takes the form of a Sarcin/Ricin (S/R) motif, an asymmetric motif in which each base forms at least one non-Watson-Crick basepair and three bases form a base triple (cf. Fig. 1). MD simulations and thermodynamic studies have shown that each of these motifs are unusually stable even in the absence of their interacting partners [22], [23], [24], [25].

Recurrent motifs tend to be autonomous, as apparently is the case for 5S loop E, but this is not always the case, especially for larger motifs, some of which require folding by induced fit to assume their functional forms. Such appears to be the case for the GAAA loop receptor (the so-called “11-nt motif”), which changes structure upon binding its cognate GAAA HL [26].

Other RNA motifs are conformationally flexible; their 3D structures change in response to changes in their environment, as an integral part of their function. The classic example is the IL in helix 44 of 16S rRNA, which functions to “decode” the codon/anti-codon interaction between mRNA and the incoming tRNA. This IL motif comprises two adjacent, unpaired adenosines (T. thermophilus 16S nucleotides A1492 and A1493) that are tucked inside helix 44 in the absence of tRNA but swing out when tRNA is bound to the A-site of the SSU, to interact with the first and second base-pairs of the codon/AC mini-helix. (See http://rna.bgsu.edu/rna3dhub/loops/view/IL_1J5E_056 for the “tucked in” conformation and http://rna.bgsu.edu/rna3dhub/loops/view/IL_1FJG_057 for the “swung out” conformation.) The interactions they form, and therefore the conformation of the decoding loop, depends on whether the bound tRNA is cognate, near-cognate, or non-cognate to the A-site codon presented by the mRNA, as documented by a series of ribosome structures [27], [28], [29]. When the interaction is cognate, the two bulged As form ideal “A-minor” interactions, i.e. Sugar-Edge basepairs with the mRNA/tRNA BPs. In this case, the conformation assumed by the motif is transduced into a signal to the large subunit to stimulate the elongation factor EF-Tu to hydrolyze ATP and release the amino-acyl end of the tRNA into the LSU A-site, leading to peptide bond formation [30].

RNA 3D motifs are the primary loci of functional interactions in structured RNA molecules. They also provide structural variety to confer structural complexity to RNA that rivals that of proteins, by breaking up the linear monotony of the Watson-Crick double helix. Functions of individual motifs include the following: 1) To specifically bind small molecule ligands, proteins, or other RNAs; 2) to mediate tertiary interactions that allow RNA molecules to fold compactly; 3) to play architectural roles; 4) to provide nucleation sites to guide RNA folding; and 5) to create structural complexity by introducing branching points in the secondary structure. These are not exclusive roles: instances of the same motif can play multiple roles simultaneously, depending on the context in which they occur.

The function of many RNA 3D motifs is to mediate long-range tertiary interactions within the same RNA, with other RNA molecules, or with proteins or small molecules. The GNRA and TPsiC hairpin loops are examples of motifs that form tertiary interactions almost everywhere they are observed. GNRA HL present three stacked bases that interact in the minor grooves of target helices [31]. TPsiC or “T-loop” HL present intercalation sites for purines “bulged out” of other RNA motifs [32].

Some 3D motifs mediate interactions with other RNA molecules or with proteins. For example, several motifs in SSU, including the decoding site IL mentioned above, interact with mRNA and tRNA. Most RNA-protein interactions in 16S involve nucleotides found in loops. For example 60% of nucleotide-amino acid interactions in E. coli 16S rRNA involve loop nucleotides, even though these constitute just 42% of all 16S nucleotides [12].

Other RNA 3D motifs appear to primarily play architectural roles. These include the C-loops, which increase the helical twist of the RNA helix in which they are embedded [33], [34], [35], and the Kink-turns, which introduce a sharp bend or kink into helices in which they are found [36]. There is evidence from structure comparisons and MD simulations that kink-turns also function as hinges [37], [38].

Finally, some motifs appear to play primarily stabilizing roles that guide RNA folding. For example, the very common UNCG hairpin loops appear to serve as nucleation sites for forming hairpin loop-stems because of their unusual thermodynamic stability [39], [40]. This stability has been factored into structure prediction algorithms, such as the mFOLD program [41], [42], [43], to improve computational folding and structure predictability.

An important question that motivates the study of RNA 3D motifs is to determine which motifs can structurally or functionally substitute for each other, and are therefore functionally interchangeable. Such motifs constitute alternative, functionally equivalent, and modular building blocks for RNA nanotechnology [44]. An important source of data is provided by 3D structures of homologous molecules. Geometries and interactions of corresponding 3D motifs from homologous molecules can be compared to identify interchangeable motifs. In this way, for example, it is found that at least two different 3D motifs correspond to the IL called “loop E” in 5S rRNA [25], [45], [46], [47]. The motif in bacterial and chloroplast 5S is distinct from the one found in archaeal and eukaryal 5S, which is identical in most cases to the Sarcin/Ricin of 23S rRNA (see Fig. 1). The ribosome structures show that in all cases, archaeal, bacterial and eukaryal, 5S rRNA loop E interacts with a conserved IL in the “A-site Finger,” helix 38 of LSU rRNA.

The motivation for this work in the wider context of understanding the role of RNA in living cells is the following: High throughput transcriptomic studies have shown that most of the DNA in eukaryal genomes (including human) is transcribed into RNA at some point in the life cycle of the organism, even though less than 2% actually codes for protein [48], [49]. Large numbers of new RNA molecules have been identified in these studies. However, the biological characterization of RNA continues to lag far behind genomic and transcriptomic identification of new RNA molecules. Evidence that many of these RNAs are likely to be functional is provided by the high temporal and spatial specificity of their transcription, especially in the brain [50], [51] and by sequence and structural conservation within or across phylogenetic groups. Moreover, given that the numbers, types and even sequences of proteins are highly conserved among mammals, and even among animals of all kinds, evidence is accumulating that evolutionary processes producing new animal species, for example the emergence of humans from the great ape lineage, may be driven in part by rapid RNA evolution [52], [53], [54]. Understanding the functions of new RNAs can be aided by predictions of their 2D and 3D structures. Methods for predicting 2D structures of RNAs are highly developed, although there is still room for improvement, but RNA 3D structure prediction, even starting from a reliable 2D structure, is still very challenging, as documented by the results of recent blind “RNA Puzzles” prediction competitions and reviews of the field [55], [56], [57]. We believe that careful study and comparison of the RNA 3D structures we already have, with each other and with aligned homologous sequences, can contribute to improving the methods of 3D RNA structure prediction.

Historically, experimental determination of RNA 3D structure has been time consuming and highly contingent on obtaining suitable crystals for diffraction. This is changing rapidly with the advent of atomic resolution cryo-EM, which now achieves atomic resolution of the same large RNA-containing complexes in multiple functional states with distinct conformations [58], [59]. These advances promise a wealth of new structures to be analyzed and organized into accessible and useful formats.

Motifs commonly found in internal loops, hairpins, and some junctions are closed by Watson-Crick pairs and therefore are readily amenable to extraction and grouping, with only a few exceptions, as we describe in Section 3. However, not all interesting motifs are closed by Watson-Crick pairs. For example, in many large RNA molecules, structural domains are connected by single-stranded segments of the RNA chain. We call such segments “linkers.” For example the body domain of the SSU rRNA is connected to the head by a linker. Likewise, helix 44, which contains the decoding IL, is connected to the head domain by a linker. Moreover, the 3D structure reveals a large number of base-pairing and base-stacking interactions involving nucleotides in the linker and nearby helices, to form a highly structured “neck,” made of RNA, connecting these domains. The 3D motifs that constitute the neck in 16S are not identified or extracted by methods designed to extract HL, IL, or conventional MHJ [60]. New methods are needed to treat such motifs.

Long-range interactions represent another type of recurrent RNA 3D motif that is not extracted by methods targeting HL, IL, and MHJ loops. For example, GNRA HL, which form tertiary interactions almost everywhere that they occur, form recurrent motifs with their target receptors.

There do not appear to be many recurrent MHJ having a clearly defined consensus set of core nts and conserved inter-nucleotide interactions. Recent efforts to classify RNA 3WJ produced a small number of fairly broad classes characterized largely by differences in co-axial helical stacking at the junction [16]. Similar results were obtained for larger junctions [14]. New approaches will be needed to systematically identify and extract recurrent motifs formed by linkers, tertiary interactions, and higher-order MHJ.

Any new clustering procedure needs to be assessed. The ideal clustering procedure groups together instances that are sufficiently similar and separates those that differ sufficiently to require distinct groups. If too many groups are generated, this leads to a plethora of singletons (groups with only one instance), some of which belong with other instances. With too few groups, heterogeneous instances are included in some groups making it difficult to derive sequence signatures for motifs or to make meaningful statements about the geometric variability of the instances.

An excellent source of RNA 3D motif instances for assessing current clustering procedures are the ribosomal RNA (rRNA) structures. The rRNAs represent an ideal test case because they contain a large number of IL and HL and have been solved from a variety of organisms representing all major phylogenetic domains [61], [62], [63], [64], [65], [66]. In addition, the function of the ribosome has been extensively studied, and detailed knowledge is available regarding the functional roles of each of the helical elements of the SSU and LSU rRNAs, including protein, tRNA, and mRNA binding sites and loci of functional conformational flexibility. Finally, a large number of aligned sequences are available for both the SSU and LSU rRNA of all major phylogenetic domains, including chloroplast and mitochondria. As a whole, these data provide good indications regarding which IL and HL in the SSU and LSU rRNAs are likely to be conserved in 3D structure and for which phylogenetic domains. In the Results section we will illustrate the approach using the hairpin loops of bacterial SSU rRNA. A complete analysis for HL and IL of SSU and LSU across all phylogenetic domains will be presented elsewhere.

In previous work, we compared the structures of corresponding HL and IL in rRNA using R3D Align, an online web application we constructed to locally align the 3D structures of homologous RNA molecules [67]. We found a high degree of structure conservation of motifs at corresponding locations. Here we ask into which motif group of the RNA 3D Motif Atlas corresponding loop instances from the representative rRNA structures of the NR set (see below) have been placed. If corresponding instances are placed in the same motif group, that is a sign that their geometries are strongly conserved. Where they are placed in different groups, this is a sign of variability in the geometry, including the internal base-pairing of the motif instance. Sometimes, corresponding motifs are placed in different, but structurally related groups. By comparing the clustering results with the expected variation in structure we can assess the reliability of the clustering approach. We define a successful clustering as one that reproduces the known similarities and differences among homologous corresponding motifs in homologous RNA molecules.

The PDB is now providing structure quality information at the nucleotide level to indicate how well the modeled residues of a macromolecular structure fit the experimental electron density. One measure is the Real-Space Refinement statistic (RSR) calculated from the difference between the experimental electron density and that calculated from the 3D model [68]. Values range from 0 to 1, with smaller values indicating a better match to the data. RSR values are provided as part of PDB’s validation pipeline for new structures and for older structures deposited with structure factors [69]. PDB also computes percentile rank scores which facilitate comparison of RSR values between different structures. Using these data to filter out poorly modeled loop instances will improve the quality of data included in any collection of RNA 3D motifs.

Section snippets

Sources and nature of atomic-resolution structural data

The Protein Data Bank (PDB) is the international, archival repository of experimental 3D structures of macromolecules of biological interest, including structures of RNA, DNA, and polysaccharide molecules in addition to proteins. As such, it contains entries for all author-deposited structures that meet its criteria for scientific originality and accuracy. The PDB therefore contains a large amount of information, much of which is redundant for purposes of identifying, classifying, and searching

Theory: Principles of RNA 3D motif analysis and classification

The RNA 3D Motif Atlas is designed to classify RNA 3D motifs according to 3D structural similarity. Consequently, some motif instances identical in sequence are assigned to different motif groups while other instances differing in sequence, or even in total number of nucleotides, are assigned to the same group. This is intentional and is based on many comparative observations of 3D structures of homologous RNA molecules. We see that loops with the same sequence can form different geometries in

Overview of RNA 3D Motif Atlas version 1.18

In this section we give brief summary statistics for version 1.18 of the RNA 3D Motif Atlas and we discuss some coherent, correctly-separated motif groups and a notable success dealing with instances containing “flipped” (i.e. anti-syn) bases. However, a careful inspection of the Motif Atlas shows some limitations in our clustering methodology. Notably, some groups contain individual instances which are geometrical outliers, relative to all other instances in the group. These instances should

“Isolated” or “embedded” cWW basepairs in HL and IL

Determining which “isolated” cis Watson-Crick (cWW) pairs should be considered integral parts of 3D motifs and which should be considered part of the secondary structure and thus used to split large motifs is a major problem for MHJ motifs as discussed in the next section, and also for some HL and IL motifs. For example, the large 15-nt HL in 5S rRNA called “loop C” is not correctly extracted by the current algorithm populating the RNA 3D Motif Atlas. The 3D structure is conserved in all 5S

Conclusions

Generally, the RNA 3D Motif Atlas performs a very robust clustering of motif instances. Its automated analysis process results that agree with manual analysis, yet can be done in a fraction of the time and thus can be carried out on a regular basis. In addition, the Motif Atlas outputs various visualization pages and files which aid manual analysis, such as windows to view and superimpose the 3D coordinates of motif instances, alignments of motif instances with lists of interactions, lists of

Acknowledgements

Funding provided by the National Institutes of Health [2R01GM085328-05 to N.B.L. and C.L.Z.]. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

References (81)

  • N.B. Leontis et al.

    The building blocks and motifs of RNA architecture

    Curr. Opin. Struct. Biol.

    (2006)
  • C.C. Correll et al.

    Metals, motifs, and recognition in the crystal structure of a 5S rRNA domain

    Cell

    (1997)
  • I.A. Qureshi et al.

    Long non-coding RNAs in nervous system function and disease

    Brain Res.

    (2010)
  • P. Wu et al.

    Roles of long noncoding RNAs in brain development, functional diversification and neurodegenerative diseases

    Brain Res. Bull.

    (2013)
  • B. Guennewig et al.

    The central role of noncoding RNA in the brain

    Int. Rev. Neurobiol.

    (2014)
  • G. Barry et al.

    The role of regulatory RNA in cognitive evolution

    Trends Cogn. Sci.

    (2012)
  • J.M. Ogle et al.

    Selection of tRNA by the ribosome requires a transition from an open to a closed form

    Cell

    (2002)
  • D.J. Klein et al.

    The roles of ribosomal proteins in the structure assembly, and evolution of the large ribosomal subunit

    J. Mol. Biol.

    (2004)
  • T.V. Magee et al.

    Novel 3-O-carbamoyl erythromycin A derivatives (carbamolides) with activity against resistant staphylococcal and streptococcal isolates

    Bioorg. Med. Chem. Lett.

    (2013)
  • A.I. Petrov et al.

    Automated classification of RNA 3D motifs and the RNA 3D Motif Atlas

    RNA

    (2013)
  • M. Djelloul et al.

    Automated motif extraction and classification in RNA tertiary structures

    RNA

    (2008)
  • S. Lemieux et al.

    Automated extraction and classification of RNA tertiary structure cyclic motifs

    Nucleic Acids Res.

    (2006)
  • C. Zhong et al.

    Clustering RNA structural motifs in ribosomal RNAs using secondary structural alignment

    Nucleic Acids Res.

    (2011)
  • G. Chojnowski et al.

    RNA Bricks–a database of RNA 3D motifs and their interactions

    Nucleic Acids Res.

    (2013)
  • P. Cech et al.

    SETTER: web server for RNA structure comparison

    Nucleic Acids Res.

    (2012)
  • L. Nasalean et al.

    RNA 3D structural motifs: definition, identification, annotation, and database searching

  • B.S. Tolbert et al.

    Major groove width variations in RNA structures determined by NMR and impact of 13C residual chemical shift anisotropy and 1H–13C residual dipolar coupling on refinement

    J. Biomol. NMR

    (2010)
  • P.S. Klosterman et al.

    Crystal structures of two plasmid copy control related RNA duplexes: an 18 base pair duplex at 1.20 ?? Resolution and a 19 base pair duplex at 1.55 Resolution

    Biochemistry

    (1999)
  • B.A. Sweeney et al.

    An introduction to recurrent nucleotide interactions in RNA

    Wiley Interdiscip. Rev. RNA

    (2014)
  • B. Coimbatore Narayanan et al.

    The Nucleic Acid Database: new features and capabilities

    Nucleic Acids Res.

    (2014)
  • E. Bindewald et al.

    RNA Junction: a database of RNA junctions and kissing loops for three-dimensional structural analysis and nanodesign

    Nucleic Acids Res.

    (2007)
  • A. Lescoute et al.

    Topology of three-way junctions in folded RNAs

    RNA

    (2006)
  • A. Khvorova et al.

    Sequence elements outside the hammerhead ribozyme catalytic core enable intracellular activity

    Nat. Struct. Biol.

    (2003)
  • M. De la Peña et al.

    Peripheral regions of natural hammerhead ribozymes greatly increase their self-cleavage activity

    EMBO J.

    (2003)
  • C.L. Zirbel et al.

    Identifying novel sequence variants of RNA 3D motifs

    Nucleic Acids Res.

    (2015)
  • M. Zgarbová et al.

    Noncanonical hydrogen bonding in nucleic acids. Benchmark evaluation of key base-phosphate interactions in folded RNA molecules using quantum-chemical calculations and molecular dynamics simulations

    J. Phys. Chem. A.

    (2011)
  • N. Spacková et al.

    Molecular dynamics simulations of sarcin-ricin rRNA motif

    Nucleic Acids Res.

    (2006)
  • S.E. Butcher et al.

    Solution structure of a GAAA tetraloop receptor RNA

    EMBO J.

    (1997)
  • I.S. Fernández et al.

    Unusual base pairing during the decoding of a stop codon by the ribosome

    Nature

    (2013)
  • A. Rozov et al.

    Structural insights into the translational infidelity mechanism

    Nat. Commun.

    (2015)
  • Cited by (33)

    • RNA thermometers in bacteria: Role in thermoregulation

      2022, Biochimica et Biophysica Acta - Gene Regulatory Mechanisms
      Citation Excerpt :

      Another probing approach by chemical probes provide better resolution that target the nucleotides at different sites, modifying the RNA in vitro or in vivo; for instance, reagents like 2-methylnicotinic acid imidazolide (NAI) or 1-methyl-7-nitroisatoic anhydride (1M7) used in Selective 2′-hydroxyl acylation analyzed by primer extension (SHAPE) introduce acylation at the common 2′OH existing in all four nucleotide bases [36,96,97]. Both the aforementioned approaches assess the RNA structures which can then be studied through computational analyses for predicting RNA structures [98–100]. Synthetic biology, an emerging tool in the field of bioengineering can be used to optimise natural metabolic pathways involving proteins, DNA/RNA sequences, or synthesizing novel biomolecules that do not normally exist in the cell and are implemented with desired functions [101].

    • Shapeshifting RNAs guide innate immunity

      2018, Journal of Biological Chemistry
    • Predicting RNA-RNA interactions in three-dimensional structures

      2018, Encyclopedia of Bioinformatics and Computational Biology: ABC of Bioinformatics
    View all citing articles on Scopus
    View full text