Contrary To Conventional Wisdom, Some Virusoids Do Produce Proteins

(Posted on Saturday, April 22, 2023)

It’s axiomatic in reviews of virusoids that they are small circular RNAs that do not encode any proteins. Studies with the virusoid known as small circular satellite of rice yellow mottle virus (scRYMV) challenge this notion: not only does the virusoid produce proteins from its positive (+) strand RNA, it may also produce proteins from its negative (-) strand RNA, which is used as a template during replication. This opens the much broader possibility that other virusoids produce peptides and proteins of varying descriptions. As we shall soon see, that is certainly true of the virusoid most directly involved in human disease, the hepatitis delta virusoid.

Viroids & Virusoids: What are they?

Viroids and virusoids are some of the smallest transmissible pathogens discovered thus far. They are tiny strands of circular RNA roughly between 220 and 450 nucleotides long — for reference, SARS-CoV-2 clocks in at 30,000 nucleotides. What sets the two apart is the fact that virusoids require a helper virus to infect and replicate, whereas viroids do not. Helper viruses also provide virusoids with a protein coat, which helps keep them safe and facilitates entry into host cells. Some viroids and most virusoids have the unusual feature that their RNA includes a section that has catalytic activity, known as a ribozyme. Essentially, it is RNA that, instead of encoding a protein to do a job, does that job itself. In the case of viroids and virusoids, the ribozymes help cleave and occasionally rejoin the circular RNA genomes during replication.

Discovery of Open Reading Frames

When Abouhaidar et al. first discovered and described the virusoid associated with rice yellow mottle virus, they noticed that it contained various possible open reading frames (ORFs) in its genetic sequence. Open reading frames are segments of RNA that have the potential to be translated into proteins. They are flanked on one side by a “start codon” and on the other end by a “stop codon”. These codons, each made up of three nucleotides, help tell the ribosome —the protein-producing machinery in cells— where to begin and where to end. The RNA between the two codons is turned into a protein.

Virusoids replicate via rolling-circle replication (Figure 1). Replication of this kind involves the formation of a long chain of negative (-) strand RNAs, known as a concatemer. Each negative-strand RNA is a mirror image of the original, positive-strand genome. The long chain of negative-strand RNAs is then cut up into unit-length segments, either by the ribozymes contained within the virusoid RNA or by host and viral polymerases. Finally, the negative strands are then used as templates for the production of new positive (+) strands, completing replication.

Abouhaidar and his colleagues found that the (-) strand contains three small open reading frames. The (+) strand also contains two or even three, totally overlapping open reading frames. Curiously, the start codon in the positive strand, made up of the three nucleotides AUG, is embedded right between two stop codons: UGAUGA. Changing reading frames switches the sequence from a start codon to back-to-back stop codons: UGA UGA.

Another peculiarity of virusoids is that they lack clear directionality. Most protein-encoding RNA is linear, with two distinct sides: a five-prime end and a three-prime end. Ribosomes orient themselves by binding to the five-prime end and scanning the RNA until they hit a start codon, at which point they begin translation. Translation comes to an end once they hit a stop codon. Since virusoids are circular, their two ends are fused together, making the usual approach impossible for ribosomes. Even if they occasionally take on a linear form during replication, they still do not have a five-prime and three-prime end. Instead, the small circular satellite RNA of rice yellow mottle virus likely has an internal ribosome entry site (IRES) — think of this as an artificial docking station that grants the ribosome access to the circular genome. This is not without precedent, as many RNA viruses make use of internal ribosome entry sites (IRES) to initiate translation independently of a five-prime cap, skipping over the scanning step (Figure 2).

FIGURE 1. “Mechanism proposed for replication of viroids, viroid-like satellite rNAs and HDv. The asymmetric pathway with one rolling-circle (lower) is used by viroids of the family Pospiviroidae and by some viroid-like satellite rNAs, while the symmetric pathway with two rolling-circles (upper) is used by the other viroid-like satellite rNAs, viroids of the family Avsunviroidae and HDv. Solid and open lines refer to plus (+) and minus (-) polarities, respectively and processing sites are marked by arrowheads. Cleavage is alternatively mediated by ribozymes (rZ) or host enzymes (He), generating linear monomeric rNAs with characteristic termini that are subsequently ligated by host enzymes or autocatalytically.” SOURCE: FLORES ET AL. 2011, https://doi.org/10.4161/rna.8.2.14238

FIGURE 2. Schematic diagram of cap-dependent initiation of translation (A) versus IRES-dependent initiation of translation (B). SOURCE: ACCESS Health International

Are the Reading Frames Functional?

Simply because a sequence contains a potential reading frame does not necessarily mean it actively encodes any proteins. To find out, Abouhaidar et al. generated a linear copy of the putative (+) strand open reading frame by joining three strands of the unit-length RNA head-to-tail, thus mimicking the overlapping reading frames present in the virusoid.

The RNA construct was then inserted into an Escherichia coli (E. Coli) cell. If the reading frame were operational, you would expect the ribosomes in the E. Coli cell to translate the RNA sequence and produce a corresponding protein. Indeed, a 19-kilodalton (kDA) protein was made. The amino-acid sequence of this 19-kDA protein mostly matched the predicted amino-acid sequence of the virusoid RNA.

Next, the 19-kDa protein was used to raise specific antibodies that would recognize and bind to the protein whenever it was present. The researchers then in vitro translated the RNA construct and used the antibodies to double check whether the protein had been synthesized. The antibodies revealed two distinct proteins, one weighing 19 kDa and the other weighing 16 kDa. Based on the amino-acid sequence of the 16-kDa protein, translation must have started at the initiation-termination site of the first virusoid RNA monomer and must have ended at the initiation-termination site of the last virusoid RNA monomer (Figure 3). Removing the last virusoid RNA monomer led to the loss of the 16-kDA protein but did not interfere with the synthesis of the 19-kDa protein, which began initiation outside of the first virusoid RNA monomer.

FIGURE 3. Simplified diagram of the RNA construct; scRYMV = Small Circular Satellite RNA of Rice Yellow Mottle Virus. SOURCE: ACCESS Health International (Adapted from “Novel coding, translation, and gene expression of a replicating covalently closed circular RNA of 220 nt”, Abouhaidar et al. 2014)

The next step was to determine whether the 16-kDa protein was made in plants infected with the virusoid. To this end, total viral RNA was extracted from the plants, translated, and then exposed to the virusoid-specific antibodies raised in earlier experiments. The same 16-kDa protein was present (Figure 4). Supplementing total viral RNA with additional purified circular RNA generated a much stronger signal of the 16-kDa protein. In contrast, total RNA extracted from healthy plants did not generate any such proteins.

FIGURE 4. In vitro translation reaction depicting the 16-kDa protein from the scRYMV circular RNA (purified from denaturing polyacrylamide gels) is shown in lane 3, whereas that of the reaction from total RNA of RYMV-infected rice is shown in lane 2. Lane 5 demonstrates the 16-kDa product from reaction using total viral RNA extracted from RYMV virus particles, and lane 4 shows the enhanced 16-kDa signal from reaction using RYMV total viral RNA but supplemented with the same amount of gel-purified circular RNA as that used in the reaction of lane 3. Lanes 1 and 6 represent negative control reactions using healthy rice and the endogenous empty lysate, respectively. Arrows on right depict the position of molecular weight markers 25, 16, and 14 kDa from top to bottom, respectively. SOURCE: Abouhaidar et al. 2014

The 16-kDa protein produced by the small circular satellite RNA of rice yellow mottle virus requires a ribosome to loop around its RNA twice. The reason the ribosome doesn’t stop after the first loop is because the virusoid RNA is 220 nucleotides long, which is not a multiple of three. Since each codon is made up of three nucleotides, after one full loop, instead of hitting the stop codon, the ribosome slips into a new reading frame and continues on, now off by one nucleotide compared to the first loop. This one nucleotide difference in reading means new amino acids are synthesized during the second loop (Figure 5). It also means the ribosome now hits one of the stop codons in the initiation-termination sequence, which it missed the first time around (Figure 6).

FIGURE 5. Example of the impact a ribosomal frameshift can have on a nucleotide sequence. SOURCE: WIKIPEDIA

Along with the 16-kDa protein, translation of total viral RNA from infected plants also yielded larger proteins: 18 kDa, 23 kDa, 32 kDa, and even 39 kDa. The researchers suspect these are “read-through” proteins, formed when the ribosome does not stop at the initiation-termination codon after the second loop, but instead keeps going around the circular RNA of the virusoid, shifts into a different reading frame, and eventually hits another stop codon. Sometimes even these secondary or tertiary stop codons may prove to be “leaky”, in which case the ribosome continues on until it hits the next reading frame and continues translation. For example, the 32-kDa protein seemed to be a perfect dimer of the 16-kDa protein.

FIGURE 6. Schematic representation of small circular satellite RNA of rice yellow mottle virus (scRYMV) translation. Two loops (red spiral) marks the production of the 16-kDa protein. Two loops plus 18 additional amino acids marks the production of the 18-kDa read-through protein (red + green). Three full loops marks the production of the 23-kDa protein (red + green +blue). Full amino-acid sequence: MSQEELGGTQEFHPGRPGRLGAVHRASLWSEPGLQGAWRRNRSVGTTRTISHRAPAASSCAGGGDFVSSLTDTDE – RGTWRHPGISPGSTWAARSRAQGVAVERAWPPRGLEAKPVCWDHSDHQSSCSGSFQLRRGRRFCFEPYRH – AKRNLEARPNFTRVDLGG – EPCTGRRCGASLASKGPGGETGLLGPLGPSVIVLRQLPAAQGAEILFRALPT SOURCE: ACCESS Health International (Adapted from Abouhaidar et al. 2014)

As mentioned above, the negative (-) strand of the virusoid RNA also contains open reading frames. Whereas the positive (+) strand only has one clear initiation codon across all three reading frames, the negative strand contains two possible start codons, one in the first reading frame and one in the third reading frame. Although the researchers did not address the negative strand, there is no clear reason why it shouldn’t also produce functional proteins. All the necessary components are in place. Barring experimental data, we cannot draw any firm conclusions, but it is possible that the negative strand encodes a long, 72-amino-acid sequence. This sequence may begin in the third reading frame before frameshifting into the second sequence and eventually hitting a downstream stop codon (Figure 7).

Possible open reading frame in negative strand of scRYMV genome

FIGURE 7. Schematic representation of possible open reading frame contained in the negative strand of the small circular satellite RNA of rice yellow mottle virus (scRYMV). The ‘X’ would be E, A V, or G depending on the nucleotide that is added during the frameshift. SOURCE: ACCESS Health International.

Implications

Here, we described the proteins and extraordinary coding capacity of a virusoid known as small circular satellite RNA of ring yellow mottle virus (scRYMV). This opens the much broader possibility that other virusoids produce peptides and proteins of varying descriptions. As we shall soon see, that is certainly true of the virusoid most directly involved in human disease, the hepatitis delta virusoid.

Back to Writings

Read Dr. Haseltine's latest piece with