In addition to splicing, the final functional sequence and structure of RNA molecules in the cell may depend on some combination of the following processing reactions:
Deliberate and specific changes that are made to the sequence of mRNA after it is transcribed were discovered in the 1980s. These changes are quite widespread in some organelle transcripts (mitochondria of protozoans; plant chloroplasts) but are fairly rare in nuclear mRNA.
Changes that have been observed are:
- C changed to U (and, vice-versa)
- U inserted (can be multiple copies)
- U deleted (can be multiple copies)
The editing reactions probably require the active participation of special small RNA molecules called guide RNA (gRNA).
Some examples of RNA editing are:
Trypanosomes have a single large mitochondrion called a kinetoplast. The mRNAs synthesized in the kinetoplast can be extensively edited -- usually through the addition of U -- to increase their size by as much as 50%.
The mitochondrial encoded cytochrome oxidase II (coxII), cytochrome oxidase III (coxIII), NAD dehydrogenase (nadI) and cytochrome b (cytb) transcripts are edited in several species of "higher" plants.
Mammalian apolipoprotein B
Two forms of apolipoprotein B are found in mammalian serum. One (Apo-B100) is synthesized in hepatocytes (liver cells), the other (Apo-B48) is synthesized in intestinal epithelial cells. Both are transcribed from the same gene.
The gene consists of 29 exons. Exon 26 contains a CAA codon which is the target of RNA editing.
In hepatocytes, a full length mRNA is translated and a 500 KDa protein is synthesized. In intestinal epithelial cells, however, the CAA codon in the mRNA is edited by a cytidine deaminase which changes the codon from CAA to UAA - a stop codon. Translation is halted at this stop codon and the resulting protein is only 240 KDa in size.
Addition of a 5' Cap
Eukaryotic mRNAs and some other RNAs are modified through the addition of a 5' Cap structure. The mRNA cap is a 7-methyl-guanosine in a 5'-5' linkage with the 5'-terminal nucleotide of the transcribed RNA.
In essence, the addition of the cap occurs through the condensation of GTP with the 5' triphosphate of the mRNA. The reaction occurs in three steps:
Further methylation may also occur. The extent of this methylation varies in different organisms. The original 5' nucleotide (which is really the first nucleotide of the mRNA) is 2'-O-methylated in higher organisms but not in yeast. The second nucleotide is also frequently 2'-O-methylated in vertebrates.
The 5' cap structure serves two roles. Its principal role is to serve as a recognition feature during protein synthesis. Eukaryotic mRNAs do not contain ribosome binding sites unlike prokaryotic mRNA which do (the Shine-Dalgarno sequence). Eukaryotic ribosomes are positioned at the 5' end of mRNA through the action of CBPI, an initiation factor that recognizes and binds to the 5' end. The 5' cap may also help protect the 5' end of the mRNA from digestion by exonucleases.
The snRNAs that participate in splicing are also capped (except for U6). The genes for the U1, U2, U4 and U5 snRNAs are transcribed by RNA polymerase II and, consequently, the resulting RNAs are capped. However, they are methylated differently from other mRNAs. The guanine base is methylated at position N7 as normal but, in addition, it is dimethylated at position 2. Thus most snRNAs have a characteristic 2,2,7-trimethyl-guanosine (m3G) cap. This structure is essential for assembly of the spliceosome and for transport of the spliceosome back into the nucleus.
3' end of Eukaryotic mRNA:
(1) Termination of Transcription
The termination of transcription in eukaryotes is not as well-defined as it is in prokaryotes, particularly the termination of mRNA synthesis. There are three different mechanisms of transcription termination, one for three different types of transcription unit in eukaryotes.
CLASS I TRANSCRIPTION UNITS
Transcription by RNA polymerase I terminates at a discrete 18 nt terminator site located approxiumately 1000 nt downstream of the end of the coding sequence. Recognition of this site appears to involve an ancillary protein factor.
CLASS II TRANSCRIPTION UNITS
Transcription by RNA polymerase II terminates over a terminator region but it is not known what features define this region nor how it effects termination. Much of the difficulty in resolving this issue is due to the fact that class II transcripts are processed at the 3' end with the addition of a poly(A) tail which replaces the true 3' end of the transcript.
CLASS III TRANSCRIPTION UNITS
Transcription by RNA polymerase III is terminated in a manner reminiscent of that in prokaryotes. A small run of U's in a GC-rich region is required as the termination signal. However, the run of U's is shorter than in prokaryotes -- 4 U's are sufficient -- and the GC-rich region need not adopt any kind of hairpin structure.
3' end of Eukaryotic mRNA:
(2) Addition of a poly(A) Tail
Most eukaryotic mRNAs have a special "tail" added to the 3' end. This tail is added by a special processing event that recognizes sequence elements in the 3' untranslated region (3' UTR) of eukaryotic mRNA and replaces the true 3' end of the transcript with a tail consisting of 20-250 adenine nucleotides.
The important signal in the 3' untranslated region (3' UTR) of eukaryotic mRNA is the sequence AAUAAA. A complex consisting of 4 proteins assembles at this sequence to carry out the processing:
- A specificity factor -- called Cleavage and Polyadenylation Specificity Factor (CPSF). CPSF consists of three subunits, each of which is an RNA-binding protein. Each of the individual subunits will bind RNA non-specifically. However, collectively, they recognize and bind specifically to the AAUAAA polyadenylation sequence.
- An endonuclease, which consists of two components -- CFI and CFII, cleaves the RNA.
- The poly(A) polymerase which catalyses the addition of up to 250 adenine nucleotides to the 3' end of the cleaved mRNA. This enzyme, which is a monomeric protein, can act either distributively or processively.
- An additional protein, called Cleavage Stimulation factor (CStF) is required to stabilize the complex. CStF is an RNA-binding protein.
Once these proteins have assembled, the mRNA is cleaved 10-35 nt downstream of the AAUAAA recognition sequence by the endonuclease and approximately 20 adenine nucleotides are added by the poly(A) polymerase acting in a distributive mode of synthesis.
These nascent poly(A) tails are then bound by poly(A) binding protein, also an RNA-binding protein, which causes the poly(A) polymerase to shift to a processive mode of synthesis and results in the addition of up to 250 adenine nucleotides.
The function of the 3' tail is likely protective. mRNAs without a 3' tail are rapidly degraded.
Not all mRNAs are polyadenylated. Histone protein mRNAs, for example, are not. Termination of transcription of the histone H3 mRNA requires the formation of a hairpin structure in the 3' unstranslated region of the mRNA but this hairpin need not be GC-rich. The U7 snRNA pairs with a sequence in the vicinity of this hairpin to effect termination.
The existence of poly(A) tails on the 3' end of eukaryotic mRNAs provides a relatively easy means for cloning eukaryotic coding sequences.
As shown above, eukaryotic mRNAs can be converted into a double-stranded DNA molecule, which can be cloned, in what is essentially a two-step process:
- First, a short oligonucleotide consisting of approximately 20 thymine deoxyribonucleotides is hybridized with the mRNA. The oligonucleotide will hybridize readily to the poly(A) tail.
This now serves as a template and a primer for DNA synthesis. Reverse transcriptase is an RNA-directed DNA polymerase which catalyses the synthesis of a DNA strand from an RNA template. The mRNA is the template; the thymine oligonucleotide is the primer.
The DNA strand that is synthesized is a copy of the original mRNA.
Finally, the mRNA can be removed by treatment with alkali.
- In the second step, a DNA strand is sythesized to replace the original mRNA strand.
This synthesis is usually carried out using the DNA polymerase I (Klenow fragment). The newly-synthesized DNA strand serves as a template. It also serves to provide a primer because it generally also forms a hairpin loop at the 3' end. Synthesis from this "internal" primer results in the formation of a double stranded DNA molecule with a hairpin loop at one end and an oligo(dA):oligo(dT) region at the other.
The products of this reaction are treated with the enzyme S1 nuclease which cleaves single-stranded regions within double-stranded DNA molecules. The result (if all goes right!) is a blunt-ended DNA fragment which can be cloned into a suitably-digested plasmid vector.
This is the cDNA (complementary DNA) that corresponds to an original mRNA.
Control of Gene Expression through Alternative mRNA Processing Events
The fact that most eukaryotes processe mRNA means that they have an elaborate set of tools to vary the pattern and/or type of transcription while limiting the number of genes that are strictly necessary.
We can divide eukaryotic transcripts in 2 broad classes:
These transcripts give rise to a single mRNA from which a single protein is made.
These transcripts give rise to more than one mRNA and as a result can give rise to more than one protein.
In some cases, different transcripts can be generated by alternative splicing. The following cartoon shows how a single transcript can be spliced in two different ways to generate two different mRNAs
An example of alternative splicing occurs with the virus SV40. When it infects a cell, it directs the synthesis of two proteins, T antigen (big T antigen) and t antigen (little t antigen). Both are expressed from the same pre-mRNA. Which is expressed depends on differential splicing between two 5' splice sites and a common 3' splice site.
Some other examples are described below in the section on Sex determination in Drosphila melanogaster.
Different mRNA transcripts can also arise from a given pre-mRNA by altering locations of poly(A) tail addition as shown in the following cartoon.
In this example the selection of polyadenylation signal A1 or A2 will determine whether the RNA sequence coloured in green will be included in the mature transcript and therefore translated into functional protein.
An example of the use of alternative polyadenylation signals occurs in the rat. The same primary transcript gives rise to CALCITONIN if it is expressed in the thyroid, and to CALCITONIN GENE-RELATED PEPTIDE if expressed in brain cells. The difference is due primarily to a choice between two different poly(A) sites. As a consequence of this choice, different exons will be spliced together and this will result in the different protein expression in the two tissues.
ALTERNATIVE SPLICING & TAILING
It is also possible for multiple mRNA transcripts to be generated from a single pre-mRNA transcript through a combination of both alternative splicing reactions and the addition of poly(A) tails at many different sites.
We have already encountered one example of this when we looked at the molecular biology of antibody gene expression.
Some of the diversity in generating functionally different antibodies of a given antigenic specificity also depends on the use of alternative poly(A) sites and alternative splicing mechanisms. The switch from the synthesis of an IgM antibody with a C-terminal membrane attachment segment to synthesis of one with a C-terminal soluble segment depends on a choice of poly(A) sites.
Effect of Mutations
Mutations that change existing 5' or 3' splice sequences or that change the polyadenylation cleavage signal will result in altered gene expression. When an existing site is destroyed by mutation, an alternative cryptic site nearby is often unmasked. Similarly, new sites can be created by mutation. In many cases, the results are severe.
For example, b-thalassemia is caused by a single base mutation that creates a new 3' splice acceptor site in the human b-globin gene.
Sex Determination in Drosophila melanogaster
The determination of male or female sex in Drosophila melanogaster depends upon the expression of a series of genes which regulate the splicing of a cascade of genes in a male-specific manner or in a female-specific manner:
- The sex-lethal gene is transcribed in early female embryos but not in male embryos.
- The Sex-lethal protein is an RNA-binding protein.
- In late male embryos as well as in late female embryos, the sex-lethal gene is transcribed.
The Sex-lethal protein in developing female embryos blocks a splice acceptor site when it binds to the pre-mRNA. The resulting late Sex-lethal protein is functional.
In male embryos, the transcript is spliced differently. However, the spliced transcript contains an in-frame stop codon. As a result, no functional protein is synthesized.
- Next, the transformer gene is expressed. Once again, splicing of the transformer pre-mRNA depends on the presence of the Sex-lethal protein.
In male embryos, once again, the spliced transcript contains an in-frame stop codon so no Transformer protein is synthesized.
In female embryos, the late Sex-lethal protein binds to the pre-mRNA and results in an alternative splicing that removes the exon containing the stop codon. A functional Transformer protein can be synthesized.
- Finally, the double-sex gene is transcribed. Its pattern of splicing is affected by the presence of the Transformer protein which functions in association with the Transformer-2 protein (another RNA-binding protein).
In male embryos, a male-specific Double-sex protein is then synthesized.
In female embryos, a female-specific Double-sex protein is synthesized.
Ultimately, the Double-sex protein negatively regulates the expression of genes required for differentiation of the opposite sex.