Transcription in bacteria

© Anders Skovly 2024

Both transcription and translation are performed somewhat differently in bacteria compared to eukaryotes. This article focuses on transcription in bacteria.


The transcription process

Transcription is a kind of copying of a base sequence from DNA to RNA. It is not an exact copying mechanism: while adenines, cytosines and guanines are copied "one-to-one", the thymines of DNA are copied to similar but not identical uracils in RNA. The main component of the transcription process is a protein called RNA polymerase (RNApol). Located inside RNApol is a Product-site and an Addition-site (P-site and A-site), each of which can bind to a ribose-nucleotide.

Transcription of a gene is initiated by the binding of RNApol to a DNA sequence called a promoter, which lies adjacent to the start of the gene. A cluster of base pairs inside the promoter are broken as the two DNAs of the double helix separate into a transcription bubble. One of the separated DNAs will be used directly in the synthesis of RNA, this DNA is called the template. The other DNA is called the nontemplate. Which of the two DNAs become the template varies between genes and is determined by the promoter.

RNApol binds DNA, and the two DNA separate.
Figure 1: ① and ② Transcription initiates with the binding of RNApol to DNA. ③ The binding is followed by the opening of the double helix into a transcription bubble.

The template is positioned in such a way that one template base will be in close proximity to the P-site, and an adjacent template base will be in close proximity to the A-site. These bases will be named the first and second base of the template. A ribose-nucleotide enters the P-site, and if this nucleotide is complementary to the first base of the template the two will form a base pair. Similarly, a ribose-nucleotide will enter the A-site, and if it is complementary to the second base of the template the two will also form a base pair.

Two nucleotides binds the catalytic center of RNApol.
Figure 2: ① A close look at the two separated DNAs of the transcription bubble. The upper DNA is the template and is positioned near the P-site and A-site. The lower DNA is the nontemplate. Adenine-bases are colored red, thymines are green, guanine is yellow and cytosine is purple. ② Two ribose-nucleotides enters the P- and A-sites and form base pairs with the first and second base of the template.

When both the P-site and the A-site contain a nucleotide base paired to the template, a covalent bond will be created between the two nucleotides. This bond goes between the 3'-oxygen of the nucleotide in the P-site and the phosphate of the nucleotide in the A-site. The template is now base paired to an RNA-chain consisting of two nucleotides (see Figure 3). (The term "RNA-chain" is not fully appropriate when only two nucleotides have been incorporated into the chain, but I use it here anyway.)

A detail of RNA synthesis is that while the nucleotides in the RNA-chain contain a single phosphate each, the RNA is built from nucleotides containing three phosphates each. In these so-called triphosphates the 5'-carbon of ribose is covalently bonded to a first phosphate, which is covalently bonded to a second phosphate, which is covalently bonded to a third. The covalent bond that is created between two nucleotides is always created between the 3'-oxygen of the nucleotide in the P-site and the first phosphate of the nucleotide in the A-site. Simultaneously with the formation of this new bond, the old bond between the first and second phosphates is broken. The second and third phosphates then exist in the form of a pyrophosphate, which dissociates from RNApol.

A covalent bond forms between the nucleotide in the P-site and the nucleotide in the A-site. Pyrophosphate dissociates.
Figure 3: ① A covalent bond is created between the 3'-oxygen of the nucleotide in the P-site and the phosphate of the nucleotide in the A-site. Simultaneously the bond between the first and second phosphates is broken. ② The pyrophosphate dissociates from RNApol.

After bonding of the two nucleotides, RNApol (including the A-site and the P-site) moves a distance of one nucleotide along the template, in the template's 3'-to-5' direction. This causes the ribose-nucleotide in the A-site to be shifted over to the P-site, while the ribose-nucleotide in the P-site is shifted out of the P-site. The A-site is now empty and is in close proximity to the third base of the template. A new ribose-nucleotide enters the A-site and forms a base pair with the third base of the template, and again a covalent bond is created between the 3'-oxygen of the P-site-nucleotide and the first phosphate of the A-site-nucleotide. The template is now base paired to an RNA-chain consisting of three nucleotides.

RNApol moves along DNA, a new nucleotide enters the A-site, and another covalent bond is formed between two nucleotides.
RNApol moves along DNA, a new nucleotide enters the A-site, and another covalent bond is formed between two nucleotides.
Figure 4: ① RNApol (including the A-site and P-site) moves one nucleotide-length along the template's 3'-to-5'-direction. The A-site becomes available for binding to a new nucleotide. ② A new ribose-nucleotide enters the A-site. In this case the nucleotide contains an uracil-base, which is colored dark green. ③ A covalent bond is created between the 3'-oxygen of the P-site-nucleotide and the first phosphate of the A-site-nucleotide. Simultaneously the bond between the first and second phosphates is broken. ④ Pyrophosphate dissociates from RNApol.

The events in the previous paragraph are repeated over and over again: RNApol moves along the template DNA, a new ribose-nucleotide is incorporated into the RNA-chain, RNApol moves, yet a nucleotide is added, and the RNA-chain grows longer and longer (se Figure 5).

As RNApol moves along the template, the two DNA chains of the double helix separates in front of RNApol. Behind the moving RNApol the template separates from RNA and once again base pairs with the nontemplate, causing the DNA double helix to reform. In effect, the transcription bubble moves along the DNA together with RNApol. This means that as soon as one RNApol has moved away from the promoter of a gene, another RNApol can bind the same promoter and initiate transcription. A gene can therefore be transcriped by multiple RNApols at the same time.

RNA grows longer and longer.
Figure 5: ① Ribose-nucleotides (yellow dots) enters RNApol and are added to the RNA-chain. ② RNApol moves along the DNA while adding more nucleotides, and the RNA-chain grows longer. Note: RNA is not shaped as a helix, so the figure is a bit misleading.

Eventually, RNApol reaches a terminator sequence in the DNA. This base sequence causes RNApol to halt. Next, the RNA dissociates from both RNApol and DNA, followed by dissociation of RNApol from DNA. The two separated DNAs of the transcription bubble bind together again, causing the transcription bubble to disappear, and with that the transcription process has come to an end.

Note that transcription does not initiate exactly at the start of a gene, nor does it end exactly at the end of a gene. Rather, transcription begins somewhere before the gene and ends somewhere after it. An RNA therefore contains a base sequence at each end that will not be translated to the amino acids of a protein. These parts of the RNA are called the 5'- and 3'-untranslated regions.

How base sequences are presented

At the start of transcription, nucleotides enters both the P-site and the A-site. Once RNApol begins to to move, however, new nucleotides only enters the A-site, as the P-site is from that point always occupied by a nucleotide of the RNA-chain. The covalent bond is created between the 3'-oxygen of the P-site-nucleotide and the first phosphate of the A-site-nucleotide, which means that new nucleotides are added to the 3'-end of the growing RNA chain.

Earlier it was stated that transcription is a kind of copying of a base sequence from DNA to RNA. The base sequence in RNA is synthesized antiparallell complementary to the sequence in the template, which again is antiparallell complementary to the sequence in the nontemplate. Therefore, the base sequence of RNA is a copy of the nontemplate sequence.

The sequence in RNA can be written in two different ways: in the 3'-to-5'-direction, or in the 5'-to-3'-direction. As RNA in synthesized in the 5'-to-3'-direction, with new ribose-nucleotides added to the 3'-end of the existing RNA-chain, it is common to write the sequences in this direction (i.e. the left-to-right writing direction equals the 5'-to-3' sequence direction). During DNA replication, the DNA is also synthesized in the 5'-to-3'-direction, so it is common to write DNA sequences in this direction too.

When the base sequence of a gene is to be presented one must choose between using the sequence of the template, or the sequence of the nontemplate, or both. As the base sequence of RNA is a copy of the nontemplate sequence it is common to present gene sequences only as their nontemplate sequences. A promoter sequence is also presented simply as the sequence of nontemplate DNA, even if promoter DNA functions in double helical form. The template sequences can be dropped as they are given implicitly by the nontemplate sequence, since the two are complements. For example, consider this sequence from double helical DNA:

5’-TAATGTGAGTTAGCTCACTCAT-3’
3’-ATTACACTCAATCGAGTGAGTA-5’

The two sequences can be written simply as 5'-TAATGTGAGTTAGCTCACTCAT-3' (nontemplate). This is the method that will be used on this website.

As RNA and DNA are synthesized in the 5'-to-3' direction, this direction is commenly referred to as "downstream". The 3'-to-5' direction, being the opposite direction, is commonly referred to as "upstream". For example, if one says that "On the nontemplate DNA, the promoter is upstream of the gene", this means that the promoter is located in the 3'-to-5' direction relative to the gene. (The terms upstream and downstream will be used a fair bit in some of my later articles, so they're worth remembering if you intend to keep reading later articles.)

Sigmafactors

One important component of transcription was excluded from the section on the transctiption process, to make it an easier read. The component in question is a protein called a sigma-factor, or just sigma. Before transcription can begin, RNApol has to bind together with sigma. The sigma can then bind to a promoter, and RNApol will be bound to the promoter indirectly through sigma. There are different types of sigma that allows RNApol to bind to different types of promoters. E. coli, for example, has seven different types. Once RNApol has begun to move along DNA the sigma dissociates from RNApol and the promoter, and does not participate further in transcription.

RNApol binds to DNA indirectly through sigma.
Figure 6: ① Sigma and RNApol binds together. ② RNApol binds DNA indirectly through sigma. ③ Once RNApol has begun to move the sigma dissociates.

Summary

Transcription is a kind of copying of a base sequence from nontemplate DNA to RNA. Adenines, cytosines and guanines are copied exactly, while thymines in DNA are copies to uracils in RNA.

The first events of transcription is the binding of RNApol to sigma, followed by the binding of sigma to a promoter-sequence in DNA and the opening of the double helix into a transcription bubble. Two ribose-nucleotides enters RNApol and form base pairs with the template. A covalent bond is then created between the 3'-oxygen of one nucelotide and the first phosphate of the other nucleotide.

RNApol then begins to move along the DNA. New ribose-nucleotides continue to form base pairs with the template, and new covalent bonds are created between the nucleotides, causing the RNA-chain to grow longer. Eventually, RNApol reaches a terminator where it halts. RNApol, RNA and DNA then dissociates from each other, and the transcription process is complete.



Back to the startpage