DNA and RNA Structure: The Molecular Blueprints of Life
Nucleotide Building Blocks
Both DNA and RNA are polymers built from nucleotide monomers. Each nucleotide has three components: a nitrogenous base, a five-carbon (pentose) sugar, and one or more phosphate groups. The bases fall into two chemical families. Purines (adenine and guanine) have a two-ring structure. Pyrimidines (cytosine, thymine in DNA, and uracil in RNA) have a single-ring structure. The base is attached to the 1' carbon of the sugar through a glycosidic bond.
The sugar in DNA is 2'-deoxyribose, which lacks a hydroxyl group on the 2' carbon. The sugar in RNA is ribose, which retains the 2' hydroxyl group. This seemingly small difference has major consequences: the extra hydroxyl makes RNA more chemically reactive and susceptible to hydrolysis, which is one reason why DNA, not RNA, serves as the long-term information storage molecule in most organisms.
Nucleotides are linked together by phosphodiester bonds. These bonds connect the 5' phosphate group of one nucleotide to the 3' hydroxyl group of the next, creating a sugar-phosphate backbone with alternating sugar and phosphate groups. The bases extend from the backbone like rungs on a ladder. Because each strand has a 5' end (with a free phosphate) and a 3' end (with a free hydroxyl), the strand has directionality, a feature that is critical for replication and transcription.
DNA's Double Helix
James Watson and Francis Crick proposed the double helix model of DNA in 1953, building on X-ray diffraction data from Rosalind Franklin and Maurice Wilkins. Their model revealed that DNA consists of two polynucleotide strands wound around a common axis in a right-handed helix. The sugar-phosphate backbones run along the outside of the helix, while the bases point inward and pair with bases on the opposing strand.
The two strands are antiparallel, meaning one runs 5' to 3' while the other runs 3' to 5'. This arrangement is essential for the enzymes that replicate and transcribe DNA, which always work in the 5' to 3' direction.
Base pairing follows strict rules dictated by hydrogen bonding and geometry. Adenine pairs with thymine through two hydrogen bonds (A-T). Guanine pairs with cytosine through three hydrogen bonds (G-C). This complementary base pairing means that the sequence of one strand automatically determines the sequence of the other, a feature that is the molecular basis of DNA replication and genetic fidelity.
The double helix has a diameter of approximately 2 nanometers and completes one full turn every 10 base pairs, spanning 3.4 nanometers. Two grooves of unequal width, called the major groove and minor groove, wind along the helix. Many proteins that regulate gene expression recognize specific DNA sequences by reading the pattern of hydrogen bond donors and acceptors exposed in the major groove, without needing to unwind the helix.
The stability of the double helix comes from two main sources: hydrogen bonds between complementary bases and stacking interactions between the flat, aromatic ring systems of adjacent bases. Stacking interactions, a type of van der Waals force, actually contribute more to the overall stability than hydrogen bonds. DNA regions rich in G-C base pairs (with three hydrogen bonds each) are more thermally stable than regions rich in A-T pairs (with two hydrogen bonds each).
DNA Replication and Information Storage
The complementary base pairing of DNA immediately suggests a mechanism for replication. When the two strands are separated, each serves as a template for building a new complementary strand. The enzyme DNA polymerase reads the template strand in the 3' to 5' direction and synthesizes the new strand in the 5' to 3' direction, adding nucleotides that pair with the template bases according to the A-T and G-C rules.
DNA replication is semiconservative: each daughter DNA molecule contains one original strand and one newly synthesized strand. This was demonstrated by Matthew Meselson and Franklin Stahl in their elegant 1958 experiment using isotopically labeled DNA. The semiconservative mechanism ensures that genetic information is faithfully transmitted from one cell generation to the next.
Errors in replication, called mutations, occur at a rate of approximately one per billion base pairs per cell division in human cells, thanks to the proofreading ability of DNA polymerase and post-replication repair mechanisms. When mutations do occur, they can alter protein structure and function, sometimes leading to disease but also providing the raw material for evolution.
RNA Structure and Types
RNA differs from DNA in three key chemical features: it uses ribose instead of deoxyribose, it contains uracil instead of thymine, and it is typically single-stranded rather than double-stranded. These differences give RNA distinct properties that suit its diverse cellular roles.
Although RNA is generally single-stranded, it can fold back on itself to form local regions of double-stranded structure through intramolecular base pairing. These secondary structures, including stem-loops, hairpins, and pseudoknots, are critical for RNA function. The cloverleaf structure of transfer RNA (tRNA), for example, is formed by internal base pairing and is essential for tRNA's role in protein synthesis.
Cells produce several types of RNA, each with a distinct function. Messenger RNA (mRNA) carries the protein-coding information from DNA in the nucleus to the ribosomes in the cytoplasm. It is produced by transcription, the process in which RNA polymerase reads a DNA template and synthesizes a complementary RNA strand. In eukaryotes, the initial transcript (pre-mRNA) undergoes processing, including the addition of a 5' cap, a 3' poly(A) tail, and the removal of non-coding sequences (introns) by splicing, before the mature mRNA is exported to the cytoplasm.
Transfer RNA (tRNA) acts as an adapter during translation. Each tRNA molecule carries a specific amino acid on its 3' end and has an anticodon loop that base-pairs with the corresponding codon on the mRNA. This matching ensures that amino acids are added to the growing polypeptide chain in the correct order.
Ribosomal RNA (rRNA) is the most abundant type of RNA in the cell and forms the structural and catalytic core of the ribosome. The ribosome itself is a ribonucleoprotein complex (made of both RNA and protein), and it is the rRNA, not the protein, that catalyzes peptide bond formation. This catalytic activity makes the ribosome a ribozyme, an RNA molecule with enzymatic function.
Additional types of RNA include small nuclear RNA (snRNA), which participates in mRNA splicing; microRNA (miRNA) and small interfering RNA (siRNA), which regulate gene expression by targeting mRNAs for degradation or translational silencing; and long non-coding RNA (lncRNA), which plays diverse roles in chromosome organization, gene regulation, and other processes that are still being characterized.
The Central Dogma
The relationship between DNA, RNA, and protein is summarized by the central dogma of molecular biology, first articulated by Francis Crick in 1958. In its simplest form: DNA is transcribed into RNA, and RNA is translated into protein. Information flows from nucleic acid sequence to protein sequence, not in the reverse direction.
There are important exceptions and refinements to this scheme. Retroviruses like HIV use reverse transcriptase to copy RNA back into DNA, violating the directionality implied by the original formulation. Some RNA molecules are themselves functional end products, never translated into protein (tRNA, rRNA, miRNA, etc.). Nevertheless, the central dogma remains a useful framework for understanding how genetic information is stored, accessed, and expressed in all living cells.
DNA and RNA in Biotechnology
Understanding nucleic acid structure has enabled powerful technologies. The polymerase chain reaction (PCR) exploits the base-pairing rules to amplify specific DNA sequences from trace amounts of starting material. DNA sequencing, pioneered by Frederick Sanger and now performed using high-throughput next-generation methods, reads the base sequence of DNA molecules, enabling genomics research and clinical genetic testing. CRISPR-Cas9 gene editing uses a guide RNA to direct a nuclease to a specific DNA sequence, allowing precise modifications to the genome.
RNA-based technologies have also advanced dramatically. mRNA vaccines, such as those developed against COVID-19 by Pfizer-BioNTech and Moderna, deliver synthetic mRNA encoding a viral protein into cells, which then produce the protein and trigger an immune response. RNA interference (RNAi) therapies use siRNA molecules to silence disease-causing genes. These applications demonstrate how understanding the fundamental chemistry of nucleic acids translates into practical tools that improve human health.
DNA stores genetic information in a stable double helix held together by complementary base pairing, while RNA carries out that information through diverse single-stranded molecules that serve as messengers, adapters, and catalysts in protein synthesis and gene regulation.