Amino Acids and Proteins: The Chemistry of Life Molecular Machinery
Amino Acid Structure and Classification
Each amino acid has a central (alpha) carbon bonded to four groups: an amino group (-NH2), a carboxyl group (-COOH), a hydrogen atom, and a variable side chain (R group) that distinguishes the 20 standard amino acids. Because the alpha carbon is bonded to four different groups (except in glycine, where R = H), it is a stereocenter. All standard amino acids in proteins have the L-configuration, which corresponds to S configuration by the Cahn-Ingold-Prelog system (except for cysteine, which is R due to the sulfur atom in its side chain changing the priority order).
The 20 standard amino acids are classified by their side chain properties. Nonpolar, aliphatic side chains (glycine, alanine, valine, leucine, isoleucine, proline, methionine) are hydrophobic and tend to cluster in the interior of folded proteins. Aromatic side chains (phenylalanine, tyrosine, tryptophan) are also largely hydrophobic but can participate in pi-stacking interactions. Polar, uncharged side chains (serine, threonine, cysteine, asparagine, glutamine) form hydrogen bonds with water and with other parts of the protein.
Positively charged side chains at physiological pH (lysine, arginine, histidine) and negatively charged side chains (aspartate, glutamate) are hydrophilic and typically found on the protein surface. Histidine is unique because its imidazole side chain has a pKa near physiological pH (about 6.0), allowing it to act as either a proton donor or acceptor in enzyme active sites. Cysteine is notable for its thiol (-SH) group, which can form disulfide bonds (cystine linkages) between two cysteine residues, providing covalent cross-links that stabilize protein structure.
The Peptide Bond
The peptide bond forms through a condensation reaction between the carboxyl group of one amino acid and the amino group of another, releasing water. The resulting amide bond (C-N) has partial double bond character because resonance delocalizes the nitrogen lone pair into the carbonyl, making the peptide bond planar and restricting rotation. The six atoms of the peptide unit (C-alpha, C=O, N-H, C-alpha) all lie in the same plane.
This planarity has profound consequences for protein structure. The backbone of a polypeptide chain can be described as a series of rigid planar peptide units connected by rotatable bonds at each alpha carbon. The two dihedral angles at each alpha carbon (phi and psi) determine the overall fold of the backbone. Certain combinations of phi and psi are sterically forbidden because they bring atoms too close together, while others correspond to regular secondary structures.
Peptide bond formation in living cells is catalyzed by the ribosome and requires energy input from GTP hydrolysis. The sequence of amino acids in a protein (the primary structure) is determined by the sequence of codons in the messenger RNA, which is in turn determined by the DNA gene sequence. A typical protein contains 100-500 amino acid residues, though some (like titin, with over 34,000 residues) are much larger.
Protein Folding and Structure
Protein structure is organized in four hierarchical levels. Primary structure is the linear amino acid sequence. Secondary structure refers to local folding patterns stabilized by backbone hydrogen bonds: alpha helices (right-handed coils where the C=O of residue n hydrogen bonds to the N-H of residue n+4) and beta sheets (extended strands running side by side, connected by hydrogen bonds between strands). Turns and loops connect helices and strands and often occur at the protein surface.
Tertiary structure is the overall three-dimensional fold of a single polypeptide chain, stabilized by interactions between side chains: hydrophobic packing in the protein core, hydrogen bonds, ionic interactions (salt bridges) between oppositely charged residues, and disulfide bonds between cysteines. Quaternary structure describes the arrangement of multiple polypeptide chains (subunits) into a functional complex. Hemoglobin, for example, is a tetramer of two alpha and two beta subunits.
The driving force for protein folding is the hydrophobic effect: nonpolar side chains are buried in the protein interior to minimize their contact with water, while polar and charged side chains are displayed on the surface. A folded protein is only marginally stable (about 20-60 kJ/mol more stable than the unfolded state), meaning that small changes in temperature, pH, or chemical environment can cause denaturation (unfolding).
Enzymes: Protein Catalysts
Enzymes are proteins that catalyze biological reactions with extraordinary specificity and efficiency. They accelerate reactions by factors of 10^6 to 10^17 by stabilizing the transition state more than the ground state, lowering the activation energy. The active site, a three-dimensional pocket formed by specific amino acid residues, binds the substrate with complementary shape, charge, and hydrogen bonding interactions.
Enzyme mechanisms use the same organic chemistry principles as laboratory reactions but with precision that synthetic chemistry cannot yet match. Serine proteases (like trypsin and chymotrypsin) use a catalytic triad of serine, histidine, and aspartate to cleave peptide bonds through a covalent acyl-enzyme intermediate. Lysozyme cleaves bacterial cell wall polysaccharides by stabilizing an oxocarbenium ion intermediate. Understanding enzyme mechanisms at the molecular level requires fluency in organic chemistry, stereochemistry, and acid-base chemistry.
The 20 standard amino acids, each with a unique side chain, polymerize through peptide bonds into polypeptide chains that fold into precise three-dimensional structures. Protein function depends on this folded structure, which is stabilized by hydrophobic packing, hydrogen bonds, ionic interactions, and disulfide bonds. Enzymes, the catalytic proteins, accelerate reactions using the same organic chemistry mechanisms found in laboratory reactions.