Proteins play crucial roles in almost every biological process. They are responsible in one form or another for a variety of physiological functions including:
Like most biological macromolecules, proteins are made up of simple building blocks; in the case of proteins, these building blocks are called amino acids. As shown below, the amino and carboxyl moieties in an amino acid are alpha to one another; also located on the alpha carbon is an "R" group. The nature of this R-group (called the side chain) determines the identity of a particular amino acid. There are a total of 20 amino acids which are used to make up proteins (some modified or otherwise unusual amino acids exist that we will discuss later in the course). In solution at physiological pH (7.4), amino acids undergo an acid-base reaction to form zwitterions. In a zwitterion, the + and - charges cancel to give a molecule with a net charge of zero. However, the pKa values for a typical amino acid (glycine for example) are 9.6 and 2.3 for the amino and carboxyl groups, respectively. If the pH of an amino acid solution is lowered significantly from 7.4, a species results in which the amine group has a positive charge, while the carboxyl is neutral. Likewise, If the pH is raised from 7.4, a species results in which the amine group is neutral, while the carboxyl has a negative charge. Thus, the ionization state of amino acids is pH dependent.
All amino acids except glycine (R = H) are chiral. Every amino acid in mammalian systems exists in the L-configuration, where "L" signifies that the amino acid in Fischer projection is similar to L-glyceraldehyde. This description of stereochemistry is outdated, and is seldom used except in trivial names. However, all natural amino acids are also in the S-configuration, which is determined by assigning priorities based on the Cahn-Ingold-Prelog rules.
As was mentioned above, there are 20 amino acids which are used to make up proteins in mammalian biological systems. The amino acids are amphipathic molecules, meaning that they contain both polar and non-polar functional groups, and thus have a tendency to form interfaces between hydrophilic and hydrophobic molecules. The properties of each amino acid are dictated by the side chain, which can vary in size, shape, charge, reactivity and ability to hydrogen bond. The amino acids are grouped according to the properties of their sidechains, as shown in the figure below. Each amino acid has a standard three letter abbreviation which is used in lieu of a full structure, as seen in the figure.
The first six amino acids, glycine (GLY), alanine (ALA), leucine (LEU), isoleucine (ILE), proline (PRO) and valine (VAL) are aliphatic in nature. Glycine and alanine are too small to have a hydrophobic effect in proteins, but they are considered aliphatic amino acids. Proline is also aliphatic, and because of its cyclic structure, it can often be found in the bend portion of a protein chain. Valine, leucine and isoleucine are hydrophobic aliphatic, and although they can be found anywhere in the chain, they prefer to cluster in the inside region of a protein, away from water. This effect causes a significant stabilization of the protein structure.
There are three aromatic amino acids, phenylalanine (PHE), tyrosine (TYR) and tryptophan (TRP). These amino acids have sidechains which contain delocalized pi electrons that can interact with other pi systems in biomolecules. In addition, the phenolic hydroxyl of TYR can ionize under physiological conditions, and thus increase water solubility. Two of the amino acids are sulfur-containing, namely cysteine (CYS) and methionine (MET). These amino acids have special properties that will be covered at a later time. Finally, there are two hydroxyl-containing amino acids, serine (SER) and threonine (THR). These two amino acids have sidechains which can hydrogen bond to water or to other groups on neighboring macromolecules.
Five of the 20 amino acids are considered hydrophilic, in that they are able to ionize at physiological pH. The amino acids lysine (LYS), arginine (ARG) and histidine (HIS) are considered basic hydrophilic, since they contain basic sidechain groups that will have a positive charge at pH 7.4. The amino acids aspartic acid (ASP) and glutamic acid (GLU) are considered acidic hydrophilic, since they contain acidic sidechain groups that will have a negative charge at pH 7.4. These two amino acids also have amide counterparts, asparagine (ASN) and glutamine (GLN).
Note that 8 of the 20 amino acids have ionizable sidechains. Arginine, lysine and histidine can have a positive charge, while aspartic acid and glutamic acid can possess a negative charge under physiological conditions. It is also possible for serine, tyrosine and cysteine to ionize to a negatively charged species during certain biological processes.
Protein chains are held together by peptide bonds, which are simply amide linkages between neighboring amino acids. When two amino acids interact, an equilibrium is set up between unbound amino acids and a species in which two amino acids are linked, called a dipeptide. Since this equilibrium favors the unlinked forms of the amino acids, it is clear that formation of a peptide bond requires energy. When a few amino acids become linked, the protein species is called an oligopeptide, and when many are linked, the species is called a polypeptide. Polypeptides are generally between 50 and 2000 amino acids. Their molecular weights are expressed in Daltons, where 1 Dalton is equal to 1 atomic mass unit (the weight of one hydrogen atom). 1000 Daltons is called a kilodalton (kD). Most proteins weigh in between 5500 and 220,000 Daltons.
Each peptide chain has two free ends, the amino terminus, which is always drawn on the left by convention, and the carboxyl terminus, which is always drawn on the right. This convention extends to peptide chains expressed using three letter abbreviations. Thus, the oligopeptide ALA-GLY-TRP-SER-GLU has an alanine at the amino terminus, and a glutamic acid at the carboxyl terminus.
Amino acids in a protein are determined by the genetic code, wherein a three base sequence of nucleotides called a codon calls for a specific amino acid to be added to the growing chain. The process of converting the sequence of codons into a sequence of amino acids entails transcription (the conversion of a segment of DNA into complimentary mRNA) and translation (the conversion of the mRNA code into protein). You will learn a great deal more about protein synthesis later in the semester.
As shown below, amino acids can participate in reactions that occur after they are positioned in a peptide chain. These reactions are called post-translational modifications, and can be of enormous biological significance. One example of a post-translational modification is the crosslinking of two cysteines to form a new amino acid, called cystine. This modification most often occurs in extracellular proteins, and can contribute to their three-dimensional structure.
There are other post-translational modifications of biological significance, three of which are shown below. In some proteins, acetylation of the amino terminus occurs. This modification greatly decreases protein degradation, since many proteases require an amino terminus to act. In structural protein such as collagen, hydroxylation of proline occurs to afford hydroxyproline (HPRO). Since hydroxyproline has a hydrogen-bonding sidechain, it is used to lend additional strength to the collagen structure, and hence to tendons and other like tissues. Finally, the amino acids serine, threonine and tyrosine can be phosphorylated within a protein chain. This modification is often used by the cell to turn on or off a critical biological process.
In addition to the post-translational modifications mentioned above, some proteins are synthesized in inactive forms called pro forms. For example, some enzymes are synthesized as inactive proenzymes, and are trimmed by a peptidase to form the active enzyme. The portion of the enzyme chain that is cleaved is then hydrolyzed, and the amino acids are reused.
:
In 1953, Sanger performed a critical series of experiments in which he demonstrated several facets of protein structure. His experiments showed that proteins have a unique amino acid sequence; all molecules of a given protein are identical, and the sequence of each different protein is unique. He also showed for the first time that all amino acids in mammalian proteins are in the S-configuration, that the peptide bond is an amide bond, and that amino acids have alpha amino groups and alpha carboxyl groups. We now know that proteins are made when a section of DNA is read (a process called called transcription) and a complimentary molecule of RNA is formed. This RNA is then used to specifiy the structure of a given protein through a process called translation. Thus, the sequence of a protein is encoded in DNA.
The sequence of a peptide is important for other reasons including these:
The peptide bond has unique characteristics which contribute to the overall structure of proteins. The peptide bond itself is rigid, and thus is not free to rotate. This rigidity arises because the amide bond is involved in a tautomerization that gives it considerable double bond character. The other bonds in a peptide ar not rigid, and can freely rotate, giving the protein chain >many degrees of rotational freedom. The amide bond, together with the bonds on either side of it that connect to the alpha carbons, are called the backbone of the protein chain.
Proteins have a total of four levels of structure, as defined below:






Proteins can be associated with membranes, and in fact carry out almost every membrane function. Interestingly, membrane proteins have special characteristics that allow them to exist in this lipid environment. Proteins that sit on the inner or outer surface of the membrane are called extrinsic or peripheral, and have a large percentage of hydrophobic amino acids in the portion of the molecule that is close to the hydrophobic membrane structure. The amino acids on the outer portion of the protein (facing the aqueous environment of the cytoplasm or extracellular fluid) are mostly hydrophilic, allowing the protein to be compatable with water. Proteins can also traverse the membrane, and in this case they are called intrinsic or integral. The portion of the protein that passes through the membrane is composed of hydrophobic amino acid residues, while the inner and outer portions exposed to water are largely hydrophilic. Transmembrane proteins can move laterally in the membrane, but cannot flip-flop.
Proteins are a unique class of biomolecules, in that they can recognize and interact with diverse substances. The contain complimentary clefts and surfaces which are designed to bind to specific molecules. Often only a single molecule or even a single stereoisomer can bind to a complimentary protein surface. Once this binding takes place, a complex is formed. This induces a conformational change which may act as a signal within the cell, or may serve to activate an enzyme.
There are a number of experimental procedures which may be used to characterize peptides and larger protein molecules. Six of these methods are discussed below:
A second common electrophoreisis procedure is known as isoelectric focusing, because proteins migrate until they reach electroneutrality. Consider a protein that has 50 ionizable sidechains, 25 that can be positive and 25 that can be negative. The isoelectric point pI is the pH at which the number of positive and negative charges equals zero. At this point, the net charge is zero. In isoelectric focusing, a polyacrylamide gel is treated with ampholines, which set up a pH gradient across the length of the gel. As shown above, each protein will "focus" at the point on the gel where the pH equals its isoelectric point, at which time it stops moving. Since isoelectric focusing is non-denaturing, it can be used to isolate active proteins in their native form.
Peptides can also be synthesized by an automated process. These peptides are constructed on beads made of polystyrene or some other solid support in a process known as solid phase synthesis. As shown below the bead is reacted with the carboxyl end of an amino acid in which a protecting group such as N-Boc is in place to keep the amine from reacting prematurely. Once the amino acid is attached to the bead, the amino terminus is
The cycle of removal of the protecting group and addition of amino acids is continued until the desired peptide has been formed, and then the peptide is released from the bead using HF.
4. Ion Exchange Chromatography. Ion exchange chromatography seperates proteins based on their charge, as shown below. There are two methods, known as anion exchange (shown below) and cation exchange. In anion exchange chromatography, a protein is added to a column packed with beads which bear a positively charged group such as diethylaminoethyl. The negative charges on the protein displace the counterion (chloride is shown) and stick to the bead. After washing the coulnm, the protein is eluted using another negative ion. Sodium chloride in a concentration gradient is commonly used, and the more negative charges on a protein, the better it sticks, and the more NaCl is needed to displace it. A cation exchange column works the same, except that the charge on the bead is negative, and proteins stick by their positively charged residues.
5. Affinity Chromatography. Affinity chromatography is used to isolate one particular protein from a mixture, as shown in the figure below. An epoxysepharose gel is allowed to react with a ligand that has an affinity for the protein of interest, and the protein mixture is then added to the column. Only the protein that binds to the ligand will stick. After washing the column to remove the rest of the protein, the protein of interest is eluted using a salt gradient.
6. Enzyme-Linked Immunosorbent Assay (ELISA). Enzyme-linked immunosorbent assay, or ELISA, depends on the reaction of a predetermined protein with a specific antibody to form a complex. This method is extremely sensitive, and can distinguish between two proteins that differ by only one amino acid. A serum or blood sample is added to the specific antibody which has been bound to a polymer support, and the first complex forms. A second antibody, specific for the protein of interest but linked to an enzyme is then added, forming a complex that is bound to an active enzyme. The enzyme carries out the conversion of a non-colored or non-fluorescent substrate to a colored or fluorescent product, which is measured. The more color that is produced, the more of the protein of interest that is present. This technique is the basis for many diagnostic tests, including pregnancy tests where human chorionic gonadotropin is measured.
1. The Renin-Angiotensin-Aldosterone System. The renin-angiotensin-aldosterone system is used by the body to regulate blood pressure (see the figure below). In response to lowered blood pressure, the kidney releases the protease renin, which cleaves the inactive, 14 amino acid peptide angiotensinogen to another inactive peptide, the decapeptide angiotensin I. A second enzyme, angiotensin converting enzyme (ACE), converts this decapeptide to its active form, the octapeptide angiotensin II. Angiotensin II is a potent vasoconstrictor that is about 40 times more potent than norepinephrine at raising vascular pressure. In addition, angiotensin II stimulates the release of aldosterone, a steroid hormone that causes the kidney to reabsorb sodium and water, thus raising blood pressure by an osmotic effect. Angiotensin II is ultimately inactivated by a third peptidase called angiotensinase, which renders the hormone inactive.
The renin-angiotensin-aldosterone system is of great importance in the development of a common disease known as essential hypertension. When the renin-angiotensin-aldosterone system is overactive, the basal blood pressure is elevated, putting increased stress on the cardiovascular system. A group of compounds have been developed known as ACE inhibitors which are used quite effectively to treat hypertension. Since they prevent the conversion of angiotensin I to angiotensin II, they prevent the elevation of blood pressure seen in essential hypertension.
2. Oxytocin and Vasopressin. Oxytocin and vasopressin are two peptide hormones with very similar structure, but with very different biological activities. Their primary sequences are shown below. Interestingly, their structures only differ by one amino acid residue (the hydrophobic LEU number 8 in oxytocin is replaced by a hydrophilic ARG residue in vasopressin). Oxytocin is a potent stimulator of uterine smooth muscle, and also stimulates lactation. However, vasopressin, also know as antidiuretic hormone (ADH), has no effect on uterine smooth muscle, but causes reabsorbtion of water by the kidney, thus increasing blood pressure.
3. Insulin and Glucagon. Insulin is an extremely important peptide hormone that is produced by the beta cells of the Islet of Langerhans in the pancreas. It has 51 amino acids, three disulfide crosslinks, and is comprised of two seperate chains, termed A and B. Insulin has a number of important effects on cells in the body including:
Insulin is not synthesized in active form, but is first made as a single inactive peptide chain called preproinsulin (see the figure below). Preproinsulin has no crosslinks, and in addition to the A and B chain, has two additional portions called the signal sequence and the connecting (C) peptide. The signal sequence informs the cell that insulin is being made, and that the finished preproinsulin should be deposited outside the cell. The C-peptide is necessary to allow preproinsulin to fold in the correct conformation to ultimately produce active insulin. Preproinsulin is processed by a two step procedure; in the first step, the signal sequence is cleaved by a peptidase, and two of the three crosslinks are formed to give a new but still inactive peptide called proinsulin. A second peptidase then cleaves the C-peptide, and an internal disulfide forms to produce insulin.
Glucagon is a peptide hormone that is formed in the alpha cells of the Islets of Langerhans in the pancreas. It is a single chain peptide consisting of 29 amino acid residues, and has effects which oppose insulin, including:
4. Hemoglobin. Hemoglobin A (HbA) is a tetrameric protein which consists of two alpha chains and two beta chains, and comprises 98% of human hemoglobin A. There is a heme group and an oxygen binding site on each subunit; therefore, each molecule of HbA can carry 4 molecules of oxygen. There are other forms of human hemoglobin A, the most common being HbA2, which has two alpha chains and two delta chains, and accounts for 2% of HbA.
Hemoglobin is an example of an allosteric protein, i.e. its function can be altered by the binding of some external substance (called the effector) at a site on the molecule other than the active site (the allosteric site). When an allosteric effector binds to a protein, it induces a conformational change which turns the function of the protein either on (positive allosterism) or off (negative allosterism). In the case of hemoglobin, the allosteric effector is 2,3-diphosphoglycerate (2,3-DPG), which causes hemoglobin to have 1/26th of its normal affinity for oxygen. This is an important issue, since 2,3-DPG in the tissues triggers the release of oxygen at the correct location.
Hemoglobin also exhibits cooperativity, which is a phenomenon wherin the binding of one molecule to a protein with more that one active site influences the ease of binding of subsequent molecules. Cooperativity can be positive (the second molecule binds more easily), or negative (the second molecule binds less easily). In the case of hemoglobin, the binding of oxygen to the four sites of hemoglobin is an example of positive cooperativity.
As shown in the figure below, hemoglobin can also exist in a glycosylated form known as HbA1C. HbA1C is formed when the amino terminus of HbA reacts with glucose, first reversibly forming an aldimin or Schiff's Base, and then undergoing an irreversible Amadori rearrangement to afford the ketamine form HbA1C. In normal patients, HbA1C accounts for about 3-5% of HbA, but in diabetics who have elevated blood glucose for extended periods, this number can reach 6 to 15%. Physicians can measure HbA1C, and are using it as a reliable way to monitor how well diabetic patients are complying with their insulin therapy.
5. Collagen. Collagen is a connective tissue protein that is found in skin, bone, tendons, cartilage, the cornea, etc.. It is quite insoluble in water, and is composed of two types of chain termed alpha-1 and alpha-2. In the amino acid sequence of collagen, about every 3rd amino acid is a GLY residue, and there are many prolines which are hydroxylated to form hydroxyproline (HPRO). LYS residues are also hydroxylated in collagen to form HLYS. These additional sidechain OH residues allow for extra strength due to H-bonding, and the GLY residues allow the protein to coil more tightly, since they fit on the inside of the helix. In a collagen fiber, three of these helices are coiled together to form a rope-like structure called a superhelical coil. It is this structure that gives collagen its great strength. Collagen structure can be disrupted in diseases such as scurvy, which is a lack of ascorbic acid, a cofactor in the hydroxylation of proline. In addition, collagen structure is disrupted in rheumatoid arthritis.
from www.mun.ca/biology/scarr/Collagen_structure.html
Return to the PSC 3110 Homepage