CHEM 440
Biochemistry I

J. D. Cronk   Syllabus [ Previous | Next ] Pick a lecture:
11.header

Lecture 11. Protein structure: Part III

Friday 30 September 2011

Fibrous proteins and the collagen triple helix. Residue propensities in proteins. Experimental structure determination methods: X-ray crystallography and NMR. Structural distribution of residues.

Reading: VVP3e - Ch.6, pp.134-146.


8. Summary

Lecture 11 Summary

Having been introduced to the elements of secondary structure and the fundamental features of the structures of the fibrous structural proteins keratin and collagen, we are getting ready to move on to consider tertiary and quaternary structures of globular proteins. First we review and discuss some further features of keratin and collagen, then make a bit of a detour into methods for structure determination, X-ray crystallography and multidimensional NMR. Along the way, we'll also consider some information about residue propensities, both with respect to secondary structure and general features of their higher-order structural distribution.

Keratin is a coiled-coil

The coiled-coil of keratin is based on the α-helix, except that each helix is twisted slightly more, accomplishing one turn every 3.5 residues (instead of 3.6). The helices wind around each other, with the axis of each forming a left-handed supercoil.

The extended coiled-coils of keratin form higher order structures. This includes cross-linking of cysteine residues via disulfide bonds.

Collagen is a triple helix

The primary structure of the collagen triple helix consists of a repetitive sequence, Gly-X-Y, where X and Y are often Pro of Hyp (hydroxyproline). This forms a left-handed helix with about three residues per turn, and we emphasize that this is not to be confused with the α-helix. Three such left-handed helical chains wind around each other, the axis of each tracing out a right-handed supercoil. The amide group of glycine and acyl oxygen of proline form interchain hydrogen bonds through the middle of the triple helix.

There are many forms of collagen, and higher-order structures, in which triple-helices (denoted as "microfibrils") assemble.

Post-translational modifications are an especially interesting aspect to collagen. Two important modifications:

  • Hydroxyproline - enables the formation of hydrogen bonds between individual helices
  • Allysine residues are modified lysine residues which lead to covalent cross-linking between triple helices.

The hydroxylation of proline and lysine residues of collagen is carried out by hydroxylation enzymes that rely on ascorbate (vitamin C) for their function. Absence of vitamin C from the diet of humans leads to the collagen defects underlying the disease scurvy.

 

Protein structure determination: X-ray crystallography and multidimensional NMR. X-ray crystallography provides a high-resolution view of the structures of molecules. As applied to proteins, it typically reveals the locations of all non-hydrogen atoms that are well-localized in a tertiary structure to a resolution range of 3 - 1 Å.

 

Protein structure determination by multidimensional NMR

Recall that nuclear magnetic resonance (NMR) is based upon the quantum nature of atomic nuclei; in particular, a property called "nuclear spin". In the absence of an applied magnetic field, the two possible spin states of a proton (1H nucleus) are degenerate, i.e. of equal energy. However, an external magnetic field creates an energy gap ΔE that is proportional to the strength Ho of the applied field:

ΔE = hνo, where νo = γHo/2π .

In the above relation, γ is the magnetogyric ratio, a constant characteristic of the nucleus being observed in the magnetic field. The equation expresses the condition for a resonant frequency (of electromagnetic radiation) νo, which lies in the radio frequency part of the spectrum. When a photon (quantum of electromagnetic radiation) with the resonant frequency is absorbed by the nucleus in the lower energy state (with the nuclear spin aligned with the direction of the magnetic field, analogous to the alignment of a compass needle to the earth's magnetic north pole)

In "1-D" 1H-NMR for small organic molecules, the different hydrogens are found in differing chemical (electronic) environments, the result being that the magnetic field strength at a particular hydrogen nucleus varies from the applied field strength by the effects of electrodynamic shielding or deshielding. This gives rise to what is termed chemical shift, δ, which is defined using a reference frequency.

The 1H-NMR spectra for small organic molecules reveal other features besides the chemical shift values for different types of hydrogens in the molecules. They also display the effects of signal "splitting" occurring between hydrogen atoms attached to adjacent covalently-bonded carbon atoms. The splitting occurs because the spin states of the hydrogen nuclei are coupled through the three bonds by which they are linked. This through-bond spin-spin coupling, or J-coupling, is responsible for the doublets, triplets, quartets, etc., of peaks for a given type of proton in the molecule, depending on the number of neighboring hydrogens.

Once we go beyond small organic molecules and acquire 1H-NMR spectra for polypeptides of the size of a small protein, we find that it is quite crowded and uinterpretable in detail. One can atrribute peaks in different regions to the characteristic types of hydrogen nuclei in polypeptides: amide hydrogens generally lie furthest downfield in the spectra (δ lies in the range 7 - 8.3), then aromatic hydrogens (δ 5.7 - 7.5), followed by hydrogens attached to Cα(δ 3.3 - 5.3), and finally other aliphatic hydrogens the furthest upfield (hydrogens attached to secondary Cβ, Cγ, or Cδ, δ 1.5 - 3.3; methyl group hydrogens, δ up to 1.5). Yet is is impossible to resolve all the peaks, let alone assign each specific hydrogen type in all the peptide's residues to a peak in the spectrum.

One approach to improve resolution for larger molecules is to employ a stronger magnetic field. Certainly the use of more powerful magnets has helped make NMR of polypeptides and small proteins useful. The implementation of multidimensional methods was also crucial. An analogy to the abilty of two-dimensional electrophoresis to resolve the hundreds or even thousands of individual proteins in a complex sample such as a cell extract is perhaps an apt one. A two-dimensional COSY (COrrelation SpectroscopY) spectrum probes the chemical shift of and J-coupling among the hydrogen nuclei, and makes possible the discernment of patterns of off-diagonal "cross-peaks" that are characteristic of the through-bond coupling within different types of amino acid side chains.

Another type of interaction becomes relevant in the application of NMR to macromolecules with well-defined conformational ensembles. This is the effect that nearby nuclei can have on each other's resonant frequency when they are simply juxtaposed "through space", and not linked by a small number of chemical bonds. This is the so-called nuclear Overhauser effect (nOe), which is effective only over very short ranges (the nuclei must be within about 5 Å). Multidimensional NMR is able to incorporate the effects of such through-space nuclear coupling, and such nOe signals can be similarly identified with cross-peaks in a 2-D spectrum.

Still, the "assignment problem" (i.e. answering the question of which peak belongs to which proton) in multidimensional NMR is the biggest challenge in applying it to the problem of protein structure determination. Very many methods have been developed to meet this challenge, which include using three- and even four-dimensional spectra and the labelling of the proteins being investigated (incorporation of isotopes) with 15N and 13C, so that the NMR spectra for these nuclei can be used as additional dimensions. The "editing" of polypeptide 1H-NMR spectra is also performed by incubation in solvent containing D2O, allowing for the exchange of labile hydrogens with deuterium (D).

In principle, solving the assignment problem with a sufficient quantity of nOe data allows the application of a set of distance constraints derived from the latter to calculate a family of closely related structures consistent with those constraints. Such a family of structures, called an "ensemble", is the form in which the results of structure determination by NMR are reported, although it is also common for a "most representative structure" to be derived from the ensemble.

Limitations of NMR

NMR and crystallography can be viewed as complimentary methoods in structure determination. Structures determined by these methods are most often in close agreement. While one cannot directly assign a resolution to an NMR protein structure, it does provide evidence of the dynamic features of proteins that is largely lacking in crystal structures. Some of the limitations of NMR as a protein structure determination method include the following:

  • Upper limit on size is about 40 kDa
  • NMR structures are not “high-resolution”
  • “Dynamic” features may be hard to distinguish from uncertainty in data
  • The protein must be soluble to high concentration

Evaluation of NMR structures

  • How “well-determined” is the structure? (The best NMR structures have 15 – 25 nOe's per residue.)
  • How closely does the model agree with the data? (e.g. does the model predict observed nOe's?)
  • Stereochemical quality of the model (e.g. are there φ, ψ outliers? Unfavorable bond angles, bad contacts?)
  • Is the structure supported by biochemical evidence?
 

 

 

Learning objectives

Page update in progress

References:

 
footer

[ E-mail: cronk@gonzaga.edu ]