to determine 3D structure of molecules at atomic resolution
to determine where the protein is located and what the function will be
Workflow: 1. Protein crystallisation 2. X-ray diffusion 3. Electron density maps 4. Atomic model
Protein crystallisation = Pure protein (>95%) solutions produced recombinantly, using bacterial expression. Can produce proteins in bacteria, mammalian cells, insect cells
Xray crystallography needs crystals
Protein + specific solution = crystals
Crystal
Ordered 3D array of molecules
Asymmetric unit
smallest repeating object within crystal
Unit cell
smallest volume element completely representative of the whole crystal
Not all diffractions are useful, multiple ways in which they can bounce = many planes. Only consider the ones that go through the two vertices
how x-rays are diffracted = diffraction pattern of isolated protein
Constructive interference: Intensifies reflection, occurs when two or more waves combine to produce a resultant wave with an amplitude that is greater than the sum of the individual waves
Destructive interference
waves cancel each other out
Braggs law (n(lambda) = 2s sin theta) predicts when there is constructive interference
the greater the angle the higher the resolution
3A is low res, 1A is high res
Extract theta/angles of refraction: distance between two atoms
Reflections allowed by Braggs law, planes that intersect the sphere, on detector you will only those planes
How X-rays are formed: when an electron transitions to a lower level, it loses energy corresponding to the energy levels between shells, energy is emitted as X-ray photons
Mathematical relationship between an object and its diffraction pattern.
Fourier transform
To convert the diffraction pattern to an ELECTRON DENSITY mapneed to know the POSITION, INTENSITY and PHASE of as many reflections as possible
3 ways to derive phases
Multiple isomorphous replacement
multiple wavelength anomalous dispersion
molecular replacement
Heavy atoms absorb x-rays, solve heavy atom structure, estimate phase of protein reflection made
Isomorphous replacement: Position of heavy atom in the unit cell determined directly from intensity differences. Only possible for one or two heavy atoms per asymmetric unit.
Multiwavelength Anomalous Dispersion: heavy atom already in protein, soak in metal or manipulate using molecular biology tools: replace methionine with selenomethionine, can use more than one wavelength to obtain diffraction pattern
In MAD 3 data sets are measured at dif wavelengths from one crystal. Intensity dif used to calculate the positions of the anomalous atoms in the unit cell: effect of anomalous atoms can be used to calculate phases
Molecular replacement uses structure of similar protein as a structural model, known structure can be superimposed in the unit cell in the same orientation as the unknown proteins. If a proteins sequence is >30% identical: high probability of the same fold.
R factor: measure of the agreement between the observed and calculated structure factors in X-ray crystallography. A lower R-factor indicates a better fit between the observed and calculated data, suggesting a more accurate atomic model.
R-free factor
The R-free factor is an additional measure used to validate crystallographic models.
In the refinement process, a subset of the data (usually 5-10%) is randomly selected and excluded from the refinement. This excluded subset is referred to as the "test set."
A large discrepancy between the R-factor and R-free factor may indicate overfitting, and it is desirable to have both values similar.
Difference in map shows where atoms are ‘missing’ (maxima; positive electron density; green)
Also shows if the atom was modelled in the wrong place/not where it should be (minima; negative electron density)
6A˚: can define secondary structure
3A˚:can define main chain of polypeptide
2.5A˚: can identify sidechains
1.5A˚: can resolve individual atoms
<1.0A˚: some hydrogen atoms visible
Assess quality of model: is structural model is compliant of the quality checks from diffraction data?
Validate model: is the structure right in the context of the cell and it’s function?