Protein structure characterization

Comparative modeling:

Proteins are amino acid residues that come together in a certain order –-- as determined by the corresponding genetic code in the DNA --– to form chains of various length (from 20 residues in small peptides to thousands of residues in large proteins).
The functioning of proteins in living organisms depend on their 3-D conformations.
The 3-D structure is mainly dictated by the dihedral angles – φ (phi) and ψ (psi) – formed by two adjacent amino acid residues.
Predicting the native 3-D form of a protein is very challenging because it requires scanning of every possible φ and ψ values for all amino acids and comparing energetics of each.
Although determining the correct values of φ and ψ (i.e., determining the minimum energy conformation) is more like IMPOSSIBLE, it is worth to note that here are a limited number of 20 standard amino acids common in all proteins.
Studies showed that there are sequence dependent patterns among protein structures, such as each amino acid prefers certain φ/ψ values.
φ and ψ involves adjacent amino acids, and there are also preferred values of these angles as a function of amino acid pairs.
- Our method scans the protein data bank (PDB), which hosts 3-D coordinates of over 40,000 distinct proteins (based on 90% non-redundancy), and extract ψ/φ and φ/ψ distribution for each amino acid (20 distributions) and amino acid pair (20x20=400 distributions), respectively.

- It then converts these distribution to a score of distance from the median as a function of number of standard deviation.
- For the predicted structure (inset of the D2 score above) of the Staphylococcal nuclease, that has a significant role in digesting DNA, our method evaluates that it is less than a standard deviation away from the median of all the protein structures in the PDB.

- Although useful in evaluating the likelihood of a given protein structure as a whole, D2 does not provide amino acid level information.
- Therefore, we have developed an amino acid level description of the D2, that shows the likelihood of each amino acid in the test protein.

- This is a visual description in the form of a color-coded strip that identifies the likely amino acids (green) and unlikely amino acids (red and blue).
- Such a color strip may provide great insight as to indicate the areas of the protein structure that needs further refinement and can be a valuable tool in protein structure prediction.

Page updated

Google Sites

Report abuse