Distortion Scaling

Overview

When comparing the shapes of molecules, we are generally more interested in substructures that are geometrically close to one another, as these are more likely to interact chemically. This principle underlies the template-modelling (TM) score and TM-align.

To implement this idea in GWProt, we modify the intra-protein distance matrices by applying a scaling function, causing the GW computation to give greater weight to nearby residues. This approach generally improves alignment accuracy without impacting runtime.

Scaling Function

We choose a scaling function \(f\) such that \(f(0) = 0\), \(f\) is strictly increasing, and \(f\) is concave down. The square root function, \(f(x) = \sqrt{x}\), works well in practice and is the default value. For each protein, we apply \(f\) to all entries in its intra-protein distance matrix before running GW, where we align it with other proteins scaled with the same \(f\).

Note

Distortion scaling can also be used with FGW in the same way as with GW. However, it is recommended to use a larger value of alpha.

References

  • Zhang, Y., & Skolnick, J. (2004). Scoring function for automated assessment of protein structure template quality. Proteins: Structure, Function, and Bioinformatics, 57(4), 702-710. (TM-score paper)

  • TM-align: https://zhanggroup.org/TM-align/