FGW_matrices

This module provides methods to generate difference dictionaries and data lists for the fused Gromov-Wasserstein (FGW) methods in GWProt.GW_protein.run_FGW_dict and GWProt.GW_protein.run_FGW_data_lists. These difference matrices are used to incorporate biochemical features into FGW alignments and to compute local geometric distortion (LGD) at the residue level.


Hydrophobicity-Based Differences

These methods generate difference matrices using hydrophobicity values from Eisenberg, Schwarz, Komaromy, and Wall. These can be used as the feature space for FGW, allowing LGD to reflect hydrophobicity differences between residues.


BLOSUM-Based Differences

This method generates difference matrices based on the BLOSUM matrices. For a pair of amino acids, it computes \(e^{-b}\) (where \(b\) is the BLOSUM entry), then normalizes the result to ensure the distances satisfy the triangle inequality. These matrices can be used in FGW to compute LGD based on sequence similarity.


Grantham-Based Differences

This method generates difference matrices using the Grantham difference scores, which reflect physicochemical differences between amino acids. These can be used in FGW to compute LGD based on these properties.


Isoelectric Point Differences

These methods generate difference matrices using Solomon isoelectric point values. For more advanced handling of isoelectric points, use the GWProt.GW_pI module, which offers greater functionality for FGW and LGD computations.

References

  • Solomons, T.W.G. and Fryhle, C.B. and Snyder, S.A. (2016) Organic Chemistry 12th Ed. Wiley.

  • Grantham, R. (1974) Amino Acid Difference Formula to Help Explain Protein Evolution. Science, New Series, Vol. 185, No. 4154. 862-864.

  • Eisenberg, D.; Schwarz, E.; Komaromy, M.; and Wall, R. (1984) Analysis of membrane and surface protein sequences with the hydrophobic moment plot. J Mol Biol. Oct 15;179(1):125-42.