Unbalanced Gromov-Wasserstein Correspondences
Overview
Unbalanced Gromov-Wasserstein (UGW) 1 is a variant of the Gromov-Wasserstein distance that permits some of the mass to be discarded instead of transported between two proteins. This can be more robust to proteins of different lengths and indel mutations. However it does not define a distance metric.
Mathematical Formulation
The unbalanced Gromov-Wasserstein distance between proteins \(X\) and \(Y\) is defined as:
Where \(KL\) denotes Kullback-Leibler divergence, \(KL^\otimes(\mu|\nu)\) denotes \(KL(\mu \otimes \mu|\nu \otimes \nu)\) , \(\mu_1, \mu_2\) are the probability distributions on \(X\) and \(Y\), and \(\pi_1(T), \pi_2(T)\) are the projections of \(T\) onto \(X\) and \(Y\), in this case uniform distributions. Note that \(T\) is non-negative, but does not have the strict marginal constraints as in GW and FGW. Instead the marginal constraints are weakly enforced via Kullback-Leibler divergence. Optimal values of \(\rho\) may vary depending on the proteins, and higher \(\rho\) means more mass is preserved. In line with the Python OT package, \(\alpha\) has different conventions.
We similarly have fused unbalanced Gromov-Wasserstein distance defined as:
Where \(\delta(x, y)\) and \(\alpha\) are as in FGW.
References
- 1
Séjourné, T., Vialard, F., and Peyré, G. (2021) The Unbalanced Gromov Wasserstein Distance: Conic Formulation and Relaxation. Neural Information Processing Systems, 35.