-
Notifications
You must be signed in to change notification settings - Fork 505
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Gromov-Wasserstein with asymmetric cost matrices #399
[WIP] Gromov-Wasserstein with asymmetric cost matrices #399
Conversation
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## master #399 +/- ##
==========================================
- Coverage 93.97% 93.94% -0.04%
==========================================
Files 22 22
Lines 5927 5930 +3
==========================================
+ Hits 5570 5571 +1
- Misses 357 359 +2 |
Hello decarpentierg. Best regards, |
Hello Cédric, |
Thanks again @decarpentierg for the PR. As discussed above the PR #431 of @cedricvincentcuaz is advancing and contains a working imlementtaion with the non symmetric marices. I'm closing this PR now. |
Types of changes
Motivation and context / Related issue
In [12] "Gromov-Wasserstein Averaging of Kernel and Distance Matrices", the authors make no explicit assumption about the symmetry of the cost mastrices C and C'. However, their formula (9) for the computation of the gradient misses at least a 2 factor in the case of symmetric matrices, and even a second term in the case of asymmetric matrices, as explained in subsection 5.1 of "Multilingual Alignment of Word Embedding Spaces", Dan Meller and Gonzague de Carpentier, 2021 (https://www.researchgate.net/publication/357505092_Multilingual_alignment_of_word_embedding_spaces). In the code of ot/gromov.py, the 2 factor has been corrected, but it is not sufficient to support asymmetric cost matrices, which can be useful in the case of directed graphs for example. This contribution corrects the expression of the gradient in ot.gromov.gwggrad to support asymmetric distance matrices, by adding the appropriate term instead of just multiplying by 2.
How has this been tested (if it applies)
PR checklist