493
Views
0
CrossRef citations to date
0
Altmetric
Sparse Learning

Efficient Approximation of Gromov-Wasserstein Distance Using Importance Sparsification

ORCID Icon, ORCID Icon, ORCID Icon & ORCID Icon
Pages 1512-1523 | Received 28 Sep 2022, Accepted 29 Dec 2022, Published online: 15 Feb 2023

References

  • Alaux, J., Grave, E., Cuturi, M., and Joulin, A. (2019), “Unsupervised Hyper-Alignment for Multilingual Word Embeddings,” in 7th International Conference on Learning Representations, New Orleans, LA, USA.
  • Alvarez-Melis, D., and Jaakkola, T. (2018), “Gromov-Wasserstein Alignment of Word Embedding Spaces,” in Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 1881–1890, Brussels, Belgium. ACL.
  • Benamou, J.-D., Brenier, Y., and Guittet, K. (2002), “The Monge–Kantorovitch Mass Transfer and its Computational Fluid Mechanics Formulation,” International Journal for Numerical Methods in Fluids, 40, 21–30.
  • Blumberg, A. J., Carriere, M., Mandell, M. A., Rabadan, R., and Villar, S. (2020), “MREC: A Fast and Versatile Framework for Aligning and Matching Point Clouds with Applications to Single Cell Molecular Data,” arXiv preprint arXiv:2001.01666.
  • Bonneel, N., Rabin, J., Peyré, G., and Pfister, H. (2015), “Sliced and Radon Wasserstein Barycenters of Measures,” Journal of Mathematical Imaging and Vision, 51, 22–45.
  • Bonneel, N., van de Panne, M., Paris, S., and Heidrich, W. (2011), “Displacement Ìnterpolation Using Lagrangian Mass Transport,” ACM Transactions on Graphics, 30, 1–12.
  • Brenier, Y. (1997), “A Homogenized Model for Vortex Sheets,” Archive for Rational Mechanics and Analysis, 138, 319–353.
  • Brogat-Motte, L., Flamary, R., Brouard, C., Rousu, J., and d’Alché Buc, F. (2022), “Learning to Predict Graphs with Fused Gromov-Wasserstein Barycenters,” in International Conference on Machine Learning, pp. 2321–2335. PMLR.
  • Bunne, C., Alvarez-Melis, D., Krause, A., and Jegelka, S. (2019), “Learning Generative Models Across Incomparable Spaces,” in International Conference on Machine Learning, pp. 851–861. PMLR.
  • Chapel, L., Alaya, M. Z., and Gasso, G. (2020), “Partial Optimal Tranport with Applications on Positive-Unlabeled Learning,” in Advances in Neural Information Processing Systems, (Vol. 33), 2903–2913.
  • Chizat, L., Peyré, G., Schmitzer, B., and Vialard, F.-X. (2018a), “An Interpolating Distance between Optimal Transport and Fisher-Rao Metrics,” Foundations of Computational Mathematics, 18, 1–44.
  • Chizat, L. (2018b), “Scaling Algorithms for Unbalanced Optimal Transport Problems,” Mathematics of Computation, 87, 2563–2609.
  • Chizat, L. (2018c), “Unbalanced Optimal Transport: Dynamic and Kantorovich Formulations,” Journal of Functional Analysis, 274, 3090–3123.
  • Chowdhury, S., and Mémoli, F. (2019), “The Gromov-Wasserstein Distance between Networks and Stable Network Invariants,” Information and Inference: A Journal of the IMA, 8, 757–787.
  • Chowdhury, S., Miller, D., and Needham, T. (2021), “Quantized Gromov-Wasserstein,” in Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 811–827, Springer.
  • Chowdhury, S., and Needham, T. (2021), “Generalized Spectral Clustering via Gromov-Wasserstein Learning,” in Proceedings of the 24th International Conference on Artificial Intelligence and Statistics (Vol. 130), pp. 712–720. PMLR.
  • Cuturi, M. (2013), “Sinkhorn Distances: Lightspeed Computation of Optimal Transport,” in Advances in Neural Information Processing Systems (Vol. 26), pp. 2292–2300.
  • Deshpande, I., Hu, Y.-T., Sun, R., Pyrros, A., Siddiqui, N., Koyejo, S., Zhao, Z., Forsyth, D., and Schwing, A. G. (2019), “Max-Sliced Wasserstein Distance and its Use for GANs,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10648–10656. IEEE.
  • Ezuz, D., Solomon, J., Kim, V. G., and Ben-Chen, M. (2017), “GWCNN: A Metric Alignment Layer for Deep Shape Analysis,” Computer Graphics Forum, 36, 49–57.
  • Feragen, A., Kasenburg, N., Petersen, J., de Bruijne, M., and Borgwardt, K. (2013), “Scalable Kkernels for Graphs with Continuous Attributes,” in Advances in Neural Information Processing Systems (Vol. 26), pp. 216–224.
  • Fey, M., and Lenssen, J. E. (2019), “Fast Graph Representation Learning with PyTorch Geometric,” in ICLR Workshop on Representation Learning on Graphs and Manifolds.
  • Genevay, A., Chizat, L., Bach, F., Cuturi, M., and Peyré, G. (2019), “Sample Complexity of Sinkhorn Divergences,” in 22nd International Conference on Artificial Intelligence and Statistics, pp. 1574–1583. PMLR.
  • Gong, F., Nie, Y., and Xu, H. (2022), “Gromov-Wasserstein Multi-Modal Alignment and Clustering,” in Proceedings of the 31st ACM International Conference on Information & Knowledge Management, pp. 603–613.
  • Hagberg, A. A., Schult, D. A., and Swart, P. J. (2008), “Exploring Network Structure, Dynamics, and Ffunction using NetworkX,” in Proceedings of the 7th Python in Science Conference, pp. 11–15, Pasadena, CA, USA.
  • Kantorovich, L. (1942), “On the Transfer of Masses,” Doklady Akademii Nauk, 37, 227–229 (in Russian).
  • Kawano, S., and Mason, J. K. (2021), “Classification of Atomic Environments via the Gromov-Wasserstein Distance,” Computational Materials Science, 188, 110144.
  • Kerdoncuff, T., Emonet, R., and Sebban, M. (2021), “Sampled Gromov Wasserstein,” Machine Learning, 110, 2151–2186.
  • Kriege, N. M., Fey, M., Fisseler, D., Mutzel, P., and Weichert, F. (2018), “Recognizing Cuneiform Signs Using Graph based Methods,” in International Workshop on Cost-Sensitive Learning, pp. 31–44. PMLR.
  • Le, T., Ho, N., and Yamada, M. (2021), “Flow-based Alignment Approaches for Probability Measures in Different Spaces,” in International Conference on Artificial Intelligence and Statistics, pp. 3934–3942. PMLR.
  • Li, T., Meng, C., Yu, J., and Xu, H. (2022), “Hilbert Curve Projection Distance for Distribution Comparison,” arXiv preprint arXiv:2205.15059.
  • Liao, Q., Chen, J., Wang, Z., Bai, B., Jin, S., and Wu, H. (2022a), “Fast Sinkhorn I: An O(N) Algorithm for the Wasserstein-1 Metric,” arXiv preprint arXiv:2202.10042.
  • Liao, Q., Wang, Z., Chen, J., Bai, B., Jin, S., and Wu, H. (2022b), “Fast Sinkhorn II: Collinear Triangular Matrix and Linear Time Accurate Computation of Optimal Transport,” arXiv preprint arXiv:2206.09049.
  • Liero, M., Mielke, A., and Savaré, G. (2016), “Optimal Transport in Competition with Reaction: The Hellinger-Kantorovich Distance and Geodesic Curves,” SIAM Journal on Mathematical Analysis, 48, 2869–2911.
  • Lin, T., Ho, N., and Jordan, M. I. (2019), “On the Acceleration of the Sinkhorn and Greenkhorn Algorithms for Optimal Transport,” arXiv preprint arXiv:1906.01437.
  • Liu, J. S. (1996), “Metropolized Independent Sampling with Comparisons to Rejection Sampling and Importance Sampling,” Statistics and Computing, 6, 113–119.
  • Liu, J. S. (2008), Monte Carlo Strategies in Scientific Computing, New York: Springer.
  • Luo, D., Wang, Y., Yue, A., and Xu, H. (2022), “Weakly-Supervised Temporal Action Alignment Driven by Unbalanced Spectral Fused Gromov-Wasserstein Distance,” in Proceedings of the 30th ACM International Conference on Multimedia, pp. 728–739.
  • Ma, P., Mahoney, M. W., and Yu, B. (2015), “A Statistical Perspective on Algorithmic Leveraging,” The Journal of Machine Learning Research, 16, 861–911.
  • Mémoli, F. (2011), “Gromov-Wasserstein Distances and the Metric Approach to Object Matching,” Foundations of Computational Mathematics, 11, 417–487.
  • Meng, C., Ke, Y., Zhang, J., Zhang, M., Zhong, W., and Ma, P. (2019), “Large-Scale Optimal Transport Map Estimation Using Projection Pursuit,” in Advances in Neural Information Processing Systems (Vol. 32), pp. 8118–8129.
  • Muzellec, B., Josse, J., Boyer, C., and Cuturi, M. (2020), “Missing Data Imputation Using Optimal Transport,” in International Conference on Machine Learning, pp. 7130–7140. PMLR.
  • Nadjahi, K. (2021), “Sliced-Wasserstein Distance for Large-Scale Machine Learning: Theory, Methodology and Extensions,” PhD thesis, Institut polytechnique de Paris.
  • Neumann, M., Moreno, P., Antanas, L., Garnett, R., and Kersting, K. (2013), “Graph Kernels for Object Category Prediction in Task-Dependent Robot Grasping,” in Online Proceedings of the Eleventh Workshop on Mining and Learning with Graphs, pp. 10–6, Chicago, Illinois, USA. ACM.
  • Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., and Duchesnay, E. (2011), “Scikit-Learn: Machine Learning in Python,” The Journal of Machine Learning Research, 12, 2825–2830.
  • Peyré, G., and Cuturi, M. (2019), “Computational Optimal Transport: With Applications to Data Science,” Foundations and Trends[textregistered] in Machine Learning, 11, 355–607.
  • Peyré, G., Cuturi, M., and Solomon, J. (2016), “Gromov-Wasserstein Averaging of Kernel and Distance Matrices,” in International Conference on Machine Learning, pp. 2664–2672. PMLR.
  • Pham, K., Le, K., Ho, N., Pham, T., and Bui, H. (2020), “On Unbalanced Optimal Transport: An Analysis of Sinkhorn Algorithm,” in International Conference on Machine Learning, pp. 7673–7682. PMLR.
  • Rand, W. M. (1971), “Objective Criteria for the Evaluation of Clustering Methods,” Journal of the American Statistical Association, 66, 846–850.
  • Reddi, S. J., Sra, S., Póczos, B., and Smola, A. (2016), “Stochastic Frank-Wolfe Methods for Nonconvex Optimization,” in 2016 54th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pp. 1244–1251, IEEE.
  • Sato, R., Cuturi, M., Yamada, M., and Kashima, H. (2020), “Fast and Robust Comparison of Probability Measures in Heterogeneous Spaces,” arXiv preprint arXiv:2002.01615.
  • Scetbon, M., and Cuturi, M. (2020), “Linear Time Sinkhorn Divergences Using Positive Features,” Advances in Neural Information Processing Systems (Vol. 33), pp. 13468–13480.
  • Scetbon, M., Peyré, G., and Cuturi, M. (2022), “Linear-Time Gromov Wasserstein Distances Using Low Rank Couplings and Costs,” in International Conference on Machine Learning, pp. 19347–19365. PMLR.
  • Séjourné, T., Vialard, F.-X., and Peyré, G. (2021), “The Unbalanced Gromov Wasserstein Distance: Conic Formulation and Relaxation,” in Advances in Neural Information Processing Systems (Vol. 34), pp. 8766–8779.
  • Sinkhorn, R., and Knopp, P. (1967), “Concerning Nonnegative Matrices and Doubly Stochastic Matrices,” Pacific Journal of Mathematics, 21, 343–348.
  • Solomon, J., Peyré, G., Kim, V. G., and Sra, S. (2016), “Entropic Metric Alignment for Correspondence Problems,” ACM Transactions on Graphics, 35, 1–13.
  • Sturm, K.-T. (2006), “On the Geometry of Metric Measure Spaces,” Acta Mathematica, 196, 65–131.
  • Sutherland, J. J., O’brien, L. A., and Weaver, D. F. (2003), “Spline-fitting with a Genetic Algorithm: A Method for Developing Classification Structure-activity Relationships,” Journal of Chemical Information and Computer Sciences, 43, 1906–1915. DOI: 10.1021/ci034143r.
  • Titouan, V., Courty, N., Tavenard, R., and Flamary, R. (2019a), “Optimal Transport for Structured Data with Application on Graphs,” in International Conference on Machine Learning, pp. 6275–6284. PMLR.
  • Titouan, V., Flamary, R., Courty, N., Tavenard, R., and Chapel, L. (2019b), “Sliced Gromov-Wasserstein,” in Advances in Neural Information Processing Systems (Vol. 32), pp. 4753–14763.
  • Vayer, T., Chapel, L., Flamary, R., Tavenard, R., and Courty, N. (2020), “Fused Gromov-Wasserstein Distance for Structured Objects,” Algorithms, 13, 212–244.
  • Villani, C. (2009), Optimal Transport: Old and New (Vol. 338), Berlin: Springer.
  • Vincent-Cuaz, C., Flamary, R., Corneli, M., Vayer, T., and Courty, N. (2022), “Semi-relaxed Gromov-Wasserstein Divergence with Applications on Graphs,” in 10th International Conference on Learning Representations.
  • Xie, Y., Wang, X., Wang, R., and Zha, H. (2020), “A Fast Proximal Point Method for Computing Exact Wasserstein Distance,” in Uncertainty in Artificial Intelligence, pp. 433–453. PMLR.
  • Xu, H., Liu, J., Luo, D., and Carin, L. (2023), “Representing Graphs via Gromov-Wasserstein Factorization,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 45, 999–1016.
  • Xu, H., Luo, D., and Carin, L. (2019), “Scalable Gromov-Wasserstein Learning for Graph Partitioning and Matching,” in Advances in Neural Information Processing Systems (Vol. 32), pp. 3052–3062.
  • Xu, H., Luo, D., Zha, H., and Carin, L. (2019), “Gromov-Wasserstein Learning for Graph Matching and Node Embedding,” in International Conference on Machine Learning, pp. 6932–6941. PMLR.
  • Yan, Y., Li, W., Wu, H., Min, H., Tan, M., and Wu, Q. (2018), “Semi-supervised Optimal Transport for Heterogeneous Domain Adaptation,” in Proceedings of the 27th International Joint Conference on Artificial Intelligence, pp. 2969–2975.
  • Yanardag, P., and Vishwanathan, S. (2015), “Deep Graph Kernels,” in Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1365–1374.
  • Yu, J., Wang, H., Ai, M., and Zhang, H. (2022), “Optimal Distributed Subsampling for Maximum Quasi-Likelihood Estimators with Massive Data,” Journal of the American Statistical Association, 117, 265–276.
  • Zhang, J., Ma, P., Zhong, W., and Meng, C. (2022), “Projection-Based Techniques for High-Dimensional Optimal Transport Problems,” Wiley Interdisciplinary Reviews: Computational Statistics, e1587.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.