Publication Cover
Optimization
A Journal of Mathematical Programming and Operations Research
Volume 73, 2024 - Issue 4
184
Views
1
CrossRef citations to date
0
Altmetric
Research Article

Generalized left-localized Cayley parametrization for optimization with orthogonality constraints

&
Pages 1113-1159 | Received 14 Oct 2021, Accepted 18 Oct 2022, Published online: 15 Nov 2022

References

  • Absil PA, Mahony R, Sepulchre R. Optimization algorithms on matrix manifolds. Princeton (NJ): Princeton University Press; 2008.
  • Manton JH. Geometry, manifolds, and nonconvex optimization: how geometry can help optimization. IEEE Signal Process Mag. 2020;37(5):109–119.
  • Sato H. Riemannian optimization and its applications. Switzerland: Springer International Publishing; 2021.
  • Pietersz R, Groenen PJF. Rank reduction of correlation matrices by majorization. Quant Finance. 2004;4(6):649–662.
  • Grubišić I, Pietersz R. Efficient rank reduction of correlation matrices. Linear Algebra Appl. 2007;422(2–3):629–653.
  • Zhu X. A feasible filter method for the nearest low-rank correlation matrix problem. Numer Algorithms. 2015;69(4):763–784.
  • Bai Z, Sleijpen G, van der Vorst H. Nonlinear eigenvalue problems. In: Bai Z, Demmel J, Dongarra J, Ruhe A, van der Vorst H, editors, Templates for the solution of algebraic eigenvalue problems. Philadelphia (PA): SIAM; 2000. p. 281–314.
  • Yang C, Meza JC, Wang LW. A constrained optimization algorithm for total energy minimization in electronic structure calculations. J Comput Phys. 2006;217(2):709–721.
  • Zhao Z, Bai Z, Jin X. A Riemannian Newton algorithm for nonlinear eigenvalue problems. Comput Optim Appl. 2015;36(2):752–774.
  • Zou H, Hastie T, Tibshirani R. Sparse principal component analysis. J Comput Graph Stat. 2006;15(2):265–286.
  • Journée M, Nesterov Y, Richtárik P, et al. Generalized power method for sparse principal component analysis. J Mach Learn Res. 2010;11(15):517–553.
  • Lu Z, Zhang Y. An augmented Lagrangian approach for sparse principal component analysis. Math Program. 2012;135(1–2):149–193.
  • Boufounos PT, Baraniuk RG. 1-bit compressive sensing. In: Annual Conference on Information Sciences and Systems. IEEE; 2008. p. 16–21.
  • Laska JN, Wen Z, Yin W, et al. Trust, but verify: fast and accurate signal recovery from 1-bit compressive measurements. IEEE Trans Signal Process. 2011;59(11):5289–5301.
  • Joho M, Mathis H. Joint diagonalization of correlation matrices by using gradient methods with application to blind signal separation. In: Sensor Array and Multichannel Signal Processing Workshop Proceedings. IEEE; 2002. p. 273–277.
  • Theis FJ, Cason TP, Absil PA. Soft dimension reduction for ICA by joint diagonalization on the Stiefel manifold. In: International Symposium on Independent Component Analysis and Blind Signal Separation. Springer; 2009. p. 354–361.
  • Sato H. Riemannian Newton-type methods for joint diagonalization on the Stiefel manifold with application to independent component analysis. Optimization. 2017;66(12):2211–2231.
  • Helfrich K, Willmott D, Ye Q. Orthogonal recurrent neural networks with scaled Cayley transform. In: International Conference on Machine Learning. PMLR; Vol. 80; 2018. p. 1969–1978.
  • Bansal N, Chen X, Wang Z. Can we gain more from orthogonality regularizations in training deep networks? In: Advances in neural information processing systems. Curran Associates Inc.; 2018. p. 4266–4276.
  • Yamada I, Ezaki T. An orthogonal matrix optimization by dual Cayley parametrization technique. In: 4th International Symposium on Independent Component Analysis and Blind, Signal Separation; 2003. p. 35–40.
  • Hori G, Tanaka T. Pivoting in Cayley transform-based optimization on orthogonal groups. In: Proceedings of the Second APSIPA Annual Summit and Conference; 2010. p. 181–184.
  • Wen Z, Yin W. A feasible method for optimization with orthogonality constraints. Math Program. 2013;142(1–2):397–434.
  • Gao B, Liu X, Chen X, et al. A new first-order algorithmic framework for optimization problems with orthogonality constraints. SIAM J Optim. 2018;28(1):302–332.
  • Zhu X. A Riemannian conjugate gradient method for optimization on the Stiefel manifold. Comput Optim Appl. 2017;67(1):73–110.
  • Zhu X, Sato H. Riemannian conjugate gradient methods with inverse retraction. Comput Optim Appl. 2020;77(3):779–810.
  • Fraikin C, Hüper K, Dooren PV. Optimization over the Stiefel manifold. In: Proceedings in Applied Mathematics and Mechanics. Wiley; Vol. 7; 2007.
  • Reddi SJ, Hefny A, Sra S, et al. Stochastic variance reduction for nonconvex optimization. In: International Conference on Machine Learning. PMLR; Vol. 48; 2016. p. 314–323.
  • Ghadimi S, Lan G. Accelerated gradient methods for nonconvex nonlinear and stochastic programming. Math Program. 2016;156:59–99.
  • Allen-Zhu Z. Natasha 2: faster non-convex optimization than SGD. In: Advances in neural information processing systems. Curran Associates Inc.; 2018. p. 2680–2691.
  • Ward R, Wu X, Bottou L. AdaGrad stepsizes: sharp convergence over nonconvex landscapes. In: International Conference on Machine Learning. PMLR; Vol. 97; 2019. p. 6677–6686.
  • Chen X, Liu S, Sun R, et al. On the convergence of a class of adam-type algorithms for non-convex optimization. arXiv preprint arXiv:1808.02941, 2018.
  • Tatarenko T, Touri B. Non-convex distributed optimization. IEEE Trans Automat Control. 2017;62(8):3744–3757.
  • Nocedal J, Wright S. Numerical optimization. 2nd ed. New York (NY): Springer; 2006.
  • Boumal N, Absil PA. Low-rank matrix completion via preconditioned optimization on the Grassmann manifold. Linear Algebra Appl. 2015;475:200–239.
  • Pitaval RA, Dai W, Tirkkonen O. Convergence of gradient descent for low-rank matrix approximation. IEEE Trans Inf Theory. 2015;61(8):4451–4457.
  • Sato H, Iwai T. Optimization algorithms on the Grassmann manifold with application to matrix eigenvalue problems. Jpn J Ind Appl Math. 2014;31(2):355–400.
  • Xu Y, Zeng T. Fast optimal H2 model reduction algorithms based on Grassmann manifold optimization. Int J Numer Anal Model. 2013;10(4):972–991.
  • Kume K, Yamada I. Adaptive localized Cayley parametrization technique for smooth optimization over the Stiefel manifold. In: European Signal Processing Conference. EURASIP; 2019. p. 500–504.
  • Kume K, Yamada I. A Nesterov-type acceleration with adaptive localized Cayley parametrization for optimization over the Stiefel manifold. In: European Signal Processing Conference. EURASIP; 2020. p. 2105–2109.
  • Kume K, Yamada I. A global Cayley parametrization of Stiefel manifold for direct utilization of optimization mechanisms over vector spaces. In: International Conference on Acoustics, Speech, and Signal Processing. IEEE; 2021. p. 5554–5558.
  • Edelman A, Arias TA, Smith ST. The geometry of algorithms with orthogonality constraints. SIAM J Matrix Anal Appl. 1998;20(2):303–353.
  • Nikpour M, Manton JH, Hori G. Algorithms on the Stiefel manifold for joint diagonalisation. In: International Conference on Acoustics, Speech, and Signal Processing. IEEE; Vol. 2. 2002. p. 1481–1484.
  • Nishimori Y, Akaho S. Learning algorithms utilizing quasi-geodesic flows on the Stiefel manifold. Neurocomputing. 2005;67:106–135.
  • Absil PA, Baker CG, Gallivan KA. Trust-region methods on Riemannian manifolds. Found Comut Math. 2007;7(3):303–330.
  • Abrudan TE, Eriksson J, Koivunen V. Steepest descent algorithms for optimization under unitary matrix constraint. IEEE Trans Signal Process. 2008;56(3):1134–1147.
  • Absil PA, Malick J. Projection-like retractions on matrix manifolds. SIAM J Optim. 2012;22(1):135–158.
  • Ring W, Wirth B. Optimization methods on Riemannian manifolds and their application to shape space. SIAM J Optim. 2012;22(2):596–627.
  • Huang W, Gallivan KA, Absil PA. A Broyden class of quasi-Newton methods for Riemannian optimization. SIAM J Optim. 2015;25(3):1660–1685.
  • Jiang B, Dai YH. A framework of constraint preserving update schemes for optimization on Stiefel manifold. Math Program. 2015;153(2):535–575.
  • Manton JH. A framework for generalising the Newton method and other iterative methods from Euclidean space to manifolds. Numer Math. 2015;129:91–125.
  • Sato H, Iwai T. A new, globally convergent Riemannian conjugate gradient method. Optimization. 2015;64(4):1011–1031.
  • Kasai H, Mishra B. Inexact trust-region algorithms on Riemannian manifolds. In: Advances in neural information processing systems. Curran Associates Inc.; 2018. p. 4254–4265.
  • Nesterov Y. A method for solving the convex programming problem with convergence rate o(1/k2). Dokl Akad Nauk SSSR. 1983;269:543–547.
  • Siegel JW. Accelerated optimization with orthogonality constraints. J Comput Math. 2020;39(2):207–226.
  • Boumal N, Mishra B, Absil PA, et al. Manopt, a Matlab toolbox for optimization on manifolds. J Mach Learn Res. 2014;15:1455–1459.
  • Satake I. Linear algebra. New York (NY): Marcel Dekker Inc.; 1975.
  • Van den Bos A. Parameter estimation for scientists and engineers. New York (NY): Wiley; 2007.
  • Horn RA, Johnson CR. Matrix analysis. 2nd ed. Cambridge (MA): Cambridge University Press; 2012.