Abstract
We present a reformulation of optimization problems over the Stiefel manifold by using a Cayley-type transform, named the generalized left-localized Cayley transform, for the Stiefel manifold. The reformulated optimization problem is defined over a vector space, whereby we can apply directly powerful computational arts designed for optimization over a vector space. The proposed Cayley-type transform enjoys several key properties which are useful to (i) study relations between the original problem and the proposed problem; (ii) check the conditions to guarantee the global convergence of optimization algorithms. Numerical experiments demonstrate that the proposed algorithm outperforms the standard algorithms designed with a retraction on the Stiefel manifold.
Disclosure statement
No potential conflict of interest was reported by the author(s).
Notes
1 is well-defined over because all eigenvalues of are pure imaginary. For the second expression in (Equation4(4) (4) ), see the beginning of Appendix 3.
2 The closure of is equal to . For every , we can approximate it by some sequence of with any accuracy, i.e. .
3 The domain of with is a subset of .
4 As in (Equation9(9) (9) ), is the common set for every . However, we distinguish for each as a parametrization of the particular subset of (see also Remark 1.3(b)).
5 Algorithm 1 can serve as a central building block in our further advanced Cayley parametrization strategies, reported partially in [Citation38–40].
6 The local diffeomorphism of around can be verified with the inverse function theorem and the condition (ii) in Definition B.1.
7 Let be the eigenvalue decomposition with and a nonnegative-valued diagonal matrix . From (I2) in Appendix 9, we have . Thus, we have .
8 From the relation in Lemma 2.6, is also a global minimizer of f over .
9 We note that this early stopping of GDM+CP-retraction can be caused by the instability [Citation22] of the Sherman-Morrison-Woodbury formula used in and .
10 The subspace is an orthogonal complement to the subspace with the inner product . The tangent space can be decomposed as with the direct sum ⊕. In view of the orthogonal decomposition, the first term and the second term in the right-hand side of (EquationA1(A1) (A1) ) can be regarded respectively as the orthogonal projection of onto and .
11 The exponential mapping at is defined as a mapping that assigns a given direction to a point on the geodesic of with the initial velocity . The exponential mapping is also a special instance of retractions of . However, due to its high computational complexity, computationally simpler retractions have been used extensively for Problem 1.1 [Citation1].