Publication Cover
Optimization
A Journal of Mathematical Programming and Operations Research
Volume 73, 2024 - Issue 4
184
Views
1
CrossRef citations to date
0
Altmetric
Research Article

Generalized left-localized Cayley parametrization for optimization with orthogonality constraints

&
Pages 1113-1159 | Received 14 Oct 2021, Accepted 18 Oct 2022, Published online: 15 Nov 2022
 

Abstract

We present a reformulation of optimization problems over the Stiefel manifold by using a Cayley-type transform, named the generalized left-localized Cayley transform, for the Stiefel manifold. The reformulated optimization problem is defined over a vector space, whereby we can apply directly powerful computational arts designed for optimization over a vector space. The proposed Cayley-type transform enjoys several key properties which are useful to (i) study relations between the original problem and the proposed problem; (ii) check the conditions to guarantee the global convergence of optimization algorithms. Numerical experiments demonstrate that the proposed algorithm outperforms the standard algorithms designed with a retraction on the Stiefel manifold.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Notes

1 φ1 is well-defined over QN,N because all eigenvalues of VQN,N are pure imaginary. For the second expression in (Equation4), see the beginning of Appendix 3.

2 The closure of SO(N)EN,N is equal to SO(N). For every USO(N), we can approximate it by some sequence (Un)n=1 of SO(N)EN,N with any accuracy, i.e. limnUn=U.

3 The domain of φS with SSO(N) is a subset O(N)EN,N(S)=SO(N)EN,N(S) of SO(N).

4 As in (Equation9), QN,p(S) is the common set QN,p for every SO(N). However, we distinguish QN,p(S) for each SO(N) as a parametrization of the particular subset St(p,N)EN,p(S) of St(p,N) (see also Remark 1.3(b)).

5 Algorithm 1 can serve as a central building block in our further advanced Cayley parametrization strategies, reported partially in [Citation38–40].

6 The local diffeomorphism of RU around 0TUSt(p,N) can be verified with the inverse function theorem and the condition (ii) in Definition B.1.

7 Let Ip+[[V]]21T[[V]]21=Q(Ip+Σ)QT be the eigenvalue decomposition with QO(N) and a nonnegative-valued diagonal matrix ΣRp×p. From (I2) in Appendix 9, we have M12(Ip+Σ)1F=(1+σmin2([[V]]21))11. Thus, we have κ(M)M21+[[V]]112+[[V]]2122.

8 From the relation minUSt(p,N)f(U)=infVQN,p(S)fΦS1(V) in Lemma 2.6, ΦS1(V)St(p,N) is also a global minimizer of f over St(p,N).

9 We note that this early stopping of GDM+CP-retraction can be caused by the instability [Citation22] of the Sherman-Morrison-Woodbury formula used in RU0Cay and (fRU0Cay).

10 The subspace W1:={UΩRN×pΩT=ΩRp×p}RN×p is an orthogonal complement to the subspace W2:={UKRN×pKR(Np)×p}RN×p with the inner product X,Y=Tr(XTY)(X,YRN×p). The tangent space TUSt(p,N) can be decomposed as W1W2 with the direct sum ⊕. In view of the orthogonal decomposition, the first term and the second term in the right-hand side of (EquationA1) can be regarded respectively as the orthogonal projection of X onto W1 and W2.

11 The exponential mapping ExpU:TUSt(p,N)St(p,N) at USt(p,N) is defined as a mapping that assigns a given direction DTUSt(p,N) to a point on the geodesic of St(p,N) with the initial velocity D. The exponential mapping is also a special instance of retractions of St(p,N). However, due to its high computational complexity, computationally simpler retractions have been used extensively for Problem 1.1 [Citation1].

Additional information

Funding

This work was supported by JSPS [grants-in-aid19H04134] partially, by JSPS [grants-in-aid 21J21353] and by JST SICORP [grant number JPMJSC20C6].