456
Views
0
CrossRef citations to date
0
Altmetric
Articles

Autoregressive moving average model for matrix time series

ORCID Icon &
Pages 318-335 | Received 08 Dec 2022, Accepted 18 Sep 2023, Published online: 04 Oct 2023

Abstract

In the paper, the autoregressive moving average model for matrix time series (MARMA) is investigated. The properties of the MARMA model are investigated by using the conditional least square estimation, the conditional maximum likelihood estimation, the projection theorem in Hilbert space and the decomposition technique of time series, which include necessary and sufficient conditions for stationarity and invertibility, model parameter estimation, model testing and model forecasting.

1. Introduction

Matrix time series is a time series whose cross-sectional data are matrices, which can be found in a variety of fields such as economics, business, ecology, psychology, meteorology, biology and fMRI (Samadi, Citation2014). For example, consider two stocks, A1 and A2, as potential investment products, whose prices and volumes are selected as two analysis factors. Denote the price and volume of stock Ak at time t by Pk(t) and Vk(t), k = 1, 2, and then a 2×2-dimensional matrix time series can be constructed as follows: {Xt[P1(t)P2(t)V1(t)V2(t)],t=1,2,}.Matrix time series has attracted a few scholars' attention and research at the beginning of the century. Walden and Serroukh (Citation2002) studied the construction of matrix-valued filters for multi-resolution analysis of matrix time series. Samadi (Citation2014) brought forward and investigated a p-order autoregressive model for matrix time series, which is essentially a VAR(p) model in matrix form. D. Wang et al. (Citation2019) proposed a novel factor model Xt=RFtC+ϵt,t=1,2,,where Xt and Ft are matrix time series. Chen et al. (Citation2021) first proposed one-order autoregressive model for matrix time series in the bilinear form, denoted by MAR(1), (1) Xt=AXt1B+ϵt,t=1,2,,(1) and investigated its stationarity, causality, method of parameter estimation, and asymptotics of statistic. Wu and Hua (Citation2022) independently proposed the p-order autoregressive model for matrix time series in the bilinear form, denoted by MAR(p), (2) Xt=k=1pAkXtkBk+ϵt,t=1,2,,(2) and presented parameter estimation, model identification criterion and model checking. For more literature studies on matrix time series, one can refer to H. Wang and West (Citation2009), Zhou et al. (Citation2018), Getmanov et al. (Citation2021) and their references.

It is widely known that the autoregressive moving average model of time series (ARMA) plays a very important role in the theory and the application of one-dimensional time series, and we will show later that a bilinear model has its unique advantages for matrix time series. In the paper, autoregressive moving average models for matrix time series (MARMA) are first proposed and investigated. Necessary and sufficient conditions for stationarity of MARMA are provided, and parameter estimations are also considered by the conditional least squares method and the conditional maximum likelihood estimation method. At last, an example is presented to show the applications of the MARMA model.

2. Preliminaries

Let (Ω,F,P) be a probability space with a σ-filtration {Ft,tN} in which the second moment of each variable exists, and N={1,2,3,}.

Definition 2.1

For any given positive integers m and n, an m×n-dimensional matrix time series refers to (3) X={(Xij(t))m×n,tN},(3) where {Xij(t),tN} is a one-dimensional time series on a probability space (Ω,F,P) for any i=1,2,,n and j=1,2,,m.

Definition 2.2

Let X={X(t),tN} be an m×n-dimensional matrix time series defined by (Equation3), and then its mean function follows as (4) μX(t)E[X(t)]=(E[Xij(t)])m×n,tN.(4) Additionally, its autocovariance function follows as (5) ΓX(t,s)Γvec(X)(t,s)=(σij,k(t,s))mn×mn,(5) where σij,k(t,s)=cov(Xij(t),Xk(s)), i,k=1,2,,n and j,=1,2,,n; t,sN, and vec(X(t)) is the vectorization of X(t) by columns, that is, (6) vec(X(t))=[X11(t),X21(t),,Xm1(t),X12(t),X22(t),,Xmn(t)].(6)

Stationarity and matrix white noise play a very important role on time series analysis. Thus, we will introduce the concept of stationary matrix time series and matrix white noise in the following.

Definition 2.3

Let {X(t),tN} be a matrix time series defined by (Equation3) and vec(X(t)) be the vectorization of X(t) defined by (Equation6). Then {X(t),tN} is a stationary matrix time series if and only if {vec(X(t)),tN} is stationary.

Definition 2.4

For any given positive integers m and n, denote an m×n-dimensional matrix time series ϵ={(ϵij(t))m×n, tN}, and then ε is called an m×n-dimensional matrix white noise, if it satisfies the following conditions.

  1. Its mean function μϵ(t)=Om×n for all tN, where Om×n is the m×n-dimensional zero matrix.

  2. Its autocovariance function Γϵ(t,s) defined by Definition 2.3 satisfies that Γϵ(t,s)={Omn,ts,Σmn,t=s,t,sN,where Omn is the mn×mn-dimensional zero matrix, and (7) Σmn=diag(σ112,σ212,,σm12,σ122,σ222,,σ(m1)n2,σmn2)(7) is the mn×mn-dimensional diagonal matrix with diagonal entries σ112, σ212, …, σm12, σ122, σ222, …, σ(m1)n2, σmn2.

For any matrix white noise {ϵ(t),tN}, if its vectorization by columns, {vec(ϵ(t)),tN}, is Gaussian, then {ϵ(t),tN} is called a matrix Gaussian white noise.

Property 2.1

For any m×n-dimensional matrix time series {ϵ(t),tN}, it is an m×n-dimensional matrix white noise if and only if {vec(ϵ(t)),tN} is an mn-dimensional vector white noise, where N={1,2,3,}.

The proof of Property 2.1 is not difficult, so we omit it.

When we investigate the autoregressive moving average model for matrix time series, we may use the Kronecker product, matrix reshape and derivative of matrix. Thus, we introduce them in the following.

Definition 2.5

Graham, Citation2018

Assume matrices A=(aij)m×n and C=(cij)p×q, and then the m×n block matrix (aijC)m×n is called the Kronecker product of A and C, denoted by AC, that is, AC=[a11Ca12Ca1nCa21Ca22Ca2nCam1Cam2CamnC].

Definition 2.6

For any A=(aij)m×n and positive integers p, q satisfying pq = mn, the (p,q)-order reshaped matrix of A, denoted by Res(A,p,q), is defined by Res(A,p,q)=[a1ap+1a(p1)q+1a2ap+2a(p1)q+2apa2papq],where ak=aij for all k=1,2,,pq, i=km[(k1)/m] and j=[(k1)/m]+1, where [] is the operator of taking the integer part.

Definition 2.7

Graham, Citation2018

Let F=(Fij)m×n and X=(Xij)p×q be two matrices, where m, n, p and q are natural numbers. The derivative of matrix F with respect to matrix X is defined by FX=[FX11FX12FX1qFX21FX22FX2qFXp1FXp2FXpq],where the derivative of matrix F with respect to scalar Xij is defined by FXij=[F11XijF12XijF1nXijF21XijF22XijF2nXijFm1XijFm2XijFmnXij],i=1,2,,p,j=1,2,,q.

For the derivative of matrix with respect to matrix, its product rule and two common formulas follow as Properties 2.2 and 2.3.

Property 2.2

Graham, Citation2018

For any X=(xij)m×n, Y=(yij)n×u and Z=(zij)p×q, it follows that (XY)Z=XZ(IqY)+(IpX)YZ,where Iq is the q×q-dimensional identity matrix.

Taking Y=(yij)n×1 and X=Y into Property 2.2, we obtain Corollary 2.1.

Corollary 2.1

For any Y=(yij)n×1 and Z=(zij)p×q, it follows that (YY)Z=2YZ(IqY).

Property 2.3

Graham, Citation2018

For any A=(aij)m×n, B=(bij)n×u and invertible X=(Xij)n×n, it follows that vec(AX1B)vec(X)=(X1B)(AX1)and ln(X)X=(X1).

Taking B=(bij)n×1 and A=B into Property 2.3, we obtain Corollary 2.2.

Corollary 2.2

For any B=(bij)n×1 and invertible X=(Xij)n×n, it follows that BX1BX=Res((X1B)(AX1),n,n).

3. Autoregressive moving average model for matrix time series

The autoregressive moving average model for matrix time series is an extension of the vector autoregressive moving average model (VARMA) to matrix time series. However, we cannot build the autoregressive moving average model for matrix time series like the VARMA model as follows: (8) X(t)=Φ0+i=1pΦiX(ti)+ϵ(t)j=1qΨjϵ(tj).(8) The reason is that the form of (Equation8) cannot describe the dependent relation between the different columns of X(t) according to the rule of matrix multiplication. That is, the ℓth column of X(t) will not be affected by the sth column of X(t1),X(t2),,X(tp) as s.

3.1. MARMA (p,q) model

In this section, an autoregressive moving average model for matrix time series (MARMA) is first brought forward, whose degradation model, autoregressive model for matrix-valued time series (MAR), is just the model (Equation2) proposed by Wu and Hua (Citation2022) and the extension of model (Equation1) proposed by Chen et al. (Citation2021).

Definition 3.1

Let {X(t),tN} be an m×n-dimensional matrix time series. If X is stationary and for each tN it follows that (9) X(t)=C+k=1pΦkX(tk)Ψk+ϵ(t)j=1qΘjϵ(tj)Ξj,(9) where C is an m×n-dimensional matrix; Φk and Θj are m×m-dimensional matrices, and Ψk and Ξj are n×n-dimensional matrices for each k=1,2,,p and j=1,2,,q, where p and q are two nonnegative integers; {ϵ(t),tN} is an m×n-dimensional matrix white noise satisfying that vec(ϵ(t)) is independent with vec(X(s)) for all s<t, and then {X(t),tN} is said to follow a (p,q)-order autoregressive moving average model for matrix time series, denoted by MARMA(p,q).

When q = 0, MARMA(p,0) model (Equation9) degenerates into the form (10) X(t)=C+k=1pΦkX(tk)Ψk+ϵ(t),(10) which is a p-order autoregressive model for matrix time series, MAR(p).

When p = 0, MARMA(0,q) model (Equation9) degenerates into the form (11) X(t)=C+ϵ(t)j=1qΘjϵ(tj)Ξj,(11) which is called a q-order moving average model for matrix time series, denoted by MMA(q).

If X={X(t),tN} is an m×n-dimensional matrix time series defined by (Equation3) and X is stationary, denote μ=E[X(t)],tN,and then it follows from MARMA(p,q) model (Equation9) that (12) μ=C+Φ1μΨ1+Φ2μΨ2++ΦpμΨp.(12) Denote Y(t)=X(t)μ.It yields from (Equation12) and MARMA(p,q) model (Equation9) that (13) Y(t)=k=1pΦkY(tk)Ψk+ϵ(t)j=1qΘjϵ(tj)Ξj(13) holds for all tN, and then Y={Y(t),tN} is said to follow a (p,q)-order centralized MARMA(p,q) model.

Because every MARMA(p,q) model can be changed into a centralized MARMA(p,q) model and they have the same coefficient parameters. Thus, while estimating coefficient parameters of MARMA(p,q) model (Equation9) we will mainly study centralized MARMA(p,q) model (Equation13).

For any MARMA(p,q) model (Equation9), and for any ck0 and dj0, k=1,2,,p and j=1,2,,q, it follows that X(t)=C+k=1p(ckΦk)X(tk)(1ckΨk)+ϵ(t)j=1q(djΘj)ϵ(tj)(1djΞj),that is, coefficient parameters of MARMA(p,q) model (Equation9) are not unique! Thus, we present constraint conditions that (14) Ψk=(ψuv)n×n satisfies argmaxψij{|ψij|, i,j=1,2,,n}=1(14) and (15) Ξj=(ξuv)n×n satisfies argmaxξij{|ξij|, i,j=1,2,,n}=1(15) for all k=1,2,,p and j=1,2,,q.

3.2. Relationship between MARMA model and VARMA model

When the column number of matrix X(t) equals one, i.e., n = 1, MARMA(p,q) model (Equation9) degenerates into a (p,q)-order vector autoregressive moving average model, VARMA(p,q), as follows: (16) X(t)=C+k=1pΦkX(tk)+ϵ(t)j=1qΘjϵ(tj),(16) where {X(t),tN} is an m-dimensional vector time series, C is an m-dimensional vector, Φk and Θj are m×m-dimensional matrices for all k=1,2,,p and j=1,2,,q, and {ϵ(t),tN} is a white noise of the m-dimensional vector time series satisfying that ϵ(t) is independent with X(s) for all s<t. Obviously, VARMA model (Equation16) is a special case of MARMA model (Equation9).

On the other hand, for any m×n-dimensional matrix time series {X(t),tN}, its vectorization {vec(X(t)),tN} is an mn×1-dimensional time series, and the (p,q)-order vector autoregressive moving average model VARMA(p,q) for {vec(X(t)),tN} follows as (17) vec(X(t))=A0+k=1pAkvec(X(tk))+ϵ(t)j=1qBjϵ(tj),(17) where A0 is an mn×1-dimensional vector; Ak and Bj are mn×mn-dimensional matrices for k=1,2,,p and j=1,2,,q; and {ϵ(t),tN} is an mn×1-dimensional white noise satisfying that ϵ(t) is independent with vec(X(s)) for all s<t.

A natural question is why the authors still bring forward MARMA(p,q) model (Equation9) for {X(t),tN} but directly use VARMA(p,q) model (Equation17) for {vec(X(t)),tN}.

In fact, there are two important reasons that the authors propose MARMA(p,q) model (Equation9) for {X(t),tN}. Firstly, MARMA(p,q) model (Equation9) for {X(t),tN} can reveal the information structure of {X(t),tN} very clearly. Secondly, MARMA(p,q) model (Equation9) for {X(t),tN} can reduce model parameters more greatly than VARMA(p,q) model (Equation17) for {vec(X(t)),tN}. In fact, the parameter number of MARMA(p,q) model (Equation9) for {X(t),tN} is 2mn+(p+q)(m2+n2). However, the parameter number of VARMA(p,q) model (Equation17) for {vec(X(t)),tN} is 2mn+(p+q)m2n2. Generally, 2mn+(p+q)(m2+n2)2mn+(p+q)m2n2.For example, if p = q = 1 and m = n = 10, then 2mn+(p+q)(m2+n2)=6002mn+(p+q)m2n2=20200.In today's big data era, m and n are often very large, taking m = n = 100 and p = q = 1 as an example, and then 2mn+(p+q)(m2+n2)=600002mn+(p+q)m2n2=200020000.

Remark 3.1

MARMA(p,q) model (Equation9) greatly reduces model parameters compared with VARMA(p,q) model (Equation17).

Although it is not a good idea to replace MARMA(p,q) model (Equation9) with VARMA(p,q) model (Equation17), in the following we will show there exists a special VARMA(p,q) model equivalent to MARMA(p,q) model, which will play a very important role in theoretical analysis of MARMA(p,q) model (Equation9).

Theorem 3.1

MARMA(p,q) model (Equation9) for {X(t),tN} is equivalent to VARMA(p,q) model (Equation18) for {vec(X(t)),tN} as follows: (18) vec(X(t))=vec(C)+k=1pΨkΦkvec(X(tk))+vec(ϵ(t))j=1qΞjΘjvec(ϵ(tj)),(18) where vec(X(t)) and vec(ϵ(t)) represent the vectorization of matrices X(t) and ϵ(t) by columns, and ⊗ is the Kronecker product.

Theorem 3.1 can be proved by the following equivalence relation: for any matrices Ym×n, Am×m, Bm×n and Cn×n, it follows that Ym×n=Am×mBm×nCn×nvec(Ym×n)=(Cn×nAm×m)vec(Bm×n).The equivalence relation is not difficult to prove, so we omit the proof and that of Theorem 3.1.

3.3. Stationary and invertible conditions for MARMA model

According to Theorem 3.1, any MARMA(p,q) model (Equation9) can be converted into its corresponding VARMA(p,q) model (Equation18). Furthermore, VARMA(p,q) model (Equation18) can be rewritten as (19) P(B)vec(X(t))=vec(C)+Q(B)vec(ϵ(t)),tN,(19) where (20) P(B)=Imnk=1pΨkΦkBk,(20) (21) Q(B)=Imnj=1qΞjΘjBj(21) and B is the delay operator, i.e., BX(t)=X(t1) holds for all tN.

Theorem 3.2

For MARMA(p,q) model (Equation9), the necessary and sufficient conditions for stationarity are that any root λ of (Equation22) is in the unit circle, where (22) |λpImnλp1Ψ1Φ1λp2Ψ2Φ2λΨp1Φp1ΨpΦp|=0.(22) The necessary and sufficient conditions for invertibility are that any root λ of (Equation23) is in the unit circle, where (23) |λqImnλq1Ξ1Θ1λq2Ξ2Θ2λΞq1Θq1ΞqΘq|=0.(23)

The proof of Theorem 3.2 is presented in Appendix 1.

Corollary 3.1

For MAR(p) model (Equation10), the necessary and sufficient conditions for stationarity are that any root λ of (Equation22) is in the unit circle.

Remark 3.2

Corollary 3.1 expands Proposition 1 in Chen et al. (Citation2021).

Corollary 3.2

For MMA(q) model (Equation11), the necessary and sufficient conditions for invertibility are that any root λ of (Equation23) is in the unit circle.

3.4. Parameter estimation for MARMA model

In the section, we will present the conditional least square method and the conditional maximum likelihood estimation method for MARMA(p,q) model (Equation9).

Let x1,x2,,xN be a series of samples of the centralized matrix time series X={X(t),tN} defined by (Equation3) with C=Om×n, where (24) xt=[x11(t)x12(t)x1n(t)x21(t)x22(t)x2n(t)xm1(t)xm2(t)xmn(t)],t=1,2,,N,(24) where the integer N is the sample length.

When the coefficient parameters of MARMA(p,q) model (Equation9) have been obtained, it follows from (Equation12) that C=μΦ1μΨ1Φ2μΨ2ΦpμΨp,and then the constant matrix C of MARMA(p,q) model (Equation9) can be estimated as follows: Cˆ=X¯Φ1X¯Ψ1Φ2X¯Ψ2ΦpX¯Ψp,where X¯=1Nt=1Nxt.Thus, in the following we always assume the samples come from a centralized MARMA(p,q) model (Equation9), i.e., C=Om×n.

We use VARMA(p,q) model (Equation19) with C=Om×n, equivalent to centralized MARMA(p,q) model (Equation9), to estimate the coefficient parameters of MARMA(p,q) model (Equation9) by the conditional least square method.

It yields from (Equation19) with C=Om×n that (25) vec(ϵ(t))=Q1(B)P(B)vec(X(t)),tN,(25) where Q1(B) is the inverse operator of Q(B), and P(B)=Imnk=1pΨkΦkBk,Q(B)=Imnj=1qΞjΘjBj.For the sake of briefness, denote (26) G(B)=k=0+GkBk=Q1(B)P(B)(26) and P(B)=i=0+PiBi,Q(B)=j=0+QjBj,where we stipulate that (27) Pi={Imn,i=0,ΨiΦi,1ip,Omn,ip+1, and Qj={Imn,j=0,ΞjΘjBj,1jq,Omn,jq+1.(27) It follows from (Equation26) that Q(B)G(B)=P(B), which means that (28) i=0kQiGki=Pk,k=0,1,2,.(28) It yields from (Equation28) and (Equation27) that (29) Gk={Imn,k=0,ΨkΦk+i=1kq(ΞiΘi)Gki,1kp,i=1kq(ΞiΘi)Gki,kp+1,(29) where kq=min{k,q}.

In summary, centralized MARMA(p,q) model (Equation9), i.e., C=Om×n, is equivalent to VARMA(p,q) model (Equation30). (30) vec(ϵ(t))=k=0+Gkvec(X(tk)),tN,(30) where Gk, k=0,1,2,, are given by (Equation29).

Theorem 3.3

According to the conditional least square method, the parameters of MARMA(p,q) model (Equation9) satisfy the following matrix differential equations: {t=p+1Nk=1t1(Gkx~tk)Φi(Im(x~t+=1t1Gx~t))=Om,i=1,2,,p,t=p+1Nk=1t1(Gkx~tk)Ψi(In(x~t+=1t1Gx~t))=On,i=1,2,,p,t=p+1Nk=1t1(Gkx~tk)Θj(Im(x~t+=1t1Gx~t))=Om,j=1,2,,q,t=p+1Nk=1t1(Gkx~tk)Ξj(In(x~t+=1t1Gx~t))=On,j=1,2,,q,where Gk is given by (Equation29).

The proof of Theorem 3.3 is presented in Appendix 2.

Corollary 3.3

According to the conditional least square method, the parameters of MAR(p) model (Equation10) satisfy the following matrix differential equations: {t=p+1N(x~ti(ΨiΦi))Φi(Im(x~t=1p(ΨΦ)x~t))=Om,t=p+1N(x~ti(ΨiΦi))Ψi(In(x~t=1p(ΨΦ)x~t))=On,i=1,2,,p.

Theorem 3.4

Assume the innovations are Gaussian with the mean Om×n and covariance matrix Σmn. According to the conditional maximum likelihood estimation method, the parameters of centralized MARMA(p,q) model (Equation9) satisfy the following matrix differential equations: {t=2Nk=1t1(Gkx~tk)Φi(Im[Σmn1H(t,G)])=Om,i=1,2,,p,t=2Nk=1t1(Gkx~tk)Ψi(In[Σmn1H(t,G)])=On,i=1,2,,p,t=2Nk=1t1(Gkx~tk)Θj(Im[Σmn1H(t,G)])=Om,j=1,2,,q,t=2Nk=1t1(Gkx~tk)Ξj(In[Σmn1H(t,G)])=On,j=1,2,,q,1Nt=1NRes([Σmn1H(t,G)][Σmn1H(t,G)],mn,mn)=Σmn1,where H(t,G)=x~t+k=1t1Gkx~tk and Gk is given by (Equation29).

The proof of Theorem 3.4 is presented in Appendix 3.

Corollary 3.4

Assume the innovations are Gaussian with the mean Om×n and covariance matrix Σmn. According to the conditional maximum likelihood estimation method, the parameters of centralized MAR(p) model (Equation10) satisfy the following matrix differential equations: {t=i+1N(x~ti(ΨiΦi))Φi(Im[Σmn1H(t,Φ,Ψ)])=Om,i=1,2,,p,t=i+1N(x~ti(ΨiΦi))Ψi(In[Σmn1H(t,Φ,Ψ)])=On,i=1,2,,p,1Nt=1NRes([Σmn1H(t,Φ,Ψ)][Σmn1H(t,Φ,Ψ)],mn,mn)=Σmn1,where H(t,Φ,Ψ)=x~ts=1(t1)p(ΨsΦs)x~ts.

Remark 3.3

The matrix differential equations in Theorems 3.3 and 3.4 are very complex. Especially, the coefficients Gk in (Equation29), k=1,2,, are defined by a series of recursions, whose implied parameters are to be estimated. Thus, it is difficult to obtain its closed solution, but its approximate solutions can be obtained by the numerical computation method.

3.5. Hypothesis testing for the MARMA model

Let x1,x2,,xN be a series of samples of the centralized matrix time series X={X(t),tN} defined by (Equation3) with C=Om×n and xt=(xij)m×n for all t=1,2,,N. Additionally assume x1,x2,,xN are Gaussian. In the section, we will test whether x1,x2,,xN follow MARMA(p,q) model (Equation9).

The null hypothesis and the alternative hypothesis follow as

  • H0: X={X(t),tN} follows MARMA(p,q) model (Equation9);

  • H1: X={X(t),tN} does not follow MARMA(p,q) model (Equation9).

When H0 holds, denote (31) ϵ~(t)=k=0t1Gkx~tk,t=1,2,,N,(31) where ()~=vec(). It follows from Corollary 5.3 (Karl & Simar, Citation2015) that T2=N(ϵ~¯)(Sϵ~2)1ϵ~¯T2(mn,N1),where (32) ϵ~¯=1Nt=1Nϵ~t and Sϵ~2=1N1t=1Nϵ~tϵ~t.(32) It follows from Theorem 5.9 (Karl & Simar, Citation2015) that Nmn(N1)mnT2(mn,N1)F(mn,Nmn),that is, F=N(Nmn)(N1)mn(ϵ~¯)(Sϵ~2)1ϵ~¯F(mn,Nmn).Summarize the above deduction and we obtain Theorem 3.5 for the hypothesis testing on MARMA(p,q) model (Equation9).

Theorem 3.5

For any given significance level α(0,1), if F<Fα2(mn,Nmn) or F>F1α2(mn,Nmn), then reject {X(t),tN} following MARMA(p,q) model (Equation9); otherwise, accept {X(t),tN} following MARMA(p,q) model (Equation9), where F=N(Nmn)(N1)mn(ϵ~¯)(Sϵ~2)1ϵ~¯,and ϵ~¯, Sϵ~2, ϵ~t are given by (Equation32) and (Equation31).

3.6. Forecasting for the MARMA model

Let {Xt,tN} be an m×n-dimensional matrix time series defined by (Equation3) following MARMA(p,q) model (Equation9), equivalently, {vec(Xt),tN} following VARMA(p,q) model (Equation18), that is, (33) vec(Xt)=vec(C)+k=1pΨkΦkvec(Xtk)+vec(ϵt)j=1qΞjΘjvec(ϵtj),(33) where {ϵt,tN} is an m×n-dimensional matrix white noise.

Denote the forecasting for Xt+ under the condition that X1,X2,,Xt have been known by Xˆt(), which refers to the ℓth step forecasting. It follows from (Equation33) and the projection theorem in Hilbert space that (34) vec(Xˆt())=vec(C)+k=1pΨkΦkvec(X~t(k))j=1qΞjΘjvec(ϵ~t(j)),(34) where vec(X~t(k))={vec(Xt+k),k0,vec(Xˆt(k)),k1, and vec(ϵ~t(k))={vec(ϵt+k),k0,vec(Om×n),k1.It yields from the equivalence relation of MARMA(p,q) model (Equation9) and VARMA(p,q) model (Equation18) that (35) Xˆt()=C+k=1pΦkX~t(k)Ψkj=1qΘjϵ~t(j)Ξj,(35) where X~t(k)={Xt+k,k0,Xˆt(k),k1, and ϵ~t(k)={ϵt+k,k0,Om×n,k1.In the following, we will study the interval estimation of MARMA(p,q) model (Equation9) and assume the innovations are Gaussian. Equivalently, {vec(Xt),tN} follows VARMA(p,q) model (Equation19), that is, P(B)vec(Xt)=vec(C)+Q(B)vec(ϵt),where P(B) and Q(B) are defined by (Equation20) and (Equation21), and {vec(ϵt),tN} is a vector white noise.

Denote Π(B)=k=0+ΠkBk=P1(B)Q(B),and then (36) vec(Xt)=P1(B)vec(C)+k=0+Πkvec(ϵtk),(36) where (37) Πk={Imn,k=0,ΞkΘk+i=1kp(ΨiΦi)Πki,1kq,i=1kp(ΨiΦi)Πki,kq+1,(37) with kp=min{k,p}.

For any >0, it follows from (Equation36) and the estimation method of vec(Xˆt()) that (38) vec(Xt+)vec(Xˆt())=k=01Πkvec(ϵt+k),(38) and then (39) vec(Xt+)vec(Xˆt())N(Omn×1,k=01ΠkΣmnΠk).(39) For any given α(0,1), it yields from (Equation39) that the confidence interval of vec(Xt+) with confidence level 1α follows as (vec(Xˆt())U1α2diag(k=01ΠkΣmnΠk), vec(Xˆt())+U1α2diag(k=01ΠkΣmnΠk)),where diag() refers to the vector composed by all main diagonal elements, and means taking the square roots of every elements. It yields from the equivalence relation of MARMA(p,q) model (Equation9) and VARMA(p,q) model (Equation19) that the confidence interval of Xt+ with confidence level 1α follows as (Xˆt()U1α2Res(diag(k=01ΠkΣmnΠk),m,n), Xˆt()+U1α2Res(diag(k=01ΠkΣmnΠk),m,n)).In summary, we can obtain the following results.

Theorem 3.6

Assume {Xt,tN} follows MARMA(p,q) model (Equation9).

(1) For any >0, the ℓ-step point estimation follows as Xˆt()=C+k=1pΦkX~t(k)Ψkj=1qΘjϵ~t(j)Ξj,where X~t(k)={Xt+k,k0,Xˆt(k),k1, and ϵ~t(k)={ϵt+k,k0,Om×n,k1.(2) For any >0 and α(0,1), the ℓ-step interval estimation with confidence level 1α follows as (Xˆt()U1α2Res(diag(k=01ΠkΣmnΠk),m,n), Xˆt()+U1α2Res(diag(k=01ΠkΣmnΠk),m,n)),where U1α2 is the 1α2 level lower quantile of standard normal distribution, Res() the reshape function by Definition 2.6, diag() the vector composed by all main diagonal elements, takes the square roots of every elements, and Πk={Imn,k=0,ΞkΘk+i=1kp(ΨiΦi)Πki,1kq,i=1kp(ΨiΦi)Πki,kq+1.

3.7. Supplementary notes for the MARMA model

3.7.1. Model identification for the MARMA model

According to Theorem 3.1, MARMA(p,q) model (Equation9) is equivalent to VARMA(p,q) model (Equation18). Thus, we can use the model identification method for the VARMA model to identify the order of MARMA model, such as AIC(p,q)=ln(|Σmn(p,q)|)+2N(p+q)(m2+n2),BIC(p,q)=ln(|Σmn(p,q)|)+ln(N)N(p+q)(m2+n2),or alternatively, AIC(p,q)=ln(L)+(p+q)(m2+n2),BIC(p,q)=2ln(L)+ln(N)(p+q)(m2+n2),where N is the length of observation sequence and ln(L) is the logarithm likelihood function.

3.7.2. MARIMA model

For any matrix time series {X(t)=(Xij(t))m×n,tN} defined by (Equation3), the difference operator Δ for matrix time series follows as (40) ΔX(t)=X(t)X(t1),ΔkX(t)=Δk1X(t)Δk1X(t1),k=2,3,,(40) and Δ defined by (Equation40) has the same effect as the difference operator for vector time series. That is, if {X(t)=(Xij(t))m×n,tN} is nonstationary, then we can try to eliminate nonstationarity by Δ defined by (Equation40). If there exists a positive integer d such that {ΔdX(t),tN} is stationary but {Δd1X(t),tN} is nonstationary, and {ΔdX(t),tN} follows a MARMA(p,q) model (Equation9), then {X(t)=(Xij(t))m×n,tN} is called to follow a (p,d,q)-order autoregressive integrated moving average for matrix time series, and denoted by MARIMA(p,d,q).

4. An application of the MARMA model

In this section, we will try to model the time series of daily closing prices and daily volumes of Haitong Securities Company Limited (Abbreviated as Haitong Securities; Stock code: 600837) and Ping An Insurance (Group) Company of China, Ltd. (Abbreviated as Ping An; Stock code: 601318). The data are downloaded from the China Stock Market & Accounting Research Database (CSMAR), and the time window is from January 2, 2018 to December 31, 2021, which includes 973 records every stock.

For the sake of clarity, we denote the time series by {[P1(t)P2(t)V1(t)V2(t)],t=1,2,3,},where P1(t) and V1(t) are the daily closing price and daily volume of Haitong Securities, and P2(t) and V2(t) are the daily closing price and daily volume of Ping An.

4.1. Data preprocessing

We first conduct the Kwiatkowski, Phillips, Schmidt and Shin (KPSS) test, i.e., ‘kpsstest’ function in the software MATLAB R2020b, to test the stationarity of the daily closing prices and daily volumes of Haitong Securities and Ping An, and the results show that the daily closing prices and daily volumes of Haitong Securities and Ping An are nonstationary.

In the following we will consider the logarithmic rates (log rate) of daily closing prices and daily volumes of Haitong Securities and Ping An. Denote (41) {R(t)=[R11(t)R12(t)R21(t)R22(t)],t=2,3,4,},(41) where R1k(t)=ln(Pk(t)Pk(t1)) and R2k(t)=ln(Vk(t)Vk(t1)),k=1,2.That is, R11(t) is the logarithmic rate of daily closing price of Haitong Securities, R21(t) the logarithmic rate of daily volume of Haitong Securities, R12(t) the logarithmic rate of daily closing price of Ping An and R22(t) the logarithmic rate of daily volume of Ping An.

We conduct the Kwiatkowski, Phillips, Schmidt and Shin (KPSS) test, i.e., ‘kpsstest’ function in the software MATLAB R2020b, to test the stationarity of the logarithmic rates of daily closing prices and daily volumes of Haitong Securities and Ping An, and the results show that the logarithmic rates of daily closing prices and daily volumes of Haitong Securities and Ping An are stationary.

Additionally, we conduct a Ljung-Box Q test, i.e., ‘lbqtest’ function in the software MATLAB R2020b, to test the pure randomness of the logarithmic rates of daily closing prices and daily volumes of Haitong Securities and Ping An, and the results show that the logarithmic rates of daily closing prices or daily volumes of Haitong Securities and Ping An are not purely random.

In conclusion, for the stocks of Haitong Securities and Ping An, their daily closing prices and daily volumes are nonstationary, but their logarithmic rates of daily closing prices and daily volumes are stationary, and their logarithmic rates of daily closing prices or daily volumes are not purely random.

4.2. Modelling of MARMA (p,q)

We use the Bayesian information criterion (BIC) to select the model, and the results show that MARMA(4,0) is the best. Using the conditional least square method and MATLAB R2020b program, we establish MARMA(4,0) model for {R(t),t=1,2,3,} by (Equation41) as follows: (42) R(t)=Cˆ+Φˆ1R(t1)Ψˆ1+Φˆ2R(t2)Ψˆ2+Φˆ3R(t3)Ψˆ3+Φˆ4R(t4)Ψˆ4+ϵ(t),(42) where Φˆ1=[2.11×1021.32×1033.43710.3527],Ψˆ1=[10.37080.40140.0354],Φˆ2=[0.03392.21×1031.17210.1271],Ψˆ2=[0.41320.446810.2409],Φˆ3=[0.01019.82×1040.35750.1739],Ψˆ3=[10.47150.16250.2516],Φˆ4=[0.04408.75×1040.47320.1012],Ψˆ4=[0.95370.68530.43851],μˆ=[6.66×1053.75×1044.86×1041.64×103],and Cˆ=[3.27×1053.48×1041.27×1033.40×103],and then the covariance matrix of residuals {ϵ(t),tN} follows as (43) Σϵ=[4.40×1042.74×1032.17×1041.30×1032.74×1030.13861.31×1035.60×1022.17×1041.31×1033.15×1041.31×1031.30×1035.60×1021.31×1030.1077].(43)

4.3. Evaluation on MARMA (p,q)

For the sake of saving space, we will not show the model test, model optimization or forecasting of MARMA(4,0) model (Equation42), but present a comparison of the MARMA model and ARMA model in this subsection. We first establish ARMA(p,q) model for R11(t),R21(t),R12(t) and R22(t), respectively, and obtain their models as follows: (44) R11(t)=1.53×108+0.0204R11(t1)+0.0058R11(t2)+0.0625R11(t3)0.0393R11(t4)+e11(t),R21(t)=2.98×1040.3994R21(t1)0.2598R21(t2)0.1918R21(t3)0.1096R21(t4)+e21(t),R12(t)=5.17×1070.0002R12(t1)0.0041R12(t2)+0.0476R12(t3)0.0699R12(t4)+e12(t),R22(t)=2.52×1040.4842R22(t1)0.3860R22(t2)0.2660R22(t3)0.1562R22(t4)+e22(t),(44) where the covariance matrix of residuals {e(t)=(e11(t),e21(t),e12(t),e22(t)),tN} follows as (45) Σe=[4.49×1042.69×1032.22×1041.24×1032.69×1030.16921.40×1037.22×1022.22×1041.40×1033.22×1041.32×1031.24×1037.22×1021.32×1030.1369].(45) It follows from (Equation43) and (Equation45) that the residuals of MARMA(4,0) model (Equation42) are almost consistently less than those of ARMA(4,0) model (Equation44).

In practice, we are more concerned about the residual variance, i.e., the variance of every element of residual. Using (Equation43) and (Equation45), we compute the relative change of the residual variance of MARMA(4,0) model (Equation42) to the residual variance of ARMA(4,0) model (Equation44) as follows: [var(ϵ11(t))var(e11(t))var(e11(t))var(ϵ12(t))var(e12(t))var(e12(t))var(ϵ21(t))var(e21(t))var(e21(t))var(ϵ22(t))var(e22(t))var(e22(t))]=[1.93%2.24%18.10%21.31%].That is, MARMA(4,0) model (Equation42) reduces all residual variance relative to ARMA(4,0) model (Equation44). Especially, the relative change of volume's residual variance exceeds 10% by MARMA(4,0) model (Equation42) relative to ARMA(4,0) model (Equation44), which means the MARMA model could really improve the prediction accuracy.

5. Conclusion

We proposed an autoregressive moving average model for matrix time series (MARMA), which is an extension of the autoregressive model for matrix time series (MAR). Like the MAR model, the MARMA model retains the original matrix structure, and provides a much more parsimonious model, compared with the approach of the vector autoregressive model for vectorizing the matrix into a long vector. Compared with MAR model, MARMA models are capable of modelling the unknown process with the minimum number of parameters.

As for MARMA model, the necessary and sufficient conditions for stationarity and invertibility are established. Parameter estimation methods are investigated for the conditional least square method and the conditional maximum likelihood estimation method. Point forecasting and interval forecasting are presented by using the projection theorem in the Hilbert space and the decomposition technique of time series. Additionally, model identification, model testing and possible extensions are discussed.

There are many directions to extend the scope of the MARMA model. Random environment such as the Markov environment might be imposed on the MARMA model to depict the impact of environmental change. Additionally, sparsity or group sparsity might be imposed on coefficient matrices to reach a further dimension reduction. Furthermore, the idea of MARMA can be applied for yield modelling, volatility modelling, weather forecast modelling and animal migration modelling.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

This paper is partially supported by the basic scientific research business expenses of Universities in Xinjiang, China [Grant Number XQZX20230057] and the National Natural Science Foundation of China [Grant Number 11671142].

References

  • Chen, R., Xiao, H., & Yang, D. (2021). Autoregressive models for matrix-valued time series. Journal of Econometrics, 222(1), 539–560. https://doi.org/10.1016/j.jeconom.2020.07.015
  • Getmanov, V. G., Chinkin, V. E., Sidorov, R. V., Gvishiani, A. D., Dobrovolsky, M. N., Soloviev, A. A., Dmitrieva, A. N., Kovylyaeva, A. A., & Yashin, I. I. (2021). Methods for recognition of local anisotropy in muon fluxes in the URAGAN hodoscope matrix data time series. Physics of Atomic Nuclei, 84(6), 1080–1086. https://doi.org/10.1134/S106377882113010X
  • Graham, A. (2018). Kronecker products and matrix calculus with applications. Dover Publications Inc.
  • Karl, W., & Simar, L. (2015). Applied multivariate statistical analysis (4th ed.). Springer-Verlag.
  • Samadi, S. (2014). Matrix time series analysis [PhD Dissertation]. The University of Georgia.
  • Walden, A., & Serroukh, A. (2002). Wavelet analysis of matrix-valued time series. Proceedings: Mathematical, Physical and Engineering Sciences, 458(2017), 157–179.
  • Wang, D., Liu, X., & Chen, R. (2019). Factor models for matrix-valued high-dimensional time series. Journal of Econometrics, 208(1), 231–248. https://doi.org/10.1016/j.jeconom.2018.09.013
  • Wang, H., & West, M. (2009). Bayesian analysis of matrix normal graphical models. Biometrika, 96(4), 821–834. https://doi.org/10.1093/biomet/asp049
  • Wu, S. J., & Hua, N. (2022). Autoregressive model of time series with matrix cross-section data (in Chinese). Acta Mathematica Sinica, 65(6), 1093–1104.
  • Zhou, L. H., Du, G. W., Tao, D. P., Chen, H. M., Cheng, J., & Gong, L. B. (2018). Clustering multivariate time series data via multi-nonnegative matrix factorization in multi-relational networks. IEEE Access, 2018(6), 74747–74761. https://doi.org/10.1109/Access.6287639.

Appendices

Appendix 1. Proof of Theorem 3.2

In order to obtain stationary conditions and invertible conditions for MARMA(p,q) model (Equation9), we first give a lemma as follows.

Lemma A.1

For any square matrices A1,A2,,Ak, the operator G(B)=IA1BAk1Bk1AkBkis invertible if and only if any root λ of (EquationA1) satisfies |λ|<1, where k is a natural number and B is the delay operator. (A1) |λkIλk1A1λAk1Ak|=0.(A1)

Proof.

For the polynomial with k degree and matrix coefficients G(z)=IA1zAk1zk1Akzk,it can be factorized into k linear polynomials with the matrix coefficient in the complex field as follows: G(z)=(IC1z)(IC2z)(ICkz),where C1,C2,,Ck are determined by (A2) 1j1<j2<<jukCj1Cj2Cju=(1)u1Au,u=1,2,,k.(A2) Thus, (A3) G(B)=(IC1B)(IC2B)(ICkB).(A3) For any i=1,2,,k, it is easy to prove that ICiB is invertible if and only if ρ(Ci)<1, that is, all roots of |λICi|=0 are in the unit circle. It follows from (EquationA3) that G(B) is invertible if and only if all ICiB, i=1,2,,k, are invertible. Thus, G(B) is invertible if and only if all roots of |λICi|=0 are in the unit circle for all i=1,2,,k. According to determinant properties, G(B) is invertible if and only if all roots of (A4) |(λIC1)(λIC2)(λICk)|=0(A4) are in the unit circle. It yields from (EquationA2) that (λIC1)(λIC2)(λICk)=λkIλk1A1λAk1Ak.Thus, G(B) is invertible if and only if all roots of |λkIλk1A1λAk1Ak|=0are in the unit circle.

Proof

Proof of Theorem 3.2

For VARMA(p,q) model (Equation19), P(B)vec(X(t))=vec(C)+Q(B)vec(ϵ(t)),tN.It follows from the concept of stationarity that the necessary and sufficient conditions of stationarity are that the operator P(B) is invertible. According to Lemma A.1, the operator P(B) is invertible if and only if any root λ of (Equation22) satisfies |λ|<1. Thus, VARMA(p,q) model (Equation19) is stationary if and only if any root λ of (Equation22) satisfies |λ|<1. Note that VARMA(p,q) model (Equation19) is equivalent to MARMA(p,q) model (Equation9), so MARMA(p,q) model (Equation9) is stationary if and only if any root λ of (Equation22) satisfies |λ|<1.

The necessary and sufficient conditions for invertibility can be obtained by the similar method to obtain the necessary and sufficient conditions for stationarity, so we omit it.

Appendix 2. Proof of Theorem 3.3

Noting that {vec(ϵ(t)),tN} is an mn×1-dimensional white noise, and the objective function of VARMA(p,q) model (Equation30) using the conditional least square method follows as (A5) J(Φ1,,Φp,Ψ1,,Ψp,Θ1,,Θq,Ξ1,Ξq)=t=p+1N(vec(xt)+k=1t1Gkvec(xtk))(vec(xt)+k=1t1Gkvec(xtk)),(A5) where we take xt=Om×n for all t0.

Lemma A.2

J(Φ1,,Φp,Ψ1,,Ψp,Θ1,,Θq,Ξ1,,Ξq) defined by (EquationA5) has the minimum value about Φk, Ψk, Θj and Ξj for all k=1,2,,p and j=1,2,,q.

Proof.

It yields from analysing (Equation29) that J(Φ1,,Φp,Ψ1,,Ψp,Θ1,,Θq,Ξ1,,Ξq) by (EquationA5) is a multivariate polynomial of Φk, Ψk, Θj and Ξj for all k=1,2,,p and j=1,2,,q. And it is obvious that J(Φ1,,Φp,Ψ1,,Ψp,Θ1,,Θq,Ξ1,,Ξq) by (EquationA5) is greater than or equal to zero, which means that J(Φ1,,Φp,Ψ1,,Ψp,Θ1,,Θq,Ξ1,,Ξq) by (EquationA5) has lower bound. Thus, J(Φ1,,Φp,Ψ1,,Ψp,Θ1,,Θq,Ξ1,,Ξq) by (EquationA5) has the minimum value about Φk, Ψk, Θj and Ξj for all k=1,2,,p and j=1,2,,q.

Proof

Proof of Theorem 3.3.

It follows from Lemma A.2 that, according to the conditional least square method, the parameters of MARMA(p,q) model (Equation9) satisfy the following matrix differential equations: {Φit=p+1N(vec(xt)+k=1t1Gkvec(xtk))(vec(xt)+=1t1Gvec(xt))=Om,i=1,2,,p,Ψit=p+1N(vec(xt)+k=1t1Gkvec(xtk))(vec(xt)+=1t1Gvec(xt))=On,i=1,2,,p,Θjt=p+1N(vec(xt)+k=1t1Gkvec(xtk))(vec(xt)+=1t1Gvec(xt))=Om,j=1,2,,q,Ξjt=p+1N(vec(xt)+k=1t1Gkvec(xtk))(vec(xt)+=1t1Gvec(xt))=On,j=1,2,,q.Using the derivative of scalar by matrix, it yields from Corollary 2.1 that {t=p+1Nk=1t1(Gkx~tk)Φi(Im(x~t+=1t1Gx~t))=Om,i=1,2,,p,t=p+1Nk=1t1(Gkx~tk)Ψi(In(x~t+=1t1Gx~t))=On,i=1,2,,p,t=p+1Nk=1t1(Gkx~tk)Θj(Im(x~t+=1t1Gx~t))=Om,j=1,2,,q,t=p+1Nk=1t1(Gkx~tk)Ξj(In(x~t+=1t1Gx~t))=On,j=1,2,,q.

Appendix 3. Proof of Theorem 3.4

It yields from (Equation30) that (A6) vec(X(t))=k=1+Gkvec(X(tk))+vec(ϵ(t)),tN.(A6) For the sake of briefness, we denote X~t=vec(X(t)),tN and x~k=vec(xk),k=1,2,,N.It yields from (EquationA6) that (A7) X~t|{X~t1,X~t2,}N(k=1+GkX~tk,Σmn),tN,(A7) where Σmn is defined by (Equation7).

Let X(t)=Om×n for all t0. It follows from (EquationA7) that (A8) X~1N(Omn×1,Σmn)(A8) and (A9) X~t|{X~t1,X~t2,,X~1}N(k=1t1GkX~tk,Σmn),tN.(A9) Thus, the maximum likelihood function of x1,x2,,xV follows as L(x1,x2,,xN;Φ1,,Φp,Ψ1,,Ψp,Θ1,,Θq,Ξ1,,Ξq)=L(x~1,x~2,,x~N;Φ1,,Φp,Ψ1,,Ψp,Θ1,,Θq,Ξ1,,Ξq)=f(x~1)f(x~2|{x~1})f(x~3|{x~2,x~1})f(x~N|{x~N1,x~N2,,x~1})=(2π)Nmn2|Σmn|N2exp{12t=1N(x~t+k=1t1Gkx~tk)Σmn1(x~t+k=1t1Gkx~tk)},where f() means the probability density function, and we stipulate k=10() equals zero vector or zero matrix as needed. Therefore, the logarithm maximum likelihood function of x1,x2,,xN follows as (A10) (x1,x2,,xN;Φ1,,Φp,Ψ1,,Ψp,Θ1,,Θq,Ξ1,,Ξq)=ln(L(x1,x2,,xN;Φ1,,Φp,Ψ1,,Ψp,Θ1,,Θq,Ξ1,,Ξq))=Nmn2ln(2π)N2ln(|Σmn|)12t=1N(x~t+k=1t1Gkx~tk)Σmn1(x~t+k=1t1Gkx~tk).(A10) Using the derivative of scalar by matrix, it yields from (EquationA10) that (A11) {t=1NΦi(H(t,G)Σmn1H(t,G))=Om,i=1,2,,p,t=1NΨi(H(t,G)Σmn1H(t,G))=On,i=1,2,,p,t=1NΘj(H(t,G)Σmn1H(t,G))=Om,j=1,2,,q,t=1NΞj(H(t,G)Σmn1H(t,G))=On,j=1,2,,q,Nln(|Σmn|)Σmn+t=1NΣmn(H(t,G)Σmn1H(t,G))=Omn,(A11) where H(t,G)=x~t+k=1t1Gkx~tk. It yields from Corollary 2.2 and Property 2.3 that {t=2Nk=1t1(Gkx~tk)Φi(Im[Σmn1H(t,G)])=Om,i=1,2,,p,t=2Nk=1t1(Gkx~tk)Ψi(In[Σmn1H(t,G)])=On,i=1,2,,p,t=2Nk=1t1(Gkx~tk)Θj(Im[Σmn1H(t,G)])=Om,j=1,2,,q,t=2Nk=1t1(Gkx~tk)Ξj(In[Σmn1H(t,G)])=On,j=1,2,,q,1Nt=1NRes([Σmn1H(t,G)][(Σmn1)H(t,G)],mn,mn)=(Σmn1).