Publication Cover
Sequential Analysis
Design Methods and Applications
Volume 43, 2024 - Issue 1
363
Views
0
CrossRef citations to date
0
Altmetric
Research Articles

Control charts for high-dimensional time series with estimated in-control parameters

, ORCID Icon &
Pages 103-129 | Received 14 Jun 2023, Accepted 10 Nov 2023, Published online: 05 Jan 2024

Abstract

In this article, we study the effect of misspecification caused by fitting the target process in the Phase I analysis of the monitoring procedure on the behavior of several types of multivariate exponentially weighted moving average (MEWMA) control charts in the high-dimensional setting. In particular, the classical MEWMA control charts, whose control statistics are based on the exact and asymptotic Mahalanobis distance, are considered together with the novel approaches where the Euclidean distance and the diagonalized Euclidean distance are employed in the construction of control statistics. The high-dimensional distributions of the control statistics are deduced at each time. These results are later used to assess the performance of the considered control charts under misspecification. Both theoretical and empirical findings lead to the conclusion that the control charts based on the Euclidean distance and the diagonalized Euclidean distance are robust to misspecification for moderate dimensions of the data-generating model, whereas they tend to overestimate the in-control average run lengths (ARLs) in the case of larger dimensions. On the other hand, the control schemes based on the Mahalanobis distance are considerably affected by the estimation of the parameters of the target process, and their application results in drastically smaller values of the ARLs, especially when the dimension of the data-generating model is large.

1. INTRODUCTION

Statistical process control (SPC) plays a special role in the monitoring of production processes. The methods of SPC are also widely used in other fields of science, like in engineering, economics, medicine, chemistry, biology, and finance (see, e.g., Frisén Citation1992; Schipper and Schmid Citation2001; Sonesson and Bock Citation2003; Andersson, Bock, and Frisén Citation2004; Schmid and Tzotchev Citation2004; Lawson and Kleinman Citation2005; O. Bodnar Citation2007, Citation2009b; Messaoud, Weihs, and Hering Citation2008; Golosnoy et al. Citation2011; O. Bodnar and Schmid Citation2017).

In the setup of the monitoring procedure, the relationship between the observed and target process should be specified. Let {Yt} denote the p-dimensional target process and let {Xt} be the p-dimensional observed process. Under the target process, we consider a process that fulfills quality requirements, whereas the observed process is the actual process. In the following, the relationship between the target and observed processes is described by the change point model expressed as (1.1) Xt={Ytfort<τYt+afortτ,t Z,(1.1) where a0 and τN{}. If τ=, then the observed process is called an in-control process. Otherwise, it is called out-of-control process. The symbols E(.),Var(.), and Cov(.) will denote the mean, variance, and covariance matrix, respectively, computed under the assumption of the in-control state. We assume that the target process {Yt} is a weakly stationary process with mean vector μ and autocovariance matrix at lag h, denoted by Γ(h).

Control charts present the mostly spread tool of SPC (see Montgomery Citation2020). While the first control charts were designed for detecting changes in the location behavior based on univariate independent observations (cf. Shewhart Citation1926; Page Citation1954; Roberts Citation1959), they were extended to time series by Alwan and Roberts (Citation1988), Schmid (Citation1995, Citation1997a), Schmid and Schöne (Citation1997), and Knoth and Schmid (Citation2002), among others. Another line of research led to the surveillance of the parameters in multivariate models.

The first multivariate control chart was proposed in Hotelling (Citation1947), who introduced a control scheme based on the Mahalanobis distance to monitor the mean vector of the independent observations coming from the multivariate normal distribution. This approach was later extended by Crosier (Citation1988), Pignatiello and Runger (Citation1990), Lowry et al. (Citation1992), and Ngai and Zhang (Citation2001), who proposed several multivariate control charts based on the multivariate exponentially weighted moving average (MEWMA) recursion and the cumulative sum approach. Multivariate control charts for monitoring the parameters of multivariate time series have recently become a hot topic of research. Control charts of the mean behavior were discussed in Theodossiou (Citation1993), Kramer and Schmid (Citation1997), O. Bodnar and Schmid (Citation2007, Citation2011), and O. Bodnar (Citation2009a), and control charts for monitoring the covariance matrix were introduced in Śliwa and Schmid (Citation2005) and O. Bodnar and Schmid (Citation2017).

Due to the rapid development of computer technology, monitoring the parameters of complex high-dimensional processes has become possible and has attracted many researchers to this challenging field of science. In the high-dimensional setting, it is assumed that the dimensions of the data-generating model grow at the same rate as the sample size when the latter tends to infinity (see, e.g., Bai and Silverstein Citation2010; T. Bodnar, Dette, and Parolya Citation2019). K. Wang and Jiang (Citation2009) considered a variable selection–based multivariate SPC procedure under the high-dimensional setting, and a high-dimensional control chart for profile monitoring was suggested by Chen and Nembhard (Citation2011). The control scheme is based on the adaptive Neyman test statistic for the coefficients of the discrete Fourier transform of profiles. Li et al. (Citation2014) suggested a new control chart that starts monitoring with the second observation regardless of the dimensionality and reduces the average run length (ARL) in detecting early shifts in high-dimensionality measurements. Z. Wang, Li, and Zhou (Citation2017) constructed a hybrid control chart in the case of independent multivariate Poisson data, and R. Bodnar, Bodnar, and Schmid (Citation2023) introduced several MEWMA-type control charts for high-dimensional time series where the Euclidean distance and the diagonalized Euclidean distance are employed in the construction of the control statistics instead of the Mahalanobis distance.

All of the abovementioned control charts were designed under the assumption that the parameters of the target process {Yt} are known. However, this assumption appears to be very restrictive in many practical situations (see, e.g., Kramer and Schmid Citation2000; Albers and Kallenberg Citation2004; Jensen et al. Citation2006; Saleh et al. Citation2015; Jardim, Chakraborti, and Epprecht Citation2020; Sarmiento et al. Citation2022). In practice, the target process should be fitted in the Phase I analysis of the monitoring procedure, whereas the control statistics for the Phase II are constructed by replacing the unknown true parameters with their corresponding estimators. This approach leads to the misspecified control charts, and the effect of the misspecification has to be studied before the monitoring scheme is applied in practice. In this article, we contribute to the literature by developing new theoretical results that allow assessment of the effect of the misspecification in the high-dimensional setting. In particular, we show that the MEWMA control charts based on the Euclidean distance and the diagonalized Euclidean distance are quite robust to the misspecification caused by the estimation of the parameters of the target process in Phase I, whereas the control schemes based on the Mahalanobis distance can be strongly affected by the misspecification effect in high dimensions.

The derived results can be applied in several fields of science. One direction of possible applications lead to economics and finance with a special emphasis on optimal portfolio theory. Several control charts for monitoring the structure of optimal portfolios were suggested in O. Bodnar and Schmid (Citation2007), Golosnoy and Schmid (Citation2007), O. Bodnar (Citation2009b), and Golosnoy et al. (Citation2011), among others. Whereas these procedures were developed in the case of small dimensions of data-generating model, the introduced approaches extend the existent methods to the high-dimensional case. Moreover, because the model parameters of the data-generating model in Phase I are not known in financial applications, the obtained findings potentially introduce new methods for monitoring the structure of optimal portfolios, where the Euclidean distance and the diagonalized Euclidean distance are employed in the construction of the test statistics. Another line of possible applications is in environmental science, where problems of monitoring the parameters of the high-dimensional spatial processes may be present (see, e.g., Otto and Schmid Citation2023).

The rest of the article is organized as follows. In Section 2, the MEWMA recursion is discussed and its basic properties are presented. Section 3 introduces the MEWMA control charts for high-dimensional time series under model misspecification, and the distributional properties of the considered statistics are derived in Section 4. The results of the simulation study are given in Section 5, and final remarks are summarized in Section 6.

2. CONTROL CHARTS BASED ON MEWMA RECURSIONS

The MEWMA recursion is defined by (2.1) Zt=(IR)Zt1+RXt,t1(2.1) with Z0=μ. In the following, we set R=diag(r1,,rp) with ri(0,1] for i=1,,p being known and deterministic (see, e.g., Qiu Citation2013; Montgomery Citation2020).

Following Kramer and Schmid (Citation1997), it holds that for t N and p fixed, E(Zt)=μ+atτtμ+aIN(τ)withatτ=(I(IR)tτ+1)aI{0,1,}(tτ) and Σt=Cov(Zt)=Ri,j=0t1(IR)iΓ(ji)(IR)jRtΣl=Ri,j=0(IR)iΓ(ji)(IR)jR, provided that {Γ(v)} is absolutely summable; that is, that v=0||Γ(v)||<, where ||.|| denotes the Euclidean norm (see, e.g., Kramer and Schmid Citation1997). Furthermore, to ensure that Σt is positive definite, it is assumed that the covariance matrix of (Y1,Yt), defined as a block matrix with the (j, i)th block equal to Γ(ji), is positive definite for any t. Alternatively, the positive definiteness of Σt can be ensured by assuming that the process {Yt} has a positive definite spectral density matrix denoted by f(λ), because for any vector u it holds that uΣtu=uRv,j=0t1(IR)v(ππei(jv)λf(λ)dλ)(IR)jRu               =ππ(v=0t1(IR)veivλRu)f(λ)(j=0t1(IR)jeijλRu)dλ0 with the equality if and only if j=0t1(IR)jeijλRu=(I(IR)eiλ)1(I(IR)teitλ)u=0, which is equivalent to u=0 due to the definition of R at the beginning of Section 2. It is noted that the spectral density matrix f(λ) is positive definite for many stationary multivariate time series, such as, for example, for a stationary vector autoregressive moving average (VARMA) process. Similar results in the univariate case can be found in Schmid (Citation1997b). Whereas the mean vector of Zt depends whether the observed process is in the in-control state or in the out-of-control state, its covariance matrix is same in both states. In the following, we will always assume that rk(Σt)=p=rk(Σl).

The first control chart based on the MEWMA recursion was suggested by Lowry et al. (Citation1992) for independent multivariate observations, and Kramer and Schmid (Citation1997) extended this chart to monitor changes in the mean vector of a multivariate time series. This control chart is based on the Mahalanobis distance (Ztμ)Σt1(Ztμ). Because for this scheme the covariance matrix has to be determined at each time point, practitioners prefer to work with the statistic (Ztμ)Σl1(Ztμ), where the exact covariance matrix of Zt is replaced by its limit as t tends to infinity.

Recently, R. Bodnar, Bodnar, and Schmid (Citation2023) proposed two further versions of control statistics based on the MEWMA recursion. The first one uses the Euclidean distance to compute the control statistic at each time t and is given by (Ztμ)(Ztμ). Let Σd;t be a diagonal matrix that consists of the diagonal elements of Σt. The second control statistic proposed in R. Bodnar, Bodnar, and Schmid (Citation2023) is based on the diagonalized Euclidean distance expressed as (Ztμ)Σd;t1(Ztμ).

Using these approaches. R. Bodnar, Bodnar, and Schmid (Citation2023) developed suitable control statistics by normalizing these quantities. They distinguished between several possibilities. The control statistics are centered by subtracting the exact in-control mean, the asymptotic in-control mean as t goes to infinity, and the asymptotic in-control mean as p goes to infinity, respectively. Further they normalized these expressions by dividing by the square root of the exact variance, the asymptotic variance as t goes to infinity, and the asymptotic variance as p goes to infinity, respectively. Altogether, 12 control schemes were considered and compared with each other.

3. MEWMA CONTROL CHARTS WITH ESTIMATED PARAMETERS

The control charts in the previous section depend on certain parameters as—for example, μ,Γ(h),h0—which were assumed to be known in the previous studies. In practice, however, these parameters have to be estimated. This is usually done by a prerun in engineering or by using a historical sample in, for example, finance. In the Phase I analysis, the unknown parameters of the target process are estimated by a previous sample, and these parameter estimators are used within the Phase II analysis, the monitoring phase.

We will assume in the following that a sample of the underlying target process—that is, for τ=—is available to estimate the parameters in Phase I. This sample is assumed to be independent of the observation vectors in Phase II, which are used to construct the MEWMA recursion and the corresponding control charts to monitor the mean behavior of the underlying high-dimensional time series.

To run a control chart, one should estimate the expectation and the autocovariance matrices of the underlying stationary process using the sample from Phase I by a certain estimation procedure. As such, the estimators of μ and Γ(h),h0 are obtained and are used instead of the true population counterparts in the definition of the control statistics. In particular, this means that the estimators of μ and Γ(h), denoted by μ* and Γ(h)*, are assumed to be deterministic in practice. Of course, we cannot estimate Γ(h) for all values of h for an arbitrary stationary process. However, it is usually assumed that the underlying target process {Yt} follows a vector autoregressive (VAR) or VARMA process and then the autocovariance matrices can be estimated for all h using the estimators of the coefficient matrices and the covariance matrix of the white noise process.

If {Yt} follows a VAR(1) process given by (3.1) Yt=μ+Φ(Yt1μ)+εt,(3.1) where {εt} are independent and normally distributed with E(εt)=0 and Cov(εt,εt)=Σ, then it holds that E(Yt)=μ=E(Xt) and (see, e.g., Brockwell and Davis Citation1991; Reinsel Citation1993; Lütkepohl Citation2005) (3.2) Γ(h)=ΦhΓ(0)andΓ(h)=Γ(h)forh=1,2,,(3.2) where Γ(0) is the solution of the following matrix equation: (3.3) Γ(0)=ΦΓ(0)Φ+Σ.(3.3)

For estimation of the autocovariance matrices Γ(h) for h0, the coefficient matrix Φ and the covariance matrix Σ of the white noise process’s various procedures were proposed in the statistical literature (e.g., Brockwell and Davis Citation1991; Hamilton Citation1994). In the following, three estimation methods are considered.

Let X1:N0=(X1,,XN0) denote the p×N0 observation matrix consisting of N0 observation vectors taken in Phase I. Later, it is assumed that N0>p. This assumption ensures that the sample covariance matrix is positive definite with probability 1 when an independent sample is taken from the normal distribution (see, e.g., theorem 3.1.4 in Muirhead Citation1982). In the first and second approaches, the mean vector μ is estimated by its sample counterpart given by (Brockwell and Davis Citation1991) X¯=1N0t=1N0Xt.

The first method is based on the nonparametric estimation of the autocovariance matrices Γ(h) expressed as Γ̂non(h)={1N0t=1N0h(Xt+hX¯)(XtX¯)for 0hN01,1N0t=h+1N0(Xt+hX¯)(XtX¯)forN0+1h<0.

The second approach uses the assumption of the VAR(1) model as given in Equation(3.1). In this case, Γ(0) and Γ(1) are estimated nonparametrically by Γ̂non(0) and Γ̂non(1), respectively, which are then used to estimate Φ and Σ by Φ̂V=Γ̂non(1)Γ̂non(0)1 and Σ̂V=Γ̂non(0)Φ̂VΓ̂non(0)Φ̂V.

Finally, Γ(h) and Γ(h) for h1 are estimated by Γ̂V(h)=Φ̂VhΓ̂non(0)andΓ̂V(h)=Γ̂V(h).

The third approach is based on the maximum likelihood estimation of ν=(IΦ)μ,Φ and Σ, where I denotes the identity matrix. Following section 11.1 of Hamilton (Citation1994), the maximum likelihood estimator of Π=[ν,Φ] is given by Π̂ML=[ν̂ML,Φ̂ML]=[t=1N0XtVt][t=1N0VtVt]1withVt=[1Xt1].

Furthermore, the maximum likelihood estimator for Σ is expressed as Σ̂ML=1Tt=1Tεt̂εt̂withεt̂=YtΠ̂Vt.

Finally, using Equation(3.2) and Equation(3.3), we get the maximum likelihood estimators of Γ(h) given by vec(Γ̂ML(0))=(IΦ̂MLΦ̂ML)vec(Σ̂ML),Γ̂ML(h)=Φ̂MLhΓ̂ML(0),Γ̂ML(h)=Γ̂ML(h), where vec(.) denotes the vec operator and the symbol ⊗ stands for the Kronecker product (see, e.g., Harville Citation1997).

Later, we use the symbol * to denote the estimators of the population quantities, which are used in the construction of the control statistics based on MEWMA recursion. This leads to μ*, Σt*=Ri,j=0t1(IR)iΓ(ji)*(IR)jR, and Σd;t* for the matrix consisting of the diagonal elements of Σt*. By analogy, Σl* is used, which is well defined if v=0||Γ(v)*||. Further, we use Σd;l* for the limit of Σd;t* as t tends to infinity.

Replacing μ,Σt,Σl, and Σd;t by the misspecified values μ*,Σt*,Σl*, and Σd;t* in the quadratic forms discussed in Section 2, we get the misspecified Mahalanobis quantities (Ztμ*)Σt*1(Ztμ*),(Ztμ*)Σl*1(Ztμ*) and the misspecified quantities based on the Euclidean distance (Ztμ*)(Ztμ*),(Ztμ*)Σd;t*1(Ztμ*).

R. Bodnar, Bodnar, and Schmid (Citation2023) introduced several control charts based on the Euclidean distance. However, the effect of misspecification was not taken into account. In this article, we want to analyze the behavior of these control charts under misspecification. Following R. Bodnar, Bodnar, and Schmid (Citation2023) and their notation, we consider the MEWMA control charts given by T1,t*=(Ztμ*)(Ztμ*)tr(Σt*)2tr(Σt*2),T2,t*=(Ztμ*)(Ztμ*)tr(Σl*)2tr(Σt*2),T3,t*=(Ztμ*)(Ztμ*)tr(Σt*)2tr(Σl*2),T4,t*=(Ztμ*)(Ztμ*)tr(Σl*)2tr(Σl*2),T6,t*=(Ztμ*)Σd;t*1(Ztμ*)tr(Σd;t*1Σt*)2tr((Σd;t*1Σt*)2),T7,t*=(Ztμ*)Σd;t*1(Ztμ*)tr(Σd;l*1Σl*)2tr((Σd;t*1Σt*)2),T8,t*=(Ztμ*)Σd;t*1(Ztμ*)tr(Σd;t*1Σt*)2tr((Σd;l*1Σl*)2),T9,t*=(Ztμ*)Σd;t*1(Ztμ*)tr(Σd;l*1Σl*)2tr((Σd;l*1Σl*)2),TMah,t*=(Ztμ*)Σt*1(Ztμ*)p2p,TMahInf,t*=(Ztμ*)Σl*1(Ztμ*)tr(Σl*1Σt*)2tr((Σl*1Σt*)2).

These statistics were studied in R. Bodnar, Bodnar, and Schmid (Citation2023); however, here we additionally take misspecification into account. The statistics T5,t and T10,t of R. Bodnar, Bodnar, and Schmid (Citation2023) are not considered in this article, because they make use of the limit of the first and second moments as p tends to infinity, which in most cases is difficult to determine.

4. BEHAVIOR OF THE CONTROL STATISTICS UNDER MISSPECIFICATION

In this section, we analyze the distributional properties of the control statistics defined under misspecification in detail. In particular, the exact distributions of the quadratic forms present in the definition of the control statistics are derived and their asymptotic approximations are provided in two cases, when t tends to infinity and when p tends to infinity. These results shed light on the effect of misspecification on the performance of the considered MEWMA control charts, especially in the high-dimensional case.

The following notation will be used throughout the article: (4.1) atτ=(I(IR)tτ+1)aI{0,1,2,}(tτ).(4.1)

In Theorems 4.1 and 4.2, the results are presented in the case of the control schemes based on the Mahalanobis distance, and Theorems 4.3 and 4.4 provide the results of the control chart based on the Euclidean distance and the diagonalized Euclidean distance, respectively.

Theorem 4.1.

Let {Yt} be a stationary Gaussian process with E(Yt)=μ and Cov(Yt+h,Yt)=Γ(h). Let τ be fixed.

  1. Suppose that rk(Σt)=rk(Σt*)=p. Let Ut be an orthogonal matrix such thatUtΣt1/2Σt*1Σt1/2Ut=diag(λMah,1,t,,λMah,p,t)and let δMah,t=UtΣt1/2(μ+atτμ*)=(δMah,i,t)i=1,,p. Further, suppose that ζ1, …, ζp are independent and standard normally distributed random variables.

    Then(Ztμ*)Σt*1(Ztμ*)=di=1pλMah,i,t(ζi+δMah,i,t)2.Moreover,E((Ztμ*)Σt*1(Ztμ*))=tr(Σt*1Σt)+(μ+atτμ*)Σt*1(μ+atτμ*)andVar((Ztμ*)Σt*1(Ztμ*))=2tr((Σt*1Σt)2)+4(μ+atτμ*)Σt*1ΣtΣt*1(μ+atτμ*).

  2. Suppose that rk(Σt)=rk(Σl)=rk(Σt*)=rk(Σl*)=p and that {Γ(v)} and {Γ(v)*} are absolutely summable. Let p be fixed and let Ul be an orthogonal matrix such thatUlΣl1/2Σl*1Σl1/2Ul=diag(λMah,1,,λMah,p)and let δMah=UlΣl1/2(μ+aIN(τ)μ*)=(δMah,i)i=1,,p.

    If, further,(4.2) limtUt=UlandlimtΣt1/2=Σl1/2,(4.2) then the asymptotic distribution of (Ztμ*)Σt*1(Ztμ*) as t tends to infinity is equal to the distribution of i=1pλMah,i(ζi+δMah,i)2 with ζi as in part (a).

  3. Let t be fixed. Suppose that rk(Σt)=rk(Σt*)=p and that(4.3) limpmax1ipλMah,i,t2(1+2δMah,i,t2)i=1pλMah,i,t2(1+2δMah,i,t2)=0.(4.3) Then(Ztμ*)Σt*1(Ztμ*)E((Ztμ*)Σt*1(Ztμ*))Var((Ztμ*)Σt*1(Ztμ*))pdN(0,1).

Proof.

It holds that ZtμNp(atτ,Σt) and thus Σt1/2(Ztμ*)N(Σt1/2(μ+atτμ*),I). Consequently, (Ztμ*)Σt*1(Ztμ*)=(Σt1/2(Ztμ*))Σt1/2Σt*1Σt1/2 (Σt1/2(Ztμ*)).

Thus, the proof of part (a) follows immediately using chapter 3.1a, corollary 3.2b.1, and theorem 3.2b.2 of Mathai and Provost (Citation1992).

To prove part (b), we make use of (a) and the fact that ΣttΣl. Further, we use that the eigenvalues of a matrix are continuous functions of the elements of the matrix (cf. theorem 9.6 in Lax Citation2007). Consequently, λMah,i,ttλMah,i. Because of Equation(4.2), part (b) follows.

To prove part (c), we apply lemma 7.1 of R. Bodnar, Bodnar, and Schmid (Citation2023). □

Condition Equation(4.2) is needed to prove part (b). In principle, we need the eigenvectors of a symmetric matrix to be continuous functions of the elements of the matrix. This is in general not fulfilled and therefore we have to assume Equation(4.2). A detailed discussion of this problem is given in, for example, chapter 9 of Lax (Citation2007). A sufficient condition for Equation(4.2) to hold is that all eigenvalues of Σt1/2Σt*1Σt1/2 are simple. Condition Equation(4.3) is a technical one and is needed to apply a central limit theorem for a nonidentically distributed random sequence. In particular, this condition ensures that there is no dominating summand (with considerably larger variance) in the infinite sum of the random variables. Namely, the quadratic form (Ztμ*)Σt*1(Ztμ*) can be presented as a weighted sum of independent random variables that are all non-central χ2-distributed with one degree of freedom. Furthermore, the denominator in condition Equation(4.3) is equal to the variance of the quadratic form with the summands corresponding the variance contribution of each random variable presented in the stochastic representation of (Ztμ*)Σt*1(Ztμ*) mentioned above. As such, condition Equation(4.3) requires that no summand in the variance decomposition be dominating.

Now we study the statistic based on the limit covariance matrix.

Theorem 4.2.

Let {Yt} be a stationary Gaussian process with E(Yt)=μ and Cov(Yt+h,Yt)=Γ(h). Let τ be fixed.

  1. Suppose that rk(Σt)=rk(Σt*)=p. Let Ul;t be an orthogonal matrix such thatUl;tΣt1/2Σl*1Σt1/2Ul;t=diag(λMahInf,1,t,,λMahInf,p,t)and let δMahInf,t=Ul;tΣt1/2(μ+atτμ*)=(δMahInf,i,t)i=1,,p. Further, suppose that ζ1, …, ζp are independent and standard normally distributed random variables.

    Then(Ztμ*)Σl*1(Ztμ*)=di=1pλMahInf,i,t(ζi+δMahInf,i,t)2.Moreover,E((Ztμ*)Σl*1(Ztμ*))=tr(Σl*1Σt)+(μ+atτμ*)Σl*1(μ+atτμ*)andVar((Ztμ*)Σl*1(Ztμ*))=2tr((Σl*1Σt)2)+4(μ+atτμ*)Σl*1ΣtΣl*1(μ+atτμ*).

  2. Suppose that rk(Σt)=rk(Σl)=rk(Σt*)=rk(Σl*)=p and that {Γ(v)} and {Γ(v)*} are absolutely summable. Let p be fixed and let Ul be the orthogonal matrix defined in Theorem 4.1(b). If, further,(4.4) limtUl;t=UlandlimtΣt1/2=Σl1/2,(4.4) then the asymptotic distribution of (Ztμ*)Σl*1(Ztμ*) as t tends to infinity is equal to the distribution of i=1pλMah,i(ζi+δMah,i)2 with ζi as in part (a).

  3. Let t be fixed. Suppose that rk(Σt)=rk(Σt*)=p and thatlimpmax1ipλMahInf,i,t2(1+2δMahInf,i,t2)i=1pλMahInf,i,t2(1+2δMahInf,i,t2)=0.Then(Ztμ*)Σl*1(Ztμ*)E((Ztμ*)Σl*1(Ztμ*))Var((Ztμ*)Σl*1(Ztμ*))pdN(0,1). Of course, the limit distribution in Theorem 4.2(b) is the same as that in Theorem 4.1(b).

Next we analyze the statistics based on the Euclidean distance and the diagonalized Euclidean distance.

Theorem 4.3.

Let {Yt} be a stationary Gaussian process with E(Yt)=μ and Cov(Yt+h,Yt)=Γ(h). Let τ be fixed.

  1. Suppose that rk(Σt)=p. Let U˜t be an orthogonal matrix such thatU˜tΣtU˜t=diag(λEu,1,t,,λEu,p,t)and let δEu,t=U˜tΣt1/2(μ+atτμ*)=(δEu,i,t)i=1,,p. Further, suppose that ζ1, …, ζp are independent and standard normally distributed random variables.

    Then(Ztμ*)(Ztμ*)=di=1pλEu,i,t(ζi+δEu,i,t)2.Moreover,E((Ztμ*)(Ztμ*))=tr(Σt)+(μ+atτμ*)Σt(μ+atτμ*)andVar((Ztμ*)(Ztμ*))=2tr(Σt2)+4(μ+atτμ*)Σt2(μ+atτμ*).

  2. Suppose that rk(Σt)=rk(Σl)=p and that {Γ(v)} is absolutely summable. Let p be fixed and let U˜l be an orthogonal matrix such thatU˜lΣlU˜l=diag(λEu,1,,λEu,p)and let δEu=U˜lΣl1/2(μ+aIN(τ)μ*)=(δEu,i)i=1,p. If, further,(4.5) limtU˜t=U˜landlimtΣt1/2=Σl1/2,(4.5) then the asymptotic distribution of (Ztμ*)(Ztμ*) as t tends to infinity is equal to the distribution of i=1pλEu,i(ζi+δEu,i)2 with ζi as in part (a).

  3. Let t be fixed. Suppose that rk(Σt)=p and thatlimpmax1ipλEu,i,t2(1+2δEu,i,t2)i=1pλEu,i,t2(1+2δEu,i,t2)=0.Then(Ztμ*)(Ztμ*)E((Ztμ*)(Ztμ*))Var((Ztμ*)(Ztμ*))pdN(0,1).

Theorem 4.4.

Let {Yt} be a stationary Gaussian process with E(Yt)=μ and Cov(Yt+h,Yt)=Γ(h). Let τ be fixed.

  1. Suppose that rk(Σt)=rk(Σt*)=p. Let Ud;t be an orthogonal matrix such thatUd;tΣt1/2Σd;t*1Σt1/2Ud;t=diag(λEu,d;1,t,,λEu,d;p,t)and let δEu,d;t=Ud;tΣt1/2(μ+atτμ*)=(δEu,d;i,t)i=1,,p. Further, suppose that ζ1, …, ζp are independent and standard normally distributed random variables.

    Then(Ztμ*)Σd;t*1(Ztμ*)=di=1pλEu,d;i,t(ζi+δEu,d;i,t)2.Moreover,E((Ztμ*)Σd;t*1(Ztμ*))=tr(Σd;t*1Σt)+(μ+atτμ*)Σd;t*1(μ+atτμ*)andVar((Ztμ*)Σd;t*1(Ztμ*))=2tr((Σd;t*1Σt)2)+4(μ+atτμ*)Σd;t*1ΣtΣd;t*1(μ+atτμ*).

  2. Suppose that rk(Σt)=rk(Σl)=rk(Σt*)=rk(Σl*)=p and that {Γ(v)} and {Γ(v)*} are absolutely summable. Let p be fixed and let Ud be an orthogonal matrix such thatUdΣl1/2Σd;t*1Σl1/2Ud=diag(λEu,d;1,,λEu,d;p)and let δEu,d=UdΣl1/2(μ+aIN(τ)μ*)=(δEu,d;i)i=1,,p.

    If, further,(4.6) limtUd;t=UdandlimtΣt1/2=Σl1/2,(4.6) then the asymptotic distribution of (Ztμ*)Σd;t*1(Ztμ*) as t tends to infinity is equal to the distribution of i=1pλEu,d;i(ζi+δEu,d;i)2 with ζi as in part (a).

  3. Let t be fixed. Suppose that rk(Σt)=rk(Σt*)=p and thatlimpmax1ipλEu,d;i,t2(1+2δEu,d;i,t2)i=1pλEu,d;i,t2(1+2δEu,d;i,t2)=0.Then(Ztμ*)Σd;t*1(Ztμ*)E((Ztμ*)Σd;t*1(Ztμ*))Var((Ztμ*)Σd;t*1(Ztμ*))pdN(0,1). Note that in practice the in-control mean is frequently known and the mean must not be estimated. This case is obtained by setting μ*=μ in the above formulas.

Using the results of the theorems, we get under the conditions given that as t and with p fixed, P(T1,t*x)P(i=1pλEu,i(ζi+δEu,i)2tr(Σt*)+x2tr(Σt*2)),P(T2,t*x)P(i=1pλEu,i(ζi+δEu,i)2tr(Σl*)+x2tr(Σt*2)),P(T3,t*x)P(i=1pλEu,i(ζi+δEu,i)2tr(Σt*)+x2tr(Σl*2),P(T4,t*x)P(i=1pλEu,i(ζi+δEu,i)2tr(Σl*)+x2tr(Σl*2)),P(T6,t*x)P(i=1pλEu,d;i(ζi+δEu,d;i)2tr(Σd;t*1Σt*)+x2tr((Σd;t*1Σt*)2)),P(T7,t*x)P(i=1pλEu,d;i(ζi+δEu,d;i)2tr(Σd;l*1Σl*)+x2tr((Σd;t*1Σt*)2)),P(T8,t*x)P(i=1pλEu,d;i(ζi+δEu,d;i)2tr(Σd;t*1Σt*)+x2tr((Σd;l*1Σl*)2)),P(T9,t*x)P(i=1pλEu,d;i(ζi+δEu,d;i)2tr(Σd;l*1Σl*)+x2tr((Σd;l*1Σl*)2)),P(TMah,t*x)P(i=1pλMah,i(ζi+δMah,i)2p+x2p),P(TMahInf,t*x)P(i=1pλMah,i(ζi+δMah,i)2tr(Σl*1Σt*)+x2tr((Σl*1Σt*)2)).

As t is fixed and p tends to infinity, it holds under conditions given in Section 4 that control statistics are asymptotically normally distributed in the high-dimensional setting. Namely, we get that P(T1,t*x)Φ(tr(Σt*)tr(Σt)(μ+atτμ*)Σt(μ+atτμ*)+x2tr(Σt*2)2tr((Σt)2)+4(μ+atτμ*)Σt2(μ+atτμ*)),P(T2,t*x)Φ(tr(Σl*)tr(Σt)(μ+atτμ*)Σt(μ+atτμ*)+x2tr(Σt*2)2tr((Σt)2)+4(μ+atτμ*)Σt2(μ+atτμ*)),P(T3,t*x)Φ(tr(Σt*)tr(Σt)(μ+atτμ*)Σt(μ+atτμ*)+x2tr(Σl*2)2tr((Σt)2)+4(μ+atτμ*)Σt2(μ+atτμ*)),P(T4,t*x)Φ(tr(Σl*)tr(Σt)(μ+atτμ*)Σt(μ+atτμ*)+x2tr(Σl*2)2tr((Σt)2)+4(μ+atτμ*)Σt2(μ+atτμ*)),P(T6,t*x)Φ(tr(Σd;t*1(Σt*Σt))(μ+atτμ*)Σd;t*1(μ+atτμ*)+x2tr((Σd;t*1Σt*)2)2tr((Σd;t*1Σt)2)+4(μ+atτμ*)Σd;t*1ΣtΣd;t*1(μ+atτμ*)),P(T7,t*x)Φ(tr(Σd;l*1Σl*Σd;t*1Σt)(μ+atτμ*)Σd;t*1(μ+atτμ*)+x2tr((Σd;t*1Σt*)2)2tr((Σd;t*1Σt)2)+4(μ+atτμ*)Σd;t*1ΣtΣd;t*1(μ+atτμ*)),P(T8,t*x)Φ(tr(Σd;t*1(Σt*Σt))(μ+atτμ*)Σd;t*1(μ+atτμ*)+x2tr((Σd;l*1Σl*)2)2tr((Σd;t*1Σt)2)+4(μ+atτμ*)Σd;t*1ΣtΣd;t*1(μ+atτμ*)),P(T9,t*x)Φ(tr(Σd;l*1Σl*Σd;t*1Σt)(μ+atτμ*)Σd;t*1(μ+atτμ*)+x2tr((Σd;l*1Σl*)2)2tr((Σd;t*1Σt)2)+4(μ+atτμ*)Σd;t*1ΣtΣd;t*1(μ+atτμ*)),P(TMah,t*x)Φ(ptr(Σt*1Σt)(μ+atτμ*)Σt*1(μ+atτμ*)+x2p2tr((Σt*1Σt)2)+4(μ+atτμ*)Σt*1ΣtΣt*1(μ+atτμ*)),P(TMahInf,t*x)Φ(tr(Σl*1(Σt*Σt))(μ+atτμ*)Σl*1(μ+atτμ*)+x2tr((Σl*1Σt*)2)2tr((Σl*1Σt)2)+4(μ+atτμ*)Σl*1ΣtΣl*1(μ+atτμ*)).

We conclude this section with the analysis of the above asymptotic distributions obtained under the misspecification, which are compared with the corresponding high-dimensional distribution obtained without the effect of misspecification; that is, when μ* and Γ(h)* coincide with the true population values μ and Γ(h). These results provide initial intuition about the effect of misspecification. More detailed analysis is obtained via a Monte Carlo study and is presented in Section 5.

In the setup of the simulation study, we use the same data-generating model as described in Section 5.1. Moreover, to investigate the effects of the sample size, we set N0{100,250,500} when p{20,50} and N0{2000,5000,10000} when p{500,1000}. The results for the four variants of the MEWMA control chart based on the Euclidean distance and diagonalized Euclidean distance are depicted in , and the findings obtained for the MEWMA approaches based on the exact and asymptotic Mahalanobis distance are summarized in .

In the figures, we observe that the misspecification has only a minor effect on the in-control performance of the MEWMA control charts that are based on the Euclidean distance and diagonalized Euclidean distance in the case of moderate dimensions of the data-generating model (see and ). All four curves in each plot are located very close to the density of the standard normal distribution, which is the limiting distribution of all considered control statistics when no misspecification is present; that is, the parameters of the in-control process are known. Some minor departures are present only when the sample size in Phase I is equal to 100. Interestingly, when p = 20, minor shifts of the densities to the right take place, whereas the asymptotic distributions under misspecification appear to have smaller variances when p = 50.

Figure 1. Probabilities P(T1,t*x),P(T2,t*x),P(T3,t*x),P(T4,t*x),P(T6,t*x),P(T7,t*x),P(T8,t*x), and P(T9,t*x) as functions of x for t = 5, p = 20, and N0{100,250,500}. The red plot corresponds to the density of the distribution in the case without misspecification; that is, the standard normal distribution.

Figure 1. Probabilities P(T1,t*≤x),P(T2,t*≤x),P(T3,t*≤x),P(T4,t*≤x),P(T6,t*≤x),P(T7,t*≤x),P(T8,t*≤x), and P(T9,t*≤x) as functions of x for t = 5, p = 20, and N0∈{100,250,500}. The red plot corresponds to the density of the distribution in the case without misspecification; that is, the standard normal distribution.

Figure 2. Probabilities P(T1,t*x),P(T2,t*x),P(T3,t*x),P(T4,t*x),P(T6,t*x),P(T7,t*x),P(T8,t*x), and P(T9,t*x) as functions of x for t = 5, p = 50, and N0{100,250,500}. The red plot corresponds to the density of the distribution in the case without misspecification; that is, the standard normal distribution.

Figure 2. Probabilities P(T1,t*≤x),P(T2,t*≤x),P(T3,t*≤x),P(T4,t*≤x),P(T6,t*≤x),P(T7,t*≤x),P(T8,t*≤x), and P(T9,t*≤x) as functions of x for t = 5, p = 50, and N0∈{100,250,500}. The red plot corresponds to the density of the distribution in the case without misspecification; that is, the standard normal distribution.

and present the results for large dimensions of the data-generating model. In this case, the empirical densities are shifted to the left, indicating a more conservative behavior of the MEWMA control chart based on the Euclidean distance and diagonalized Euclidean distance. As such, the effect of misspecification is expected to result in larger values of the in-control ARLs for these control schemes.

Figure 3. Probabilities P(T1,t*x),P(T2,t*x),P(T3,t*x),P(T4,t*x),P(T6,t*x),P(T7,t*x),P(T8,t*x), and P(T9,t*x) as functions of x for t = 5, p = 500, and N0{2000,5000,10000}. The red plot corresponds to the density of the distribution in the case without misspecification; that is, the standard normal distribution.

Figure 3. Probabilities P(T1,t*≤x),P(T2,t*≤x),P(T3,t*≤x),P(T4,t*≤x),P(T6,t*≤x),P(T7,t*≤x),P(T8,t*≤x), and P(T9,t*≤x) as functions of x for t = 5, p = 500, and N0∈{2000,5000,10000}. The red plot corresponds to the density of the distribution in the case without misspecification; that is, the standard normal distribution.

Figure 4. Probabilities P(T1,t*x),P(T2,t*x),P(T3,t*x),P(T4,t*x),P(T6,t*x),P(T7,t*x),P(T8,t*x), and P(T9,t*x) as functions of x for t = 5, p = 1,000, and N0{2000,5000,10000}. The red plot corresponds to the density of the distribution in the case without misspecification; that is, the standard normal distribution.

Figure 4. Probabilities P(T1,t*≤x),P(T2,t*≤x),P(T3,t*≤x),P(T4,t*≤x),P(T6,t*≤x),P(T7,t*≤x),P(T8,t*≤x), and P(T9,t*≤x) as functions of x for t = 5, p = 1,000, and N0∈{2000,5000,10000}. The red plot corresponds to the density of the distribution in the case without misspecification; that is, the standard normal distribution.

The situation is completely different in , where the results for the control charts based on the Mahalanobis distance are provided. The impact of the misspecification is considerable for this type of the MEWMA control scheme, even for moderate dimensions of the data-generating model. The blue curves in the plots, which correspond to the case of N0=100, are drastically shifted to the right. Moreover, the increase in the sample size does not obviously lead to the desired behavior of the asymptotic distributions. Even in the case of N0=500 and p = 50, the asymptotic densities deviate significantly from the distribution corresponding to the case without misspecification. For larger dimensions of the data-generating model, the empirical densities are completely shifted to the right. This behavior would result in high probabilities of false alarms when the MEWMA control schemes based on the Mahalanobis distance are used. Such an effect can be explained by the large amount of noise that is present in the nondiagonal elements of the inverse estimated covariance matrices Σt* and Σl*. In contrast to the control schemes based on the Mahalanobis distance, the statistics of MEWMA control charts based on the Euclidian distance and the diagonalized Euclidean distance do not use the nondiagonal elements of Σt* and Σl* in their definitions, and, as such, they appear to be robust against misspecification.

Figure 5. Probabilities P(TMah,t*x) (left column) and P(TMahInf,t*x) (right column) as functions of x for t = 5. We set p{20,50} with N0{100,250,500} and p{500,1000} with N0{2000,5000,10000}. The red plot corresponds to the density of the distribution in the case without misspecification; that is, the standard normal distribution.

Figure 5. Probabilities P(TMah,t*≤x) (left column) and P(TMahInf,t*≤x) (right column) as functions of x for t = 5. We set p∈{20,50} with N0∈{100,250,500} and p∈{500,1000} with N0∈{2000,5000,10000}. The red plot corresponds to the density of the distribution in the case without misspecification; that is, the standard normal distribution.

5. COMPARISON STUDY

The above results characterize the behavior of the considered MEWMA charts if the parameters of the process are misspecified. Thus, we can compare the charts and see how the charts react to deviations from the true parameters. This point will be illustrated in Section 5.2.

In practice, we estimate the process parameters within the Phase I analysis. Usually a preliminary run or historical data is used. Several approaches have been proposed in the literature, which we discussed in Section 3. If we apply the control statistics Ti,t=Ti,t(θ) proposed above, then we have to estimate the parameter by the prerun and get T̂i,t=Ti,t(θ̂). Here we will assume that the estimators obtained in the Phase I analysis are independent from the control statistics used in the Phase II analysis. With the law of total probability, we get (4.7) P(Ti,t(θ̂)x)=P(Ti,t(θ̂)x|θ̂=θ*)fθ̂(θ*)dθ*=P(Ti,t(θ*)x)fθ̂(θ*)dθ*.(4.7)

We have exclusively analyzed P(Ti,t(θ*)x) in the previous sections. To study Equation(4.7), we have to know the distribution of the estimator θ̂. This distribution is unknown; we only know that these quantities are usually asymptotically normally distributed. This fact could be used to get an approximation to Equation(4.7), but then the problem is to determine the resulting integral. Here we choose another procedure to evaluate Equation(4.7) and make use of simulations. Our procedure and the results are provided in Section 5.2.

5.1. Design of the Comparison Study

In Section 5.1 we described the design of our simulation study. We have chosen the in-control process to be a VAR(1) process; that is, Yt=ΦYt1+εt with independent white noise process {εt} and

  1. Φ=φ I with φ=0.5,

  2. εtNp(0,Σ) with Σ=DAD,

where D=diag(d1,,dp) is a diagonal matrix consisting of the standard deviations d1,,dp and A=(1αα2αp1α1ααp2α2α1αp3αp1αp2αp31)

is a correlation matrix with α=0.5. The values of d1,,dp are drawn randomly from the uniform distribution on the interval [0.5,2].

Additionally, we set R=rI with r{0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1} and the dimension of the observed process and the target process p to be equal to 20. Note that at the beginning we determine the matrix Σ and keep it constant for all of our simulations. Furthermore, we will always assume in this article that μ=0 is known and thus we do not estimate μ; that is, μ=μ*=0.

5.2. Behavior of the Control Statistics under Misspecification

The control procedures will be compared using the ARL as a performance criterion (see, e.g., Montgomery Citation2020). This is the most popular performance measure in SPC. Whereas the run length is equal to the first value at which the control chart gives a signal, the ARL is equal to the expected run length assuming that the change has occurred at the first position; that is, τ = 1.

Because the control statistics are dependent variables over time having a complicate correlation structure, no explicit expressions for the ARL are known. Note that even for univariate time series, explicit expressions are only known for special cases (see, e.g., Schmid Citation1995). In the case of an independent sequence of normally distributed variables, the ARL is obtained by solving an integral equation (e.g., Knoth Citation2021). This is why we use simulations to estimate the ARL.

In a first step, we generate a preliminary sample of the above VAR(1) process of size 500 and use it to estimate the parameters of the VAR(1) process following the three estimators presented in Section 3. It is assumed that the preliminary sample is in control. Then, in the Phase II analysis, we generate a realization of the Phase II process. Here we restrict ourselves to the case where the Phase II process is in control as well. The Phase I process and the Phase II process are assumed to be independent of each other. Now we apply one of the above-discussed control charts, fix a control limit, and determine the run length for the given data. This procedure is repeated 1,000 times and the ARL is used as an estimator of the true run length. Consequently, we get an estimator of the in-control ARL for a given control limit.

To compare the control charts, they have to be calibrated. This means that the control limit for each chart is determined in such way that the corresponding in-control ARL is equal to a fixed value ξ. In this article, we chose ξ = 200. This means that, on average, after 200 observations a wrong decision is made; that is, it is concluded that the process is out of control but in reality it is in control. The choice of ξ depends on the data frequency. In engineering, frequently the value 500 is chosen, whereas in finance smaller values are taken; for example, ξ = 60. A discussion on the choice of ξ can be found in Severin and Schmid (Citation1999).

Now we have to determine the control limits for each chart. Therefore, the Regula falsi is applied to the estimated in-control ARL; that is, the estimated in-control ARL is calculated for various values of control limits based on 104 independent observations. This procedure leads to estimators for the control limits. R. Bodnar, Bodnar, and Schmid (Citation2023) noticed that the control charts T6,t and T7,t have the same behavior as T8,t and T9,t, respectively. Thus, in the following we will consider only T6,t and T7,t. Moreover, we will omit the chart T2,t because even for the true process this control scheme performs very poorly in both the in-control and out-of-control states.

and depict the in-control ARLs computed for the misspecified T1,t*,T3,t*,T4,t*,T6,t*,T7,t*,TMah,t*, and TMahInf,t* MEWMA control charts for various values of r{0.1,0.2,,1.0} when the in-control process is the 20-dimensional VAR(1) model in and the 100-dimensional VAR(1) model in as defined in Section 5.1. We observe that the estimation method has only a minor impact on the computed values of the estimated ARLs with the exception present for both MEWMA control schemes based on the Mahalanobis distance when r is small. In general, quite robust behavior toward the parameter misspecification is present when the control statistics are defined by using the Euclidean norm and the diagonalized Euclidean norm. In all of the cases, the computed ARLs depart from the target ARL by not more than 15% when p = 20. In the case of a larger dimension of the data-generating model, the control charts based on the Euclidean norm and the diagonalized Euclidean norm become more conservative with the deviation of the empirical ARLs to the target value of 200 bounded by around 65%. The results obtained for the control schemes based on the Mahalanobis distance are completely different. The ARLs are considerably smaller than the target value of 200, especially, when r is small. If p = 100, then the computed ARLs for the MEWMA control charts based on the Mahalanobis distance are very small, indicating high probabilities of the false alarm in this case due to the misspecification. These results are in line with our previous findings depicted in , where it was noted that the high-dimensional asymptotic distributions of the control statistics in the case of Mahalanobis distance are drastically moved to the right, which results in small values of the estimated ARLs. Because the control statistics, which are based on the Euclidean distance and the diagonalized Euclidean distance, do not depend on the inverse of Σt* and Σl*, these control charts are robust to large estimation errors that are present when the inverse of a covariance matrix is estimated. Finally, a similar performance of the MEWMA control charts based on the Euclidean distance and the diagonalized Euclidean distance is displayed in , which is again in line with the results depicted in and .

Table 1. ARLs of the T1,t*,T3,t*,T4,t*,T6,t*,T7,t*,TMah,t*, and TMahInf,t* MEWMA control charts for r{0.1,0.2,,1.0} when the in-control process is the 20-dimensional VAR(1) process.

Table 2. ARLs of the T1,t*,T3,t*,T4,t*,T6,t*,T7,t*,TMah,t*, and TMahInf,t* MEWMA control charts for r{0.1,0.2,,1.0} when the in-control process is the 100-dimensional VAR(1) process.

6. CONCLUSION

The commonly used multivariate control charts are derived under the assumption that the parameters of the target process are known before the control procedure starts. However, this assumption is not fulfilled in many situations of practical interest; for example, in economics and finance. In such situations, a detailed analysis is performed in Phase 1 of the surveillance procedure with the aim to fit the target process by estimating its parameters. The quality of the estimator is expected to have a considerable impact on the control procedure, especially when the data-generating model is of large dimension.

In this article, we analyze the in-control properties of the MEWMA control charts whose control statistics are based on the Mahalanobis distance, the Euclidean distance, and the diagonalized Euclidean distance. Misspecified MEWMA control schemes are proposed in which the unknown parameters of the in-control process are replaced by their estimators. The distributional properties of the control statistics of the misspecified control charts are investigated and their high-dimensional asymptotic distributions are derived. The established theoretical results are used to study the effect of the mispecification, and the finite-sample behavior of the control statistics is assessed via an intensive simulation study. Based on our findings, it is concluded that the MEWMA control schemes based on the Mahalanobis distance can suffer from considerable misspecification, whereas the control charts whose test statistics are defined by using the Euclidean distance and the diagonalized Euclidean distance are quite robust to misspecification, because we estimate the parameters of the target process in the Phase I analysis of the monitoring process.

DISCLOSURE

The authors have no conflicts of interest to report.

ACKNOWLEDGEMENT

The authors thank the Editor, the Associate Editor, and two anonymous reviewers for their constructive comments that improved the quality of this article.

Additional information

Funding

The first and third authors acknowledge financial support from the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) Project No. 428472210.

REFERENCES

  • Albers, W., and W. C. Kallenberg. 2004. “Are Estimated Control Charts in Control?” Statistics 38 (1): 67–79. https://doi.org/10.1080/02669760310001619369
  • Alwan, L. C., and H. V. Roberts. 1988. “Time-Series Modeling for Statistical Process Control.” Journal of Business & Economic Statistics 6 (1): 87–95.
  • Andersson, E., D. Bock, and M. Frisén. 2004. “Detection of Turning Points in Business Cycles.” Journal of Business Cycle Measurement and Analysis 2004 (1): 93–108. https://doi.org/10.1787/jbcma-v2004-art6-en
  • Bai, Z., and J. W. Silverstein. 2010. Spectral Analysis of Large Dimensional Random Matrices, Vol. 20. New York: Springer.
  • Bodnar, O. 2007. “Sequential Procedures for Monitoring Covariances of Asset Returns.” In Advances in Risk Management, edited by G. N. Gregoriou, 241–64. New York: Palgrave.
  • Bodnar, O. 2009a. “Application of the Generalized Likelihood Ratio Test for Detecting Changes in the Mean of Multivariate Garch Processes.” Communications in Statistics - Simulation and Computation 38 (5): 919–938. https://doi.org/10.1080/03610910802691861
  • Bodnar, O. 2009b. “Sequential Surveillance of the Tangency Portfolio Weights.” International Journal of Theoretical and Applied Finance 12 (06): 797–810. https://doi.org/10.1142/S0219024909005464
  • Bodnar, O., and W. Schmid. 2007. “Surveillance of the Mean Behavior of Multivariate Time Series.” Statistica Neerlandica 61 (4): 383–406. https://doi.org/10.1111/j.1467-9574.2007.00365.x
  • Bodnar, O., and W. Schmid. 2011. “CUSUM Charts for Monitoring the Mean of a Multivariate Gaussian Process.” Journal of Statistical Planning and Inference 141 (6): 2055–2070. https://doi.org/10.1016/j.jspi.2010.12.020
  • Bodnar, O., and W. Schmid. 2017. “CUSUM Control Schemes for Monitoring the Covariance Matrix of Multivariate Time Series.” Statistics 51 (4): 722–744. https://doi.org/10.1080/02331888.2016.1268616
  • Bodnar, R., T. Bodnar, and W. Schmid. 2023. “Sequential Monitoring of High-Dimensional Time Series.” Scandinavian Journal of Statistics 50 (3): 962–992. https://doi.org/10.1111/sjos.12607
  • Bodnar, T., H. Dette, and N. Parolya. 2019. “Testing for Independence of Large Dimensional Vectors.” The Annals of Statistics 47 (5): 2977–3008. https://doi.org/10.1214/18-AOS1771
  • Brockwell, P. J., and R. A. Davis. 1991. Time Series: Theory and Methods. New York: Springer Science & Business Media.
  • Chen, S., and H. B. Nembhard. 2011. “A High-Dimensional Control Chart for Profile Monitoring.” Quality and Reliability Engineering International 27 (4): 451–464. https://doi.org/10.1002/qre.1140
  • Crosier, R. 1988. “Multivariate Generalizations of Cumulative Sum Quality-Control Schemes.” Technometrics 30 (3): 291–303. https://doi.org/10.1080/00401706.1988.10488402
  • Frisén, M. 1992. “Evaluations of Methods for Statistical Surveillance.” Statistics in Medicine 11 (11): 1489–1502. https://doi.org/10.1002/sim.4780111107
  • Golosnoy, V., and W. Schmid. 2007. “EWMA Control Charts for Monitoring Optimal Portfolio Weights.” Sequential Analysis 26 (2): 195–224. https://doi.org/10.1080/07474940701247099
  • Golosnoy, W., I. Okhrin, S. Ragulin, and W. Schmid. 2011. “On the Application of SPC in Finance.” Frontiers in Statistical Quality Control 9: 119–32.
  • Hamilton, J. D. 1994. Time Series Analysis. New Jersey: Princeton University Press.
  • Harville, D. A. 1997. Matrix Algebra from Statistician’s Perspective. New York: Springer.
  • Hotelling, H. 1947. “Multivariate Quality Control—Illustrated by the Air Testing of Sample Bombsights.” In Techniques of Statistical Analysis, edited by C. Eisenhart, M. W. Hastay, and W. Wallis, 111–184. New York: McGraw Hill.
  • Jardim, F. S., S. Chakraborti, and E. K. Epprecht. 2020. “Two Perspectives for Designing a Phase II Control Chart with Estimated Parameters: The Case of the Shewhart X Chart.” Journal of Quality Technology 52 (2): 198–217. https://doi.org/10.1080/00224065.2019.1571345
  • Jensen, W. A., L. A. Jones-Farmer, C. W. Champ, and W. H. Woodall. 2006. “Effects of Parameter Estimation on Control Chart Properties: A Literature Review.” Journal of Quality Technology 38 (4): 349–364. https://doi.org/10.1080/00224065.2006.11918623
  • Knoth, S. 2021. “Steady-State Average Run Length (s): Methodology, Formulas, and Numerics.” Sequential Analysis 40 (3): 405–426. https://doi.org/10.1080/07474946.2021.1940501
  • Knoth, S., and W. Schmid. 2002. “Monitoring the Mean and the Variance of a Stationary Process.” Statistica Neerlandica 56 (1): 77–100. https://doi.org/10.1111/1467-9574.03000
  • Kramer, H., and W. Schmid. 1997. “EWMA Charts for Multivariate Time Series.” Sequential Analysis 16 (2): 131–154. https://doi.org/10.1080/07474949708836378
  • Kramer, H., and W. Schmid. 2000. “The Influence of Parameter Estimation on the ARL of Shewhart Type Charts for Time Series.” Statistical Papers 41 (2): 173–196. https://doi.org/10.1007/BF02926102
  • Lawson, A., and K. Kleinman. 2005. Spatial & Syndromic Surveillance. New York: Wiley.
  • Lax, P. D. 2007. Linear Algebra and Its Applications, Vol. 78. New Jersey: John Wiley & Sons.
  • Li, Y., Y. Liu, C. Zou, and W. Jiang. 2014. “A Self-Starting Control Chart for High-Dimensional Short-Run Processes.” International Journal of Production Research 52 (2): 445–461. https://doi.org/10.1080/00207543.2013.832001
  • Lowry, C., W. Woodall, C. Champ, and S. Rigdon. 1992. “A Multivariate Exponentially Weighted Moving Average Control Chart.” Technometrics 34 (1): 46–53. https://doi.org/10.2307/1269551
  • Lütkepohl, H. 2005. New Introduction to Multiple Time Series Analysis. Berlin: Springer Science & Business Media.
  • Mathai, A. M., and S. B. Provost. 1992. Quadratic Forms in Random Variables: Theory and Applications. New York: Dekker.
  • Messaoud, A., C. Weihs, and F. Hering. 2008. “Detection of Chatter Vibration in a Drilling Process Using Multivariate Control Charts.” Computational Statistics & Data Analysis 52 (6): 3208–3219. https://doi.org/10.1016/j.csda.2007.09.029
  • Montgomery, D. C. 2020. Introduction to Statistical Quality Control. Hoboken: John Wiley & Sons.
  • Muirhead, R. J. 1982. Aspects of Multivariate Statistical Theory. New York: Wiley.
  • Ngai, H., and J. Zhang. 2001. “Multivariate Cumulative Sum Control Charts Based on Projection Pursuit.” Statistica Sinica 11: 747–766.
  • Otto, P., and W. Schmid. 2023. “A General Framework for Spatial Garch Models.” Statistical Papers 64 (5): 1721–1747. https://doi.org/10.1007/s00362-022-01357-1
  • Page, E. S. 1954. “Continuous Inspection Schemes.” Biometrika 41 (1-2): 100–115. https://doi.org/10.1093/biomet/41.1-2.100
  • Pignatiello, J., and G. Runger. 1990. “Comparisons of Multivariate CUSUM Charts.” Journal of Quality Technology 22 (3): 173–186. https://doi.org/10.1080/00224065.1990.11979237
  • Qiu, P. 2013. Introduction to Statistical Process Control. Boca Raton, FL: CRC press.
  • Reinsel, G. 1993. Multivariate Time Series Analysis. New York: John Wiley & Sons.
  • Roberts, S. 1959. “Control Chart Tests Based on Geometric Moving Averages.” Technometrics 1 (3): 239–250. https://doi.org/10.1080/00401706.1959.10489860
  • Saleh, N. A., M. A. Mahmoud, L. A. Jones-Farmer, I. Zwetsloot, and W. H. Woodall. 2015. “Another Look at the EWMA Control Chart with Estimated Parameters.” Journal of Quality Technology 47 (4): 363–382. https://doi.org/10.1080/00224065.2015.11918140
  • Sarmiento, M. G., F. S. Jardim, S. Chakraborti, and E. K. Epprecht. 2022. “Design of Variance Control Charts with Estimated Parameters: A Head to Head Comparison between Two Perspectives.” Journal of Quality Technology 54 (3): 249–268. https://doi.org/10.1080/00224065.2020.1834892
  • Schipper, S., and W. Schmid. 2001. “Sequential Methods for Detecting Changes in the Variance of Economic Time Series.” Sequential Analysis 20 (4): 235–262. https://doi.org/10.1081/SQA-100107647
  • Schmid, W. 1995. “On the Run Length of a Shewhart Chart for Correlated Data.” Statistical Papers 36 (1): 111–130. https://doi.org/10.1007/BF02926025
  • Schmid, W. 1997a. “CUSUM Control Schemes for Gaussian Processes.” Statistical Papers 38 (2): 191–217. https://doi.org/10.1007/BF02925223
  • Schmid, W. 1997b. “On EWMA Charts for Time Series.” In Frontiers in Statistical Quality Control, edited by Hans-Joachim Lenz, Peter-Theodor Wilrich, 115–137. Berlin, Heidelberg: Springer.
  • Schmid, W., and A. Schöne. 1997. “Some Properties of the EWMA Control Chart in the Presence of Autocorrelation.” The Annals of Statistics 25 (3): 1277–1283. https://doi.org/10.1214/aos/1069362748
  • Schmid, W., and D. Tzotchev. 2004. “Statistical Surveillance of the Parameters of a One-Factor Cox-Ingersoll-Ross Model.” Sequential Analysis 23 (3): 379–412. https://doi.org/10.1081/SQA-200027052
  • Severin, T., and W. Schmid. 1999. “Monitoring Changes in GARCH Models.” Allgemeines Statistisches Archiv (AStA) 83: 281–307.
  • Shewhart, W. A. 1926. “Quality Control Charts.” Bell System Technical Journal 5 (4): 593–603. https://doi.org/10.1002/j.1538-7305.1926.tb00125.x
  • Śliwa, P., and W. Schmid. 2005. “Monitoring the Cross-Covariances of a Multivariate Time Series.” Metrika 61 (1): 89–115. https://doi.org/10.1007/s001840400326
  • Sonesson, C., and D. Bock. 2003. “A Review and Discussion of Prospective Statistical Surveillance in Public Health.” Journal of the Royal Statistical Society Series A: Statistics in Society 166 (1): 5–21. https://doi.org/10.1111/1467-985X.00256
  • Theodossiou, P. T. 1993. “Predicting Shifts in the Mean of a Multivariate Time Series Process: An Application in Predicting Business Failures.” Journal of the American Statistical Association 88 (422): 441–449. https://doi.org/10.1080/01621459.1993.10476294
  • Wang, K., and W. Jiang. 2009. “High-Dimensional Process Monitoring and Fault Isolation via Variable Selection.” Journal of Quality Technology 41 (3): 247–258. https://doi.org/10.1080/00224065.2009.11917780
  • Wang, Z., Y. Li, and X. Zhou. 2017. “A Statistical Control Chart for Monitoring High-Dimensional Poisson Data Streams.” Quality and Reliability Engineering International 33 (2): 307–321. https://doi.org/10.1002/qre.2005