353
Views
0
CrossRef citations to date
0
Altmetric
Research Articles

Matching Kollo measures

, &
Pages 1279-1293 | Received 16 Feb 2023, Accepted 02 Jul 2023, Published online: 09 Aug 2023

Abstract

We motivate the advantages of using the Kollo measures, relative to other types of third and fourth moments of multivariate systems, and explore their Monte Carlo simulation and bootstrapping errors. Then we derive necessary and sufficient conditions for simultaneously matching any given mean vector, covariance matrix, Kollo skewness, and Kollo kurtosis. The specification of a suitable orthonormal basis greatly simplifies these moment-matching conditions. We offer semi-closed-form solutions to increase computational efficiency. In this respect, we compare our approach to two competing methods, which anyway can only match Kollo skewness and not the kurtosis at the same time. Ours is the first method for exactly matching all four multivariate moments simultaneously.

1. Introduction

Consider an n-dimensional random variable Xn=[X1,,Xn] with a mean vector μn=E[Xn] and a covariance matrix Σn=V[Xn]. Mitchell (Citation1989) introduced the third and fourth moments as the skewness tensor E[YnYnYn] and kurtosis tensor E[YnYnYnYn] where Yn is the standardized version of Xn, i.e., Yn=Σn1/2(Xnμn) and ⊗ denotes the Kronecker product. In other words, the elements in the skewness tensor take the form E[YiYjYk] and those of the kurtosis tensor take the form E[YiYjYkYt], for i,j,k,t=1,,n. The tensors contain the third and fourth order central moments and all the co-moments of the marginal variables, which include n(n+1)(n+2)/6 and n(n+1)(n+2)(n+3)/24 distinct elements, respectively. The number of elements increases rapidly with n. For that reason, these full tensor moments are impractical for use in large dimensions and several more parsimonious representations of multivariate skewness and kurtosis have been proposed.

Many authors advocate the need for models to match the information contained in measures of skewness and kurtosis, for instance in finance (Ryoo, Citation2007), biotech (Kuruvilla & Gunavathi, Citation2014), and engineering (Love et al., Citation2013). Indeed, moment-matching techniques have been applied in a variety of disciplines: Heath et al. (Citation2018) use moment-matching simulation for probabilistic sensitivity analysis in economics; Pellegrino and Sabino (Citation2014) apply moment-matching to reduce the dimension of the pricing problem in finance; and Ching et al. (Citation2009) improve the accuracy of the estimation by using moment-matching in engineering. Yet none can help match even the mean vector and covariance matrix exactly, let alone the higher moments.

Some moment-matching techniques are specifically designed to reduce the sampling error in multivariate moments. In this strand of the literature both Bardsley (Citation2017), Høyland et al. (Citation2003), and Looney (Citation1983) propose a method to generate data based on moments; Foldnes and Olsson (Citation2016) introduce a simple copula structure to target the mean vector, the covariance matrix and both the skewness and the kurtosis of each marginal; and Qu et al. (Citation2020) propose a method to target the Mardia (Citation1970) skewness and kurtosis. However, none of these use the tensor measures defined above, because they are intractable for practical applications except when n is small.

Many different summary measures have been advocated to reduce the size of the skewness and kurtosis tensors, including those proposed by Koizumi et al. (Citation2014), Kollo (Citation2008), Mardia (Citation1970), and Móri et al. (Citation1994). Several other measures are not even directly related to the full tensors.Footnote1 All these measures are listed in chronological order in and each column refers to a property that is not fulfilled by the measure.

Table 1. Weaknesses of popular skewness and kurtosis measures for multivariate variables.

First note that most of the measures that are not based on the full tensors require computation of the multivariate probability density function, which is rather complex, and cannot be used to establish solvable systems of moment-matching constraints for simulations targeting skewness and kurtosis.Footnote2 Secondly, of the four methods that are derived from the full tensor moments, all but three ignore information by excluding some of the cross moments. This includes the popular Mardia (Citation1970) skewness and kurtosis, which are easy to use but—like several other measures—are only scalar quantities. As such, they are unable to carry as much information about the full multivariate density as vector or matrix measures.

From it is clear that the only measures that suffer from none of these drawbacks are those introduced by Kollo (Citation2008). In the form of an n×1 vector and an n × n symmetric matrix, respectively, they may be written: (1a) τ(Xn)=[i,knE[YiY1Yk],,i,knE[YiYnYk]],(1a) (1b) K(Xn)=[i,knE[YiY1Y1Yk]i,knE[YiY1YnYk]i,knE[YiYnY1Yk]i,knE[YiYnYnYk]].(1b)

Unlike tensor moments, the Kollo measures are easy to compute for applications. But so are other measures, such as those of Mardia and Móri et al.Footnote3 In Appendix A, we provide some real-world examples that motivate the use of Kollo skewness and kurtosis in a practical setting, rather than the measures introduced by Mardia. Kollo measures also have many theoretical advantages. By retaining so many co-moments the Kolo measures are the most tractable attempt to retain as much information as possible from the full skewness and kurtosis tensors.Footnote4 The Mardia and Koizumi et al. measures suffer from a substantial loss of information because they are scalars. For example, Mardia skewness can only be positive, which cannot show the data is negatively skewed or positively skewed. A similar issue of information loss is shared by the Móri et al. measures which are a multivariate extension of Mardia measures but still omit many third and fourth order co-moments. Moreover, like the tensors themselves and several other related measures, the Kollo measures are invariant under affine transformations. Later we shall rely on this property to generate repeated simulations that exactly match some target values for Kollo skewness and kurtosis. This invariance property also enables concatenations of repeated samples. That is, the concatenated samples have the same moments as the sub-samples.

Our exact moment-matching methods are achieved by creating an orthonormal basis, the span of which includes a particular subspace containing all samples that have the exact Kollo skewness vector and Kollo kurtosis matrix. The benefit of using such a set of orthonormal vectors is huge: mathematically, matching the sample moments exactly means solving a large system of non-linear equations in which the number of unknowns greatly exceeds the number of constraints. The orthonormal basis also underpins a significant reduction in the size of the system to be solved, and we are able to derive analytic solutions that greatly enhance the computational efficiency of our method. It is also, to our knowledge, the only method for exactly matching both Kollo skewness and kurtosis in every simulation of the distribution.

Common techniques, such as Monte Carlo (MC) simulation (Hertz, Citation1964) or the statistical bootstrap (Efron, Citation1979) fail to ensure that each sample has the same moments as the target distribution. There is a large literature on their simulation errors in covariance matrices—see Jobson and Korkie (Citation1980), Ledoit and Wolf (Citation2004), and many others—but no previous study on MC or bootstrap simulation errors in the Kollo measures. Section 2 motivates the need to target higher moments exactly by demonstrating that the Kollo moment sampling errors from standard Monte Carlo and the statistical bootstrap can be very significant. This well-known property of sampling error is unlikely to be resolved completely by increasing the number of simulations alone (Ferraz & Moura, Citation2012; Oberkampf et al., Citation2002) and variance-reduction techniques (Saliby & Paul, Citation2009) come at the expense of model complexity.

Then Section 3 introduces the Ledermann et al. (Citation2011) Random Orthogonal Matrix (ROM) simulation method, which encompasses MC and the bootstrap (as special cases of the L matrices which underpin this approach) and was originally developed as a simulation method for matching exact covariance matrices. Two different extensions of the ROM algorithm by Hanke et al. (Citation2017) and Alexander et al. (Citation2022) can target exact Kollo skewness by generating arbitrary values to pad the sample so that Kollo skewness matching conditions are solvable numerically. But a trial-and-error process is needed to find suitable arbitrary values and, even when such values are found, there is no analytical solution and its numerical solution cannot be guaranteed, so the algorithms can fail. Moreover, as Alexander et al. (Citation2022) demonstrate, these methods risk generating simulated samples with undesirable marginal densities, depending on the arbitrary values that are found.

Having derived the necessary and sufficient conditions for a sample generated to match a target Kollo skewness and Kollo kurtosis in Section 3, Section 4 uses these conditions to derive theoretical results which first define an orthonormal basis, and then find analytic solutions for the coefficients which generate the required subspace of the span of this orthonormal basis. Here we also prove the invariance under sample concatenation. We conclude the paper with a discussion of our results in Section 5, in the context of the previous literature.

Our algorithms are written in Python and available on Github https://github.com/vivid1005/GKROM-algorithm.

2. Sampling errors in Kollo skewness and kurtosis

This section presents two experiments that demonstrate the need to use a simulation method that can target Kollo moments exactly. Let μi,σij,τi, and κij denote the elements of the target mean μn, covariance Σn, Kollo skewness τn, and Kollo kurtosis Kn, respectively, and let and μ̂i,σ̂ij,τ̂i, and κ̂ij denote the corresponding elements of the simulated sample. Then the RMSEs for each moment are given by: (2) in(μiμ̂i)2,i,jn(σijσ̂ij)2,in(τiτ̂i)2,i,jn(κijκ̂ij)2,(2) for the mean vector, covariance matrix, Kollo skewness vector, and Kollo kurtosis matrix, respectively. Now, for a given dimension n of the system of random variables and a given size m for the simulations, we generate a sample and compute the mean vector, covariance matrix, Kollo skewness vector, and Kollo kurtosis matrix and record the RMSEs using Equation(2). We repeat this many times and then record the average RMSE and its standard error.

and tabulate the results of the first, bootstrapping experiment using two real data sets, viz. the U.S. stock sector returns data described above and a temperature data sample from the “Global and City Yearly Average Temperatures 1820–2015” data set available on Data.world. It includes the yearly average temperatures between 1849 and 2013 of the 11 cities in the United States (viz. Austin, Boston, Chicago, Kansas City, Las Vegas, Los Angeles, New Orleans, New York, Philadelphia, San Francisco, and Washington). In both cases n = 11 with the target mean vector, covariance matrix, Kollo skewness vector, and Kollo kurtosis matrix derived from Equation(1) the first 1000 observations in the sample, from 1 December 2018 to 31 December 2021 for sector returns and Equation(2) 166 observations from 1849 to 2013 for temperature data, respectively. The moments are reported in the upper part of the tables. Then we apply the statistical bootstrap to these samples, to generate random samples of size m, for m=100,500, and 1000. For any given m, we generate 100,000 samples of this size. For each sample, we record the RMSEs calculated using Equation(2) and then compute the average and standard error of these RMSEs over all 100,000 repetitions and report the results in the lower part of the tables.

Table 2. Sampling errors in multivariate moments for statistical bootstrapping financial data.

Table 3. Sampling errors in multivariate moments for statistical bootstrapping temperature data.

As expected both the average RMSE and its standard deviation decrease with m. However, their ratio (shown in the right-hand panel) does not. The errors in moments that arise when using the bootstrap are particularly apparent when only 100 observations are randomly selected. When m = 100 the ratio of the average RMSE on the Kollo skewness to its standard deviation is 2.632 for the stock returns data and 3.856 for the temperature data, and for the Kollo kurtosis, the corresponding ratio is 7.099 and 2.702, respectively. But even when m = 1000, so we bootstrap 100,000 samples having the same size as the original one, the errors relative to the target moments derived from the original sample remain substantial (with ratios are 2.416 and 2.469 for the stock returns data and 3.288 and 1.782 for the temperature data on the Kollo skewness and kurtosis, respectively). It is worth noting that much of the research on sampling errors in simulation focuses on errors in the covariance matrix, but looking at the ratios in the right-hand panel it appears that the covariance matrix is the least inaccurate of the four moments.

reports the results of the second experiment which assumes an n-dimensional N(0n,In) distribution for n = 5 or 10 and applies standard Monte Carlo simulation to generate samples of size m for m=103 or 106. As expected both the average RMSE and its standard error decrease as m increases. However, again their ratio (shown in the right panel) does not. It is clear that all simulations yield inaccurate values for the mean vector, covariance matrix, Kollo skewness vector, and Kollo kurtosis matrix. It also becomes more difficult to simulate accurate values as the dimension of the system increases, especially for the covariance matrix. This observation was used to motivate the introduction of moment matching methods (such as Ledermann et al., Citation2011) and demonstrates that similar inaccuracies occur when using Monte Carlo simulation in large systems for which accurate third and fourth moments are also important.

Table 4. RMSE of multivariate moments in Monte Carlo simulation.

3. Conditions for matching Kollo skewness and kurtosis

Consider the simple case of generating a sample Xmn of size m from an n-dimensional distribution that has the exact n×1 mean vector μn and covariance matrix Σn. Let An be a fixed square matrix such that AnAn=Σn. Following Ledermann et al. (Citation2011) we can select any “scaled L-matrix” Mmn that is defined by the properties: (3) 1mMmn=0n,MmnMmn=mIn,(3) where 1m denotes an m×1 vector of 1s, 0m is an m×1 vector of 0s and In is the n × n identity matrix. The matrix Mmn is just an L matrix scaled by m1/2. See Ledermann and Alexander (Citation2012) for more details about different types of L matrices that can be used to target the Mardia measures exactly, either skewness or kurtosis, but not both. The original concept of ROM simulation selects any random n × n rotation matrix Rn, and possibly also a random m × m permutation matrix Qm, so that by Equation(3) every sample of the form: (4) Xmn=1mμn+QmMmnRnAn,(4) has mean μn and covariance matrix Σn. The problem with such a construction is that these ROM simulations can have quite different Kollo skewness and kurtosis because they depend on the choice of Rn. For this reason, we set Rn=In in EquationEquation (4) and introduce another source of randomness.

First, we note that Mmn must satisfy more than just Equation(3) for the simulation Xmn given by: (5) Xmn=1mμn+QmMmnAn,(5) to have exact Kollo skewness τn and Kollo kurtosis Kn as well as the exact mean vector μn and covariance matrix Σn. It is easiest to write these necessary and sufficient conditions in summation notation, letting mir denote the 1×n vector that is the ith row of Mmn, as follows: (6a) i=1mmir=0n,i=1mmirmir=mIn,(For matching μn and Σn)(6a) (6b) i=1m(1nmir)2mir=mτn,(For matching τn)(6b) (6c) i=1m(1nmir)2mir mir=mKn.(For matching Kn)(6c)

Unfortunately Equation(6b) and Equation(6c) are difficult to solve directly. Instead, we follow Hanke et al. (Citation2017) and introduce an n × n rotation matrix, Ωn, which satisfies 1Ωn=n[1,0,,0] in addition to the usual conditions ΩnΩn=In and det(Ωn)=1.Footnote5 One example of such a matrix is: Ωn=(1n(1)n1n/(n1)00001n(1)nn(n1)(1)n1(n1)/(n2)0001n(1)nn(n1)(1)n(n1)(n2)(1)n1(n2)/(n3)001n(1)nn(n1)(1)n(n1)(n2)(1)n(n2)(n3)001n(1)nn(n1)(1)n(n1)(n2)(1)n(n2)(n3)(1)n13/201n(1)nn(n1)(1)n(n1)(n2)(1)n(n2)(n3)(1)n3×2(1)n121n(1)nn(n1)(1)n(n1)(n2)(1)n(n2)(n3)(1)n3×2(1)n2).

The rotation matrix is used to convert Mmn to a new matrix Smn, i.e., Smn=MmnΩn, and so Mmn=SmnΩn. The sample construction Equation(5) may now be rewritten: (7) Xmn=1mμn+QmSmnΩnAn,(7) and the necessary and sufficient conditions Equation(6) become: (8a) 1mSmn=0n,SmnSmn=mIn,(8a) (8b) i=1msi12 sir=mn Ωnτn:=mτ˜n,(8b) (8c) i=1msi12 sir sir=mn ΩnKnΩn:=mK˜n.(8c)

The above formulation explains why we require 1Ωn=n[1,0,,0]. We can see that the most complex terms (1nmir)2 in Equation(6b) and Equation(6c) are reduced to terms of the form nsi12 in Equation(8b) and Equation(8c), with si1 denoting the first element in the ith row of Smn. This way, the focus moves from simulating Equation(5) using different Mn to simulating Equation(7) using different Smn which satisfies Equation(8).

4. Theoretical results

For convenience, we now rewrite the conditions Equation(8) as functions of each column of Smn, respectively, using the Hadamard product °. Let sjc denote the m×1 vector that is the jth column of Smn. The following formulation of the necessary and sufficient conditions for the simulated sample to match exact mean, covariance and Kollo skewness and kurtosis will be useful in this section: (9a) 1mskc=0,(9a) (9b) 1m(sjc°skc)=0,j<k,(9b) (9c) 1m(skc°skc)=m,(9c) (9d) 1m(s1c°s1c°skc)=mτ˜k,(9d) (9e) 1m(s1c°s1c°sic°skc)=mκ˜ik,ik,(9e) where τ˜k is the kth element of the scaled and rotated Kollo skewness vector τ˜n and κ˜ik is the element in the ith row and kth column of the scaled and rotated Kollo kurtosis matrix K˜n. Matching the exact mean, covariance and Kollo skewness relies on solving conditions Equation(9a)Equation(9d) for Smn. The additional constraints Equation(9e) set out the k restrictions on the kth column of Smn that must hold in order that each simulation Xmn matches the target Kollo kurtosis.

The system Equation(9) is not easy to solve. When k = 1 we have to solve a system of linear equations for the mean constraint Equation(9a), a system of quadratic equations for the variance constraint Equation(9c), a system of cubic equations for the Kollo skewness constraint Equation(9d), and a system of quartic equations for the Kollo kurtosis constraint Equation(9e). When k > 1 we have four different systems of linear equations, for the mean constraint Equation(9a), the covariance constraint Equation(9b), the Kollo skewness constraint Equation(9d), and Kollo kurtosis constraints Equation(9e) when i < k, and two different systems of quadratic equations for the variance constraint Equation(9c) and the Kollo kurtosis constraint Equation(9e) when i = k. In total, there are n(n+5)/2 equations to solve for matching the first three moments, or n(n+3) equations for matching the first four moments. For example, in a 10-dimensional system, we need to solve 75 equations for matching the first three moments, or 130 equations to match the first four moments.

In this section we show how to solve Equation(9) for Smn column by column, by (a) finding a particular type of orthonormal basis for the sample space of the n random variables; and then (b) solving for the coefficients which generate a subspace Smn of its span which contains samples having exactly the target Kollo moments. That is, for each column k and for some column dimension lk, with 1lkm, we first find a specific m×lk rectangular orthonormal matrix Θm,lkk, i.e., a matrix having the properties 1mΘm,lkk=0lkand(Θm,lkk)Θm,lkk=Ilk.

Then we set (10) skc=Θm,lkkck,(10) where ck is an lk×1 column vector with at least one non-zero element. The decomposition Equation(10) allows one to select the orthonormal matrix Θm,lkk in such a way that skc always satisfies EquationEquations (9a) and Equation(9b). That is, instead of solving (9) for skc directly, we may solve for Θm,lkk and ck iteratively, for k=1,,n, using the following result:

Lemma 1.

Suppose the first k − 1 columns in the matrix Smn satisfy conditions Equation(9a) and Equation(9b) and the kth column skc is constructed using Equation(10). Then the conditions Equation(9a) and Equation(9b) will be satisfied if the column vectors in Θm,lkk are also orthogonal to the vectors 1m and s1c,,sk1c.

Generating Smn column by column this way considerably reduces the number of constraints that must be satisfied. Later we shall show that, in addition to the mean and covariance, we can match the Kollo skewness and kurtosis by placing only k + 2 constraints on the kth column of Smn. And if we only wish to target the first three moments, there are just two constraints on each column of Smn. In total, instead of n(n+5)/2 equations to solve for matching the first three moments, or n(n+3) equations for the first four moments, now we only need to solve 2n and n(n+5)/2 equations, respectively. For instance, in a 10-dimensional system, we need to solve 20 equations to match the mean, covariance, and Kollo skewness or 75 equations to additionally match the Kollo kurtosis. Of course, the dimension reduction increases with n; e.g., when n = 20 the first three moments constraints reduce from 250 to 40, and the first four moments constraints reduce from 460 to 250.

Our method of solution requires that we fix the first k − 1 columns of Smn before we generate the matrix Θm,lkk. We then generate a rectangular orthonormal matrix with dimensions m × N whose first k columns are equal to m1/21m and m1/2s1c,,m1/2sk1c. The columns of the matrix Θm,lkk are then randomly selected from the other Nk columns of this matrix. That is, we take a random sample (without replacement) of any column j for jk and the selected columns θj for j=1,,lk are then used to form the orthonormal matrix Θm,lkk. Note that we require Nm, because there are m elements in each column and so, no more than m columns can be mutually orthogonal. This places an upper bound on lk because lkNkmk.

There is a close relationship between Θm,lkk and the L matrices (Ledermann et al., Citation2011). In fact, we can immediately obtain Θm,lkk by randomly extracting lk columns from the last N(k1) columns of an L matrix whose first k − 1 columns are m1/2s1c,,m1/2sk1c.

Now we turn to the actual solution of (9) using the construction Equation(10), starting with the simpler case of matching the first three moments only. Recall that s1c is different from the other columns because it involves a cubic equation, which is easy to see by rewriting the conditions Equation(9a)—(9d) for s1c as: (11) i=1msi1=0,i=1msi12=m,i=1msi13=mτ˜1.(11)

The following Theorem explains how to solve for s1c, the first column of Smn, from which the other columns are generated in an iterative manner:

Theorem 1.

Suppose the kth column of Smn satisfies Equation(10) for lk2, that  Θm,lkk=[θ1k,,θlkk] satisfies the conditions of Lemma 1, and that sk1,,sk1c satisfy conditions Equation(9a) to Equation(9d). Then: by Lemma 1, skc satisfies conditions Equation(9a) and Equation(9b); it also satisfies condition Equation(9c) if(12) ckck=m for k=1,,n;(12) and the condition Equation(9d) also holds if(13a) fork=1:1m[(Θm,lk1 c1)°(Θm,lk1 c1)°(Θm,lk1 c1)]=mτ˜1,(13a) (13b) fork=2,,n:1mBm,lkkck=mτ˜k,(13b) whereBm,lkk=[b1k,.blkk]=[s1c°s1c°θ1k,,s1c°s1c°θlkk].

The pseudocode in algorithm 1 summarizes how to generate a sample Xmn that matches any given mean vector, covariance matrix, and Kollo skewness vector.

Algorithm 1.

Matching the First Three Moments

1: function ROM3(m,μ,Σ,τ)      ⊳ m: Number of observations, μ,Σ,τ: Targeted first three moments

2:      n=length(τ)

3:      Obtain An where AnAn=Σ

4:      Generate Ωn            ⊳ Ωn is an n × n rotation matrix satisfying 1Ωn=n[1,0,,0]

5:      τ˜=Ωn1n1τ                      ⊳ Obtain the rotated Kollo skewness

6:    while Theorem 1 conditions Equation(12) and Equation(13a) for c1 do not hold do    ⊳ Solve for the first column

7:     Generate l1 orthonormal vectors to form Θm,l11 satisfying Lemma 1

8:     Obtain real solutions for the coefficient c1 restricted by EquationEquations (12) and Equation(13a)

9:    Obtain s1c=Θm,l11c1

10:     for k=2,,n do:

11:      while Theorem 1 conditions Equation(12) and Equation(13b) for ck do not hold do

12:        Generate lk orthonormal vectors to form Θm,lkk satisfying Lemma 1

13:        Obtain real solutions for the coefficients ck restricted by EquationEquations (12) and Equation(13b)

14:      Obtain skc=Θm,lkkck

15:     Generate permutation matrix Qm if required

16:     Xmn=1mμ+QmSmnΩnAn        ⊳ Generate a sample with exact first three moments

17:     return Xmn

It solves Smn iteratively, column-by-column while building the columns of an orthonormal matrix Θm,lkk that yields a real-valued solution for ck. This means, to obtain skc satisfying conditions Equation(9a)—(9d), the k − 1 columns of Smn must first be known and will satisfy conditions Equation(9a)—(9d). So we first obtain an orthonormal matrix Θm,l11 satisfying Lemma 1 and then obtain the coefficients c1 by solving a system of quadratic EquationEquation (12) and a system of cubic EquationEquation (13a) simultaneously. Next, we generate skc for k=2,,n. Now ck should satisfy a quadratic system Equation(12) and a linear system Equation(13b) simultaneously.

We require lk2 because the elements of ck are always restricted by 2 equations: for k = 1 the systems Equation(12), Equation(13a)—and for k > 1 the systems Equation(12) and Equation(13b)—are both exactly determined when lk=2; but when lk>2 the systems are under-determined, which makes it easier to find a real-value solution for ck.

While both general and flexible, there is an element of trial and error for finding a real-valued solution in Algorithm 1. The following Corollary 1 considers a special case of Theorem 1 for lk = 2 and k2. It is useful because it derives a single criterion for the existence of a real-valued solution to the system of Equation(12) and Equation(13b) and an analytical formula for each 2 × 1 vector ck=[c1k,c2k] when such a solution exists.

Corollary 1.

When lk = 2 and k2, the system of EquationEquations (12) and Equation(13b) has a real solution if (14) 1mBm,2k Bm,2k1mmτ˜k2,(14) and when the condition Equation(14) holds the solution is:(15) c1k=b˜1kmτ˜k±b˜2k[(b˜1k)2+(b˜2k)2]mm2τ˜k2(b˜1k)2+(b˜2k)2,c2k=b˜2kmτ˜kb˜1k[(b˜1k)2+(b˜2k)2]mm2τ˜k2(b˜1k)2+(b˜2k)2,(15) where b˜1k and b˜2k are the first and second elements of 1mBm,2k.

Corollary 1 greatly improves the computational efficiency of the new method. Setting lk = 2 for k > 1 means that the matrix has only two columns which are simple to orthogonalize. Moreover, it is very quick to repeatedly generate orthonormal matrices Θm,2k for k > 1 until we find those which satisfy the criterion Equation(14), and then we can immediately obtain values for ck using the analytical formula Equation(15). In this way, we do not need to search for the numerical solution for the simultaneous EquationEquations (12) and Equation(13b).

Next, we consider simulations matching mean, covariance, Kollo skewness, and Kollo kurtosis. When matching only the first three moments, there is no reason why we should not always set lk = 2 and use Corollary 1. Then the moment matching algorithm is very fast, being fully analytic. However, the next Theorem, which explains how to match the first four moments, requires lkk+2. Again we begin by generating the first column of Smn because it requires higher-order conditions than the subsequent columns. To target the first four moments s1c should satisfy not only conditions Equation(11) but also a quartic condition for matching Kollo kurtosis: (16) i=1msi14=mκ˜11,(16) where κ˜11 is the first element in the first row of K˜n. Theorem 2 provides the necessary and sufficient conditions on ck for Smn to satisfy conditions Equation(9).

Theorem 2.

Suppose the kth column of Smn satisfies Equation(10) for lkk+2, that  Θm,lkk=[θ1k,,θlkk] satisfies the conditions of Lemma 1, and that sk1,,sk1c satisfy conditions Equation(9). Then skc satisfies:

  1. the conditions Equation(9a) and Equation(9b) under Lemma 1;

  2. the condition Equation(9c) if condition Equation(12) holds;

  3. the condition Equation(9d) if condition Equation(13) holds;

  4. the condition Equation(9e) if

for k = 1: (17a) 1m[(Θm,l11 c1)°(Θm,l11 c1)°(Θm,l11 c1)°(Θm,l11 c1)]=mκ˜11,(17a) for k=2,,n:(17b) ckDm,lkkDm,lkkck=mκ˜kk,(17b) (17c) andHm,k1kΘm,lkkck=mκ˜k1,(17c) where the (k1)×1 column vector κ˜k1 contains the first k − 1 elements of the kth column of K˜n, Dm,lkk=[s1c°θ1k,,s1c°θlkk]  and  Hm,k1k=[s1c°s1c°s1c,,s1c°s1c°s1k1].

Algorithm 2 is the pseudocode for implementing the above results. It yields a realisation Xmn of m observations on n random variables which matches any given target mean, covariance, Kollo skewness, and Kollo kurtosis. To obtain skc satisfying all the conditions Equation(9), the k − 1 columns of Smn should be known and have satisfied condition Equation(9). For this, we first obtain an orthonormal matrix Θm,l11 satisfying Lemma 1 and then obtain the coefficients c1 by solving a quadratic Equation(12), a cubic Equation(13a), and a quartic Equation(17a) simultaneously. Next, we generate skc in an iterative manner from k = 2 to k = n. Now the vector ck should satisfy k + 2 equations including two quadratic EquationEquations (12) and Equation(17b) and k linear EquationEquations (13b) and Equation(17c), simultaneously.

Algorithm 2.

Matching the First Four Moments

1: function ROM4(m,μ,Σ,τ,K)    ⊳ m: Number of observations, μ,Σ,τ,K: Targeted first four moments

2:   n=length(μ)

3:   Obtain An where AnAn=Σ

4:   Generate Ωn           ⊳ Ωn is an n × n rotation matrix satisfying 1Ωn=n[1,0,,0]

5:   τ˜=Ωn1n1τ                      ⊳ Obtain the rotated Kollo skewness

6:   K˜=Ωn1n1KΩn                     ⊳ Obtain the rotated Kollo kurtosis

7:   while Theorem 2 does not hold do                 ⊳ Solve for the first column

8:     generate l1 orthonormal vectors and form Θm,l11 satisfying Lemma 1

9:     obtain real solutions for the coefficient c1 restricted by EquationEquations (12), Equation(13a), and Equation(17a)

10:   Obtain s1c=Θm,l11c1

11:   for k=2,,n do ⊳ Solve for the other columns

12:    while Theorem 2 does not hold do

13:      Generate lk orthonormal vectors to form Θm,lkk satisfying Lemma 1

14:      Obtain real solutions for the coefficients ck restricted by EquationEquations (12), Equation(13b), Equation(17b), and Equation(17c)

15:    Obtain skc=Θm,lkkck

16:   Generate permutation matrix Qm if required

17:   Xmn=1mμ+QmSmnΩnAn ⊳ Generate sample with exact first four moments

18:   return Xmn

We require lkk+2 because the elements of ck are always restricted by k + 2 equations: for k = 1 the systems Equation(12), Equation(13a), Equation(17a)—and for k > 1 the systems Equation(12), Equation(13b), Equation(17b), Equation(17c)—are both exactly determined when lk=k+2; but when lk>k+2 the systems are under-determined, which makes it easier to find a real-value solution for ck.

Finally, we note the following property for Kollo kurtosis:Footnote6 Let M(k) be the matrix of dimension mk×n with mkn+2 satisfying conditions Equation(3). Denote the ith row of M(k) by mi(k)r, for i=1,,N where N is the number of concatenations. Then the Kollo kurtosis of the concatenated sample is the weighted sum of their Kollo kurtoses: (18) K([M(1)M(N)])=i=1m1(mi(1)r1n)2mi(1)rmi(1)r++i=1mN(mi(N)r1n)2mi(N)rmi(N)rm1++mN=m1K(M(1))++mNK(M(N))m1++mN.(18)

Hence, Kollo kurtosis is invariant under sample concatenation if each sub-matrix M(k) has identical Kollo kurtosis.

5. Discussions

This paper derives necessary and sufficient analytic conditions for exactly matching the mean, covariance, Kollo skewness, and Kollo kurtosis of multivariate random variables. The necessary and sufficient conditions (9) are a highly under-determined system. Even without the condition for matching Kollo kurtosis Equation(9e), it is already a complex problem to find columns of Smn which satisfy Equation(9a)—(9d) because of the skewness condition Equation(9d). This explains why Kollo skewness matching methods, such as the algorithm proposed by Alexander et al. (Citation2022), are much slower than those matching only the first two moments exactly, such as Ledermann et al. (Citation2011). Adding the conditions Equation(9e) creates even more of a challenge: there are already k + 2 constraints on the kth column for Kollo skewness matching and an extra set of k conditions for Kollo kurtosis matching, so that in total for each column k, for k=1,,n, there are 2k+2 constraints on m observations making a total of 2n(n+1)/2+2n=n(n+3) conditions.

In response to this issue, we propose the novel idea of decomposing Smn in Equation(7) into the product of an orthonormal matrix and a vector of coefficients. We use the necessary and sufficient conditions to solve for the coefficients which generate the columns of Smn as linear combinations of the orthonormal column vectors in such a way that the moment matching conditions are satisfied. The orthonormal column vectors are restricted to the conditions provided in Lemma 1, which helps to reduce the number of equations in the non-linear system. With such construction simulations matching Kollo skewness become highly efficient; moreover, simulations matching Kollo kurtosis become practically possible, but in high-dimensional systems, the kurtosis matrix contains many elements to match, which greatly increases the difficulty of exact simulation.Footnote7

Our method differs from Alexander et al. (Citation2022) and Hanke et al. (Citation2017), which solve condition Equation(9) for simulations matching up to Kollo skewness. Hanke et al. (Citation2017) set the undetermined elements in the equation system equal to arbitrary values so the equation system becomes a determined system that is potentially solvable. Their method thus requires a trial-and-error process to find arbitrary values, which can be very slow. Moreover, it can fail completely and even when it works the simulations may have undesirable marginal densities for most real-world applications. Noting these issues, Alexander et al. (Citation2022) propose the Random Orthogonal Matrix simulation method with exact Kollo skewness which uses either a statistical bootstrap from real-world data or parametric multivariate distributions as arbitrary values. They also provide the necessary and sufficient conditions on arbitrary values for Kollo skewness matching to be possible.Footnote8 Such modifications significantly improve the theoretical and computational properties of the simulated samples, but their algorithms require arbitrary values which often produce marginals with strange characteristics. This problem arises because the choice of distribution used to generate random values has a significant impact on the success of solving for the remaining elements. This problem was discussed in depth by Alexander et al. (Citation2022) who show that reducing the variance of the chosen distribution will ensure a low failure rate. But then, by definitions, the remaining elements will have more extreme values so that the first few columns, viewed as samples from marginal distributions of the first few random variables, could be overly leptokurtic. In this case the sample characteristics may not be ideal for many applications, especially when the target values for skewness or kurtosis are large. Our method bypasses this problem completely. With the use of an orthonormal basis, we remove the need to allocate arbitrary values to elements in Smn. Instead, we can obtain the orthonormal matrices Θm,lkk in such a way that Smn has consistent characteristics, which provides a better quality of data, as will be demonstrated below.

Furthermore, direct simulations matching kurtosis become practically impossible when n is large. For instance, when n = 30 the Alexander et al. algorithm requires the solution of 30 different systems of non-linear equations, the smallest being a system of 3 equations (k = 1, i.e., for the first column) and the largest being a system of 32 equations (k = 30, i.e., the last column). In total one needs to solve k=1n(k+2)=525 non-linear equations to target Kollo skewness. But if we were to extend the system to target Kollo kurtosis directly, when n = 30 there would be k=1n(2k+2)=989 non-linear equations to solve, which may even be beyond the reach of the high-performance computers in use today. Our method provides a practical solution to this problem because the orthonormal basis greatly reduces the number of equations that need to be solved. When matching Kollo skewness, our algorithm only needs to solve one quadratic equation and one cubic (k = 1) or linear (k=2,,n) equation. In addition, we have an analytical solution in Corollary 1 for solving each column of Smn once the first column of Smn is available. By contrast, the algorithm proposed by Alexander et al. requires a numerical solution for one quadratic equation and k + 1 linear equations. When matching both Kollo skewness and Kollo kurtosis, our algorithm is much slower because it involves solving two quadratic equations and k linear equations for each column of Smn. Also, it requires numerical methods to solve a higher-dimensional, non-linear system. Nevertheless, it is the only algorithm that can target the first four multivariate moments exactly.

We finish with a brief demonstration of the flexibility and efficiency of our moment-matching results. Not all matrices Θm,l11 admit a solution for the conditions of Theorem 1 or 2, so the speed of the general Algorithms 1 and 2 depend on the time it takes to find the matrices Θm,lkk for which real-valued vectors ck that satisfy Theorem 1 or 2 exist. Similarly, for k=2,,n each Θm,lkk may need to be regenerated several times until we obtain a real-value ck that satisfies the conditions of Theorem 1 or 2. Although lk is at least 2 when matching the first three moments, and at least k + 2 when matching the first four moments, if computational speed becomes an issue we prefer to use a greater value for lk because it is easier to find a real-value solution for an under-determined system than for an exactly determined one. But for matching the first three moments, Corollary 1 provides a simple set of conditions to check if we set lk = 2, as well as providing the analytical expressions for the elements of ck. The computation times provided in use lk = 2 for matching the first three moments and lk=k+4 for matching the first four moments.

Figure 1. Empirical cumulative distribution functions of simulations matching cryptocurrency data. Reported in the left panels, the empirical distribution functions are based on simulations using the algorithm of Alexander et al. (Citation2022) (KROM) and our algorithms (Algorithm 1 for targeting the first three moments and Algorithm 2 for targeting all first four moments) with 1000 observations on three random variables. The target moments are derived from a sample of cryptocurrency data with 720 observations, from 08:00 4 April to 08:00 4 May 2021, UTC time, with Kollo skewness τ=[1.02,0.03,0.93] and the Kollo kurtosis K=[7.79,3.97,2.75;3.97,9.30,2.44;2.75,2.44,13.08]. For simplicity, we standardize all data before setting the targets for simulation and therefore set μn=0n, Σn=In and Qm=In, so that Xmn=SmnΩn. We also employ the rotation matrix shown in Section 3 for all simulations. The black lines in the CDF plots are the empirical distribution functions of the real data on the corresponding variable, fitted using the Gaussian kernel. We treat these lines as the target marginal densities for the KS tests. The example simulations are a typical case of simulation trials that pass the KS test at the significance level of 10%. For simulations using our algorithms, the orthonormal basis is generated using the bootstrapped sample. For simulations using the Alexander et al. algorithm, the arbitrary values in each column are also randomly drawn from the real data and we set the parameter σ2=0.8 or 0.6. In the right panel, the key characteristics table reports the KS statistics and p-values for the marginals depicted on the left. The table also reports the mean and standard deviation (in brackets) of the average computation time (ACT, in second/simulation), which is computed using the total computational time over 100 trials and depicted in the computational time figure below the table; the red line depicts the computation time of KROM 0.8 and is set against the left axis; the blue and green lines depicts the computation time of KROM 0.6 and our algorithm matching the first three moments and are set against the right axis. The machine used is an Intel Core i5-8250U CPU, 3.41 gigahertz, with 8 gigabyte RAM, running Python under Windows.

Figure 1. Empirical cumulative distribution functions of simulations matching cryptocurrency data. Reported in the left panels, the empirical distribution functions are based on simulations using the algorithm of Alexander et al. (Citation2022) (KROM) and our algorithms (Algorithm 1 for targeting the first three moments and Algorithm 2 for targeting all first four moments) with 1000 observations on three random variables. The target moments are derived from a sample of cryptocurrency data with 720 observations, from 08:00 4 April to 08:00 4 May 2021, UTC time, with Kollo skewness τ=[−1.02,−0.03,0.93]′ and the Kollo kurtosis K=[7.79,3.97,2.75;3.97,9.30,2.44;2.75,2.44,13.08]. For simplicity, we standardize all data before setting the targets for simulation and therefore set μn=0n, Σn=In and Qm=In, so that Xmn=SmnΩn′. We also employ the rotation matrix shown in Section 3 for all simulations. The black lines in the CDF plots are the empirical distribution functions of the real data on the corresponding variable, fitted using the Gaussian kernel. We treat these lines as the target marginal densities for the KS tests. The example simulations are a typical case of simulation trials that pass the KS test at the significance level of 10%. For simulations using our algorithms, the orthonormal basis is generated using the bootstrapped sample. For simulations using the Alexander et al. algorithm, the arbitrary values in each column are also randomly drawn from the real data and we set the parameter σ2=0.8 or 0.6. In the right panel, the key characteristics table reports the KS statistics and p-values for the marginals depicted on the left. The table also reports the mean and standard deviation (in brackets) of the average computation time (ACT, in second/simulation), which is computed using the total computational time over 100 trials and depicted in the computational time figure below the table; the red line depicts the computation time of KROM 0.8 and is set against the left axis; the blue and green lines depicts the computation time of KROM 0.6 and our algorithm matching the first three moments and are set against the right axis. The machine used is an Intel Core i5-8250U CPU, 3.41 gigahertz, with 8 gigabyte RAM, running Python under Windows.

presents the empirical comparison of the cumulative density function (CDF) generated by our algorithm for matching the Kollo skewness only, or both the Kollo skewness and kurtosis, against that generated by the algorithm of Alexander et al. (Citation2022).Footnote9 The Kollo skewness and kurtosis targets are derived empirically from the cryptocurrency hourly returns data. These moments have moderately large elements everywhere, which is suitable to illustrate the two problems discussed above. The figure also reports the results of the Kolmogorov–Smirnov (KS) test which quantifies the distance between the simulated and targeted samples, and the computation time for simulations matching the Kollo skewness only, since we have no benchmark comparison for matching Kollo kurtosis.

First, the results replicate Alexander et al.’s finding that a more constrained selection of arbitrary values (i.e., σ2=0.6, in blue, relative to σ2=0.8, in red) generates much more leptokurtic samples but is much faster for sample generation. So users must choose between a better-fitted sample or a faster speed of simulation. Note that the example distribution functions reported in are selected from “qualified” trials, i.e., the trials that pass the KS test at the significance level of 10%. Yet we can see the distribution functions generated using Alexander et al.’s algorithm with both σ2=0.6 and σ2=0.8 still appear leptokurtic. By comparison, samples generated by our algorithms for matching Kollo skewness and kurtosis do not suffer from the leptokurtic problem. In addition, our algorithm for matching Kollo skewness can be a hundred times faster when targeting suitably distributed samples: it takes 0.08 s per simulation for our algorithm versus 8.04 s per simulation for Alexander et al.’s algorithm with σ2=0.8. The computation time for matching all four moments is not reported because it is always much slower than the other algorithms. This is in line with expectations because our algorithm requires at least k + 2 degrees of freedom when solving the kth column of Smn.

Finally, we emphasize that the more multivariate moments that are matched exactly, the more information that is preserved by ROM simulation. Our research thus provides an interesting avenue for future research on matching Kollo kurtosis as well as Kollo skewness—that is to look for an analytical solution for solving each column of Smn once its first column is available. Similar to the impact of Corollary 1 which significantly speeds up our algorithm for matching Kollo skewness, such a solution should further enhance our algorithm for matching all four moments so that such simulations become not just practically possible, as this paper demonstrates, but much more faster to use in practice.

Correction Statement

This article has been corrected with minor changes. These changes do not impact the academic content of the article.

Supplemental material

Supplemental Material

Download MS Word (14.3 KB)

Disclosure statement

No potential conflict of interest was reported by the author(s).

Notes

1 This category of measures include the Malkovich and Afifi (Citation1973) measures, which are developed specifically for normality tests and these measures are extended by Balakrishnan et al. (Citation2007) and Srivastava (Citation1984); the Isogai (Citation1982) measure which is an extension of the Pearson (Citation1895) mode skewness; the Song (Citation2001) kurtosis measure which is based on Rényi entropy.

2 An exception is the Srivastava (Citation1984) measures which are much easier to compute, but they too cannot be applied to moment-matching simulations. This is because they involve a principal component transformation, which is easy to obtain from a given sample, but it is difficult to generate a sample when we only know the Srivastava moments.

3 The Koizumi et al. measures are defined as the sum of Mardia skewness and kurtosis of all sub-vectors of Xn, but they are complex to compute because the number of sub-vectors increases rapidly with the dimensions Xn.

4 To contrast the number of elements: in a 5-dimensional distribution we have 35 and 70 elements in the full skewness and kurtosis tensors, but only 5 and 15 elements in Kollo’s measures.

5 See Ledermann et al. (Citation2011) for discussions on the usual conditions for rotation matrices. Ledermann and Alexander (Citation2012) further investigate the sampling properties of different types of rotation matrices.

6 Alexander et al. (Citation2022) note a similar property for Kollo skewness.

7 In both cases our method may be regarded as a generalization of the original ROM simulation to higher-moment matching, because using only one orthonormal vector to generate each column of Smn reduces our method to a ROM simulation which only targets the mean vector and covariance matrix.

8 If the constraints are not satisfied, the arbitrary values require re-generation until the equation system becomes solvable with these values.

9 We do not include the algorithm of Hanke et al. (Citation2017) because Alexander et al. (Citation2022) have already demonstrated that the modifications provided by their algorithm produce marginals with superior characteristics and it is very much more computationally efficient. The Hanke et al. method may also suffer from a high failure rate.

References

  • Alexander, C., Meng, X., & Wei, W. (2022). Targeting Kollo skewness with random orthogonal matrix simulation. European Journal of Operational Research, 299(1), 362–376. https://doi.org/10.1016/j.ejor.2021.09.003
  • Balakrishnan, N., Brito, M., & Quiroz, A. (2007). A vectorial notion of skewness and its use in testing for multivariate symmetry. Communications in Statistics-Theory and Methods, 36(9), 1757–1767. https://doi.org/10.1080/03610920601126225
  • Bardsley, E. (2017). A finite mixture approach to univariate data simulation with moment matching. Environmental Modelling & Software, 90, 27–33. https://doi.org/10.1016/j.envsoft.2016.11.019
  • Ching, J., Porter, K. A., & Beck, J. L. (2009). Propagating uncertainties for loss estimation in performance-based earthquake engineering using moment matching. Structure and Infrastructure Engineering, 5(3), 245–262. https://doi.org/10.1080/15732470701298323
  • Efron, B. (1979). Bootstrap methods: Another look at the jackknife. The Annals of Statistics, 7(1), 1–26. https://doi.org/10.1214/aos/1176344552
  • Ferraz, V., & Moura, F. (2012). Small area estimation using skew normal models. Computational Statistics & Data Analysis, 56(10), 2864–2874. https://doi.org/10.1016/j.csda.2011.07.005
  • Foldnes, N., & Olsson, U. H. (2016). A simple simulation technique for nonnormal data with prespecified skewness, kurtosis, and covariance matrix. Multivariate Behavioral Research, 51(2–3), 207–219. https://doi.org/10.1080/00273171.2015.1133274
  • Hanke, M., Penev, S., Schief, W., & Weissensteiner, A. (2017). Random orthogonal matrix simulation with exact means, covariances, and multivariate skewness. European Journal of Operational Research, 263(2), 510–523. https://doi.org/10.1016/j.ejor.2017.05.023
  • Heath, A., Manolopoulou, I., & Baio, G. (2018). Efficient Monte Carlo estimation of the expected value of sample information using moment matching. Medical Decision Making, 38(2), 163–173. https://doi.org/10.1177/0272989X17738515
  • Hertz, D. B. (1964). Risk analysis in capital investment. Harvard Business Review, 42, 95–106.
  • Høyland, K., Kaut, M., & Wallace, S. W. (2003). A heuristic for moment-matching scenario generation. Computational Optimization and Applications, 24(2/3), 169–185. https://doi.org/10.1023/A:1021853807313
  • Isogai, T. (1982). On a measure of multivariate skewness and a test for multivariate normality. Annals of the Institute of Statistical Mathematics, 34(3), 531–541. https://doi.org/10.1007/BF02481051
  • Jobson, J. D., & Korkie, B. (1980). Estimation for Markowitz efficient portfolios. Journal of the American Statistical Association, 75(371), 544–554. https://doi.org/10.1080/01621459.1980.10477507
  • Koizumi, K., Sumikawa, T., Pavlenko, T., Kazuyuki, K., Takuma, S., & Tatjana, P. (2014). Measures of multivariate skewness and kurtosis in high-dimensional framework. Journal of Mathematics, 50(2), 483–511.
  • Kollo, T. (2008). Multivariate skewness and kurtosis measures with an application in ICA. Journal of Multivariate Analysis, 99(10), 2328–2338. https://doi.org/10.1016/j.jmva.2008.02.033
  • Kuruvilla, J., & Gunavathi, K. (2014). Lung cancer classification using neural networks for CT images. Computer Methods and Programs in Biomedicine, 113(1), 202–209. https://doi.org/10.1016/j.cmpb.2013.10.011
  • Ledermann, D., & Alexander, C. (2012). Further properties of random orthogonal matrix simulation. Mathematics and Computers in Simulation, 83, 56–79. https://doi.org/10.1016/j.matcom.2012.07.013
  • Ledermann, W., Alexander, C., & Ledermann, D. (2011). Random orthogonal matrix simulation. Linear Algebra and Its Applications, 434(6), 1444–1467. https://doi.org/10.1016/j.laa.2010.10.023
  • Ledoit, O., & Wolf, M. (2004). Honey, I shrunk the sample covariance matrix. The Journal of Portfolio Management, 30(4), 110–119. https://doi.org/10.3905/jpm.2004.110
  • Looney, C. G. (1983). Monte Carlo simulation with moment matching samples. Mathematics and Computers in Simulation, 25(3), 237–240. https://doi.org/10.1016/0378-4754(83)90099-X
  • Love, P. E., Sing, C. P., Wang, X., Edwards, D. J., & Odeyinka, H. (2013). Probability distribution fitting of schedule overruns in construction projects. Journal of the Operational Research Society, 64(8), 1231–1247. https://doi.org/10.1057/jors.2013.29
  • Malkovich, J. F., & Afifi, A. (1973). On tests for multivariate normality. Journal of the American Statistical Association, 68(341), 176–179. https://doi.org/10.1080/01621459.1973.10481358
  • Mardia, K. V. (1970). Measures of multivariate skewness and kurtosis with applications. Biometrika, 57(3), 519–530. https://doi.org/10.1093/biomet/57.3.519
  • Mitchell, A. E. (1989). The information matrix, skewness tensor and α-connections for the general multivariate elliptic distribution. Annals of the Institute of Statistical Mathematics, 41(2), 289–304. https://doi.org/10.1007/BF00049397
  • Móri, T. F., Rohatgi, V. K., & Székely, G. (1994). On multivariate skewness and kurtosis. Theory of Probability & Its Applications, 38(3), 547–551. https://doi.org/10.1137/1138055
  • Oberkampf, W. L., DeLand, S. M., Rutherford, B. M., Diegert, K. V., & Alvin, K. F. (2002). Error and uncertainty in modeling and simulation. Reliability Engineering & System Safety, 75(3), 333–357. https://doi.org/10.1016/S0951-8320(01)00120-X
  • Pearson, K. (1895). X. Contributions to the mathematical theory of evolution.—II. Skew variation in homogeneous material. Philosophical Transactions of the Royal Society of London A, 186, 343–414.
  • Pellegrino, T., & Sabino, P. (2014). On the use of the moment-matching technique for pricing and hedging multi-asset spread options. Energy Economics, 45, 172–185. https://doi.org/10.1016/j.eneco.2014.06.014
  • Qu, W., Liu, H., & Zhang, Z. (2020). A method of generating multivariate non-normal random numbers with desired multivariate skewness and kurtosis. Behavior Research Methods, 52(3), 939–946. https://doi.org/10.3758/s13428-019-01291-5
  • Ryoo, H. S. (2007). A compact mean-variance-skewness model for large-scale portfolio optimization and its application to the NYSE market. Journal of the Operational Research Society, 58(4), 505–515. https://doi.org/10.1057/palgrave.jors.2602168
  • Saliby, E., & Paul, R. (2009). A farewell to the use of antithetic variates in Monte Carlo simulation. Journal of the Operational Research Society, 60(7), 1026–1035. https://doi.org/10.1057/palgrave.jors.2602645
  • Song, K. (2001). Rényi information, loglikelihood and an intrinsic distribution measure. Journal of Statistical Planning and Inference, 93(1-2), 51–69. https://doi.org/10.1016/S0378-3758(00)00169-5
  • Srivastava, M. S. (1984). A measure of skewness and kurtosis and a graphical method for assessing multivariate normality. Statistics & Probability Letters, 2(5), 263–267. https://doi.org/10.1016/0167-7152(84)90062-2

Appendix A.

Real-world examples

We illustrate the Kollo skewness and kurtosis and compare them with the popular Mardia measures using a set of three hourly cryptocurrency returns (bitcoin, ether, and litecoin) and the eleven different U.S. stock sector index daily returns, from which we show just three sectors (energy, finance, and real estate) although the Kollo measures are computed from the entire eleven-dimensional distribution. The returns are presented in .

Figure A1. Time series of financial returns. (a) Exhibits the hourly returns on three cryptocurrencies: bitcoin (BTC), ether (ETH), and litecoin (LTC) from 1 January 2017 to 31 December 2021; and (b) exhibits the daily returns for three U.S. stock sector indices, viz. Energy, Finance, and Real Estate, from 16 October 2001 to 31 December 2021.

Figure A1. Time series of financial returns. (a) Exhibits the hourly returns on three cryptocurrencies: bitcoin (BTC), ether (ETH), and litecoin (LTC) from 1 January 2017 to 31 December 2021; and (b) exhibits the daily returns for three U.S. stock sector indices, viz. Energy, Finance, and Real Estate, from 16 October 2001 to 31 December 2021.

Given that each series is clearly far from i.i.d., we used a fixed-size rolling window to estimate the sample Kollo skewness and kurtosis. The sample size is 720 (∼1 month) for the hourly cryptocurrency data and 250 (∼ one year) for the daily stock index data. The relevant elements of the Kollo skewness vector and kurtosis matrix are depicted as time series in .

Figure A2. Kollo skewness and kurtosis in financial data in a rolling-window framework. We plot all elements in the Kollo skewness vector and only plot the unique elements in the Kollo kurtosis matrix. (a) Exhibits the Kollo skewness and kurtosis using a rolling window of 720 hourly returns for cryptocurrency prices of bitcoin (BTC), ether (ETH), and litecoin (LTC) from 1 January 2017 to 31 December 2021. (b) Exhibits the Kollo skewness and kurtosis using a rolling window of 250 daily returns for three U.S. stock sector indices, viz. Energy, Finance, and Real Estate, from 16 October 2001 to 31 December 2021. The grey dotted line depicts the rolling Mardia skewness and kurtosis, using the right-hand scale.

Figure A2. Kollo skewness and kurtosis in financial data in a rolling-window framework. We plot all elements in the Kollo skewness vector and only plot the unique elements in the Kollo kurtosis matrix. (a) Exhibits the Kollo skewness and kurtosis using a rolling window of 720 hourly returns for cryptocurrency prices of bitcoin (BTC), ether (ETH), and litecoin (LTC) from 1 January 2017 to 31 December 2021. (b) Exhibits the Kollo skewness and kurtosis using a rolling window of 250 daily returns for three U.S. stock sector indices, viz. Energy, Finance, and Real Estate, from 16 October 2001 to 31 December 2021. The grey dotted line depicts the rolling Mardia skewness and kurtosis, using the right-hand scale.

The spikes in skewness and kurtosis clearly represent significant events in each sector. For example, during the financial crisis of 2008, we see a spike in both τ7 and κ7,7 following the collapse of Lehman Brothers in September 2008. The extremely low values of τ1 and the spike in κ1,1 during the first half of 2018 were caused by an upward surge in oil prices. Similarly, the upward spike in τ7 and in most elements in Kollo kurtosis on 16 March 2020 resulted from extreme negative returns across all financial markets following the outbreak of Covid-19.

Almost all elements of the Kollo skewness vector lie in the range [10,10] with the most sampling variability in τ1 and τ3. The Kollo kurtosis ranges from 0 to 200. The spikes in Kollo skewness and Kollo kurtosis occur simultaneously. In particular, the global crash in risky assets after the outbreak of Covid in January 2020 is evident from the dip in Kollo skewness down to almost −10 in both data sets. The Kollo kurtosis takes a narrower range in the stock sector data than for the cryptocurrency data.

Compared with the Mardia measures, we see that the Kollo measures are more sensitive to the extremes of the distribution. Unlike the Mardia skewness, the Kollo measure also captures the direction of asymmetry in the marginals. The Kollo measures also capture more information from co-moments and not only from the marginals. They can exhibit substantial variability even when the Mardia measures remain relatively stable over time. Furthermore, the Kollo measures can indicate which variable contributes most to an extreme value for skewness and kurtosis but Mardia measures only capture the overall information, in a scalar value.

Appendix B.

Mathematical proofs

B.1 Proof of Lemma 1

Write Θm,lkk=[θ1k,,θlkk], and ck=[c1,,clk] where at least one element is non-zero. Now if the column vectors in Θm,lkk are also orthogonal to the vectors s1c,,sk1c we have: 1m(shc°θjk)=0forj=1,,lkandh<k.

Then, since Θm,lkk is orthogonal to 1m we have 1mθik=0 and hence: 1mskc=1m(Θm,lkk ck)=1mi=1lkciθik=i=1lkci1mθik=0, and for j<k  1m(sjc°skc)=1m[sjc°(Θm,lkkck)]=1m(sjc°i=1lkciθik)=i=1lkci1m(sjc°θik)=0.

B.2 Proof of Theorem 1

When k = 1 the conditions for s1c are set out in EquationEquation (11) and by Lemma 1 we have s1c=Θm,lk1c1 so we only need to solve the second two equations in Equation(11). The quadratic equation can be written as: m=1m(s1c°s1c)=s1cs1c=(Θm,lk1c1)(Θm,lk1c1)=c1Θm,lk1Θm,lk1c1=c1c1, because Θm,lk1Θm,lk1=In. And the cubic equation becomes: mτ˜1=1m(s1c°s1c°s1c)=1m[(Θm,lk1 c1)°(Θm,lk1 c1)°(Θm,lk1 c1)], which proves Equation(13a). When k=2,,n we need to solve a quadratic EquationEquation (9c) and a linear EquationEquation (9d). The quadratic equation may be written as: m=1m(skc°skc)=skcskc=ckΘm,lkkΘm,lkkck=ckck, as before. The linear equation becomes: mτ˜=1m(s1c°s1c°skc)=(s1c°s1c)skc=(s1c°s1c)(Θm,lkkck)=((s1c°s1c)Θm,lkk)ck=[(s1c°s1c)θ1k,,(s1c°s1c)θlkk]ck=1m [s1c°s1c°θ1k,,s1c°s1c°θlkk] ck=1mBm,lkkck, which proves Equation(13b).

B.3 Proof of Corollary 1

EquationEquation (12) defines a 2-dimensional sphere and EquationEquation (13b) defines a 2-dimensional plane. Hence the question of whether the system of equations has real roots can be converted into the question of whether the sphere has intersections with the plane. We know that a necessary and sufficient condition for this is that the shortest distance between the centre of the sphere and the plane should be no more than the radius of the sphere, i.e., |mτ˜k|1mBm,2kBm,2k1mm, which proves Equation(14). The sphere and the linear equation for the plane may be written as follows, (c1k)2+(c2k)2=m(s1c°s1c)θ1kc1k+(s1c°s1c)θ2kc2k=mτ˜k.

Hence we may set: (B1) c2k=mτ˜k(s1c°s1c)θ1kc1k(s1c°s1c)θ2k,(B1) and so the sphere equation becomes: (c1k)2+(mτ˜k(s1c°s1c)θ1kc1k(s1c°s1c)θ2k)2=m.

Let b˜1k and b˜2k denote the first and second elements of 1mBm,2k, i.e., b˜1k=(s1c°s1c)θ1k and b˜2k=(s1c°s1c)θ2k. Then the above equation becomes: [(b˜1k)2+(b˜2k)2](c1k)22b1kmτ˜kc1k+m2τ˜k 2m(b˜2k)2=0.

Using the standard formula for the roots of this equation, the solutions are: c1k=2b˜1kmτ˜k±4(b˜1k)2m2τ˜k 24[(b˜1k)2+(b˜2k)2](m2τ˜k 2m(b˜2k)2)2[(b˜1k)2+(b˜2k)2]=b˜1kmτ˜k±[(b˜1k)2+(b˜2k)2](b˜2k)2m(b˜2k)2m2τ˜k2(b˜1k)2+(b˜2k)2=b˜1kmτ˜k±b˜2k[(b˜1k)2+(b˜2k)2]mm2τ˜k2(b˜1k)2+(b˜2k)2.

Finally, we substitute these values for c1k into Equation(B1) to obtain: c2k=mτ˜k(s1c°s1c)θ1kc1k(s1c°s1c)θ2k=b˜2kmτ˜kb˜1k[(b˜1k)2+(b˜2k)2]mm2τ˜k2(b˜1k)2+(b˜2k)2.

B.4 Proof of Theorem 2

To match Kollo kurtosis each column of Smn needs to satisfy EquationEquation (9e) in addition to satisfying the conditions Equation(9a)—(9d), thus (i)—(iii) has been proven in Theorem 1 and the following focuses on (iv). When k = 1, s1c needs to satisfy a quartic EquationEquation (16) which is easily rewritten as Equation(17a), viz. mκ˜11=1m(s1c°s1c°s1c°s1c)=1m[(Θm,l11c1)°(Θm,l11c1)°(Θm,l11c1)°(Θm,l11c1)].

When k=2,,n, skc should satisfy condition Equation(9e), which contains a quadratic equation when i = k and k − 1 linear equations for i=1,.,k1. First, we focus on the quadratic equation: mκ˜kk=1m(s1c°s1c°skc°skc)=(s1c°skc)(s1c°skc)=(s1c°(Θm,lkkck))(s1c°(Θm,lkkck))=ck[s1c°θ1,,s1c°θlk][s1c°θ1,,s1c°θlk]ck=ckDm,lkkDm,lkkck, which proves the condition Equation(17b). Next, consider these k − 1 linear equations, for each i < k, mκ˜ik=1m(s1c°s1c°sic°skc)=(s1c°s1c°sic)skc=(s1c°s1c°sic)(Θm,lkkck).

Then the EquationEquation (17c) is simply rewritten in a matrix form.