Abstract
We motivate the advantages of using the Kollo measures, relative to other types of third and fourth moments of multivariate systems, and explore their Monte Carlo simulation and bootstrapping errors. Then we derive necessary and sufficient conditions for simultaneously matching any given mean vector, covariance matrix, Kollo skewness, and Kollo kurtosis. The specification of a suitable orthonormal basis greatly simplifies these moment-matching conditions. We offer semi-closed-form solutions to increase computational efficiency. In this respect, we compare our approach to two competing methods, which anyway can only match Kollo skewness and not the kurtosis at the same time. Ours is the first method for exactly matching all four multivariate moments simultaneously.
Keywords:
1. Introduction
Consider an n-dimensional random variable with a mean vector and a covariance matrix Mitchell (Citation1989) introduced the third and fourth moments as the skewness tensor and kurtosis tensor where is the standardized version of i.e., and ⊗ denotes the Kronecker product. In other words, the elements in the skewness tensor take the form and those of the kurtosis tensor take the form for The tensors contain the third and fourth order central moments and all the co-moments of the marginal variables, which include and distinct elements, respectively. The number of elements increases rapidly with n. For that reason, these full tensor moments are impractical for use in large dimensions and several more parsimonious representations of multivariate skewness and kurtosis have been proposed.
Many authors advocate the need for models to match the information contained in measures of skewness and kurtosis, for instance in finance (Ryoo, Citation2007), biotech (Kuruvilla & Gunavathi, Citation2014), and engineering (Love et al., Citation2013). Indeed, moment-matching techniques have been applied in a variety of disciplines: Heath et al. (Citation2018) use moment-matching simulation for probabilistic sensitivity analysis in economics; Pellegrino and Sabino (Citation2014) apply moment-matching to reduce the dimension of the pricing problem in finance; and Ching et al. (Citation2009) improve the accuracy of the estimation by using moment-matching in engineering. Yet none can help match even the mean vector and covariance matrix exactly, let alone the higher moments.
Some moment-matching techniques are specifically designed to reduce the sampling error in multivariate moments. In this strand of the literature both Bardsley (Citation2017), Høyland et al. (Citation2003), and Looney (Citation1983) propose a method to generate data based on moments; Foldnes and Olsson (Citation2016) introduce a simple copula structure to target the mean vector, the covariance matrix and both the skewness and the kurtosis of each marginal; and Qu et al. (Citation2020) propose a method to target the Mardia (Citation1970) skewness and kurtosis. However, none of these use the tensor measures defined above, because they are intractable for practical applications except when n is small.
Many different summary measures have been advocated to reduce the size of the skewness and kurtosis tensors, including those proposed by Koizumi et al. (Citation2014), Kollo (Citation2008), Mardia (Citation1970), and Móri et al. (Citation1994). Several other measures are not even directly related to the full tensors.Footnote1 All these measures are listed in chronological order in and each column refers to a property that is not fulfilled by the measure.
First note that most of the measures that are not based on the full tensors require computation of the multivariate probability density function, which is rather complex, and cannot be used to establish solvable systems of moment-matching constraints for simulations targeting skewness and kurtosis.Footnote2 Secondly, of the four methods that are derived from the full tensor moments, all but three ignore information by excluding some of the cross moments. This includes the popular Mardia (Citation1970) skewness and kurtosis, which are easy to use but—like several other measures—are only scalar quantities. As such, they are unable to carry as much information about the full multivariate density as vector or matrix measures.
From it is clear that the only measures that suffer from none of these drawbacks are those introduced by Kollo (Citation2008). In the form of an vector and an n × n symmetric matrix, respectively, they may be written: (1a) (1a) (1b) (1b)
Unlike tensor moments, the Kollo measures are easy to compute for applications. But so are other measures, such as those of Mardia and Móri et al.Footnote3 In Appendix A, we provide some real-world examples that motivate the use of Kollo skewness and kurtosis in a practical setting, rather than the measures introduced by Mardia. Kollo measures also have many theoretical advantages. By retaining so many co-moments the Kolo measures are the most tractable attempt to retain as much information as possible from the full skewness and kurtosis tensors.Footnote4 The Mardia and Koizumi et al. measures suffer from a substantial loss of information because they are scalars. For example, Mardia skewness can only be positive, which cannot show the data is negatively skewed or positively skewed. A similar issue of information loss is shared by the Móri et al. measures which are a multivariate extension of Mardia measures but still omit many third and fourth order co-moments. Moreover, like the tensors themselves and several other related measures, the Kollo measures are invariant under affine transformations. Later we shall rely on this property to generate repeated simulations that exactly match some target values for Kollo skewness and kurtosis. This invariance property also enables concatenations of repeated samples. That is, the concatenated samples have the same moments as the sub-samples.
Our exact moment-matching methods are achieved by creating an orthonormal basis, the span of which includes a particular subspace containing all samples that have the exact Kollo skewness vector and Kollo kurtosis matrix. The benefit of using such a set of orthonormal vectors is huge: mathematically, matching the sample moments exactly means solving a large system of non-linear equations in which the number of unknowns greatly exceeds the number of constraints. The orthonormal basis also underpins a significant reduction in the size of the system to be solved, and we are able to derive analytic solutions that greatly enhance the computational efficiency of our method. It is also, to our knowledge, the only method for exactly matching both Kollo skewness and kurtosis in every simulation of the distribution.
Common techniques, such as Monte Carlo (MC) simulation (Hertz, Citation1964) or the statistical bootstrap (Efron, Citation1979) fail to ensure that each sample has the same moments as the target distribution. There is a large literature on their simulation errors in covariance matrices—see Jobson and Korkie (Citation1980), Ledoit and Wolf (Citation2004), and many others—but no previous study on MC or bootstrap simulation errors in the Kollo measures. Section 2 motivates the need to target higher moments exactly by demonstrating that the Kollo moment sampling errors from standard Monte Carlo and the statistical bootstrap can be very significant. This well-known property of sampling error is unlikely to be resolved completely by increasing the number of simulations alone (Ferraz & Moura, Citation2012; Oberkampf et al., Citation2002) and variance-reduction techniques (Saliby & Paul, Citation2009) come at the expense of model complexity.
Then Section 3 introduces the Ledermann et al. (Citation2011) Random Orthogonal Matrix (ROM) simulation method, which encompasses MC and the bootstrap (as special cases of the L matrices which underpin this approach) and was originally developed as a simulation method for matching exact covariance matrices. Two different extensions of the ROM algorithm by Hanke et al. (Citation2017) and Alexander et al. (Citation2022) can target exact Kollo skewness by generating arbitrary values to pad the sample so that Kollo skewness matching conditions are solvable numerically. But a trial-and-error process is needed to find suitable arbitrary values and, even when such values are found, there is no analytical solution and its numerical solution cannot be guaranteed, so the algorithms can fail. Moreover, as Alexander et al. (Citation2022) demonstrate, these methods risk generating simulated samples with undesirable marginal densities, depending on the arbitrary values that are found.
Having derived the necessary and sufficient conditions for a sample generated to match a target Kollo skewness and Kollo kurtosis in Section 3, Section 4 uses these conditions to derive theoretical results which first define an orthonormal basis, and then find analytic solutions for the coefficients which generate the required subspace of the span of this orthonormal basis. Here we also prove the invariance under sample concatenation. We conclude the paper with a discussion of our results in Section 5, in the context of the previous literature.
Our algorithms are written in Python and available on Github https://github.com/vivid1005/GKROM-algorithm.
2. Sampling errors in Kollo skewness and kurtosis
This section presents two experiments that demonstrate the need to use a simulation method that can target Kollo moments exactly. Let and κij denote the elements of the target mean covariance Kollo skewness and Kollo kurtosis respectively, and let and and denote the corresponding elements of the simulated sample. Then the RMSEs for each moment are given by: (2) (2) for the mean vector, covariance matrix, Kollo skewness vector, and Kollo kurtosis matrix, respectively. Now, for a given dimension n of the system of random variables and a given size m for the simulations, we generate a sample and compute the mean vector, covariance matrix, Kollo skewness vector, and Kollo kurtosis matrix and record the RMSEs using Equation(2)(2) (2) . We repeat this many times and then record the average RMSE and its standard error.
and tabulate the results of the first, bootstrapping experiment using two real data sets, viz. the U.S. stock sector returns data described above and a temperature data sample from the “Global and City Yearly Average Temperatures 1820–2015” data set available on Data.world. It includes the yearly average temperatures between 1849 and 2013 of the 11 cities in the United States (viz. Austin, Boston, Chicago, Kansas City, Las Vegas, Los Angeles, New Orleans, New York, Philadelphia, San Francisco, and Washington). In both cases n = 11 with the target mean vector, covariance matrix, Kollo skewness vector, and Kollo kurtosis matrix derived from Equation(1) the first 1000 observations in the sample, from 1 December 2018 to 31 December 2021 for sector returns and Equation(2)(2) (2) 166 observations from 1849 to 2013 for temperature data, respectively. The moments are reported in the upper part of the tables. Then we apply the statistical bootstrap to these samples, to generate random samples of size m, for and 1000. For any given m, we generate 100,000 samples of this size. For each sample, we record the RMSEs calculated using Equation(2)(2) (2) and then compute the average and standard error of these RMSEs over all 100,000 repetitions and report the results in the lower part of the tables.
As expected both the average RMSE and its standard deviation decrease with m. However, their ratio (shown in the right-hand panel) does not. The errors in moments that arise when using the bootstrap are particularly apparent when only 100 observations are randomly selected. When m = 100 the ratio of the average RMSE on the Kollo skewness to its standard deviation is 2.632 for the stock returns data and 3.856 for the temperature data, and for the Kollo kurtosis, the corresponding ratio is 7.099 and 2.702, respectively. But even when m = 1000, so we bootstrap 100,000 samples having the same size as the original one, the errors relative to the target moments derived from the original sample remain substantial (with ratios are 2.416 and 2.469 for the stock returns data and 3.288 and 1.782 for the temperature data on the Kollo skewness and kurtosis, respectively). It is worth noting that much of the research on sampling errors in simulation focuses on errors in the covariance matrix, but looking at the ratios in the right-hand panel it appears that the covariance matrix is the least inaccurate of the four moments.
reports the results of the second experiment which assumes an n-dimensional distribution for n = 5 or 10 and applies standard Monte Carlo simulation to generate samples of size m for or 106. As expected both the average RMSE and its standard error decrease as m increases. However, again their ratio (shown in the right panel) does not. It is clear that all simulations yield inaccurate values for the mean vector, covariance matrix, Kollo skewness vector, and Kollo kurtosis matrix. It also becomes more difficult to simulate accurate values as the dimension of the system increases, especially for the covariance matrix. This observation was used to motivate the introduction of moment matching methods (such as Ledermann et al., Citation2011) and demonstrates that similar inaccuracies occur when using Monte Carlo simulation in large systems for which accurate third and fourth moments are also important.
3. Conditions for matching Kollo skewness and kurtosis
Consider the simple case of generating a sample of size m from an n-dimensional distribution that has the exact mean vector and covariance matrix Let be a fixed square matrix such that Following Ledermann et al. (Citation2011) we can select any “scaled L-matrix” that is defined by the properties: (3) (3) where denotes an vector of 1s, is an vector of 0s and is the n × n identity matrix. The matrix is just an L matrix scaled by See Ledermann and Alexander (Citation2012) for more details about different types of L matrices that can be used to target the Mardia measures exactly, either skewness or kurtosis, but not both. The original concept of ROM simulation selects any random n × n rotation matrix and possibly also a random m × m permutation matrix so that by Equation(3)(3) (3) every sample of the form: (4) (4) has mean and covariance matrix The problem with such a construction is that these ROM simulations can have quite different Kollo skewness and kurtosis because they depend on the choice of For this reason, we set in EquationEquation (4)(4) (4) and introduce another source of randomness.
First, we note that must satisfy more than just Equation(3)(3) (3) for the simulation given by: (5) (5) to have exact Kollo skewness and Kollo kurtosis as well as the exact mean vector and covariance matrix It is easiest to write these necessary and sufficient conditions in summation notation, letting denote the vector that is the ith row of as follows: (6a) (6a) (6b) (6b) (6c) (6c)
Unfortunately Equation(6b)(6b) (6b) and Equation(6c)(6c) (6c) are difficult to solve directly. Instead, we follow Hanke et al. (Citation2017) and introduce an n × n rotation matrix, which satisfies in addition to the usual conditions and Footnote5 One example of such a matrix is:
The rotation matrix is used to convert to a new matrix i.e., and so The sample construction Equation(5)(5) (5) may now be rewritten: (7) (7) and the necessary and sufficient conditions Equation(6) become: (8a) (8a) (8b) (8b) (8c) (8c)
The above formulation explains why we require We can see that the most complex terms in Equation(6b)(6b) (6b) and Equation(6c)(6c) (6c) are reduced to terms of the form in Equation(8b)(8b) (8b) and Equation(8c)(8c) (8c) , with denoting the first element in the ith row of This way, the focus moves from simulating Equation(5)(5) (5) using different to simulating Equation(7)(7) (7) using different which satisfies Equation(8).
4. Theoretical results
For convenience, we now rewrite the conditions Equation(8) as functions of each column of respectively, using the Hadamard product Let denote the vector that is the jth column of The following formulation of the necessary and sufficient conditions for the simulated sample to match exact mean, covariance and Kollo skewness and kurtosis will be useful in this section: (9a) (9a) (9b) (9b) (9c) (9c) (9d) (9d) (9e) (9e) where is the kth element of the scaled and rotated Kollo skewness vector and is the element in the ith row and kth column of the scaled and rotated Kollo kurtosis matrix Matching the exact mean, covariance and Kollo skewness relies on solving conditions Equation(9a)(9a) (9a) –Equation(9d)(9d) (9d) for The additional constraints Equation(9e)(9e) (9e) set out the k restrictions on the kth column of that must hold in order that each simulation matches the target Kollo kurtosis.
The system Equation(9) is not easy to solve. When k = 1 we have to solve a system of linear equations for the mean constraint Equation(9a)(9a) (9a) , a system of quadratic equations for the variance constraint Equation(9c)(9c) (9c) , a system of cubic equations for the Kollo skewness constraint Equation(9d)(9d) (9d) , and a system of quartic equations for the Kollo kurtosis constraint Equation(9e)(9e) (9e) . When k > 1 we have four different systems of linear equations, for the mean constraint Equation(9a)(9a) (9a) , the covariance constraint Equation(9b)(9b) (9b) , the Kollo skewness constraint Equation(9d)(9d) (9d) , and Kollo kurtosis constraints Equation(9e)(9e) (9e) when i < k, and two different systems of quadratic equations for the variance constraint Equation(9c)(9c) (9c) and the Kollo kurtosis constraint Equation(9e)(9e) (9e) when i = k. In total, there are equations to solve for matching the first three moments, or equations for matching the first four moments. For example, in a 10-dimensional system, we need to solve 75 equations for matching the first three moments, or 130 equations to match the first four moments.
In this section we show how to solve Equation(9) for column by column, by (a) finding a particular type of orthonormal basis for the sample space of the n random variables; and then (b) solving for the coefficients which generate a subspace of its span which contains samples having exactly the target Kollo moments. That is, for each column k and for some column dimension lk, with we first find a specific rectangular orthonormal matrix i.e., a matrix having the properties
Then we set (10) (10) where is an column vector with at least one non-zero element. The decomposition Equation(10)(10) (10) allows one to select the orthonormal matrix in such a way that always satisfies EquationEquations (9a)(9a) (9a) and Equation(9b)(9b) (9b) . That is, instead of solving (9) for directly, we may solve for and iteratively, for using the following result:
Lemma 1.
Suppose the first k − 1 columns in the matrix satisfy conditions Equation(9a)(9a) (9a) and Equation(9b)(9b) (9b) and the kth column is constructed using Equation(10)(10) (10) . Then the conditions Equation(9a)(9a) (9a) and Equation(9b)(9b) (9b) will be satisfied if the column vectors in are also orthogonal to the vectors and
Generating column by column this way considerably reduces the number of constraints that must be satisfied. Later we shall show that, in addition to the mean and covariance, we can match the Kollo skewness and kurtosis by placing only k + 2 constraints on the kth column of And if we only wish to target the first three moments, there are just two constraints on each column of In total, instead of equations to solve for matching the first three moments, or equations for the first four moments, now we only need to solve 2n and equations, respectively. For instance, in a 10-dimensional system, we need to solve 20 equations to match the mean, covariance, and Kollo skewness or 75 equations to additionally match the Kollo kurtosis. Of course, the dimension reduction increases with n; e.g., when n = 20 the first three moments constraints reduce from 250 to 40, and the first four moments constraints reduce from 460 to 250.
Our method of solution requires that we fix the first k − 1 columns of before we generate the matrix We then generate a rectangular orthonormal matrix with dimensions m × N whose first k columns are equal to and The columns of the matrix are then randomly selected from the other N—k columns of this matrix. That is, we take a random sample (without replacement) of any column j for and the selected columns for are then used to form the orthonormal matrix Note that we require because there are m elements in each column and so, no more than m columns can be mutually orthogonal. This places an upper bound on lk because
There is a close relationship between and the L matrices (Ledermann et al., Citation2011). In fact, we can immediately obtain by randomly extracting lk columns from the last columns of an L matrix whose first k − 1 columns are
Now we turn to the actual solution of (9) using the construction Equation(10)(10) (10) , starting with the simpler case of matching the first three moments only. Recall that is different from the other columns because it involves a cubic equation, which is easy to see by rewriting the conditions Equation(9a)—(9d) for as: (11) (11)
The following Theorem explains how to solve for the first column of from which the other columns are generated in an iterative manner:
Theorem 1.
Suppose the kth column of satisfies Equation(10)(10) (10) for , that satisfies the conditions of Lemma 1, and that satisfy conditions Equation(9a)(9a) (9a) to Equation(9d)(9d) (9d) . Then: by Lemma 1, satisfies conditions Equation(9a)(9a) (9a) and Equation(9b)(9b) (9b) ; it also satisfies condition Equation(9c)(9c) (9c) if(12) (12) and the condition Equation(9d)(9d) (9d) also holds if(13a) (13a) (13b) (13b) where
The pseudocode in algorithm 1 summarizes how to generate a sample that matches any given mean vector, covariance matrix, and Kollo skewness vector.
Algorithm 1.
Matching the First Three Moments
1: function ROM3() ⊳ m: Number of observations, Targeted first three moments
2:
3: Obtain where
4: Generate ⊳ is an n × n rotation matrix satisfying
5: ⊳ Obtain the rotated Kollo skewness
6: while Theorem 1 conditions Equation(12)(12) (12) and Equation(13a)(13a) (13a) for do not hold do ⊳ Solve for the first column
7: Generate l1 orthonormal vectors to form satisfying Lemma 1
8: Obtain real solutions for the coefficient restricted by EquationEquations (12)(12) (12) and Equation(13a)(13a) (13a)
9: Obtain
10: for do:
11: while Theorem 1 conditions Equation(12)(12) (12) and Equation(13b)(13b) (13b) for do not hold do
12: Generate lk orthonormal vectors to form satisfying Lemma 1
13: Obtain real solutions for the coefficients restricted by EquationEquations (12)(12) (12) and Equation(13b)(13b) (13b)
14: Obtain
15: Generate permutation matrix if required
16: ⊳ Generate a sample with exact first three moments
17: return
It solves iteratively, column-by-column while building the columns of an orthonormal matrix that yields a real-valued solution for This means, to obtain satisfying conditions Equation(9a)—(9d), the k − 1 columns of must first be known and will satisfy conditions Equation(9a)—(9d). So we first obtain an orthonormal matrix satisfying Lemma 1 and then obtain the coefficients by solving a system of quadratic EquationEquation (12)(12) (12) and a system of cubic EquationEquation (13a)(13a) (13a) simultaneously. Next, we generate for Now should satisfy a quadratic system Equation(12)(12) (12) and a linear system Equation(13b)(13b) (13b) simultaneously.
We require because the elements of are always restricted by 2 equations: for k = 1 the systems Equation(12)(12) (12) , Equation(13a)(13a) (13a) —and for k > 1 the systems Equation(12)(12) (12) and Equation(13b)(13b) (13b) —are both exactly determined when but when the systems are under-determined, which makes it easier to find a real-value solution for
While both general and flexible, there is an element of trial and error for finding a real-valued solution in Algorithm 1. The following Corollary 1 considers a special case of Theorem 1 for lk = 2 and It is useful because it derives a single criterion for the existence of a real-valued solution to the system of Equation(12)(12) (12) and Equation(13b)(13b) (13b) and an analytical formula for each 2 × 1 vector when such a solution exists.
Corollary 1.
When lk = 2 and the system of EquationEquations (12)(12) (12) and Equation(13b)(13b) (13b) has a real solution if (14) (14) and when the condition Equation(14)(14) (14) holds the solution is:(15) (15) where and are the first and second elements of
Corollary 1 greatly improves the computational efficiency of the new method. Setting lk = 2 for k > 1 means that the matrix has only two columns which are simple to orthogonalize. Moreover, it is very quick to repeatedly generate orthonormal matrices for k > 1 until we find those which satisfy the criterion Equation(14)(14) (14) , and then we can immediately obtain values for using the analytical formula Equation(15)(15) (15) . In this way, we do not need to search for the numerical solution for the simultaneous EquationEquations (12)(12) (12) and Equation(13b)(13b) (13b) .
Next, we consider simulations matching mean, covariance, Kollo skewness, and Kollo kurtosis. When matching only the first three moments, there is no reason why we should not always set lk = 2 and use Corollary 1. Then the moment matching algorithm is very fast, being fully analytic. However, the next Theorem, which explains how to match the first four moments, requires Again we begin by generating the first column of because it requires higher-order conditions than the subsequent columns. To target the first four moments should satisfy not only conditions Equation(11)(11) (11) but also a quartic condition for matching Kollo kurtosis: (16) (16) where is the first element in the first row of Theorem 2 provides the necessary and sufficient conditions on for to satisfy conditions Equation(9).
Theorem 2.
Suppose the kth column of satisfies Equation(10)(10) (10) for , that satisfies the conditions of Lemma 1, and that satisfy conditions Equation(9). Then satisfies:
the conditions Equation(9a)(9a) (9a) and Equation(9b)(9b) (9b) under Lemma 1;
the condition Equation(9c)(9c) (9c) if condition Equation(12)(12) (12) holds;
the condition Equation(9d)(9d) (9d) if condition Equation(13) holds;
the condition Equation(9e)(9e) (9e) if
Algorithm 2 is the pseudocode for implementing the above results. It yields a realisation of m observations on n random variables which matches any given target mean, covariance, Kollo skewness, and Kollo kurtosis. To obtain satisfying all the conditions Equation(9), the k − 1 columns of should be known and have satisfied condition Equation(9). For this, we first obtain an orthonormal matrix satisfying Lemma 1 and then obtain the coefficients by solving a quadratic Equation(12)(12) (12) , a cubic Equation(13a)(13a) (13a) , and a quartic Equation(17a)(17a) (17a) simultaneously. Next, we generate in an iterative manner from k = 2 to k = n. Now the vector should satisfy k + 2 equations including two quadratic EquationEquations (12)(12) (12) and Equation(17b)(17b) (17b) and k linear EquationEquations (13b)(13b) (13b) and Equation(17c)(17c) (17c) , simultaneously.
Algorithm 2.
Matching the First Four Moments
1: function ROM4() ⊳ m: Number of observations, Targeted first four moments
2:
3: Obtain where
4: Generate ⊳ is an n × n rotation matrix satisfying
5: ⊳ Obtain the rotated Kollo skewness
6: ⊳ Obtain the rotated Kollo kurtosis
7: while Theorem 2 does not hold do ⊳ Solve for the first column
8: generate l1 orthonormal vectors and form satisfying Lemma 1
9: obtain real solutions for the coefficient restricted by EquationEquations (12)(12) (12) , Equation(13a)(13a) (13a) , and Equation(17a)(17a) (17a)
10: Obtain
11: for do ⊳ Solve for the other columns
12: while Theorem 2 does not hold do
13: Generate lk orthonormal vectors to form satisfying Lemma 1
14: Obtain real solutions for the coefficients restricted by EquationEquations (12)(12) (12) , Equation(13b)(13b) (13b) , Equation(17b)(17b) (17b) , and Equation(17c)(17c) (17c)
15: Obtain
16: Generate permutation matrix if required
17: ⊳ Generate sample with exact first four moments
18: return
We require because the elements of are always restricted by k + 2 equations: for k = 1 the systems Equation(12)(12) (12) , Equation(13a)(13a) (13a) , Equation(17a)(17a) (17a) —and for k > 1 the systems Equation(12)(12) (12) , Equation(13b)(13b) (13b) , Equation(17b)(17b) (17b) , Equation(17c)(17c) (17c) —are both exactly determined when but when the systems are under-determined, which makes it easier to find a real-value solution for
Finally, we note the following property for Kollo kurtosis:Footnote6 Let be the matrix of dimension with satisfying conditions Equation(3)(3) (3) . Denote the ith row of by for where N is the number of concatenations. Then the Kollo kurtosis of the concatenated sample is the weighted sum of their Kollo kurtoses: (18) (18)
Hence, Kollo kurtosis is invariant under sample concatenation if each sub-matrix has identical Kollo kurtosis.
5. Discussions
This paper derives necessary and sufficient analytic conditions for exactly matching the mean, covariance, Kollo skewness, and Kollo kurtosis of multivariate random variables. The necessary and sufficient conditions (9) are a highly under-determined system. Even without the condition for matching Kollo kurtosis Equation(9e)(9e) (9e) , it is already a complex problem to find columns of which satisfy Equation(9a)(9a) (9a) —(9d) because of the skewness condition Equation(9d)(9d) (9d) . This explains why Kollo skewness matching methods, such as the algorithm proposed by Alexander et al. (Citation2022), are much slower than those matching only the first two moments exactly, such as Ledermann et al. (Citation2011). Adding the conditions Equation(9e)(9e) (9e) creates even more of a challenge: there are already k + 2 constraints on the kth column for Kollo skewness matching and an extra set of k conditions for Kollo kurtosis matching, so that in total for each column k, for there are constraints on m observations making a total of conditions.
In response to this issue, we propose the novel idea of decomposing in Equation(7)(7) (7) into the product of an orthonormal matrix and a vector of coefficients. We use the necessary and sufficient conditions to solve for the coefficients which generate the columns of as linear combinations of the orthonormal column vectors in such a way that the moment matching conditions are satisfied. The orthonormal column vectors are restricted to the conditions provided in Lemma 1, which helps to reduce the number of equations in the non-linear system. With such construction simulations matching Kollo skewness become highly efficient; moreover, simulations matching Kollo kurtosis become practically possible, but in high-dimensional systems, the kurtosis matrix contains many elements to match, which greatly increases the difficulty of exact simulation.Footnote7
Our method differs from Alexander et al. (Citation2022) and Hanke et al. (Citation2017), which solve condition Equation(9) for simulations matching up to Kollo skewness. Hanke et al. (Citation2017) set the undetermined elements in the equation system equal to arbitrary values so the equation system becomes a determined system that is potentially solvable. Their method thus requires a trial-and-error process to find arbitrary values, which can be very slow. Moreover, it can fail completely and even when it works the simulations may have undesirable marginal densities for most real-world applications. Noting these issues, Alexander et al. (Citation2022) propose the Random Orthogonal Matrix simulation method with exact Kollo skewness which uses either a statistical bootstrap from real-world data or parametric multivariate distributions as arbitrary values. They also provide the necessary and sufficient conditions on arbitrary values for Kollo skewness matching to be possible.Footnote8 Such modifications significantly improve the theoretical and computational properties of the simulated samples, but their algorithms require arbitrary values which often produce marginals with strange characteristics. This problem arises because the choice of distribution used to generate random values has a significant impact on the success of solving for the remaining elements. This problem was discussed in depth by Alexander et al. (Citation2022) who show that reducing the variance of the chosen distribution will ensure a low failure rate. But then, by definitions, the remaining elements will have more extreme values so that the first few columns, viewed as samples from marginal distributions of the first few random variables, could be overly leptokurtic. In this case the sample characteristics may not be ideal for many applications, especially when the target values for skewness or kurtosis are large. Our method bypasses this problem completely. With the use of an orthonormal basis, we remove the need to allocate arbitrary values to elements in Instead, we can obtain the orthonormal matrices in such a way that has consistent characteristics, which provides a better quality of data, as will be demonstrated below.
Furthermore, direct simulations matching kurtosis become practically impossible when n is large. For instance, when n = 30 the Alexander et al. algorithm requires the solution of 30 different systems of non-linear equations, the smallest being a system of 3 equations (k = 1, i.e., for the first column) and the largest being a system of 32 equations (k = 30, i.e., the last column). In total one needs to solve non-linear equations to target Kollo skewness. But if we were to extend the system to target Kollo kurtosis directly, when n = 30 there would be non-linear equations to solve, which may even be beyond the reach of the high-performance computers in use today. Our method provides a practical solution to this problem because the orthonormal basis greatly reduces the number of equations that need to be solved. When matching Kollo skewness, our algorithm only needs to solve one quadratic equation and one cubic (k = 1) or linear () equation. In addition, we have an analytical solution in Corollary 1 for solving each column of once the first column of is available. By contrast, the algorithm proposed by Alexander et al. requires a numerical solution for one quadratic equation and k + 1 linear equations. When matching both Kollo skewness and Kollo kurtosis, our algorithm is much slower because it involves solving two quadratic equations and k linear equations for each column of Also, it requires numerical methods to solve a higher-dimensional, non-linear system. Nevertheless, it is the only algorithm that can target the first four multivariate moments exactly.
We finish with a brief demonstration of the flexibility and efficiency of our moment-matching results. Not all matrices admit a solution for the conditions of Theorem 1 or 2, so the speed of the general Algorithms 1 and 2 depend on the time it takes to find the matrices for which real-valued vectors that satisfy Theorem 1 or 2 exist. Similarly, for each may need to be regenerated several times until we obtain a real-value that satisfies the conditions of Theorem 1 or 2. Although lk is at least 2 when matching the first three moments, and at least k + 2 when matching the first four moments, if computational speed becomes an issue we prefer to use a greater value for lk because it is easier to find a real-value solution for an under-determined system than for an exactly determined one. But for matching the first three moments, Corollary 1 provides a simple set of conditions to check if we set lk = 2, as well as providing the analytical expressions for the elements of The computation times provided in use lk = 2 for matching the first three moments and for matching the first four moments.
presents the empirical comparison of the cumulative density function (CDF) generated by our algorithm for matching the Kollo skewness only, or both the Kollo skewness and kurtosis, against that generated by the algorithm of Alexander et al. (Citation2022).Footnote9 The Kollo skewness and kurtosis targets are derived empirically from the cryptocurrency hourly returns data. These moments have moderately large elements everywhere, which is suitable to illustrate the two problems discussed above. The figure also reports the results of the Kolmogorov–Smirnov (KS) test which quantifies the distance between the simulated and targeted samples, and the computation time for simulations matching the Kollo skewness only, since we have no benchmark comparison for matching Kollo kurtosis.
First, the results replicate Alexander et al.’s finding that a more constrained selection of arbitrary values (i.e., in blue, relative to in red) generates much more leptokurtic samples but is much faster for sample generation. So users must choose between a better-fitted sample or a faster speed of simulation. Note that the example distribution functions reported in are selected from “qualified” trials, i.e., the trials that pass the KS test at the significance level of 10%. Yet we can see the distribution functions generated using Alexander et al.’s algorithm with both and still appear leptokurtic. By comparison, samples generated by our algorithms for matching Kollo skewness and kurtosis do not suffer from the leptokurtic problem. In addition, our algorithm for matching Kollo skewness can be a hundred times faster when targeting suitably distributed samples: it takes 0.08 s per simulation for our algorithm versus 8.04 s per simulation for Alexander et al.’s algorithm with The computation time for matching all four moments is not reported because it is always much slower than the other algorithms. This is in line with expectations because our algorithm requires at least k + 2 degrees of freedom when solving the kth column of
Finally, we emphasize that the more multivariate moments that are matched exactly, the more information that is preserved by ROM simulation. Our research thus provides an interesting avenue for future research on matching Kollo kurtosis as well as Kollo skewness—that is to look for an analytical solution for solving each column of once its first column is available. Similar to the impact of Corollary 1 which significantly speeds up our algorithm for matching Kollo skewness, such a solution should further enhance our algorithm for matching all four moments so that such simulations become not just practically possible, as this paper demonstrates, but much more faster to use in practice.
Correction Statement
This article has been corrected with minor changes. These changes do not impact the academic content of the article.
Supplemental Material
Download MS Word (14.3 KB)Disclosure statement
No potential conflict of interest was reported by the author(s).
Notes
1 This category of measures include the Malkovich and Afifi (Citation1973) measures, which are developed specifically for normality tests and these measures are extended by Balakrishnan et al. (Citation2007) and Srivastava (Citation1984); the Isogai (Citation1982) measure which is an extension of the Pearson (Citation1895) mode skewness; the Song (Citation2001) kurtosis measure which is based on Rényi entropy.
2 An exception is the Srivastava (Citation1984) measures which are much easier to compute, but they too cannot be applied to moment-matching simulations. This is because they involve a principal component transformation, which is easy to obtain from a given sample, but it is difficult to generate a sample when we only know the Srivastava moments.
3 The Koizumi et al. measures are defined as the sum of Mardia skewness and kurtosis of all sub-vectors of Xn, but they are complex to compute because the number of sub-vectors increases rapidly with the dimensions Xn.
4 To contrast the number of elements: in a 5-dimensional distribution we have 35 and 70 elements in the full skewness and kurtosis tensors, but only 5 and 15 elements in Kollo’s measures.
5 See Ledermann et al. (Citation2011) for discussions on the usual conditions for rotation matrices. Ledermann and Alexander (Citation2012) further investigate the sampling properties of different types of rotation matrices.
6 Alexander et al. (Citation2022) note a similar property for Kollo skewness.
7 In both cases our method may be regarded as a generalization of the original ROM simulation to higher-moment matching, because using only one orthonormal vector to generate each column of reduces our method to a ROM simulation which only targets the mean vector and covariance matrix.
8 If the constraints are not satisfied, the arbitrary values require re-generation until the equation system becomes solvable with these values.
9 We do not include the algorithm of Hanke et al. (Citation2017) because Alexander et al. (Citation2022) have already demonstrated that the modifications provided by their algorithm produce marginals with superior characteristics and it is very much more computationally efficient. The Hanke et al. method may also suffer from a high failure rate.
References
- Alexander, C., Meng, X., & Wei, W. (2022). Targeting Kollo skewness with random orthogonal matrix simulation. European Journal of Operational Research, 299(1), 362–376. https://doi.org/10.1016/j.ejor.2021.09.003
- Balakrishnan, N., Brito, M., & Quiroz, A. (2007). A vectorial notion of skewness and its use in testing for multivariate symmetry. Communications in Statistics-Theory and Methods, 36(9), 1757–1767. https://doi.org/10.1080/03610920601126225
- Bardsley, E. (2017). A finite mixture approach to univariate data simulation with moment matching. Environmental Modelling & Software, 90, 27–33. https://doi.org/10.1016/j.envsoft.2016.11.019
- Ching, J., Porter, K. A., & Beck, J. L. (2009). Propagating uncertainties for loss estimation in performance-based earthquake engineering using moment matching. Structure and Infrastructure Engineering, 5(3), 245–262. https://doi.org/10.1080/15732470701298323
- Efron, B. (1979). Bootstrap methods: Another look at the jackknife. The Annals of Statistics, 7(1), 1–26. https://doi.org/10.1214/aos/1176344552
- Ferraz, V., & Moura, F. (2012). Small area estimation using skew normal models. Computational Statistics & Data Analysis, 56(10), 2864–2874. https://doi.org/10.1016/j.csda.2011.07.005
- Foldnes, N., & Olsson, U. H. (2016). A simple simulation technique for nonnormal data with prespecified skewness, kurtosis, and covariance matrix. Multivariate Behavioral Research, 51(2–3), 207–219. https://doi.org/10.1080/00273171.2015.1133274
- Hanke, M., Penev, S., Schief, W., & Weissensteiner, A. (2017). Random orthogonal matrix simulation with exact means, covariances, and multivariate skewness. European Journal of Operational Research, 263(2), 510–523. https://doi.org/10.1016/j.ejor.2017.05.023
- Heath, A., Manolopoulou, I., & Baio, G. (2018). Efficient Monte Carlo estimation of the expected value of sample information using moment matching. Medical Decision Making, 38(2), 163–173. https://doi.org/10.1177/0272989X17738515
- Hertz, D. B. (1964). Risk analysis in capital investment. Harvard Business Review, 42, 95–106.
- Høyland, K., Kaut, M., & Wallace, S. W. (2003). A heuristic for moment-matching scenario generation. Computational Optimization and Applications, 24(2/3), 169–185. https://doi.org/10.1023/A:1021853807313
- Isogai, T. (1982). On a measure of multivariate skewness and a test for multivariate normality. Annals of the Institute of Statistical Mathematics, 34(3), 531–541. https://doi.org/10.1007/BF02481051
- Jobson, J. D., & Korkie, B. (1980). Estimation for Markowitz efficient portfolios. Journal of the American Statistical Association, 75(371), 544–554. https://doi.org/10.1080/01621459.1980.10477507
- Koizumi, K., Sumikawa, T., Pavlenko, T., Kazuyuki, K., Takuma, S., & Tatjana, P. (2014). Measures of multivariate skewness and kurtosis in high-dimensional framework. Journal of Mathematics, 50(2), 483–511.
- Kollo, T. (2008). Multivariate skewness and kurtosis measures with an application in ICA. Journal of Multivariate Analysis, 99(10), 2328–2338. https://doi.org/10.1016/j.jmva.2008.02.033
- Kuruvilla, J., & Gunavathi, K. (2014). Lung cancer classification using neural networks for CT images. Computer Methods and Programs in Biomedicine, 113(1), 202–209. https://doi.org/10.1016/j.cmpb.2013.10.011
- Ledermann, D., & Alexander, C. (2012). Further properties of random orthogonal matrix simulation. Mathematics and Computers in Simulation, 83, 56–79. https://doi.org/10.1016/j.matcom.2012.07.013
- Ledermann, W., Alexander, C., & Ledermann, D. (2011). Random orthogonal matrix simulation. Linear Algebra and Its Applications, 434(6), 1444–1467. https://doi.org/10.1016/j.laa.2010.10.023
- Ledoit, O., & Wolf, M. (2004). Honey, I shrunk the sample covariance matrix. The Journal of Portfolio Management, 30(4), 110–119. https://doi.org/10.3905/jpm.2004.110
- Looney, C. G. (1983). Monte Carlo simulation with moment matching samples. Mathematics and Computers in Simulation, 25(3), 237–240. https://doi.org/10.1016/0378-4754(83)90099-X
- Love, P. E., Sing, C. P., Wang, X., Edwards, D. J., & Odeyinka, H. (2013). Probability distribution fitting of schedule overruns in construction projects. Journal of the Operational Research Society, 64(8), 1231–1247. https://doi.org/10.1057/jors.2013.29
- Malkovich, J. F., & Afifi, A. (1973). On tests for multivariate normality. Journal of the American Statistical Association, 68(341), 176–179. https://doi.org/10.1080/01621459.1973.10481358
- Mardia, K. V. (1970). Measures of multivariate skewness and kurtosis with applications. Biometrika, 57(3), 519–530. https://doi.org/10.1093/biomet/57.3.519
- Mitchell, A. E. (1989). The information matrix, skewness tensor and α-connections for the general multivariate elliptic distribution. Annals of the Institute of Statistical Mathematics, 41(2), 289–304. https://doi.org/10.1007/BF00049397
- Móri, T. F., Rohatgi, V. K., & Székely, G. (1994). On multivariate skewness and kurtosis. Theory of Probability & Its Applications, 38(3), 547–551. https://doi.org/10.1137/1138055
- Oberkampf, W. L., DeLand, S. M., Rutherford, B. M., Diegert, K. V., & Alvin, K. F. (2002). Error and uncertainty in modeling and simulation. Reliability Engineering & System Safety, 75(3), 333–357. https://doi.org/10.1016/S0951-8320(01)00120-X
- Pearson, K. (1895). X. Contributions to the mathematical theory of evolution.—II. Skew variation in homogeneous material. Philosophical Transactions of the Royal Society of London A, 186, 343–414.
- Pellegrino, T., & Sabino, P. (2014). On the use of the moment-matching technique for pricing and hedging multi-asset spread options. Energy Economics, 45, 172–185. https://doi.org/10.1016/j.eneco.2014.06.014
- Qu, W., Liu, H., & Zhang, Z. (2020). A method of generating multivariate non-normal random numbers with desired multivariate skewness and kurtosis. Behavior Research Methods, 52(3), 939–946. https://doi.org/10.3758/s13428-019-01291-5
- Ryoo, H. S. (2007). A compact mean-variance-skewness model for large-scale portfolio optimization and its application to the NYSE market. Journal of the Operational Research Society, 58(4), 505–515. https://doi.org/10.1057/palgrave.jors.2602168
- Saliby, E., & Paul, R. (2009). A farewell to the use of antithetic variates in Monte Carlo simulation. Journal of the Operational Research Society, 60(7), 1026–1035. https://doi.org/10.1057/palgrave.jors.2602645
- Song, K. (2001). Rényi information, loglikelihood and an intrinsic distribution measure. Journal of Statistical Planning and Inference, 93(1-2), 51–69. https://doi.org/10.1016/S0378-3758(00)00169-5
- Srivastava, M. S. (1984). A measure of skewness and kurtosis and a graphical method for assessing multivariate normality. Statistics & Probability Letters, 2(5), 263–267. https://doi.org/10.1016/0167-7152(84)90062-2
Appendix A.
Real-world examples
We illustrate the Kollo skewness and kurtosis and compare them with the popular Mardia measures using a set of three hourly cryptocurrency returns (bitcoin, ether, and litecoin) and the eleven different U.S. stock sector index daily returns, from which we show just three sectors (energy, finance, and real estate) although the Kollo measures are computed from the entire eleven-dimensional distribution. The returns are presented in .
Given that each series is clearly far from i.i.d., we used a fixed-size rolling window to estimate the sample Kollo skewness and kurtosis. The sample size is 720 (∼1 month) for the hourly cryptocurrency data and 250 (∼ one year) for the daily stock index data. The relevant elements of the Kollo skewness vector and kurtosis matrix are depicted as time series in .
The spikes in skewness and kurtosis clearly represent significant events in each sector. For example, during the financial crisis of 2008, we see a spike in both τ7 and following the collapse of Lehman Brothers in September 2008. The extremely low values of τ1 and the spike in during the first half of 2018 were caused by an upward surge in oil prices. Similarly, the upward spike in τ7 and in most elements in Kollo kurtosis on 16 March 2020 resulted from extreme negative returns across all financial markets following the outbreak of Covid-19.
Almost all elements of the Kollo skewness vector lie in the range with the most sampling variability in τ1 and τ3. The Kollo kurtosis ranges from 0 to 200. The spikes in Kollo skewness and Kollo kurtosis occur simultaneously. In particular, the global crash in risky assets after the outbreak of Covid in January 2020 is evident from the dip in Kollo skewness down to almost −10 in both data sets. The Kollo kurtosis takes a narrower range in the stock sector data than for the cryptocurrency data.
Compared with the Mardia measures, we see that the Kollo measures are more sensitive to the extremes of the distribution. Unlike the Mardia skewness, the Kollo measure also captures the direction of asymmetry in the marginals. The Kollo measures also capture more information from co-moments and not only from the marginals. They can exhibit substantial variability even when the Mardia measures remain relatively stable over time. Furthermore, the Kollo measures can indicate which variable contributes most to an extreme value for skewness and kurtosis but Mardia measures only capture the overall information, in a scalar value.
Appendix B.
Mathematical proofs
B.1 Proof of Lemma 1
Write and where at least one element is non-zero. Now if the column vectors in are also orthogonal to the vectors we have:
Then, since is orthogonal to we have and hence: and for
B.2 Proof of Theorem 1
When k = 1 the conditions for are set out in EquationEquation (11)(11) (11) and by Lemma 1 we have so we only need to solve the second two equations in Equation(11)(11) (11) . The quadratic equation can be written as: because And the cubic equation becomes: which proves Equation(13a)(13a) (13a) . When we need to solve a quadratic EquationEquation (9c)(9c) (9c) and a linear EquationEquation (9d)(9d) (9d) . The quadratic equation may be written as: as before. The linear equation becomes: which proves Equation(13b)(13b) (13b) .
B.3 Proof of Corollary 1
EquationEquation (12)(12) (12) defines a 2-dimensional sphere and EquationEquation (13b)(13b) (13b) defines a 2-dimensional plane. Hence the question of whether the system of equations has real roots can be converted into the question of whether the sphere has intersections with the plane. We know that a necessary and sufficient condition for this is that the shortest distance between the centre of the sphere and the plane should be no more than the radius of the sphere, i.e., which proves Equation(14)(14) (14) . The sphere and the linear equation for the plane may be written as follows,
Hence we may set: (B1) (B1) and so the sphere equation becomes:
Let and denote the first and second elements of i.e., and Then the above equation becomes:
Using the standard formula for the roots of this equation, the solutions are:
Finally, we substitute these values for into Equation(B1)(B1) (B1) to obtain:
B.4 Proof of Theorem 2
To match Kollo kurtosis each column of needs to satisfy EquationEquation (9e)(9e) (9e) in addition to satisfying the conditions Equation(9a)—(9d), thus (i)—(iii) has been proven in Theorem 1 and the following focuses on (iv). When k = 1, needs to satisfy a quartic EquationEquation (16)(16) (16) which is easily rewritten as Equation(17a)(17a) (17a) , viz.
When should satisfy condition Equation(9e)(9e) (9e) , which contains a quadratic equation when i = k and k − 1 linear equations for First, we focus on the quadratic equation: which proves the condition Equation(17b)(17b) (17b) . Next, consider these k − 1 linear equations, for each i < k,
Then the EquationEquation (17c)(17c) (17c) is simply rewritten in a matrix form.