Full article: Monte Carlo Simulation for Trading Under a Lévy-Driven Mean-Reverting Framework

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

Abstract

We present a Monte Carlo approach to pairs trading on mean-reverting spreads modelled by Lévy-driven Ornstein-Uhlenbeck processes. Specifically, we focus on using a variance gamma driving process, an infinite activity pure jump process to allow for more flexible models of the price spread than is available in the classical model. However, this generalization comes at the cost of not having analytic formulas, so we apply Monte Carlo methods to determine optimal trading levels and develop a variance reduction technique using control variates. Within this framework, we numerically examine how the optimal trading strategies are affected by the parameters of the model. In addition, we extend our method to bivariate spreads modelled using a weak variance alpha-gamma driving process, and explore the effect of correlation on these trades.

Keywords:

1. Introduction

Many empirical studies have demonstrated examples of mean-reverting spreads in different markets. They can be generated by taking long and short positions in stocks and ETFs (Avellaneda and Lee Citation2010; Gatev, Goetzmann, and Rouwenhorst Citation2006; Huck and Afawubo Citation2015), futures contracts (CitationBrennan and Schwartz; CitationDai, Zhong, and Kwok), physical commodity and commodity ETFs (Kanamura, Rachev, and Fabozzi Citation2010; Leung and Li Citation2016), as well as cryptocurrencies (Leung and Nguyen Citation2019).

Another popular approach, the stochastic spread method captures the path behaviour of the spread through a stochastic process with mean reversion. Given a mean-reverting spread process, we consider the problem of pairs trading, that is, extracting trading signals from the model based on the idea that a trade should be entered when the spread is far from the mean, and exited when it returns to the mean, so that the trader profits based on the tendency of the process to mean revert. This is a form of statistical arbitrage. The construction of spreads and extraction of trading signals are derived from the analysis of the underlying model. For instance, Elliott, Van Der Hoek, and Malcolm (Citation2005) propose a mean-reverting Gaussian Markov chain model to describe spread dynamics. The model's estimates are compared with observations of the spread to determine appropriate trading decisions. They solve an optimal double-stopping problem to analyse the timing of entry and exit subject to transaction costs.

Most prominently, Ornstein-Uhlenbeck (OU) processes have been used to model spreads for pairs trading. For classical OU processes, which are driven by Brownian motion, closed-form solutions to the pairs trading problem were obtained in Leung and Li (Citation2015) and Lipton and López de Prado (Citation2020).

Lévy processes have been widely applied in finance to model price processes with jumps (Cont and Tankov Citation2004; Schoutens Citation2003), which leads us to model price spreads with jumps. These Lévy models often provide a better fit than models based on Brownian motion (Madan, Carr, and Chang Citation1998; Michaelsen and Szimayer Citation2018; Schoutens Citation2003), where the first two references focus on variance gamma-based models. Jumps can also arise from hard-to-borrow stocks (Avellaneda and Lipkin Citation2009) or illiquid stocks (Gardini, Sabino, and Sasso Citation2022). This motivates going beyond the classical OU processes to consider Lévy-driven Ornstein-Uhlenbeck processes (LDOUP) to model mean-reverting spreads with jumps in pairs trading.

LDOUPs have been applied to other areas of mathematical finance such as modelling stochastic volatility (Barndorff-Nielsen and Shephard Citation2001), energy prices and energy derivatives (Benth and Schmeck Citation2014; Cummins, Kiely, and Murphy Citation2018; Sabino Citation2020), credit risk (Cariboni and Schoutens Citation2009), among others. LDOUPs also have increased flexibility for capturing skewness, kurtosis, and dependence in the multivariate setting. Parametric estimation of LDOUPs using maximum likelihood has been considered in the univariate and multivariate cases by Valdivieso, Schoutens, and Tuerlinckx (Citation2009) and Lu (Citation2022), respectively, and these methods can be applied to fit LDOUPs to spreads. However, in the context of pairs trading, LDOUPs have been applied restricting the driving process to compound Poisson processes with double exponential jumps. This setup is used since Borovkov and Novikov (Citation2008) derived an analytic formula for the expected exit time which relies on the memorylessness property of the exponential jumps to deal with overshoots. Without this assumption, overshoots present a difficult technical hurdle to finding analytic formulas for the passage times of these processes in general, which requires Monte Carlo methods to overcome. Based on Borovkov and Novikov (Citation2008) and Wu, Zang, and Zhao (Citation2020) provide an analytic value function for pairs trading, while Endres and Stübinger (Citation2019) use the analytic formula of Borovkov and Novikov (Citation2008), while ignoring overshoots, to conduct an empirical study of optimal pairs trading.

In this paper, we discuss a framework for pairs trading using Lévy-driven Ornstein-Uhlenbeck processes to model the mean-reverting spread, which works with any LDOUP that can be simulated. Specifically, we focus on a variance gamma driving process, which is a time-changed Brownian motion and a pure-jump, infinite activity Lévy process with skewness and kurtosis parameters. By using this driving process, these properties extend to the LDOUP, allowing for more flexible models of the price spread than is available in the classical OU model, and in contrast to the finite activity processes considered previously. However, the generalization comes at the cost of not having analytic formulas. Hence, we develop a Monte Carlo method to evaluate the trading strategies and determine the optimal trading levels. In particular, we introduce an array of control variates to reduce the variance of the Monte Carlo estimator of the value function. We derive a formula for the variance reduction ratio when the control variate is used to estimate the mean and variance of the profit simultaneously, which extends the usual approach where only one parameter is estimated by control variates. Furthermore, in our numerical implementation, we examine how the parameters of the model affect the optimal trading levels and optimal expected profit in various scenarios. Specifically, we demonstrate numerically how skewness in the distribution of the spread leads to asymmetry in the optimal trading levels, how the optimal exit level can differ from the stationary mean when the process starts sufficiently far away from it and the discount rate is large, and it is otherwise difficult to find other situations where this occurs. And lastly, we demonstrate how the jumps can increase the optimal entry level and optimal expected profit.

We note that all the studies mentioned above focus on the trading performance of a single spread. The recent paper by Lee, Leung, and Ning (Citation2023) introduces a framework for trading multiple mean-reverting spreads simultaneously, where capital is dynamically allocated among different spreads based on their statistical characteristics. In this paper, we also consider the trading problem involving two mean-reverting spreads modelled by an LDOUP driven by a weak variance alpha-gamma process, which is a multivariate generalization of the variance gamma process introduced in Buchmann, Lu, and Madan (Citation2019). For bivariate pairs trading, we estimate the expected profit via Monte Carlo simulation and examine the effect of the correlation between spreads on the optimal trading levels and optimal expected profit. We demonstrate an example where the optimal strategy is to effectively pick the best spread mostly ignoring correlation, and an example where having low absolute correlation is more profitable when monitoring multiple spreads with marginal components that follow the same law.

The rest of the paper is structured as follows. In Section 2, we review Lévy-driven Ornstein-Uhlenbeck processes and required properties used in this study. The trading problem is formulated in Section 3. We discuss our Monte Carlo estimation framework in Section 4 and present the numerical results in Section 5. In Section 6, we examine the trading problem involving two mean-reverting spreads. Concluding remarks are provided in Section 7.

2. Lévy-Driven Ornstein-Uhlenbeck Process

Let $Z \sim L^{n} (μ, Σ, Z)$ denote an n-dimensional Lévy process with characteristic triplet $(μ, Σ, Z)$ , where $μ \in R^{n}$ , $Σ \in R^{n \times n}$ is a covariance matrix and $Z$ is an n-dimensional Lévy measure. See Bertoin (Citation1996), Sato (Citation1999), and Schoutens (Citation2003) for references on Lévy processes.

A Lévy-driven Ornstein-Uhlenbeck process (LDOUP) $X$ is defined by the stochastic differential equation (1) $d X (t) = - λ X (t) d t + d Z (λt), X (0) = X_{0}, t \geq 0,$ (1) where $Z \sim L^{n} (μ, Σ, Z)$ is the background driving Lévy process (BDLP), $λ > 0$ , and $X_{0}$ is a random vector independent of $Z$ . Note that there is no loss in generality in using the drift term $- λ X (t) d t$ instead of $λ (\bar{μ} - X (t)) d t$ for some $\bar{μ} \in R^{n}$ , since the stationary mean of $X$ is controlled by $E [Z (1)]$ . Also, there is no loss in generality in using the BDLP $Z \circ (λI)$ , where I is the identity function, instead of $Z$ , since any Lévy process $\tilde{Z} \sim L^{n} (\tilde{μ}, \tilde{Σ}, \tilde{Z})$ can be written in the form $\tilde{Z} \overset{D}{=} Z \circ (λI)$ , where $Z \sim L^{n} (\tilde{μ} / λ, \tilde{Σ} / λ, \tilde{Z} / λ)$ . However, the former is more convenient since it leads to a stationary distribution that does not depend on λ. The exposition on LDOUP here follows Lu (Citation2022), while some additional references for LDOUPs include Masuda (Citation2004), Sato and Yamazato (Citation1983), and Sato and Yamazato (Citation1985).

The solution of (Equation1(1) $d X (t) = - λ X (t) d t + d Z (λt), X (0) = X_{0}, t \geq 0,$ (1) ) is (2) $X (t) = e^{- λt} X (0) + e^{- λt} \int_{0}^{t} e^{λs} d Z (λs), t \geq 0.$ (2) For $t_{0} = 0, t_{1} = Δ, \dots, t_{q} = q Δ$ , at the observation time $t_{i}$ , we have $X (t_{i}) = e^{- λ Δ} (X (t_{i - 1}) + \int_{t_{i - 1}}^{t_{i}} e^{λs} d Z (λs)), i = 1, \dots, q .$ The stochastic integral term is iid, and we let $Z^{*} (Δ)$ be the random vector with this distribution, so it is interpreted as the innovation term, up to a factor $e^{- λ Δ}$ , of an AR(1) process. Specifically, we define $Z^{*} (Δ) = \int_{0}^{Δ} e^{λs} d Z (λs) \overset{D}{=} \int_{t_{i - 1}}^{t_{i}} e^{λs} d Z (λs), Δ > 0.$ The random vector $Z^{*} (Δ)$ is infinitely divisible, so it is determined by its characteristic exponent, which is given by (3) $Ψ_{Z^{*} (Δ)} (θ) = \int_{0}^{λ Δ} Ψ_{Z} (e^{t} θ) d t, θ \in R^{n},$ (3) where $Ψ_{Z}$ is the characteristic exponent of the BLDP $Z$ . Combining these results, the LDOUP can be written as (4) $X (t_{i}) = e^{- λ Δ} (X (t_{i - 1}) + Z^{*} (Δ)^{(i)}), i = 1, \dots, q,$ (4) where $Z^{*} (Δ)^{(i)}$ is iid with the distribution of $Z^{*} (Δ)$ .

Next, we consider specific examples of LDOUP processes where the BDLP is a tractable pure-jump infinite activity Lévy process to model price spreads for pairs trading. We consider both the univariate and multivariate cases, where the BDLP is a variance gamma and a weak variance alpha-gamma process, respectively.

2.1. Univariate LDOUP Driven by a Variance Gamma Process

The variance gamma process was introduced in Madan and Seneta (Citation1990) and additional details can be found in Madan, Carr, and Chang (Citation1998). Let $Γ_{S} (a, b)$ denote a gamma subordinator with shape a>0 and rate b>0, and $B M^{n} (μ, Σ)$ denote a Brownian motion with drift $μ = (μ_{1}, \dots, μ_{n}) \in R^{n}$ and covariance matrix $Σ = (Σ_{kl}) \in R^{n \times n}$ . A Lévy process $V \sim V G^{n} (b, μ, Σ)$ is a variance gamma (VG) process if $V \overset{D}{=} ηI + B \circ (G, \dots, G),$ where $B \sim B M^{n} (μ, Σ)$ and $G \sim Γ_{S} (b, b)$ are independent, $η \in R^{n}$ , and I is the identity function. The parameter $η$ adds a drift.

We call the univariate LDOUP $X \sim OU - VG (λ, b, μ, σ^{2}, η)$ an $OU - VG$ process if its BDLP is $Z \sim V G^{1} (b, μ, σ^{2}, η)$ .

2.2. Multivariate LDOUP Driven by a Weak Variance Alpha-Gamma Process

The weak variance alpha-gamma (WVAG) process was introduced in Buchmann, Lu, and Madan (Citation2019). Let $n \geq 2$ , a>0, $α = (α_{1}, \dots, α_{n}) \in (0, 1 / a)^{n}$ , $β_{k} := (1 - a α_{k}) / α_{k}$ , $k = 1, \dots, n$ and $η \in R^{n}$ . Let $α ⋄ μ := (α_{1} μ_{1}, \dots, α_{n} μ_{n}) \in R^{n}$ and $α ⋄ Σ := (Σ_{kl} (α_{k} \land α_{l})) \in R^{n \times n}$ . We define $Z \sim WVA G^{n} (a, α, μ, Σ, η)$ as a weak variance alpha-gamma (WVAG) process if (5) $Z \overset{D}{=} ηI + V_{0} + (V_{1}, \dots, V_{n}),$ (5) where $V_{0} \sim V G^{n} (a, a α ⋄ μ, a α ⋄ Σ)$ and $V_{k} \sim V G^{1} (β_{k}, α_{k} β_{k} μ_{k}, α_{k} β_{k} Σ_{kk})$ , $k = 1, \dots, n$ , are independent. This formulation gives a direct method of simulating the WVAG process in terms of VG processes, and seeing that the process has common and idiosyncratic jumps.

The original definition of the WVAG process (with $η = 0$ ), (6) $Z \overset{D}{=} B ⊙ T,$ (6) where $B \sim B M^{n} (μ, Σ)$ , $T$ an n-dimensional alpha-gamma subordinator, and ⊙ is the weak subordination operation, gives a multivariate time-change interpretation to the process, generalizing the univariate VG process. Indeed, the marginal components of the WVAG process are general univariate VG processes with $Z_{k} \sim V G^{1} (1 / α_{k}, μ_{k}, Σ_{kk}, η_{k})$ .

We call a multivariate LDOUP $X \sim OU - WVA G^{n} (λ, a, α, μ, Σ, η)$ an OU-WVAG process if its BDLP is $Z \sim WVA G^{n} (a, α, μ, Σ, η)$ .

Based on the moment formulas (see Michaelsen and Szimayer Citation2018), the parameters of the WVAG process can be interpreted as follows. The marginal parameters $α_{k}, μ_{k}, Σ_{kk}, η_{k}$ are kurtosis, skewness, variance, and location parameters, respectively, that affect the marginal component distribution, while $a, Σ_{kl}, k \neq l$ , are joint parameters. In particular, as $α_{k} \to 0$ (equivalently, the VG parameter $b_{k} = 1 / α_{k} \to \infty$ in the kth component), the kth marginal component of the WVAG process converges in law to Brownian motion so the sample paths become closer to continuous.

2.3. Moment Formulas

For a random vector $U = (U_{1}, \dots, U_{n})$ , denote the mean by $m_{1} (U) := E [U]$ , and the kth central moment of $U$ by $m_{k} (U) := (E [(U_{1} - E [U_{1}])^{k}], \dots, E [(U_{n} - E [U_{n}])^{k}])$ , $k \geq 2$ .

For a general BDLP $Z$ , Lu (Citation2022, Lemma 1) gives formulas for the moments of $Z^{*} (Δ)$ in terms of the moments of $Z$ . Combining this with (Equation2(2) $X (t) = e^{- λt} X (0) + e^{- λt} \int_{0}^{t} e^{λs} d Z (λs), t \geq 0.$ (2) ) and conditional on $X_{0}$ (or assuming it is constant), the moments of the LDOUP $X$ are given by $\begin{aligned} m_{1} (X (t)) & = e^{- λt} X_{0} + (1 - e^{- λt}) m_{1} (Z (1)), \\ m_{2} (X (t)) & = (\frac{1 - e^{- 2 λt}}{2}) m_{2} (Z (1)), \\ m_{3} (X (t)) & = (\frac{1 - e^{- 3 λt}}{3}) m_{3} (Z (1)), \\ Cov (X_{k} (t), X_{l} (t)) & = (\frac{1 - e^{- 2 λt}}{2}) Cov (Z_{k} (1), Z_{l} (1)), k \neq l . \end{aligned}$ The stationary distribution $Y$ of $X$ is the distribution such that if $X_{0} \overset{D}{=} Y$ , then $X (t) \overset{D}{=} Y$ for all $t \geq 0$ . Thus, the stationary mean, which is the mean of the stationary distribution, is $\bar{μ} = E [Z (1)]$ . This is the level that we expect the process to return to in pairs trading.

Now specializing to the case where $X \sim OU - WVA G^{n} (λ, a, α, μ, Σ, η)$ , and using the moment formulas for the WVAG process in Michaelsen and Szimayer (Citation2018, Remark 4 and Appdendix A.1), we have (7) $\begin{aligned} E [X_{1} (t)] & = e^{- λt} X_{0} + (1 - e^{- λt}) (η_{1} + μ_{1}), \end{aligned}$ (7) (8) $\begin{aligned} Var (X_{1} (t)) & = (\frac{1 - e^{- 2 λt}}{2}) (Σ_{11} + α_{1} μ_{1}^{2}), \end{aligned}$ (8) (9) $\begin{aligned} Skew (X_{1} (t)) & = \frac{(\frac{1 - e^{- 3 λt}}{3}) (3 Σ_{11} μ_{1} α_{1} + 2 μ_{1}^{3} α_{1}^{2})}{{((\frac{1 - e^{- 2 λt}}{2}) (Σ_{11} + μ_{1}^{2} α_{1}))}^{3 / 2}} \end{aligned}$ (9) (10) $\begin{aligned} Cov (X_{1} (t), X_{2} (t)) & = (\frac{1 - e^{- 2 λt}}{2}) a ((α_{1} \land α_{2}) Σ_{12} + α_{1} α_{2} μ_{1} μ_{2}) . \end{aligned}$ (10) Furthermore, $\bar{μ} = μ + η$ . These formulas will be used for control variates in the Monte Carlo method. Note that (Equation7(7) $\begin{aligned} E [X_{1} (t)] & = e^{- λt} X_{0} + (1 - e^{- λt}) (η_{1} + μ_{1}), \end{aligned}$ (7) ) to (Equation9(9) $\begin{aligned} Skew (X_{1} (t)) & = \frac{(\frac{1 - e^{- 3 λt}}{3}) (3 Σ_{11} μ_{1} α_{1} + 2 μ_{1}^{3} α_{1}^{2})}{{((\frac{1 - e^{- 2 λt}}{2}) (Σ_{11} + μ_{1}^{2} α_{1}))}^{3 / 2}} \end{aligned}$ (9) ) apply to OU-VG processes with $b_{1} = 1 / α_{1}$ .

3. Pairs Trading Problem

Suppose the spread of a pair of assets or the price of a portfolio of assets is mean-reverting and follows a univariate LDOUP X.

Now we proceed to determine the levels at which trades should be entered or exited. Consider the following pairs trading strategy on the spread X: Enter a short position in X when it is above $\bar{μ} + d$ , and close the position by buying when it is below $\bar{μ} + c$ , where $0 \leq c < d$ . Enter a long position when it is below $\bar{μ} - d$ , and close the position by selling when it is above $\bar{μ} - c$ . Fix a terminal time T>0, where the trade is closed by time T if it has not been closed already. Thus, d and c are the entry and exit levels relative to the mean reversion level $\bar{μ}$ .

This trading strategy is depicted for a sample path of X in Figure . Note that the price at which the trade enters and exits is not the same as the entry and exit levels due to the presence of jumps. In this model when X is an OU-VG process, there is always some degree of overshoot.

Figure 1. For the spread process X (black line), a trade is entered when the price passes $\pm d$ , whichever happens first (red lines), and exited when it passes $\pm c = 0$ (blue line). The value of the spread at the entry (red point) and exit (blue point) times are shown. Here, $X \sim OU - VG (1, 5, 0, 0.015, 0)$ .

Figure 1. For the spread process X (black line), a trade is entered when the price passes ±d, whichever happens first (red lines), and exited when it passes ±c=0 (blue line). The value of the spread at the entry (red point) and exit (blue point) times are shown. Here, X∼OU-VG(1,5,0,0.015,0).

More generally, we can consider possibly different entry levels $d_{+}$ , $d_{-}$ from above and below $\bar{μ}$ , and their respective exit levels $c_{+}, c_{-}$ . These satisfy $0 \leq c_{\pm} < d_{\pm}$ .

The relevant passage times are $\begin{aligned} τ_{d_{+}} & = inf {t \geq 0 : X (t) > \bar{μ} + d_{+}} \land T, \\ τ_{c_{+}} & = inf {t \geq τ_{d_{+}} : X (t) < \bar{μ} + c_{+}} \land T, \\ τ_{d_{-}} & = inf {t \geq 0 : X (t) < \bar{μ} - d_{-}} \land T, \\ τ_{c_{-}} & = inf {t \geq τ_{d_{-}} : X (t) > \bar{μ} - c_{-}} \land T . \end{aligned}$ Let $r \in R$ be a discount rate, then P is the discounted profit over one trade cycle given by (11) $\begin{aligned} P := P (c_{+}, c_{-}, d_{+}, d_{-}) & = 1_{{τ_{d_{+}} \leq τ_{d_{-}}}} e^{- r τ_{c_{+}}} (X (τ_{d_{+}}) - X (τ_{c_{+}})) \\ + 1_{{τ_{d_{+}} > τ_{d_{-}}}} e^{- r τ_{c_{-}}} (X (τ_{c_{-}}) - X (τ_{d_{-}})) . \end{aligned}$ (11) For convenience, we will refer to this simply as the profit. The value function is (12) $\begin{aligned} V := V (c_{+}, c_{-}, d_{+}, d_{-}) & = E [P] - γVar (P) \\ = E [P] - γ E [P^{2}] + γ E [P]^{2}, \end{aligned}$ (12) where $γ \geq 0$ is a parameter which penalizes trading strategies with high variance. If $γ = 0$ , the value function is the expected profit.

We want to find the optimal levels $c_{\pm}, d_{\pm}$ for the pairs trading strategy to maximize the value function, that is, $max_{c_{+}, c_{-}, d_{+}, d_{-}} V (c_{+}, c_{-}, d_{+}, d_{-}) .$ We develop a Monte Carlo approach to simulate sample paths of the LDOUP X, to estimate $V (c_{+}, c_{-}, d_{+}, d_{-})$ on $0 \leq c_{\pm} < d_{\pm}$ , and in turn numerically solve the maximization problem for the optimal trading levels. Given the use of simulation, we also discuss control variates for variance reduction.

4. Monte Carlo Method

Let m be the number of Monte Carlo simulations. For a spread process following a univariate LDOUP X, suppose we have m simulations of $X (t_{0}), \dots, X (t_{q})$ with a small stepsize $Δ > 0$ . Then we can compute the random variable P, and hence obtain the Monte Carlo estimate of $E [P]$ and $E [P^{2}]$ to estimate the value function V in (Equation12(12) $\begin{aligned} V := V (c_{+}, c_{-}, d_{+}, d_{-}) & = E [P] - γVar (P) \\ = E [P] - γ E [P^{2}] + γ E [P]^{2}, \end{aligned}$ (12) ). The same sample paths are used for all evaluations of the same value function. While the Monte Carlo method presented here works for any LDOUP that can be fitted and simulated, we specifically focus on the OU-VG process.

4.1. Simulating the Spread

Let the spread process be $X \sim OU - VG (λ, b, μ, σ^{2}, η)$ . We assume that the parameters are known. For example, they may be estimated using maximum likelihood as in Valdivieso, Schoutens, and Tuerlinckx (Citation2009) and Lu (Citation2022).

It is possible to exactly simulate $Z^{*} (Δ)$ for the OU-VG process, and hence $X (t_{0}),, \dots, X (t_{q})$ by (Equation4(4) $X (t_{i}) = e^{- λ Δ} (X (t_{i - 1}) + Z^{*} (Δ)^{(i)}), i = 1, \dots, q,$ (4) ), using the method of Sabino (Citation2020) and Qu, Dassios, and Zhao (Citation2021). For $G \sim Γ_{S} (a, b)$ , define the random variable $G^{*} (Δ) \overset{D}{=} e^{λ Δ} (\tilde{G} + \tilde{C})$ , where $\tilde{G} \sim Γ (aλ Δ, b e^{λ Δ})$ is gamma where the parameters are the shape and rate, respectively, and $\tilde{C} \sim CP (a λ^{2} Δ^{2} / 2, P_{J})$ is compound Poisson where the parameters are the rate and jump distribution, respectively. Here, $P_{J}$ is the probability law of J such that $J | U \sim Exponential (b e^{λ Δ \sqrt{U}})$ where the parameter is the rate, and $U \sim Uniform (0, 1)$ . Then $Z^{*} (Δ) \overset{D}{=} η (e^{λ Δ} - 1) + G_{+}^{*} (Δ) - G_{-}^{*} (Δ),$ where $G_{+} \sim Γ_{S} (b, b_{+})$ , $G_{-} \sim Γ_{S} (b, b_{-})$ , and $b_{\pm} := \frac{2 b}{\sqrt{μ^{2} + 2 σ^{2} b} \pm μ} .$

4.2. Control Variates

Next, we use control variates to reduce the variance of the Monte Carlo estimators of the value function. We refer to Glasserman (Citation2003, Chapter 4) for the details of the method. Specifically, to estimate $E [P]$ , the method of control variates defines the random variable $P_{C} = P - β (C - E [C])$ , where C is the control variate for which $E [C]$ has an analytic representation, and β is chosen to minimize $Var (P_{C})$ , resulting in an unbiased estimator with reduced variance. Since the true value of β is unknown, the control variate estimator ${\hat{P}}_{C}$ is the sample mean of $P_{C}$ with β estimated by $\hat{β}$ , the least squares estimator (LSE) of the slope in a linear regression where C is the regressor and P is the response variable. Note that $\hat{β}$ is also the sample estimate of the true β.

This method can be extended to multiple control variates, and the control variate estimator is equivalent to the prediction of a linear regression. See Appendix for details.

We apply the control variate method to estimate both $E [P]$ and $E [P^{2}]$ in the value function (Equation12(12) $\begin{aligned} V := V (c_{+}, c_{-}, d_{+}, d_{-}) & = E [P] - γVar (P) \\ = E [P] - γ E [P^{2}] + γ E [P]^{2}, \end{aligned}$ (12) ). Consider the control variates $X (t_{1}) - \bar{μ}, \dots, X (t_{p}) - \bar{μ}$ , where $0, t_{1}, \dots, t_{p}$ are equally spaced points on $[0, T]$ . When estimating $E [P^{2}]$ , we include the additional control variates $(X (t_{1}) - \bar{μ})^{2}, \dots, (X (t_{p}) - \bar{μ})^{2}$ . Note that $E [X (t) - \bar{μ}]$ and $E [(X (t) - \bar{μ})^{2}]$ can be analytically computed using (Equation7(7) $\begin{aligned} E [X_{1} (t)] & = e^{- λt} X_{0} + (1 - e^{- λt}) (η_{1} + μ_{1}), \end{aligned}$ (7) ) and (Equation8(8) $\begin{aligned} Var (X_{1} (t)) & = (\frac{1 - e^{- 2 λt}}{2}) (Σ_{11} + α_{1} μ_{1}^{2}), \end{aligned}$ (8) ).

Further, we may want to consider the event of entering and exiting a trade, and the event of entering and not exiting over the observations $X (t_{1}), \dots, X (t_{p})$ as additional control variates to approximate the indicator variable appearing in the profit function (Equation11(11) $\begin{aligned} P := P (c_{+}, c_{-}, d_{+}, d_{-}) & = 1_{{τ_{d_{+}} \leq τ_{d_{-}}}} e^{- r τ_{c_{+}}} (X (τ_{d_{+}}) - X (τ_{c_{+}})) \\ + 1_{{τ_{d_{+}} > τ_{d_{-}}}} e^{- r τ_{c_{-}}} (X (τ_{c_{-}}) - X (τ_{d_{-}})) . \end{aligned}$ (11) ). But the event that $X (t_{1}), \dots, X (t_{p})$ passes some level cannot be computed analytically for use as a control variate. Instead, the probability of a related event using the iid sequence of innovations $Z^{*} (Δ)^{(i)}$ , $i = 1, \dots, p$ can be. Specifically, this motivates the inclusion of two additional control variates $1_{A}$ and $1_{B}$ . Define A as the event that $Z^{*} (Δ)^{(1)}, \dots, Z^{*} (Δ)^{(p)}$ “enters a trade” (even though there is no actual trading on this sequence) by passing the level $e^{λ Δ} (\bar{μ} + d) - \bar{μ}$ or $e^{λ Δ} (\bar{μ} - d) - \bar{μ}$ and then “exits the trade” by passing the level $e^{λ Δ} (\bar{μ} + c) - \bar{μ}$ or $e^{λ Δ} (\bar{μ} - c) - \bar{μ}$ , respectively.

The modified levels for $Z^{*} (Δ)^{(i)}$ are the levels such that $X (t_{i}) = e^{- λ Δ} (X (t_{i - 1}) + Z^{*} (Δ)^{(i)})$ passes the entry levels $μ \pm d$ when $X (t_{i - 1}) = \bar{μ}$ , and similarly for the exit levels. Further, B is the same as A except without exit. Thus, A and B are the events that $Z^{*} (Δ)^{(i)}$ , $i = 1, \dots, p$ , enters and exits these modified levels, and enters and does not exit, respectively. Due to collinearity, the complementary event that the sequence does not enter is not included, and if p = 1, the event A, which has probability 0, is also not included. We have expressed this for symmetric levels $c = c_{+} = c_{-}$ and $d = d^{+} = d_{-}$ , but this can be modified in an obvious way for the asymmetric case.

Since $Z^{*} (Δ)^{(i)}$ , $i = 1, \dots, p$ , are iid, we can compute the probabilities of A and B since we can compute $P (Z^{*} (Δ) \leq x)$ by applying the Fourier inversion to the characteristic function of $Z^{*} (Δ)$ determined by (Equation3(3) $Ψ_{Z^{*} (Δ)} (θ) = \int_{0}^{λ Δ} Ψ_{Z} (e^{t} θ) d t, θ \in R^{n},$ (3) ). The Fourier inversion methodology is described in Michaelsen and Szimayer (Citation2018, Section 4.1).

4.3. Variance Reduction

Consider the two separate linear regression models, (13) $Y_{1} = X_{1} β_{1} + ϵ_{1}, Y_{2} = X_{2} β_{2} + ϵ_{2},$ (13) where $Y_{1}$ and $Y_{2}$ are the vector of m simulated values of P and $P^{2}$ , respectively. In this section only, we let $X_{1} \in R^{(p_{1} + 1) \times m}$ and $X_{2} \in R^{(p_{2} + 1) \times m}$ be the design matrices, including an intercept. As outlined in Section 4.2, $X_{1}$ includes the control variates $X (t_{i}) - \bar{μ}$ , $i = 1, \dots, p$ , $1_{A}$ , $1_{B}$ , and $X_{2}$ includes the control variates in $X_{1}$ in addition to $(X (t_{i}) - \bar{μ})^{2}$ , $i = 1, \dots, p$ . Since each simulated sample path of the LDOUP is independent, we have $Cov (ϵ_{i}) = Σ_{ϵ, ii} I$ , i = 1, 2 and $Cov (ϵ_{1}, ϵ_{2}) = Σ_{ϵ, 12} I$ for some $2 \times 2$ covariance matrix $Σ_{ϵ} = (Σ_{ϵ, ij})_{i, j = 1}^{2}$ , where I is the identity matrix.

For each of the linear regression models, i = 1, 2, let $μ_{C, i}$ be the mean vector of the control variates computed as outlined in Section 4.2, then the predictions $x_{i} = (1, μ_{C, i})$ are ${\hat{Y}}_{i} = x_{i}^{'} {\hat{β}}_{i}$ . As explained in Appendix, the control variate estimators of $E [P]$ and $E [P^{2}]$ are the predictions ${\hat{Y}}_{1}$ and ${\hat{Y}}_{2}$ , respectively. Thus, for each argument, the control variate estimator of the value function in (Equation12(12) $\begin{aligned} V := V (c_{+}, c_{-}, d_{+}, d_{-}) & = E [P] - γVar (P) \\ = E [P] - γ E [P^{2}] + γ E [P]^{2}, \end{aligned}$ (12) ) is ${\hat{V}}_{C} := {\hat{Y}}_{1} - γ {\hat{Y}}_{2} + γ {\hat{Y}}_{1}^{2} = b^{'} \hat{Y} + {\hat{Y}}^{'} A \hat{Y},$ where $A = (\begin{matrix} γ & 0 \\ 0 & 0 \end{matrix}), b = (\begin{matrix} 1 \\ - γ \end{matrix}), \hat{Y} = (\begin{matrix} {\hat{Y}}_{1} \\ {\hat{Y}}_{2} \end{matrix}) .$ Let ${\bar{Y}}_{1}, {\bar{Y}}_{2}$ be the sample mean of $Y_{1}, Y_{2}$ respectively, then ${\hat{V}}_{M} := {\bar{Y}}_{1} - γ {\bar{Y}}_{2} + γ {\bar{Y}}_{1}^{2}$ is the corresponding Monte Carlo estimator of the value function.

Given that our approach involves parameters that are estimated by a function of two control variate estimators, rather than the usual situation of only one, we now derive the variance reduction factor in this context. For a large number of Monte Carlo simulations m, the asymptotic normality of the LSE gives $\hat{Y} \sim N (μ_{C}, Σ_{C})$ approximately for some $μ_{C} \in R^{2}$ and $Σ_{C} \in R^{2 \times 2}$ . Using results on multivariate linear regression, and the law of total variance, noting that $X_{1}$ and $X_{2}$ are random, we have $\begin{aligned} μ_{C} & = (\begin{matrix} x_{1}^{'} β_{1} \\ x_{2}^{'} β_{2} \end{matrix}), \\ Σ_{C} & = (\begin{matrix} Σ_{ϵ, 11} x_{1}^{'} E [(X_{1}^{'} X_{1})^{- 1}] x_{1} & Σ_{ϵ, 12} x_{1}^{'} E [(X_{1}^{'} X_{1})^{- 1} X_{1}^{'} X_{2} (X_{2}^{'} X_{2})^{- 1}] x_{2} \\ * & Σ_{ϵ, 22} x_{2}^{'} E [(X_{2}^{'} X_{2})^{- 1}] x_{2} \end{matrix}), \end{aligned}$ where $*$ denotes the rest of the matrix is completed by symmetry. By the plug-in principle, these parameters are estimated using $\begin{aligned} {\hat{μ}}_{C} & = (\begin{matrix} x_{1}^{'} {\hat{β}}_{1} \\ x_{2}^{'} {\hat{β}}_{2} \end{matrix}), \\ {\hat{Σ}}_{C} & = (\begin{matrix} {\hat{Σ}}_{ϵ, 11} x_{1}^{'} (X_{1}^{'} X_{1})^{- 1} x_{1} & {\hat{Σ}}_{ϵ, 12} x_{1}^{'} (X_{1}^{'} X_{1})^{- 1} X_{1}^{'} X_{2} (X_{2}^{'} X_{2})^{- 1} x_{2} \\ * & {\hat{Σ}}_{ϵ, 22} x_{2}^{'} (X_{2}^{'} X_{2})^{- 1} x_{2} \end{matrix}), \end{aligned}$ where $Σ_{ϵ, ij}$ is estimated using ${\hat{Σ}}_{ϵ, ij} = \frac{{\hat{ϵ}}_{i}^{'} {\hat{ϵ}}_{j}}{m - r - 1},$ and ${\hat{ϵ}}_{1}, {\hat{ϵ}}_{2}$ are the residuals from the linear regressions in (Equation13(13) $Y_{1} = X_{1} β_{1} + ϵ_{1}, Y_{2} = X_{2} β_{2} + ϵ_{2},$ (13) ), $r = p_{1}$ for i = j = 1, and $r = p_{2}$ otherwise.

Now since ${\hat{V}}_{C}$ is a quadratic form in $\hat{Y}$ , and $\hat{Y} \sim N (μ_{C}, Σ_{C})$ holds approximately for large m, by Rencher and Schaalje (Citation2008, Theorem 5.2c–d), the variance of the control variate estimator is (14) $\begin{aligned} Var ({\hat{V}}_{C}) & \approx b^{'} Σ_{C} b + 2 tr ((A Σ_{C})^{2}) + 4 μ^{'} A Σ_{C} A μ_{C} + 4 b^{'} Σ_{C} A μ_{C} \\ = (1 + 4 μ_{C, 1} γ + 4 (μ_{C, 1})^{2} γ^{2}) Σ_{C, 11} + 2 γ^{2} (Σ_{C, 11})^{2} \\ - 2 γ (1 + 2 μ_{C, 1} γ) Σ_{C, 12} + γ^{2} Σ_{C, 22} . \end{aligned}$ (14) Then estimated variance $\hat{Var} ({\hat{V}}_{C})$ is obtained by replacing $μ_{C}$ and $Σ_{C}$ with ${\hat{μ}}_{C}$ and ${\hat{Σ}}_{C}$ , respectively. By the asymptotic normality of ${\hat{V}}_{M}$ , (Equation14(14) $\begin{aligned} Var ({\hat{V}}_{C}) & \approx b^{'} Σ_{C} b + 2 tr ((A Σ_{C})^{2}) + 4 μ^{'} A Σ_{C} A μ_{C} + 4 b^{'} Σ_{C} A μ_{C} \\ = (1 + 4 μ_{C, 1} γ + 4 (μ_{C, 1})^{2} γ^{2}) Σ_{C, 11} + 2 γ^{2} (Σ_{C, 11})^{2} \\ - 2 γ (1 + 2 μ_{C, 1} γ) Σ_{C, 12} + γ^{2} Σ_{C, 22} . \end{aligned}$ (14) ) is also used to compute $\hat{Var} ({\hat{V}}_{M})$ by replacing ${\hat{μ}}_{C}$ and ${\hat{Σ}}_{C}$ with ${\hat{μ}}_{M} = ({\bar{Y}}_{1}, {\bar{Y}}_{2})$ and ${\hat{Σ}}_{M} = {\hat{Σ}}_{Y} / m$ , respectively, where ${\hat{Σ}}_{Y} \in R^{2 \times 2}$ with the sample covariance $\hat{Cov} (Y_{i}, Y_{j})$ as its $(i, j)$ -entry.

Thus, the estimated variance reduction factor of using control variates relative to Monte Carlo without control variates is $R = \frac{\hat{Var} ({\hat{V}}_{C})}{\hat{Var} ({\hat{V}}_{M})} .$ In the context of a single parameter estimated using control variates, while the true variance reduction factor using the true value of $β_{i}$ is no greater than 1, when using LSE ${\hat{β}}_{i}$ instead, R>1 is possible. This topic is discussed in Glasserman (Citation2003, Section 4.1.3).

5. Numerical Implementation and Analysis

In this section, we implement the Monte Carlo method outlined above for various scenarios. The spread $X \sim OU - VG (λ, b, μ, σ^{2}, η)$ is simulated with stepsize $Δ = 0.01$ and terminal time T = 50. In all cases, $μ = - η$ , and any changes to μ adjusts η accordingly to ensure that the stationary mean is $\bar{μ} = 0$ . Unless otherwise stated, the initial value is $X_{0} = 0$ , $γ = 0$ , the discount rate is r = 0.01, and it is assumed for simplicity that the entry level $d = d_{+} = d_{-}$ is symmetric and the exit level is fixed at $c = c_{+} = c_{-} = 0$ . Thus, we optimize for the entry level d. All results use m = 10, 000 Monte Carlo simulations.

5.1. Effect of Control Variates

Consider two examples using control variates.

Example 5.1

Suppose the spread X has parameters $(λ, b, μ, σ^{2}, η) = (1, 1, - 0.5, 0.015, 0.5)$ , and $γ = 0.1$ . The number of time points p used in the control variate method can be chosen to obtain the optimal variance reduction factor, however, this also depends on the entry level d. We optimize this iteratively by using no control variates to optimize d, then use this value of d to optimize p with the resulting optimal value denoted $p^{*}$ . Doing this, Figure shows that for d = 1.086, the optimal number of time points for the control variates is $p^{*} = 130$ when searching over $p = 10, 20, \dots, 200$ . This gives a moderate optimal variance reduction factor of $R^{*} = 0.735$ .

Figure 2. Estimated variance reduction factor R as a function of the number of time points for the control variates p. The parameters are $(λ, b, μ, σ^{2}, η) = (1, 1, - 0.5, 0.015, 0.5)$ , $X_{0} = 0$ , $γ = 0.1$ , r = 0.01.

Using $p^{*} = 130$ , Figure shows the estimate of the value function $V (d)$ , which has optimal entry level $d^{*} = 1.086$ . The control variate reduces $Var ({\hat{V}}_{C})$ , however, it does not change the optimal entry level $d^{*}$ in this example because the variability is already quite low, in fact, the coefficient of variation of ${\hat{V}}_{C} (d^{*})$ , the estimate of $V (d^{*})$ conditional on $d^{*}$ , is 0.0038 after control variates.

Figure 3. The estimate ${\hat{V}}_{C} (d)$ of the value function $V (d)$ and the optimal entry level $d^{*} = 1.086$ . The parameters are $(λ, b, μ, σ^{2}, η) = (1, 1, - 0.5, 0.015, 0.5)$ , $X_{0} = 0$ , $γ = 0.1$ , r = 0.01.

Figure 3. The estimate V^C(d) of the value function V(d) and the optimal entry level d∗=1.086. The parameters are (λ,b,μ,σ2,η)=(1,1,−0.5,0.015,0.5), X0=0, γ=0.1, r = 0.01.

Different choices in the parameters result in different variance reductions. In Table , results are shown varying $b, μ$ (which also affects η), and for the parameters, decreasing b>0 increases $R^{*}$ , while decreasing $μ < 0$ reduces and then increases $R^{*}$ .

Table 1. For various values of $(b, μ)$ , the optimal number of time points for the control variates $p^{}$ (searching over $p = 10, 20 \dots, 200$ ) and the optimal variance reduction factor $R^{}$ are shown.

Display Table

Consider the case shown in the second row of Table where $(b, μ) = (2, - 0.05)$ and $p^{*} = 120$ . Surprisingly, for large γ, such as $γ = 1.5$ , it is possible that the control variate method reduces the variance of the estimator of $E [P]$ with an estimated variance reduction factor 0.910, and $E [P^{2}]$ with estimated variance reduction factor 0.816, but increases the variance of the estimator of $V (d^{*}) = E [P] - γ E [P^{2}] + γ E [P]^{2}$ with variance reduction factor 1.139. In comparison, when $γ = 0.1$ , this does not occur, and those estimated variance reduction factors are 0.892, 0.793, 0.904, respectively.

Example 5.2

We use the same setup as Example 5.1, except X now has parameters $(λ, b, μ, σ^{2}, η) = (0.01, 50, 0.5, 4, - 0.5)$ . Since $λ \approx 0$ , this process has little mean reversion, and is close in law to a variance gamma process, so the value function is estimated with a relatively large amount of simulation error as shown in Figure . To address this, a locally estimated scatterplot smoothing (loess) smoother is applied to the estimated value function before maximizing it. Then the optimal entry level is $d^{*} = 0.506$ without using control variates, and $d^{*} = 0.456$ using the optimal number of control variates, which is $p^{*} = 1$ . The optimal variance reduction factor is $R^{*} = 0.817$ and the coefficient of variation of ${\hat{V}}_{C} (d^{*})$ is 0.0281, which is consistent with the much larger variability of the estimated value function compared to Example 5.1, Figure . Unlike in Figure , using control variates here has an effect on the estimate of $d^{*}$ , the size of this effect varies with the simulation. Note that $R^{*}$ is the variance reduction of estimating $V (d)$ for fixed d using control variates, and the method is not designed to reduce the standard error of $d^{*}$ , although reducing the former should be expected to reduce the latter. For these optimal values, only 63.91% of simulations resulted in a trade that is entered and exited before the terminal time due to the slow mean reversion, compared to 90.88% in Figure .

Figure 4. The estimate of the value function $V (d)$ (solid lines) and the corresponding loess smooth (dashed lines) with (red lines) and without (black lines) control variates. The optimal entry level is $d^{*} = 0.506$ without using control variates, and $d^{*} = 0.456$ using the optimal number of control variates $p^{*} = 1$ . The parameters are $(λ, b, μ, σ^{2}, η) = (0.01, 50, 0.5, 4, - 0.5)$ , $X_{0} = 0$ , $γ = 0.1$ , r = 0.01.

5.2. Asymmetry of Optimal Trading Levels

From now on, we set $γ = 0$ and do not include control variates. In the previous examples, we fixed $d = d_{+} = d_{-}$ for simplicity. The parameter μ controls the asymmetry of the stationary distribution Y of the LDOUP X. If $μ = 0$ , then the stationary distribution is symmetric and hence the optimal levels are equal, $d_{+}^{*} = d_{-}^{*}$ . However, under asymmetry, we may want to optimize both $d_{+}$ and $d_{-}$ separately.

Here, we consider the optimal level in four cases: from a large negative skew to no skew. The other parameters of X are $(λ, b, σ^{2}) = (1, 5, 0.015)$ . The results are shown in Table , where $Skew (Y)$ is the skewness of the stationary distribution, which by (Equation9(9) $\begin{aligned} Skew (X_{1} (t)) & = \frac{(\frac{1 - e^{- 3 λt}}{3}) (3 Σ_{11} μ_{1} α_{1} + 2 μ_{1}^{3} α_{1}^{2})}{{((\frac{1 - e^{- 2 λt}}{2}) (Σ_{11} + μ_{1}^{2} α_{1}))}^{3 / 2}} \end{aligned}$ (9) ), is $Skew (Y) = \frac{2^{3 / 2}}{3} (\frac{3 σ^{2} μ / b + 2 μ^{3} / b^{2}}{(σ^{2} + μ^{2} / b)^{3 / 2}}) .$ The results in Table show how $d_{+}^{*}$ and $d_{-}^{*}$ may differ due to skewness. For negatively skewed spreads, the trade is more likely to enter from the negative level, and so the positive level is slightly lower to compensate. Figures and show the sample paths of the spread when $μ = - 0.5$ (large negative skew) and $μ = 0$ (no skew), respectively.

Figure 5. A sample path of the spread $X \sim OU - VG (1, 5, - 0.5, 0.015, 0.5)$ using the optimal entry levels in the first row of Table .

Figure 5. A sample path of the spread X∼OU-VG(1,5,−0.5,0.015,0.5) using the optimal entry levels in the first row of Table 2.

Table 2. For various value of μ, the optimal entry levels $d_{+}^{}$ and $d_{-}^{}$ are shown.

Display Table

In the classical OU spread model, which has no skew, Leung and Li (Citation2015) derive the optimal entry and exit levels with transaction costs, where the two levels can also be asymmetric.

5.3. Exit Level Different From the Mean

Recall that we set $c = c_{+} = c_{-}$ for simplicity. So far, we have set c = 0. However, it is possible to find examples where $c^{*}$ is much greater than 0. Such a situation can occur when $X_{0} > \bar{μ}$ and the discount rate r is very large. Let $X_{0} = 0.25$ and r = 1 while the other parameters are as before with $(λ, b, μ, σ^{2}, η) = (1, 5, 0, 0.015, 0)$ . We optimize c, d for 0<c<d, and find that the optimal exit level is $c^{*} = 0.105$ , while the optimal entry level $d^{*}$ is any value $c^{*} < d^{*} < 0.25$ as shown in Figure , which plots the estimate of the value function $V (c, d)$ .

Figure 6. The estimate of the value function $V (c, d)$ with entry level d and exit level c. The parameters are $(λ, b, μ, σ^{2}, η) = (1, 5, 0, 0.015, 0)$ , $X_{0} = 0.25$ , $γ = 0$ , r = 1.

Thus, the pairs trading strategy is to enter the trade immediately, and because the large discount rate greatly reduces the profit if the trade is not exited quickly, the trade is exited at $c^{*} = 0.105$ above the stationary mean $\bar{μ}$ rather than waiting for it to fall to $\bar{μ}$ .

Figure shows how the optimal exit level $c^{*}$ changes for different values of the discount rate r. For the plotted values, the point at which $c^{*} > 0.000$ (to 3 decimal places) is r = 0.3. For all these plotted values of r, if the initial value is instead set to $X_{0} = \bar{μ} = 0$ , then the optimal exit level would be $c^{*} = 0.000$ . This demonstrates how having an initial value different from the stationary mean and a large discount rate can make the optimal exit level $c^{*}$ different from 0. It is difficult to find other situations where this occurs, which suggests optimizing c may have little benefit beyond this, and fixing c = 0, which corresponds to exiting the trade when the spread passes $\bar{μ}$ , simplifies the optimization.

Figure 7. The optimal exit level $c^{*}$ for various value of discount rate $r = 0.01, 0.1, 0.2, \dots, 1.5$ . The other parameters are $(λ, b, μ, σ^{2}, η) = (1, 5, 0, 0.015, 0)$ , $X_{0} = 0.25$ , $γ = 0$ .

Figure 7. The optimal exit level c∗ for various value of discount rate r=0.01,0.1,0.2,…,1.5. The other parameters are (λ,b,μ,σ2,η)=(1,5,0,0.015,0), X0=0.25, γ=0.

5.4. Effect of Jumps

Now we consider the effect of the parameter b, which affects the jumps of the LDOUP X, on the optimal entry level $d^{*}$ . As before, let $(λ, μ, σ^{2}, η) = (1, 0, 0.015, 0)$ . As $b \to \infty$ , the OU-VG process for the spread converges in law to a classical OU process driven by Brownian motion, and the sample paths are closer to continuous. As $b \to 0$ , roughly speaking, the sample paths become more jumpy (specifically, for each t, the BDLP $Z (t)$ converges in distribution to a constant while its kurtosis diverges). Define the overshoot for the entry level d as the random variable $O = {\begin{cases} | X (τ_{d}) - (\bar{μ} + d) | & if τ_{d} < τ_{- d}, \\ | X (τ_{- d}) - (\bar{μ} - d) | & if τ_{d} \geq τ_{- d} . \end{cases}$ Table shows the effect of the sample paths of the spread, roughly speaking, being very jumpy (b = 1), moderately jumpy (b = 5), and approximately continuous $(b = 100)$ . A plot of a sample path in each of these cases is given in Figures (a), and (b), respectively. The results show that the more jumpy the sample path, the higher the level at which the trade is entered to compensate for overshooting. When b = 1, the mean overshoot is relatively large (0.0640) compared to the optimal entry level $d^{*}$ (0.246). This demonstrates the importance of accounting for jumps by using LDOUPs compared to the classical OU process. Both Figure and Table show that as the sample path becomes closer to continuous, the mean of the overshoots becomes smaller and the optimal expected profit ${\hat{V}}_{M} (d^{*})$ becomes close to $d^{*}$ as expected, although they do not converge since r is nonzero. In this example, ${\hat{V}}_{M} (d^{*})$ increases as b decreases. Note that the stationary variance remains constant, so the increased profitability is not due to changes in the variance.

Figure 8. A sample path of the spread $X \sim OU - VG (1, b, 0, 0.015, 0)$ , where (a) b = 1 and (b) b = 100, using the optimal entry levels in the first and third row of Table .

Figure 8. A sample path of the spread X∼OU-VG(1,b,0,0.015,0), where (a) b = 1 and (b) b = 100, using the optimal entry levels in the first and third row of Table 3.

Table 3. For various values of b, the optimal entry level d, the estimate ${\hat{V}}_{M} (d^{*})$ of the optimal expected profit, and overshoot statistics are shown.

Display Table

6. Bivariate Pairs Trading

We now consider pairs trading in a bivariate setting, where the spread $X = (X_{1}, X_{2})$ is a bivariate LDOUP with stationary mean $\bar{μ} = ({\bar{μ}}_{1}, {\bar{μ}}_{2})$ . While we focus on the 2-dimensional case, the concepts here can in principle be extended to the n-dimensional case.

6.1. Paris Trading Problem

While there are many ways to set up a pairs trading strategy on multiple spreads, we use the following setup to generalize the notion of one trade cycle used in the univariate setting. A trade is entered for the first time that either spread $X_{k}$ is above ${\bar{μ}}_{k} + d_{k}$ or below ${\bar{μ}}_{k} - d_{k}$ for k = 1, 2. If a trade on spread $X_{k}$ is entered, then exit occurs when the spread passes ${\bar{μ}}_{k} + c_{k}$ or ${\bar{μ}}_{k} - c_{k}$ , respectively. When a trade is entered, no further trades can be initiated, even if the other spread later passes the entry level.

More generally, it is possible to have asymmetric entry and exit levels for both spreads, where $d_{+, 1}$ , $d_{-, 1}$ are the levels for entering a trade on spread $X_{1}$ from the positive and negative sides, respectively, and similarly for the other entry and exit levels. For each spread k = 1, 2, there are 8 relevant passage times defined by $\begin{aligned} τ_{d_{+, k}} & = inf {t \geq 0 : X_{k} (t) > {\bar{μ}}_{k} + d_{+, k}} \land T, \\ τ_{c_{+, k}} & = inf {t \geq τ_{d_{+, k}} : X_{k} (t) < {\bar{μ}}_{k} + c_{+, k}} \land T, \\ τ_{d_{-, k}} & = inf {t \geq 0 : X_{k} (t) < {\bar{μ}}_{k} - d_{-, k}} \land T, \\ τ_{c_{-, k}} & = inf {t \geq τ_{d_{-, k}} : X_{k} (t) > {\bar{μ}}_{k} - c_{-, k}} \land T, \end{aligned}$ with $0 \leq c_{\pm, k} < d_{\pm, k}$ . However, we will not need to use this much generality in the examples below.

The nonzero probability of common jumps for the WVAG process used as the BDLP, and hence for $X$ , means that the event ${τ_{d_{+, 1}} = τ_{d_{+, 2}}}$ also has nonzero probability, that is, both spreads can pass their corresponding entry levels simultaneously. In this case, we invest with 50% weight in each of the spreads. An alternative approach would be to invest fully in the spread furthest from the mean-reverting level. However, in the numerical examples we consider below, both approaches produce virtually the same results, since it is rare that both spreads pass the optimal entry levels simultaneously. Therefore, we will only consider the former approach.

Let $τ = min {τ_{d_{+, 1}}, τ_{d_{-, 1}}, τ_{d_{+, 2}}, τ_{d_{-, 2}}}$ , $c = (c_{+, 1}, c_{-, 1}, c_{+, 2}, c_{-, 2})$ and $d = (d_{+, 1}, d_{-, 1}, d_{+, 2}, d_{-, 2})$ . The profit over one trade cycle is given by $\begin{aligned} P := P (c, d) & = 1_{{τ \in {d_{+, 1}, d_{+, 2}}}} \sum_{k = 1, 2} e^{- r τ_{c_{+, k}}} W_{k} (X_{k} (τ_{d_{+, k}}) - X_{k} (τ_{c_{+, k}})) \\ + 1_{{τ \in {d_{-, 1}, d_{-, 2}}}} \sum_{k = 1, 2} e^{- r τ_{c_{-, k}}} W_{k} (X_{k} (τ_{c_{-, k}}) - X_{k} (τ_{d_{-, k}})), \end{aligned}$ where the weights are $(W_{1}, W_{2}) = {\begin{cases} (1, 0) & if τ_{d_{+, 1}} < τ_{d_{+, 2}} or τ_{d_{-, 1}} < τ_{d_{-, 2}}, \\ (0, 1) & if τ_{d_{+, 1}} > τ_{d_{+, 2}} or τ_{d_{-, 1}} > τ_{d_{-, 2}}, \\ (0.5, 0.5) & otherwise, \end{cases}$ and r is the discount rate.

The objective function is the value function $V (c, d) = E [P] - γVar (P), γ \geq 0.$ We want to find the optimal level $c, d$ for the pairs trading strategy to maximize the value function, that is $max_{c, d} V (c, d) .$

6.2. Simulating the Spread

Now we specifically let the spread be $X \sim OU - WVA G^{2} (a, α, μ, Σ, η)$ . To simulate $X (t_{0}), \dots, X (t_{q})$ , we use the Euler scheme approximation in Lu (Citation2022, Sections 4.2 and 5.1), which is based on taking a stochastic integral representation of $Z^{*} (Δ)$ , which can be approximately simulated whenever $Z (t)$ can be simulated, in this case using (Equation5(5) $Z \overset{D}{=} ηI + V_{0} + (V_{1}, \dots, V_{n}),$ (5) ). This is in contrast to the univariate OU-VG process which was simulated exactly in Section 4 with a method that has no multivariate generalization. We now determine the optimal trading levels using Monte Carlo methods.

6.3. Numerical Implementation and Analysis

We consider two examples of bivariate pairs trading on the spread $X \sim OU - WVA G^{2} (λ, a, α, μ, Σ, η)$ simulated using the approximate method above with stepsizes $Δ = 0.01$ and $\tilde{Δ} = 0.001$ , and terminal time T = 50. In all cases, the initial value is $X_{0} = 0$ , $γ = 0$ , we do not consider control variates, and $μ = - η$ so that $\bar{μ} = 0$ . Unless stated otherwise, we assume for simplicity that the entry levels are symmetric with $d_{k} = d_{+, k} = d_{-, k}$ , k = 1, 2, and the exit levels are $c = 0$ , so we optimize the value function over $d_{1}, d_{2}$ . We use m = 10,000 Monte Carlo simulations.

In our numerical examples, we explore the effect of correlation on the optimal entry level and the optimal expected profit. In the first example, the correlation has little effect, whereas in the second example, it has a larger effect. The correlation of the components of $X$ is in part controlled by the correlation parameter $ρ = Σ_{12} / \sqrt{Σ_{11} Σ_{22}}$ . However, note that ρ is the correlation of the Brownian motion subordinate in (Equation6(6) $Z \overset{D}{=} B ⊙ T,$ (6) ), and is related to the correlation of the components of $X$ by (15) $\begin{aligned} Corr (X_{1} (t), X_{2} (t)) & = Corr (Z_{1} (t), Z_{2} (t)) \\ = \frac{a ((α_{1} \land α_{2}) Σ_{12} + α_{1} α_{2} μ_{1} μ_{2})}{\sqrt{Σ_{11} + α_{1} μ_{1}^{2}} \sqrt{Σ_{22} + α_{2} μ_{2}^{2}}} \end{aligned}$ (15) for all t>0, due to (Equation10(10) $\begin{aligned} Cov (X_{1} (t), X_{2} (t)) & = (\frac{1 - e^{- 2 λt}}{2}) a ((α_{1} \land α_{2}) Σ_{12} + α_{1} α_{2} μ_{1} μ_{2}) . \end{aligned}$ (10) ) and (Equation8(8) $\begin{aligned} Var (X_{1} (t)) & = (\frac{1 - e^{- 2 λt}}{2}) (Σ_{11} + α_{1} μ_{1}^{2}), \end{aligned}$ (8) ).

Example 6.1

Let (16) $\begin{aligned} λ & = 1, a = 2.5, α = (\begin{matrix} 0.2 \\ 0.3 \end{matrix}), μ = (\begin{matrix} 0 \\ - 0.2 \end{matrix}), Σ = (\begin{matrix} 0.015 & ρ \sqrt{Σ_{11} Σ_{22}} \\ * & 0.02 \end{matrix}), \end{aligned}$ (16) and r = 0.01.

Table shows the optimal entry levels $d_{1}^{*}$ and $d_{2}^{*}$ , the optimal expected profit ${\hat{V}}_{M} (d_{1}^{*}, d_{2}^{*})$ , and the estimated probability of which spread is traded, and each of these quantities is approximately equal across the values of ρ considered.

Table 4. For various values of ρ, the optimal entry levels $d_{1}^{}, d_{2}^{}$ , the estimate ${\hat{V}}_{M} (d_{1}^{}, d_{2}^{})$ of the optimal expected profit, and the estimated probability of which spread $X_{1}, X_{2}$ is traded are shown.

Display Table

When pairs trading on the univariate spreads $X_{1}$ and $X_{2}$ , the optimal expected profit is 0.227 (see the second row of Table ) and 0.331 respectively, while it is 0.334–0.336 in this bivariate pairs trading example. Bivariate pairs trading is more profitable than univariate pairs trading on either spread, although it is approximately as profitable as trading only $X_{2}$ since this accounts for 79–84% of trades. The optimal expected profit of the bivariate pairs trade must be at least that of the best univariate pairs trade, as the latter strategy can be implemented within bivariate pairs trading by setting the other spread to have an arbitrarily large optimal entry level. Indeed, this is happening to some extent here as the optimal entry level of the spread $X_{1}$ has increased from 0.220 in the univariate case to 0.305–0.318 in the bivariate case. Thus, this decreases the probability of only $X_{1}$ entering a trade from 96.79% to 11–16% in Table (the latter accounts for the probability that a trade on $X_{1}$ fails to be entered because a trade on $X_{2}$ is entered first, while the former is univariate pairs trading and does not include this). So in this example, the method has essentially identified that $X_{2}$ is the more profitable spread, regardless of correlation, and has accordingly increased the optimal entry level of $X_{1}$ to reduce the instances where it is traded.

Example 6.2

Now we consider an example where different values of the correlation parameter ρ can have a larger effect on the optimal entry level and optimal expected profit. Suppose the parameters are now (17) $λ = 1, a = 6.65, α = (\begin{matrix} 0.15 \\ 0.15 \end{matrix}), μ = 0, Σ = (\begin{matrix} 0.015 & ρ \sqrt{Σ_{11} Σ_{22}} \\ * & 0.015 \end{matrix}),$ (17) and r = 1.

Recall the parameter boundary $a < \frac{1}{α_{1}} \land \frac{1}{α_{2}}$ . In Example 6.1, the upper bound of $Corr (X_{1} (t), X_{2} (t))$ as a function of $ρ \in (- 1, 1)$ is 0.395 by (Equation15(15) $\begin{aligned} Corr (X_{1} (t), X_{2} (t)) & = Corr (Z_{1} (t), Z_{2} (t)) \\ = \frac{a ((α_{1} \land α_{2}) Σ_{12} + α_{1} α_{2} μ_{1} μ_{2})}{\sqrt{Σ_{11} + α_{1} μ_{1}^{2}} \sqrt{Σ_{22} + α_{2} μ_{2}^{2}}} \end{aligned}$ (15) ). The upper bound converges to 1 as $a \to \frac{1}{α_{k}}$ while keeping $α_{k}, μ_{k}, Σ_{kk}, η_{k}$ constant for all k. Since $a \approx \frac{1}{α_{1}} = \frac{1}{α_{2}}$ here, this example represents monitoring two spreads $X_{1} \overset{D}{=} X_{2}$ that are equal in law for each fixed ρ, with $X_{1}$ and $X_{2}$ converging pathwisely as $ρ \to 1$ , since in the limit, $Corr (X_{1} (t), X_{2} (t)) = 1$ for all t with $X_{0} = 0$ . Also, appealing to the symmetry of $X (t)$ for each fixed t, all components of the optimal entry levels $d^{*}$ are equal to $d^{*}$ , say. Thus, we optimize over d, the symmetric entry level for all components.

Table shows the results for this example over different values of ρ. Here, the optimal expected profit ${\hat{V}}_{M} (d^{*})$ increases from 0.032 to 0.040 as ρ decreases from 0.99 to 0, this represents a 26% increase in the optimal expected profit. The case where $ρ = 0.99$ is approximately the case of univariate pairs trading, which occurs in the limit as $ρ \to 1$ . Since ${\hat{V}}_{M} (d^{*})$ is larger when ρ is much less than 1, the results demonstrate a situation where bivariate pairs trading is much more profitable than what is effectively pairs trading on a single spread, and where correlation has a larger effect.

Table 5. For various value of ρ, the optimal entry level $d^{}$ and the estimate ${\hat{V}}_{M} (d^{})$ of the optimal expected profit.

Display Table

These results are consistent with intuition. Monitoring multiple spreads of the same law presents additional opportunities to enter into the trade sooner, which increases the optimal expected profit relative to monitoring only one spread, while having a larger discount rate r substantially rewards entering the trade sooner. Thus, in this example, using bivariate pairs trading by choosing pairs that are uncorrelated (as represented by the $ρ = 0$ case) is much more profitable than either univariate pairs trading or choosing highly correlated spreads (both represented by the $ρ \to 1$ case).

7. Conclusions

We have presented a Monte Carlo framework for evaluating pairs trading strategies where the mean-reverting spread follows an LDOUP. We numerically demonstrated how different parameters affect the optimal trading level and value function and capture various aspects of the trading strategy. The framework is flexible to accommodate a wide class of LDOUPs, provided that they can be fitted and simulated, as well as different ways of trading the spread. For instance, methods for fitting and simulating univariate LDOUPs with tempered stable and normal inverse Gaussian stationary distributions are provided in Valdivieso, Schoutens, and Tuerlinckx (Citation2009). Our proposed control variates for variance reduction, can also be applied to similar problems.

For future research, one useful direction is to incorporate transaction costs into the trading problems. To that end, Do and Faff (Citation2012) examine the profitability of pairs trading accounting for transaction costs, and Leung and Li (Citation2015) analyse an optimal stopping approach to pairs trading with transaction costs and stop loss, and this work could be extended to spreads following LDOUPs. Another possibility is to consider large jumps in stock prices that trigger immediate liquidation due to the need to return the stocks. This can also be incorporated into the stochastic model and value function. Additional extensions of our framework include portfolio optimization problems arising from trading multiple spreads and spread selection based on statistical and other characteristics. Theoretical results on pairs trading under LDOUP models are very limited, and additional work in this direction would be useful. However with LDOUPs, these problems are complicated and unlikely to admit analytic solutions, so simulation-based approaches can be both mathematically interesting and practically useful.

Acknowledgements

Kevin Lu thanks the Department of Applied Mathematics at the University of Washington, where the substantial majority of his work on this article was completed. We thank an anonymous referee for the useful comments which have helped improve the paper.

Disclosure statement

No potential conflict of interest was reported by the author(s).

References

Avellaneda, M., and J.-H. Lee. 2010. “Statistical Arbitrage in the US Equities Market.” Quantitative Finance 10 (7): 761–782. https://doi.org/10.1080/14697680903124632.
Web of Science ®Google Scholar
Avellaneda, M., and M. Lipkin. 2009. “A Dynamic Model for Hard-to-Borrow Stocks.” Risk 22 (6): 92–97.
Google Scholar
Barndorff-Nielsen, O. E., and N. Shephard. 2001. “Non-Gaussian Ornstein-Uhlenbeck-Based Models and Some of Their Uses in Financial Economics.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 63 (2): 167–241. https://doi.org/10.1111/1467-9868.00282.
Web of Science ®Google Scholar
Benth, F. E., and M. D. Schmeck. 2014. “Pricing Futures and Options in Electricity Markets.” In The Interrelationship Between Financial and Energy Markets, edited by S. Ramos and H. Veiga, 233–260. Berlin-Verlag: Springer.
Google Scholar
Bertoin, J. 1996. Lévy Processes. Cambridge: Cambridge University Press.
Google Scholar
Borovkov, K., and A. Novikov. 2008. “On Exit Times of Lévy-Driven Ornstein-Uhlenbeck Processes.” Statistics and Probability Letters 78 (12): 1517–1525. https://doi.org/10.1016/j.spl.2008.01.017.
Web of Science ®Google Scholar
Brennan, M. J., and E. S. Schwartz. 1990. “Arbitrage in Stock Index Futures.” Journal of Business 63 (1): S7–S31. https://doi.org/10.1086/jb.1990.63.issue-S1.
Google Scholar
Buchmann, B., K. W. Lu, and D. B. Madan. 2019. “Weak Subordination of Multivariate Lévy Processes and Variance Generalised Gamma Convolutions.” Bernoulli 25 (1): 742–770. https://doi.org/10.3150/17-BEJ1004.
Web of Science ®Google Scholar
Cariboni, J., and W. Schoutens. 2009. “Jumps in Intensity Models: Investigating the Performance of Ornstein-Uhlenbeck Processes in Credit Risk Modeling.” Metrika 69:73–198. https://doi.org/10.1007/s00184-008-0213-4.
Web of Science ®Google Scholar
Cont, R., and P. Tankov. 2004. Financial Modelling with Jump Processes. Boca Raton: Chapman & Hall/CRC.
Google Scholar
Cummins, M., G. Kiely, and B. Murphy. 2018. “Gas Storage Valuation Under Multifactor Lévy Processes.” Journal of Banking and Finance 95:167–184. https://doi.org/10.1016/j.jbankfin.2018.02.012.
Web of Science ®Google Scholar
Dai, M., Y. Zhong, and Y. K. Kwok. 2011. “Optimal Arbitrage Strategies on Stock Index Futures Under Position Limits.” Journal of Futures Markets 31 (4): 394–406. https://doi.org/10.1002/fut.v31.4.
Web of Science ®Google Scholar
Do, B., and R. Faff. 2012. “Are Pairs Trading Profits Robust to Trading Costs?” Journal of Financial Research 35 (2): 261–287. https://doi.org/10.1111/jfir.2012.35.issue-2.
Google Scholar
Elliott, R., Van Der Hoek J., and W. Malcolm. 2005. “Pairs Trading.” Quantitative Finance 5 (3): 271–276. https://doi.org/10.1080/14697680500149370.
Web of Science ®Google Scholar
Endres, S., and Stübinger J.. 2019. “Optimal Trading Strategies for Lévy-Driven Ornstein-Uhlenbeck Processes.” Applied Economics 51 (29): 3153–3169. https://doi.org/10.1080/00036846.2019.1566688.
Web of Science ®Google Scholar
Gardini, M., P. Sabino, and E. Sasso. 2022. “The Variance Gamma++ Process and Applications to Energy Markets.” Applied Stochastic Models in Business and Industry 38 (2): 391–418. https://doi.org/10.1002/asmb.v38.2.
Web of Science ®Google Scholar
Gatev, E., W. N. Goetzmann, and K. G. Rouwenhorst. 2006. “Pairs Trading: Performance of a Relative-Value Arbitrage Rule.” Review of Financial Studies 19 (3): 797–827. https://doi.org/10.1093/rfs/hhj020.
Web of Science ®Google Scholar
Glasserman, P. 2003. Monte Carlo Methods in Financial Engineering. New York: Springer.
Google Scholar
Huck, N., and K. Afawubo. 2015. “Pairs Trading and Selection Methods: Is Cointegration Superior?” Applied Economics 47 (6): 599–613. https://doi.org/10.1080/00036846.2014.975417.
Web of Science ®Google Scholar
Kanamura, T., S. Rachev, and F. Fabozzi. 2010. “A Profit Model for Spread Trading with an Application to Energy Futures.” The Journal of Trading 5 (1): 48–62. https://doi.org/10.3905/JOT.2010.5.1.048.
Google Scholar
Lee, K., T. Leung, and B. Ning. 2023. “A Diversification Framework for Multiple Pairs Trading Strategies.” Risks 11 (5). https://doi.org/10.3390/risks11050093.
Web of Science ®Google Scholar
Leung, T., and X. Li. 2015. “Optimal Mean Reversion Trading with Transaction Costs and Stop-Loss Exit.” International Journal of Theoretical and Applied Finance 18 (03): 1550020. https://doi.org/10.1142/S021902491550020X.
Google Scholar
Leung, T., and X. Li 2016. Optimal Mean Reversion Trading: Mathematical Analysis and Practical Applications. Modern Trends in Financial Engineering, World Scientific Publishing Company.
Google Scholar
Leung, T., and H. Nguyen. 2019. “Constructing Cointegrated Cryptocurrency Portfolios for Statistical Arbitrage.” Studies in Economics and Finance 36 (3): 581–599. https://doi.org/10.1108/SEF-08-2018-0264.
Google Scholar
Lipton, A., and López de Prado M.. 2020. “A Closed-Form Solution for Optimal Ornstein-Uhlenbeck Driven Trading Strategies.” International Journal of Theoretical and Applied Finance 23 (8): 2050056. https://doi.org/10.1142/S0219024920500569.
Web of Science ®Google Scholar
Lu, K. W. 2022. “Calibration for Multivariate Lévy-Driven Ornstein-Uhlenbeck Processes with Applications to Weak Subordination.” Statistical Inference for Stochastic Processes 25 (2): 365–396. https://doi.org/10.1007/s11203-021-09254-4.
Google Scholar
Madan, D. B., P. P. Carr, and E. C. Chang. 1998. “The Variance Gamma Process and Option Pricing.” European Finance Review 2 (1): 79–105. https://doi.org/10.1023/A:1009703431535.
Google Scholar
Madan, D. B., and E. Seneta. 1990. “The Variance Gamma (v.g.) Model for Share Market Returns.” The Journal of Business 63 (4): 511–524. https://doi.org/10.1086/jb.1990.63.issue-4.
Web of Science ®Google Scholar
Masuda, H. 2004. “On Multidimensional Ornstein-Uhlenbeck Processes Driven by a General Lévy Process.” Bernoulli 310 (1): 97–120.
Google Scholar
Michaelsen, M., and A. Szimayer. 2018. “Marginal Consistent Dependence Modeling Using Weak Subordination for Brownian Motions.” Quantitative Finance 18 (11): 1909–1925. https://doi.org/10.1080/14697688.2018.1439182.
Web of Science ®Google Scholar
Nelson, B. L. 1990. “Control Variate Remedies.” Operations Research 38 (6): 974–992. https://doi.org/10.1287/opre.38.6.974.
Web of Science ®Google Scholar
Qu, Y., A. Dassios, and H. Zhao. 2021. “Exact Simulation of Gamma-Driven Ornstein–Uhlenbeck Processes with Finite and Infinite Activity Jumps.” Journal of the Operational Research Society 72 (2): 471–484. https://doi.org/10.1080/01605682.2019.1657368.
Web of Science ®Google Scholar
Rencher, A. C., and G. B. Schaalje. 2008. Linear Models in Statistics. Hoboken: John Wiley & Sons, Inc.
Google Scholar
Sabino, P. 2020. “Exact Simulation of Variance Gamma-Related OU Processes: Application to the Pricing of Energy Derivatives.” Applied Mathematical Finance 27 (3): 207–227. https://doi.org/10.1080/1350486X.2020.1813040.
Google Scholar
Sato, K. 1999. Lévy Processes and Infinitely Divisible Distributions. Cambridge: Cambridge University Press.
Google Scholar
Sato, K., and M. Yamazato. 1983. “Stationary Processes of Ornstein-Uhlenbeck Type.” In Probability Theory and Mathematical Statistics, edited by K. Itô and J. V. Prohorov, 541–551. Berlin: Springer.
Google Scholar
Sato, K., and M. Yamazato. 1985. “Completely Operator-Selfdecomposable Distributions and Operator-Stable Distributions.” Nagoya Mathematical Journal 97:71–94. https://doi.org/10.1017/S0027763000021267.
Web of Science ®Google Scholar
Schoutens, W. 2003. Lévy Processes in Finance: Pricing Financial Derivatives. East Sussex: John Wiley & Sons Ltd.
Google Scholar
Valdivieso, L., W. Schoutens, and F. Tuerlinckx. 2009. “Maximum Likelihood Estimation in Processes of Ornstein-Uhlenbeck Type.” Statistical Inference for Stochastic Processes 12 (1): 1–19. https://doi.org/10.1007/s11203-008-9021-8.
Google Scholar
Wu, L., X. Zang, and H. Zhao. 2020. “Analytic Value Function for a Pairs Trading Strategy with a Lévy-Driven Ornstein-Uhlenbeck Process.” Quantitative Finance 20 (8): 285–1306. https://doi.org/10.1080/14697688.2020.1736613.
Web of Science ®Google Scholar

Appendix. Equivalence of control variates and linear regression

Suppose we have multiple control variate

C = (C_{1}, \dots, C_{p})

to estimate

E [P]

, where

(C, P)

is jointly simulated as we want the control variates to be correlated with P. As explained in Glasserman (Citation2003, Section 4.1.2), the control variate estimator is

{\hat{P}}_{C} = \bar{Y} - {\hat{β}}_{C}^{'} (\bar{X} - μ_{C}),

where

Y

is the vector of m simulated values of P,

\bar{Y}

is its sample mean,

X_{1}, \dots, X_{p}

are the vectors of m simulated values of

C_{1}, \dots, C_{p}

, respectively,

\bar{X}

is the vector of the sample means of

X_{1}, \dots, X_{p}

μ_{C} = E [C]

{\hat{β}}_{C} = S_{XX}^{- 1} S_{XY}

S_{XX} \in R^{p \times p}

with the sample covariance

\hat{Cov} (X_{i}, X_{j})

as its

(i, j)

-entry, and

S_{XY} \in R^{p}

with the sample covariance

\hat{Cov} (X_{i}, Y)

as its ith entry. This generalizes the method of Section 4.2.

Consider the linear regression model $Y = X β + ϵ$ , where $X \in R^{(p + 1) \times m}$ is the design matrix, which includes an intercept and the control variates $C_{1}, \dots, C_{p}$ , and $β \in R^{p + 1}$ with LSE $\hat{β}$ . The prediction of this linear regression at the point $x = (1, μ_{C})$ is $\hat{Y} = {\hat{β}}^{'} x = \bar{Y} + {\hat{β}}_{C}^{'} (μ_{C} - \bar{X}) = {\hat{P}}_{C},$ where the second equality follows from the centred form of linear regression by Rencher and Schaalje (Citation2008, pg 156, Equations (7.36), (7.46)). This reference also provides that ${\hat{β}}_{1} = {\hat{β}}_{C}$ , where $\hat{β} = ({\hat{β}}_{0}, {\hat{β}}_{1})$ .

Thus, the control variate estimator ${\hat{P}}_{C}$ is the prediction of this linear regression. This fact is well-known and mentioned in Glasserman (Citation2003, Section 4.1.1) for a single control variate, and it is unsurprising that it is also true for multiple control variates. Nelson (Citation1990) gives a different but equivalent statement where the regressors are $C - μ_{C}$ .

Monte Carlo Simulation for Trading Under a Lévy-Driven Mean-Reverting Framework

Abstract

1. Introduction

2. Lévy-Driven Ornstein-Uhlenbeck Process

2.1. Univariate LDOUP Driven by a Variance Gamma Process