Abstract
We propose a deep learning framework for impulse control problems involving multivariate stochastic processes, which can be controllable or uncontrollable. We use this framework to estimate central bank interventions on the (controllable) interest rate to stabilize the (uncontrollable) inflation rate, where the two rates are correlated and cointegrated. This method is useful for small banks or insurance companies with high exposure to Treasury securities to predict and stress-test their potential losses from central bank interventions. We also study the mathematical properties of the proposed framework.
1. Introduction
Monetary policy is one of the most important ways in which the government can affect the speed and direction of economic growth through central banking (Friedman Citation2000). For example, the United States Congress directs the Federal Reserve (Fed) to pursue several economic goals, including maximum employment, stable prices, and moderate long-term interest. The Fed uses monetary policy consisting of actions and communications to achieve these goals.Footnote1 Central banks have multiple ways of controlling economic activity, and recent research shows that one of the direct instruments that the central bank can use to do so is to choose between a short-term interest rate and a reserve quantity (Friedman Citation2000). In this paper, we focus on interest rate interventions, as they are closely linked to the 2023 small bank failures in the United States, such as Silicon Valley Bank (SVB). In an interview with Douglas DiamondFootnote2 , winner of the 2022 Nobel Prize for his research on bank runs, the reporter writes,
Douglas Diamond argues that the Fed's choice to signal long-term low interest rates, and then suddenly reverse course by raising interest rates in response to inflation, is a major reason for the collapse at Silicon Valley Bank….
Diamond pointed out that even in the Fed's 2022 stress tests, banks were not tested at treasury yield rates above 2%. Although SVB was not subject to a stress test, it likely would have passed under those parameters.
However, due to rapid inflation beginning in the middle of 2021, the Fed began raising interest rates quickly in 2022. Today, the effective Fed funds rate is 4.57%. This sudden reversal, which according to Diamond was not well-telegraphed by the Fed, is the reason that the market value of SVB's securities began to plummet.
This motivates us to use contemporary technology to predict interest rate interventions to control inflation and set a realistic stress test target for financial institutions.
There is a rich body of literature on interest rate dynamics in an open market, such as instantaneous interest rate models (Vasicek Citation1977, Hull and White Citation1990, Black and Karasinski Citation1991, Cox et al. Citation2005) and corresponding empirical studies (Chan et al. Citation1992, Chapman and Pearson Citation2000). In Guttentag (Citation1969), the importance of central bank intervention is highlighted in addition to the effect of market forces on interest rates. In 2022, the Fed launched seven large-scale interventions on the federal funds rate, causing substantial changes in the interest rate market after each intervention Footnote 1. Some studies explore central bank interventions on interest rates, such as interest rate targeting behavior (Rudebusch Citation1995), the Fed's target rate on interest rate dynamics (Balduzzi et al. Citation1997), and its affect on the yield curve (Piazzesi Citation2005). We specifically discuss studies of optimal interest rate interventions (Cadenillas and Zapatero Citation1999, Citation2000, Feng and Muthuraman Citation2010, Mitchell et al. Citation2014). All of these studies formulate optimal interventions as impulse control problems. The optimal interventions derived are then based on tractable models and interest rate targets. Like these studies, we consider optimal impulse control. However, unlike them, our task is to develop a deep learning approach to estimate optimal interest rate interventions to control inflation under a general class of models. In this way, our consideration proposed framework can accommodate a wider range of stochastic models and a more flexible economic target, such as the inflation rate target.
In this paper, we consider the situation in which the central bank is able to intervene on several interest rates, such as short-term and long-term rates, to affect the targeted economic variable (e.g. the inflation rate), which is correlated and cointegrated with the interest rates (Booth and Ciner Citation2001). Thus, the cost function of the central bank is related to the targeted value of the economic variable and the cost of intervention. The central bank then aims to find a control policy to minimize the cost of deviation from the target level and the cost of intervention. In reality, the central bank determines the best time to announce the change in fund rates. Decisions about the timing and extent of intervention constitute the impulse control policy (Constantinides and Richard Citation1978, Sulem Citation1986). Under a Brownian filtration, there are computational methods for impulse control problems (Harrison et al. Citation1983, Feng and Muthuraman Citation2010). Some studies examine impulse control of exchange and interest rates with Ornstein-Uhlenbeck (OU) processes (Cadenillas and Zapatero Citation1999, Citation2000, Mitchell et al. Citation2014). However, solutions for high-dimensional impulse control problems encounter tremendous challenge in terms of analytical tractability. The conventional approaches encounter challenges when it comes to formulating quasi-variational inequalities for high-dimensional impulse control problems, and the inherent uncontrollable nature of the inflation rate further exacerbates the complexity of the problem. As a consequence, prior studies often (if not always) assume a pre-specified target interest rate and derive analytical solutions with one-dimensional impulse control techniques. In reality, the central bank intervenes the interest rate without knowing the target interest rate but the target inflation rate. When the previous impulse control research concentrates on infinite horizon problems for mathematical tractability, the central bank may want to complete the task in a finite horizon. Machine learning techniques, especially deep learning, offer computational feasibility for high-dimensional quantitative finance problems (Tsang and Wong Citation2020, Lo and Singh Citation2023, Mikkilä and Kanniainen Citation2023, Na and Wan Citation2023, Yin and Wong Citation2023). Our deep learning framework can be applied to a more general model setup and can accommodate higher dimensional problems, and handle the finite horizon problems. And some of our numerical comparisons are made against the literature to show the accuracy of our computation.
Our major contribution to the literature is the development of a deep learning framework for impulse control problems, which is based on the deep optimal stopping framework of Becker et al. (Citation2019) and Jia et al. (Citation2023) and additional layers for learning intervention actions. When we apply our framework to real data, our approach predicts Fed funds rate hikes consistent with the Fed's real interventions in terms of aggregate increase, although our predictions are more volatile than those of the real interventions. We suggest that such a discrepancy is associated with the Fed's intention to avoid volatile adjustments, but the aggregate increase in fund rates is similar when the inflation target is reached. Therefore, our framework also contributes to the practical stress test target of interest rate products. In addition, the application of impulse control can go beyond central bank intervention problems, such as inventory control (Cadenillas et al. Citation2010) and reinsurance strategy (Yan et al. Citation2022b).
The remainder of this paper is organized as follows. Section 2 presents the problem formulation and the cost function. Section 3 introduces our machine learning framework, including the neural network (NN) architecture, the approximation method, and the training procedure. The numerical study in section 4 examines the accuracy of our proposed method and the effectiveness of preventing the class vanishing problem where the neural networks are easily to be trapped into the local optimal in a one-dimensional setup. We then perform a real case study of taming inflation by intervening on interest rates. We also demonstrate the use of our proposed framework when the central bank applies impulse control to two interest rates. Finally, section 6 concludes the paper.
2. Problem formulation
Let denote a filtered probability space and T>0 a deterministic fixed future time. The is the usual filtration. The denotes the set of positive integers and is the set of positive real numbers.
Following (Mitchell et al. Citation2014), we postulate that the interest rate follows the Ito process. There are several differences between our model and previous research (Cadenillas et al. Citation2010, Mitchell et al. Citation2014). First, is a multi-dimensional rather than one-dimensional stochastic process with a general form on which impulse control can be applied. Second, we add a multi-dimensional stochastic process, , which is cointegrated with . However, impulse control cannot be applied to . For example, when indicates the inflation rate for which the central bank has a target range, the central bank can only intervene on the interest rate to monitor the inflation rate. Third, to mimic real-world settings, we assume that impulse control can only be applied to a set of pre-fixed time points with a fixed action set . This is a reasonable assumption for controlling the interest rate because the central bank only changes its monetary policy at regular meetings, and interest rate changes are usually made based on a set of fixed actions rather than . Fourth, we consider a fixed termination time T>0 rather than ∞ because the members of the Federal Open Market Committee change every year. Therefore, it is reasonable to assume that the objective function will change accordingly.
Under the physical measure, the uncontrolled stochastic process is defined as (1) (1) This general formulation can easily incorporate several commonly used stochastic processes, including the Cox–Ingersoll–Ross process (Cox et al. Citation2005) and the OU process (Vasicek Citation1977).
The impulse control applied to is a sequence of immediate (upward or downward) changes in by a certain nonzero amount. We define impulse control v as a sequence of pairs of , where z is the number of impulse controls applied, are non-decreasing stopping times belonging to , and is the corresponding control amount for . Given a specific impulse control , the stochastic processes in (Equation1(1) (1) ) becomes (2) (2) After modeling the stochastic process, we define the cost and objective functions for the impulse control problem. We assume that the central bank can quantify its preferences for and its aversion to intervening. This indicates that the central bank can evaluate its preferences for through a running cost function and its aversion through an action cost function. The running cost and action cost functions measure the central bank's preferences and aversion rather than the real money it pays, which is similar to the settings in Lohmann (Citation1992) and Cadenillas and Zapatero (Citation1999). We adopt the general function to describe the running cost rate at time t with . The action cost of intervention is represented by the function for each time with an action of size ξ. In this paper, we consider a finite time impulse control problem with maturity T and a discount factor β for future costs. By combining these two costs, the total cost to the central bank of impulse control v with an initial value is (3) (3) The ultimate goal of the central bank is to find an optimal policy to minimize the objective function J. Assuming that , we are interested in impulse control v such that (4) (4) The collection of all impulse controls is defined as the admissible control set . The value function V is then given by (5) (5) The general forms of ϕ and G are consistent with previous research on the impulse control of interest rates (Mitchell et al. Citation2014). The key assumption in our method is that the stochastic processes are Markovian. In many applications, this assumption is not restrictive because Markovian processes can be realized by including past information. There are also restrictions on ϕ so that the solutions are non-trivial. Our method does not rely on these restrictions, and interested readers may refer to the following studies (Constantinides and Richard Citation1978, Feng and Muthuraman Citation2010, Mitchell et al. Citation2014).
3. Deep impulse control framework
Studies of impulse control are generally designed for one-dimensional cases, while the cointegrated process has yet to be considered (Cadenillas et al. Citation2010, Mitchell et al. Citation2014). Traditional methods encounter difficulties for higher dimensional problems when impulse control can be applied to the multi-dimensional interest rate vector , and the incorporation of vector makes the problem even more difficult. For example, the central bank can control both short-term and long-term rates to control inflation. To handle the general impulse control problem defined in the previous section, we develop a novel NN to identify the optimal impulse control policy, which is inspired by Becker et al. (Citation2019).
3.1. Express impulse control using neural networks
For impulse control v, two decisions must be made at each time point : whether v is needed at time t, and its magnitude if so. We consider a larger set of actions to combine these two decisions. The non-intervention decision is regarded as an impulse control of magnitude 0. In this approach, the number of impulse controls z corresponds to the cardinal number of denoted N in the rest of the paper.
For , the optimal impulse control at time should generally be based on the whole process of from time 0 to time . However, only is sufficient to make the decision at time because of the Markov property of the process. Let us assume that the optimal impulse control decision at time is for the measurable function . is the one-hot vector taking a value of 1 in the ith coordinate and 0 otherwise, and M is the cardinal number of . The optimal control policy can be represented by a sequence of for . The idea is to approximate through an NN whose input is the pair and the output is a one-hot vector. Then, we can obtain the control policy .
3.2. Neural network approximation
The numerical method approximates with an NN for through iterative backward training. Before considering the approximation procedure, we define some notations for better illustration. Consider an auxiliary problem (6) (6) where is the set of all admissible controls in which for . We define the expected future cost condition on as (7) (7) Let the magnitude of the m-th intervention on R be . For a fixed , a given control in is defined as a sequence of measurable functions . The following proposition supports the iterative backward training method.
Proposition 3.1
For a specific , let be an impulse control policy generated by measurable functions . Then, there exists a measurable function such that the impulse control policy given by satisfies where and are defined as in (Equation6(6) (6) ).
Proof.
Let with as the impulse control policy. Then, we have where the cost from time 0 to time is the same and is therefore subsumed. Let and , with being the indicator function. Then As is arbitrary, it is also satisfied in the optimal case:
Proposition 3.2
Let and impulse control . For any depth of the NN with and , there exist positive integers such that where is the set of all measurable one-hot functions.
Proof.
For a fixed , the integrability condition ensures that there exists a measurable function such that and there exists a Borel set such that . As the output of is a one-hot function, is disjoint and . Based on the integrability condition and (Equation4(4) (4) ), which defines a finite Borel measure on . Let such that . Then, defines a sequence of continuous functions that converge pointwisely to for each m. By the dominated convergence theorem, there exists such that We require to satisfy at least one for . This can be done because the union of is . By Leshno's theorem, can be approximated uniformly on compact sets by functions of the form (8) (8) Hence, there exists a function in (Equation8(8) (8) ) such that can be expressed as an NN. We can then combine multiple functions to form a larger prediction network.
Based on the above two propositions, we can construct an NN to approximate the optimal policy. We introduce the NN architecture in the next section.
3.3. Neural network architecture
Based on Proposition 3.1, it is theoretically possible to find a fully connected deep neural network (DNN) to approximate . However, the output of is , to which gradient descent cannot be applied. Therefore, we include an NN as a transition step for optimization purposes. is continuous and almost everywhere differentiable, and our objective is to minimize the following function: After determining , the function is defined as where is the function used to find the position of the maximum value of the input. The softmax and pmax functions restrict the output of the NN to , resulting in a smaller value than the indicator functions of Proposition 3.2. When there is more than one maximum value in , we assume that pmax finds the position of the first maximum value by convention.
During the training process, we find that the general fully connected NN has a high probability of falling into the local minimums. Furthermore, the iterative backward training method worsens the situation because the accuracy of depends on an accurate estimate of .
This phenomenon is due to the complex nature of our task. Our problem is neither a traditional regression problem nor a traditional classification problem. It differs from classical regression problems because the outputs of the NN are one-hot vectors. It also differs from classification problems because the objective function is not related to the misclassification error and there is no ground truth available for the training process. Our task is a classification task with a regression-type objective function, and this type of combination results in a high probability of being trapped in local minimums during gradient descent. To the best of our knowledge, there is no research describing this phenomenon, which we refer to as ‘class vanishing’ in the rest of the paper. When class vanishing occurs, the output of the DNN will be a proper subset of , which means that the DNN solves a simpler problem with a smaller set of actions than the original set. We further illustrate the class vanishing problem in the numerical study section for the one-dimensional case, as a benchmark is available for this case.
To solve the class vanishing problem, we propose a new NN architecture that proves useful in the one-dimensional case. Because of the lack of benchmarks for high-dimensional cases, we leave it to future research to test the effectiveness of this NN architecture in high-dimensional situations.
takes the following form: (9) (9) where is the depth of the NN , are positive integers indicating the number of nodes in the hidden layer of . is the standard softmax function defined as and are affine functions. are activation functions where . The NN consist of M different NNs, and each NN is only responsible for predicting one impulse control choice in . The outputs of small NNs are combined through a softmax function, which is commonly used for multi-class classification. An overview of the NN architecture is shown in figure . The idea here is very similar to ensemble learning techniques such as xgboost or random forest. However, our aim is to prevent the occurrence of the class vanishing problem rather than to make the bias–variance tradeoff. We refer to this type of NN as the ensemble NN in the rest of the paper. The ensemble NN can be regarded as a kind of regularization method similar to the DropConnect method proposed in Wan et al. (Citation2013). The difference is that DropConnect randomly sets the weights to 0, whereas we do it deterministically.
The training of depends on the sample of , and may not minimize for in Proposition 3.1. We can train different functions for different distributions of to minimize . In our numerical study below, one function is enough to obtain a satisfactory control policy.
3.4. Parameter optimization
We determine the depth and the number of nodes in the hidden layer for all k and train the NN described in section 3.3. To obtain the parameter numerically, we simulate H paths of and minimize the sample average, which can be regarded as an approximation of the expectation of the cost function.
Consider a given , we assume that the parameters are determined and generate an impulse control policy . We denote the cost calculated by the hth path from time to time T by , which is defined as is estimated in the discrete case by where . In this paper, the integral is measured by ϕ at the end time point of each interval. We choose the end time point here because it is the most difficult to predict. The input for is and the resulting process values are , which contain no randomness. It is easier for the NN to find the optimal control policy if we include in the cost calculation.
Suppose that we can apply impulse control according to at time and make a decision according to ; then, the cost for the hth simulated path is given by where is the simulated path with applied at time and is the corresponding cost from time to time T.
For a large H, the mean of , approximates the expectation so that it is used to find through gradient descent.
We divide the training process into two phases, the pre-training phase and the fine-training phase, to prevent class vanishing. In the pre-training phase, the objective function is The first part of this equation is the expectation that we want to minimize and the second part is the penalty term for the occurrence of a small probability in . λ is the pre-defined parameter to balance these two objectives and ε is a small number to prevent log explosion during the training process. By adding the penalty term, we can avoid the class vanishing problem but may introduce an additional bias into our predictions. After the pre-training phase, we remove the penalty term and continue to train the model in the fine-training phase to remove this additional bias.
4. Numerical study
In this section, we examine our method in three scenarios. First, we do not consider the uncontrollable stochastic process I and test our method in the one-dimensional case. Because impulse control in the one-dimensional case is well studied, we can use the results of Cadenillas et al. (Citation2010) as a benchmark to examine our method. Second, we illustrate our method using data on real inflation and effective federal funds rates (EFFR) in the United States. We compare our results with the Fed's decision on the EFFR in 2022. Next, we examine the effect of our model with a multi-dimensional R and a one-dimensional I through a simulation study.
4.1. One-dimensional benchmark
The one-dimensional stochastic impulse control problem is studied in Cadenillas et al. (Citation2010), and we follow these settings for our simulation study. Assume that follows an OU process given by when there is no impulse control. a = 0.2 is the speed of reversion, b = 2 represents the long-term mean level, is the instantaneous volatility of the model, and is the one-dimensional Brownian motion. The discount rate β is set to 0.06.
There are several differences between (Cadenillas et al. Citation2010) and our study, leading to a deviation between our optimal policy and theirs. In Cadenillas et al. (Citation2010), T is set to ∞ and are non-negative real numbers, and . For our numerical study, T is set to 5, , and . We use the same running cost function and action cost function defined as in Cadenillas et al. (Citation2010), respectively. The cost should be for based on the two cost functions above. However, 0 in represents the choice of no impulse control rather than an impulse control of magnitude 0. Therefore, we replace 5 with 0, and the cost is in our paper.
We train backwardly from with the NN architecture defined in section 3.3. The depth , and for . The process in (4.1) can be easily simulated, and the initial value is sampled from a uniform distribution to allow the NN to learn different scenarios. For each , we simulate 8,192 sample paths for the training process, and we conduct 1,500 training steps in the pre-training phase with and 3,000 training steps in the fine-training phase.
and are shown in figure with different and values. are all similar, and we show here. We also show because is different from the other functions. The reason may be that this is the last decision point and the NN is more conservative in impulse control.
We also show a representative class vanishing problem for in figure . is a fully connected NN with a depth of I = 3 and 40 nodes in each hidden layer.Footnote3 The training process is the same with the same parameter settings, and we can see from the figure that the three classes −3, 3 and 6 are missing. We increase the number of nodes in the hidden layers to 50, but the class vanishing problem still occurs.
4.2. Fed's interest rate illustration
In this section, we illustrate our method with real data from the United States. We collect EFFR and Consumer Price Index (CPI) data from U. B. of Labor Statistics (Citation2023) and F.R.B. of New York (Citation2023). We define the month-over-month (MoM) inflation rate as Assume that the EFFR and monthly inflation follow stochastic processes, (10) (10) when no impulse control is applied. follows an OU process with being the speed of reversion, and the long-term mean level. follows a stochastic process that is cointegrated with . and are constant instantaneous volatilities, and and are independent one-dimensional Brownian motions. We estimate the parameters using EFFR and CPI data from August 1, 2019 to November 2.Footnote4 As the Fed launched several interventions on the EFFR during this period, we remove the corresponding days during the estimation process. The estimated parameters are , and .
After determining the parameters of the stochastic processes, we start training our NN for a one-year interest rate impulse control policy for 2022. The maturity T is set to 1. The Fed holds eight regular meetings each year to determine monetary policy, and the time interval between meetings is usually one and a half months. Therefore, we set to . The Fed could only intervene on the EFFR, so the action set is set to , which contains all of the actions applied by the Fed to the EFFR in 2022. We also add −0.25 to the action set to test whether the NN chooses the right direction of action. Inflation was extremely high in 2022, so we assume that the main goal of the Fed was to control the inflation rate. Because the Fed is aiming for 2% inflation over the long run, we set our MoM inflation rate target to 0.2%. The running cost function and the action cost function are defined as follows: where the discount factor . The depth is set to 3, and the number of hidden nodes for . We simulate 30,000 sample paths to train the NN. The training step in the pre-training phase is set to 2000 with . The training step in the fine-training phase is set to 1500. After training the NN, there is one last problem to solve before comparing the output of the NN with the actual interventions of the Fed in 2022. In reality, there is a delay in the publication of the CPI, so we use the predicted inflation rate as the input. We compare our prediction results with the Fed's actions from January to September 2022, and the results are shown in table . Here, we consider two methods for determining the magnitude of the Fed's actions. First, we follow the general rule to choose the action with the highest probability (I) estimated by the NN. Second, we construct a method that is similar to the dot plot in the Fed's Summary of Economic Projections. We calculate for the decision at time and choose the action in that is closest to it (II).
Table shows that our impulse control policy is more volatile than the Fed's policy. We speculate that this may be related to the Fed's intention to smooth its intervention levels and the Fed's aim on the year-on-year (YoY) inflation rate rather than the MoM inflation rate. As the YoY inflation rate is less volatile than its MoM counterpart, the induced intervention magnitude is expected to be less volatile too. Despite the intervention volatility, the overall level of interest rate intervention by the Fed is similar to our NN predictions. The predicted aggregate intervention level is useful to set stress testing target for fixed income securities. For instance, around July 26, 2022, the MoM inflation rate was predicted to be 0.2%, and our NN model suggests stopping the interest rate intervention. The Fed should have increased the EFFR by 3.25%, but it was only increased by 1.5% in reality. Therefore, the Fed still needed an increase of at least 0.75%. A similar phenomenon occurs on September 20. Therefore, our model offers an early warning signal to financial institutions for potential future interventions. This signal sets a more realistic stress test target for interest rate products.
4.3. Two-dimensional illustration
We demonstrate the application of our model to short-term and long-term interest rate interventions to monitor inflation. In other words, the central bank can apply impulse control to the EFFR and long-term interest rates. Similar to the previous section, we define the stochastic processes of the EFFR , the long-term interest rate , and the MoM inflation rate as follows: without intervention, where and are the same as in the previous section. We assume that follows an OU process with a reversion speed , a long-term mean level , , and . We define the action set as If the central bank wants to monitor the monthly inflation rate by controlling the EFFR and long-term interest rates, we use the same running cost function as in the previous section, but the action cost function is replaced by The discount factor .
The depth of the NN is set to 3 and the number of hidden nodes for . We simulate 30,000 paths to train the NN backwardly. After training the NN, we simulate another set of 30,000 paths to illustrate the performance of our NN. Because of the lack of a benchmark, we calculate the sample mean of the overall cost if no impulse control is applied and compare it with the overall cost if we intervene based on the NN. The cost drops from 1.299 to 0.980, indicating that our NN is useful for minimizing the cost. However, it is technically more difficult for the Fed to intervene simultaneously on short-term and long-term rates in reality. Therefore, we do not report the details here, but we mention this potential alternative use of our framework.Footnote5
5. Discussion and areas for future study
5.1. Relationship with deep reinforcement learning
The primary goal of Deep Reinforcement Learning (DRL) is to optimize a policy for discrete actions that maximizes the expected cumulative rewards. This objective could be made in line with that in (Equation3(3) (3) ).Footnote6 This offers a possibility to link our problem to the discrete-time and discrete-space reinforcement learning problem. We believe that it is a promising future direction to use DRL techniques such as deep Q-learning (Mnih et al. Citation2013) to solve the interest rate intervention problem. An advantage of our method is that we train the DNN backward to get the optimal impulse control policy for which we can find an upper bound for the errors. The DRL requires exploration such as the ϵ greedy strategy to learn the environment. Further investigation is needed to derive error bound (or regret bound) to DRL for the same problem and compare the empirical results between the two approaches. This is a potential future direction to incorporate our ensemble networks in DRL and then empirically compare for their empirical performances.
5.2. Delay effect of interest rate intervention
Although the implementation of impulse control results in an immediate intervention of the interest rate, there exists an inherent time delay in the inflation response. In the context of inflation, it is anticipated that any adjustments would manifest as a gradual shift rather than an abrupt leap. Our proposed model (Equation10(10) (10) ) partially captures the intrinsic characteristics by incorporating the impact of the EFFR on the drift term of the inflation rate. Thus, it takes time for the inflation rate to revert to the intervened equilibrium level through the notion of cointegration dynamics. However, it is still interesting to examine and incorporate alternative delayed mechanism. An examples is the delayed stochastic system in Yan et al. (Citation2022a) and Yan and Wong (Citation2022) as follows. with a specific Δ time lag. The discrete version becomes: The DNN training is the same except for a different input for the DNN. This delayed mechanism changes the Markovian nature of the original problem and complicates the training of the DNN. We leave it a future research.
6. Conclusion
We generalize the impulse control problem and propose a novel deep-learning framework to solve it. The generalized impulse control problem is formulated with a finite time horizon, finite decision points, and a finite set of actions which is more aligned with the realistic situations compared with previous research. In addition, we consider both controllable and uncontrollable processes to replicate the real-world scenarios encountered by the central bank. Our framework can handle high-dimensional cases with cointegrated processes, which cannot be directly controlled by impulse control. We also propose a new NN architecture to solve the class vanishing problem, which occurs because of the complex nature of our task. The ensemble NN can be easily extended to other areas when the problem is a classification task with a regression-type objective function. The accuracy of our method is examined in one-dimensional cases. Our numerical results show reasonable congruence between the predictions of our NN model and the Fed's interventions on the EFFR in 2022. We suggest that our deep impulse control framework is useful for financial institutions and regulatory agency to develop stress test interest rate scenarios for risk management purposes.
Acknowledgements
We are grateful to comments by participants of 26th International Congress on Insurance: Mathematics and Economics at Heriot-Watt University, Recent Advances on Quantitative Finance 2023 at Hong Kong Polytechnic University, and The 3rd Yushan Conference at National Yang Ming Chiao Tung University.
Disclosure statement
No potential conflict of interest was reported by the author(s).
Additional information
Funding
Notes
1 Source: the monetary policy web page of the Board of Governors of the Federal Reserve System. https://www.federalreserve.gov/monetarypolicy.htm.
3 Our NN architecture has three layers at each decision time point so that the whole NN collecting all decision points has a total of three times the number of decision points of layers. When the whole NN is trained at once, it is essentially a DNN training (Becker et al. Citation2021). In addition, our framework is an extension of the deep optimal stopping (Becker et al. Citation2019), which also uses a three-layer NN at each time point.
4 This is a distinctive period for us to examine our DNN because the last interest rate hikes to control inflation can be dated back to 1979-1982, when the interest rates were fully controlled to retained in a specified interval by the Fed. In other words, the interest rates were not sufficiently random to empirically estimate the interaction between interest rates and the inflation rate.
5 Interested readers could e-mail [email protected] for detailed result.
6 We thank an anonymous referee for pointing out the connection. As our paper aims to stimulate the use of contemporary machine learning method to help stress testing target, this suggestion would further strengthen our purpose.
References
- Balduzzi, P., Bertola, G. and Foresi, S., A model of target changes and the term structure of interest rates. J. Monet. Econ., 1997, 39, 223–249.
- Becker, S., Cheridito, P. and Jentzen, A., Deep optimal stopping. J. Mach. Learn. Res., 2019, 20, 2712–2736.
- Becker, S., Cheridito, P., Jentzen, A. and Welti, T., Solving high-dimeniosnal optimal stopping problems using deep learning. Eur. J. Appl. Math., 2021, 32, 470–514.
- Black, F. and Karasinski, P., Bond and option pricing when short rates are lognormal. Financ. Anal. J., 1991, 47, 52–59.
- Booth, G.G. and Ciner, C., The relationship between nominal interest rates and inflation: International evidence. J. Multinatl. Financ. Manag., 2001, 11, 269–280.
- Cadenillas, A. and Zapatero, F., Optimal central bank intervention in the foreign exchange market. J. Econ. Theory, 1999, 87, 218–242.
- Cadenillas, A. and Zapatero, F., Classical and impulse stochastic control of the exchange rate using interest rates and reserves. Math. Finance, 2000, 10, 141–156.
- Cadenillas, A., Lakner, P. and Pinedo, M., Optimal control of a mean-reverting inventory. Oper. Res., 2010, 58, 1697–1710.
- Chan, K.C., Karolyi, G.A., Longstaff, F.A. and Sanders, A.B., An empirical comparison of alternative models of the short-term interest rate. J. Finance, 1992, 47, 1209–1227.
- Chapman, D.A. and Pearson, N.D., Is the short rate drift actually nonlinear? J. Finance, 2000, 55, 355–388.
- Constantinides, G.M. and Richard, S.F., Existence of optimal simple policies for discounted-cost inventory and cash management in continuous time. Oper. Res., 1978, 26, 620–636.
- Cox, J.C., Ingersoll Jr, J.E. and Ross, S.A., A theory of the term structure of interest rates. In Theory of Valuation, pp. 129–164, 2005 (World Scientific).
- F.R.B. of New York, Effective federal funds rate [EFFR], retrieved from FRED, Federal Reserve Bank of St. Louis, 2023. Available online at: https://fred.stlouisfed.org/series/EFFR (accessed 26 February 2023).
- Feng, H. and Muthuraman, K., A computational method for stochastic impulse control problems. Math. Oper. Res., 2010, 35, 830–850.
- Friedman, B.M., Monetary policy, 2000.
- Guttentag, J.M., Defensive and dynamic open market operations, discounting, and the federal reserve system's crisis-prevention responsibilities. J. Finance, 1969, 24, 249–262.
- Harrison, J.M., Sellke, T.M. and Taylor, A.J., Impulse control of brownian motion. Math. Oper. Res., 1983, 8, 454–466.
- Hull, J. and White, A., Pricing interest-rate-derivative securities. Rev. Financ. Stud., 1990, 3, 573–592.
- Jia, B., Wang, L. and Wong, H.Y., Machine learning of surrender: Optimality and humanity. J. Risk Insur., 2023. https://doi.org/10.1111/jori.12428
- Lo, A.W. and Singh, M., Deep-learning models for forecasting financial risk premia and their interpretations. Quant. Finance, 2023, 23, 917–929.
- Lohmann, S., Optimal commitment in monetary policy: Credibility versus flexibility. Am. Econ. Rev., 1992, 82, 273–286.
- Mikkilä, O. and Kanniainen, J., Empirical deep hedging. Quant. Finance, 2023, 23, 111–122.
- Mitchell, D., Feng, H. and Muthuraman, K., Impulse control of interest rates. Oper. Res., 2014, 62, 602–615.
- Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D. and Riedmiller, M., Playing atari with deep reinforcement learning, 2013, arXiv preprint arXiv:1312.5602.
- Na, A.S. and Wan, J.W., Efficient pricing and hedging of high-dimensional American options using deep recurrent networks. Quant. Finance, 2023, 23, 631–651.
- Piazzesi, M., Bond yields and the federal reserve. J. Polit. Econ., 2005, 113, 311–344.
- Rudebusch, G.D., Federal reserve interest rate targeting, rational expectations, and the term structure. J. Monet. Econ., 1995, 35, 245–274.
- Sulem, A., A solvable one-dimensional model of a diffusion inventory system. Math. Oper. Res., 1986, 11, 125–133.
- Tsang, K.H. and Wong, H.Y., Deep-learning solution to portfolio selection with serially dependent returns. SIAM J. Financ. Math., 2020, 11, 593–619.
- U. B. of Labor Statistics, U.S. bureau of labor statistics, consumer price index for all urban consumers: All items in U.S. city average [cpiaucsl], retrieved from FRED, Federal Reserve Bank of St. Louis, 2023. Available online at: https://fred.stlouisfed.org/series/CPIAUCSL (accessed 26 February 2023).
- Vasicek, O., An equilibrium characterization of the term structure. J. Financ. Econ., 1977, 5, 177–188.
- Wan, L., Zeiler, M., Zhang, S., Le Cun, Y. and Fergus, R., Regularization of neural networks using dropconnect. In International Conference on Machine Learning, pp. 1058–1066, 2013 (PMLR).
- Yan, T. and Wong, H.Y., Equilibrium pairs-trading under delayed cointegration. Automatica, 2022, 144, 110498.
- Yan, T., Chiu, M.C. and Wong, H.Y., Pairs trading under delayed cointegration. Quant. Finance, 2022a, 22, 1627–1648.
- Yan, T., Park, K. and Wong, H.Y., Irreversible reinsurance: A singular control approach. Insur. Math. Econ., 2022b, 107, 326–348.
- Yin, J. and Wong, H.Y., Deep lob trading: Half a second please!. Expert Syst. Appl., 2023, 213, 118899.