29
Views
0
CrossRef citations to date
0
Altmetric
Research Article

A hybrid nonparametric multivariate density estimator with applications to risk management

ORCID Icon &
Pages 301-318 | Published online: 17 Apr 2024
 

Abstract.

Multivariate density estimation is plagued by the curse of dimensionality in theory and practice. We propose a hybrid density estimator of a multivariate density f that combines the strengths of the kernel estimator and the exponential series estimator. This estimator refines a preliminary kernel estimate f̂0 with a multiplicative correction that estimates the ratio r=f/f̂0 with an exponential series estimator. Thanks to the consistency of the pilot estimate, the coefficients of the series expansion tend to approach zero asymptotically. Accordingly, we design a thresholding method for basis function selection. A major obstacle of multivariate exponential series estimator is the calculation of its normalization factor. We resolve this difficulty with Monte Carlo integration, using the pilot kernel estimate as the trial density for importance sampling. This approach greatly enhances the practicality of the hybrid estimator. Numerical simulations demonstrate the good finite sample performance of the hybrid estimator. We present one empirical application in financial risk management.

Notes

1. Note that given the degree of smoothness, the number of cross-partial derivatives increases with the dimension d, a regularity condition commonly assumed in multivariate series estimation. This smoothness condition bounds the best possible rate of convergence of series estimation. Although sometimes not explicitly stated, a similar condition is needed for kernel-based estimation in high-dimensional estimation as well.

2. We use the npudensbw function from the R package np to implement the estimation. The “normal-reference” method computes the bandwidth hj using the standard formula hj=1.06σjn1/(2P+d), where hj represents the bandwidth for the j-th continuous variable, and σj given by the minimum among the standard deviation, the mean absolute deviation divided by 1.4826, and the interquartile range divided by 1.349. Here n denotes the number of observations, P signifies the order of the kernel, and d represents the number of continuous variables.

3. All estimation methods were implemented in R on a PC with an Intel Xeon 3.90 Ghz processor and 32 GB RAM. Take for example the Clayton-Survival Clayton DGP with parameters τ = 0. 6 and w = 0. 5. For n = 200, the computation times (in seconds) for the KDE, the hybrid estimate, and the smoothing spline estimate (Gu, Citation1993) are: (i) d = 2, 0.0059, 0.4116, and 0.2082; (ii) d = 3, 0.0073, 3.203, and 0.3545; (iii) d = 4, 0.0096, 7.8872, and 0.7261. For n = 400, the corresponding computation times are: (i) d = 2, 0.0162, 0.4695, and 0.4255; d = 3, 0.0218, 5.1458, and 0.6936; (iii) d = 4, 0.03, 15.3898, and 1.4395. As expected, the computation time for our estimator increases with both sample size and dimension as more basis functions are incorporated. Nonetheless, the computation time for our estimator remains manageable.

4. All estimation methods were implemented in R on a Windows PC with an Intel Xeon 3.90 Ghz processor and 32 GB RAM. Take for example the Clayton-Survival Clayton DGP with parameters τ = 0. 6 and w = 0. 5. For n = 200, the computation times (in seconds) for the KDE, the hybrid estimate, and the smoothing spline estimate (Gu, Citation1993) are: (i) d = 2, 0.0059, 0.4116, and 0.2082; (ii) d = 3, 0.0073, 3.203, and 0.3545; (iii) d = 4, 0.0096, 7.8872, and 0.7261. For n = 400, the corresponding computation times are: (i) d = 2, 0.0162, 0.4695, and 0.4255; d = 3, 0.0218, 5.1458, and 0.6936; (iii) d = 4, 0.03, 15.3898, and 1.4395. As expected, the computation time for our estimator increases with both sample size and dimension as more basis functions are incorporated. Nonetheless, the computation time for our estimator remains manageable.

5. Escanciano and Olmo (Citation2010) study the effects of estimation risk on VaR backtesting. They focus their investigation on parametric VaR models for a single financial asset/portfolio. In contrast, ours primarily concerns a portfolio consisting of multiple underlying assets and explicitly models their joint distribution. Since the joint distribution is modeled nonparametrically, it essentially introduces infinite-dimensional parameters into the estimation. As a result, the method proposed by Escanciano and Olmo (Citation2010) is not directly applicable. Nonetheless, accounting for estimation risk is important in VaR backtesting, and we may explore this issue for the proposed method in our future study.

Additional information

Funding

Portions of this research were conducted with the advanced computing resources provided by Texas A&M High Performance Research Computing. Juan Lin acknowledges financial support from the National Natural Science Foundation of China (grant nos. 72173107 and 72273112).

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 578.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.