Full article: Random variate generation by fast numerical inversion in the varying parameter case

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

Abstract

There are various general techniques to produce random variates of a probability distribution such as the rejection method or numerical approaches to invert the cumulative distribution function (CDF). Some of these methods work in a black-box fashion (i.e., a single piece of code can sample for a relatively large class of distributions) which allows to create generators easily even for nonstandard distributions. Numerical inversion has some desirable properties that make its application appealing. However, a setup step is typically required to compute an approximation of the inverse of the CDF. Hence, if the distribution depends on a shape parameter and small samples are required for many different parameters (varying parameter case), the cost of the setup step typically outweighs the marginal generation times of the small samples, rendering the inversion method very slow. This article presents a new approach that allows for the use of inversion in the varying parameter case, provided that a suitable transformation of the density can be found to avoid running the setup for every parameter. The method is applied to two distributions (ARGUS and alpha distribution) to demonstrate that the performance is very good in the varying parameter case.

Keywords:

1 Introduction

For continuous distributions, various general methods have been developed such as numerical inversion of the cumulative distribution function (Hörmann and Leydold Citation2003; Derflinger, Hörmann, and Leydold Citation2010), Transformed Density Rejection (Wild and Gilks Citation1993; Hörmann Citation1995), the Ratio-of-Uniforms (RoU) method (Kinderman and Monahan Citation1977) and variants thereof (Wakefield, Gelfand, and Smith Citation1991; Leydold Citation2000), as well as table methods (Ahrens Citation1995). Excellent comprehensive resources are the references Devroye (Citation1986), Dagpunar (Citation1988), and Hörmann, Leydold, and Derflinger (Citation2004). Some of these methods work in a black-box fashion (i.e., a single piece of code can sample for a relatively large class of distributions, e.g., based on the density or the CDF).

The inversion method is attractive for simulations. A few of its strengths (Derflinger, Hörmann, and Leydold Citation2010) are:

It is a very general method that works for any distribution whose CDF can be computed.
It preserves the structural properties of the uniform random numbers.
It allows efficient sampling from truncated distributions.
It is well suited for quasi-Monte Carlo methods, and it is essential for copula methods.

However, except for a few cases, the inverse of the CDF cannot be expressed using elementary functions, and special functions like the inverse of the Gamma function are expensive to evaluate. In such situations, the numerical inversion approach called PINV presented in Derflinger, Hörmann, and Leydold (Citation2010) is a suitable choice. A setup step is typically required to compute an approximation of the inverse of the CDF if a black-box method is applied. Hence, if the distribution depends on a shape parameter and small samples are required for many different parameters (varying parameter case), the cost of the setup step typically outweighs the marginal generation times of the small samples, rendering the inversion method very slow.Footnote¹ For example, even if the setup takes only one millisecond, this would mean it would take more than 16 minutes to run the setup for one million different parameters, which is obviously orders of magnitudes longer than generating one million random variates with an efficient sampler for a fixed value of the parameter. For the varying parameter case, no fast automatic method is known, and for most distributions with shape parameters, the varying parameter case results in much slower generation methods than the fixed parameter case. Being able to handle the varying parameter case is important in some applications such as Gibbs sampling in the context of Bayesian statistics (see (Hörmann, Leydold, and Derflinger Citation2004, Section 15.2) for examples and further references). The importance of the tradeoff between the speed of sampling and the setup is also discussed in (Devroye Citation1986, Section I.3): “The new ingredient for multi-parameter families is the set-up time, that is, the time spent computing constants that depend only upon the parameters of the distribution.”, and the cost of the setup is always considered in this reference. For example, for the Gamma distribution, “small or nonexistant setup times” (Devroye Citation1986, Section IX.3.2) are considered a desirable feature, and various approaches to cover the fixed and varying parameter case are discussed in the cited section. For the generalized Gaussian distribution, algorithms with a fast setup are studied in Hörmann and Leydold (Citation2014) and Devroye (Citation2014). The sampling approach Hörmann and Leydold (Citation2014) is used in Kastner, Frühwirth-Schnatter, and Lopes (Citation2017) in the context of Bayesian inference where MCMC sampling is applied to simulate from distributions with time-varying parameters in stochastic volatility models. Another application where the parameters of a distribution are varied is sensitivity analysis, see Lu et al. (Citation2008) for an example.

This article presents a new approach that allows for the use of inversion in the varying parameter case if a suitable transformation of the density can be found. In particular, the transformed density must only depend on the shape parameter via the domain, allowing to sample from the transformed distribution by conditioning on a parameter-dependent interval and then mapping the random variates back to the original distribution. Hence, one can benefit from the advantages of the inversion method even in the varying parameter case by avoiding to run the expensive setup step many times. The method is applied to two distributions (ARGUS and alpha distribution).

The remainder of this article is organized as follows: Section 2 presents the main idea to apply numerical inversion in the varying parameter case. Section 3 shows how to apply the method to two distributions: the ARGUS distribution in Section 3.1 and the alpha distribution in Section 3.2. The performance of the algorithm is analyzed in Section 4. We conclude with a brief discussion of the results in Section 5.

2 Inversion in the varying parameter case

Given the density f of a distribution, PINV(f) returns a function that is a numerical approximation of the inverse of the CDF. The algorithm is based on polynomial interpolation of the inverse CDF and Gauss-Lobatto integration. Its main strengths are that only the density is needed as an input and that the algorithm allows to control the approximation error of the numerical inversion. According to Derflinger, Hörmann, and Leydold (Citation2010), it is “by far the fastest inversion method known”. If F is a CDF and $F_{a}^{- 1}$ an approximation of its inverse, PINV allows to control the u-error which is defined as (1) $\sup_{0 < u < 1} ϵ_{u} (u) = \sup_{0 < u < 1} | u - F (F_{a}^{- 1} (u)) |$ (1)

Another measure of the error is the x-error given by (2) $\sup_{0 < u < 1} ϵ_{x} (u) = \sup_{0 < u < 1} | F^{- 1} (u) - F_{a}^{- 1} (u) | .$ (2)

One can show that (3) $ϵ_{x} (u) = ϵ_{u} (u) / f (F^{- 1} (u)) + O (ϵ_{u} {(u)}^{2}) .$ (3)

The setup step of PINV that computes the polynomial approximation to achieve the specified numerical precision (1e-10 by default) is too slow to rely on this method in the varying parameter case where f_p relies on a shape parameter p. For example, even if the setup takes only 1ms, this would mean that more than 16 minutes are required to perform the setup step for one million different parameters, which is obviously orders of magnitudes longer than generating one million random variates with an efficient sampler for a fixed value of p.

The main idea is therefore to avoid a separate setup step for each parameter. As a first step, we apply a transformation $T = T (\cdot, p)$ to X: $Y = T (X, p)$ where X has a density f_p on an interval (v₁, v₂) with CDF F_p. Let g_p/G_p denote the PDF/CDF of Y. Assuming the transformation is increasing in x (the case that it is decreasing works analogously) and differentiable, the relationships of the CDFs F_p and G_p and the densities f_p and g_p are (4) $\begin{matrix} G_{p} (x) = F_{p} (T^{- 1} (x, p)), \\ g_{p} (x) = (T^{- 1})' (x, p) \cdot f_{p} (T^{- 1} (x, p)), x \in (T (v_{1}, p), T (v_{2}, p)), \end{matrix}$ (4) where $T^{- 1}$ denotes the inverse of T w.r.t. x. The goal will be to find a transformation such that $g_{p} (x) = c_{p} \cdot g (x)$ on $(T (v_{1}, p), T (v_{2}, p))$ where g is a PDF that does not depend on p. Hence, the only dependence on p is the domain (and the normalizing constant c_p which is not an input to PINV). Thus, if G denotes the CDF corresponding to the density g, Y is distributed according to g conditional on $Y \in (T (v_{1}, p), T (v_{2}, p))$ , and the inverse of its CDF is $y \mapsto G^{- 1} (M_{p} \cdot y + G (T (v_{1}, p)))$ where $M_{p} = P [Y \in (T (v_{1}, p), T (v_{2}, p))] = G (T (v_{2}, p)) - G (T (v_{1}, p))$ . Let H denote the approximation of $G^{- 1}$ computed with PINV. Thus, in order to generate random variates for different $p_{1}, \dots, p_{k}$ , one needs to calculate the constants of the polynomial approximation H of only one inverse CDF in a setup step and to compute $M_{p_{i}}$ for each parameter to generate the conditional variates. These can then be transformed back to the target distribution by applying the inverse of $x \mapsto T (x, p)$ .

We summarize this approach in the following algorithm:

Algorithm 1

Inversion (varying parameter)

Require: $n \in N, p_{1}, \dots, p_{n}$ , density g, T and $T^{- 1}$ , v₁, v₂ (interval boundaries), tol (u-error tolerance).

Output: n random variates distributed according to $f_{p_{1}}, \dots, f_{p_{n}}$ .

1: # Setup step

2: H = PINV(g) with maximal u-error of tol

3: # Generation of random variates

4: for i = 1 to n do

5: M = G(T(v₂, p_i)) - G(T(v₁, p_i))

6: Generate U uniformly distributed on $[0, 1]$

7: Y = H(M·U + G(T(v1, pi))

8: $X_{i} = T^{- 1} (Y, p_{i})$

9: end for

10: return $X_{1}, \dots, X_{n}$

Note that the algorithm is very easy to implement if an implementation of PINV is available.Footnote² An open-source implementation of PINV can be found in the C library UNU.RAN (http://statmath.wu.ac.at/unuran/), in SciPy (release 1.8.0, Baumgarten and Patel Citation2022) and in the R package Runuran (Tirler and Leydold Citation2003).

The u-error tolerance is an input to Algorithm 1 which allows to control the numerical precision when computing $H = PINV (g)$ in line 2. H is the inverse of the CDF G of the untruncated distribution instead of the CDF G_p restricted to $(T (v_{1}, p), T (v_{2}, p))$ , so the u-error is inflated by a factor $1 / M_{p}$ : (5) $\sup_{0 < u < 1} | u - G_{p} (H (M_{p} u + G (T (v_{1}, p)))) |$ (5) (6) $\begin{matrix} = M_{p}^{- 1} \sup_{0 < u < 1} | M_{p} u + G (T (v_{1}, p)) \\ - G (H (M_{p} u + G (T (v_{1}, p))) | \end{matrix}$ (6) (7) $= M_{p}^{- 1} \sup_{G (T (v_{1}, p)) < u < M_{p} u + G (T (v_{2}, p))} | u - G (H (u)) |$ (7)

If the factor is not very large, one can still achieve the desired error tolerance ϵ by specifying a tolerance of $M_{p} \cdot ϵ$ when the approximation of the inverse CDF is computed (e.g., $ϵ = 10^{- 10}$ and $1 / M_{p} = 100$ ). However, if M_p becomes very small for certain values of p, the tolerance $M_{p} \cdot ϵ$ is not achievable anymore since the numerical error cannot be made smaller than machine precision (e.g., $ϵ = 10^{- 13}$ and $1 / M_{p} = 10^{4}$ with 64bit precision). Hence, a refined approach is required to control the error in the varying parameter case. This needs to be investigated on a case-by-case basis as we will see in the next section when the approach is applied to the ARGUS and alpha distributions.

3 Application of the approach

In this section, two continuous distributions are presented where the approach outlined in Section 2 can be applied. We first consider the ARGUS distribution in Section 3.1 before turning to the alpha distribution in Section 3.2.

3.1 The ARGUS distribution

The ARGUS distribution is a continuous probability distribution on the interval $[0, 1]$ with probability density function (PDF) given by (8) $\begin{matrix} f (x, χ) = \frac{χ^{3}}{\sqrt{2 π} Ψ (χ)} x \sqrt{1 - x^{2}} \exp (- 0.5 χ^{2} (1 - x^{2})), \\ x \in [0, 1], χ > 0, \end{matrix}$ (8) where $Ψ (χ) = Φ (χ) - χ ϕ (χ) - 0.5$ , and $Φ$ is the cumulative distribution function (CDF) of the standard normal distribution and $ϕ = Φ'$ is the normal PDF. The CDF is $F (x, χ) = 1 - Ψ (χ \sqrt{1 - x^{2}}) / Ψ (χ)$ ( $x \in [0, 1]$ ). Note that the density becomes more concentrated around the mode of the distribution as χ increases which makes accurate numerical inversion more difficult. The distribution is relevant in particle physics: it was introduced in Albrecht et al. (Citation1994), see also Pedlar et al. (Citation2011) and Lees et al. (2010) for examples of how the distribution is used in this context. The ARGUS distribution is implemented in the statistics module of the well-known open-source software SciPy (Virtanen et al. Citation2020) and in ROOT (https://root.cern.ch). While no sampling method is available in ROOT, the default method to generate random variates in SciPy relies on a root-finding procedure to invert the CDF of a distribution: just generating a sample of 1000 data points takes a few seconds, which is not sufficient for practical purposes. This motivates the derivation of a fast sampling approach for ARGUS random variates. However, the author is not aware of a specific sampling algorithm for this distribution in the literature (neither for the fixed nor for the varying parameter case).

A key observations is that the ARGUS distribution is related to the Gamma distribution: If X has an ARGUS distribution with parameter χ, let $T (x, χ) = χ^{2} (1 - x^{2}) / 2$ . Then the density of $Y = T (X, χ) \in [0, χ^{2} / 2]$ is given by (9) $\frac{\sqrt{u} \exp (- u)}{\sqrt{π} Ψ (χ)}, u \in [0, χ^{2} / 2] .$ (9)

One recognizes the density $x^{p - 1} \exp (- x) / Γ (p), x \geq 0$ , of the Gamma distribution $Γ (p)$ with parameter p = 1.5. Thus, Y is $Γ (1.5)$ conditioned on $[0, χ^{2} / 2]$ . The CDF of a Gamma distribution can be expressed as the lower incomplete Gamma function. If p = 1.5 and $erf$ denotes the error function, the CDF G simplifies to (10) $G (x) = erf (\sqrt{x}) - \frac{2 \sqrt{x} \exp (- x)}{\sqrt{π}}, x \geq 0,$ (10)

The inverse of the CDF of the Gamma distribution is implemented in software packages such as SciPy (scipy.special.gammaincinv). However, it is very slow to use it for sampling compared to the methods explained in this note, see and the comment at the end of Section 4. Note that a comparison of the normalizing constants of the PDF in EquationEquation (9)(9) $\frac{\sqrt{u} \exp (- u)}{\sqrt{π} Ψ (χ)}, u \in [0, χ^{2} / 2] .$ (9) and the PDF of the Gamma distribution implies that $G (χ^{2} / 2) = 2 Ψ (χ)$ for all $χ \geq 0$ . Finally, a useful observation one can verify using l’Hôpital’s rule or a Taylor expansion of $Ψ$ is that (11) $\lim_{χ ↓ 0} \frac{χ^{3}}{Ψ (χ)} = 3 \sqrt{2 π} .$ (11)

In particular, EquationEquation (11)(11) $\lim_{χ ↓ 0} \frac{χ^{3}}{Ψ (χ)} = 3 \sqrt{2 π} .$ (11) implies that the density of the ARGUS distribution converges to $g (x) = 3 x \sqrt{1 - x^{2}}$ as $χ ↓ 0$ . The distribution function of the limiting distribution is $1 - {(1 - x^{2})}^{3 / 2}$ . It is easy to see that random variates of the limiting distribution can be generated as $X = \sqrt{1 - V^{2 / 3}}$ where V is uniformly distributed on $[0, 1]$ .

3.1.1 Numerical inversion for small parameters

If G denotes the CDF of the Gamma distribution with parameter p = 1.5 defined in EquationEquation (10)(10) $G (x) = erf (\sqrt{x}) - \frac{2 \sqrt{x} \exp (- x)}{\sqrt{π}}, x \geq 0,$ (10) and Y has a Gamma distribution conditional on $Y \leq χ^{2} / 2$ , then the inverse of its CDF is $u \mapsto G^{- 1} (G (χ^{2} / 2) u)$ . Note that the error might be inflated by a factor $1 / M_{χ}$ with $M_{χ} = G (χ^{2} / 2)$ . Recall that $G (χ^{2} / 2) = 2 Ψ (χ)$ and therefore, EquationEquation (11)(11) $\lim_{χ ↓ 0} \frac{χ^{3}}{Ψ (χ)} = 3 \sqrt{2 π} .$ (11) implies that $M_{χ} \sim χ^{3} \cdot \sqrt{2} / (3 \sqrt{π})$ as $χ ↓ 0$ . Thus, for small χ, the u-error will exceed the chosen tolerance defined for PINV.

To achieve high accuracy for small values of χ, note that one can approximate the Gamma(1.5) density conditioned on $[0, χ^{2} / 2]$ by (12) $l (x) = 3 \sqrt{2} \sqrt{x} / χ^{3}, x \in [0, χ^{2} / 2],$ (12) with CDF $L (x) = 2 \sqrt{2} x^{3 / 2} / χ^{3}$ and inverse $L^{- 1} (u) = χ^{2} u^{2 / 3} / 2$ (for simplicity of notation, we write l, L and $L^{- 1}$ without stating the dependence on χ explicitly). If we use this approximation, one can show that the u-error is bounded by $3 χ^{2} / 50$ for χ small enough, see Appendix A. Using EquationEquation (3)(3) $ϵ_{x} (u) = ϵ_{u} (u) / f (F^{- 1} (u)) + O (ϵ_{u} {(u)}^{2}) .$ (3) and l and $L^{- 1}$ as approximations of the conditional Gamma density, it follows that $ϵ_{x} (u) \sim - 0.1 χ^{4} u^{2 / 3} (1 - u^{2 / 3})$ as $χ ↓ 0$ . Thus, taking the supremum over $u \in (0, 1)$ , the x-error behaves like $χ^{4} / 40$ as $χ ↓ 0$ . The accuracy of the approximation can be improved substantially if one applies a single iteration of Newton’s method as follows: (13) $\begin{matrix} x_{0} = L^{- 1} (u) = χ^{2} u^{2 / 3} / 2, \\ x_{1} = x_{0} - (G_{χ} (x_{0}) - u) / G_{χ}^{'} (x_{0}) . \end{matrix}$ (13)

To avoid expensive evaluations of G involving the error function, one can use the following approximation of x₁: (14) $\begin{matrix} x_{1} \approx - x_{0}^{2} / 2 - x_{0}^{3} / 6 \\ + \sum_{k = 0}^{3} x_{0}^{k + 1} / k! (1 / 3 - x_{0} / 10 + \sqrt{2 π} G (χ^{2} / 2) / χ^{3}) . \end{matrix}$ (14)

Note that there is no need to approximate $G (χ^{2} / 2)$ using EquationEquation (A6)(A6) $G (x^{2} / 2) = \frac{\sqrt{2} x^{3}}{3 \sqrt{π}} - \frac{\sqrt{2} x^{5}}{10 \sqrt{π}} + O (x^{7}) = a_{0} x^{3} - a_{1} x^{5} + O (x^{7}) .$ (A6) as it is computed anyway (line 9 in Algorithm 2). Horner’s scheme can be used to evaluate x₁ numerically. The derivation of EquationEquation (14)(14) $\begin{matrix} x_{1} \approx - x_{0}^{2} / 2 - x_{0}^{3} / 6 \\ + \sum_{k = 0}^{3} x_{0}^{k + 1} / k! (1 / 3 - x_{0} / 10 + \sqrt{2 π} G (χ^{2} / 2) / χ^{3}) . \end{matrix}$ (14) can be found in Appendix A. presents the u- and x-error estimated numerically for different values of χ.

Table 1 u-error and x-error (see (1) and (2)) for the approximations of the inverse CDF given by Equations (13) (x₀) and 14 (x₁) estimated by taking the maximum over 100,000 randomly selected points in the interval (0, 1).

Download CSV Display Table

3.1.2 The algorithm for the varying parameter case

One can now combine the observations of Sections 2 and 3.1.1 to derive the final algorithm. If one aims to achieve a u-error of approximately $10^{- 10}$ , one can proceed as follows:

For $χ > 1$ , use $PINV (G)$ with a bound on the u-error of $10^{- 10}$ .
Let $I_{k} = (10^{- (k + 1)}, 10^{- k}]$ for k = 0, 1. On I_k, invert $G_{χ}$ with $χ = 10^{- k}$ using a bound on the u-error of $10^{- 13}$ (k = 0, 1).
For $χ \leq 10^{- 2}$ , use (13) and add one iteration of Newton’s method using (14) if $χ > 10^{- 5}$ .

For $χ \in (1, \infty]$ , the u-error is bounded by $10^{- 10} / G (0.5) < 5.1 \cdot 10^{- 10}$ . Note that on the intervals I₀ and I₁, the u-error is below $10^{- 10}$ in view of the following arguments: For $χ_{1} > 0$ , if one denotes the approximation of the inverse of $G_{χ_{1}}$ by $G_{a}^{- 1}$ , we can approximate the inverse CDF of $G_{χ_{0}}$ by $u \mapsto G_{a}^{- 1} (G_{χ_{1}} (χ_{0}^{2} / 2) u)$ for $χ_{0} < χ_{1}$ . Following the same arguments as above, it is easy to show that the u-error is inflated at most by a factor of $G (χ_{1}^{2} / 2) / G (χ_{0}^{2} / 2) \approx {(χ_{1} / χ_{0})}^{3}$ . The latter ratio is below 1000 by definition of the intervals I_k, which proves the claim since inversion with a u-error of $10^{- 13}$ is used. Finally, for $χ \leq 0.01$ , the results in indicate that the error is below $10^{- 10}$ .

The complete algorithm for the varying parameter case is summarized in Algorithm 2.

Algorithm 2

Generating ARGUS random variates

Require: $χ_{1}, \dots, χ_{n} > 0, n \in N$ .

Output: n ARGUS random variates with parameters $χ_{1}, \dots, χ_{n}$

1: # Setup step

2: H0 = PINV(density of $Γ (1.5)$ ) with tol = 1e-10

3: H1 = PINV(density of $Γ (1.5)$ restricted to ) with tol = 1e-13

4: H2 = PINV(density of $Γ (1.5)$ restricted to $[0, 0.05]$ ) with tol = 1e-13

5: C1 = G(0.5) ⊳ G is defined in (10)

6: C2 = G(0.05)

7: # Generation of random variates

8: for i = 1 to n do

9: C = G(χ_i²/2)

10: Generate U uniformly distributed on $[0, 1]$

11: if $χ_{i} \leq 0.01$ then

12: < > $Y = L^{- 1} (U)$ ⊳ L is defined in (13)

13: f $χ_{i} > 10^{- 5}$ then

14: perform one Newton iteration using (14)

15: end if

16: else if $χ_{i} \leq 0.1$ then

17: Y = H2(C· U/C2)

18: else if $χ_{i} \leq 1$ then

19: Y = H1(C· U/C1)

20: else

21: Y = H0(C· U)

22: end if

23: $X_{i} = \sqrt{1 - 2 Y / χ^{2}}$

24: end for

25: return $X_{1}, \dots, X_{n}$

Overall, the approach requires three setup steps for PINV. If a higher accuracy is desired, the approach can be easily adapted by dividing the interval $(10^{- 2}, 1]$ into more sub-intervals (at the expense of calculating additional approximations of inverse CDFs) and by improving the approximation for $χ \leq 0.01$ (and/or by lowering the boundary).

3.2 The alpha distribution

The alpha distribution is a continuous probability distribution on $[0, \infty)$ with density $f_{p} (x) = x^{- 2} \exp (- 0.5 {(p - 1 / x)}^{2}) / (Φ (p) \sqrt{2 π})$ for p > 0, and $Φ$ denoting the CDF of the standard normal distribution. Its CDF can be written as $F_{p} (x) = Φ (p - 1 / x) / Φ (p)$ . It has applications in reliability analysis, see Sherif (Citation1983) and Salvia (Citation1985).

One can directly apply the approach presented in Section 2 with $T (x, p) = p - 1 / x$ for $x \in (0, \infty)$ , which transforms the alpha distribution to a standard normal distribution truncated to $(- \infty, p)$ . Note that $M_{p} = P [Y \in (- \infty, p]] \in [0.5, 1]$ for all p > 0. Hence, the u-error is increased at most by a factor of 2, and no separate treatment for small values of a is required.

If PINV is applied to the normal density, a moderate speedup compared to using the inverse based on the error function can be achieved in Python and R: PINV is about 20% faster than the inverse CDF ndtri in scipy.special and 65% faster than qnorm in R.

4 Implementation

We test the speed of the algorithm presented in Section 3.1.2. The implementation of PINV relies on UNU.RAN which can be accessed from Python using Cython (Behnel et al. Citation2011) and from the programming language R via Runuran. The code is submitted as supplementary material to this article, and it is also available at https://github.com/chrisb83/ARGUS/.

Since the main objective is to demonstrate that Algorithm 2 has a very good performance in the varying parameter case, the analysis considers the following two cases:

For different values of χ, random parameter values are drawn uniformly from $[0.99 χ, 1.01 χ]$ . As Algorithm 2 distinguishes the cases $χ \leq 10^{- 5}, 10^{- 5} < χ \leq 0.01, 0.01 \leq χ \leq 0.1, 0.1 \leq χ \leq 1$ and $χ > 1$ , the values are selected to ensure that a value in each regions is included. Hence, this approach allows to compare the performance of Algorithm 2 for a set of representative parameters.
To test the speed for larger ranges of parameters, the parameters are randomly selected from the interval (0, 10). While values for parameters larger than 10 can be generated easily, it should be noted that the ARGUS distribution becomes highly concentrated in a region close to 1 for large values of χ (this is evident from line 23 in Algorithm 2). Choosing a larger upper bound would therefore focus on parameters of limited practical relevance.Footnote³ Hence, the test focusses on parameters in (0, 10) to assess whether the performance is robust over a wide range of parameters.

The corresponding results are summarized in , showing that Algorithm 2 leads to very fast generation times in the varying parameter case both in Cython and in R. It leads to substantially faster generation times compared to relying on the inverse CDF of the gamma distribution that can be expressed as the inverse of the incomplete gamma function, reducing the runtime by more than 90% in Cython/80% in R even if a fast implementation of the inverse CDF is used.Footnote⁴ If the default method to generate random variates in SciPy that relies on Brent’s root-finding algorithm to invert the CDF is used, generating a sample of 1000 data points already takes a few seconds. While this could be optimized, it is clear that the presented approach is substantially faster. The performance of Algorithm 2 is also shown to be robust with respect to the parameter χ.

Table 2 Time in milliseconds (ms) required to generate 1 million random variates of 1 million parameters randomly chosen in the interval $[0.99 χ, 1.01 χ]$ using a Cython and R implementation of Algorithm 2.

Display Table

As both the Cython code and the R code use the same implementation of PINV from the C library UNU.RAN, one can expect that the results in Python and R are similar, though differences can arise because of a) the pseudo-uniform generators, b) the implementations of the incomplete Gamma function and c) overhead from interacting with the C functions in UNU.RAN. The latter point can be considered the main reason for the slower runtime in R as a strength of Cython is the fast integration with C.

Based on the numerical experiments, the u-error does not exceed the tolerance of order $10^{- 10}$ stated in Section 3.1.2. The following code that implements the algorithms in the paper is attached as supplementary material:

the original Python and Cython code that was used to generate the results stated in the article,
a pure Python code relying on the implementation of PINV in SciPy $\geq$ 1.8.0.

In addition, the algorithm is implemented in the R package rargus.Footnote⁵

All tests were performed on Linux Ubuntu 20.04 running on a virtual machine using a single core of an Intel Core i7 1.8GHz processor and 8 GB RAM, Python 3.9, Cython 0.29, NumPy 1.25 and GCC 5.5, R version 4.3. The Cython implementation relies on Generator instead of the legacy RandomState in numpy.random.

5 Discussion

The trade-off between the setup and sampling speed is an important consideration when developing random variate generation methods (Devroye Citation1986, Section I.3). Numerical inversion of the CDF can be considered a suitable choice for a fixed parameter: the algorithm PINV introduced in Derflinger, Hörmann, and Leydold (Citation2010) leads to a very fast sampling method for large classes of distributions at the cost of an expensive setup. A challenging problem is the derivation of a fast algorithm that works in the varying parameter case where the step to aid the numerical inversion (e.g., by precomputing large tables) generally makes this method unattractive if the setup has to be rerun for many different parameters.

The main contributions of this article in this regard can be summarized as follows:

In Section 2, a new idea is presented to derive a generator for the varying parameter case via numerical inversion if a suitable transformation of the density can be found. In particular, the transformed density must only depend on the shape parameter via the domain, allowing to sample from the transformed distribution by conditioning on a parameter-dependent interval and then mapping the random variates back to the original distribution.
We demonstrate how to apply the approach to the ARGUS and alpha distributions in Sections 3.1 and 3.2. Algorithm 2 for the ARGUS distribution can be used for all values of χ with little impact on performance. It relies on a detailed analysis of the numerical accuracy for small parameters. To the best of the author’s knowledge, no other specific algorithm is available for this distribution in the literature. The results in Section 4 demonstrate that the derived method is very fast in the varying parameter case.

The approach in Section 2 obviously hinges on finding a suitable transformation which certainly is not possible for all distributions. Nevertheless, the insight that the inversion method can lead to a fast sampling algorithm in the varying parameter case will potentially be useful to tackle this problem for other distributions. For example, distributions like Weibull (PDF $x \in [0, \infty) \mapsto a x^{a - 1} \exp (- x^{a})$ for a > 0) and log-logistic (PDF $x \in [0, \infty) \mapsto a x^{a - 1} / {(1 + x^{a})}^{2}$ for a > 0) can also be transformed to simpler distributions that do not depend on a parameter ( $T (x, a) = x^{a}$ in both cases). However, the inverse CDF of the resulting distributions is very simple, so the application of PINV is not required.

Another way to apply the approach is to start from the distribution g that is restricted to an interval and to specify a transformation. Using the notation of Section 2, the equivalent of (4) is (15) $\begin{matrix} F_{p} (x) = G_{p} (T (x, p)), \\ f_{p} (x) = T' (x, p) \cdot g_{p} (T (x, p)), \\ x \in (v_{1}, v_{2}), \end{matrix}$ (15) and the approach can be readily applied to the family of distributions F_p. For example, considering the Gamma(k) distribution with density $g (x) = x^{k - 1} \exp (- x) / Γ (k)$ on $(0, \infty)$ instead of Gamma(1.5) restricted to $[0, χ^{2} / 2]$ and applying the transformation $T (x, χ) = χ^{2} (1 - x^{2}) / 2$ from Section 3.1, the density of the resulting distribution is $\begin{matrix} f (x, χ, k) = C (χ, k) \cdot x \cdot {(1 - x^{2})}^{k - 1} \exp (- 0.5 χ^{2} (1 - x^{2})), \\ x \in [0, 1], k, χ > 0, \end{matrix}$ a generalized version of the ARGUS distribution in (8). This distribution family has by definition the advantage that it allows for random variate generation by inversion, even for the varying parameter case.

Disclosure statement

The author has no competing interests to declare that are relevant to the content of this article.

Data availability statement

Data sharing is not applicable to this article as no new data were created or analyzed in this study. The code used for the performance/accuracy simulations is shared as supplementary material.

Additional information

Funding

No funding was received to assist with the preparation of this manuscript.

Notes

1 For example, “The approximation [of the inverse CDF] is valid for a given F: to use it when F changes frequently during the simulation experiment would probably require extraordinary set-up times.” (Devroye Citation1986, Section II.2.3)

2 In fact, PINV can be replaced by any numerical inversion method in the algorithm.

3 For example, extending the interval to (0, 100) would mean that on average, 90% of the sampled parameter values would be in the range (10, 100). If χ = 10, the mean of the distribution is 0.985 and the standard deviation is 0.0126. Hence, for values larger than 10, one would essentially generate almost identical values close to 1.

4 Generating ARGUS random variates by applying PINV to the Gamma density in the fixed parameter case takes approximately the 20ms for a sample size of one million, essentially independent of the value of χ, which underlines the strength of PINV.

5 https://CRAN.R-project.org/package=argus

References

Ahrens JH. 1995. A one-table method for sampling from continuous and discrete distributions. Computing 54(2):127–146.
Web of Science ®Google Scholar
Albrecht H, Hamacher T, Hofmann RP, Kirchhoff T, Mankel R, Nau A, Nowak S, Reßing D, Schröder H, Schulz HD, Walter M, et al. 1994. Measurement of the polarization in the decay b→j/ψk*. Phys Lett B. 340(3):217–220.
Web of Science ®Google Scholar
Baumgarten C, Patel T. 2022. Automatic random variate generation in Python. In: Agarwal M, Calloway C, Niederhut D, Shupe D, editors. Proceedings of the 21st Python in Science Conference. p. 46–51.
Google Scholar
Behnel S, Bradshaw R, Citro C, Dalcin L, Seljebotn DS, Smith K. 2011. Cython: the best of both worlds. Comput Sci Eng. 13(2):31–39.
Web of Science ®Google Scholar
Dagpunar J. 1988. Principles of random variate generation. Oxford: Oxford University Press.
Google Scholar
Derflinger G, Hörmann W, Leydold J. 2010. Random variate generation by numerical inversion when only the density is known. ACM Trans Model Comput Simul (TOMACS) 20(4):1–25.
Web of Science ®Google Scholar
Devroye L. 1986. Non-uniform random variate generation. New York: Springer-Verlag.
Google Scholar
Devroye L. 2014. Random variate generation for the generalized inverse Gaussian distribution. Stat Comput. 24(2):239–246.
Web of Science ®Google Scholar
Hörmann W. 1995. A rejection technique for sampling from T-concave distributions. ACM Trans Math Softw (TOMS) 21(2):182–193.
Web of Science ®Google Scholar
Hörmann W, Leydold J. 2003. Continuous random variate generation by fast numerical inversion. ACM Trans Modeling Comput Simul (TOMACS) 13(4):347–362.
Google Scholar
Hörmann W, Leydold J. 2014. Generating generalized inverse Gaussian random variates. Stat Comput. 24(4):547–557.
Web of Science ®Google Scholar
Hörmann W, Leydold J, Derflinger G. 2004. Automatic nonuniform random variate generation. Berlin, Heidelberg: Springer.
Google Scholar
Kastner G, Frühwirth-Schnatter S, Lopes HF. 2017. Efficient Bayesian inference for multivariate factor stochastic volatility models. J Comput Graph Stat 26(4):905–917.
Web of Science ®Google Scholar
Kinderman AJ, Monahan JF. 1977. Computer generation of random variables using the ratio of uniform deviates. ACM Trans Math Soft (TOMS) 3(3):257–260.
Google Scholar
Lees J, Poireau V, Prencipe E, Tisserand V, Garra Tico J, Grauges E, Martinelli M, Palano A, Pappagallo M, Eigen G, et al. Search for charged lepton flavor violation in narrow Y decays. Phys Rev Lett. 104(15):151802.
Web of Science ®Google Scholar
Leydold J. 2000. Automatic sampling with the ratio-of-uniforms method. ACM Trans Math Softw (TOMS) 26(1):78–98.
Web of Science ®Google Scholar
Lu Z, Song S, Yue Z, Wang J. 2008. Reliability sensitivity method by line sampling. Struct Saf. 30(6):517–532.
Web of Science ®Google Scholar
Pedlar TK, Cronin-Hennessy D, Hietala J, Dobbs S, Metreveli Z, Seth KK, Tomaradze A, Xiao T, Martin L, Powell A, Wilkinson G, et al. 2011. Observation of the hc(1p) using e+e− collisions above the DD¯ threshold. Phys Rev Lett. 107:041803.
PubMed Web of Science ®Google Scholar
Salvia AA. 1985. Reliability application of the alpha distribution. IEEE Trans Reliab. 34(3):251–252.
Web of Science ®Google Scholar
Sherif Y. 1983. Models for accelerated life-testing. In: 1983 Proceedings of Annual Reliability and Maintainability Symposium.
Google Scholar
Tirler G, Leydold J. 2003. Automatic non-uniform random variate generation in R. In: Proceedings of DSC. p. 2.
Google Scholar
Virtanen P, Gommers R, Oliphant TE, Haberland M, Reddy T, Cournapeau D, Burovski E, Peterson P, Weckesser W, Bright J, et al. 2020. Scipy 1.0: fundamental algorithms for scientific computing in python. Nat Methods 17:261–272.
PubMed Web of Science ®Google Scholar
Wakefield J, Gelfand A, Smith A. 1991. Efficient generation of random variates via the ratio-of-uniforms method. Stat Comput 1(2):129–133.
Google Scholar
Wild P, Gilks W. 1993. Algorithm AS 287: Adaptive rejection sampling from log-concave density functions. J R Stat Soc Ser C Appl Stat. 42(4):701–709.
Web of Science ®Google Scholar

Appendix A:

Auxiliary calculations for the ARGUS distribution

We present the technical details for the estimation of the u-error in Section 3.1.1 and the derivation of EquationEquation (14).

One can estimate the u-error as follows: (A1) $u - G (χ^{2} u^{2 / 3} / 2) / c_{χ} \sim u - (a_{0} χ^{3} u - a_{1} χ^{5} u^{5 / 3}) / c_{χ}$ (A1) (A2) $= (u / c_{χ}) (c_{χ} - a_{0} χ^{3} + a_{1} χ^{5} u^{2 / 3})$ (A2) (A3) $\sim (u / c_{χ}) (- a_{1} χ^{5} + a_{1} χ^{5} u^{2 / 3})$ (A3) (A4) $= - a_{1} (χ^{5} / c_{χ}) u (1 - u^{2 / 3})$ (A4) (A5) $\begin{matrix} \sim - (a_{1} / a_{0}) χ^{2} u (1 - u^{2 / 3}) \\ = - 0.3 χ^{2} u (1 - u^{2 / 3}), \end{matrix}$ (A5) where we have used repeatedly that (A6) $G (x^{2} / 2) = \frac{\sqrt{2} x^{3}}{3 \sqrt{π}} - \frac{\sqrt{2} x^{5}}{10 \sqrt{π}} + O (x^{7}) = a_{0} x^{3} - a_{1} x^{5} + O (x^{7}) .$ (A6)

Since $u (1 - u^{2 / 3}) \leq 0.4 {(3 / 5)}^{3 / 2} = 0.1859 \dots \leq 0.2$ for all $u \in (0, 1)$ , this shows that the u-error is bounded by $3 χ^{2} / 50$ for small χ.

To derive EquationEquation (14)(14) $\begin{matrix} x_{1} \approx - x_{0}^{2} / 2 - x_{0}^{3} / 6 \\ + \sum_{k = 0}^{3} x_{0}^{k + 1} / k! (1 / 3 - x_{0} / 10 + \sqrt{2 π} G (χ^{2} / 2) / χ^{3}) . \end{matrix}$ (14) , note that (A7) $x_{1} = x_{0} - G_{(x_{0})} / G' (x_{0}) - u G (χ^{2} / 2) / G' (x_{0})$ (A7) (A8) $= x_{0} - \frac{G (x_{0}) - ({(2 x_{0})}^{3 / 2} / χ^{3}) G (χ^{2} / 2)}{2 \sqrt{x_{0}} \exp (- x_{0}) / \sqrt{π}}$ (A8) (A9) $= x_{0} - \frac{erf (\sqrt{x_{0}})}{2 \sqrt{x_{0}} \exp (- x_{0}) / \sqrt{π}} + 1 + \frac{({(2 x_{0})}^{3 / 2} / χ^{3}) G (χ^{2} / 2)}{2 \sqrt{x_{0}} \exp (- x_{0}) / \sqrt{π}}$ (A9) (A10) $\begin{matrix} \approx x_{0} - \frac{(2 \sqrt{x_{0}} - 2 x_{0}^{3 / 2} / 3 + x_{0}^{5 / 2} / 5) / \sqrt{π}}{2 \sqrt{x_{0}} \exp (- x_{0}) / \sqrt{π}} + 1 \\ + \frac{\sqrt{2 π} x_{0} G (χ^{2} / 2)}{\exp (- x_{0}) χ^{3}} \end{matrix}$ (A10) (A11) $\begin{matrix} = 1 + x_{0} - \exp (x_{0}) \\ + x_{0} \exp (x_{0}) (1 / 3 - x_{0} / 10 + \sqrt{2 π} G (χ^{2} / 2) / χ^{3}) \end{matrix}$ (A11) (A12) $\begin{matrix} \approx - x_{0}^{2} / 2 - x_{0}^{3} / 6 \\ + x_{0} \sum_{k = 0}^{3} x_{0}^{k} / k! (1 / 3 - x_{0} / 10 + \sqrt{2 π} G (χ^{2} / 2) / χ^{3}), \end{matrix}$ (A12) where we used the definition of x₀ in the second equality, an expansion of the error function in the first $\approx$ and of the exponential in the second.

Random variate generation by fast numerical inversion in the varying parameter case

Abstract

1 Introduction

2 Inversion in the varying parameter case

3 Application of the approach

3.1 The ARGUS distribution

3.1.1 Numerical inversion for small parameters

Table 1 u-error and x-error (see (1) and (2)) for the approximations of the inverse CDF given by Equations (13) (x₀) and 14 (x₁) estimated by taking the maximum over 100,000 randomly selected points in the interval (0, 1).

3.1.2 The algorithm for the varying parameter case

3.2 The alpha distribution

4 Implementation

Table 2 Time in milliseconds (ms) required to generate 1 million random variates of 1 million parameters randomly chosen in the interval $[0.99 χ, 1.01 χ]$ using a Cython and R implementation of Algorithm 2.

5 Discussion

Disclosure statement

Data availability statement

References

Appendix A:

Auxiliary calculations for the ARGUS distribution

Information for

Open access

Opportunities

Help and information

Random variate generation by fast numerical inversion in the varying parameter case

Abstract

1 Introduction

2 Inversion in the varying parameter case

3 Application of the approach

3.1 The ARGUS distribution

3.1.1 Numerical inversion for small parameters

Table 1 u-error and x-error (see (1) and (2)) for the approximations of the inverse CDF given by Equations (13) (x0) and 14 (x1) estimated by taking the maximum over 100,000 randomly selected points in the interval (0, 1).

3.1.2 The algorithm for the varying parameter case

3.2 The alpha distribution

4 Implementation

Table 2 Time in milliseconds (ms) required to generate 1 million random variates of 1 million parameters randomly chosen in the interval [0.99χ,1.01χ] using a Cython and R implementation of Algorithm 2.

5 Discussion

Disclosure statement

Data availability statement

Additional information

Funding

Notes

References

Appendix A:

Auxiliary calculations for the ARGUS distribution

Related research

To cite this article:

Download citation

Your download is now in progress and you may close this window

Login or register to access this feature

Information for

Open access

Opportunities

Help and information

Keep up to date

Table 1 u-error and x-error (see (1) and (2)) for the approximations of the inverse CDF given by Equations (13) (x₀) and 14 (x₁) estimated by taking the maximum over 100,000 randomly selected points in the interval (0, 1).

Table 2 Time in milliseconds (ms) required to generate 1 million random variates of 1 million parameters randomly chosen in the interval $[0.99 χ, 1.01 χ]$ using a Cython and R implementation of Algorithm 2.