456
Views
1
CrossRef citations to date
0
Altmetric
Research Article

Random variate generation by fast numerical inversion in the varying parameter case

Article: 2279060 | Received 06 Jun 2023, Accepted 30 Oct 2023, Published online: 15 Nov 2023

Abstract

There are various general techniques to produce random variates of a probability distribution such as the rejection method or numerical approaches to invert the cumulative distribution function (CDF). Some of these methods work in a black-box fashion (i.e., a single piece of code can sample for a relatively large class of distributions) which allows to create generators easily even for nonstandard distributions. Numerical inversion has some desirable properties that make its application appealing. However, a setup step is typically required to compute an approximation of the inverse of the CDF. Hence, if the distribution depends on a shape parameter and small samples are required for many different parameters (varying parameter case), the cost of the setup step typically outweighs the marginal generation times of the small samples, rendering the inversion method very slow. This article presents a new approach that allows for the use of inversion in the varying parameter case, provided that a suitable transformation of the density can be found to avoid running the setup for every parameter. The method is applied to two distributions (ARGUS and alpha distribution) to demonstrate that the performance is very good in the varying parameter case.

1 Introduction

For continuous distributions, various general methods have been developed such as numerical inversion of the cumulative distribution function (Hörmann and Leydold Citation2003; Derflinger, Hörmann, and Leydold Citation2010), Transformed Density Rejection (Wild and Gilks Citation1993; Hörmann Citation1995), the Ratio-of-Uniforms (RoU) method (Kinderman and Monahan Citation1977) and variants thereof (Wakefield, Gelfand, and Smith Citation1991; Leydold Citation2000), as well as table methods (Ahrens Citation1995). Excellent comprehensive resources are the references Devroye (Citation1986), Dagpunar (Citation1988), and Hörmann, Leydold, and Derflinger (Citation2004). Some of these methods work in a black-box fashion (i.e., a single piece of code can sample for a relatively large class of distributions, e.g., based on the density or the CDF).

The inversion method is attractive for simulations. A few of its strengths (Derflinger, Hörmann, and Leydold Citation2010) are:

  • It is a very general method that works for any distribution whose CDF can be computed.

  • It preserves the structural properties of the uniform random numbers.

  • It allows efficient sampling from truncated distributions.

  • It is well suited for quasi-Monte Carlo methods, and it is essential for copula methods.

However, except for a few cases, the inverse of the CDF cannot be expressed using elementary functions, and special functions like the inverse of the Gamma function are expensive to evaluate. In such situations, the numerical inversion approach called PINV presented in Derflinger, Hörmann, and Leydold (Citation2010) is a suitable choice. A setup step is typically required to compute an approximation of the inverse of the CDF if a black-box method is applied. Hence, if the distribution depends on a shape parameter and small samples are required for many different parameters (varying parameter case), the cost of the setup step typically outweighs the marginal generation times of the small samples, rendering the inversion method very slow.Footnote1 For example, even if the setup takes only one millisecond, this would mean it would take more than 16 minutes to run the setup for one million different parameters, which is obviously orders of magnitudes longer than generating one million random variates with an efficient sampler for a fixed value of the parameter. For the varying parameter case, no fast automatic method is known, and for most distributions with shape parameters, the varying parameter case results in much slower generation methods than the fixed parameter case. Being able to handle the varying parameter case is important in some applications such as Gibbs sampling in the context of Bayesian statistics (see (Hörmann, Leydold, and Derflinger Citation2004, Section 15.2) for examples and further references). The importance of the tradeoff between the speed of sampling and the setup is also discussed in (Devroye Citation1986, Section I.3): “The new ingredient for multi-parameter families is the set-up time, that is, the time spent computing constants that depend only upon the parameters of the distribution.”, and the cost of the setup is always considered in this reference. For example, for the Gamma distribution, “small or nonexistant setup times” (Devroye Citation1986, Section IX.3.2) are considered a desirable feature, and various approaches to cover the fixed and varying parameter case are discussed in the cited section. For the generalized Gaussian distribution, algorithms with a fast setup are studied in Hörmann and Leydold (Citation2014) and Devroye (Citation2014). The sampling approach Hörmann and Leydold (Citation2014) is used in Kastner, Frühwirth-Schnatter, and Lopes (Citation2017) in the context of Bayesian inference where MCMC sampling is applied to simulate from distributions with time-varying parameters in stochastic volatility models. Another application where the parameters of a distribution are varied is sensitivity analysis, see Lu et al. (Citation2008) for an example.

This article presents a new approach that allows for the use of inversion in the varying parameter case if a suitable transformation of the density can be found. In particular, the transformed density must only depend on the shape parameter via the domain, allowing to sample from the transformed distribution by conditioning on a parameter-dependent interval and then mapping the random variates back to the original distribution. Hence, one can benefit from the advantages of the inversion method even in the varying parameter case by avoiding to run the expensive setup step many times. The method is applied to two distributions (ARGUS and alpha distribution).

The remainder of this article is organized as follows: Section 2 presents the main idea to apply numerical inversion in the varying parameter case. Section 3 shows how to apply the method to two distributions: the ARGUS distribution in Section 3.1 and the alpha distribution in Section 3.2. The performance of the algorithm is analyzed in Section 4. We conclude with a brief discussion of the results in Section 5.

2 Inversion in the varying parameter case

Given the density f of a distribution, PINV(f) returns a function that is a numerical approximation of the inverse of the CDF. The algorithm is based on polynomial interpolation of the inverse CDF and Gauss-Lobatto integration. Its main strengths are that only the density is needed as an input and that the algorithm allows to control the approximation error of the numerical inversion. According to Derflinger, Hörmann, and Leydold (Citation2010), it is “by far the fastest inversion method known”. If F is a CDF and Fa1 an approximation of its inverse, PINV allows to control the u-error which is defined as (1) sup0<u<1ϵu(u)=sup0<u<1|uF(Fa1(u))|(1)

Another measure of the error is the x-error given by (2) sup0<u<1ϵx(u)=sup0<u<1|F1(u)Fa1(u)|.(2)

One can show that (3) ϵx(u)=ϵu(u)/f(F1(u))+O(ϵu(u)2).(3)

The setup step of PINV that computes the polynomial approximation to achieve the specified numerical precision (1e-10 by default) is too slow to rely on this method in the varying parameter case where fp relies on a shape parameter p. For example, even if the setup takes only 1ms, this would mean that more than 16 minutes are required to perform the setup step for one million different parameters, which is obviously orders of magnitudes longer than generating one million random variates with an efficient sampler for a fixed value of p.

The main idea is therefore to avoid a separate setup step for each parameter. As a first step, we apply a transformation T=T(·,p) to X: Y=T(X,p) where X has a density fp on an interval (v1, v2) with CDF Fp. Let gp/Gp denote the PDF/CDF of Y. Assuming the transformation is increasing in x (the case that it is decreasing works analogously) and differentiable, the relationships of the CDFs Fp and Gp and the densities fp and gp are (4) Gp(x)=Fp(T1(x,p)),gp(x)=(T1)(x,p)·fp(T1(x,p)),x(T(v1,p),T(v2,p)),(4) where T1 denotes the inverse of T w.r.t. x. The goal will be to find a transformation such that gp(x)=cp·g(x) on (T(v1,p),T(v2,p)) where g is a PDF that does not depend on p. Hence, the only dependence on p is the domain (and the normalizing constant cp which is not an input to PINV). Thus, if G denotes the CDF corresponding to the density g, Y is distributed according to g conditional on Y(T(v1,p),T(v2,p)), and the inverse of its CDF is yG1(Mp·y+G(T(v1,p))) where Mp=P[Y(T(v1,p),T(v2,p))]=G(T(v2,p))G(T(v1,p)). Let H denote the approximation of G1 computed with PINV. Thus, in order to generate random variates for different p1,,pk, one needs to calculate the constants of the polynomial approximation H of only one inverse CDF in a setup step and to compute Mpi for each parameter to generate the conditional variates. These can then be transformed back to the target distribution by applying the inverse of xT(x,p).

We summarize this approach in the following algorithm:

Algorithm 1

Inversion (varying parameter)

Require: nN,p1,,pn, density g, T and T1, v1, v2 (interval boundaries), tol (u-error tolerance).

Output: n random variates distributed according to fp1,,fpn.

1: # Setup step

2: H = PINV(g) with maximal u-error of tol

3: # Generation of random variates

4: for i = 1 to n do

5: M = G(T(v2, pi)) - G(T(v1, pi))

6: Generate U uniformly distributed on [0,1]

7: Y = H(M·U + G(T(v1, pi))

8: Xi=T1(Y,pi)

9: end for

10: return X1,,Xn

Note that the algorithm is very easy to implement if an implementation of PINV is available.Footnote2 An open-source implementation of PINV can be found in the C library UNU.RAN (http://statmath.wu.ac.at/unuran/), in SciPy (release 1.8.0, Baumgarten and Patel Citation2022) and in the R package Runuran (Tirler and Leydold Citation2003).

The u-error tolerance is an input to Algorithm 1 which allows to control the numerical precision when computing H=PINV(g) in line 2. H is the inverse of the CDF G of the untruncated distribution instead of the CDF Gp restricted to (T(v1,p),T(v2,p)), so the u-error is inflated by a factor 1/Mp: (5) sup0<u<1|uGp(H(Mpu+G(T(v1,p))))|(5) (6) =Mp1sup0<u<1|Mpu+G(T(v1,p))    G(H(Mpu+G(T(v1,p)))|(6) (7) =Mp1supG(T(v1,p))<u<Mpu+G(T(v2,p))|uG(H(u))|(7)

If the factor is not very large, one can still achieve the desired error tolerance ϵ by specifying a tolerance of Mp·ϵ when the approximation of the inverse CDF is computed (e.g., ϵ=1010 and 1/Mp=100). However, if Mp becomes very small for certain values of p, the tolerance Mp·ϵ is not achievable anymore since the numerical error cannot be made smaller than machine precision (e.g., ϵ=1013 and 1/Mp=104 with 64bit precision). Hence, a refined approach is required to control the error in the varying parameter case. This needs to be investigated on a case-by-case basis as we will see in the next section when the approach is applied to the ARGUS and alpha distributions.

3 Application of the approach

In this section, two continuous distributions are presented where the approach outlined in Section 2 can be applied. We first consider the ARGUS distribution in Section 3.1 before turning to the alpha distribution in Section 3.2.

3.1 The ARGUS distribution

The ARGUS distribution is a continuous probability distribution on the interval [0,1] with probability density function (PDF) given by (8) f(x,χ)=χ32πΨ(χ)x1x2exp(0.5χ2(1x2)),x[0,1],χ>0,(8) where Ψ(χ)=Φ(χ)χϕ(χ)0.5, and Φ is the cumulative distribution function (CDF) of the standard normal distribution and ϕ=Φ is the normal PDF. The CDF is F(x,χ)=1Ψ(χ1x2)/Ψ(χ) (x[0,1]). Note that the density becomes more concentrated around the mode of the distribution as χ increases which makes accurate numerical inversion more difficult. The distribution is relevant in particle physics: it was introduced in Albrecht et al. (Citation1994), see also Pedlar et al. (Citation2011) and Lees et al. (2010) for examples of how the distribution is used in this context. The ARGUS distribution is implemented in the statistics module of the well-known open-source software SciPy (Virtanen et al. Citation2020) and in ROOT (https://root.cern.ch). While no sampling method is available in ROOT, the default method to generate random variates in SciPy relies on a root-finding procedure to invert the CDF of a distribution: just generating a sample of 1000 data points takes a few seconds, which is not sufficient for practical purposes. This motivates the derivation of a fast sampling approach for ARGUS random variates. However, the author is not aware of a specific sampling algorithm for this distribution in the literature (neither for the fixed nor for the varying parameter case).

A key observations is that the ARGUS distribution is related to the Gamma distribution: If X has an ARGUS distribution with parameter χ, let T(x,χ)=χ2(1x2)/2. Then the density of Y=T(X,χ)[0,χ2/2] is given by (9) uexp(u)πΨ(χ),u[0,χ2/2].(9)

One recognizes the density xp1exp(x)/Γ(p),x0, of the Gamma distribution Γ(p) with parameter p = 1.5. Thus, Y is Γ(1.5) conditioned on [0,χ2/2]. The CDF of a Gamma distribution can be expressed as the lower incomplete Gamma function. If p = 1.5 and erf denotes the error function, the CDF G simplifies to (10) G(x)=erf(x)2xexp(x)π,x0,(10)

The inverse of the CDF of the Gamma distribution is implemented in software packages such as SciPy (scipy.special.gammaincinv). However, it is very slow to use it for sampling compared to the methods explained in this note, see and the comment at the end of Section 4. Note that a comparison of the normalizing constants of the PDF in EquationEquation (9) and the PDF of the Gamma distribution implies that G(χ2/2)=2Ψ(χ) for all χ0. Finally, a useful observation one can verify using l’Hôpital’s rule or a Taylor expansion of Ψ is that (11) limχ0χ3Ψ(χ)=32π.(11)

In particular, EquationEquation (11) implies that the density of the ARGUS distribution converges to g(x)=3x1x2 as χ0. The distribution function of the limiting distribution is 1(1x2)3/2. It is easy to see that random variates of the limiting distribution can be generated as X=1V2/3 where V is uniformly distributed on [0,1].

3.1.1 Numerical inversion for small parameters

If G denotes the CDF of the Gamma distribution with parameter p = 1.5 defined in EquationEquation (10) and Y has a Gamma distribution conditional on Yχ2/2, then the inverse of its CDF is uG1(G(χ2/2)u). Note that the error might be inflated by a factor 1/Mχ with Mχ=G(χ2/2). Recall that G(χ2/2)=2Ψ(χ) and therefore, EquationEquation (11) implies that Mχχ3·2/(3π) as χ0. Thus, for small χ, the u-error will exceed the chosen tolerance defined for PINV.

To achieve high accuracy for small values of χ, note that one can approximate the Gamma(1.5) density conditioned on [0,χ2/2] by (12) l(x)=32x/χ3,x[0,χ2/2],(12) with CDF L(x)=22x3/2/χ3 and inverse L1(u)=χ2u2/3/2 (for simplicity of notation, we write l, L and L1 without stating the dependence on χ explicitly). If we use this approximation, one can show that the u-error is bounded by 3χ2/50 for χ small enough, see Appendix A. Using EquationEquation (3) and l and L1 as approximations of the conditional Gamma density, it follows that ϵx(u)0.1χ4u2/3(1u2/3) as χ0. Thus, taking the supremum over u(0,1), the x-error behaves like χ4/40 as χ0. The accuracy of the approximation can be improved substantially if one applies a single iteration of Newton’s method as follows: (13) x0=L1(u)=χ2u2/3/2,x1=x0(Gχ(x0)u)/Gχ(x0).(13)

To avoid expensive evaluations of G involving the error function, one can use the following approximation of x1: (14) x1x02/2x03/6+k=03x0k+1/k!(1/3x0/10+2πG(χ2/2)/χ3).(14)

Note that there is no need to approximate G(χ2/2) using EquationEquation (A6) as it is computed anyway (line 9 in Algorithm 2). Horner’s scheme can be used to evaluate x1 numerically. The derivation of EquationEquation (14) can be found in Appendix A. presents the u- and x-error estimated numerically for different values of χ.

Table 1 u-error and x-error (see (1) and (2)) for the approximations of the inverse CDF given by Equations (13) (x0) and 14 (x1) estimated by taking the maximum over 100,000 randomly selected points in the interval (0, 1).

3.1.2 The algorithm for the varying parameter case

One can now combine the observations of Sections 2 and 3.1.1 to derive the final algorithm. If one aims to achieve a u-error of approximately 1010, one can proceed as follows:

  • For χ>1, use PINV(G) with a bound on the u-error of 1010.

  • Let Ik=(10(k+1),10k] for k = 0, 1. On Ik, invert Gχ with χ=10k using a bound on the u-error of 1013 (k = 0, 1).

  • For χ102, use (13) and add one iteration of Newton’s method using (14) if χ>105.

For χ(1,], the u-error is bounded by 1010/G(0.5)<5.1·1010. Note that on the intervals I0 and I1, the u-error is below 1010 in view of the following arguments: For χ1>0, if one denotes the approximation of the inverse of Gχ1 by Ga1, we can approximate the inverse CDF of Gχ0 by uGa1(Gχ1(χ02/2)u) for χ0<χ1. Following the same arguments as above, it is easy to show that the u-error is inflated at most by a factor of G(χ12/2)/G(χ02/2)(χ1/χ0)3. The latter ratio is below 1000 by definition of the intervals Ik, which proves the claim since inversion with a u-error of 1013 is used. Finally, for χ0.01, the results in indicate that the error is below 1010.

The complete algorithm for the varying parameter case is summarized in Algorithm 2.

Algorithm 2

Generating ARGUS random variates

Require: χ1,,χn>0,nN.

Output: n ARGUS random variates with parameters χ1,,χn

1: # Setup step

2: H0 = PINV(density of Γ(1.5)) with tol = 1e-10

3: H1 = PINV(density of Γ(1.5) restricted to ) with tol = 1e-13

4: H2 = PINV(density of Γ(1.5) restricted to [0,0.05]) with tol = 1e-13

5: C1 = G(0.5) ⊳ G is defined in (10)

6: C2 = G(0.05)

7: # Generation of random variates

8: for i = 1 to n do

9: C = G(χi2/2)

10: Generate U uniformly distributed on [0,1]

11: if χi0.01 then

12: < >Y=L1(U) ⊳ L is defined in (13)

13: f χi>105 then

14: perform one Newton iteration using (14)

15: end if

16: else if χi0.1 then

17: Y = H2(C· U/C2)

18: else if χi1 then

19: Y = H1(C· U/C1)

20: else

21: Y = H0(C· U)

22: end if

23: Xi=12Y/χ2

24: end for

25: return X1,,Xn

Overall, the approach requires three setup steps for PINV. If a higher accuracy is desired, the approach can be easily adapted by dividing the interval (102,1] into more sub-intervals (at the expense of calculating additional approximations of inverse CDFs) and by improving the approximation for χ0.01 (and/or by lowering the boundary).

3.2 The alpha distribution

The alpha distribution is a continuous probability distribution on [0,) with density fp(x)=x2exp(0.5(p1/x)2)/(Φ(p)2π) for p > 0, and Φ denoting the CDF of the standard normal distribution. Its CDF can be written as Fp(x)=Φ(p1/x)/Φ(p). It has applications in reliability analysis, see Sherif (Citation1983) and Salvia (Citation1985).

One can directly apply the approach presented in Section 2 with T(x,p)=p1/x for x(0,), which transforms the alpha distribution to a standard normal distribution truncated to (,p). Note that Mp=P[Y(,p]][0.5,1] for all p > 0. Hence, the u-error is increased at most by a factor of 2, and no separate treatment for small values of a is required.

If PINV is applied to the normal density, a moderate speedup compared to using the inverse based on the error function can be achieved in Python and R: PINV is about 20% faster than the inverse CDF ndtri in scipy.special and 65% faster than qnorm in R.

4 Implementation

We test the speed of the algorithm presented in Section 3.1.2. The implementation of PINV relies on UNU.RAN which can be accessed from Python using Cython (Behnel et al. Citation2011) and from the programming language R via Runuran. The code is submitted as supplementary material to this article, and it is also available at https://github.com/chrisb83/ARGUS/.

Since the main objective is to demonstrate that Algorithm 2 has a very good performance in the varying parameter case, the analysis considers the following two cases:

  • For different values of χ, random parameter values are drawn uniformly from [0.99χ,1.01χ]. As Algorithm 2 distinguishes the cases χ105,105<χ0.01,0.01χ0.1,0.1χ1 and χ>1, the values are selected to ensure that a value in each regions is included. Hence, this approach allows to compare the performance of Algorithm 2 for a set of representative parameters.

  • To test the speed for larger ranges of parameters, the parameters are randomly selected from the interval (0, 10). While values for parameters larger than 10 can be generated easily, it should be noted that the ARGUS distribution becomes highly concentrated in a region close to 1 for large values of χ (this is evident from line 23 in Algorithm 2). Choosing a larger upper bound would therefore focus on parameters of limited practical relevance.Footnote3 Hence, the test focusses on parameters in (0, 10) to assess whether the performance is robust over a wide range of parameters.

The corresponding results are summarized in , showing that Algorithm 2 leads to very fast generation times in the varying parameter case both in Cython and in R. It leads to substantially faster generation times compared to relying on the inverse CDF of the gamma distribution that can be expressed as the inverse of the incomplete gamma function, reducing the runtime by more than 90% in Cython/80% in R even if a fast implementation of the inverse CDF is used.Footnote4 If the default method to generate random variates in SciPy that relies on Brent’s root-finding algorithm to invert the CDF is used, generating a sample of 1000 data points already takes a few seconds. While this could be optimized, it is clear that the presented approach is substantially faster. The performance of Algorithm 2 is also shown to be robust with respect to the parameter χ.

Table 2 Time in milliseconds (ms) required to generate 1 million random variates of 1 million parameters randomly chosen in the interval [0.99χ,1.01χ] using a Cython and R implementation of Algorithm 2.

As both the Cython code and the R code use the same implementation of PINV from the C library UNU.RAN, one can expect that the results in Python and R are similar, though differences can arise because of a) the pseudo-uniform generators, b) the implementations of the incomplete Gamma function and c) overhead from interacting with the C functions in UNU.RAN. The latter point can be considered the main reason for the slower runtime in R as a strength of Cython is the fast integration with C.

Based on the numerical experiments, the u-error does not exceed the tolerance of order 1010 stated in Section 3.1.2. The following code that implements the algorithms in the paper is attached as supplementary material:

  • the original Python and Cython code that was used to generate the results stated in the article,

  • a pure Python code relying on the implementation of PINV in SciPy 1.8.0.

In addition, the algorithm is implemented in the R package rargus.Footnote5

All tests were performed on Linux Ubuntu 20.04 running on a virtual machine using a single core of an Intel Core i7 1.8GHz processor and 8 GB RAM, Python 3.9, Cython 0.29, NumPy 1.25 and GCC 5.5, R version 4.3. The Cython implementation relies on Generator instead of the legacy RandomState in numpy.random.

5 Discussion

The trade-off between the setup and sampling speed is an important consideration when developing random variate generation methods (Devroye Citation1986, Section I.3). Numerical inversion of the CDF can be considered a suitable choice for a fixed parameter: the algorithm PINV introduced in Derflinger, Hörmann, and Leydold (Citation2010) leads to a very fast sampling method for large classes of distributions at the cost of an expensive setup. A challenging problem is the derivation of a fast algorithm that works in the varying parameter case where the step to aid the numerical inversion (e.g., by precomputing large tables) generally makes this method unattractive if the setup has to be rerun for many different parameters.

The main contributions of this article in this regard can be summarized as follows:

  • In Section 2, a new idea is presented to derive a generator for the varying parameter case via numerical inversion if a suitable transformation of the density can be found. In particular, the transformed density must only depend on the shape parameter via the domain, allowing to sample from the transformed distribution by conditioning on a parameter-dependent interval and then mapping the random variates back to the original distribution.

  • We demonstrate how to apply the approach to the ARGUS and alpha distributions in Sections 3.1 and 3.2. Algorithm 2 for the ARGUS distribution can be used for all values of χ with little impact on performance. It relies on a detailed analysis of the numerical accuracy for small parameters. To the best of the author’s knowledge, no other specific algorithm is available for this distribution in the literature. The results in Section 4 demonstrate that the derived method is very fast in the varying parameter case.

The approach in Section 2 obviously hinges on finding a suitable transformation which certainly is not possible for all distributions. Nevertheless, the insight that the inversion method can lead to a fast sampling algorithm in the varying parameter case will potentially be useful to tackle this problem for other distributions. For example, distributions like Weibull (PDF x[0,)axa1exp(xa) for a > 0) and log-logistic (PDF x[0,)axa1/(1+xa)2 for a > 0) can also be transformed to simpler distributions that do not depend on a parameter (T(x,a)=xa in both cases). However, the inverse CDF of the resulting distributions is very simple, so the application of PINV is not required.

Another way to apply the approach is to start from the distribution g that is restricted to an interval and to specify a transformation. Using the notation of Section 2, the equivalent of (4) is (15) Fp(x)=Gp(T(x,p)),fp(x)=T(x,p)·gp(T(x,p)),x(v1,v2),(15) and the approach can be readily applied to the family of distributions Fp. For example, considering the Gamma(k) distribution with density g(x)=xk1exp(x)/Γ(k) on (0,) instead of Gamma(1.5) restricted to [0,χ2/2] and applying the transformation T(x,χ)=χ2(1x2)/2 from Section 3.1, the density of the resulting distribution is f(x,χ,k)=C(χ,k)·x·(1x2)k1exp(0.5χ2(1x2)),x[0,1],k,χ>0, a generalized version of the ARGUS distribution in (8). This distribution family has by definition the advantage that it allows for random variate generation by inversion, even for the varying parameter case.

Disclosure statement

The author has no competing interests to declare that are relevant to the content of this article.

Data availability statement

Data sharing is not applicable to this article as no new data were created or analyzed in this study. The code used for the performance/accuracy simulations is shared as supplementary material.

Additional information

Funding

No funding was received to assist with the preparation of this manuscript.

Notes

1 For example, “The approximation [of the inverse CDF] is valid for a given F: to use it when F changes frequently during the simulation experiment would probably require extraordinary set-up times.” (Devroye Citation1986, Section II.2.3)

2 In fact, PINV can be replaced by any numerical inversion method in the algorithm.

3 For example, extending the interval to (0, 100) would mean that on average, 90% of the sampled parameter values would be in the range (10, 100). If χ = 10, the mean of the distribution is 0.985 and the standard deviation is 0.0126. Hence, for values larger than 10, one would essentially generate almost identical values close to 1.

4 Generating ARGUS random variates by applying PINV to the Gamma density in the fixed parameter case takes approximately the 20ms for a sample size of one million, essentially independent of the value of χ, which underlines the strength of PINV.

References

  • Ahrens JH. 1995. A one-table method for sampling from continuous and discrete distributions. Computing 54(2):127–146.
  • Albrecht H, Hamacher T, Hofmann RP, Kirchhoff T, Mankel R, Nau A, Nowak S, Reßing D, Schröder H, Schulz HD, Walter M, et al. 1994. Measurement of the polarization in the decay b→j/ψk*. Phys Lett B. 340(3):217–220.
  • Baumgarten C, Patel T. 2022. Automatic random variate generation in Python. In: Agarwal M, Calloway C, Niederhut D, Shupe D, editors. Proceedings of the 21st Python in Science Conference. p. 46–51.
  • Behnel S, Bradshaw R, Citro C, Dalcin L, Seljebotn DS, Smith K. 2011. Cython: the best of both worlds. Comput Sci Eng. 13(2):31–39.
  • Dagpunar J. 1988. Principles of random variate generation. Oxford: Oxford University Press.
  • Derflinger G, Hörmann W, Leydold J. 2010. Random variate generation by numerical inversion when only the density is known. ACM Trans Model Comput Simul (TOMACS) 20(4):1–25.
  • Devroye L. 1986. Non-uniform random variate generation. New York: Springer-Verlag.
  • Devroye L. 2014. Random variate generation for the generalized inverse Gaussian distribution. Stat Comput. 24(2):239–246.
  • Hörmann W. 1995. A rejection technique for sampling from T-concave distributions. ACM Trans Math Softw (TOMS) 21(2):182–193.
  • Hörmann W, Leydold J. 2003. Continuous random variate generation by fast numerical inversion. ACM Trans Modeling Comput Simul (TOMACS) 13(4):347–362.
  • Hörmann W, Leydold J. 2014. Generating generalized inverse Gaussian random variates. Stat Comput. 24(4):547–557.
  • Hörmann W, Leydold J, Derflinger G. 2004. Automatic nonuniform random variate generation. Berlin, Heidelberg: Springer.
  • Kastner G, Frühwirth-Schnatter S, Lopes HF. 2017. Efficient Bayesian inference for multivariate factor stochastic volatility models. J Comput Graph Stat 26(4):905–917.
  • Kinderman AJ, Monahan JF. 1977. Computer generation of random variables using the ratio of uniform deviates. ACM Trans Math Soft (TOMS) 3(3):257–260.
  • Lees J, Poireau V, Prencipe E, Tisserand V, Garra Tico J, Grauges E, Martinelli M, Palano A, Pappagallo M, Eigen G, et al. Search for charged lepton flavor violation in narrow Y decays. Phys Rev Lett. 104(15):151802.
  • Leydold J. 2000. Automatic sampling with the ratio-of-uniforms method. ACM Trans Math Softw (TOMS) 26(1):78–98.
  • Lu Z, Song S, Yue Z, Wang J. 2008. Reliability sensitivity method by line sampling. Struct Saf. 30(6):517–532.
  • Pedlar TK, Cronin-Hennessy D, Hietala J, Dobbs S, Metreveli Z, Seth KK, Tomaradze A, Xiao T, Martin L, Powell A, Wilkinson G, et al. 2011. Observation of the hc(1p) using e+e− collisions above the DD¯ threshold. Phys Rev Lett. 107:041803.
  • Salvia AA. 1985. Reliability application of the alpha distribution. IEEE Trans Reliab. 34(3):251–252.
  • Sherif Y. 1983. Models for accelerated life-testing. In: 1983 Proceedings of Annual Reliability and Maintainability Symposium.
  • Tirler G, Leydold J. 2003. Automatic non-uniform random variate generation in R. In: Proceedings of DSC. p. 2.
  • Virtanen P, Gommers R, Oliphant TE, Haberland M, Reddy T, Cournapeau D, Burovski E, Peterson P, Weckesser W, Bright J, et al. 2020. Scipy 1.0: fundamental algorithms for scientific computing in python. Nat Methods 17:261–272.
  • Wakefield J, Gelfand A, Smith A. 1991. Efficient generation of random variates via the ratio-of-uniforms method. Stat Comput 1(2):129–133.
  • Wild P, Gilks W. 1993. Algorithm AS 287: Adaptive rejection sampling from log-concave density functions. J R Stat Soc Ser C Appl Stat. 42(4):701–709.

Appendix A:

Auxiliary calculations for the ARGUS distribution

We present the technical details for the estimation of the u-error in Section 3.1.1 and the derivation of EquationEquation (14).

One can estimate the u-error as follows: (A1) uG(χ2u2/3/2)/cχu(a0χ3ua1χ5u5/3)/cχ(A1) (A2) =(u/cχ)(cχa0χ3+a1χ5u2/3)(A2) (A3) (u/cχ)(a1χ5+a1χ5u2/3)(A3) (A4) =a1(χ5/cχ)u(1u2/3)(A4) (A5) (a1/a0)χ2u(1u2/3)=0.3χ2u(1u2/3),(A5) where we have used repeatedly that (A6) G(x2/2)=2x33π2x510π+O(x7)=a0x3a1x5+O(x7).(A6)

Since u(1u2/3)0.4(3/5)3/2=0.18590.2 for all u(0,1), this shows that the u-error is bounded by 3χ2/50 for small χ.

To derive EquationEquation (14), note that (A7) x1=x0G(x0)/G(x0)uG(χ2/2)/G(x0)(A7) (A8) =x0G(x0)((2x0)3/2/χ3)G(χ2/2)2x0exp(x0)/π(A8) (A9) =x0erf(x0)2x0exp(x0)/π+1+((2x0)3/2/χ3)G(χ2/2)2x0exp(x0)/π(A9) (A10) x0(2x02x03/2/3+x05/2/5)/π2x0exp(x0)/π+1+2πx0G(χ2/2)exp(x0)χ3(A10) (A11) =1+x0exp(x0)+x0exp(x0)(1/3x0/10+2πG(χ2/2)/χ3)(A11) (A12) x02/2x03/6+x0k=03x0k/k!(1/3x0/10+2πG(χ2/2)/χ3),(A12) where we used the definition of x0 in the second equality, an expansion of the error function in the first and of the exponential in the second.