ABSTRACT
This paper proposes a functional coefficient quantile regression model with heterogeneous and time-varying regression coefficients and factor loadings. Estimation of the model coefficients is done in two stages. First, we estimate the unobserved common factors from a linear factor model with exogenous covariates. Second, we plug-in an affine transformation of the estimated common factors to obtain the functional coefficient quantile regression model. The quantile parameter estimators are consistent and asymptotically normal. The application of this model to the quantile process of a cross-section of U.S. firms’ excess returns confirms the predictive ability of firm-specific covariates and the good performance of the local estimator of the heterogeneous and time-varying quantile coefficients.
1. Introduction
In a series of influential papers, Bai and Ng (Citation2002) and Bai (Citation2003, Citation2009) developed a general methodology for explaining economic and financial variables by a few common factors. Factor models allow for a drastic reduction of the cross-sectional dimension of a panel while providing a flexible way to summarize information from large data sets, see Pesaran (Citation2006). In the literature on factor models it is common to assume a vector of constant factor loadings. This assumption is, however, rather restrictive. To the best of our knowledge, Eichler et al. (Citation2011) is the first study to use time-varying loadings in a dynamic model with non-stationary time series. Bates et al. (Citation2013) is another influential analysis that contributes to the idea of smooth changes in factor loadings. Su and Wang (Citation2017) propose a local version of the principal component method using smoothly changing loadings, while Pelger and Xiong (Citation2019) allow them to be state-dependent. In this setting the unobserved factor structure is thus allowed to vary over time.
Another area of major interest in recent years is the study of the quantile process. Quantile regression (QR) has been studied extensively in both theoretical and empirical studies; see Koenker and Bassett (Citation1978), Portnoy (Citation1991), Chaudhuri et al. (Citation1997), Koenker and Machado (Citation1999), He and Zhu (Citation2003), Koenker and Xiao (Citation2006). This work has been recently extended to accommodate the presence of dynamics in the quantile coefficients, see Wei and He (Citation2006) and Kim (Citation2007). A more general approach that also allows for dynamics in the quantile parameters is based on nonparametric and semiparametric estimation methods for dynamic smooth coefficient models, see De Gooijer and Zerom (Citation2003), Yu and Lu (Citation2004), Horowitz and Lee (Citation2005), and more recently, Cai and Xu (Citation2008) and Cai and Xiao (Citation2012). Building on this work, recent contributions by Ando and Bai (Citation2020), Chen et al. (Citation2021) and Ma et al. (Citation2021) have extended quantile regression models to incorporate unobserved common factors. These models consider heterogeneous quantile effects that introduce much flexibility to the specification of factor models by capturing the presence of heterogeneity in the effect of observable covariates and unobserved factors at different quantiles.
The current paper combines both approaches by considering a factor model with a time-varying factor loadings structure in a quantile heterogeneity framework with varying coefficients. The idea is to propose a flexible panel data model that is general enough to encompass unobserved heterogeneity arising from unobserved factors and quantile-indexed responses together in a dynamic setting. This is done in two stages. First, we propose a factor model for the mean process that includes observable regressors and unobservable factors. This model allows for heterogeneity across individuals and dynamics in the regression coefficients. By doing so, we extend standard factor model specifications that assume slope homogeneity in the observable regressors as in Bai (Citation2003, Citation2009) and slope heterogeneity as in Song (Citation2013) and Ando and Bai (Citation2015). As a salient feature, the model also entertains dynamics in the factor loadings. Second, we extend the model to describe the quantile process. The slope coefficients associated with the observable regressors in the quantile model face three different types of variation: heterogeneity across quantiles, individuals, and over time. The factor loadings accommodate heterogeneity across individuals and over time. Estimation of the model coefficients (quantile factors, quantile regression coefficients and factor loadings) is done in two stages. In the first stage, we estimate the unobservable common factors from a linear factor model with exogenous covariates. We adapt the principal component analysis introduced in Bai (Citation2009) to a local setting using kernel estimation methods (see also Su and Wang (Citation2017)) to simultaneously estimate the local common factors, factor loadings and slope coefficients associated with the observable regressors. In contrast to Su and Wang (Citation2017), our model also accommodates the presence of observable regressors. In order to estimate the quantile common factors a fundamental assumption in our modelling framework is that these quantities are quantile-specific affine transformations of the factors obtained from the mean process in the first stage. In this regard, our model specification lies between the approximate factor models that only consider mean-shifting factors to describe quantile effects and the idiosyncratic quantile factor models in which the factors are estimated separately for each quantile using an iterative procedure, see Ando and Bai (Citation2020), Chen et al. (Citation2021) and Ma et al. (Citation2021). By doing so, our quantile factors become observable covariates in the quantile process studied in the second stage.
The estimation of the parameters in our model relies on the nonparametric quantile estimation method for dynamic smooth coefficients introduced in Cai and Xu (Citation2008) and the semiparametric approach proposed in Cai and Xiao (Citation2012) for models with partially varying coefficients. Our proposed methodology is also framed within the recent literature on QR models with an unobserved factor structure. Harding and Lamarche (Citation2014) propose a quantile common correlated effects estimator for homogeneous panel data with endogenous regressors. The authors assume a parametric approach and time-invariant factor loadings, where the way of recovering the latent factors is different from ours.
Inclusion of estimated quantities in regression models may affect the asymptotic distribution of the parameter estimates, see Pagan (Citation1984). This observation is essential in our context, characterized by a quantile factor model with estimated factors. In principle, the inclusion of such covariates into the quantile model has effects on the asymptotic distribution of the quantile parameter estimates. We show that this is not the case under standard panel data assumptions, that is, if both and diverge to infinity such that , with a bandwidth parameter. We derive the asymptotic distribution of the regression parameter estimates associated to the observable covariates for the mean and quantile models, and of the estimated factors and quantile factor loadings.
A Monte Carlo simulation exercise studies the finite-sample performance (bias and mean square error) of two estimators of the slope coefficients that are based on our two-stage procedure. The first estimator considers time-varying factor loadings using the local estimation procedure developed in this paper. In this case we estimate individual-specific coefficients for all . The second estimator considers a model with time-invariant loadings. In this case we do not impose the time-varying local estimation procedure and estimate, instead, a unique set of parameters for all . This global factor estimator uses Ando and Bai (Citation2015) iterative process. The simulation exercise confirms the consistency of our local two-stage estimation procedure and provides empirical support to our methodology for estimating heterogeneous and time-varying quantile regression coefficients and factor loadings.
This novel quantile factor model is applied to explain the distributional risk premia for a cross-section of excess returns. To do this, we fit the model to different quantiles of the distribution for a cross-section of annual U.S. firms’ asset returns. We consider firm-specific covariates as pricing factors and allow for the presence of two unobserved factors.Footnote1
The remainder of the paper proceeds as follows. In Section 2, we introduce the time-varying quantile factor model. Section 3 describes the estimation procedure based on local principal components and QR. Section 4 introduces the asymptotic properties of the parameter estimators. Section 5 presents a Monte Carlo simulation exercise to evaluate the performance in finite samples of our estimation procedure, in particular, we focus on bias and mean square error. Section 6 illustrates the suitability of the quantile factor model with exogenous covariates in an empirical asset pricing framework. Section 7 provides concluding remarks. An Appendix contains the mathematical proofs of the main results of the study. Tables and figures are collected as a second Appendix.
Notation. Let and be the sets of time periods and individual indices, respectively. The Frobenius norm is defined as with denoting the trace of a matrix and the transpose of .
2. Time-varying quantile factor models
2.1. Identification of the quantile factors and factor loadings
Let be an outcome variable of interest and be a vector of observable covariates, including a constant. Similarly, is the vector of unobservable common quantile factors indexed by where, for simplicity, is assumed to be equal across . We consider the following quantile process conditional on and , given by
for a given , where , with , is the vector of quantile slope coefficients associated to the observable regressors. Similarly, , with , are the loadings associated to the quantile factors . Here the factors are assumed to be -specific. Both and are assumed continuously differentiable smooth functions, see Cai (Citation2007) for similar assumptions in a model with observable covariates.
We impose the following assumption for the identification of the quantile factors.
Assumption A.1
The conditional mean model satisfies
with the slope coefficients for the conditional mean process; the vector of common factors affecting the conditional mean, and the associated factor loadings.
The quantile common factors satisfy
with for all .
Assumption A.1 ii) implies that the quantile factors are location shifts of the vector of factors for the mean process. Under A.1, we can identify the quantile factors and the quantile factor loadings from the following quantile regression model:
with . Identification of the quantile parameters is possible if we condition on the vector and . The additional component determines that the constant in (1) cannot be identified unless additional assumptions are imposed. In particular, identification of is possible if there is no constant in the quantile regression models indexed by . Alternatively, we may impose in assumption A.1. This additional constraint allows for the identification of the constant in model (4) from the parameter vector . Note however that this is not required for the estimation of the other parameters which is the main interest of the paper.
The next section discusses a suitable estimation strategy for obtaining consistent estimates of the model parameters. The parameters of interest are for the mean regression equation in A.1, and for the QR model (4).
2.2. Estimation
In this section we consider local versions of principal components analysis to devise an iterative procedure for estimating the model parameters of the mean process (2). To do this, we adopt the estimation procedures in Bai (Citation2009), Song (Citation2013) and Ando and Bai (Citation2015) for the estimation of , and . The parameters and of the quantile factor model with observable regressors are estimated using QR methods applied to a local kernel version of model (18) in which the unknown common factors have been replaced by consistent estimates.
2.2.1. Estimation of slope coefficients and common factors
In order to estimate the parameters of model (2), we apply local principal components as in Su and Wang (Citation2017). In contrast to these authors we consider a factor model that also includes observable regressors.
In order to estimate the slope coefficients and we need a panel data structure with large and that guarantee the consistency of the common factors and factor loadings, respectively. To do this, we extend the iterative estimation procedure in Song (Citation2013) and Ando and Bai (Citation2015) to accommodate dynamics in the and coefficients, until we reach convergence. For fixed, we consider the Taylor expansion of the vector about for close to such that
with high-order derivatives of the functional parameter evaluated at . For simplicity, we consider the local approximation of order zero given by such that the remaining terms in the approximation are in the error term. Similarly, we replace by such that we estimate the model
with an error term that includes the high-order approximation terms of the model parameters. The parameters of model (6) are estimated from minimizing the following local weighted least squares problem:
where is a kernel smoothing function. The solution to this problem can be obtained applying local principal component analysis (LPCA). To do this, we multiply both sides of expression (6) by , with , see Su and Wang (Citation2017) for a similar estimation strategy. We obtain
Now, define such that is a vector and is a matrix. Similarly, let such that , for , with . Similarly, such that is a vector. Let such that is a matrix and be a matrix. For each individual in the cross section, EquationEquation 8(8) (8) in vector form is
In this setting, for a fixed , the minimization problem (7) becomes
with tr denoting the trace of the matrix and . For parameter identification, we impose restrictions and diagonal matrix, with a matrix. This objective function is a locally weighted version of the least square estimator in Bai (Citation2009).
Applying the procedure developed by these authors, we can estimate and using an iterative estimation procedure. This approach decomposes the original estimation problem into two steps: the estimation of the individual coefficients given common factors, and the estimation of the common factors given individual coefficients. We maintain their assumption that the number of factors is known. The extension to an unknown number of factors under heterogeneous regression coefficients is cumbersome and beyond the scope of this paper. Thus when the number of unobserved factors is known, Bai (Citation2009) proposes a tractable solution to the estimation problem by concentrating out the factor loadings from the objective function (10). Following this procedure, we assume that the factor loadings satisfy a relationship of the form , with and an estimate of the vector of slope coefficients for fixed . Then, replacing this expression into (10), the objective function is
Therefore, the problem of interest becomes
The estimators should simultaneously solve a system of nonlinear equations
with , and
where is a diagonal matrix with the largest eigenvalues of , and the estimated transformed factors are interpreted as the times eigenvectors corresponding to the R largest eigenvalues of the matrix , arranged in descending order.
The actual estimation procedure can be implemented by iterating each of the two steps in (13) and (14) until convergence. The unknown factor loadings are obtained as
The estimation above involves only local data points, i.e., locally weighted in a neighbourhood of , and hence, the local estimates of and converge to the true parameters at rate. In contrast, the methodology developed in Ando and Bai (Citation2015) obtains global estimators that converge under slope heterogeneity at for each . Under the assumption of slope homogeneity, Bai (Citation2009) obtains estimators of the true slope parameters that converge at . The next step is to derive a consistent estimator of the common factors . We propose an estimator of the common factors from the minimization of the following least squares problem:
with , where is obtained from the above iterative estimation procedure for each . The solution to this problem is
2.2.2. Estimation of time-varying quantile factor loadings
In what follows, we propose a procedure to estimate the parameters of the quantile process (18). The unobserved quantile common factors are replaced by estimates of obtained from the conditional mean regression model, such that the regression of interest is
with and a rotation matrix characterizing the common factors; , with . More compactly, consider the following regression model. Let
be the feasible counterpart of , with . Here we are using the notation (note that already contains a constant) and , and also .
Estimation of the model parameters follows by adapting the nonparametric approach for dynamic quantile processes in Cai and Xu (Citation2008). These authors consider a polynomial approximation of the parameters about given by and defined as
with the local approximation of the rotated factor loadings . Note that , and are the derivatives of order of the respective functional coefficients. As in Cai and Xu (Citation2008) we disregard in the following derivations the approximation error from using a polynomial Taylor expansion of order , see Fan and Gijbels (Citation1996) for the suitability of this method and, in particular, the advantages of the local linear approximation.
The parameters of model (19) can be estimated from the following local objective function:
where is the check function of Koenker and Bassett (Citation1978) and is an indicator function that takes a value of one if the argument is true and zero otherwise; is a suitable bandwidth parameter for the quantile estimation problem.
Estimation of the quantile parameters is obtained from the first-order conditions of the optimization problem (20). Estimation of the common factors for the quantile process is also possible in a quantile model (1) without intercept. In this case, by invoking Assumption A.1, we plug-in the factors estimated from the mean regression in EquationEquation 6(6) (6) and estimate the quantile factors as
with , where is obtained from (20) and is a generalized inverse matrix of the matrix obtained from the elements , with denoting a Taylor approximation of order . The matrix satisfies that .
2.3. Determining the number of factors
In the previous analysis, we assume that the number of factors, , is known. In the simulations and the empirical application we fix the number of factors to , following the framework in Galvao et al. (Citation2018) and Galvao et al. (Citation2019). In practice, however, it is an important question to determine from the data.
Different information criteria type models have been applied to select the number of factors, although not for our specific panel data model, with and dimensions, that combines both mean- and quantile-based model specifications. The former determines the type of objective function that will be used in the information criterion. The latter determines how the penalty factor is constructed as a function of , and . Following Su and Wang (Citation2017) or in Casas et al. (Citation2021) AIC or BIC can be applied to the mean-based factor model, where we can use the objective value function that is minimized to obtain the parameters, including the factors and the factor loadings. Ando and Bai (Citation2020) propose a model for selecting the number of factors where the check objective function from QR is used in an AIC or BIC framework, and it also combines both dimensions in the criteria.
3. Asymptotic properties of the estimators
This section presents the asymptotic properties of the proposed estimators for the model parameters – including the common factors – for processes (6) and (19). There are three unique features of the current problem that pose challenges to the econometric theory. First, the proposed estimators of the common factors and beta coefficients do not have a closed-form expression. These quantities are obtained by solving a set of equations to be satisfied simultaneously by and . Second, the unobserved common factors are treated as parameters to be estimated, and thus the number of parameters grows with . Finally, each pair , with and , has its own slope coefficient and factor loading such that the number of parameters grows with and .
Our goal in the remaining of the section is to derive the asymptotic distribution of the quantile parameter estimates of model (19). Our results build on the nonparametric quantile estimation methodology for dynamic smooth coefficient models introduced in Cai and Xu (Citation2008). Our model is also closely related to the recent contribution of Ando and Bai (Citation2020). The salient feature of our model compared to Ando and Bai (Citation2020) is that the quantile common factors are treated as estimated regressors that are obtained from the mean model (2).
3.1. Assumptions
We first state the following notations and assumptions. Let be the error of the mean regression model in Assumption A.1. Then, we denote , , , and . Define , and . Let denote a positive constant that may vary from case to case.
Assumption A.2.
(Error terms and common factors). The error terms and common factors satisfy
and for all and in ;
and for some matrix .
for , where denotes the element of .
and for and .
and .
and for each .
Assumption A.3.
(Factor Loadings). The factor loading matrix satisfies that
as , where is an diagonal matrix.
is the diagonal matrix consisting of the eigenvalues of and satisfies that for all diagonal elements .
for each , where.
, where
Assumption A.4.
(Explanatory Variables). The vector of observable covariates satisfies
The matrix is positive definite.
Let , , . For each , let be the collection of such that . Then, we assume that
with , where and is the generalized inverse of .
(iv) , for . ( is a rotation matrix characterizing the factors defined above.)
Assumption A.5.
(i) The kernel function is a symmetric continuously differentiable probability density function with compact support , (ii) As , , , , , and .
Assumption A.6.
(Central Limit). As , , and
with .
These assumptions are standard in factor models. A.2 and A.3 mainly impose moment conditions in the error terms, factors, factor loadings, and their interactions, see, e.g., Bai and Ng (Citation2002), Bai (Citation2003, Citation2009). The main difference, and in line with Su and Wang (Citation2017), is that we require in A.2(ii) and in A.3(i). Assumptions A.2(iii)-(v) restrict the time and cross-sectional dependence for the idiosyncratic errors and the weak dependence between factors and errors, which are in the same spirit as Bai (Citation2003, Citation2009) and Su and Wang (Citation2017). A.2(vi) is a kernel-weighted version of Assumptions F.1-F.2 in Bai (Citation2003). Following the recent literature on factor models, we assume that is homogeneous over . This assumption is made for convenience to facilitate the asymptotic results. Assumption A.3(iii) allows for factor loadings to be time-varying and Assumption A.3(iv) is a kernel weighted version of Assumption F in Bai (Citation2003). Both parts are used to establish the asymptotic normality of our local principal components estimators. We extend the assumptions in Su and Wang (Citation2017) by incorporating a set of assumptions in A.4 specific to the observable regressors. Assumption A.4 (i)-(iii) impose the boundedness of moments and the regressors are assumed to exhibit sufficient variation such that the coefficients are identifiable. Identification also requires that the observed regressors do not exhibit multicollinearity with the unobservable common factors . Condition (iii) in the assumption guarantees the unique minimizer of the estimation objective function. The notation is used to emphasize that the entire term is a function of F. Assumption A.5 states conditions on the rates of convergence that guarantee the consistency and asymptotic normality of the kernel estimators. A.6 simplifies the proofs and is imposed, for example, in Ando and Bai (Citation2015). More primitive conditions to obtain the asymptotic properties of these objects can be found in Song (Citation2013) for a global factor model.
We consider now each cross-sectional observation separately, such that denotes for each . Let be the conditional density of given . Let and , and define and . The relevant bandwidth parameter for the quantile problem is such that .
Assumption B.1.
, , and are th order continuously differentiable in a neighbourhood of for any . Further, is bounded and satisfies the Lipschitz condition.
Assumption B.2.
For each , for some , where . Furthermore, and are positive definite and continuous in a neighbourhood of . These functions and their inverse functions are uniformly bounded.
Assumption B.3.
For each , the process is strictly stationary mixing, with mixing coefficients satisfying such that with .
Assumption B.4.
The bandwidth parameter satisfies , , , and , for .
This set of assumptions is found in Cai (Citation2007) and Cai and Xu (Citation2008). The main difference with respect to the latter authors is the assumption that allows us to remove the effect of estimating the common factors from the asymptotic distribution of the quantile parameter estimates. A similar assumption is also found in A.4 for the mean process. Under this set of additional assumptions, we obtain the asymptotic distribution of the quantile parameter estimates of and , for and . This result shows that the estimation of the common factors does not have an effect on the asymptotic distribution of the quantile parameter estimates.
3.2. Propositions
With these assumptions in place we are ready to derive the asymptotic results. We derive first the uniform consistency of the parameter estimators associated to the observable regressors.
Proposition 1.
Under Assumptions A.2-A.6 and B.1, it follows that
The proof of this result, in the Appendix, follows from extending the results in Song (Citation2013) and Ando and Bai (Citation2015) to the presence of time-varying slope coefficients. The uniform consistency of these coefficients allows us to extend the results in Su and Wang (Citation2017) from a pure factor model specification to our setting. The following result shows the asymptotic normality of to a rotation of the true factors .
Proposition 2.
Under Assumptions A.2-A.6 and B.1, for each , we have
where ; denotes the diagonal matrix of the first largest eigenvalues of , is the diagonal matrix consisting of the eigenvalues of in descending order; is the corresponding normalized eigenvector matrix such that , and .
In particular, the consistency of the local factors to allows us to derive the asymptotic distribution of the slope parameter estimators associated to the observable regressors.
Proposition 3.
Under Assumptions A.2-A.6 and B.1, for any fixed pair with and , the vector obtained from expression (13) satisfies
with , where and are matrices defined in the Appendix.
The proof of this result follows from extending the results in Song (Citation2013) and Ando and Bai (Citation2015) to the presence of time-varying slope coefficients. Similarly, we show that the asymptotic distribution of the factor loading estimates is unaffected by including a set of observable covariates with time-varying parameters that vary smoothly over time. More formally,
Proposition 4.
Under Assumptions A.2-A.6 and B.1, for each , we have
with .
These results allow us to show the consistency of the common factors estimated in (17).
Proposition 5.
Under Assumptions A.2-A.6 and B.1, as the estimator (17) of the common factors satisfies
with , where .
Proposition 6.
Under Assumptions A.1-A.6 and B.1-B.4, as the estimator of obtained from the minimization problem (20) satisfies that
with and
This result shows that the bias of the estimator of the quantile parameters decreases as one takes higher order local polynomial expansions of the functional coefficients in (19).
Inference for this model is based on bootstrap implementation for panel data models with time-dependent data. Standard errors are estimated using bootstrap by resampling only from cross-sectional units with replacement as in Kapetanios (Citation2008) and Galvao and Montes-Rojas (Citation2015). See also Galvao et al. (Citation2021) for a recent study that discusses the assumptions for asymptotic validity of the bootstrap in a similar framework.
The following section explores the finite-sample performance of our two-stage estimation procedure.
4. Monte Carlo study
Our Monte Carlo design is a variation of the Monte Carlo exercises proposed in Bai (Citation2009), Harding and Lamarche (Citation2014), and Su and Wang (Citation2017). We are interested in showing the consistency of the parameter estimators under the presence of time-varying factor loadings.
Consider the following data generating process with unknown factors:
In this model as well as in the empirical application below we assume a set of common factors that is constant across quantiles. For this exercise the parameter of interest is the marginal effect on the conditional quantile, which corresponds to . The parameter thus determines if there is heterogeneity across quantiles. For we have a location-shift model while for we have a location-scale shift model. The parameters and determine whether the factors also have an effect on the scale that may potentially contaminate the estimators of the quantile marginal effects. We consider two distributions for the error term , Gaussian and standardized chi-squared with 1 degree of freedom. For all models we fix and , and we consider different scenarios with and . For simplicity, we consider .
We generate the factors, , with the following model
where we assume for all cases that are standard Gaussian independent random variables for , and . The common parameters are assumed as in Harding and Lamarche (Citation2014).
The time-varying factor loadings models for the common factors are DGP 1: for ; and DGP 2: for . DGP 1 thus have factor loadings that vary across and while DGP 2 only varies across individuals.
We study the finite-sample performance of two estimators of the slope parameters . First, an estimator that considers time-varying factor loadings using the local estimation procedure developed in this paper, and denoted as . In this case we are in fact estimating individual-specific coefficients (, and for ) for all . This estimator is thus the most demanding one. We will refer to this model as the local factor estimator. Second, we consider a model with time-invariant loadings, that is denoted as . Here, we do not impose the time-varying local estimation procedure and, instead, we estimate a unique set of parameters (, and for ) for all . The latter estimator will be referred to as the global factor estimator. In all cases we consider a fixed bandwidth of .
In order to evaluate the performance of our estimators and for comparability purposes, we study bias and mean squared error (MSE) by comparing the estimates with the parameter defined above. For the local factor estimator we compute the sample average across and of for every simulation. For the global factor estimator we compute the sample average across .
The sample size of the different simulation experiments comprises all possible combinations of . The number of Monte Carlo experiments is in every case. report the simulation exercise results for the case with for DGP1 and DGP2, respectively. In this case all coefficients should be estimating the same value of for all quantiles. report the simulation exercise results for the case with for DGP1 and DGP2, respectively; study the case given by for DGP1 and DGP2, respectively. Importantly, the last two cases generate heterogeneity across quantiles such that the coefficient estimates are different across quantiles.
First, note that there is no clear pattern for bias reduction when or increases leaving the other dimension constant. However, bias monotonically reduces when both and increase. There is, however, a mean square error (MSE) reduction when either or increases. These results provide empirical evidence on the consistency of the parameter estimators above as and increase. Second, the time-varying local estimator exhibits a larger MSE value than the global factor estimator. This result is expected as the local estimator is more demanding and uses fewer observations to estimate the parameters. In contrast, the estimator offers additional flexibility as we can estimate time-varying coefficients. The ratios of the MSE performance of the two estimators are similar across specifications. Third, those simulation scenarios are given by an error term following a chi-squared distribution show differences across quantiles for both estimators. One unexpected feature is that the MSE performance of is worse than that of for the local estimator. This may be the result of the estimated factors absorbing a more substantial portion of the variance in the quantile location with more probability mass.
5. Empirical application
This section applies the above model to an empirical asset pricing context. In contrast to standard asset pricing models, we explore the distributional risk premia by fitting the above models to different quantiles of the distribution of excess returns. We are interested in assessing the effect of including unobserved local factors with time-varying factor loadings in standard asset pricing specifications. The methodology developed above also allows us to estimate dynamic parameter estimates measuring the sensitivity of the quantile process of excess returns to a set of idiosyncratic firm-specific factors that are combined with Fama and French (Citation1993) three-factor model.
5.1. Data
The set of firm-specific covariates is obtained from a panel of U.S. firms and obtained from Compustat Industrial dataset. The sample consists of annual CRSP/Compustat data from the years 1970 through 2011. Following standard practice, we exclude financial firms (SIC codes 6000–6999), regulated utilities (SIC codes 4900–4999), and non-profit organizations (SIC codes greater than or equal to 9000). We omit firm-years with a missing or negative value for fixed assets and sales, with a missing or less than ten million 1983 dollar book value of total assets, and with growth rates of fixed assets, sales, and the book value of total assets greater than .Footnote2
We consider the following list of firm characteristics: denotes firms’ market-to-book ratio; denotes the log of the firm’s asset size; denotes earnings before interest and taxes as a proportion of total assets; denotes the market debt ratio, defined as the book value of debt over the market value of assets; and denotes depreciation as a proportion of total assets. The set of covariates is completed by the following observable pricing factors taken from Kenneth French website. The common pricing factors are MKTRF, SMB and HML. The factor MKTRF is defined as a value-weighted average market portfolio return net of the risk-free asset. The risk-free rate is proxied by daily returns on the U.S. three-month Treasury bill. The factor SMB is a small-minus-big portfolio constructed as the difference between the returns on diversified portfolios of small and large asset size. The factor HML is high-minus-low portfolio constructed as the difference between the returns on diversified portfolios of high and small book-to-market equity. The firms’ excess returns are the annual excess return on assets computed over the annual interest rate offered by one-month U.S. Treasury bills.
The final sample includes a balanced panel of 297 firms with 2 years of data.
5.2. Empirical models
In a similar spirit to Giovannetti (Citation2013), Galvao et al. (Citation2018) and Galvao et al. (Citation2019), we propose a quantile process for modelling the distribution of excess returns. The objective of this study is to show if an empirical pricing strategy based on firm-specific variables coupled with unobserved quantile factors with time-varying loadings is able to explain the cross-section of excess returns on a set of U.S. firms. As a byproduct, we also study if this model adds predictive ability to the standard Fama-French three-factor model. The pricing factors of our baseline model are firm-specific financial ratios, see Kogan and Papanikolaou (Citation2013) for a discussion of empirical asset pricing models using firm-specific variables. This approach has recently gained support due to the strong evidence of the co-movement in stock returns of firms with similar characteristics that is unrelated to their exposures to the market portfolio.
Our baseline model is
with and . We assume that the unobserved common factors for the quantile model are location shift transformations of the estimates of the mean factors and . The shifts defining the quantile factors are captured by the values of the dynamic intercepts of the different quantile models. We estimate two versions of this model for . A first version considers global factors and uses the methodology proposed in Ando and Bai (Citation2015) to estimate the factors, , which are then used to estimate the set of parameters . The second version considers local factors and uses the methodology developed above to estimate the time-varying parameters . Note that the loadings associated to the observable covariates do not only vary over time but also across individuals. We consider two models. Model 1 uses only firm-specific covariates, . Model 2 augments the above model by MKTRF, SMB, and HML. Standard errors are estimated using bootstrap by resampling only from cross-sectional units with replacement as in Kapetanios (Citation2008) and Galvao and Montes-Rojas (Citation2015) using 100 replications. In all cases the bandwidth parameter is set to . The results are reported in .
The results are an extension of the findings in Galvao et al. (Citation2018). In this case, we incorporate the presence of unobserved common factors. Firm-specific covariates are statistically significant in all models, and the model parameter estimates are similar across the different specifications of the empirical asset pricing model reported in . The estimates reported for the model with local factors are averages across time and individuals of the parameter estimates of for and .
Our empirical asset pricing model uncovers a positive exposure of firms’ excess returns to the market-to-book ratio (MDR) and the log of asset size (LNTA) and negative exposure to the market debt ratio (MB) and depreciation as a proportion of total assets (DEPTA). Earnings before interest and taxes as a proportion of total assets (EBITTA) have a positive effect on low quantiles and turn negative for and beyond. The quantile parameter estimates are monotonically increasing on for LNTA and monotonically decreasing for DEPTA. All the coefficients are statistically significant at significance levels. report the baseline case in expression (30) given by firm-specific covariates, report the pricing model augmented with Fama-French three-factor model. The results are also similar across specifications and estimation methods. However, the magnitude of the model parameters changes significantly between the global and local factor estimation methods.
The pricing model with local factors provides similar insights to the model with unobserved global factors but has the additional advantage of offering the possibility of studying the dynamics of the loadings associated to each observable covariate. These dynamics are reported in , corresponding to the local factor model with the augmented set of covariates in . Importantly, the model also allows the possibility of studying the dynamics of the unobserved common factor loadings , nevertheless, we do not report these values as an interpretation of the results is difficult due to the lack of interpretation of the common factor estimates. Each panel reports five lines that reflect the dynamics of the parameters over time. These estimates are constructed as the cross-sectional average of for each and the standard errors are calculated by bootstrap. The results show how the exposure of the excess returns to some covariates and factor models have evolved over time. The figures show that there was little variation in the average effects, and they are all within the 95% confidence interval of each other. One limitation in the analysis is that the time dimension () does not allow us to obtain a finer set of local estimates.
6. Conclusion
This paper proposes a functional coefficient quantile regression model with time-varying factor loadings. Estimation of the quantile factors and factor loadings is done in two stages. First, we estimate the unobserved common factors from a linear factor mean-based model with exogenous covariates. In the second stage, we plug-in an affine transformation of the estimates of the common factors to obtain the quantile version of the factor model. This model requires both the number of individuals and the number of periods to grow to infinity. The number of individuals needs to diverge for the consistent estimation of the common factors in the first stage. Also, to consistently estimate the quantile factor loadings the number of time periods needs to diverge as well. As a byproduct, our model can capture dynamics and heterogeneity across individuals in both the quantile slope coefficients and the quantile factor loadings. The introduction of time-varying coefficients adds flexibility to standard factor model specifications that assume slope homogeneity as in Bai (Citation2003, Citation2009) and slope heterogeneity as in Ando and Bai (Citation2015). The model also extends the recent partial linear model of Su and Wang (Citation2017) by considering the quantile process and including the presence of exogenous regressors.
This model specification is applied in an empirical application to explain the distribution of the excess returns for a cross-section of asset returns in the U.S. In contrast to standard asset pricing formulations, we consider firm-specific covariates as pricing factors and allow for the presence of two unobserved factors. The model provides satisfactory estimates of the sensitivity of the excess return to the pricing variables under both global (Ando & Bai, Citation2015) and local factor models. The main contribution of our methodology is to be able to estimate the dynamics of the slope coefficients (betas) for each asset and over time. By doing so, we can track the dynamic exposure of assets’ excess returns to the different financial ratios acting as pricing variables.
Disclosure statement
No potential conflict of interest was reported by the authors.
Additional information
Notes on contributors
Alev Atak
Alev Atak has a PhD in Economics from Queen Mary, University of London. She works on econometrics and finance.
Gabriel Montes-Rojas
Gabriel Montes-Rojas has a PhD in Economics from the University of Illinois at Urbana-Champaign. He especializes in econometric theory with works in quantile regression, panel data, and multivariate models.
Jose Olmo
Jose Olmo has a PhD in Economics from Universidad Carlos III de Madrid. His research interests are in Financial and Applied Econometrics, and Financial Economics. Jose has also served in the editorial board of several academic journals.
Notes
1 It is prevalent in this literature to fix the number of unobserved common factors, see Bai (Citation2009), Song (Citation2013), and Ando and Bai (Citation2015). Alternatively, information criteria and rank minimization are used in Ando and Bai (Citation2020) and Chen et al. (Citation2021), to determine the number of factors at each quantile while uncovering the quantile factors individually.
2 Although there is no consensus in the literature on the length of the time dimension; we acknowledge that the time dimension selection criteria might favor larger and more mature companies, which may lead to the results being valid only for large and mature companies. However, the average estimated effects from our sample are in line with the consensus in the literature, and thus, the results could be applied to all companies. The log of total assets is the only variable that is not a ratio, and is deflated to the 1983 dollar with the consumer price index obtained from the Bureau of Labor Statistics.
References
- Ando, T., & Bai, J. (2015). Asset pricing with a general multifactor structure. Journal of Financial Econometrics, 13(3), 556–39. https://doi.org/10.1093/jjfinec/nbu026
- Ando, T., & Bai, J. (2020). Quantile co-movement in financial markets: A panel quantile model with unobserved heterogeneity. Journal of the American Statistical Association, 115(529), 266–279. https://doi.org/10.1080/01621459.2018.1543598
- Bai, J. (2003). Inferential theory for factor models of large dimensions. Econometrica, 71(1), 135–171. https://doi.org/10.1111/1468-0262.00392
- Bai, J. (2009). Panel data models with interactive fixed effects. Econometrica, 77, 1229–1279.
- Bai, J., & Ng, S. (2002). Determining the number of factors in approximate factor models.Econometrica. Econometrica, 70(1), 191–221. https://doi.org/10.1111/1468-0262.00273
- Bates, B. J., Plagborg-Mø Ller, M., Stock, J. H., & Watson, M. W. (2013). Consistent factor estimation in dynamic factor models with structural instability. Journal of Econometrics, 177(2), 289–304. https://doi.org/10.1016/j.jeconom.2013.04.014
- Cai, Z. (2007). Trending time-varying coefficient time series models with serially correlated errors. Journal of Econometrics, 136(1), 163–188. https://doi.org/10.1016/j.jeconom.2005.08.004
- Cai, Z., & Xiao, Z. (2012). Semiparametric quantile regression estimation in dynamic models with partially varying coefficients. Journal of Econometrics, 167(2), 413–425. https://doi.org/10.1016/j.jeconom.2011.09.025
- Cai, Z., & Xu, X. (2008). Nonparametric quantile estimation for dynamic smooth coefficient models. Journal of the American Statistical Association, 103(484), 1595–1608. https://doi.org/10.1198/016214508000000977
- Casas, I., Gao, J., Peng, B., & Xie, S. (2021). Time-varying income elasticities of healthcare expenditure for the OECD and Eurozone. Journal of Applied Econometrics, 36(3), 328–345. https://doi.org/10.1002/jae.2809
- Chaudhuri, P., Doksum, K., & Samarov, A. (1997). On average derivative quantile regression. Annals of Statistics, 25(2), 715–744. https://doi.org/10.1214/aos/1031833670
- Chen, L., Dolado, J., & Gonzalo, J. (2021). Quantile factor models. Econometrica, 89(2), 875–910. https://doi.org/10.3982/ECTA15746
- De Gooijer, J. G., & Zerom, D. (2003). On conditional density estimation. Statistica Neerlandica, 57(2), 159–176. https://doi.org/10.1111/1467-9574.00226
- Eichler, M., Motta, G., & von Sachs, R. (2011). Fitting dynamic factor models to non-stationary time series. Journal of Econometrics, 163(1), 51–70. https://doi.org/10.1016/j.jeconom.2010.11.007
- Fama, E. F., & French, K. R. (1993). Common risk factors in the returns on stocks and bonds. Journal of Financial Economics, 33(1), 3–56. https://doi.org/10.1016/0304-405X(93)90023-5
- Fan, J., & Gijbels, I. (1996). Local polynomial modelling and its applications. Chapman & Hall.
- Galvao, A., Juhl, T., Montes-Rojas, G., & Olmo, J. (2018). Testing slope homogeneity in quantile regression panel data with an application to the cross-section of stock returns. Journal of Financial Econometrics, 16(2), 211–243. https://doi.org/10.1093/jjfinec/nbx016
- Galvao, A., & Montes-Rojas, G. (2015). On bootstrap inference for quantile regression panel data: A Monte Carlo study. Econometrics, 3(3), 654–666. https://doi.org/10.3390/econometrics3030654
- Galvao, A., Montes-Rojas, G., & Olmo, J. (2019). Tests of asset pricing with time-varying factor loads. Journal of Applied Econometrics, 34(5), 762–778. https://doi.org/10.1002/jae.2687
- Galvao, A., Parker, T., & Xiao, Z. 2021. Bootstrap inference for panel data quantile regression, https://arxiv.org/abs/2111.03626.
- Giovannetti, B. C. (2013). Asset pricing under quantile utility maximization. Review of Financial Economics, 22(4), 169–179. https://doi.org/10.1016/j.rfe.2013.05.008
- Harding, M., & Lamarche, C. (2014). Estimating and testing a quantile regression model with interactive effects. Journal of Econometrics, 178, 101–113. https://doi.org/10.1016/j.jeconom.2013.08.010
- He, X., & Zhu, L. (2003). A lack-of-fit test for quantile regression. Journal of the American Statistical Association, 98(464), 1013–1022. https://doi.org/10.1198/016214503000000963
- Horowitz, J. L., & Lee, S. (2005). Nonparametric estimation of an additive quantile regression model. Journal of the American Statistical Association, 100(472), 1238–1249. https://doi.org/10.1198/016214505000000583
- Kapetanios, G. A. (2008). Bootstrap procedure for panel datasets with many cross-sectional units. The Econometrics Journal, 11(2), 377–395. https://doi.org/10.1111/j.1368-423X.2008.00243.x
- Kim, M. O. (2007). Quantile regression with varying coefficients. Annals of Statistics, 35(1), 92–108. https://doi.org/10.1214/009053606000000966
- Koenker, R., & Bassett, G. S. (1978). Regression quantiles. Econometrica, 46(1), 33–50. https://doi.org/10.2307/1913643
- Koenker, R., & Machado, J. A. F. (1999). Goodness of fit and related inference processes for quantile regression. Journal of the American Statistical Association, 94(448), 1296–1310. https://doi.org/10.1080/01621459.1999.10473882
- Koenker, R., & Xiao, Z. (2006). Quantile autoregression. Journal of the American Statistical Association, 101(475), 980–990. https://doi.org/10.1198/016214506000000672
- Kogan, L., & Papanikolaou, D. (2013). Firm characteristics and stock returns: The role of investment-specific shocks. The Review of Financial Studies, 26(11), 2718–2759. https://doi.org/10.1093/rfs/hht026
- Ma, S., Linton, O., & Gao, J. Estimation and inference in semiparametric quantile factor models. (2021). Journal of Econometrics, 222(1), 295–323. Part B. https://doi.org/10.1016/j.jeconom.2020.07.003
- Pagan, A. (1984). Econometric issues in the analysis of regressions with generated regressors. International Economic Review, 25(1), 221–247. https://doi.org/10.2307/2648877
- Pelger, M., & Xiong, R. 2019. State-varying factor models of large dimensions. Papers 1807.02248v2, arXiv.org.
- Pesaran, M. H. (2006). Estimation and inference in large heterogeneous panels with a multifactor error structure. Econometrica, 74(4), 967–1012. https://doi.org/10.1111/j.1468-0262.2006.00692.x
- Portnoy, S. (1991). Asymptotic behavior of regression quantiles in nonstationary, dependent cases. Journal of Multivariate Analysis, 38(1), 100–113. https://doi.org/10.1016/0047-259X(91)90034-Y
- Song, M. 2013. Essays on large panel data analysis. Ph.D. thesis, Columbia University.
- Su, L., & Wang, X. (2017). On time-varying factor models: Estimation and testing. Journal of Econometrics, 198(1), 84–101. https://doi.org/10.1016/j.jeconom.2016.12.004
- Wei, Y., & He, X. (2006). Conditional growth charts (with discussion). Annals of Statistics, 34(5), 2069–2097. https://doi.org/10.1214/009053606000000623
- Yu, K., & Lu, Z. (2004). Local linear additive quantile regression. Scandinavian Journal of Statistics, 31(3), 333–346. https://doi.org/10.1111/j.1467-9469.2004.03_035.x
Appendix
Proof of Proposition 1.
The proof of this proposition follows from an application of the results in Song (Citation2013) and Ando and Ando and Bai (Citation2015) to local principal components. The main difference is that we are considering local approximations using the kernels. Define such that is a vector and is a matrix. Let such that and and such that is a vector. Similarly, such that is a vector. Let such that is a matrix and be a matrix.
For each individual in the cross section, EquationEquation 6(6) (6) in vector form is
and the OLS estimator of is
such that
Then, under assumptions A.2 and A.4, it follows that is positive definite. Now, using a similar decomposition to Proposition 1 of Song (Citation2013), we have
1
where and . Thus,
with . Then,
such that
Now, the quantities and satisfy that
and
such that as .
Furthermore, note that , with the errors of the mean regression model in assumption A.1, and , for any fixed . Therefore,
Now, taking the maximum over and , we obtain
Finally, noting that and as , the result in the proposition follows.
Proof of Proposition 2.
Let and be defined as in the text and define also . It follows from (14) that . Note also that , with a vector.
Then,
This expression can be decomposed as
Theorem 3.1 in Su and Wang (Citation2017) shows that expression (A.4) multiplied by converges in distribution to , where ; is the diagonal matrix consisting of the eigenvalues of in descending order; is the corresponding normalized eigenvector matrix such that , and .
To complete the proof we need to show that the remaining terms multiplied by are as , with . First, we show that as . To do this, we decompose the elements of the matrix given by for . More formally,
1
From Proposition 1, it follows that , as . Then, , for , as , such that , with as defined in the text below EquationEquation 14(14) (14) . Then, it follows that . Therefore, using Assumption A.3 (ii) we have . Then, we need to prove that
Note also that , where , for any fixed . Then, the expression on the left hand side of (A.8) satisfies that
Now, noting that , for , and applying the law of large numbers with , we obtain condition (A.8).
Applying the same arguments to expressions (A.6) and (A.7), we obtain the consistency of the local factors to rotated versions of given by .
Proof of Proposition 3.
The proof of this proposition follows from the proof of Proposition 1 and the application of the results in Song (Citation2013) and Ando and Bai (Citation2015) to local principal components. For each individual in the cross section, EquationEquation 6(6) (6) in vector form is
and the OLS estimator of is
such that
Applying the results in the proof of Proposition 1, we have
We are interested in the asymptotic distribution of the entire vector . The above equation implies, stacking over
with and block-diagonal matrices with elements and . Then,
such that
given that . Furthermore, from Proposition 2, we have that . Then, , with an orthogonal rotation matrix and . Therefore,
Now, using Assumption A.6,
with .
Furthermore, each block and satisfies that and . Then, stacking over all the individuals, we define and block-diagonal matrices, such that it follows that
with . □
Proof of Proposition 4.
The proof of this result follows closely the proof of Theorem 3.2 in Su and Wang (Citation2017). It follows from (15) that . Then, replacing in this expression, we obtain
Operating with this expression, we obtain
with . Under assumption A.3 iii), , with
. Then,
It remains to see that as . Using expression (A.11), and multiplying by , this expression can be rearranged as
Therefore, the right hand side of the expression is equal to
Under assumption A.4 iv), . This implies that . Furthermore, . Now we need to show that . To show this, from A.6, it follows that , with a zero-mean normal random variable with variance . Then, applying the law of large numbers and the law of iterated expectations to , it follows that . Finally, by assumption A.2 i), this quantity converges to zero in probability.
Proof of Proposition 5.
For convenience, we reproduce the analytical expression of the estimators:
where . Then, replacing in the expression, we obtain
The first term has been analyzed in Su and Wang (Citation2017) and satisfies that
Under assumption A3 i) as , where is an diagonal matrix. Under assumption A.3 ii) it holds that for each , where . Then, converges in distribution to , with . Now, it remains to see that as . To show this, note that
By the law of large numbers, we have . Then, applying the law of iterated expectations, under assumption A.2 (i), it follows that as . Furthermore, noting that as and , we obtain the desired result.□
Proof of Proposition 6.
This proof is based on Theorem 1 of Cai and Xu (Citation2008). The main difference is that we replace the observable covariates by estimated common factors such that the quantile factor model of interest is
with .
Following Cai and Xu (Citation2008), we consider a local polynomial expansion of the quantile parameters by . To simplify the proof, we consider a local linear approximation such that , that can be reparametrized as , and minimize the following local objective function:
Let and , for some as ; is an indicator function and is the prediction of the quantile model evaluated at . These sample covariance matrices are consistent estimators of and defined above. Furthermore, let , , , and , with as the identity matrix of dimension , and let
The above minimization problem can be rewritten as
Using the same steps as in Cai and Xu (Citation2008), we derive a local Bahadur representation of such that
with . Now, after simple algebra, we decompose this expression in four terms as
Under assumptions B.1-B.4, Cai and Xu (Citation2008) show that expression (A.20) converges in distribution to , with . In particular, to compute the asymptotic variance we rely on the mixing condition B3 that limits the amount of serial dependence. More specifically,
The last term can be expressed as
Now, noting that
the above expression is
Furthermore, applying Cauchy-Schwarz inequality to the first term, we have
Finally, using the mixing condition on in B3, we obtain
and . Therefore,
The same derivations apply to such that expression (A.20) converges to .
For expression (A.21), we note that
with denoting a vector. Now, using Proposition 5, , as . Define . Then,
that converges to zero in probability as . To show this, consider the element
Under the law of large numbers, it follows that . Then, the above expression converges to zero if .
Now, the consistency of to , as , implies that and . Then, expressions (A.22) and (A.23) converge to zero in probability, and the asymptotic result in Proposition 6 follows. □