638
Views
0
CrossRef citations to date
0
Altmetric
Articles

Price Impact Without Averaging

, , &
Pages 175-206 | Received 03 Jan 2024, Accepted 04 Jan 2024, Published online: 25 Jan 2024

ABSTRACT

We present a method to estimate price impact in order-driven markets that does not require averaging over executions or scenarios. Given order book data associated with one single execution of a sell metaorder, we estimate its contribution to price decrease during the trade. We do so by modelling the limit order book using a state-dependent Hawkes process, and by defining the price impact profile of the execution as a function of the compensator of the state-dependent Hawkes process. We apply our method to a dataset from NASDAQ, and we conclude that the scheduling of sell child orders has a bigger impact on price than their sizes.

1. Introduction

Price impact is the phenomenon whereby trade executions affect the price of the asset being traded. The price is affected in a way that is unfavourable to the trade, i.e., it decreases as a consequence of sell orders and it increases as a consequence of buy orders. Hence, price impact is often regarded as a hidden transaction cost, and price impact detection is considered a branch of transaction cost analysis.

References in the literature are numerous and among them are Torre (Citation1997), Almgren et al. (Citation2005), Moro et al. (Citation2009), Tóth et al. (Citation2011), Engle, Ferstenberg, and Russell (Citation2012), Bacry et al. (Citation2015), Brokmann et al. (Citation2015), Zarinelli et al. (Citation2015), Tóth, Eisler, and Bouchaud (Citation2016) and Patzelt and Bouchaud (Citation2018). An overview of the research done on price impact is provided by Bouchaud (Citation2017).

Recently, Capponi and Cont (Citation2019) have however suggested that the empirical findings on price impact laws may be indistinguishable from the ubiquitous scaling of price variation, i.e., volatility, as a square root of time. Thus, they venture to question the existence of price impact altogether.

However, as observed in Bouchaud (Citation2017), price impact can be regarded as an embodiment of the basic economic law of supply and demand, and it must play a role in price formation. This is to say that, despite the acknowledged entanglement between execution-induced price moves and volatility, the phenomenon of price impact must exist in a market with price mechanism. The consensus among these authors is that, in order to detect price impact, one has to average across several executions and scenarios to attenuate the impact of volatility – see Bucci et al. (Citation2019).

In this paper, we present a measurement of price impact in order-driven markets that does not require averaging over executions and scenarios. Given the order book data associated with one single execution of a sell metaorder (subdivided into smaller child orders distributed in time), we measure its contribution to price decrease during the trade.

Our motivation is twofold. On the one hand, the economic argument by which price impact must exist applies scenario-by-scenario and to every execution, other things being equal. Therefore, it makes sense to ask whether price impact is detectable without averaging. In other words, is averaging the only practical way to disentangle price impact from volatility? On the other hand, and from a more practical point of view, investors and traders might in fact be more interested in a trade-by-trade assessment of price impact rather than on statistics obtained by extensive averaging. This was pointed out in the aforementioned article by Capponi and Cont (Citation2019), who argue that for institutional investors who execute infrequent but large trades, ‘the observed impact may deviate significantly from such an average’.

Our method to measure price impact is based on a granular model of order book dynamics. Our model extends that of Bacry and Muzy (Citation2014), where a four-dimensional Hawkes process was used to describe arrivals of market orders into a limit order book and price changes. We increase the granularity of the model by describing price changes via a state variable that evolves in time, and by tracking not only market orders but also limit orders. The Hawkes process and state variable are modelled using a class of hybrid marked point processes introduced in Morariu-Patrichi and Pakkanen (Citation2018), more specifically, state-dependent Hawkes processes.

The so-increased granularity allows us to: (i) have a snapshot-by-snapshot proxy for outstanding volume on every level of the order book, and (ii) assess the impact that a particular trader has on the market without assuming that their orders walk the book with the same frequency as the orders from other market participants – as assumed in Bacry and Muzy (Citation2014).

For an averaged measure of price impact, one fixes a trade direction for order executions and assesses price movements across several scenarios. Since in these scenarios the signs of all other market variables on average cancel out, an average price movement emerging during these executions represents the impact sought. If we instead eschew averaged measures of price impact, we cannot count on this average cancellation of all market fluctuation other than the trade direction of the considered execution. This poses the counterfactual of what would have happened if the trade under consideration had not been executed. In other words, the execution-induced price movement during the single observed scenario has to be disentangled from the price movement induced by all the surrounding market noise. We do so by formalizing our notion of price impact without averaging around the concept of compensator of a counting process. This allows us to focus on a single realization of the process and yet factor out its volatility. The idea of using compensators of Hawkes processes is reminiscent of existing approaches in the literature, for example in Jusselin and Rosenbaum (Citation2020) and Rosenbaum and Tomas (Citation2021).

We demonstrate our approach by assessing price impact on NASDAQ data for the ticker INTC. Since the identities of market participants behind the observed market events are not disclosed, we cannot directly measure the impact of a chosen execution among those in the raw data. Instead, we calibrate our market model on the provided data, and then we simulate a liquidation in the calibrated model. By so doing, we illustrate the usefulness of the class of market simulators to which our model belongs.

More precisely, we fix the size of the metaorder to be executed, and we examine how the price impact of such an execution varies with (i) the size of the child orders (whether they are likely to consume all available liquidity on the first price level of the limit order book or not), and (ii) the scheduling of the child orders (whether they cluster around particular times or are evenly distributed during the execution). Empirically, we find that the clustering of child orders has a bigger impact on the price than their sizes.

The paper is organized as follows. Section 2 gives some background about order-driven markets and limit order books. Section 3 recalls basic notions from the theory of counting processes and establishes the notation used in the paper. Section 4 describes our model for order-driven markets based on state-dependent Hawkes processes. Section 5 defines price impact in our model, and describes how we quantify it. Section 6 presents applications of our modelling framework to a dataset from NASDAQ, and provides some insights on the main drivers of price impact in this dataset. Section 7 concludes.

2. Background on Order-Driven Markets and Limit Order Books

Order-driven markets are trading venues organized around limit orders. A limit order is the fundamental action that market participants in order-driven markets can perform. It is represented by a 4-tuple (t,q,p,d), where t denotes time, q denotes size, p denotes price and d denotes direction. We say that a market participant posts a limit order (t,q,p,d) if at time t they submit to the exchange their commitment to buy (d = 1) or sell (d = −1) the amount q at the price p. The price p is interpreted as the highest price at which they are committed to buy if d = 1, or the lowest price at which they are committed to sell if d = −1. By regulation, the price p must be an integer multiple of some fixed ψ>0 known as the tick size. Market participants have also the possibility to withdraw their commitments to trade, by cancelling a previously submitted limit order.

A trading epoch in an order-driven market is the collection of all the limit orders submitted by market participants within some time interval. Limit orders cannot be submitted simultaneously, so that a limit order is identified by its time stamp t. We express this mathematically by saying that a trading epoch is the graph of a function from a subset of the positive half-line to the space {(q,p,d):q0,p0,d=1,1}.

When a market participant submits a limit order (t,q,p,d), a matching algorithm run by the trading platform looks for possible counterparts to their order: if other participants' commitments to trade exist to (partially) fulfil the order (at a price not worse than p from the submitter's point of view), then their order is (partially) cleared. The fraction of their order that is cleared disappears from the market, and it is said to be executed; the fraction that could not be fulfilled is recorded in the limit order book, waiting for counterparts to trade with.

A limit order (t,q,p,d) in a trading epoch is said to be active (or outstanding) at time u if tu and by time u the order has neither been executed nor been cancelled.

The search for possible trading counterparts for the matching of the incoming order (t,q,p,d) happens among the active orders with time stamp s<t and opposite direction d. We say that the active order (s,ρ,π,d) matches the incoming order (t,q,p,d) if πdpd; in this case, the two orders are executed one against the other, and ρq units of shares are transacted at the price π per unit of share. After the match, the matched orders are replaced with the orders (s,ρ~,π,d) and (t,q~,p,d), where ρ~=ρρq and q~=qρq. At least one of them has null size. If ρ~=0, i.e., qρ, the order with time stamp s ceases to be active and it is deleted from its queue. If q~=0, i.e., qρ, then the order with time stamp t (i.e., the incoming order) is fully executed, and the search for possible matching counterparts stops. If instead q~>0, i.e., q>ρ, then the search continues among active orders with opposite direction d.

Definition 2.1

Let E be a trading epoch in an order-driven market. The ask order queue At at time t is defined as At:={(s,q,p,1)E:(s,q,p,1)isactiveattimet}. Similarly, the bid order queue Bt at time t is defined as Bt:={(s,q,p,+1)E:(s,q,p,+1)isactiveattimet}.

A limit order book is a grid of equally spaced prices at which active limit orders sit. The space between consecutive grid nodes is called the tick size of the LOB, which we denote by ψ. All orders submitted to the exchange must have prices that are integer multiples of the tick size. Prices are increasing from left to right. At every node of the grid, outstanding limit orders to buy or sell at the corresponding price are collected; these limit orders are also said to be queuing there. Spontaneously – i.e., by market forces – such orders organize in a way that buy offers will be displaced on the left (the so-called bid side) and sell offers will be on the right (the so-called ask side). Indeed, if there were a buy (resp., sell) limit order to the right (resp., left) of some sell (resp., buy) limit orders, or at the same node, then the matching algorithm would have matched them, clearing them out from the order book. Therefore, the whole configuration of a limit order book at time t is given when the following variables are specified:

  1. the best ask price Pta, i.e., the lowest price at which one can find sell offers active at time t;

  2. the best bid price Ptb, i.e., the highest price at which one can find buy offers active at time t;

  3. the volume Vta,i of ask offers at price Pta,i=Pta+(i1)ψ, for i=1,2,, i.e., the quantity Vta,i={(s,q,Pta,i,1)At:st}q, where the sum is over all the active sell limit orders submitted by time t for a price of Pta+(i1)ψ;

  4. the volume Vtb,i of bid offers at price Ptb,i=Ptb(i1)ψ, for i=1,2,, i.e., the quantity Vtb,i={(s,q,Ptb,i,1)Bt:st}q, where the sum is over all the active buy limit orders submitted by time t for a price of Ptb(i1)ψ.

The whole configuration of a limit order book at time t is described by the variables (Pta,Ptb,{(Vta,i,Vtb,i):i=1,2,}). Apart from these variables, other derived quantities are useful to assess the properties of an order book. These are the spread, the mid-price and the volume imbalance (or queue imbalance).

The spread at time t is the distance |PtaPtb| between best ask price and best bid price, which we denote by ϕt. The mid-price at time t is the mid point in between Ptb and Pta, which we denote by Ptm, and is given by Ptm=(Pta+Ptb)/2. Finally, the n-levels volume imbalance (or queue imbalance) at time t, denoted Itn, is the normalized excess of limit orders on the first n levels of the bid side compared to the limit orders on the first n levels of the ask side, namely (1) Itn=inVtb,iinVta,iinVtb,i+inVta,i,(1) where inVtb,i (resp., inVta,i) is the cumulative volumes on the first n bid (resp., ask) levels. The queue imbalance In is widely accepted as a reliable signal for the next mid-price move (see Cartea, Donnelly, and Jaimungal Citation2018): when it is close to 1 the mid-price will likely decrease, and when it is close to +1 it will likely increase.

The arrival of a limit order to the market triggers two events: one is the consumption of the liquidity capable to (partially) service the limit order, the other is the addition of the non-executed part of the limit order to the appropriate queue in the order book. Proposition 2.2 states the terms of this decomposition.

Proposition 2.2

Given the configuration (Pta,Ptb,{(Vta,i,Vtb,i):i=1,2,}) of the limit order book immediately before time t, processing the sell limit order (t,q,p,1) is equivalent to processing the pair of orders [(t,qM,0,1),(t,qqM,p,1)], with the former having priority over the latter, and where qM:=min(q,i1Vtb,i1{Ptb,ip}). Similarly, processing the buy limit order (t,q,p,+1) is equivalent to processing the pair of orders [(t,qM+,,+1),(t,qqM+,p,+1)], where qM+:=min(q,i1Vta,i1{Pta,ip}).

From this point forward, given a limit order (t,q,p,d), we let qM=qM+ if d = +1, and qM=qM if d = −1. The first component (t,qM,0,1) in the decomposition [(t,qM,0,1),(t,qqM,p,1)] of the sell limit order (t,q,p,1) is called sell market order. Notice that in the 4-tuple (t,qM,0,1), the price p is set to zero, and this guarantees immediate execution: the sale of the amount qM is instantaneously matched with outstanding buy limit orders on the bid side, and no fraction of qM is put into the queue. On the contrary, the second component (t,qqM,p,1) specifies exactly that part of (t,q,p,1) which will be queued. Similarly, (t,qM,,+1) represents the market order component of the buy limit order (t,q,p,+1), and it is referred to as buy market order. The price p= is a way to express the fact that a counterpart for the purchase of the amount qM will instantaneously be found on the ask side of the order book. In the following, the term ‘market order’ (either buy or sell) will be referring to the first component in the decomposition of a limit order with non-zero market order size qM>0.Footnote1

A sell market order (t,qM,0,1) is said to ‘walk the book’ if qM>Vtb,1. Similarly, a buy market order walks the book if its size is larger than the volume of offers on the first ask level.

Given a time window [0,T], the evolution in time of the limit order book {(Pta,Ptb,{(Vta,i,Vtb,i):i=1,2,}):0tT} results from the history of all limit order submissions, cancellations and executions that happened in [0,T]. If every limit order in {(t,q,p,d):0tT} is decomposed as per in Proposition 2.2, then the seller-initiated trades that happened in [0,T] are {(t,qM,0,1):qM>0,0tT}, and the buyer-initiated trades that happened in [0,T] are {(t,qM,,+1):qM>0,0tT}. Hence, in the following we will identify trades with market orders of non-zero size qM>0.

3. Background on Counting Processes and Hawkes Processes

3.1. Counting Processes

In this section we introduce our notation for counting processes, and we review basic concepts from the theory of such processes. Our main reference is Daley and Vere-Jones (Citation2008, Chapter 14).

Let dE be a positive integer. For each e ranging from 1 to dE, let Tje, j=1,2,, be a strictly increasing sequence of positive random times, and assume that TjeTje if (e,j)(e,j). Then, Ne(t):=j1{Tjet},t0, is a non-decreasing right-continuous process; we call Ne the counting process associated with the sequence (Tje)j. Notice that (Tje)j can be retrieved from Ne by Tje=inf{t>0:Ne(t)j}; hence, there is a one-to-one correspondence between Ne and (Tje)j.

For t>0 we define δNe(t):=limh0(Ne(t)Ne(th)), and we notice that δNe(t)=1 if and only if t=Tje for some j, otherwise δNe(t)=0.

The dE-dimensional vector N(t)=(N1(t),,NdE(t)) is referred to as multivariate counting process associated with the dE sequences (Tje)j, e=1,,dE. Let Ng(t):=N1(t)++NdE(t) be the ground process of N, and let Tn:=inf{t>0:Ng(t)n},n=1,2,, be the ordered sequence of random times stemming from the union {Tje:j=1,2,;e=1,,dE}. By defining for n=1,2,, En:=e=1dEe1{δNg(Tn)=δNe(Tn)}, we have that the pair (Tn,En) equivalently characterizes the multivariate counting process, because (2) Ne(t)=n1{Tnt,En=e},(2) for all t>0 and all e=1,,dE.

We can interpret this construction by saying that the index e labels dE types of events that occur in time, and Ne(t) counts the number of events of type e that have occurred by time t.

Example 3.1

Poisson process

Let τej, j=1,2,, e=1,,dE be independent random variables such that τej is exponentially distributed with parameter λe>0, e=1,,dE. Let Tje:=kjτek, and notice that Tje has probability density function fe,j(t)=λej(j1)!tj1eλet1{t>0}. Then, the multivariate counting process N associated with the arrival times Tje is called dE-dimensional Poisson process of rates λ1,λdE. This name is justified as follows. Since {Ne(t)j}={Tjet}, we have that ddtP(Ne(t)j)=fe,j(t). On the other hand, if we define Se,j(t):=kj(λet)kk!eλet, we also have that ddtSe,j(t)=fe,j(t), by telescopic sum. Since Se,j(0)=P(Ne(t)0), we deduce that P(Ne(t)j)=Se,j(t), and that P(Ne(t)=j)=(λet)jj!exp(λet). Therefore for every t, Ne(t) is a Poisson random variable of parameter λet, and the ground process Ng of N is such that for every t, Ng(t)Pois(λ1t++λdEt).

The minimal filtration to which a multivariate counting process N is adapted – and such that it satisfies the usual conditions of completeness and right-continuity – is called the internal history of N. Any other filtration to which N is adapted is called a history of N, and it must be a superset of the internal history.

Definition 3.2

Let (Ω,F=(Ft)t,P) be a filtered probability space where the multivariate counting process N is defined, and assume that F is a history of N. We say that the dE-dimensional stochastic process Λ=(Λ1,,ΛdE) is an F-compensator for N if: (i) Λ(0)=0 and Λ is of finite variation; (ii) Λ is F-predictable; (iii) Λ is right-continuous; (iv) NΛ is a local martingale.

Given the counting process N and a history F, the F-compensator is unique up to an evanescent set, and it is equivalently characterized as the F-predictable projection of N, namely as the F-predictable non-decreasing process Λ such that (3) E[R+YdN]=E[R+YdΛ](3) for all non-negative F-predictable processes Y (see Daley and Vere-Jones Citation2008, Proposition 14.2.II).

If Λ is absolutely continuous, we write Λ(t)=0tλ(s)ds, for some F-predictable process λ=(λ1,,λdE), which is called intensity of the counting process N. Combining this with Equation (Equation3), one obtains the formula E[Ne(t)Ne(s)|Fs]=E[stλe(u)du|Fs],st, which allows to interpret λe(t) as a measure of the ‘instantaneous risk’ of a jump at time t in the eth component of the counting process N. Notice that this ‘risk’ evolves in time and it varies depending on the information available up to time s.

Compensators are crucial in the following time-change result, which will be used to perform goodness-of-fit diagnostics (see Section 6.2).

Theorem 3.3

Meyer Citation1971

Let N be a dE-dimensional counting process with arrival times Te. Assume that N has continuous compensator Λ such that Λe(t) as t for all e=1,,dE. Then, the random sequences {Λ(Tje):j=1,2,}, e=1,,dE are the arrival times of a dE-dimensional unit-rate Poisson process, namely the time-changed inter-arrival times (4) τej:=Λ(Tje)Λ(Tj1e)(4) are all independent exponentially distributed random variables for j=1,2, and e=1,,dE.

For a proof of Theorem 3.3, see Brown and Nair (Citation1988).

3.2. Multidimensional Hawkes Processes

In this section, we recall the basics of the theory of state-dependent Hawkes processes from Morariu-Patrichi and Pakkanen (Citation2018Citation2022).

Definition 3.4

A dE-dimensional counting process N is called a Hawkes process if it admits an absolutely continuous compensator Λ with intensities (5) λe(t)=νe+e=1dE0tκe,e(ts)dNe(s),e=1,,dE,(5) for some non-negative base rates νe0, and some non-negative locally integrable functions κe,e0 that are supported on the non-negative half line.

The matrix-valued function t[κe,e(t)]e,e=1,,dE is referred to as the kernel of the Hawkes process N. If all the kernel functions are integrable, the spectral radius ρ of the dE×dE-matrix of L1 norms κe,e1 is called radius of the Hawkes kernel; if some of the kernel functions are not integrable, the spectral radius is set to +.

A dE-dimensional Hawkes process is asymptotically stationary if the radius of its kernel is smaller than 1; in this case the intensity process λ is asymptotically stationary.

Let S be a finite state space. We can label its elements as x=1,,dS, where dS is the number of possible states of the system. A state-dependent counting process is a pair (N,X), where for all t, N(t) records the number of events occurred by time t as per formula (Equation2), and X(t) records the state of the system at time t. More specifically, we have:

Definition 3.5

Morariu-Patrichi and Pakkanen Citation2022, Definition 2.1

Let N be a dE-dimensional counting process. Let X be a continuous-time piecewise-constant process in the finite state space S of cardinality dS. Let F be the minimal complete right-continuous filtration generated by the pair (N,X). Then, we say that (N,X) is a state-dependent Hawkes process if

  1. N admits an absolutely continuous F-compensator with intensities (6) λe(t)=νe+e=1dE[0,t)κe,e(ts,X(s))dNe(s),e=1,,dE,(6) for some dE non-negative base rates νe0, e=1,,dE, and some dE2 measurable functions κe,e:R+×SR+, e,e=1,,dE, such that κe,e(,x) is locally integrable for all x in S;

  2. X jumps only at arrival times Tn of N, and there exist dE transition matrices ϕe(,), e=1,,dE, defined on S such that for all n (7) P(X(Tn)=x|En,FTn)=ϕEn(X(Tn),x),x=1,,dS,(7) where X(Tn)=limtTnX(t) is the state of the system immediately before the nth event En, and FTn=ϵ>0FTnϵ represents the information available immediately before this event.

Given a state-dependent Hawkes process (N,X), let Tn and En be the sequences of arrival times and events that equivalently describe the counting process component N of the pair (N,X), as per Equation (Equation2). Let Xn be the sequence of states X(Tn), for n=1,2,. Then, the dEdS-dimensional counting process (8) N~e,x(t):=n1{Tnt,En=e,Xn=x}(8) is called the hybrid-MPP counterpart of (N,X). We have that the jth jump time Tje of the eth component of N is the jth order statistic of {Tke,x:k=1,2,;x=1,,dS}, where (Tke,x)k are the jump times of the (e,x)th component of N~. Similarly, Tn is the nth order statistics of {Tke,x:k=1,2,;e=1,,dE;x=1,,dS}. The (e,x)th component N~e,x of the hybrid-MPP counterpart of (N,X) admits a continuous compensator with density given by (9) λ~e,x(t)=ϕe(X(t),x)(νe+e,x[0,t)κe,e(ts,x)dN~e,x(s)),(9) where ϕe is the transition matrix associated with event type e, and κe,e, for e,e=1,,dE, are the Hawkes kernels of N.

Let λ¯=λ1++λdE be the sum of the intensities. If λ¯ is decreasing in time, then a state-depended Hawkes process (Tn,En,Xn) can be simulated as detailed in Algorithm 3.1.

4. State-Dependent Hawkes Model

We consider four streams of random times: the stream (Tj1)j of times when limit orders are executed on the bid side (equivalently identified with the arrival times of sell market orders); the stream (Tj2)j of times when limit orders are executed on the ask side (equivalently identified with arrival times of buy market orders); the stream (Tj3)j of times when either an ask limit order is inserted inside the spread, or the cancellation of a bid limit order depletes the liquidity available at the first bid level; the stream (Tj4)j of times when either a bid limit order is inserted inside the spread, or the cancellation of an ask limit order depletes the liquidity available at the first ask level.

The four sequences of random times give rise to a four-dimensional counting process N=(N1,,N4) with the following interpretation of its components:

  1. N1(t) denotes the number of seller-initiated trades that happened before or at time t (identified with the number of market orders arrived on the bid side of the order book by time t);

  2. N2(t) denotes the number of buyer-initiated trades that happened before or at time t (identified with the number of market orders arrived on the ask side of the order book by time t);

  3. N3(t) denotes the number of decreases in the mid-price caused by a limit order insertion or cancellation that happened before or at time t;

  4. N4(t) denotes the number of increases in the mid-price caused by a limit order insertion or cancellation that happened before or at time t.

The counting process N is paired with the state variable X. At time t, the state variable X(t) summarizes the configuration (Ptb,Pta,{(Vtb,i,Vta,i):i=1,2,}) of the limit order book at time t, by recording a proxy for the n-levels volume imbalance, and the variation of the mid-price compared to time t. More precisely, (10) X(t)=(X1(t)X2(t)):=(1{δPm(T^(t))>0}1{δPm(T^(t))<0}12k=0K1(2kK+1)1{kKKItn<2(k+1)KK}),(10) where δPm(t)=limϵ0(Pm(t)Pm(tϵ)), T^(t)=sup{Tjet:e=1,,4;j=1,}, and Itn is defined in (Equation1).

The first component X1 of the state variable X can take the values 1, 0, +1, respectively denoting downward jump in the mid-price, unchanged mid-price and upward jump in the mid-price. The second component X2 of the state variable X is a discretisation of the n-levels queue imbalance Itn, and – assuming that K is odd – it takes integer values from (K1)/2 to (K1)/2, spanning the full range of possible values of In from 1 to +1.

It follows from the definition of X2 that if at time t we have that X2(t)=x2, then the n-levels queue imbalance Itn at time t must be in the half-open interval [(2x2K1)/2K,(2x2+1)/K). Notice that X2 depends on the two additional parameters n and K: the former is the number n of levels of the limit order books taken into account in the computation of the queue imbalance In; the latter is the number K of points in the partition of the interval [1,1] used for the discretisation of In.

The pair (N,X) is modelled as a state-dependent Hawkes process, hence we assume that there are base rates νe, Hawkes kernels κe,e=κe,e(t,x) and transition matrices ϕe such that Definition 3.5 is satisfied. The number of event types is dE=4 and the number of states is dS=3K.

When a new event occurs, i.e., when one of the components Ne of N jumps, the state variable X is updated as per in Equation (Equation7). The update models the mechanism whereby trades on either side of the limit order book can trigger changes in the mid-price and in the queue imbalance. Indeed, assume that a sell (resp., buy) market order arrives at time Tj1 (resp., Tj2), and that X(Tj1)=(x1,x2) (resp., X(Tj2)=(x1,x2)) for some x1 in {1,0,+1} and some x2 in {(1K)/2,(3K)/2,,(K1)/2}. Then, the mid-price jumps downward (resp., upward) with probability p:=y2=(1K)/2(K1)/2ϕ1((x1,x2),(1,y2)) (resp., p+:=y2=(1K)/2(K1)/2ϕ2((x1,x2),(+1,y2))), and it remains unchanged with probability p0:=1p=y2=(1K)/2(K1)/2ϕ1((x1,x2),(0,y2)) (resp., p0:=1p+=y2=(1K)/2(K1)/2ϕ2((x1,x2),(0,y2))).Footnote2 This jump of the state variable happens exactly at the arrival time Tj1 (resp., Tj2) of the sell (resp., buy) market order, and it naturally captures the mechanism responsible for the market-order-induced price change: p (resp., p+) represents the probability that a sell (resp., buy) market order walks the book given its submission, and p0 represents the probability that it does not. Notice that p (resp., p+) and p0 depend on the state of the limit order book immediately before the arrival of the sell (resp., buy) market order, and in particular they depend on x2. This is a granular description of the order book mechanism, and it accounts for the fact that it is less likely that a sell (resp., buy) market order walks the book when the volumes on the bid (resp., ask) side are high, namely p(x1,x2)p(x1,x~2) if x2x~2 (resp., p+(x1,x2)p+(x1,x~2) if x2x~2).

The first component X1 of the state variable X enables to write the following proxy for the mid-price: (11) P0m+ψ20tX1(s)dNg(s),(11) where ψ is the tick size of the limit order book and Ng=N1++N4 is the ground process of N.

Remark 4.1

Our model can be compared to that of Bacry and Muzy (Citation2014). In their model, four streams of random times are considered: the stream (Tj1)j of times when limit orders are executed on the bid side (equivalently identified with the arrival times of sell market orders); the stream (Tj2)j of times when limit orders are executed on the ask side (equivalently identified with arrival times of buy market orders); the stream (Tj3)j of times when the mid-price decreases; the stream (Tj4)j of times when the mid-price increases. T1 and T2 are as in our model, whereas T3 and T4 represent what in our model we represent through the state variable X1. In Bacry and Muzy (Citation2014), the four-dimensional counting process N=N(t) associated with T1, T2, T3 and T4 is assumed to be a four-dimensional ordinary Hawkes process. In their model, a buy (resp., sell) market order coming into the exchange and walking the book at time t would be represented by the equation δN2(t)=δN4(t)=1 (resp., δN1(t)=δN3(t)=1). However, the components of a multidimensional Hawkes process jump simultaneously with probability zero. In other words, if {(Tje)j:e=1,.,4} are the arrival times associated with a 4-dimensional Hawkes process, it holds P(Tje=Tje,forsomej,j1,andee)=0. Since the direction of the causality is unambiguous (a market order originates first and as a result of its execution the mid-price jumps), Bacry and Muzy (Citation2014) propose to add to the Hawkes kernels κ1,3 and κ2,4 an atomic component. This is the defining feature of the ‘impulsive impact kernel’ – see Bacry and Muzy (Citation2014, Section 2.1.3). In our paper, the usage of the state variable X1 circumvents the need of these atomic components and naturally accommodates mid-price changes triggered by market orders walking the book.

The second component X2 of the state variable X reproduces the state variable of the queue-imbalance model in Morariu-Patrichi and Pakkanen (Citation2022). It is conceived as the main indicator of the regime in which limit and market orders will arrive to the exchange: in high-frequency markets trading algorithms send their orders in response to observable quantities of the limit order book configuration, and a prominent one is indeed the queue imbalance. It is therefore expected that when X2 is positive (resp., negative), the intensities of events of types e = 2 (resp., e = 1) will be higher, because market participants following the queue imbalance signal will expect the price to increase (resp., decrease). After the price change, the volumes of deeper queues on the ask (resp., bid) side enter the computation of the queue imbalance, and this will likely reset the signal. As noted in Morariu-Patrichi and Pakkanen (Citation2022), this interaction can be deemed responsible for the mean-reverting behaviour of price dynamics in high-frequency markets.

Moreover, we use X2 to reproduce the update of the limit order book configuration that happens when a labelled agent submits their market orders. Indeed, we consider normalized volumes up to level n, namely we assume that i=1n(Vtb,i+Vta,i)1, and we assume that the 2n-tuple (Vta,1,Vtb,1,,Vta,n,Vtb,n) is distributed as a Dirichlet random variable with 2n-dimensional parameter γ=γ(X(t))R+2n that depends on the state variable at time t.

Given the time evolution of the limit order book in the time window [0,T], an estimator for γ(x), with x ranging from 1 to 3K, can be obtained by maximum likelihood estimation. Once γ is known, the order book mechanics can be reproduced by drawing from the conditional distribution Dirγ(X(t))(|X2(t)). This is the Dirichlet distribution of the 2n-tuple (Vta,1,Vtb,1,,Vta,n,Vtb,n) with parameter γ(X(t)) conditioned on 2X2(t)K12Ki=1n(Vtb,iVta,i)=Itn<2X2(t)+1K. Algorithm 4.1 describes how to reproduce the order book update in the case of the arrival of a sell market order (t,qM,0,1). The case of buy market orders is analogous.

Line 4 in Algorithm 4.1 says that the bid price (and consequently the mid-price) decreases if the size of the sell market order is larger than the available liquidity sitting on the first bid level. Lines 6:10 cancel (from the bid queues) the orders whose execution has been triggered by the arrival of (t,qM,0,1). On line 7 we used the notation min+(a,b)=max(0,min(a,b)) for a and b real numbers.

5. Price Impact Profiles

Measuring price impact requires two things. The first is to modify the model (N,X) of Section 4 in a way to account for a labelled agent, whose impact we wish to measure. The second is to extrapolate to which extent the labelled agent is responsible for the evolution of the price dynamics that emerge from the state process (X(t))t. Section 5.1 describes the former; Section 5.2 describes the latter.

5.1. Labelled Agent

We account for a labelled agent in the market, and we aim to measure their impact on the dynamics of the order book. We take the perspective of a liquidation, namely we consider our agent (also referred to as liquidator) to be selling the amount Q0 of asset. The case of acquisition is mutatis mutandis the same.

We let [0,T] represent the time window of the liquidation. The quantity Q0 is referred to as the size of the liquidator's metaorder, or their initial inventory, and we normalize it with respect to the overall volume i=1n(V0a,i+V0b,i) of offers sitting on the first n levels of the order book at the start of the liquidation window.

We assume that the liquidator intervenes in the market only by sending sell market orders; they will never place a limit order to queue on the ask side, but they will initiate trades with existing offers on the bid side.

Hence, the liquidation is described by the sequence {(Tj0,qM,j,0,1):j=1,2,} of sell market orders sent by the liquidator. For every j, Tj0 is the time stamp of the liquidator's jth child market order, and qM,j is its size.

We suppose that the stream of random times T10<T20< is restricted to [0,T]. We assume non-explosiveness, so that the number of liquidator's market orders is finite if the time horizon T of the execution window is not +. Moreover, we let t0=T10 represent the time at which the liquidator begins their intervention in the market, and we let τ:=sup{Tj0T:j=1,2,} be the time at which the liquidation stops.

Assumption 5.1

Let Q0 be the size of the liquidator's metaorder, and for j in N, let zj=qM,1++qM,j be the sum of all liquidity-normalized sizes of the first j child market orders sent by the liquidator. Then, the termination time τ of the liquidation is assumed to coincide with the smallest time stamp Tj0 among the liquidator's market orders such that zjQ0, namely τ=inf{Tj0t0:k=1jqM,kQ0}.

We introduce the liquidator's presence in the model described in Section 4 by expanding the dimension of the counting process N: we let the zero-th component N0(t) count the liquidator's market orders sent to the exchange by time t. In other words, from the overall sequence (Tj1)j of arrival times of market orders described in Section 4, we extract those sent by the liquidator and we label them as (Tj0)j; we then let N0(t):=j11{Tj0t} count the number of trades initiated by the liquidator that happened before or at time t. Notice that the map tN0(t) represents how the liquidator is splitting in time the execution of their metaorder. In other words, this is the liquidator's execution schedule.

The pair (N,X) is a state-dependent Hawkes process where the counting process component N is five-dimensional, and the state process X is as in Equation (Equation10). The event types will be labelled e=0,1,,4 and the states will be labelled x=1,,3K or x=(x1,x2) with x1=1,0,+1 and x2=(K1)/2,,+(K+1)/2. The following assumption is in place on the intensities.

Assumption 5.2

For all e=1,,4, the Hawkes kernel κ0,e coincides with κ1,e.

Assumption 5.2 guarantees consistency in the effect that trades have on the order book dynamics. It says that the rates of arrival of market orders to the exchange are modified by the liquidator's interventions in the same way as they are by other participants' sell market orders. More precisely, for e=1,,4 it holds (12) λe(t)=νe+e=04[0,t)κe,e(ts,X(s))dNe(s)=νe+e=14[0,t)κe,e(ts,X(s))dNe(s)+[0,t)κ1,e(ts,X(s))dN0(s).(12) The liquidator's execution schedule admits an absolutely continuous compensator Λ0 with density (13) λ0(t)=ν01[0,τ)(t)+e,x1[0,τ)(t)[0,t)κe,0(ts,x)dN~e,x(s).(13) For j=1,2, let (Tj0,qM,j,0,1) be the liquidator's child market orders, as denoted above. The liquidator's order scheduling depends on the Hawkes parameters ν0, and κe,0, which modulate the sequence of arrival times Tj0. Additionally, the liquidation depends on the size qM,j of the jth child market order, for j=1,2,.Footnote3 The evolution of the limit order book is simulated by combining Algorithms 3.1 and 4.1, as detailed in Algorithm 5.1.

Remark 5.3

In Bacry and Muzy (Citation2014) (see Remark 4.1), a labelled agent is accounted for by considering the following intensities of the four-dimensional counting process N. For e=1,,4 and t>0 they set (14) λe(t)=νe+e=14[0,t)κe,e(ts)dNe(s)+[0,t)θe(s)dA(s),(14) where tA(t) is the liquidator's execution schedule, tθ1(t) (resp., tθ2(t)) represents the impact of the liquidator's market orders on the arrival of other participants' sell (resp., buy) market orders, and tθ3(t) (resp., tθ4(t)) represents the impact of the liquidator's market orders on downward (resp., upward) jumps of the mid-price. In their model, to have consistency between the liquidator and other market participants one needs to impose θ3(t)=κ1,3(t) and θ4(t)=κ1,4(t) for t0. Practically, this implies that the atomic components in the Hawkes kernel are passed to the integrands θ3 and θ4, which means that the liquidator walks the book at an average rate equal to the overall proportion of markets orders walking the book. The consequence that the liquidator walks the book in this way can be a potentially undesirable feature.Footnote4 In our model, there is no need for this to be assumed. We are able to test executions without this assumption, and still maintain consistency between the liquidator and other market participants.

Remark 5.4

The liquidator's interventions in the market have been modelled by expanding one component of the Hawkes process introduced in Section 4. The justification for this modelling choice is twofold. First, this guarantees consistency between executions of sell market orders sent by the liquidator and executions of sell market orders sent by other market participants. Given our interest in understanding the liquidator's impact, any other stochastic model would raise questions of granting the liquidator with a privileged order scheduling. Second, expanding the dimensions of N allows us to give a natural justification to Bacry and Muzy (Citation2014)'s formula for the intensity (Equation14) – κ0,e takes the role of θe and dN0 takes the role of dA. Hence, our modelling choice resonates with existing models in the literature, and it is grounded in our phenomenological point of view. A future work could adopt the point of view of optimal execution, and optimize the liquidator's scheduling in a set of admissible liquidation strategies aimed at minimizing their price impact.

5.2. Definition of Price Impact

We partition the state space S according to the values of the first component X1 of the state variable X=(X1,X2). We define (15) Sx1:={y=(y1,y2)S:y1=x1}.(15) We refer to states x in S+ (resp., in S) as inflationary (resp., deflationary) states.

The jump times for the mid-price consequently give rise to the counting processes (16) Nx1(t):=n1{Tnt,X1(Tn)=x1},x1{1,0,+1},(16) where Tn is the nth jump time of the ground process. The difference N+(t)N(t) is a proxy for the mid-price in the order book. Indeed, we can rewrite the integral quantity in Equation (Equation11) as 0tX1(s)dNg(s)=N+(t)N(t).

Definition 5.5

The state-dependent Hawkes model (N,X) is said price-symmetric if for all t0 (xS+xS)e=14ϕe(X(t),x)e(t)=0, where e(t)=νe+e=14[0,t)κe,e(ts,X(s)dNe(s).

Proposition 5.6

Assume that there exist a permutation σE of {1,,4} and a bijective map σS:S+S such that

  1. ϕe(y,x)=ϕσE(e)(y,σS(x)) for all y in S, all x in S+ and all e=1,,4;

  2. νe=νσE(e) for all e=1,,4;

  3. κe,e(t,x)=κe,σE(e)(t,x) for all x in S all e,e=1,,4 and all t0.

Then, (N,X) is price-symmetric.

Remark 5.7

The condition in Proposition 5.6(i) captures the idea that, given the current state y, transitions to inflationary states and transitions to deflationary states are equally likely. The conditions in Proposition 5.6(iii) capture the idea that, given the current state y, every event-state pair (e,x) excites an event-state pair (e,x) with inflationary state xS+ the same way as it excites an event-state pair (σE(e),σS(x)) with deflationary state σS(x)S; in other words, the offspring from every event-state pair (e,x) are equally likely to be associated with inflationary states or with deflationary states.

Definition 5.8

Let t0 be the time when the liquidator becomes active in the market. Then, the price impact profile of the execution schedule N0 is the primitive of tDir(t)+Indir(t) pinned at 0 in t0, where Dir(t)=xSϕ0(X(t),x)(ν0+e=14x=13K[0,t)κe,0(ts,x)dN~e,x)1[0,τ)(t), where τ is the termination time of the liquidation, (17) ϕ0(x,x)=j1{X(Tj0)=x,X(Tj0)=x}j1{X(Tj0)=x},(17) and Indir(t)=e=14x=13K[0,t)κ1,e(ts,x)dN~0,x(xSxS+)ϕe(X(t),x). The map tDir(t)+Indir(t) is referred to as intensity of the price impact profile.

Remark 5.9

In Definition 5.8 we defined price impact as the time integral of intensities of counting processes. These counting processes count the increases and the decreases of the mid-price. The tick size is set by the trading venue to 0.01 USD, and so the minimum change in the mid-price is 0.005 USD. Therefore, in the examples below, the physical dimension of price impact is 0.005 USD.

Remark 5.10

The intensity of the price impact profile is decomposed in two components, namely Dir(t) and Indir(t). Both are null if N0(t)0. The former is referred to as ‘direct’ impact and stems from those summands of the execution schedule's intensity λ0(t)=x=13Kλ~0,x(t) that are associated with deflationary states, namely Dir(t)=xSλ~0,x(t). Notice that Dir(t)0 for all t>0 and Dir(t)=0 for all t>τ. On the contrary, the second term Indir(t) stems from events originated by participants other than the liquidator but in response to the liquidator's interventions, hence the name of ‘indirect’ impact. It can have either sign and it is in general non-zero even beyond the termination time; for this reason it is linked to the transient impact. More precisely, for t>τ it holds Indir(t)=e=14jκ1,e(tTj0,X(Tj0))(xSxS+)ϕe(X(t),x), and the transient price impact profile is the map t0tIndir(s)ds, restricted to the interval tτ.

In a price-symmetric state-dependent Hawkes model, if N00, then NN+ is a martingale, and its compensator is identically null. Instead, when the liquidator is active in the market, the symmetry is disrupted, and we map this disruption to our measure of the price impact.

Hence, Definition 5.8 is vindicated by the following proposition.

Proposition 5.11

If (N,X) is price-symmetric, then the price impact profile of N0 is the F-compensator of NN+, where F is the minimal complete right-continuous filtration to which (N,X) is adapted.

The direct impact component Dir(t) of the intensity (λλ+)(t) encompasses the transition matrix ϕ0 associated with the state update that occurs when liquidator's orders are executed. For x and x in S, ϕ0(x,x) is estimated according to Equation (Equation17); hence it summarizes the state transitions that stem from Algorithm 4.1 during the simulation of the execution. This disentangles the effects of liquidator's orders (whose sizes qM,j are set by the liquidator) from the effects of other market orders, i.e., ϕ0ϕ1 in general, allowing to investigate the impact of different execution strategies.

In particular, the liquidator might choose to send market orders with sizes that never exceed the available liquidity on the first bid level; this would cause ϕ0(x,x)=0 for all x in S and all deflationary x in S, and thus Dir(t)0. Nonetheless, the overall impact would not be null, because of the indirect term Indir(t). Indeed, even without ever walking the book, the liquidator's orders would modify (i) the arrival of orders submitted by other market participants who react to the liquidator's executions; (ii) the volumes in the order book.

As far as (i) is concerned, if the dynamics of order submission is such that deflationary events trigger other events with deflationary effects on the price, then the price may plunge as an indirect consequence of the liquidation.

As far as (ii) is concerned, despite the fact that they do not walk the book, liquidator's executions consume liquidity on the bid side, pushing the state trajectory tX(t) to dwell in states {y=(y1,y2)S:y2<0} for longer. The probability of transitioning from these states to deflationary states is higher than the probability of transitioning to inflationary states, hence making the term (xSxS+)ϕe(X(t),x) positive, and contributing to the impact via the indirect term Indir(t). Notice that this form of impact would not be captured by a less granular model where the update of the volumes in the limit order book is not reproduced as we do in Algorithm 4.1, and where the liquidator's child orders are assumed to walk the book at an average rate equal to the overall proportion of market orders that do so – see Remark 5.3.

In Section 6, we will see that, after calibrating our model on empirical data from NASDAQ, the indirect component of the price impact is actually the main driver of price impact during liquidation.

6. Applications

6.1. Description of the Dataset and Model Specifications

We study order book data provided by LOBSTER.Footnote5 LOBSTER is a provider of high-quality limit order book data that is reconstructed from NASDAQ's Historical TotalView-ITCHFootnote6 files with detailed event information. The reconstruction methodology is described in Huang and Polak (Citation2011).

For every NASDAQ ticker and every active trading day, LOBSTER provides two files in .csv format: a ‘message file’ and an ‘orderbook’ file. The former is an event-by-event history of messages sent to the exchange that provoked an update in the configuration of the order book. The latter is an event-by-event snapshot of the order book, where the nth row corresponds to the configuration resulting from the nth message reported in the message file.

Prices are reported in 104USD; hence the tick size, imposed by regulationFootnote7 and equal for all shares with price above 1USD, is set to 100. Time stamps are reported in seconds after midnight with resolution at the nanosecond scale. In the plots that follow prices are always reported in 104USD and times are reported in seconds. Events happening in the trading venue are labelled according to Table .Footnote8

Table 1. LOBSTER labels of order book events.

Table  shows how we map LOBSTER order book labels to the sequences of arrival times described in Section 4.

Table 2. Mapping of LOBSTER labels to event types.

In the analysis that follows, we study order book data for the ticker INTC trading on 25 January 2019. First, we calibrate our state-dependent Hawkes model on the dataset of 25 January 2019; then, we simulate liquidations of a large number of shares using Algorithm 5.1; and finally we assess the price impact of such simulated liquidations as per Definition 5.8. At the end of the section we make remarks about the sensitivity of our results with respect to calibrated parameters, and we provide insights on the implementation for other dates and tickers.

6.2. Calibration

After filtering for the arrival times (Tje)j, e=1,,4, and defining the state variable X=(X1,X2) as per Equation (Equation10) with n = 2 and K = 3, the data sets of message file and order book for INTC on 25 January 2019 are as in Table .

Table 3. Ten time stamps from the filtered message file and order book file.

Starting from these data sets we perform maximum likelihood estimation of our state-dependent Hawkes model.

Transition probabilities are straightforwardly estimated from empirical frequencies. For every event e=1,,4, we estimate a 9×9-transition matrix ϕe that describes the law of the state-update in Equation (Equation7). In Table , we show the result of this estimation focussing on events of type either 1 or 2, i.e., execution on either the bid or the ask side.

Table 4. Transition probabilities ϕe calibrated on INTC as of 25 January 2019.

Assumption 6.1

Hawkes kernels are assumed in the parametric form (18) κe,e(t,x)=αe,x,e(t+1)βe,x,e,(18) for some non-negative coefficients αe,x,e0 and βe,x,e>1.

Remark 6.2

Assumption 6.1 is grounded in the stylized fact that power-law kernels better fit real world data than exponential kernels do, albeit being more computational expensive. This assumption also builds on Bacry and Muzy (Citation2014)'s findings. In the aforementioned paper, the authors devise a non-parametric estimation for Hawkes kernels. Once this non-parametric estimation has converged, they compare the estimated kernels with parametric ones, and confirm that indeed the decay of the kernels is of power-law type.

We estimate the parameters νe, αe,x,e, and βe,x,e using a gradient-descent algorithm. In Table , we present the result of this estimation by reporting the four dimensional vector ν, and both, αe,x,e and βe,x,e for when X2=0 and X1=0 – the other cases are not presented in the interest of space.

Table 5. Hawkes parameters νe, αe,x,e, and βe,x,e calibrated on INTC as of 25 January 2019.

Figure  shows QQ plots for goodness-of-fit diagnostics; we note that the fit is adequate for our purposes although there is some deviation in the tail part of the plots – perfecting the fit tends to be difficult with the amount of data we employ (e.g., for INTC on 25 January 2019 we employ 1,563,582 datapoints).

Figure 1. Goodness-of-fit diagnostics for the model calibrated on INTC as of 25 January 2019. We test that the time-changed inter-arrival times of Equation (Equation4) are i.i.d. samples from a unit rate exponential distribution. Empirical quantiles of the time-changed inter-arrival times on the y-axis and theoretical quantiles of unit-rate exponential distribution on the x-axis.

Figure 1. Goodness-of-fit diagnostics for the model calibrated on INTC as of 25 January 2019. We test that the time-changed inter-arrival times of Equation (Equation4(4) τej:=Λ(Tje)−Λ(Tj−1e)(4) ) are i.i.d. samples from a unit rate exponential distribution. Empirical quantiles of the time-changed inter-arrival times on the y-axis and theoretical quantiles of unit-rate exponential distribution on the x-axis.

Finally, Figure  compares the trajectory of the mid-price as reported in LOBSTER, as reconstructed from Equation (Equation11), and as simulated in the calibrated model (one sample); we plot two different time-scales.

Figure 2. Mid-price trajectories on two time scales. Origin of time is set at 9.55am 25 January 2019. Time is measured in seconds. Prices are in 104USD.

Figure 2. Mid-price trajectories on two time scales. Origin of time is set at 9.55am 25 January 2019. Time is measured in seconds. Prices are in 10−4USD.

6.3. Price Impact Assessment

We simulate liquidations in our state-dependent Hawkes model for an order book calibrated on LOBSTER data for the ticker INTC on 25 January 2019, and we assess the price impact of such liquidations using Definition 5.8.

We investigate two aspects of the liquidation schedule: The rate with which liquidator's orders walk the book (captured by the transition matrix ϕ0), and the clustering of the liquidator's orders in response to events happening in the limit order book. We modulate these two aspects through the parameters reported in Table .

Table 6. Parameters of the liquidation schedule.

We run simulations for different values of these parameters and we find out that the clustering of liquidator's orders has a bigger price impact than the rate with which they walk the book. This suggests that the dynamic evolution of the order book plays a bigger role in price formation than the instantaneous states of the queues.

Figures  and  present three simulations representative of our findings.

Figure 3. Price impact with low rate of walking the book and no clustering. Initial inventory Q0=10, base rate ν0=0.03, clustering rate a = 0, order size c = 0.075, start time t0=0, termination time τ=9250.2, and price impact score is 0.0346.

Figure 3. Price impact with low rate of walking the book and no clustering. Initial inventory Q0=10, base rate ν0=0.03, clustering rate a = 0, order size c = 0.075, start time t0=0, termination time τ=9250.2, and price impact score is 0.0346.

More precisely, Figure shows a liquidation in which there is no clustering of market orders (i.e., αe,x,0=0), and the execution schedule follows a Poisson process with rate ν0=0.03, namely approximately 75% of the base rate of all other sell market orders.

All liquidator's orders have size equal to 7.5% of the volumes of bid offers queuing on the first n levels of the bid side at the moment of the order submission. This entails that liquidator's orders will walk the book only when the aggregate size of level 1 is less than 7.5% of the overall bid volume. As a result, the estimated transition matrix ϕ0=ϕ0(x,x) concentrates the mass on those states x=(x1,x2) such that x2=0. The more positive the queue inbalance at the moment of the order submission is, the more this concentration holds.

The liquidation takes approximately 9250 s to complete. The price impact score, defined as the maximum of the price impact profile divided by the duration, is 3.46%.

The line charts in Figure (top left plot) present visualizations of the liquidation schedule (the red dots represent executions on the bid side triggered by one of liquidator's sell market orders), and its intensity (see Equation (Equation13)), which in this case is simply λ0(t)=ν01[0,τ)(t). This panel also depicts the impact profile trajectory (green line in top left plot) that we used to compute the price impact score. Moreover, the trajectories of liquidator's inventory and of the mid-price during execution are plotted (see bottom left plot). The latter can be compared to the mid-price simulated when the liquidator was not present in the market (see Figure ), and it provides a graphical representation of price impact.

We remark that the stylized features one observes in the impact profile trajectory are consistent across simulations. In Figure  we show the impact profile trajectories for one hundred simulations.

Figure 4. Simulated trajectories of the intensity of the price impact profile, i.e., the map tDir(t)+Indir(t) (see Definition 5.8) for the scenario considered in Figure . The black solid line is the median trajectory of the impact profile across simulations and the red dotted lines are the 25% and 75% quantile trajectories.

Figure 4. Simulated trajectories of the intensity of the price impact profile, i.e., the map t↦Dir(t)+Indir(t) (see Definition 5.8) for the scenario considered in Figure 3. The black solid line is the median trajectory of the impact profile across simulations and the red dotted lines are the 25% and 75% quantile trajectories.

Remark 6.3

Taking the mean across simulations provides a bridge from our scenario-dependent impact profiles to the average impact profiles studied in the literature. Since our model is rich enough to comprise all market variables appearing in the stylized facts of price impact, one can hold all model parameters fixed except for one variable of interest, and study average price impact scores as a function of the variable of interest. This opens the door to investigating whether a specific model adheres to the stylized facts in the literature, as those reviewed in Zarinelli et al. (Citation2015). In Figure  below, we study two experiments where we stress the order size parameter c and the base rate parameter ν0.

Figure 5. Left panel: box plots of the distribution of Dir(τ)+Indir(τ) when stressing the order size parameter c. Right panel: box plots of the distribution of the price impact score as a function of the base rate ν0.

Figure 5. Left panel: box plots of the distribution of Dir(τ)+Indir(τ) when stressing the order size parameter c. Right panel: box plots of the distribution of the price impact score as a function of the base rate ν0.

Figure 6. Price impact with high rate of walking the book and no clustering. Initial inventory Q0=10.0, base rate ν0=0.03, clustering rate a = 0, order size c = 0.5, start time t0=0, termination time τ=1060.7, and price impact score is 0.0426.

Figure 6. Price impact with high rate of walking the book and no clustering. Initial inventory Q0=10.0, base rate ν0=0.03, clustering rate a = 0, order size c = 0.5, start time t0=0, termination time τ=1060.7, and price impact score is 0.0426.

Figure 7. Price impact with clustering of liquidator's orders and low rate of walking the book. We show the first 300 s. Initial inventory Q0=10.0, base rate ν0=0, clustering rate a = 0.25, order size c = 0.015, start time t0=0, termination time τ=722.3, and price impact score is 0.999.

Figure 7. Price impact with clustering of liquidator's orders and low rate of walking the book. We show the first 300 s. Initial inventory Q0=10.0, base rate ν0=0, clustering rate a = 0.25, order size c = 0.015, start time t0=0, termination time τ=722.3, and price impact score is 0.999.

The left panel of Figure shows the average increase in the price impact profile at time τ as a function of the order size parameter c; the right panel shows the price impact score as a function of the base rate parameter ν0 – see Table  for the definition of c and ν0. The remaining model parameters are those in Figure .

Figure shows a liquidation in which the execution schedule is as in Figure , namely no clustering and same base rate. However, in this case liquidator's orders have a much bigger size. They have size equal to 50% of the volumes of bid offers queuing on the first n levels of the bid side at the moment of the order submission. This entails that liquidator's orders will likely walk the book. Indeed, they walk the book whenever the aggregate size of level 1 is less than 50% of the overall bid volume. As a result, the estimated transition matrix ϕ0=ϕ0(x,x) concentrates the mass on those states x=(x1,x2) such that x2=1. The more negative the queue imbalance at the moment of the order submission is, the more this concentration holds.

The liquidation takes approximately 1061 s to complete. This is considerably shorter than in the previous simulation because every order executes more of the liquidator's inventory. The price impact score, is 4.26%. The plotted mid-price trajectory shows a sharper plunge during execution.

Hence, Figures and show how our model captures the consequence that order sizing has on price impact. Order sizing however is not the main driver of price impact. We demonstrate this in Figure .

Figure shows a liquidation in which the intensity of the execution schedule has zero base rate, namely there is no exogenous cause for the liquidator to submit their orders. Instead, they submit their orders in response to orders submitted by other market participants. We set the liquidator's response to be proportional to the response that other market participants make when scheduling their sell market orders. That is, we set αe,x,0=aαe,x,1, for some coefficient a0. In the simulation of Figure this coefficient is a=0.25.

Recall that αe,x,1 was estimated by maximum likelihood estimation from the LOBSTER data set – see Table  where the L1-norms αe,x,e/(βe,x,e1) are reported. The estimation revealed that seller-initiated executions are more likely to excite other events with deflationary pressure on the mid-price, i.e., events of type e = 1 or e = 3, whereas buyer-initiated executions are more likely to excite other events with inflationary pressure on the mid-price, i.e., events of type e = 2 or e = 4. As a consequence, when αe,x,0 is proportional to αe,x,1 deflationary events will tend to cluster, potentially triggering abrupt price changes. The line charts in Figure give visual representations of this. We deliberately focussed on a short time horizon of 300 s, so that the phenomenon described is more apparent to the eye.

We set the order sizes to be very small, at 1.5% of the volumes available on the first n levels of the bid side. This has the purpose to isolate the phenomenon of indirectly induced price changes versus price changes induced by walking the book. Indeed, the estimated transition probabilities ϕ0 on this simulation assign negligible probabilities to transitioning to a state X(T) with X1(T)=1 when an event of type 0 occurs at time T.

Nonetheless, because of the clustering effect of deflationary events, the price impact score is the highest, at 99.9%.

The results presented in this section are robust to misspecification of model parameters. When we stress the base rates {νe}e, together with the coefficients {αe,x,e}e,x,e and {βe,x,e}e,x,e with shocks between 5% and +5%, we observe that the average price impact scores across one hundred simulations get affected by less than 10% of the average unstressed values. For example, if we consider the experiment in Figure , and we repeat the profiling with the modified parameters ν~e=1.05νe, α~e,x,e=1.05αe,x,e and β~e,x,e=1.05βe,x,e for all {e,x,e}, we find that the average price impact scores changes from 0.037 (with 0.006 standard deviation) to 0.041 (with 0.006 standard deviation), which represents an average increase of 8.2%. Similarly, in that same scenario of Figure , when we consider the modified parameters ν~e=0.95νe, α~e,x,e=0.95αe,x,e and β~e,x,e=0.95βe,x,e for all {e,x,e}, we find that the average price impact scores decrease to 0.036 (with 0.005 standard deviation), which represents an average decrease of 3.7%. The other two scenarios we consider in Figures and show a similar behaviour when stressing the calibrated parameters. This means that the results shown are indeed robust to calibration errors.

Finally, we repeat the above analysis employing the tickers AAPL and INTC for a range of dates during January 2019. We report that no further insights are derived from considering other dates or other tickers. For example, it was always the case that the highest price impact scores were achieved under the scenario we consider in Figure , i.e., where the clustering effect plays a key role in the price impact profile. Thus, we conclude that the intuition derived from the analysis in Figures , , and remains valid for other dates and similar tickers.

In limit order books of small tick-size stocks (e.g., GOOG or TSLA), orders are posted and cancelled in adjacent price queues more frequently than for medium-size or large-size stocks. In other words, the mid-price of a small tick-size stock has far smaller constant traits. This is because, relative to the stock price, a one-tick change in the best bid or best ask is not as significant as for larger tick-size stocks. When studying small tick-size stocks with our model, such a reduced importance of single-tick changes can be accounted for by letting N3 and N4 jump only when the mid-price changes by two or more ticks, de-facto renormalizing the parameters of the limit order book (doubling or tripling the tick size and merging adjacent queues of orders). In our experiments, this led to a more robust calibration and prevented overfitting.

7. Discussion

Modelling price impact, direct and indirect, is an important challenge in market microstructure. In this paper we proposed a methodology for modelling both components (direct and indirect) separately on a path-by-path basis. Our definition of price impact assumes that price movements are driven by state-dependent Hawkes processes. Specifically, the (i) seller/buyer-initiated trades, and the (ii) decreases/increases in mid-price by limit order insertions or cancellations, follow a state-dependent Hawkes process. We conducted a goodness-of-fit analysis and concluded that the model fits the data adequately. We studied the average behaviour of the price impact score as a function the base rate of the execution and found evidence of a concave relationship between these variables; future work will explore alternative calibration or modelling assumptions that exhibit a better fit on the data together with further studies of the average behaviour of the price impact profiles we define. Additional avenues for future work include exploring the behaviour of the price impact profile after the executions are completed and use our model to solve control problems of optimal execution of large orders and liquidity provision.

Disclosure statement

This work reflects the analysis and views of the authors Claudio Bellani, Damiano Brigo, Leandro Sánchez-Betancourt, and Mikko Pakkanen. No reader should interpret this work to present the views of any third party. Assumptions, opinions, views and estimates constitute the authors' judgment as of the date given and are subject to change without notice and without duty to update.

Notes

1 In some trading venues, traders can actually submit market orders, i.e., buy or sell orders that are executed without price constraints, at least as long as offers with the opposite direction exist. Even in such cases, we keep our convention of referring to the fraction of a limit order that is executed upon submission as market order.

2 There is no chance that a sell (buy) market order can cause an increase (decrease, respectively) in the mid-price.

3 Every qM,j satisfies the measurability constraint qM,j^ FTj0=ϵ>0FTj0ϵ, where F is a history of (N,X).

4 To see why this can be a potentially undesirable feature, consider a liquidator that trades with child orders whose sizes are significantly different from the average market orders arriving in the market.

References

  • Almgren R., C. Thum, E. Hauptmann, and H. Li. 2005. “Direct Estimation of Equity Market Impact.” Risk 18 (7): 58–62.
  • Bacry E., A. Iuga, M. Lasnier, and C.-A. Lehalle. 2015. “Market Impacts and the Life Cycle of Investors Orders.” Market Microstructure and Liquidity 1 (2): 1550009. https://doi.org/10.1142/S2382626615500094.
  • Bacry E., and J.-F. Muzy. 2014. “Hawkes Model for Price and Trades High-frequency Dynamics.” Quantitative Finance 14 (7): 1147–1166. https://doi.org/10.1080/14697688.2014.897000.
  • Bouchaud J.-P. 2017. Market Impact: A Review. Imperial-CFM Seminar. https://www.imperial.ac.uk/events/99125/jean-philippe-bouchaud-cfm-market-impact-a-review/.
  • Brokmann X., E. Serie, J. Kockelkoren, and J.-P. Bouchaud. 2015. “Slow Decay of Impact in Equity Markets.” Market Microstructure and Liquidity 1 (2): 1550007. https://doi.org/10.1142/S2382626615500070.
  • Brown T. C., and M. G. Nair. 1988. “A Simple Proof of the Multivariate Random Time Change Theorem for Point Processes.” Journal of Applied Probability 25 (1): 210–214. https://doi.org/10.2307/3214247.
  • Bucci F., I. Mastromatteo, M. Benzaquen, and J.-P. Bouchaud. 2019. “Impact Is Not Just Volatility.” Quantitative Finance 19 (11): 1763–1766. https://doi.org/10.1080/14697688.2019.1622768.
  • Capponi F., and R. Cont. 2019. “Trade Duration, Volatility and Market Impact.” SSRN. https://ssrn.com/abstract=3351736.
  • Cartea A., R. Donnelly, and S. Jaimungal. 2018. “Enhancing Trading Strategies with Order Book Signals.” Applied Mathematical Finance 25 (1): 1–35. https://doi.org/10.1080/1350486X.2018.1434009.
  • Daley D., and D. Vere-Jones. 2008. An Introduction to the Theory of Point Processes Volume II: General Theory and Structure. Probability and Its Applications. 2nd ed. New York: Springer.
  • Engle R., R. Ferstenberg, and J. Russell. 2012. “Measuring and Modeling Execution Cost and Risk.” The Journal of Portfolio Management 38 (2): 14–28. https://doi.org/10.3905/jpm.2012.38.2.014.
  • Huang R., and T. Polak. 2011. “Lobster: Limit Order Book Reconstruction System.” Lobsterdata Docs.
  • Jusselin P., and M. Rosenbaum. 2020. “No-arbitrage Implies Power-law Market Impact and Rough Volatility.” Mathematical Finance 30 (4): 1309–1336. https://doi.org/10.1111/mafi.v30.4.
  • Meyer P.-A.. 1971. “Démonstration Simplifiée D'un Théorème De Knight.” Séminaire de Probabilités de Strasbourg 5:191–195.
  • Morariu-Patrichi M., and M. S. Pakkanen. 2018. “Hybrid Marked Point Processes: Characterization, Existence and Uniqueness.” Market Microstructure and Liquidity 4 (3n04): 1950007. https://doi.org/10.1142/S2382626619500072.
  • Morariu-Patrichi M., and M. S. Pakkanen. 2022. “State-Dependent Hawkes Processes and Their Application to Limit Order Book Modelling.” Quantitative Finance 22 (3): 563–583. https://doi.org/10.1080/14697688.2021.1983199.
  • Moro E., J. Vicente, L. G. Moyano, A. Gerig, J. D. Farmer, G. Vaglica, F. Lillo, and R. N. Mantegna. 2009. “Market Impact and Trading Profile of Hidden Orders in Stock Markets.” Physical Review E 80 (6 Pt 2): 066102. https://doi.org/10.1103/PhysRevE.80.066102.
  • Patzelt F., and J.-P. Bouchaud. 2018. “Universal Scaling and Nonlinearity of Aggregate Price Impact in Financial Markets.” Physical Review E 97:012304. https://doi.org/10.1103/PhysRevE.97.012304.
  • Rosenbaum M., and M. Tomas. 2021. “A Characterisation of Cross-Impact Kernels.” Preprint, arXiv:2107.08684.
  • Torre N. 1997. Barra Market Impact Model Handbook. Vol. 208. Berkeley: BARRA Inc.
  • Tóth B., Z. Eisler, and J.-P. Bouchaud. 2016. “The Square-root Impace Law Also Holds for Option Markets.” Wilmott 2016 (85): 70–73. https://doi.org/10.1002/wilm.2016.2016.issue-85.
  • Tóth B., Y. Lempérière, C. Deremble, J. de Lataillade, J. Kockelkoren, and J.-P. Bouchaud. 2011. “Anomalous Price Impact and the Critical Nature of Liquidity in Financial Markets.” Physical Review X 1 (2): 021006. https://doi.org/10.1103/PhysRevX.1.021006.
  • Zarinelli E., M. Treccani, J. D. Farmer, and F. Lillo. 2015. “Beyond the Square Root: Evidence for Logarithmic Dependence of Market Impact on Size and Participation Rate.” Market Microstructure and Liquidity 1 (2): 1550004. https://doi.org/10.1142/S2382626615500045.

Appendix. Proofs

Proof

Proof of Proposition 2.2

Let (t,q,p,1) be a sell limit order. Let Nv:=inf{n0:q<i=1nVtb,i}. Let Np:=inf{n1:Ptb,n<p}. Let qi:=max(0,qk=1iNv(Np1)Vtb,k)=max(0,q1kiVtb,k1{Ptb,kp}) Let NvM, NpM and qMi be the corresponding quantities for the order (t,qM,0,1). Notice that NvM=NvMNpM. Processing (t,qM,0,1) does not affect the ask side because qM=0; moreover the effects on the bid side of processing (t,qM,0,1) are the same as those of (t,q,p,1), because inf{n0:qM<i=1nVtb,i}=NvNp and qk+N1qk+N=qMk+NvM1qMk+NvM. Therefore, after (t,qM,0,1) has been processed, the bid side is the same as the bid side after the processing of (t,q,p,1).

Processing (t,qqM,p,1) after (t,qM,0,1) does not alter the bid side because either Pt1b<p or qqM=0. Moreover, (t,qqM,p,1) has the same effects on the ask side as those of (t,q,p,1) because qqM=q.

The proof of the decomposition of a buy limit order is mutatis mutandis the same.

Proof

Proof of Proposition 5.6

Let (Tn,En,Xn) be the sequence of arrival times, events and states. Then, we can compute xS+e=14ϕe(X(t),x)e=14[0,t)κe,e(ts,X(s))dNe(s)=n:Tn<txS+e=14ϕe(X(t),x)κEn,e(tTn,Xn)=n:Tn<txS+e=14ϕσE(e)(X(t),σS(x))κEn,σE(e)(tTn,Xn)=n:Tn<txSe=14ϕe(X(t),x)κEn,e(tTn,Xn)=xSe=14ϕe(X(t),x)e=14[0,t)κe,e(ts,X(s))dNe(s).

Proof

Proof of Proposition 5.11

We need to show that (A1) (λλ+)(t)=Dir(t)+Indir(t).(A1) To this purpose, we compute λ (resp., of λ+) as the sum of the intensities of N~e,x for e=0,,4 and x in S (resp., in S+). From Equations (Equation9) and (Equation12) it follows that λ(t)=xS{x=13Kϕ0(X(t),x)λ0(t)+e=14ϕe(X(t),x)(νe+e=14x=13K[0,t)κe,e(ts,x)dN~e,x(s)=e(t)+x=13K[0,t)κ1,e(ts,x)dN~0,x(s))}, where λ0(t) is as in (Equation13), and λ+(t)=xS+e=14ϕe(X(t),x)(νe+e=14x=13K[0,t)κe,e(ts,x)dN~e,x(s)=e(t)+x=13K[0,t)κ1,e(ts,x)dN~0,x(s)). By price-symmetry, the terms xSe=14ϕe(X(t),x)e(t) and xS+e=14ϕe(X(t),x)e(t) will cancel out from the difference λ(t)λ+(t).