Optimal inventory management and order book modeling

(1)

HAL Id: hal-01710301

https://hal.archives-ouvertes.fr/hal-01710301v2

Submitted on 6 Nov 2018

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Optimal inventory management and order book

modeling

Nicolas Baradel, Bruno Bouchard, David Evangelista, Othmane Mounjid

To cite this version:

Nicolas Baradel, Bruno Bouchard, David Evangelista, Othmane Mounjid. Optimal inventory manage-ment and order book modeling. ESAIM: Proceedings and Surveys, EDP Sciences, 2019, 65, pp.145-181. �hal-01710301v2�

(2)

Optimal inventory management and order book modeling

∗

Nicolas Baradel† Bruno Bouchard‡ David Evangelista § Othmane Mounjid ¶

November 7, 2018

Abstract

We model the behavior of three agent classes acting dynamically in a limit order book of a financial asset. Namely, we consider market makers (MM), high-frequency trading (HFT) firms, and institutional brokers (IB). Given a prior dynamic of the order book, similar to the one considered in the Queue-Reactive models [12, 18, 19], the MM and the HFT define their trading strategy by optimizing the expected utility of terminal wealth, while the IB has a prescheduled task to sell or buy many shares of the considered asset. We derive the variational partial differential equations that characterize the value functions of the MM and HFT and explain how almost optimal control can be deduced from them. We then provide a first illustration of the interactions that can take place between these different market participants by simulating the dynamic of an order book in which each of them plays his own (optimal) strategy.

Key words: Optimal trading, market impact, optimal control.

MSC 2010: 49L20, 49L25.

1 Introduction

The comprehension of the order book dynamic has become a fundamental issue for all market par-ticipants and for regulators that try to increase the market transparency and efficiency. A deep understanding of the order book dynamic and agents behaviors enables: market makers to en-sure liquidity provision at cheaper prices, high-frequency traders to reduce arbitrage opportunities, investors to reduce their transaction costs, policy makers to design relevant rules, to strengthen mar-ket transparency and to reduce marmar-ket manipulation. Moreover, modeling the order book provides insights on the behavior of the price at larger time scales since the price formation process starts at the order book level, see e.g. [11] for Brownian diffusion asymptotic of rescaled price processes. Recently, the widespread market electronification has facilitated the access of high quality data de-scribing market participants decisions and interactions at the finest time scale, on which statistics

∗

This research is part of a Cemracs 2017 project and benefited from the support of Initiative de Recherche “Stratégies de Trading et d’Investissement Quantitatif”, Kepler-Chevreux and Collège de France. We are very gratefull for the support of P. Besson and his team.

†

Université Paris-Dauphine, PSL Research University, CNRS, CEREMADE, Paris. ba-radel@ceremade.dauphine.fr.

‡

Université Paris-Dauphine, PSL Research University, CNRS, CEREMADE, Paris. bouchard@ceremade.dauphine.fr. Research of B. Bouchard partially supported by ANR CAESARS (ANR-15-CE05-0024).

§

King Abdullah University of Science and Technology (KAUST), CEMSE Division, Thuwal 23955-6900. Saudi Arabia. david.evangelista@kaust.edu.sa. D. Evangelista was partially supported by KAUST baseline funds and KAUST OSR-CRG2017-3452.

¶

(3)

can be based. The availability of the order book data certainly allows a better understanding of the market activity. On the other hand, the recent market fragmentation and the increase of trading frequency rise the complexity of agents actions and interactions.

The main objective of the present paper is to propose a flexible order book dynamics model close to the one of [12, 18, 19, 23], construct a first building block towards a realistic order book modeling, and try to better understand the various regimes related to the presence of different market participants. Instead of considering a pure statistical dynamics as in e.g. [12], we construct a structural endogenous dynamics, see e.g. [22, 23], based on the optimal behavior of agents that are assumed to be rational. For numerical tractability, we simplify the market in three classes of (most significant) participants: the market makers (MM), the high-frequency trading (HFT) firms, and the institutional brokers (IB). Each of them decides of his policy in an optimal way, given prior statistics, and then interact with the others given the endogenous realizations of the market.

More precisely, we postulate that they assume an order book model similar to the one suggested by the Queue-Reactive model [18], see also [1, 12, 17], in which we restrict to the best bid and ask

queues1_{: limit and aggressive orders arrive with certain intensities, when a queue is depleted it is}

regenerated according to a certain law and possibly after a price move, the spread can take two

different values2. Importantly, we also take the order book’s imbalance into account in the modeling

of the different transition probabilities, see e.g. Besson et al. [9].

The market participants can either put limit or aggressive orders. The aim of a market maker is to gain the spread. He should therefore essentially put limit orders, aggressive orders being used when his inventory is too unbalanced. In our model, he can only acts on the given order book. The high-frequency trader is assumed to play on the correlation between the order book dynamics, viewed as the stock price, and the price of another asset, called futures hereafter. Indeed, he believes that the difference between the stock and the futures prices is mean-reverting. Whenever he buys/sells one unit of the stock, he sells/buys back one unit of the futures. We do not handle the order book associated to the futures but simply model the price of the futures as the mid-price of the stock to which a mean-reverting process is added. Still, we introduce a (possibly equal to 0) transaction cost proportional to the size of the transaction. As the market-makers, he seeks for a zero inventory at the end of the trading period. Finally, institutional brokers are simply assumed to play VWAP (Volume Weighted Average Price)- or Volume-based strategies (trading algorithms). Again, they essentially use limit orders and become aggressive when they are too late in their schedule.

We focus on the derivation of the optimal strategies of the market makers and the high-frequency traders, and on how they can be computed numerically by solving the associated variational partial differential equations. Note that it is important to consider their strategies within a dynamical model as current actions impact the order book and therefore may modify its futures dynamics. We will actually see that, in certain situations, participants can place aggressive orders or limit orders in the spread just to try to manipulate the order book’s dynamics in a favorable way. Note that the dimension of our control space is higher than in [1, 17, 22, 23] since more complex decisions are required to tackle the market making problem.

However, the ultimate goal of this work is to provide a market simulator. In the last section, we already present simulations of the market behavior given the pre-computed optimal strategies of the different actors. More precisely, we will only simulate the evolution of the mean-reverting process (driving the difference between the stock and the futures price) together with the reconstruction of the queues when prices move, and let the participants play their optimal strategies given the

1_{This limitation is for numerical tractability. It is already enough for most markets.} 2

(4)

evolution of the order book due to their different actions. This should allow us to study how these different market participants may interact among each other if each of them is playing his optimal policy. In particular, one should observe different market regimes depending on the proportion of the different participants in the total population, on their risk aversion, etc. First simulations are provided in this paper, a more throughout study is left for future research.

Our approach therefore lies in between two current streams of literature. The first one is based on “general equilibrium models”, including economic models, where the market activity is generated by interactions between rational agents who take optimal decisions that interact through the market netting process, see e.g. [14, 24, 25]. The second stream of literature considers purely statistical models where the order book is seen as a random process, see e.g. [2, 3, 8, 12, 13, 16, 18, 19, 20, 21, 26]. The statistical models focus on reproducing many salient features of a real market rather than agents behaviors and interactions. In our approach, we take into account that the agent’s behavior are essentially based on statistical approaches, but that they eventually interact with each other.

We end this introduction with an outline of our paper. In Section 2, we present the general order book dynamics. The marker maker control problem is studied in details in Section 3. There, we present the equations satisfied by their optimal strategy and propose a numerical solution for this problem, together with numerical illustrations. In Section 4, we formulate the high-frequency trader control problem and perform a similar analysis. The institutional broker strategy is described in Section 5, where we restrict to VWAP and Volume liquidation problems. Finally, in Section 6, we simulate a realistic market using the three agent’s optimal trading strategies.

2 General order book presentation and priors of the market

par-ticipants

As mentioned above, we focus on a single order book and only model the best bid and ask prices, in a similar way to [12]. In this section, we describe the general market mechanisms as well as the priors on which the optimal strategies of the agents are based. We fix a terminal time horizon T

and consider a probability space (Ω, P). Here, Ω := Ω1× Ω2 and P := P1⊗ P2, where Ω1 is the

space of R11-valued cadlag paths on [0, T ] endowed with a probability measure P1 with full support

on Ω1, and Ω2 is the one dimensional Wiener space endowed with the Wiener measure P2.

We denote respectively by (P_tb)t≥0and (Pta)t≥0the best bid offer and the best ask offer processes

on the market. They are valued in dZ where d > 0 is the tick size. We denote by (Qbt)t≥0 and

(Qa_t)t≥0 the sizes of the corresponding queues valued in N∗. To simplify the notation, we introduce

P := (Pb_{, P}a_{), Q := (Q}b_{, Q}a_{) and define the spread process as δP := P}a_{− P}b_{. Moreover, we}

assume3 that δPt∈ {d, 2d} for all t ≥ 0.

We denote by (τi)i≥1 the times at which orders are sent to the market. We assume that this

sequence is increasing and that #{i ∈ N : τi < T } < ∞ a.s. The market participants can send

different types of orders at each time τ_i:

• Aggressive orders of size αb

i ∈ N ∩ [0, Qb] at the bid or of size αai ∈ N ∩ [0, Qa] at the ask: the

size of the corresponding queue, Qb or Qa, decreases by the size of the aggressive order, αb_i or

αa_i.

3

The extension to more possible spread values is straightforward. We stick to this setting for notational and computational simplicity. Note that this limit is also justified by empirical evidences for many stocks, see [12].

(5)

• Limit orders of size Lb

i ∈ N at the bid or of size Lai ∈ N at the ask: the size of the corresponding

queue, Qb or Qa, increases by the size of the limit order, Lb_i or La_i.

• When δP = 2d: Limit orders of size Lb,

1 2

i ∈ N at the bid or of size L

a,1 2

i ∈ N at the ask: the

order is placed inside the spread, at the price Pb+ d = Pa− d, this generates a new queue at

the bid or at the ask, of size Lb,

1 2

i or L

a,1 2

i , and a price move.

• Cancellations: Cancellations of Mb

i ∈ N ∩ [0, Qb] orders at the bid or of Mia ∈ N ∩ [0, Qa]

orders at the ask. The difference between cancellations and aggressive orders is that aggressive orders consume the bottom of the limit while we see cancellations as only consuming the top of the limit first.

We assume that the sequence (αb

i, αai, Lbi, Lai, L b,1₂ i , L

a,1₂

i , Mib, Mia)i≥1 is made of random

vari-ables leaving, with probability one, on the state space C◦ defined as the collection of elements

(ab, aa, `b, à, `b,12, à, 1 2, mb, ma) ∈ N8 such that                  abaa= 0 `b = à= 0 if max{ab, aa} ≥ 1 `b,12 = 0 if max{ab, aa, `b} ≥ 1 à,12 = 0 if max{ab, aa, à} ≥ 1 mb= 0 if max{ab, aa, `b, `b,12} ≥ 1 ma_{= 0 if max{a}b_{, a}a_{, `}a_{, `}a,1₂_{} ≥ 1} . (2.1)

We interpret the above expression as follows. First, aggressive orders can not be sent simultaneously at the bid and at the ask. Next, limit orders (at the current bid/ask prices) can not be placed at the same time that aggressive orders are sent. Finally, one can not place limit orders within the spread if limit orders at the current bid/ask prices are placed. Because we only consider the first limits, these conditions are natural whenever one presumes that orders of different market participants do not arrive exactly at the same time.

Depending on the arrival of orders, queues can be depleted. In this case, new queues can be re-generated, at the same prices or at different prices, and possibly with a change of the spread

value. To model this, we introduces a sequence of random variables (_i, b_i, a_i)i≥1 with values in

{0, 1} × (N∗)2. The sequence (i)i≥1 will describe possible jumps of the bid/ask prices, while the

sequence (b_i, a_i)i≥1 will describe the new sizes of the queues when they are re-generated, after one

of them is depleted. More precisely, we postulate the dynamics

P_τb_i = P_τb_i−1 + d 1{i=1} −1_{Qb τi−1= ˆαbi}+ 1{δPτi−1=2d}1{Q a τi−1= ˆαai} + 1 {Lb, 12 i >0} P_τa_i = P_τa_i−1 + d 1{i=1} 1{Qa

τi−1= ˆαai}− 1{δPτi−1=2d}1{Qb_τi−1= ˆαb i} − 1 {La, 12 i >0} Qb_τ_i = Qb_τ_i−1+ Lb_i + (Lb, 1 2 i − Qbτi−1)1_{Lb, 1₂ i >0} − ˆαb_i1{∆Pb τi=0}+ ( b

i − Qbτi−1)1{∆P_τib6=0}∪{ ˆαb_i=Qb_τi−1}

Qa_τ_i = Qa_τ_i−1+ La_i + (La, 1 2 i − Qaτi−1)1_{La, 12 i >0} − ˆα_ia1{∆Pa τi=0}+ ( a

i − Qaτi−1)1{∆P_τia6=0}∪{ ˆαai=Qaτi−1}

(2.2) for i ≥ 1, where

ˆ

(6)

with

(P₀b, P₀a, Qb₀, Qa₀) ∈ DP,Q:= {(pb, pa, q) ∈ (dZ)2× (N∗)2 : pa− pb∈ {d, 2d}},

and the convention τ0= 0−. We refer to (2.1) to see that this dynamics is consistent. In particular,

prices can move only if one of the queues is depleted because of the arrival of aggressive orders or if a new limit order is inserted within the spread. These two situations can not occur simultaneously. The ask price can move by d when the ask queue is depleted. If the spread was already 2d, then the bid price moves up as well. The other way around if the bid queue is depleted. In the following, we extend the dynamics of (P, Q) by considering it as a step constant right-continuous process on [0, T ].

We now denote by E the N12-valued step constant right-continuous process defined by

∆Eτi := Eτi− Eτi−1 := (α b i, αai, Lbi, Lai, L b,1 2 i , L a,1 2 i , M b i, Mia,i, bi, ai, 1), i ≥ 1,

with Eτ0 := E0 := 0. Later on, we shall only write

(Pτi, Qτi) = TP,Q(Pτi−1, Qτi−1, ∆Eτi) (2.3)

in which the map TP,Q is defined explicitly by (2.2).

The process E models the flow of all the orders on the market. From the viewpoint of a market participant, it corresponds to its own orders and to the other participants’ orders that we denote

by ˜E. The process ˜E has jump times (˜τi)i≥1⊂ (τi)i≥1of sizes

∆ ˜E˜τi =

X

j≥1

1{˜τi=τj}∆Eτj, i ≥ 1. (2.4)

It induces a counting measure ˜ν(dt, de). For a market participant, a prior on this measure is given

by the compensator ˜µ of ˜ν under P. We assume that it is state dependent. More precisely, we

consider a Borel kernel (p, q) ∈ DP,Q 7→ ˜µ(·|p, q) ∈ M([0, T ] × N8), where M([0, T ] × N8) denotes

the collection of non-negative measures on [0, T ] × N8. In particular, it can depend on the order

book’s imbalance, as observed in e.g. [9]. To be consistent with the constraints imposed above, it satisfies:

˜

µ(·|p, q) is supported by C◦, for all (p, q) ∈ DP,Q. (2.5)

It should also be such that aggressive orders and cancellations are never bigger than the correspond-ing queue size, which will be made more explicit in our numerical example sections, see Section 3.4.

Next, we shall denote by Eφ the flows corresponding to the trading strategy of either a market

maker, a high-frequency trader, or an institutional broker. Thus, she will assume facing a global

flow E = ˜E + Eφ_.

Moreover, for simplicity, we shall assume that ˜µ is of the form

d˜µ(c, ε, εb, εa, dt|p, q) = dλ(ε, εb, εa|p, q, c)dβ(c|p, q)dt (2.6)

in which λ and β are bounded Borel non-negative kernels and (without loss of generality) Z

dλ(ε, εb, εa|p, q, c) = 1, for all (p, q, c) ∈ DP,Q× C◦. (2.7)

Later on, when an order (αb_i, αa_i, Lb_i, La_i, Lb,

1 2 i , L

a,1₂

i , Mib, Mia) is sent by a MM, an HFT or an IB, we

(7)

Remark 2.1. Let γ(p, q) :=R dβ(c|p, q). Then, γ is uniformly bounded by the above assumption. Let τ be a stopping time and fix h > 0. Since λ integrates to one, it follows that P[#{t ∈ [τ, τ + h] :

∆ ˜Et 6= 0} = 1|Fτ] = hγ(Pτ, Qτ) + o(h) and P[#{t ∈ [τ, τ + h] : ∆ ˜Et 6= 0} > 1|Fτ] = o(h).

Moreover, the process counting the number of jumps of ˜E is dominated by a Poisson process with

intensity ¯_{γ := sup γ < ∞. Hence, P[#{t ≤ T : ∆ ˜}E_t 6= 0} ≥ k] ≤ ¯γT /k, for k ≥ 1, by Markov’s

inequality. Similarly, if g is a non-decreasing Borel map, then E[g(#{t ∈ [0, T ] : ∆ ˜Et 6= 0})] ≤

P

k≥1g(k) (¯γT )k

k! e −¯γT_.

3 Market maker’s optimal control problem

In this section, we describe the optimal control problem of the market maker, the key tools to characterize the solution and how to numerically approximate the optimal control.

3.1 Market maker’s strategy and state dynamics

The market maker typically places limit orders in order to make profit of the spread but can turn aggressive when his inventory is too important. At the end the trading period [0, T ], the later should be zero. In the following, we denote by G his gain process and by I his inventory. We also

need to keep track of the sizes of his orders already placed at the bid queue, Nb, and at the ask

queue, Na. For simplicity, we impose that new orders can not be placed at the bid (respectively at

the ask) if he already has previously taken a position at the bid (respectively at the ask). Then,

his position at the bid (resp. at the ask) is completely described by Nb (resp. Na) and the number

of units Bb _{before him in the bid-queue (resp. B}a _{before him in the ask-queue). Later on, we only}

write N = (Nb, Na) and B = (Bb, Ba).

To define the market maker’s control, we assume that he faces the exogenous process ˜E described

in Section 2.

For him, a control is a sequence of random variables φ = (τ_iφ, cφ_i)i≥1 where (τ_iφ)i≥1 is an

increasing sequence of times and each cφ_i = (αb,φ_i , α_ia,φ, Lb,φ_i , La,φ_i , Lb,

1 2,φ i , L a,1₂,φ i , M b,φ i , M a,φ i ) is C◦

-valued, see below for more implicit restrictions. The times (τ_iφ)i≥1 are the times at which he sends

orders: the action done at τ_iφ is cφ_i, whose components have the same meanings as in Section 2.

Given his own orders and the other participants’ orders, the sequence of times at which orders are

sent to the market is (τ_i)i≥1where τ0 := 0− and τi+1= min{˜τj > τi, j ≥ 1} ∧ min{τjφ> τi, j ≥ 1},

see (2.4) and above.

We denote by Eφthe càdlàg process that jumps only at the times τ_iφs with jump size

∆Eφ τ_iφ = (c φ i, X j≥1 (j, bj, aj)1{τ_iφ=τj}), i ≥ 1, so that4 E = Eφ+ ˜E

from his point of view. As usual, we impose that Eφ is predictable for the (completed) filtration

Fφ= (Ftφ)t≥0 generated5 by E . We will also keep in mind the number of actions

J :=X

i≥1

1_{τφ

i≤·}

4

We keep in mind that E also depends of φ but dot not make this explicit for ease of notations.

(8)

from time 0 on, as it may induce a cost.

We now impose a minimum and maximum inventory size, denoted by (−I∗, I∗) ∈ (−N) × N,

and that

#{τ_iφ≤ T, i ≥ 1} ≤ k_φ∧ J◦ a.s., for some kφ∈ N,

for some J◦ ∈ N ∪ {∞}. The constraint on the inventory is classic. The constraint on the number

of operations can be justified by operational constraints. In the case J◦ = ∞, it just means that

each control should be of essentially bounded activity, but the bound is not uniform on the set of controls and can be as large as needed.

To be admissible, a control φ should therefore be such that each cφ_i is C(Z_τφ

i−)-valued, where

Z := (P, Q, X) with X := (G, I, N, B, J ),

and, for z = (pb, pa, qb, qa, g, i, nb, na, bb, ba, j), C(z) is the collection of elements c := (ab, aa, `b,

à_{, `}b,1₂_{, `}a,1₂_{, m}b_{, m}a_{) ∈ C} ◦\ {0} such that : ab≤ minni + I∗− ba; qb o , aa ≤ minnI∗− i − bb; qa o , `b ≤ (I∗− i)1_{nb_=0}, à ≤ (i + I∗)1_{na_=0}, `b,12 ≤ (I∗− i)1_{pa_−pb_=2d}1_{nb_=0}, à, 1 2 ≤ (i + I∗)1_{pa_−pb_=2d}1_{na_=0} mb ≤ nb, ma ≤ na c = 0 if j = J◦.

Note that the constraints on the first three lines correspond to the fact that we do not want to

take a position that could lead to an inventory out of the limits −I∗ and I∗ if it was suddenly

executed. The indicator functions correspond to additional constraints on the controls, imposed for numerical tractability: no new limit order can be sent on a side if one has not been executed or has not cancelled the position on the same side before, no limit order can be sent in the spread if

it is not equal to two ticks. In this case, we write φ ∈ C(0, Z0−).

The dynamics of X is given by

Xτi = TX(Pτi−1, Qτi−1, Xτi−1, ∆E

φ

τi)1{∆E_τiφ6=0}+ ˜TX(Pτi−1, Qτi−1, Xτi−1, ∆ ˜Eτi)1{∆E_τiφ=0}, (3.1)

in which TX, ˜TX : R227→ dZ × N × N4× N. More precisely, consider the map

exe(a, n, b) := min{(a − b)+, n}, a, b, n ∈ N.

It represents the number of stocks set at a limit that are executed when an aggressive order of size a arrives, that the position in the queue is b, and the size of the posted block at this position is n.

Then, having in mind the constraints encoded in C(·) above, see also (2.1), we can write

TX = (TG, TI, TNb, T_Na, T_Bb, T_Ba, T_J) , ˜T_X = ( ˜T_G, ˜T_I, ˜T_Nb, ˜T_Na, ˜T_Bb, ˜T_Ba, ˜T_J)

where

TG(p, q, x, δ) = g + (ab− exe(ab, nb, bb))pb− (aa− exe(aa, na, ba))pa

TI(p, q, x, δ) = i − (ab− exe(ab, nb, bb)) + (aa− exe(aa, na, ba))

T_Nb/a(p, q, x, δ) = nb/a+ [`b/a− nb/a]++ `b/a,

1

2 − mb/a− exe(ab/a, nb/a, bb/a)

T_Bb/a(p, q, x, δ) = bb/a+ (qb/a− bb/a)1_{`b/a_6=0}− bb/a1_{mb/a_=nb/a_}− (bb/a∧ ab/a)1_{ab/a_6=0}− bb/a1

{`b/a, 1₂_6=0}

(9)

and ˜ TG(p, q, x, δ) = g − exe(ab, nb, bb)pb+ exe(aa, na, ba)pa ˜ TI(p, q, x, δ) = i + exe(ab, nb, bb) − exe(aa, na, ba) ˜

T_Nb/a(p, q, x, δ) = nb/a− exe(ab/a, nb/a, bb/a)

˜

T_Bb/a(p, q, x, δ) = [bb/a− ab/a]+1_{mb/a_=0}+ (bb/a− [mb/a− (qb/a− bb/a− nb/a)]+)+1_{mb/a_6=0}

˜ T_J(p, q, x, δ) = 0, for x = (g, i, nb, na, bb, ba, j), δ = (ab, aa, `b, `a, `b,12, `a, 1 2, mb, ma, ε, εb, εa), p = (pb, pa) and q = (qb_{, q}a_{) .}

Remark 3.1. It follows from (2.6) and the constraint that Eφis predictable that the probability that

Eφ _{and ˜}_{E jump at the same time on [0, T ] is zero. This justifies the formulation (3.1).}

For later use, note that it follows from (2.2) and (3.1) that

Zτi = T (Zτi−1, ∆E

φ

τi)1{∆E_τiφ6=0}+ ˜T (Zτi−1, ∆ ˜Eτi)1{∆E_τiφ=0}, (3.2)

in which

T = (T_P,Q, TX) and ˜T = (TP,Q, ˜TX).

3.2 The optimal control problem

The aim of the market maker is to maximize her expected utility

E[U (ZT)] in which U (z) := − exp −η{g + i+pb− i−pa− κ([i+− qb]++ [i−− qa]+) − %j} (3.3)

for z = (pb, pa, qb, qa, g, i, nb, na, bb, ba, j). In the above, η > 0 is the absolute risk aversion parameter,

and κ > 0 is a penalty term taking into account that liquidating the current inventory may lead to a

worse price than the one corresponding to the best bid or ask: the quantity i+pb− i−pa corresponds

to the liquidation value of the inventory if the bid and ask queues are big enough to absorb it, the expression starting from κ takes into account the number of shares that will not be liquidated at the best limit. The coefficient % ≥ 0 penalizes the number of actions taken by the market maker.

To define the corresponding value function, we now extend the definition of our state processes by writing

Zt,z,φ= (Pt,z,φ, Qt,z,φ, Xt,z,φ)

for the process satisfying (2.2)-(3.1) for the control φ and with initial condition Z_t−t,z,φ = z ∈ DZ

where D_Zis the collection of elements (pb, pa, qb, qa, g, i, nb, na, bb, ba, j) ∈ DP,Q×dZ×{−I∗, . . . , I∗}×

N4× {0, . . . , J◦} such that

nb+ i ≤ I∗ , i − na≥ −I∗

(10)

The corresponding set of admissible controls is C(t, z), and the filtration associated to φ ∈ C(t, z) is Ft,z,φ. We then set v(t, z) := sup φ∈C(t,z) J (t, z; φ) for (t, z) ∈ [0, T ] × DZ, where J (t, z; φ) := E[U (ZTt,z,φ)].

Remark 3.2. For later use, observe that

v(t, z) = e−ηgv(t, pb, pa, qb, qa, 0, i, nb, na, bb, ba, j)

for all t ≤ T and z = (pb, pa, qb, qa, g, i, nb, na, bb, ba, j) ∈ DZ. Moreover, if J◦ = ∞, we also have

v(t, z) = e−η(g−%j)v(t, z) := e¯ −η(g−%j)v(t, pb, pa, qb, qa, 0, i, nb, na, bb, ba, 0)

Remark 3.3. Note that v is bounded from above by 0 by definition. On the other hand, for all

z = (pb, pa, qb, qa, g, i, nb, na, bb, ba, j) ∈ DZ, v(t, z) ≥ min i∈[−I∗_,I∗_]E[U (P t,z,0 T , 0, 0, g, i, 0, 0, 0, 0, j)] = e−η(g−%j) min i∈[−I∗_,I∗_]E[U (P t,z,0 T , 0, 0, 0, i, 0, 0, 0, 0, 0)],

where Pt,z,0 corresponds to the dynamics in the case that the MM does not act on the order book

up to T . Moreover, it follows from (3.3) that

by Remark 2.1 and the fact that the price can jump only by d when a market event occurs. Thus, v

belongs to the class Lexp∞ of functions ϕ such that ϕ/L is bounded, in which

L(pb, g, j) := e−η(g−I∗|pb|−%j)

for (pb_{, g, j) ∈ dZ × dZ × {0, . . . , J}◦}.

3.3 The dynamic programming equation

The derivation of the dynamic programming equation is standard, and is based on the dynamic

programming principle. We state below the weak version of Bouchard and Touzi [10], we let v∗ and

v∗ denote the lower- and upper-semicontinuous envelopes of v.

Proposition 3.1. Fix (t, z) ∈ [0, T ] × D_Z and a family {θφ_{, φ ∈ C(t, z)} such that each θ}φ _{is a}

[t, T ]-valued Ft,z,φ-stopping time and kZt,z,φ

θφ kL∞ < ∞. Then, sup φ∈C(t,z) E h v∗(θφ, Z_θt,z,φφ ) i ≤ v(t, z) ≤ sup φ∈C(t,z) E h v∗(θφ, Z_θt,z,φφ ) i .

(11)

Proof. The right-hand side inequality follows from a conditioning argument, see [10]. The left-hand side is more delicate because the set of admissible controls depends on the initial data. However, it

can be easily proved along the lines of Baradel et al. [6] when {θφ, φ ∈ C(t, z)} is [t, T ] ∩ (N ∪ {t, T

})-valued. Then, the general case is obtained by approximating [t, T ]-valued stopping times from the

right (recall that Z is right-continuous).

One can then derive the corresponding dynamic programming equation. For z = (p, q, x) ∈ D_Z,

c ∈ C(z), t ≤ T , and a continuous and bounded function ϕ, we set Iϕ(t, z) :=

Z

(Kcϕ(t, z) − ϕ(t, z))dβ(c|p, q) and Kϕ(t, z) := sup

c∈C(z)

Kcϕ(t, z)

where we use the convention that K0 = sup{∅} = −∞, and

Kcϕ(t, z) :=

Z

ϕ(t, T (z, c, ε, εb, εa))dλ(ε, εb, εa|p, q, c),

recall (2.6), (2.7) and (3.2).

The partial differential equation characterization of v is then at least formally given by

min {−∂tϕ − Iϕ, ϕ − Kϕ} = 0 on [0, T ) × DZ

min {ϕ − U, ϕ − Kϕ} = 0 on {T } × DZ.

(3.4) In order to ensure that the above is correct, we need two additional conditions.

Assumption 3.1. For all upper-semicontinuous (resp. lower-semicontinuous) ϕ ∈ Lexp∞ , the map

(t, z) ∈ [0, T ] × DZ 7→ (I, K)ϕ(t, z) is upper-semicontinuous (resp. lower-semicontinuous) and

belongs to Lexp∞ .

Assumption 3.2. There exists a Borel function ψ that is continuously differentiable in time and such that

(i) 0 ≥ ∂tψ + Iψ on [0, T ) × DZ,

(ii) ψ − Kψ ≥ ι on [0, T ] × DZ for some ι > 0,

(iii) ψ ≥ U on {T } × D_Z,

(iv) lim inf

n→∞ (ψ/L)(tn, zn) = ∞ if |zn| → ∞ as n → ∞, for all (tn, zn)n≥1⊂ [0, T ] × DZ.

Then, one can actually prove that v is the unique solution of (3.4) in the class Lexp∞ defined in

Remark 3.3.

Theorem 3.1. Let Assumption 3.1 hold. Then, v∗ (resp. v∗) is a viscosity supersolution (resp.

sub-solution) of (3.4). If moreover Assumption 3.2 holds, then v is continuous on [0, T ) × DZ and is

the unique viscosity solution of (3.4), in the class of (discontinuous) solutions in Lexp∞ .

Proof. In view of Proposition 3.1, the derivation of the viscosity super- and subsolution properties is very standard under Assumption 3.1, see e.g. [6, 10]. As for uniqueness, let us assume that v and w

are respectively a super- and a subsolution. Let ψ be as in Assumption 3.2. Then, (v −w−ψ)(t_n, zn)

converges to −∞ if |zn| → ∞ as n → ∞, for any sequence (tn, zn)n≥1 ⊂ [0, T ] × DZ, and showing

that v ≥ w on [0, T ] × D_Z can be done by, e.g., following the line of arguments of [6, Proposition

(12)

Remark 3.4. If J◦ < ∞ and the supports of λ(·|p, q, c) and γ(·|p, q) are bounded, uniformly in

(p, q, c) ∈ (dZ)2× N2_{× C}

◦, then it is not difficult to see that the function defined by

ψ(t, z) := e2η(1+I∗)|z|e−r(j+t), for z = (p, q, g, i, n, b, j) ∈ DZ and t ≤ T,

satisfies the requirements of Assumption 3.2, for r large enough. Verifying Assumption 3.2 in

the case J◦ = ∞ seems much more difficult. On the other hand, the sequence of value functions

associated to a sequence (J_◦n)n≥1 increases to the value function associated to J◦ = ∞ as J◦n→ ∞.

which provides a natural way to construct a convergent numerical scheme for the computation of v and the optimal control policy, see Sections 3.4 and 3.5 below. Standard arguments based on this

approximation would also imply that v∗ = v and that v is the smallest supersolution of (3.4), in the

class of (discontinuous) solutions in Lexp∞ .

3.4 Dimension reduction, symmetries and numerical resolution

Before to provide a converging numerical scheme for (3.4), let us first recall that the variables g

(and j if J◦ = ∞) can be omitted, see Remark 3.2. If moreover, the transition kernels depend

on prices only through the spread (which is a natural assumption at least on a rather short time horizon), then one more dimension can be eliminated.

Assumption 3.3. The kernel (p, q, c) ∈ D_P,Q× C 7→ (λ(·|p, q, c), β(·|p, q)) depend on p = (pb_{, p}a₎

only through the value of the mid-spread δp := (pa_{− p}b_)/2.

Indeed, if Assumption 3.3 holds, then one easily checks that

e−ηip◦¯v(t, −δp, δp, qb, qa, 0, i, ·) = ¯v(t, p◦− δp, p◦+ δp, qb, qa, 0, i, ·)

= ¯v(t, pb, pa, qb, qa, 0, i, ·)

with p◦ := (pa+ pb)/2, so that

eηi(pb+δp)¯v(t, pb, pb+ 2δp, qb, qa, 0, i, ·)

does not depend on pb but only on δp.

The resolution of the equation can also be simplified by using potential symmetries, in the sense of the following assumption.

Assumption 3.4. For all (p, q) ∈ D_P,Q, c := (ab, aa, `b, `a, `b,12, `a,

1

2, mb, ma) ∈ C and all Borel

sets O ⊂ {0, 1}, Ob, Oa⊂ N, Z O×Ob_×Oa dλ(ε, εb, εa|p, q, c) = Z O×Oa_×Ob dλ(ε, εa, εb|¯p, ¯q, ¯c) where ¯p = (−pa, −pb), ¯q = (qa, qb) and ¯c = (aa, ab, `a, `b, `a,12, `b, 1

2, ma, mb). Moreover, for all Borel

sets O = O_ab× Oa a× O`b× Oa` × Ob `12 × Oa `12 × Ob m× Oam⊂ N8, Z O dβ(c|p, q) = Z ¯ O dβ(c|¯p, ¯q) where ¯O := Oa_a× Ob a× Oa`× O`b× O_`a1 2 × Ob `12 × Oa m× Obm.

(13)

The above assumption implies that the transition probabilities of the order book are symmetric at the bid and at the ask, whenever the configurations are. Then, v admits a symmetry which can be exploited to reduce the complexity of the numerical resolution of (3.4). Namely, under Assumption 3.4, we have

¯

v(t, pb, pa, qb, qa, 0, i, nb, na, bb, ba, j) = ¯v(t, −pa, −pb, qa, qb, 0, −i, na, nb, ba, bb, j).

Let us now turn to the definition of a numerical scheme for (3.4). We now make the additional assumption that the supports of λ and β are bounded (not that they are already discrete, by nature).

Assumption 3.5. J◦ < ∞ and there exists finite Borel sets O1 ⊂ N8 and O2 ⊂ N3 such that

β(·|p, q) is supported by O1 and λ(·|p, q, c) is supported by O2, for all (p, q, c) ∈ DP,Q× C.

Then, the operators I and K are explicit. Hence, the only required discretization is in time.

For a time step T /n > 0, we define a time grid π_n := {tn

i, i ≤ n} where tni = iT /n for i ≤ n. We

next consider the sequence of space domains D_Zk := DZ∩ [−k, k]11for k ≥ 1, and we let vkn be the

solution of vk_n(tn_i, ·) = max v_nk(tn_i+1, ·) +T nIv k n(tni+1, ·), max c∈C(·) Kc,n_vk n(tni+1, ·) = 0 on Dk_Z, i ≤ n − 1 vk_n− U = 0 on ({T } × Dk_Z) ∪ (πn× (DZ\ DkZ)), (3.5) where Kc,n_{= K}c₊T nI ◦ K c_.

This fully explicit scheme is convergent.

Proposition 3.2. Let Assumption 3.5 hold, then the sequence (vk_n)k,n≥1 converges pointwise to v

on [0, T ) × D_Z as k, n → ∞.

Proof. First note that (vk_n/L)k,n≥1 is uniformly bounded, where L is defined in Remark 3.3. This

follows from Assumption 3.5 and a simple induction argument, compare with Remark 3.3. Then,

standard stability results, see e.g. [7] and [5, Section 3.2], imply that the relaxed upper-limit v∞

and lower-limit of v∞ of (vkn)k,n≥1 are respectively sub- and supersolution of (3.4) and belong to

the class Lexp∞ . The comparison result mentioned in the proof of Theorem 3.1, see Remark 3.4, thus

implies that v∞≥ v ≥ v∞ while v∞≤ v∞ by definition.

3.5 Approximate optimal controls

In the following, we estimate the optimal control in a classical way. For each i < n, we choose a

measurable map ˆck_n(tn_i, ·) such that

ˆ

ck_n(tn_i, ·) ∈ arg max{Kc,nv_nk(th_i+1, ·), c ∈ C(·)} on Dk_Z

ˆ

ck_n(tn_i, ·) = 0 on DZ\ DkZ,

and define the sequence of stopping times ˆ τ_j+1n,k := min{tn_i : i ≥ 0, tn_i > ˆτ_jn,k, (vk_n− Kˆckn_vk n(tni+1, ·))(tni, ˆZ n,k tn i−) = 0}, j ≥ 0,

with ˆτ₀n,k:= 0−, and in which ˆZn,k= ( ˆPn,k, ˆQn,k, ˆXn,k) is defined as in (3.2) for the initial condition

Z0− at 0 and the control associated to ˆφkn := (ˆτ

n,k i , ˆckn(ˆτ n,k i , ˆZ n,k ˆ

τ_in,k−))i≥1 in a Markovian way. This

(14)

Proposition 3.3. Let the conditions of Proposition 3.2 hold. Then, lim

n,k→∞E[U (

ˆ

Z_Tn,k)] = v(0, Z0−).

Proof. Let γ(p, q) := R dβ(c|p, q) and recall that γ is uniformly bounded by assumption, as well

as the sequence (v_nk)k,n≥1. Then, it follows from Remark 2.1 that

vk_n(tn_i+1, ˆZ_tn,kn i ) + T nIv k n(tni+1, ˆZ n,k tn i ) =v k n(tni+1, ˆZ n,k tn i )(1 − T nγ( ˆP n,k tn i , ˆQ n,k tn i )) +T n Z Kcv_nk(tn_i+1, ˆZ_tn,kn i )dβ(c| ˆP n,k tn i , ˆQ n,k tn i ) =E[vkn(tni+1, ˜T ( ˆZ n,k tn i , ∆ ˜Et n i+1))|F ˆ φk n tn i ] + o(n −1 ) and, similarly, Kˆc k n(ˆτ n,k i , ˆZ n,k ˆ τn,k i − ),n vk_n(ˆτ_in,k+T n, ˆZ n,k ˆ τ_in,k−) =E[v k n(ˆτ n,k i + T n, ˜T ( ˆZ n,k ˆ τ_in,k+T n− , ∆ ˜E ˆ τ_in,k+T_n))|F ˆ φk n ˆ τ_in,k−] + o(n −1₎

with the convention that ˜T (·, 0) is the identity. Let θn

k be the first time when ˆZn,k exists the

domain Dk_Z. In view of (3.5), Assumption 3.5 (in particular that J0 < ∞) and Remark 2.1 (see the

arguments at the end of Remark 3.3), an induction implies that

v_nk(0, Z0−) = E[U ( ˆZ_θn,kn

k )] + on(1) = E[U ( ˆZ

n,k

T )] + on(1) + ok(1)

in which o_n(1) and ok(1) go to 0 as n → ∞ and k → ∞. It remains to appeal to Proposition 3.2.

3.6 Numerical experiments

We now turn to a numerical experiment. We compute an approximation of the optimal control as described in Section 3.5, using the simplifications detailed in Section 3.4.

Let us first describe a realistic prior distribution for the evolution of ˜E. The coefficients we use

are inspired from the behavior of the stock Société Générale (CLE FP)6 and from [9].

As for the prior on the dynamics of the market. We simply consider that both market and limit orders arrive according to a Poisson process. Both limit and aggressive orders arrive with an intensity of 0.6 per second. When a limit order arrives, it is assigned to the bid or the ask with probability 1/2. When a market order arrives, we assign it to the bid or the ask according to the statistics described for big caps in [9, Chart 8 p.10]. Namely, if

Imb˜τi := Qb ˜ τi− Q a ˜ τi Qb ˜ τi+ Q a ˜ τi

is the order book imbalance at the time ˜τi at which the market order arrives, then it arrives at the

ask with probability 0.5 + 0.35 ∗ Imbτ˜i.

To describe the size of the orders and of the inventory, we take 1/2 of the ATS (mediAn7 Trade

Size) as the unit.

6

We thank Chevreux-Kepler for providing us these data.

7

(15)

The size of the market orders is also assumed to be dependent on the order book imbalance. We use [9, Chart 16 p.10] to estimate that the size of the trade arriving at the ask (if it arrives at the ask), represents a percentage of the queue (that we round to an integer number, by above).

Namely, we set ˆf_τa_˜_i := 0.7 + 0.3 ∗ Imbτ˜i, and assign a (conditional) probability of 60% that the order

is of size round[ˆf_τ_˜a_i∗ Qa

˜

τi] and the same probability that the executed volume deviates from the latter

by one unit (with equal probability to be by one more and one less unit - if quantities are negative or bigger than the size of the queue, we obviously set them to 0 or to the size of the queue).

We use a simpler modeling approach for the limit orders. With 55% probability a limit order (if it arrives) is of size of 2 (recall that the unit is 1/2 of an ATS). It is of size 3 with probability 10% and of size 1 with probability 35%. Again, it is based on [9].

When a queue is depleted, the probability of a price move is set to P[i= 1|Fτ˜i] = 75%. If the

bid is depleted but the bid price does not go down, the size of the new bid queue is set to 2 units with probability 60%, 1 unit with probability 25% and 3 units with probability 15%, otherwise it is set to 10 units with probability 60%, 5 units with probability 25% and 12 units with probability 15%. The same applies to the ask price if this is the ask queue that is depleted.

If both ask and bid prices move, recall (2.3), we take the same distribution for both queues and consider them as being (conditionally) independent. The distribution corresponds to the one of a price move, as described above.

When the spread is equal to two ticks, the next limit order arrives in the spread with probability 90%. This models the fact that a spread of two ticks is not common, see e.g. [12]. To be consistent with the probability of arrival of market orders at the bid or at the ask, we assume that a new

bid limit is created in the spread with probability 0.5 + 0.35 ∗ Imb_τ_˜_i (otherwise, this is a new ask

limit). The size of this new limit is 2 with 55%, of size 3 with probability 10% and of size 1 with probability 35%. This corresponds to the behavior of the stock Société Générale (CLE FP). We do not change the dynamic of aggressive orders with the size of the spread.

Let us now describe the other parameters of the Market Maker’s optimal control problem. We set the bid price to 10 at time 0−, the spread is one tick, and the tick equal to 0.01, recall from Section 3.4 that only the spread size matters (because we have here the required symmetry) and observe that the latter can be rescaled together with the level of risk aversion, which is here fixed

to η = 1. We take κ = 0.02 and % = 10−20. The time horizon is T = 59 seconds, and the time step

is 1 second. We keep a small time horizon for a better visibility of the evolution of the order book. We do not consider cancellations from the rest of the market for simplicity. Moreover, in order to reduce the computation time, we add the additional constraint that the MM can not cancel only part of position in a queue, he can only cancel the whole position. We also fix a maximal queue

size of 12 and fix the maximal absolute value of the inventory to I∗ = 7. The size of the limit and

market orders sent by the MM are constrained to be less than 3. This corresponds to adding an additional constraint in the definition of C(·) which does not change the above analysis.

In Figure 2, we provide a simulated path of the optimal strategy, starting from a symmetric configuration of the order book, with queue lengths equal to 6. In this simulation, the MM always

play before the other (random) players8. The top left graphic describes the control played by

the MM. Triangles pointing outward (with respect to the zero line) correspond to limit orders, the number of triangles giving their size. Arrows with triangles pointing inward are cancellations, again the number of triangles gives the size. Aggressive orders are symbolized by lines with squares, while limit orders within the spread correspond to the lines with dots. The top right graphic gives the state of the order book just after the MM has played, and before the nature (i.e. the other players)

8

(16)

plays. The size of the lines gives the total length of each queues, while the dots symbolize the position of the MM in the queue. The middle left graphic describes the state of the order book after the nature plays. The bottom left graphic gives the value of the portfolio of the MM if he had to liquidate (by sending aggressive orders) his stock holding at the best bid/ask price. The final value gives the “true” liquidation value of the book. In the case that the final inventory can not be liquidated at the best bid/ask price, we liquidate the remaining part at the best bid/ask price minus/plus one tick. The middle right graphic is the inventory just before he plays and the bottom right one is the bid and ask prices just after he plays. Figure 1 provides the distribution of the gain

made by the MM, it uses 105 simulated paths.

His strategy can seem difficult to interpret at first sight. But, we have to remember that he believes that the imbalance plays an important role in the book order dynamics, and that he not only should take care of it but that he can actually use it: some limit orders are sent not to be executed but to influence the evolution of the price in a favorable direction. Also note that he should avoid the price to go down/up if his inventory is positive/negative. Having this in mind, it is not surprising that he can be sometimes at the limit of a price manipulation strategy, if he is big enough with respect to the market, which is the case in these simulations. Finally, we have to keep in mind that his strategy is constrained. He sometimes cancels positions to be free to react more quickly to a more favorable market configuration later on.

Not surprisingly, the MM first sends limit orders of equal sizes on both sides of the order book. Nothing happens until time t = 9. At this time is inventory is 2 and his position at the ask is still far from the beginning of the queue. In order to avoid increasing again his inventory, he cancels his remaining position of one unit at the bid. He puts a new position at the bid at time 10 after the queue has slightly increased, because of an exogenous limit order. Unfortunately for him, this new position is immediately executed, and his inventory jumps to 5. After the queue is regenerated, he immediately puts a limit order of size 2 at the bid. His reasoning is the following: 1. this position has little chance to be executed immediately, he does not take a risk of again increasing his inventory; 2. by doing this he increases the imbalance and therefore the probability of being executed at the ask so has to gain the spread and reduce his inventory. This strategy is successful since immediately 2 units of his positions at the ask are executed. The imbalance is still good even if he cancels his last unit at the bid, so as to be free to play the control he wants once his last unit at the ask will be executed. In fact, it does not work and he decides to refresh his global position at time 16 by canceling his final unit at the ask and putting limit orders again at time 17 in a symmetric way. This is a limit spread order at the bid. This makes sense since he has to avoid the stock price to go down, because he has a positive inventory. The fact that just after he alternates between putting and canceling limit orders at the ask is probably a numerical artifact: on the one hand, he wants to have a limit position at the ask because his inventory is large, on the other hand he does not want to increase the imbalance to avoid having a too big probability of being executed at the bid and of seeing the price go down. Similarly, he wants to keep a position at the bid to avoid a downward price move, while he really does not want to be executed on this side. The aggressive order at the bid at time 19 is just a partial cancellation, that is completed at time 20. He just does not want to be executed. By doing so, he unfortunately causes a downward jump of the bid, which is not good for him. Just after he sends a limit spread order at the bid to fight against this downward pressure, and then cancels it once the bid is back at a distance one tick of the ask. This strategy is successful. The rest of his orders can be interpreted in a similar manner. Just note that he starts to be aggressive at the bid side at time 49 because his inventory is too big and the maturity starts to be quite close, in particular the first market order of size 3 compensates the execution of a position of size 3 at the bid just before (so that the graphic of the inventory does not move, although the

(17)

inventory temporary jumps to the upper limit 7).

Histogram of Market Maker return

Return Density −0.4 −0.2 0.0 0.2 0.4 0.6 0.8 0 1 2 3 4 5

(18)

0 10 20 30 40 50 60 −3 −2 −1 0 1 2 3 Controls Time Bid | Ask ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● 0 10 20 30 40 50 60 −10 −5 0 5 10

Queues after Control

Time Bid | Ask ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 10 20 30 40 50 60 −10 −5 0 5 10

Queues after Nature

Time Bid | Ask ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 10 20 30 40 50 60 −5 0 5 Inventory Time 0 10 20 30 40 50 60 −0.05 0.00 0.05 0.10 Portfolio Value Time 0 10 20 30 40 50 60 9.980 9.985 9.990 9.995 10.000 10.010

Bid and Ask Price

Time

Bid | Ask

(19)

4 High frequency trader’s pair trading problem

We now turn to the HFT strategy. We consider a pair trading strategy where the trader invests in the difference of two highly correlated assets. Here, we choose the futures price of the stock to be the second asset. The order book defined in Section 2 represents the dynamics of the stock. Whenever the trader buys/sells n units of stocks (being by an aggressive order or by the execution of a limit order), she sells/buys immediately n units of the futures. Her inventory should be fully liquidated at T .

4.1 The optimal control problem

We assume that the reference price F of the futures is given by

F = P

b_{+ P}a

2 + S (4.1)

where S is a mean-reverting process

S = S0+ Z · 0 ρ(ˆs − St)dt + Z · 0 σ(St)dWt. (4.2)

Here, ρ is the strength of mean reversion, ˆ_{s is the average of mean reversion and σ : R 7→ R is}

a Lipschitz bounded function representing the volatility of the process. In the following, we shall assume that

the support of σ is bounded. (4.3)

Remark 4.1. The above implies in particular that S lies in a certain compact set DS as soon as

S0 does. This could clearly be relaxed to the price of a finer analysis.

The strategy of the HFT is described by the same quantities as the one of the MM in Section 3. The only difference lies in the fact that she constantly holds a number equal to −I units of the futures F . We assume that buying/selling the futures leads to the payment of a proportional cost κ ≥ 0. Then, the dynamics of the corresponding state process X is given by

Xτi = TX(Sτi, Pτi−1, Qτi−1, Xτi−1, ∆E

φ

τi)1{∆E_τiφ6=0}+ ˜TX(Sτi, Pτi−1, Qτi−1, Xτi−1, ∆ ˜Eτi)1{∆E_τiφ=0},

(4.4)

with TX and ˜TX now defined with respect to

TG(s, p, q, x, δ) = g + (ab− exe(ab, nb, bb))∆−b − (aa− exe(aa, na, ba))∆a+

TI(s, p, q, x, δ) = i − (ab− exe(ab, nb, bb)) + (aa− exe(aa, na, ba))

T_Nb/a(s, p, q, x, δ) = nb/a+ [`b/a− nb/a]++ `b/a,

1

2 − mb/a− exe(ab/a, nb/a, bb/a)

T_Bb/a(p, q, x, δ) = bb/a+ (qb/a− bb/a)1_{`b/a_6=0}− bb/a1_{mb/a_=nb/a_}− (bb/a∧ ab/a)1_{ab/a_6=0}− bb/a1_{`b/a_6=0}

TJ(s, p, q, x, δ) = j + 1 and ˜ TG(s, p, q, x, δ) = g − exe(ab, nb, bb)∆+b + exe(aa, na, ba)∆a− ˜ T_I(s, p, q, x, δ) = i + exe(ab, nb, bb) − exe(aa, na, ba) ˜

T_Nb/a(s, p, q, x, δ) = nb/a− exe(ab/a, nb/a, bb/a)

˜

T_Bb/a(p, q, x, δ) = [bb/a− ab/a]+1_{mb/a_=0}+ (bb/a− [mb/a− (qb/a− bb/a− nb/a)]+)+1_{mb/a_6=0}

˜

(20)

in which ∆b±:= pb− pb+ pa 2 − s ± κ , ∆ a ±:= pa− pb+ pa 2 − s ± κ.

In the above, we used the notations x = (g, i, nb, na, bb, ba, j), δ = (ab, aa, `b, `a, `b,12, `a,

1

2, mb, ma,

ε, εb_{, ε}a_{), p = (p}b_{, p}a_{) and q = (q}b_{, q}a_).

The set of admissible controls C(0, S0, Z0−) is defined as in Section 3 but with respect to the

(completed) filtration Fφ= (F_tφ)t≥0 generated by (S, E ).

We also assume that she has an exponential type utility function, with risk aversion parameter

η > 0. Then, she wants to maximize over φ ∈ C(0, S0, Z0−) the expected utility

E[U (ST, Z_Tφ)] where U (s, z) := − exp −η{g + i+∆b−− i−∆a+− κ([i+− qb]++ [i −_{− q}a_]+_{) − %j}} _(4.5)

for z = (pb, pa_{, q}b_{, q}a_{, g, i, n}b_{, n}a_{, b}b_{, b}a_{, j), and where ∆}b

± and ∆a± are defined as above.

As in Section 3, we next extend the definition of our state processes by writing (St,s, Zt,s,z,φ) = (St,s, Pt,s,z,φ, Qt,s,z,φ, Xt,s,z,φ)

for the process satisfying (2.2)-(3.1)-(4.2) for the control φ and the initial condition (S_tt,s, Z_t−t,s,z,φ)

= (s, z) ∈ DS× DZ. The corresponding set of admissible controls is C(t, s, z), and the filtration

associated to φ ∈ C(t, s, z) is Ft,s,z,φ= (Fst,s,z,φ)s∈[t,T ]. We finally define the value function

v(t, s, z) := sup φ∈C(t,s,z) J (t, s, z; φ) for (t, s, z) ∈ [0, T ] × DS× DZ, where J (t, s, z; φ) := E[U (STt,s, Z t,s,z,φ T )].

We close this section with Remarks that are the counterparts of Remarks 3.2 and 3.3. Remark 4.2. For later use, observe that

v(t, s, z) = e−η(g−%j)v(t, z) := e¯ −η(g−%j)v(t, s, pb, pa, qb, qa, 0, i, nb, na, bb, ba, 0)

for all t ≤ T , s ∈ DS and z = (pb, pa, qb, qa, g, i, nb, na, bb, ba, j) ∈ DZ.

Remark 4.3. Note that v is bounded from above by 0 by definition. It also follows from (4.3) that

St,s takes values in the compact set DS so that v ∈ Lexp∞ by the same reasoning as in Remark 3.3.

4.2 The dynamic programming equation

As in Section 3.3, we first provide a dynamic programming principle. Again, we let v∗ and v∗

denote the lower- and upper-semicontinuous envelopes of v.

Proposition 4.1. Fix (t, s, z) ∈ [0, T ] × D_S× D_Z and a family {θφ, φ ∈ C(t, s, z)} such that each

θφ is a [t, T ]-valued Ft,s,z,φ-stopping time and k(St,s

θφ, Z t,s,z,φ θφ )k∞< ∞. Then, sup E h v∗(θφ, S_θt,sφ, Z t,s,z,φ θφ ) i ≤ v(t, s, z) ≤ sup E h v∗(θφ, S_θt,sφ, Z t,s,z,φ θφ ) i .

(21)

Proof. Let C_k(t, x, z) be the set of controls φ satisfying the additional constraint #{τ_iφ, i ≥ 1} ≤ k

a.s., and let v_k be the corresponding value function, for k ≥ 1. Then, it is not difficult to see that

vk is continuous, and the arguments of [10] imply that

sup φ∈Ck(t,x,z) E h vk(θφ, S_θt,sφ, Z t,s,z,φ θφ ) i ≤ v_k(t, s, z) ≤ sup φ∈Ck(t,x,z) E h vk(θφ, S_θt,sφ, Z t,s,z,φ θφ ) i ≤ sup φ∈C(t,x,z) E h v(θφ, S_θt,sφ, Z t,s,z,φ θφ ) i .

Since by definition v = limk→∞ ↑ vk and C(t, x, z) = ∪k≥1Ck(t, x, z), sending k → ∞ in the above

leads to the required result, recall Remark 4.3.

The partial differential equation associated to v is then at least formally given by

min {−Lϕ − Iϕ, ϕ − Kϕ} = 0 on [0, T ) × DS× DZ

min {ϕ − U, ϕ − Kϕ} = 0 on {T } × DS× DZ,

(4.6)

in which L is the Dynkin operator associated to (4.2):

Lϕ = ∂tϕ + ρ(ˆs − s)∂sϕ +

1

2σ

2_∂2 ssϕ.

To fully characterize the value function, we need the additional assumption, similar to Assump-tion 3.2.

Assumption 4.1. There exists a Borel function ψ ∈ C1,2_{([0, T ] × D}

S× DZ) such that

(i) 0 ≥ Lψ + Iψ on [0, T ) × D_S× D_Z,

(ii) ψ − Kψ ≥ ι on [0, T ] × D_S× D_Z for some ι > 0,

(iii) ψ ≥ U on {T } × DS× DZ,

(iv) lim inf

n→∞ (ψ/L)(tn, sn, zn) = ∞ if |zn| → ∞ as n → ∞, for all (tn, sn, zn)n≥1⊂ [0, T ] × DS× DZ.

Theorem 4.1. Let Assumption 3.1 hold. Then, v∗ (resp. v∗) is a viscosity supersolution (resp.

sub-solution) of (4.6). If moreover Assumption 4.1 holds, then v is continuous on [0, T ) × DZ and is

the unique viscosity solution of (4.6), in the class Lexp∞ .

Proof. In view of Proposition 4.1, the derivation of the viscosity super- and subsolution properties is very standard under Assumption 3.1, see e.g. [10]. As for uniqueness, this follows from a comparison

principle that can be proved in the class Lexp∞ by arguing as in the proof of Theorem 3.1.

As for the MM problem, Assumption 4.1 is easily checked when J◦ < ∞.

Remark 4.4. If J◦ < ∞ and the supports of λ(·|p, q, c) and γ(·|p, q) are bounded, uniformly in

(p, q, c) ∈ (dZ)2× N2_{× C}

◦, then the function ψ defined in Remark 4.4 also satisfies the requirements

(22)

4.3 Dimension reduction, symmetries and numerical resolution

As in Section 3.4, one can use specificities of the value function to reduce the complexity of the res-olution of (4.6). First, the variable (g, j) can be omitted, see Remark 4.2. Moreover, if Assumption 3.3 holds, then one easily checks that

¯

v(t, s, pb, pb+ 2δp, qb, qa, 0, i, nb, na, bb, ba, 0)

does not depend on pb. The difference with the relation obtained in Section 3.4 is du to (4.1) and

the fact that the HFT always hold a symmetric position in the stock and the futures (he is protected against evolutions of the mid-price). The other symmetry relations described in Section 3.4 do not hold because of the dependence on the process S.

Let us now turn to the definition of a numerical scheme for (4.6). Recall that, under Assumption 3.5, the operators I and K are explicit. Hence, the only required discretization is in time and in the variable s. We shall consider separately the diffusion part and the obstacle part of the PDE. More

precisely, we fix a time and a space grid π_tn:= {tn_i, i ≤ nt} and πsn:= {sni, i ≤ ns} where tni = iT /nt

for i ≤ n_t and sn_i = s + i(s − s)/ns, i ≤ ns. Here, s and s are such that DS= [s, s], recall Remark

4.1, and n := (n_t, ns) ∈ N2. We next define the sequence of space domains DkZ := DZ∩ [−k, k]11

for k ≥ 1, and we let vk_nbe defined by

vk_n= maxnvˇk_n, Kˇv_nko= 0 on πn_t × π_sn× Dk_Z,

v_nk− U = 0 on ({T } × π_sn× Dk_Z) ∪ (π_tn× πn_s × (D_Z\ Dk_Z)).

(4.7)

Here, for i ≤ nt− 1 and i0 ≤ ns,

ˇ vk_n(tn_i, sn_i0, ·) = E[vk_n(tn_i+1, p_n(S tn i,sni0 tn i+1 ), ·)] + T nIv k n(tni+1, sni0, ·), on Dk_Z, (4.8)

where pn is the left-hand side projection operator on πsn.

Note that the above numerical scheme is not fully explicit as it requires to compute conditional expectations. This can however be easily performed either by a finite difference scheme or by Monte-Carlo techniques in a very classical way. Note in particular that the randomness in these conditional expectations only comes from a one dimensional factor.

Proposition 4.2. Let Assumptions 4.1 and 3.5 hold, then the sequence (vk_n)n≥1converges pointwise

to v as nt, ns, k → ∞.

Proof. Note that Assumption 4.1 ensures that comparison holds for (4.6) in the class Lexp∞ , see

e.g., [6, Proposition 5.1]. Thus, as for Proposition 3.2, the result is an easy consequence of the

stability result of [7].

4.4 Approximate optimal controls

The optimal control can be numerically estimated as in Section 3.4. We first extend (vk_n, ˇv_nk) to

π_tn× D_S× D_Z by setting (vk_n, ˇvk_n)(·, s, ·) := (v_nk, ˇvk_n)(·, pn(s), ·). Then, we choose a measurable map

ˆ ck_n(tn_i, ·) such that ˆ ck_n(tn_i, ·) ∈ arg max{Kcˇvk_n(th_i, s, ·), c ∈ C(·)} , on DS× DkZ ˆ ck_n(tn_i, ·) = 0 on DS× (DZ\ DZk),

(23)

and define the sequence of stopping times ˆ

τ_j+1n,k := min{tn_i : i ≥ 0, tn_i > ˆτ_jn,k, (vk_n− Kcˆkn_ˇ_vk

n(tni, ·))(tni, Stn

i, ˆZtni−) = 0}, j ≥ 0,

with ˆτ₀n,k := 0−, and in which ˆZn,k _{= ( ˆ}_Pn,k_{, ˆ}_Qn,k_{, ˆ}_Xn,k_{) is defined as in (3.2)-(4.2) for the initial}

condition Z0− and the control associated to ˆφkn := (ˆτ

n,k i , ˆckn(ˆτ n,k i , Sτˆ_in,k, ˆZ n,k ˆ τ_in,k−))i≥1 in a Markovian

way. Again, this provides a sequence of controls that is asymptotically optimal. Proposition 4.3. Let the conditions of Proposition 4.2 hold. Then,

lim

k→∞nt,nlims→∞E[U (ST

, ˆZ_Tn,k)] = v(0, S0, Z0−),

in which the limit is taken along sequences n such that n2_tn−1_s → 0.

Proof. Let γ(p, q) := R dβ(c|p, q) and recall that γ is uniformly bounded by assumption. The

family {(vk_n, ˇv_nk)/L}k,n≥1 is uniformly bounded, compare with Remark 4.3. Also note that (s, z) ∈

DS× DkZ 7→ U is Ck-Lipschitz, for some Ck> 0 that only depends on k. By induction (recall that

the component s is projected on π_sn),

|ˇvk_n(·, s, ·) − ˇv_nk(·, s0, ·)| ≤ C_k0[|s − s0| + ntO(n−1s )], for all s, s

0 _{∈ D}

S, on DkZ,

for some C_k0 _{> 0 that does not depend on n∈ N}2. Since S is 1/2-Hölder in time in L2, together

with Assumption 3.5, this implies that

KcI ˇvk_n(tn_i+1, s_in0, z) = Kc_{E[I ˇ}vk_n(tn_i+1, S

tn i,sni0 tn i+1 , z)] + Ok(n −1 2 t ) + ntOk(n−1s ), = E[ˇvkn(tni+1, S tn i,sni0 tn i+1 , Z tn i,z,c tn i+1 )] + Ok(n −1 2 t ) + ntOk(n−1s ) for z ∈ DkZ, c ∈ C(z),

in which the exponent c in Ztni,z,c means that the impulse c is given at tn

i, and we use Remark

2.1 again. In the above, O_k_{(ξ) is a function, that may depend on k, but such that ξ ∈ R \ {0} 7→}

|O_k(ξ)|/ξ is bounded in a neighborhood of 0. By the arguments already used in the proof of

Proposition 3.3, it follows that

v_nk(tn_i, s_in0, z) =E[vk_n(tn_i+1, S tn i,sni0 tn i+1 , Z tn i,z,0 tn

i+1 )] ∨ max_c∈C(z)E[v

k n(tni+1, S tn i,sni0 tn i+1 , Z tn i,z,c tn i+1 )] + ok(n−1t ) + ntOk(n−1s ), and therefore vk_n(0, S0, Z0−) = E[U (Sθn k, ˆZ n,k θn k )] + o k n(1) + Ok(n2tn−1s ) = E[U (ST, ˆZTn,k)] + o k n(1) + Ok(n2tn−1s ) + ok(1)

by induction, in which ok_n(1) goes to 0 as n → ∞, ok(1) goes to 0 as k → ∞, and θnk is as in the

proof of Proposition 3.3. It remains to appeal to Proposition 4.2.

4.5 Numerical experiments

We use the same model as the one described in Section 3.6. As for the new parameters, we take ˆ

(24)

0 10 20 30 40 50 60 −3 −2 −1 0 1 2 3 Controls Time Bid | Ask ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 10 20 30 40 50 60 −10 −5 0 5 10

Queues after Control

Time Bid | Ask ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 10 20 30 40 50 60 −10 −5 0 5 10

Queues after Nature

Time Bid | Ask ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 10 20 30 40 50 60 −5 0 5 Inventory Time 0 10 20 30 40 50 60 0.00 0.05 0.10 0.15 0.20 0.25 0.30 Portfolio Value Time 0 10 20 30 40 50 60 9.975 9.985 9.995 10.005

Bid and Ask Price / Futures Price

Time

Bid | Ask

Figure 3: Simulated path of the optimal strategy of the High Frequency Trader.

We approximate the behavior of the spread process S by a trinomial tree based on the transition probabilities associated to the diffusion (4.2), so that the expectation in (4.8) can be computed explicitly. More precisely, we consider a centered 6 points grid with mesh equal to half a tick

(25)

(i.e. 0.005).

The graphics in Figure 3 have the same interpretation as in Section 3.6 except that we now also provide the evolution of the Futures process F , this is the dashed line in the bottom right graphic. Remember that the HFT does not gain from the evolution of the mid-price, as her position in stocks in always covered by a symmetric position in the futures. She only gains from the evolution of the spread process S or from the bid-ask spread of the stock if S does not move. Not surprisingly, her behavior is quite different from the one of the MM described in Section 3.4.

As the MM would do, she first positions herself on the limits in a symmetric ways, because the spread with the future is 0. She then tries selling at time 4 by sending a limit sell order in the spread to buy the futures whose price (compared to the mid-price of the stock) is gone low. She is starting to play on the stock-futures spread. She follows this strategy until time 14. She plays in a more symmetric way after this until time 35, with a slight tendency to resume her inventory. At time 35, she decides to clearly sell stocks again and buy the futures whose price is again very low. From time 40 on, she inverts her position on the book to try buying the stock and thus sells the futures whose price has gown up. She finally inverts her position again at time 50 when the futures price goes back to the mid price, to liquidate her position on the pair.

Compared to the positions of the MM in Figure 2, her positions in one direction are much more clear, and are clearly driven by the stock-futures spread. The HFT also does not seem to try to control the stock’s mid-price as the MM did, again this is because it does not matter for her.

In Figure 4, we provide an estimation of the density of the gain of the HFT based on 105

simulated paths. Histogram of HFT return Return Density −0.1 0.0 0.1 0.2 0.3 0.4 0.5 0 1 2 3 4 5 6

Figure 4: Density estimation of the gain made by the High Frequency Trader.

5 Institutional broker strategies for portfolio liquidation

We now turn to the Institutional Broker problem. We consider in the following the two mostly used strategies for buying/selling a block of stocks. We focus on the buying side, selling being performed in a symmetrical way.