Markoff–Lagrange spectrum and extremal numbers

(1)

c

Markoff–Lagrange spectrum and extremal numbers

by

Damien Roy

Universit´e d’Ottawa Ottawa, Ontario, Canada

1. Introduction

The purpose of this paper is to present a link between two relatively distant topics of Diophantine approximation. The first one concerns theLagrange constant ν(ξ) of a real numberξdefined as the infimum of all real numbers c>0 for which the inequality

ξ−p q 6 c

q²

admits infinitely many solutions (p, q)∈Z²withq>1. This constant, which vanishes when ξ∈Q, provides a measure of approximation to ξby rational numbers. It is also given by

ν(ξ) = lim inf

q!∞ qkqξk,

wherekxkstands for the distance from a real numberxto a closest integer. TheLagrange spectrumis the setν(R) of values ofν. It is a subset of the interval [0,1/√

5 ]. Due to work of Markoff, the portion of the spectrum in the subinterval ¹₃,1/√

5

is well understood (see [3, Chapter II, §6]). It forms a countable discrete subset of this subinterval with

1

3 as its only accumulation point. Moreover the real numbers ξ for which ν(ξ)>¹₃ are all quadratic. As a consequence, any transcendental real number ξ has ν(ξ)6¹3. In the range

0,¹₃

, the situation becomes more complicated. Although, with respect to Lebesgue measure, almost all real numbers ξ have ν(ξ)=0, we know in particular that there are uncountably manyξ∈Rwithν(ξ)=¹₃.

The second topic is the problem of simultaneous rational approximations to a real number and its square, from a uniform perspective. Letγ=¹₂(1+√

5 ) denote the golden

Work partially supported by NSERC and CRM.

(2)

ratio. In 1969, Davenport and Schmidt showed [6, Theorem 1a] that, for each non- quadratic irrational real number ξ, there exists a constantc>0 with the property that, for arbitrarily large values ofX, the inequalities

|x0|6X, |x0ξ−x1|6cX^−1/γ and |x0ξ²−x2|6cX^−1/γ

admit no non-zero solution (x0, x1, x2)∈Z³. Recently, it was established [14, Theorem 1.1]

that their result is best possible in the sense that, conversely, there are countably many non-quadratic irrational real numbersξwhich we henceforth callextremal such that, for a larger value ofc, the same inequalities admit a non-zero integer solution for eachX>1.

Our objective here is to show the existence of extremal numbersξ withν(ξ)=¹₃ and to show how this set is intimately linked with Markoff’s theory.

In the next section, we present the main results of Markoff’s theory from a point of view pertaining to the study of extremal numbers. Then, in§3, we construct a family of extremal numbersξmparameterized by all solutions in positive integersm=(m, m1, m2) of the Markoff equation

m²+m²₁+m²₂= 3mm1m2, (1)

up to permutation, exceptm=(1,1,1). Our main result is that these numbersξm constitute a system of representatives of the equivalence classes of extremal numbersξwith ν(ξ)=¹₃, under the action of GL₂(Z) onR\Q by linear fractional transformations. To prove this, we develop further the properties of approximation to extremal numbers by quadratic real numbers obtained in [14,§8]. Each extremal number ξ comes with a sequence of best quadratic approximations{αi}i>1 which is uniquely determined byξ up to its first terms. In§4, we show that the sequence of their conjugates {α_i}_i_>₁ admits exactly two accumulation pointsξ⁰ and ξ⁰⁰ which are also extremal numbers and which we call theconjugates ofξ. Then, in§5, we show thatν(ξ)=ν(ξ⁰)=ν(ξ⁰⁰) and that these Lagrange constants can be computed as the infima of the absolute values of the binary real quadratic forms

|ξ−ξ⁰|⁻¹(T−ξU)(T−ξ⁰U) and |ξ−ξ⁰⁰|⁻¹(T−ξU)(T−ξ⁰⁰U)

on Z²\{(0,0)}. The latter quantities admit handy representations in terms of doubly infinite words attached to the continued fraction expansions of ξ and ξ⁰ on one hand, and of ξand ξ⁰⁰ on the other hand. This is at the basis of Markoff’s original approach.

However, it requires that 0<ξ <1 and max{ξ⁰, ξ⁰⁰}<−1. In§6, we show that each extremal number is GL₂(Z)-equivalent to exactly one extremal number ξ satisfying 0<ξ <1 and having conjugates ξ⁰ and ξ⁰⁰ of unequal negative integral parts. We say that such an extremal number isbalanced. We also provide a characterization of the numbers ξ_m in

(3)

terms of their continued fraction expansions. Finally, we conclude in§7 with the proof of our main result by showing that any balanced extremal number ξ with ν(ξ)=¹₃ is equivalent to some ξm on the basis of the strong combinatorial properties shared by the two doubly infinite words attached toξ. As a corollary, we obtain that an extremal numberξhasν(ξ)=¹₃if and only if its sequence of best quadratic approximations{αi}i>1

satisfiesν(αi)>¹₃ for infinitely many indicesi.

2. Markoff ’s theory

A general reference for this section is the exposition given by Cassels in [3, Chapter II]. In the presentation below, we reinterpret his constructions in [3,§II.3], from a point of view closer to the approach of Cohn in [4], to align them with similar constructions arising from the study of extremal numbers.

We first recall that the group GL2(Q) acts on the setR\Qof irrational numbers by

g·ξ=aξ+b

cξ+d, ifg= a b

c d

∈GL₂(Q), (2)

and that we have ν(g·ξ)=ν(ξ) for any g∈GL2(Z) and anyξ∈R\Q[3, §I.3, Corollary].

Consequently, the Lagrange spectrum can be described as the set of values taken by ν on a set of representatives of the equivalence classes ofR\Qunder GL2(Z).

A real binary quadratic formF(U, T)=rU²+qU T+sT²∈R[U, T] is said to beindef- inite if itsdiscriminant disc(F)=q²−4rs is positive. For such a form, one is interested in the quantity

µ(F) := inf{|F(x, y)|: (x, y)∈Z²\{(0,0)}}.

Keeping the same notation as in (2), the group R^∗×GL₂(Z) acts on the set of real indefinite binary quadratic forms by

(λ, g)·F(U, T) =λF((U, T)g) =λF(aU+cT, bU+dT), and this action fixes the ratioµ(F)/p

disc(F). TheMarkoff spectrum is the set of values of these quotients µ(F)/p

disc(F), whereF runs through the set of all real indefinite binary quadratic forms or equivalently through a system of representatives of the equivalence classes of these forms under the above action of R^∗×GL₂(Z). Although this spectrum contains strictly the Lagrange spectrum [5, Chapter 3, Theorem 1], a remark- able feature of Markoff’s theory is that the intersections of the two spectra with the interval ¹₃,1/γ

are equal (recall thatγ=¹₂(1+√ 5 )).

(4)

The theory provides explicit sets of representatives both for the equivalence classes of real numbersξ with ν(ξ)>¹₃ and for the equivalence classes of real indefinite binary quadratic formsF withµ(F)/p

disc(F)>¹₃. They are parameterized by the solutions in positive integers m=(m, m1, m2) of Markoff’s equation (1) upon identifying two solutions when one is a permutation of the other. Setting aside the “degenerate solutions”

(1,1,1) and (2,1,1) which have at least two equal entries, all other solutions in positive integers appear once and only once in the rooted binary tree

(5,1,2)

(13,1,5) (29,5,2)

(34,1,13) (194,13,5) (433,5,29) (169,29,2)

... ... ... ... ... ... ... ...

(3)

where each node (m, m1, m2) has successors given by (3mm1−m2, m1, m) on the left and by (3mm2−m1, m, m2) on the right. Moreover [3, §II.2] all nodes (m, m1, m2) satisfym>max{m1, m₂}.

The same construction starting with (2,1,1) as a root provides a tree which contains exactly once each triple (m, m₁, m₂) satisfying (1) andm>max{m₁, m₂}>0. In this new tree, each non-degenerate solution occurs twice, with the tree (3) appearing as its left half. This suggests to extend the latter by adding (2,1,1) as a right ancestor of (5,1,2):

(2,1,1) (5,1,2)

(13,1,5) (29,5,2)

... ... ... ...

(4)

In this extended tree, a node m=(m, m₁, m₂) has m₁>m₂ if and only if m has a left ancestor. In the sequel, we denote by Σ^∗ the set of all nodes of the tree (4), and by Σ=Σ^∗∪{(1,1,1)} the set of all positive solutions of the Markoff equation (1).

The next proposition lifts (3) to a tree whose nodes are triples of symmetric matrices in SL₂(Z) (compare with [3,§II.3] and [4,§5]).

(5)

Proposition 2.1. Put

M=

3 1

−1 0

and consider the binary rooted tree 5 3

3 2

,

1 1 1 2

,

2 1 1 1

13 8 8 5

,

1 1 1 2

,

5 3 3 2

29 17 17 10

,

5 3 3 2

,

2 1 1 1

... ... ... ...

(5) where each node (x,x₁,x₂) has successors (x₁Mx,x₁,x) on the left and (xMx₂,x,x₂) on the right. Then each node (x,x₁,x₂)of this tree is a triple of symmetric matrices in SL₂(Z)with positive entries of the form

x=

m k k l

, x₁=

m1 k1

k1 l1

and x₂=

m2 k2

k2 l2

(6) satisfying both x=x1Mx2 and max{k, l}6m62k. Moreover, the tree formed by replacing each of these triples of matrices (x,x1,x2) by the triple of their upper left entries (m, m1, m2)is exactly the tree (3) of non-degenerate solutions of the Markoff equation.

Proof. We first note that the triple of upper left entries of the root of this tree is the root (5,1,2) of the Markoff tree (3). Now, suppose that a node (x,x₁,x₂) of the tree consists of symmetric matrices in SL₂(Z) satisfying x=x₁Mx₂, and that the corresponding triple (m, m₁, m₂) is a node of the Markoff tree. Using Cayley–Hamilton’s theorem, we find that

x1Mx= (x1M)²x2= (tr(x1M)x1M−det(x1M)I)x2= 3m1x−x2, xMx2=x1(Mx2)²=x1(tr(Mx2)Mx2−det(Mx2)I) = 3m2x−x1.

(7) Sincex,x1andx2are symmetric matrices in SL2(Z) and sinceM∈SL2(Z), we conclude that these products are also symmetric matrices in SL2(Z). Moreover, if we write

x₁Mx=

m⁰₂ k⁰₂ k₂⁰ l⁰₂

and xMx₂=

m⁰₁ k₁⁰ k₁⁰ l⁰₁

,

then we getm⁰₂=3mm₁−m₂ andm⁰₁=3mm₂−m₁, showing that the triples (m⁰₂, m₁, m) and (m⁰₁, m, m₂) associated with the left and right successors of (x,x₁,x₂) are respectively

(6)

the left and right successors of (m, m1, m2) in the Markoff tree. By recurrence, this proves all the assertions of the proposition besides the constraints on the coefficients of the matrices. To prove the latter, suppose that the node (x,x1,x2) satisfies conditions of the form

ϕ(x)>ϕ(xi)>c fori= 1,2, (8) for some constant c>0 and some linear form ϕ on the space of 2×2 matrices. Then, using the fact thatm1 andm2are positive (because (m, m1, m2)∈Σ^∗), the relations (7) lead to

ϕ(x1Mx) = 3m1ϕ(x)−ϕ(x2)>ϕ(x)>ϕ(x1)>c, ϕ(xMx2) = 3m2ϕ(x)−ϕ(x1)>ϕ(x)>ϕ(x2)>c,

showing by induction on the level that (8) holds for each node of the tree (5) as soon as it holds for its root. Since the latter satisfiesm>mi>1,k>ki>1 andl>li>1 fori=1,2, we conclude that each node of the tree meets these conditions and so consists of matrices with positive entries. Moreover, since the root also satisfies m−k>mi−ki>0 and 2k−m>2ki−mi>0 fori=1,2, each node meets these additional conditions and in particular satisfiesk6m62k. Finally, since (m, m1, m2)∈Σ^∗, we havem>max{m1, m2}>1.

Thusm>2 and, from 1=det(x)=ml−k², we deduce thatl=(k²+1)/m6m+1/m<m+1, and thereforel6m.

For each nodem=(m, m1, m2) of (3), we denote by xm=

m k k l

(9) the first component of the corresponding node (6) of the tree (5), and we extend this definition to all of Σ by putting

x(1,1,1)= 1 1

1 2

and x(2,1,1)= 2 1

1 1

. (10)

Then, for eachm∈Σ, we define Fm(U, T) = (T −U)xmM

U T

=mT²+(3m−2k)T U+(l−3k)U², (11) using the notation (9). Since det(x_m)=ml−k²=1, we find that disc(F_m)=9m²−4. As disc(F_m)≡2 mod 3, the formF_m is irreducible overQ. Therefore it factors as a product

F_m(U, T) =m(T−α_mU)(T−α_mU),

(7)

where

αm=2k−3m+√ 9m²−4

2m and αm=2k−3m−√

9m²−4

2m (12)

are conjugate quadratic real numbers.

In his presentation of Markoff’s theory, Cassels also defines quadratic forms indexed by solutionsm of Markoff’s equation, except that, assuming the uniqueness conjecture, he denotes them simplyF_m, wheremis the largest entry ofm, the conjecture being that this entry determines uniquely the solution (see [3, p. 33] or [1, Appendix B]). In view of the discussion in [3, §II.4], the corollary below shows that the above forms F_m are equivalent to the corresponding forms defined by Cassels.

Corollary 2.2. For each m=(m, m₁, m₂)∈Σthe off-diagonal entry kof x_m satisfies

k≡m1

m₂≡ −m2

m₁ modm and 0< k6m. (13) Note that condition (13) makes sense since each triple of Σ has pairwise relatively prime components [3,§II.3, Lemma 5]. It also determinesk uniquely.

Proof. This is readily checked whenmis (1,1,1) or (2,1,1). Now, assume that m is non-degenerate and write the corresponding triple of symmetric matrices (x,x1,x2) in the form (6). Sincex=xm, this notation is consistent with (9). Then, by Proposition 2.1, we have 0<k6m. Asx, x₁ and x₂ are symmetric, taking the transpose of both sides of the equalityx=x₁Mx₂ givesx=x₂^tMx₁, and so we obtainxx⁻¹₂ =x₁M andxx⁻¹₁ = x₂^tM. Comparing the upper right entries in the latter matrix equalities, we find that km₂−mk2=m₁andkm₁−mk1=−m2, from which the requested congruences follow.

Combining Theorems II and III in [3, Chapter II], we then recover the following main results of Markoff [11], [12].

Theorem2.3. (Markoff, 1879/80) The real numbers αm with m∈Σform a system of representatives of the equivalence classes of real numbers ξ with ν(ξ)>¹₃, while the forms Fm with m∈Σconstitute a system of representatives of the equivalence classes of real indefinite binary quadratic forms F with µ(F)/p

disc(F)>¹₃. Moreover, for each m=(m, m1, m2)∈Σ,the numbers αm and αm are equivalent and we have

ν(α_m) =ν(α_m) = µ(F_m)

pdisc(Fm)= 1

√9−4m⁻².

(8)

3. Extremal numbers

LetP denote the set of 2×2 matrices with relatively prime integer coefficients and non- zero determinant. It is a group for the product ∗ given by y1∗y2=c⁻¹y1y2, where c is the greatest positive common divisor of the coefficients ofy1y2. This group contains GL2(Z) as a subgroup, and its quotientP/{±I} is isomorphic to PGL2(Q). With this notation, we state the following characterization of extremal numbers reproduced from [17, Lemma 3.1], which collects results from [14] and [15].

Proposition 3.1. Let ξ be an extremal real number. Then, there exists an unbounded sequence of symmetric matrices {xi}i>1 in P such that,for each i>1,we have kxi+1k kxik^γ, k(ξ,−1)xik kxik⁻¹ and |detx_i| 1, (14) with implied constants that are independent of i. Such a sequence {xi}i>1 is uniquely determined by ξup to its first terms and up to multiplication of each of its terms by ±1.

Moreover, for any such sequence,there exists a non-symmetric and non-skew-symmetric matrix M∈P such that

xi+2=±

xi+1∗M∗xi, if iis odd,

xi+1∗^tM∗xi, if iis even, (15) for any sufficiently large index i. Conversely, if {x_i}_i_>₁ is an unbounded sequence of symmetric matrices in P which satisfies a recurrence relation of the type (15)for some non-symmetric matrix M∈P,and if for each i>1 we have

kxi+2k kxi+1k kxik and |detx_i| 1, (16) then {xi}i>1 also satisfies the estimates (14)for some extremal real number ξ.

In the above statement, the choice of a norm for matrices is secondary, since it only affects the implied constants in all estimates. However, for definiteness, we choose the norm kxk of a matrixx with real coefficients to be the largest absolute value of its coefficients. Then, for an extremal numberξwith a corresponding unbounded sequence of symmetric matrices{xi}i>1 inP satisfying (14), we find that

k(ξ,−1)xik= max{|xi,0ξ−xi,1|,|xi,1ξ−xi,2|} upon writing xi=

xi,0 xi,1

xi,1 xi,2

,

and thereforeξ=lim_i_!_∞x_i,1/x_i,0=lim_i_!_∞x_i,2/x_i,1.

(9)

It can be shown directly from the definition that the set of extremal numbers is stable under the action of GL2(Q) by linear fractional transformations on R\Q[17,§2].

In particular, it is stable under the action of the subgroup GL2(Z). The next corollary shows how the latter action affects the corresponding sequences of symmetric matrices {xi}i>1 and the corresponding matricesM.

Corollary3.2. Letξbe an extremal number,let{xi}i>1be an unbounded sequence of symmetric matrices in P satisfying (14) and let M∈P be such that (15) holds. For any

g= a b

c d

∈SL2(Z),

the number ξ⁰:=g·ξis also extremal with corresponding sequence {x⁰_i}i>1and matrix M⁰ given by

x⁰_i=^t(g⁰)⁻¹x_i(g⁰)⁻¹ and M⁰=g⁰M^tg⁰, where g⁰=

a −b

−c d

. (17)

Proof. It is clear that the above matrices x⁰_i and M⁰ belong to P and satisfy the recurrence relation (15) instead ofxi andM. Moreover, the matricesx⁰_i are symmetric whileM⁰ is both non-symmetric and non-skew-symmetric. We also find thatkx⁰_ikkxik,

k(ξ⁰,−1)x⁰_ik=|cξ+d|⁻¹k(ξ,−1)xi(g⁰)⁻¹k k(ξ,−1)xik,

and det(x⁰_i)=det(xi). Therefore{x⁰_i}i>1andξ⁰also satisfy (14) instead of{xi}i>1andξ.

In particular,{x⁰_i}i>1 satisfies (16) and so, by the last part of Proposition 3.1, it obeys (14) for some extremal number ξ⁰⁰ instead ofξ. This forces ξ⁰=ξ⁰⁰, and therefore ξ⁰ is extremal.

It follows from Proposition 3.1 that the matrixM∈P attached to an extremal num- berξis uniquely determined byξwithin the set{M,−M,^tM ,−^tM}. When the sequence of symmetric matrices attached toξ is contained in SL₂(Z), the matrixM also belongs to SL₂(Z) and the recurrence relation (15) can be put in a simpler form. Then, applying an identity of Fricke like Cohn in [4], we obtain the following result.

Lemma3.3. Let ξbe an extremal number with a corresponding sequence of symmetric matrices {xi}i>1 inSL2(Z). Choose M∈SL2(Z)and the above sequence so that,for each i>1,we have

x_i+2=x_i+1M_i+1x_i, where M_i=

M, if i is even,

tM, if i is odd.

Then, for each i>1,the traces qi:=tr(xiMi)∈Zsatisfy

q²_i+2+q²_i+1+q²_i=q_i+2q_i+1q_i+tr(^tM M⁻¹)+2. (18)

(10)

Proof. In [9] Fricke shows that for anyA, B∈SL2(R) we have

tr(A)²+tr(B)²+tr(AB)²= tr(A) tr(B) tr(AB)+tr(ABA⁻¹B⁻¹)+2.

PuttingA=x_i+1M_i+1andB=x_iM_i, the recurrence relation gives AB=x_i+2M_i=x_i+2M_i+2,

and so tr(AB)=qi+2. Sincexi+2 is symmetric, we also find that AB=^txi+2Mi=xiMixi+1Mi=BAM_i+1⁻¹Mi, and so tr(ABA⁻¹B⁻¹)=tr(M_i+1⁻¹Mi). The conclusion follows since tr(M_i+1⁻¹M_i) = tr(^tM⁻¹_i+2^tM_i+1) = tr(M_i+2⁻¹M_i+1) is independent ofi.

We observed in [13] that the arithmetic of extremal numbers is particularly simple when the corresponding sequence of symmetric matrices{xi}i>1is contained in GL2(Z) and the lower right entry of the corresponding matrix M is 0. When all these matrices belong to SL2(Z), the preceding result applies and we find the following result.

Lemma 3.4. Let ube a non-zero integer and let E_u⁺ denote the set of all extremal numbers with a corresponding sequence of symmetric matrices

x_i=

xi,0 xi,1

xi,1 xi,2

∈SL₂(Z) satisfying, for each i>1,

x_i+2=x_i+1M_i+1x_i, where M_i=

u (−1)ⁱ (−1)ⁱ⁺¹ 0

. (19)

Then,the set E_u⁺=E_−u⁺ is empty if u6=±3. Moreover, if ξ∈E₃⁺, then, upon choosing the matrices xi as above, each triple (xi+2,0, xi+1,0, xi,0) is a solution of Markoff ’s equation (1).

Proof. Let ξ∈E_u⁺. Using the notation of the lemma, a simple computation shows that the matrix M:=M2 satisfies tr(^tM M⁻¹)=−2 and that, for each i>1, we have tr(xiMi)=uxi,0. Therefore, Lemma 3.3 gives

x²_i+2,0+x²_i+1,0+x²_i,0=u xi+2,0xi+1,0xi,0 (20)

for eachi>1. Since−1 is not a square modulo 3 and since 1=det(xi)≡−x²_i,1 modx_i,0, we also note that x_i,0 is prime to 3 for each i>1. Then, looking at the equation (20) modulo 3, we deduce that uis divisible by 3 and so, each triple ¹₃u(x_i+2,0, x_i+1,0, x_i,0) provides a solution of Markoff’s equation in integers not all zero. Since each such solution has relatively prime entries, this is possible only ifu=±3.

(11)

Lemma 3.5. Two elements ξ and ξ⁰ of E₃⁺ are equivalent (under GL2(Z)) if and only if ξ⁰=±ξ+b for some b∈Z. Each element of E₃⁺ is equivalent to one and only one element of E₃⁺ in the open interval ¹₂,1

.

Proof. The second assertion follows from the first one, since, for eachξ∈R\Q, there is a unique integerb and a unique choice of sign such that±ξ+b∈ ¹₂,1

. To prove the first assertion, suppose thatξ∈E₃⁺and let

g= a b

c d

∈GL2(Z).

By Corollary 3.2, we haveg·ξ∈E₃⁺if and only if a −b

−c d

3 1

−1 0

a −c

−b d

=ε1

3 ε2

−ε2 0

for some choices of ε1, ε2∈{1,−1}. Equating coefficients, this translates into the conditions 3a²=3ε₁, 3c²=0 and det(g)±3ac=ε₁ε₂, which mean that ε₁=1, a=±1, c=0, ad=ε₂and impose no restriction on b. For such a,candd, we findg·ξ=ε2(ξ±b).

Azigzag in the tree (4) is a sequence of nodesm⁽¹⁾,m⁽²⁾,m⁽³⁾, ... of that tree such that, for each i>1, the nodem⁽ⁱ⁺¹⁾ is a successor of m⁽ⁱ⁾ on some side (left or right) and m⁽ⁱ⁺²⁾ is a successor of m⁽ⁱ⁺¹⁾ on the other side. A maximal zigzag is a zigzag m⁽¹⁾,m⁽²⁾,m⁽³⁾, ... which cannot be extended by inserting an ancestor of m⁽¹⁾ as the first element. With the convention that the root (2,1,1) has no ancestor in (4), it follows that eachm∈Σ^∗ is the first element of a unique maximal zigzag. Examples of maximal zigzags in (4) are

(2,1,1) (5,1,2)

(29,5,2) (433,5,29)

...

and

(5,1,2) (13,1,5)

(194,13,5) ...

Recall that, in §2, we attached a symmetric matrixx_m∈SL₂(Z) to eachm∈Σ. Thus, each maximal zigzag in (4) leads to a sequence of symmetric matrices in SL₂(Z). We can now state and prove the main result of this section.

(12)

Theorem 3.6. Given m∈Σ^∗,consider the maximal zigzag m=m⁽¹⁾,m⁽²⁾,m⁽³⁾, ...

in the tree (4)originating from m. Then{x_m(i)}i>1 is a sequence of symmetric matrices in SL2(Z)corresponding to an extremal number ξm in E₃⁺∩ ¹₂,1

,and we have ξ_m= lim

i!∞α_m(i)= lim

i!∞(α_m(i)+3) (21)

in terms of the quadratic numbers given by (12). Each element of E₃⁺ is equivalent to ξm for one and only one m∈Σ^∗.

Proof. Let

M=

3 1

−1 0

be as in Proposition 2.1 and let {m⁽ⁱ⁾}_i_>₁ be a maximal zigzag in (4) originating from a point m=m⁽¹⁾ in Σ^∗. For simplicity, we write x_i to denote the matrix x_m(i). If, for some indexi, the pointm⁽ⁱ⁺¹⁾is the left successor ofm⁽ⁱ⁾, then the node of the tree (5) corresponding tom⁽ⁱ⁺¹⁾takes the form (xi+1,∗,xi) and, asm⁽ⁱ⁺²⁾is the right successor ofm⁽ⁱ⁺¹⁾, we find thatxi+2=xi+1Mxi. Similarly, ifm⁽ⁱ⁺¹⁾is the right successor ofm⁽ⁱ⁾, then the node of (5) corresponding tom⁽ⁱ⁺¹⁾ takes the form (xi+1,xi,∗) and m⁽ⁱ⁺²⁾ is the left successor ofm⁽ⁱ⁺¹⁾. Thusxi+2=xiMxi+1=xi+1tMxi. As the parity ofidecides which alternative holds, we deduce that condition (15) of Proposition 3.1 is satisfied for each i>1 with the present choice of M or with M replaced by its transpose ^tM. The above considerations also show that, for each i>1, the node of (5) corresponding to m⁽ⁱ⁺²⁾ is either (xi+2,xi+1,xi) or (xi+2,xi,xi+1) and som⁽ⁱ⁺²⁾can be described as the node of the Markoff tree (4) formed by the upper left entries ofxi+2, xi+1 andxi.

To verify conditions (16) of Proposition 3.1, we write x_i=

m_i k_i k_i l_i

.

With this notation, Proposition 2.1 giveskxik=mi andki6mi62kifor eachi>1. Thus, ifxi+2=xi+1Mxi, we find that

mi+2= (3mi+1−ki+1)mi+mi+1ki>⁵2mi+1mi.

Otherwise, we havex_i+2=x_iMx_i+1and the same computation applies with the indicesi andi+1 permuted. This means thatkx_i+2k>⁵₂kx_i+1k kx_ikfor eachi>1. As det(x_i)=1 for each i, conditions (16) of Proposition 3.1 are fulfilled and therefore {x_i}_i_>₁ satisfies the conditions (14) of the same proposition for some extremal number ξ=ξ_m. We have ξ_m∈E₃⁺ by definition, and moreover ξ_m=lim_i_!_∞k_i/m_i∈1

2,1

. Then (21) follow

(13)

from formulas (12) and, asξm is irrational, we conclude that ξm∈E₃⁺∩ ¹₂,1

. The first assertion of the theorem is proved.

Now assume that ξm=ξn for some n∈Σ^∗, and let {n⁽ⁱ⁾}i>1 denote the maximal zigzag starting with n⁽¹⁾=n. Then, {x_m(i)}i>1 and {x_n(i)}i>1 are two sequences of symmetric matrices with positive entries corresponding to the same extremal number.

By Proposition 3.1, this is possible if and only if there exists an integer s such that x_m(i)=x_n(i+s) for each sufficiently largei. However, we observed that, for eachi>1, the triplem⁽ⁱ⁺²⁾ is the node of (4) formed by the upper left entries ofx_m(i+2), x_m(i+1) and x_m(i). Similarly, n⁽ⁱ⁺²⁾ is formed by the upper left entries of x_n(i+2), x_n(i+1) and x_n(i). This forces m⁽ⁱ⁾=n^(i+s) for each sufficiently largei, and therefore m=n because each zigzag in (4) is contained in a unique maximal zigzag.

Lemma 3.5 together with the preceding observation reduce the last assertion of the theorem to proving that each element ofE₃⁺∩ ¹₂,1

is equal to ξm for somem∈Σ^∗. To this end, we fix a pointξ∈E₃⁺∩ ¹₂,1

and a corresponding sequence{xi}i>1of symmetric matrices in SL2(Z) obeying the recurrence relation (19) of Lemma 3.4 with u=3. Using the notation of that lemma for the entries ofxi, we have ξ=limi!∞xi,1/xi,0. Since ξ belongs to ¹₂,1

, the ratioxi,1/xi,0must also belong to that interval for each sufficiently large integer i. Without loss of generality, we may assume that this already holds for each i>1. Upon multiplying x1 and x2 by ±1 and adjusting the following xi so that (19) continues to hold, we may also assume that x1,0 and x2,0 are positive. Then a simple recurrence argument based on (19) shows that xi,0>max{x_i−1,0, x_i−2,0}>0 for each i>3. By Lemma 3.4, this means that, for each i>3, exactly one of the points (xi,0, x_i−1,0, x_i−2,0) and (xi,0, x_i−2,0, x_i−1,0) is a nodem⁽ⁱ⁾ of the tree (4). In particular, the integersxi,0,x_i−1,0andx_i−2,0 are pairwise relatively prime.

We claim thatxi=x_m(i) for eachi>3. Since the symmetric matricesxi and x_m(i)

have the same upper left entries and the same determinant, this reduces to showing that the off-diagonal entryk of x_m(i) is x_i,1. In the notation of Lemma 3.4 (withu=3), we havex_ix⁻¹_i−2=x_i−1M_i−1which, by comparing the upper right entries of the matrices on both sides (as in the proof of Corollary 2.2), givesx_i,1x_i−2,0−xi,0x_i−2,1=(−1)ⁱ⁻¹x_i−1,0, and therefore

xi,1≡(−1)ⁱ⁻¹x_i−1,0

x_i−2,0 modxi,0. (22)

By comparison with the conditions that Corollary 2.2 imposes onk, this leads tok≡±x_i,1 modx_i,0. As Proposition 2.1 gives ¹₂x_i,06k6x_i,0 and as we know that ¹₂x_i,0<x_i,1<x_i,0, we conclude thatk=x_i,1, and the claim is proved.

(14)

Comparing the congruence (22) with those of (13) shows moreover that, fori>3, m⁽ⁱ⁾=

(xi,0, x_i−1,0, x_i−2,0), ifiis odd, (xi,0, x_i−2,0, x_i−1,0), ifiis even.

Sincem⁽ⁱ⁺¹⁾has two coordinates in common withm⁽ⁱ⁾and a larger first coordinate, this implies that, in the Markoff tree (4),m⁽ⁱ⁺¹⁾is the left successor ofm⁽ⁱ⁾ ifiis odd, and its right successor if i is even (see [3,§II.3]). Thus, the sequence {m⁽ⁱ⁾}i>3 is a zigzag in (4) and{x_m(i)}_i_>₃is a sequence of symmetric matrices associated with the extremal number ξ. We conclude thatξ=ξ_m, where mis the first element of the maximal zigzag containing{m⁽ⁱ⁾}_i_>₃.

The main goal of this paper is to show that the set {ξm:m∈Σ^∗} constitutes a system of representatives of the equivalence classes of extremal numbersξwithν(ξ)=¹₃. By Lemma 3.5, we know that they belong to distinct equivalence classes. The next step is to show thatν(ξm)=¹₃ for eachm∈Σ^∗. This will be achieved in§5.

4. Conjugates of an extremal number

This section deals with approximation to extremal numbers by quadratic real numbers, and introduces the notion of conjugates of an extremal number, a concept which will play an important role in the sequel. With respect to notation, we define thenormkFk of a polynomialF overRto be the largest absolute value of its coefficients, and we define theheight H(α) of an algebraic number αto be the norm of its minimal polynomial in Z[T].

Throughout the section, we fix an arbitrary extremal number ξ, a corresponding unbounded sequence of symmetric matrices{xi}i>1inP satisfying the condition (14) of Proposition 3.1, and a matrixM∈P which is assumed to satisfy (15) for eachi>1 (this condition on the range oficarries no loss of generality). For eachi>1, we write

J=

0 1

−1 0

, M= a b

c d

, xi=

x_i,0 x_i,1 x_i,1 x_i,2

and Xi=kxik.

We also define new matrices

Wi=xi∗Mi, where Mi=

M, ifiis even,

tM, ifiis odd, and real quadratic forms

F_i(U, T) =−(U T)J W_i U

T

and G_i(U, T) =−(U T)J 1 ξ

ξ ξ²

M_i U

T

.

(15)

It is clear from the above definition thatGidepends only on the parity ofi. A short computation gives the following formulas.

Lemma4.1. For each integer i>1,we have Gi(U, T) =

G⁰(U, T) := (c+dξ)(T−ξU)(T−ξ⁰U), if iis odd,

G⁰⁰(U, T) := (b+dξ)(T−ξU)(T−ξ⁰⁰U), if iis even, (23) where

ξ⁰=−a+bξ

c+dξ and ξ⁰⁰=−a+cξ

b+dξ. (24)

The sets {ξ⁰, ξ⁰⁰} and {±G⁰,±G⁰⁰} depend only on ξ. Moreover, ξ, ξ⁰ and ξ⁰⁰ are three distinct extremal numbers.

Proof. The second assertion of the lemma follows from the facts thatM is uniquely determined by ξ within the set {±M,±^tM} (see §3), and that replacing M by ±M or by±^tM just permutes the elements of{ξ⁰, ξ⁰⁰} and {±G⁰,±G⁰⁰}. The real numbers ξ⁰ and ξ⁰⁰ are extremal because they belong to the GL2(Q)-orbit of ξ (see [17, §2]).

Finally, the numbersξ,ξ⁰ andξ⁰⁰are distinct because ξis not quadratic overQand, by Proposition 3.1,M is neither symmetric nor skew-symmetric.

Definition 4.2. The extremal numbersξ⁰ andξ⁰⁰given by (24) are called theconjugates ofξ, while the polynomialsG⁰ andG⁰⁰ given by (23) are called thereal quadratic forms associated withξ.

For example, the extremal numbersξmconstructed by Theorem 3.6 have associated matrix

M=

3 1

−1 0

,

and so a short computation gives the following result.

Lemma 4.3. For each m∈Σ^∗, the conjugates of ξm are ξm−3 and ξm+3 and its associated quadratic forms are, up to sign,

Gm(U, T) := (T−ξmU)(T−(ξm−3)U) and Gm(U, T−3U).

In the computations below, we use the fact that, for any A, B∈P, the integer c determined byA∗B=c⁻¹ABis a common divisor of det(A) and det(B). We also use the estimateXi+1X_i^γ coming from (14). The next lemma relates the formsFi andGi.

Lemma4.4. For each i>1,there exists a non-zero rational number ri with |ri|Xi

such that Fi=riGi+O(X_i⁻¹).

(16)

Proof. AsWi=xi∗Mi, we haveWi=c⁻¹_i xiMi for some divisorci of det(M). Thus, forilarge enough, the rational numberri=xi,0/ci is non-zero and satisfies

|ri| |xi,0| X_i

as well as

kF_i−r_iG_ik

x_i−x_i,0 1 ξ

ξ ξ²

k(ξ,−1)x_ik X_i⁻¹.

The next result provides an alternative formula for the formsFi, showing that they are essentially homogenous versions of the quadratic polynomials of [14,§8].

Lemma 4.5. For each i>1,we have

F_i(U, T) = 1 di

U² U T T² x_i+1,0 x_i+1,1 x_i+1,2 xi+2,0 xi+2,1 xi+2,2

, (25)

where di is a divisor of det(xi+1). Moreover, the content of Fi as a polynomial in Z[U, T]is bounded above independently of i.

Proof. Due to the formulas of [14,§2], the determinant in the right-hand side of (25) can be rewritten as

tr

U² U T U T T²

Jxi+2Jxi+1J

.

Asx_i+2=W_i∗x_i+1=i⁻¹W_ix_i+1for some divisoriof det(x_i+1) and sincex_i+1Jx_i+1J=

−det(xi+1)I, this expression becomes

−det(xi+1) i

tr

U² U T U T T²

J Wi

=det(xi+1) i

Fi(U, T).

This proves the first assertion. Identifying any symmetric matrix m k

k l

with the triple (m, k, l), formula (25) implies that the content ofFi divides det(xi,xi+1,xi+2).

The second assertion follows as, by [14, Theorem 5.1], the absolute value of this determinant is bounded above independently ofi.

(17)

Combining the above lemma with the results of [14, §8], we obtain the following result.

Proposition 4.6. There exists an integer i0>1such that, for each i>i0,the polynomial Fi(U, T)is irreducible over Q and the root αi of Fi(1, T) which is closest to ξ is algebraic over Qof degree 2 with

H(α_i) kFik X_i and |ξ−αi| H(α_i)^−2γ−2.

Moreover, for every algebraic number α∈C of degree 62 over Q with α6=αi for each i>i0, we have |ξ−α|H(α)⁻⁴.

Proof. According to [14, Theorem 8.2], the polynomialQi+1(T):=diFi(1, T) is irreducible overQfor each sufficiently largei. So, for thosei, the quadratic formFi(U, T) is irreducible overQandαiis algebraic overQof degree 2. Since by Lemma 4.5 the integer diand the content of Fi are bounded, we deduce thatH(αi)kFikkQi+1k. According to [14, Proposition 8.1], we also havekQi+1kXi. The remaining estimates follow from [14, Theorem 8.2].

Definition 4.7. In view of the above proposition, the sequence{α_i}_i_>_i₀ is uniquely determined by the extremal numberξup to its first terms. We refer to it as a sequence ofbest quadratic approximations toξ.

The next lemma provides such sequences for the extremal numbersξ_m defined in Theorem 3.6, in terms of the quadratic numbersα_m given by (12).

Lemma 4.8. Let m∈Σ^∗ and let {m⁽ⁱ⁾}i>1 denote the maximal zigzag in the tree (4) starting with m⁽¹⁾=m. Put r=1 if m⁽²⁾ is the right successor of m⁽¹⁾ and r=0 otherwise. Then a sequence {αi}i>1 of best quadratic approximations to ξm is given by

αi=

α_m(i), if i≡rmod 2,

α_m(i)+3, if i6≡rmod 2. (26)

Proof. Define xi=x_m(i) for each i>1 so that {xi}i>1 is a sequence of symmetric matrices in SL2(Z) corresponding toξm(see Theorem 3.6). By virtue of the choice of r, the triplem⁽ⁱ⁺¹⁾ is a right successor ofm⁽ⁱ⁾ in (4) if and only ifi≡rmod 2. From this we deduce that

x_i+2=x_i+1

3 (−1)^i−r+1

(−1)^i−r 0

x_i

for each i>1 (same argument as in the first paragraph of the proof of Theorem 3.6).

Thus, in view of Proposition 4.6, it remains simply to show that, for each sufficiently

(18)

largei, the real number defined by (26) is the root of the polynomial

−(1 T)Jxi

3 (−1)^i−r

(−1)^i−r−1 0

1 T

,

which is closest toξ_m. Ifi≡rmod 2, this polynomial is simplyF_m(i)(1, T) (with the notation of (11)). Ifi6≡rmod 2, a short computation shows that it is equal to−F_m(i)(1, T−3).

The conclusion follows since the roots ofF_m(i)(1, T) areα_m(i)andα_m(i) which, according to (21), converge respectively toξ_m andξ_m−3, asi!∞.

The next result justifies the terminology of Definition 4.2.

Proposition 4.9. Let {αi}i>i₀ be as in Proposition 4.6. Then,as i!∞, we have

|ξ⁰−α_2i−1| H(α_2i−1)⁻² and |ξ⁰⁰−α2i| H(α2i)⁻². (27) Therefore, the sequence of conjugates of a sequence of best quadratic approximations to ξ admits exactly two accumulation points,namely the conjugates ξ⁰ and ξ⁰⁰ of ξ.

Proof. We simply prove (27), since the second assertion follows from it. For each i>i0, letpi:=Fi(0,1) denote the coefficient ofT²inFi(U, T). Ifi>i0is odd, Lemma 4.4 gives

(T−αiU)(T−αiU) =p⁻¹_i Fi(U, T) = (T−ξU)(T−ξ⁰U)+O(X_i⁻²),

and thereforeαi+αi=ξ+ξ⁰+O(X_i⁻²), by comparing the coefficients ofU T. Since Propo- sition 4.6 gives kFikXi and|αi−ξ|X_i^−2γ−2, we deduce that

|pi| X_i and |α_i−ξ⁰| X_i⁻².

To bound|αi−ξ⁰|from below, we first note that, sinceξ⁰6=ξ, the above estimates imply

X_i²|α_i−ξ⁰|+O(X_i+1⁻²).

Ifiis large enough this resultant is a non-zero integer. Its absolute value is then bounded below by 1, and the above estimate leads to |αi−ξ⁰|X_i⁻². Thus

|αi−ξ⁰| X_i⁻²H(αi)⁻².

The proof forieven is similar: it suffices to replace everywhereξ⁰ byξ⁰⁰.

(19)

Corollary4.10. For each A∈GL2(Q),the conjugates of A·ξare A·ξ⁰ and A·ξ⁰⁰. Proof. Fix A∈GL2(Q) and a sequence {αi}i>1 of best quadratic approximations toξ. Since

|A·ξ−A·α_i| |ξ−α_i| H(α_i)^−2γ−2H(A·α_i)^−2γ−2,

we deduce that{A·αi}i>1is a sequence of best quadratic approximations to the extremal number A·ξ. Thus the conjugates of A·ξ are the accumulation points of the sequence {A·αi}i>1, namelyA·ξ⁰ andA·ξ⁰⁰.

Based on this proposition, a simple computation gives the following result.

Corollary 4.11. Let

N=

b a

−d −c

.

Then we have ξ⁰=N·ξand ξ⁰⁰=N⁻¹·ξ. Moreover,for each i∈Z,the conjugates of Nⁱ·ξ are Nⁱ⁻¹·ξ and Nⁱ⁺¹·ξ.

In particular, this shows that ξ is one of the two conjugates of ξ⁰ and also one of the two conjugates of ξ⁰⁰. Although we will not need the next result in the sequel, we decided to include it, as it provides an attractive complement to Proposition 4.6.

Theorem 4.12. Let {αi}i>i₀ be as in Proposition 4.6. For each i>i0,define α⁰_i=

αi, if iis odd, N·αi, if iis even, where N is the integral matrix of Corollary 4.11,then

|ξ−α⁰_i| |ξ⁰−α⁰_i| H(α⁰_i)^−2γ−4. (28) For each quadratic or rational number α∈C not belonging to the sequence {α⁰_i}i>i₀, we have instead

|ξ−α| |ξ⁰−α| H(α)⁻⁶, (29) where α denotes the conjugate of αover Q.

Proof. If i is odd, the estimate (28) follows from Propositions 4.6 and 4.9, since α⁰_i=αi andα⁰_i=αi. Ifiis even, we find that

|ξ−α⁰_i| |ξ⁰−α⁰_i|=|ξ−N·αi| |ξ⁰−N·αi| |N⁻¹·ξ−αi| |N⁻¹·ξ⁰−αi|=|ξ⁰⁰−αi| |ξ−αi|, and (28) again follows from Propositions 4.6 and 4.9, becauseH(α_i)H(α⁰_i).

(20)

To prove the second part of the theorem, we first note that, if α=αi for some even integer i, then Proposition 4.6 provides |ξ−α|H(α)^−2γ−2, while the estimates of Proposition 4.9 lead to |ξ⁰−α|1 since ξ⁰6=ξ⁰⁰. Similarly, if α=N·αi for some odd integer i, we find that

|ξ⁰−α| =|N·ξ−N·αi| |ξ−αi| H(α)^−2γ−2 and |ξ−α| |N⁻¹·ξ−αi|=|ξ⁰⁰−αi| 1.

In both cases, this leads to

|ξ−α| |ξ⁰−α| H(α)^−2γ−2H(α)⁻⁶.

If α=αi for an integer i>i0, then we find instead|ξ−α||ξ⁰−α|1 and so (29) holds again. This estimate also holds if α∈Q, because in that case we have|ξ−α|H(α)⁻³ and|ξ⁰−α|H(α)⁻³by [14, Theorem 1.3]. We may therefore assume thatαis irrational and different fromαi,αi andN·αi for eachi>i0. In this case, Proposition 4.6 gives

|ξ−α| H(α)⁻⁴ and |ξ⁰−α| |ξ−N ⁻¹·α| H(α)⁻⁴. (30) Letpdenote the positive integer for which the polynomial

F(U, T) :=p(T−αU)(T−αU)

has relatively prime integer coefficients. Then, F is an irreducible polynomial of Z[T] and, for eachi>i0, we have

16|Res(F, Fi)|=p²p²_i|α−αi| |α−αi| |α−α i| |α− αi|, wherepi=Fi(0,1). Since

p|pi| |α−αi| |α−α i|6p|pi|(2 max{1,|α|}max{1,|αi|})(2 max{1,|α|} max{1,|αi|})

= 4(pmax{1,|α|}max{1,|α|})(|p i|max{1,|αi|}max{1,|α_i|}) H(α)H(α_i),

we deduce that

1H(α)²H(α_i)²|α−αi| |α− α_i|.

Ifiis odd, Propositions 4.6 and 4.9 also give

H(α_i)X_i, |α−α_i|6|ξ−α|+O(X_i^−2γ−2) and |α− α_i|6|ξ⁰−α|+O(X _i⁻²).

Combining these estimates, we deduce the existence of a constantc>0 such that c6H(α)²X_i²(|ξ−α|+X_i^−2γ−2)(|ξ⁰−α|+X _i⁻²)

(21)

for each odd integer i. If |ξ−α|>¹4cH(α)⁻² or |ξ⁰−α| >¹4cH(α)⁻², then the required estimate (29) follows from (30) and we are done. Otherwise, we obtain

1

2c6H(α)²X_i^−2γ−2+H(α)²X_i²|ξ−α| |ξ⁰−α|.

Chooseito be the smallest positive odd integer such thatH(α)²X_i^−2γ−26¹4c. Then we haveXiH(α)^1/γ and we obtain

1

4c6H(α)²X_i²|ξ−α| |ξ⁰−α| H(α)^2γ|ξ−α| |ξ⁰−α|, which is stronger than (29).

Remark. A similar argument shows that Theorem 4.12 holds withξ⁰ replaced byξ⁰⁰ andα⁰_i replaced byα⁰⁰_i, where

α⁰⁰_i =

αi, ifiis even, N⁻¹·α_i, ifiis odd.

5. Minima of the associated real quadratic forms

We keep the notation of the preceding section. In particular we deal with a fixed arbitrary extremal numberξwith conjugatesξ⁰andξ⁰⁰and associated quadratic formsG⁰ andG⁰⁰. The main result of this section is that

ν(ξ) = µ(G⁰)

pdisc(G⁰)= µ(G⁰⁰) pdisc(G⁰⁰).

We will deduce from this that the extremal numbersξm,m∈Σ^∗, constructed by Theo- rem 3.6 have Lagrange constantν(ξm)=¹₃. The proof goes through a series of lemmas.

Lemma 5.1. Let d denote the least common multiple of all integers det(Wi) with i>1. Suppose that Wi≡Wj mod 4dfor some indices i, j>1. Then, WjW_i⁻¹∈SL2(Z).

Proof. This follows from the formula

WjW_i⁻¹= det(Wi)⁻¹WjAdj(Wi), where Adj(Wi) denotes the adjoint ofWi. Since

WjAdj(Wi)≡WiAdj(Wi)≡det(Wi)I mod 4d,

and since det(Wi) divides d, the matrix WjW_i⁻¹ has integer coefficients. Moreover, as det(Wi) and det(Wj) dividedand are congruent modulo 4d, they must be equal, and so det(WjW_i⁻¹)=1.