Diophantine approximation on algebraic varieties

(1)

J

OURNAL DE

T

HÉORIE DES

N

OMBRES DE

B

ORDEAUX

M

ICHAEL

N

AKAMAYE

Diophantine approximation on algebraic varieties

Journal de Théorie des Nombres de Bordeaux, tome

11, n

o

_{2 (1999),}

p. 439-502

<http://www.numdam.org/item?id=JTNB_1999__11_2_439_0>

L’accès aux archives de la revue « Journal de Théorie des Nombres de Bordeaux » (http://jtnb.cedram.org/) implique l’accord avec les condi-tions générales d’utilisation (http://www.numdam.org/conditions). Toute uti-lisation commerciale ou impression systématique est constitutive d’une infraction pénale. Toute copie ou impression de ce fichier doit conte-nir la présente mention de copyright.

Article numérisé dans le cadre du programme Numérisation de documents anciens mathématiques

(2)

Diophantine

Approximation

on

Algebraic

Varieties

par MICHAEL NAKAMAYE

RÉSUMÉ. Nous donnons un _apergude

_progrès

récents en théorie

de

_{1’approximation diophantienne.}

Le

_point

de

_départ

étant le

théorème de

_Roth,

nous nous intéressons d’abord à la

_conjecture

de

_{Mordell, puis}

ensuite à des résultats

_analogues

en dimension

supérieure,

résultats dûs à

_{Faltings-Wustholz}

et à

_Faltings.

ABSTRACT. We _presentan overview of recent advances in

dio-phantine approximation. Beginning

with Roth’s

theorem,

we

dis-cuss the Mordell conjecture and then _passon to recent

_higher

dimensional results due to

_{Faltings-Wustholz}

and to

_Faltings

re-spectively.

0. INTRODUCTION

The

_theory

of

_{diophantine approximation}

often

_begins

with a finite

col-lection of

_polynomials

_{f 1 (X), ..., In (X) E}

_{(~~.X l, ... , Xm)}

with rational

co-efficients. There are then two distinct

_types

of

_{questions commonly}

asked.

First,

one can look for rational

_points

(pl/ql, ...

which lie on the

common set of zeroes of the

polynomials,

i.e.

pm/qm)

= 0 for

all i. This line of

_inquiry

leads to the Mordell

_conjecture

_(which

deals with the case of a

_{single polynomial}

in two

_variables)

and

_higher

dimensional

generalizations

due to

_Faltings

and

_{Vojta. Alternatively,}

one can look for

rational

_points

_{(pl/ql, ... ,Pm/qm)}

which are

_{’approximate}

solutions’ or in

other words lie _veryclose the set of common zeroes of the

f i’s.

This

question

is

_perhaps

older than the first and is

_aptly

called

_diophantine

approxima-tion because one is

looking

not for solutions to

diophantine

equations

but

for

_approximate

solutions. In this

_direction,

the most

_striking

results are

Roth’s

_{theorem, dealing}

with the case of a

single

irreducible

polynomial

in

one

variable,

and the Schmidt

subspace

theorem which allows for several

variables but deals with _very

_specific

_(linear)

_polynomials.

One of the

_simplest

results in

_{diophantine approximation}

is Liouville’s

theorem. For

_this,

we consider a

_single

irreducible

polynomial

f (X )

E

_Q(X~

(3)

of

_degree

d > 2 with a root a E R. Since

_f

is

_{irreducible, a g Q}

and it makes sense to

_approximate

this zero of

f (X)

by

rational numbers.

Liou-ville’s theorem states that for any rational

p/q

one

_always

has an

_inequality

of the form

where

_c(a)

is an

_explicity

determined constant

_depending

_only

on a. It

turns

_out,

_however,

that the

_exponent

d in the denominator is not best

possible

except

for

_quadratic

_{f (X ).}

In

_fact,

Roth

_proved

that 0.1 can be

improved by replacing

d with 2 + E for _anyf > 0. The

_disadvantage

to

Roth’s theorem is that the constant

_c(a)

is no

longer explicitly

determined.

Roth’s theorem was

generalized

fifteen _yearslater

by

Schmidt who deals

with the case of n

_independent

linear forms

Li(X)

E

_{k[Xl,... Xnj}

where

Q

has now been

replaced by

a finite extension

k/Q.

The Schmidt

subspace

theorem

_(see

_§5

for an

explicit

statement)

states that the set of rational

points

(pl/ql, ... , pn/qn)

which are close to the n linear

_subspaces

0 are

_degenerate;

here

_degenerate

means contained in a finite union of

proper linear

subspaces.

In the one variable _case,

_choosing

L(X)

= X - _a, one recovers Roth’s theorem since a

_degenerate

linear

subspace

of

_Q

is

just

a

_point.

Schmidt’s

_subspace

theorem has

_recently

been extended

_by

Faltings

and Wüstholz to deal with the case when the

_Li

are no

_longer

linear.

Returning

to the first

_question

_posed

in

_{diophantine approximation}

we

take the

_subvariety

X C pn defined over

Q

as the common zero locus of our collection of

_polynomials.

As noted

_above,

in this case one is not

’ap-proximating’

a

_{geometric object}

with rational

_points

but rather

_looking

for

actual rational solutions to the

_system

of

_equations

_defining

X. The

con-nection between the

_problem

of

_finding

actual as

_opposed

to

_approximate

solutions to

_polynomial

_equations

was

developed by Vojta

in his

proof

of

the Mordell

_conjecture

_[V2].

The Mordell

_conjecture

states that if X is a

smooth

_projective

curve of _genusat least

_2,

defined over a number field

k,

then X has

_{only finitely}

many rational

points.

Like Roth’s

theorem,

the result is ineffective in that it does not bound the size of the solutions.

Building

upon

Vojta’s

ideas,

Faltings

[Fl, F2]

was able to obtain similar

results for

_higher

dimensional varieties under certain restrictions.

The

_goal

of these notes is to

_give

an introduction to the ideas and

argu-ments

_occurring

in

_{diophantine approximation, beginning}

with the

_simplest

result,

Liouville’s

_theorem,

and

_culminating

with the difficult results of

Falt-ings

and

_Vojta

on the one hand and

_{Faltings-Wilstholz}

on the other. The

rough

structure of these notes is as follows: the first section

begins

with

Liouville’s theorem and

_procedes

to

_analyse

the difficulties encountered when

_trying

to

_sharpen

Liouville’s result to obtain Roth’s theorem. The

(4)

second section then deals with these

_{difficulties, concentrating}

on the

geom-etry

behind Roth’s

_theorem,

_specifically

the

_product

theorem and

_Dyson’s

lemma. The third lecture moves from Roth’s theorem to

_{Vojta’s proof}

of the Mordell

_{conjecture, developing}

_{the necessary}

_theory

of

_height

functions and the

_{corresponding}

metrics on line bundles

_along

the _way.We will once more

emphasize

the

_geometric

_aspects

of the

proof

and

_give

a

rough

idea of

the difficult arithmetic

_machinery

involved. Lectures four and five will be devoted to

_higher

dimensional

_problems,

one to

_higher

dimensional Mordell

type

theorems and one to the Schmidt

subspace

theorem

respectively.

The

proofs presented

here will

_not,

in

_general,

be

_complete

but we

_hope

to

em-phasize

all of the

_{important points}

so that the interested reader can then

consult the literature for

_{rigorous proofs}

and

_hopefully

the account here will facilitate this process.

There are several

expositions

of the material covered in these lectures.

For Roth’s theorem and the Schmidt

_subspace

theorem one can consult

Schmidt’s two collections of lecture notes

_{[Sl, S2].}

For

_Dyson’s

_lemma,

in addition to

_{Dyson’s original}

paper

[D],

one can consult

_[Bl,

_EV,

_Nl,

N3].

Vojta

has two

_proofs

of the Mordell

_conjecture,

_{[V2, V3],}

the second of

which is closer to the

_exposition

_given

here. He also has a beautiful _paper

[V4]

giving

a full

_proof

of the

_higher

dimensional results of

_Faltings

[Fl , F2].

For _surveysof the material behind

_{Faltings’ work,}

one can consult

[H]

for a beautiful overview or

[EE]

for a more

_thorough

treatment.

These notes form the basis of a series of five lectures delivered at the Isaac Newton Institute from March 23

_through

March

_27,

1998. It is a

pleasure

to thank the

_organizers

of the

_{workshop, Christophe Soul6,}

Jean-Louis

Colhot-Th6l6ne,

and Jan

_Nekováf,

for

_inviting

me to

_present

this material.

I would also like to thank all of those who attended the lectures and whose

comments have

_helped

to remove _manyerrors from these notes as well as

to add better and more

_{complete explanations.}

I also thank those who

attended courses I _gaveon

_part

of this material at Harvard in the

_spring

of 1995 and at

_Bayreuth

in the winter of 1996: it is

_only

after

_trying

to teach the material several times that I have reached a full

_appreciation

of the

subtleties involved. The

_spirit

of these notes is

_hopefully

the same as that

of the

_{lectures, namely}

informal

_{but aiming}

to

_give

a

_{reasonably complete}

picture

of the difficult

_techniques

involved in these

_questions

of arithmetic

geometry.

1. FROM LIOUVILLE TO ROTH

We

_{begin by recalling}

the

_quick

and

_{elegant proof}

of Liouville’s theorem.

Theorem 1.1

_(Liouville).

_Suppose

a E R is an

_algebraic

irrational

(5)

The

_proof

can be

formally

divided into three

_stages

which will _reappear

in all of our

_arguments

but which are

particularly

_transparent

in this case.

We

_begin

with the irreducible

_polynomial

_{f (X )}

E

_Q[X]

for a over

Q.

To

specify f uniquely

one can take

_f

E

_Z[X]

with

_relatively

_prime

coefficients.

The outline of the

_argument

is as follows:

Step

1: 0 for _any

_p/q

E

_Q

since otherwise

_f

would not be

irreducible over

Q. Step

2: since

_qd

is a common denominator for the terms

in

_1(P/q).

Step

3:

_a~

for an

explicit

constant

_b(a).

Liouville’s theorem

_follows,

with

_c(a)

=

1/2b(a),

by comparing

the bounds

in

_Steps

2 and 3.

_Only

the _upperbound in

_Step

3

_requires

further comment.

Suppose

we take the

_Taylor

series

_expansion

of f (X )

about a. Since

f (a) _

0 the first term is zero:

Thus

This establishes

_Step

_3,

_provided

_{lplq -}

_al

_1,

with

_{b(a) =}

Ed 1

_jail

and _provesLiouville’s theorem

_(taking

_c(a)

= As for

effectivity,

the numbers ai

depend only

on a as

_they

are the coefficients of

the

_Taylor

series

_expansion

of

_f(X)

about a.

Consequently,

b(a)

and

c(a)

depend only

on a.

Suppose

now that one tries to

_improve

the

_exponent

d in Liouville’s

theorem. One idea would be to run

through

the same

_argument

with a

different

_{auxiliarly polynomial}

_{f (X).}

Of course this risks

_{losing effectivity}

because

_{f (X)}

was determined

uniquely by

a.

Nonetheless,

one could

_try.

A

review of the

_argument

reveals that the

_exponent

d in the theorem comes

by taking

the

_quotient

of

_{deg f(X)}

_by

_{multa ( f (X )).}

Thus if we

replace

f (X )

with a different

polynomial

g(X)

which also vanishes at a we will

_get

(6)

But for

_Step

2 to

_work,

we need to have

_g(X)

E

_Q[X]

and thus

_g(X)

will be divisible

_by

Thus we

always

will have

degg(X)/multa(g(X))

> d and there is no

_improvement.

Thus some new idea is needed in order to

_improve

the

_exponent

d of Liouville’s theorem. At the

_beginning

of the

_century,

Thue had the idea of

running

the same

_argument

with an

_{auxiliary polynomial}

in two variables.

Thus in order to bound how close a

single

rational number can be _{to a,}

Thue considers a

pair

of rational numbers both of which are

close to a. The reason

_why

an extra

_{approximating}

rational

_{point helps}

is that now the

auxiliary polynomial

will be in two variables and this

_gives

additional

_degrees

of freedom

_allowing

for an

_improvement

in the

_important

ratio

which occurs as the

_exponent

in Liouville’s theorem. There are of course

immediate

_{complications}

in the

_argument,

_particularly

with

_Step

1 as with

two variables there will not be a well-defined choice of

_auxiliary

polyno-mial with

_big

_multiplicity

at

_(a,

_a).

But without a canonical choice of

f (X,Y)

there is no

_longer

_anyreason

_why

f (pl/q,,P2/q2)

should be

non-zero, without which the

argument

runs

aground.

Thus the

simplest

_step

in

the

_proof

of Liouville’s

_{theorem, Step}

_1,

becomes the most cult when

one considers an

_auxiliary

function in more than one variable.

Of _course,once one is

_willing

to consider

_{f (X, Y)}

there is no reason to

stop

there and Roth was

finally

able to obtain the best result

_{possible by}

considering

an

auxiliary polynomial

with an

_arbitrary

number of variables:

Theorem 1.2

_{(Roth’s Theorem).}

_Suppose

a E R is

_algebraic

and

irra-tional. Then

_for

_anye > 0 there are

only finitely

_manysolutions to

Note that the formulation of Roth’s theorem is

_slightly

different from that of Liouville’s theorem. It follows from the fact that 1.3 has

_only

_finitely

many solutions that there exists a constant

_c(a)

such that

The

_problem

is that the constant

_c(a)

is not effective and this is

_why

Roth’s theorem is formulated in this alternative

_style.

The

_strategy

for

_proving

Roth’s theorem is identical to that of Liouville’s theorem

_except

that we will use several

_good

_{approximating}

_points.

So we

(7)

assume that Roth’s theorem is false and this

_gives

a _sequenceof

_good

approximating points

The

_strategy

for the

_proof

will be to choose m of these

good approximating

points,

where m -+ oo as e --~

_0,

and construct an

auxiliary polynomial

in m variables with

large

order

vanishing

at

(a, ... , a).

Steps

2 and 3 of

Liouville’s theorem will then tell us that the

_{approximations}

are too

_good,

giving

a contradiction. In order for

Steps

2 and 3 to

_yield

a

contradiction,

one needs to choose the

_appropriate

notion for the order of

_vanishing

of our

polynomial

f (X 1, ... , X,,,~ )

at

_{(a, ... ,}

_{a) .}

We will see as we _go

_through

with

the

_proof

that the

_following,

_though

at first

_awkward,

is the

_apt

definition:

D efinit ion 1.5. Let

_{0 7~ /(Xi,...,X~)}

E and let

(al, ... , 7 am)

E Consider the

_Taylor

series

_expansion

of

_f

about

(a , ... ,

7 am):

Suppose f

has

_multi-degree

_{(dl , ... ,}

The ind ex of

_f

at

_{(a~ , ... ,}

is defined as follows:

The index of a at a

_point

x E cm is a

weighted multiplicity.

For

_example,

if

di ==...== dm

= d then

In

_general,

each variable is

_weighted

so that it can contribute a maximum

of 1 to the index.

To see the relevance of the notion of

index,

we run

through

the three

step

proof

of Roth’s theorem which we would like to model after Liouville’s

theorem: we construct an

auxiliary polynomial

f (Xl, ... ,

of

multi-degree

(dl, ... ,

with

_large

index at the

_point

_{(a, ... , a).}

Then the

proof

should

_procede

as follows:

Step

1: Show that 0.

Step

2: I

as in Liouville’s theorem.

Step

3:

_If(PI/ql,...

_I

is small since

_{lpilqi - al}

is small and

_f

(8)

As it

_stands,

much work is needed in order to make this into a

_proof.

Only Step

2

_requires

no further

justification.

We

begin by explaining

the

upper bound for

,Pm/qm)1

in

Step

3 as this is what motivates

the notion of index. As we did in Liouville’s

theorem,

consider the

Taylor

series

_expansion

of

_f

about

_{(a, ... , a):}

Using

1.4 will

_give

the

_following

_upperbound

for some constant C which

_depends

on the number of terms in 1.6 and the size of the coefficients of P. We _see,

_ignoring

the constant C for the

moment,

that 1.7 contradicts the lower bound of

_Step

2

_{provided provided}

aj=0

whenever

Since we do not know the relative sizes of the _qz,the

only

reasonable _way

to

_guarantee

_inequality

1.8 is

_{by choosing di}

so that

qai

are all

roughly

proportional,

i.e. we need to choose

Moreover,

with this choice

_{of di}

_inequality

1.8 translates into the

_following:

we want

_aj=0

whenever

And here the index has

_{finally reappeared}

as a natural _consequenceof the

argument:

indeed,

1.10 says that we want

To

_summarize,

the lower and _upperbounds for

_given

in

_Steps

2 and 3

_respectively

contradict one

another, assuming

we can

suitably

bound the constant C in

_1.7,

_provided

that

_m/(2+e).

At this

_point

there are three technical issues to deal with in order to

(9)

Problem 1: Show that

_{f (pl/ql, ... ,}

0. Problem 2: Bound the constant C

_occurring

in 1.7.

Problem 3: Show that one can find

0 ~

_f

with index >

m/(2 + e)

at

(a, ... , a).

We will deal with the three

_problems

in reverse order. Problem 3 is

essentially

a

_{counting problem.}

We want to kill

_leading

terms _ajin the

Taylor

series

_expansion

1.6

_provided

J satisfies 1.10. In order to

_guarantee

that one can force all of these a~ to

vanish,

it suffces to show that the number

_{of m-tuples}

_{( jl, ... ,}

_jm)

with

_{0 j~ c~}

_satisfying

1.10 is a small

f raction

of the total number of

monomials;

for if this is true then a

dimen-sion count shows that there are lots of

polynomials,

with coefficients in a

finite extension

_having

index at least

_m/(2

+

_e)

at

_{(a, ... , a)}

and all of its Galois

_{conjugates. Taking}

the norm over

Q

of such a

polynomial

and

_clearing

denominators

_gives

the desired

_auxiliary

_polynomial.

To show

that all of this works in our

_{particular setting}

is a

_special

case of what

Falt-ings

and Wiistholz

_[FW2]

refer to as the ’law of

_large

of numbers’.

_Rougly

speaking,

this _saysthe

_following:

for each

_i,

the

_"average"

value of

_{ji/ dï}

is

_1/2

so if one

_randomly

chooses a set the

_probability

that 1.10

is violated

_approaches

zero as m

approaches infinity.

For a

_rigorous

state-ment and

_proof

one can consult

[FW2]

Proposition

5.1.

Alternatively,

for

the case in which we are

_interested,

one can make an

_explicit

_computation

as in

_[L1]

_pp.170-171.

Next we deal with Problem 2. There are two

_separate

issues

_here,

first

counting

the number of terms aj and second

_bounding

the size of the

_jail.

The first of these is

_simple

as the number of aj is

Bounding

the size of the aj is

equivalent

to

_bounding

the size of the

co-efficients

of the

_{auxiliary polynomial}

_{f (X)}

which is

_potentially

a serious manner as

_f

was constructed

_abstractly

to

_satisfy

certain

_vanishing

condi-tions. The fact that the size of the coefficients of

_f

can be controlled is a

consequence of the famous

Siegel

lemma:

Lemma 1.11

_{(Siegel’s Lemma).}

Consider a

_system

of

rra linear

_equations

in n

unknowns,

with m n:

Suppose

az j E Z

for

all

i, j

and that

laijl

I

A

for

all

i, j .

Then there

exists a solution

(xl, ... , xn)

E Zn to the

_system

_of

_equations

with

_ixi

_I

(10)

We

_apply

_Siegel’s

lemma

_{by viewing}

the condition

_{ind~a ... a~ ( f )}

>

_m/(2+

e)

as a

_system

of linear

_equations

in the

_coefficients

of

_{f ,}

i.e. we view the

coefficients

of f as

variables.

_Siegel’s

lemma cannot be

_{applied directly}

in

this situation because our

_system

of linear

equations

will have coefficients

in the number field K =

Q(a)

since _{we are}

_trying

_to

_impose

_large

index _at

a.

_Choosing

a basis for K as a vector _spaceover

_Q

allows one to make the

reduction to Lemma 1.11

_(for

_details,

see

[S2]

Lemma

9A).

One then

ob-tains a

_{polynomial g}

E

_{0~[Yi,... ,}

with

_large

index

_along

_{(a, ... ,}

_a)

and its

_conjugates

such that the coefficients have bounded

_size;

_taking

the

norm

of g

then

_gives

the desired

f .

The crucial observation to make about

_Siegel’s

lemma is that the

ex-ponent

m/(n - m)

is small

_provided

that the number of unknowns is a

fixed

_multiple

of the number of

_equations.

At this

_point,

one needs to do

a lot of

_accounting

to check that

_Siegel’s

lemma

_gives

an

_auxiliary

polyno-mial

_{f E Z(Xl, ... ,}

such that the constant C in 1.7 is small

_enough

to obtain a contradiction when

comparing

the lower and _upperbounds on

In

_practice,

to

_get

the numbers to work out one

chooses m

_{big enough}

to

_impose

index

_m/(2

_{+ e/2)}

at

_{(a, ... ,}

_a)

and the

extra

_e/2

allows one to absorb the bound for C in 1.7 and still obtain a

contradiction when

_comparing

the numbers in

_Steps

2 and 3.

2. INTERLUDE: DYSON’S LEMMA

We still have not faced the most difficult of the obstructions to

_proving

Roth’s

_theorem,

_namely

the issue

_arising

at the

_beginning

of the

_argument

in

_Step

1: how can we

_guarantee

that

Without this

_information,

all of our

_computations

to obtain a contradiction

from

_comparing

_upperand lower bounds of

_[

are in

vain. This is

_by

far the most difficult

_part

of Roth’s theorem as the rest

of the

_argument

_essentially

consists in

_counting.

In

_general,

there is of

course no _wayto

_guarantee

that 0 because

_f

has

not been

_explicity

constructed: we

_simply

know from dimension

_counting

that such an

_f

exists but of course there _maybe several and most of them will vanish at the

_{approximating}

_point

Roth

_originally

dealt with this

_problem

in an

_essentially

arithmetic manner. He

_proved,

in what is now called Roth’s

_lemma,

that

_provided

di

W

_{d2 ...}

»

_d",,

(or,

equivalently, given

1.9,

ql « q2 « ... «

_q."a)

the

_polynomial

_{f (X )}

constructed

_{using Siegel’s}

lemma cannot have

_large

order of

_vanishing

at

(pi/9i)’ "

Taking

the

_appropriate

derivatives of

_f

then

_yields

a

contradiction

_(since

the number of derivatives is

_small,

it does not affect the contradiction arrived at in

_Steps

2 and 3 of the

_argument).

Roth’s

(11)

lemma is

_essentially

arithmetic in

_nature,

_using

the facts that

_f

has

_integer

coefficients and that we are interested in its order of

_vanishing

at a rational

point.

For a

proof,

one can consult

[Ll]

_pp.179-181.

An alternative

_geometric

_argument

to establish

_{non-vanishing}

of

,pm/4m)

was

developed thirty

_yearslater

_by

Esnault and

Viehweg

[EV]

building

upon

Dyson

[D],

Bombieri

[Bl],

and Viola

_[Vi].

Esnault and

_{Viehweg approach}

the

_problem

from the

following

point

of view.

_Suppose

that whichever

_{f (X)}

we choose with

large

index at

_{(a, ... ,}

_a)

we

_always

find that

f (X )

also has

_large

index at

(pl/qi,...

This _saysthat certain linear conditions on the _space

of all

_polynomials

of

_degree

_{(dl, ... ,}

fail to be

_independent.

Thus if

one could establish this

_independence

then this would also

_yield

a

contra-diction. The

_advantage

to this

_viewpoint

is that it is

_entirely

_geometric;

the fact that the

_points

in which we are interested are all

algebraic

becomes

unimportant.

To state the

_powerful

result which Esnault and

_Viehweg

were able to

prove, we introduce some new notation. Let

be a

_product

of m

_projective

lines defined over

k,

an

algebraically

closed

field of characteristic zero. We will assume for

_simplicity

that k -

C,

the field of

_complex

numbers

_although

the

_argument

remains valid in the

general

case.

Let 7ri :

P -

Pl

be the

_projection

to the

ith

factor. For

positive integers d 1, ... ,

dm

write d =

( d 1, ... ,

dm )

and

For fixed d let

0 # s

E

H°(P,Op(d)).

The index of s at a closed

_point

,

of P is defined

_{by locally identifying s}

with a

polynomial

and

applying

Definition 1.5. We need to define certain volumes as in

[EV]

Definition 0.2:

Definition 2.1. Let 1m =

_{l~ = (~1, ... ,}

_£m)

E Rm

_{0 ~~}

1 for all

_i}

and let

_Vol(t)

denote the volume of

I

Of course 1 -

Vol(t)

_measures,

asymtotically

in

d,

the

_proportion

of

sections of

_{HO (P, Op (d))}

with index > t at a

point

C. Dyson’s

lemma

gives

conditions on a set of

_points

_{~~}

C P and a line bundle

Op (d)

so that

_requiring

_{index ti}

_imposes

almost

_independent

conditions on

global

sections of The

_following

is an

alternative,

slightly

more

transparent

formulation of Esnault and

_{Viehweg’s Dyson}

lemma:

Theorem 2.2.

_{Suppose 0 f=}

s E

H°(P,

and

_{~1, ... , Cm}

C P so

(12)

Note that

_if

t 0 then

_Vol(t)

= 0.

Theorem _{2.2 says}that the linear conditions for

_imposing

_{index ti}

- m8

are

independent

inside

H°(P, Op(d)).

The

key

observation about

Theorem 2.2 is that if

di

»

_d2

» ... » then mJ is small so in this case

Theorem _{2.2 says}that the conditions to

_{impose index ti}

at

_(i

are

_nearly

independent

inside This will then allow us to derive the

desired contradiction and prove Roth’s

theorem,

provided

we can _arrange

for

_dl

W

_d2

» ... »

d",,.

But

_by

1.9 this is

_equivalent

to

_choosing

rational

approximations

with denominators

_satisfying

_qlC q2 « ... « _q",,and we can achieve this since we have assumed that there are

_infinitely

_many

_good

rational

_{approximating points.}

It should be noted that there is still some

arguing

left because

_Dyson’s

lemma

_only

shows that there exists

_g(X)

E

C[X1,... Xm]

with

_large

index at

_{(a, ... ,}

_a)

and its

_conjugates

which has small order of

_vanishing

at

_{(pllql,... ,P."6/q",,).}

From here one can

construct

_{f (X )}

E

_{Z[X1,... Xm]}

with

_large

index at

_{(a, ... , a)}

and small index at

_(pi/9i,...

but a

_priori

we have no control on the size of

the coefficients of

f .

Thus more calculations of dimensions and of

Siegel

type

are _necessary

(see

_[EV]

_§10

for

details).

We now turn to the

_proof

of Theorem 2.2. In the statement of Theorem

2.2,

the volumes which occur have been

_{perturbed by}

m8. The reason for

this is that suitable derivatives of the non-zero section 8 E

with index

_{at (;}

still have

_{index > ti}

- m8

at

_z.

The

_product

theorem

will tell us that the common zero locus of these sections is contained in a

finite union of _proper

_product

subvarieties. Let P’ C P be one of these

product

subvarieties and let 7r : P - P’ denote the natural

projection.

Taking Ci’

= _onethen

shows,

_except

in certain

degenerate

_cases,that

on P’ one can

_produce

a new non-zero section s’ E

H°

_{(P’, Op (d))}

with suitable indices at the

_(I.

This

_procedure

is then iterated to

_produce

a

zero

_cycle

_representing

and the

_part

of the class associated

to

_~’=

measures the cost of

_imposing

index ti

- m6

at

_~t.

These classes have

_disjoint

_supports

and this

_implies

the desired conclusion: the cost of

imposing

index ti

- mJ

at (i

for all i is the sum of the individual

_costs,

i.e.

the conditions are

independent.

The details of the

_proof

of Theorem 2.2 are as follows. We are

_given

a non-zero section s E

H°(P,

_{with ti}

=

ind(¡ (s ).

Given a

_point

₍

E

_P,

(13)

an af&ne _opensubset U =

Spec

A _C

P,

and _alocal trivialization of

Op

(d)

on

_U,

the sections Q E

H°(P, Op (d))

such that > t

_generate

an

ideal c A. This ideal does not

_depend

on the trivialization and as U

varies over an a£ne cover of P this defines an ideal sheaf

_{2£, d t}

c Op .

Let

Since we are

_trying

to show that the various index conditions are

inde-pendent,

as a first

approximation

one should

_expect

the

_supports

of these

ideal sheaves to be

_disjoint.

This is not cult to show because one can

write down

_explicit

monomial

_generators

for and hence determine its

support

(see

[EV]

Lemma

_2.5).

It then follows

_readily

_([EV]

Lemma

_2.8),

as

desired,

that

As discussed

_above,

we would like to be able to

_produce

sections

sat-isfying

the relevant index conditions on _proper

_product

subvarieties. To this

_end,

let

_{Id (s) =}

_IÇj,d,tj

and for P’ C P a

_product

_subvariety

let

_Op’

_(d)

=

Op(d)[P’.

If _oneis

unlucky

and P’ _C

Zi

for _somei then

H°

= 0 for all k _> 0 so there will be no

hope

of

producing

a new section. On the other

hand,

[EV]

2.9

(ii)

_implies

that

whenever

_{P’ 0}

_Zi

for

_{any i}

then for k

_sufficiently

divisible

There is one more crucial

_ingredient

in the

proof

of Theorem 2.2 as it

allows us to use the sections

guaranteed

by

2.4 to construct an intersection

class

_representing

cl

( Op (d))m.

This is the so-called

product

theorem which establishes a certain

_degeneracy

in the locus where the non-zero section

s E can have

large

index.

Initially inspired by

the

product

theorem of

_Faltings

_[Fl],

the result

_presented

here

_strengthens

Theorem

3.1 of

_[Nl].

Theorem 2.5

_{(Product Theorem).}

Let

_f

E

_{C[Xl, ... , Xm]}

be

_of

multi-degree d 1, ... ,

d,~

and

_for

an m -

1--tupl e

of

non-negative

integers

a =

1

₁

Let W C

_{X ( f )}

be an irreducible

_component.

Then W is contained in a proper

product subvariety.

Proof of Theorem 2.5. We will show that either is a

_point

or

(14)

in a _proper

_product

subvariety

and we are done. In the latter _case,choose

a

_general

A E C and

_{replace f}

with

Since

_{a/axl, ... ,}

commute with evaluation at A we have

"""""- --

-B ut

’9011

...

irm-2

2

f

(

X 1, ... ,

X m )

vanishes

along

W’

x C for aZ But

5X " ... "!-2 f

(Xi,... X.,,,)

vanishes

_along

W’ x C

for

_{ai :5}

a1 ,gxm 2

assumption.

Hence

_by

2.5.1

Thus, using

the notation in the statement of Theorem

_2.5,

W’

C

and we can conclude

by

induction on m that

W’,

and hence

W ,

must be contained in a _proper

_{product subvariety.}

We now

proceed

to establish that if is not a

_point

then W =

W’ x

_{C, concluding}

the

_proof

of Theorem 2.5. Choose a

general

point

’11 E W

so that

Let denote the

_tangent

space to W at 17. Since we are

assum-ing

that W - C is

_surjective

it

_{follows, choosing q}

still more

gen-eral if _necessary,that we _mayassume the induced _mapon

_tangent

_spaces

is

_{surjective. Consequently, viewing}

_Tl1(W),

after

translation to the origin, as a subspace

We claim that for some 1 i m - 1

Observe first that if DO. is _anydifferential

_operator

of order

_{tll(f) - 1}

and if

_8/8X

E

_{T,7 (W)}

then = = 0 since

D°‘ ( f )

vanishes on W and E Thus

Suppose,

contrary

to

_2.5.5,

that for have

(15)

By

2.5.4 and 2.5.6 we also have But this

would

_imply

that +

_1,

a contradiction. Thus 2.5.5

holds.

_Applying

the same

_argument

inductively

and

using

2.5.3 shows that

for some

_non-negative

_{il, ... ,}

But

_by

the definition of

_{X ( f ),}

= 0 for _any

0 ~ Qi ~ dj.

Thus

+ 1. Since

_f

has

_degree

_dj

in this is

_{only possible}

if W is of the form W’ x C. This concludes the

_proof

of the

_product

theorem.

Observe that if we

_replace

d with Nd for N

_sufficiently

divisible then we can assume without loss of

generality

that if

Z 0

Zi

for

_{any i}

then there

exists

Indeed, applying

the

_product

theorem to the section s of Theorem 2.2

gives

sections of

H°

_(P,

_{Op (d)}

o which

_generate

off a finite

union of _proper

_product

subvarieties. If P’ is one of these _proper

product

subvarieties,

not contained in

_Zi

for

_{any i,}

then 2.4

_gives

a non-zero section

for some

_positive

_integer

n. Since 2.4

only

needs to be

applied

finitely

_many

times in order to obtain sections on _any

subvariety Z 0 Zi,

we can choose

N

_sufficiently

divisible to

_give

all of the sections sz at once.

We will use 2.6 in order to construct an effective

cycle representing

This is done

_inductively

as follows. With sp as in

_2.6,

_Z(sp)

represents

cl(Op(d)).

Let Z be an irreducible

_component

of

Z(sp).

If

Z C

Zi

for some

_i,

then choose some

0 ~

Sz E

H°(Z,Op(d))

and

_Z(sz)

is

an effective

_{representative}

for

cl (Op (d)) fl Z.

_Suppose

next that

_{Z 0}

_Zi

for

any i. Then

by

2.6 there exists E

H°

and

_Z(sz)

_represents

_{cl (Op (d))}

fl Z.

_Repeating

this

_procedure

_inductively

gives

an effective

_{representative}

for

_Moreover,

since

_by

2.3 the

Zi

are

_{mutually disjoint,}

we can write

where is the

_part

of the intersection

_supported

on

_{Z; .}

Since R is

_effective,

(16)

We have

_deg

= m!

die

Thus to conclude the

proof

of

Theorem 2.2 it suffices to prove that for

1 j

M we have

Combining

2.8 with 2.7

_yields

Theorem 2.2.

A

_{rigorous proof}

of 2.8 involves

_blowing

_upthe ideal sheaf

but

_intuitively

it has an _easy

explanation.

Suppose Cl, C2

C

p2

are two

distinct,

irreducible curves

meeting

in a

point

P.

Suppose

moreover that

both

₀₁

and

_C2

have

_multiplicity

m at P. Let

_{C2 )}

denote the

_part

of the intersection class

C2

supported

at P. Then one has

But of course

m2

measures

_precisely

the cost of

_{imposing multiplicity}

m

at P. In the situation of

_2.8,

_things

are more

complicated

because the

condition we are

_imposing

will force

_vanishing

not

_only

at

_points

but also

along higher

dimensional subvarieties. The fundamental

_{principle, however,}

is the same: the

_part

of the intersection class we have

produced supported

on

_Zj

measures

_by

definition the cost of

_imposing

_{index t~}

- m6 at

~~ .

The

right

hand side of 2.8 is the cost of

_{ixnposing tj -}

mb

_{at (;}

whence the

inequality.

More

_precisely,

suppose 7r : Y - P is the

blow-up

of P

along

_-EC,

_{-m5 .}

The ideal sheaf has been defined so that it is

_{generated locally}

by global

sections of

H°

_{(P, C~p (d) ) .}

Hence if E denotes the

_exceptional

divisor of _7r,it follows that is

_numerically

effective.

_Lifting

the construction of to Y one verifies that

Since

_x*Op(d)(-E)

is

_numerically

effective is

deter-mined

_by

the

_growth

of for

_large

n.

Pushing

back

down to

_P,

a direct

computation

(see [N3])

then derives 2.8 from 2.9.

We now _passto

_Vojta’s

version of

_Dyson’s

lemma on

products

of curves

of any genus. This is a

simple

and beautiful theorem which

initially inspired

Vojta’s diophantine proof

of the Mordell

_conjecture

_[V2]

and serves as a

key ingredient

in that

_proof.

We

_{begin by fixing}

some notation. Let

Ci

and

_C2

be smooth

_projective

curves over an

_{algebraically}

closed field of

characteristic zero.

_Suppose

L is a line bundle on

Cl

x

_C2.

Let

F1

=

Pi

x

C2

be a fibre of the first

projection

and

F2

=

C1

x

_P2

a fibre of the second

projection

for

_{arbitrary points}

PI

E

_Cl

and

_P2

E

_C2.

Let

_di

= L -

F2

and

_d2

= L -

Fl

so

_d1

and

d2

measure the

’degrees’

of the line bundle L in

analogy

to the

_multi-degree

of our

_auxiliary

_polynomial

in Roth’s theorem.

(17)

isomorphic

to the

_projective

line. If s is a non-zero section of L it makes sense to talk about the index of s at a

_point

x E

₀₁

x

_C2:

Definition 1.5

applies

in this

_situation,

with

_{weights di, d2,}

as it is a local definition. We

let the genus of

Ci

be gl and the genus of

C2

will be denoted 92.

Theorem 2.10

_(Vojta).

_Suppose

there is a non-zero section 8 E

H°(Cl

x

C2, L).

moreover that are

_{points in}

Ci

x

_C2

such that no two are contained in a _proper

product subvariety

and let =

t~.

Write

_{Z(s) =}

_E~=1

aiZ¡

where

_Zi

is irreducible and let

Several remarks are in order to

_explain

the

_strength

of Theorem 2.10.

If,

as will be the case in the Mordell

conjecture,

the _genusof both

01

and

C2

is at least

_2,

then

_{2g1 -}

2 + M > 0 and one does not need to take the maximum. In these

circumstances,

inequality

2.11 reads

..- ..

Moreover,

we know that e

d2 }

since _anycurve

Zi

which is not a

fibre of either

_projection

satisfies 1 and 1. If we assume

that

_d1

»

_d2,

as we did in Roth’s

theorem,

this will kill the second term

of 2.12

_leaving

us with In the case we dealt with

previously

where both

_Cl

and

_CZ

had _{genus zero,}we have

cl (L)2

=

2d1d2

and this

gives

the 1 on the

_right

hand side of our

_previous

Dyson

lemma. On

higher

genus curves,

however,

there are more line bundles because the

_diagonal

inside C x _{C is}

_{only linearly equivalent}

to fibres of the two

_projections

when the _genusof C is zero. Thus if one can choose a line bundle with

cl (L)2

« then

_Vojta’s

version

_{of Dyson’s}

lemma _saysthat no section

of L can have

_big

index _{at any}

_point

of C x C. It is

exactly

in this fashion

that

_Dyson’s

lemma will be

_applied

to _provethe Mordell

_conjecture.

The

_proof

of Theorem 2.10 is

_essentially

identical to that of Theorem

_2.2,

the

_only

difference

_being

that

_taking

derivatives is

_slightly

more

_complicated

on a curve of

_larger

_genus:

this,

in

fact,

is what creates the

_{2gi - 2 occurring}

on the

right

hand side of 2.11. To see the

_similarity

between Theorem

2.2 and Theorem 2.10 we consider first the

_special

case of Theorem 2.10

where both

_Cl

and

_CZ

are

_projective

lines. In this _case,the theorems

(18)

appears on the left hand side of Theorem 2.2 while it has been moved to

the

_right

hand side in Theorem 2.10.

To see how our

proof

of Theorem 2.2 can be

_adapted

to this new

situ-ation,

suppose we make _sure,when

_constructing

our intersection

_product,

that the sections which we intersect all have rather than

settling

for

_{index ti}

- 28

as we did. In other

words,

after

identifying s

with

a

_polynomial

f (X, Y)

on an affine _opensubset and then

_{differentiating}

,f (X,

Y),

we increase the index

by

multiplying

by

some other fixed

polyno-mial. Of course, since the

goal

of the

proof

is to induct on the dimension

of

_product

_{subvarieties,}

the

_polynomial

one

_{multiplies by}

should if

_possible

have zeroes in a _proper

product subvariety,

i.e. one should

_just

add fibres

of one of the two

projections.

Thus the

simplest

solution is to

_replace

the

derivatives

_by

Indeed,

the

_polynomials

in 2.13 still have index at

_{least ti}

at as for all i and

the

_possible

decrease in index incurred

_by

_{differentiating}

has been remedied.

The

_{only problem}

is that the

_degree

of the

_polynomial

has been increased.

To minimize the

_increase,

note that one of the

points,

_say can be taken

to be

_(oo,

_oo)

with

_respect

to the affine _open

_patch

where we differentiate.

Then the derivatives do not affect the index of the

_{projectivization}

of

_f

at

_~1.

_Also,

since each derivative decreases the

_degree

of

_f,

we have some

additional room to twist in order to

_get

a section of In

_particular,

taking C2

=

(0, 0),

_wenote that

Xi axi (f )

has

degree

(di, d2)

and index

> t2

at

_~2.

So with this modification 2.13

_gives

us

polynomials

of

bi-degree

(d,

+

_{(M -}

_{2)i, d2).}

The number of derivatives which need to be taken in order for the common zeroes to be contained in a _proper

_{product subvariety}

is

_by

definition the number e because is transverse to all non-fibral C and so these

partial

derivatives

always

decrease the order of

vanishing

along

such

_components

of

_Z(s).

Thus the

_conclusion,

if we run

_through

the

argument

used to _proveTheorem

_2.2,

is that

Expanding

2.14

_{yields exactly}

_{2.11 up}to a factor of two on the second term:

for a discussion of how to remove this factor of

2,

see

[EV]

_§10.

We now turn to

_{proving Vojta’s}

version of

_Dyson’s

lemma in full

gener-ality.

In order to take derivatives of a section of L on

₀₁

x

_C2

we recall

the method used

_{by Faltings}

_([F1]

_p.560 and _p.

₅₆₆₎

in a more

general

(19)

derivatives of a section s E x

_C2,

_L)

is first that there are no

_global

vector 2 and second that after

_{identifying s}

locally

with some

polynomial

and

differentiating

this

polynomial,

the derivatives

no

longer patch together

to

give

a section of L. To

remedy

this we

con-sider the

_following

construction.

_Suppose

we are

_given

a dominant _map

pl :

Ci -

Then one can

pull

back the derivation on

Pl

via with

poles along

the ramification divisor R of pl and so obtain:

Definition 2.15. Let s E

H°(Cl

x

C2, L)

and _supposeZ C

_Z(s).

Then

is the

_global

section of +

_L(1riKcl)IZ

obtained

_{locally by differentiating s}

via the

_pull-back

of the derivation

on

_Opi

_by

the

_composition

_{pi -}_1r1.

Higher

order derivatives are defined

analogously

so for a > 0

provided

all

_partial

derivatives

_{D/3 ( s)}

with

_Q

a vanish

identically along

Z.

Using

Definition

_2.15,

one can derive Theorem 2.10 in

_exactly

the same

manner in which we

_proved

Theorem 2.2. The

only

difference is that when we take e derivatives of the section s we have to twist

_by

_{e7ri Kc¡.}

Since we

also have to add one fibre for each of the

points ~~

in order to _preservethe

index of the derivative of s at

_(i

this means that the total twist we need in

order take the necessary e derivatives is

and,

since

_{deg(Kcl )}

=

2g1 -

2 this is

exactly

what occurs on the

_right

and

side in 2.11. Thus

_{arguing exactly}

as before one

_finds,

for J =

max2g

-2+M,0}

which is 2.11 _upto a factor of 2 on the second term. We

_already

saw this

factor of _{2 appear}after 2.14 and to avoid it one makes a direct intersection

theoretic construction as in

[Vl]

or

[EV]

_§10.

Two further remarks are _necessaryin order to turn this sketch into a

proof. First,

one needs to be careful about

_twisting

the line bundle L

because this can make it difficult to

_compute

deg R~

in

2.8;

indeed the

proof

is

_{cohomological}

in nature and

_requires

that the sections constructed

all be sections of the same line bundle. This

problem

is not

_serious,

_however,

as we are

_{merely twisting by}

fibres of the first

_projection

and so one can

twist all bundles involved without

_{adversely affecting}

the construction. The second

_{possible problem}

is that on

_higher

_{genus curves,}the derivative of

(20)

not as a section of a line bundle on

Cl

x

_C2.

This

_too,

_however,

_plays

no

crucial role in the

_proof

of 2.14

_(and

its

_analogue

on

Cl

x

_C2).

_Again

the

proof

is

_{cohomological}

in nature and

_only

_depends

on numerical

effectivity

of a line bundle on a blow _upof

01

x

C2

(see [N3]

for

details)

and for this one does not need

_global

sections but

_just

non-zero sections on _anycurve

a c 01 x C2.

To conclude this

_section,

it is worth

_pointing

out that Theorem 2.10 does

not

_hold,

as

stated,

on a

product

of three or more curves of

_{genus >}

1: in

other

_words,

the data of the

_degrees

_di

of L and the

_top

intersection number

c1 (L)t°p

are not sufficient to bound the sum of the relevant volumes. The reason for

this, thinking

of the

proof

of Theorem

2.10,

is that the

only

proper

product

subvarieties X of

Cl

x

_C2

are

_points

or fibres of the two

projections.

In both _cases,the

_degrees

_{dl, d2}

_determine,

at least _upto

numerical

_equivalence,

the line bundle In

_higher

dimension this is no

longer

the case and we take

_advantage

of

_precisely

this fact in

producing

a

_higher

dimensional

_example

where the direct

_{generalization}

of Theorem

2.10 fails.

Let C be a smooth

projective

curve of genus g > 1 and let Y = C x C.

Let

Fl

C Y be a fibre of the first

projection,

FZ

C Y a fibre of the second

projection,

and A’ =

tl - Fl - F2.

Finally,

let n =

(nl, n2, n3)

be _a

3-tuple

of

_integers.

We consider divisors of the form

Since

_F1

+

_F2

is

_{ample and Vn ~}

_(Fl

+

_{F2 )}

= _nl₊

n2, some

multiple

of

Vn

is effective

_provided

nl + n2 > 0 and

Vri

> 0. We have

Hence

_given

_anyE > 0 there exist _{nl, n2, n3}such that nl » n2 »

0, Yn

is

effective,

and

LetX=CxCxCandfixapointPEC.

Let _{p :}X -~ Y denote the

projection

to the first two factors and let x3 : X ~ C denote the

projection

to the third factor. Define

Choose

_{0 0 s}

E

H° (Y, OY (Vn) )

and let t E

_HO(C,Oc(P))

be the section whose divisor of zeroes is P. Then 1 for _any

_{point C}

E X

(21)

Thus if Theorem 2.10 held for G one would find

(up

to an

o(1)

_depending

on the constants

_implied

in nj » n2 W

₀₎

that

contradicting

2.17.

It should be noted that the

_particular

divisor

_Yn

which we chose above

will

_{play a}

central role in

_{Vojta’s proof}

of the Mordell

_conjecture

to be

given

in the next section. As was noted

above, Vojta’s Dyson

lemma is

extremely powerful

when

_applied

to a divisor like

_Yn

as it _saysthat no

section of

_H°(C

x

_C,

_vn)

can have

large

index _{at any}

_point.

3. THE MORDELL CONJECTURE

In this section we will

give

a sketch of

_{Vojta’s proof}

[V2, V3]

of the

famous Mordell

_conjecture

Theorem 3.1

_(Faltings).

_Suppose

C is a smooth

_projective

curve

of

_genus

> 2

_defined

over a number

field

k. Then C has at most

finitely

_many

k-rational

_points.

Of course, Mordell

originally

conjectured

Theorem 3.1

only

for the case

where k =

Q.

This is in fact the case which we will

_{ultimately consider,}

not because it is

_{intrinsically}

easier than the

_general

case but because there

is less notational

_complication

as

Q

admits

only

one

_embedding

in C. How does one use

_Vojta’s

Theorem 2.10 to prove the Mordell

conjecture?

Vojta’s

beautiful idea is to take

_advantage

of the freedom to choose the line

bundle L in Theorem 2.10. If C is a curve of _genusat least one then there are more _{line bundles}on C x C than the

_pull-backs

of bundles from the

two

_projections

because the

_diagonal

is not

_{linearly equivalent}

to a sum of

fibres. This additional

_parameter

of freedom allows for the choice of a line

bundle L

_(given

in fact

_by

the divisor

_Vn

at the end of the last

_section)

such that

,

As was noted after the statement of Theorem

_2.10,

if we also _arrangefor

di

»

_d2

then 2.12 tells us that no section s E

HO(C

x

C, L)

can have

large

index at _any

_point.

Thus all of the hard work we

_spent

in Roth’s

theorem

_trying

to show that the

_auxiliary

_polynomial

_{P(Xl, ... , Xm)}

has small index at

_p.,,Iq.,,,)

is

_{accomplished automatically, regardless}

of our _{choice of section s.}

This sounds too

_good

to be true and indeed it is. It will turn out that what was _easyin the case of Roth’s theorem

(namely

to

_produce

an

arith-metic reason

_{why P}

must vanish at the rational

_{approximating}

_point)

will